1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 11:42:57 +01:00
Commit Graph

103462 Commits

Author SHA1 Message Date
Jyotsna Verma
0b0b5038bf [Hexagon] Add new InstrItinClass to support timing classes.
This patch doesn't introduce any functionality change. Test cases will be
added later when v5 support is added.

llvm-svn: 208349
2014-05-08 18:47:08 +00:00
Rafael Espindola
ccc1932aa6 Use for range loops.
llvm-svn: 208348
2014-05-08 18:40:06 +00:00
Sebastian Pop
739ca9316a add testcase for r208237: do not collect undef terms
llvm-svn: 208347
2014-05-08 18:38:58 +00:00
Rafael Espindola
96c7b0effe Use range loop.
llvm-svn: 208346
2014-05-08 18:17:44 +00:00
Matt Arsenault
119209fcfe R600: Promote f64 vector load/stores to i64 for consistency
llvm-svn: 208344
2014-05-08 18:01:56 +00:00
Rafael Espindola
c6c3ed654b Use a range loop.
llvm-svn: 208343
2014-05-08 17:57:50 +00:00
Andrea Di Biagio
7e0e036a46 [X86] Add target specific combine rules to fold SSE2/AVX2 packed arithmetic shift intrinsics.
This patch teaches the backend how to combine packed SSE2/AVX2 arithmetic shift
intrinsics.

The rules are:
 - Always fold a packed arithmetic shift by zero to its first operand;
 - Convert a packed arithmetic shift intrinsic dag node into a ISD::SRA only if
   the shift count is known to be smaller than the vector element size.

This patch also teaches to function 'getTargetVShiftByConstNode' how fold
target specific vector shifts by zero.

Added two new tests to verify that the DAGCombiner is able to fold
sequences of SSE2/AVX2 packed arithmetic shift calls.

llvm-svn: 208342
2014-05-08 17:44:04 +00:00
Saleem Abdulrasool
d43e9730d0 test: fix test on Windows
When building on Windows, the default target is Windows.  Windows on ARM does
not support ARM mode compilation, resulting in test failures.  Simply specify a
triple to ensure that we are testing the correct behaviour.

llvm-svn: 208340
2014-05-08 17:11:29 +00:00
NAKAMURA Takumi
5143373603 Mark test/TableGen/listconcat.td as XFAIL:vg_leak. llvm-tblgen is ignorant of vg_leak.
llvm-svn: 208337
2014-05-08 17:06:10 +00:00
Daniel Sanders
ac1f965519 [mips] Add PredicateControl to InstAlias's
Summary:
No functional change

Depends on D3649

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3672

llvm-svn: 208334
2014-05-08 16:12:31 +00:00
Bradley Smith
e7c79c7b8a [ARM64] Add diagnostics for expected arithmetic shifts
llvm-svn: 208330
2014-05-08 15:40:39 +00:00
Bradley Smith
08ab47ca2d [ARM64] Re-work parsing of ADD/SUB shifted immediate operands
The parsing of ADD/SUB shifted immediates needs to be done explicitly so
that better diagnostics can be emitted, as a side effect this also
removes some of the hacks in the current method of handling this operand
type.

Additionally remove manual CMP aliasing to ADD/SUB and use InstAlias
instead.

llvm-svn: 208329
2014-05-08 15:39:58 +00:00
Daniel Sanders
9f2ecd428c [mips] Correct tests that are meant to test valid assembly. They were actually rejected by GAS.
Summary:
I've noticed a bug in my test generator script that caused 64-bit objects
to be disassembled as if it were using the O32 ABI, giving the wrong register
names. As a result, it generated assembly files that are rejected by GAS when
assembling for the correct ABI. This was caused by the generator setting the
ELF e_flags incorrectly before disassembling the object.

This patch corrects the invalid tests that have already been committed by
replacing the ABI-dependent register names with numeric registers. In addition
to fixing the tests this allows the 32-bit and 64-bit ISA tests to be easily diffed
to produce the invalid-*.s tests which test that instructions defined in later ISA's
are not accepted.

Depends on D3648

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3649

llvm-svn: 208327
2014-05-08 15:17:29 +00:00
Bradley Smith
40ea4329b1 [ARM64] Ensure immediates in extend operands are in a valid range
Also emit a more useful diagnostic when they are not.

llvm-svn: 208318
2014-05-08 14:12:12 +00:00
Bradley Smith
0bcfb4a0bc [ARM64] Check for proper immediate in shift/extend operands
llvm-svn: 208317
2014-05-08 14:11:16 +00:00
Christian Pirker
35d96c7f86 ARM big endian function argument passing
llvm-svn: 208316
2014-05-08 14:06:24 +00:00
Hal Finkel
faaba5686b Fix a spelling error
llvm-svn: 208314
2014-05-08 13:42:57 +00:00
Daniel Sanders
c6c9c916df [mips] Implement l[wd]c3, and s[wd]c3.
Summary:
These instructions were added in MIPS-I, and MIPS-II but were removed in
MIPS-III. Interestingly, GAS continues to accept them when assembling for
MIPS-III.

For the moment, these instructions will follow GAS and accept them for
MIPS-III and newer but this will be tightened up when the invalid-*.s
tests are added.

Depends on D3647

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3648

llvm-svn: 208311
2014-05-08 13:02:11 +00:00
Ed Maste
9040058e24 Add isOSFreeBSD triple test
For http://reviews.llvm.org/D3448

llvm-svn: 208309
2014-05-08 13:00:15 +00:00
Dario Domizioli
df090599ce Revert test commit. Removed blank line.
llvm-svn: 208308
2014-05-08 12:54:43 +00:00
James Molloy
294269a69e [ARM64-BE] Teach fast-isel about how to set up sub-word stack arguments for big endian calls.
SelectionDAG already knows about this, but fast-isel was ignorant.

llvm-svn: 208307
2014-05-08 12:53:50 +00:00
Daniel Sanders
8071a219e6 [mips] Marked up instructions added in MIPS-II and tested that IAS for -mcpu=mips1 does not accept them
Summary:
A small number of instructions are rejected with the wrong error message.
These have been placed in a separate test for now. There seems to be some
parsing quirk that triggers when these instructions are disabled.

Depends on D3571

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3647

llvm-svn: 208305
2014-05-08 12:40:48 +00:00
Daniel Sanders
94fae7d980 [mips] Implement tlbp, tlbr, tlbwi, and tlbwr
Reviewers: vmedic, dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3571

llvm-svn: 208301
2014-05-08 11:51:18 +00:00
Dario Domizioli
97c0aef837 Test commit. Added blank line.
llvm-svn: 208298
2014-05-08 11:28:14 +00:00
Tim Northover
72838ce201 ARM64: make sure FastISel emits SSA MachineInstrs
We need to use a temporary register for a 2-step operation like REM.

llvm-svn: 208297
2014-05-08 10:30:56 +00:00
Evgeniy Stepanov
196ad52640 [asan] Preserve flags in asm instrumentation.
Patch by Yuri Gorshenin.

llvm-svn: 208296
2014-05-08 09:55:24 +00:00
Daniel Sanders
237454fcb1 Use a vector of unique_ptrs to fix a memory leak introduced in r208179.
Also removed an inaccurate comment that stated that a DenseMap was used as
storage for the ListInit*'s. It's currently using a FoldingSet.

I expect there's a better way to fix this but I haven't found it yet. FoldingSet
is incompatible with the Pool template and I'm not sure if FoldingSet can be
safely replaced with a DenseMap of computed FoldingSetID's to ListInit*'s.

llvm-svn: 208293
2014-05-08 09:29:28 +00:00
Hal Finkel
c52e65b830 Move late partial-unrolling thresholds into the processor definitions
The old method used by X86TTI to determine partial-unrolling thresholds was
messy (because it worked by testing target features), and also would not
correctly identify the target CPU if certain target features were disabled.
After some discussions on IRC with Chandler et al., it was decided that the
processor scheduling models were the right containers for this information
(because it is often tied to special uop dispatch-buffer sizes).

This does represent a small functionality change:
 - For generic x86-64 (which uses the SB model and, thus, will get some
   unrolling).
 - For AMD cores (because they still currently use the SB scheduling model)
 - For Haswell (based on benchmarking by Louis Gerbarg, it was decided to bump
   the default threshold to 50; we're working on a test case for this).
Otherwise, nothing has changed for any other targets. The logic, however, has
been moved into BasicTTI, so other targets may now also opt-in to this
functionality simply by setting LoopMicroOpBufferSize in their processor
model definitions.

llvm-svn: 208289
2014-05-08 09:14:44 +00:00
Tobias Grosser
358e9a97e7 Revert "SCEV: Use I = vector<>.erase(I) to iterate and delete at the same time"
as committed in r208282. The original commit was incorrect.

llvm-svn: 208286
2014-05-08 07:55:34 +00:00
Hao Liu
be513c440d AArch64/ARM64: Port NEON post-increment load/store with 2/3/4 vectors to ARM64 backend.
llvm-svn: 208284
2014-05-08 07:38:13 +00:00
Tobias Grosser
4c447db9fb SCEV: Use I = vector<>.erase(I) to iterate and delete at the same time
llvm-svn: 208282
2014-05-08 07:12:44 +00:00
Richard Smith
34ed6bf95c [modules] Add missing #include.
llvm-svn: 208276
2014-05-08 02:34:32 +00:00
Saleem Abdulrasool
fffc610ca8 test: fix silly typo
Oh silly Darwin and your case insensitive file system.

llvm-svn: 208274
2014-05-08 01:41:04 +00:00
Saleem Abdulrasool
84a61727f4 ARM: support FK_SecRel_2 relocations on WoA
This adds FK_SecRel_2 relocation support to ARM.  This enables the building of
object files for armv7-windows-msvc which enables CodeView line tables for
debugging as opposed to armv7-windows-itanium which currently uses DWARF.

llvm-svn: 208273
2014-05-08 01:35:57 +00:00
Richard Smith
90c10bd484 Simplify and fix incorrect comment. No functionality change.
llvm-svn: 208272
2014-05-08 01:08:43 +00:00
Filipe Cabecinhas
275860c4fd Lower certain build_vectors to insertps instructions
Summary:
Vectors built with zeros and elements in the same order as another
(source) vector are optimized to be built using a single insertps
instruction.
Also optimize when we move one element in a vector to a different place
in that vector while zeroing out some of the other elements.

Further optimizations are possible, described in TODO comments.
I will be implementing at least some of them in the near future.

Added some tests for different cases where this optimization triggers.

Reviewers: nadav, delena, craig.topper

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3521

llvm-svn: 208271
2014-05-08 00:25:16 +00:00
Lang Hames
eb0e367f97 Back out r208257 while I investigate tester failures.
llvm-svn: 208267
2014-05-07 23:35:53 +00:00
Duncan P. N. Exon Smith
f5258e4d2c GlobalValue: Assert symbols with local linkage have default visibility
The change to ExtractGV.cpp has no functionality change except to avoid
the asserts.  Existing testcases already cover this, so I didn't add a
new one.

llvm-svn: 208264
2014-05-07 23:00:22 +00:00
Duncan P. N. Exon Smith
c74b2b0974 IR: Don't allow non-default visibility on local linkage
Visibilities of `hidden` and `protected` are meaningless for symbols
with local linkage.

  - Change the assembler to reject non-default visibility on symbols
    with local linkage.

  - Change the bitcode reader to auto-upgrade `hidden` and `protected`
    to `default` when the linkage is local.

  - Update LangRef.

<rdar://problem/16141113>

llvm-svn: 208263
2014-05-07 22:57:20 +00:00
Duncan P. N. Exon Smith
bba2550124 LTO: Assert visibility of local linkage when merging symbols
`ModuleLinker::getLinkageResult()` shouldn't create symbols with local
linkage and non-default visibility -- in fact, symbols with local
linkage shouldn't be merged at all.  Assert to that effect.

llvm-svn: 208262
2014-05-07 22:55:46 +00:00
Duncan P. N. Exon Smith
9739ca9e59 LTO: Check local linkage first
Since visibility is meaningless for symbols with local linkage, check
local linkage before visibility when setting symbol attributes.

When linkage is `internal` and the visibility is `hidden`, the exposed
attribute is now `LTO_SYMBOL_SCOPE_INTERNAL` instead of
`LTO_SYMBOL_SCOPE_HIDDEN`.  Although the bitfield allows *both* to be
specified, the combination is nonsense anyway.

Given changes (in progress) to drop visibility when a symbol has local
linkage, this almost has no functionality change: it's mostly a cleanup
to clarify the logic.

The exception is when something has `appending` linkage.  Before this
change, such symbols would be advertised as `LTO_SYMBOL_SCOPE_INTERNAL`;
now, they'll be given `LTO_SYMBOL_SCOPE_COMMON`.

Unfortunately this is really awkward to test.  This only changes what we
advertise to linkers (before running LTO), not what the final object
looks like.  In theory I could add `DEBUG` output to `llvm-lto` (and
test with "REQUIRES: asserts"), but follow-up commits to disallow
`internal hidden` simplify this anyway.

<rdar://problem/16141113>

llvm-svn: 208261
2014-05-07 22:53:14 +00:00
Quentin Colombet
548eb2e304 [X86] Add a test case for r208252.
Prior to r208252, the FMA 231 family was marked as isCommutable. However the
memory variants of this family are not commutable. Therefore, we did not
implemented the findCommutedOpIndices for those variants and missed that
the default implementation (more or less: commute indices 1 and 2) was
firing behind our back.
As a result, as demonstrated in the test case before the fix, we were
transforming a = b * c + a into a = a * c + b.

I.e., before r208252 we were generating for this test case:
vmovaps %xmm0, %xmm1
vmoss (%rsi), %xmm0
vfmadd231ss (%rdi), %xmm1, %xmm0

Instead of:
vmoss (%rsi), %xmm1
vfmadd231ss (%rdi), %xmm1, %xmm0

<rdar://problem/16800495> 

llvm-svn: 208260
2014-05-07 22:52:58 +00:00
Lang Hames
0dd4713eee [RuntimeDyld] Make RuntimeDyldImpl::resolveExternalSymbols preserve the
relocation entries it applies.

Prior to this patch, RuntimeDyldImpl::resolveExternalSymbols discarded
relocations for external symbols once they had been applied. This causes issues
if the client calls MCJIT::finalizeLoadedModules more than once, and updates the
location of any symbols in between (e.g. by calling MCJIT::mapSectionAddress).

No test case yet: None of our in-tree memory managers support moving sections
around. I'll have to hack up a dummy memory manager before I can write a unit
test.

Fixes <rdar://problem/16764378>

llvm-svn: 208257
2014-05-07 22:34:08 +00:00
Hal Finkel
a22ec95e68 [X86TTI] Remove the unrolling branch limits
The loop stream detector (LSD) on modern Intel cores, which optimizes the
execution of small loops, has limits on the number of taken branches in
addition to uop-count limits (modern AMD cores have similar limits).
Unfortunately, at the IR level, estimating the number of branches that will be
taken is difficult. For one thing, it strongly depends on later passes (block
placement, etc.). The original implementation took a conservative approach and
limited the maximal BB DFS depth of the loop.  However, fairly-extensive
benchmarking by several of us has revealed that this is the wrong approach. In
fact, there are zero known cases where the branch limit prevents a detrimental
unrolling (but plenty of cases where it does prevent beneficial unrolling).

While we could improve the current branch counting logic by incorporating
branch probabilities, this further complication seems unjustified without a
motivating regression. Instead, unless and until a regression appears, the
branch counting will be removed.

llvm-svn: 208255
2014-05-07 22:25:18 +00:00
Justin Bogner
81ce7e44c7 llvm-cov: Fix some funny indentation (NFC)
Noticed by Duncan Exon Smith. Thanks!

llvm-svn: 208253
2014-05-07 21:50:43 +00:00
Quentin Colombet
9b13d839be [X86] Selectively mark the FMA variants inside a family as isCommutable.
Given a FMA family (e.g., 213, 231), not all the variants (i.e., register or
memory) are commutable.
E.g., for the 213 family (with the syntax src1, src2, src3):
fmaXXX213 A, B, reg3/mem3 == fmaXXX213 B, A, reg3/mem3

Now consider the 231 family:
fmaXXX231 A, B, reg3 == fmaXXX231 A, reg3, B
But
fmaXXX231 A, B, mem3 != fmaXXX231 A, mem3, B
Indeed, mem3 cannot be the second argument of the memory variant of fmaXXX231.

Working on a reduced test case!

<rdar://problem/16800495>

llvm-svn: 208252
2014-05-07 21:43:35 +00:00
Eric Christopher
91701a28d0 Reformat a couple of functions for clarity.
llvm-svn: 208248
2014-05-07 21:05:47 +00:00
Nico Weber
4dc9c7ea6b Let OnDiskHashTable call the destructor of its Items.
OnDiskHashTable::insert() calls the Item constructor via placement new, but
nothing called the destructor.  This matters in cases when the Info template
parameter has key_type or data_type typedefs that have a destructor, for
example like IdentifierIndexWriterTrait in clang's GlobalModuleIndex.cpp.

This fixes a 5-year old bug that's been around since the OnDiskHashTable code
was added in r64192.  Bug found by LSan!

llvm-svn: 208243
2014-05-07 19:55:38 +00:00
Rafael Espindola
290b146df0 Replace a virtual with an override.
llvm-svn: 208242
2014-05-07 19:52:32 +00:00
Jyotsna Verma
c9cece4644 [Hexagon] Add New TSFlags to be used in the upcoming patches.
llvm-svn: 208239
2014-05-07 19:07:34 +00:00