1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

24878 Commits

Author SHA1 Message Date
Lang Hames
321bebf72e [RuntimeDyld] Add a framework for testing relocation logic in RuntimeDyld.
This patch adds a "-verify" mode to the llvm-rtdyld utility. In verify mode,
llvm-rtdyld will test supplied expressions against the linked program images
that it creates in memory. This scheme can be used to verify the correctness
of the relocation logic applied by RuntimeDyld.

The expressions to test will be read out of files passed via the -check option
(there may be more than one of these). Expressions to check are extracted from
lines of the form:
# rtdyld-check: <expression>

This system is designed to fit the llvm-lit regression test workflow. It is
format and target agnostic, and supports verification of images linked for
remote targets. The expression language is defined in
llvm/include/llvm/RuntimeDyldChecker.h . Examples can be found in
test/ExecutionEngine/RuntimeDyld.

llvm-svn: 211956
2014-06-27 20:20:57 +00:00
Chandler Carruth
c99cbd9cc7 [x86] Fix another bug hit when bootstrapping with the new shuffle
lowering.

For maximum irony, I had already discovered this bug, diagnosed it, and
left FIXMEs about it in the test cases. =[ I just failed to go back over
those until after i had reduced a bootstrap miscompile down to a single
TU, stared at the assembly for an hour, and figured out the bug. Again.

Oh well.

llvm-svn: 211955
2014-06-27 20:07:40 +00:00
Justin Holewinski
2a38f449ff [NVPTX] Add reflect intrinsic (better than matching by function name)
Also clean up some of the logic in NVVMReflect.cpp while we're messing around in there.

llvm-svn: 211948
2014-06-27 18:36:11 +00:00
Justin Holewinski
3774707bba [NVPTX] Add 'b' asm constraint
llvm-svn: 211946
2014-06-27 18:36:06 +00:00
Justin Holewinski
caff2b4efe [NVPTX] Error out if initializer is given for variable in an address space that does not support initialization
llvm-svn: 211943
2014-06-27 18:36:01 +00:00
Justin Holewinski
004cf81d64 [NVPTX] Add support for .managed variables for UVM
llvm-svn: 211942
2014-06-27 18:35:58 +00:00
Justin Holewinski
6888ca5201 [NVPTX] Emit .weak linkage for link_once, weak, available_externally, and common linkage
llvm-svn: 211941
2014-06-27 18:35:56 +00:00
Justin Holewinski
0ed0e4b150 [NVPTX] Fix handling of ldg/ldu intrinsics.
The address space of the pointer must be global (1) for these intrinsics.  There must also be alignment metadata attached to the intrinsic calls, e.g.

%val = tail call i32 @llvm.nvvm.ldu.i.global.i32.p1i32(i32 addrspace(1)* %ptr), !align !0

!0 = metadata !{i32 4}

llvm-svn: 211939
2014-06-27 18:35:51 +00:00
Justin Holewinski
6fc5dab1cf [NVPTX] Clean up argument lowering code and properly handle alignment for structs and vectors
llvm-svn: 211938
2014-06-27 18:35:44 +00:00
Justin Holewinski
97b1e28f31 [NVPTX] Add support for [SHL,SRA,SRL]_PARTS
llvm-svn: 211936
2014-06-27 18:35:40 +00:00
Justin Holewinski
7663d1fd5a [NVPTX] Implement fma and imad contraction as target DAGCombiner patterns
This also introduces DAGCombiner patterns for mul.wide to multiply two smaller integers and produce a larger integer

llvm-svn: 211935
2014-06-27 18:35:37 +00:00
Justin Holewinski
1c8d752df2 [NVPTX] Add support for efficient rotate instructions on SM 3.2+
llvm-svn: 211934
2014-06-27 18:35:33 +00:00
Justin Holewinski
b7ecf6b4d6 [NVPTX] Add missing isel patterns for 64-bit atomics
llvm-svn: 211933
2014-06-27 18:35:30 +00:00
Justin Holewinski
2ffa2d24b0 [NVPTX] Add isel patterns for bit-field extract (bfe)
llvm-svn: 211932
2014-06-27 18:35:27 +00:00
Justin Holewinski
7c9cd5f566 [NVPTX] Add support for isspacep instruction
llvm-svn: 211931
2014-06-27 18:35:24 +00:00
Justin Holewinski
ab34c507cf [NVPTX] Add support for envreg reads
llvm-svn: 211930
2014-06-27 18:35:21 +00:00
Justin Holewinski
a3b8653f47 [NVPTX] Emit .weak when linkage is not external, internal, or private
llvm-svn: 211926
2014-06-27 18:35:10 +00:00
Chandler Carruth
ec4fb8a59b [x86] Fix a miscompile in the new shuffle lowering uncovered by
a bootstrap.

I managed to mis-remember how PACKUS worked on x86, and was using undef
for the high bytes instead of zero. The fix is fairly obvious.

llvm-svn: 211922
2014-06-27 18:25:23 +00:00
David Majnemer
abf7854d05 IR: Add COMDATs to the IR
This new IR facility allows us to represent the object-file semantic of
a COMDAT group.

COMDATs allow us to tie together sections and make the inclusion of one
dependent on another. This is required to implement features like MS
ABI VFTables and optimizing away certain kinds of initialization in C++.

This functionality is only representable in COFF and ELF, Mach-O has no
similar mechanism.

Differential Revision: http://reviews.llvm.org/D4178

llvm-svn: 211920
2014-06-27 18:19:56 +00:00
David Blaikie
06bae5c2fe Fix test so it doesn't try to write out temporary files into the test tree.
llvm-svn: 211916
2014-06-27 17:45:43 +00:00
David Majnemer
c2cdc28730 MC: Fix associative sections on COFF
COFF sections in MC were represented by a tuple of section-name and
COMDAT-name.  This is not sufficient to represent a .text section
associated with another .text section; we need a way to distinguish
between the key section and the one marked associative.

llvm-svn: 211913
2014-06-27 17:19:44 +00:00
Matt Arsenault
128df7aaf1 R600: Don't crash on unhandled instruction in promote alloca
llvm-svn: 211906
2014-06-27 16:52:49 +00:00
Ulrich Weigand
85c764c15a [PowerPC] Constrain base register in PPCRegisterInfo::resolveFrameIndex
I've run into a bug where current LLVM at -O0 (with fast-isel)
generated invalid code like:

        ld 0, 20936(1)                  # 8-byte Folded Reload
        stw 12, 10348(0)
        stw 12, 10344(0)

The underlying vreg had been introduced as base register by the
Local Stack Slot Allocation pass.  That register was constrained
to G8RC by PPCRegisterInfo::materializeFrameBaseRegister to match
the ADDI instruction used to set it, but it was *not* constrained
to G8RC_NOX0 to fit the *use* of the register in an address.

That should have happened in PPCRegisterInfo::resolveFrameIndex.
This patch adds an appropriate constrainRegClass call.

Reviewed by Hal Finkel.

llvm-svn: 211897
2014-06-27 13:04:12 +00:00
Chandler Carruth
e7aaf337d0 [x86] Teach the target combine step to aggressively fold pshufd insturcions.
Summary:
This allows it to fold pshufd instructions across intervening
half-shuffles and other noise. This pattern actually shows up in the
generic lowering tests, but I've also added direct tests using
intrinsics to make sure that the specific desired functionality is
working even if the lowering stuff changes in the future.

Differential Revision: http://reviews.llvm.org/D4292

llvm-svn: 211892
2014-06-27 11:40:13 +00:00
Simon Atanasyan
fb5fe85156 [ELF][Mips] Fix recognition of MIPS 64-bit arch in the ELFObjectFile:getArch() method.
llvm-svn: 211891
2014-06-27 11:36:45 +00:00
Chandler Carruth
127f1b532c [x86] Teach the target-specific combining how to aggressively fold
half-shuffles, even looking through intervening instructions in a chain.

Summary:
This doesn't happen to show up with any test cases I've found for the current
shuffle lowering, but previous attempts would benefit from this and it seems
generally useful. I've tested it directly using intrinsics, which also shows
that it will work with hand vectorized code as well.

Note that even though pshufd isn't directly used in these tests, it gets
exercised because we combine some of the half shuffles into a pshufd
first, and then merge them.

Differential Revision: http://reviews.llvm.org/D4291

llvm-svn: 211890
2014-06-27 11:34:40 +00:00
Chandler Carruth
c62b16ce89 [x86] Teach the X86 backend to DAG-combine SSE2 shuffles that are
trivially redundant.

This fixes several cases in the new vector shuffle lowering algorithm
which would generate redundant shuffle instructions for the sake of
simplicity.

I'm also deleting a testcase which was somewhat ridiculous. It was
checking for a bug in 2007 about incorrectly transforming shuffles by
looking for the string "-86" in the output of a pretty substantial
function. This test case doesn't seem to have any value at this point.

Differential Revision: http://reviews.llvm.org/D4240

llvm-svn: 211889
2014-06-27 11:27:52 +00:00
Chandler Carruth
e28f70fce3 [x86] Begin a significant overhaul of how vector lowering is done in the
x86 backend.

This sketches out a new code path for vector lowering, hidden behind an
off-by-default flag while it is under development. The fundamental idea
behind the new code path is to aggressively break down the problem space
in ways that ease selecting the odd set of instructions available on
x86, and carefully avoid scalarizing code even when forced to use older
ISAs. Notably, this starts off restricting itself to SSE2 and implements
the complete vector shuffle and blend space for 128-bit vectors in SSE2
without scalarizing. The plan is to layer on top of this ISA extensions
where we can bail out of the complex SSE2 lowering and opt for
a cheaper, specialized instruction (or set of instructions). It also
needs to be generalized to AVX and AVX512 vector widths.

Currently, this does a decent but not perfect job for SSE2. There are
some specific shortcomings that I plan to address:
- We need a peephole combine to fold together shuffles where possible.
  There are cases where a previous shuffle could be modified slightly to
  arrange for elements to be in the correct position and a later shuffle
  eliminated. Doing this eagerly added quite a bit of complexity, and
  so my plan is to combine away these redundancies afterward.
- There are a lot more clever ways to use unpck and pack that need to be
  added. This is essential for real world shuffles as it turns out...

Once SSE2 is polished a bit I should be able to get interesting numbers
on performance improvements on benchmarks conducive to vectorization.
All of this will be off by default until it is functionally equivalent
of course.

Differential Revision: http://reviews.llvm.org/D4225

llvm-svn: 211888
2014-06-27 11:23:44 +00:00
Dinesh Dwivedi
73e3709b2c Added instruction combine to transform few more negative values addition to subtraction (Part 3)
This patch enables transforms for

(x + (~(y | c) + 1) --> x - (y | c) if c is odd

Differential Revision: http://reviews.llvm.org/D4210

llvm-svn: 211881
2014-06-27 07:47:35 +00:00
David Majnemer
be156a8772 GlobalOpt: Fix constantfold-initializers.ll test
The test added in r211762 was sloppy, the correct initializer wasn't
added to @llvm.global_ctors

Spotted by Pasi Parviainen!

llvm-svn: 211879
2014-06-27 07:36:26 +00:00
David Blaikie
d6c9082735 Revert "Revert "Revert "PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location."""
Reverting this again, didn't mean to commit it - while r211872 fixes one
of the issues here, there are still others to figure out and address.

This reverts commit r211871.

llvm-svn: 211873
2014-06-27 05:34:05 +00:00
David Blaikie
232726db30 ArgumentPromotion: Propagate debug locations on calls for which arguments are promoted.
llvm-svn: 211872
2014-06-27 05:32:09 +00:00
David Blaikie
a4e2150a25 Revert "Revert "PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location.""
This reverts commit r211724.

llvm-svn: 211871
2014-06-27 05:31:49 +00:00
Andrew Trick
a8cb3163c0 MachineScheduler: add some book-keeping to fix an assert.
Fixe for Bug 20057 - Assertion failied in llvm::SUnit* llvm::SchedBoundary::pickOnlyChoice(): Assertion `i <= (HazardRec->getMaxLookAhead() + MaxObservedStall) && "permanent hazard"'

Thanks to Chad for the test case.

llvm-svn: 211865
2014-06-27 04:57:05 +00:00
Matt Arsenault
19ebc28a85 R600: Add some testcases for promote alloca pass.
More complicated GEPs are skipped. Add some tests to
actually stress this skipping.

llvm-svn: 211859
2014-06-27 03:55:55 +00:00
Adam Nemet
e14e099b2e [X86] AVX512: Add vbroadcasti*
For now I used a separate template for these sub-vector/tuple broadcasts
rather than sharing the mem variants with avx512_int_broadcast_rm.

<rdar://problem/17402869>

llvm-svn: 211828
2014-06-27 00:43:38 +00:00
Juergen Ributzka
4e8c3d809f [StackMaps] Enable patchpoint liveness analysis per default.
llvm-svn: 211817
2014-06-26 23:39:52 +00:00
Juergen Ributzka
f24160ef4c [Stackmaps] Remove the liveness calculation for stackmap intrinsics.
There is no need to calculate the liveness information for stackmaps. The
liveness information is still available for the patchpoint intrinsic and
that is also the intended usage model.

Related to <rdar://problem/17473725>

llvm-svn: 211816
2014-06-26 23:39:44 +00:00
Arnold Schwaighofer
7e23314405 GVN: Preserve invariant.load metadata
If both instructions to be replaced are marked invariant the resulting
instruction is invariant.

rdar://13358910

Fix by Erik Eckstein!

llvm-svn: 211801
2014-06-26 19:51:19 +00:00
Matt Arsenault
c2d6bb970a R600/SI: Add FP mode bits to binary.
The default rounding mode to initialize the mode register needs
to be reported to the runtime. Fill in other bits a kernel
may be interested in setting for future use.

llvm-svn: 211791
2014-06-26 17:22:30 +00:00
Renato Golin
9ca48ec4d1 Added parsing co-processor names starting with "cr"
Additional compliant GAS names for coprocessor register name
are enabled for all instruction with parameter MCK_CoprocReg:
LDC,LDC2,STC,STC2,CDP,CDP2,MCR,MCR2,MCRR,MCRR2,MRC,MRC2,MRRC,MRRC2

Patch by Andrey Kuharev.

llvm-svn: 211776
2014-06-26 13:10:53 +00:00
Andrea Di Biagio
3be8532e69 [X86] Improve the selection of SSE3/AVX addsub instructions.
This patch teaches the backend how to canonicalize a shuffle vectors
according to the rule:

 - (shuffle (FADD A, B), (FSUB A, B), Mask) ->
       (shuffle (FSUB A, -B), (FADD A, -B), Mask)

Where 'Mask' is:
  <0,5,2,7>            ;; for v4f32 and v4f64 shuffles.
  <0,3>                ;; for v2f64 shuffles.
  <0,9,2,11,4,13,6,15> ;; for v8f32 shuffles.

In general, ISel only knows how to pattern-match a canonical
'fadd + fsub + blendi' dag node sequence into an ADDSUB instruction.

This new rule allows to convert a non-canonical dag sequence into a
canonical one that will be matched by a single ADDSUB at ISel stage.

The idea of converting a non-canonical ADDSUB into a canonical one by
swapping the first two operands of the shuffle, and then negating the
second operand of the FADD and FSUB, was originally proposed by Hal Finkel.

llvm-svn: 211771
2014-06-26 10:45:21 +00:00
Dinesh Dwivedi
9d122cf780 This patch removed duplicate code for matching patterns
which are now handled in SimplifyUsingDistributiveLaws() 
(after r211261)

Differential Revision: http://reviews.llvm.org/D4253

llvm-svn: 211768
2014-06-26 08:57:33 +00:00
Dinesh Dwivedi
b98a2e9f49 Added instruction combine to transform few more negative values addition to subtraction (Part 2)
This patch enables transforms for

(x + (~(y | c) + 1)   -->   x - (y | c) if c is even

Differential Revision: http://reviews.llvm.org/D4209

llvm-svn: 211765
2014-06-26 05:40:22 +00:00
David Majnemer
96d60954b8 GlobalOpt: Don't optimize thread_local for initializers
Folding a reference to a thread_local variable into another global
variable's initializer is very problematic, there is no relocation that
exists to represent such an access.

llvm-svn: 211762
2014-06-26 03:02:19 +00:00
Matt Arsenault
435a7c1256 R600: Fix vector FMA
llvm-svn: 211757
2014-06-26 01:28:05 +00:00
Hans Wennborg
0b74da6314 Don't build switch tables for dllimport and TLS variables in GEPs
This is a follow-up to r211331, which failed to notice that we were
returning early from ValidLookupTableConstant for GEPs.

llvm-svn: 211753
2014-06-26 00:30:52 +00:00
Adam Nemet
aa04918ad4 [X86] AVX512: Fix asm syntax for packed vcmp
The *_alt defs for vcmp are used by the InstParser (the asm string in the main
def is used by the InstPrinter) .  The former was accepting vector registers
as destination rather than mask registers.

llvm-svn: 211750
2014-06-26 00:21:12 +00:00
Juergen Ributzka
87fbbdf13d [FastISel][X86] Only fold the cmp into the select when both instructions are in the same basic block.
If the cmp is in a different basic block, then it is possible that not all
operands of that compare have defined registers. This can happen when one of
the operands to the cmp is a load and the load gets folded into the cmp. In
this case FastISel will skip the load instruction and the vreg is never
defined.

llvm-svn: 211730
2014-06-25 20:06:12 +00:00
David Blaikie
ac824331fd Revert "PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location."
This reverts commit r211723.

Breaks the ASan/compiler-rt build... guess I didn't test very far at all
:/.

llvm-svn: 211724
2014-06-25 18:20:54 +00:00