1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 11:33:24 +02:00
Commit Graph

28943 Commits

Author SHA1 Message Date
Justin Holewinski
004cf81d64 [NVPTX] Add support for .managed variables for UVM
llvm-svn: 211942
2014-06-27 18:35:58 +00:00
Justin Holewinski
6888ca5201 [NVPTX] Emit .weak linkage for link_once, weak, available_externally, and common linkage
llvm-svn: 211941
2014-06-27 18:35:56 +00:00
Justin Holewinski
20bb9827a8 [NVPTX] Variables that start with llvm. or nvvm. are reserved and should not be emitted
llvm-svn: 211940
2014-06-27 18:35:53 +00:00
Justin Holewinski
0ed0e4b150 [NVPTX] Fix handling of ldg/ldu intrinsics.
The address space of the pointer must be global (1) for these intrinsics.  There must also be alignment metadata attached to the intrinsic calls, e.g.

%val = tail call i32 @llvm.nvvm.ldu.i.global.i32.p1i32(i32 addrspace(1)* %ptr), !align !0

!0 = metadata !{i32 4}

llvm-svn: 211939
2014-06-27 18:35:51 +00:00
Justin Holewinski
6fc5dab1cf [NVPTX] Clean up argument lowering code and properly handle alignment for structs and vectors
llvm-svn: 211938
2014-06-27 18:35:44 +00:00
Justin Holewinski
92e1b59c28 [NVPTX] Add missing boolean vector contents flag
llvm-svn: 211937
2014-06-27 18:35:42 +00:00
Justin Holewinski
97b1e28f31 [NVPTX] Add support for [SHL,SRA,SRL]_PARTS
llvm-svn: 211936
2014-06-27 18:35:40 +00:00
Justin Holewinski
7663d1fd5a [NVPTX] Implement fma and imad contraction as target DAGCombiner patterns
This also introduces DAGCombiner patterns for mul.wide to multiply two smaller integers and produce a larger integer

llvm-svn: 211935
2014-06-27 18:35:37 +00:00
Justin Holewinski
1c8d752df2 [NVPTX] Add support for efficient rotate instructions on SM 3.2+
llvm-svn: 211934
2014-06-27 18:35:33 +00:00
Justin Holewinski
b7ecf6b4d6 [NVPTX] Add missing isel patterns for 64-bit atomics
llvm-svn: 211933
2014-06-27 18:35:30 +00:00
Justin Holewinski
2ffa2d24b0 [NVPTX] Add isel patterns for bit-field extract (bfe)
llvm-svn: 211932
2014-06-27 18:35:27 +00:00
Justin Holewinski
7c9cd5f566 [NVPTX] Add support for isspacep instruction
llvm-svn: 211931
2014-06-27 18:35:24 +00:00
Justin Holewinski
ab34c507cf [NVPTX] Add support for envreg reads
llvm-svn: 211930
2014-06-27 18:35:21 +00:00
Justin Holewinski
6eaef88634 [NVPTX] Add target options for PTX 3.2/4.0 and SM 5.0 (Maxwell)
Default PTX version is set to PTX 3.2

llvm-svn: 211929
2014-06-27 18:35:18 +00:00
Justin Holewinski
1c2cdcb292 [NVPTX] Update sub-target feature detection
llvm-svn: 211928
2014-06-27 18:35:16 +00:00
Justin Holewinski
7d536fecc0 [NVPTX] Directly control the Machine SSA passes that are invoked for NVPTX.
NVPTX is a bit special in the optimizations it requires, so this gives
us better control over the backend optimization pipeline.

llvm-svn: 211927
2014-06-27 18:35:14 +00:00
Justin Holewinski
a3b8653f47 [NVPTX] Emit .weak when linkage is not external, internal, or private
llvm-svn: 211926
2014-06-27 18:35:10 +00:00
Justin Holewinski
b137c23ee9 [NVPTX] Just use getTypeAllocSize() when computing return value size for structures and vectors
llvm-svn: 211925
2014-06-27 18:35:08 +00:00
Chandler Carruth
ec4fb8a59b [x86] Fix a miscompile in the new shuffle lowering uncovered by
a bootstrap.

I managed to mis-remember how PACKUS worked on x86, and was using undef
for the high bytes instead of zero. The fix is fairly obvious.

llvm-svn: 211922
2014-06-27 18:25:23 +00:00
Matt Arsenault
5e70db4151 R600: Move trivial getters into header, use initializer list
llvm-svn: 211917
2014-06-27 17:57:00 +00:00
Juergen Ributzka
f7b405f472 [FastISel][X86] Fix typos.
llvm-svn: 211911
2014-06-27 17:16:34 +00:00
Matt Arsenault
128df7aaf1 R600: Don't crash on unhandled instruction in promote alloca
llvm-svn: 211906
2014-06-27 16:52:49 +00:00
Alexander Kornienko
3be4bd19cf Clean up unused variable warning in release build.
llvm-svn: 211902
2014-06-27 15:30:55 +00:00
Ulrich Weigand
85c764c15a [PowerPC] Constrain base register in PPCRegisterInfo::resolveFrameIndex
I've run into a bug where current LLVM at -O0 (with fast-isel)
generated invalid code like:

        ld 0, 20936(1)                  # 8-byte Folded Reload
        stw 12, 10348(0)
        stw 12, 10344(0)

The underlying vreg had been introduced as base register by the
Local Stack Slot Allocation pass.  That register was constrained
to G8RC by PPCRegisterInfo::materializeFrameBaseRegister to match
the ADDI instruction used to set it, but it was *not* constrained
to G8RC_NOX0 to fit the *use* of the register in an address.

That should have happened in PPCRegisterInfo::resolveFrameIndex.
This patch adds an appropriate constrainRegClass call.

Reviewed by Hal Finkel.

llvm-svn: 211897
2014-06-27 13:04:12 +00:00
Chandler Carruth
608f397a25 [x86] Clean up some unused variables, especially in release builds.
llvm-svn: 211894
2014-06-27 12:04:18 +00:00
Chandler Carruth
e7aaf337d0 [x86] Teach the target combine step to aggressively fold pshufd insturcions.
Summary:
This allows it to fold pshufd instructions across intervening
half-shuffles and other noise. This pattern actually shows up in the
generic lowering tests, but I've also added direct tests using
intrinsics to make sure that the specific desired functionality is
working even if the lowering stuff changes in the future.

Differential Revision: http://reviews.llvm.org/D4292

llvm-svn: 211892
2014-06-27 11:40:13 +00:00
Chandler Carruth
127f1b532c [x86] Teach the target-specific combining how to aggressively fold
half-shuffles, even looking through intervening instructions in a chain.

Summary:
This doesn't happen to show up with any test cases I've found for the current
shuffle lowering, but previous attempts would benefit from this and it seems
generally useful. I've tested it directly using intrinsics, which also shows
that it will work with hand vectorized code as well.

Note that even though pshufd isn't directly used in these tests, it gets
exercised because we combine some of the half shuffles into a pshufd
first, and then merge them.

Differential Revision: http://reviews.llvm.org/D4291

llvm-svn: 211890
2014-06-27 11:34:40 +00:00
Chandler Carruth
c62b16ce89 [x86] Teach the X86 backend to DAG-combine SSE2 shuffles that are
trivially redundant.

This fixes several cases in the new vector shuffle lowering algorithm
which would generate redundant shuffle instructions for the sake of
simplicity.

I'm also deleting a testcase which was somewhat ridiculous. It was
checking for a bug in 2007 about incorrectly transforming shuffles by
looking for the string "-86" in the output of a pretty substantial
function. This test case doesn't seem to have any value at this point.

Differential Revision: http://reviews.llvm.org/D4240

llvm-svn: 211889
2014-06-27 11:27:52 +00:00
Chandler Carruth
e28f70fce3 [x86] Begin a significant overhaul of how vector lowering is done in the
x86 backend.

This sketches out a new code path for vector lowering, hidden behind an
off-by-default flag while it is under development. The fundamental idea
behind the new code path is to aggressively break down the problem space
in ways that ease selecting the odd set of instructions available on
x86, and carefully avoid scalarizing code even when forced to use older
ISAs. Notably, this starts off restricting itself to SSE2 and implements
the complete vector shuffle and blend space for 128-bit vectors in SSE2
without scalarizing. The plan is to layer on top of this ISA extensions
where we can bail out of the complex SSE2 lowering and opt for
a cheaper, specialized instruction (or set of instructions). It also
needs to be generalized to AVX and AVX512 vector widths.

Currently, this does a decent but not perfect job for SSE2. There are
some specific shortcomings that I plan to address:
- We need a peephole combine to fold together shuffles where possible.
  There are cases where a previous shuffle could be modified slightly to
  arrange for elements to be in the correct position and a later shuffle
  eliminated. Doing this eagerly added quite a bit of complexity, and
  so my plan is to combine away these redundancies afterward.
- There are a lot more clever ways to use unpck and pack that need to be
  added. This is essential for real world shuffles as it turns out...

Once SSE2 is polished a bit I should be able to get interesting numbers
on performance improvements on benchmarks conducive to vectorization.
All of this will be off by default until it is functionally equivalent
of course.

Differential Revision: http://reviews.llvm.org/D4225

llvm-svn: 211888
2014-06-27 11:23:44 +00:00
Eric Christopher
666067451e Remove the caching of the target machine from SystemZTargetLowering.
Update all callers and uses accordingly.

llvm-svn: 211880
2014-06-27 07:38:01 +00:00
Eric Christopher
fe4c033d5f Remove target machine caching from SystemZInstrInfo and
SystemZRegisterInfo and replace it with the subtarget as that's
all they needed in the first place. Update all uses and calls
accordingly.

llvm-svn: 211877
2014-06-27 07:01:17 +00:00
Eric Christopher
702459da8e Have SystemZSelectionDAGInfo constructor take a DataLayout rather
than a target machine since it doesn't need anything past the
DataLayout.

llvm-svn: 211870
2014-06-27 05:26:28 +00:00
Craig Topper
9586f4812f Rename getX86ConditonCode -> getX86ConditionCode
llvm-svn: 211869
2014-06-27 05:18:21 +00:00
Eric Christopher
75468540c0 Have MipsSelectionDAGInfo constructor take a DataLayout rather
than a target machine since it doesn't need anything past the
DataLayout.

llvm-svn: 211863
2014-06-27 04:38:30 +00:00
Eric Christopher
d9c871322b Move NVPTX subtarget dependent variables from the target machine
to the subtarget.

llvm-svn: 211860
2014-06-27 04:33:14 +00:00
Eric Christopher
e3a87b8eff Use the target lowering we can get off of the DAG rather than off
of the cached target machine.

llvm-svn: 211858
2014-06-27 03:45:49 +00:00
Matt Arsenault
61dec09b87 Fix missing newline and simplify debug printing.
llvm-svn: 211850
2014-06-27 02:36:59 +00:00
Matt Arsenault
2939409b36 R600: Move load/store ReplaceNodeResults to common code.
Future patches will want to custom lower loads on SI.

llvm-svn: 211848
2014-06-27 02:33:47 +00:00
Eric Christopher
cf13130521 Move the constructor for NVPTXFrameLowering into the implementation
file in preparation for the subtarget move.

llvm-svn: 211847
2014-06-27 02:05:24 +00:00
Eric Christopher
56c143cc1d Remove unnecessary caching of the TargetMachine on NVPTXFrameLowering.
Adjust the constructor accordingly.

llvm-svn: 211846
2014-06-27 02:05:22 +00:00
Eric Christopher
6b5be1c423 Rework the logic for setting the TargetName. This appears to
be shorter and identical in goal.

llvm-svn: 211845
2014-06-27 02:05:19 +00:00
Eric Christopher
8010e246be Remove caching of the target machine in NVPTXInstrInfo and
update constructor accordingly.

llvm-svn: 211840
2014-06-27 01:27:08 +00:00
Eric Christopher
035afbc45f Remove comment that duplicated information in the constructor
that it's after.

llvm-svn: 211839
2014-06-27 01:27:06 +00:00
Eric Christopher
27757a2e74 Remove commented out code.
llvm-svn: 211838
2014-06-27 01:27:05 +00:00
Eric Christopher
9095101a3e Remove extraneous parens and extraneous const cast (and fix the
prototype for the function to patch what we were returning).

llvm-svn: 211837
2014-06-27 01:27:03 +00:00
Eric Christopher
e0bc241533 Move the subtarget dependent features from the target machine to
the subtarget for the MSP430 target.

llvm-svn: 211836
2014-06-27 01:14:54 +00:00
Eric Christopher
b041717dce Remove uses and caches of the target machine and subtarget from
both MSP430InstrInfo and MSP430RegisterInfo. Remove unused member
variable StackAlign from MSP430RegisterInfo. Update constructors
accordingly.

llvm-svn: 211835
2014-06-27 01:14:50 +00:00
Eric Christopher
b69f775d9c Remove caching of an unused subtarget from MSP430FrameLowering.
llvm-svn: 211830
2014-06-27 00:52:11 +00:00
Adam Nemet
e14e099b2e [X86] AVX512: Add vbroadcasti*
For now I used a separate template for these sub-vector/tuple broadcasts
rather than sharing the mem variants with avx512_int_broadcast_rm.

<rdar://problem/17402869>

llvm-svn: 211828
2014-06-27 00:43:38 +00:00
Eric Christopher
50d95a165d Remove unnecessary caching of variables by MSP430TargetLowering and
make the constructor more general since it only needs a target
machine.

llvm-svn: 211827
2014-06-27 00:37:59 +00:00