Vincent Lejeune
2100f94811
R600: Non vector only instruction can be scheduled on trans unit
...
llvm-svn: 187514
2013-07-31 19:31:56 +00:00
Vincent Lejeune
dd19dcd43e
R600: Don't mix LDS and non-LDS instructions in the same group
...
There are a lot of restrictions on instruction groups that contain
LDS instructions, so for now we will be conservative and not packetize
anything else with them.
llvm-svn: 187513
2013-07-31 19:31:41 +00:00
Vincent Lejeune
5879083446
R600: Use SchedModel enum for is{Trans,Vector}Only functions
...
llvm-svn: 187512
2013-07-31 19:31:35 +00:00
Vincent Lejeune
14935c8687
R600: Remove predicated_break inst
...
We were using two instructions for similar purpose : break and
predicated break. Only predicated_break was emitted and it was
lowered at R600ControlFlowFinalizer to JUMP;CF_BREAK;POP.
This commit simplify the situation by making AMDILCFGStructurizer
emit IF_PREDICATE;BREAK;ENDIF; instead of predicated_break (which
is now removed).
There is no functionality change.
llvm-svn: 187510
2013-07-31 19:31:14 +00:00
Tom Stellard
0009a2cbb1
R600/SI: Expand vector fp <-> int conversions
...
llvm-svn: 187421
2013-07-30 14:31:03 +00:00
Quentin Colombet
dbdfe6759b
[R600] Replicate old DAGCombiner behavior in target specific DAG combine.
...
build_vector is lowered to REG_SEQUENCE, which is something the register
allocator does a good job at optimizing.
llvm-svn: 187397
2013-07-30 00:27:16 +00:00
Tom Stellard
8e98bf332b
SimplifyCFG: Use parallel-and and parallel-or mode to consolidate branch conditions
...
Merge consecutive if-regions if they contain identical statements.
Both transformations reduce number of branches. The transformation
is guarded by a target-hook, and is currently enabled only for +R600,
but the correctness has been tested on X86 target using a variety of
CPU benchmarks.
Patch by: Mei Ye
llvm-svn: 187278
2013-07-27 00:01:07 +00:00
Tom Stellard
cf58aac74c
DAGCombiner: Pass the correct type to TargetLowering::isF(Abs|Neg)Free
...
This commit also implements these functions for R600 and removes a test
case that was relying on the buggy behavior.
llvm-svn: 187007
2013-07-23 23:55:03 +00:00
Tom Stellard
7e06ca2a39
R600: Treat CONSTANT_ADDRESS loads like GLOBAL_ADDRESS loads when necessary
...
These are really the same address space in hardware. The only
difference is that CONSTANT_ADDRESS uses a special cache for faster
access. When we are unable to use the constant kcache for some reason
(e.g. smaller types or lack of indirect addressing) then the instruction
selector must use GLOBAL_ADDRESS loads instead.
llvm-svn: 187006
2013-07-23 23:54:56 +00:00
Tom Stellard
e096e3c298
R600: Add support for 24-bit MAD instructions
...
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 186923
2013-07-23 01:48:49 +00:00
Tom Stellard
803a4c6e50
R600: Add support for 24-bit MUL instructions
...
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 186922
2013-07-23 01:48:42 +00:00
Tom Stellard
705721da31
R600: Improve support for < 32-bit loads
...
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 186921
2013-07-23 01:48:35 +00:00
Tom Stellard
fa0b944f18
R600: Rename AMDILISelDAGToDAG.cpp -> AMDGPUISelDAGToDAG.cpp
...
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 186920
2013-07-23 01:48:29 +00:00
Tom Stellard
bc9deba8ad
R600: Move CONST_ADDRESS folding into AMDGPUDAGToDAGISel::Select()
...
This increases the number of opportunites we have for folding. With the
previous implementation we were unable to fold into any instructions
other than the first when multiple instructions were selected from a
single SDNode.
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 186919
2013-07-23 01:48:24 +00:00
Tom Stellard
e60bb5f272
R600: Use KCache for kernel arguments
...
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 186918
2013-07-23 01:48:18 +00:00
Tom Stellard
c35ac4a061
R600: Simplify assembly for KCache registers using the TableGen !add operator
...
Before:
MOV * T0.W, KC0[131-128].Y
After:
MOV * T0.W, KC0[3].Y
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 186917
2013-07-23 01:48:08 +00:00
Tom Stellard
6ca248c89f
R600: Use the same compute kernel calling convention for all GPUs
...
A side-effect of this is that now the compiler expects kernel arguments
to be 4-byte aligned.
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 186916
2013-07-23 01:48:05 +00:00
Tom Stellard
ac10764020
R600: Use correct LoadExtType when lowering kernel arguments
...
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 186915
2013-07-23 01:47:58 +00:00
Tom Stellard
d7dd88a3f7
R600: Clean up extended load patterns
...
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 186914
2013-07-23 01:47:52 +00:00
Tom Stellard
25ce190913
R600: Expand vector FNEG
...
llvm-svn: 186913
2013-07-23 01:47:46 +00:00
Vincent Lejeune
946e631d0f
R600: Don't emit empty then clause and use alu_pop_after
...
llvm-svn: 186725
2013-07-19 21:45:15 +00:00
Vincent Lejeune
0872500c7d
R600: Simplify AMDILCFGStructurize by removing templates and assuming single exit
...
llvm-svn: 186724
2013-07-19 21:45:06 +00:00
Vincent Lejeune
2bcf101e52
R600: Replace legacy debug code in AMDILCFGStructurizer.cpp
...
llvm-svn: 186723
2013-07-19 21:44:56 +00:00
Tom Stellard
2fd2f61532
R600/SI: Fix crash with VSELECT
...
https://bugs.freedesktop.org/show_bug.cgi?id=66175
llvm-svn: 186616
2013-07-18 21:43:53 +00:00
Tom Stellard
3f8cd2512a
R600/SI: Add support for v2f32 loads
...
llvm-svn: 186615
2013-07-18 21:43:48 +00:00
Tom Stellard
9268fc81d5
R600/SI: Add support for v2f32 stores
...
llvm-svn: 186614
2013-07-18 21:43:42 +00:00
Tom Stellard
73fcab4d3a
R600: Expand VSELECT for all types
...
llvm-svn: 186613
2013-07-18 21:43:35 +00:00
Craig Topper
0910c52f8e
Move string pointer from being a static class member to just a static global in the one file its needed in.
...
llvm-svn: 186476
2013-07-17 00:31:35 +00:00
Craig Topper
b8260534f6
Add 'const' qualifiers to static const char* variables.
...
llvm-svn: 186371
2013-07-16 01:17:10 +00:00
Tom Stellard
5a5b5f2786
R600/SI: Add support for 64-bit loads
...
https://bugs.freedesktop.org/show_bug.cgi?id=65873
llvm-svn: 186339
2013-07-15 19:00:09 +00:00
Craig Topper
d2ea089f09
Make some arrays 'static const'
...
llvm-svn: 186307
2013-07-15 06:39:13 +00:00
Craig Topper
4e9457fd7d
Use llvm::array_lengthof to replace sizeof(array)/sizeof(array[0]).
...
llvm-svn: 186301
2013-07-15 04:27:47 +00:00
Craig Topper
58fa7a9b4a
Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector size.
...
llvm-svn: 186274
2013-07-14 04:42:23 +00:00
Benjamin Kramer
912dd0dc3a
R600: Remove unsafe type punning. No intended functionality change.
...
llvm-svn: 186196
2013-07-12 20:18:05 +00:00
Tom Stellard
977376a943
R600/SI: Add support for f64 kernel arguments
...
Patch by: Niels Ole Salscheider
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 186182
2013-07-12 18:15:26 +00:00
Tom Stellard
ce0acc677f
R600/SI: Implement select and compares for SI
...
Patch by: Niels Ole Salscheider
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 186181
2013-07-12 18:15:19 +00:00
Tom Stellard
f2a3075fdd
R600/SI: Add fsqrt pattern for SI
...
Patch by: Niels Ole Salscheider
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 186180
2013-07-12 18:15:13 +00:00
Tom Stellard
b7b09a29aa
R600/SI: Add double precision fsub pattern for SI
...
Patch by: Niels Ole Salscheider
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 186179
2013-07-12 18:15:08 +00:00
Tom Stellard
43c1f3d80d
R600/SI: SI support for 64bit ConstantFP
...
Patch by: Niels Ole Salscheider
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 186178
2013-07-12 18:15:02 +00:00
Tom Stellard
8b6f62dcb2
R600/SI: Add initial double precision support for SI
...
Patch by: Niels Ole Salscheider
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 186177
2013-07-12 18:14:56 +00:00
Aaron Ballman
e3eec3e18c
Replacing an empty switch with its moral equivalent. No functional changes intended.
...
llvm-svn: 186017
2013-07-10 17:19:22 +00:00
Michel Danzer
68916ffa69
R600/SI: Initial local memory support
...
Enough for the radeonsi driver to use it for calculating derivatives.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 186012
2013-07-10 16:37:07 +00:00
Michel Danzer
229c2c8c0f
R600/SI: Add pattern for the AMDGPU.barrier.local intrinsic
...
lit test coverage to follow in the next commit.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 186011
2013-07-10 16:36:57 +00:00
Michel Danzer
c2e06ddf2d
R600/SI: Add intrinsic for retrieving the current thread ID
...
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 186010
2013-07-10 16:36:52 +00:00
Michel Danzer
89ebe756dd
R600/SI: Initial support for LDS/GDS instructions
...
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 186009
2013-07-10 16:36:43 +00:00
Michel Danzer
47a9f6685b
R600/SI: Add intrinsics for texture sampling with user derivatives
...
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 186008
2013-07-10 16:36:36 +00:00
Vincent Lejeune
5517f57c42
R600: Do not predicated basic block with multiple alu clause
...
Test is not included as it is several 1000 lines long.
To test this functionnality, a test case must generate at least 2 ALU clauses,
where an ALU clause is ~110 instructions long.
NOTE: This is a candidate for the stable branch.
llvm-svn: 185943
2013-07-09 15:03:33 +00:00
Vincent Lejeune
0c1224c533
R600: Fix a rare bug where swizzle optimization returns wrong values
...
llvm-svn: 185942
2013-07-09 15:03:25 +00:00
Vincent Lejeune
48ea85c102
R600: Fix wrong export reswizzling
...
llvm-svn: 185941
2013-07-09 15:03:19 +00:00
Vincent Lejeune
d29844ad2e
R600: Use DAG lowering pass to handle fcos/fsin
...
NOTE: This is a candidate for the stable branch.
llvm-svn: 185940
2013-07-09 15:03:11 +00:00