Jan Vesely
3b8464bc4e
R600: Expand vector fceil
...
Move fp64 fceil tests to fceil64.ll
v2: rebase
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 211194
2014-06-18 17:57:29 +00:00
Matt Arsenault
1c0d02231f
Work around ridiculous warning.
...
Apparently C++ doesn't really have hex floating point constants.
llvm-svn: 211192
2014-06-18 17:45:58 +00:00
Matt Arsenault
a46ba4c9d1
R600/SI: Add intrinsics for brev instructions
...
llvm-svn: 211187
2014-06-18 17:13:57 +00:00
Matt Arsenault
068d030935
R600: Implement f64 ftrunc, ffloor and fceil.
...
CI has instructions for these, so this fixes them for older hardware.
llvm-svn: 211183
2014-06-18 17:05:30 +00:00
Matt Arsenault
77f7e6fc35
R600: Custom lower f64 frint for pre-CI
...
llvm-svn: 211182
2014-06-18 17:05:26 +00:00
Matt Arsenault
71fb43e88e
R600/SI: Match ctlz_zero_undef
...
llvm-svn: 211115
2014-06-17 17:36:24 +00:00
Tom Stellard
a529beed9c
R600: Use LDS and vectors for private memory
...
llvm-svn: 211110
2014-06-17 16:53:14 +00:00
Tom Stellard
6987d06184
SelectionDAG: Expand i64 = FP_TO_SINT i32
...
llvm-svn: 211108
2014-06-17 16:53:07 +00:00
Matt Arsenault
8d575afe8e
Fix copy paste error
...
llvm-svn: 211003
2014-06-15 21:22:52 +00:00
Matt Arsenault
a88eef222c
R600: Remove a few more things from AMDILISelLowering
...
Try to keep all the setOperationActions for integer ops
together.
llvm-svn: 211001
2014-06-15 21:08:58 +00:00
Matt Arsenault
1f47d520f5
R600: Fix assert on vector sdiv
...
llvm-svn: 211000
2014-06-15 21:08:54 +00:00
Matt Arsenault
512b09be91
R600: Move / cleanup more leftover AMDIL stuff.
...
llvm-svn: 210998
2014-06-15 20:23:38 +00:00
Matt Arsenault
d4919ac014
R600: Move division custom lowering out of AMDILISelLowering
...
llvm-svn: 210997
2014-06-15 20:08:02 +00:00
Matt Arsenault
6f5ac69231
R600: Report that integer division is expensive.
...
Divides by weird constants now emit much better code.
llvm-svn: 210995
2014-06-15 19:48:16 +00:00
Matt Arsenault
b2c8575d08
R600: Fix asserts related to constant initializers
...
This would assert if a constant address space was extern
and therefore didn't have an initializer. If the initializer
was undef, it would hit the unreachable unhandled initializer case.
An extern global should never really occur since we don't have
machine linking, but bugpoint likes to remove initializers.
llvm-svn: 210967
2014-06-14 04:26:05 +00:00
Matt Arsenault
fd04db6d9e
R600: Use address space enum instead of value
...
llvm-svn: 210966
2014-06-14 04:26:01 +00:00
Matt Arsenault
28d84d0e7f
R600: Cleanup some old AMDIL stuff.
...
Move / delete some of the more obviously wrong
setOperationAction calls. Most of these are setting Expand
for types that aren't legal which is the default anyway.
Leave stuff that might require more thought on whether it's
junk or not as it is.
No functionality change.
llvm-svn: 210922
2014-06-13 17:20:53 +00:00
Matt Arsenault
e8c6185eba
R600/SI: Fix selection error on i64 rotl / rotr.
...
Evergreen is still broken due to missing shl_parts.
llvm-svn: 210885
2014-06-13 04:00:30 +00:00
Matt Arsenault
e19ddbd0dc
R600: Mostly remove remaining AMDIL intrinsics.
...
Delete all unused ones, and add new AMDGPU named intrinsics for
the ones that are. Handle the old AMDIL names for comptability (although
remove their GCCBuiltin names) and add tests since there weren't any
for these before.
llvm-svn: 210827
2014-06-12 21:15:44 +00:00
Matt Arsenault
a75d166beb
R600/SI: Use v_cvt_f32_ubyte* instructions
...
This eliminates extra extract instructions when loading an i8 vector to
a float vector.
llvm-svn: 210666
2014-06-11 17:50:44 +00:00
Rafael Espindola
40744fa427
Try to fix the msvc build.
...
llvm-svn: 210636
2014-06-11 04:41:37 +00:00
Matt Arsenault
3090d98faf
Use cast instead of assert + dyn_cast
...
llvm-svn: 210628
2014-06-11 03:30:06 +00:00
Matt Arsenault
6728d3c17d
R600: Add helper functions.
...
Extract these from some of my other patches, since this
is the only thing really making them dependent on each other.
llvm-svn: 210627
2014-06-11 03:29:54 +00:00
Matt Arsenault
4f96643a42
R600: Use BCNT_INT for evergreen
...
llvm-svn: 210569
2014-06-10 19:18:28 +00:00
Matt Arsenault
8407076508
R600/SI: Use bcnt instruction for ctpop
...
llvm-svn: 210567
2014-06-10 19:18:21 +00:00
Matt Arsenault
d30b483e1a
R600: Handle fcopysign
...
llvm-svn: 210564
2014-06-10 19:00:20 +00:00
Matt Arsenault
a34a3c834c
R600: Fix selection failure for vector bswap
...
llvm-svn: 210475
2014-06-09 16:20:25 +00:00
Matt Arsenault
a36a2916ac
R600: Set all float vector expands in the same place
...
llvm-svn: 209988
2014-06-01 07:38:21 +00:00
Matt Arsenault
bfc007dbb5
R600: Try to convert BFE back to standard bit ops when possible.
...
This allows existing DAG combines to work on them, and then
we can re-match to BFE if necessary during instruction selection.
llvm-svn: 209462
2014-05-22 18:09:12 +00:00
Matt Arsenault
90d0fd2ea0
R600: Add dag combine for BFE
...
llvm-svn: 209461
2014-05-22 18:09:07 +00:00
Matt Arsenault
4ab9246e99
R600: Implement ComputeNumSignBitsForTargetNode for BFE
...
llvm-svn: 209460
2014-05-22 18:09:03 +00:00
Matt Arsenault
3728da5d51
R600: Implement computeMaskedBitsForTargetNode for BFE
...
llvm-svn: 209459
2014-05-22 18:09:00 +00:00
Matt Arsenault
e43426533f
R600: Add intrinsics for mad24
...
llvm-svn: 209456
2014-05-22 18:00:15 +00:00
Matt Arsenault
8ec42a3269
R600: Add comment describing problems with LowerConstantInitializer
...
llvm-svn: 209333
2014-05-21 22:59:17 +00:00
Matt Arsenault
094f9f1e9c
R600: Partially fix constant initializers for structs and vectors.
...
This should extend the current workaround to work with structs
that only contain legal, scalar types.
llvm-svn: 209331
2014-05-21 22:42:42 +00:00
Matt Arsenault
90ef7a5eaa
Use cast<> instead of unchecked dyn_cast
...
llvm-svn: 209310
2014-05-21 18:03:59 +00:00
Matt Arsenault
6a9e6f69e7
Use range for
...
llvm-svn: 208922
2014-05-15 21:44:05 +00:00
Jay Foad
e0eac700cb
Rename ComputeMaskedBits to computeKnownBits. "Masked" has been
...
inappropriate since it lost its Mask parameter in r154011.
llvm-svn: 208811
2014-05-14 21:14:37 +00:00
Matt Arsenault
c2251d492b
R600: Add mul24 intrinsics
...
llvm-svn: 208604
2014-05-12 17:49:57 +00:00
Matt Arsenault
8358bf5227
Fix return before else
...
llvm-svn: 208510
2014-05-11 21:24:41 +00:00
Tom Stellard
9562e8a6ba
R600: Expand i64 SELECT_CC
...
llvm-svn: 208430
2014-05-09 16:42:19 +00:00
Tom Stellard
83d3208148
R600: Move MIN/MAX matching from LowerOperation() to PerformDAGCombine()
...
llvm-svn: 208429
2014-05-09 16:42:16 +00:00
Matt Arsenault
119209fcfe
R600: Promote f64 vector load/stores to i64 for consistency
...
llvm-svn: 208344
2014-05-08 18:01:56 +00:00
Tom Stellard
1291d8f8c2
R600: Expand i64 ISD:SUB
...
llvm-svn: 208005
2014-05-05 21:47:15 +00:00
Tom Stellard
18ca382db4
R600: Expand vector sin and cos.
...
v2: move code to AMDGPUISelLowering.cpp
squash with tests (both EG and SI)
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 207845
2014-05-02 15:41:47 +00:00
Tom Stellard
05e86018ff
R600: Expand TruncStore i64 -> {i16,i8}
...
llvm-svn: 207844
2014-05-02 15:41:46 +00:00
Tom Stellard
9112e301ba
R600: optimize the UDIVREM 64 algorithm
...
This is a squash of several optimization commits:
- calculate DIV_Lo and DIV_Hi separately
- use BFE_U32 if we are operating on 32bit values
- use precomputed constants instead of shifting in UDVIREM
- skip the first 32 iterations of udivrem
v2: Check whether BFE is supported before using it
Patch by: Jan Vesely
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 207589
2014-04-29 23:12:46 +00:00
Tom Stellard
42ce4f9d81
R600: Implement iterative algorithm for udivrem
...
Initial implementation, rather slow
Patch by: Jan Vesely
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 207588
2014-04-29 23:12:45 +00:00
Tom Stellard
cbed2c43ab
R600: Change UDIV/UREM to UDIVREM when legalizing types
...
When legalizing ops, with UDIV/UREM set to expand, they automatically
expand to UDIVREM (if legal or custom).
We need to do this manually for legalize types.
v2:
SI should be set to Expand because the type is legal, and it is
automatically lowered to UDIVREM if UDIVREM is Legal/Custom
R600 should set to UDIV/UREM to Custom because it needs to lower them
during type legalization
Patch by: Jan Vesely
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 207587
2014-04-29 23:12:43 +00:00
Tom Stellard
ce760e4826
R600: remove unused variable
...
Patch by: Jan Vesely
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 207586
2014-04-29 23:12:38 +00:00
Craig Topper
9683cb114b
Convert more SelectionDAG functions to use ArrayRef.
...
llvm-svn: 207397
2014-04-28 05:57:50 +00:00
Craig Topper
536995c0a7
Convert SelectionDAG::getMergeValues to use ArrayRef.
...
llvm-svn: 207374
2014-04-27 19:20:57 +00:00
Craig Topper
1b1f54bcca
Convert SelectionDAG::getNode methods to use ArrayRef<SDValue>.
...
llvm-svn: 207327
2014-04-26 18:35:24 +00:00
Matt Arsenault
8deebf765d
R600: Fix function name printing in LowerCall
...
v2: Check both ExternalSymbol and GlobalAddress
Patch by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 207282
2014-04-25 22:22:01 +00:00
Craig Topper
6d411cb95a
[C++] Use 'nullptr'. Target edition.
...
llvm-svn: 207197
2014-04-25 05:30:21 +00:00
Matt Arsenault
f022fe68e4
R600: Emit error instead of unreachable on function call
...
llvm-svn: 206904
2014-04-22 16:42:00 +00:00
Matt Arsenault
c47555412c
R600: Change how vector truncating stores are packed.
...
Don't introduce new operations on an illegal sub 32-bit type.
Do the operations on a 32-bit value, and then use a truncating store.
llvm-svn: 206864
2014-04-22 04:11:14 +00:00
Matt Arsenault
01a0b32658
R600: Make sign_extend_inreg legal.
...
Don't know why I didn't just do this in the first place.
llvm-svn: 206862
2014-04-22 03:49:30 +00:00
Tom Stellard
a405a50d5f
R600: Add comment clariying use of sext for result of MUL_U24
...
llvm-svn: 206501
2014-04-17 21:00:13 +00:00
Matt Arsenault
15d9205991
R600: Expand sign extension of vectors.
...
Setting vector types to expand will result in scalarization on pre SI hw,
as those gpus don't have vector shifts either.
Expand also i32 vectors, this helps llvm make the correct decision
about scalarizing the vector ops.
v2: move setOperation() calls to R600ISelLowering.cpp.
cleanup the SI code to make it obvious that this patch does is nop for SI
Patch by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 206348
2014-04-16 01:41:30 +00:00
Matt Arsenault
a43cbe5951
R600/SI: Fix loads of i1
...
llvm-svn: 206330
2014-04-15 22:28:39 +00:00
Nick Lewycky
82ad9fc7c8
Break PseudoSourceValue out of the Value hierarchy. It is now the root of its own tree containing FixedStackPseudoSourceValue (which you can use isa/dyn_cast on) and MipsCallEntry (which you can't). Anything that needs to use either a PseudoSourceValue* and Value* is strongly encouraged to use a MachinePointerInfo instead.
...
llvm-svn: 206255
2014-04-15 07:22:52 +00:00
Matt Arsenault
65fde80ac6
Move ExtractVectorElements to SelectionDAG.
...
This seems generally useful, and makes sense to
go along with SplitVector.
llvm-svn: 206041
2014-04-11 17:47:30 +00:00
Tom Stellard
557024a30d
R600: Match 24-bit arithmetic patterns in a Target DAGCombine
...
Moving these patterns from TableGen files to PerformDAGCombine()
should allow us to generate better code by eliminating unnecessary
shifts and extensions earlier.
This also fixes a bug where the MAD pattern was calling
SimplifyDemandedBits with a 24-bit mask on the first operand
even when the full pattern wasn't being matched. This occasionally
resulted in some instructions being incorrectly deleted from the
program.
v2:
- Fix bug with 64-bit mul
llvm-svn: 205731
2014-04-07 19:45:41 +00:00
Matt Arsenault
625d4b3956
Use .data() instead of &x[0]
...
llvm-svn: 205722
2014-04-07 16:44:24 +00:00
Matt Arsenault
c36c1df67d
R600: Compute masked bits for min and max
...
llvm-svn: 205242
2014-03-31 19:35:33 +00:00
Matt Arsenault
0d30a17857
R600: Add BFE, BFI, and BFM intrinsics to help with writing tests.
...
llvm-svn: 205236
2014-03-31 18:21:18 +00:00
Matt Arsenault
a8674ddad8
R600: Add target nodes for BFM and BFI
...
llvm-svn: 205235
2014-03-31 18:21:13 +00:00
Matt Arsenault
7f99777a74
R600: Implement isZExtFree.
...
This allows 64-bit operations that are truncated to be reduced
to 32-bit ones.
llvm-svn: 204946
2014-03-27 17:23:31 +00:00
Matt Arsenault
e42a0c31f3
R600/SI: Fix unreachable with a sext_in_reg to an illegal type.
...
llvm-svn: 204945
2014-03-27 17:23:24 +00:00
Matt Arsenault
97718f1b49
R600: Add a testcase for sext_in_reg I missed.
...
This sext_inreg i32 in i64 case was already handled, but not enabled.
llvm-svn: 204840
2014-03-26 18:31:06 +00:00
Matt Arsenault
63960a4cd8
R600: Move computeMaskedBitsForTargetNode out of AMDILISelLowering.cpp
...
Remove handling of select_cc, since it makes no sense to be there. This
now does nothing, but I'll be adding some handling of other target nodes
soon.
llvm-svn: 204743
2014-03-25 18:18:27 +00:00
Matt Arsenault
7ae7f52221
R600: Implement isNarrowingProfitable.
...
llvm-svn: 204658
2014-03-24 19:43:31 +00:00
Matt Arsenault
553297669c
R600: Match sign_extend_inreg to BFE instructions
...
llvm-svn: 204072
2014-03-17 18:58:11 +00:00
Matt Arsenault
114c69ce5a
R600: Remove unnecessary attempt to zext a pointer.
...
Private pointers are now always 32-bits.
llvm-svn: 203989
2014-03-15 00:08:26 +00:00
Matt Arsenault
001181bb1b
R600: Code cleanup.
...
Use sign_extend_inreg and getZeroExtendInReg instead of
using the bit operations they expand into.
llvm-svn: 203988
2014-03-15 00:08:22 +00:00
Matt Arsenault
469ede65b2
R600: Fix trunc store from i64 to i1
...
llvm-svn: 203695
2014-03-12 18:45:52 +00:00
Matt Arsenault
16c4bdf77e
R600: Calculate store mask instead of using switch.
...
llvm-svn: 203527
2014-03-11 01:38:53 +00:00
Matt Arsenault
805d9618a9
Use .data() instead of &x[0]
...
llvm-svn: 203516
2014-03-11 00:01:31 +00:00
Matt Arsenault
8140d7d370
R600: Fix extloads from i8 / i16 to i64.
...
This appears to only be working for global loads. Private
and local break for other reasons.
llvm-svn: 203135
2014-03-06 17:34:12 +00:00
Matt Arsenault
f68a94e609
R600/SI: Expand selects on vectors.
...
llvm-svn: 203134
2014-03-06 17:34:03 +00:00
Matt Arsenault
64d5c4df35
Fix typo
...
llvm-svn: 203013
2014-03-05 21:47:22 +00:00
Matt Arsenault
a3de4dc001
R600/SI - Add new CI arithmetic instructions.
...
Does not yet include larger part required
to match v_mad_i64_i32 / v_mad_u64_u32.
llvm-svn: 202077
2014-02-24 21:01:28 +00:00
Matt Arsenault
9b1fec610f
Fix DOT4 missing from getTargetOpcodeName
...
llvm-svn: 202075
2014-02-24 21:01:21 +00:00
Tom Stellard
988925aeae
R600/SI: Expand all v8[if]32 operations
...
llvm-svn: 201371
2014-02-13 23:34:15 +00:00
Benjamin Kramer
b51d0de00f
R600: Always implement both versions of isTruncateFree and add a sanity check.
...
llvm-svn: 201222
2014-02-12 10:17:54 +00:00
Matt Arsenault
38609a2ae1
R600: Implement isTruncateFree
...
Truncation is just accessing a subregister for any multiple of
the register size, so it's free.
llvm-svn: 201107
2014-02-10 19:57:42 +00:00
Tom Stellard
d35a25ea14
R600/SI: Expand i1 BR_CC
...
This fixes a crashes in the OpenCV test suite and also the scrypt
kernel in bfgminer.
I was unable to come up with a reduced test case for this.
https://bugs.freedesktop.org/show_bug.cgi?id=72785
llvm-svn: 200776
2014-02-04 17:18:43 +00:00
Tom Stellard
f4a180e50b
R600: Enable vector fpow.
...
The OpenCL specs say: "The vector versions of the math functions operate
component-wise. The description is per-component."
Patch by: Jan Vesely
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 200773
2014-02-04 17:18:37 +00:00
Tom Stellard
d424fe57e4
R600: Add support for global addresses with constant initializers
...
llvm-svn: 199825
2014-01-22 19:24:21 +00:00
Tom Stellard
369c33de20
R600/SI: Add support for i8 and i16 private loads/stores
...
llvm-svn: 199823
2014-01-22 19:24:14 +00:00
Tom Stellard
b39ac07c09
R600: Allow ftrunc
...
v2: Add ftrunc->TRUNC pattern instead of replacing int_AMDGPU_trunc
v3: move ftrunc pattern next to TRUNC definition, it's available since R600
Patch By: Jan Vesely
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 197783
2013-12-20 05:11:55 +00:00
Matt Arsenault
6520333e09
Don't manually calculate size in bytes
...
llvm-svn: 197327
2013-12-14 18:21:59 +00:00
Matt Arsenault
9db19365b4
Use llvm_unreachable instead of assert(0)
...
llvm-svn: 196971
2013-12-10 21:37:42 +00:00
Tom Stellard
95624c101d
R600: Expand vector FABS
...
NOTE: This is a candidate for the 3.4 branch.
llvm-svn: 195881
2013-11-27 21:23:39 +00:00
Tom Stellard
0a14ce13e1
R600: Add support for ISD::FROUND
...
NOTE: This is a candidate for the 3.4 branch.
llvm-svn: 195878
2013-11-27 21:23:20 +00:00
Matt Arsenault
084675c776
Add target hook to prevent folding some bitcasted loads.
...
This is to avoid this transformation in some cases:
fold (conv (load x)) -> (load (conv*)x)
On architectures that don't natively support some vector
loads efficiently casting the load to a smaller vector of
larger types and loading is more efficient.
Patch by Micah Villmow.
llvm-svn: 194783
2013-11-15 04:42:23 +00:00
Tom Stellard
c38302be13
R600/SI: Add support for private address space load/store
...
Private address space is emulated using the register file with
MOVRELS and MOVRELD instructions.
llvm-svn: 194626
2013-11-13 23:36:50 +00:00
Vincent Lejeune
5f1f106136
R600: Fix LowerUDIVREM
...
llvm-svn: 194153
2013-11-06 17:36:04 +00:00
Tom Stellard
e058534e9a
R600: Custom lower f32 = uint_to_fp i64
...
llvm-svn: 193701
2013-10-30 17:22:05 +00:00