llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-23 21:13:02 +02:00

Author	SHA1	Message	Date
Matt Arsenault	d857ef4f47	R600/SI: Fix i64 truncate to i1 llvm-svn: 228273	2015-02-05 06:05:13 +00:00
Tom Stellard	e60a7f33b1	R600/SI: Enable subreg liveness by default llvm-svn: 228228	2015-02-04 23:14:18 +00:00
Tom Stellard	cf5e2de361	R600/SI: Expand misaligned 16-bit memory accesses llvm-svn: 228190	2015-02-04 20:49:52 +00:00
Tom Stellard	e523dc6621	R600/SI: Make more store operations legal v2i32, i32, trunc i32 to i16, and truc i32 to i8 stores are legal for all address spaces. We had marked them as custom in order to lower them for the private address space, but this is no longer necessary. This enables lowering of misaligned stores of these types in the DAGLegalizer. llvm-svn: 228189	2015-02-04 20:49:51 +00:00
Tom Stellard	5ea00f06af	R600: Don't promote i64 stores to v2i32 during DAG legalization We take care of this during instruction selection now. This fixes a potential infinite loop when lowering misaligned stores. llvm-svn: 228188	2015-02-04 20:49:49 +00:00
Marek Olsak	fa2a952243	R600/SI: Remove the -CHECK suffix from all FileCheck prefixes in LIT tests llvm-svn: 228040	2015-02-03 21:53:27 +00:00
Marek Olsak	7724be694b	R600/SI: Fix B64 VALU shifts on VI SI only has standard versions. VI only has REV versions. Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 228037	2015-02-03 21:53:01 +00:00
Marek Olsak	31489249b9	R600/SI: Don't generate non-existent LSHL, LSHR, ASHR B32 variants on VI This can happen when a REV instruction is commuted. The trick is not to define the _vi versions of instructions, which has these consequences: - code generation will always fail if a pseudo cannot be lowered (very useful to catch bugs where an unsupported instruction somehow makes it to the printer) - ability to query if a pseudo can be lowered, which is done in commuteOpcode to prevent REV from commuting to non-REV on VI Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 227990	2015-02-03 17:38:12 +00:00
Marek Olsak	08103dc7bd	R600/SI: Fix dependency between instruction writing M0 and S_SENDMSG on VI (v2) This fixes a hang when using an empty geometry shader. v2: - don't add s_nop when followed by s_waitcnt - comestic changes Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 227986	2015-02-03 17:37:52 +00:00
Tom Stellard	5e1403ebec	R600/SI: 64-bit and larger memory access must be at least 4-byte aligned This is true for SI only. CI+ supports unaligned memory accesses, but this requires driver support, so for now we disallow unaligned accesses for all GCN targets. llvm-svn: 227822	2015-02-02 18:02:28 +00:00
Tom Stellard	79712c884f	R600/SI: Merge two test files llvm-svn: 227821	2015-02-02 18:02:23 +00:00
Matt Arsenault	c7e64d4e6b	R600/SI: Only select cvt_flr/cvt_rpi with no NaNs. These have different behavior from cvt_i32_f32 on NaN. llvm-svn: 227693	2015-01-31 21:28:13 +00:00
Matt Arsenault	97c228de40	R600/SI: Implement enableAggressiveFMAFusion Add tests for the various combines. This should always be at least cycle neutral on all subtargets for f64, and faster on some. For f32 we should prefer selecting v_mad_f32 over v_fma_f32. llvm-svn: 227484	2015-01-29 19:34:32 +00:00
Tom Stellard	33ad0a78c5	R600/SI: Define a schedule model and enable the generic machine scheduler The schedule model is not complete yet, and could be improved. llvm-svn: 227461	2015-01-29 16:55:25 +00:00
Tom Stellard	4758e65ded	R600: Move DataLayout to AMDGPUTargetMachine This is a follow up to r227113. It is now required to use the amdgcn target for SI and newer GPUs. llvm-svn: 227316	2015-01-28 16:04:26 +00:00
Marek Olsak	385598a9ea	R600/SI: Enable all tests that pass on VI without changes llvm-svn: 227214	2015-01-27 17:27:15 +00:00
Matt Arsenault	4cd6c4c4c3	R600: Cleanup or test Fix broken check lines, use multiple check prefixes, add an additional test for i1 or. llvm-svn: 227137	2015-01-26 21:16:10 +00:00
Tom Stellard	2e48fd18ce	R600/SI: Emit .hsa.version section for amdhsa OS llvm-svn: 226970	2015-01-23 23:59:08 +00:00
Tom Stellard	27630189de	R600/SI: Move i64 -> v2i32 load promotion into AMDGPUDAGToDAGISel::Select() We used to do this promotion during DAG legalization, but this caused an infinite loop in ExpandUnalignedLoad() because it assumed that i64 loads were legal if i64 was a legal type. It also seems better to report i64 loads as legal, since they actually are and we were just promoting them to simplify our tablegen files. llvm-svn: 226945	2015-01-23 22:05:45 +00:00
Jan Vesely	042d995634	R600: Try to use lower types for 64bit division if possible v2: add and enable tests for SI Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> llvm-svn: 226881	2015-01-22 23:42:43 +00:00
Tim Northover	5b0e908c64	DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N)) It can help with argument juggling on some targets, and is generally a good idea. llvm-svn: 226740	2015-01-21 23:17:19 +00:00
Matt Arsenault	ab16558365	R600: Add checks for urem/srem by a constant Make sure this uses the faster expansion using magic constants to avoid the full division path. llvm-svn: 226734	2015-01-21 22:56:15 +00:00
Matt Arsenault	58bdd73a01	R600: Add missing tests for i64 srem llvm-svn: 226713	2015-01-21 22:43:19 +00:00
Matt Arsenault	3c5cd0f801	R600/SI: Custom lower fround This fixes it for SI. It also removes the pattern used previously for Evergreen for f32. I'm not sure if the the new R600 output is better or not, but it uses 1 fewer instructions if BFI is available. llvm-svn: 226682	2015-01-21 18:18:25 +00:00
Tim Northover	c2963c8019	Revert "DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N))" It hadn't gone through review yet, but was still on my local copy. This reverts commit r226663 llvm-svn: 226665	2015-01-21 15:48:52 +00:00
Tim Northover	c9cc73b336	DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N)) llvm-svn: 226663	2015-01-21 15:43:28 +00:00
Tom Stellard	c7e82f2d5a	R600/SI: Fix simple-loop.ll test llvm-svn: 226596	2015-01-20 19:33:02 +00:00
Tom Stellard	59a54d03fe	R600/SI: Add kill flag when copying scratch offset to a register This allows us to re-use the same register for the scratch offset when accessing large private arrays. llvm-svn: 226585	2015-01-20 17:49:45 +00:00
Tom Stellard	01ff7eb626	R600/SI: Don't store scratch buffer frame index in MUBUF offset field We don't have a good way of legalizing this if the frame index offset is more than the 12-bits, which is size of MUBUF's offset field, so now we store the frame index in the vaddr field. llvm-svn: 226584	2015-01-20 17:49:43 +00:00
Matt Arsenault	e51db749de	R600: Remove redundant test This is already covered in ftrunc.ll llvm-svn: 226412	2015-01-18 19:30:32 +00:00
Matt Arsenault	ffd1bd1d5d	R600: Clean up floor tests These were using different naming schemes, not using multiple check prefixes and not using -LABEL. llvm-svn: 226333	2015-01-16 22:11:00 +00:00
Matt Arsenault	5500d52bba	R600/SI: Add patterns for v_cvt_{flr\|rpi}_i32_f32 llvm-svn: 226230	2015-01-15 23:58:35 +00:00
Matt Arsenault	2f04c34f62	R600/SI: Fix trailing comma with modifiers Instructions with 1 operand can still use source modifiers, so make sure we don't print an extra comma afterwards. llvm-svn: 226226	2015-01-15 23:17:03 +00:00
Matt Arsenault	1851a2058d	R600/SI: Improve fpext / fptrunc test coverage llvm-svn: 226197	2015-01-15 19:39:42 +00:00
Marek Olsak	6dc07f6738	R600/SI: Use 64-bit encoding by default for opcodes that are VOP3-only on VI llvm-svn: 226190	2015-01-15 18:43:01 +00:00
Matt Arsenault	d3aa8b3a41	R600/SI: Remove some redudant load testcases. This reduces coverage for Evergreen, since the more complete tests have those run lines disabled. llvm-svn: 225927	2015-01-14 01:35:26 +00:00
Matt Arsenault	424a0025ca	R600/SI: Fix bad code with unaligned byte vector loads Don't do the v4i8 -> v4f32 combine if the load will need to be expanded due to alignment. This stops adding instructions to repack into a single register that the v_cvt_ubyteN_f32 instructions read. llvm-svn: 225926	2015-01-14 01:35:22 +00:00
Matt Arsenault	22a9f67443	Implement new way of expanding extloads. Now that the source and destination types can be specified, allow doing an expansion that doesn't use an EXTLOAD of the result type. Try to do a legal extload to an intermediate type and extend that if possible. This generalizes the special case custom lowering of extloads R600 has been using to work around this problem. This also happens to fix a bug that would incorrectly use more aligned loads than should be used. llvm-svn: 225925	2015-01-14 01:35:17 +00:00
Matt Arsenault	4cc5aca737	R600: Implement getRsqrtEstimate Only do for f32 since I'm unclear on both what this is expecting for the refinement steps in terms of accuracy, and what f64 instruction actually provides. llvm-svn: 225827	2015-01-13 20:53:18 +00:00
Matt Arsenault	ef34769c29	R600: Make cttz / ctlz cheap to speculate Speculating things is generally good. SI+ has instructions for these for 32-bit values. This is still probably better even with the expansion for 64-bit values, although it is odd that this callback doesn't have the size as a parameter. llvm-svn: 225822	2015-01-13 19:46:48 +00:00
Matt Arsenault	bdd98aed5a	Combine fcmp + select to fminnum / fmaxnum if no nans and legal Also require unsafe FP math for no since there isn't a way to test for signed zeros. llvm-svn: 225744	2015-01-13 00:43:00 +00:00
Tom Stellard	d556c00330	R600/SI: Use RegisterOperands to specify which operands can accept immediates There are some operands which can take either immediates or registers and we were previously using different register class to distinguish between operands that could take immediates and those that could not. This patch switches to using RegisterOperands which should simplify the backend by reducing the number of register classes and also make it easier to implement the assembler. llvm-svn: 225662	2015-01-12 19:33:18 +00:00
Tom Stellard	d8d9d6ab95	R600/SI: Remove SIISelLowering::legalizeOperands() Its functionality has been replaced by calling SIInstrInfo::legalizeOperands() from SIISelLowering::AdjstInstrPostInstrSelection() and running the SIFoldOperands and SIShrinkInstructions passes. llvm-svn: 225445	2015-01-08 15:08:17 +00:00
Matthias Braun	22b8fb521f	RegisterCoalescer: Fix valuesIdentical() in some subrange merge cases. I got confused and assumed SrcIdx/DstIdx of the CoalescerPair is a subregister index in SrcReg/DstReg, but they are actually subregister indices of the coalesced register that get you back to SrcReg/DstReg when applied. Fixed the bug, improved comments and simplified code accordingly. Testcase by Tom Stellard! llvm-svn: 225415	2015-01-07 23:58:38 +00:00
Tom Stellard	1e2b9bc41b	R600/SI: Commute instructions to enable more folding opportunities llvm-svn: 225410	2015-01-07 22:44:19 +00:00
Tom Stellard	8d2d692413	R600/SI: Only fold immediates that have one use Folding the same immediate into multiple instruction will increase program size, which can hurt performance. llvm-svn: 225405	2015-01-07 22:18:27 +00:00
Tom Stellard	0014f9fa76	R600/SI: Add a V_MOV_B64 pseudo instruction This is used to simplify the SIFoldOperands pass and make it easier to fold immediates. llvm-svn: 225373	2015-01-07 20:27:25 +00:00
Tom Stellard	cf5b7b89ba	R600/SI: Teach SIFoldOperands to split 64-bit constants when folding This allows folding of sequences like: s[0:1] = s_mov_b64 4 v_add_i32 v0, s0, v0 v_addc_u32 v1, s1, v1 into v_add_i32 v0, 4, v0 v_add_i32 v1, 0, v1 llvm-svn: 225369	2015-01-07 19:56:17 +00:00
Matt Arsenault	53120c2e9a	R600/SI: Add combine for isinfinite pattern llvm-svn: 225310	2015-01-06 23:00:46 +00:00
Matt Arsenault	b663657a06	R600/SI: Pattern match isinf to v_cmp_class instructions llvm-svn: 225307	2015-01-06 23:00:41 +00:00

1 2 3 4 5 ...

768 Commits