llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-24 05:23:45 +02:00

Author	SHA1	Message	Date
Matt Arsenault	628cc59d6b	R600/SI: f64 frint is legal on CI llvm-svn: 206475	2014-04-17 17:06:37 +00:00
Matt Arsenault	adccea7f1a	R600/SI: Fix zext from i1 to i64 llvm-svn: 206437	2014-04-17 02:03:08 +00:00
Matt Arsenault	f7ba7017b0	R600: Extend r600 sign_extend_inreg tests for EG Patch by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 206349	2014-04-16 01:41:34 +00:00
Matt Arsenault	f045301bf1	R600/SI: Print more immediates in hex format Print in decimal for inline immediates, and hex otherwise. Use hex always for offsets in addressing offsets. This approximately matches what the shader compiler does. llvm-svn: 206335	2014-04-15 22:32:49 +00:00
Matt Arsenault	a43cbe5951	R600/SI: Fix loads of i1 llvm-svn: 206330	2014-04-15 22:28:39 +00:00
Tom Stellard	9ea60803c4	SelectionDAG: Use helper function to improve legalization of ISD::MUL The TargetLowering::expandMUL() helper contains lowering code extracted from the DAGTypeLegalizer and allows the SelectionDAGLegalizer to expand more ISD::MUL patterns without having to use a library call. llvm-svn: 206037	2014-04-11 16:12:01 +00:00
Matt Arsenault	ffd08a2504	R600/SI: Match not instruction. llvm-svn: 205837	2014-04-09 07:16:16 +00:00
Tom Stellard	5e0d95bd87	R600/SI: Handle INSERT_SUBREG in SIFixSGPRCopies llvm-svn: 205732	2014-04-07 19:45:45 +00:00
Tom Stellard	557024a30d	R600: Match 24-bit arithmetic patterns in a Target DAGCombine Moving these patterns from TableGen files to PerformDAGCombine() should allow us to generate better code by eliminating unnecessary shifts and extensions earlier. This also fixes a bug where the MAD pattern was calling SimplifyDemandedBits with a 24-bit mask on the first operand even when the full pattern wasn't being matched. This occasionally resulted in some instructions being incorrectly deleted from the program. v2: - Fix bug with 64-bit mul llvm-svn: 205731	2014-04-07 19:45:41 +00:00
Tom Stellard	981d7d5d7e	R600: Correct opcode for BFE_INT Acording to AMD documentation, the correct opcode for BFE_INT is 0x5, not 0x4 Fixes Arithm/Absdiff.Mat/3 OpenCV test Patch by: Bruno Jiménez llvm-svn: 205562	2014-04-03 20:19:29 +00:00
Tom Stellard	76577a21a1	R600/SI: Lower 64-bit immediates using REG_SEQUENCE llvm-svn: 205561	2014-04-03 20:19:27 +00:00
Tom Stellard	7fed4dd0dd	TargetLibraryInfo: Disable memcpy and memset on R600 There are no implementations of these for R600. llvm-svn: 205455	2014-04-02 19:53:29 +00:00
Matt Arsenault	df61dc156f	Fix missing RUN line in test llvm-svn: 205341	2014-04-01 18:34:13 +00:00
Matt Arsenault	0062eb7871	Make isSetCCEquivalent respect the TargetBooleanContents llvm-svn: 205336	2014-04-01 18:13:26 +00:00
Matt Arsenault	c36c1df67d	R600: Compute masked bits for min and max llvm-svn: 205242	2014-03-31 19:35:33 +00:00
Matt Arsenault	0d30a17857	R600: Add BFE, BFI, and BFM intrinsics to help with writing tests. llvm-svn: 205236	2014-03-31 18:21:18 +00:00
Tom Stellard	c6c05561d5	R600/SI: Lower i64 SELECT by bitcasting to a vector type This allows allows us to replace ISD::EXTRACT_ELEMENT, which is lowered using shifts, with ISD::EXTRACT_VECTOR_ELT, which is a no-op. llvm-svn: 205187	2014-03-31 14:01:55 +00:00
Matt Arsenault	7f99777a74	R600: Implement isZExtFree. This allows 64-bit operations that are truncated to be reduced to 32-bit ones. llvm-svn: 204946	2014-03-27 17:23:31 +00:00
Matt Arsenault	e42a0c31f3	R600/SI: Fix unreachable with a sext_in_reg to an illegal type. llvm-svn: 204945	2014-03-27 17:23:24 +00:00
Matt Arsenault	97718f1b49	R600: Add a testcase for sext_in_reg I missed. This sext_inreg i32 in i64 case was already handled, but not enabled. llvm-svn: 204840	2014-03-26 18:31:06 +00:00
Matt Arsenault	a88c889ce0	R600: Add failing testcase for <3 x i32> stores. This is supposed to have the same store size and alignment as <4 x i32>, but currently is split into a 64-bit and 32-bit store. llvm-svn: 204729	2014-03-25 16:50:55 +00:00
Matt Arsenault	94cdf74a4b	R600/SI: Fix extra mov from legalizing 64-bit SALU ops. Check the register class of each operand individually to avoid an extra copy to a vgpr. llvm-svn: 204662	2014-03-24 20:08:13 +00:00
Matt Arsenault	3436234471	R600/SI: Sub-optimial fix for 64-bit immediates with SALU ops. No longer asserts, but now you get moves loading legal immediates into the split 32-bit operations. llvm-svn: 204661	2014-03-24 20:08:09 +00:00
Matt Arsenault	ed12a24627	R600/SI: Fix 64-bit bit ops that require the VALU. Try to match scalar and first like the other instructions. Expand 64-bit ands to a pair of 32-bit ands since that is not available on the VALU. llvm-svn: 204660	2014-03-24 20:08:05 +00:00
Matt Arsenault	7ae7f52221	R600: Implement isNarrowingProfitable. llvm-svn: 204658	2014-03-24 19:43:31 +00:00
Matt Arsenault	e063f39ed3	R600/SI: Fix 64-bit private loads. llvm-svn: 204630	2014-03-24 17:50:46 +00:00
Matt Arsenault	f0af6362fd	R600/SI: Move instruction patterns to scalar versions. Some of them also had the pattern on both, so this removes the duplication. llvm-svn: 204492	2014-03-21 18:01:18 +00:00
Tom Stellard	e5e3293278	R600/SI: Handle MUBUF instructions in SIInstrInfo::moveToVALU() llvm-svn: 204476	2014-03-21 15:51:57 +00:00
Tom Stellard	8078855521	R600/SI: Handle S_MOV_B64 in SIInstrInfo::moveToVALU() llvm-svn: 204475	2014-03-21 15:51:54 +00:00
Matt Arsenault	a604a1a412	R600/SI: Add support for 64-bit LDS writes llvm-svn: 204274	2014-03-19 22:19:54 +00:00
Matt Arsenault	35f86bd433	R600/SI: Add support for 64-bit LDS loads. v2: -Use correct opcode for DS_READ_64 llvm-svn: 204273	2014-03-19 22:19:52 +00:00
Matt Arsenault	38344ebbaf	R600/SI: Match i16 immediate offset of LDS instructions. llvm-svn: 204272	2014-03-19 22:19:49 +00:00
Matt Arsenault	194b9e9539	R600/SI: Fix test checking wrong instruction operand. The source and destination happen to be the same register. llvm-svn: 204271	2014-03-19 22:19:45 +00:00
Matt Arsenault	45311f1864	R600/SI: Don't display the GDS bit. It isn't actually used now, and probably never will be, plus it makes tests less annoying. I also think SC prints GDS instructions as a separate instruction name. llvm-svn: 204270	2014-03-19 22:19:43 +00:00
NAKAMURA Takumi	a0deabb112	CodeGen/R600/v_cndmask.ll: Relax an expression to unbreak msvcrt. V_CNDMASK_B32_e64 v0, v0, -1.#QNAN0e+00, s[2:3], 0, 0, 0, 0 FIXME: We really need to implement our formatter... llvm-svn: 204118	2014-03-18 06:17:22 +00:00
Kevin Enderby	b8221f3c03	Making a guess to fix the test case with r204056 to get the build bot working. llvm-svn: 204073	2014-03-17 19:00:03 +00:00
Matt Arsenault	553297669c	R600: Match sign_extend_inreg to BFE instructions llvm-svn: 204072	2014-03-17 18:58:11 +00:00
Tom Stellard	6f60ceca31	R600/SI: Fix implementation of isInlineConstant() used by the verifier The type of the immediates should not matter as long as the encoding is equivalent to the encoding of one of the legal inline constants. Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 204056	2014-03-17 17:03:52 +00:00
Tom Stellard	6b4e505e41	R600/SI: Use correct dest register class for V_READFIRSTLANE_B32 This instructions writes to an 32-bit SGPR. This change required adding the 32-bit VCC_LO and VCC_HI registers, because the full VCC register is 64 bits. This fixes verifier errors on several of the indirect addressing piglit tests. Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 204055	2014-03-17 17:03:51 +00:00
Tom Stellard	c33b600343	R600: LDS instructions shouldn't implicitly define OQAP LDS instructions are pseudo instructions which model the OQAP defs and uses within a single instruction. This fixes a hang in the opencv MedianFilter tests. llvm-svn: 203818	2014-03-13 17:13:04 +00:00
Matt Arsenault	469ede65b2	R600: Fix trunc store from i64 to i1 llvm-svn: 203695	2014-03-12 18:45:52 +00:00
Tom Stellard	bee4678d48	R600/SI: Using SGPRs is illegal for instructions that read carry-out from VCC Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 203281	2014-03-07 20:12:39 +00:00
Tom Stellard	230af572ff	R600/SI: Custom lower i1 stores These are sometimes created by the shrink to boolean optimization in the globalopt pass. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 203280	2014-03-07 20:12:33 +00:00
Matt Arsenault	8140d7d370	R600: Fix extloads from i8 / i16 to i64. This appears to only be working for global loads. Private and local break for other reasons. llvm-svn: 203135	2014-03-06 17:34:12 +00:00
Matt Arsenault	f68a94e609	R600/SI: Expand selects on vectors. llvm-svn: 203134	2014-03-06 17:34:03 +00:00
Matt Arsenault	394a9d104d	R600: Add failing control flow tests. Simple cases hit a variety of problems at -O0. llvm-svn: 202601	2014-03-01 21:45:41 +00:00
Tom Stellard	6280afdecd	R600/SI: Expand all v16[if]32 operations llvm-svn: 202543	2014-02-28 21:36:37 +00:00
Michel Danzer	8edacce1de	R600/SI: Optimize SI_KILL for constant operands If the SI_KILL operand is constant, we can either clear the exec mask if the operand is negative, or do nothing otherwise. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 202337	2014-02-27 01:47:09 +00:00
Michel Danzer	0ddce64f7c	R600/SI: Allow SI_KILL for geometry shaders Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 202336	2014-02-27 01:47:02 +00:00
Tom Stellard	3dafad8efc	R600/SI: Custom select 64-bit ADD llvm-svn: 202194	2014-02-25 21:36:18 +00:00
Matt Arsenault	a3de4dc001	R600/SI - Add new CI arithmetic instructions. Does not yet include larger part required to match v_mad_i64_i32 / v_mad_u64_u32. llvm-svn: 202077	2014-02-24 21:01:28 +00:00
Quentin Colombet	5c6ea83f97	[CodeGenPrepare] Fix the check of the legality of an instruction. The API expects an ISD opcode, not an IR opcode. Fixes a regression for R600. Related to <rdar://problem/15519855>. llvm-svn: 201923	2014-02-22 01:06:41 +00:00
Nico Rieck	f3b62a4af6	Fix more broken CHECK lines llvm-svn: 201493	2014-02-16 13:28:39 +00:00
Quentin Colombet	5700bbac29	[CodeGenPrepare][AddressingModeMatcher] Give up on type promotion if the transformation does not bring any immediate benefits and introduce an illegal operation. llvm-svn: 201439	2014-02-14 22:23:22 +00:00
Tom Stellard	a3a801780f	TargetLowering: n * r where n > 2 should be an illegal addressing mode llvm-svn: 201433	2014-02-14 21:10:34 +00:00
Tom Stellard	988925aeae	R600/SI: Expand all v8[if]32 operations llvm-svn: 201371	2014-02-13 23:34:15 +00:00
Tom Stellard	309a624102	R600/SI: Add a pattern for i32 anyext Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 201370	2014-02-13 23:34:13 +00:00
Tom Stellard	4b0c3551df	R600/SI: Completely Disable TypeRewriter on compute llvm-svn: 201369	2014-02-13 23:34:12 +00:00
Tom Stellard	4447febe55	R600/SI: Split global vector loads with more than 4 elements llvm-svn: 201368	2014-02-13 23:34:10 +00:00
Tom Stellard	de306d4a6d	R600/SI: Add ShaderType attribute to some tests llvm-svn: 201367	2014-02-13 23:34:07 +00:00
Matt Arsenault	cc13cc04ab	R600/SI: Fix assertion on infinite loops. This isn't the most useful case to fix in the real world, but bugpoint runs into this. llvm-svn: 201177	2014-02-11 21:12:38 +00:00
Tom Stellard	2712f90019	R600/SI: Initialize M0 and emit S_WQM_B64 whenever DS instructions are used DS instructions that access local memory can only uses addresses that are less than or equal to the value of M0. When M0 is uninitialized, then we experience undefined behavior. This patch also changes the behavior to emit S_WQM_B64 on pixel shaders no matter what kind of DS instruction is used. llvm-svn: 201097	2014-02-10 16:58:30 +00:00
Matt Arsenault	accd6717ba	R600/SI: Add failing test for 3 x i64 vectors. Stores of <4 x i64> do work (although they do expand to 4 stores instead of 2), but 3 x i64 vectors fail to select. llvm-svn: 200989	2014-02-07 20:29:40 +00:00
Tom Stellard	1906c48d55	R600/SI: Add a MUBUF store pattern for Reg+Imm offsets llvm-svn: 200935	2014-02-06 18:36:41 +00:00
Tom Stellard	c690406420	R600/SI: Add a MUBUF store pattern for Imm offsets llvm-svn: 200934	2014-02-06 18:36:39 +00:00
Tom Stellard	2e3a1cc4d8	R600/SI: Add a MUBUF load pattern for Reg+Imm offsets llvm-svn: 200933	2014-02-06 18:36:38 +00:00
Tom Stellard	879ab71511	R600/SI: Use immediates offsets for SMRD instructions whenever possible There was a problem with the old pattern, so we were copying some larger immediates into registers when we could have been encoding them in the instruction. llvm-svn: 200932	2014-02-06 18:36:34 +00:00
Michel Danzer	ed3052cc54	R600/SI: Add pattern for zero-extending i1 to i32 Fixes opencl-example if_* tests with radeonsi. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74469 Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200830	2014-02-05 09:48:05 +00:00
Tom Stellard	279daf2506	R600/SI: Custom lower i64 ISD::SELECT llvm-svn: 200774	2014-02-04 17:18:40 +00:00
Tom Stellard	f4a180e50b	R600: Enable vector fpow. The OpenCL specs say: "The vector versions of the math functions operate component-wise. The description is per-component." Patch by: Jan Vesely Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 200773	2014-02-04 17:18:37 +00:00
Michel Danzer	292dfb1151	R600/SI: Fix fneg for 0.0 V_ADD_F32 with source modifier does not produce -0.0 for this. Just manipulate the sign bit directly instead. Also add a pattern for (fneg (fabs ...)). Fixes a bunch of bit encoding piglit tests with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200743	2014-02-04 07:12:38 +00:00
Matt Arsenault	f89553a645	Add some xfailed R600 tests for 64-bit private accesses. llvm-svn: 200620	2014-02-02 00:13:12 +00:00
Matt Arsenault	4198d962c0	R600/SI: Fix insertelement with dynamic indices. This didn't work for any integer vectors, and didn't work with some sizes of float vectors. This should now work with all sizes of float and i32 vectors. llvm-svn: 200619	2014-02-02 00:05:35 +00:00
Michel Danzer	71542b5f92	R600/SI: Add pattern for truncating i32 to i1 Fixes half a dozen piglit tests with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200283	2014-01-28 03:01:16 +00:00
Michel Danzer	65a5397c22	R600/SI: Add intrinsic for BUFFER_LOAD_DWORD* instructions Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200196	2014-01-27 07:20:51 +00:00
Michel Danzer	36dd8ac577	R600/SI: Add intrinsic for S_SENDMSG instruction Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200195	2014-01-27 07:20:44 +00:00
Tom Stellard	e8c59f575b	R600: Disable the BFE pattern This pattern uses an SDNodeXForm, which isn't being emitted for some reason. I can get it to work by attaching the PatLeaf that has the XForm to the argument in the output pattern, but this results in an immediate being used in a register operand, which the backend can't handle yet. llvm-svn: 199918	2014-01-23 18:49:33 +00:00
Tom Stellard	25fa3e2b1d	R600: Correctly handle vertex fetch clauses the precede ENDIFs The control flow finalizer would sometimes use an ALU_POP_AFTER instruction before the vetex fetch clause instead of using a POP instruction after it. llvm-svn: 199917	2014-01-23 18:49:31 +00:00
Tom Stellard	ab9b18423b	R600: Unconditionally unroll loops that contain GEPs with alloca pointers Implement the getUnrollingPreferences() function for AMDGPUTargetTransformInfo so that loops that do address calculations on pointers derived from alloca are unconditionally unrolled. Unrolling these loops makes it more likely that SROA will be able to eliminate the allocas, which is a big win for R600 since memory allocated by alloca (private memory) is really slow. llvm-svn: 199916	2014-01-23 18:49:28 +00:00
Tom Stellard	6f13c22a7a	R600: Recommit 199842: Add work-around for the CF stack entry HW bug The unit test is now disabled on non-asserts builds. The CF stack can be corrupted if you use CF_ALU_PUSH_BEFORE, CF_ALU_ELSE_AFTER, CF_ALU_BREAK, or CF_ALU_CONTINUE when the number of sub-entries on the stack is greater than or equal to the stack entry size and sub-entries modulo 4 is either 0 or 3 (on cedar the bug is present when number of sub-entries module 8 is either 7 or 0) We choose to be conservative and always apply the work-around when the number of sub-enries is greater than or equal to the stack entry size, so that we can safely over-allocate the stack when we are unsure of the stack allocation rules. reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 199905	2014-01-23 16:18:02 +00:00
Tom Stellard	d5181ee67d	Revert "R600: Add work-around for the CF stack entry HW bug" This reverts commit 35b8331cad6eb512a2506adbc394201181da94ba. The -debug-only flag for llc doesn't appear to be available in all build configurations. llvm-svn: 199845	2014-01-22 22:20:54 +00:00
Tom Stellard	cd874ab98c	R600: Add work-around for the CF stack entry HW bug The CF stack can be corrupted if you use CF_ALU_PUSH_BEFORE, CF_ALU_ELSE_AFTER, CF_ALU_BREAK, or CF_ALU_CONTINUE when the number of sub-entries on the stack is greater than or equal to the stack entry size and sub-entries modulo 4 is either 0 or 3 (on cedar the bug is present when number of sub-entries module 8 is either 7 or 0) We choose to be conservative and always apply the work-around when the number of sub-enries is greater than or equal to the stack entry size, so that we can safely over-allocate the stack when we are unsure of the stack allocation rules. reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 199842	2014-01-22 21:55:46 +00:00
Tom Stellard	ae477cc774	R600: Refactor stack size calculation reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 199840	2014-01-22 21:55:43 +00:00
Tom Stellard	19af07fe92	R600: MOVA is vector only llvm-svn: 199827	2014-01-22 19:24:24 +00:00
Tom Stellard	0971c460b5	R600: Take alignment into account when calculating the stack offset llvm-svn: 199826	2014-01-22 19:24:23 +00:00
Tom Stellard	d424fe57e4	R600: Add support for global addresses with constant initializers llvm-svn: 199825	2014-01-22 19:24:21 +00:00
Tom Stellard	452996a15e	R600: Begin private memory at the second GPR. This way private memory does not over-write work group information stored in GPRs 0 and 1. llvm-svn: 199824	2014-01-22 19:24:19 +00:00
Tom Stellard	369c33de20	R600/SI: Add support for i8 and i16 private loads/stores llvm-svn: 199823	2014-01-22 19:24:14 +00:00
Benjamin Kramer	002aed9cb3	Fix broken CHECK lines. llvm-svn: 199016	2014-01-11 21:06:00 +00:00
Tom Stellard	b39ac07c09	R600: Allow ftrunc v2: Add ftrunc->TRUNC pattern instead of replacing int_AMDGPU_trunc v3: move ftrunc pattern next to TRUNC definition, it's available since R600 Patch By: Jan Vesely Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 197783	2013-12-20 05:11:55 +00:00
NAKAMURA Takumi	e67a3fe0ef	Add REQUIRES:asserts to 3 tests in llvm/test/CodeGen/R600 added in r192212. They are failing in assertions. llvm-svn: 197669	2013-12-19 10:41:12 +00:00
Matt Arsenault	e64331a159	R600/SI: Make private pointers be 32-bit. Different sized address spaces should theoretically work most of the time now, and since 64-bit add is currently disabled, using more 32-bit pointers fixes some cases. llvm-svn: 197659	2013-12-19 05:32:55 +00:00
Matt Arsenault	329326f031	R600/SI: Minor improvements to test. Use CHECK-LABEL, add an i64 version, check store instructions. llvm-svn: 197293	2013-12-14 00:38:04 +00:00
Matt Arsenault	0a974aca94	R600/SI: Add i64 cmp tests llvm-svn: 196960	2013-12-10 21:11:55 +00:00
Vincent Lejeune	73c6e97c15	R600: Fix an infinite loop when trying to reorganize export/tex vector input llvm-svn: 196923	2013-12-10 14:43:31 +00:00
Vincent Lejeune	cd5d9a9849	R600: Fix input modifiers lost for Cayman llvm-svn: 196922	2013-12-10 14:43:27 +00:00
Vincent Lejeune	8f22bc4540	Add a RequireStructuredCFG Field to TargetMachine. llvm-svn: 196634	2013-12-07 01:49:19 +00:00
Matt Arsenault	6f14dd54b4	R600/SI: Add comments for number of used registers. llvm-svn: 196467	2013-12-05 05:15:35 +00:00
Vincent Lejeune	26780e84f1	R600: Workaround for cayman loop bug llvm-svn: 196121	2013-12-02 17:29:37 +00:00
Tom Stellard	95624c101d	R600: Expand vector FABS NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195881	2013-11-27 21:23:39 +00:00

1 2 3 4 5 ...

439 Commits