llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-23 13:02:52 +02:00

Author	SHA1	Message	Date
Matt Arsenault	8df97c1243	AMDGPU: Support v2i16/v2f16 packed operations llvm-svn: 296396	2017-02-27 22:15:25 +00:00
Matt Arsenault	c230fcbb58	AMDGPU: Generalize matching of v_med3_f32 I think this is safe as long as no inputs are known to ever be nans. Also add an intrinsic for fmed3 to be able to handle all safe math cases. llvm-svn: 293598	2017-01-31 03:07:46 +00:00
Matt Arsenault	58721b2662	AMDGPU: Make i32 uaddo/usubo legal llvm-svn: 293514	2017-01-30 18:11:38 +00:00
Tom Stellard	4f3e674183	AMDGPU/SI: Move some ISel helpers into utils so they can be shared with GISel Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D29068 llvm-svn: 293321	2017-01-27 18:41:14 +00:00
Matt Arsenault	a73166cb45	AMDGPU: Remove modifiers from v_div_scale_* They seem to produce nonsense results when used. This should be applied to the release branch. llvm-svn: 292472	2017-01-19 06:04:12 +00:00
Jan Vesely	523782f6c1	AMDGPU/R600: Don't use REGISTER_{LOAD,STORE} ISD nodes This will make transition to SCRATCH_MEMORY easier Differential Revision: https://reviews.llvm.org/D24746 llvm-svn: 291279	2017-01-06 21:00:46 +00:00
Matt Arsenault	b1034a224d	AMDGPU: Select branch on undef to uniform scc branch llvm-svn: 289877	2016-12-15 21:57:11 +00:00
Eugene Zelenko	796f37f3bb	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 289282	2016-12-09 22:06:55 +00:00
Tom Stellard	42afa7429f	AMDGPU : Add S_SETREG instructions to fix fdiv precision issues. Patch By: Wei Ding Summary: This patch fixes the fdiv precision issues. Reviewers: b-sumner, cfang, wdng, arsenm Subscribers: kzhuravl, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D26424 llvm-svn: 288879	2016-12-07 02:42:15 +00:00
Marek Olsak	30b976334f	AMDGPU/SI: Add back reverted SGPR spilling code, but disable it suggested as a better solution by Matt llvm-svn: 287942	2016-11-25 17:37:09 +00:00
Marek Olsak	c530d56272	Revert "AMDGPU: Make m0 unallocatable" This reverts commit 124ad83dae04514f943902446520c859adee0e96. llvm-svn: 287932	2016-11-25 16:03:15 +00:00
Matt Arsenault	9a257a9a17	AMDGPU: Make m0 unallocatable m0 may need to be written for spill code, so we don't want general code uses relying on the value stored in it. This introduces a few code quality regressions where copies from m0 are not coalesced into copies of a copy of m0. llvm-svn: 287841	2016-11-24 00:26:40 +00:00
Matt Arsenault	74174e09f8	AMDGPU: Remove unnecessary and on conditional branch The comment explaining why this was necessary is incorrect in its description of v_cmp's behavior for inactive workitems. llvm-svn: 286134	2016-11-07 19:09:33 +00:00
Matt Arsenault	3fd81027a1	AMDGPU: Handle CopyToReg in getOperandRegClass llvm-svn: 285768	2016-11-01 23:22:17 +00:00
Nicolai Haehnle	5c5844a79f	AMDGPU: Select 64-bit {ADD,SUB}{C,E} nodes Summary: This will be used for 64-bit MULHU, which is in turn used for the 64-bit divide-by-constant optimization (see D24822). Reviewers: arsenm, tstellarAMD Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25289 llvm-svn: 284224	2016-10-14 10:30:00 +00:00
Konstantin Zhuravlyov	3cd0ba7fe2	[AMDGPU] Pass optimization level to SelectionDAGISel llvm-svn: 283133	2016-10-03 18:47:26 +00:00
Mehdi Amini	1fef2dd6b7	Use StringRef in Pass/PassManager APIs (NFC) llvm-svn: 283004	2016-10-01 02:56:57 +00:00
Matt Arsenault	17a8bb755d	AMDGPU: Fix broken FrameIndex handling We were trying to avoid using a FrameIndex operand in non-pointer operands in a convoluted way, and would break because of using TargetFrameIndex. The TargetFrameIndex should only be used in the case where it makes sense to fold it as part of the addressing mode, otherwise it requires materialization like a normal constant. This wasn't working reliably and failed in the added testcase, hitting the assert when processing the frame index. The TargetFrameIndex was coming from trying to produce an AssertZext limiting the maximum stack size. I'm not sure this was correct to begin with, because it is apparently possible to have a single workitem dispatch that requires all 4G of private memory. llvm-svn: 281824	2016-09-17 16:09:55 +00:00
Matt Arsenault	4beb31bd8d	AMDGPU: Use i64 scalar compare instructions VI added eq/ne for i64, so use them. llvm-svn: 281800	2016-09-17 02:02:19 +00:00
Matt Arsenault	a2689a506d	AMDGPU: Run LoadStoreVectorizer pass by default llvm-svn: 281112	2016-09-09 22:29:28 +00:00
Matthias Braun	91722d430e	MachineFunction: Return reference for getFrameInfo(); NFC getFrameInfo() never returns nullptr so we should use a reference instead of a pointer. llvm-svn: 277017	2016-07-28 18:40:00 +00:00
Matt Arsenault	6e95fdbb3e	AMDGPU: Remove analyzeImmediate This no longer uses the more complicated classification of constants. llvm-svn: 276945	2016-07-28 00:32:02 +00:00
Nicolai Haehnle	67daedb5d4	AMDGPU: Unify MOVRELSOffset and MOVRELDOffset Summary: Previously, constant index insertelements would be turned into SI_INDIRECT_DST, which is bound to prevent some optimization opportunities. Worse, it mislead the heuristic that decides whether immediates should be lowered to S_MOV_B32 or V_MOV_B32 in a way that resulted in unnecessary v_readfirstlanes. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D22217 llvm-svn: 275160	2016-07-12 08:12:16 +00:00
Matt Arsenault	32ea105667	AMDGPU: Improve offset folding for register indexing llvm-svn: 274954	2016-07-09 01:13:56 +00:00
Tom Stellard	222891328a	AMDGPU/SI: Remove address space query functions from AMDGPUDAGToDAGISel Summary: These have been replaced with TableGen code (except for isConstantLoad, which is still used for R600). The queries were broken for cases where MemOperand was a PseudoSourceValue. Reviewers: arsenm Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21684 llvm-svn: 274561	2016-07-05 16:10:44 +00:00
Tom Stellard	63fc45cc8d	AMDGPU/R600: Add PatFrags for selecting the correct vtx id for loads This moves of the r600 logic out of isGlobalLoad() and into the TableGen files. Differential Revision: http://reviews.llvm.org/D21710 llvm-svn: 274527	2016-07-05 00:12:51 +00:00
Tom Stellard	99144bdc58	AMDGPU/SI: Remove hack for selecting < 32-bit loads to MUBUF instructions Summary: The isGlobalLoad() query was returning true for constant address space loads with memory types less than 32-bits, which is wrong. This logic has been replaced with PatFrag in the TableGen files, to provide the same functionality. Reviewers: arsenm Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21696 llvm-svn: 274521	2016-07-04 20:41:48 +00:00
Matt Arsenault	8603948f83	AMDGPU: Cleanup subtarget handling. Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target. llvm-svn: 273652	2016-06-24 06:30:11 +00:00
Matt Arsenault	856a55687d	AMDGPU: Fix gcc warnings Mostly removing dead code. Apparently gcc's warning for unused functions is better llvm-svn: 273363	2016-06-22 01:53:49 +00:00
Rafael Espindola	b8700788a8	Delete more dead code. Found by gcc 6. llvm-svn: 273322	2016-06-21 21:51:41 +00:00
Rafael Espindola	cd2c189f82	Delete some dead code. Found by gcc 6. llvm-svn: 273303	2016-06-21 19:48:12 +00:00
NAKAMURA Takumi	24b157d37a	Reformat blank lines. llvm-svn: 273131	2016-06-20 01:05:15 +00:00
NAKAMURA Takumi	01c29da292	Untabify. llvm-svn: 273129	2016-06-20 00:37:41 +00:00
Nicolai Haehnle	1772a3e0bc	AMDGPU: Fix MUBUF offset bugs affecting llvm.amdgcn.buffer.* intrinsics Summary: This fixes two related bugs. First, the generic optimization passes unfortunately generate negative constant offsets but the hardware treats SOffset as an unsigned value. Second, there is a hardware bug on SI and CI, where address clamping in MUBUF instructions does not work correctly when SOffset is larger than the buffer size. This patch works around this bug by never using SOffset. An alternative workaround would be to do the clamping manually when SOffset is too large, but generating the required code sequence during instruction selection would be rather involved, and in any case the resulting code would probably be worse. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96360 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21326 llvm-svn: 272761	2016-06-15 07:13:05 +00:00
Benjamin Kramer	e80783f62f	Pass DebugLoc and SDLoc by const ref. This used to be free, copying and moving DebugLocs became expensive after the metadata rewrite. Passing by reference eliminates a ton of track/untrack operations. No functionality change intended. llvm-svn: 272512	2016-06-12 15:39:02 +00:00
Tom Stellard	7240ff2203	AMDGPU/SI: Make sure to emit TargetConstant nodes when matching ds_permute Summary: This fixes a bug with ds_permute instructions where if it was passed a constant address, then the offset operand would get assigned a register operand instead of an immediate. Reviewers: scchan, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19994 llvm-svn: 272349	2016-06-10 00:01:04 +00:00
Matt Arsenault	b7b3917848	AMDGPU: Fix flat atomics The flat atomics could already be selected, but only when using flat instructions for global memory. Add patterns for flat addresses. llvm-svn: 272345	2016-06-09 23:42:54 +00:00
Matt Arsenault	bcec847408	AMDGPU: Fix i64 global cmpxchg This was using extract_subreg sub0 to extract the low register of the result instead of sub0_sub1, producing an invalid copy. There doesn't seem to be a way to use the compound subreg indices in tablegen since those are generated, so manually select it. llvm-svn: 272344	2016-06-09 23:42:48 +00:00
Jan Vesely	8960f3da2c	AMDGPU/R600: Implement memory loads from constant AS Reviewers: tstellard Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D19792 llvm-svn: 269479	2016-05-13 20:39:29 +00:00
Justin Bogner	e0b750ea0f	SDAG: Implement Select instead of SelectImpl in AMDGPUDAGToDAGISel - Where we were returning a node before, call ReplaceNode instead. - Where we would return null to fall back to another selector, rename the method to try* and return a bool for success. - Where we were calling SelectNodeTo, just return afterwards. Part of llvm.org/pr26808. llvm-svn: 269349	2016-05-12 21:03:32 +00:00
Simon Pilgrim	0f2ef1de0a	Fixed unused but set variable warning llvm-svn: 268931	2016-05-09 16:42:23 +00:00
Justin Bogner	9448d374ee	SDAG: Rename Select->SelectImpl and repurpose Select as returning void This is a step towards removing the rampant undefined behaviour in SelectionDAG, which is a part of llvm.org/PR26808. We rename SelectionDAGISel::Select to SelectImpl and update targets to match, and then change Select to return void and consolidate the sketchy behaviour we're trying to get away from there. Next, we'll update backends to implement `void Select(...)` instead of SelectImpl and eventually drop the base Select implementation. llvm-svn: 268693	2016-05-05 23:19:08 +00:00
Matt Arsenault	7932e530a0	AMDGPU: Make i64 loads/stores promote to v2i32 Now that unaligned access expansion should not attempt to produce i64 accesses, we can remove the hack in PreprocessISelDAG where this is done. This allows splitting i64 private accesses while allowing the new add nodes indexing the vector components can be folded with the base pointer arithmetic. llvm-svn: 268293	2016-05-02 20:07:26 +00:00
Tom Stellard	33134ca52e	AMDGPU/SI: Add offset field to ds_permute/ds_bpermute instructions Summary: These instructions can add an immediate offset to the address, like other ds instructions. Reviewers: arsenm Subscribers: arsenm, scchan Differential Revision: http://reviews.llvm.org/D19233 llvm-svn: 268043	2016-04-29 14:34:26 +00:00
Matt Arsenault	b60850cb10	AMDGPU: Implement addrspacecast llvm-svn: 267452	2016-04-25 19:27:24 +00:00
Matt Arsenault	3cbb6d7d74	AMDGPU: sext_inreg (srl x, K), vt -> bfe x, K, vt.Size llvm-svn: 267244	2016-04-22 22:59:16 +00:00
Nicolai Haehnle	25eef7cc0f	[StructurizeCFG] Annotate branches that were treated as uniform Summary: This fully solves the problem where the StructurizeCFG pass does not consider the same branches as uniform as the SIAnnotateControlFlow pass. The patch in D19013 helps with this problem, but is not sufficient (and, interestingly, causes a "regression" with one of the existing test cases). No tests included here, because tests in D19013 already cover this. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19018 llvm-svn: 266346	2016-04-14 17:42:35 +00:00
Matt Arsenault	f100af354c	AMDGPU: Add atomic_inc + atomic_dec intrinsics These are different than atomicrmw add 1 because they have an additional input value to clamp the result. llvm-svn: 266074	2016-04-12 14:05:04 +00:00
Jan Vesely	e71f70898a	AMDGPU/SI: Implement atomic load/store for i32 and i64 Standard load/store instructions with GLC bit set. Reviewers: tstellardAMD, arsenm Differential Revision: http://reviews.llvm.org/D18760 llvm-svn: 265709	2016-04-07 19:23:11 +00:00
Vasileios Kalintiris	f64e5bc3d4	Fix sequence point warning. NFC. llvm-svn: 264255	2016-03-24 10:53:28 +00:00

1 2

88 Commits