llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 14:33:02 +02:00

Author	SHA1	Message	Date
Matt Arsenault	f3c657912f	AMDGPU: Add intrinsic for s_flbit_i32/v_ffbh_i32 llvm-svn: 275871	2016-07-18 18:35:05 +00:00
Matt Arsenault	a21ee6967c	AMDGPU: Remove dead code llvm-svn: 275369	2016-07-14 05:23:08 +00:00
Matt Arsenault	fcc3666c3b	AMDGPU: Expand unaligned accesses early Due to visit order problems, in the case of an unaligned copy the legalized DAG fails to eliminate extra instructions introduced by the expansion of both unaligned parts. llvm-svn: 274397	2016-07-01 22:55:55 +00:00
Matt Arsenault	14dbe93675	AMDGPU: Improve load/store of illegal types. There was a combine before to handle the simple copy case. Split this into handling loads and stores separately. We might want to change how this handles some of the vector extloads, since this can result in large code size increases. llvm-svn: 274394	2016-07-01 22:47:50 +00:00
Matt Arsenault	8603948f83	AMDGPU: Cleanup subtarget handling. Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target. llvm-svn: 273652	2016-06-24 06:30:11 +00:00
Matt Arsenault	66eec46266	AMDGPU: Fix verifier errors in SILowerControlFlow The main sin this was committing was using terminator instructions in the middle of the block, and then not updating the block successors / predecessors. Split the blocks up to avoid this and introduce new pseudo instructions for branches taken with exec masking. Also use a pseudo instead of emitting s_endpgm and erasing it in the special case of a non-void return. llvm-svn: 273467	2016-06-22 20:15:28 +00:00
Jan Vesely	c245f58037	AMDGPU: Add implicitarg.ptr intrinsic. Points to the start of implicit arguments (appended after explicit arguments) Differential Revision: http://reviews.llvm.org/D20297 llvm-svn: 273317	2016-06-21 20:46:20 +00:00
Tom Stellard	98822b77c0	AMDGPU/SI: Refactor fixup handling for constant addrspace variables Summary: We now use a standard fixup type applying the pc-relative address of constant address space variables, and we have the GlobalAddress lowering code add the required 4 byte offset to the global address rather than doing it as part of the fixup. This refactoring will make it easier to use the same code for global address space variables and also simplifies the code. Re-commit this after fixing a bug where we were trying to use a reference to a Triple object that had already been destroyed. Reviewers: arsenm, kzhuravl Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21154 llvm-svn: 272705	2016-06-14 20:29:59 +00:00
Tom Stellard	de6d6f0171	Revert "AMDGPU/SI: Refactor fixup handling for constant addrspace variables" This reverts commit r272675. llvm-svn: 272677	2016-06-14 15:16:35 +00:00
Tom Stellard	b56b2a98e2	AMDGPU/SI: Refactor fixup handling for constant addrspace variables Summary: We now use a standard fixup type applying the pc-relative address of constant address space variables, and we have the GlobalAddress lowering code add the required 4 byte offset to the global address rather than doing it as part of the fixup. This refactoring will make it easier to use the same code for global address space variables and also simplifies the code. Reviewers: arsenm, kzhuravl Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21154 llvm-svn: 272675	2016-06-14 15:11:01 +00:00
Benjamin Kramer	e80783f62f	Pass DebugLoc and SDLoc by const ref. This used to be free, copying and moving DebugLocs became expensive after the metadata rewrite. Passing by reference eliminates a ton of track/untrack operations. No functionality change intended. llvm-svn: 272512	2016-06-12 15:39:02 +00:00
Matt Arsenault	6e839f3775	AMDGPU: Remove custom load/store scalarization llvm-svn: 266385	2016-04-14 23:31:26 +00:00
Matt Arsenault	f100af354c	AMDGPU: Add atomic_inc + atomic_dec intrinsics These are different than atomicrmw add 1 because they have an additional input value to clamp the result. llvm-svn: 266074	2016-04-12 14:05:04 +00:00
Tom Stellard	07e93d8663	AMDGPU: Implement {BUFFER,FLAT}_ATOMIC_CMPSWAP{,_X2} Summary: Implement BUFFER_ATOMIC_CMPSWAP{,_X2} instructions on all GCN targets, and FLAT_ATOMIC_CMPSWAP{,_X2} on CI+. 32-bit instruction variants tested manually on Kabini and Bonaire. Tests and parts of code provided by Jan Veselý. Patch by: Vedran Miletić Reviewers: arsenm, tstellarAMD, nhaehnle Subscribers: jvesely, scchan, kanarayan, arsenm Differential Revision: http://reviews.llvm.org/D17280 llvm-svn: 265170	2016-04-01 18:27:37 +00:00
Matt Arsenault	0c86db8f5c	AMDGPU: R600 code splitting cleanup Move a few functions only used by R600 to R600 specific code, fix header macros to stop using R600, mark classes as final. llvm-svn: 263204	2016-03-11 08:00:27 +00:00
Matt Arsenault	e24a001b35	AMDGPU: Move function only used by R600 llvm-svn: 262853	2016-03-07 21:10:13 +00:00
Matt Arsenault	4ff4c396c1	AMDGPU: Rename intrinsic to better match instruction name Also fixes missing f32 test. llvm-svn: 260780	2016-02-13 01:03:00 +00:00
Matt Arsenault	cdd789021d	AMDGPU: Split R600 and SI store lowering These were only sharing some somewhat incorrect logic for when to scalarize or split vectors. llvm-svn: 260490	2016-02-11 05:32:46 +00:00
Matt Arsenault	2c8154f002	AMDGPU: Split R600 and SI load lowering These weren't actually sharing anything in the common LowerLOAD. llvm-svn: 260398	2016-02-10 18:21:39 +00:00
Matt Arsenault	28a667546a	AMDGPU: Match some med3 patterns llvm-svn: 259089	2016-01-28 20:53:42 +00:00
Matt Arsenault	fe8ee22547	AMDGPU: Remove more unused intrinsics Replace tests with lrp with basic IR expansion llvm-svn: 258612	2016-01-23 05:42:38 +00:00
Matt Arsenault	c463ea6063	AMDGPU: Remove abs intrinsic llvm-svn: 258343	2016-01-20 20:58:29 +00:00
Matt Arsenault	348623d27f	AMDGPU: Reduce 64-bit SRAs llvm-svn: 258096	2016-01-18 22:09:04 +00:00
Matt Arsenault	862bf93c73	AMDGPU: Split 64-bit and of constant up This breaks the tests that were meant for testing 64-bit inline immediates, so move those to shl where they won't be broken up. This should be repeated for the other related bit ops. llvm-svn: 258095	2016-01-18 22:01:13 +00:00
Matt Arsenault	e1a6e6ae7f	AMDGPU: Reduce 64-bit lshr by constant to 32-bit 64-bit shifts are very slow on some subtargets. llvm-svn: 258090	2016-01-18 21:43:36 +00:00
Marek Olsak	2b92f6022f	AMDGPU/SI: Add support for non-void functions Summary: Return values can be stored in SGPRs (i32) and VGPRs (f32). This will be used by functions which expect some bytecode or other binary to be appended at the end. It allows defining in which registers the return values will be stored. v2: don't do this for compute shaders Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16033 llvm-svn: 257621	2016-01-13 17:23:04 +00:00
Matt Arsenault	e2033b4ea1	AMDGPU: Implement {{s\|u}}int_to_fp i64 -> f32 The old lowering for uint_to_fp failed opencl conformance. It might be OK for fast math mode, but I'm not sure. llvm-svn: 257393	2016-01-11 22:01:48 +00:00
Matt Arsenault	b88ff2e112	AMDGPU: Pattern match ffbh pattern to instruction. The hardware instruction's output on 0 is -1 rather than 32. Eliminate a test and select to -1. This removes an extra instruction from the compatability function with HSAIL's firstbit instruction. llvm-svn: 257352	2016-01-11 17:02:00 +00:00
Matt Arsenault	9caa943926	AMDGPU: Custom lower i64 ctlz llvm-svn: 257348	2016-01-11 16:50:29 +00:00
Matt Arsenault	3ca8d64e75	AMDGPU: Use generic bitreverse intrinsic Also fix bug in vector legalization for bitreverse. llvm-svn: 255512	2015-12-14 17:25:38 +00:00
Matt Arsenault	85dd075020	DAGCombiner: Combine extract_vector_elt from build_vector This basic combine was surprisingly missing. AMDGPU legalizes many operations in terms of 32-bit vector components, so not doing this results in many extra copies and subregister extracts that need to be cleaned up later. InstCombine already does this for the hasOneUse case. The target hook is to fix a handful of tests which break (e.g. ARM/vmov.ll) which turn from a vector materialize repeated immediate instruction to a constant vector load with more scalar copies from it. llvm-svn: 250129	2015-10-12 23:59:50 +00:00
Matt Arsenault	9dd8b4ce6c	AMDGPU: Produce error on dynamic_stackalloc llvm-svn: 246048	2015-08-26 18:37:13 +00:00
Simon Pilgrim	3d6f76a00a	[AMDGPU] Use the general SMAX/SMIN/UMAX/UMIN pattern matching and remove the AMDGPU implementation D9746 added general SMAX/SMIN/UMAX/UMIN pattern matching to SelectionDAGBuilder::visitSelect. Differential Revision: http://reviews.llvm.org/D12007 llvm-svn: 244960	2015-08-13 21:40:02 +00:00
Matt Arsenault	27b16f9fd2	AMDGPU: Fix return type of getImplicitParameterOffset. Patch by Zoltan Gilian <zoltan.gilian@gmail.com> llvm-svn: 243459	2015-07-28 18:09:55 +00:00
Matt Arsenault	efaaa8cb7c	AMDGPU: Avoid using 64-bit shift for i64 (shl x, 32) This can be done only with moves which theoretically will optimize better later. Although this transform increases the instruction count, it should be code size / cycle count neutral in the worst VALU case. It also seems to slightly improve a couple of testcases due to other DAG combines this exposes. This is probably slightly worse for the SALU case, so it might be better to handle this during moveToVALU, although then you lose some simplifications like the load width reducing in the simple testcase. llvm-svn: 242177	2015-07-14 18:20:33 +00:00
Tom Stellard	073eb1265b	AMDGPU: Add helper function for implicit parameter offsets. Patch by: Zoltan Gilian llvm-svn: 241861	2015-07-09 21:20:37 +00:00
Mehdi Amini	abf873c623	Make TargetLowering::getPointerTy() taking DataLayout as an argument Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, ted, yaron.keren, rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D11028 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241775	2015-07-09 02:09:04 +00:00
Tom Stellard	3f1708598e	R600 -> AMDGPU rename llvm-svn: 239657	2015-06-13 03:28:10 +00:00
Tom Stellard	39f7e52397	Revert "AMDGPU: Add core backend files for R600/SI codegen v6" This reverts commit 4ea70107c5e51230e9e60f0bf58a0f74aa4885ea. llvm-svn: 160303	2012-07-16 18:19:53 +00:00
Tom Stellard	9f326179fc	AMDGPU: Add core backend files for R600/SI codegen v6 llvm-svn: 160270	2012-07-16 14:17:08 +00:00

40 Commits