llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-24 21:42:54 +02:00

Author	SHA1	Message	Date
Tom Stellard	c05d813021	R600/SI: Use READ2/WRITE2 instructions for 64-bit mem ops with 32-bit alignment llvm-svn: 216279	2014-08-22 18:49:35 +00:00
Tom Stellard	d7ab32bbb1	R600/SI: Use a ComplexPattern for DS loads and stores llvm-svn: 216278	2014-08-22 18:49:33 +00:00
Tom Stellard	51057b05fa	R600/SI: Use eliminateFrameIndex() to expand SGPR spill pseudos This will simplify the SGPR spilling and also allow us to use MachineFrameInfo for calculating offsets, which should be more reliable than our custom code. This fixes a crash in some cases where a register would be spilled in a branch such that the VGPR defined for spilling did not dominate all the uses when restoring. This fixes a crash in an ocl conformance test. The test requries register spilling and is too big to include. llvm-svn: 216217	2014-08-21 20:40:54 +00:00
Matt Arsenault	babc0fbd04	R600/SI: Move all fabs / fneg handling to patterns llvm-svn: 215749	2014-08-15 18:42:22 +00:00
Matt Arsenault	134b10c2db	R600/SI: Use source modifiers for f64 fneg llvm-svn: 215748	2014-08-15 18:42:18 +00:00
Matt Arsenault	9832bdad0e	R600/SI: Refactor fneg / fabs patterns llvm-svn: 215746	2014-08-15 18:42:11 +00:00
Matt Arsenault	014e16538b	R600/SI: Add intrinsic for ldexp llvm-svn: 215734	2014-08-15 17:30:25 +00:00
Tom Stellard	fcb2bdc3e4	R600/SI: Add an _OFFEN variant MUBUF_STORE_* and use it for scratch writes llvm-svn: 215398	2014-08-11 22:18:14 +00:00
Matt Arsenault	1ecd6214f0	R600/SI: Add definitions for ds_read2st64_ / ds_write2st64_ llvm-svn: 214936	2014-08-05 23:53:20 +00:00
Matt Arsenault	1f6d88bf21	R600/SI: Fix definitions for ds_read2 / ds_write2 instructions. These were just wrong, using the wrong register classes and store2 was missing an operand. llvm-svn: 214756	2014-08-04 18:49:22 +00:00
Tom Stellard	827479f1ca	R600/SI: Do abs/neg folding with ComplexPatterns Abs/neg folding has moved out of foldOperands and into the instruction selection phase using complex patterns. As a consequence of this change, we now prefer to select the 64-bit encoding for most instructions and the modifier operands have been dropped from integer VOP3 instructions. llvm-svn: 214467	2014-08-01 00:32:39 +00:00
Matt Arsenault	a82949eb53	R600/SI: Remove redundant setting of bits on instructions. neverHasSideEffects is deprecated, and hasSideEffects = 0 is already set on the base classes of the basic ALU instruction classes. The base classes also already set mayLoad = 0 and mayStore = 0 llvm-svn: 214283	2014-07-30 03:18:57 +00:00
Tom Stellard	ed0ccca70d	R600/SI: Use scratch memory for large private arrays llvm-svn: 213551	2014-07-21 15:45:01 +00:00
Tom Stellard	7a57564546	R600/SI: Remove vaddr operand from BUFFER_LOAD_*_OFFSET instructions This operand is never used. llvm-svn: 213549	2014-07-21 15:44:55 +00:00
Tom Stellard	5bfbb25d6b	R600/SI: Store constant initializer data in constant memory This implements a solution for constant initializers suggested by Vadim Girlin, where we store the data after the shader code and then use the S_GETPC instruction to compute its address. This saves use the trouble of creating a new buffer for constant data and then having to pass the pointer to the kernel via user SGPRs or the input buffer. llvm-svn: 213530	2014-07-21 14:01:14 +00:00
Tom Stellard	cc6c170604	R600/SI: Add isCFDepth0 Predicate to SALU addc pattern llvm-svn: 213529	2014-07-21 14:01:12 +00:00
Tom Stellard	7f35eb40ab	R600/SI: Use VALU for i1 XOR llvm-svn: 213528	2014-07-21 14:01:10 +00:00
Tom Stellard	7efced1747	R600/SI: Use a custom encoding method for simm16 in SOPP branch instructions This allows us to explicitly define the type of fixup that is needed, so we can distinguish this from future fixup types. llvm-svn: 213527	2014-07-21 14:01:08 +00:00
Tom Stellard	cecb51057f	R600/SI: Rename SOPP operands to match the encoding fields llvm-svn: 213526	2014-07-21 14:01:05 +00:00
Matt Arsenault	840d57e330	R600/SI: implement range reduction for sin/cos These instructions can only take a limited input range, and return the constant value 1 out of range. We should do range reduction to be able to process arbitrary values. Use a FRACT instruction after normalization to achieve this. Also add a test for constant folding with the lowered code with unsafe-fp-math enabled. v2: use DAG lowering instead of intrinsic, adapt test v3: calculate constant, fold pattern into instruction definition v4: misc style fixes, add sin-fold testcase, cosmetics Patch by Grigori Goronzy llvm-svn: 213458	2014-07-19 18:44:39 +00:00
Tim Northover	eae1f1c8cc	CodeGen: extend f16 conversions to permit types > float. This makes the two intrinsics @llvm.convert.from.f16 and @llvm.convert.to.f16 accept types other than simple "float". This is only strictly needed for the truncate operation, since otherwise double rounding occurs and there's no way to represent the strict IEEE conversion. However, for symmetry we allow larger types in the extend too. During legalization, we can expand an "fp16_to_double" operation into two extends for convenience, but abort when the truncate isn't legal. A new libcall is probably needed here. Even after this commit, various target tweaks are needed to actually use the extended intrinsics. I've put these into separate commits for clarity, so there are no actual tests of f64 conversion here. llvm-svn: 213248	2014-07-17 10:51:23 +00:00
Matt Arsenault	15eb0d54b0	R600/SI: Allow using f32 rcp / rsq when denormals not handled. These are precise enough to use for OpenCL unless denormals are handled. llvm-svn: 213107	2014-07-15 23:50:10 +00:00
Matt Arsenault	1ceb5e82c1	R600/SI: Implement less wrong f32 fdiv Assuming single precision denormals and accurate sqrt/div are not reported, this passes the OpenCL conformance test. llvm-svn: 213089	2014-07-15 20:18:31 +00:00
Marek Olsak	7757b563f1	R600/SI: Use i32 vectors for resources and samplers This affects new intrinsics only. What surprises me is that v32i8 still works. llvm-svn: 212831	2014-07-11 17:11:52 +00:00
Marek Olsak	6db1789f95	R600/SI: add sample and image intrinsics exposing all instruction fields We need the intrinsics with offsets, so why not just add them all. The R128 parameter will also be useful for reducing SGPR usage. GL_ARB_image_load_store also adds some image GLSL modifiers like "coherent", so Mesa will probably translate those to slc, glc, etc. When LLVM 3.5 is released, I'll switch Mesa to these new intrinsics. llvm-svn: 212830	2014-07-11 17:11:46 +00:00
Matt Arsenault	834bcebaa6	R600/SI: Add support for llvm.convert.{to\|from}.fp16 llvm-svn: 212676	2014-07-10 03:22:20 +00:00
Tom Stellard	c4ab9c96da	R600/SI: Use a ComplexPattern for ADDR64 addressing of MUBUF loads llvm-svn: 212217	2014-07-02 20:53:56 +00:00
Tom Stellard	209c137768	R600: Promote i64 loads to v2i32 llvm-svn: 212216	2014-07-02 20:53:54 +00:00
Tom Stellard	1f2dabfbae	R600/SI: Add verifier check for immediates in register operands. llvm-svn: 212214	2014-07-02 20:53:44 +00:00
Tom Stellard	86f1137544	R600/SI: Use a ComplexPattern for MUBUF stores Now that non-leaf ComplexPatterns are allowed we can fold all the MUBUF store patterns into the instruction definition. We will also be able to reuse this new ComplexPattern for MUBUF loads and atomic operations. llvm-svn: 211644	2014-06-24 23:33:07 +00:00
Tom Stellard	840992bb71	R600: Promote i64 stores to v2i32 Now we need only one 64-bit pattern for stores. llvm-svn: 211643	2014-06-24 23:33:04 +00:00
Matt Arsenault	37d6d91b5b	R600: Fix inconsistency in rsq instructions. R600 was using a clamped version of rsq, but SI was not. Add a new rsq_clamped intrinsic and use them consistently. It's unclear to me from the documentation what behavior the R600 instructions have, so I assume they have the legacy behavior described by the SI documents. For R600, use RECIPSQRT_IEEE for both llvm.AMDGPU.rsq.legacy and llvm.AMDGPU.rsq. R600 also has RECIPSQRT_FF, which I'm not sure how it fits in here. llvm-svn: 211637	2014-06-24 22:13:39 +00:00
Matt Arsenault	7819e41b84	R600/SI: Move pattern to instruction definition llvm-svn: 211614	2014-06-24 17:17:06 +00:00
Matt Arsenault	0cbe5b2357	R600/SI: Fix div_scale intrinsic. The operand that must match one of the others does matter, and implement selecting for it. llvm-svn: 211523	2014-06-23 18:28:28 +00:00
Tom Stellard	ae7faa387d	R600/SI: Add patterns for ctpop inside a branch llvm-svn: 211378	2014-06-20 17:06:11 +00:00
Tom Stellard	2d56aec1cd	R600/SI: Add a pattern for f32 ftrunc llvm-svn: 211377	2014-06-20 17:06:09 +00:00
Tom Stellard	d4aa49ad5e	R600/SI: Add a VALU pattern for i64 xor llvm-svn: 211373	2014-06-20 17:05:57 +00:00
Matt Arsenault	b82983ef6a	R600/SI: Add intrinsics for various math instructions. These will be used for custom lowering and for library implementations of various math functions, so it's useful to expose these as builtins. llvm-svn: 211247	2014-06-19 01:19:19 +00:00
Marek Olsak	d80d0ca951	R600/SI: add gather4 and getlod intrinsics (v3) This contains all the previous patches + getlod support on top of it. It doesn't use SDNodes anymore, so it's quite small. It also adds v16i8 to SReg_128, which is used for the sampler descriptor. Reviewed-by: Tom Stellard llvm-svn: 211228	2014-06-18 22:00:29 +00:00
Matt Arsenault	a46ba4c9d1	R600/SI: Add intrinsics for brev instructions llvm-svn: 211187	2014-06-18 17:13:57 +00:00
Matt Arsenault	dc16a24358	R600/SI: Comparisons set vcc. llvm-svn: 211178	2014-06-18 16:53:48 +00:00
Matt Arsenault	508925b13a	R600/SI: Match cttz_zero_undef llvm-svn: 211116	2014-06-17 17:36:27 +00:00
Matt Arsenault	71fb43e88e	R600/SI: Match ctlz_zero_undef llvm-svn: 211115	2014-06-17 17:36:24 +00:00
Tom Stellard	a529beed9c	R600: Use LDS and vectors for private memory llvm-svn: 211110	2014-06-17 16:53:14 +00:00
Tom Stellard	4bdc400436	R600/SI: Add a pattern for llvm.AMDGPU.barrier.global llvm-svn: 211109	2014-06-17 16:53:09 +00:00
Tom Stellard	cc6701010d	R600: Remove AMDIL instruction and register definitions Most of these are no longer used any more. llvm-svn: 210915	2014-06-13 16:38:59 +00:00
Matt Arsenault	e19ddbd0dc	R600: Mostly remove remaining AMDIL intrinsics. Delete all unused ones, and add new AMDGPU named intrinsics for the ones that are. Handle the old AMDIL names for comptability (although remove their GCCBuiltin names) and add tests since there weren't any for these before. llvm-svn: 210827	2014-06-12 21:15:44 +00:00
Matt Arsenault	2c82883c7a	R600/SI: Use a register set to -1 for data0 on ds_inc/ds_dec There is not such thing as a 0-data ds instruction, and the data operand needs to be a vgpr set to something meaningful. llvm-svn: 210756	2014-06-12 08:21:54 +00:00
Matt Arsenault	b6b9bb5978	R600/SI: Fix bitcast between v2i32 and f64 This is the same problem fixed in r210664 for more types. The test passes without this fix. For some reason I'm only hitting this when creating selects lowered to v2i32 selects. llvm-svn: 210692	2014-06-11 19:31:13 +00:00
Matt Arsenault	b8a7faa150	R600/SI: Update place using old subtarget predicate llvm-svn: 210683	2014-06-11 18:11:34 +00:00

1 2 3 4 5 ...

261 Commits