llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00

Author	SHA1	Message	Date
Marek Olsak	319154a0b6	R600/SI: Expand fract to floor, then only select V_FRACT on CI V_FRACT is buggy on SI. R600-specific code is left intact. v2: drop the multiclass, use complex VOP3 patterns llvm-svn: 233075	2015-03-24 13:40:08 +00:00
Matt Arsenault	d59f9a3d0d	R600/SI: Remove v_sub_f64 pseudo The expansion code does the same thing. Since the operands were not defined with the correct types, this has the side effect of fixing operand folding since the expanded pseudo would never use SGPRs or inline immediates. llvm-svn: 230072	2015-02-20 22:10:45 +00:00
Matt Arsenault	82f523f0fa	R600: Use new fmad node. This enables a few useful combines that used to only use fma. Also since v_mad_f32 apparently does not support denormals, disable the existing cases that are custom handled if they are requested. llvm-svn: 230071	2015-02-20 22:10:41 +00:00
Matt Arsenault	9c682e9039	R600/SI: Fix implicit vcc operand to v_div_fmas_* This should allow finally fixing the f64 fdiv implementation. Test is disabled for VI since there seems to be a problem with one of the buffer load instructions on it. llvm-svn: 229236	2015-02-14 04:22:00 +00:00
Tom Stellard	e523dc6621	R600/SI: Make more store operations legal v2i32, i32, trunc i32 to i16, and truc i32 to i8 stores are legal for all address spaces. We had marked them as custom in order to lower them for the private address space, but this is no longer necessary. This enables lowering of misaligned stores of these types in the DAGLegalizer. llvm-svn: 228189	2015-02-04 20:49:51 +00:00
Tom Stellard	5ea00f06af	R600: Don't promote i64 stores to v2i32 during DAG legalization We take care of this during instruction selection now. This fixes a potential infinite loop when lowering misaligned stores. llvm-svn: 228188	2015-02-04 20:49:49 +00:00
Eric Christopher	ce1de59ee6	Reuse a bunch of cached subtargets and remove getSubtarget calls without a Function argument. llvm-svn: 227638	2015-01-30 23:24:40 +00:00
Eric Christopher	aacfef65cf	Move DataLayout back to the TargetMachine from TargetSubtargetInfo derived classes. Since global data alignment, layout, and mangling is often based on the DataLayout, move it to the TargetMachine. This ensures that global data is going to be layed out and mangled consistently if the subtarget changes on a per function basis. Prior to this all targets() have had subtarget dependent code moved out and onto the TargetMachine. One target hasn't been migrated as part of this change: R600. The R600 port has, as a subtarget feature, the size of pointers and this affects global data layout. I've currently hacked in a FIXME to enable progress, but the port needs to be updated to either pass the 64-bitness to the TargetMachine, or fix the DataLayout to avoid subtarget dependent features. llvm-svn: 227113	2015-01-26 19:03:15 +00:00
Tom Stellard	27630189de	R600/SI: Move i64 -> v2i32 load promotion into AMDGPUDAGToDAGISel::Select() We used to do this promotion during DAG legalization, but this caused an infinite loop in ExpandUnalignedLoad() because it assumed that i64 loads were legal if i64 was a legal type. It also seems better to report i64 loads as legal, since they actually are and we were just promoting them to simplify our tablegen files. llvm-svn: 226945	2015-01-23 22:05:45 +00:00
Jan Vesely	042d995634	R600: Try to use lower types for 64bit division if possible v2: add and enable tests for SI Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> llvm-svn: 226881	2015-01-22 23:42:43 +00:00
Jan Vesely	ef787763d5	R600: Simplify LowerUDIVREM optimizations can handle removing the Hi part operations. The generated code is identical for R600, ~10% icount reduction for SI v2: rebase Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> llvm-svn: 226879	2015-01-22 23:42:39 +00:00
Matt Arsenault	3c5cd0f801	R600/SI: Custom lower fround This fixes it for SI. It also removes the pattern used previously for Evergreen for f32. I'm not sure if the the new R600 output is better or not, but it uses 1 fewer instructions if BFI is available. llvm-svn: 226682	2015-01-21 18:18:25 +00:00
Matt Arsenault	22a9f67443	Implement new way of expanding extloads. Now that the source and destination types can be specified, allow doing an expansion that doesn't use an EXTLOAD of the result type. Try to do a legal extload to an intermediate type and extend that if possible. This generalizes the special case custom lowering of extloads R600 has been using to work around this problem. This also happens to fix a bug that would incorrectly use more aligned loads than should be used. llvm-svn: 225925	2015-01-14 01:35:17 +00:00
Matt Arsenault	fb153e33c0	R600: Implement getRecipEstimate This requires a new hook to prevent expanding sqrt in terms of rsqrt and reciprocal. v_rcp_f32, v_rsq_f32, and v_sqrt_f32 are all the same rate, so this expansion would just double the number of instructions and cycles. llvm-svn: 225828	2015-01-13 20:53:23 +00:00
Matt Arsenault	4cc5aca737	R600: Implement getRsqrtEstimate Only do for f32 since I'm unclear on both what this is expecting for the refinement steps in terms of accuracy, and what f64 instruction actually provides. llvm-svn: 225827	2015-01-13 20:53:18 +00:00
Matt Arsenault	ef34769c29	R600: Make cttz / ctlz cheap to speculate Speculating things is generally good. SI+ has instructions for these for 32-bit values. This is still probably better even with the expansion for 64-bit values, although it is odd that this callback doesn't have the size as a parameter. llvm-svn: 225822	2015-01-13 19:46:48 +00:00
Ahmed Bougacha	4150499cd1	[SelectionDAG] Allow targets to specify legality of extloads' result type (in addition to the memory type). The LoadExt legalization handling used to only have one type, the memory type. This forced users to assume that as long as the extload for the memory type was declared legal, and the result type was legal, the whole extload was legal. However, this isn't always the case. For instance, on X86, with AVX, this is legal: v4i32 load, zext from v4i8 but this isn't: v4i64 load, zext from v4i8 Whereas v4i64 is (arguably) legal, even without AVX2. Note that the same thing was done a while ago for truncstores (r46140), but I assume no one needed it yet for extloads, so here we go. Calls to getLoadExtAction were changed to add the value type, found manually in the surrounding code. Calls to setLoadExtAction were mechanically changed, by wrapping the call in a loop, to match previous behavior. The loop iterates over the MVT subrange corresponding to the memory type (FP vectors, etc...). I also pulled neighboring setTruncStoreActions into some of the loops; those shouldn't make a difference, as the additional types are illegal. (e.g., i128->i1 truncstores on PPC.) No functional change intended. Differential Revision: http://reviews.llvm.org/D6532 llvm-svn: 225421	2015-01-08 00:51:32 +00:00
Matt Arsenault	08086327f3	R600/SI: Add class intrinsic llvm-svn: 225305	2015-01-06 23:00:37 +00:00
Matt Arsenault	47c2c0f05a	R600: Remove outdated comment llvm-svn: 224648	2014-12-19 23:29:13 +00:00
Matt Arsenault	a3a9080dce	R600/SI: Only form min/max with 1 use. If the condition is used for something else, this increases the number of instructions. llvm-svn: 224646	2014-12-19 23:15:30 +00:00
Matt Arsenault	c1a6f36235	R600: Fix min/max matching problems with unordered compares The returned operand needs to be permuted for the unordered compares. Also fix incorrectly producing fmin_legacy / fmax_legacy for f64, which don't exist. llvm-svn: 224094	2014-12-12 02:30:37 +00:00
Matt Arsenault	b0274833dd	Add target hook for whether it is profitable to reduce load widths Add an option to disable optimization to shrink truncated larger type loads to smaller type loads. On SI this prevents using scalar load instructions in some cases, since there are no scalar extloads. llvm-svn: 224084	2014-12-12 00:00:24 +00:00
Marek Olsak	15875571f6	R600/SI: Update instruction conversions for VI There are 3 changes: - Convert 32-bit S_LSHL/LSHR/ASHR to their V_*REV variants for VI - Lower RSQ_CLAMP for VI - Don't generate MIN/MAX_LEGACY on VI llvm-svn: 223604	2014-12-07 12:19:03 +00:00
Matt Arsenault	796e0c24e7	R600/SI: Use ZeroOrNegativeOneBooleanContent This sort of doesn't matter since the setcc type is i1, but this previously was using the default UndefinedBooleanContent. This makes it more consistent with R600. This enables more optimizations which typically give up on UndefinedBooleanContent. For example, there is already a special case target DAG combine for setcc + sext which can be eliminated in favor of what the generic DAG combiner can do if it assumes boolean values are sign extended. Since -1 is an inline immediate, using it is basically free and the backend already uses it when a boolean value is needed in a wider type. llvm-svn: 222850	2014-11-26 21:23:15 +00:00
Matt Arsenault	417f5ceb20	R600: Fix assert on copy of an i1 on pre-SI i1 is not a legal type on Evergreen, so this combine proceeded and tried to produce a bitcast between i1 and i8. llvm-svn: 222630	2014-11-23 02:57:52 +00:00
Matt Arsenault	1298bf6e9c	R600: Permute operands when selecting legacy min/max This gets the correct NaN behavior based on the compare type the hardware uses. This now passes the new piglit test I have for this on SI. Add stricter tests for the operand order. llvm-svn: 222079	2014-11-15 05:02:57 +00:00
Tom Stellard	d8a0a4cc2b	R600: Fix 64-bit integer division This fixes a failure in one of the oclconform tests. Patch by: Jan Vesely llvm-svn: 222073	2014-11-15 01:07:57 +00:00
Tom Stellard	573a5f6172	R600: Factor i64 UDIVREM lowering into its own fuction This is so it could potentially be used by SI. However, the current implementation does not always produce correct results, so the IntegerDivisionPass is being used instead. llvm-svn: 222072	2014-11-15 01:07:53 +00:00
Matt Arsenault	633ce0ecd7	R600/SI: Combine min3/max3 instructions llvm-svn: 222032	2014-11-14 20:08:52 +00:00
Matt Arsenault	346bdd92b1	R600/SI: Match integer min / max instructions llvm-svn: 222015	2014-11-14 18:30:06 +00:00
Matt Arsenault	08233590d4	R600/SI: Fix fmin_legacy / fmax_legacy matching for SI select_cc is expanded on SI, so this was never matched. llvm-svn: 221941	2014-11-13 23:03:09 +00:00
Aditya Nandakumar	4d9c1ff994	We can get the TLOF from the TargetMachine - so constructor no longer requires TargetLoweringObjectFile to be passed. llvm-svn: 221926	2014-11-13 21:29:21 +00:00
Matt Arsenault	8f277a520f	R600: Error on initializer for LDS. Also give a proper error for other address spaces. llvm-svn: 221917	2014-11-13 19:56:13 +00:00
Aditya Nandakumar	b93fb292df	This patch changes the ownership of TLOF from TargetLoweringBase to TargetMachine so that different subtargets could share the TLOF effectively llvm-svn: 221878	2014-11-13 09:26:31 +00:00
Matt Arsenault	2257f6b589	Add minnum / maxnum codegen llvm-svn: 220342	2014-10-21 23:01:01 +00:00
Matt Arsenault	98d33a4281	R600/SI: Add missing parameter to div_fmas intrinsic llvm-svn: 220338	2014-10-21 22:20:55 +00:00
Matt Arsenault	75125bd463	R600: Fix nonsensical implementation of computeKnownBits for BFE This was resulting in invalid simplifications of sdiv llvm-svn: 219953	2014-10-16 20:07:40 +00:00
Matt Arsenault	c79fc2137a	R600: Remove dead function llvm-svn: 219879	2014-10-16 00:08:09 +00:00
Matt Arsenault	302179d41a	R600: Remove unnecessary part of computeKnownBitsForTargetNode Zero-width BFEs are combined away already, so there's no point in handling them. llvm-svn: 219868	2014-10-15 23:37:49 +00:00
Matt Arsenault	03564ece92	Move variable down to use llvm-svn: 219867	2014-10-15 23:37:42 +00:00
Matt Arsenault	1d906ecdac	R600: Fix miscompiles when BFE has multiple uses SimplifyDemandedBits would break the other uses of the operand. llvm-svn: 219819	2014-10-15 17:58:34 +00:00
Matt Arsenault	cb725dcde9	R600: Use existing variable llvm-svn: 219778	2014-10-15 05:07:00 +00:00
Matt Arsenault	04b9e0240c	R600: Remove outdated comment llvm-svn: 219777	2014-10-15 05:06:57 +00:00
Matt Arsenault	c421684bad	R600/SI: Custom lower f64 -> i64 conversions llvm-svn: 219038	2014-10-03 23:54:56 +00:00
Matt Arsenault	7b24655980	R600: Custom lower [s\|u]int_to_fp for i64 -> f64 llvm-svn: 219037	2014-10-03 23:54:41 +00:00
Matt Arsenault	2456242394	R600/SI: Fix ftrunc f64 conformance failures. Re-add the tests since they were deleted at some point llvm-svn: 219036	2014-10-03 23:54:27 +00:00
Matt Arsenault	c05b3987ac	R600/SI: Add a note about the order of the operands to div_scale llvm-svn: 218534	2014-09-26 17:55:09 +00:00
Tom Stellard	8c1a958993	R600: Don't set BypassSlowDiv for 64-bit division BypassSlowDiv is used by codegen prepare to insert a run-time check to see if the operands to a 64-bit division are really 32-bit values and if they are it will do 32-bit division instead. This is not useful for R600, which has predicated control flow since both the 32-bit and 64-bit paths will be executed in most cases. It also increases code size which can lead to more instruction cache misses. llvm-svn: 218252	2014-09-22 15:35:32 +00:00
Tom Stellard	14b4dc8502	R600/SI: Use ISD::MUL instead of ISD::UMULO when lowering division ISD::MUL and ISD:UMULO are the same except that UMULO sets an overflow bit. Since we aren't using the overflow bit, we should use ISD::MUL. llvm-svn: 218251	2014-09-22 15:35:30 +00:00
Matt Arsenault	114b472f2c	R600: Better fix for bug 20982 Just do the left shift as unsigned to avoid the UB. llvm-svn: 218092	2014-09-19 00:42:06 +00:00

1 2 3 4 5

246 Commits