llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

Author	SHA1	Message	Date
Francesco Petrogalli	f5046224cf	[llvm][AArch64] Simplify (and (sign_extend..) #bitmask). Fold VT = (and (sign_extend NarrowVT to VT) #bitmask) into VT = (zero_extend NarrowVT) With this combine, the test replaces a sign extended load + an unsigned extention with a zero extended load to render one of the operands of the last multiplication. BEFORE \| AFTER f_i16_i32: \| f_i16_i32: .fnstart \| .fnstart ldrsh r0, [r0] \| ldrh r1, [r1] ldrsh r1, [r1] \| ldrsh r0, [r0] smulbb r0, r1, r0 \| smulbb r0, r0, r1 uxth r1, r1 \| mul r0, r0, r1 mul r0, r0, r1 \| bx lr bx lr \| Reviewed By: resistor Differential Revision: https://reviews.llvm.org/D90605	2020-11-09 12:53:36 +00:00
Craig Topper	a5e7540f6f	[LegalizeTypes] Remove unnecessary if around switch in ScalarizeVectorOperand and SplitVectorOperand. NFC The if was checking !Res.getNode() but that's always true since Res was initialized to SDValue() and not touched before the if. This appears to be a leftover from a previous implementation of Custom legalization where Res was updated instead of returning immediately.	2020-11-05 11:00:51 -08:00
Simon Pilgrim	6d3173c2b6	[DAG] computeKnownBits - Replace ISD::SREM handling with KnownBits::srem to reduce code duplication	2020-11-05 17:12:58 +00:00
Simon Pilgrim	329a4a468b	[KnownBits] Move ValueTracking/SelectionDAG UREM KnownBits handling to KnownBits::urem. NFCI. Both these have the same implementation - so move them to a single KnownBits copy. GlobalISel will be able to use this as well with minimal effort.	2020-11-05 14:30:59 +00:00
Simon Pilgrim	499b0ffb24	[KnownBits] Move ValueTracking/SelectionDAG UDIV KnownBits handling to KnownBits::udiv. NFCI. Both these have the same implementation - so move them to a single KnownBits copy. GlobalISel will be able to use this as well with minimal effort.	2020-11-05 13:42:42 +00:00
Cameron McInally	4e8609f657	[SelectionDAG] Add legalizations for VECREDUCE_SEQ_FMUL Hook up legalizations for VECREDUCE_SEQ_FMUL. This is following up on the VECREDUCE_SEQ_FADD work from D90247. Differential Revision: https://reviews.llvm.org/D90644	2020-11-04 14:20:31 -06:00
Fraser Cormack	5f7fe12cc6	[DAGCombine] Fix bug in load scalarization Summary: For vector element types which are not byte-sized, we would generate incorrect scalar offsets and produce incorrect codegen. This optimization could potentially be supported in the future, e.g. by loading in bytes, then shifting and masking out the remaining bits of the vector element. However, without an upstream target to test against it's best to avoid the bad codegen in the simplest possible way. Related to this bug: https://bugs.llvm.org/show_bug.cgi?id=27600 Reviewed by: foad Differential Revision: https://reviews.llvm.org/D78568	2020-11-04 19:02:40 +00:00
Kerry McLaughlin	1de78af5c4	[SVE][CodeGen] Lower scalable integer vector reductions This patch uses the existing LowerFixedLengthReductionToSVE function to also lower scalable vector reductions. A separate function has been added to lower VECREDUCE_AND & VECREDUCE_OR operations with predicate types using ptest. Lowering scalable floating-point reductions will be addressed in a follow up patch, for now these will hit the assertion added to expandVecReduce() in TargetLowering. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D89382	2020-11-04 11:38:49 +00:00
Simon Pilgrim	ad61fbdaba	[DAG] computeKnownBits - Replace ISD::MUL handling with the common KnownBits::computeForMul implementation	2020-11-04 11:32:08 +00:00
Simon Pilgrim	977dc8c300	[DAG] computeKnownBits - Move ISD::SRA handling into KnownBits::ashr As discussed on D90527, we should be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking.	2020-11-03 18:09:33 +00:00
Simon Pilgrim	95f20bec93	[DAG] computeKnownBits - Move (most) ISD::SRL handling into KnownBits::lshr As discussed on D90527, we should be be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking. The refactor to use the KnownBits fixed/min/max constant helpers allows us to hit a couple of cases that we were missing before. We still need the getValidMinimumShiftAmountConstant case as KnownBits doesn't handle per-element vector cases.	2020-11-03 17:30:36 +00:00
Simon Pilgrim	fc543f8a95	[DAG] computeKnownBits - Move (most) ISD::SHL handling into KnownBits::shl As discussed on D90527, we should be be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking. The refactor to use the KnownBits fixed/min/max constant helpers allows us to hit a couple of cases that we were missing before. We still need the getValidMinimumShiftAmountConstant case as KnownBits doesn't handle per-element vector cases.	2020-11-03 14:22:28 +00:00
Qiu Chaofan	6e7484a58c	[PowerPC] Avoid unnecessary fadd for unsigned to ppcf128 Unsigned 32-bit or shorter integer to ppcf128 conversion are currently expanded as signed-to-double with an extra fadd to 'complement'. But on PowerPC we have native instruction to directly convert unsigned to double since ISA v2.06. This patch exploits it. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D89786	2020-11-01 23:22:47 +08:00
Cameron McInally	7b7e236aab	[Legalize] Add legalizations for VECREDUCE_SEQ_FADD Add Legalization support for VECREDUCE_SEQ_FADD, so that we don't need to depend on ExpandReductionsPass. Differential Revision: https://reviews.llvm.org/D90247	2020-10-30 16:02:55 -05:00
Nikita Popov	71c3b986f4	[SDAG] Extract helper to determine neutral element (NFC) Make the existing VECREDUCE based code more generic, but expressing it in terms of the neutral value of the base opcode instead.	2020-10-29 22:05:06 +01:00
Nikita Popov	7839d472ab	[SDAG] Fix neutral value for vecreduce_fadd The neutral value for FADD is -0.0, not 0.0, so this is what we need to pad vectors with.	2020-10-29 21:27:59 +01:00
Nikita Popov	f87aa4f24a	[SDAG] Extract helper to get vecreduce base opcode (NFC)	2020-10-29 20:22:22 +01:00
Simon Pilgrim	25e5e32d22	[DAG] Move canFoldInAddressingMode before foldBinOpIntoSelect. NFC. Reduces the diff in D90113.	2020-10-28 12:16:05 +00:00
Sven van Haastregt	a77c70490e	[TargetLowering] Add i1 condition for bit comparison fold For i1 types, boolean false is represented identically regardless of the boolean content, so we can allow optimizations that otherwise would not be correct for booleans with false represented as a negative one. Patch by Erik Hogeman. Differential Revision: https://reviews.llvm.org/D90145	2020-10-27 12:22:20 +00:00
Peter Waller	ff2b8ca717	[SVE][CodeGen][DAGCombiner] Fix TypeSize warning in redundant store elimination The modified code in visitSTORE was missing a scalable vector check, and still using the now deprecated implicit cast of TypeSize to uint64_t through the overloaded operator. This patch fixes these issues. This brings the logic in line with the comment on the context line immediately above the added precondition. Add a test in sve-redundant-store.ll that the warning is not triggered. Differential Revision: https://reviews.llvm.org/D89701	2020-10-26 16:37:48 +00:00
Peter Waller	752c121e75	Revert "[SVE][CodeGen][DAGCombiner] Fix TypeSize warning in redundant store elimination" This reverts commit 4604441386dc5fcd3165f4b39f5fa2e2c600f1bc. Reverting because it was not the intended version of the patch, which follows this patch.	2020-10-26 16:37:00 +00:00
Peter Waller	dec44ead4c	[SVE][CodeGen][DAGCombiner] Fix TypeSize warning in redundant store elimination The modified code in visitSTORE was missing a scalable vector check, and still using the now deprecated implicit cast of TypeSize to uint64_t through the overloaded operator. This patch fixes these issues. This brings the logic in line with the comment on the context line immediately above the added precondition. Add a test in Redundantstores.ll that the warning is not triggered.	2020-10-26 16:23:42 +00:00
Simon Pilgrim	e7c5dcedac	[DAG] Add BuildVectorSDNode::getRepeatedSequence helper to recognise multi-element splat patterns Replace the X86 specific isSplatZeroExtended helper with a generic BuildVectorSDNode method. I've just used this to simplify the X86ISD::BROADCASTM lowering so far (and remove isSplatZeroExtended), but we should be able to use this in more places to lower to complex broadcast patterns. Differential Revision: https://reviews.llvm.org/D87930	2020-10-24 12:23:09 +01:00
Simon Pilgrim	fa551c490c	[LegalizeTypes] Legalize vector rotate operations Lower vector rotate operations as long as the legalization occurs outside of LegalizeVectorOps. This fixes https://bugs.llvm.org/show_bug.cgi?id=47320 Patch By: @rsanthir.quic (Ryan Santhirarajan) Differential Revision: https://reviews.llvm.org/D89497	2020-10-24 11:30:32 +01:00
Cameron McInally	94734c2c54	[SVE] Lower fixed length VECREDUCE_SEQ_FADD operation Differential Revision: https://reviews.llvm.org/D89162	2020-10-23 16:24:02 -05:00
Denis Antrushin	91be48b03e	Revert "[Statepoints] Allow deopt GC pointer on VReg if gc-live bundle is empty." Downstream testing revealed some problems with this patch. Reverting while investigating. This reverts commit 2b96dcebfae65485859d956954f10f409abaae79.	2020-10-23 21:55:06 +07:00
Craig Topper	db12ebf176	[FPEnv][X86][SystemZ] Use different algorithms for i64->double uint_to_fp under strictfp to avoid producing -0.0 when rounding toward negative infinity Some of our conversion algorithms produce -0.0 when converting unsigned i64 to double when the rounding mode is round toward negative. This switches them to other algorithms that don't have this problem. Since it is undefined behavior to change rounding mode with the non-strict nodes, this patch only changes the behavior for strict nodes. There are still problems with unsigned i32 conversions too which I'll try to fix in another patch. Fixes part of PR47393 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D87115	2020-10-21 18:12:54 -07:00
Gaurav Jain	cc3db92d73	[NFC] Set return type of getStackPointerRegisterToSaveRestore to Register Differential Revision: https://reviews.llvm.org/D89858	2020-10-21 16:19:38 -07:00
Simon Pilgrim	a666d53743	[DAG] getNode(ISD::EXTRACT_SUBVECTOR) Drop unnecessary N2C null check - we assert that this isn't null and have already used the pointer. NFCI. Fixes cppcheck + null dereference warning.	2020-10-21 11:53:44 +01:00
Sven van Haastregt	763f532d46	[TargetLowering] Check boolean content when folding bit compare Updates an optimization that relies on boolean contents being either 0 or 1 to properly check for this before triggering. The following: (X & 8) != 0 --> (X & 8) >> 3 Produces unexpected results when a boolean 'true' value is represented by negative one. Patch by Erik Hogeman. Differential Revision: https://reviews.llvm.org/D89390	2020-10-21 11:46:55 +01:00
David Sherwood	294e08a633	[SVE][CodeGen] Replace use of TypeSize comparison operator in CreateStackTemporary We were previously relying upon the TypeSize comparison operators to obtain the maximum size of two types, however use of such operators is being deprecated in favour of making the caller aware that it could be dealing with scalable vector types. I have changed the code to assert that the two types have the same scalable property and thus we can simply take the maximum of the known minimum sizes instead. Differential Revision: https://reviews.llvm.org/D88563	2020-10-21 08:31:36 +01:00
Qiu Chaofan	c6aeda738c	[DAGCombiner] Tighten reasscociation of visitFMA From LangRef, FMF contract should not enable reassociating to form arbitrary contractions. So it should not help rearrange nodes like (fma (fmul x, c1), c2, y) into (fma x, c1*c2, y). Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D89527	2020-10-20 10:13:01 +08:00
Craig Topper	907d0bc7f5	[SelectionDAG][X86] Enable SimplifySetCC CTPOP transforms for vector splats This enables these transforms for vectors: (ctpop x) u< 2 -> (x & x-1) == 0 (ctpop x) u> 1 -> (x & x-1) != 0 (ctpop x) == 1 --> (x != 0) && ((x & x-1) == 0) (ctpop x) != 1 --> (x == 0) \|\| ((x & x-1) != 0) All enabled if CTPOP isn't Legal. This differs from the scalar behavior where the first two are done unconditionally and the last two are done if CTPOP isn't Legal or Custom. The Legal check produced better results for vectors based on X86's custom handling. Might be worth re-visiting scalars here. I disabled the looking through truncate for vectors. The code that creates new setcc can use the same result VT as the original setcc even if we truncated the input. That may work work for most scalars, but definitely wouldn't work for vectors unless it was a vector of i1. Fixes or at least improves PR47825 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D89346	2020-10-19 12:56:59 -07:00
Amy Kwan	35e7ebf816	[DAGCombiner][PowerPC] Remove isMulhCheaperThanMulShift TLI hook, Use isOperationLegalOrCustom directly instead. MULH is often expanded on targets. This patch removes the isMulhCheaperThanMulShift hook and uses isOperationLegalOrCustom instead. Differential Revision: https://reviews.llvm.org/D80485	2020-10-19 12:23:04 -05:00
David Sherwood	f28f40ddb8	[SVE][CodeGen] Replace more TypeSize comparison operators with their scalar equivalents In certain places in llvm/lib/CodeGen we were relying upon the TypeSize comparison operators when in fact the code was only ever expecting either scalar values or fixed width vectors. This patch changes a few functions that were always expecting to work on scalar or fixed width types: 1. DAGCombiner::mergeTruncStores - deals with scalar integers only. 2. DAGCombiner::ReduceLoadWidth - not valid for vectors. 3. DAGCombiner::createBuildVecShuffle - should only be used for fixed width vectors. 4. SelectionDAGLegalize::ExpandFCOPYSIGN and SelectionDAGLegalize::getSignAsIntValue - only work on scalars. Differential Revision: https://reviews.llvm.org/D88562	2020-10-19 08:38:50 +01:00
David Sherwood	64f6cfdc9f	[SVE][CodeGen][NFC] Replace TypeSize comparison operators with their scalar equivalents In certain places in llvm/lib/CodeGen we were relying upon the TypeSize comparison operators when in fact the code was only ever expecting either scalar values or fixed width vectors. I've changed some of these places to use the equivalent scalar operator. Differential Revision: https://reviews.llvm.org/D88482	2020-10-19 08:30:31 +01:00
David Sherwood	6ea45a3a15	[SVE][CodeGen] Replace uses of TypeSize comparison operators In certain places in the code we can never end up in a situation where we're mixing fixed width and scalable vector types. For example, we can't have truncations and extends that change the lane count. Also, in other places such as GenWidenVectorStores and GenWidenVectorLoads we know from the behaviour of FindMemType that we can never choose a vector type with a different scalable property. In various places I have used EVT::bitsXY functions instead of TypeSize::isKnownXY, where it probably makes sense to keep an assert that scalable properties match. Differential Revision: https://reviews.llvm.org/D88654	2020-10-19 08:08:41 +01:00
Craig Topper	cf43cfffa7	[TargetLowering] Extract simplifySetCCs ctpop into a separate function. NFCI As requested in D89346. This allows us to add some early outs. I reordered some checks a little bit to make the more common bail outs happen earlier. Like checking opcode before checking hasOneUse. And I moved the bit width check to make sure it was safe to look through a truncate to the spot where we look through truncates instead of after. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D89494	2020-10-16 19:47:56 -07:00
Denis Antrushin	7b96615eab	[Statepoints] Remove MI limit on number of tied operands. After D87915 statepoint can have more than 15 tied operands. Remove this restriction from statepoint lowering code.	2020-10-15 19:02:38 +07:00
Craig Topper	9b01853207	[TargetLowering] Replace Log2_32_Ceil with Log2_32 in SimplifySetCC ctpop combine. This combine can look through (trunc (ctpop X)). When doing this it tries to make sure the trunc doesn't lose any information from the ctpop. It does this by checking that the truncated type has more bits that Log2_32_Ceil of the ctpop type. The Ceil is unnecessary and pessimizes non-power of 2 types. For example, ctpop of i256 requires 9 bits to represent the max value of 256. But ctpop of i255 only requires 8 bits to represent the max result of 255. Log2_32_Ceil of 256 and 255 both return 8 while Log2_32 returns 8 for 256 and 7 for 255 The code with popcnt enabled is a regression for this test case, but it does match what already happens with i256 truncated to i9. Since power of 2 is more likely, I don't think it should block this change. Differential Revision: https://reviews.llvm.org/D89412	2020-10-15 01:05:07 -07:00
Jeremy Morse	75a07ccbc2	[DebugInstrRef] Create DBG_INSTR_REFs in SelectionDAG When given the -experimental-debug-variable-locations option (via -Xclang or to llc), have SelectionDAG generate DBG_INSTR_REF instructions instead of DBG_VALUE. For now, this only happens in a limited circumstance: when the value referred to is not a PHI and is defined in the current block. Other situations introduce interesting problems, addresed in later patches. Practically, this patch hooks into InstrEmitter and if it can find a defining instruction for a value, gives it an instruction number, and points the DBG_INSTR_REF at that <instr, operand> pair. Differential Revision: https://reviews.llvm.org/D85747	2020-10-14 14:24:08 +01:00
Craig Topper	0120cd1285	[X86][SelectionDAG] Add SADDO_CARRY and SSUBO_CARRY to support multipart signed add/sub overflow legalization. This passes existing X86 test but I'm not sure if it handles all type legalization cases it needs to. Alternative to D89200 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D89222	2020-10-12 23:18:29 -07:00
Simon Pilgrim	132f72d148	[DAG][ARM][MIPS][RISCV] Improve funnel shift promotion to use 'double shift' patterns Based on a discussion on D88783, if we're promoting a funnel shift to a width at least twice the size as the original type, then we can use the 'double shift' patterns (shifting the concatenated sources). Differential Revision: https://reviews.llvm.org/D89139	2020-10-12 14:11:02 +01:00
David Sherwood	9a012ebf36	[SVE] Make ElementCount and TypeSize use a new PolySize class I have introduced a new template PolySize class, where the template parameter determines the type of quantity, i.e. for an element count this is just an unsigned value. The ElementCount class is now just a simple derivation of PolySize<unsigned>, whereas TypeSize is more complicated because it still needs to contain the uint64_t cast operator, since there are still many places in the code that rely upon this implicit cast. As such the class also still needs some of it's own operators. I've tried to minimise the amount of code in the base PolySize class, which led to a couple of changes: 1. In some places we were relying on '==' operator comparisons between ElementCounts and the scalar value 1. I didn't put this operator in the new PolySize class, and thought it was actually clearer to use the isScalar() function instead. 2. I removed the isByteSized function and replaced it with calls to isKnownMultipleOf(8). I've also renamed NextPowerOf2 to be coefficientNextPowerOf2 so that it's more consistent with coefficientDivideBy. Differential Revision: https://reviews.llvm.org/D88409	2020-10-12 08:23:38 +01:00
Krzysztof Parzyszek	6f882a320b	[SDAG] Remember to set UndefElts in isSplatValue for SPLAT_VECTOR	2020-10-10 19:42:24 -05:00
Denis Antrushin	2cab9515a5	[Statepoints] Allow deopt GC pointer on VReg if gc-live bundle is empty. Currently we allow passing pointers from deopt bundle on VReg only if they were seen in list of gc-live pointers passed on VRegs. This means that for the case of empty gc-live bundle we spill deopt bundle's pointers. This change allows lowering deopt pointers to VRegs in case of empty gc-live bundle. In case of non-empty gc-live bundle, behavior does not change. Reviewed By: skatkov Differential Revision: https://reviews.llvm.org/D88999	2020-10-10 14:58:08 +07:00
Esme-Yi	725cbaf080	[DAGCombiner] Add decomposition patterns for Mul-by-Imm. Summary: This patch is derived from D87384. In this patch we expand the existing decomposition of mul-by-constant to be more general by implementing 2 patterns: ``` mul x, (2^N + 2^M) --> (add (shl x, N), (shl x, M)) mul x, (2^N - 2^M) --> (sub (shl x, N), (shl x, M)) ``` The conversion will be trigged if the multiplier is a big constant that the target can't use a single multiplication instruction to handle. This is controlled by the hook `decomposeMulByConstant`. More over, the conversion benefits from an ILP improvement since the instructions are independent. A case with the sequence like following also gets benefit since a shift instruction is saved. ``` res1 = a 0x8800; res2 = a 0x8080; ``` Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D88201	2020-10-09 08:51:40 +00:00
Fangrui Song	ff30c5884e	Fix incorect Register -> MCRegister conversion getReg returns a Register which may represent a virtual register.	2020-10-08 21:40:48 -07:00
Amara Emerson	e007b84419	Rename the VECREDUCE_STRICT_{FADD,FMUL} SDNodes to VECREDUCE_SEQ_{FADD,FMUL}. The STRICT was causing unnecessary confusion. I think SEQ is a more accurate name for what they actually do, and the other obvious option of "ORDERED" has the issue of already having a meaning in FP contexts. Differential Revision: https://reviews.llvm.org/D88791	2020-10-07 10:45:09 -07:00
Amara Emerson	59c2440372	[llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics. This change renames the intrinsics to not have "experimental" in the name. The autoupgrader will handle legacy intrinsics. Relevant ML thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140729.html Differential Revision: https://reviews.llvm.org/D88787	2020-10-07 10:36:44 -07:00

1 2 3 4 5 ...

11078 Commits