llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

Author	SHA1	Message	Date
Paul Walker	6eee9eb46a	[SVE] Use NEON for extract_vector_elt when the index is in range. Patch also adds missing patterns for unpacked vector types and extracts of element zero. Differential Revision: https://reviews.llvm.org/D87842	2020-09-21 13:12:28 +01:00
Alexander Belyaev	da4afc0c02	Revert "[NFC][ScheduleDAG] Remove unused EntrySU SUnit" This reverts commit 0345d88de654259ae90494bf9b015416e2cccacb. Google internal backend uses EntrySU, we are looking into removing dependency on it. Differential Revision: https://reviews.llvm.org/D88018	2020-09-21 13:33:05 +02:00
Florian Hahn	77b449d743	Recommit "[SCEV] Look through single value PHIs." This commit was originally because it was suspected to cause a crash, but a reproducer did not surface. A crash that was exposed by this change was fixed in 1d8f2e52925b. This reverts the revert commit 0581c0b0eeba03da590d1176a4580cf9b9e8d1e3.	2020-09-21 11:59:50 +01:00
David Green	c8c6d59c71	[ARM] Select f32 constants with vmov.f16 This adds lowering for f32 values using the vmov.f16, which zeroes the top bits whilst setting the lower bits to a pattern. This range of values does not often come up, except where a f16 constant value has been converted to a f32. Differential Revision: https://reviews.llvm.org/D87790	2020-09-21 11:10:47 +01:00
Sjoerd Meijer	622615c7ad	[AArch64] Cortex-A55 scheduler model This is an initial commit adding the A55 model, but it isn't used/enabled yet. We will follow up on this to improve the model, then flip the switch. The optimisation guide describing Cortex-A55 micro-architecture in more detail can be found here: https://static.docs.arm.com/epm128372/20/arm_cortex_a55_software_optimization_guide_v2.pdf Original patch by Javed Absar. Differential Revision: https://reviews.llvm.org/D46884	2020-09-21 10:54:32 +01:00
Alex Richardson	01addf8046	[RISC-V] Implement RISCVInstrInfo::isCopyInstrImpl() This does not result in changes for any of the current tests, but it might improve debug information in some cases. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D86522	2020-09-21 10:21:11 +01:00
Lucas Prates	438b0e34bd	[CodeGen] Fixing inconsistent ABI mangling of vlaues in SelectionDAGBuilder SelectionDAGBuilder was inconsistently mangling values based on ABI Calling Conventions when getting them through copyFromRegs in SelectionDAGBuilder, causing duplicate value type convertions for function arguments. The checking for the mangling requirement was based on the value's originating instruction and was performed outside of, and inspite of, the regular Calling Convention Lowering. The issue could be observed in a scenario such as: ``` %arg1 = load half, half* %const, align 2 %arg2 = call fastcc half @someFunc() call fastcc void @otherFunc(half %arg1, half %arg2) ; Here, %arg2 was incorrectly mangled twice, as the CallConv data from ; the call to @someFunc() was taken into consideration for the check ; when getting the value for processing the call to @otherFunc(...), ; after the proper convertion had taken place when lowering the return ; value of the first call. ``` This patch fixes the issue by disregarding the Calling Convention information for such copyFromRegs, making sure the ABI mangling is properly contanined in the Calling Convention Lowering. This fixes Bugzilla #47454. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D87844	2020-09-21 10:05:34 +01:00
Florian Hahn	fe09f259f9	[LSR] Preserve MSSA when using SplitCriticalEdge. LSR claims to MemorySSA, but we also have to make sure it is preserved when splitting critical edges. This can be done by passing MSSAU to SplitCriticalEdge. Fixes PR47557.	2020-09-21 09:51:26 +01:00
Fangrui Song	e6e7c80f0e	[EHStreamer] Fix a "Continue to action" -fverbose-asm comment when multi-byte LEB128 encoding is needed This only happens with more than 64 action records and it is difficult to construct a test.	2020-09-20 21:41:48 -07:00
Qiu Chaofan	d6e49b7d70	[PowerPC] Pass nofpexcept flag to custom lowered constrained ops This is a follow-up of D86605. For strict DAG FP node, if its FP exception behavior metadata is ignore, it should have nofpexcept flag. But during custom lowering, this flag isn't passed down. This is also seen on X86 target. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D87390	2020-09-21 10:44:25 +08:00
Fangrui Song	4dabbec93d	[XRay] Change mips to use version 2 sled (PC-relative address) Follow-up to D78590. All targets use PC-relative addresses now. Reviewed By: atanasyan, dberris Differential Revision: https://reviews.llvm.org/D87977	2020-09-20 17:59:57 -07:00
Craig Topper	634488ce52	[X86] Make reduceMaskedLoadToScalarLoad/reduceMaskedStoreToScalarStore work for avx512 after type legalization. The scalar elements of the vXi1 build_vector will have been type legalized to i8 by padding with 0s. So we can't check for all ones. Instead we should just look at bit 0 of the constant. Differential Revision: https://reviews.llvm.org/D87863	2020-09-20 13:54:20 -07:00
Craig Topper	aa3ccaf198	[X86] Stop reduceMaskedLoadToScalarLoad/reduceMaskedStoreToScalarStore from creating scalar i64 load/stores in 32-bit mode If we emit a scalar i64 load/store it will get type legalized to two i32 load/stores. Differential Revision: https://reviews.llvm.org/D87862	2020-09-20 13:46:59 -07:00
David Green	aa89df6c4c	[ARM] Constant fold VMOVrh This adds simple constant folding for VMOVrh, to constant fold fp16 constants to integer values. It can help especially with soft calling conventions, but some of the results are not optimal as we end up loading using a vldr. This will be improved in a follow up patch. Differential Revision: https://reviews.llvm.org/D87789	2020-09-20 21:32:51 +01:00
Nikita Popov	766062255a	[LVI] Get value range from mask comparison InstCombine likes to canonicalize comparisons of the form X == C \|\| X == C+1 into (X & -2) == C'. Make sure LVI can still recover the value range from this. Can of course also be useful for proper mask comparisons. For the sake of clarity, the implementation goes through KnownBits to compute the range.	2020-09-20 21:13:57 +02:00
Nikita Popov	e0082e6dcf	[LVI] Refactor getValueFromICmpCondition (NFC) Rewrite this in a way where the core logic is in a separate function, that is invoked with swapped operands. This makes it easier to add handling for additional icmp patterns.	2020-09-20 21:13:57 +02:00
Simon Pilgrim	228d97b26d	[X86][SSE] Fold EXTEND_VECTOR_INREG(EXTRACT_SUBVECTOR(EXTEND(X),0)) -> EXTEND_VECTOR_INREG(X)	2020-09-20 18:39:12 +01:00
Simon Pilgrim	19d99cd594	[X86][SSE] Fold SIGN_EXTEND(SIGN_EXTEND_VECTOR_INREG(X)) -> SIGN_EXTEND_VECTOR_INREG(X) It should be possible to make this generic, but we're not great at checking legality of *_EXTEND_VECTOR_INREG ops so I'm conservatively putting this inside X86ISelLowering.cpp	2020-09-20 18:39:12 +01:00
Sanjay Patel	79a8f1c79c	[InstCombine] factorize left shifts of add/sub We do similar factorization folds in SimplifyUsingDistributiveLaws, but that drops no-wrap properties. Propagating those optimally may help solve: https://llvm.org/PR47430 The propagation is all-or-nothing for these patterns: when all 3 incoming ops have nsw or nuw, the 2 new ops should have the same no-wrap property: https://alive2.llvm.org/ce/z/Dv8wsU This also solves: https://llvm.org/PR47584	2020-09-20 12:55:24 -04:00
Sanjay Patel	2991bca580	[InstCombine] replace zombie unreachable values with 'undef' before erasing The test (currently crashing) is reduced from the example provided in the post-commit discussion in D87149. Differential Revision: https://reviews.llvm.org/D87965	2020-09-20 12:25:08 -04:00
Simon Pilgrim	178cee9086	[X86][SSE] Fold EXTEND_VECTOR_INREG(EXTEND_VECTOR_INREG(X)) -> EXTEND_VECTOR_INREG(X) It should be possible to make this generic, but we're not great at checking legality of *_EXTEND_VECTOR_INREG ops so I'm conservatively putting this inside X86ISelLowering.cpp	2020-09-20 16:33:02 +01:00
Simon Pilgrim	66a7dfdc0c	[X86][SSE] Enable ZERO_EXTEND_VECTOR_INREG shuffle combining on SSE41 targets. Allows ZERO_EXTEND_VECTOR_INREG to be shuffle combined on all targets where it is legal.	2020-09-20 16:05:10 +01:00
Simon Pilgrim	c44544f40e	[X86] Rename getExtendInVec to getEXTEND_VECTOR_INREG. NFCI. Make it easier to find the method by naming it after the ops it actually handles. We already do this for lowering/combining.	2020-09-20 15:19:39 +01:00
Simon Pilgrim	c10f7482f0	DWARFYAML::emitDebugSections - fix use after std::move warnings. NFCI. We were using Err after it had been moved into cantFail - avoid this by calling cantFail with Error::success() directly.	2020-09-20 14:42:36 +01:00
Simon Pilgrim	b9b03a71fd	[X86] combineX86ShufflesRecursively - fix use after move warning. NFCI. After moving WidenedMask is in an undefined state, so reduce scope of the variable so its reinitialized every iteration - we should still retain any memory allocation savings.	2020-09-20 14:06:50 +01:00
Dávid Bolvanský	fe83462295	[MemLoc] Support lllvm.memcpy.inline in MemoryLocation::getForArgument Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D87971	2020-09-20 14:01:48 +02:00
Simon Pilgrim	0e960b0556	[X86] Rename combineExtInVec to combineEXTEND_VECTOR_INREG. NFCI. Make it easier to find the method by naming it after the ops it actually handles. We already do this for lowering.	2020-09-20 12:16:00 +01:00
Fangrui Song	acda854fa5	[FunctionAttrs] Inline setDoesNotRecurse() and delete it. NFC It always returns true, which may lead to confusion. Inline it because it is trivial and only called twice.	2020-09-19 22:24:52 -07:00
Fangrui Song	2485eebd4f	[FunctionAttrs] Remove redundant check. NFC	2020-09-19 20:46:18 -07:00
Fangrui Song	64799c106c	Fix some clang-tidy bugprone-argument-comment issues	2020-09-19 20:41:25 -07:00
Nikita Popov	2428bdcb45	[Local] Clean up enforceKnownAlignment() (NFC) I want to export this function, and the current API was a bit weird: It took an additional Alignment argument that didn't really have anything to do with what the function does. Drop it, and perform a max at the callsite. Also rename it to tryEnforceAlignment().	2020-09-19 22:29:40 +02:00
Florian Hahn	03e16aab4f	[SCEVExpander] Support expanding nonintegral pointers with constant base. Currently SCEVExpander creates inttoptr for non-integral pointers if the base is a null constant for example. This results in invalid IR. This patch changes InsertNoopCastOfTo to emit a GEP & bitcast to convert to a non-integral pointer. First, a GEP of i8* null is generated and the integral value is used as index. The GEP is then bitcasted to the target type. This was exposed by D71539. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D87827	2020-09-19 17:19:53 +01:00
Dávid Bolvanský	0126c354e7	[MemLoc] Support bcmp in MemoryLocation::getForArgument Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D87964	2020-09-19 17:12:43 +02:00
Sanjay Patel	883809adba	[ConstantFolding] add undef handling for fmin/fmax intrinsics The output here may not be optimal (yet), but it should be consistent for commuted operands (it was not before) and correct. We can do better by checking FMF and NaN if needed. Code in InstSimplify generally assumes that we have already folded code like this, so it was not handling 2 constant inputs by commuting consistently.	2020-09-19 10:31:01 -04:00
Paul C. Anagnostopoulos	58f51bc97c	Change name of Record::TheInit to CorrespondingDefInit to make code clearer. Differential Revision: https://reviews.llvm.org/D87919	2020-09-19 09:18:44 -04:00
Joachim Meyer	d06a70e02e	Add -Wno-error=unknown flag to clang-format. Currently newer clang-format options cannot be included in .clang-format files, if not all users can be forced to use an updated version. This patch tries to solve this by adding an option to clang-format, enabling to ignore unknown (newer) options. Differential Revision: https://reviews.llvm.org/D86137	2020-09-19 10:17:57 +02:00
Craig Topper	cba5d6bed8	[X86] Return from SimplifyDemandedBitsForTargetNode after calculating known bits for VSHLI/VSRAI/VSRLI. We were breaking out of the switch which falls into the default implementation of SimplifyDemandedBitsForTargetNode which is a wrapper around computeKnownBits. So we end up doing the recursion and known bits calculation all over again. Instead we should return with the known bits we calculated in the switch.	2020-09-18 23:57:01 -07:00
Amara Emerson	3e4cfa99b8	[AArch64][GlobalISel] Add legalization and selection support for <4 x s16> G_SHL.	2020-09-18 23:32:01 -07:00
Xun Li	bc3710453a	[ASAN] Properly deal with musttail calls in ASAN When address sanitizing a function, stack unpinsoning code is inserted before each ret instruction. However if the ret instruciton is preceded by a musttail call, such transformation broke the musttail call contract and generates invalid IR. This patch fixes the issue by moving the insertion point prior to the musttail call if there is one. Differential Revision: https://reviews.llvm.org/D87777	2020-09-18 23:10:34 -07:00
Andrew Litteken	c53dab65b4	[IRSim] Adding ilist for IRInstructionData. The IRInstructionData structs are a different representation of the program. This list treats the program as if it was "flattened" and the only parent is this list. This lets us easily create ranges of instructions. Differential Revision: https://reviews.llvm.org/D86969	2020-09-19 00:18:39 -05:00
Craig Topper	e9b884a465	[X86] Fix copy paste mistake in @ccnp flag. We were treating @ccp and @ccnp the same.	2020-09-18 21:28:01 -07:00
David Blaikie	4e8a4f6d62	DebugInfo: Cleanup RLE dumping, using a length-constrained DataExtractor rather than carrying the end offset separately	2020-09-18 19:32:38 -07:00
Eric Christopher	907ccb4712	Temporarily Revert "RegAllocFast: Rewrite and improve" as it's breaking a few tests in the lldb test suite. Bot: http://lab.llvm.org:8011/builders/lldb-arm-ubuntu/builds/4226/steps/test/logs/stdio This reverts commit c8757ff3aa7dd7a25a6343f6ef74a70c7be04325.	2020-09-18 18:11:21 -07:00
Fangrui Song	90b9ef1425	[LiveDebugValues] Add `#if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)` to suppress -Wunused-function	2020-09-18 17:25:37 -07:00
Amara Emerson	7600b7ba0c	[AArch64][GlobalISel] Legalize arithmetic ops for <4 x s16>	2020-09-18 17:13:55 -07:00
Vitaly Buka	f298c20805	[NFC][StackSafety] Replace auto with type Fixes static analyzer is warning.	2020-09-18 17:10:28 -07:00
Fangrui Song	07bdca3215	[SCEV] Fix an unused variable in -DLLVM_ENABLE_ASSERTIONS=off build	2020-09-18 16:19:05 -07:00
Amara Emerson	c373bb64ff	[GlobalISel] Add lowering support for G_ABS and use for AArch64. Differential Revision: https://reviews.llvm.org/D87952	2020-09-18 16:17:18 -07:00
Amy Kwan	be427da523	[PowerPC] Implement Move to VSR Mask builtins in LLVM/Clang This patch implements the vec_gen[b\|h\|w\|d\|q]m function prototypes in altivec.h in order to utilize the move to VSR with mask instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82725	2020-09-18 18:16:14 -05:00
Philip Reames	eaea7ba7a7	[instcombine][x86] Converted pdep/pext with shifted mask to simple arithmetic If the mask of a pdep or pext instruction is a shift masked (i.e. one contiguous block of ones) we need at most one and and one shift to represent the operation without the intrinsic. One all platforms I know of, this is faster than the pdep/pext. The cost modelling for multiple contiguous blocks might be worth exploring in a follow up, but it's not relevant for my current use case. It would almost certainly be a win on AMDs where these are really really slow though. Differential Revision: https://reviews.llvm.org/D87861	2020-09-18 14:54:24 -07:00

1 2 3 4 5 ...

139215 Commits