llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

Author	SHA1	Message	Date
Bill Wendling	0816222e8f	Revert "Remove redundant "std::move"s in return statements" The build failed with error: call to deleted constructor of 'llvm::Error' errors. This reverts commit 1c2241a7936bf85aa68aef94bd40c3ba77d8ddf2.	2020-02-10 07:07:40 -08:00
James Henderson	f0566b59e2	[DebugInfo] Reject line tables of version > 5 If a debug line section with version of greater than 5 is encountered, prior to this change the parser would accept it and treat it as version 5. This might work to some extent, but then it might not at all, as it really depends on the format of the unspecified future version, which will be different (otherwise there would be no point in changing the version number). Any information we could provide has a good chance of being invalid, so we should just refuse to parse such tables. Reviewed by: dblaikie, MaskRay Differential Revision: https://reviews.llvm.org/D74204	2020-02-10 14:43:10 +00:00
Bill Wendling	e45b5f33f3	Remove redundant "std::move"s in return statements	2020-02-10 06:39:44 -08:00
Luke Geeson	fa507d6291	[AArch64] Make Read Write System Registers Read Only This patch makes the following System Registers Read Only: - CurrentEL - ICH_MISR_EL2 - PMBIDR_EL1 - PMSIDR_EL1 as found in: https://developer.arm.com/docs/ddi0595/e/aarch64-system-registers Relative line numbers were also added to the tests so we get more informative error messages on failure. Change-Id: I963b4f01ca5737b58f9e8e7abe9ca1d99e328758	2020-02-10 14:34:24 +00:00
Sebastian Neubauer	92565aa596	[SelectionDAG] Optimize build_vector of truncates and shifts Add a simplification to fuse a manual vector extract with shifts and truncate into a bitcast. Unpacking and packing values into vectors is only optimized with extractelement instructions, not when manually unpacked using shifts and truncates. This patch simplifies shifts and truncates into a bitcast if possible. Simplify (build_vec (trunc $1) (trunc (srl $1 width)) (trunc (srl $1 (2 * width))) ...) to (bitcast $1) Differential Revision: https://reviews.llvm.org/D73892	2020-02-10 15:04:07 +01:00
Kai Nacke	c31671a606	[SystemZ] Add implementation for the intrinsic llvm.read_register This change implements the llvm intrinsic llvm.read_register for the SystemZ platform which returns the value of the specified register (http://llvm.org/docs/LangRef.html#llvm-read-register-and-llvm-write-register-intrinsics). This implementation returns the value of the stack register, and can be extended to return the value of other registers. The implementation for this intrinsic exists on various other platforms including Power, x86, ARM, etc. but missing on SystemZ. Reviewers: uweigand Differential Revision: https://reviews.llvm.org/D73378	2020-02-10 08:19:10 -05:00
Hans Wennborg	c12abbadd4	Fix an unused variable warning	2020-02-10 14:08:18 +01:00
Mikael Holmen	e84339f6f7	Fix compiler warning when compiling without asserts [NFC]	2020-02-10 13:55:52 +01:00
Kadir Cetinkaya	aee66e3ccd	[OpenMP] Fix unused variable	2020-02-10 13:47:20 +01:00
Kerry McLaughlin	e2dc4155d9	[AArch64][SVE] SVE2 intrinsics for complex integer arithmetic Summary: Adds the following SVE2 intrinsics: - cadd & sqcadd - cmla & sqrdcmlah - saddlbt, ssublbt & ssubltb Reviewers: sdesmalen, dancgr, efriedma, cameron.mcinally, c-rhodes, rengolin Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73636	2020-02-10 12:14:56 +00:00
Simon Pilgrim	4f15f7247f	Revert rGe82e17d4d4cac8b2df00094e80d5e1cb22795664 - [X86] Add lowerShuffleAsBitRotate (PR44379) As noted on PR44379, we didn't attempt to lower vector shuffles using bit rotations on XOP/AVX512F targets. This patch lowers to uniform ISD:ROTL nodes - ROTR isn't supported by XOP and they are interchangeable for constant values anyway. There might be cases where targets without ISD:ROTL support would benefit from this (expanding to SRL+SHL+OR), which I'll investigate in a future patch. Also, non-AVX512BW targets fail to concatenate 256-bit rotations back to 512-bits (split during shuffle lowering as they don't have v32i16/v64i8 types). --- Internal shuffle tests indicate theres a bug somewhere that I haven't been able to track down yet.	2020-02-10 12:14:26 +00:00
Florian Hahn	c7634d1407	[DSE] Add first version of MemorySSA-backed DSE (Bottom up walk). This patch adds a first version of a MemorySSA based DSE. It is missing a lot of features, which will get added as follow-ups, to help to keep the review manageable. The patch uses the following general approach: given a MemoryDef, walk upwards to find clobbering MemoryDefs that may be killed by the starting def. Then check that there are no uses that may read the location of the original MemoryDef in between both MemoryDefs. A bit more concretely: For all MemoryDefs StartDef: 1. Get the next dominating clobbering MemoryDef (DomAccess) by walking upwards. 2. Check that there no reads between DomAccess and the StartDef by checking all uses starting at DomAccess and walking until we see StartDef. 3. For each found DomDef, check that: 1. There are no barrier instructions between DomDef and StartDef (like throws or stores with ordering constraints). 2. StartDef is executed whenever DomDef is executed. 3. StartDef completely overwrites DomDef. 4. Erase DomDef from the function and MemorySSA. The patch uses a very simple approach to guarantee that no throwing instructions are between 2 stores: We only allow accesses to stack objects, access that are in the same basic block if the block does not contain any throwing instructions or accesses in functions that do not contain any throwing instructions. This will get lifted later. Besides adding support for the missing cases, there is plenty of additional potential for improvements as follow-up work, e.g. the way we visit stores (could be just a traversal of the MemorySSA, rather than collecting them up-front), using the alias information discovered during walking to optimize the MemorySSA. This is loosely based on D40480 by Dave Green. Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea, Tyker Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72700	2020-02-10 11:52:11 +00:00
Kerry McLaughlin	0f6e08381a	[AArch64][SVE] SVE2 intrinsics for character match & histogram generation Summary: Implements the following intrinsics: - @llvm.aarch64.sve.histcnt - @llvm.aarch64.sve.histseg - @llvm.aarch64.sve.match - @llvm.aarch64.sve.nmatch Reviewers: c-rhodes, sdesmalen, dancgr, efriedma, rengolin Reviewed By: c-rhodes Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74117	2020-02-10 11:08:00 +00:00
Kerry McLaughlin	05d274bf7e	[AArch64][SVE] Add SVE2 intrinsics for widening DSP operations Summary: Implements the following intrinsics: - @llvm.aarch64.sve.[s\|u]abalb - @llvm.aarch64.sve.[s\|u]abalt - @llvm.aarch64.sve.[s\|u]addlb - @llvm.aarch64.sve.[s\|u]addlt - @llvm.aarch64.sve.[s\|u]sublb - @llvm.aarch64.sve.[s\|u]sublt - @llvm.aarch64.sve.[s\|u]abdlb - @llvm.aarch64.sve.[s\|u]abdlt - @llvm.aarch64.sve.sqdmullb - @llvm.aarch64.sve.sqdmullt - @llvm.aarch64.sve.[s\|u]mullb - @llvm.aarch64.sve.[s\|u]mullt Reviewers: sdesmalen, dancgr, efriedma, cameron.mcinally, rengolin Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73719	2020-02-10 10:37:59 +00:00
Florian Hahn	583b0cbdfc	[DSE] Add tests for MemorySSA based DSE. This copies the DSE tests into a MSSA subdirectory to test the MemorySSA backed DSE implementation, without disturbing the original tests. Differential Revision: https://reviews.llvm.org/D72145	2020-02-10 10:28:43 +00:00
Djordje Todorovic	290bca5867	[CSInfo] Fix the assertions regarding updating the CSInfo The call site info was not updated correctly when deleting corresponding call instructions. Differential Revision: https://reviews.llvm.org/D73700	2020-02-10 10:55:06 +01:00
Kai Nacke	455cbd72ce	[SytemZ] Disable vector ABI when using option -march=arch[8\|9\|10] When specifying -march=arch[8\|9\|10], those CPU types do NOT support the vector extension. In this case the vector ABI must be disabled. The generated data layout should NOT contain 64-v128. Reviewers: uweigand Differential Revision: https://reviews.llvm.org/D74146	2020-02-10 04:14:05 -05:00
Djordje Todorovic	f63712851c	[CSInfo] Use isCandidateForCallSiteEntry() when updating the CSInfo Use the isCandidateForCallSiteEntry(). This should mostly be an NFC, but there are some parts ensuring the moveCallSiteInfo() and copyCallSiteInfo() operate with call site entry candidates (both Src and Dest should be the call site entry candidates). Differential Revision: https://reviews.llvm.org/D74122	2020-02-10 10:03:14 +01:00
Sebastian Neubauer	b1879e506b	[AMDGPU] Add a16 feature to gfx10 Based on D72931 This adds a new feature called A16 which is enabled for gfx10. gfx9 keeps the R128A16 feature so it can share all the instruction encodings with gfx7/8. Differential Revision: https://reviews.llvm.org/D73956	2020-02-10 09:04:23 +01:00
Johannes Doerfert	bdfa790992	[Attributor] Simple casts preserve no-alias property This is a minimal but important advancement over the existing code. A cast with an operand that is only used in the cast retains the no-alias property of the operand.	2020-02-10 01:11:32 -06:00
Amara Emerson	b69eda2e73	[GlobalISel][CallLowering] Tighten constantexpr check for callee. I'm not sure there's a test case for this, but it's better to be safe.	2020-02-09 22:59:48 -08:00
Johannes Doerfert	bf5760c5ab	[Attributor] Allow PHI nodes in AAValueConstantRangeFloating Traversing PHI nodes is natural with the genericValueTraversal but also a bit tricky. The problem is similar to the ones we have seen in AAAlign and AADereferenceable, namely that we continue to increase the range in each iteration. We use a pessimistic approach here to stop the iterations. Nevertheless, optimistic information can now be propagated through a PHI node.	2020-02-10 00:55:10 -06:00
Johannes Doerfert	6ac6a81d33	[Attributor][FIX] Remove FIXME that seems outdated The change is performed as stated by the FIXME and the tests are adjusted. All changes look fine to me and values can be inferred as undef without it being an error.	2020-02-10 00:55:10 -06:00
Johannes Doerfert	6ad6b3038f	[Attributor] Allow SelectInst in AAValueConstantRangeFloating The genericValueTraversal will already handle SelectInst properly and we just needed to allow them in the initialize method.	2020-02-10 00:55:09 -06:00
Johannes Doerfert	894ac6470f	[Attributor] Look through (some) casts in AAValueConstantRangeFloating Casts can be handled natively by the ConstantRange class. We do limit it to extends for now as we assume an integer type in different locations. A TODO and a test case with a FIXME was added to remove that restriction in the future.	2020-02-10 00:38:01 -06:00
Johannes Doerfert	47c6de4aa0	[Attributor][FIX] Call right base method in AAValueConstantRangeFloating We now call the base class method as we should.	2020-02-10 00:38:01 -06:00
Craig Topper	d6b10b296b	[X86] Make (insert_vector_elt (v8i16 zerovec), i16 %x, 0) generate the same code as (v8i16 (build_vector %x, 0, 0, 0, 0, 0, 0, 0)). Instead of using a insrw to element 0, use movzx and movd. Same for v16i8.	2020-02-09 21:52:11 -08:00
Michael Liao	b43dfd2f41	Fix `-Wparentheses` warning. NFC.	2020-02-10 00:45:02 -05:00
Craig Topper	9d2a7779c9	[X86] Use MOVZX instead of MOVSX in f16_to_fp isel patterns. Using sign extend forces the adjacent element to either all zeros or all ones. But all ones is a NAN. So that doesn't seem like a great idea. Trying to work on supporting this with strict FP where NAN would definitely be bad.	2020-02-09 20:39:52 -08:00
Shiva Chen	93b04ecdf3	[RISCV] Fix incorrect FP base CFI offset for variable argument functions When the FP exists, the FP base CFI directive offset should take the size of variable arguments into account. Differential Revision: https://reviews.llvm.org/D73862	2020-02-10 11:56:08 +08:00
Fangrui Song	db8bcae104	[DebugInfo] Add a DWARFDataExtractor constructor that takes ArrayRef<uint8_t> Similar to D67797 (DataExtractor).	2020-02-09 17:45:32 -08:00
Matt Arsenault	fabdf2e6da	GlobalISel: Fix narrowScalar for G_{CTLZ\|CTTZ}_ZERO_UNDEF Narrow these for 64-bit VALU for AMDGPU.	2020-02-09 19:02:38 -05:00
Matt Arsenault	4dfb5b8a70	AMDGPU/GlobalISel: Split 64-bit G_CTPOP in RegBankSelect	2020-02-09 18:39:33 -05:00
Matt Arsenault	a025afb406	GlobalISel: Fix narrowing of G_CTLZ/G_CTTZ The result type is separate from the source type.	2020-02-09 18:11:43 -05:00
Matt Arsenault	49e5ee334c	AMDGPU/GlobalISel: Don't mis-select vector index on a constant Vector indexing with a constant index should be folded out in the legalizer, but this was accidentally falling through. This would produce the indexing operation with $noreg. Handle this case as a dynamic index just in case a bug like this happens again in the future.	2020-02-09 18:02:37 -05:00
Matt Arsenault	11aefea8e8	AMDGPU/GlobalISel: Look through casts when legalizing vector indexing We were failing to find constants that were casted. I feel like the artifact combiner should have folded the constant in the trunc before the custom lowering, but that doesn't happen.	2020-02-09 18:02:10 -05:00
Matt Arsenault	7f0a98ae35	AMDGPU: Remove dead kill handling At one point a custom node was used for kill handling, but now the intrinsic is directly selected. Remove leftover pattern machinery.	2020-02-09 17:59:24 -05:00
Matt Arsenault	780e89dd58	AMDGPU: Fix SI_IF lowering when the save exec reg has terminator uses Reverts part of 6524a7a2b9ca072bd7f7b4355d1230e70c679d2f. Since that commit, the expansion was ignoring the actual save exec register produced by the instruction, and looking at other instructions. I do not understand why it was looking at other instructions, but relying on this scan was wrong. Fixes verifier errors after SI_IF is tail duplicated, which should be correct to do. The results were fed into a phi, which was lowered to the S_MOV_B64_term instructions.	2020-02-09 17:59:19 -05:00
Simon Pilgrim	4d7c42b93d	[X86] combineConcatVectorOps - combine VROTLI/VROTRI ops Fix issue mentioned on rGe82e17d4d4ca - non-AVX512BW targets failed to concatenate 256-bit rotations back to 512-bits (split during shuffle lowering as they don't have v32i16/v64i8 types).	2020-02-09 21:50:10 +00:00
Craig Topper	d20adfcf42	[X86] Use custom isel for (X86sbb_flag 0, 0) so we can use 32-bit SBB for i8/i16. We were using MOV32r0 and an extract_subreg as an input. By using custom isel we can move the extract_subreg to after the SBB instead of on the input.	2020-02-09 13:19:35 -08:00
Craig Topper	11d30da96a	[X86] Add flag result VT to a MOV32r0 created in X86DAGToDAGISel::Select The flag isn't used, but I believe this matches the MOV32r0 that would be created by the table emitter. This should allow this node to be CSEed with any others created by the table.	2020-02-09 13:19:21 -08:00
Simon Pilgrim	1f239dd6be	[X86] Add lowerShuffleAsBitRotate (PR44379) As noted on PR44379, we didn't attempt to lower vector shuffles using bit rotations on XOP/AVX512F targets. This patch lowers to uniform ISD:ROTL nodes - ROTR isn't supported by XOP and they are interchangeable for constant values anyway. There might be cases where targets without ISD:ROTL support would benefit from this (expanding to SRL+SHL+OR), which I'll investigate in a future patch. Also, non-AVX512BW targets fail to concatenate 256-bit rotations back to 512-bits (split during shuffle lowering as they don't have v32i16/v64i8 types).	2020-02-09 21:15:03 +00:00
Craig Topper	ac35a8a35e	[X86] Use MVT::i32 for the type of a MOV32r0 created in X86DAGToDAGISel::Select. Not sure if this really matters. The VT isn't really used after this point. At best it might affect CSE.	2020-02-09 11:57:42 -08:00
Craig Topper	c35ae80199	[X86] Remove isel patterns that include a vselect/X86selects and a strict FP node. A vselect+strictfp node is not equivalent to a masked operation. The exceptions of the strictfp node are not masked by a vselect after it so we can't match it to a masked operation. We already had a hack in IsLegalToFold to prevent these patterns from matching. This patch removes that hack and removes the patterns.	2020-02-09 11:45:54 -08:00
Simon Pilgrim	9fac692aaa	[X86] Rename matchShuffleAsRotate - matchShuffleAsByteRotate. NFCI. A matchShuffleAsBitRotate variant will be added soon and we need to make the difference more obvious.	2020-02-09 18:35:50 +00:00
Sanjay Patel	35167a94b8	[VectorCombine] new IR transform pass for partial vector ops We have several bug reports that could be characterized as "reducing scalarization", and this topic was also raised on llvm-dev recently: http://lists.llvm.org/pipermail/llvm-dev/2020-January/138157.html ...so I'm proposing that we deal with these patterns in a new, lightweight IR vector pass that runs before/after other vectorization passes. There are 4 alternate options that I can think of to deal with this kind of problem (and we've seen various attempts at all of these), but they all have flaws: InstCombine - can't happen without TTI, but we don't want target-specific folds there. SDAG - too late to assist other vectorization passes; TLI is not equipped for these kind of cost queries; limited to a single basic block. CGP - too late to assist other vectorization passes; would need to re-implement basic cleanups like CSE/instcombine. SLP - doesn't fit with existing transforms; limited to a single basic block. This initial patch/transform is based on existing code in AggressiveInstCombine: we walk backwards through the function looking for a pattern match. But we diverge from that cost-independent IR canonicalization pass by using TTI to decide if the vector alternative is profitable. We probably have at least 10 similar bug reports/patterns (binops, constants, inserts, cheap shuffles, etc) that would fit in this pass as follow-up enhancements. It's possible that we could iterate on a worklist to fix-point like InstCombine does, but it's safer to start with a most basic case and evolve from there, so I didn't try to do anything fancy with this initial implementation. Differential Revision: https://reviews.llvm.org/D73480	2020-02-09 10:04:41 -05:00
Simon Pilgrim	c912faf386	Fix signed/unsigned warning.	2020-02-09 13:35:03 +00:00
Simon Pilgrim	d1b038d961	[X86] Recognise ROTLI/ROTRI rotations as faux shuffles Allows us to combine rotations with shuffles. One of many things necessary to fix PR44379 (lowering shuffles to rotations)	2020-02-09 12:25:49 +00:00
Ehud Katz	f6fd87fc2b	[LoopExtractor] Convert LoopExtractor from LoopPass to ModulePass The LoopExtractor created new functions (by definition), which violates the restrictions of a LoopPass. The correct implementation of this pass should be as a ModulePass. Includes reverting rL82990 implications on the LoopExtractor. Fixes PR3082 and PR8929. Differential Revision: https://reviews.llvm.org/D69069	2020-02-09 12:25:21 +02:00
serge_sans_paille	35ac7e7659	Support -fstack-clash-protection for x86 Implement protection against the stack clash attack [0] through inline stack probing. Probe stack allocation every PAGE_SIZE during frame lowering or dynamic allocation to make sure the page guard, if any, is touched when touching the stack, in a similar manner to GCC[1]. This extends the existing `probe-stack' mechanism with a special value `inline-asm'. Technically the former uses function call before stack allocation while this patch provides inlined stack probes and chunk allocation. Only implemented for x86. [0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt [1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html This a recommit of 39f50da2a357a8f685b3540246c5d762734e035f with proper LiveIn declaration, better option handling and more portable testing. Differential Revision: https://reviews.llvm.org/D68720	2020-02-09 10:42:45 +01:00

1 2 3 4 5 ...

131046 Commits