llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00

Author	SHA1	Message	Date
David Sherwood	ac8a8b4ad9	[SVE][CodeGen] Fix scalable vector issues in DAGTypeLegalizer::GenWidenVectorStores In DAGTypeLegalizer::GenWidenVectorStores the algorithm assumes it only ever deals with fixed width types, hence the offsets for each individual store never take 'vscale' into account. I've changed the main loop in that function to use TypeSize instead of unsigned for tracking the remaining store amount and offset increment. In addition, I've changed the loop to use the new IncrementPointer helper function for updating the addresses in each iteration, since this handles scalable vector types. Whilst fixing this function I also fixed a minor issue in IncrementPointer whereby we were not adding the no-unsigned-wrap flag for the add instruction in the same way as the fixed width case does. Also, I've added a report_fatal_error in GenWidenVectorTruncStores, since this code currently uses a sequence of element-by-element scalar stores. I've added new tests in CodeGen/AArch64/sve-intrinsics-stores.ll CodeGen/AArch64/sve-st1-addressing-mode-reg-imm.ll for the changes in GenWidenVectorStores. Differential Revision: https://reviews.llvm.org/D84937	2020-08-13 11:07:17 +01:00
David Sherwood	8fe44dff5f	[CodeGen] In narrowExtractedVectorLoad bail out for scalable vectors In narrowExtractedVectorLoad there is an optimisation that tries to combine extract_subvector with a narrowing vector load. At the moment this produces warnings due to the incorrect calls to getVectorNumElements() for scalable vector types. I've got this working for scalable vectors too when the extract subvector index is a multiple of the minimum number of elements. I have added a new variant of the function: MachineFunction::getMachineMemOperand that copies an existing MachineMemOperand, but replaces the pointer info with a null version since we cannot currently represent scaled offsets. I've added a new test for this particular case in: CodeGen/AArch64/sve-extract-subvector.ll Differential Revision: https://reviews.llvm.org/D83950	2020-08-13 10:46:18 +01:00
Florian Hahn	850e6d4aa1	[InstCombine] Precommit tests for PR47149.	2020-08-13 10:36:52 +01:00
Rainer Orth	8644290085	[test] XFAIL two tests with inlining debug info issues on Sparc Currently only two test failures remain on Sparc, both `sparcv9-sun-solaris2.11` and `sparc64-unknown-linux-gnu`: LLVM :: DebugInfo/Generic/debug-label-inline.ll LLVM :: Linker/subprogram-linkonce-weak.ll They seem related in that debug info isn't generated for instruction bundles (like `retl+add` in the delay slot). I've filed separate bugs for both files (Bug 47129 and 47131), though it's probably the same issue. This patch `XFAIL`s the tests. Tested on `sparcv9-sun-solaris2.11` and `amd64-pc-solaris2.11`. Differential Revision: https://reviews.llvm.org/D85827	2020-08-13 11:12:52 +02:00
Sebastian Neubauer	f6c931c6b8	[AMDGPU] Fix typo. NFC	2020-08-13 10:41:48 +02:00
Qiu Chaofan	c854603d04	[NFC] [PowerPC] Rename SPE strict conversion test	2020-08-13 15:02:07 +08:00
Ali Tamur	e7d6dfa5d7	Revert "[SCEV] Look through single value PHIs." This reverts commit e441b7a7a0a72c28daf5a8e594559c667e5b4534. This patch causes a compile error in tensorflow opensource project. The stack trace looks like: Point of crash: llvm/include/llvm/Analysis/LoopInfoImpl.h : line 35 (gdb) ptype this type = const class llvm::LoopBase<llvm::BasicBlock, llvm::Loop> [with BlockT = llvm::BasicBlock, LoopT = llvm::Loop] (gdb) p this $1 = {ParentLoop = 0x0, SubLoops = std::vector of length 0, capacity 0, Blocks = std::vector of length 0, capacity 1, DenseBlockSet = {<llvm::SmallPtrSetImpl<llvm::BasicBlock const>> = {<llvm::SmallPtrSetImplBase> = {<llvm::DebugEpochBase> = {Epoch = 3}, SmallArray = 0x1b2bf6c8, CurArray = 0x1b2bf6c8, CurArraySize = 8, NumNonEmpty = 0, NumTombstones = 0}, <No data fields>}, SmallStorage = {0xfffffffffffffffe, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}, IsInvalid = true} (gdb) p this->DenseBlockSet->CurArray $2 = (const void *) 0xfffffffffffffffe I will try to get a case from tensorflow or use creduce to get a small case.	2020-08-12 23:13:24 -07:00
Nadav Rotem	5c8e661d88	[Clang options] Optimize optionMatches() runtime by removing mallocs The method optionMatches() constructs 9865 std::string instances when comparing different options. Many of these instances exceed the size of the internal storage and force memory allocations. This patch adds an early exit check that eliminates most of the string allocations while keeping the code simple. Example inputs: Prefix: /, Name: Fr Prefix: -, Name: Fr Prefix: -, Name: fsanitize-address-field-padding= Prefix: -, Name: fsanitize-address-globals-dead-stripping Prefix: -, Name: fsanitize-address-poison-custom-array-cookie Prefix: -, Name: fsanitize-address-use-after-scope Prefix: -, Name: fsanitize-address-use-odr-indicator Prefix: -, Name: fsanitize-blacklist= Differential Revision: D85538	2020-08-12 23:07:07 -07:00
Aditya Kumar	11fc43fd2a	[HotColdSplit] Fix variable name spelling	2020-08-12 22:50:08 -07:00
Carl Ritson	e685b4674e	[AMDGPU] Pre-commit test for D85872	2020-08-13 13:07:27 +09:00
Xing GUO	a3c52b27f2	[macho2yaml] Remove an unused variable. NFC.	2020-08-13 11:14:31 +08:00
Ruiling Song	9b5efb57d0	[AMDGPU] Fix crash when dag-combining bitcast From the code after the 'break', they are processing 64bit scalar and vector bitcast. So I think the break-condition should be (cond1 \|\| cond2) This means we only execute following code if (64bit and dest-is-vector). Also remove a previous fix which is not needed with this new fix. (introduced in: 1349a04ef5f594dda705ec80474dda4837f26dba) Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D85804	2020-08-13 10:23:13 +08:00
Albion Fung	3972d9461d	[PowerPC] Implement Vector Shift Builtins This patch implements the builtins for the vector shifts (shl, srl, sra), and adds the appropriate test cases for these builtins. The builtins utilize the vector shift instructions introduced within ISA 3.1. Differential Revision: https://reviews.llvm.org/D83338	2020-08-12 18:26:58 -05:00
Nikita Popov	3cea08454d	[ValueTracking] Add abs intrinsics support to computeConstantRange() Implementation is the same as for SPF_ABS.	2020-08-12 22:28:46 +02:00
Nikita Popov	fe50645fc5	[InstSimplify] Add additional abs intrinsic icmp tests (NFC) While abs >= 0 already folds, some variations thereon don't.	2020-08-12 22:28:46 +02:00
Nikita Popov	5901f3abc5	[InstSimplify] Extract abs intrinsic tests into separate file (NFC) Also move some tests from InstCombine to InstSimplify, as they are already handled by InstSimplify.	2020-08-12 22:28:46 +02:00
Nikita Popov	2a8504a89f	[ValueTracking] Support min/max intrinsics in computeConstantRange() The implementation is the same as for the SPF_* case.	2020-08-12 22:07:29 +02:00
Nikita Popov	cfe9647681	[InstSimplify] Add tests for icmp of min/max with constants (NFC) Test the case where the constants are not the same, but the result is still known.	2020-08-12 22:07:29 +02:00
Sanjay Patel	2e6cebe140	[InstCombine] prefer xor with -1 because 'not' is easier to understand (PR32706) This is a retry of rL300977 which was reverted because of infinite loops. We have fixed all of the known places where that would happen, but there's still a chance that this patch will cause infinite loops. This matches the demanded bits behavior in the DAG and should fix: https://bugs.llvm.org/show_bug.cgi?id=32706 Differential Revision: https://reviews.llvm.org/D32255	2020-08-12 15:50:33 -04:00
Sanjay Patel	cb77fb312f	[InstCombine] add test for 'not' vs 'xor'; NFC	2020-08-12 15:50:33 -04:00
Roman Lebedev	a0d3c66adb	[NFC][InstCombine] Add FIXME's for getLogBase2() / visitUDivOperand() These are not correctness issues. In visitUDivOperand(), if the (potential) divisor is undef, then udiv is already UB, so it is not incorrect to keep undef as shift amount. But, that is suboptimal. We could instead simply drop that select, picking the other operand. Afterwards, getLogBase2() could assert that there is no undef in divisor.	2020-08-12 22:06:54 +03:00
Roman Lebedev	84dd80140a	[InstCombine] Sanitize undef vector constant to 1 in X(2^C) with X << C (PR47133) While xundef is undef, shift-by-undef is poison, which we must avoid introducing. Also log2(iN undef) is NOT iN undef, because log2(iN undef) u< N. See https://bugs.llvm.org/show_bug.cgi?id=47133	2020-08-12 22:06:53 +03:00
Francesco Petrogalli	6fee8f8f2e	[SVE][VLS] Don't combine logical AND. Testing is performed when targeting 128, 256 and 512-bit wide vectors. For 128-bit vectors, the original behavior of using NEON instructions is preserved. Differential Revision: https://reviews.llvm.org/D85479	2020-08-12 20:00:07 +01:00
Amara Emerson	a56d446c69	[GlobalISel] Implement bit-test switch table optimization. This is mostly a straight port from SelectionDAG. We re-use the actual bit-test analysis part from SwitchLoweringUtils, which was factored out earlier to support jump-tables. Differential Revision: https://reviews.llvm.org/D85233	2020-08-12 11:31:39 -07:00
Simon Pilgrim	523ad96399	Fix signed/unsigned comparison warnings. NFC.	2020-08-12 19:22:13 +01:00
Craig Topper	0308690c50	Recommit "[InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms" and its follow up patches This recommits the following patches now that D85684 has landed 1cf6f210a2e [IR] Disable select ? C : undef -> C fold in ConstantFoldSelectInstruction unless we know C isn't poison. 469da663f2d [InstSimplify] Re-enable select ?, undef, X -> X transform when X is provably not poison 122b0640fc9 [InstSimplify] Don't fold vectors of partial undef in SimplifySelectInst if the non-undef element value might produce poison ac0af12ed2f [InstSimplify] Add test cases for opportunities to fold select ?, X, undef -> X when we can prove X isn't poison 9b1e95329af [InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms	2020-08-12 10:45:27 -07:00
Christopher Tetreault	76b23db219	[SVE] Remove default-false VectorType::get Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D84212	2020-08-12 10:37:05 -07:00
David Green	51c55be3c2	[Scheduler] Fix typo in comments. NFC	2020-08-12 18:36:05 +01:00
David Green	23f12b7d77	[ARM] Predicated VFMA patterns Similar to the Two op + select patterns that were added recently, this adds some patterns for select + fma to turn them into predicated operations. Differential Revision: https://reviews.llvm.org/D85824	2020-08-12 18:35:01 +01:00
Simon Pilgrim	1c995f64f2	[X86][SSE] Pull out BUILD_VECTOR operand equivalence tests. NFC. Pull out element equivalence code from isShuffleEquivalent/isTargetShuffleEquivalent, I've also removed many of the index modulos where possible. First step toward simply adding some additional equivalence tests.	2020-08-12 18:20:18 +01:00
Craig Topper	9a34dd9cc6	[X86][GlobalISel] Legalize G_ICMP results to s8. We need to produce a setcc instruction which has an 8-bit result. This gets rid of a bunch of cases that were using the s1->s8/s16/s32/s64 handling in selectZExt. I'm not very familiar with GlobalISel yet so I'm not yet sure the best way to do things. I'd especially like feedback on the best way to handle the currently split 32-bit and 64-bit mode handling. Differential Revision: https://reviews.llvm.org/D85814	2020-08-12 10:13:59 -07:00
Cameron McInally	f5557910ab	[SVE] Lower fixed length FP minnum/maxnum Lower fixed length MINNUM/MAXNUM to scalable vectors. Cherry-picked from D71767 with added tests. Differential Revision: https://reviews.llvm.org/D85744	2020-08-12 12:02:52 -05:00
Johannes Doerfert	fdfc525664	[UpdateTestChecks][FIX] Python 2.7 compatibility and use right prefix	2020-08-12 11:58:08 -05:00
Ilya Leoshkevich	9b258b83db	[SanitizerCoverage] Use zeroext for cmp parameters on all targets Commit 9385aaa84851 ("[sancov] Fix PR33732") added zeroext to __sanitizer_cov_trace(_const)?_cmp[1248] parameters for x86_64 only, however, it is useful on other targets, in particular, on SystemZ: it fixes swap-cmp.test. Therefore, use it on all targets. This is safe: if target ABI does not require zero extension for a particular parameter, zeroext is simply ignored. A similar change has been implemeted as part of commit 3bc439bdff8b ("[MSan] Add instrumentation for SystemZ"), and there were no problems with it. Reviewed By: morehouse Differential Revision: https://reviews.llvm.org/D85689	2020-08-12 18:38:12 +02:00
Stanislav Mekhanoshin	c9db20b4cf	[AMDGPU][test] Add dedicated llvm-readobj test. Differential Revision: https://reviews.llvm.org/D85683	2020-08-12 09:11:36 -07:00
Krzysztof Parzyszek	b46450a31b	[Hexagon] Return scalar size in getMinVectorRegisterBitWidth() when no HVX This fixes https://llvm.org/PR47128.	2020-08-12 10:13:58 -05:00
Anna Welker	7da989b1b5	[ARM][MVE] Enable tail predication for loops containing MVE gather/scatters Widen the scope of memory operations that are allowed to be tail predicated to include gathers and scatters, such that loops that are auto-vectorized with the option -enable-arm-maskedgatscat (and actually end up containing an MVE gather or scatter) can be tail predicated. Differential Revision: https://reviews.llvm.org/D85138	2020-08-12 15:32:37 +01:00
Matt Arsenault	a7f874dd31	AMDGPU/GlobalISel: Select llvm.amdgcn.global.atomic.fadd Remove the intermediate transform in the DAG path. I believe this is the last non-deprecated intrinsic that needs handling.	2020-08-12 10:04:53 -04:00
Matt Arsenault	e479314184	AMDGPU: Handle intrinsics in performMemSDNodeCombine This avoids a possible regression in a future patch	2020-08-12 10:04:53 -04:00
Xing GUO	748a235862	[DWARFYAML] Make the address size of compilation units optional. This patch makes the 'AddrSize' field optional. If the address size is missing, yaml2obj will infer it from the object file. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D85805	2020-08-12 21:47:32 +08:00
Xing GUO	d15e24388b	[MachOYAML] Simplify the section data emitting function. NFC. This patch helps simplify some codes in writeSectionData() function. Reviewed By: jhenderson, grimar Differential Revision: https://reviews.llvm.org/D85821	2020-08-12 21:46:43 +08:00
Sanjay Patel	00ebcf5eea	[VectorCombine] early exit if target has no vector registers Based on post-commit discussion in: D81766 Other vectorization passes (SLP and Loop) use this TTI API similarly.	2020-08-12 09:22:31 -04:00
Sanjay Patel	575e40e45d	[VectorCombine] add test for x86 target with SSE disabled; NFC	2020-08-12 09:22:31 -04:00
David Green	40e668f84c	[ARM] Add additional predicated VFMA tests. NFC	2020-08-12 14:20:20 +01:00
Sanjay Patel	8a2dab8cbf	[InstCombine] eliminate a pointer cast around insertelement I'm not sure if this solves PR46839 completely, but reducing the casting should help: https://bugs.llvm.org/show_bug.cgi?id=46839 Differential Revision: https://reviews.llvm.org/D85647	2020-08-12 09:08:17 -04:00
Sanjay Patel	fe63c251a7	[VectorCombine] add test for Hexagon that would crash; NFC This test verifies the code change from: rGb0b95dab1ce2 (although that would not be true if PR47128 is fixed)	2020-08-12 08:38:20 -04:00
Kai Nacke	c6e90fc4cb	[SystemZ/ZOS] Implement computeHostNumPhysicalCores On z/OS, the information is stored in the Common System Data Area (CSD). It is the number of CPs allocated to the current LPAR. Reviewers: aganea, hubert.reinterpertcast, MaskRay Reviewed By: hubert.reinterpertcast Differential Revision: https://reviews.llvm.org/D85531	2020-08-12 08:31:33 -04:00
Sam Parker	0906f80f16	[LoopUnroll] Adjust CostKind query When TTI was updated to use an explicit cost, TCK_CodeSize was used although the default implicit cost would have been the hand-wavey cost of size and latency. So, revert back to this behaviour. This is not expected to have (much) impact on targets since most (all?) of them return the same value for SizeAndLatency and CodeSize. When optimising for size, the logic has been changed to query CodeSize costs instead of SizeAndLatency. This patch also adds a testing option in the unroller so that OptSize thresholds can be specified. Differential Revision: https://reviews.llvm.org/D85723	2020-08-12 12:56:09 +01:00
David Green	f1a40c93c6	[ARM] Commutative vmin/maxnma tests. NFC	2020-08-12 12:50:18 +01:00
Simon Pilgrim	4c91fdf3fa	[X86][SSE] Fold HOP(SHUFFLE(X),SHUFFLE(Y)) --> SHUFFLE(HOP(X,Y)) This is beginning to look like a canonicalization stage that could be performed as part of shuffle combining Another step towards PR41813	2020-08-12 12:16:36 +01:00

1 2 3 4 5 ...

201853 Commits