llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 19:23:23 +01:00

Author	SHA1	Message	Date
Fraser Cormack	0be8ae2293	[RISCV] Adjust tested vor ops for more stable tests. NFC.	2020-12-28 19:33:25 +00:00
Arthur Eubanks	0c3d18722c	[NewPM][AMDGPU] Port amdgpu-simplifylib/amdgpu-usenative And add them to the pipeline via AMDGPUTargetMachine::registerPassBuilderCallbacks(), which mirrors AMDGPUTargetMachine::adjustPassManager(). These passes can't be unconditionally added to PassRegistry.def since they are only present when the AMDGPU backend is enabled. And there are no target-specific headers in llvm/include, so parsing these pass names must occur somewhere in the AMDGPU directory. I decided the best place was inside the TargetMachine, since the PassBuilder invokes TargetMachine::registerPassBuilderCallbacks() anyway. If we come up with a cleaner solution for target-specific passes in the future that's fine, but there aren't too many target-specific IR passes living in target-specific directories so it shouldn't be too bad to change in the future. Reviewed By: ychen, arsenm Differential Revision: https://reviews.llvm.org/D93863	2020-12-28 10:38:51 -08:00
Philip Reames	c5a5f5171e	Reapply "[LV] Vectorize (some) early and multiple exit loops"" w/fix for builder This reverts commit 4ffcd4fe9ac2ee948948f732baa16663eb63f1c7 thus restoring e4df6a40dad. The only change from the original patch is to add "llvm::" before the call to empty(iterator_range). This is a speculative fix for the ambiguity reported on some builders.	2020-12-28 10:13:28 -08:00
Arthur Eubanks	39bb979f4e	Revert "[LV] Vectorize (some) early and multiple exit loops" This reverts commit e4df6a40dad66e989a4333c11d39cf3ed9635135. Breaks Windows bots, e.g. http://45.33.8.238/win/30472/step_4.txt and http://lab.llvm.org:8011/#/builders/83/builds/2078/steps/5/logs/stdio	2020-12-28 10:05:41 -08:00
Philip Reames	11bdb2a5fc	[LV] Vectorize (some) early and multiple exit loops This patch is a major step towards supporting multiple exit loops in the vectorizer. This patch on it's own extends the loop forms allowed in two ways: single exit loops which are not bottom tested multiple exit loops w/ a single exit block reached from all exits and no phis in the exit block (because of LCSSA this implies no values defined in the loop used later) The restrictions on multiple exit loop structures will be removed in follow up patches; disallowing cases for now makes the code changes smaller and more obvious. As before, we can only handle loops with entirely analyzable exits. Removing that restriction is much harder, and is not part of currently planned efforts. The basic idea here is that we can force the last iteration to run in the scalar epilogue loop (if we have one). From the definition of SCEV's backedge taken count, we know that no earlier iteration can exit the vector body. As such, we can leave the decision on which exit to be taken to the scalar code and generate a bottom tested vector loop which runs all but the last iteration. The existing code already had the notion of requiring one iteration in the scalar epilogue, this patch is mainly about generalizing that support slightly, making sure we don't try to use this mechanism when tail folding, and updating the code to reflect the difference between a single exit block and a unique exit block (very mechanical). Differential Revision: https://reviews.llvm.org/D93317	2020-12-28 09:40:42 -08:00
Nikita Popov	78e2ca16be	[ValueTracking] Fix isKnownNonEqual() with constexpr mul Confusingly, BinaryOperator is not an Operator, OverflowingBinaryOperator is... We were implicitly assuming that the multiply is an Instruction here. This fixes the assertion failure reported in https://reviews.llvm.org/D92726#2472827.	2020-12-28 18:32:57 +01:00
Dmitry Preobrazhensky	c78e6a3a7b	[AMDGPU][MC][NFC] Split large asm tests into smaller chunks The following large tests have been split into smaller parts by instruction formats: gfx7_asm_all.s gfx8_asm_all.s gfx9_asm_all.s gfx10_asm_all.s This change results in noticeable lit testing speedup. For example, on a debug Windows build, split asm tests are run 3.5 times faster.	2020-12-28 20:22:38 +03:00
Roman Lebedev	7b3150a431	Revert "[benchmark] Fixed a build error when using CMake 3.15.1 + NDK-R20" Temporairly revert until a consensus on post-commit comments is achieved. This reverts commit a485a59d2172daaee1d5e734da54fbb243f7d54c.	2020-12-28 20:19:08 +03:00
Paul C. Anagnostopoulos	3c1ec6d0fb	[TableGen] Fix bug in !interleave operator I forgot to account for unresolved elements of the list. Differential Revision: https://reviews.llvm.org/D93814	2020-12-28 12:17:24 -05:00
Roman Lebedev	a8e9630846	[InstCombine] 'hoist xor-by-constant from xor-by-value': ignore constantexprs As it is being reported (in post-commit review) in https://reviews.llvm.org/D93857 this fold (as i expected, but failed to come up with test coverage despite trying) has issues with constant expressions. Since we only care about true constants, which constantexprs are not, don't perform such hoisting for constant expressions.	2020-12-28 20:15:20 +03:00
Gabriel Hjort Åkerlund	1c8a391cbc	[MIRPrinter] Fix incorrect output of unnamed stack names The MIRParser expects unnamed stack entries to have empty names (''). In case of unnamed alloca instructions, the MIRPrinter would output '<unnamed alloca>', which caused the MIRParser to reject the generated code. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D93685	2020-12-28 18:01:40 +01:00
Nemanja Ivanovic	bb2bb5b29c	[PowerPC] Remove redundant COPY_TO_REGCLASS introduced by 8a58f21f5b6c	2020-12-28 09:26:51 -06:00
alex-t	f210958b3b	[AMDGPU] Split edge to make si_if dominate end_cf Basic block containing "if" not necessarily dominates block that is the "false" target for the if. That "false" target block may have another predecessor besides the "if" block. IR value corresponding to the Exec mask is generated by the si_if intrinsic and then used by the end_cf intrinsic. In this case IR verifier complains that 'Def does not dominate all uses'. This change split the edge between the "if" block and "false" target block to make it dominated by the "if" block. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D91435	2020-12-28 17:14:02 +03:00
Zakk Chen	690bf6c8de	[RISCV] Define vmsbf.m/vmsif.m/vmsof.m/viota.m/vid.v intrinsics. Define those intrinsics and lower to V instructions. Use update_llc_test_checks.py for viota.m tests to check earlyclobber is applied correctly. mask viota.m tests uses the same argument as input and mask for avoid dependency of D93364. We work with @rogfer01 from BSC to come out this patch. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D93823	2020-12-28 05:54:18 -08:00
Dmitry Preobrazhensky	c7d0eb0473	[AMDGPU][MC] Improved errors handling for v_interp* operands See bug 48596 (https://bugs.llvm.org/show_bug.cgi?id=48596) Reviewers: rampitec Differential Revision: https://reviews.llvm.org/D93757	2020-12-28 16:15:48 +03:00
Dmitry Preobrazhensky	0ef397453b	[AMDGPU][MC][NFC] Parser refactoring See bug 48515 (https://bugs.llvm.org/show_bug.cgi?id=48515) Reviewers: rampitec Differential Revision: https://reviews.llvm.org/D93756	2020-12-28 14:59:49 +03:00
AnZhong Huang	547707945a	[benchmark] Fixed a build error when using CMake 3.15.1 + NDK-R20 std::decay_t used by llvm/utils/benchmark/include/benchmark/benchmark.h is a c++14 feature, but the CMakelist uses c++11, it's the root-cause of build error. There are two options to fix the error. 1) change the CMakelist to support c++14. 2) change std::decay_t to std::decay, it's what the patch done. This bug can only be reproduced by CMake 3.15, we didn't observer the bug with CMake 3.16. But based on the code's logic, it's an obvious bug of LLVM. The upstream code is fine, the problem was introduced by rG1bd6123b781120c9190b9ba58b900cdcb718cdd1. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D93794	2020-12-28 11:24:56 +03:00
Fraser Cormack	1540d39c62	[RISCV] Pattern-match more vector-splatted constants This patch extends the pattern-matching capability of vector-splatted constants. When illegally-typed constants are legalized they are canonically sign-extended to XLenVT. This preserves the sign and allows us to match simm5. If they were zero-extended for whatever reason we'd lose that ability: e.g. `(i8 -1) -> (XLenVT 255)` would not be matched under the current logic. To address this we first manually sign-extend the splatted constant from the vector element type to int64_t. This preserves the semantics while removing any implicitly-truncated bits. The corresponding logic for uimm5 was not updated, the rationale being that neither sign- nor zero-extending a legal uimm5 immediate should change that (unless we expect actual "garbage" upper bits). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D93837	2020-12-28 07:11:10 +00:00
Chen Zheng	e35f5bcc7e	[MachineSink] add threshold in machinesink pass to reduce compiling time.	2020-12-27 23:23:07 -05:00
Yevgeny Rouban	dd767b7278	[RS4GC] Lazily set changed flag when folding single entry phis The function FoldSingleEntryPHINodes() is changed to return if it has changed IR or not. This return value is used by RS4GC to set the MadeChange flag respectively. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D93810	2020-12-28 10:54:21 +07:00
Juneyoung Lee	a7f581c4d0	[InstCombine] use poison as placeholder for undemanded elems Currently undef is used as a don’t-care vector when constructing a vector using a series of insertelement. However, this is problematic because undef isn’t undefined enough. Especially, a sequence of insertelement can be optimized to shufflevector, but using undef as its placeholder makes shufflevector a poison-blocking instruction because undef cannot be optimized to poison. This makes a few straightforward optimizations incorrect, such as: ``` ; https://bugs.llvm.org/show_bug.cgi?id=44185 define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) { %xv = insertelement <4 x float> %q, float %x, i32 2 %r = shufflevector <4 x float> %y, <4 x float> %xv, <4 x i32> { 0, 6, 2, undef } ret <4 x float> %r ; %r[3] is undef } => define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) { %r = insertelement <4 x float> %y, float %x, i32 1 ret <4 x float> %r ; %r[3] = %y[3], incorrect if %y[3] = poison } Transformation doesn't verify! ERROR: Target is more poisonous than source ``` I’d like to suggest 1. Using poison as insertelement’s placeholder value (IRBuilder::CreateVectorSplat should be patched too) 2. Updating shufflevector’s semantics to return poison element if mask is undef Note that poison is currently lowered into UNDEF in SelDag, so codegen part is okay. m_Undef() matches PoisonValue as well, so existing optimizations will still fire. The only concern is hidden miscompilations that will go incorrect when poison constant is given. A conservative way is copying all tests having `insertelement undef` & replacing it with `insertelement poison` & run Alive2 on it, but it will create many tests and people won’t like it. :( Instead, I’ll simply locally maintain the tests and run Alive2. If there is any bug found, I’ll report it. Relevant links: https://bugs.llvm.org/show_bug.cgi?id=43958 , http://lists.llvm.org/pipermail/llvm-dev/2019-November/137242.html Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93586	2020-12-28 08:58:15 +09:00
Juneyoung Lee	545caba6c2	[ValueTracking] Use m_LogicalAnd/Or to look into conditions This patch updates isImpliedCondition/isKnownNonZero to look into select form of and/or as well. See llvm.org/pr48353 and D93065 for more context Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93845	2020-12-28 08:32:45 +09:00
Florian Hahn	a50a21617e	[GVN] Correctly set modified status when doing PRE on indices. This patch updates GVN to correctly return the modified status, if PRE is performed on indices. It fixes a crash when building the test-suite with EXPENSIVE_CHECKS and LTO.	2020-12-27 21:58:31 +00:00
Juneyoung Lee	0eb41a93d1	[ValueTracking] Add unit tests for isKnownNonZero, isImpliedCondition (NFC)	2020-12-28 06:32:57 +09:00
Juneyoung Lee	fb0fe9dd48	[EarlyCSE] Use m_LogicalAnd/Or matchers to handle branch conditions EarlyCSE's handleBranchCondition says: ``` // If the condition is AND operation, we can propagate its operands into the // true branch. If it is OR operation, we can propagate them into the false // branch. ``` This holds for the corresponding select patterns as well. This is a part of an ongoing work for disabling buggy select->and/or transformations. See llvm.org/pr48353 and D93065 for more context Proof: and: https://alive2.llvm.org/ce/z/MQWodU or: https://alive2.llvm.org/ce/z/9GLbB_ Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93842	2020-12-28 05:36:26 +09:00
Juneyoung Lee	5e880dcc6f	[GVN] Use m_LogicalAnd/Or to propagate equality from branch conditions This patch makes GVN recognize `select c1, c2, false` as well as `select c1, true, c2` branch condition and propagate equality from these. See llvm.org/pr48353, D93065 Differential Revision: https://reviews.llvm.org/D93841	2020-12-28 05:28:38 +09:00
Juneyoung Lee	6a21ef716e	[EarlyCSE] Add tests for select form of and/or (NFC)	2020-12-28 04:19:22 +09:00
Juneyoung Lee	d6b69a0f4a	[GVN] Add tests for select form of and/or (NFC)	2020-12-28 03:39:57 +09:00
Florian Hahn	3e4c097377	[LV] Set up branch from middle block earlier. Previously the branch from the middle block to the scalar preheader & exit was being set-up at the end of skeleton creation in completeLoopSkeleton. Inserting SCEV or runtime checks may result in LCSSA phis being created, if they are required. Adjusting branches afterwards may break those PHIs. To avoid this, we can instead create the branch from the middle block to the exit after we created the middle block, so we have the final CFG before potentially adjusting/creating PHIs. This fixes a crash for the included test case. For the non-crashing case, this is almost a NFC with respect to the generated code. The only change is the order of the predecessors of the involved branch targets. Note an assertion was moved from LoopVersioning() to LoopVersioning::versionLoop. Adjusting the branches means loop-simplify form may be broken before constructing LoopVersioning. But LV only uses LoopVersioning to annotate the loop instructions with !noalias metadata, which does not require loop-simplify form. This is a fix for an existing issue uncovered by D93317.	2020-12-27 18:21:12 +00:00
Kazu Hirata	c8e09ced69	[Transforms] Use llvm::append_range (NFC)	2020-12-27 09:57:29 -08:00
Kazu Hirata	b3da88360d	[CodeGen, Transforms] Use *Map::lookup (NFC)	2020-12-27 09:57:27 -08:00
Kazu Hirata	41179ce945	[llvm-cov] Use is_contained (NFC)	2020-12-27 09:57:25 -08:00
Nikita Popov	bc7d9c8c8c	[PatternMatch][LVI] Handle select-form and/or in LVI Following the discussion in D93065, this adds m_LogicalAnd() and m_LogicalOr() matchers, that match A && B and A \|\| B logical operations, either as bitwise operations or select expressions. As an example usage, LVI is adapted to use these matchers for its condition reasoning. The plan here is to switch other parts of LLVM that reason about and/or of conditions to also support the select forms, and then merge D93065 (or a variant thereof) to disable the poison-unsafe select to and/or transform. Differential Revision: https://reviews.llvm.org/D93827	2020-12-27 17:39:02 +01:00
Nikita Popov	f523232389	[AArch64] Fix legalization of i128 ctpop without neon If neon is disabled, LowerCTPOP will return SDValue() to indicate that normal legalization should be used. However, ReplaceNodeResults does not check for this and pushes the empty SDValue() onto the result vector, which will subsequently result in a crash. Differential Revision: https://reviews.llvm.org/D93825	2020-12-27 17:24:41 +01:00
David Green	4cea4fafb0	[AArch64] Add some anyextend testing. NFC This cleans up and regenerates the NEON addw/addl/subw/subl/mlal etc tests, adding some tests that turn the zext into anyextend using an and mask.	2020-12-27 13:36:03 +00:00
David Green	ff897025eb	[ARM] Add some NEON anyextend testing. NFC This cleans up and regenerates the NEON addw/addl/subw/subl/mlal etc tests, adding some tests that turn the zext into anyextend using an and mask.	2020-12-27 13:18:10 +00:00
Amara Emerson	87ab42c12b	[GlobalISel] Fix assertion failures after "GlobalISel: Return APInt from getConstantVRegVal" landed. APInt binary ops don't promote types but instead assert, which a combine was relying on.	2020-12-26 23:51:44 -08:00
Craig Topper	4bcdc4ccb1	[X86] Remove X86Fmadd SDNode from tablegen. Use standard fma instead. NFC I guess I missed this in 4252f7773a5b98b825d17e5f77c7d349cb2fb7c7 when I modified most patterns.	2020-12-26 23:41:46 -08:00
Craig Topper	dc370e1d14	[RISCV] Improve VMConstraint checking on more unary and nullary instructions. We weren't consistently marking unary instructions as OneInput and vid.v is really ZeroInput but we had no way to mark that. This patch improves this by removing the error prone OneInput constraint. Instead we just always look for the mask in the last operand. It appears that the "CheckReg" variable used for the check on the broken instruction was unitialized or garbage because it was also used for VS1/VS2 constraints. I've scoped the variable locally to each check now. I've gone through and set NoConstraint on instructions that don't have a real VMConstraint and don't have a mask as the last operand. I've also removed the unused enum values in RISCVBaseInfo.h. We never use them in C++ and we have separate versions in a td file. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D93784	2020-12-26 18:47:59 -08:00
Arthur Eubanks	9d5dc949f8	[test] Pin some tests to legacy PM These all have NPM RUN lines.	2020-12-26 13:46:02 -08:00
Nikita Popov	600672ffcb	[CVP] Add tests for select form of and/or (NFC) This tests their handling inside LVI. See D93065 for wider context.	2020-12-26 21:48:24 +01:00
Kazu Hirata	4eb8419cf3	[llvm-cov, llvm-symbolizer] Use llvm::erase_if (NFC)	2020-12-26 12:06:27 -08:00
Kazu Hirata	46ef2cf21f	[TableGen] Use llvm::erase_if (NFC)	2020-12-26 12:06:26 -08:00
Kazu Hirata	45712d80ac	[llvm-pdbutil] Use llvm::is_contained (NFC)	2020-12-26 12:06:24 -08:00
Nathan James	72df03bec7	[NFC] Refactor some SourceMgr code	2020-12-26 17:53:32 +00:00
Sanjay Patel	3f4cf6da80	[SLP] rename reduction variables for readability; NFC I am hoping to extend the reduction matching code, and it is hard to distinguish "ReductionData" from "ReducedValueData". So extend the tree/root metaphor to include leaves. Another problem is that the name "OperationData" does not provide insight into its purpose. I'm not sure if we can alter that underlying data structure to make the code clearer.	2020-12-26 11:20:25 -05:00
Sanjay Patel	85a38d4e26	[SLP] use switch to improve readability; NFC This will get more complicated when we handle intrinsics like maxnum.	2020-12-26 10:59:45 -05:00
Nikita Popov	ebc3162468	[ValueTracking] Handle more non-trivial conditions in isKnownNonZero() In 35676a4f9a536a2aab768af63ddbb15bc722d7f9 I've added handling for non-trivial dominating conditions that imply non-zero on the true branch. This adds the same support for the false branch. The changes in pr45360.ll change block ordering and naming, but don't change the control flow. The urem is still guaraded by a non-zero check correctly.	2020-12-26 15:48:04 +01:00
Nikita Popov	baf08259be	[ValueTracking] Add more known non zero tests (NFC) Add tests for non-trivial conditions that imply non-zero on the false branch rather than the true branch. The last case already folds due to canonicalization.	2020-12-26 15:48:04 +01:00
Monk Chiang	20dc5d0c2f	[RISCV] Define vector widening reduction intrinsic. Define vwredsumu/vwredsum/vfwredosum/vfwredsum We work with @rogfer01 from BSC to come out this patch. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Zakk Chen <zakk.chen@sifive.com> Differential Revision: https://reviews.llvm.org/D93807	2020-12-26 21:42:30 +08:00

1 2 3 4 5 ...

208892 Commits