llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 04:32:44 +01:00

Author	SHA1	Message	Date
Roman Lebedev	438e79af49	[X86][CostModel] X86TTIImpl::getShuffleCost(): subvector insertions are cheap This is similar to the subvector extractions, except that the 0'th subvector isn't free to insert, because we generally don't know whether or not the upper elements need to be preserved: https://godbolt.org/z/rsxP5W4sW This is needed to avoid regressions in D100684 Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D100698	2021-04-19 13:24:58 +03:00
Fraser Cormack	d35a61570c	[RISCV] Lower vector shuffles to vrgather operations This patch extends the lowering of RVV fixed-length vector shuffles to avoid the default stack expansion and instead lower to vrgather instructions. For "permute"-style shuffles where one vector is swizzled, we can lower to one vrgather. For shuffles involving two vector operands, we lower to one unmasked vrgather (or splat, where appropriate) followed by a masked vrgather which blends in the second half. On occasion, when it's not possible to create a legal BUILD_VECTOR for the indices, we use vrgatherei16 instructions with 16-bit index types. For 8-bit element vectors where we may have indices over 255, we have a fairly blunt fallback to the stack expansion to avoid custom-splitting of the vector types. To enable the selection of masked vrgather instructions, this patch extends the various RISCVISD::VRGATHER nodes to take a passthru operand. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D100549	2021-04-19 11:13:13 +01:00
Kerry McLaughlin	23bab76d04	[NFC] Add tests for scalable vectorization of loops with in-order reductions D98435 added support for in-order reductions and included tests for fixed-width vectorization with the -enable-strict-reductions flag. This patch adds similar tests to verify support for scalable vectorization of loops with in-order reductions. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D100385	2021-04-19 11:15:55 +01:00
OCHyams	9028e71ae0	[DebugInfo] Replace debug uses in replaceUsesOutsideBlock Value::replaceUsesOutsideBlock doesn't replace debug uses which leads to an unnecessary reduction in variable location coverage. Fix this, add a unittest for it, and add a regression test demonstrating the change through instcombine's replacedSelectWithOperand. Reviewed By: djtodoro Differential Revision: https://reviews.llvm.org/D99169	2021-04-19 11:06:53 +01:00
OCHyams	fbe80b0c21	[DebugInfo] Move the findDbg* functions into DebugInfo.cpp Move the findDbg* functions into lib/IR/DebugInfo.cpp from lib/Transforms/Utils/Local.cpp. D99169 adds a call to a function (findDbgUsers) that lives in lib/Transforms/Utils/Local.cpp (LLVMTransformUtils) from lib/IR/Value.cpp (LLVMCore). The Core lib doesn't include TransformUtils. The builtbots caught this here: https://lab.llvm.org/buildbot/#/builders/109/builds/12664. This patch moves the function, and the 3 similar ones for consistency, into DebugInfo.cpp which is part of LLVMCore. Reviewed By: dblaikie, rnk Differential Revision: https://reviews.llvm.org/D100632	2021-04-19 10:30:25 +01:00
Clement Courbet	09f6560512	[llvm-exegesis] Honor -mcpu in analysis mode. This is useful to set the baseline model for an unknown CPU. Fixes PR50013. Differential Revision: https://reviews.llvm.org/D100743	2021-04-19 10:44:28 +02:00
David Sherwood	40e085b6bd	[CodeGen] Improve code generation for clamping of constant indices with scalable vectors When trying to clamp a constant index into a scalable vector we can test if the index is less than the minimum number of elements in the vector. If so, we can simply return the index because we know it is guaranteed to fit inside the vector. Differential Revision: https://reviews.llvm.org/D100639	2021-04-19 08:34:17 +01:00
Serguei Katkov	3c8b0754c7	[GreedyRA ORE] Add stats for copy of virtual registers. Greedy RA adds copies of virtual registers when splitting live interval. This stat might be useful. Reviewers: reames, MatzeB, anemet, thegameg Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D100017	2021-04-19 12:43:44 +07:00
Serguei Katkov	353cafc332	[Greedy RA] Add a check to MachineVerifier If Virtual Register is alive in landing pad its def must be before the call causing the exception or it should be statepoint instruction itself and in this case def actually means the relocation of gc pointer and is alive in landing pad. The test shows the triggering this check for an option under development use-registers-for-gc-values-in-landing-pad which is off by default until it is functionally correct. Reviewers: reames, void, jyknight, nickdesaulniers, efriedma, arsenm, rnk Reviewed By: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D100525	2021-04-19 12:31:18 +07:00
Evgeniy Brevnov	8e314ca696	[CVP] processCallSite returns wrong status Recently processMinMaxIntrinsic has been added and we started to observe a number of analysis get invalidated after CVP. The problem is CVP conservatively returns 'true' even if there were no modifications to IR. I found one more place besides processMinMaxIntrinsic which has the same problem. I think processMinMaxIntrinsic and similar should better have boolean return status to prevent similar issue reappear in future. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D100538	2021-04-19 12:13:22 +07:00
Xun Li	8ba502ea6f	Revert "[Coroutines] Set presplit attribute in Clang instead of CoroEarly pass" This reverts commit fa6b54c44ab1d5f579304eadb7ac8bd7e72d0e77. The commited patch broke mlir tests. It seems that mlir tests depend on coroutine function properties set in CoroEarly pass.	2021-04-18 17:22:28 -07:00
Craig Topper	59da2ddc9a	[TableGen] Pass SmallVector to union_modes instead of returning a std::vector. The number of modes is small so this should avoid a heap allocation. Also replace std::set with SmallSet.	2021-04-18 15:59:52 -07:00
Xun Li	fcafc059cd	[Coroutines] Set presplit attribute in Clang instead of CoroEarly pass Presplit coroutines cannot be inlined. During AlwaysInliner we check if a function is a presplit coroutine, if so we skip inlining. The presplit coroutine attributes are set in CoroEarly pass. However in O0 pipeline, AlwaysInliner runs before CoroEarly, so the attribute isn't set yet and will still inline the coroutine. This causes Clang to crash: https://bugs.llvm.org/show_bug.cgi?id=49920 To fix this, we set the attributes in the Clang front-end instead of in CoroEarly pass. Reviewed By: rjmccall, ChuanqiXu Differential Revision: https://reviews.llvm.org/D100282	2021-04-18 15:41:09 -07:00
Xun Li	72ec560e95	Revert "[Coroutines] Move CoroEarly pass to before AlwaysInliner" This reverts commit 2b50f5a4343f8fb06acaa5c36355bcf58092c9cd. Forgot to update the description of the commit to sync with phabricator. Going to redo the commit.	2021-04-18 15:38:19 -07:00
Xun Li	c42f8ec702	[Coroutines] Move CoroEarly pass to before AlwaysInliner Presplit coroutines cannot be inlined. During AlwaysInliner we check if a function is a presplit coroutine, if so we skip inlining. The presplit coroutine attributes are set in CoroEarly pass. However in O0 pipeline, AlwaysInliner runs before CoroEarly, so the attribute isn't set yet and will still inline the coroutine. This causes Clang to crash: https://bugs.llvm.org/show_bug.cgi?id=49920 Differential Revision: https://reviews.llvm.org/D100282	2021-04-18 14:54:04 -07:00
Martin Storsjö	86db67416f	[lit] Fix the return code for "not not" after evaluating "not" internally This fixes cases where "not not <command>" is supposed to return only the error codes 0 or 1, but after efee57925c3f46c74c6697, it passed the original error code through. This was visible on AIX in the shtest-output-printing.py testcase, where 'wc' returns 2, while it returns 1 on other platforms, and the test required "not not" to normalize it to 1.	2021-04-19 00:37:13 +03:00
Craig Topper	fdef0cf249	[TableGen] Use MachineValueTypeSet in place of SmallSet. MachineValueTypeSet is effectively a std::bitset<256>. This allows us quickly insert into the set and check if a type is in the set.	2021-04-18 13:38:30 -07:00
Nikita Popov	bf2498bc3f	[LoopDeletion] Add test for PR49967 (NFC) Test case for a SCEV invalidation bug caused by D100264, which has since been reverted.	2021-04-18 22:08:51 +02:00
Craig Topper	bf5c787068	[TableGen] Use range-based for loop. NFC	2021-04-18 12:41:09 -07:00
Roman Lebedev	5a174914f6	Revert "[SCEV] Model `ashr exact x, C` as `(abs(x) EXACT/u (1<<C)) * signum(x)`" As being discussed in https://reviews.llvm.org/D100721, this modelling is lossy, we can't reconstruct `ash`/`ashr exact` from it, which means that whenever we actually expand the IR, we've just pessimized the code.. It would be good to model this pattern, after all it comes up every time you want to compute a distance between two pointers, but not at this cost. This reverts commit ec54867df5e7f20e12146e628af34f0384308bcb.	2021-04-18 16:26:45 +03:00
xgupta	79c9c7433b	[Docs] Correct Boehm collector weblink in GarbageCollection.rst	2021-04-18 17:30:17 +05:30
LLVM GN Syncbot	60e1274472	[gn build] Port 01ace074fcb6	2021-04-18 11:35:28 +00:00
Florian Hahn	a77ecd76f3	[IndVarSimplify] Add test requiring ashr expansion. Add test cases showing large ashr expansion during IndVarSimplify after ec54867df5e7.	2021-04-18 12:28:49 +01:00
Roman Lebedev	f06347058d	[NFC][X86][CostModel] Rewrite load_store.ll Test SSE41, since that added float/i64/i32/i8 inserts/extracts. Don't forget to test vectors of pointers. Do test byte-aligned loads/stores. Fixup test coverage to be rather more exhaustive, testing all reasonable element sizes vs element counts permutations that fit up to witin ZMM.	2021-04-18 11:12:36 +03:00
Roman Lebedev	6861dd5bf7	[NFC][LoopVectorize] Autogenerate check lines in X86/gather_scatter.ll test	2021-04-18 10:26:16 +03:00
Juneyoung Lee	49d8db7466	Update InstCombine to use undef matcher instead This is a patch to use m_Undef() matcher instead of isa<UndefValue>(). As suggested in D100122, this update is separately committed.	2021-04-18 11:05:36 +09:00
Juneyoung Lee	e279a5783d	Update m_Undef to match vectors/aggrs with undefs and poisons mixed This fixes https://reviews.llvm.org/D93990#2666922 by teaching `m_Undef` to match vectors/aggrs with poison elements. As suggested, fixes in InstCombine files to use the `m_Undef` matcher instead of `isa<UndefValue>` will be followed. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D100122	2021-04-18 10:57:04 +09:00
Florian Hahn	bd2bad88ac	[ADT] Update RPOT to work with specializations of different types. At the moment, ReversePostOrderTraversal performs a post-order walk on the entry node of the passed in graph, rather than the graph type itself. If GT::NodeRef is the same as GraphT, everything works as expected and this is the case for the current uses in-tree. But it does not work as expected if GraphT != GT::NodeRef. In that case, we either fail to build (if there is no GraphTrait specialization for GT:NodeRef) or we pick the GraphTrait specialization for GT::NodeRef, instead of the specialization of GraphT. Both the depth-first and post-order iterators pick the expected specalization and this patch updates ReversePostOrderTraversal to delegate to po_begin & po_end to pick the right specialization, rather than forcing using GraphTraits<GT::NodeRef>, by first getting the entry node. This makes `ReversePostOrderTraversal<Graph<6>> RPOT(G);` build and work as expected in the test. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D100169	2021-04-17 20:45:04 +01:00
Nikita Popov	ac5625799c	[LoopUnroll] Regenerate test checks (NFC)	2021-04-17 20:59:20 +02:00
Nikita Popov	cdf04cd5e0	[LoopUnroll] Make some tests more robust (NFC) Replace branch on undef by branch on unknown condition.	2021-04-17 20:59:20 +02:00
Lang Hames	c2126ffe7e	[JITLink] Add testcase that was accidentally left out of 19e402d2b34.	2021-04-17 11:55:55 -07:00
Alexandre Ganea	5098c69ed1	[Support] ThreadPool tests: silence warning unused variable 'It'	2021-04-17 14:22:50 -04:00
Craig Topper	e51b3ceb31	[TableGen] Remove local SmallSet from TypeSetByHwMode::insert. This keeps track of which modes are in VVT so we can find out if a mode is missing later. But we can just ask VVT whether it has a particular mode.	2021-04-17 10:48:57 -07:00
Florian Hahn	dce8472e88	[ADT] Take graph as const & in some post-order iterators (NFC). This patch updates a couple of functions that unnecessarily took the input graph by value, when it was not needed. They can take the graph by const-reference instead, which does not require GraphT to provide a copy constructor. Split off from D100169.	2021-04-17 17:05:24 +01:00
Yaxun (Sam) Liu	2596649ce8	[AMDGPU] Add GlobalDCE before internalization pass The internalization pass only internalizes global variables with no users. If the global variable has some dead user, the internalization pass will not internalize it. To be able to internalize global variables with dead users, a global dce pass is needed before the internalization pass. This patch adds that. Reviewed by: Artem Belevich, Matt Arsenault Differential Revision: https://reviews.llvm.org/D98783	2021-04-17 11:25:25 -04:00
Nikita Popov	875a81f11e	[LICM] Add more tests for promotion and capture (NFC) We could optimize the first case, as the pointer is captured only after the loop.	2021-04-17 16:57:15 +02:00
Florian Hahn	b26528306e	[SimplifyCFG] Skip dbg intrinsics when checking for branch-only BBs. Debug intrinsics are free to hoist and should be skipped when looking for terminator-only blocks. As a consequence, we have to delegate to the main hoisting loop to hoist any dbg intrinsics instead of jumping to the terminator case directly. This fixes PR49982. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D100640	2021-04-17 15:17:50 +01:00
Nikita Popov	174ac0349e	[Inline] Don't add noalias metadata to inaccessiblememonly calls It will not do anything useful for them, as we already know that they don't modref with any accessible memory. In particular, this prevents noalias metadata from being placed on noalias.scope.decl intrinsics. This reduces the amount of metadata needed, and makes it more likely that unnecessary decls can be eliminated.	2021-04-17 14:56:13 +02:00
Simon Pilgrim	47aaabf409	[Support] AbsoluteDifference - add brackets to appease static analyzer warning. NFCI.	2021-04-17 13:47:02 +01:00
Serge Guelton	b2fe6dcbc2	Normalize interaction with boolean attributes Such attributes can either be unset, or set to "true" or "false" (as string). throughout the codebase, this led to inelegant checks ranging from if (Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true") to if (Fn->hasAttribute("no-jump-tables") && Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true") Introduce a getValueAsBool that normalize the check, with the following behavior: no attributes or attribute set to "false" => return false attribute set to "true" => return true Differential Revision: https://reviews.llvm.org/D99299	2021-04-17 08:17:33 +02:00
Craig Topper	da04af3736	[TableGen] Replace two SmallDenseSets with SmallSets. The key here is HwMode indices. They're going to be small numbers, contiguous, and only a few different values. I don't think we need to go through the SmallDenseSet hashing. A BitVector would be even better, but we don't have the upper bound here.	2021-04-16 17:57:53 -07:00
Nemanja Ivanovic	d506c40420	[PowerPC] Minor improvement for insert_vector_elt codegen For v2f64, all VSX subtargets can insert an element with a single XXPERMDI.	2021-04-16 18:52:37 -05:00
Philip Reames	e84c42802f	[inferattrs] Don't infer lib func attributes for nobuiltin functions If we have a nobuiltin function, we can't assume we know anything about the implementation. I noticed this when tracing through a log from an in the wild miscompile (https://github.com/emscripten-core/emscripten/issues/9443) triggered after 8666463. We were incorrectly assuming that a custom allocator could not free. (It's not clear yet this is the only problem in said issue.) I also noticed something similiar mentioned in the commit message of ab243e when scrolling back through history. Through, from what I can tell, that commit fixed symptom not root cause. The interface we have for library function detection is extremely error prone, but given the interaction between ``nobuiltin`` decls and ``builtin`` callsites, it's really hard to imagine something much cleaner. I may iterate on that, but it'll be invasive enough I didn't want to hold an obvious functional fix on it.	2021-04-16 15:36:15 -07:00
Nico Weber	af14859209	[gn build] (manually) port ca6751043d88 better	2021-04-16 18:16:29 -04:00
Craig Topper	d27dd156eb	[TableGen] Run GenerateVariants before ExpandHwModeBasedTypes. A large portion of the patterns are duplicated for HwMode on RISCV. If we expand HwMode first, we need to check nearly twice as many patterns for variants. HwModes shouldn't affect whether a variant is valid so we should be able to expand after. This also reduces the RISCV isel table by 539 bytes due to factoring working better on this pattern order. Unfortunately it increases Hexagon table size by ~50 bytes. But I think this is a reasonable trade.	2021-04-16 15:05:33 -07:00
Nico Weber	b82ca2a3f3	[gn build] (manually) port ca6751043d88	2021-04-16 18:03:44 -04:00
Philip Reames	9e6192bcdf	[funcattrs] Add the maximal set of implied attributes to definitions Have funcattrs expand all implied attributes into the IR. This expands the infrastructure from D100400, but for definitions not declarations this time. Somewhat subtly, this mostly isn't semantic. Because the accessors did the inference, any client which used the accessor was already getting the stronger result. Clients that directly checked presence of attributes (there are some), will see a stronger result now. The old behavior can end up quite confusing for two reasons: * Without this change, we have situations where function-attrs appears to fail when inferring an attribute (as seen by a human reading IR), but that consuming code will see that it should have been implied. As a human trying to sanity check test results and study IR for optimization possibilities, this is exceeding error prone and confusing. (I'll note that I wasted several hours recently because of this.) * We can have transforms which trigger without the IR appearing (on inspection) to meet the preconditions. This change doesn't prevent this from happening (as the accessors still involve multiple checks), but it should make it less frequent. I'd argue in favor of deleting the extra checks out of the accessors after this lands, but I want that in it's own review as a) it's purely stylistic, and b) I already know there's some disagreement. Once this lands, I'm also going to do a cleanup change which will delete some now redundant duplicate predicates in the inference code, but again, that deserves to be a change of it's own. Differential Revision: https://reviews.llvm.org/D100226	2021-04-16 14:22:19 -07:00
serge-sans-paille	5678d281d2	Simplify BitVector code Instead of managing memory by hand, delegate it to std::vector. This makes the code much simpler, and also avoids repeatedly computing the storage size. According to valgrind --tool=callgrind, this also slightly decreases the instruction count, but by a small margin. This is a recommit of 82f0e3d3ea6bf927e3397b2fb423abbc5821a30f with one usage fixed in llvm/lib/CodeGen/RegisterScavenging.cpp. Not the slight API change: BitVector::clear() now has the same behavior as any other container: it does not free memory, but indeed sets the size of the BitVector to 0. It is thus incorrect to access its content right afterwards, a scenario which wasn't enforced in previous implementation. Differential Revision: https://reviews.llvm.org/D100387	2021-04-16 22:48:33 +02:00
Fangrui Song	adec14b680	[TableGen] Fix -Wparentheses	2021-04-16 13:37:52 -07:00
Joe Nash	549eeb882a	[AMDGPU] Remove redundant field from DPP8 def These lines set the value to what it already was, so they are redundant. NFC Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D100664 Change-Id: Ibf6f27d50a7fa1f76c127f01b799821378bfd3b3	2021-04-16 16:23:52 -04:00

1 2 3 4 5 ...

214318 Commits