llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

Author	SHA1	Message	Date
Matt Arsenault	93d37e8051	OpaquePtr: Update more tests to use typed sret	2020-11-20 20:08:43 -05:00
Matt Arsenault	f8cfaf8c10	OpaquePtr: Bulk update tests to use typed sret	2020-11-20 17:58:26 -05:00
Matt Arsenault	4bf7d5872e	OpaquePtr: Bulk update tests to use typed byval Upgrade of the IR text tests should be the only thing blocking making typed byval mandatory. Partially done through regex and partially manual.	2020-11-20 14:00:46 -05:00
Sanjay Patel	0d725bd74e	[CostModel] mostly remove cost-kind predicate for intrinsics in basic TTI implementation This is re-applying a combination of f7eac51b9b3f and 8ec7ea3ddce7 as one patch to avoid regressions now that we have better testing in place. Those were reverted with 32dd5870ee31 because of crashing in experimental intrinsics. That bug should be fixed with 7ae346434. Paraphrased original commit messages: This is the last step in removing cost-kind as a consideration in the basic class model for intrinsics. See D89461 for the start of that. Subsequent commits dealt with each of the special-case intrinsics that had customization here in the basic class. This should remove a barrier to retrying D87188 (canonicalization to the abs intrinsic). The ARM and x86 cost diffs seen here may be wrong because the target-specific overrides have their own bugs, but we hope this is less wrong - if something has a significant throughput cost, then it should have a significant size / blended cost too by default. The only behavioral diff in current regression tests is shown in the x86 scatter-gather test (which is misplaced or broken because it runs the entire -O3 pipeline) - we unrolled less, and we assume that is a improvement. Exception: in general, we want the size cost for a scalar call to be cheap even if the other costs are expensive - we expect it to just be a branch with some optional stack manipulation. It is likely that we will want to carve out some exceptions/overrides to this rule as follow-up patches for calls that have some general and/or target-specific difference to the expected lowering. This was noticed as a regression in unrolling, so we have a test for that now along with a couple of direct cost model tests. If the assumed scalarization costs for the oversized vector calls are not realistic, that would be another follow-up refinement of the cost models. Differential Revision: https://reviews.llvm.org/D90554	2020-11-20 11:21:10 -05:00
Sanjay Patel	7e28c46bfe	[CostModel] avoid crashing while finding scalarization overhead The constrained intrinsics have metadata arguments, so the tests here were crashing as noted in D90554 (and that was reverted even though this bug exists independently of that change).	2020-11-20 10:18:29 -05:00
Sanjay Patel	5c4c695a34	[CostModel] add tests for math library calls; NFC This is a partial un-revert of 32dd5870ee31 (originally df09f82599 ). I'm adding back the baseline tests first, so we don't have to back-track as much in case there are still problems.	2020-11-20 08:24:49 -05:00
Max Kazantsev	39c3beff62	[Test] Auto-update checks in a test	2020-11-20 16:53:51 +07:00
Max Kazantsev	c19e519d70	[Test] Add tests demonstrating a bug in SCEV, PR48225 Slightly simplified version of original test reported by Congzhe Cao.	2020-11-20 15:59:22 +07:00
Eric Christopher	e7321a134d	Temporarily Revert "[CostModel] remove cost-kind predicate for intrinsics in basic TTI implementation" as it's causing crashes in the optimizer. A reduced testcase has been posted as a follow-up. This reverts commit f7eac51b9b3f780c96ca41913293851c5acb465b. Temporarily Revert "[CostModel] make default size cost for libcalls small (again)" as it depends upon the primary revert. This reverts commit 8ec7ea3ddce7379e13e8dfb4a5260a6d2004aa1c. Temporarily Revert "[CostModel] add tests for math library calls; NFC" as it depends upon the primary revert. This reverts commit df09f825995b10da03f148133c119f52c94fd6e4. Temporarily Revert "[LoopUnroll] add test for full unroll that is sensitive to cost-model; NFC" as it depends upon the primary revert. This reverts commit 618d555e8d926a83161774df2035519c387269db.	2020-11-19 22:10:23 -08:00
Artur Pilipenko	d3849bdc93	[BasicAA] Deoptimize intrinsics don't modify memory Similarly to assumes and guards deoptimize intrinsics are marked as writing to ensure proper control dependencies but they never modify any particular memory location. Differential Revision: https://reviews.llvm.org/D91658	2020-11-19 12:08:33 -08:00
Mircea Trofin	e1027a640d	[FileCheck] Disallow unused prefixes in llvm/test/Analysis This is achieved through a substitution of FileCheck in lit.cfg.py, where we explicitly set -allow-unused-prefixes to false. We also introduce a %FileCheckWithUnusedPrefixes% substitution that can be used in those cases where we want to allow unused prefixes, even if the folder policy is to disallow them. Differential Revision: https://reviews.llvm.org/D91275	2020-11-19 07:56:35 -08:00
Nikita Popov	86fecd5fbf	[BasicAA] Generalize base offset modulus handling The GEP aliasing implementation currently has two pieces of code that solve two different subsets of the same basic problem: If you have GEPs with offsets 4x + 0 and 4y + 1 (assuming access size 1), then they do not alias regardless of whether x and y are the same. One implementation is in aliasSameBasePointerGEPs(), which looks at this in a limited structural way. It requires both GEP base pointers to be exactly the same, then (optionally) a number of equal indexes, then an unknown index, then a non-equal index into a struct. This set of limitations works, but it's overly restrictive and hides the core property we're trying to exploit. The second implementation is part of aliasGEP() itself and tries to find a common modulus in the scales, so it can then check that the constant offset doesn't overlap under modular arithmetic. The second implementation has the right idea of what the general problem is, but effectively only considers power of two factors in the scales (while aliasSameBasePointerGEPs also works with non-pow2 struct sizes.) What this patch does is to adjust the aliasGEP() implementation to instead find the largest common factor in all the scales (i.e. the GCD) and use that as the modulus. Differential Revision: https://reviews.llvm.org/D91027	2020-11-18 21:48:49 +01:00
Nikita Popov	4d1322fde5	[BasicAA] Remove assert in AA evaluator As reported in https://reviews.llvm.org/D91383#2401825, this assert breaks external -aa-eval tests. We'll have to fix this case before re-enabling it.	2020-11-18 20:04:38 +01:00
Craig Topper	ce7901a517	[X86] Use GF2P8AFFINEQB to implement vector bitreverse. We can use GF2P8AFFINEQB to reverse bits in a byte. Shuffles are needed to reverse the bytes in elements larger than i8. LegalizeVectorOps takes care of inserting the shuffle for the larger element size. We already have Custom lowering for v16i8 with SSSE3, v32i8 with AVX, and v64i8 with AVX512BW. I think we might be able to use this for scalars too by moving into a vector and back. But I'll save that for a follow up as its a little more involved. Reviewed By: RKSimon, pengfei Differential Revision: https://reviews.llvm.org/D91515	2020-11-17 23:49:06 -08:00
Arthur Eubanks	7b22dbd090	[JumpThreading] Make -print-lvi-after-jump-threading work with NPM	2020-11-17 23:15:20 -08:00
Wei Wang	07da148882	[BPI] Look through bitcasts in calcZeroHeuristic Constant hoisting may hide the constant value behind bitcast for And's operand. Track down the constant to make the BFI result consistent regardless of hoisting. Differential Revision: https://reviews.llvm.org/D91450	2020-11-17 09:33:05 -08:00
Nikita Popov	0fa89d71cb	[BasicAA] Make alias GEP positive offset handling symmetric aliasGEP() currently implements some special handling for the case where all variable offsets are positive, in which case the constant offset can be taken as the minimal offset. However, it does not perform the same handling for the all-negative case. This means that the alias-analysis result between two GEPs is asymmetric: If GEP1 - GEP2 is all-positive, then GEP2 - GEP1 is all-negative, and the first will result in NoAlias, while the second will result in MayAlias. Apart from producing sub-optimal results for one order, this also violates our caching assumption. In particular, if BatchAA is used, the cached result depends on the order of the GEPs in the first query. This results in an inconsistency in BatchAA and AA results, which is how I noticed this issue in the first place. Differential Revision: https://reviews.llvm.org/D91383	2020-11-17 18:05:34 +01:00
Caroline Concatto	5d2395e6d6	[AArch64] Add check for widening instruction for SVE. This patch fixes the function isWideningInstruction for scalable vectors. Now the cost model can check the widening pattern for SVE. Differential Revision: https://reviews.llvm.org/D91260	2020-11-16 12:30:08 +00:00
Florian Hahn	a34e7402ed	[MemorySSA] Add pointer decrement loop clobber test case.	2020-11-15 18:00:01 +00:00
Nikita Popov	9dd51839f7	Revert "[SCEV] Factor out part of wrap flag detection logic [NFC-ish]" This reverts commit 1ec6e1eb8a084bffae8a40236eb9925d8026dd07. This change causes a significant compile-time regression: https://llvm-compile-time-tracker.com/compare.php?from=dd0b8b94d0796bd895cc998dd163b4fbebceb0b8&to=1ec6e1eb8a084bffae8a40236eb9925d8026dd07&stat=instructions I assume that this is due to the non-NFC part of the change, which now performs expensive nowrap inference even for nowrap flags that are not used by the particular code.	2020-11-15 10:19:44 +01:00
Philip Reames	3c9c657cb9	[SCEV] Factor out part of wrap flag detection logic [NFC-ish] In an effort to make code around flag determination more readable, and (possibly) prepare for a follow up change, factor out some of the flag detection logic. In the process, reduce the number of locations we mutate wrap flags by a couple. Note that this isn't NFC. The old code tried for NSW xor (NUW \|\| NW). This is, two different paths computed different sets of wrap flags. The new code will try for all three. The result is that some expressions end up with a few extra flags set.	2020-11-14 19:21:05 -08:00
Nikita Popov	7c3186ef2c	[BasicAA] Remove unnecessary size limitation We're dropping a common offset from both GEPs here. It's not necessary for the access sizes to be the same as well.	2020-11-14 16:51:31 +01:00
Sanjay Patel	47294cfe00	[CostModel] make default size cost for libcalls small (again) This was changed recently with D90554 / f7eac51b9b3f ...because we had a regression testing blindspot for intrinsics that are expected to be lowered to libcalls. In general, we want the size cost for a scalar call to be cheap even if the other costs are expensive - we expect it to just be a branch with some optional stack manipulation. It is likely that we will want to carve out some exceptions/overrides to this rule as follow-up patches for calls that have some general and/or target-specific difference to the expected lowering. This was noticed as a regression in unrolling, so we have a test for that now along with a couple of direct cost model tests. If the assumed scalarization costs for the oversized vector calls are not realistic, that would be another follow-up refinement of the cost models.	2020-11-14 08:15:35 -05:00
Sanjay Patel	1fb5766336	[CostModel] add tests for math library calls; NFC	2020-11-14 08:15:35 -05:00
Simon Pilgrim	ba4efb3482	[CostModel][X86] Remove unused CHECK prefixes Allows us to remove the "CHECK: {{^}}" hack and help simplify D91275	2020-11-13 17:31:48 +00:00
Nikita Popov	318f4a3446	[SCEV] Fix nsw flags for GEP expressions The SCEV code for constructing GEP expressions currently assumes that the addition of the base and all the offsets is nsw if the GEP is inbounds. While the addition of the offsets is indeed nsw, the addition to the base address is not, as the base address is interpreted as an unsigned value. Fix the GEP expression code to not assume nsw for the base+offset calculation. However, do assume nuw if we know that the offset is non-negative. With this, we use the same behavior as the construction of GEP addrecs does. (Modulo the fact that we disregard SCEV unification, as the pre-existing FIXME points out). Differential Revision: https://reviews.llvm.org/D90648	2020-11-13 18:19:32 +01:00
Nikita Popov	86a2a999fb	[BasicAA] Remove checks for GEP decomposition limit reached The GEP aliasing code currently checks for the GEP decomposition limit being reached (i.e., we did not reach the "final" underlying object). As far as I can see, these checks are not necessary. It is perfectly fine to work with a GEP whose base can still be further decomposed. Looking back through the commit history, these checks were originally introduced in 1a444489e9d90915cfdda0720489893896ef1503. However, I believe that the problem this was intended to address was later properly fixed with 1726fc698ccb85fe4bb23c200a50f28b57fc53cb, and the checks are no longer necessary since then (and were not the right fix in the first place). Differential Revision: https://reviews.llvm.org/D91010	2020-11-12 20:43:38 +01:00
Jamie Schmeiser	b7526c6b0c	Reland: Introduce -dot-cfg-mssa option which creates dot-cfg style file with mssa comments included in source Summary: Expand the print-memoryssa and print<memoryssa> passes with a new hidden option -cfg-dot-mssa that names a file. When set, a dot-cfg style file will be generated into the named file with the memoryssa comments retained and those blocks containing them shown in light pink. The option does nothing in isolation. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: asbirlea (Alina Sbirlea), dblaikie (David Blaikie) Differential Revision: https://reviews.llvm.org/D90638	2020-11-12 17:39:14 +00:00
Anh Tuyen Tran	27b1315bd4	Revert "Introduce -dot-cfg-mssa option which creates dot-cfg style file with mssa comments included in source" This reverts commit 45d459e7522ddc512ac70c4c822d58d335099672 due to build issue in Poly.	2020-11-12 15:48:14 +00:00
Jamie Schmeiser	73817f396c	Introduce -dot-cfg-mssa option which creates dot-cfg style file with mssa comments included in source Summary: Expand the print-memoryssa and print<memoryssa> passes with a new hidden option -cfg-dot-mssa that names a file. When set, a dot-cfg style file will be generated into the named file with the memoryssa comments retained and those blocks containing them shown in light pink. The option does nothing in isolation. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: asbirlea (Alina Sbirlea), dblaikie (David Blaikie) Differential Revision: https://reviews.llvm.org/D90638	2020-11-12 15:41:16 +00:00
Caroline Concatto	bc83bf6558	[AArch64]Add memory op cost model for SVE This patch adds/fixes memory op cost model for SVE with fixed-width vector. Differential Revision: https://reviews.llvm.org/D90950	2020-11-11 12:49:19 +00:00
Simon Pilgrim	bf3890c14f	[ValueTacking] assume-queries-counter.ll - remove unused check prefix	2020-11-10 14:31:03 +00:00
Simon Pilgrim	bd95b92516	[BasicAA] phi-values-usage.ll - remove unused check prefix	2020-11-10 14:31:03 +00:00
Simon Pilgrim	779cc24d90	[ScalarEvolution] Remove unused check prefixes	2020-11-10 14:31:02 +00:00
Sanjay Patel	6d6238699e	[CostModel] remove cost-kind predicate for intrinsics in basic TTI implementation This is the last step in removing cost-kind as a consideration in the basic class model for intrinsics. See D89461 for the start of that. Subsequent commits dealt with each of the special-case intrinsics that had customization here in the basic class. This should remove a barrier to retrying D87188 (canonicalization to the abs intrinsic). The ARM and x86 cost diffs seen here may be wrong because the target-specific overrides have their own bugs, but we hope this is less wrong - if something has a significant throughput cost, then it should have a significant size / blended cost too by default. The only behavioral diff in current regression tests is shown in the x86 scatter-gather test (which is misplaced or broken because it runs the entire -O3 pipeline) - we unrolled less, and we assume that is a improvement. Differential Revision: https://reviews.llvm.org/D90554	2020-11-10 08:19:31 -05:00
Simon Pilgrim	09e69baf05	[CostModel][ARM] Remove unused check-prefix	2020-11-10 13:10:12 +00:00
Simon Pilgrim	bee3f9451f	[CostModel][AArch64] Remove unused check-prefix	2020-11-10 13:10:11 +00:00
Simon Pilgrim	bb7e18f9de	[CostModel][X86] Remove unused check-prefixes	2020-11-10 12:48:35 +00:00
Max Kazantsev	b2b8b21ee4	[SCEV] Drop cached ranges of AddRecs after flag update Our range computation methods benefit from no-wrap flags. But if the ranges were first computed before the flags were set, the cached range will be too pessimistic. We need to drop cached ranges whenever we sharpen AddRec's no wrap flags. Differential Revision: https://reviews.llvm.org/D89847 Reviewed By: fhahn	2020-11-10 12:37:12 +07:00
Nikita Popov	d459bc8cde	[BasicAA] Add test for decomposition limit (NFC) Test behavior before/at/after the GEP decomposition limit.	2020-11-09 21:31:11 +01:00
Michael Liao	171536d3a1	[GlobalsAA] Teach to handle `addrspacecast`.	2020-11-09 00:04:52 -05:00
Sanjay Patel	a8868babff	[ARM] remove cost-kind predicate for cmp/sel costs This is the cmp/sel sibling to D90692. Again, the reasoning is: the throughput cost is number of instructions/uops, so size/blended costs are identical except in special cases (for example, fdiv or other known-expensive machine instructions or things like MVE that may require cracking into >1 uops). We need to check for a valid (non-null) condition type parameter because SimplifyCFG may pass nullptr for that (and so we will crash multiple regression tests without that check). I'm not sure if passing nullptr makes sense, but other code in the cost model does appear to check if that param is set or not. Differential Revision: https://reviews.llvm.org/D90781	2020-11-05 14:52:25 -05:00
Arthur Eubanks	e22b9f13f5	Port print-must-be-executed-contexts and print-mustexecute to NPM Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D90207	2020-11-03 21:06:46 -08:00
Sanjay Patel	4b28ca8f6f	[ARM] remove cost-kind predicate for most math op costs This is based on the same idea that I am using for the basic model implementation and what I have partly already done for x86: throughput cost is number of instructions/uops, so size/blended costs are identical except in special cases (for example, fdiv or other known-expensive machine instructions or things like MVE that may require cracking into >1 uop)). Differential Revision: https://reviews.llvm.org/D90692	2020-11-03 17:23:46 -05:00
Sanjay Patel	81f8bd9111	[CostModel] fix cost calc bug for sadd/ssub with overflow As noted in D90554, there's an opcode typo in using an easily misused cost model API: getCmpSelInstrCost(). Beyond that, the assumed sequence of ops is questionable, but that would be another patch. My guess is that the x86 test diffs show that we are probably wrong both before and after this change, so there will be no practical difference. As an example, I tried this test which shows a cost of '7' either way: define <4 x i32> @sadd(<4 x i32> %va, <4 x i32> %vb) { %V4I32 = call {<4 x i32>, <4 x i1>} @llvm.sadd.with.overflow.v4i32(<4 x i32> %va, <4 x i32> %vb) %ov = extractvalue {<4 x i32>, <4 x i1>} %V4I32, 1 %r = extractvalue {<4 x i32>, <4 x i1>} %V4I32, 0 %z = select <4 x i1> %ov, <4 x i32> <i32 42, i32 42, i32 42, i32 42>, <4 x i32> %r ret <4 x i32> %z } $ llc -o - sadd.ll -mattr=avx vpaddd %xmm1, %xmm0, %xmm2 vpcmpgtd %xmm2, %xmm0, %xmm0 vpxor %xmm0, %xmm1, %xmm0 vblendvps %xmm0, LCPI0_0(%rip), %xmm2, %xmm0a Differential Revision: https://reviews.llvm.org/D90681	2020-11-03 11:03:47 -05:00
David Green	41688b499e	[CostModel] Make target intrinsics cheap by default This patch changes the intrinsics cost model to assume that by default target intrinsics are cheap. This didn't seem to be the case for all intrinsics, and is potentially an MVE problem due to our scalarization overheads. Cheap seems to be a good default in general though. Differential Revision: https://reviews.llvm.org/D90597	2020-11-03 09:58:28 +00:00
Fangrui Song	c9829bfb08	[LazyCallGraph] Build SCCs of the reference graph in order ``` // The legacy PM CGPassManager discovers SCCs this way: for function in the source order tarjanSCC(function) // While the new PM CGSCCPassManager does: for function in the reversed source order [1] discover a reference graph SCC build call graph SCCs inside the reference graph SCC ``` In the common cases, reference graph ~= call graph, the new PM order is undesired because for `a \| b \| c` (3 independent functions), the new PM will process them in the reversed order: c, b, a. If `a <-> b <-> c`, we can see that `-print-after-all` will report the sole SCC as `scc: (c, b, a)`. This patch corrects the iteration order. The discovered SCC order will match the legacy PM in the common cases. For some tests (`Transforms/Inline/cgscc-*.ll` and `unittests/Analysis/CGSCCPassManagerTest.cpp`), the behaviors are dependent on the SCC discovery order and there are too many check lines for the particular order. This patch simply reverses the function order to avoid changing too many check lines. Differential Revision: https://reviews.llvm.org/D90566	2020-11-02 13:22:42 -08:00
David Green	7290582ceb	[ARM] Cost model test for target intrinsics. NFC	2020-11-02 17:46:48 +00:00
Sanjay Patel	389858bbc4	[x86] add AVX2 cost model entries for maxnum of 256-bit vectors As noticed in D90554 , the AVX2 costs for 256-bit vectors did not include FMAXNUM entries, so we fell back to AVX1 which assumes those ops will be split into 128-bit halves or something close to that. Differential Revision: https://reviews.llvm.org/D90613	2020-11-02 12:20:17 -05:00
Florian Hahn	1db8566f5e	Reland "[TTI] Add VecPred argument to getCmpSelInstrCost." This reverts the revert commit 408c4408facc3a79ee4ff7e9983cc972f797e176. This version of the patch includes a fix for a crash caused by treating ICmp/FCmp constant expressions as instructions. Original message: On some targets, like AArch64, vector selects can be efficiently lowered if the vector condition is a compare with a supported predicate. This patch adds a new argument to getCmpSelInstrCost, to indicate the predicate of the feeding select condition. Note that it is not sufficient to use the context instruction when querying the cost of a vector select starting from a scalar one, because the condition of the vector select could be composed of compares with different predicates. This change greatly improves modeling the costs of certain compare/select patterns on AArch64. I am also planning on putting up patches to make use of the new argument in SLPVectorizer & LV.	2020-11-02 15:39:29 +00:00

1 2 3 4 5 ...

2527 Commits