llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 04:22:57 +02:00

Author	SHA1	Message	Date
Simon Pilgrim	70f3672799	[X86][SSE] Add initial costs for vector CTTZ/CTLZ llvm-svn: 277716	2016-08-04 10:51:41 +00:00
George Burgess IV	648b4cdc02	[CFLAA] Remove modref queries from CFLAA. As it turns out, modref queries are broken with CFLAA. Specifically, the data source we were using for determining modref behaviors explicitly ignores operations on non-pointer values. So, it wouldn't note e.g. storing an i32 to an i32* (or loading an i64 from an i64). It also ignores external function calls, rather than acting conservatively for them. (N.B. These operations, where necessary, are* tracked by CFLAA; we just use a different mechanism to do so. Said mechanism is relatively imprecise, so it's unlikely that we can provide reasonably good modref answers with it as implemented.) Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22978 llvm-svn: 277366	2016-08-01 18:47:28 +00:00
Adam Nemet	9449a00cc1	[BPI] Add new LazyBPI analysis Summary: The motivation is the same as in D22141: In order to add the hotness attribute to optimization remarks we need BFI to be available in all passes that emit optimization remarks. BFI depends on BPI so unless we make this lazy as well we would still compute BPI unconditionally. The solution is to use the new LazyBPI pass in LazyBFI and only compute BPI when computation of BFI is requested by the client. I extended the laziness test using a LoopDistribute test to also cover BPI. Reviewers: hfinkel, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22835 llvm-svn: 277083	2016-07-28 23:31:12 +00:00
George Burgess IV	628ac58686	[CFLAA] Add getModRefBehavior to CFLAnders. This patch lets CFLAnders respond to mod-ref queries. It also includes a small bugfix to CFLSteens. Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22823 llvm-svn: 276939	2016-07-27 23:07:07 +00:00
Hans Wennborg	1ff36cfcf2	Revert r276136 "Use ValueOffsetPair to enhance value reuse during SCEV expansion." It causes Clang tests to fail after Windows self-host (PR28705). (Also reverts follow-up r276139.) llvm-svn: 276822	2016-07-26 23:25:13 +00:00
Wei Mi	5534e22fe5	Remove useless pass from the pipeline in test/Analysis/Dominators/2007-01-14-BreakCritEdges.ll. llvm-svn: 276644	2016-07-25 16:27:34 +00:00
Sanjoy Das	d06246dfe7	[SCEV] Make isImpliedCondOperandsViaRanges smarter This change lets us prove things like "{X,+,10} s< 5000" implies "{X+7,+,10} does not sign overflow" It does this by replacing replacing getConstantDifference by computeConstantDifference (which is smarter) in isImpliedCondOperandsViaRanges. llvm-svn: 276505	2016-07-23 00:54:36 +00:00
Wei Mi	95c770abfc	[PM] Port BreakCriticalEdges to the new PM. Differential Revision: https://reviews.llvm.org/D22688 llvm-svn: 276449	2016-07-22 18:04:25 +00:00
Wei Mi	b7c8cbfa86	Fix test/Analysis/ScalarEvolution/scev-expander-existing-value-offset.ll for rL276136. The content in this testcase was accidentally duplicated. Fix the error. llvm-svn: 276139	2016-07-20 16:54:58 +00:00
Wei Mi	6fe94448f1	Use ValueOffsetPair to enhance value reuse during SCEV expansion. In D12090, the ExprValueMap was added to reuse existing value during SCEV expansion. However, const folding and sext/zext distribution can make the reuse still difficult. A simplified case is: suppose we know S1 expands to V1 in ExprValueMap, and S1 = S2 + C_a S3 = S2 + C_b where C_a and C_b are different SCEVConstants. Then we'd like to expand S3 as V1 - C_a + C_b instead of expanding S2 literally. It is helpful when S2 is a complex SCEV expr and S2 has no entry in ExprValueMap, which is usually caused by the fact that S3 is generated from S1 after const folding. In order to do that, we represent ExprValueMap as a mapping from SCEV to ValueOffsetPair. We will save both S1->{V1, 0} and S2->{V1, C_a} into the ExprValueMap when we create SCEV for V1. When S3 is expanded, it will first expand S2 to V1 - C_a because of S2->{V1, C_a} in the map, then expand S3 to V1 - C_a + C_b. Differential Revision: https://reviews.llvm.org/D21313 llvm-svn: 276136	2016-07-20 16:40:33 +00:00
Simon Pilgrim	c494279f63	[X86][SSE] Add cost model values for CTPOP of vectors This patch adds costs for the vectorized implementations of CTPOP, the default values were seriously underestimating the cost of these and was encouraging vectorization on targets where serialized use of POPCNT would be much better. Differential Revision: https://reviews.llvm.org/D22456 llvm-svn: 276104	2016-07-20 10:41:28 +00:00
George Burgess IV	48af022908	[CFLAA] Make a test tell the truth. NFC. Dishonesty noted by Jia Chen. llvm-svn: 276028	2016-07-19 20:56:41 +00:00
George Burgess IV	fc60bb603d	[CFLAA] Add some interproc. analysis to CFLAnders. This patch adds function summary support to CFLAnders. It also comes with a lot of tests! Woohoo! Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22450 llvm-svn: 276026	2016-07-19 20:47:15 +00:00
Simon Pilgrim	6883102e64	[X86] Add CTPOP/CTLZ/CTTZ scalar cost tests llvm-svn: 275725	2016-07-17 18:29:19 +00:00
George Burgess IV	1820150537	[CFLAA] Add attributes handling for CFLAnders. This patch adds proper handling of stratified attributes into our anders-style CFLAA implementation. It also comes bundled with more CFLAnders tests. :) Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22325 llvm-svn: 275604	2016-07-15 20:02:49 +00:00
George Burgess IV	8b6295a5d8	[CFLAA] Add an initial CFLAnders implementation. This adds an incomplete anders-style implementation for CFLAA. It's incomplete in that it's missing interprocedural analysis, attrs handling, etc. and that it needs more tests. More tests and features will be added in future commits. Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22291 llvm-svn: 275602	2016-07-15 19:53:25 +00:00
Tom Stellard	f071f36723	GlobalsAA: Functions with the argmemonly attribute won't read arbitrary globals Summary: In preparation for changing GlobalsAA to stop assuming that intrinsics can't read arbitrary globals, we need to make sure GlobalsAA is querying function attributes rather than relying on this assumption. This patch was inspired by: http://reviews.llvm.org/D20206 Reviewers: jmolloy, hfinkel Subscribers: eli.friedman, llvm-commits Differential Revision: https://reviews.llvm.org/D21318 llvm-svn: 275433	2016-07-14 15:50:27 +00:00
Adam Nemet	071e00e973	[BFI] Add new LazyBFI analysis pass Summary: This is necessary for D21771. In order to add the hotness attribute to optimization remarks we need BFI to be available in all passes that emit optimization remarks. However we don't want to pay for computing BFI unless the hotness attribute is requested. This is achieved by making BFI lazy at the very high-level through a new analysis pass -- BFI is not calculated unless requested. I am adding a test to check the laziness under D21771 where the first user of the analysis is added. Reviewers: hfinkel, dexonsmith, davidxl Subscribers: davidxl, dexonsmith, llvm-commits Differential Revision: http://reviews.llvm.org/D22141 llvm-svn: 275250	2016-07-13 05:01:48 +00:00
Keno Fischer	7919bf891e	Fix ScalarEvolutionExpander step scaling bug The expandAddRecExprLiterally function incorrectly transforms `[Start + Step * X]` into `Step * [Start + X]` instead of the correct transform of `[Step * X] + Start`. This caused https://github.com/JuliaLang/julia/issues/14704#issuecomment-174126219 due to what appeared to be sufficiently complicated loop interactions. Patch by Jameson Nash (jameson@juliacomputing.com). Reviewers: sanjoy Differential Revision: http://reviews.llvm.org/D16505 llvm-svn: 275239	2016-07-13 01:28:12 +00:00
Michael Kuperstein	7e6a08b33c	[X86] Make some cast costs more precise Make some AVX and AVX512 cast costs more precise. Based on part of a patch by Elena Demikhovsky (D15604). Differential Revision: http://reviews.llvm.org/D22064 llvm-svn: 275106	2016-07-11 21:39:44 +00:00
Hal Finkel	392961ad04	Teach isDereferenceablePointer to look through returned-argument functions For functions which are known to return their argument, isDereferenceableAndAlignedPointer can examine the argument value. Differential Revision: http://reviews.llvm.org/D9384 llvm-svn: 275038	2016-07-11 03:08:49 +00:00
Hal Finkel	f9c4041c84	Teach SCEV to look through returned-argument functions When building SCEVs, if a function is known to return its argument, then we can build the SCEV using the corresponding argument value. Differential Revision: http://reviews.llvm.org/D9381 llvm-svn: 275037	2016-07-11 02:48:23 +00:00
Hal Finkel	9dd10de0f9	BasicAA should look through functions with returned arguments Motivated by the work on the llvm.noalias intrinsic, teach BasicAA to look through returned-argument functions when answering queries. This is essential so that we don't loose all other AA information when supplementing with llvm.noalias. Differential Revision: http://reviews.llvm.org/D9383 llvm-svn: 275035	2016-07-11 01:32:20 +00:00
Adam Nemet	50b66ccab5	[LAA] Port test to the new PM This is a follow-on to r274452. The LAA with the new PM is a loop pass so we go from inner to outer loops. Also using a CHECK-NOT didn't make much sense because we print something in either case; whether an invariant is 'found' or 'not found'. llvm-svn: 274935	2016-07-08 21:24:06 +00:00
Justin Bogner	b7a198f7fd	NVPTX: Remove the legacy ptx intrinsics - Rename the ptx.read.* intrinsics to nvvm.read.ptx.sreg.* - some but not all of these registers were already accessible via the nvvm name. - Rename ptx.bar.sync nvvm.bar.sync, to match nvvm.bar0. There's a fair amount of code motion here, but it's all very mechanical. llvm-svn: 274769	2016-07-07 16:40:17 +00:00
Sean Silva	09ccac554e	[PM] Avoid getResult on a higher level in LoopAccessAnalysis Note that require<domtree> and require<loops> aren't needed because they come in implicitly via the loop pass manager. llvm-svn: 274712	2016-07-07 01:01:53 +00:00
Sanjay Patel	cd5177a158	[x86] fix cost of SINT_TO_FP for i32 --> float (PR21356, PR28434) This is "cvtdq2ps" which does not appear to be particularly slow on any CPU according to Agner's tables. Choosing "5" as a cost here as suggested in: https://llvm.org/bugs/show_bug.cgi?id=21356 ...but it seems very conservative given that the instruction is fully pipelined, and I think these costs are supposed to model throughput. Note that related costs are also most likely too high, but this fixes PR21356 and partly fixes PR28434. llvm-svn: 274658	2016-07-06 19:15:54 +00:00
Michael Kuperstein	d59d4568d1	[TTI] The cost model should not assume vector casts get completely scalarized The cost model should not assume vector casts get completely scalarized, since on targets that have vector support, the common case is a partial split up to the legal vector size. So, when a vector cast gets split, the resulting casts end up legal and cheap. Instead of pessimistically assuming scalarization, base TTI can use the costs the concrete TTI provides for the split vector, plus a fudge factor to account for the cost of the split itself. This fudge factor is currently 1 by default, except on AMDGPU where inserts and extracts are considered free. Differential Revision: http://reviews.llvm.org/D21251 llvm-svn: 274642	2016-07-06 17:30:56 +00:00
George Burgess IV	9f9488ba33	[CFLAA] Split into Anders+Steens analysis. StratifiedSets (as implemented) is very fast, but its accuracy is also limited. If we take a more aggressive andersens-like approach, we can be way more accurate, but we'll also end up being slower. So, we've decided to split CFLAA into CFLSteensAA and CFLAndersAA. Long-term, we want to end up in a place where CFLSteens is queried first; if it can provide an answer, great (since queries are basically map lookups). Otherwise, we'll fall back to CFLAnders, BasicAA, etc. This patch splits everything out so we can try to do something like that when we get a reasonable CFLAnders implementation. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21910 llvm-svn: 274589	2016-07-06 00:26:41 +00:00
Nemanja Ivanovic	f7fafe4ce2	[PowerPC] - Legalize vector types by widening instead of integer promotion This patch corresponds to review: http://reviews.llvm.org/D20443 It changes the legalization strategy for illegal vector types from integer promotion to widening. This only applies for vectors with elements of width that is a multiple of a byte since we have hardware support for vectors with 1, 2, 3, 8 and 16 byte elements. Integer promotion for vectors is quite expensive on PPC due to the sequence of breaking apart the vector, extending the elements and reconstituting the vector. Two of these operations are expensive. This patch causes between minor and major improvements in performance on most benchmarks. There are very few benchmarks whose performance regresses. These regressions can be handled in a subsequent patch with a DAG combine (similar to how this patch handles int -> fp conversions of illegal vector types). llvm-svn: 274535	2016-07-05 09:22:29 +00:00
Nicolai Haehnle	fe1657d8ae	Add writeonly IR attribute Summary: This complements the earlier addition of IntrWriteMem and IntrWriteArgMem LLVM intrinsic properties, see D18291. Also start using the attribute for memset, memcpy, and memmove intrinsics, and remove their special-casing in BasicAliasAnalysis. Reviewers: reames, joker.eph Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18714 llvm-svn: 274485	2016-07-04 08:01:29 +00:00
Xinliang David Li	ad67ad9771	[PM] Port LoopAccessInfo analysis to new PM It is implemented as a LoopAnalysis pass as discussed and agreed upon. llvm-svn: 274452	2016-07-02 21:18:40 +00:00
Sanjoy Das	461b4dc08b	[SCEV] Compute max be count from shift operator only if all else fails In particular, check to see if we can compute a precise trip count by exhaustively simulating the loop first. llvm-svn: 274199	2016-06-30 02:47:28 +00:00
George Burgess IV	ab525e4a97	[CFLAA] Add support for ModRef queries. This patch makes CFLAA answer some ModRef queries. Because we don't distinguish between reading/writing when making StratifiedSets, we're unable to offer any of the readonly-related answers. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21858 llvm-svn: 274197	2016-06-30 02:11:26 +00:00
Artur Pilipenko	e2ddc2857d	Support arbitrary addrspace pointers in masked load/store intrinsics This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details). This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace. The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics. Reviewed By: reames Differential Revision: http://reviews.llvm.org/D17270 llvm-svn: 274043	2016-06-28 18:27:25 +00:00
Artur Pilipenko	f9a0655273	Revert -r273892 "Support arbitrary addrspace pointers in masked load/store intrinsics" since some of the clang tests don't expect to see the updated signatures. llvm-svn: 273895	2016-06-27 16:54:33 +00:00
Artur Pilipenko	5d29d9eab5	Support arbitrary addrspace pointers in masked load/store intrinsics This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details). This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace. The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics. Reviewed By: reames Differential Revision: http://reviews.llvm.org/D17270 llvm-svn: 273892	2016-06-27 16:29:26 +00:00
George Burgess IV	6298b8c82e	[CFLAA] Propagate StratifiedAttrs in interproc. analysis. This patch also has a refactor that kills StratifiedAttr, and leaves us with StratifiedAttrs, because having both was mildly redundant. This patch makes us correctly handle stratified attributes when doing interprocedural analysis. It also adds another attribute, AttrCaller, which acts like AttrUnknown. We can filter out AttrCaller values when during interprocedural analysis, since the caller should have information about what arguments it's passing to its callee. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21645 llvm-svn: 273636	2016-06-24 01:00:03 +00:00
George Burgess IV	bc4d541115	[CFLAA] Use better interprocedural function summaries. Previously, we just unified any arguments that seemed to be related to each other. With this patch, we now respect dereference levels, etc. which should make us substantially more accurate. Proper handling of StratifiedAttrs will be done in a later patch. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21536 llvm-svn: 273596	2016-06-23 18:55:23 +00:00
Michael Kuperstein	f3a6ce93de	[X86] Make arithmetic operations cost model test saner. NFC. llvm-svn: 273316	2016-06-21 20:41:40 +00:00
George Burgess IV	0898a41f88	[CFLAA] Be more aggressive with interprocedural analysis. This patch makes us perform interprocedural analysis on functions that don't have internal linkage. It also removes a test that should've been deleted in an earlier commit (since other tests now cover everything that the newly-removed test covers). Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21513 llvm-svn: 273229	2016-06-21 01:42:47 +00:00
George Burgess IV	5e569e7990	[CFLAA] Add interprocedural function summaries. This patch adds function summaries, so that we don't need to recompute various properties about function parameters/return values at each callsite of a function. It also adds many interprocedural tests for CFLAA. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21475#inline-182390 llvm-svn: 273219	2016-06-20 23:10:56 +00:00
Simon Pilgrim	894b8d816e	[X86][SSE] Add cost model for BSWAP of vectors The BSWAP of vector types is quite efficiently implemented using vector shuffles on SSE/AVX targets, we should reflect the typical cost of this to encourage vectorization. Differential Revision: http://reviews.llvm.org/D21521 llvm-svn: 273217	2016-06-20 23:08:21 +00:00
Sanjoy Das	8576aee95c	[SCEV] Fix incorrect trip count computation The way we elide max expressions when computing trip counts is incorrect -- it breaks cases like this: ``` static int wrapping_add(int a, int b) { return (int)((unsigned)a + (unsigned)b); } void test() { volatile int end_buf = 2147483548; // INT_MIN - 100 int end = end_buf; unsigned counter = 0; for (int start = wrapping_add(end, 200); start < end; start++) counter++; print(counter); } ``` Note: the `NoWrap` variable that was being tested has little to do with the values flowing into the max expression; it is a property of the induction variable. test/Transforms/LoopUnroll/nsw-tripcount.ll was added to solely test functionality I'm reverting in this change, so I've deleted the test fully. llvm-svn: 273079	2016-06-18 04:38:31 +00:00
George Burgess IV	8ffcde5567	[CFLAA] Tag arguments as escaped instead of unknown. This patch also includes some refactoring. Prior to this patch, we tagged all CFLAA attributes as unknown. This is suboptimal, since it meant that any Value used as an argument would be considered to alias any other Value that existed. Now that we have the machinery to tag sets below the set for an arbitrary value with attributes, it's okay to be less conservative with arguments. (Specifically, we still tag the set under an argument with unknown). Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21262 llvm-svn: 272690	2016-06-14 18:12:28 +00:00
Simon Pilgrim	2b98b76820	[CostModel][X86][SSE] Updated costs for vector BITREVERSE ops on SSSE3+ targets To account for the fast PSHUFB implementation now available llvm-svn: 272484	2016-06-11 19:23:02 +00:00
Michael Kuperstein	57fc4a3484	[X86] Add costs for SSE zext/sext to v4i64 to TTI The costs are somewhat hand-wavy, but should be much closer to the truth than what we get from BasicTTI. Differential Revision: http://reviews.llvm.org/D21156 llvm-svn: 272406	2016-06-10 17:01:05 +00:00
George Burgess IV	f70bbc0aad	[CFLAA] Handle global/arg attrs more sanely. Prior to this patch, we used argument/global stratified attributes in order to note that a value could have come from either dereferencing a global/arg, or from the assignment from a global/arg. Now, AttrUnknown is placed on sets when we see a dereference, instead of the global/arg attributes. This allows us to be more aggressive in the future when we see global/arg attributes without AttrUnknown. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21110 llvm-svn: 272335	2016-06-09 23:15:04 +00:00
Sanjoy Das	7ac8ad9906	Be wary of abnormal exits from loop when exploiting UB We can safely rely on a NoWrap add recurrence causing UB down the road only if we know the loop does not have a exit expressed in a way that is opaque to ScalarEvolution (e.g. by a function call that conditionally calls exit(0)). I believe with this change PR28012 is fixed. Note: I had to change some llvm-lit tests in LoopReroll, since it looks like they were depending on this incorrect behavior. llvm-svn: 272237	2016-06-09 01:13:59 +00:00
Sanjoy Das	c35e5710c9	[SCEV] Track no-abnormal-exits instead of no-throw calls Absence of may-unwind calls is not enough to guarantee that a UB-generating use of an add-rec poison in the loop latch will actually cause UB. We also need to guard against calls that terminate the thread or infinite loop themselves. This partially addresses PR28012. llvm-svn: 272181	2016-06-08 17:48:42 +00:00

1 2 3 4 5 ...

1024 Commits