llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00

Author	SHA1	Message	Date
Matthew Simpson	843e321f23	[LAA] Rename forwarding conflict detection option (NFC) This patch renames the option enabling the store-to-load forwarding conflict detection optimization. This change was requested in the review of D20241. llvm-svn: 269668	2016-05-16 17:00:56 +00:00
Xinliang David Li	22e4d41bc3	[PM] Port indirect call promotion pass to new pass manager llvm-svn: 269660	2016-05-16 16:31:07 +00:00
Matthew Simpson	0ad2094b84	[LV] Ensure safe VF for loops with interleaved accesses The selection of the vectorization factor currently doesn't consider interleaved accesses. The vectorization factor is based on the maximum safe dependence distance computed by LAA. However, for loops with interleaved groups, we should instead base the vectorization factor on the maximum safe dependence distance divided by the maximum interleave factor of all the interleaved groups. Interleaved accesses not in a group will be scalarized. Differential Revision: http://reviews.llvm.org/D20241 llvm-svn: 269659	2016-05-16 15:08:20 +00:00
Sanjay Patel	a7bf577009	add test to show missing optimization llvm-svn: 269601	2016-05-15 18:41:18 +00:00
Sanjay Patel	9f25dff4a3	regenerate checks llvm-svn: 269596	2016-05-15 18:05:10 +00:00
Elena Demikhovsky	8163cea3de	Vector GEP - fixed a crash on InstSimplify Pass. Vector GEP with mixed (vector and scalar) indices failed on the InstSimplify Pass when all indices are constants. Differential revision http://reviews.llvm.org/D20149 llvm-svn: 269590	2016-05-15 12:30:25 +00:00
Davide Italiano	7977d84dfd	[PM] Port LowerAtomic to the new pass manager. llvm-svn: 269511	2016-05-13 22:52:35 +00:00
Michael Zolotukhin	e7c1345927	Revert "Revert "[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the..."" This reverts commit r269395. Try to reapply with a fix from chapuni. llvm-svn: 269486	2016-05-13 21:23:25 +00:00
Sanjay Patel	2f92ae6c43	regenerate checks and add a run to show missed shrinkage llvm-svn: 269449	2016-05-13 18:04:39 +00:00
Sanjay Patel	fa5165cbf4	regenerate checks llvm-svn: 269447	2016-05-13 18:02:16 +00:00
Sanjay Patel	e6e6eb6572	[InstCombine] handle zero constant vectors for LE/GE comparisons too Enhancement to: http://reviews.llvm.org/rL269426 With discussion in: http://reviews.llvm.org/D17859 This should complete the fixes for: PR26701, PR26819: https://llvm.org/bugs/show_bug.cgi?id=26701 https://llvm.org/bugs/show_bug.cgi?id=26819 llvm-svn: 269439	2016-05-13 17:28:12 +00:00
Jun Bum Lim	bfebf25704	[MemCpyOpt] Use MaxIntSize in byte instead of bit Summary: This change fix the bug in isProfitableToUseMemset() where MaxIntSize shoule be in byte, not bit. Reviewers: arsenm, joker.eph, mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20176 llvm-svn: 269433	2016-05-13 16:52:24 +00:00
Sanjay Patel	2c688c8b52	[InstCombine] canonicalize* LE/GE vector integer comparisons to LT/GT (PR26701, PR26819) *We don't currently handle the edge case constants (min/max values), so it's not a complete canonicalization. To fully solve the motivating bugs, we need to enhance this to recognize a zero vector too because that's a ConstantAggregateZero which is a ConstantData, not a ConstantVector or a ConstantDataVector. Differential Revision: http://reviews.llvm.org/D17859 llvm-svn: 269426	2016-05-13 15:10:46 +00:00
Michael Zolotukhin	5226965218	Revert "[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the..." This reverts commit r269388. It caused some bots to fail, I'm reverting it until I investigate the issue. llvm-svn: 269395	2016-05-13 06:32:25 +00:00
Michael Zolotukhin	afd08c7313	[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the... Summary: ...loop after the last iteration. This is really hard to do correctly. The core problem is that we need to model liveness through the induction PHIs from iteration to iteration in order to get the correct results, and we need to correctly de-duplicate the common subgraphs of instructions feeding some subset of the induction PHIs. All of this can be driven either from a side effect at some iteration or from the loop values used after the loop finishes. This patch implements this by storing the forward-propagating analysis of each instruction in a cache to recall whether it was free and whether it has become live and thus counted toward the total unroll cost. Then, at each sink for a value in the loop, we recursively walk back through every value that feeds the sink, including looping back through the iterations as needed, until we have marked the entire input graph as live. Because we cache this, we never visit instructions more than twice -- once when we analyze them and put them into the cache, and once when we count their cost towards the unrolled loop. Also, because the cache is only two bits and because we are dealing with relatively small iteration counts, we can store all of this very densely in memory to avoid this from becoming an excessively slow analysis. The code here is still pretty gross. I would appreciate suggestions about better ways to factor or split this up, I've stared too long at the algorithmic side to really have a good sense of what the design should probably look at. Also, it might seem like we should do all of this bottom-up, but I think that is a red herring. Specifically, the simplification power is much greater working top-down. We can forward propagate very effectively, even across strange and interesting recurrances around the backedge. Because we use data to propagate, this doesn't cause a state space explosion. Doing this level of constant folding, etc, would be very expensive to do bottom-up because it wouldn't be until the last moment that you could collapse everything. The current solution is essentially a top-down simplification with a bottom-up cost accounting which seems to get the best of both worlds. It makes the simplification incremental and powerful while leaving everything dead until we know it is needed. Finally, a core property of this approach is its monotonicity. At all times, the current UnrolledCost is a conservatively low estimate. This ensures that we will never early-exit from the analysis due to exceeding a threshold when if we had continued, the cost would have gone back below the threshold. These kinds of bugs can cause incredibly hard to track down random changes to behavior. We could use a techinque similar (but much simpler) within the inliner as well to avoid considering speculated code in the inline cost. Reviewers: chandlerc Subscribers: sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D11758 llvm-svn: 269388	2016-05-13 01:42:39 +00:00
Michael Zolotukhin	c6573bbc1b	[LoopUnrollAnalyzer] Don't treat gep-instructions with simplified offset as simplified. Summary: Currently we consider such instructions as simplified, which is incorrect, because if their user isn't simplified, we can't actually simplify them too. This biases our estimates of profitability: for instance the analyzer expects much more gains from unrolling memcpy loops than there actually are. Reviewers: hfinkel, chandlerc Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17365 llvm-svn: 269387	2016-05-13 01:42:34 +00:00
David Majnemer	7230d61b65	[SCCP] Resolve shifts beyond the bitwidth to undef Shifts beyond the bitwidth are undef but SCCP resolved them to zero. Instead, DTRT and resolve them to undef. This reimplements the transform which caused PR27712. llvm-svn: 269269	2016-05-12 03:07:40 +00:00
Sanjoy Das	ee178ad6c3	All llvm.deoptimize declarations must use the same calling convention This new verifier rule lets us unambigously pick a calling convention when creating a new declaration for `@llvm.experimental.deoptimize.<ty>`. It is also congruent with our lowering strategy -- since all calls to `@llvm.experimental.deoptimize` are lowered to calls to `__llvm_deoptimize`, it is reasonable to enforce a unique calling convention. Some of the tests that were breaking this verifier rule have had to be split up into different .ll files. The inliner was violating this rule as well, and has been fixed to avoid producing invalid IR. llvm-svn: 269261	2016-05-12 01:17:38 +00:00
Davide Italiano	9c6851f574	Revert "[SCCP] Partially propagate informations when the input is not fully defined." This reverts commit r269105 as it caused PR27712. llvm-svn: 269252	2016-05-11 23:06:10 +00:00
Easwaran Raman	bf64a7664e	Revert r269131 llvm-svn: 269138	2016-05-10 23:26:04 +00:00
Dehao Chen	735b361b8a	Propagate branch metadata when some branch probability is missing. Summary: In sample profile, some branches may have profile missing due to profile inaccuracy. We want existing branch probability still valid after propagation. Reviewers: hfinkel, davidxl, spatel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19948 llvm-svn: 269137	2016-05-10 23:07:19 +00:00
Easwaran Raman	80787ca12d	Reapply r266477 and r266488 llvm-svn: 269131	2016-05-10 22:03:23 +00:00
Xinliang David Li	cde638dcef	[PM]: port IR based profUse pass to new pass manager llvm-svn: 269129	2016-05-10 21:59:52 +00:00
Tim Northover	ae4c0a6787	Revert "MemCpyOpt: combine local load/store sequences into memcpy." This reverts commit r269125. It was in my tree when I ran "git svn dcommit". It's really still under review. llvm-svn: 269127	2016-05-10 21:49:40 +00:00
Tim Northover	3d78a3c8b7	MemCpyOpt: combine local load/store sequences into memcpy. Sort of the BB-local equivalent to idiom-recognizer: if we have a basic-block that really implements a memcpy operation, backends can benefit from seeing this. llvm-svn: 269125	2016-05-10 21:48:11 +00:00
Hans Wennborg	0d10a020ff	Loop unroller: set thresholds for optsize and minsize functions to zero Before r268509, Clang would disable the loop unroll pass when optimizing for size. That commit enabled it to be able to support unroll pragmas in -Os builds. However, this regressed binary size in one of Chromium's DLLs with ~100 KB. This restores the original behaviour of no unrolling at -Os, but doing it in LLVM instead of Clang makes more sense, and also allows the pragmas to keep working. Differential revision: http://reviews.llvm.org/D20115 llvm-svn: 269124	2016-05-10 21:45:55 +00:00
Lawrence Hu	84f170e1c6	Enable loopreroll for sext of loop control only IV This patch extend loopreroll to allow the instruction chain of loop control only IV has sext. Differential Revision: http://reviews.llvm.org/D19820 llvm-svn: 269121	2016-05-10 21:16:49 +00:00
Lawrence Hu	ebbd866e5b	Revert r26084: Enable loopreroll for sext of loop control only IV llvm-svn: 269119	2016-05-10 21:11:09 +00:00
Lawrence Hu	7230587586	Revert r269093: Enable loopreroll for sext of loop control only IV llvm-svn: 269117	2016-05-10 21:04:28 +00:00
Sanjay Patel	7746d548c9	[InstSimplify] use computeKnownBits on shift amount operands Do simplifications common to all shift instructions based on the amount shifted: 1. If the shift amount is known larger than the bitwidth, the result is undefined. 2. If the valid bits of the shift amount are all known to be 0, it's a shift by zero, so the shift operand is the result. Note that we could generalize the shift-by-zero transform into a shift-by-constant if all of the valid bits in the shift amount are known, but that would have to be done in InstCombine rather than here because it would mean we need to create a new shift instruction. Differential Revision: http://reviews.llvm.org/D19874 llvm-svn: 269114	2016-05-10 20:46:54 +00:00
Chad Rosier	1293f2b484	[InstCombine] Fold icmp ugt/ult (udiv i32 C2, X), C1. This patch adds support for two optimizations: icmp ugt (udiv C2, X), C1 -> icmp ule X, C2/(C1+1) icmp ult (udiv C2, X), C1 -> icmp ugt X, C2/C1 Differential Revision: http://reviews.llvm.org/D20123 llvm-svn: 269109	2016-05-10 20:22:09 +00:00
Davide Italiano	04252ed302	[SCCP] Partially propagate informations when the input is not fully defined. With this patch: %r1 = lshr i64 -1, 4294967296 -> undef Before this patch: %r1 = lshr i64 -1, 4294967296 -> 0 llvm-svn: 269105	2016-05-10 19:49:47 +00:00
Justin Bogner	b41d2978a0	LPM: Drop require<loops> from these tests, it's redundant. NFC The LoopPassManager needs to calculate the loops analysis in order to iterate over the loops at all. Requiring it is redundant and just adds noise to the RUN lines here. llvm-svn: 269097	2016-05-10 18:28:10 +00:00
Rafael Espindola	3560edede3	Make "@name =" mandatory for globals in .ll files. An oddity of the .ll syntax is that the "@var = " in @var = global i32 42 is optional. Writing just global i32 42 is equivalent to @0 = global i32 42 This means that there is a pretty big First set at the top level. The current implementation maintains it manually. I was trying to refactor it, but then started wondering why keep it a all. I personally find the above syntax confusing. It looks like something is missing. This patch removes the feature and simplifies the parser. llvm-svn: 269096	2016-05-10 18:22:45 +00:00
Lawrence Hu	df161bc96d	Enable loopreroll for sext of loop control only IV This patch extend loopreroll to allow the instruction chain of loop control only IV has sext. Differential Revision: http://reviews.llvm.org/D19820 llvm-svn: 269093	2016-05-10 18:00:42 +00:00
Rong Xu	cc0ee912d1	[PGO] resubmit r268969 Put the test into a target specific directory. llvm-svn: 269090	2016-05-10 17:45:33 +00:00
Lawrence Hu	8953864da8	Enable loopreroll for sext of loop control only IV This patch extend loopreroll to allow the instruction chain of loop control only IV has sext. llvm-svn: 269084	2016-05-10 17:42:27 +00:00
James Molloy	0bbc10508e	Revert "[VectorUtils] Query number of sign bits to allow more truncations" This was a fairly simple patch but on closer inspection was seriously flawed and caused PR27690. This reverts commit r268921. llvm-svn: 269051	2016-05-10 12:27:23 +00:00
Chuang-Yu Cheng	79a3fbfded	Update Debug Intrinsics in RewriteUsesOfClonedInstructions in LoopRotation Loop rotation clones instruction from the old header into the preheader. If there were uses of values produced by these instructions that were outside the loop, we have to insert PHI nodes to merge the two values. If the values are used by DbgIntrinsics they will be used as a MetadataAsValue of a ValueAsMetadata of the original values, and iterating all of the uses of the original value will not update the DbgIntrinsics. The new code checks if the values are used by DbgIntrinsics and if so, updates them using essentially the same logic as the original code. The attached testcase demonstrates the issue. Without the fix, the DbgIntrinic outside the loop uses values computed inside the loop, even though these values do not dominate the DbgIntrinsic. Author: Thomas Jablin (tjablin) Reviewers: dblaikie aprantl kbarton hfinkel cycheng http://reviews.llvm.org/D19564 llvm-svn: 269034	2016-05-10 09:45:44 +00:00
Arnaud A. de Grandmaison	6c99fa9cb1	[InstCombine] Remove trivially empty va_start/va_end and va_copy/va_end ranges. When a va_start or va_copy is immediately followed by a va_end (ignoring debug information or other start/end in between), then it is safe to remove the pair. As this code shares some commonalities with the lifetime markers, this has been factored to helper functions. This InstCombine pattern kicks-in 3 times when running the LLVM test suite. llvm-svn: 269033	2016-05-10 09:24:49 +00:00
Renato Golin	477f18731f	Revert "[PGO] Fix __llvm_profile_raw_version linkage in MACHO IR instrumentation generates a COMDAT symbol __llvm_profile_raw_version to overwrite the same symbol in profile run-time to distinguish IR profiles from Clang generated profiles. In MACHO, LinkOnceODR linkage is used due to the lack of COMDAT support." This reverts commits r268969, r268979 and r268984. They had target specific test in generic directories without the correct specifiers and made it hard for us to come up with a good solution by rapidly committing untested changes. This test needs to be in a target specific directory or have the correct REQUIRED identifier. llvm-svn: 269027	2016-05-10 08:23:57 +00:00
Elena Demikhovsky	7866776645	[LoopVectorize] Handling induction variable with non-constant step. Allow vectorization when the step is a loop-invariant variable. This is the loop example that is getting vectorized after the patch: int int_inc; int bar(int init, int restrict A, int N) { int x = init; for (int i=0;i<N;i++){ A[i] = x; x += int_inc; } return x; } "x" is an induction variable with loop-invariant* step. But it is not a primary induction. Primary induction variable with non-constant step is not handled yet. Differential Revision: http://reviews.llvm.org/D19258 llvm-svn: 269023	2016-05-10 07:33:35 +00:00
Sanjoy Das	bbb7dedca9	[ValueTracking] Use guards to prove non-nullness of a value Reviewers: apilipenko, majnemer, reames Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20044 llvm-svn: 269008	2016-05-10 02:35:44 +00:00
Evgeniy Stepanov	aba7343f5c	Don't inline functions with different SafeStack attributes. llvm-svn: 268999	2016-05-10 00:33:07 +00:00
Adam Nemet	c716a23b10	[LV] Hint at the new loop distribution pragma in optimization remark When we encounter unsafe memory dependencies, loop distribution could help. Even though, the diagnostics is in LAA, it's only currently emitted in the vectorizer. llvm-svn: 268987	2016-05-09 23:03:44 +00:00
Rong Xu	c43404ab85	Fix buildbot failure from r268968. llvm-svn: 268984	2016-05-09 22:45:47 +00:00
Sanjay Patel	279f051ad7	[Inliner] don't assume that a Constant alloca size is a ConstantInt (PR27277) Differential Revision: http://reviews.llvm.org/D20077 llvm-svn: 268980	2016-05-09 21:51:53 +00:00
Rong Xu	a0b26189e6	Fix buildbot failure from r268968. llvm-svn: 268979	2016-05-09 21:51:50 +00:00
Rong Xu	0381dde382	[PGO] Fix __llvm_profile_raw_version linkage in MACHO IR instrumentation generates a COMDAT symbol __llvm_profile_raw_version to overwrite the same symbol in profile run-time to distinguish IR profiles from Clang generated profiles. In MACHO, LinkOnceODR linkage is used due to the lack of COMDAT support. But LinkOnceODR linkage might have .weak_def_can_be_hidden assembly directive, while the weak variable in run-time has a .weak_definition directive. Linker will not merge these two symbols even they have the same name. The end result is IR profiles are not properly flagged in MACHO. This patch changes the linkage for __llvm_profile_raw_version in each module to LinkOnceAny so that it has same .weak_definition directive as in the run-time. Differential Revision: http://reviews.llvm.org/D20078 llvm-svn: 268969	2016-05-09 21:03:06 +00:00
Chad Rosier	8b41062bd4	[InstCombine] Fold icmp eq/ne (udiv i32 A, B), 0 -> icmp ugt/ule B, A. Differential Revision: http://reviews.llvm.org/D20036 llvm-svn: 268960	2016-05-09 19:30:20 +00:00

1 2 3 4 5 ...

6701 Commits