llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00

Author	SHA1	Message	Date
Davide Italiano	4a84641b2a	[SimplifyLibCalls] New trick: pow(x, 0.5) -> sqrt(x) under -ffast-math. Differential Revision: http://reviews.llvm.org/D14466 llvm-svn: 253521	2015-11-18 23:21:32 +00:00
Mehdi Amini	c848f65f7c	Fix returned value for GVN: could return "false" even after modifying the IR This bug would manifest in some very specific cases where all the following conditions are fullfilled: - GVN didn't remove block - The regular GVN iteration didn't change the IR - PRE is enabled - PRE will not split critical edge - The last instruction processed by PRE didn't change the IR Because the CallGraph PassManager relies on this returned value to decide if it needs to recompute a node after the execution of Function passes, not returning the right value can lead to unexpected results. Fix for: https://llvm.org/bugs/show_bug.cgi?id=24715 Patch by Wenxiang Qiu <vincentqiuuu@gmail.com> From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 253518	2015-11-18 22:49:49 +00:00
Davide Italiano	d84ba23a15	[BuildLibCalls] EmitStrNLen() is dead code. Garbage collect. llvm-svn: 253514	2015-11-18 22:29:38 +00:00
Pete Cooper	aca4c5cdc6	Change memcpy/memset/memmove to have dest and source alignments. Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html These intrinsics currently have an explicit alignment argument which is required to be a constant integer. It represents the alignment of the source and dest, and so must be the minimum of those. This change allows source and dest to each have their own alignments by using the alignment attribute on their arguments. The alignment argument itself is removed. There are a few places in the code for which the code needs to be checked by an expert as to whether using only src/dest alignment is safe. For those places, they currently take the minimum of src/dest alignments which matches the current behaviour. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false) will now read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false) For out of tree owners, I was able to strip alignment from calls using sed by replacing: (call.llvm\.memset.)i32\ [0-9]\,\ i1 false\) with: $1i1 false) and similarly for memmove and memcpy. I then added back in alignment to test cases which needed it. A similar commit will be made to clang which actually has many differences in alignment as now IRBuilder can generate different source/dest alignments on calls. In IRBuilder itself, a new argument was added. Instead of calling: CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, / isVolatile / false) you now call CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, / isVolatile */ false) There is a temporary class (IntegerAlignment) which takes the source alignment and rejects implicit conversion from bool. This is to prevent isVolatile here from passing its default parameter to the source alignment. Note, changes in future can now be made to codegen. I didn't change anything here, but this change should enable better memcpy code sequences. Reviewed by Hal Finkel. llvm-svn: 253511	2015-11-18 22:17:24 +00:00
Mike Aizatsky	45a03a44b9	Disable gvn non-local speculative loads under asan. Summary: Fix for https://llvm.org/bugs/show_bug.cgi?id=25550 Differential Revision: http://reviews.llvm.org/D14763 llvm-svn: 253498	2015-11-18 20:43:00 +00:00
Betul Buyukkurt	b3b3ea9a07	[PGO] Value profiling support This change introduces an instrumentation intrinsic instruction for value profiling purposes, the lowering of the instrumentation intrinsic and raw reader updates. The raw profile data files for llvm-profdata testing are updated. llvm-svn: 253484	2015-11-18 18:14:55 +00:00
Igor Laevsky	521a68ebf0	Revert "Revert "Strip metadata when speculatively hoisting instructions (r252604)" Failing clang test is now fixed by the r253458. llvm-svn: 253459	2015-11-18 14:50:18 +00:00
James Molloy	b92ba28077	[LTO] Add an early run of functionattrs Because we internalize early, we can potentially mark a bunch of functions as norecurse. Do this before globalopt. llvm-svn: 253451	2015-11-18 11:24:42 +00:00
Sanjoy Das	3bc2ba29a2	[OperandBundles] Tighten OperandBundleDef's interface; NFC llvm-svn: 253446	2015-11-18 08:30:07 +00:00
Craig Topper	f9c69f76b7	Replace dyn_cast with isa in places that weren't using the returned value for more than a boolean check. NFC. llvm-svn: 253441	2015-11-18 07:07:59 +00:00
Sanjoy Das	f5a4d357df	Teach the inliner to track deoptimization state Summary: This change teaches LLVM's inliner to track and suitably adjust deoptimization state (tracked via deoptimization operand bundles) as it inlines through call sites. The operation is described in more detail in the LangRef changes. Reviewers: reames, majnemer, chandlerc, dexonsmith Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14552 llvm-svn: 253438	2015-11-18 06:23:38 +00:00
Sanjay Patel	a5866e67e6	[InstCombine] refactor optimizeIntToFloatBitCast() ; NFCI The logic for handling the pattern without a shift is identical to the logic for handling the pattern with a shift if you set the shift amount to zero for the former. This should make it easier to see that we probably don't even need optimizeIntToFloatBitCast(). If we call something like foldVecTruncToExtElt() from visitTrunc(), we'll solve PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 llvm-svn: 253403	2015-11-18 00:00:04 +00:00
Andrew Kaylor	459ce58049	[EH] Keep filter clauses for types that have been caught. The instruction combiner previously removed types from filter clauses in Landing Pad instructions if the type had previously been seen in a catch clause. This is incorrect and prevents unexpected exception handlers from rethrowing the caught type. Differential Revision: http://reviews.llvm.org/D14669 llvm-svn: 253370	2015-11-17 20:13:04 +00:00
Elena Demikhovsky	0600baa03e	Vector of pointers in function attributes calculation While setting function attributes we check all instructions that may access memory. For a call instruction we check all arguments. The special check is required for pointers. I added vector-of-pointers to the call arguments types that should be checked. Differential Revision: http://reviews.llvm.org/D14693 llvm-svn: 253363	2015-11-17 19:30:51 +00:00
Sanjay Patel	4fcb285de4	fix typos; NFC llvm-svn: 253359	2015-11-17 18:46:56 +00:00
Sanjay Patel	bee424376a	use local variables; NFCI llvm-svn: 253356	2015-11-17 18:37:23 +00:00
Sanjay Patel	b04539d10d	function names start with a lower case letter; NFC llvm-svn: 253348	2015-11-17 17:24:08 +00:00
Chad Rosier	1d3f2c21a9	Typo. llvm-svn: 253336	2015-11-17 13:58:10 +00:00
Philip Reames	577e9b1072	[PRE] Preserve !invariant.load metadata Spoted via inspection. Test case included. llvm-svn: 253275	2015-11-17 00:15:09 +00:00
Sanjay Patel	36f22e5dda	use range-based for loop; NFCI llvm-svn: 253256	2015-11-16 22:16:52 +00:00
Michael Zolotukhin	1bc5fae202	[PR25538]: Fix a failure caused by r253126. In r253126 we stopped to recompute LCSSA after loop unrolling in all cases, except the unrolling is full and at least one of the loop exits is outside the parent loop. In other cases the transformation should not break LCSSA, but it turned out, that we also call SimplifyLoop on the parent loop, which might break LCSSA by itself. This fix just triggers LCSSA recomputation in this case as well. I'm committing it without a test case for now, but I'll try to invent one. It's a bit tricky because in an isolated test LoopSimplify would be scheduled before LoopUnroll, and thus will change the test and hide the problem. llvm-svn: 253253	2015-11-16 21:17:26 +00:00
Owen Anderson	4d5ef8fb85	Add intermediate subtract instructions to reassociation worklist. We sometimes create intermediate subtract instructions during reassociation. Adding these to the worklist to revisit exposes many additional reassociation opportunities. Patch by Aditya Nandakumar. llvm-svn: 253240	2015-11-16 18:07:30 +00:00
David Majnemer	723dbfed60	[LoopStrengthReduce] Don't increment iterator past the end of the BB We tried to move the insertion point beyond instructions like landingpad and cleanuppad. However, we also tried to move past catchpad. This is problematic because catchpad is also a terminator. This fixes PR25541. llvm-svn: 253238	2015-11-16 17:37:58 +00:00
Davide Italiano	f6fa662c51	[SimplifyLibCalls] Generalize a comment. This doesn't apply only to sqrt. llvm-svn: 253224	2015-11-16 16:54:28 +00:00
Pavel Labath	0a618f50bf	Don't generate discriminators for calls to debug intrinsics Summary: This fails a check in Verifier.cpp, which checks for location matches between the declared variable and the !dbg attachments. Reviewers: dnovillo, dblaikie, danielcdh Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14657 llvm-svn: 253194	2015-11-16 10:40:38 +00:00
James Molloy	63f33470bb	[GlobalOpt] Address post-commit review comments on r253168 Address Duncan Exon Smith's comments on D14148, which was added after the patch had been LGTM'd and committed: * clang-format one area where whitespace diffs occurred. * Add a threshold to limit the store/load dominance checks as they are quadratic. llvm-svn: 253192	2015-11-16 10:16:22 +00:00
Benjamin Kramer	c51f76e89b	Move helper classes into anonymous namespaces. NFC. llvm-svn: 253189	2015-11-16 09:01:28 +00:00
Keno Fischer	fe589b62e3	Also map the personality function in CloneFunctionInto Summary: The Old personality function gets copied over, but the Materializer didn't have a chance to inspect it (e.g. to fix up references to the correct module for the target function). Also add a verifier check that makes sure the personality routine is in the same module as the function whose personality it is. Reviewers: majnemer Subscribers: jevinskie, llvm-commits Differential Revision: http://reviews.llvm.org/D14474 llvm-svn: 253183	2015-11-16 05:13:30 +00:00
Keno Fischer	6b30a9f86b	[Sink] Don't move landingpads Summary: Moving landingpads into successor basic blocks makes the verifier sad. Teach Sink that much like PHI nodes and terminator instructions, landingpads (and cleanuppads, etc.) may not be moved between basic blocks. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14475 llvm-svn: 253182	2015-11-16 04:47:58 +00:00
Teresa Johnson	5dcb20f322	Fix mapping of unmaterialized global values during metadata linking Summary: The patch to move metadata linking after global value linking didn't correctly map unmaterialized global values to null as desired. They were in fact mapped to the source copy. It largely worked by accident since most module linker clients destroyed the source module which caused the source GVs to be replaced by null, but caused a failure with LTO linking on Windows: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312869.html The problem is that a null return value from materializeValueFor is handled by mapping the value to self. This is the desired behavior when materializeValueFor is passed a non-GlobalValue. The problem is how to distinguish that case from the case where we really do want to map to null. This patch addresses this by passing in a new flag to the value mapper indicating that unmapped global values should be mapped to null. Other Value types are handled as before. Note that the documented behavior of asserting on unmapped values when the flag RF_IgnoreMissingValues isn't set is currently disabled with FIXME notes due to bootstrap failures. I modified these disabled asserts so when they are eventually enabled again it won't assert for the unmapped values when the new RF_NullMapMissingGlobalValues flag is set. I also considered using a callback into the value materializer, but a flag seemed cleaner given that there are already existing flags. I also considered modifying materializeValueFor to return the input value when we want to map to source and then treat a null return to mean map to null. However, there are other value materializer subclasses that implement materializeValueFor, and they would all need to be audited and the return values possibly changed, which seemed error-prone. Reviewers: dexonsmith, joker.eph Subscribers: pcc, llvm-commits Differential Revision: http://reviews.llvm.org/D14682 llvm-svn: 253170	2015-11-15 14:50:14 +00:00
James Molloy	85bd37fc58	[GlobalOpt] Demote globals to locals more aggressively Global to local demotion can speed up programs that use globals a lot. It is particularly useful with LTO, when the entire call graph is known and most functions have been internalized. For a global to be demoted, it must only be accessed by one function and that function: 1. Must never recurse directly or indirectly, else the GV would be clobbered. 2. Must never rely on the value in GV at the start of the function (apart from the initializer). GlobalOpt can already do this, but it is hamstrung and only ever tries to demote globals inside "main", because C++ gives extra guarantees about how main is called - once and only once. In LTO mode, we can often prove the first property (if the function is internal by this point, we know enough about the callgraph to determine if it could possibly recurse). FunctionAttrs now infers the "norecurse" attribute for this reason. The second property can be proven for a subset of functions by proving that all loads from GV are dominated by a store to GV. This is conservative in the name of compile time - this only requires a DominatorTree which is fairly cheap in the grand scheme of things. We could do more fancy stuff with MemoryDependenceAnalysis too to catch more cases but this appears to catch most of the useful ones in my testing. llvm-svn: 253168	2015-11-15 14:21:37 +00:00
Elena Demikhovsky	d43b8f3050	Fixed GEP visitor in the InstCombine pass. The current implementation of GEP visitor in InstCombine fails with assertion on Vector GEP with mix of scalar and vector types, like this: getelementptr double, double* %a, <8 x i32> %i (It fails to create a "sext" from <8 x i32> to <8 x i64>) I fixed it and added some tests. Differential Revision: http://reviews.llvm.org/D14485 llvm-svn: 253162	2015-11-15 08:19:35 +00:00
Michael Zolotukhin	71a368d115	Don't recompute LCSSA after loop-unrolling when possible. Summary: Currently we always recompute LCSSA for outer loops after unrolling an inner loop. That leads to compile time problem when we have big loop nests, and we can solve it by avoiding unnecessary work. For instance, if w eonly do partial unrolling, we don't break LCSSA, so we don't need to rebuild it. Also, if all exits from the inner loop are inside the enclosing loop, then complete unrolling won't break LCSSA either. I replaced unconditional LCSSA recomputation with conditional recomputation + unconditional assert and added several tests, which were failing when I experimented with it. Soon I plan to follow up with a similar patch for recalculation of dominators tree. Reviewers: hfinkel, dexonsmith, bogner, joker.eph, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14526 llvm-svn: 253126	2015-11-14 05:51:41 +00:00
Chad Rosier	8328302c3f	[LIR] Add support for creating memcpys from loops with a negative stride. This allows us to transform the below loop into a memcpy. void test(unsigned __restrict__ a, unsigned __restrict__ b) { for (int i = 2047; i >= 0; --i) { a[i] = b[i]; } } This is the memcpy version of r251518, which added support for memset with negative strided loops. llvm-svn: 253091	2015-11-13 21:51:02 +00:00
Evgeniy Stepanov	cabb944249	[safestack] Rewrite isAllocaSafe using SCEV. Use ScalarEvolution to calculate memory access bounds. Handle function calls based on readnone/nocapture attributes. Handle memory intrinsics with constant size. This change improves both recall and precision of IsAllocaSafe. See the new tests (ex. BitCastWide) for the kind of code that was wrongly classified as safe. SCEV efficiency seems to be limited by the fact the SafeStack runs late (in CodeGenPrepare), and many loops are unrolled or otherwise not in LCSSA. llvm-svn: 253083	2015-11-13 21:21:42 +00:00
Chad Rosier	6e5d633af9	Add a comment that should have made my last commit. llvm-svn: 253063	2015-11-13 19:13:40 +00:00
Chad Rosier	4e30bfd5be	[LIR] Factor out the code to compute base ptr for negative strided loops. This will allow for the code to be reused in the memcpy optimization. llvm-svn: 253061	2015-11-13 19:11:07 +00:00
James Molloy	7d379efeea	[GlobalOpt] Make sure all debug lines end with '\n' GlobalVariable::print() used to emit a newline. It hasn't for a while now, but these debug lines weren't updated. llvm-svn: 253030	2015-11-13 11:05:13 +00:00
James Molloy	e22291bd4d	[GlobalOpt] Coding style - remove function names from doxygen comments Suggested by Mehdi in the review of D14148. llvm-svn: 253029	2015-11-13 11:05:07 +00:00
Akira Hatanaka	a41bf2744e	Revert r252990. Some of the buildbots are still failing. llvm-svn: 252999	2015-11-13 01:44:32 +00:00
Akira Hatanaka	5df31fbb8f	Provide a way to specify inliner's attribute compatibility and merging. This reapplies r252949. I've changed the type of FuncName to be std::string instead of StringRef in emitFnAttrCompatCheck. Original commit message for r252949: Provide a way to specify inliner's attribute compatibility and merging rules using table-gen. NFC. This commit adds new classes CompatRule and MergeRule to Attributes.td, which are used to generate code to check attribute compatibility and merge attributes of the caller and callee. rdar://problem/19836465 llvm-svn: 252990	2015-11-13 01:23:11 +00:00
Davide Italiano	24bc557e74	[SimplifyLibCalls] Make a function shorter. NFC. llvm-svn: 252970	2015-11-12 23:39:00 +00:00
Akira Hatanaka	8642e85c17	Revert r252949. It broke some of the bots including clang-x64-ninja-win7. llvm-svn: 252951	2015-11-12 21:19:18 +00:00
Akira Hatanaka	ca7dc7a319	Provide a way to specify inliner's attribute compatibility and merging rules using table-gen. NFC. This commit adds new classes CompatRule and MergeRule to Attributes.td, which are used to generate code to check attribute compatibility and merge attributes of the caller and callee. rdar://problem/19836465 llvm-svn: 252949	2015-11-12 20:59:43 +00:00
Tobias Grosser	a740869bb2	Revert "Fix bug 25440: GVN assertion after coercing loads" This reverts 252919 which broke LNT: MultiSource/Applications/SPASS llvm-svn: 252936	2015-11-12 20:04:21 +00:00
Chad Rosier	4dd6659eb0	[LIR] Minor refactoring. NFCI. This change prevents uninteresting stores from being inserted into the list of candidate stores for memset/memcpy conversion. llvm-svn: 252926	2015-11-12 19:09:16 +00:00
Weiming Zhao	134dff3b8a	Fix bug 25440: GVN assertion after coercing loads Summary: when coercing loads, it inserts some instructions, which have no GV assigned. https://llvm.org/bugs/show_bug.cgi?id=25440 Reviewers: hfinkel, dberlin Subscribers: dberlin, llvm-commits Differential Revision: http://reviews.llvm.org/D14479 llvm-svn: 252919	2015-11-12 18:19:59 +00:00
James Molloy	c1250c50be	[InstCombine] Add trivial folding (bitreverse (bitreverse x)) -> x There are plenty more instcombines we could probably do with bitreverse, but this seems like a very obvious and trivial starting point and was brought up by Hal in his review. llvm-svn: 252879	2015-11-12 12:39:41 +00:00
James Molloy	64e756bd07	Revert "Revert "[FunctionAttrs] Identify norecurse functions"" This reapplies this patch, with test fixes. llvm-svn: 252871	2015-11-12 10:55:20 +00:00
James Molloy	9cd723ec63	Revert "[FunctionAttrs] Identify norecurse functions" This reverts commit r252862. This introduced test failures and I'm reverting while I investigate how this happened. llvm-svn: 252863	2015-11-12 09:05:43 +00:00
James Molloy	4da975f03a	[FunctionAttrs] Identify norecurse functions A function can be marked as norecurse if: * The SCC to which it belongs has cardinality 1; and either a) It does not call any non-norecurse function. This includes self-recursion; or b) It only has one callsite and the function that callsite is within is marked norecurse. a) is best propagated bottom-up and b) is best propagated top-down. We build up the norecurse attributes bottom-up using the existing SCC pass, and mark functions with no obvious recursion (but not provably norecurse) to sweep later, top-down. llvm-svn: 252862	2015-11-12 08:53:04 +00:00
Chad Rosier	94688ac2d3	[LIR] General refactor to improve compile-time and simplify code. First create a list of candidates, then transform. This simplifies the code in that you have don't have to worry that you may be using an invalidated iterator. Previously, each time we created a memset/memcpy we would reevaluate the entire loop potentially resulting in lots of redundant work for large basic blocks. llvm-svn: 252817	2015-11-11 23:00:59 +00:00
David Majnemer	3deb8be573	[IR] Add support for empty tokens When working with tokens, it is often the case that one has instructions which consume a token and produce a new token. Currently, we have no mechanism to represent an initial token state. Instead, we can create a notional "empty token" by inventing a new constant which captures the semantics we would like. This new constant is called ConstantTokenNone and is written textually as "token none". Differential Revision: http://reviews.llvm.org/D14581 llvm-svn: 252811	2015-11-11 21:57:16 +00:00
Diego Novillo	d52939585a	SamplePGO - Fix PR 25482 - Do not rely on llvm.dbg.cu for discriminators The discriminators pass relied on the presence of llvm.dbg.cu to decide whether to add discriminators, but this fails in the case where debug info is only enabled partially when -fprofile-sample-use is active. The reason llvm.dbg.cu is not present in these cases is to prevent codegen from emitting debug info (as it is only used for the sample profile pass). This changes the discriminators pass to also emit discriminators even when debug info is not being emitted. llvm-svn: 252763	2015-11-11 17:54:37 +00:00
Charlie Turner	3bf913a172	[SLP] Enable -slp-vectorize-hor by default. Measurements primarily on AArch64 have shown this feature does not significantly effect compile-time. The are no significant perf changes in LNT, but for AArch64 at least, there are wins in third party benchmarks. As discussed on llvm-dev, we're going to try turning this on by default and see how other targets react to the change. llvm-svn: 252733	2015-11-11 15:03:46 +00:00
Yury Gribov	5cda6485bd	[ASan] Enable optional ASan recovery. Differential Revision: http://reviews.llvm.org/D14242 llvm-svn: 252719	2015-11-11 10:36:49 +00:00
Renato Golin	2143e4c7a3	Revert "Strip metadata when speculatively hoisting instructions" This reverts commit r252604, as it broke all ARM and AArch64 buildbots, as well as some x86, et al. llvm-svn: 252623	2015-11-10 18:01:16 +00:00
Igor Laevsky	747370b198	Strip metadata when speculatively hoisting instructions This is fix for PR24059. When we are hoisting instruction above some condition it may turn out that metadata on this instruction was control dependant on the condition. This metadata becomes invalid and we need to drop it. This patch should cover most obvious places of speculative execution (which I have found by greping isSafeToSpeculativelyExecute). I think there are more cases but at least this change covers the severe ones. Differential Revision: http://reviews.llvm.org/D14398 llvm-svn: 252604	2015-11-10 14:10:31 +00:00
Adhemerval Zanella	0d3a676058	[sanitizer] Use same shadow offset for ASAN on aarch64 This patch makes ASAN for aarch64 use the same shadow offset for all currently supported VMAs (39 and 42 bits). The shadow offset is the same for 39-bit (36). Similar to ppc64 port, aarch64 transformation also requires to use an add instead of 'or' for 42-bit VMA. llvm-svn: 252495	2015-11-09 18:03:48 +00:00
Dehao Chen	d1d1c30073	Add discriminators for call instructions that are from the same line and same basic block. Summary: Call instructions that are from the same line and same basic block needs to have separate discriminators to distinguish between different callsites. Reviewers: davidxl, dnovillo, dblaikie Subscribers: dblaikie, probinson, llvm-commits Differential Revision: http://reviews.llvm.org/D14464 llvm-svn: 252492	2015-11-09 17:30:38 +00:00
Chad Rosier	c47bdf955f	Simplify. NFC. llvm-svn: 252491	2015-11-09 16:56:06 +00:00
Oliver Stannard	989496fc9c	GlobalOpt should maintain externally_initialized when splitting aggregates When GlobalOpt splits an internal, global variable with an aggregate type, it should propagate the externally_initialized flag to the newly created globals. This makes the pass safe for our downstream use of this flag, while still allowing some useful optimisations (such as removing dead parts of the split aggregate) to be performed. Differential Revision: http://reviews.llvm.org/D13382 llvm-svn: 252490	2015-11-09 16:47:16 +00:00
James Molloy	697ec724f3	[LoopVectorize] Address post-commit feedback on r250032 Implemented as many of Michael's suggestions as were possible: * clang-format the added code while it is still fresh. * tried to change Value* to Instruction* in many places in computeMinimumValueSizes - unfortunately there are several places where Constants need to be handled so this wasn't possible. * Reduce the pass list on loop-vectorization-factors.ll. * Fix a bug where we were querying MinBWs for I->getOperand(0) but using MinBWs[I]. llvm-svn: 252469	2015-11-09 14:32:05 +00:00
Silviu Baranga	7e3cf64ceb	Allow LLE/LD and the loop versioning infrastructure to use SCEV predicates Summary: LAA currently generates a set of SCEV predicates that must be checked by users. In the case of Loop Distribute/Loop Load Elimination, no such predicates could have been emitted, since we don't allow stride versioning. However, in the future there could be SCEV predicates that will need to be checked. This change adds support for SCEV predicate versioning in the Loop Distribute, Loop Load Eliminate and the loop versioning infrastructure. Reviewers: anemet Subscribers: mssimpso, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D14240 llvm-svn: 252467	2015-11-09 13:26:09 +00:00
David Majnemer	ea1d0b6088	[LoopStrengthReduce] Don't bother fixing up PHIs from EH Pad preds We cannot really insert fixup code into a PHI's predecessor. This fixes PR25445. llvm-svn: 252416	2015-11-08 05:04:07 +00:00
Sanjoy Das	62bf9f3dd6	Unbreak the build My code clashed with some ilist iterator changes upstream. Fix by adding an explicit "&*" coercion. llvm-svn: 252392	2015-11-07 02:26:53 +00:00
Sanjoy Das	c550b2c217	[FunctionAttrs] Add comment and clarify assertion message; NFC llvm-svn: 252389	2015-11-07 01:56:07 +00:00
Sanjoy Das	6b6a5c9388	[FunctionAttrs] Add handling for operand bundles Summary: Teach the FunctionAttrs to do the right thing for IR with operand bundles. Reviewers: reames, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14408 llvm-svn: 252387	2015-11-07 01:56:00 +00:00
Sanjoy Das	9dbcbc0397	[FunctionAttrs] Fix an iterator wraparound bug Summary: This change fixes an iterator wraparound bug in `determinePointerReadAttrs`. Ideally, ++'ing off the `end()` of an iplist should result in a failed assert, but currently iplist seems to silently wrap to the head of the list on `end()++`. This is why the bad behavior is difficult to demonstrate. Reviewers: chandlerc, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14350 llvm-svn: 252386	2015-11-07 01:55:53 +00:00
David Majnemer	d5f26284b9	[InstCombine] Teach FoldPHIArgZextsIntoPHI about EHPads FoldPHIArgZextsIntoPHI cannot insert an instruction after the PHI if there is an EHPad in the BB. Doing so would result in an instruction inserted after a terminator. llvm-svn: 252377	2015-11-07 00:52:53 +00:00
Duncan P. N. Exon Smith	2c4e1a2f50	ADT: Remove last implicit ilist iterator conversions, NFC Some implicit ilist iterator conversions have crept back into Analysis, Transforms, Hexagon, and llvm-stress. This removes them. I'll commit a patch immediately after this to disallow them (in a separate patch so that it's easy to revert if necessary). llvm-svn: 252371	2015-11-07 00:01:16 +00:00
David Majnemer	48f3ee66bd	[InstCombine] Don't insert an instruction after a terminator We tried to insert a cast of a phi in a block whose terminator is an EHPad. This is invalid. Do not attempt the transform in these circumstances. llvm-svn: 252370	2015-11-06 23:59:23 +00:00
Akira Hatanaka	a73e1a6ef3	Add 'notail' marker for call instructions. This marker prevents optimization passes from adding 'tail' or 'musttail' markers to a call. Is is used to prevent tail call optimization from being performed on the call. rdar://problem/22667622 Differential Revision: http://reviews.llvm.org/D12923 llvm-svn: 252368	2015-11-06 23:55:38 +00:00
David Majnemer	9ffc9c11c7	[InstCombine] Don't RAUW tokens with undef Let SimplifyCFG remove unreachable BBs which define token instructions. llvm-svn: 252343	2015-11-06 21:26:32 +00:00
Davide Italiano	0addd685df	[SimplifyLibCalls] Don't hardcode the function name. llvm-svn: 252342	2015-11-06 21:05:07 +00:00
Mehdi Amini	dd38378605	Fix SLPVectorizer commutativity reordering The SLPVectorizer had a very crude way of trying to benefit from associativity: it tried to optimize for splat/broadcast or in order to have the same operator on the same side. This is benefitial to the cost model and allows more vectorization to occur. This patch improve the logic and make the detection optimal (locally, we don't look at the full tree but only at the immediate children). Should fix https://llvm.org/bugs/show_bug.cgi?id=25247 Reviewers: mzolotukhin Differential Revision: http://reviews.llvm.org/D13996 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 252337	2015-11-06 20:17:51 +00:00
Sanjoy Das	3c64437294	[ValueTracking] Add parameters to isImpliedCondition; NFC Summary: This change makes the `isImpliedCondition` interface similar to the rest of the functions in ValueTracking (in that it takes a DataLayout, AssumptionCache etc.). This is an NFC, intended to make a later diff less noisy. Depends on D14369 Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14391 llvm-svn: 252333	2015-11-06 19:01:08 +00:00
Chad Rosier	af14e3c44b	[LIR] Simplify code by making DataLayout globally accessible. NFC. llvm-svn: 252317	2015-11-06 16:33:57 +00:00
Peter Collingbourne	5b721561aa	DI: Reverse direction of subprogram -> function edge. Previously, subprograms contained a metadata reference to the function they described. Because most clients need to get or set a subprogram for a given function rather than the other way around, this created unneeded inefficiency. For example, many passes needed to call the function llvm::makeSubprogramMap() to build a mapping from functions to subprograms, and the IR linker needed to fix up function references in a way that caused quadratic complexity in the IR linking phase of LTO. This change reverses the direction of the edge by storing the subprogram as function-level metadata and removing DISubprogram's function field. Since this is an IR change, a bitcode upgrade has been provided. Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is attached to the PR. Differential Revision: http://reviews.llvm.org/D14265 llvm-svn: 252219	2015-11-05 22:03:56 +00:00
Alexey Samsonov	31ce2b9337	[ASan] Disable instrumentation for inalloca variables. inalloca variables were not treated as static allocas, therefore didn't participate in regular stack instrumentation. We don't want them to participate in dynamic alloca instrumentation as well. llvm-svn: 252213	2015-11-05 21:18:41 +00:00
Davide Italiano	7186ee9479	[SimplifyLibCalls] Use hasFloatVersion(). NFCI. llvm-svn: 252186	2015-11-05 19:18:23 +00:00
James Molloy	3716696427	[SimplifyCFG] Tweak heuristic for merging conditional stores We were correctly skipping dbginfo intrinsics and terminators, but the initial bailout wasn't, causing it to bail out on almost any block. llvm-svn: 252152	2015-11-05 08:40:19 +00:00
Sanjoy Das	3c5b8b4566	[FunctionAttrs] Remove a loop, NFC refactor Summary: Remove the loop over the uses of the CallSite in ArgumentUsesTracker. Since we have the `Use *` for actual argument operand, we can just use pointer subtraction. The time complexity remains the same though (except for a vararg argument) -- `std::advance` is O(UseIndex) for the ArgumentList iterator. The real motivation is to make a later change adding support for operand bundles simpler. Reviewers: reames, chandlerc, nlewycky Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14363 llvm-svn: 252141	2015-11-05 03:04:40 +00:00
Xinliang David Li	3edbd8ba8a	[PGO] Use template file to define runtime structures With this change, instrumentation code and reader/write code related to profile data structs are kept strictly in-sync. THis will be extended to cfe and compile-rt references as well. Differential Revision: http://reviews.llvm.org/D13843 llvm-svn: 252113	2015-11-05 00:47:26 +00:00
Davide Italiano	c3b20ee04f	[SimplifyLibCalls] New transformation: tan(atan(x)) -> x This is enabled only under -ffast-math. So, instead of emitting: 4007b0: 50 push %rax 4007b1: e8 8a fd ff ff callq 400540 <atanf@plt> 4007b6: 58 pop %rax 4007b7: e9 94 fd ff ff jmpq 400550 <tanf@plt> 4007bc: 0f 1f 40 00 nopl 0x0(%rax) for: float mytan(float x) { return tanf(atanf(x)); } we emit a single retq. Differential Revision: http://reviews.llvm.org/D14302 llvm-svn: 252098	2015-11-04 23:36:56 +00:00
Eugene Zelenko	7c062ccc18	Fix some Clang-tidy modernize warnings, other minor fixes. Fixed warnings are: modernize-use-override, modernize-use-nullptr and modernize-redundant-void-arg. Differential revision: http://reviews.llvm.org/D14312 llvm-svn: 252087	2015-11-04 22:32:32 +00:00
James Molloy	17b2066590	[SimplifyCFG] Merge conditional stores We can often end up with conditional stores that cannot be speculated. They can come from fairly simple, idiomatic code: if (c & flag1) a = x; if (c & flag2) a = y; ... There is no dominating or post-dominating store to a, so it is not legal to move the store unconditionally to the end of the sequence and cache the intermediate result in a register, as we would like to. It is, however, legal to merge the stores together and do the store once: tmp = undef; if (c & flag1) tmp = x; if (c & flag2) tmp = y; if (c & flag1 \|\| c & flag2) *a = tmp; The real power in this optimization is that it allows arbitrary length ladders such as these to be completely and trivially if-converted. The typical code I'd expect this to trigger on often uses binary-AND with constants as the condition (as in the above example), which means the ending condition can simply be truncated into a single binary-AND too: 'if (c & (flag1\|flag2))'. As in the general case there are bitwise operators here, the ladder can often be optimized further too. This optimization involves potentially increasing register pressure. Even in the simplest case, the lifetime of the first predicate is extended. This can be elided in some cases such as using binary-AND on constants, but not in the general case. Threading 'tmp' through all branches can also increase register pressure. The optimization as in this patch is enabled by default but kept in a very conservative mode. It will only optimize if it thinks the resultant code should be if-convertable, and additionally if it can thread 'tmp' through at least one existing PHI, so it will only ever in the worst case create one more PHI and extend the lifetime of a predicate. This doesn't trigger much in LNT, unfortunately, but it does trigger in a big way in a third party test suite. llvm-svn: 252051	2015-11-04 15:28:04 +00:00
Philip Reames	d4059ff8b7	[CVP] Fold return values if possible In my previous change to CVP (251606), I made CVP much more aggressive about trying to constant fold comparisons. This patch is a reversal in direction. Rather than being agressive about every compare, we restore the non-block local restriction for most, and then try hard for compares feeding returns. The motivation for this is two fold: * The more I thought about it, the less comfortable I got with the possible compile time impact of the other approach. There have been no reported issues, but after talking to a couple of folks, I've come to the conclusion the time probably isn't justified. * It turns out we need to know the context to leverage the full power of LVI. In particular, asking about something at the end of it's block (the use of a compare in a return) will frequently get more precise results than something in the middle of a block. This is an implementation detail, but it's also hard to get around since mid-block queries have to reason about possible throwing instructions and don't get to use most of LVI's block focused infrastructure. This will become particular important when combined with http://reviews.llvm.org/D14263. Differential Revision: http://reviews.llvm.org/D14271 llvm-svn: 252032	2015-11-04 01:43:54 +00:00
Adam Nemet	24ab61577e	Fix unused variable warning from r252017 llvm-svn: 252019	2015-11-04 00:10:33 +00:00
Adam Nemet	933537bc6b	LLE 6/6: Add LoopLoadElimination pass Summary: The goal of this pass is to perform store-to-load forwarding across the backedge of a loop. E.g.: for (i) A[i + 1] = A[i] + B[i] => T = A[0] for (i) T = T + B[i] A[i + 1] = T The pass relies on loop dependence analysis via LoopAccessAnalisys to find opportunities of loop-carried dependences with a distance of one between a store and a load. Since it's using LoopAccessAnalysis, it was easy to also add support for versioning away may-aliasing intervening stores that would otherwise prevent this transformation. This optimization is also performed by Load-PRE in GVN without the option of multi-versioning. As was discussed with Daniel Berlin in http://reviews.llvm.org/D9548, this is inferior to a more loop-aware solution applied here. Hopefully, we will be able to remove some complexity from GVN/MemorySSA as a consequence. In the long run, we may want to extend this pass (or create a new one if there is little overlap) to also eliminate loop-indepedent redundant loads and store that require versioning due to may-aliasing intervening stores/loads. I have some motivating cases for store elimination. My plan right now is to wait for MemorySSA to come online first rather than using memdep for this. The main motiviation for this pass is the 456.hmmer loop in SPECint2006 where after distributing the original loop and vectorizing the top part, we are left with the critical path exposed in the bottom loop. Being able to promote the memory dependence into a register depedence (even though the HW does perform store-to-load fowarding as well) results in a major gain (~20%). This gain also transfers over to x86: it's around 8-10%. Right now the pass is off by default and can be enabled with -enable-loop-load-elim. On the LNT testsuite, there are two performance changes (negative number -> improvement): 1. -28% in Polybench/linear-algebra/solvers/dynprog: the length of the critical paths is reduced 2. +2% in Polybench/stencils/adi: Unfortunately, I couldn't reproduce this outside of LNT The pass is scheduled after the loop vectorizer (which is after loop distribution). The rational is to try to reuse LAA state, rather than recomputing it. The order between LV and LLE is not critical because normally LV does not touch scalar st->ld forwarding cases where vectorizing would inhibit the CPU's st->ld forwarding to kick in. LoopLoadElimination requires LAA to provide the full set of dependences (including forward dependences). LAA is known to omit loop-independent dependences in certain situations. The big comment before removeDependencesFromMultipleStores explains why this should not occur for the cases that we're interested in. Reviewers: dberlin, hfinkel Subscribers: junbuml, dberlin, mssimpso, rengolin, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13259 llvm-svn: 252017	2015-11-03 23:50:08 +00:00
Fiona Glaser	abcc6a8ee2	InstCombine: fix sinking of convergent calls llvm-svn: 251991	2015-11-03 22:23:39 +00:00
Adam Nemet	8ce9fb467e	[LAA] LLE 3/6: Rename InterestingDependence to Dependences, NFC Summary: We now collect all types of dependences including lexically forward deps not just "interesting" ones. Reviewers: hfinkel Subscribers: rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D13256 llvm-svn: 251985	2015-11-03 21:39:52 +00:00
Davide Italiano	063a880856	[SimplifyLibCalls] Add a new transformation: pow(exp(x), y) -> exp(x*y) This one is enabled only under -ffast-math (due to rounding/overflows) but allows us to emit shorter code. Before (on FreeBSD x86-64): 4007f0: 50 push %rax 4007f1: f2 0f 11 0c 24 movsd %xmm1,(%rsp) 4007f6: e8 75 fd ff ff callq 400570 <exp2@plt> 4007fb: f2 0f 10 0c 24 movsd (%rsp),%xmm1 400800: 58 pop %rax 400801: e9 7a fd ff ff jmpq 400580 <pow@plt> 400806: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 40080d: 00 00 00 After: 4007b0: f2 0f 59 c1 mulsd %xmm1,%xmm0 4007b4: e9 87 fd ff ff jmpq 400540 <exp2@plt> 4007b9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) Differential Revision: http://reviews.llvm.org/D14045 llvm-svn: 251976	2015-11-03 20:32:23 +00:00
Elena Demikhovsky	45c8421c3f	LoopVectorizer - skip 'bitcast' between GEP and load. Skipping 'bitcast' in this case allows to vectorize load: %arrayidx = getelementptr inbounds double, double* %in, i64 %indvars.iv %tmp53 = bitcast double** %arrayidx to i64* %tmp54 = load i64, i64* %tmp53, align 8 Differential Revision http://reviews.llvm.org/D14112 llvm-svn: 251907	2015-11-03 10:29:34 +00:00
Tobias Grosser	b12c4fa7a9	Revert "[IndVarSimplify] Rewrite loop exit values with their initial values from loop preheader" Commit 251839 triggers miscompiles on some bots: http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly-fast/builds/13723 (The commit is listed in 13722, but due to an existing failure introduced in 13721 and reverted in 13723 the failure is only visible in 13723) To verify r251839 is indeed the only change that triggered the buildbot failures and to ensure the buildbots remain green while investigating I temporarily revert this commit. At the current state it is unclear if this commit introduced some miscompile or if it only exposed code to Polly that is subsequently miscompiled by Polly. llvm-svn: 251901	2015-11-03 07:14:39 +00:00
Teresa Johnson	90b0eac682	Restore "Support for ThinLTO function importing and symbol linking." This restores commit r251837, with the new library dependence added to llvm-link/Makefile to address bot failures. llvm-svn: 251866	2015-11-03 00:14:15 +00:00
Davide Italiano	6b8a532d6f	[SimplifyLibCalls] Remove variables that are not used. NFC. llvm-svn: 251852	2015-11-02 23:07:14 +00:00
Cong Hou	0b6d5e284f	Add a flag vectorizer-maximize-bandwidth in loop vectorizer to enable using larger vectorization factor. To be able to maximize the bandwidth during vectorization, this patch provides a new flag vectorizer-maximize-bandwidth. When it is turned on, the vectorizer will determine the vectorization factor (VF) using the smallest instead of widest type in the loop. To avoid increasing register pressure too much, estimates of the register usage for different VFs are calculated so that we only choose a VF when its register usage doesn't exceed the number of available registers. This is the second attempt to submit this patch. The first attempt got a test failure on ARM. This patch is updated to try to fix the failure (more specifically, by handling the case when VF=1). Differential revision: http://reviews.llvm.org/D8943 llvm-svn: 251850	2015-11-02 22:53:48 +00:00
Sanjay Patel	7f2d34165d	don't repeat function names in comments; NFC llvm-svn: 251846	2015-11-02 22:34:55 +00:00
Davide Italiano	fef9087e70	[SimplifyLibCalls] Merge two if statements. NFC. llvm-svn: 251845	2015-11-02 22:33:26 +00:00

1 2 3 4 5 ...

13936 Commits