llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 22:42:46 +02:00

Author	SHA1	Message	Date
Chandler Carruth	47c6759da6	Try to move to a more reasonable set of naming conventions given the new implementation of the SROA algorithm. We were using the term 'partition' in many places that no longer ever represented an actual partition, but rather just an arbitrary slice of an alloca. No functionality change intended here. Mostly just renaming of types, functions, variables, and rewording of comments. Several comments were rewritten to make a lot more sense in the new structure of things. The stats are still weird and not reflective of how this really works. I'll fix those up in a separate patch as it is a touch more semantic of a change... llvm-svn: 186659	2013-07-19 09:13:58 +00:00
Chandler Carruth	51ced11831	A long overdue cleanup in SROA to use 'DL' instead of 'TD' for the DataLayout variables. llvm-svn: 186656	2013-07-19 07:21:28 +00:00
Chandler Carruth	dc65e5d0aa	Fix PR16651, an assert introduced in my recent re-work of the innards of SROA. The crux of the issue is that now we track uses of a partition of the alloca in two places: the iterators over the partitioning uses and the previously collected split uses vector. We weren't accounting for the fact that the split uses might invalidate integer widening in ways other than due to their width (in this case due to being volatile). Further reduced testcase added to the tests. llvm-svn: 186655	2013-07-19 07:12:23 +00:00
Chandler Carruth	b59978aea3	Reapply r186316 with a fix for one bug where the code could walk off the end of a vector. This was found with ASan. I've had one other report of a crasher, but thus far been unable to reproduce the crash. It may well be fixed with this version, and if not I'd like to get more information from the build bots about what is happening. See r186316 for the full commit log for the new implementation of the SROA algorithm. llvm-svn: 186565	2013-07-18 07:15:00 +00:00
Craig Topper	b8260534f6	Add 'const' qualifiers to static const char* variables. llvm-svn: 186371	2013-07-16 01:17:10 +00:00
Stephen Lin	b764d0591b	Remove trailing whitespace llvm-svn: 186333	2013-07-15 17:55:02 +00:00
Chandler Carruth	255d06c032	Revert r186316 while I track down an ASan failure and an assert from a bot. This reverts the commit which introduced a new implementation of the fancy SROA pass designed to reduce its overhead. I'll skip the huge commit log here, refer to r186316 if you're looking for how this all works and why it works that way. llvm-svn: 186332	2013-07-15 17:36:21 +00:00
Chandler Carruth	05be8f3230	Reimplement SROA yet again. Same fundamental principle, but a totally different core implementation strategy. Previously, SROA would build a relatively elaborate partitioning of an alloca, associate uses with each partition, and then rewrite the uses of each partition in an attempt to break apart the alloca into chunks that could be promoted. This was very wasteful in terms of memory and compile time because regardless of how complex the alloca or how much we're able to do in breaking it up, all of the datastructure work to analyze the partitioning was done up front. The new implementation attempts to form partitions of the alloca lazily and on the fly, rewriting the uses that make up that partition as it goes. This has a few significant effects: 1) Much simpler data structures are used throughout. 2) No more double walk of the recursive use graph of the alloca, only walk it once. 3) No more complex algorithms for associating a particular use with a particular partition. 4) PHI and Select speculation is simplified and happens lazily. 5) More precise information is available about a specific use of the alloca, removing the need for some side datastructures. Ultimately, I think this is a much better implementation. It removes about 300 lines of code, but arguably removes more like 500 considering that some code grew in the process of being factored apart and cleaned up for this all to work. I've re-used as much of the old implementation as possible, which includes the lion's share of code in the form of the rewriting logic. The interesting new logic centers around how the uses of a partition are sorted, and split into actual partitions. Each instruction using a pointer derived from the alloca gets a 'Partition' entry. This name is totally wrong, but I'll do a rename in a follow-up commit as there is already enough churn here. The entry describes the offset range accessed and the nature of the access. Once we have all of these entries we sort them in a very specific way: increasing order of begin offset, followed by whether they are splittable uses (memcpy, etc), followed by the end offset or whatever. Sorting by splittability is important as it simplifies the collection of uses into a partition. Once we have these uses sorted, we walk from the beginning to the end building up a range of uses that form a partition of the alloca. Overlapping unsplittable uses are merged into a single partition while splittable uses are broken apart and carried from one partition to the next. A partition is also introduced to bridge splittable uses between the unsplittable regions when necessary. I've looked at the performance PRs fairly closely. PR15471 no longer will even load (the module is invalid). Not sure what is up there. PR15412 improves by between 5% and 10%, however it is nearly impossible to know what is holding it up as SROA (the entire pass) takes less time than reading the IR for that test case. The analysis takes the same time as running mem2reg on the final allocas. I suspect (without much evidence) that the new implementation will scale much better however, and it is just the small nature of the test cases that makes the changes small and noisy. Either way, it is still simpler and cleaner I think. llvm-svn: 186316	2013-07-15 10:30:19 +00:00
Craig Topper	58fa7a9b4a	Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector size. llvm-svn: 186274	2013-07-14 04:42:23 +00:00
Andrew Trick	b79ae09045	LFTR improvement to avoid truncation. This is a reimplemntation of the patch originally in r186107. llvm-svn: 186215	2013-07-12 22:08:48 +00:00
Andrew Trick	064fb30cab	Cleanup LFTR logic. llvm-svn: 186214	2013-07-12 22:08:44 +00:00
Andrew Trick	aacd582de2	Cleanup: rename a variable to make the logic easier to follow. llvm-svn: 186213	2013-07-12 22:08:41 +00:00
Chandler Carruth	4163f3fe85	Revert "indvars: Improve LFTR by eliminating truncation when comparing against a constant." This reverts commit r186107. It didn't handle wrapping arithmetic in the loop correctly and thus caused the following C program to count from 0 to UINT64_MAX instead of from 0 to 255 as intended: #include <stdio.h> int main() { unsigned char first = 0, last = 255; do { printf("%d\n", first); } while (first++ != last); } Full test case and instructions to reproduce with just the -indvars pass sent to the original review thread rather than to r186107's commit. llvm-svn: 186152	2013-07-12 11:18:55 +00:00
Andrew Trick	fe577c9f45	indvars: Improve LFTR by eliminating truncation when comparing against a constant. Patch by Michele Scandale! Adds a special handling of the case where, during the loop exit condition rewriting, the exit value is a constant of bitwidth lower than the type of the induction variable: instead of introducing a trunc operation in order to match correctly the operand types, it allows to convert the constant value to an equivalent constant, depending on the initial value of the induction variable and the trip count, in order have an equivalent comparison between the induction variable and the new constant. llvm-svn: 186107	2013-07-11 17:08:59 +00:00
Michael Gottesman	722cf9dc9b	Teach TailRecursionElimination to handle certain cases of nocapture escaping allocas. Without the changes introduced into this patch, if TRE saw any allocas at all, TRE would not perform TRE or mark callsites with the tail marker. Because TRE runs after mem2reg, this inadequacy is not a death sentence. But given a callsite A without escaping alloca argument, A may not be able to have the tail marker placed on it due to a separate callsite B having a write-back parameter passed in via an argument with the nocapture attribute. Assume that B is the only other callsite besides A and B only has nocapture escaping alloca arguments (NOTE B may have other arguments that are not passed allocas). In this case not marking A with the tail marker is unnecessarily conservative since: 1. By assumption A has no escaping alloca arguments itself so it can not access the caller's stack via its arguments. 2. Since all of B's escaping alloca arguments are passed as parameters with the nocapture attribute, we know that B does not stash said escaping allocas in a manner that outlives B itself and thus could be accessed indirectly by A. With the changes introduced by this patch: 1. If we see any escaping allocas passed as a capturing argument, we do nothing and bail early. 2. If we do not see any escaping allocas passed as captured arguments but we do see escaping allocas passed as nocapture arguments: i. We do not perform TRE to avoid PR962 since the code generator produces significantly worse code for the dynamic allocas that would be created by the TRE algorithm. ii. If we do not return twice, mark call sites without escaping allocas with the tail marker. NOTE This excludes functions with escaping nocapture allocas. 3. If we do not see any escaping allocas at all (whether captured or not): i. If we do not have usage of setjmp, mark all callsites with the tail marker. ii. If there are no dynamic/variable sized allocas in the function, attempt to perform TRE on all callsites in the function. Based off of a patch by Nick Lewycky. rdar://14324281. llvm-svn: 186057	2013-07-11 04:40:01 +00:00
Benjamin Kramer	d9392561e7	Reassociate: Remove unnecessary default operator=. llvm-svn: 185757	2013-07-06 15:10:13 +00:00
Sylvestre Ledru	be24b2e69f	Remove a useless declarations (found by scan-build) llvm-svn: 185709	2013-07-05 15:58:12 +00:00
Craig Topper	783617eba7	Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid specifying the vector size. llvm-svn: 185606	2013-07-04 01:31:24 +00:00
Craig Topper	9729e843cb	Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid specifying the vector size. llvm-svn: 185540	2013-07-03 15:07:05 +00:00
Nick Lewycky	e6e35eda7b	dbgs() << Instruction doesn't print a newline on the end any more. Update these debug statements to add a missing newline. Also canonicalize to '\n' instead of "\n"; the latter calls a function with a loop the former does not. llvm-svn: 184897	2013-06-26 00:30:18 +00:00
Bob Wilson	f1bf7886b8	Fix SROA to avoid unnecessary scalar conversions for 1-element vectors. When a 1-element vector alloca is promoted, a store instruction can often be rewritten without converting the value to a scalar and using an insertelement instruction to stuff it into the new alloca. This patch just adds a check to skip that conversion when it is unnecessary. This turns out to be really important for some ARM Neon operations where <1 x i64> is used to get around the fact that i64 is not a legal type. llvm-svn: 184870	2013-06-25 19:09:50 +00:00
Meador Inge	f58d6431f9	Remove the simplify-libcalls pass (finally) This commit completely removes what is left of the simplify-libcalls pass. All of the functionality has now been migrated to the instcombine and functionattrs passes. The following C API functions are now NOPs: 1. LLVMAddSimplifyLibCallsPass 2. LLVMPassManagerBuilderSetDisableSimplifyLibCalls llvm-svn: 184459	2013-06-20 19:48:07 +00:00
Bill Wendling	4d82ecded8	Access the TargetLoweringInfo from the TargetMachine object instead of caching it. The TLI may change between functions. No functionality change. llvm-svn: 184352	2013-06-19 21:07:11 +00:00
Matt Arsenault	fb5518e48b	Move StructurizeCFG out of R600 to generic Transforms. Register it with PassManager llvm-svn: 184343	2013-06-19 20:18:24 +00:00
Quentin Colombet	4633bd4a55	LSR: Fix the parameters used to compute the scaling factor cost. Prior to this change, the considered addressing modes may be invalid since the maximum and minimum offsets were not taking into account. This was causing an assertion failure. The added test case exercices that behavior. <rdar://problem/14199725> Assertion failed: (CurScaleCost >= 0 && "Legal addressing mode has an illegal cost!") llvm-svn: 184341	2013-06-19 19:59:41 +00:00
Jakub Staszak	6fe8ea3c17	Use 0 instead of NULL. llvm-svn: 184044	2013-06-15 12:20:44 +00:00
Shuxin Yang	63a223a0a0	Fix a potential bug in r183584. r183584 tries to derive some info from the code AFTER a call and apply these derived info to the code BEFORE the call, which is not always safe as the call in question may never return, and in this case, the derived info is invalid. Thank Duncan for pointing out this potential bug. rdar://14073661 llvm-svn: 183606	2013-06-08 04:56:05 +00:00
Shuxin Yang	7247dac833	Fix an assertion in MemCpyOpt pass. The MemCpyOpt pass is capable of optimizing: callee(&S); copy N bytes from S to D. into: callee(&D); subject to some legality constraints. Assertion is triggered when the compiler tries to evalute "sizeof(typeof(D))", while D is an opaque-typed, 'sret' formal argument of function being compiled. i.e. the signature of the func being compiled is something like this: T caller(...,%opaque* noalias nocapture sret %D, ...) The fix is that when come across such situation, instead of calling some utility functions to get the size of D's type (which will crash), we simply assume D has at least N bytes as implified by the copy-instruction. rdar://14073661 llvm-svn: 183584	2013-06-07 22:45:21 +00:00
David Majnemer	f6b2c81f95	IndVarSimplify: check if loop invariant expansion can trap IndVarSimplify is willing to move divide instructions outside of their loop bodies if they are invariant of the loop. However, it may not be safe to expand them if we do not know if they can trap. Instead, check to see if it is not safe to expand the instruction and skip the expansion. This fixes PR16041. Testcase by Rafael Ávila de Espíndola. llvm-svn: 183239	2013-06-04 17:51:58 +00:00
Quentin Colombet	3e2682d134	Loop Strength Reduce: Scaling factor cost. Account for the cost of scaling factor in Loop Strength Reduce when rating the formulae. This uses a target hook. The default implementation of the hook is: if the addressing mode is legal, the scaling factor is free. <rdar://problem/13806271> llvm-svn: 183045	2013-05-31 21:29:03 +00:00
Quentin Colombet	c3a4f33cc1	Modify how the formulae are rated in Loop Strength Reduce. Namely, check if the target allows to fold more that one register in the addressing mode and if yes, adjust the cost accordingly. Prior to this commit, reg1 + scale * reg2 accesses were artificially preferred to reg1 + reg2 accesses. Indeed, the cost model wrongly assumed that reg1 + reg2 needs a temporary register for the computation, whereas it was correctly estimated for reg1 + scale * reg2. <rdar://problem/13973908> llvm-svn: 183021	2013-05-31 17:20:29 +00:00
Michael J. Spencer	c195b8a813	Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros. llvm-svn: 182680	2013-05-24 22:23:49 +00:00
Shuxin Yang	018fd6828f	[GVN] Split critical-edge on the fly, instead of postpone edge-splitting to next iteration. This on step toward non-iterative GVN. My local hack suggests that getting rid of iteration will speedup GVN by 30%+ on a medium sized input (2k LOC, C++). I cannot explain why not 2x or more at this moment. llvm-svn: 181532	2013-05-09 18:34:27 +00:00
Nick Lewycky	a88ff03516	Fix a bug in codegenprep where it was losing track of values OptimizeMemoryInst by switching to a ValueMap. Patch by Andrea DiBiagio! llvm-svn: 181397	2013-05-08 09:00:10 +00:00
Andrew Trick	5d13ab6ea6	Rotate multi-exit loops even if the latch was simplified. Test case by Michele Scandale! Fixes PR10293: Load not hoisted out of loop with multiple exits. There are few regressions with this patch, now tracked by rdar:13817079, and a roughly equal number of improvements. The regressions are almost certainly back luck because LoopRotate has very little idea of whether rotation is profitable. Doing better requires a more comprehensive solution. This checkin is a quick fix that lacks generality (PR10293 has a counter-example). But it trivially fixes the case in PR10293 without interfering with other cases, and it does satify the criteria that LoopRotate is a loop canonicalization pass that should avoid heuristics and special cases. I can think of two approaches that would probably be better in the long run. Ultimately they may both make sense. (1) LoopRotate should check that the current header would make a good loop guard, and that the loop does not already has a sufficient guard. The artifical SimplifiedLoopLatch check would be unnecessary, and the design would be more general and canonical. Two difficulties: - We need a strong guarantee that we won't endlessly rotate, so the analysis would need to be precise in order to avoid the SimplifiedLoopLatch precondition. - Analysis like this are usually based on SCEV, which we don't want to rely on. (2) Rotate on-demand in late loop passes. This could even be done by shoving the loop back on the queue after the optimization that needs it. This could work well when we find LICM opportunities in multi-branch loops. This requires some work, and it doesn't really solve the problem of SCEV wanting a loop guard before the analysis. llvm-svn: 181230	2013-05-06 17:58:18 +00:00
Shuxin Yang	2e42a06bb1	Decompose GVN::processNonLocalLoad() (about 400 LOC) into smaller helper functions. No function change. This function consists of following steps: 1. Collect dependent memory accesses. 2. Analyze availability. 3. Perform fully redundancy elimination, or 4. Perform PRE, depending on the availability Step 2, 3 and 4 are now moved to three helper routines. llvm-svn: 181047	2013-05-03 19:17:26 +00:00
Shuxin Yang	df9f738a35	[GV] Remove dead code which is really difficult to decipher. Actually it took me couple of hours trying to make sense of them and only to find they are dead code. I guess the original author used "allSingleSucc" to indicate if there are any critial edge emanating from some blocks, and tried to perform code motion (actually speculation) in the presence of these critical edges; but later on he/she changed mind and decided to perform edge-splitting first. llvm-svn: 180951	2013-05-02 21:14:31 +00:00
Filip Pizlo	dd62846c56	This patch breaks up Wrap.h so that it does not have to include all of the things, and renames it to CBindingWrapping.h. I also moved CBindingWrapping.h into Support/. This new file just contains the macros for defining different wrap/unwrap methods. The calls to those macros, as well as any custom wrap/unwrap definitions (like for array of Values for example), are put into corresponding C++ headers. Doing this required some #include surgery, since some .cpp files relied on the fact that including Wrap.h implicitly caused the inclusion of a bunch of other things. This also now means that the C++ headers will include their corresponding C API headers; for example Value.h must include llvm-c/Core.h. I think this is harmless, since the C API headers contain just external function declarations and some C types, so I don't believe there should be any nasty dependency issues here. llvm-svn: 180881	2013-05-01 20:59:00 +00:00
Nadav Rotem	c0309431a1	SROA: Generate selects instead of shuffles when blending values because this is the cannonical form. Shuffles are more difficult to lower and we usually don't touch them, while we do optimize selects more often. llvm-svn: 180875	2013-05-01 19:53:30 +00:00
Shuxin Yang	cb9d06c59b	Fix a XOR reassociation bug. When Reassociator optimize "(x \| C1)" ^ "(X & C2)", it may swap the two subexpressions, however, it forgot to swap cached constants (of C1 and C2) accordingly. rdar://13739160 llvm-svn: 180676	2013-04-27 18:02:12 +00:00
Eric Christopher	beec5d09da	Move C++ code out of the C headers and into either C++ headers or the C++ files themselves. This enables people to use just a C compiler to interoperate with LLVM. llvm-svn: 180063	2013-04-22 22:47:22 +00:00
Rafael Espindola	88a7961c64	Clarify that llvm.used can contain aliases. Also add a check for llvm.used in the verifier and simplify clients now that they can assume they have a ConstantArray. llvm-svn: 180019	2013-04-22 14:58:02 +00:00
Benjamin Kramer	47f18d3da1	SROA: Don't crash on a select with two identical operands. This is an edge case that can happen if we modify a chain of multiple selects. Update all operands in that case and remove the assert. PR15805. llvm-svn: 179982	2013-04-21 17:48:39 +00:00
Chris Lattner	a5d4b30d60	Fix a comment, PR15777. llvm-svn: 179775	2013-04-18 17:42:14 +00:00
Jim Grosbach	1b2d956f84	Fix a typo in comment. llvm-svn: 179542	2013-04-15 17:40:48 +00:00
Shuxin Yang	cc126626e3	Redo the fix Benjamin Kramer committed in r178793 about iterator invalidation in Reassociate. I brazenly think this change is slightly simpler than r178793 because: - no "state" in functor - "OpndPtrs[i]" looks simpler than "&Opnds[OpndIndices[i]]" While I can reproduce the probelm in Valgrind, it is rather difficult to come up a standalone testing case. The reason is that when an iterator is invalidated, the stale invalidated elements are not yet clobbered by nonsense data, so the optimizer can still proceed successfully. Thank Benjamin for fixing this bug and generously providing the test case. llvm-svn: 179062	2013-04-08 22:00:43 +00:00
Chandler Carruth	8d04726f54	Fix PR15674 (and PR15603): a SROA think-o. The fix for PR14972 in r177055 introduced a real think-o in the store side, likely because I was much more focused on the load side. While we can arbitrarily widen (or narrow) a loaded value, we can't arbitrarily widen a value to be stored, as that changes the width of memory access! Lock down the code path in the store rewriting which would do this to only handle the intended circumstance. All of the existing tests continue to pass, and I've added a test from the PR. llvm-svn: 178974	2013-04-07 11:47:54 +00:00
Shuxin Yang	5cf388a00f	Disable the optimization about promoting vector-element-access with symbolic index. This optimization is unstable at this moment; it 1) block us on a very important application 2) PR15200 3) test6 and test7 in test/Transforms/ScalarRepl/dynamic-vector-gep.ll (the CHECK command compare the output against wrong result) I personally believe this optimization should not have any impact on the autovectorized code, as auto-vectorizer is supposed to put gather/scatter in a "right" way. Although in theory downstream optimizaters might reveal some gather/scatter optimization opportunities, the chance is quite slim. For the hand-crafted vectorizing code, in term of redundancy elimination, load-CSE, copy-propagation and DSE can collectively achieve the same result, but in much simpler way. On the other hand, these optimizers are able to improve the code in a incremental way; in contrast, SROA is sort of all-or-none approach. However, SROA might slighly win in stack size, as it tries to figure out a stretch of memory tightenly cover the area accessed by the dynamic index. rdar://13174884 PR15200 llvm-svn: 178912	2013-04-05 21:07:08 +00:00
Benjamin Kramer	d4c69ec04b	Reassociate: Avoid iterator invalidation. OpndPtrs stored pointers into the Opnd vector that became invalid when the vector grows. Store indices instead. Sadly I only have a large testcase that only triggers under valgrind, so I didn't include it. llvm-svn: 178793	2013-04-04 21:15:42 +00:00
Shuxin Yang	74f54ae4b2	Correct assertion condition llvm-svn: 178484	2013-04-01 18:13:05 +00:00
Shuxin Yang	c53fc5dc4c	Implement XOR reassociation. It is based on following rules: rule 1: (x \| c1) ^ c2 => (x & ~c1) ^ (c1^c2), only useful when c1=c2 rule 2: (x & c1) ^ (x & c2) = (x & (c1^c2)) rule 3: (x \| c1) ^ (x \| c2) = (x & c3) ^ c3 where c3 = c1 ^ c2 rule 4: (x \| c1) ^ (x & c2) => (x & c3) ^ c1, where c3 = ~c1 ^ c2 It reduces an application's size (in terms of # of instructions) by 8.9%. Reviwed by Pete Cooper. Thanks a lot! rdar://13212115 llvm-svn: 178409	2013-03-30 02:15:01 +00:00
Jakub Staszak	760ea04733	Minor cleanups. No functionality change. llvm-svn: 177837	2013-03-24 09:56:28 +00:00
Jakub Staszak	8c92d0d919	Use dyn_cast instead of isa && cast. No functionality change. llvm-svn: 177836	2013-03-24 09:25:47 +00:00
Chandler Carruth	5dfc3ade1f	[SROA] Prefix names using a custom IRBuilder inserter. The key part of this is ensuring that name prefixes remain in a Twine form until we get to a point where we can nuke them under NDEBUG. This is tricky using the old APIs as they played fast and loose with Twine, which is prone to serious error. The inserter is much cleaner as it is actually in the call stack leading to the setName call, and so has a good opportunity to prepend the prefix. This matters more than you might imagine because most runs over an alloca find a single partition, and rewrite 3 or 4 instructions referring to it. As a consequence doing this lazily and exclusively with Twine allows the optimizer to delete more of it and shaves another 2% to 3% off of the release build's SROA run time for PR15412. I also think the APIs are cleaner, and the use of Twine is more reliable, so I consider it a win-win despite the churn required to reach this state. llvm-svn: 177631	2013-03-21 09:52:18 +00:00
Meador Inge	8c4638bcc3	simplify-libcalls: Removed unused variable The 'Modified' variable should have been removed from SimplifyLibCalls in r177619, but was missed. This commit removes it. llvm-svn: 177622	2013-03-21 02:44:07 +00:00
Meador Inge	30024047b3	Move library call prototype attribute inference to functionattrs The simplify-libcalls pass implemented a doInitialization hook to infer function prototype attributes for well-known functions. Given that the simplify-libcalls pass is going away and that the functionattrs pass is already in place to deduce function attributes, I am moving this logic to the functionattrs pass. This approach was discussed during patch review: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121126/157465.html. llvm-svn: 177619	2013-03-21 00:55:59 +00:00
Chandler Carruth	6f1e6bc2dc	Fix a silly search-and-replace goof with r177495 that only broke non-release builds. llvm-svn: 177498	2013-03-20 07:40:56 +00:00
Chandler Carruth	13fa287d63	[SROA] Don't preserve the IR names in release builds. This is espcially important because the new SROA pass goes to great lengths to provide helpful names for debugging, and as a consequence they can become very slow to render. Good for between 5% and 15% of the SROA runtime on some slow test cases such as the one in PR15412. llvm-svn: 177495	2013-03-20 07:30:36 +00:00
Chandler Carruth	16617f6650	Move the endif to the correct line so we don't have warnings about unused statistics variables. llvm-svn: 177494	2013-03-20 06:47:00 +00:00
Chandler Carruth	9248b3ae59	Introduce some new statistics to help track the exact behavior of the new SROA pass. llvm-svn: 177493	2013-03-20 06:30:46 +00:00
Quentin Colombet	268e28a41d	Update global merge pass according to Duncan's advices: - Remove useless includes - Change misleading comments - Move code into doFinalization llvm-svn: 177445	2013-03-19 21:46:49 +00:00
Arnaud A. de Grandmaison	092ac21f4f	IndVarSimplify: do not recompute an IV value outside of the loop if : - it is trivially known to be used inside the loop in a way that can not be optimized away - there is no use outside of the loop which can take advantage of the computation hoisting llvm-svn: 177432	2013-03-19 20:00:22 +00:00
Andrew Trick	2fcea6b47a	Revert "Cleanup some SCEV logic a bit." This reverts commit 82cd8f7382322bee7a71cdc31f7a923c44d37d32. Just add a comment instead! llvm-svn: 177377	2013-03-19 05:10:27 +00:00
Andrew Trick	256f5077d9	Cleanup some SCEV logic a bit. Make the code more obvious to scan-build and humans. llvm-svn: 177375	2013-03-19 04:14:59 +00:00
Andrew Trick	0dd5df1889	Tighten up an internal LSR API that should check for NULL. No test case, but should fix a scan_build warning. llvm-svn: 177374	2013-03-19 04:14:57 +00:00
Jakub Staszak	89b78a9580	Make method private. Keep coding standard. llvm-svn: 177348	2013-03-18 23:31:30 +00:00
Quentin Colombet	bb36556d97	Extend global merge pass to optionally consider global constant variables. Also add some checks to not merge globals used within landing pad instructions or marked as "used". llvm-svn: 177331	2013-03-18 22:30:07 +00:00
Chandler Carruth	1d83f79b3d	Mark internal classes as POD-like to get better behavior out of SmallVector and DenseMap. This speeds up SROA by 25% on PR15412. llvm-svn: 177259	2013-03-18 08:36:46 +00:00
Chandler Carruth	3d9eacc90b	PR14972: SROA vs. GVN exposed a really bad bug in SROA. The fundamental problem is that SROA didn't allow for overly wide loads where the bits past the end of the alloca were masked away and the load was sufficiently aligned to ensure there is no risk of page fault, or other trapping behavior. With such widened loads, SROA would delete the load entirely rather than clamping it to the size of the alloca in order to allow mem2reg to fire. This was exposed by a test case that neatly arranged for GVN to run first, widening certain loads, followed by an inline step, and then SROA which miscompiles the code. However, I see no reason why this hasn't been plaguing us in other contexts. It seems deeply broken. Diagnosing all of the above took all of 10 minutes of debugging. The really annoying aspect is that fixing this completely breaks the pass. ;] There was an implicit reliance on the fact that no loads or stores extended past the alloca once we decided to rewrite them in the final stage of SROA. This was used to encode information about whether the loads and stores had been split across multiple partitions of the original alloca. That required threading explicit tracking of whether a use of a partition is split across multiple partitions. Once that was done, another problem arose: we allowed splitting of integer loads and stores iff they were loads and stores to the entire alloca. This is a really arbitrary limitation, and splitting at least some integer loads and stores is crucial to maximize promotion opportunities. My first attempt was to start removing the restriction entirely, but currently that does Very Bad Things by causing many common alloca patterns to be fully decomposed into i8 operations and lots of or-ing together to produce larger integers on demand. The code bloat is terrifying. That is still the right end-goal, but substantial work must be done to either merge partitions or ensure that small i8 values are eagerly merged in some other pass. Sadly, figuring all this out took essentially all the time and effort here. So the end result is that we allow splitting only when the load or store at least covers the alloca. That ensures widened loads and stores don't hurt SROA, and that we don't rampantly decompose operations more than we have previously. All of this was already fairly well tested, and so I've just updated the tests to cover the wide load behavior. I can add a test that crafts the pass ordering magic which caused the original PR, but that seems really brittle and to provide little benefit. The fundamental problem is that widened loads should Just Work. llvm-svn: 177055	2013-03-14 11:32:24 +00:00
Dan Gohman	8d664849a4	Change the order of the operands in patchAndReplaceAllUsesWith so that they're more consistent with Value::replaceAllUsesWith. llvm-svn: 176872	2013-03-12 16:22:56 +00:00
Jakub Staszak	88061183a0	Keep coding stanard. llvm-svn: 176661	2013-03-07 22:20:06 +00:00
Jakub Staszak	d94b12a75b	Don't create IRBuilder if we can return from the method earlier. llvm-svn: 176660	2013-03-07 22:10:33 +00:00
Preston Gurd	66b9c4fcf9	Bypass Slow Divides * Only apply divide bypass optimization when not optimizing for size. * Fixed bug caused by constant for 0 value of type Int32, used dividend type to generate the constant instead. * For atom x86-64 apply the divide bypass to use 16-bit divides instead of 64-bit divides when operand values are small enough. * Added lit tests for 64-bit divide bypass. Patch by Tyler Nowicki! llvm-svn: 176442	2013-03-04 18:13:57 +00:00
Benjamin Kramer	caa5d4f15d	CVP: If we have a PHI with an incoming select, try to skip the select. This is a common pattern with dyn_cast and similar constructs, when the PHI no longer depends on the select it can often be turned into a simpler construct or even get hoisted out of the loop. PR15340. llvm-svn: 175995	2013-02-24 15:34:43 +00:00
Bill Wendling	eecb534c87	Implement the NoBuiltin attribute. The 'nobuiltin' attribute is applied to call sites to indicate that LLVM should not treat the callee function as a built-in function. I.e., it shouldn't try to replace that function with different code. llvm-svn: 175835	2013-02-22 00:12:35 +00:00
Chad Rosier	d3dde59198	Remove dead code and whitespace. llvm-svn: 175804	2013-02-21 21:40:51 +00:00
Chad Rosier	60f8f47d44	Update a comment that looks to have been accidentally deleted many moons ago. llvm-svn: 175658	2013-02-20 20:15:55 +00:00
Jakub Staszak	06b3cdcb51	Remove unused variable. llvm-svn: 175568	2013-02-19 22:17:58 +00:00
Jakub Staszak	2cbda8564a	Minor cleanups. No functionality change. llvm-svn: 175567	2013-02-19 22:14:45 +00:00
Jakub Staszak	c60273ce06	Remove unneeded #includes. llvm-svn: 175565	2013-02-19 22:06:38 +00:00
Jakub Staszak	3b29b5aba1	Fix typos. llvm-svn: 175562	2013-02-19 22:02:21 +00:00
Jakub Staszak	18a6467836	Reduce indents in LSRInstance::NarrowSearchSpaceByCollapsingUnrolledCode method. No functionality change. llvm-svn: 175364	2013-02-16 16:08:15 +00:00
Dan Gohman	1ea13bd49f	Actually delete this code, since it's really not clear what it's trying to do. llvm-svn: 175014	2013-02-12 22:26:41 +00:00
Dan Gohman	a04e614c0f	Record PRE predecessors with a SmallVector instead of a DenseMap, and avoid a second pred_iterator traversal. llvm-svn: 175001	2013-02-12 19:49:10 +00:00
Dan Gohman	79869cd16c	When disabling PRE for a value is directly redundant with itself (through a loop), don't continue to iterate through the reamining predecessors. llvm-svn: 174994	2013-02-12 19:05:10 +00:00
Dan Gohman	6815bf69b1	Check that pointers are removed from maps before calling delete on the pointers, for tidiness' sake. llvm-svn: 174988	2013-02-12 18:44:43 +00:00
Dan Gohman	69f4c8b640	Minor code simplification. llvm-svn: 174985	2013-02-12 18:38:36 +00:00
Andrew Trick	b2b2884c68	LSR IVChain improvement. Handle chains in which the same offset is used for both loads and stores to the same array. Fixes rdar://11410078. llvm-svn: 174789	2013-02-09 01:11:01 +00:00
Jakub Staszak	ac504d22d0	Remove #includes from the commonly used LoopInfo.h. llvm-svn: 174786	2013-02-09 01:04:28 +00:00
Preston Gurd	fd12deb433	This patch aims to improve compile time performance by increasing the SCEV vector size in LoopStrengthReduce. It is observed that the BaseRegs vector size is 4 in most cases, and elements are frequently copied when it is initialized as SmallVector<const SCEV *, 2> BaseRegs. Our benchmark results show that the compilation time performance improved by ~0.5%. Patch by Wan Xiaofei. llvm-svn: 174219	2013-02-01 20:41:27 +00:00
Dan Gohman	7eac0c2694	Change GetPointerBaseWithConstantOffset's DataLayout argument from a reference to a pointer, so that it can handle the case where DataLayout is not available and behave conservatively. llvm-svn: 174024	2013-01-31 02:00:45 +00:00
Edwin Vane	fafd787d1f	Fixing warnings revealed by gcc release build Fixed set-but-not-used warnings. Reviewer: gribozavr llvm-svn: 173810	2013-01-29 17:42:24 +00:00
Michael Gottesman	3d8ed99b1f	Extracted ObjCARC.cpp into its own library libLLVMObjCARCOpts in preparation for refactoring the ARC Optimizer. llvm-svn: 173647	2013-01-28 01:35:51 +00:00
Michael Gottesman	307107e1ea	Renamed function IsPotentialUse to IsPotentialRetainableObjPtr. This name change does the following: 1. Causes the function name to use proper ARC terminology. 2. Makes it clear what the function truly does. llvm-svn: 173609	2013-01-27 06:19:48 +00:00
Michael Gottesman	fd9cebe07d	Added comment to ObjCARC elaborating what is meant by the term 'Provenance' in 'Provenance Analysis'. llvm-svn: 173374	2013-01-24 21:35:00 +00:00
Michael Gottesman	1b1c476a4c	Fixed typo. llvm-svn: 173202	2013-01-22 21:53:43 +00:00
Michael Gottesman	8eb9360a37	[ObjCARC] Refactored out the inner most 2-loops from PerformCodePlacement into the method ConnectTDBUTraversals. The method PerformCodePlacement was doing too much (i.e. 3x loops, lots of different checking). This refactoring separates the analysis section of the method into a separate function while leaving the actual code placement and analysis preparation in PerformCodePlacement. NOTE Really this part of ObjCARC should be refactored out of the main pass class into its own seperate class/struct. But, it is not time to make that change yet though (don't want to make such an invasive change without fixing all of the bugs first). llvm-svn: 173201	2013-01-22 21:49:00 +00:00
Bill Wendling	82711f5390	More encapsulation work. Use the AttributeSet when we're talking about more than one attribute. Add a function that adds a single attribute. No functionality change intended. llvm-svn: 173196	2013-01-22 21:15:51 +00:00
Chandler Carruth	e7f6a7e82e	Begin fleshing out an interface in TTI for modelling the costs of generic function calls and intrinsics. This is somewhat overlapping with an existing intrinsic cost method, but that one seems targetted at vector intrinsics. I'll merge them or separate their names and use cases in a separate commit. This sinks the test of 'callIsSmall' down into TTI where targets can control it. The whole thing feels very hack-ish to me though. I've left a FIXME comment about the fundamental design problem this presents. It isn't yet clear to me what the users of this function really care about. I'll have to do more analysis to figure that out. Putting this here at least provides it access to proper analysis pass tools and other such. It also allows us to more cleanly implement the baseline cost interfaces in TTI. With this commit, it is now theoretically possible to simplify much of the inline cost analysis's handling of calls by calling through to this interface. That conversion will have to happen in subsequent commits as it requires more extensive restructuring of the inline cost analysis. The CodeMetrics class is now really only in the business of running over a block of code and aggregating the metrics on that block of code, with the actual cost evaluation done entirely in terms of TTI. llvm-svn: 173148	2013-01-22 11:26:02 +00:00
Chandler Carruth	632fcee01a	Switch CodeMetrics itself over to use TTI to determine if an instruction is free. The whole CodeMetrics API should probably be reworked more, but this is enough to allow deleting the duplicate code there for computing whether an instruction is free. All of the passes using this have been updated to pull in TTI and hand it to the CodeMetrics stuff. Further, a dead CodeMetrics API (analyzeFunction) is nuked for lack of users. llvm-svn: 173036	2013-01-21 13:04:33 +00:00
Michael Gottesman	e179dec59c	Improved comment. llvm-svn: 172864	2013-01-18 23:02:45 +00:00
Michael Gottesman	ed41df19ff	Fixed typo in comment. llvm-svn: 172863	2013-01-18 23:00:33 +00:00
Bill Wendling	7777bbbbf3	Use AttributeSet accessor methods instead of Attribute accessor methods. Further encapsulation of the Attribute object. Don't allow direct access to the Attribute object as an aggregate. llvm-svn: 172853	2013-01-18 21:53:16 +00:00
Benjamin Kramer	3fc02ad019	Silence GCC warning about dropping off a non-void function. llvm-svn: 172839	2013-01-18 19:45:22 +00:00
Michael Gottesman	51c56a28c1	Fixed 80+ violation. llvm-svn: 172782	2013-01-18 03:08:39 +00:00
Michael Gottesman	0f96f3f655	Added missing const from my last commit. llvm-svn: 172736	2013-01-17 18:36:17 +00:00
Michael Gottesman	54242440aa	[ObjCARC] Implemented operator<< for InstructionClass and changed a ``Visited'' Debug message to use it. llvm-svn: 172735	2013-01-17 18:32:34 +00:00
Michael Gottesman	56275116f2	[ObjCARC] Turn off ignoring unwind edges in ObjCARC when -fno-objc-arc-exception is enabled due to it's affect on correctness. Specifically according to the semantics of ARC -fno-objc-arc-exception simply states that it is expected that the unwind path out of a call MAY not release objects. Thus we can have the situation where a release gets moved into a catch block which we ignore when we remove a retain/release pair resulting in (even though we assume the program is exiting anyways) the cleanup code path potentially blowing up before program exit. llvm-svn: 172599	2013-01-16 06:32:39 +00:00
Michael Gottesman	58db6d1cd7	Changed SmallPtrSet.count guard + SmallPtrSet.insert to just SmallPtrSet.insert. llvm-svn: 172452	2013-01-14 19:18:39 +00:00
Michael Gottesman	aabad66a3a	Fixed some 80+ violations. llvm-svn: 172374	2013-01-14 01:47:53 +00:00
Michael Gottesman	0ff0eb0f71	Updated the documentation in ObjCARC.cpp to fit the style guide better (i.e. use doxygen). Still some work to do though. llvm-svn: 172371	2013-01-14 00:35:14 +00:00
Michael Gottesman	c6a7902080	Fixed an infinite loop in the block escape in analysis in ObjCARC caused by 2x blocks each assigned a value via a phi-node causing each to depend on the other. A test case is provided as well. llvm-svn: 172368	2013-01-13 22:12:06 +00:00
Michael Gottesman	1f32fece1e	[ObjCARC] Even more debug messages! llvm-svn: 172347	2013-01-13 07:47:32 +00:00
Michael Gottesman	ffdbdc6957	[ObjCARC] More debug messages. llvm-svn: 172346	2013-01-13 07:00:51 +00:00
Chandler Carruth	a094848fcc	Fix an editor goof in r171738 that Bill spotted. He may even have a test case, but looking at the diff this was an obviously unintended change. Thanks for the careful review Bill! =] llvm-svn: 172336	2013-01-12 23:46:04 +00:00
Michael Gottesman	ea2af5c74a	Fixed debug message in ObjCARC. llvm-svn: 172299	2013-01-12 03:45:49 +00:00
Michael Gottesman	a00a81d4cc	Fixed a few debug messages in ObjCARC and added one. llvm-svn: 172298	2013-01-12 02:57:16 +00:00
Michael Gottesman	822cc01174	Fixed bug in ObjCARC where we were changing a call from objc_autoreleaseRV => objc_autorelease but were not updating the InstructionClass to IC_Autorelease. llvm-svn: 172288	2013-01-12 01:25:19 +00:00
Michael Gottesman	c59f1a814a	Fixed a bug where we were tail calling objc_autorelease causing an object to not be placed into an autorelease pool. The reason that this occurs is that tail calling objc_autorelease eventually tail calls -[NSObject autorelease] which supports fast autorelease. This can cause us to violate the semantic gaurantees of __autoreleasing variables that assignment to an __autoreleasing variables always yields an object that is placed into the innermost autorelease pool. The fix included in this patch works by: 1. In the peephole optimization function OptimizeIndividualFunctions, always remove tail call from objc_autorelease. 2. Whenever we convert to/from an objc_autorelease, set/unset the tail call keyword as appropriate. NOTE I also handled the case where objc_autorelease is converted in OptimizeReturns to an autoreleaseRV which still violates the ARC semantics. I will be removing that in a later patch and I wanted to make sure that the tree is in a consistent state vis-a-vis ARC always. Additionally some test cases are provided and all tests that have tail call marked objc_autorelease keywords have been modified so that tail call has been removed. NOTE One test fails due to a separate bug that I am going to commit soon. Thus I marked the check line TMP: instead of CHECK: so make check does not fail. llvm-svn: 172287	2013-01-12 01:25:15 +00:00
Shuxin Yang	5df00cb3b4	PR14904: Segmentation fault running pass 'Recognize loop idioms' The root cause is mistakenly taking for granted that "dyn_cast<Instruction>(a-Value)" return a non-NULL instruction. llvm-svn: 172145	2013-01-10 23:32:01 +00:00
Michael Gottesman	0b51c8346d	[ObjCARC Debug Message] Added debug message when we convert an autorelease into an autoreleaseRV. llvm-svn: 172034	2013-01-10 02:03:50 +00:00
Michael Gottesman	4f1ddbf0fc	[ObjCARC Debug Messages] This is a squashed commit of 3x debug message commits ala echristo's suggestion. 1. Added debug messages when in OptimizeIndividualCalls we move calls into predecessors and then erase the original call. 2. Added debug messages when in the process of moving calls in ObjCARCOpt::MoveCalls we create new RR and delete old RR. 3. Added a debug message when we visit a specific retain instruction in ObjCARCOpt::PerformCodePlacement. llvm-svn: 171988	2013-01-09 19:23:24 +00:00
Benjamin Kramer	953e24a567	LICM: Hoist insertvalue/extractvalue out of loops. Fixes PR14854. llvm-svn: 171984	2013-01-09 18:12:03 +00:00
Michael Gottesman	75a619b8c2	Fixed EOL whitespace. llvm-svn: 171791	2013-01-07 21:26:07 +00:00
Chandler Carruth	b8c9b84572	Sink AddrMode back into TargetLowering, removing one of the most peculiar headers under include/llvm. This struct still doesn't make a lot of sense, but it makes more sense down in TargetLowering than it did before. llvm-svn: 171739	2013-01-07 15:14:13 +00:00
Chandler Carruth	2c7e0782e3	Remove LSR's use of the random AddrMode struct. These variables were already in a class, just inline the four of them. I suspect that this class could be simplified some to not always keep distinct variables for these things, but it wasn't clear to me how given the usage so I opted for a trivial and mechanical translation. This removes one of the two remaining users of a header in include/llvm which does nothing more than define a 4 member struct. llvm-svn: 171738	2013-01-07 15:04:40 +00:00
Chandler Carruth	0e84971ae2	Switch the SCEV expander and LoopStrengthReduce to use TargetTransformInfo rather than TargetLowering, removing one of the primary instances of the layering violation of Transforms depending directly on Target. This is a really big deal because LSR used to be a "special" pass that could only be tested fully using llc and by looking at the full output of it. It also couldn't run with any other loop passes because it had to be created by the backend. No longer is this true. LSR is now just a normal pass and we should probably lift the creation of LSR out of lib/CodeGen/Passes.cpp and into the PassManagerBuilder. =] I've not done this, or updated all of the tests to use opt and a triple, because I suspect someone more familiar with LSR would do a better job. This change should be essentially without functional impact for normal compilations, and only change behvaior of targetless compilations. The conversion required changing all of the LSR code to refer to the TTI interfaces, which fortunately are very similar to TargetLowering's interfaces. However, it also allowed us to always expect to have some implementation around. I've pushed that simplification through the pass, and leveraged it to simplify code somewhat. It required some test updates for one of two things: either we used to skip some checks altogether but now we get the default "no" answer for them, or we used to have no information about the target and now we do have some. I've also started the process of removing AddrMode, as the TTI interface doesn't use it any longer. In some cases this simplifies code, and in others it adds some complexity, but I think it's not a bad tradeoff even there. Subsequent patches will try to clean this up even further and use other (more appropriate) abstractions. Yet again, almost all of the formatting changes brought to you by clang-format. =] llvm-svn: 171735	2013-01-07 14:41:08 +00:00
Silviu Baranga	0aae14e2d4	Make the MergeGlobals pass correctly handle the address space qualifiers of the global variables. We partition the set of globals by their address space, and apply the same the trasnformation as before to merge them. llvm-svn: 171730	2013-01-07 12:31:25 +00:00
Chandler Carruth	f82e5f38d9	Switch LoopIdiom pass to directly require target transform information. I'm sorry for duplicating bad style here, but I wanted to keep consistency. I've pinged the code review thread where this style was reviewed and changes were requested. llvm-svn: 171714	2013-01-07 09:17:41 +00:00
Chandler Carruth	f8bff2bea0	Make SimplifyCFG simply depend upon TargetTransformInfo and pass it through as a reference rather than a pointer. There is always some implementation of this available, so this simplifies code by not having to test for whether it is available or not. Further, it turns out there were piles of places where SimplifyCFG was recursing and not passing down either TD or TTI. These are fixed to be more pedantically consistent even though I don't have any particular cases where it would matter. llvm-svn: 171691	2013-01-07 03:53:25 +00:00
Chandler Carruth	601fa4e996	Make the popcnt support enums and methods have more clear names and follow the conding conventions regarding enumerating a set of "kinds" of things. llvm-svn: 171687	2013-01-07 03:16:03 +00:00
Chandler Carruth	3c0f5d4efb	Move TargetTransformInfo to live under the Analysis library. This no longer would violate any dependency layering and it is in fact an analysis. =] llvm-svn: 171686	2013-01-07 03:08:10 +00:00
Michael Gottesman	4a300ff5a0	[ObjCARC Debug Message] - Added debug message when fuse a retain/autorelease pair in ObjCARCContract::ContractAutorelease. llvm-svn: 171679	2013-01-07 00:31:26 +00:00
Michael Gottesman	4a96ee6b2a	[ObjCARC Debug Message] - Added debug message when we zap a matching retain/autorelease pair in ObjCARCOpt::OptimizeReturns. llvm-svn: 171678	2013-01-07 00:04:56 +00:00
Michael Gottesman	9c90fe632e	[ObjCARC Debug Message] - Added debug message when we erase ARC calls with null since they are no-ops. llvm-svn: 171677	2013-01-07 00:04:52 +00:00
Michael Gottesman	abee5cf395	[ObjCARC Debug Message] - Added debug message when we add a nounwind keyword to a function which can not throw. llvm-svn: 171676	2013-01-06 23:39:13 +00:00
Michael Gottesman	49f71df55e	[ObjCARC Debug Message] - Added debug message when we add a tail keyword to a function which can never be passed stack args. llvm-svn: 171675	2013-01-06 23:39:09 +00:00
Michael Gottesman	6a2ae38c56	[ObjCARC Debug Messages] - Added missing newline. llvm-svn: 171674	2013-01-06 22:56:54 +00:00
Michael Gottesman	44c08c9584	Added debug statement to ObjCARC when we replace objc_autorelease(x) with objc_release(x) when x is otherwise unused. llvm-svn: 171673	2013-01-06 22:56:50 +00:00
Michael Gottesman	023ea4f317	Added 2x Debug statements to ObjCARC that log when we handle the two undefined pointer-to-weak-pointer is NULL cases by replacing the given call inst with an undefined value. The reason that there are two cases is that the first case handles the unary cases and the second the binary cases. llvm-svn: 171672	2013-01-06 21:54:30 +00:00
Michael Gottesman	c862b0cd55	Added debug message in ObjCARC when we remove a no-op cast which has only special semantic meaning in the frontend and thus in the optimizer can be deleted. llvm-svn: 171670	2013-01-06 21:07:15 +00:00
Michael Gottesman	8107797e22	Added debug message to ObjCARC when we transform an objc_autoreleaseReturnValue => objc_autorelease due to its operand not being used as a return value. llvm-svn: 171669	2013-01-06 21:07:11 +00:00
Andrew Trick	8c90c2b2d7	Fix a crash in LSR replaceCongruentIVs. Indirect branch in the preheader crashes replaceCongruentIVs. Fixes rdar://12910141. llvm-svn: 171653	2013-01-06 05:59:39 +00:00
Michael Gottesman	b08d13fa04	Added debug message to ObjCARC when we transform objc_retainAutorelasedReturnValue => objc_retain since the operand to said function is not a return value. llvm-svn: 171629	2013-01-05 17:55:42 +00:00
Michael Gottesman	3f22b59b75	Added debug message for ObjCARC when we zap an objc_autoreleaseReturnValue/objc_retainAutoreleasedValue pair. llvm-svn: 171628	2013-01-05 17:55:35 +00:00
Chris Lattner	561e5f6442	switch from pointer equality comparison to MDNode::getMostGenericTBAA when merging two TBAA tags, pointed out by Nuno. llvm-svn: 171627	2013-01-05 16:44:07 +00:00
Chandler Carruth	c37f873121	Switch LoopIdiomRecognize to directly use the TargetTransformInfo interface rather than the ScalarTargetTransformInterface. llvm-svn: 171616	2013-01-05 10:00:09 +00:00
Chandler Carruth	413d8d63a5	Sink the AddressingModeMatcher helper class into an anonymous namespace next to its only user. This helper relies on TargetLowering information that shouldn't be generally used throughout the Transfoms library, and so it made little sense as a generic utility. This also consolidates the file where we need to remove the remaining uses of TargetLowering in favor of the IR-layer abstract interface in TargetTransformInfo. llvm-svn: 171590	2013-01-05 02:09:22 +00:00
Michael Gottesman	c63800aa94	Added DEBUG message to ObjCARC when we optimize objc_retain => objc_retainAutorelasedReturnValue. llvm-svn: 171535	2013-01-04 21:30:38 +00:00
Michael Gottesman	0f781a1a5c	Fixed up some DEBUG messages where I was putting in the text of a message the method where it was being called when I should have just prefixed the actual message with Pass::Method. Additionally I fixed some whitespace issues. llvm-svn: 171534	2013-01-04 21:29:57 +00:00
Michael Gottesman	a67b835abe	Changed two debug statements that state that a queue had finished being processed when said queue was really a list to state a list had finished being processed. llvm-svn: 171465	2013-01-03 08:09:27 +00:00
Michael Gottesman	27dfca0765	Added DEBUG message for ObjCARC when we zap a push/pop pair in ObjCARCAPElim::OptimizeBB. llvm-svn: 171464	2013-01-03 08:09:17 +00:00
Michael Gottesman	5c76a62ee6	Added DEBUG message to ObjCARC when we transform objc_initWeak(p, null) => *p = null. llvm-svn: 171463	2013-01-03 07:32:53 +00:00
Michael Gottesman	fcffa87a15	Added DEBUG message for ObjCARC when an inline asm marker is inserted for architectures where this is required to perform a retainAutoreleasedReturnValue optimization. llvm-svn: 171462	2013-01-03 07:32:41 +00:00
Shuxin Yang	c985c304e2	- Add comment to two functions which might be considered as dead code. - Fix a typo llvm-svn: 171399	2013-01-02 18:26:31 +00:00
Chandler Carruth	4c1f3c24db	Move all of the header files which are involved in modelling the LLVM IR into their new header subdirectory: include/llvm/IR. This matches the directory structure of lib, and begins to correct a long standing point of file layout clutter in LLVM. There are still more header files to move here, but I wanted to handle them in separate commits to make tracking what files make sense at each layer easier. The only really questionable files here are the target intrinsic tablegen files. But that's a battle I'd rather not fight today. I've updated both CMake and Makefile build systems (I think, and my tests think, but I may have missed something). I've also re-sorted the includes throughout the project. I'll be committing updates to Clang, DragonEgg, and Polly momentarily. llvm-svn: 171366	2013-01-02 11:36:10 +00:00
Chandler Carruth	5f5c383ef1	Resort the #include lines in include/... and lib/... with the utils/sort_includes.py script. Most of these are updating the new R600 target and fixing up a few regressions that have creeped in since the last time I sorted the includes. llvm-svn: 171362	2013-01-02 10:22:59 +00:00
Benjamin Kramer	ddae3440aa	Add IRBuilder::CreateVectorSplat and use it to simplify code. llvm-svn: 171349	2013-01-01 19:55:16 +00:00
Benjamin Kramer	6a165dfca3	SROA: Clean up unused assignment warnings from clang's analyzer. No functionality change. llvm-svn: 171348	2013-01-01 16:13:35 +00:00
Michael Gottesman	bb7c4c44f3	Added DEBUG message when ObjCARC replaces a call which returns its argument verbatim with its argument to temporarily undo an optimization. Specifically these calls return their argument verbatim, as a low-level optimization. However, this makes high-level optimizations harder. We undo any uses of this optimization that the front-end emitted. We redo them later in the contract pass. llvm-svn: 171346	2013-01-01 16:05:54 +00:00
Michael Gottesman	7d1661c22c	Added DEBUG messages to the top of several processing loops in ObjCARC.cpp that emit what instructions are being visited. This is a part of a larger effort of adding DEBUG messages to the ARC Optimizer Backend. llvm-svn: 171345	2013-01-01 16:05:48 +00:00
Chris Lattner	c9303c920d	Fix LICM's memory promotion optimization to preserve TBAA tags when promoting a store in a loop. This was noticed when working on PR14753, but isn't directly related. llvm-svn: 171281	2012-12-31 08:37:17 +00:00
Nuno Lopes	0873c9d511	convert a bunch of callers from DataLayout::getIndexedOffset() to GEP::accumulateConstantOffset(). The later API is nicer than the former, and is correct regarding wrap-around offsets (if anyone cares). There are a few more places left with duplicated code, which I'll remove soon. llvm-svn: 171259	2012-12-30 16:25:48 +00:00
Bill Wendling	e0920e4122	Remove the Function::getFnAttributes method in favor of using the AttributeSet directly. This is in preparation for removing the use of the 'Attribute' class as a collection of attributes. That will shift to the AttributeSet class instead. llvm-svn: 171253	2012-12-30 10:32:01 +00:00
Evan Cheng	69cb91fa21	Every pass deserves a name, even codegenprep. llvm-svn: 170831	2012-12-21 01:48:14 +00:00
James Molloy	de926c367f	Add a new attribute, 'noduplicate'. If a function contains a noduplicate call, the call cannot be duplicated - Jump threading, loop unrolling, loop unswitching, and loop rotation are inhibited if they would duplicate the call. Similarly inlining of the function is inhibited, if that would duplicate the call (in particular inlining is still allowed when there is only one callsite and the function has internal linkage). llvm-svn: 170704	2012-12-20 16:04:27 +00:00
Bill Wendling	56d9c4b832	Rename the 'Attributes' class to 'Attribute'. It's going to represent a single attribute in the future. llvm-svn: 170502	2012-12-19 07:18:57 +00:00
Nadav Rotem	c22e8c34a7	SROA: Replace calls to getScalarSizeInBits to DataLayout's API because getScalarSizeInBits could not handle vectors of pointers. llvm-svn: 170412	2012-12-18 05:23:31 +00:00
Chandler Carruth	60738bca93	Fix another SROA crasher, PR14601. This was a silly oversight, we weren't pruning allocas which were used by variable-length memory intrinsics from the set that could be widened and promoted as integers. Fix that. llvm-svn: 170353	2012-12-17 18:48:07 +00:00
Chandler Carruth	91d886f61b	Teach the rewriting of memcpy calls to support subvector copies. This also cleans up a bit of the memcpy call rewriting by sinking some irrelevant code further down and making the call-emitting code a bit more concrete. Previously, memcpy of a subvector would actually miscompile (!!!) the copy into a single vector element copy. I have no idea how this ever worked. =/ This is the memcpy half of PR14478 which we probably weren't noticing previously because it didn't actually assert. The rewrite relies on the newly refactored insert- and extractVector functions to do the heavy lifting, and those are the same as used for loads and stores which makes the test coverage a bit more meaningful here. llvm-svn: 170338	2012-12-17 14:51:24 +00:00
Evgeniy Stepanov	5ecec98c2c	Optimize tree walking in markAliveBlocks. Check whether a BB is known as reachable before adding it to the worklist. This way BB's with multiple predecessors are added to the list no more than once. llvm-svn: 170335	2012-12-17 14:28:00 +00:00
Chandler Carruth	e576359bf4	Fix a secondary bug I introduced while fixing the first part of PR14478. The first half of fixing this bug was actually in r170328, but was entirely coincidental. It did however get me to realize the nature of the bug, and adapt the test case to test more interesting behavior. In turn, that uncovered the rest of the bug which I've fixed here. This should fix two new asserts that showed up in the vectorize nightly tester. llvm-svn: 170333	2012-12-17 14:03:01 +00:00
Chandler Carruth	2a27cb5523	Hoist a convertValue call to the two paths where it is needed. I noticed this while looking at r170328. We only ever do a vector rewrite when the alloca is the vector type, so it's good to not paper over bugs here by doing a convertValue that isn't needed. llvm-svn: 170331	2012-12-17 13:51:03 +00:00
Chandler Carruth	35ed75156a	Hoist the insertVector helper to be a static helper. This will allow its use inside of memcpy rewriting as well. This routine is more complex than extractVector, and some of its uses are not 100% where I want them to be so there is still some work to do here. While this can technically change the output in some cases, it shouldn't be a change that matters -- IE, it can leave some dead code lying around that prior versions did not, etc. Yet another step in the refactorings leading up to the solution to the last component of PR14478. llvm-svn: 170328	2012-12-17 13:41:21 +00:00
Chandler Carruth	bbd2c1ab94	Lift the extractVector helper all the way out to a static helper function. The method helpers all implicitly act upon the alloca, and what we really want is a fully generic helper. Doing memcpy rewrites is more special than all other rewrites because we are at times rewriting instructions which touch pointers other than the alloca. As a consequence all of the helpers needed by memcpy rewriting of sub-vector copies will need to be generalized fully. Note that all of these helpers ({insert,extract}{Integer,Vector}) are woefully uncommented. I'm going to go back through and document them once I get the factoring correct. No functionality changed. llvm-svn: 170325	2012-12-17 13:07:30 +00:00
Chandler Carruth	af6b524242	Factor the vector load rewriting into a more generic form. This makes it suitable for use in rewriting memcpy in the presence of subvector memcpy intrinsics. No functionality changed. llvm-svn: 170324	2012-12-17 12:50:21 +00:00
Chandler Carruth	a079cd0144	Fix the first part of PR14478: memset now works. PR14478 highlights a serious problem in SROA that simply wasn't being exercised due to a lack of vector input code mixed with C-library function calls. Part of SROA was written carefully to handle subvector accesses via memset and memcpy, but the rewriter never grew support for this. Fixing it required refactoring the subvector access code in other parts of SROA so it could be shared, and then fixing the splat formation logic and using subvector insertion (this patch). The PR isn't quite fixed yet, as memcpy is still broken in the same way. I'm starting on that series of patches now. Hopefully this will be enough to bring the bullet benchmark back to life with the bb-vectorizer enabled, but that may require fixing memcpy as well. llvm-svn: 170301	2012-12-17 04:07:37 +00:00
Chandler Carruth	dae722c19d	Extract the logic for inserting a subvector into a vector alloca. No functionality changed. Another step of refactoring toward solving PR14487. llvm-svn: 170300	2012-12-17 04:07:35 +00:00
Chandler Carruth	fefd557661	Lift the integer splat computation into a helper function. No functionality changed. Refactoring leading up to the fix for PR14478 which requires some significant changes to the memset and memcpy rewriting. llvm-svn: 170299	2012-12-17 04:07:30 +00:00
Chandler Carruth	0fa6260bcf	Relax an overly aggressive assert to fix PR14572. The alloca width is based on the alloc size, not the type size. llvm-svn: 170270	2012-12-15 09:26:06 +00:00
Patrik Hagglund	caaedc6ade	Revert EVT->MVT changes, r169836-169851, due to buildbot failures. llvm-svn: 169854	2012-12-11 11:14:33 +00:00
Patrik Hagglund	6c9d0f4058	Change TargetLowering::getLoadExtAction to take an MVT, instead of EVT. llvm-svn: 169840	2012-12-11 09:39:09 +00:00
Chandler Carruth	4686de879c	Add a new visitor for walking the uses of a pointer value. This visitor provides infrastructure for recursively traversing the use-graph of a pointer-producing instruction like an alloca or a malloc. It maintains a worklist of uses to visit, so it can handle very deep recursions. It automatically looks through instructions which simply translate one pointer to another (bitcasts and GEPs). It tracks the offset relative to the original pointer as long as that offset remains constant and exposes it during the visit as an APInt offset. Finally, it performs conservative escape analysis. However, currently it has some limitations that should be addressed going forward: 1) It doesn't handle vectors of pointers. 2) It doesn't provide a cheaper visitor when the constant offset tracking isn't needed. 3) It doesn't support non-instruction pointer values. The current functionality is exactly what is required to implement the SROA pointer-use visitors in terms of this one, rather than in terms of their own ad-hoc base visitor, which was always very poorly specified. SROA has been converted to use this, and the code there deleted which this utility now provides. Technically speaking, using this new visitor allows SROA to handle a few more cases than it previously did. It is now more aggressive in ignoring chains of instructions which look like they would defeat SROA, but in fact do not because they never result in a read or write of memory. While this is "neat", it shouldn't be interesting for real programs as any such chains should have been removed by others passes long before we get to SROA. As a consequence, I've not added any tests for these features -- it shouldn't be part of SROA's contract to perform such heroics. The goal is to extend the functionality of this visitor going forward, and re-use it from passes like ASan that can benefit from doing a detailed walk of the uses of a pointer. Thanks to Ben Kramer for the code review rounds and lots of help reviewing and debugging this patch. llvm-svn: 169728	2012-12-10 08:28:39 +00:00
Chandler Carruth	c9b6bd9712	Fix PR14548: SROA was crashing on a mixture of i1 and i8 loads and stores. When SROA was evaluating a mixture of i1 and i8 loads and stores, in just a particular case, it would tickle a latent bug where we compared bits to bytes rather than bits to bits. As a consequence of the latent bug, we would allow integers through which were not byte-size multiples, a situation the later rewriting code was never intended to handle. In release builds this could trigger all manner of oddities, but the reported issue in PR14548 was forming invalid bitcast instructions. The only downside of this fix is that it makes it more clear that SROA in its current form is not capable of handling mixed i1 and i8 loads and stores. Sometimes with the previous code this would work by luck, but usually it would crash, so I'm not terribly worried. I'll watch the LNT numbers just to be sure. llvm-svn: 169719	2012-12-10 00:54:45 +00:00
Chandler Carruth	1e72559a05	Switch SROA to pop Uses off the back of its visitors' queues. This will more closely match the behavior of the new PtrUseVisitor that I am adding. Hopefully this will not change the actual behavior in any way, but by making the processing order more similar help in debugging. llvm-svn: 169697	2012-12-09 11:56:01 +00:00
Shuxin Yang	7221b14d96	- Re-enable population count loop idiom recognization - fix a bug which cause sigfault. - add two testing cases which was causing crash llvm-svn: 169687	2012-12-09 03:12:46 +00:00
Chandler Carruth	329a5c1e03	Revert the patches adding a popcount loop idiom recognition pass. There are still bugs in this pass, as well as other issues that are being worked on, but the bugs are crashers that occur pretty easily in the wild. Test cases have been sent to the original commit's review thread. This reverts the commits: r169671: Fix a logic error. r169604: Move the popcnt tests to an X86 subdirectory. r168931: Initial commit adding the pass. llvm-svn: 169683	2012-12-08 22:18:29 +00:00
Shuxin Yang	d80db0a201	Fix an inadvertent typo error. llvm-svn: 169671	2012-12-08 05:00:59 +00:00
Bill Wendling	3f153ce37b	s/AttrListPtr/AttributeSet/g to better label what this class is going to be in the near future. llvm-svn: 169651	2012-12-07 23:16:57 +00:00
Bill Wendling	7119fab4de	Set the 'MadeChange' variable if we are deleting blocks. llvm-svn: 169455	2012-12-06 00:30:20 +00:00
Matt Beaumont-Gay	3e68d7d342	Add 'using' declarations to suppress -Woverloaded-virtual warnings. llvm-svn: 169214	2012-12-04 05:41:27 +00:00
Nadav Rotem	489fb9a4c3	Teach the jump threading optimization to stop scanning the basic block when calculating the cost after passing the threshold. llvm-svn: 169135	2012-12-03 17:34:44 +00:00
Chandler Carruth	a490793037	Use the new script to sort the includes of every file under lib. Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131	2012-12-03 16:50:05 +00:00
Chandler Carruth	dfd3102f8e	Remove some buggy and apparantly unnecessary code from SROA. The partitioning logic attempted to handle uses of an alloca with an offset starting before the alloca so long as the use had some overlap with the alloca itself. However, there was a bug where we tested '(uint64_t)Offset >= AllocSize' without first checking whether 'Offset' was positive. As a consequence, essentially every negative offset (that is, starting before the alloca does) would be thrown out, even if it was overlapping. The subsequent code to throw out negative offsets which were actually non-overlapping was essentially dead. The code to handle overlapping negative offsets was actually dead! I've just removed all of this, and taught SROA to discard any uses which start prior to the alloca from the beginning. It has the lovely property of simplifying the code. =] All the tests still pass, and in fact no new tests are needed as this is already covered by our testsuite. Fixing the code so that negative offsets work the way the comments indicate they were supposed to work causes regressions. That's how I found this. Anyways, this is all progress in the correct direction -- tightening up SROA to be maximally aggressive. Some day, I really hope to turn out-of-bounds accesses to an alloca into 'unreachable'. llvm-svn: 169120	2012-12-03 10:59:55 +00:00
Benjamin Kramer	b7100504d2	SROA: Avoid struct and array types early to avoid creating an overly large integer type. Fixes PR14465. Differential Revision: http://llvm-reviews.chandlerc.com/D148 llvm-svn: 169084	2012-12-01 11:53:32 +00:00
Bill Wendling	2a966eebb7	Replace r168930 with a more reasonable patch. The original patch removed a bunch of code that the SjLjEHPrepare pass placed into the entry block if all of the landing pads were removed during the CodeGenPrepare class. The more natural way of doing things is to run the CGP before we run the SjLjEHPrepare pass. Make it so! llvm-svn: 169044	2012-11-30 22:08:55 +00:00
Meador Inge	6a05d05854	Move library call simplification statistic to instcombine The simplify-libcalls pass maintained a statistic to count the number of library calls that have been simplified. Now that library call simplification is being carried out in instcombine the statistic should be moved to there. llvm-svn: 168975	2012-11-30 04:05:06 +00:00
Chandler Carruth	15fed97f3e	Move the InstVisitor utility into VMCore where it belongs. It heavily depends on the IR infrastructure, there is no sense in it being off in Support land. This is in preparation to start working to expand InstVisitor into more special-purpose visitors that are still generic and can be re-used across different passes. The expansion will go into the Analylis tree though as nothing in VMCore needs it. llvm-svn: 168972	2012-11-30 03:08:41 +00:00
Shuxin Yang	a7c032d8b5	rdar://12100355 (part 1) This revision attempts to recognize following population-count pattern: while(a) { c++; ... ; a &= a - 1; ... }, where <c> and <a>could be used multiple times in the loop body. TODO: On X8664 and ARM, __buildin_ctpop() are not expanded to a efficent instruction sequence, which need to be improved in the following commits. Reviewed by Nadav, really appreciate! llvm-svn: 168931	2012-11-29 19:38:54 +00:00
Bill Wendling	18531926d1	Handle the situation where CodeGenPrepare removes a reference to a BB that has the last invoke instruction in the function. This also removes the last landing pad in an function. This is fine, but with SjLj EH code, we've already placed a bunch of code in the 'entry' block, which expects the landing pad to stick around. When we get to the situation where CGP has removed the last landing pad, go ahead and nuke the SjLj instructions from the 'entry' block. <rdar://problem/12721258> llvm-svn: 168930	2012-11-29 19:38:06 +00:00
Meador Inge	3524aece42	instcombine: Migrate puts optimizations This patch migrates the puts optimizations from the simplify-libcalls pass into the instcombine library call simplifier. All the simplifiers from simplify-libcalls have now been migrated to instcombine. Yay! Just a few other bits to migrate (prototype attribute inference and a few statistics) and simplify-libcalls can finally be put to rest. llvm-svn: 168925	2012-11-29 19:15:17 +00:00
Meador Inge	95a0f6df53	instcombine: Migrate fputs optimizations This patch migrates the fputs optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 168893	2012-11-29 15:45:43 +00:00
Meador Inge	787f51971a	instcombine: Migrate fwrite optimizations This patch migrates the fwrite optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 168892	2012-11-29 15:45:39 +00:00
Meador Inge	5553b265a0	instcombine: Migrate fprintf optimizations This patch migrates the fprintf optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 168891	2012-11-29 15:45:33 +00:00
Bill Wendling	176de1b36b	When we delete a dead basic block, see if any of its successors are dead and delete those as well. llvm-svn: 168829	2012-11-28 23:23:48 +00:00
Meador Inge	7ed4062656	instcombine: Migrate sprintf optimizations This patch migrates the sprintf optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 168677	2012-11-27 05:57:54 +00:00
Meador Inge	a9043a2ff6	instcombine: Migrate printf optimizations This patch migrates the printf optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 168604	2012-11-26 20:37:20 +00:00
Meador Inge	9d6c54b477	instcombine: Migrate toascii optimizations This patch migrates the toascii optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 168580	2012-11-26 03:38:52 +00:00
Meador Inge	d0fe640156	instcombine: Migrate isascii optimizations This patch migrates the isascii optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 168579	2012-11-26 03:10:07 +00:00
Meador Inge	b385d5577b	instcombine: Migrate isdigit optimizations This patch migrates the isdigit optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 168578	2012-11-26 02:31:59 +00:00
Meador Inge	a7db63a469	instcombine: Migrate abs optimizations This patch migrates the abs optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 168574	2012-11-26 00:24:07 +00:00
Meador Inge	0a86e4b5ec	instcombine: Migrate ffs* optimizations This patch migrates the ffs* optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 168571	2012-11-25 20:45:27 +00:00
Benjamin Kramer	403c120acc	CodeGenPrepare: Move ret duplication out of the instruction iteration loop. It can delete the block, and the loop continues on free'd memory. No change in output. Found by valgrind. llvm-svn: 168525	2012-11-23 19:17:06 +00:00
Chandler Carruth	17e363c242	PR14055: Implement support for sub-vector operations in SROA. Now if we can transform an alloca into a single vector value, but it has subvector, non-element accesses, we form the appropriate shufflevectors to allow SROA to proceed. This fixes PR14055 which pointed out a very common pattern that SROA couldn't handle -- mixed vec3 and vec4 operations on a single alloca. llvm-svn: 168418	2012-11-21 08:16:30 +00:00
Chandler Carruth	6bdeed2384	Use LLVM_ENABLE_DUMP for the variables used in printing as well as the printing functions themselves. Part of PR14324 (which should have just been a patch to the list, but hey...) llvm-svn: 168362	2012-11-20 10:23:07 +00:00
Chandler Carruth	42df021931	Fix PR14132 and handle OOB loads speculated throuh PHI nodes. The issue is that we may end up with newly OOB loads when speculating a load into the predecessors of a PHI node, and this confuses the new integer splitting logic in some cases, triggering an assertion failure. In fact, the branch in question must be dead code as it loads from a too-narrow alloca. Add code to handle this gracefully and leave the requisite FIXMEs for both optimizing more aggressively and doing more to aid sanitizing invalid code which triggers these patterns. llvm-svn: 168361	2012-11-20 10:02:19 +00:00
Chandler Carruth	dbe3a57c30	Add a comment to associate a FIXME with a PR where it is matters. llvm-svn: 168347	2012-11-20 01:27:48 +00:00
Chandler Carruth	47187cb94a	Rework the rewriting of loads and stores for vector and integer allocas to properly handle the combinations of these with split integer loads and stores. This essentially replaces Evan's r168227 by refactoring the code in a different way, and trynig to mirror that refactoring in both the load and store sides of the rewriting. Generally speaking there was some really problematic duplicated code here that led to poorly founded assumptions and then subtle bugs. Now much of the code actually flows through and follows a more consistent style and logical path. There is still a tiny bit of duplication on the store side of things, but it is much less bad. This also changes the logic to never re-use a load or store instruction as that was simply too error prone in practice. I've added a few tests (one a reduction of the one in Evan's original patch, which happened to be the same as the report in PR14349). I'm going to look at adding a few more tests for things I found and fixed in passing (such as the volatile tests in the vectorizable predicate). This patch has survived bootstrap, and modulo one bugfix survived Duncan's test suite, but let me know if anything else explodes. llvm-svn: 168346	2012-11-20 01:12:50 +00:00
Duncan Sands	666f7dee7a	Remove the last bit of constant folding from LinearizeExprTree (most of it was removed in commit 168035, but I missed this bit). llvm-svn: 168292	2012-11-18 20:15:36 +00:00
Duncan Sands	2d43cbfea0	Fix PR14060, an infinite loop in reassociate. The problem was that one of the operands of the expression being written was wrongly thought to be reusable as an inner node of the expression resulting in it turning up as both an inner node and a leaf, creating a cycle in the def-use graph. This would have caused the verifier to blow up if things had gotten that far, however it managed to provoke an infinite loop first. llvm-svn: 168291	2012-11-18 19:27:01 +00:00
Evan Cheng	3fb5893b5d	Teach SROA rewriteVectorizedStoreInst to handle cases when the loaded value is narrower than the stored value. rdar://12713675 llvm-svn: 168227	2012-11-17 00:05:06 +00:00
Duncan Sands	f05d8752a2	Fix a crash observed by Shuxin Yang. The issue here is that LinearizeExprTree, the utility for extracting a chain of operations from the IR, thought that it might as well combine any constants it came across (rather than just returning them along with everything else). On the other hand, the factorization code would like to see the individual constants (this is quite reasonable: it is much easier to pull a factor of 3 out of 2*3 than it is to pull it out of 6; you may think 6/3 isn't so hard, but due to overflow it's not as easy to undo multiplications of constants as it may at first appear). This patch therefore makes LinearizeExprTree stupider: it now leaves optimizing to the optimization part of reassociate, and sticks to just analysing the IR. llvm-svn: 168035	2012-11-15 09:58:38 +00:00
Meador Inge	a191db7d99	instcombine: Migrate math library call simplifications This patch migrates the math library call simplifications from the simplify-libcalls pass into the instcombine library call simplifier. I have typically migrated just one simplifier at a time, but the math simplifiers are interdependent because: 1. CosOpt, PowOpt, and Exp2Opt all depend on UnaryDoubleFPOpt. 2. CosOpt, PowOpt, Exp2Opt, and UnaryDoubleFPOpt all depend on the option -enable-double-float-shrink. These two factors made migrating each of these simplifiers individually more of a pain than it would be worth. So, I migrated them all together. llvm-svn: 167815	2012-11-13 04:16:17 +00:00
Shuxin Yang	9597b0a305	revert r167740 llvm-svn: 167787	2012-11-13 00:08:49 +00:00
Shuxin Yang	a699462f9d	This change is to fix rdar://12571717 which is about assertion in Reassociate pass. The assertion is trigged when the Reassociater tries to transform expression ... + 2 * n * 3 + 2 * m + ... into: ... + 2 * (n3 + m). In the process of the transformation, a helper routine folds the constant 23 into 6, confusing optimizer which is trying the to eliminate the common factor 2, and cannot find 2 any more. Review is pending. But I'd like commit first in order to help those who are waiting for this fix. llvm-svn: 167740	2012-11-12 19:34:11 +00:00
Meador Inge	1ed197fb71	Delete a stale comment. No functional change. llvm-svn: 167698	2012-11-12 00:28:15 +00:00
Meador Inge	ba025d5d90	instcombine: Migrate memset optimizations This patch migrates the memset optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167689	2012-11-11 06:49:03 +00:00
Meador Inge	e093f6c41e	instcombine: Migrate memmove optimizations This patch migrates the memmove optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167687	2012-11-11 06:22:40 +00:00
Meador Inge	bf03751391	instcombine: Migrate memcpy optimizations This patch migrates the memcpy optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167686	2012-11-11 05:54:34 +00:00
Meador Inge	13e6be2fd6	instcombine: Migrate memcmp optimizations This patch migrates the memcmp optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167683	2012-11-11 05:11:20 +00:00
Meador Inge	a062b17960	instcombine: Migrate strstr optimizations This patch migrates the strstr optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167682	2012-11-11 03:51:48 +00:00
Meador Inge	a202e0c179	instcombine: Migrate strcspn optimizations This patch migrates the strcspn optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167675	2012-11-10 15:16:48 +00:00
Meador Inge	28cefe8802	instcombine: Migrate strspn optimizations This patch migrates the strspn optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167568	2012-11-08 01:33:50 +00:00
Chandler Carruth	1fc186f3bd	Revert the switch of loop-idiom to use the new dependence analysis. The new analysis is not yet ready for prime time. It has a critical flawed assumption, and some troubling shortages of testing. Until it's been hammered into better shape, let's stick with the working code. This should be easy to revert itself when the analysis is ready. Fixes PR14241, a miscompile of any memcpy-able loop which uses a pointer as the induction mechanism. If you have been seeing miscompiles in this revision range, you really want to test with this backed out. The results of this miscompile are a bit subtle as they can lead to downstream passes concluding things are impossible which are in fact possible. Thanks to David Blaikie for the majority of the reduction of this miscompile. I'll be checking in the test case in a non-revert commit. Revesions reverted here: r167045: LoopIdiom: Fix a serious missed optimization: we only turned top-level loops into memmove. r166877: LoopIdiom: Add checks to avoid turning memmove into an infinite loop. r166875: LoopIdiom: Recognize memmove loops. r166874: LoopIdiom: Replace custom dependence analysis with DependenceAnalysis. llvm-svn: 167286	2012-11-02 08:33:25 +00:00
Duncan Sands	f676ead13a	Fix an obvious typo that causes an assertion failure when running test/Transforms/GVN/rle.ll if the (currently disabled) check for a pointer type in getIntPtrType is turned on. llvm-svn: 167285	2012-11-02 07:49:32 +00:00
Chandler Carruth	0a6b99ee2b	Revert the majority of the next patch in the address space series: r165941: Resubmit the changes to llvm core to update the functions to support different pointer sizes on a per address space basis. Despite this commit log, this change primarily changed stuff outside of VMCore, and those changes do not carry any tests for correctness (or even plausibility), and we have consistently found questionable or flat out incorrect cases in these changes. Most of them are probably correct, but we need to devise a system that makes it more clear when we have handled the address space concerns correctly, and ideally each pass that gets updated would receive an accompanying test case that exercises that pass specificaly w.r.t. alternate address spaces. However, from this commit, I have retained the new C API entry points. Those were an orthogonal change that probably should have been split apart, but they seem entirely good. In several places the changes were very obvious cleanups with no actual multiple address space code added; these I have not reverted when I spotted them. In a few other places there were merge conflicts due to a cleaner solution being implemented later, often not using address spaces at all. In those cases, I've preserved the new code which isn't address space dependent. This is part of my ongoing effort to clean out the partial address space code which carries high risk and low test coverage, and not likely to be finished before the 3.2 release looms closer. Duncan and I would both like to see the above issues addressed before we return to these changes. llvm-svn: 167222	2012-11-01 09:14:31 +00:00
Chandler Carruth	76f7f4a33e	Revert the series of commits starting with r166578 which introduced the getIntPtrType support for multiple address spaces via a pointer type, and also introduced a crasher bug in the constant folder reported in PR14233. These commits also contained several problems that should really be addressed before they are re-committed. I have avoided reverting various cleanups to the DataLayout APIs that are reasonable to have moving forward in order to reduce the amount of churn, and minimize the number of commits that were reverted. I've also manually updated merge conflicts and manually arranged for the getIntPtrType function to stay in DataLayout and to be defined in a plausible way after this revert. Thanks to Duncan for working through this exact strategy with me, and Nick Lewycky for tracking down the really annoying crasher this triggered. (Test case to follow in its own commit.) After discussing with Duncan extensively, and based on a note from Micah, I'm going to continue to back out some more of the more problematic patches in this series in order to ensure we go into the LLVM 3.2 branch with a reasonable story here. I'll send a note to llvmdev explaining what's going on and why. Summary of reverted revisions: r166634: Fix a compiler warning with an unused variable. r166607: Add some cleanup to the DataLayout changes requested by Chandler. r166596: Revert "Back out r166591, not sure why this made it through since I cancelled the command. Bleh, sorry about this! r166591: Delete a directory that wasn't supposed to be checked in yet. r166578: Add in support for getIntPtrType to get the pointer type based on the address space. llvm-svn: 167221	2012-11-01 08:07:29 +00:00
Jakub Staszak	131641627b	Don't insert and erase load instruction. Simply create (new) and delete it. llvm-svn: 167196	2012-11-01 01:10:43 +00:00
Meador Inge	ccbf761437	instcombine: Migrate strto* optimizations This patch migrates the strto* optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167119	2012-10-31 14:58:26 +00:00
Meador Inge	b6984384bf	instcombine: Migrate strpbrk optimizations This patch migrates the strpbrk optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167105	2012-10-31 04:29:58 +00:00
Meador Inge	5f906a50d3	instcombine: Migrate strlen optimizations This patch migrates the strlen optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167103	2012-10-31 03:33:06 +00:00
Meador Inge	4d309f330c	instcombine: Migrate strncpy optimizations This patch migrates the strncpy optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167102	2012-10-31 03:33:00 +00:00
Meador Inge	261da7dfde	instcombine: Migrate stpcpy optimizations This patch migrates the stpcpy optimizations from the simplify-libcalls pass into the instcombine library call simplifier. Note that the __stpcpy_chk simplifications were migrated in a previous commit. llvm-svn: 167083	2012-10-31 00:20:56 +00:00
Chandler Carruth	d3b4a83c9f	Fix PR14212: For some strange reason I treated vectors differently from integers in that the code to handle split alloca-wide integer loads or stores doesn't come first. It should, for the same reasons as with integers, and the PR attests to that. Also had to fix a busted assert in that this test case also covers. llvm-svn: 167051	2012-10-30 20:52:40 +00:00
Benjamin Kramer	78cdbf2f16	LoopIdiom: Fix a serious missed optimization: we only turned top-level loops into memmove. Thanks to Preston Briggs for catching this! llvm-svn: 167045	2012-10-30 19:49:39 +00:00
Hans Wennborg	40eb1b4055	Use TargetTransformInfo to control switch-to-lookup table transformation When the switch-to-lookup tables transform landed in SimplifyCFG, it was pointed out that this could be inappropriate for some targets. Since there was no way at the time for the pass to know anything about the target, an awkward reverse-transform was added in CodeGenPrepare that turned lookup tables back into switches for some targets. This patch uses the new TargetTransformInfo to determine if a switch should be transformed, and removes CodeGenPrepare::ConvertLoadToSwitch. llvm-svn: 167011	2012-10-30 11:23:25 +00:00
Ulrich Weigand	445bd73056	In various places throughout the code generator, there were special checks to avoid performing compile-time arithmetic on PPCDoubleDouble. Now that APFloat supports arithmetic on PPCDoubleDouble, those checks are no longer needed, and we can treat the type like any other. llvm-svn: 166958	2012-10-29 18:35:49 +00:00
Duncan Sands	e6f6a2ecdc	Remove a wrapper around getIntPtrType added to GVN by Hal in commit 166624 (the wrapper returns a vector of integers when passed a vector of pointers) by having getIntPtrType itself return a vector of integers in this case. Outside of this wrapper, I didn't find anywhere in the codebase that was relying on the old behaviour for vectors of pointers, so give this a whirl through the buildbots. llvm-svn: 166939	2012-10-29 17:31:46 +00:00
Benjamin Kramer	00df4c1b61	LoopIdiom: Add checks to avoid turning memmove into an infinite loop. I don't think this is possible with the current implementation but that may change eventually. llvm-svn: 166877	2012-10-27 15:18:28 +00:00
Benjamin Kramer	8ba71ab2ab	LoopIdiom: Recognize memmove loops. This turns loops like for (unsigned i = 0; i != n; ++i) p[i] = p[i+1]; into memmove, which has a highly optimized implementation in most libcs. This was really easy with the new DependenceAnalysis :) llvm-svn: 166875	2012-10-27 14:25:51 +00:00

... 3 4 5 6 7 ...

5970 Commits