llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-23 04:52:54 +02:00

Author	SHA1	Message	Date
Matt Arsenault	f608b07c6f	Fix GVN creating bitcast between address spaces llvm-svn: 193710	2013-10-30 19:05:41 +00:00
Andrew Trick	741b7a0cfc	Fix SCEVExpander: don't try to expand quadratic recurrences outside a loop. Partial fix for PR17459: wrong code at -O3 on x86_64-linux-gnu (affecting trunk and 3.3) When SCEV expands a recurrence outside of a loop it attempts to scale by the stride of the recurrence. Chained recurrences don't work that way. We could compute binomial coefficients, but would hve to guarantee that the chained AddRec's are in a perfectly reduced form. llvm-svn: 193438	2013-10-25 21:35:56 +00:00
Juergen Ributzka	db5b9c734f	Fix a bug in LinearFunctionTestReplace that created invalid loop exit checks. Reviewed by Andy llvm-svn: 193303	2013-10-24 05:29:56 +00:00
Andrew Trick	780bd06488	Clarify comments in genLoopLimit. llvm-svn: 193292	2013-10-24 00:43:38 +00:00
Matt Arsenault	fcca6dd732	Use more type helper functions llvm-svn: 193109	2013-10-21 19:43:56 +00:00
Bill Wendling	ab46af5515	Don't eliminate a partially redundant load if it's in a landing pad. A landing pad can be jumped to only by the unwind edge of an invoke instruction. If we eliminate a partially redundant load in a landing pad, it will create a basic block that violates this constraint. It then leads to other problems down the line if it tries to merge that basic block with the landing pad. Avoid this by not eliminating the load in a landing pad. PR17621 llvm-svn: 193064	2013-10-21 04:09:17 +00:00
Tom Stellard	b9cbfd3959	StructurizeCFG: Add dependency on LowerSwitch pass Switch instructions were crashing the StructurizeCFG pass, and it's probably easier anyway if we don't need to handle them in this pass. Reviewed-by: Christian König <christian.koenig@amd.com> llvm-svn: 191841	2013-10-02 17:04:59 +00:00
Chandler Carruth	ee12d58370	Remove the very substantial, largely unmaintained legacy PGO infrastructure. This was essentially work toward PGO based on a design that had several flaws, partially dating from a time when LLVM had a different architecture, and with an effort to modernize it abandoned without being completed. Since then, it has bitrotted for several years further. The result is nearly unusable, and isn't helping any of the modern PGO efforts. Instead, it is getting in the way, adding confusion about PGO in LLVM and distracting everyone with maintenance on essentially dead code. Removing it paves the way for modern efforts around PGO. Among other effects, this removes the last of the runtime libraries from LLVM. Those are being developed in the separate 'compiler-rt' project now, with somewhat different licensing specifically more approriate for runtimes. llvm-svn: 191835	2013-10-02 15:42:23 +00:00
Robert Wilhelm	198f21deb3	Even more spelling fixes for "instruction". llvm-svn: 191611	2013-09-28 13:42:22 +00:00
Benjamin Kramer	109a525643	Push analysis passes to InstSimplify when they're around anyways. llvm-svn: 191309	2013-09-24 16:37:40 +00:00
Benjamin Kramer	7dae47d205	Drop spurious handle in comment. llvm-svn: 191172	2013-09-22 11:24:58 +00:00
Benjamin Kramer	190baf6fef	SROA: Handle casts involving vectors of pointers and integer scalars. SROA wants to convert any types of equivalent widths but it's not possible to convert vectors of pointers to an integer scalar with a single cast. As a workaround we add a bitcast to the corresponding int ptr type first. This type of cast used to be an edge case but has become common with SLP vectorization. Fixes PR17271. llvm-svn: 191143	2013-09-21 20:36:04 +00:00
Shuxin Yang	640fbb44a5	Resurrect r191017 " GVN proceeds in the presence of dead code" plus a fix to PR17307 & 17308. The problem of r191017 is that when GVN fabricate a val-number for a dead instruction (in order to make following expr-PRE happy), it forget to fabricate a leader-table entry for it as well. llvm-svn: 191118	2013-09-20 23:12:57 +00:00
Joerg Sonnenberger	d4339cb110	Revert r191017, it results in segmentation faults in Qt. llvm-svn: 191104	2013-09-20 20:33:57 +00:00
Shuxin Yang	76ccefc5f8	GVN proceeds in the presence of dead code. This is how it ignores the dead code: 1) When a dead branch target, say block B, is identified, all the blocks dominated by B is dead as well. 2) The PHIs of those blocks in dominance-frontier(B) is updated such that the operands corresponding to dead predecessors are replaced by "UndefVal". Using lattice's jargon, the "UndefVal" is the "Top" in essence. Phi node like this "phi(v1 bb1, undef xx)" will be optimized into "v1" if v1 is constant, or v1 is an instruction which dominate this PHI node. 3) When analyzing the availability of a load L, all dead mem-ops which L depends on disguise as a load which evaluate exactly same value as L. 4) The dead mem-ops will be materialized as "UndefVal" during code motion. llvm-svn: 191017	2013-09-19 17:22:51 +00:00
Matt Arsenault	513e7539be	MemCpyOptimizer: Use max legal int size instead of pointer size If there are no legal integers, assume 1 byte. This makes more sense than using the pointer size as a guess for the maximum GPR width. It is conceivable to want to use some 64-bit pointers on a target where 64-bit integers aren't legal. llvm-svn: 190817	2013-09-16 22:43:16 +00:00
Chandler Carruth	d47d52e219	Remove the long, long defunct IR block placement pass. This pass was based on the previous (essentially unused) profiling infrastructure and the assumption that by ordering the basic blocks at the IR level in a particular way, the correct layout would happen in the end. This sometimes worked, and mostly didn't. It also was a really naive implementation of the classical paper that dates from when branch predictors were primarily directional and when loop structure wasn't commonly available. It also didn't factor into the equation non-fallthrough branches and other machine level details. Anyways, for all of these reasons and more, I wrote MachineBlockPlacement, which completely supercedes this pass. It both uses modern profile information infrastructure, and actually works. =] llvm-svn: 190748	2013-09-14 09:28:14 +00:00
Hal Finkel	fe9daed60a	Add getUnrollingPreferences to TTI Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (which is in-order with a deep pipeline), and using more aggressive defaults is important. llvm-svn: 190542	2013-09-11 19:25:43 +00:00
Matt Arsenault	d35880d234	Teach loop-idiom about address space pointer sizes llvm-svn: 190491	2013-09-11 05:09:42 +00:00
Matt Arsenault	8c0c236d92	Add braces llvm-svn: 190490	2013-09-11 05:09:35 +00:00
Eli Friedman	3718173153	Get rid of unused isPodLike definitions. llvm-svn: 190461	2013-09-11 00:36:54 +00:00
Eli Friedman	713110699b	Fix mistake in r190442. llvm-svn: 190446	2013-09-10 23:09:24 +00:00
Eli Friedman	89c35cdea8	Remove unused functions. llvm-svn: 190442	2013-09-10 22:42:31 +00:00
Matt Arsenault	a15663a1b4	Teach ScalarEvolution about pointer address spaces llvm-svn: 190425	2013-09-10 19:55:24 +00:00
Matt Arsenault	4c2083b14a	Use type helper functions. llvm-svn: 190113	2013-09-06 00:37:24 +00:00
Matt Arsenault	f658af2617	Teach CodeGenPrepare about address spaces llvm-svn: 190112	2013-09-06 00:18:43 +00:00
Hal Finkel	198ffea54f	Revert: r189565 - Add getUnrollingPreferences to TTI Revert unintentional commit (of an unreviewed change). Original commit message: Add getUnrollingPreferences to TTI Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (which is in-order with a deep pipeline), and using more aggressive defaults is important. llvm-svn: 189566	2013-08-29 03:33:15 +00:00
Hal Finkel	04a990355c	Add getUnrollingPreferences to TTI Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (which is in-order with a deep pipeline), and using more aggressive defaults is important. llvm-svn: 189565	2013-08-29 03:29:57 +00:00
Richard Sandiford	b195d89bde	Turn MipsOptimizeMathLibCalls into a target-independent scalar transform ...so that it can be used for z too. Most of the code is the same. The only real change is to use TargetTransformInfo to test when a sqrt instruction is available. The pass is opt-in because at the moment it only handles sqrt. llvm-svn: 189097	2013-08-23 10:27:02 +00:00
Nick Lewycky	eab287a60a	Revert r187191, which broke opt -mem2reg on the testcases included in PR16867. However, opt -O2 doesn't run mem2reg directly so nobody noticed until r188146 when SROA started sending more things directly down the PromoteMemToReg path. In order to revert r187191, I also revert dependent revisions r187296, r187322 and r188146. Fixes PR16867. Does not add the testcases from that PR, but both of them should get added for both mem2reg and sroa when this revert gets unreverted. llvm-svn: 188327	2013-08-13 22:51:58 +00:00
Peter Collingbourne	723e9b89ec	Reapply r188119 now that the bug it exposed is fixed. llvm-svn: 188217	2013-08-12 22:38:43 +00:00
Chandler Carruth	24ce5e9526	Re-instate r187323 which fast-tracks promotable allocas as soon as the SROA-based analysis has enough information. This should work now that both mem2reg and the SSAUpdater-based AllocaPromoter have been updated to be able to promote the types of allocas that the SROA analysis detects. I've included tests for the AllocaPromoter that were only possible to write once we fast-tracked promotable allocas without rewriting them. This includes a test both for r187347 and r188145. Original commit log for r187323: """ Now that mem2reg understands how to cope with a slightly wider set of uses of an alloca, we can pre-compute promotability while analyzing an alloca for splitting in SROA. That lets us short-circuit the common case of a bunch of trivially promotable allocas. This cuts 20% to 30% off the run time of SROA for typical frontend-generated IR sequneces I'm seeing. It gets the new SROA to within 20% of ScalarRepl for such code. My current benchmark for these numbers is PR15412, but it fits the general pattern of IR emitted by Clang so it should be widely applicable. """ llvm-svn: 188146	2013-08-11 02:17:11 +00:00
Chandler Carruth	bc4fdfb024	Finish fixing the SSAUpdater-based AllocaPromoter strategy in SROA to cope with the more general set of patterns that are now handled by mem2reg and that we can detect quickly while doing SROA's initial analysis. Notably, this allows it to promote through no-op bitcast and GEP sequences. A core part of the SSAUpdater approach is the ability to test whether a particular instruction is part of the set being promoted. Testing this becomes significantly more complex in the world where the operand to every load and store isn't the alloca itself. I ended up using the approach of walking up the def-chain until we find the alloca. I benchmarked this against keeping a set of pointer operands and keeping a set of the loads and stores we care about, and this one seemed faster although the difference was very small. No test case yet because currently the rewriting always "fixes" the inputs to not require this. The next patch which re-enables early promotion of easy cases in SROA will include a test case that specifically exercises this aspect of the alloca promoter. llvm-svn: 188145	2013-08-11 01:56:15 +00:00
Chandler Carruth	5b16f00b28	Reformat some bits of AllocaPromoter and simplify the name and type of our visiting datastructures in the AllocaPromoter/SSAUpdater path of SROA. Also shift the order if clears around to be more consistent. No functionality changed here, this is just a cleanup. llvm-svn: 188144	2013-08-11 01:03:18 +00:00
Arnold Schwaighofer	3525aff8f6	Revert r188119 "Kill some duplicated code for removing unreachable BBs." It is breaking builbots with libgmalloc enabled on Mac OS X. $ cd llvm ; mkdir release ; cd release $ ../configure --enable-optimized —prefix=$PWD/install $ make $ make check $ Release+Asserts/bin/llvm-lit -v --param use_gmalloc=1 --param \ gmalloc_path=/usr/lib/libgmalloc.dylib \ ../test/Instrumentation/DataFlowSanitizer/args-unreachable-bb.ll llvm-svn: 188142	2013-08-10 20:16:06 +00:00
Peter Collingbourne	0b56f9dd44	Kill some duplicated code for removing unreachable BBs. This moves removeUnreachableBlocksFromFn from SimplifyCFGPass.cpp to Utils/Local.cpp and uses it to replace the implementation of llvm::removeUnreachableBlocks, which appears to do a strict subset of what removeUnreachableBlocksFromFn does. Differential Revision: http://llvm-reviews.chandlerc.com/D1334 llvm-svn: 188119	2013-08-09 22:47:24 +00:00
Benjamin Kramer	fbbec483df	JumpThreading: Turn a select instruction into branching if it allows to thread one half of the select. This is a common pattern coming out of simplifycfg generating gross code. a: ; preds = %entry %sel = select i1 %cmp1, double %add, double 0.000000e+00 br label %b b: %cond5 = phi double [ %sel, %a ], [ %sub, %entry ] %cmp6 = fcmp oeq double %cond5, 0.000000e+00 br i1 %cmp6, label %if.then, label %if.end becomes a: br i1 %cmp1, label %b, label %if.then b: %cond5 = phi double [ %sub, %entry ], [ %add, %a ] %cmp6 = fcmp oeq double %cond5, 0.000000e+00 br i1 %cmp6, label %if.then, label %if.end Skipping block b completely if possible. llvm-svn: 187880	2013-08-07 10:29:38 +00:00
Jakub Staszak	06543ea089	Adjust file to the coding standard. llvm-svn: 187808	2013-08-06 17:03:42 +00:00
Tom Stellard	e4e3be6f50	Factor FlattenCFG out from SimplifyCFG Patch by: Mei Ye llvm-svn: 187764	2013-08-06 02:43:45 +00:00
Chandler Carruth	4093b4bbce	Teach the AllocaPromoter which is wrapped around the SSAUpdater infrastructure to do promotion without a domtree the same smarts about looking through GEPs, bitcasts, etc., that I just taught mem2reg about. This way, if SROA chooses to promote an alloca which still has some noisy instructions this code can cope with them. I've not used as principled of an approach here for two reasons: 1) This code doesn't really need it as we were already set up to zip through the instructions used by the alloca. 2) I view the code here as more of a hack, and hopefully a temporary one. The SSAUpdater path in SROA is a real sore point for me. It doesn't make a lot of architectural sense for many reasons: - We're likely to end up needing the domtree anyways in a subsequent pass, so why not compute it earlier and use it. - In the future we'll likely end up needing the domtree for parts of the inliner itself. - If we need to we could teach the inliner to preserve the domtree. Part of the re-work of the pass manager will allow this to be very powerful even in large SCCs with many functions. - Ultimately, computing a domtree has gotten significantly faster since the original SSAUpdater-using code went into ScalarRepl. We no longer use domfrontiers, and much of domtree is lazily done based on queries rather than eagerly. - At this point keeping the SSAUpdater-based promotion saves a total of 0.7% on a build of the 'opt' tool for me. That's not a lot of performance given the complexity! So I'm leaving this a bit ugly in the hope that eventually we just remove all of this nonsense. I can't even readily test this because this code isn't reachable except through SROA. When I re-instate the patch that fast-tracks allocas already suitable for promotion, I'll add a testcase there that failed before this change. Before that, SROA will fix any test case I give it. llvm-svn: 187347	2013-07-29 09:06:53 +00:00
Chandler Carruth	cfd2dee91f	Temporarily revert r187323 until I update SSAUpdater to match mem2reg. I forgot that we had two totally independent things here. :: sigh :: llvm-svn: 187327	2013-07-28 09:05:49 +00:00
Chandler Carruth	d3834e5bef	Now that mem2reg understands how to cope with a slightly wider set of uses of an alloca, we can pre-compute promotability while analyzing an alloca for splitting in SROA. That lets us short-circuit the common case of a bunch of trivially promotable allocas. This cuts 20% to 30% off the run time of SROA for typical frontend-generated IR sequneces I'm seeing. It gets the new SROA to within 20% of ScalarRepl for such code. My current benchmark for these numbers is PR15412, but it fits the general pattern of IR emitted by Clang so it should be widely applicable. llvm-svn: 187323	2013-07-28 08:27:12 +00:00
Chandler Carruth	ca3736c604	Thread DataLayout through the callers and into mem2reg. This will be useful in a subsequent patch, but causes an unfortunate amount of noise, so I pulled it out into a separate patch. llvm-svn: 187322	2013-07-28 06:43:11 +00:00
Chandler Carruth	b3e0150179	Don't use all the #ifdefs to hide the stats counters and instead rely on their being optimized out in debug mode. Realistically, this just isn't going to be the slow part anyways. This also fixes unused variable warnings that are breaking LLD build bots. =/ I didn't see these at first, and kept losing track of the fact that they were broken. llvm-svn: 187297	2013-07-27 10:17:49 +00:00
Nick Lewycky	3d5df03884	Reimplement isPotentiallyReachable to make nocapture deduction much stronger. Adds unit tests for it too. Split BasicBlockUtils into an analysis-half and a transforms-half, and put the analysis bits into a new Analysis/CFG.{h,cpp}. Promote isPotentiallyReachable into llvm::isPotentiallyReachable and move it into Analysis/CFG. llvm-svn: 187283	2013-07-27 01:24:00 +00:00
Tom Stellard	8e98bf332b	SimplifyCFG: Use parallel-and and parallel-or mode to consolidate branch conditions Merge consecutive if-regions if they contain identical statements. Both transformations reduce number of branches. The transformation is guarded by a target-hook, and is currently enabled only for +R600, but the correctness has been tested on X86 target using a variety of CPU benchmarks. Patch by: Mei Ye llvm-svn: 187278	2013-07-27 00:01:07 +00:00
Benjamin Kramer	936fe15ecf	TRE: Move class into anonymous namespace. While there shrink a dangerously large SmallPtrSet. llvm-svn: 187050	2013-07-24 16:12:08 +00:00
Chandler Carruth	97ac149a21	Fix a problem I introduced in r187029 where we would over-eagerly schedule an alloca for another iteration in SROA. This only showed up with a mixture of promotable and unpromotable selects and phis. Added a test case for this. llvm-svn: 187031	2013-07-24 12:12:17 +00:00
Chandler Carruth	b086f93d75	Fix PR16687 where we were incorrectly promoting an alloca that had pending speculation for a phi node. The problem here is that we were using growth of the specluation set as an indicator of whether speculation would occur, and if the phi node is already in the set we don't see it grow. This is a symptom of the fact that this signal is a total hack. Unfortunately, I couldn't really come up with a non-hacky way of signaling that promotion remains valid after speculation occurs, such that we only speculate when all else looks good for promotion. In the end, I went with at least a much more explicit approach of doing the work of queuing inside the phi and select processing and setting a preposterously named flag to convey that we're in the special state of requiring speculating before promotion. Thanks to Richard Trieu and Nick Lewycky for the excellent work reducing a testcase for this from a pretty giant, nasty assert in a big application. =] The testcase was excellent. llvm-svn: 187029	2013-07-24 09:47:28 +00:00
Nick Lewycky	80c6b2a977	Remove extraneous null statement. No functionality change! llvm-svn: 186893	2013-07-22 23:38:27 +00:00
Jakub Staszak	ad1f6af5a0	Use switch instead of if. No functionality change. llvm-svn: 186892	2013-07-22 23:38:16 +00:00
Jakub Staszak	9e648c8d0e	OldPtr is llvm::Instruction. Remove unneeded cast<>. llvm-svn: 186880	2013-07-22 22:10:43 +00:00
Jakub Staszak	11159a5a91	Change tabs to spaces. llvm-svn: 186877	2013-07-22 21:11:30 +00:00
Matt Arsenault	f4ec7f5975	Fix spelling and grammar llvm-svn: 186858	2013-07-22 18:59:58 +00:00
Benjamin Kramer	78617fc38a	SROA: Microoptimization: Remove dead entries first, then sort. While there replace an explicit struct with std::mem_fun. llvm-svn: 186761	2013-07-20 08:38:34 +00:00
Chandler Carruth	684f6afef6	Cleanup the stats counters for the new implementation. These actually count the right things and have the right names. llvm-svn: 186667	2013-07-19 10:57:36 +00:00
Chandler Carruth	5d65964bbf	Fix another assert failure very similar to PR16651's test case. This test case came from Benjamin and found the parallel bug in the vector promotion code. llvm-svn: 186666	2013-07-19 10:57:32 +00:00
Chandler Carruth	47c6759da6	Try to move to a more reasonable set of naming conventions given the new implementation of the SROA algorithm. We were using the term 'partition' in many places that no longer ever represented an actual partition, but rather just an arbitrary slice of an alloca. No functionality change intended here. Mostly just renaming of types, functions, variables, and rewording of comments. Several comments were rewritten to make a lot more sense in the new structure of things. The stats are still weird and not reflective of how this really works. I'll fix those up in a separate patch as it is a touch more semantic of a change... llvm-svn: 186659	2013-07-19 09:13:58 +00:00
Chandler Carruth	51ced11831	A long overdue cleanup in SROA to use 'DL' instead of 'TD' for the DataLayout variables. llvm-svn: 186656	2013-07-19 07:21:28 +00:00
Chandler Carruth	dc65e5d0aa	Fix PR16651, an assert introduced in my recent re-work of the innards of SROA. The crux of the issue is that now we track uses of a partition of the alloca in two places: the iterators over the partitioning uses and the previously collected split uses vector. We weren't accounting for the fact that the split uses might invalidate integer widening in ways other than due to their width (in this case due to being volatile). Further reduced testcase added to the tests. llvm-svn: 186655	2013-07-19 07:12:23 +00:00
Chandler Carruth	b59978aea3	Reapply r186316 with a fix for one bug where the code could walk off the end of a vector. This was found with ASan. I've had one other report of a crasher, but thus far been unable to reproduce the crash. It may well be fixed with this version, and if not I'd like to get more information from the build bots about what is happening. See r186316 for the full commit log for the new implementation of the SROA algorithm. llvm-svn: 186565	2013-07-18 07:15:00 +00:00
Craig Topper	b8260534f6	Add 'const' qualifiers to static const char* variables. llvm-svn: 186371	2013-07-16 01:17:10 +00:00
Stephen Lin	b764d0591b	Remove trailing whitespace llvm-svn: 186333	2013-07-15 17:55:02 +00:00
Chandler Carruth	255d06c032	Revert r186316 while I track down an ASan failure and an assert from a bot. This reverts the commit which introduced a new implementation of the fancy SROA pass designed to reduce its overhead. I'll skip the huge commit log here, refer to r186316 if you're looking for how this all works and why it works that way. llvm-svn: 186332	2013-07-15 17:36:21 +00:00
Chandler Carruth	05be8f3230	Reimplement SROA yet again. Same fundamental principle, but a totally different core implementation strategy. Previously, SROA would build a relatively elaborate partitioning of an alloca, associate uses with each partition, and then rewrite the uses of each partition in an attempt to break apart the alloca into chunks that could be promoted. This was very wasteful in terms of memory and compile time because regardless of how complex the alloca or how much we're able to do in breaking it up, all of the datastructure work to analyze the partitioning was done up front. The new implementation attempts to form partitions of the alloca lazily and on the fly, rewriting the uses that make up that partition as it goes. This has a few significant effects: 1) Much simpler data structures are used throughout. 2) No more double walk of the recursive use graph of the alloca, only walk it once. 3) No more complex algorithms for associating a particular use with a particular partition. 4) PHI and Select speculation is simplified and happens lazily. 5) More precise information is available about a specific use of the alloca, removing the need for some side datastructures. Ultimately, I think this is a much better implementation. It removes about 300 lines of code, but arguably removes more like 500 considering that some code grew in the process of being factored apart and cleaned up for this all to work. I've re-used as much of the old implementation as possible, which includes the lion's share of code in the form of the rewriting logic. The interesting new logic centers around how the uses of a partition are sorted, and split into actual partitions. Each instruction using a pointer derived from the alloca gets a 'Partition' entry. This name is totally wrong, but I'll do a rename in a follow-up commit as there is already enough churn here. The entry describes the offset range accessed and the nature of the access. Once we have all of these entries we sort them in a very specific way: increasing order of begin offset, followed by whether they are splittable uses (memcpy, etc), followed by the end offset or whatever. Sorting by splittability is important as it simplifies the collection of uses into a partition. Once we have these uses sorted, we walk from the beginning to the end building up a range of uses that form a partition of the alloca. Overlapping unsplittable uses are merged into a single partition while splittable uses are broken apart and carried from one partition to the next. A partition is also introduced to bridge splittable uses between the unsplittable regions when necessary. I've looked at the performance PRs fairly closely. PR15471 no longer will even load (the module is invalid). Not sure what is up there. PR15412 improves by between 5% and 10%, however it is nearly impossible to know what is holding it up as SROA (the entire pass) takes less time than reading the IR for that test case. The analysis takes the same time as running mem2reg on the final allocas. I suspect (without much evidence) that the new implementation will scale much better however, and it is just the small nature of the test cases that makes the changes small and noisy. Either way, it is still simpler and cleaner I think. llvm-svn: 186316	2013-07-15 10:30:19 +00:00
Craig Topper	58fa7a9b4a	Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector size. llvm-svn: 186274	2013-07-14 04:42:23 +00:00
Andrew Trick	b79ae09045	LFTR improvement to avoid truncation. This is a reimplemntation of the patch originally in r186107. llvm-svn: 186215	2013-07-12 22:08:48 +00:00
Andrew Trick	064fb30cab	Cleanup LFTR logic. llvm-svn: 186214	2013-07-12 22:08:44 +00:00
Andrew Trick	aacd582de2	Cleanup: rename a variable to make the logic easier to follow. llvm-svn: 186213	2013-07-12 22:08:41 +00:00
Chandler Carruth	4163f3fe85	Revert "indvars: Improve LFTR by eliminating truncation when comparing against a constant." This reverts commit r186107. It didn't handle wrapping arithmetic in the loop correctly and thus caused the following C program to count from 0 to UINT64_MAX instead of from 0 to 255 as intended: #include <stdio.h> int main() { unsigned char first = 0, last = 255; do { printf("%d\n", first); } while (first++ != last); } Full test case and instructions to reproduce with just the -indvars pass sent to the original review thread rather than to r186107's commit. llvm-svn: 186152	2013-07-12 11:18:55 +00:00
Andrew Trick	fe577c9f45	indvars: Improve LFTR by eliminating truncation when comparing against a constant. Patch by Michele Scandale! Adds a special handling of the case where, during the loop exit condition rewriting, the exit value is a constant of bitwidth lower than the type of the induction variable: instead of introducing a trunc operation in order to match correctly the operand types, it allows to convert the constant value to an equivalent constant, depending on the initial value of the induction variable and the trip count, in order have an equivalent comparison between the induction variable and the new constant. llvm-svn: 186107	2013-07-11 17:08:59 +00:00
Michael Gottesman	722cf9dc9b	Teach TailRecursionElimination to handle certain cases of nocapture escaping allocas. Without the changes introduced into this patch, if TRE saw any allocas at all, TRE would not perform TRE or mark callsites with the tail marker. Because TRE runs after mem2reg, this inadequacy is not a death sentence. But given a callsite A without escaping alloca argument, A may not be able to have the tail marker placed on it due to a separate callsite B having a write-back parameter passed in via an argument with the nocapture attribute. Assume that B is the only other callsite besides A and B only has nocapture escaping alloca arguments (NOTE B may have other arguments that are not passed allocas). In this case not marking A with the tail marker is unnecessarily conservative since: 1. By assumption A has no escaping alloca arguments itself so it can not access the caller's stack via its arguments. 2. Since all of B's escaping alloca arguments are passed as parameters with the nocapture attribute, we know that B does not stash said escaping allocas in a manner that outlives B itself and thus could be accessed indirectly by A. With the changes introduced by this patch: 1. If we see any escaping allocas passed as a capturing argument, we do nothing and bail early. 2. If we do not see any escaping allocas passed as captured arguments but we do see escaping allocas passed as nocapture arguments: i. We do not perform TRE to avoid PR962 since the code generator produces significantly worse code for the dynamic allocas that would be created by the TRE algorithm. ii. If we do not return twice, mark call sites without escaping allocas with the tail marker. NOTE This excludes functions with escaping nocapture allocas. 3. If we do not see any escaping allocas at all (whether captured or not): i. If we do not have usage of setjmp, mark all callsites with the tail marker. ii. If there are no dynamic/variable sized allocas in the function, attempt to perform TRE on all callsites in the function. Based off of a patch by Nick Lewycky. rdar://14324281. llvm-svn: 186057	2013-07-11 04:40:01 +00:00
Benjamin Kramer	d9392561e7	Reassociate: Remove unnecessary default operator=. llvm-svn: 185757	2013-07-06 15:10:13 +00:00
Sylvestre Ledru	be24b2e69f	Remove a useless declarations (found by scan-build) llvm-svn: 185709	2013-07-05 15:58:12 +00:00
Craig Topper	783617eba7	Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid specifying the vector size. llvm-svn: 185606	2013-07-04 01:31:24 +00:00
Craig Topper	9729e843cb	Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid specifying the vector size. llvm-svn: 185540	2013-07-03 15:07:05 +00:00
Nick Lewycky	e6e35eda7b	dbgs() << Instruction doesn't print a newline on the end any more. Update these debug statements to add a missing newline. Also canonicalize to '\n' instead of "\n"; the latter calls a function with a loop the former does not. llvm-svn: 184897	2013-06-26 00:30:18 +00:00
Bob Wilson	f1bf7886b8	Fix SROA to avoid unnecessary scalar conversions for 1-element vectors. When a 1-element vector alloca is promoted, a store instruction can often be rewritten without converting the value to a scalar and using an insertelement instruction to stuff it into the new alloca. This patch just adds a check to skip that conversion when it is unnecessary. This turns out to be really important for some ARM Neon operations where <1 x i64> is used to get around the fact that i64 is not a legal type. llvm-svn: 184870	2013-06-25 19:09:50 +00:00
Meador Inge	f58d6431f9	Remove the simplify-libcalls pass (finally) This commit completely removes what is left of the simplify-libcalls pass. All of the functionality has now been migrated to the instcombine and functionattrs passes. The following C API functions are now NOPs: 1. LLVMAddSimplifyLibCallsPass 2. LLVMPassManagerBuilderSetDisableSimplifyLibCalls llvm-svn: 184459	2013-06-20 19:48:07 +00:00
Bill Wendling	4d82ecded8	Access the TargetLoweringInfo from the TargetMachine object instead of caching it. The TLI may change between functions. No functionality change. llvm-svn: 184352	2013-06-19 21:07:11 +00:00
Matt Arsenault	fb5518e48b	Move StructurizeCFG out of R600 to generic Transforms. Register it with PassManager llvm-svn: 184343	2013-06-19 20:18:24 +00:00
Quentin Colombet	4633bd4a55	LSR: Fix the parameters used to compute the scaling factor cost. Prior to this change, the considered addressing modes may be invalid since the maximum and minimum offsets were not taking into account. This was causing an assertion failure. The added test case exercices that behavior. <rdar://problem/14199725> Assertion failed: (CurScaleCost >= 0 && "Legal addressing mode has an illegal cost!") llvm-svn: 184341	2013-06-19 19:59:41 +00:00
Jakub Staszak	6fe8ea3c17	Use 0 instead of NULL. llvm-svn: 184044	2013-06-15 12:20:44 +00:00
Shuxin Yang	63a223a0a0	Fix a potential bug in r183584. r183584 tries to derive some info from the code AFTER a call and apply these derived info to the code BEFORE the call, which is not always safe as the call in question may never return, and in this case, the derived info is invalid. Thank Duncan for pointing out this potential bug. rdar://14073661 llvm-svn: 183606	2013-06-08 04:56:05 +00:00
Shuxin Yang	7247dac833	Fix an assertion in MemCpyOpt pass. The MemCpyOpt pass is capable of optimizing: callee(&S); copy N bytes from S to D. into: callee(&D); subject to some legality constraints. Assertion is triggered when the compiler tries to evalute "sizeof(typeof(D))", while D is an opaque-typed, 'sret' formal argument of function being compiled. i.e. the signature of the func being compiled is something like this: T caller(...,%opaque* noalias nocapture sret %D, ...) The fix is that when come across such situation, instead of calling some utility functions to get the size of D's type (which will crash), we simply assume D has at least N bytes as implified by the copy-instruction. rdar://14073661 llvm-svn: 183584	2013-06-07 22:45:21 +00:00
David Majnemer	f6b2c81f95	IndVarSimplify: check if loop invariant expansion can trap IndVarSimplify is willing to move divide instructions outside of their loop bodies if they are invariant of the loop. However, it may not be safe to expand them if we do not know if they can trap. Instead, check to see if it is not safe to expand the instruction and skip the expansion. This fixes PR16041. Testcase by Rafael Ávila de Espíndola. llvm-svn: 183239	2013-06-04 17:51:58 +00:00
Quentin Colombet	3e2682d134	Loop Strength Reduce: Scaling factor cost. Account for the cost of scaling factor in Loop Strength Reduce when rating the formulae. This uses a target hook. The default implementation of the hook is: if the addressing mode is legal, the scaling factor is free. <rdar://problem/13806271> llvm-svn: 183045	2013-05-31 21:29:03 +00:00
Quentin Colombet	c3a4f33cc1	Modify how the formulae are rated in Loop Strength Reduce. Namely, check if the target allows to fold more that one register in the addressing mode and if yes, adjust the cost accordingly. Prior to this commit, reg1 + scale * reg2 accesses were artificially preferred to reg1 + reg2 accesses. Indeed, the cost model wrongly assumed that reg1 + reg2 needs a temporary register for the computation, whereas it was correctly estimated for reg1 + scale * reg2. <rdar://problem/13973908> llvm-svn: 183021	2013-05-31 17:20:29 +00:00
Michael J. Spencer	c195b8a813	Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros. llvm-svn: 182680	2013-05-24 22:23:49 +00:00
Shuxin Yang	018fd6828f	[GVN] Split critical-edge on the fly, instead of postpone edge-splitting to next iteration. This on step toward non-iterative GVN. My local hack suggests that getting rid of iteration will speedup GVN by 30%+ on a medium sized input (2k LOC, C++). I cannot explain why not 2x or more at this moment. llvm-svn: 181532	2013-05-09 18:34:27 +00:00
Nick Lewycky	a88ff03516	Fix a bug in codegenprep where it was losing track of values OptimizeMemoryInst by switching to a ValueMap. Patch by Andrea DiBiagio! llvm-svn: 181397	2013-05-08 09:00:10 +00:00
Andrew Trick	5d13ab6ea6	Rotate multi-exit loops even if the latch was simplified. Test case by Michele Scandale! Fixes PR10293: Load not hoisted out of loop with multiple exits. There are few regressions with this patch, now tracked by rdar:13817079, and a roughly equal number of improvements. The regressions are almost certainly back luck because LoopRotate has very little idea of whether rotation is profitable. Doing better requires a more comprehensive solution. This checkin is a quick fix that lacks generality (PR10293 has a counter-example). But it trivially fixes the case in PR10293 without interfering with other cases, and it does satify the criteria that LoopRotate is a loop canonicalization pass that should avoid heuristics and special cases. I can think of two approaches that would probably be better in the long run. Ultimately they may both make sense. (1) LoopRotate should check that the current header would make a good loop guard, and that the loop does not already has a sufficient guard. The artifical SimplifiedLoopLatch check would be unnecessary, and the design would be more general and canonical. Two difficulties: - We need a strong guarantee that we won't endlessly rotate, so the analysis would need to be precise in order to avoid the SimplifiedLoopLatch precondition. - Analysis like this are usually based on SCEV, which we don't want to rely on. (2) Rotate on-demand in late loop passes. This could even be done by shoving the loop back on the queue after the optimization that needs it. This could work well when we find LICM opportunities in multi-branch loops. This requires some work, and it doesn't really solve the problem of SCEV wanting a loop guard before the analysis. llvm-svn: 181230	2013-05-06 17:58:18 +00:00
Shuxin Yang	2e42a06bb1	Decompose GVN::processNonLocalLoad() (about 400 LOC) into smaller helper functions. No function change. This function consists of following steps: 1. Collect dependent memory accesses. 2. Analyze availability. 3. Perform fully redundancy elimination, or 4. Perform PRE, depending on the availability Step 2, 3 and 4 are now moved to three helper routines. llvm-svn: 181047	2013-05-03 19:17:26 +00:00
Shuxin Yang	df9f738a35	[GV] Remove dead code which is really difficult to decipher. Actually it took me couple of hours trying to make sense of them and only to find they are dead code. I guess the original author used "allSingleSucc" to indicate if there are any critial edge emanating from some blocks, and tried to perform code motion (actually speculation) in the presence of these critical edges; but later on he/she changed mind and decided to perform edge-splitting first. llvm-svn: 180951	2013-05-02 21:14:31 +00:00
Filip Pizlo	dd62846c56	This patch breaks up Wrap.h so that it does not have to include all of the things, and renames it to CBindingWrapping.h. I also moved CBindingWrapping.h into Support/. This new file just contains the macros for defining different wrap/unwrap methods. The calls to those macros, as well as any custom wrap/unwrap definitions (like for array of Values for example), are put into corresponding C++ headers. Doing this required some #include surgery, since some .cpp files relied on the fact that including Wrap.h implicitly caused the inclusion of a bunch of other things. This also now means that the C++ headers will include their corresponding C API headers; for example Value.h must include llvm-c/Core.h. I think this is harmless, since the C API headers contain just external function declarations and some C types, so I don't believe there should be any nasty dependency issues here. llvm-svn: 180881	2013-05-01 20:59:00 +00:00
Nadav Rotem	c0309431a1	SROA: Generate selects instead of shuffles when blending values because this is the cannonical form. Shuffles are more difficult to lower and we usually don't touch them, while we do optimize selects more often. llvm-svn: 180875	2013-05-01 19:53:30 +00:00
Shuxin Yang	cb9d06c59b	Fix a XOR reassociation bug. When Reassociator optimize "(x \| C1)" ^ "(X & C2)", it may swap the two subexpressions, however, it forgot to swap cached constants (of C1 and C2) accordingly. rdar://13739160 llvm-svn: 180676	2013-04-27 18:02:12 +00:00
Eric Christopher	beec5d09da	Move C++ code out of the C headers and into either C++ headers or the C++ files themselves. This enables people to use just a C compiler to interoperate with LLVM. llvm-svn: 180063	2013-04-22 22:47:22 +00:00
Rafael Espindola	88a7961c64	Clarify that llvm.used can contain aliases. Also add a check for llvm.used in the verifier and simplify clients now that they can assume they have a ConstantArray. llvm-svn: 180019	2013-04-22 14:58:02 +00:00
Benjamin Kramer	47f18d3da1	SROA: Don't crash on a select with two identical operands. This is an edge case that can happen if we modify a chain of multiple selects. Update all operands in that case and remove the assert. PR15805. llvm-svn: 179982	2013-04-21 17:48:39 +00:00
Chris Lattner	a5d4b30d60	Fix a comment, PR15777. llvm-svn: 179775	2013-04-18 17:42:14 +00:00
Jim Grosbach	1b2d956f84	Fix a typo in comment. llvm-svn: 179542	2013-04-15 17:40:48 +00:00
Shuxin Yang	cc126626e3	Redo the fix Benjamin Kramer committed in r178793 about iterator invalidation in Reassociate. I brazenly think this change is slightly simpler than r178793 because: - no "state" in functor - "OpndPtrs[i]" looks simpler than "&Opnds[OpndIndices[i]]" While I can reproduce the probelm in Valgrind, it is rather difficult to come up a standalone testing case. The reason is that when an iterator is invalidated, the stale invalidated elements are not yet clobbered by nonsense data, so the optimizer can still proceed successfully. Thank Benjamin for fixing this bug and generously providing the test case. llvm-svn: 179062	2013-04-08 22:00:43 +00:00
Chandler Carruth	8d04726f54	Fix PR15674 (and PR15603): a SROA think-o. The fix for PR14972 in r177055 introduced a real think-o in the store side, likely because I was much more focused on the load side. While we can arbitrarily widen (or narrow) a loaded value, we can't arbitrarily widen a value to be stored, as that changes the width of memory access! Lock down the code path in the store rewriting which would do this to only handle the intended circumstance. All of the existing tests continue to pass, and I've added a test from the PR. llvm-svn: 178974	2013-04-07 11:47:54 +00:00
Shuxin Yang	5cf388a00f	Disable the optimization about promoting vector-element-access with symbolic index. This optimization is unstable at this moment; it 1) block us on a very important application 2) PR15200 3) test6 and test7 in test/Transforms/ScalarRepl/dynamic-vector-gep.ll (the CHECK command compare the output against wrong result) I personally believe this optimization should not have any impact on the autovectorized code, as auto-vectorizer is supposed to put gather/scatter in a "right" way. Although in theory downstream optimizaters might reveal some gather/scatter optimization opportunities, the chance is quite slim. For the hand-crafted vectorizing code, in term of redundancy elimination, load-CSE, copy-propagation and DSE can collectively achieve the same result, but in much simpler way. On the other hand, these optimizers are able to improve the code in a incremental way; in contrast, SROA is sort of all-or-none approach. However, SROA might slighly win in stack size, as it tries to figure out a stretch of memory tightenly cover the area accessed by the dynamic index. rdar://13174884 PR15200 llvm-svn: 178912	2013-04-05 21:07:08 +00:00
Benjamin Kramer	d4c69ec04b	Reassociate: Avoid iterator invalidation. OpndPtrs stored pointers into the Opnd vector that became invalid when the vector grows. Store indices instead. Sadly I only have a large testcase that only triggers under valgrind, so I didn't include it. llvm-svn: 178793	2013-04-04 21:15:42 +00:00
Shuxin Yang	74f54ae4b2	Correct assertion condition llvm-svn: 178484	2013-04-01 18:13:05 +00:00
Shuxin Yang	c53fc5dc4c	Implement XOR reassociation. It is based on following rules: rule 1: (x \| c1) ^ c2 => (x & ~c1) ^ (c1^c2), only useful when c1=c2 rule 2: (x & c1) ^ (x & c2) = (x & (c1^c2)) rule 3: (x \| c1) ^ (x \| c2) = (x & c3) ^ c3 where c3 = c1 ^ c2 rule 4: (x \| c1) ^ (x & c2) => (x & c3) ^ c1, where c3 = ~c1 ^ c2 It reduces an application's size (in terms of # of instructions) by 8.9%. Reviwed by Pete Cooper. Thanks a lot! rdar://13212115 llvm-svn: 178409	2013-03-30 02:15:01 +00:00
Jakub Staszak	760ea04733	Minor cleanups. No functionality change. llvm-svn: 177837	2013-03-24 09:56:28 +00:00
Jakub Staszak	8c92d0d919	Use dyn_cast instead of isa && cast. No functionality change. llvm-svn: 177836	2013-03-24 09:25:47 +00:00
Chandler Carruth	5dfc3ade1f	[SROA] Prefix names using a custom IRBuilder inserter. The key part of this is ensuring that name prefixes remain in a Twine form until we get to a point where we can nuke them under NDEBUG. This is tricky using the old APIs as they played fast and loose with Twine, which is prone to serious error. The inserter is much cleaner as it is actually in the call stack leading to the setName call, and so has a good opportunity to prepend the prefix. This matters more than you might imagine because most runs over an alloca find a single partition, and rewrite 3 or 4 instructions referring to it. As a consequence doing this lazily and exclusively with Twine allows the optimizer to delete more of it and shaves another 2% to 3% off of the release build's SROA run time for PR15412. I also think the APIs are cleaner, and the use of Twine is more reliable, so I consider it a win-win despite the churn required to reach this state. llvm-svn: 177631	2013-03-21 09:52:18 +00:00
Meador Inge	8c4638bcc3	simplify-libcalls: Removed unused variable The 'Modified' variable should have been removed from SimplifyLibCalls in r177619, but was missed. This commit removes it. llvm-svn: 177622	2013-03-21 02:44:07 +00:00
Meador Inge	30024047b3	Move library call prototype attribute inference to functionattrs The simplify-libcalls pass implemented a doInitialization hook to infer function prototype attributes for well-known functions. Given that the simplify-libcalls pass is going away and that the functionattrs pass is already in place to deduce function attributes, I am moving this logic to the functionattrs pass. This approach was discussed during patch review: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121126/157465.html. llvm-svn: 177619	2013-03-21 00:55:59 +00:00
Chandler Carruth	6f1e6bc2dc	Fix a silly search-and-replace goof with r177495 that only broke non-release builds. llvm-svn: 177498	2013-03-20 07:40:56 +00:00
Chandler Carruth	13fa287d63	[SROA] Don't preserve the IR names in release builds. This is espcially important because the new SROA pass goes to great lengths to provide helpful names for debugging, and as a consequence they can become very slow to render. Good for between 5% and 15% of the SROA runtime on some slow test cases such as the one in PR15412. llvm-svn: 177495	2013-03-20 07:30:36 +00:00
Chandler Carruth	16617f6650	Move the endif to the correct line so we don't have warnings about unused statistics variables. llvm-svn: 177494	2013-03-20 06:47:00 +00:00
Chandler Carruth	9248b3ae59	Introduce some new statistics to help track the exact behavior of the new SROA pass. llvm-svn: 177493	2013-03-20 06:30:46 +00:00
Quentin Colombet	268e28a41d	Update global merge pass according to Duncan's advices: - Remove useless includes - Change misleading comments - Move code into doFinalization llvm-svn: 177445	2013-03-19 21:46:49 +00:00
Arnaud A. de Grandmaison	092ac21f4f	IndVarSimplify: do not recompute an IV value outside of the loop if : - it is trivially known to be used inside the loop in a way that can not be optimized away - there is no use outside of the loop which can take advantage of the computation hoisting llvm-svn: 177432	2013-03-19 20:00:22 +00:00
Andrew Trick	2fcea6b47a	Revert "Cleanup some SCEV logic a bit." This reverts commit 82cd8f7382322bee7a71cdc31f7a923c44d37d32. Just add a comment instead! llvm-svn: 177377	2013-03-19 05:10:27 +00:00
Andrew Trick	256f5077d9	Cleanup some SCEV logic a bit. Make the code more obvious to scan-build and humans. llvm-svn: 177375	2013-03-19 04:14:59 +00:00
Andrew Trick	0dd5df1889	Tighten up an internal LSR API that should check for NULL. No test case, but should fix a scan_build warning. llvm-svn: 177374	2013-03-19 04:14:57 +00:00
Jakub Staszak	89b78a9580	Make method private. Keep coding standard. llvm-svn: 177348	2013-03-18 23:31:30 +00:00
Quentin Colombet	bb36556d97	Extend global merge pass to optionally consider global constant variables. Also add some checks to not merge globals used within landing pad instructions or marked as "used". llvm-svn: 177331	2013-03-18 22:30:07 +00:00
Chandler Carruth	1d83f79b3d	Mark internal classes as POD-like to get better behavior out of SmallVector and DenseMap. This speeds up SROA by 25% on PR15412. llvm-svn: 177259	2013-03-18 08:36:46 +00:00
Chandler Carruth	3d9eacc90b	PR14972: SROA vs. GVN exposed a really bad bug in SROA. The fundamental problem is that SROA didn't allow for overly wide loads where the bits past the end of the alloca were masked away and the load was sufficiently aligned to ensure there is no risk of page fault, or other trapping behavior. With such widened loads, SROA would delete the load entirely rather than clamping it to the size of the alloca in order to allow mem2reg to fire. This was exposed by a test case that neatly arranged for GVN to run first, widening certain loads, followed by an inline step, and then SROA which miscompiles the code. However, I see no reason why this hasn't been plaguing us in other contexts. It seems deeply broken. Diagnosing all of the above took all of 10 minutes of debugging. The really annoying aspect is that fixing this completely breaks the pass. ;] There was an implicit reliance on the fact that no loads or stores extended past the alloca once we decided to rewrite them in the final stage of SROA. This was used to encode information about whether the loads and stores had been split across multiple partitions of the original alloca. That required threading explicit tracking of whether a use of a partition is split across multiple partitions. Once that was done, another problem arose: we allowed splitting of integer loads and stores iff they were loads and stores to the entire alloca. This is a really arbitrary limitation, and splitting at least some integer loads and stores is crucial to maximize promotion opportunities. My first attempt was to start removing the restriction entirely, but currently that does Very Bad Things by causing many common alloca patterns to be fully decomposed into i8 operations and lots of or-ing together to produce larger integers on demand. The code bloat is terrifying. That is still the right end-goal, but substantial work must be done to either merge partitions or ensure that small i8 values are eagerly merged in some other pass. Sadly, figuring all this out took essentially all the time and effort here. So the end result is that we allow splitting only when the load or store at least covers the alloca. That ensures widened loads and stores don't hurt SROA, and that we don't rampantly decompose operations more than we have previously. All of this was already fairly well tested, and so I've just updated the tests to cover the wide load behavior. I can add a test that crafts the pass ordering magic which caused the original PR, but that seems really brittle and to provide little benefit. The fundamental problem is that widened loads should Just Work. llvm-svn: 177055	2013-03-14 11:32:24 +00:00
Dan Gohman	8d664849a4	Change the order of the operands in patchAndReplaceAllUsesWith so that they're more consistent with Value::replaceAllUsesWith. llvm-svn: 176872	2013-03-12 16:22:56 +00:00
Jakub Staszak	88061183a0	Keep coding stanard. llvm-svn: 176661	2013-03-07 22:20:06 +00:00
Jakub Staszak	d94b12a75b	Don't create IRBuilder if we can return from the method earlier. llvm-svn: 176660	2013-03-07 22:10:33 +00:00
Preston Gurd	66b9c4fcf9	Bypass Slow Divides * Only apply divide bypass optimization when not optimizing for size. * Fixed bug caused by constant for 0 value of type Int32, used dividend type to generate the constant instead. * For atom x86-64 apply the divide bypass to use 16-bit divides instead of 64-bit divides when operand values are small enough. * Added lit tests for 64-bit divide bypass. Patch by Tyler Nowicki! llvm-svn: 176442	2013-03-04 18:13:57 +00:00
Benjamin Kramer	caa5d4f15d	CVP: If we have a PHI with an incoming select, try to skip the select. This is a common pattern with dyn_cast and similar constructs, when the PHI no longer depends on the select it can often be turned into a simpler construct or even get hoisted out of the loop. PR15340. llvm-svn: 175995	2013-02-24 15:34:43 +00:00
Bill Wendling	eecb534c87	Implement the NoBuiltin attribute. The 'nobuiltin' attribute is applied to call sites to indicate that LLVM should not treat the callee function as a built-in function. I.e., it shouldn't try to replace that function with different code. llvm-svn: 175835	2013-02-22 00:12:35 +00:00
Chad Rosier	d3dde59198	Remove dead code and whitespace. llvm-svn: 175804	2013-02-21 21:40:51 +00:00
Chad Rosier	60f8f47d44	Update a comment that looks to have been accidentally deleted many moons ago. llvm-svn: 175658	2013-02-20 20:15:55 +00:00
Jakub Staszak	06b3cdcb51	Remove unused variable. llvm-svn: 175568	2013-02-19 22:17:58 +00:00
Jakub Staszak	2cbda8564a	Minor cleanups. No functionality change. llvm-svn: 175567	2013-02-19 22:14:45 +00:00
Jakub Staszak	c60273ce06	Remove unneeded #includes. llvm-svn: 175565	2013-02-19 22:06:38 +00:00
Jakub Staszak	3b29b5aba1	Fix typos. llvm-svn: 175562	2013-02-19 22:02:21 +00:00
Jakub Staszak	18a6467836	Reduce indents in LSRInstance::NarrowSearchSpaceByCollapsingUnrolledCode method. No functionality change. llvm-svn: 175364	2013-02-16 16:08:15 +00:00
Dan Gohman	1ea13bd49f	Actually delete this code, since it's really not clear what it's trying to do. llvm-svn: 175014	2013-02-12 22:26:41 +00:00
Dan Gohman	a04e614c0f	Record PRE predecessors with a SmallVector instead of a DenseMap, and avoid a second pred_iterator traversal. llvm-svn: 175001	2013-02-12 19:49:10 +00:00
Dan Gohman	79869cd16c	When disabling PRE for a value is directly redundant with itself (through a loop), don't continue to iterate through the reamining predecessors. llvm-svn: 174994	2013-02-12 19:05:10 +00:00
Dan Gohman	6815bf69b1	Check that pointers are removed from maps before calling delete on the pointers, for tidiness' sake. llvm-svn: 174988	2013-02-12 18:44:43 +00:00
Dan Gohman	69f4c8b640	Minor code simplification. llvm-svn: 174985	2013-02-12 18:38:36 +00:00
Andrew Trick	b2b2884c68	LSR IVChain improvement. Handle chains in which the same offset is used for both loads and stores to the same array. Fixes rdar://11410078. llvm-svn: 174789	2013-02-09 01:11:01 +00:00
Jakub Staszak	ac504d22d0	Remove #includes from the commonly used LoopInfo.h. llvm-svn: 174786	2013-02-09 01:04:28 +00:00
Preston Gurd	fd12deb433	This patch aims to improve compile time performance by increasing the SCEV vector size in LoopStrengthReduce. It is observed that the BaseRegs vector size is 4 in most cases, and elements are frequently copied when it is initialized as SmallVector<const SCEV *, 2> BaseRegs. Our benchmark results show that the compilation time performance improved by ~0.5%. Patch by Wan Xiaofei. llvm-svn: 174219	2013-02-01 20:41:27 +00:00
Dan Gohman	7eac0c2694	Change GetPointerBaseWithConstantOffset's DataLayout argument from a reference to a pointer, so that it can handle the case where DataLayout is not available and behave conservatively. llvm-svn: 174024	2013-01-31 02:00:45 +00:00
Edwin Vane	fafd787d1f	Fixing warnings revealed by gcc release build Fixed set-but-not-used warnings. Reviewer: gribozavr llvm-svn: 173810	2013-01-29 17:42:24 +00:00
Michael Gottesman	3d8ed99b1f	Extracted ObjCARC.cpp into its own library libLLVMObjCARCOpts in preparation for refactoring the ARC Optimizer. llvm-svn: 173647	2013-01-28 01:35:51 +00:00
Michael Gottesman	307107e1ea	Renamed function IsPotentialUse to IsPotentialRetainableObjPtr. This name change does the following: 1. Causes the function name to use proper ARC terminology. 2. Makes it clear what the function truly does. llvm-svn: 173609	2013-01-27 06:19:48 +00:00
Michael Gottesman	fd9cebe07d	Added comment to ObjCARC elaborating what is meant by the term 'Provenance' in 'Provenance Analysis'. llvm-svn: 173374	2013-01-24 21:35:00 +00:00
Michael Gottesman	1b1c476a4c	Fixed typo. llvm-svn: 173202	2013-01-22 21:53:43 +00:00
Michael Gottesman	8eb9360a37	[ObjCARC] Refactored out the inner most 2-loops from PerformCodePlacement into the method ConnectTDBUTraversals. The method PerformCodePlacement was doing too much (i.e. 3x loops, lots of different checking). This refactoring separates the analysis section of the method into a separate function while leaving the actual code placement and analysis preparation in PerformCodePlacement. NOTE Really this part of ObjCARC should be refactored out of the main pass class into its own seperate class/struct. But, it is not time to make that change yet though (don't want to make such an invasive change without fixing all of the bugs first). llvm-svn: 173201	2013-01-22 21:49:00 +00:00
Bill Wendling	82711f5390	More encapsulation work. Use the AttributeSet when we're talking about more than one attribute. Add a function that adds a single attribute. No functionality change intended. llvm-svn: 173196	2013-01-22 21:15:51 +00:00
Chandler Carruth	e7f6a7e82e	Begin fleshing out an interface in TTI for modelling the costs of generic function calls and intrinsics. This is somewhat overlapping with an existing intrinsic cost method, but that one seems targetted at vector intrinsics. I'll merge them or separate their names and use cases in a separate commit. This sinks the test of 'callIsSmall' down into TTI where targets can control it. The whole thing feels very hack-ish to me though. I've left a FIXME comment about the fundamental design problem this presents. It isn't yet clear to me what the users of this function really care about. I'll have to do more analysis to figure that out. Putting this here at least provides it access to proper analysis pass tools and other such. It also allows us to more cleanly implement the baseline cost interfaces in TTI. With this commit, it is now theoretically possible to simplify much of the inline cost analysis's handling of calls by calling through to this interface. That conversion will have to happen in subsequent commits as it requires more extensive restructuring of the inline cost analysis. The CodeMetrics class is now really only in the business of running over a block of code and aggregating the metrics on that block of code, with the actual cost evaluation done entirely in terms of TTI. llvm-svn: 173148	2013-01-22 11:26:02 +00:00
Chandler Carruth	632fcee01a	Switch CodeMetrics itself over to use TTI to determine if an instruction is free. The whole CodeMetrics API should probably be reworked more, but this is enough to allow deleting the duplicate code there for computing whether an instruction is free. All of the passes using this have been updated to pull in TTI and hand it to the CodeMetrics stuff. Further, a dead CodeMetrics API (analyzeFunction) is nuked for lack of users. llvm-svn: 173036	2013-01-21 13:04:33 +00:00
Michael Gottesman	e179dec59c	Improved comment. llvm-svn: 172864	2013-01-18 23:02:45 +00:00
Michael Gottesman	ed41df19ff	Fixed typo in comment. llvm-svn: 172863	2013-01-18 23:00:33 +00:00
Bill Wendling	7777bbbbf3	Use AttributeSet accessor methods instead of Attribute accessor methods. Further encapsulation of the Attribute object. Don't allow direct access to the Attribute object as an aggregate. llvm-svn: 172853	2013-01-18 21:53:16 +00:00
Benjamin Kramer	3fc02ad019	Silence GCC warning about dropping off a non-void function. llvm-svn: 172839	2013-01-18 19:45:22 +00:00
Michael Gottesman	51c56a28c1	Fixed 80+ violation. llvm-svn: 172782	2013-01-18 03:08:39 +00:00
Michael Gottesman	0f96f3f655	Added missing const from my last commit. llvm-svn: 172736	2013-01-17 18:36:17 +00:00
Michael Gottesman	54242440aa	[ObjCARC] Implemented operator<< for InstructionClass and changed a ``Visited'' Debug message to use it. llvm-svn: 172735	2013-01-17 18:32:34 +00:00
Michael Gottesman	56275116f2	[ObjCARC] Turn off ignoring unwind edges in ObjCARC when -fno-objc-arc-exception is enabled due to it's affect on correctness. Specifically according to the semantics of ARC -fno-objc-arc-exception simply states that it is expected that the unwind path out of a call MAY not release objects. Thus we can have the situation where a release gets moved into a catch block which we ignore when we remove a retain/release pair resulting in (even though we assume the program is exiting anyways) the cleanup code path potentially blowing up before program exit. llvm-svn: 172599	2013-01-16 06:32:39 +00:00
Michael Gottesman	58db6d1cd7	Changed SmallPtrSet.count guard + SmallPtrSet.insert to just SmallPtrSet.insert. llvm-svn: 172452	2013-01-14 19:18:39 +00:00
Michael Gottesman	aabad66a3a	Fixed some 80+ violations. llvm-svn: 172374	2013-01-14 01:47:53 +00:00
Michael Gottesman	0ff0eb0f71	Updated the documentation in ObjCARC.cpp to fit the style guide better (i.e. use doxygen). Still some work to do though. llvm-svn: 172371	2013-01-14 00:35:14 +00:00
Michael Gottesman	c6a7902080	Fixed an infinite loop in the block escape in analysis in ObjCARC caused by 2x blocks each assigned a value via a phi-node causing each to depend on the other. A test case is provided as well. llvm-svn: 172368	2013-01-13 22:12:06 +00:00
Michael Gottesman	1f32fece1e	[ObjCARC] Even more debug messages! llvm-svn: 172347	2013-01-13 07:47:32 +00:00
Michael Gottesman	ffdbdc6957	[ObjCARC] More debug messages. llvm-svn: 172346	2013-01-13 07:00:51 +00:00
Chandler Carruth	a094848fcc	Fix an editor goof in r171738 that Bill spotted. He may even have a test case, but looking at the diff this was an obviously unintended change. Thanks for the careful review Bill! =] llvm-svn: 172336	2013-01-12 23:46:04 +00:00
Michael Gottesman	ea2af5c74a	Fixed debug message in ObjCARC. llvm-svn: 172299	2013-01-12 03:45:49 +00:00
Michael Gottesman	a00a81d4cc	Fixed a few debug messages in ObjCARC and added one. llvm-svn: 172298	2013-01-12 02:57:16 +00:00
Michael Gottesman	822cc01174	Fixed bug in ObjCARC where we were changing a call from objc_autoreleaseRV => objc_autorelease but were not updating the InstructionClass to IC_Autorelease. llvm-svn: 172288	2013-01-12 01:25:19 +00:00
Michael Gottesman	c59f1a814a	Fixed a bug where we were tail calling objc_autorelease causing an object to not be placed into an autorelease pool. The reason that this occurs is that tail calling objc_autorelease eventually tail calls -[NSObject autorelease] which supports fast autorelease. This can cause us to violate the semantic gaurantees of __autoreleasing variables that assignment to an __autoreleasing variables always yields an object that is placed into the innermost autorelease pool. The fix included in this patch works by: 1. In the peephole optimization function OptimizeIndividualFunctions, always remove tail call from objc_autorelease. 2. Whenever we convert to/from an objc_autorelease, set/unset the tail call keyword as appropriate. NOTE I also handled the case where objc_autorelease is converted in OptimizeReturns to an autoreleaseRV which still violates the ARC semantics. I will be removing that in a later patch and I wanted to make sure that the tree is in a consistent state vis-a-vis ARC always. Additionally some test cases are provided and all tests that have tail call marked objc_autorelease keywords have been modified so that tail call has been removed. NOTE One test fails due to a separate bug that I am going to commit soon. Thus I marked the check line TMP: instead of CHECK: so make check does not fail. llvm-svn: 172287	2013-01-12 01:25:15 +00:00
Shuxin Yang	5df00cb3b4	PR14904: Segmentation fault running pass 'Recognize loop idioms' The root cause is mistakenly taking for granted that "dyn_cast<Instruction>(a-Value)" return a non-NULL instruction. llvm-svn: 172145	2013-01-10 23:32:01 +00:00
Michael Gottesman	0b51c8346d	[ObjCARC Debug Message] Added debug message when we convert an autorelease into an autoreleaseRV. llvm-svn: 172034	2013-01-10 02:03:50 +00:00
Michael Gottesman	4f1ddbf0fc	[ObjCARC Debug Messages] This is a squashed commit of 3x debug message commits ala echristo's suggestion. 1. Added debug messages when in OptimizeIndividualCalls we move calls into predecessors and then erase the original call. 2. Added debug messages when in the process of moving calls in ObjCARCOpt::MoveCalls we create new RR and delete old RR. 3. Added a debug message when we visit a specific retain instruction in ObjCARCOpt::PerformCodePlacement. llvm-svn: 171988	2013-01-09 19:23:24 +00:00
Benjamin Kramer	953e24a567	LICM: Hoist insertvalue/extractvalue out of loops. Fixes PR14854. llvm-svn: 171984	2013-01-09 18:12:03 +00:00
Michael Gottesman	75a619b8c2	Fixed EOL whitespace. llvm-svn: 171791	2013-01-07 21:26:07 +00:00
Chandler Carruth	b8c9b84572	Sink AddrMode back into TargetLowering, removing one of the most peculiar headers under include/llvm. This struct still doesn't make a lot of sense, but it makes more sense down in TargetLowering than it did before. llvm-svn: 171739	2013-01-07 15:14:13 +00:00
Chandler Carruth	2c7e0782e3	Remove LSR's use of the random AddrMode struct. These variables were already in a class, just inline the four of them. I suspect that this class could be simplified some to not always keep distinct variables for these things, but it wasn't clear to me how given the usage so I opted for a trivial and mechanical translation. This removes one of the two remaining users of a header in include/llvm which does nothing more than define a 4 member struct. llvm-svn: 171738	2013-01-07 15:04:40 +00:00
Chandler Carruth	0e84971ae2	Switch the SCEV expander and LoopStrengthReduce to use TargetTransformInfo rather than TargetLowering, removing one of the primary instances of the layering violation of Transforms depending directly on Target. This is a really big deal because LSR used to be a "special" pass that could only be tested fully using llc and by looking at the full output of it. It also couldn't run with any other loop passes because it had to be created by the backend. No longer is this true. LSR is now just a normal pass and we should probably lift the creation of LSR out of lib/CodeGen/Passes.cpp and into the PassManagerBuilder. =] I've not done this, or updated all of the tests to use opt and a triple, because I suspect someone more familiar with LSR would do a better job. This change should be essentially without functional impact for normal compilations, and only change behvaior of targetless compilations. The conversion required changing all of the LSR code to refer to the TTI interfaces, which fortunately are very similar to TargetLowering's interfaces. However, it also allowed us to always expect to have some implementation around. I've pushed that simplification through the pass, and leveraged it to simplify code somewhat. It required some test updates for one of two things: either we used to skip some checks altogether but now we get the default "no" answer for them, or we used to have no information about the target and now we do have some. I've also started the process of removing AddrMode, as the TTI interface doesn't use it any longer. In some cases this simplifies code, and in others it adds some complexity, but I think it's not a bad tradeoff even there. Subsequent patches will try to clean this up even further and use other (more appropriate) abstractions. Yet again, almost all of the formatting changes brought to you by clang-format. =] llvm-svn: 171735	2013-01-07 14:41:08 +00:00
Silviu Baranga	0aae14e2d4	Make the MergeGlobals pass correctly handle the address space qualifiers of the global variables. We partition the set of globals by their address space, and apply the same the trasnformation as before to merge them. llvm-svn: 171730	2013-01-07 12:31:25 +00:00
Chandler Carruth	f82e5f38d9	Switch LoopIdiom pass to directly require target transform information. I'm sorry for duplicating bad style here, but I wanted to keep consistency. I've pinged the code review thread where this style was reviewed and changes were requested. llvm-svn: 171714	2013-01-07 09:17:41 +00:00
Chandler Carruth	f8bff2bea0	Make SimplifyCFG simply depend upon TargetTransformInfo and pass it through as a reference rather than a pointer. There is always some implementation of this available, so this simplifies code by not having to test for whether it is available or not. Further, it turns out there were piles of places where SimplifyCFG was recursing and not passing down either TD or TTI. These are fixed to be more pedantically consistent even though I don't have any particular cases where it would matter. llvm-svn: 171691	2013-01-07 03:53:25 +00:00
Chandler Carruth	601fa4e996	Make the popcnt support enums and methods have more clear names and follow the conding conventions regarding enumerating a set of "kinds" of things. llvm-svn: 171687	2013-01-07 03:16:03 +00:00
Chandler Carruth	3c0f5d4efb	Move TargetTransformInfo to live under the Analysis library. This no longer would violate any dependency layering and it is in fact an analysis. =] llvm-svn: 171686	2013-01-07 03:08:10 +00:00
Michael Gottesman	4a300ff5a0	[ObjCARC Debug Message] - Added debug message when fuse a retain/autorelease pair in ObjCARCContract::ContractAutorelease. llvm-svn: 171679	2013-01-07 00:31:26 +00:00
Michael Gottesman	4a96ee6b2a	[ObjCARC Debug Message] - Added debug message when we zap a matching retain/autorelease pair in ObjCARCOpt::OptimizeReturns. llvm-svn: 171678	2013-01-07 00:04:56 +00:00
Michael Gottesman	9c90fe632e	[ObjCARC Debug Message] - Added debug message when we erase ARC calls with null since they are no-ops. llvm-svn: 171677	2013-01-07 00:04:52 +00:00
Michael Gottesman	abee5cf395	[ObjCARC Debug Message] - Added debug message when we add a nounwind keyword to a function which can not throw. llvm-svn: 171676	2013-01-06 23:39:13 +00:00
Michael Gottesman	49f71df55e	[ObjCARC Debug Message] - Added debug message when we add a tail keyword to a function which can never be passed stack args. llvm-svn: 171675	2013-01-06 23:39:09 +00:00
Michael Gottesman	6a2ae38c56	[ObjCARC Debug Messages] - Added missing newline. llvm-svn: 171674	2013-01-06 22:56:54 +00:00
Michael Gottesman	44c08c9584	Added debug statement to ObjCARC when we replace objc_autorelease(x) with objc_release(x) when x is otherwise unused. llvm-svn: 171673	2013-01-06 22:56:50 +00:00
Michael Gottesman	023ea4f317	Added 2x Debug statements to ObjCARC that log when we handle the two undefined pointer-to-weak-pointer is NULL cases by replacing the given call inst with an undefined value. The reason that there are two cases is that the first case handles the unary cases and the second the binary cases. llvm-svn: 171672	2013-01-06 21:54:30 +00:00
Michael Gottesman	c862b0cd55	Added debug message in ObjCARC when we remove a no-op cast which has only special semantic meaning in the frontend and thus in the optimizer can be deleted. llvm-svn: 171670	2013-01-06 21:07:15 +00:00
Michael Gottesman	8107797e22	Added debug message to ObjCARC when we transform an objc_autoreleaseReturnValue => objc_autorelease due to its operand not being used as a return value. llvm-svn: 171669	2013-01-06 21:07:11 +00:00
Andrew Trick	8c90c2b2d7	Fix a crash in LSR replaceCongruentIVs. Indirect branch in the preheader crashes replaceCongruentIVs. Fixes rdar://12910141. llvm-svn: 171653	2013-01-06 05:59:39 +00:00
Michael Gottesman	b08d13fa04	Added debug message to ObjCARC when we transform objc_retainAutorelasedReturnValue => objc_retain since the operand to said function is not a return value. llvm-svn: 171629	2013-01-05 17:55:42 +00:00
Michael Gottesman	3f22b59b75	Added debug message for ObjCARC when we zap an objc_autoreleaseReturnValue/objc_retainAutoreleasedValue pair. llvm-svn: 171628	2013-01-05 17:55:35 +00:00
Chris Lattner	561e5f6442	switch from pointer equality comparison to MDNode::getMostGenericTBAA when merging two TBAA tags, pointed out by Nuno. llvm-svn: 171627	2013-01-05 16:44:07 +00:00
Chandler Carruth	c37f873121	Switch LoopIdiomRecognize to directly use the TargetTransformInfo interface rather than the ScalarTargetTransformInterface. llvm-svn: 171616	2013-01-05 10:00:09 +00:00
Chandler Carruth	413d8d63a5	Sink the AddressingModeMatcher helper class into an anonymous namespace next to its only user. This helper relies on TargetLowering information that shouldn't be generally used throughout the Transfoms library, and so it made little sense as a generic utility. This also consolidates the file where we need to remove the remaining uses of TargetLowering in favor of the IR-layer abstract interface in TargetTransformInfo. llvm-svn: 171590	2013-01-05 02:09:22 +00:00
Michael Gottesman	c63800aa94	Added DEBUG message to ObjCARC when we optimize objc_retain => objc_retainAutorelasedReturnValue. llvm-svn: 171535	2013-01-04 21:30:38 +00:00
Michael Gottesman	0f781a1a5c	Fixed up some DEBUG messages where I was putting in the text of a message the method where it was being called when I should have just prefixed the actual message with Pass::Method. Additionally I fixed some whitespace issues. llvm-svn: 171534	2013-01-04 21:29:57 +00:00
Michael Gottesman	a67b835abe	Changed two debug statements that state that a queue had finished being processed when said queue was really a list to state a list had finished being processed. llvm-svn: 171465	2013-01-03 08:09:27 +00:00
Michael Gottesman	27dfca0765	Added DEBUG message for ObjCARC when we zap a push/pop pair in ObjCARCAPElim::OptimizeBB. llvm-svn: 171464	2013-01-03 08:09:17 +00:00
Michael Gottesman	5c76a62ee6	Added DEBUG message to ObjCARC when we transform objc_initWeak(p, null) => *p = null. llvm-svn: 171463	2013-01-03 07:32:53 +00:00
Michael Gottesman	fcffa87a15	Added DEBUG message for ObjCARC when an inline asm marker is inserted for architectures where this is required to perform a retainAutoreleasedReturnValue optimization. llvm-svn: 171462	2013-01-03 07:32:41 +00:00
Shuxin Yang	c985c304e2	- Add comment to two functions which might be considered as dead code. - Fix a typo llvm-svn: 171399	2013-01-02 18:26:31 +00:00
Chandler Carruth	4c1f3c24db	Move all of the header files which are involved in modelling the LLVM IR into their new header subdirectory: include/llvm/IR. This matches the directory structure of lib, and begins to correct a long standing point of file layout clutter in LLVM. There are still more header files to move here, but I wanted to handle them in separate commits to make tracking what files make sense at each layer easier. The only really questionable files here are the target intrinsic tablegen files. But that's a battle I'd rather not fight today. I've updated both CMake and Makefile build systems (I think, and my tests think, but I may have missed something). I've also re-sorted the includes throughout the project. I'll be committing updates to Clang, DragonEgg, and Polly momentarily. llvm-svn: 171366	2013-01-02 11:36:10 +00:00
Chandler Carruth	5f5c383ef1	Resort the #include lines in include/... and lib/... with the utils/sort_includes.py script. Most of these are updating the new R600 target and fixing up a few regressions that have creeped in since the last time I sorted the includes. llvm-svn: 171362	2013-01-02 10:22:59 +00:00
Benjamin Kramer	ddae3440aa	Add IRBuilder::CreateVectorSplat and use it to simplify code. llvm-svn: 171349	2013-01-01 19:55:16 +00:00
Benjamin Kramer	6a165dfca3	SROA: Clean up unused assignment warnings from clang's analyzer. No functionality change. llvm-svn: 171348	2013-01-01 16:13:35 +00:00
Michael Gottesman	bb7c4c44f3	Added DEBUG message when ObjCARC replaces a call which returns its argument verbatim with its argument to temporarily undo an optimization. Specifically these calls return their argument verbatim, as a low-level optimization. However, this makes high-level optimizations harder. We undo any uses of this optimization that the front-end emitted. We redo them later in the contract pass. llvm-svn: 171346	2013-01-01 16:05:54 +00:00
Michael Gottesman	7d1661c22c	Added DEBUG messages to the top of several processing loops in ObjCARC.cpp that emit what instructions are being visited. This is a part of a larger effort of adding DEBUG messages to the ARC Optimizer Backend. llvm-svn: 171345	2013-01-01 16:05:48 +00:00
Chris Lattner	c9303c920d	Fix LICM's memory promotion optimization to preserve TBAA tags when promoting a store in a loop. This was noticed when working on PR14753, but isn't directly related. llvm-svn: 171281	2012-12-31 08:37:17 +00:00
Nuno Lopes	0873c9d511	convert a bunch of callers from DataLayout::getIndexedOffset() to GEP::accumulateConstantOffset(). The later API is nicer than the former, and is correct regarding wrap-around offsets (if anyone cares). There are a few more places left with duplicated code, which I'll remove soon. llvm-svn: 171259	2012-12-30 16:25:48 +00:00
Bill Wendling	e0920e4122	Remove the Function::getFnAttributes method in favor of using the AttributeSet directly. This is in preparation for removing the use of the 'Attribute' class as a collection of attributes. That will shift to the AttributeSet class instead. llvm-svn: 171253	2012-12-30 10:32:01 +00:00
Evan Cheng	69cb91fa21	Every pass deserves a name, even codegenprep. llvm-svn: 170831	2012-12-21 01:48:14 +00:00
James Molloy	de926c367f	Add a new attribute, 'noduplicate'. If a function contains a noduplicate call, the call cannot be duplicated - Jump threading, loop unrolling, loop unswitching, and loop rotation are inhibited if they would duplicate the call. Similarly inlining of the function is inhibited, if that would duplicate the call (in particular inlining is still allowed when there is only one callsite and the function has internal linkage). llvm-svn: 170704	2012-12-20 16:04:27 +00:00
Bill Wendling	56d9c4b832	Rename the 'Attributes' class to 'Attribute'. It's going to represent a single attribute in the future. llvm-svn: 170502	2012-12-19 07:18:57 +00:00
Nadav Rotem	c22e8c34a7	SROA: Replace calls to getScalarSizeInBits to DataLayout's API because getScalarSizeInBits could not handle vectors of pointers. llvm-svn: 170412	2012-12-18 05:23:31 +00:00
Chandler Carruth	60738bca93	Fix another SROA crasher, PR14601. This was a silly oversight, we weren't pruning allocas which were used by variable-length memory intrinsics from the set that could be widened and promoted as integers. Fix that. llvm-svn: 170353	2012-12-17 18:48:07 +00:00
Chandler Carruth	91d886f61b	Teach the rewriting of memcpy calls to support subvector copies. This also cleans up a bit of the memcpy call rewriting by sinking some irrelevant code further down and making the call-emitting code a bit more concrete. Previously, memcpy of a subvector would actually miscompile (!!!) the copy into a single vector element copy. I have no idea how this ever worked. =/ This is the memcpy half of PR14478 which we probably weren't noticing previously because it didn't actually assert. The rewrite relies on the newly refactored insert- and extractVector functions to do the heavy lifting, and those are the same as used for loads and stores which makes the test coverage a bit more meaningful here. llvm-svn: 170338	2012-12-17 14:51:24 +00:00
Evgeniy Stepanov	5ecec98c2c	Optimize tree walking in markAliveBlocks. Check whether a BB is known as reachable before adding it to the worklist. This way BB's with multiple predecessors are added to the list no more than once. llvm-svn: 170335	2012-12-17 14:28:00 +00:00
Chandler Carruth	e576359bf4	Fix a secondary bug I introduced while fixing the first part of PR14478. The first half of fixing this bug was actually in r170328, but was entirely coincidental. It did however get me to realize the nature of the bug, and adapt the test case to test more interesting behavior. In turn, that uncovered the rest of the bug which I've fixed here. This should fix two new asserts that showed up in the vectorize nightly tester. llvm-svn: 170333	2012-12-17 14:03:01 +00:00
Chandler Carruth	2a27cb5523	Hoist a convertValue call to the two paths where it is needed. I noticed this while looking at r170328. We only ever do a vector rewrite when the alloca is the vector type, so it's good to not paper over bugs here by doing a convertValue that isn't needed. llvm-svn: 170331	2012-12-17 13:51:03 +00:00
Chandler Carruth	35ed75156a	Hoist the insertVector helper to be a static helper. This will allow its use inside of memcpy rewriting as well. This routine is more complex than extractVector, and some of its uses are not 100% where I want them to be so there is still some work to do here. While this can technically change the output in some cases, it shouldn't be a change that matters -- IE, it can leave some dead code lying around that prior versions did not, etc. Yet another step in the refactorings leading up to the solution to the last component of PR14478. llvm-svn: 170328	2012-12-17 13:41:21 +00:00
Chandler Carruth	bbd2c1ab94	Lift the extractVector helper all the way out to a static helper function. The method helpers all implicitly act upon the alloca, and what we really want is a fully generic helper. Doing memcpy rewrites is more special than all other rewrites because we are at times rewriting instructions which touch pointers other than the alloca. As a consequence all of the helpers needed by memcpy rewriting of sub-vector copies will need to be generalized fully. Note that all of these helpers ({insert,extract}{Integer,Vector}) are woefully uncommented. I'm going to go back through and document them once I get the factoring correct. No functionality changed. llvm-svn: 170325	2012-12-17 13:07:30 +00:00
Chandler Carruth	af6b524242	Factor the vector load rewriting into a more generic form. This makes it suitable for use in rewriting memcpy in the presence of subvector memcpy intrinsics. No functionality changed. llvm-svn: 170324	2012-12-17 12:50:21 +00:00
Chandler Carruth	a079cd0144	Fix the first part of PR14478: memset now works. PR14478 highlights a serious problem in SROA that simply wasn't being exercised due to a lack of vector input code mixed with C-library function calls. Part of SROA was written carefully to handle subvector accesses via memset and memcpy, but the rewriter never grew support for this. Fixing it required refactoring the subvector access code in other parts of SROA so it could be shared, and then fixing the splat formation logic and using subvector insertion (this patch). The PR isn't quite fixed yet, as memcpy is still broken in the same way. I'm starting on that series of patches now. Hopefully this will be enough to bring the bullet benchmark back to life with the bb-vectorizer enabled, but that may require fixing memcpy as well. llvm-svn: 170301	2012-12-17 04:07:37 +00:00
Chandler Carruth	dae722c19d	Extract the logic for inserting a subvector into a vector alloca. No functionality changed. Another step of refactoring toward solving PR14487. llvm-svn: 170300	2012-12-17 04:07:35 +00:00
Chandler Carruth	fefd557661	Lift the integer splat computation into a helper function. No functionality changed. Refactoring leading up to the fix for PR14478 which requires some significant changes to the memset and memcpy rewriting. llvm-svn: 170299	2012-12-17 04:07:30 +00:00
Chandler Carruth	0fa6260bcf	Relax an overly aggressive assert to fix PR14572. The alloca width is based on the alloc size, not the type size. llvm-svn: 170270	2012-12-15 09:26:06 +00:00
Patrik Hagglund	caaedc6ade	Revert EVT->MVT changes, r169836-169851, due to buildbot failures. llvm-svn: 169854	2012-12-11 11:14:33 +00:00
Patrik Hagglund	6c9d0f4058	Change TargetLowering::getLoadExtAction to take an MVT, instead of EVT. llvm-svn: 169840	2012-12-11 09:39:09 +00:00
Chandler Carruth	4686de879c	Add a new visitor for walking the uses of a pointer value. This visitor provides infrastructure for recursively traversing the use-graph of a pointer-producing instruction like an alloca or a malloc. It maintains a worklist of uses to visit, so it can handle very deep recursions. It automatically looks through instructions which simply translate one pointer to another (bitcasts and GEPs). It tracks the offset relative to the original pointer as long as that offset remains constant and exposes it during the visit as an APInt offset. Finally, it performs conservative escape analysis. However, currently it has some limitations that should be addressed going forward: 1) It doesn't handle vectors of pointers. 2) It doesn't provide a cheaper visitor when the constant offset tracking isn't needed. 3) It doesn't support non-instruction pointer values. The current functionality is exactly what is required to implement the SROA pointer-use visitors in terms of this one, rather than in terms of their own ad-hoc base visitor, which was always very poorly specified. SROA has been converted to use this, and the code there deleted which this utility now provides. Technically speaking, using this new visitor allows SROA to handle a few more cases than it previously did. It is now more aggressive in ignoring chains of instructions which look like they would defeat SROA, but in fact do not because they never result in a read or write of memory. While this is "neat", it shouldn't be interesting for real programs as any such chains should have been removed by others passes long before we get to SROA. As a consequence, I've not added any tests for these features -- it shouldn't be part of SROA's contract to perform such heroics. The goal is to extend the functionality of this visitor going forward, and re-use it from passes like ASan that can benefit from doing a detailed walk of the uses of a pointer. Thanks to Ben Kramer for the code review rounds and lots of help reviewing and debugging this patch. llvm-svn: 169728	2012-12-10 08:28:39 +00:00
Chandler Carruth	c9b6bd9712	Fix PR14548: SROA was crashing on a mixture of i1 and i8 loads and stores. When SROA was evaluating a mixture of i1 and i8 loads and stores, in just a particular case, it would tickle a latent bug where we compared bits to bytes rather than bits to bits. As a consequence of the latent bug, we would allow integers through which were not byte-size multiples, a situation the later rewriting code was never intended to handle. In release builds this could trigger all manner of oddities, but the reported issue in PR14548 was forming invalid bitcast instructions. The only downside of this fix is that it makes it more clear that SROA in its current form is not capable of handling mixed i1 and i8 loads and stores. Sometimes with the previous code this would work by luck, but usually it would crash, so I'm not terribly worried. I'll watch the LNT numbers just to be sure. llvm-svn: 169719	2012-12-10 00:54:45 +00:00
Chandler Carruth	1e72559a05	Switch SROA to pop Uses off the back of its visitors' queues. This will more closely match the behavior of the new PtrUseVisitor that I am adding. Hopefully this will not change the actual behavior in any way, but by making the processing order more similar help in debugging. llvm-svn: 169697	2012-12-09 11:56:01 +00:00
Shuxin Yang	7221b14d96	- Re-enable population count loop idiom recognization - fix a bug which cause sigfault. - add two testing cases which was causing crash llvm-svn: 169687	2012-12-09 03:12:46 +00:00
Chandler Carruth	329a5c1e03	Revert the patches adding a popcount loop idiom recognition pass. There are still bugs in this pass, as well as other issues that are being worked on, but the bugs are crashers that occur pretty easily in the wild. Test cases have been sent to the original commit's review thread. This reverts the commits: r169671: Fix a logic error. r169604: Move the popcnt tests to an X86 subdirectory. r168931: Initial commit adding the pass. llvm-svn: 169683	2012-12-08 22:18:29 +00:00
Shuxin Yang	d80db0a201	Fix an inadvertent typo error. llvm-svn: 169671	2012-12-08 05:00:59 +00:00
Bill Wendling	3f153ce37b	s/AttrListPtr/AttributeSet/g to better label what this class is going to be in the near future. llvm-svn: 169651	2012-12-07 23:16:57 +00:00
Bill Wendling	7119fab4de	Set the 'MadeChange' variable if we are deleting blocks. llvm-svn: 169455	2012-12-06 00:30:20 +00:00
Matt Beaumont-Gay	3e68d7d342	Add 'using' declarations to suppress -Woverloaded-virtual warnings. llvm-svn: 169214	2012-12-04 05:41:27 +00:00
Nadav Rotem	489fb9a4c3	Teach the jump threading optimization to stop scanning the basic block when calculating the cost after passing the threshold. llvm-svn: 169135	2012-12-03 17:34:44 +00:00
Chandler Carruth	a490793037	Use the new script to sort the includes of every file under lib. Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131	2012-12-03 16:50:05 +00:00

... 3 4 5 6 7 ...

6027 Commits