llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 04:22:57 +02:00

Author	SHA1	Message	Date
Chris Lattner	bb93cd80d6	Enhance LICM to promote alias sets whose pointers themselves are stored, which doesn't affect the memory address being promoted. llvm-svn: 122172	2010-12-19 05:57:25 +00:00
Chris Lattner	71fcecf597	fix PR8602, a bug in an assertion: a volatile store of a pointer does not make the alias set for that pointer volatile, just stores to the pointer. llvm-svn: 122171	2010-12-19 05:51:54 +00:00
Chris Lattner	0965f3f76d	revert r122164, I'm going to go with a different approach. llvm-svn: 122168	2010-12-19 04:23:03 +00:00
Chris Lattner	14a3e26146	first step to fixing PR8642: don't fold away empty basic blocks which have trapping constant exprs in them due to PHI nodes. Eliminating them can cause the constant expr to be evalutated on new paths if the input edges are critical. llvm-svn: 122164	2010-12-19 03:02:34 +00:00
Chris Lattner	1cc35d2472	move this test into the ARM test so that it is only run when the arm backend is enabled. llvm-svn: 122163	2010-12-19 02:58:14 +00:00
Nate Begeman	063d88d6fb	Add vector versions of some existing scalar transforms to aid codegen in matching psign & pblend operations to the IR produced by clang/gcc for their C idioms. llvm-svn: 122105	2010-12-17 23:12:19 +00:00
Owen Anderson	6acf8c9125	Reapply r121905 (automatic synthesis of @llvm.sadd.with.overflow) with a fix for a bug that manifested itself on the DragonEgg self-host bot. Unfortunately, the testcase is pretty messy and doesn't reduce well due to interactions with other parts of InstCombine. llvm-svn: 122072	2010-12-17 18:08:00 +00:00
Benjamin Kramer	39b30b18fa	SimplifyCFG: Ranges can be larger than 64 bits. Fixes Release-selfhost build. llvm-svn: 122054	2010-12-17 10:48:14 +00:00
Chris Lattner	e92f8121d4	improve switch formation to handle small range comparisons formed by comparisons. For example, this: void foo(unsigned x) { if (x == 0 \|\| x == 1 \|\| x == 3 \|\| x == 4 \|\| x == 6) bar(); } compiles into: _foo: ## @foo ## BB#0: ## %entry cmpl $6, %edi ja LBB0_2 ## BB#1: ## %entry movl %edi, %eax movl $91, %ecx btq %rax, %rcx jb LBB0_3 instead of: _foo: ## @foo ## BB#0: ## %entry cmpl $2, %edi jb LBB0_4 ## BB#1: ## %switch.early.test cmpl $6, %edi ja LBB0_3 ## BB#2: ## %switch.early.test movl %edi, %eax movl $88, %ecx btq %rax, %rcx jb LBB0_4 This catches a bunch of cases in GCC, which look like this: %804 = load i32* @which_alternative, align 4, !tbaa !0 %805 = icmp ult i32 %804, 2 %806 = icmp eq i32 %804, 3 %or.cond121 = or i1 %805, %806 %807 = icmp eq i32 %804, 4 %or.cond124 = or i1 %or.cond121, %807 br i1 %or.cond124, label %.thread, label %808 turning this into a range comparison. llvm-svn: 122045	2010-12-17 06:20:15 +00:00
Dan Gohman	f8949c3d1a	Revert r64460. strtol and friends cannot be marked readonly, even with a null endptr argument, because they may write to errno. This fixes a seflhost miscompile observed on Linux targets when TBAA was enabled. llvm-svn: 122014	2010-12-17 01:09:43 +00:00
Duncan Sands	22de496ae3	Speculatively revert commit 121905 since it looks like it might have broken the dragonegg self-host buildbot. Original commit message: Add an InstCombine transform to recognize instances of manual overflow-safe addition (performing the addition in a wider type and explicitly checking for overflow), and fold them down to intrinsics. This currently only supports signed-addition, but could be generalized if someone works out the magic constant formulas for other operations. llvm-svn: 121965	2010-12-16 09:40:54 +00:00
Dan Gohman	a2fd4f2e22	Preserve TBAA tags when doing load PRE. llvm-svn: 121921	2010-12-15 23:53:55 +00:00
Owen Anderson	aefeb448a9	Add an InstCombine transform to recognize instances of manual overflow-safe addition (performing the addition in a wider type and explicitly checking for overflow), and fold them down to intrinsics. This currently only supports signed-addition, but could be generalized if someone works out the magic constant formulas for other operations. Fixes <rdar://problem/8558713>. llvm-svn: 121905	2010-12-15 22:32:38 +00:00
Frits van Bommel	83b7c3773f	Teach jump threading to "look through" a select when the branch direction of a terminator depends on it. When it sees a promising select it now tries to figure out whether the condition of the select is known in any of the predecessors and if so it maps the operands appropriately. llvm-svn: 121859	2010-12-15 09:51:20 +00:00
Owen Anderson	de42e1136e	Fix PR8790, another instance where unreachable code can cause instruction simplification to fail, this case involve a select that simplifies to itself. llvm-svn: 121817	2010-12-15 00:55:35 +00:00
Chris Lattner	c1aaf52608	- Insert new instructions before DomBlock's terminator, which is simpler than finding a place to insert in BB. - Don't perform the 'if condition hoisting' xform on certain i1 PHIs, as it interferes with switch formation. This re-fixes "example 7", without breaking the world hopefully. llvm-svn: 121764	2010-12-14 08:46:09 +00:00
Chris Lattner	22d4dc5a4d	fix two significant issues with FoldTwoEntryPHINode: first, it can kick in on blocks whose conditions have been folded to a constant, even though one of the edges will be trivially folded. second, it doesn't clean up the "if diamond" that it just eliminated away. This is a problem because other simplifycfg xforms kick in depending on the order of block visitation, causing pointless work. llvm-svn: 121762	2010-12-14 08:01:53 +00:00
Chris Lattner	5d4aea9791	fix yet anohter broken line llvm-svn: 121750	2010-12-14 06:09:07 +00:00
Chris Lattner	093b5b256d	reapply my recent change that disables a piece of the switch formation work, but fixes 400.perlbmk. llvm-svn: 121749	2010-12-14 05:57:30 +00:00
Owen Anderson	5536134dc4	Fix recent buildbot breakage by pulling SimplifyCFG back to its state as of r121694, the most recent state where I'm confident there were no crashes or miscompilations. XFAIL the test added since then for now. llvm-svn: 121733	2010-12-13 23:49:28 +00:00
Chris Lattner	dcba81d96f	temporarily disable part of my previous patch, which causes an iterator invalidation issue, causing a crash on some versions of perlbmk. llvm-svn: 121728	2010-12-13 23:02:19 +00:00
Benjamin Kramer	7f1cdac1e4	Fix sort predicate. qsort(3)'s predicate semantics differ from std::sort's. Fixes PR 8780. llvm-svn: 121705	2010-12-13 18:20:38 +00:00
Chris Lattner	0368bf7457	reinstate my patch: the miscompile was caused by an inverted branch in the 'and' case. llvm-svn: 121695	2010-12-13 08:12:19 +00:00
Chris Lattner	caad324345	Completely disable the optimization I added in r121680 until I can track down a miscompile. This should bring the buildbots back to life llvm-svn: 121693	2010-12-13 07:41:29 +00:00
Chris Lattner	5ce3e42d80	Make simplifycfg reprocess newly formed "br (cond1 \| cond2)" conditions when simplifying, allowing them to be eagerly turned into switches. This is the last step required to get "Example 7" from this blog post: http://blog.regehr.org/archives/320 On X86, we now generate this machine code, which (to my eye) seems better than the ICC generated code: _crud: ## @crud ## BB#0: ## %entry cmpb $33, %dil jb LBB0_4 ## BB#1: ## %switch.early.test addb $-34, %dil cmpb $58, %dil ja LBB0_3 ## BB#2: ## %switch.early.test movzbl %dil, %eax movabsq $288230376537592865, %rcx ## imm = 0x400000017001421 btq %rax, %rcx jb LBB0_4 LBB0_3: ## %lor.rhs xorl %eax, %eax ret LBB0_4: ## %lor.end movl $1, %eax ret llvm-svn: 121690	2010-12-13 07:00:06 +00:00
Chris Lattner	ea15ce73be	fix a bug in r121680 that upset the various buildbots. llvm-svn: 121687	2010-12-13 05:34:18 +00:00
Chris Lattner	c331eb8e1e	make these tests a bit less fragile llvm-svn: 121682	2010-12-13 05:10:30 +00:00
Chris Lattner	5cbbcc56ad	enhance the "change or icmp's into switch" xform to handle one value in an 'or sequence' that it doesn't understand. This allows us to optimize something insane like this: int crud (unsigned char c, unsigned x) { if(((((((((( (int) c <= 32 \|\| (int) c == 46) \|\| (int) c == 44) \|\| (int) c == 58) \|\| (int) c == 59) \|\| (int) c == 60) \|\| (int) c == 62) \|\| (int) c == 34) \|\| (int) c == 92) \|\| (int) c == 39) != 0) foo(); } into: define i32 @crud(i8 zeroext %c, i32 %x) nounwind ssp noredzone { entry: %cmp = icmp ult i8 %c, 33 br i1 %cmp, label %if.then, label %switch.early.test switch.early.test: ; preds = %entry switch i8 %c, label %if.end [ i8 39, label %if.then i8 44, label %if.then i8 58, label %if.then i8 59, label %if.then i8 60, label %if.then i8 62, label %if.then i8 46, label %if.then i8 92, label %if.then i8 34, label %if.then ] by pulling the < comparison out ahead of the newly formed switch. llvm-svn: 121680	2010-12-13 04:50:38 +00:00
Chris Lattner	e35f4b31f4	merge two tests llvm-svn: 121679	2010-12-13 04:45:56 +00:00
Chris Lattner	25b642edfd	Fix my previous patch to handle a degenerate case that the llvm-gcc bootstrap buildbot tripped over. llvm-svn: 121674	2010-12-13 03:43:57 +00:00
Chris Lattner	a21c02e807	fix a fairly serious oversight with switch formation from or'd conditions. Previously we'd compile something like this: int crud (unsigned char c) { return c == 62 \|\| c == 34 \|\| c == 92; } into: switch i8 %c, label %lor.rhs [ i8 62, label %lor.end i8 34, label %lor.end ] lor.rhs: ; preds = %entry %cmp8 = icmp eq i8 %c, 92 br label %lor.end lor.end: ; preds = %entry, %entry, %lor.rhs %0 = phi i1 [ true, %entry ], [ %cmp8, %lor.rhs ], [ true, %entry ] %lor.ext = zext i1 %0 to i32 ret i32 %lor.ext which failed to merge the compare-with-92 into the switch. With this patch we simplify this all the way to: switch i8 %c, label %lor.rhs [ i8 62, label %lor.end i8 34, label %lor.end i8 92, label %lor.end ] lor.rhs: ; preds = %entry br label %lor.end lor.end: ; preds = %entry, %entry, %entry, %lor.rhs %0 = phi i1 [ true, %entry ], [ false, %lor.rhs ], [ true, %entry ], [ true, %entry ] %lor.ext = zext i1 %0 to i32 ret i32 %lor.ext which is much better for codegen's switch lowering stuff. This kicks in 33 times on 176.gcc (for example) cutting 103 instructions off the generated code. llvm-svn: 121671	2010-12-13 03:18:54 +00:00
Benjamin Kramer	a638216447	Generalize the and-icmp-select instcombine further by allowing selects of the form (x & 2^n) ? 2^m+C : C we can offset both arms by C to get the "(x & 2^n) ? 2^m : 0" form, optimize the select to a shift and apply the offset afterwards. llvm-svn: 121609	2010-12-11 10:49:22 +00:00
Benjamin Kramer	5a1721f4ac	Factor the (x & 2^n) ? 2^m : 0 instcombine into its own method and generalize it to catch cases where n != m with a shift. llvm-svn: 121608	2010-12-11 09:42:59 +00:00
Chris Lattner	996691e79c	enhance memcpyopt to zap memcpy's that have the same src/dst. llvm-svn: 121362	2010-12-09 07:45:45 +00:00
Chris Lattner	4fef82afa0	fix PR8753, eliminating a case where we'd infinitely make a substitution because it doesn't actually change the IR. Patch by Jakub Staszak! llvm-svn: 121361	2010-12-09 07:39:50 +00:00
Dan Gohman	3d9fc7db03	Really check that the bits that will become zero are actually already zero before eliminating the operation that zeros them. This fixes rdar://8739316. llvm-svn: 121353	2010-12-09 02:52:17 +00:00
Chris Lattner	12c2c17ac7	reapply r121100 with a tweak to constant fold ConstExprs with TargetData (if available) as we go so that we get simple constantexprs not insane ones. This fixes the failure of clang/test/CodeGenCXX/virtual-base-ctor.cpp that the previous iteration of this patch had. llvm-svn: 121111	2010-12-07 04:33:29 +00:00
Eric Christopher	cab6997dc8	Temporarily revert r121100 as it's causing clang to fail CodeGenCXX/virtual-base-ctor.cpp. llvm-svn: 121102	2010-12-07 02:41:11 +00:00
Chris Lattner	5996a47663	fix PR8710 - teach global opt that some constantexprs are too complex to put in a global variable's initializer. llvm-svn: 121100	2010-12-07 01:59:32 +00:00
Frits van Bommel	1494a2f6fe	Implement jump threading of 'indirectbr' by keeping track of whether we're looking for ConstantInts or BlockAddresss. llvm-svn: 121066	2010-12-06 23:36:56 +00:00
Chris Lattner	db6c348f31	Fix PR8728, a miscompilation I recently introduced. When optimizing memcpy's like: memcpy(A, B) memcpy(A, C) we cannot delete the first memcpy as dead if A and C might be aliases. If so, we actually get: memcpy(A, B) memcpy(A, A) which is not correct to transform into: memcpy(A, A) This patch was heavily influenced by Jakub Staszak's patch in PR8728, thanks Jakub! llvm-svn: 120974	2010-12-06 01:48:06 +00:00
Frits van Bommel	31cf7b99f9	Teach SimplifyCFG to turn (indirectbr (select cond, blockaddress(@fn, BlockA), blockaddress(@fn, BlockB))) into (br cond, BlockA, BlockB). llvm-svn: 120943	2010-12-05 18:29:03 +00:00
Chris Lattner	c3112f1e94	fix a bozo bug I introduced in r119930, causing a miscompile of 20040709-1.c from the gcc testsuite. I was using the size of a pointer instead of the pointee. This fixes rdar://8713376 llvm-svn: 120519	2010-12-01 01:24:55 +00:00
Chris Lattner	c888f3ec58	Enhance DSE to handle the variable index case in PR8657. llvm-svn: 120498	2010-11-30 23:43:23 +00:00
Chris Lattner	b9c5a6fa04	teach DSE to use GetPointerBaseWithConstantOffset to analyze may-aliasing stores that partially overlap with different base pointers. This implements PR6043 and the non-variable part of PR8657 llvm-svn: 120485	2010-11-30 23:05:20 +00:00
Chris Lattner	41b6b286a3	enhance isRemovable to refuse to delete volatile mem transfers now that DSE hacks on them. This fixes a regression I introduced, by generalizing DSE to hack on transfers. llvm-svn: 120445	2010-11-30 19:12:10 +00:00
Chris Lattner	7d444d0682	Rewrite the main DSE loop to be written in terms of reasoning about pairs of AA::Location's instead of looking for MemDep's "Def" predicate. This is more powerful and general, handling memset/memcpy/store all uniformly, and implementing PR8701 and probably obsoleting parts of memcpyoptimizer. This also fixes an obscure bug with init.trampoline and i8 stores, but I'm not surprised it hasn't been hit yet. Enhancing init.trampoline to carry the size that it stores would allow DSE to be much more aggressive about optimizing them. llvm-svn: 120406	2010-11-30 07:23:21 +00:00
Anders Carlsson	67e9e6234c	Add a puts optimization that converts puts() to putchar('\n'). llvm-svn: 120398	2010-11-30 06:19:18 +00:00
Anders Carlsson	2a46a03898	Fix a typo. llvm-svn: 120394	2010-11-30 06:03:55 +00:00
Anders Carlsson	a2ad88fb73	Rename this test to FPuts.ll since it actually tests fputs. llvm-svn: 120393	2010-11-30 05:59:26 +00:00
Chris Lattner	bea813875e	remove a use of llvm-dis llvm-svn: 120383	2010-11-30 02:04:15 +00:00
Chris Lattner	56b0cc6974	merge one more away llvm-svn: 120375	2010-11-30 01:06:43 +00:00
Chris Lattner	8e2909e4d8	I already merged partial-overwrite.ll -> PartialStore.ll Merge context-sensitive.ll -> simple.ll and upgrade it. llvm-svn: 120374	2010-11-30 01:05:07 +00:00
Chris Lattner	496eacefab	clean up DSE tests, removing some poorly reduced and useless old test, merging more into other larger .ll files, filecheckizing along the way. llvm-svn: 120373	2010-11-30 01:00:34 +00:00
Chris Lattner	083731f3d6	enhance basicaa to return "Mod" for a memcpy call when the queried location doesn't overlap the source, and add a testcase. llvm-svn: 120370	2010-11-30 00:43:16 +00:00
Chris Lattner	8ec1830a01	Teach basicaa that memset's modref set is at worst "mod" and never contains "ref". Enhance DSE to use a modref query instead of a store-specific hack to generalize the "ignore may-alias stores" optimization to handle memset and memcpy. llvm-svn: 120368	2010-11-30 00:28:45 +00:00
Chris Lattner	975dcf5ac8	my previous patch would cause us to start deleting some volatile stores, fix and add a testcase. llvm-svn: 120363	2010-11-30 00:12:39 +00:00
Benjamin Kramer	84bf47f2d8	Fix some broken CHECK lines. llvm-svn: 120332	2010-11-29 22:34:55 +00:00
Chris Lattner	2f8a9e1eac	fix PR8677, patch by Jakub Staszak! llvm-svn: 120325	2010-11-29 21:59:31 +00:00
Frits van Bommel	a59a8cf49f	Transform (extractvalue (load P), ...) to (load (gep P, 0, ...)) if the load has no other uses, shrinking the load. llvm-svn: 120323	2010-11-29 21:56:20 +00:00
Frits van Bommel	77f6750c83	Update this test to keep testing the -instcombine transform it's supposed to be testing instead of triggering the improved constant folding for insertvalue and extractvalue. llvm-svn: 120319	2010-11-29 20:55:40 +00:00
Frits van Bommel	6610b43890	Teach ConstantFoldInstruction() how to fold insertvalue and extractvalue. llvm-svn: 120316	2010-11-29 20:36:52 +00:00
Nick Lewycky	f6fa6b29f4	Treat a call of function pointer like a load of the pointer when considering whether the pointer can be replaced with the global variable it is a copy of. Fixes PR8680. llvm-svn: 120126	2010-11-24 22:04:20 +00:00
Benjamin Kramer	8d7096e8ca	The srem -> urem transform is not safe for any divisor that's not a power of two. E.g. -5 % 5 is 0 with srem and 1 with urem. Also addresses Frits van Bommel's comments. llvm-svn: 120049	2010-11-23 20:33:57 +00:00
Benjamin Kramer	c8e6037e7d	InstCombine: Reduce "X shift (A srem B)" to "X shift (A urem B)" iff B is positive. This allows to transform the rem in "1 << ((int)x % 8);" to an and. llvm-svn: 120028	2010-11-23 18:52:42 +00:00
Duncan Sands	555525adf4	Exploit distributive laws (eg: And distributes over Or, Mul over Add, etc) in a fairly systematic way in instcombine. Some of these cases were already dealt with, in which case I removed the existing code. The case of Add has a bunch of funky logic which covers some of this plus a few variants (considers shifts to be a form of multiplication), which I didn't touch. The simplification performed is: AB+AC -> A(B+C). The improvement is to do this in cases that were not already handled [such as AB-AC -> A(B-C), which was reported on the mailing list], and also to do it more often by not checking for "only one use" if "B+C" simplifies. llvm-svn: 120024	2010-11-23 14:23:47 +00:00
Chris Lattner	41281bd30f	duncan's spider sense was right, I completely reversed the condition on this instcombine xform. This fixes a miscompilation of 403.gcc. llvm-svn: 119988	2010-11-23 02:42:04 +00:00
Benjamin Kramer	b5a2a81094	InstCombine: Implement X - A-B -> X + AB. llvm-svn: 119984	2010-11-22 20:31:27 +00:00
Duncan Sands	73f0559779	If a GEP index simply advances by multiples of a type of zero size, then replace the index with zero. llvm-svn: 119974	2010-11-22 16:32:50 +00:00
Duncan Sands	c2b128ad7d	Add a rather pointless InstructionSimplify transform, inspired by recent constant folding improvements: if P points to a type of size zero, turn "gep P, N" into "P". More generally, if a gep index type has size zero, instcombine could replace the index with zero, but that is not done here. llvm-svn: 119942	2010-11-21 13:53:09 +00:00
Chris Lattner	3a0edfb37c	implement PR8576, deleting dead stores with intervening may-alias stores. llvm-svn: 119927	2010-11-21 07:34:32 +00:00
Chris Lattner	32a16bce7a	file checkize llvm-svn: 119926	2010-11-21 07:32:40 +00:00
Chris Lattner	908a01328c	optimize: void a(int x) { if (((1<<x)&8)==0) b(); } into "x != 3", which occurs over 100 times in 403.gcc but in no other program in llvm-test. llvm-svn: 119922	2010-11-21 06:44:42 +00:00
Chris Lattner	ba1cc33676	Implement PR8644: forwarding a memcpy value to a byval, allowing the memcpy to be eliminated. Unfortunately, the requirements on byval's without explicit alignment are really weak and impossible to predict in the mid-level optimizer, so this doesn't kick in much with current frontends. The fix is to change clang to set alignment on all byval arguments. llvm-svn: 119916	2010-11-21 00:28:59 +00:00
Owen Anderson	5ee547b9d5	Add a test for CodeGenPrepare's ability to look through PHI nodes when performing addressing mode folding, introduced in r119853. llvm-svn: 119857	2010-11-19 22:34:53 +00:00
Duncan Sands	4562d3b919	Factor code for testing whether replacing one value with another preserves LCSSA form out of ScalarEvolution and into the LoopInfo class. Use it to check that SimplifyInstruction simplifications are not breaking LCSSA form. Fixes PR8622. llvm-svn: 119727	2010-11-18 19:59:41 +00:00
Owen Anderson	c2db966e5e	Completely rework the datastructure GVN uses to represent the value number to leader mapping. Previously, this was a tree of hashtables, and a query recursed into the table for the immediate dominator ad infinitum if the initial lookup failed. This led to really bad performance on tall, narrow CFGs. We can instead replace it with what is conceptually a multimap of value numbers to leaders (actually represented by a hashtable with a list of Value*'s as the value type), and then determine which leader from that set to use very cheaply thanks to the DFS numberings maintained by DominatorTree. Because there are typically few duplicates of a given value, this scan tends to be quite fast. Additionally, we use a custom linked list and BumpPtr allocation to avoid any unnecessary allocation in representing the value-side of the multimap. This change brings with it a 15% (!) improvement in the total running time of GVN on 403.gcc, which I think is pretty good considering that includes all the "real work" being done by MemDep as well. The one downside to this approach is that we can no longer use GVN to perform simple conditional progation, but that seems like an acceptable loss since we now have LVI and CorrelatedValuePropagation to pick up the slack. If you see conditional propagation that's not happening, please file bugs against LVI or CVP. llvm-svn: 119714	2010-11-18 18:32:40 +00:00
Dan Gohman	ec75e876ab	Add support for PHI-translating sext, zext, and trunc instructions, enabling more PRE. PR8586. llvm-svn: 119704	2010-11-18 17:05:13 +00:00
Chris Lattner	c752718881	remove a pointless restriction from memcpyopt. It was refusing to optimize two memcpy's like this: copy A <- B copy C <- A if it couldn't prove that noalias(B,C). We can eliminate the copy by producing a memmove instead of memcpy. llvm-svn: 119694	2010-11-18 08:00:57 +00:00
Chris Lattner	1000d06bee	filecheckize, this is still not optimal, see PR8643 llvm-svn: 119693	2010-11-18 07:49:32 +00:00
Chris Lattner	6048697a30	allow eliminating an alloca that is just copied from an constant global if it is passed as a byval argument. The byval argument will just be a read, so it is safe to read from the original global instead. This allows us to promote away the %agg.tmp alloca in PR8582 llvm-svn: 119686	2010-11-18 06:41:51 +00:00
Chris Lattner	791e914b1b	enhance the "alloca is just a memcpy from constant global" to ignore calls that obviously can't modify the alloca because they are readonly/readnone. llvm-svn: 119683	2010-11-18 06:26:49 +00:00
Chris Lattner	44ccd4643d	fix a small oversight in the "eliminate memcpy from constant global" optimization. If the alloca that is "memcpy'd from constant" also has a memcpy from it, ignore it: it is a load. We now optimize the testcase to: define void @test2() { %B = alloca %T %a = bitcast %T* @G to i8* %b = bitcast %T* %B to i8* call void @llvm.memcpy.p0i8.p0i8.i64(i8* %b, i8* %a, i64 124, i32 4, i1 false) call void @bar(i8* %b) ret void } previously we would generate: define void @test() { %B = alloca %T %b = bitcast %T* %B to i8* %G.0 = getelementptr inbounds %T* @G, i32 0, i32 0 %tmp3 = load i8* %G.0, align 4 %G.1 = getelementptr inbounds %T* @G, i32 0, i32 1 %G.15 = bitcast [123 x i8]* %G.1 to i8* %1 = bitcast [123 x i8]* %G.1 to i984* %srcval = load i984* %1, align 1 %B.0 = getelementptr inbounds %T* %B, i32 0, i32 0 store i8 %tmp3, i8* %B.0, align 4 %B.1 = getelementptr inbounds %T* %B, i32 0, i32 1 %B.12 = bitcast [123 x i8]* %B.1 to i8* %2 = bitcast [123 x i8]* %B.1 to i984* store i984 %srcval, i984* %2, align 1 call void @bar(i8* %b) ret void } llvm-svn: 119682	2010-11-18 06:20:47 +00:00
Chris Lattner	c1e63bb987	filecheckize llvm-svn: 119681	2010-11-18 06:16:43 +00:00
Benjamin Kramer	1b330efb46	InstCombine: Add a missing irem identity (X % X -> 0). llvm-svn: 119538	2010-11-17 19:11:46 +00:00
Duncan Sands	825c7d7f79	In which I discover the existence of loops. Threading an operation over a phi node by applying it to each operand may be wrong if the operation and the phi node are mutually interdependent (the testcase has a simple example of this). So only do this transform if it would be correct to perform the operation in each predecessor of the block containing the phi, i.e. if the other operands all dominate the phi. This should fix the FFMPEG snow.c regression reported by İsmail Dönmez. llvm-svn: 119347	2010-11-16 12:16:38 +00:00
Duncan Sands	ccdccb2776	Teach InstructionSimplify the trick of skipping incoming phi values that are equal to the phi itself. llvm-svn: 119161	2010-11-15 17:52:45 +00:00
Duncan Sands	658bdcf094	Move PHI tests to phi.ll, out of select.ll. llvm-svn: 119153	2010-11-15 16:43:28 +00:00
Duncan Sands	617030ad18	Teach InstructionSimplify about phi nodes. I chose to have it simply offload the work to hasConstantValue rather than do something more complicated (such handling mutually recursive phis) because (1) it is not clear it is worth it; and (2) if it is worth it, maybe such logic would be better placed in hasConstantValue. Adjust some GVN tests which are now cleaned up much further (eg: all phi nodes are removed). llvm-svn: 119043	2010-11-14 13:30:18 +00:00
Chris Lattner	7ade54ed02	rename test. llvm-svn: 119033	2010-11-14 07:03:49 +00:00
Chris Lattner	9b45357c09	filecheckize, remove an old and useless test llvm-svn: 119032	2010-11-14 07:03:38 +00:00
Chris Lattner	449c3bd7b5	this test is pretty pointless and "propogation" isn't a word (or so Misha claims). llvm-svn: 119031	2010-11-14 07:02:03 +00:00
Duncan Sands	47dddbe925	Testcase to go along with commit 118923 ("Have GVN simplify instructions as it goes"). Before -std-compile-opts only got it down to %a = tail call i32 @foo(i32 0) readnone %x = tail call i32 @foo(i32 %a) readnone %y = tail call i32 @foo(i32 %a) readnone %z = icmp eq i32 %x, %y ret i1 %z while now -basicaa -gvn alone reduce it to %a = call i32 @foo(i32 0) readnone %x = call i32 @foo(i32 %a) readnone ret i1 true llvm-svn: 119009	2010-11-13 21:33:19 +00:00
Duncan Sands	88fc6cd7fe	Generalize the reassociation transform in SimplifyCommutative (now renamed to SimplifyAssociativeOrCommutative) "(A op C1) op C2" -> "A op (C1 op C2)", which previously was only done if C1 and C2 were constants, to occur whenever "C1 op C2" simplifies (a la InstructionSimplify). Since the simplifying operand combination can no longer be assumed to be the right-hand terms, consider all of the possible permutations. When compiling "gcc as one big file", transform 2 (i.e. using right-hand operands) fires about 4000 times but it has to be said that most of the time the simplifying operands are both constants. Transforms 3, 4 and 5 each fired once. Transform 6, which is an existing transform that I didn't change, never fired. With this change, the testcase is now optimized perfectly with one run of instcombine (previously it required instcombine + reassociate + instcombine, and it may just have been luck that this worked). llvm-svn: 119002	2010-11-13 15:10:37 +00:00
Dan Gohman	bc5a716f10	Enhance DSE to handle the case where a free call makes more than one store dead. This is especially noticeable in SingleSource/Benchmarks/Shootout/objinst. llvm-svn: 118875	2010-11-12 02:19:17 +00:00
Dan Gohman	2cf0e523cb	Filecheckize. llvm-svn: 118874	2010-11-12 02:02:39 +00:00
Dan Gohman	b20b7d2ec2	Factor out Instruction::isSafeToSpeculativelyExecute's code for testing for dereferenceable pointers into a helper function, isDereferenceablePointer. Teach it how to reason about GEPs with simple non-zero indices. Also eliminate ArgumentPromtion's IsAlwaysValidPointer, which didn't check for weak externals or out of range gep indices. llvm-svn: 118840	2010-11-11 21:23:25 +00:00
Dan Gohman	2b4e8302a6	Enhance GVN to do more precise alias queries for non-local memory references. For example, this allows gvn to eliminate the load in this example: void foo(int n, int* p, int q) { p[0] = 0; p[1] = 1; if (n) { q = p[0]; } } llvm-svn: 118714	2010-11-10 20:37:15 +00:00
Duncan Sands	8531f14f5b	Teach InstructionSimplify how to look through PHI nodes. Since PHI nodes can be used in loops, this could result in infinite looping if there is no recursion limit, so add such a limit. It is also used for the SelectInst case because in theory there could be an infinite loop there too if the basic block is unreachable. llvm-svn: 118694	2010-11-10 18:23:01 +00:00
Dale Johannesen	24a82bdec2	When checking that the necessary bits are zero in order to reduce ((x<<30)>>24) to x<<6, check the correct bits. PR 8547. llvm-svn: 118665	2010-11-10 01:30:56 +00:00
Dan Gohman	1571dfc883	Make ModRefBehavior a lattice. Use this to clean up AliasAnalysis chaining and simplify FunctionAttrs' GetModRefBehavior logic. llvm-svn: 118660	2010-11-10 01:02:18 +00:00
Duncan Sands	3ced912bf8	Add an additional test for icmp of select folding. llvm-svn: 118441	2010-11-08 20:56:28 +00:00
Dan Gohman	6909ecf66e	Extend the AliasAnalysis::pointsToConstantMemory interface to allow it to optionally look for constant or local (alloca) memory. Teach BasicAliasAnalysis::pointsToConstantMemory to look through Select and Phi nodes, and to support looking for local memory. Remove FunctionAttrs' PointsToLocalOrConstantMemory function, now that AliasAnalysis knows all the tricks that it knew. llvm-svn: 118412	2010-11-08 16:45:26 +00:00
Dan Gohman	4e65d5ff92	Make FunctionAttrs use AliasAnalysis::getModRefBehavior, now that it knows about intrinsic functions. llvm-svn: 118410	2010-11-08 16:10:15 +00:00
Duncan Sands	0a5bcce250	Add simplification of floating point comparisons with the result of a select instruction, the same as already exists for integer comparisons. llvm-svn: 118379	2010-11-07 16:46:25 +00:00
Duncan Sands	c9c7d54930	Fix a README item: when doing a comparison with the result of a select instruction, see if doing the compare with the true and false values of the select gives the same result. If so, that can be used as the value of the comparison. llvm-svn: 118378	2010-11-07 16:12:23 +00:00
Owen Anderson	47f0efad86	When folding away a (shl (shr)) pair, we need to check that the bits that will BECOME the low bits are zero, not that the current low bits are zero. Fixes <rdar://problem/8606771>. llvm-svn: 117953	2010-11-01 21:08:20 +00:00
Duncan Sands	a7198342e7	If a function does a volatile load from a global constant, do not consider it to be readonly. In fact, don't even consider it to be readonly if it does a volatile load from an AllocaInst either (it is debatable as to whether readonly would be correct or not in this case; play safe for the moment). This fixes PR8279. llvm-svn: 117783	2010-10-30 12:59:44 +00:00
Bob Wilson	d7f24e831f	Change instcombine's getShuffleMask to represent undef with negative values. This code had previously used 2*N, where N is the mask length, to represent undef. That is not safe because the shufflevector operands may have more than N elements -- they don't have to match the result type. llvm-svn: 117721	2010-10-29 22:03:05 +00:00
Bob Wilson	996353fb5d	Make instcombine a little more aggressive in combining vector shuffles. Allow splats even if they don't match either of the original shuffles, possibly due to undef entries in the shuffles masks. Radar 8597790. Also fix some 80-column violations. llvm-svn: 117719	2010-10-29 22:02:50 +00:00
Owen Anderson	14cf6bfa0f	Update testcase since we're no longer doing the constant forwarding inline with correlated value propagation. llvm-svn: 117712	2010-10-29 21:18:23 +00:00
NAKAMURA Takumi	b89aaebdde	test/Transforms/SimplifyLibCalls/floor.ll: Mark as XFAIL:win32 due to lack of nearbyintf on MSVC. [PR8466] llvm-svn: 117529	2010-10-28 06:46:04 +00:00
Dale Johannesen	454b9243bd	Teach InstCombine not to use Add and Neg on FP. PR 8490. llvm-svn: 117510	2010-10-27 23:45:18 +00:00
Dan Gohman	96e34e87ca	Fix a case where instcombine was stripping metadata (and alignment) from stores when folding in bitcasts. llvm-svn: 117265	2010-10-25 16:16:27 +00:00
Duncan Sands	5b25503aab	Fix PR8445: a block with no predecessors may be the entry block, in which case it isn't unreachable and should not be zapped. The check for the entry block was missing in one case: a block containing a unwind instruction. While there, do some small cleanups: "M" is not a great name for a Function* (it would be more appropriate for a Module*), change it to "Fn"; use Fn in more places. llvm-svn: 117224	2010-10-24 12:23:30 +00:00
Bob Wilson	0290dbe7d4	Teach instcombine to set the alignment arguments for NEON load/store intrinsics. llvm-svn: 117154	2010-10-22 21:41:48 +00:00
Mikhail Glushenkov	0c09a4b97f	GlobalOpt: EvaluateFunction() must not evaluate stores to weak_odr globals. Fixes PR8389. llvm-svn: 116812	2010-10-19 16:47:23 +00:00
Dan Gohman	6aff5b94ff	Make BasicAliasAnalysis a normal AliasAnalysis implementation which does normal initialization and normal chaining. Change the default AliasAnalysis implementation to NoAlias. Update StandardCompileOpts.h and friends to explicitly request BasicAliasAnalysis. Update tests to explicitly request -basicaa. llvm-svn: 116720	2010-10-18 18:04:47 +00:00
Owen Anderson	4373b4516b	Generalize MemCpyOpt's handling of call slot forwarding to function properly when the call slot forwarding is implemented with a load/store pair rather than a memcpy. llvm-svn: 116637	2010-10-15 22:52:12 +00:00
Chris Lattner	27d8b68afa	fix a bug I introduced, no idea how this didn't repro right. llvm-svn: 116462	2010-10-14 00:30:00 +00:00
Chris Lattner	7c5912d186	hack to unbreak buildbots llvm-svn: 116461	2010-10-14 00:26:10 +00:00
Chris Lattner	451a0accb5	add uadd_ov/usub_ov to apint, consolidate constant folding logic to use the new APInt methods. Among other things this implements rdar://8501501 - llvm.smul.with.overflow.i32 should constant fold which comes from "clang -ftrapv", originally brought to my attention from PR8221. llvm-svn: 116457	2010-10-14 00:05:07 +00:00
Kenneth Uildriks	e9771f15f7	Now using a variant of the existing inlining heuristics to decide whether to create a given specialization of a function in PartialSpecialization. If the total performance bonus across all callsites passing the same constant exceeds the specialization cost, we create the specialization. llvm-svn: 116158	2010-10-09 22:06:36 +00:00
Devang Patel	35201e0fd6	Remove LoopIndexSplit pass. It is neither maintained nor used by anyone. llvm-svn: 116004	2010-10-07 23:29:37 +00:00
Owen Anderson	a88628cd72	Now that the profitable bits of EnableFullLoadPRE have been enabled by default, rip out the remainder. Anyone interested in more general PRE would be better served by implementing it separately, to get real anticipation calculation, etc. llvm-svn: 115337	2010-10-01 20:02:55 +00:00
Chris Lattner	bf0f375aba	fix PR8267 - Instcombine shouldn't optimizer away volatile memcpy's. llvm-svn: 115296	2010-10-01 05:51:02 +00:00
Chris Lattner	c131f7d23b	upgrade this test. llvm-svn: 115295	2010-10-01 05:47:16 +00:00
Owen Anderson	5adba2c2ff	We do want to allow LoadPRE to perform LICM-like transformations: we already consider PHI nodes to be negligible for code size (making this transform code size neutral), and it allows us to hoist values out of loops, which is always a good thing. llvm-svn: 115205	2010-09-30 20:53:04 +00:00
Benjamin Kramer	2a44a539e2	Add constant folding for strspn and strcspn to SimplifyLibCalls. llvm-svn: 115116	2010-09-30 00:58:35 +00:00
Benjamin Kramer	476bfb7a10	Add strpbrk folding to SimplifyLibCalls. llvm-svn: 115111	2010-09-29 23:52:12 +00:00
Benjamin Kramer	cec2603ec2	Simplify the loop in StrChrOptimizer. FileCheckize test. llvm-svn: 115095	2010-09-29 22:29:12 +00:00
Benjamin Kramer	75a825ff6b	Teach SimplifyLibCalls how to optimize strrchr. llvm-svn: 115091	2010-09-29 21:50:51 +00:00
Owen Anderson	8e70968a13	Fix PR8247: JumpThreading can cause a block to become unreachable while still having predecessor, if it is part of a self-loop. Because of this, we cannot use the Simplify* APIs, as they can assert-fail on unreachable code. Since it's not easy to determine if a given threading will cause a block to become unreachable, simply defer simplifying simplification to later InstCombine and/or DCE passes. llvm-svn: 115082	2010-09-29 20:34:41 +00:00
Jakob Stoklund Olesen	c9755c5213	Don't try to constant fold libm functions with non-finite arguments. Usually we wouldn't do this anyway because llvm_fenv_testexcept would return an exception, but we have seen some cases where neither errno nor fenv detect an exception on arm-linux. llvm-svn: 114893	2010-09-27 21:29:20 +00:00
Owen Anderson	856fcd57d1	LoadPRE was not properly checking that the load it was PRE'ing post-dominated the block it was being hoisted to. Splitting critical edges at the merge point only addressed part of the issue; it is also possible for non-post-domination to occur when the path from the load to the merge has branches in it. Unfortunately, full anticipation analysis is time-consuming, so for now approximate it. This is strictly more conservative than real anticipation, so we will miss some cases that real PRE would allow, but we also no longer insert loads into paths where they didn't exist before. :-) This is a very slight net positive on SPEC for me (0.5% on average). Most of the benchmarks are largely unaffected, but when it pays off it pays off decently: 181.mcf improves by 4.5% on my machine. llvm-svn: 114785	2010-09-25 05:26:18 +00:00
Jakob Stoklund Olesen	eb9c5129d1	Be more precise when trying to XFAIL this tester: http://google1.osuosl.org:8011/builders/llvm-arm-linux llvm-svn: 114755	2010-09-24 20:34:49 +00:00
Dan Gohman	e96841854e	Attempt to XFAIL this test on arm-linux, which is inexplicably failing. llvm-svn: 114241	2010-09-18 00:04:37 +00:00
Dan Gohman	4487f42592	Fix this test to avoid an "inexact" fold. llvm-svn: 114202	2010-09-17 20:25:43 +00:00
Dan Gohman	0e6744d219	Fix this test so that folding doesn't depend on a potentially "inexact" result. llvm-svn: 114198	2010-09-17 20:15:53 +00:00
Dan Gohman	9dc559bdef	Fix the folding of floating-point math library calls, like sin(infinity), so that it detects errors on platforms where libm doesn't set errno. It's still subject to host libm details though. llvm-svn: 114148	2010-09-17 01:38:06 +00:00
Owen Anderson	37a6d67bd6	Add missing RUN line to this test. llvm-svn: 114106	2010-09-16 18:46:23 +00:00
Owen Anderson	6f3516065f	It is possible, under specific circumstances involving ptrtoint ConstantExpr's, for LVI to end up trying to merge a Constant into a ConstantRange. Handle this conservatively for now, rather than asserting. The testcase is more complex that I would like, but the manifestation of the problem is sensitive to iteration orders and the state of the LVI cache, and I have not been able to reproduce it with manually constructed or simplified cases. Fixes PR8162. llvm-svn: 114103	2010-09-16 18:28:33 +00:00
Owen Anderson	521e8dfef8	Fix PR8161, in which an unreachable loop causes recursive instruction simplification to try to replace an instruction with itself. Add a predicate to the simplifier to prevent this case. llvm-svn: 114097	2010-09-16 17:42:36 +00:00
Chris Lattner	8729e47b8f	fix PR8144, a bug where constant merge would merge globals marked attribute(used). llvm-svn: 113911	2010-09-15 00:30:11 +00:00
Owen Anderson	788b93febd	Remove dead option from tests. llvm-svn: 113855	2010-09-14 21:03:40 +00:00
Chris Lattner	0718ff9be2	fix PR8102, a case where we'd copyValue from a value that we already deleted. Fix this by doing the copyValue's before we delete stuff! The testcase only repros the problem on my system with valgrind. llvm-svn: 113820	2010-09-14 00:19:00 +00:00
Owen Anderson	e305c119b9	Add a reduced testcase for the infinite loop fixed in r113763. llvm-svn: 113770	2010-09-13 18:28:40 +00:00
Owen Anderson	9c34a7831d	Re-apply r113679, which was reverted in r113720, which added a paid of new instcombine transforms to expose greater opportunities for store narrowing in codegen. This patch fixes a potential infinite loop in instcombine caused by one of the introduced transforms being overly aggressive. llvm-svn: 113763	2010-09-13 17:59:27 +00:00
Eric Christopher	d4aaabfa74	Revert 113679, it was causing an infinite loop in a testcase that I've sent on to Owen. llvm-svn: 113720	2010-09-12 06:09:23 +00:00
Owen Anderson	d4ebde12ce	Invert and-of-or into or-of-and when doing so would allow us to clear bits of the and's mask. This can result in increased opportunities for store narrowing in code generation. Update a number of tests for this change. This fixes <rdar://problem/8285027>. Additionally, because this inverts the order of ors and ands, some patterns for optimizing or-of-and-of-or no longer fire in instances where they did originally. Add a simple transform which recaptures most of these opportunities: if we have an or-of-constant-or and have failed to fold away the inner or, commute the order of the two ors, to give the non-constant or a chance for simplification instead. llvm-svn: 113679	2010-09-11 05:48:06 +00:00
Benjamin Kramer	6110efbf8c	Teach InstructionSimplify to fold (A & B) & A -> A & B and (A \| B) \| A -> A \| B. Reassociate does this but it doesn't catch all cases (e.g. if the operands are i1). llvm-svn: 113651	2010-09-10 22:39:55 +00:00
Owen Anderson	db6a08beef	Revert r113439, which relaxed the requirement that loops containing calls cannot be unrolled. After some discussion, there seems to be a better way to achieve the same effect. llvm-svn: 113528	2010-09-09 20:02:23 +00:00
Owen Anderson	956afdd1f2	Relax the "don't unroll loops containing calls" rule. Instead, when a loop contains a call, lower the unrolling threshold to the optimize-for-size threshold. Basically, for loops containing calls, unrolling can still be profitable as long as the loop is REALLY small. llvm-svn: 113439	2010-09-08 23:10:07 +00:00
Owen Anderson	c51d7d1a8d	Generalize instcombine's support for combining multiple bit checks into a single test. Patch by Dirk Steinke! llvm-svn: 113423	2010-09-08 22:16:17 +00:00
Chris Lattner	a58a97dafc	Fix a serious performance regression introduced by r108687 on linux: turning (fptrunc (sqrt (fpext x))) -> (sqrtf x) is great, but we have to delete the original sqrt as well. Not doing so causes us to do two sqrt's when building with -fmath-errno (the default on linux). llvm-svn: 113260	2010-09-07 20:01:38 +00:00
Chris Lattner	f2534b401f	rename test. llvm-svn: 113257	2010-09-07 19:57:06 +00:00
Chris Lattner	6e6a535055	fix PR8067, an over-aggressive assertion in LICM. llvm-svn: 113146	2010-09-06 05:11:24 +00:00
Chris Lattner	4100881939	Teach loop rotate to hoist trivially invariant instructions in the duplicated block instead of duplicating them. Duplicating them into the end of the loop and the preheader means that we got a phi node in the header of the loop, which prevented LICM from hoisting them. GVN would usually come around later and merge the duplicated instructions so we'd get reasonable output... except that anything dependent on the shoulda-been-hoisted value can't be hoisted. In PR5319 (which this fixes), a memory value didn't get promoted. llvm-svn: 113134	2010-09-06 01:10:22 +00:00
Chris Lattner	4dccd368f3	fix PR8063, a crash in globalopt in the malloc analysis code. llvm-svn: 113109	2010-09-05 17:20:46 +00:00
Dan Gohman	e1ad0ebbcc	Fix LoopSimplify to notify ScalarEvolution when splitting a loop backedge into an inner loop, as the new loop iteration may differ substantially. This fixes PR8078. llvm-svn: 113057	2010-09-04 02:42:48 +00:00
Chris Lattner	2b77a2a167	fix a bug in my licm rewrite when a load from the promoted memory location is being re-stored to the memory location. We would get a dangling pointer from the SSAUpdate data structure and miss a use. This fixes PR8068 llvm-svn: 113042	2010-09-04 00:12:30 +00:00
Owen Anderson	94d98b12c8	Propagate non-local comparisons. Fixes PR1757. llvm-svn: 113025	2010-09-03 22:47:08 +00:00
Owen Anderson	9161c79ffe	Add support for simplifying a load from a computed value to a load from a global when it is provable that they're equivalent. This fixes PR4855. llvm-svn: 112994	2010-09-03 19:08:37 +00:00
Owen Anderson	f700be9fb2	Add a test for PR4413, which was apparently fixed at some point in the past. llvm-svn: 112987	2010-09-03 18:33:08 +00:00
Owen Anderson	91cc1ae13c	Add PR number to test. llvm-svn: 112971	2010-09-03 16:58:25 +00:00
Chris Lattner	a1691fc6cb	more test cleanup llvm-svn: 112892	2010-09-02 22:38:56 +00:00
Chris Lattner	238f46d92e	remove some noise from tests. llvm-svn: 112889	2010-09-02 22:35:33 +00:00
Chris Lattner	1edeb00c72	fix more AST updating bugs, correcting miscompilation in PR8041 llvm-svn: 112878	2010-09-02 22:19:10 +00:00
Owen Anderson	86201cfdd9	Fix typo. I accidentally edited the wrong file before my last commit. llvm-svn: 112851	2010-09-02 19:52:06 +00:00
Owen Anderson	3206920eb2	Fix a bug in LazyValueInfo that CorrelatedValuePropagation exposed: In the LVI lattice, undef and the full set ConstantRange should not be treated as equivalent. llvm-svn: 112843	2010-09-02 18:23:58 +00:00
Duncan Sands	6b382d04f4	Print the number of uses of a function in the .ll since it can be informative and there seems to be no reason not to. llvm-svn: 112812	2010-09-02 08:52:23 +00:00
Chris Lattner	b1c861e28c	deepen my MMX/SRoA hack to avoid hurting non-x86 codegen. llvm-svn: 112763	2010-09-01 23:09:27 +00:00
Dan Gohman	d0dc80485c	Fix loop unswitching's assumption that a code path which either infinite loops or exits will eventually exit. This fixes PR5373. llvm-svn: 112745	2010-09-01 21:46:45 +00:00
Bill Wendling	988287ae88	The output of opt -stats must be sent to stderr. Patch by NAKAMURA Takumi! llvm-svn: 112724	2010-09-01 18:32:56 +00:00
Chris Lattner	9759e898f0	add a gross hack to work around a problem that Argiris reported on llvmdev: SRoA is introducing MMX datatypes like <1 x i64>, which then cause random problems because the X86 backend is producing mmx stuff without inserting proper emms calls. In the short term, force off MMX datatypes. In the long term, the X86 backend should not select generic vector types to MMX registers. This is being worked on, but won't be done in time for 2.8. rdar://8380055 llvm-svn: 112696	2010-09-01 05:14:33 +00:00
Chris Lattner	c7ae149253	filecheckize llvm-svn: 112695	2010-09-01 05:10:14 +00:00
Chris Lattner	5ed5e29575	licm is wasting time hoisting constant foldable operations, instead of hoisting them, just fold them away. This occurs in the testcase for PR8041, for example. llvm-svn: 112669	2010-08-31 23:00:16 +00:00
Owen Anderson	c9c199c531	Merge 2010-08-31-InfiniteRecursion.ll into crash.ll. llvm-svn: 112635	2010-08-31 20:27:17 +00:00
Owen Anderson	3ab91d56b4	Add a test for the duplicated-conditional situation illutrated by PR5652. llvm-svn: 112621	2010-08-31 18:49:12 +00:00
Chris Lattner	98902c15fa	merge two tests. llvm-svn: 112617	2010-08-31 18:44:03 +00:00
Owen Anderson	43ac4da8d1	Manually reduce this testcase. llvm-svn: 112615	2010-08-31 18:16:29 +00:00
Chris Lattner	8535204036	merge two tests and convert to filecheck. llvm-svn: 112613	2010-08-31 18:05:08 +00:00
Owen Anderson	e4af4b10f1	Add a micro-test for the transforms I added to JumpThreading. I have not been able to find a way to test each in isolation, for a few reasons: 1) The ability to look-through non-i1 BinaryOperator's requires the ability to look through non-constant ICmps in order for it to ever trigger. 2) The ability to do LVI-powered PHI value determination only matters in cases that ProcessBranchOnPHI can't handle. Since it already handles all the cases without other instructions in the def-use chain between the PHI and the branch, it requires the ability to look through ICmps and/or BinaryOperators as well. llvm-svn: 112611	2010-08-31 17:59:07 +00:00
Owen Anderson	e930c65b2c	Rename test directory to reflect new pass name. llvm-svn: 112592	2010-08-31 07:50:31 +00:00
Owen Anderson	ccaee65189	Rename ValuePropagation to a more descriptive CorrelatedValuePropagation. llvm-svn: 112591	2010-08-31 07:48:34 +00:00
Owen Anderson	ba28fe3dcb	More Chris-inspired JumpThreading fixes: use ConstantExpr to correctly constant-fold undef, and be more careful with its return value. This actually exposed an infinite recursion bug in ComputeValueKnownInPredecessors which theoretically already existed (in JumpThreading's handling of and/or of i1's), but never manifested before. This patch adds a tracking set to prevent this case. llvm-svn: 112589	2010-08-31 07:36:34 +00:00
Owen Anderson	bd9edea8a3	Remove r111665, which implemented store-narrowing in InstCombine. Chris discovered a miscompilation in it, and it's not easily fixable at the optimizer level. I'll investigate reimplementing it in DAGCombine. llvm-svn: 112575	2010-08-31 04:41:06 +00:00
Owen Anderson	18110f0db4	Combine these two tests, and make sure there's a newline at the end of the file. llvm-svn: 112554	2010-08-30 23:37:41 +00:00
Duncan Sands	254f8ff0a6	Correct bogus module triple specifications. llvm-svn: 112469	2010-08-30 10:48:29 +00:00
Chris Lattner	51639dea34	LICM does get dead instructions input to it. Instead of sinking them out of loops, just delete them. llvm-svn: 112451	2010-08-29 18:22:25 +00:00
Chris Lattner	4b49ada02c	remove the ABCD and SSI passes. They don't have any clients that I'm aware of, aren't maintained, and LVI will be replacing their value. nlewycky approved this on irc. llvm-svn: 112355	2010-08-28 03:51:24 +00:00
Chris Lattner	b61cf1e296	handle the constant case of vector insertion. For something like this: struct S { float A, B, C, D; }; struct S g; struct S bar() { struct S A = g; ++A.B; A.A = 42; return A; } we now generate: _bar: ## @bar ## BB#0: ## %entry movq _g@GOTPCREL(%rip), %rax movss 12(%rax), %xmm0 pshufd $16, %xmm0, %xmm0 movss 4(%rax), %xmm2 movss 8(%rax), %xmm1 pshufd $16, %xmm1, %xmm1 unpcklps %xmm0, %xmm1 addss LCPI1_0(%rip), %xmm2 pshufd $16, %xmm2, %xmm2 movss LCPI1_1(%rip), %xmm0 pshufd $16, %xmm0, %xmm0 unpcklps %xmm2, %xmm0 ret instead of: _bar: ## @bar ## BB#0: ## %entry movq _g@GOTPCREL(%rip), %rax movss 12(%rax), %xmm0 pshufd $16, %xmm0, %xmm0 movss 4(%rax), %xmm2 movss 8(%rax), %xmm1 pshufd $16, %xmm1, %xmm1 unpcklps %xmm0, %xmm1 addss LCPI1_0(%rip), %xmm2 movd %xmm2, %eax shlq $32, %rax addq $1109917696, %rax ## imm = 0x42280000 movd %rax, %xmm0 ret llvm-svn: 112345	2010-08-28 01:50:57 +00:00
Chris Lattner	c70b0c0ee7	optimize bitcasts from large integers to vector into vector element insertion from the pieces that feed into the vector. This handles a pattern that occurs frequently due to code generated for the x86-64 abi. We now compile something like this: struct S { float A, B, C, D; }; struct S g; struct S bar() { struct S A = g; ++A.A; ++A.C; return A; } into all nice vector operations: _bar: ## @bar ## BB#0: ## %entry movq _g@GOTPCREL(%rip), %rax movss LCPI1_0(%rip), %xmm1 movss (%rax), %xmm0 addss %xmm1, %xmm0 pshufd $16, %xmm0, %xmm0 movss 4(%rax), %xmm2 movss 12(%rax), %xmm3 pshufd $16, %xmm2, %xmm2 unpcklps %xmm2, %xmm0 addss 8(%rax), %xmm1 pshufd $16, %xmm1, %xmm1 pshufd $16, %xmm3, %xmm2 unpcklps %xmm2, %xmm1 ret instead of icky integer operations: _bar: ## @bar movq _g@GOTPCREL(%rip), %rax movss LCPI1_0(%rip), %xmm1 movss (%rax), %xmm0 addss %xmm1, %xmm0 movd %xmm0, %ecx movl 4(%rax), %edx movl 12(%rax), %esi shlq $32, %rdx addq %rcx, %rdx movd %rdx, %xmm0 addss 8(%rax), %xmm1 movd %xmm1, %eax shlq $32, %rsi addq %rax, %rsi movd %rsi, %xmm1 ret This resolves rdar://8360454 llvm-svn: 112343	2010-08-28 01:20:38 +00:00
Owen Anderson	dc4703bcd5	Add a prototype of a new peephole optimizing pass that uses LazyValue info to simplify PHIs and select's. This pass addresses the missed optimizations from PR2581 and PR4420. llvm-svn: 112325	2010-08-27 23:31:36 +00:00
Chris Lattner	08d2f26030	tidy up test. llvm-svn: 112321	2010-08-27 23:15:21 +00:00
Chris Lattner	3f880c2097	Enhance the shift propagator to handle the case when you have: A = shl x, 42 ... B = lshr ..., 38 which can be transformed into: A = shl x, 4 ... iff we can prove that the would-be-shifted-in bits are already zero. This eliminates two shifts in the testcase and allows eliminate of the whole i128 chain in the real example. llvm-svn: 112314	2010-08-27 22:53:44 +00:00
Chris Lattner	80632e5fd9	Implement a pretty general logical shift propagation framework, which is good at ripping through bitfield operations. This generalize a bunch of the existing xforms that instcombine does, such as (x << c) >> c -> and to handle intermediate logical nodes. This is useful for ripping up the "promote to large integer" code produced by SRoA. llvm-svn: 112304	2010-08-27 22:24:38 +00:00
Chris Lattner	1a15c898b9	merge and filecheckize test llvm-svn: 112289	2010-08-27 20:44:45 +00:00
Chris Lattner	a571568019	merge two tests llvm-svn: 112288	2010-08-27 20:42:10 +00:00
Chris Lattner	866b888095	teach the truncation optimization that an entire chain of computation can be truncated if it is fed by a sext/zext that doesn't have to be exactly equal to the truncation result type. llvm-svn: 112285	2010-08-27 20:32:06 +00:00
Chris Lattner	69a9143584	Add an instcombine to clean up a common pattern produced by the SRoA "promote to large integer" code, eliminating some type conversions like this: %94 = zext i16 %93 to i32 ; <i32> [#uses=2] %96 = lshr i32 %94, 8 ; <i32> [#uses=1] %101 = trunc i32 %96 to i8 ; <i8> [#uses=1] This also unblocks other xforms from happening, now clang is able to compile: struct S { float A, B, C, D; }; float foo(struct S A) { return A.A + A.B+A.C+A.D; } into: _foo: ## @foo ## BB#0: ## %entry pshufd $1, %xmm0, %xmm2 addss %xmm0, %xmm2 movdqa %xmm1, %xmm3 addss %xmm2, %xmm3 pshufd $1, %xmm1, %xmm0 addss %xmm3, %xmm0 ret on x86-64, instead of: _foo: ## @foo ## BB#0: ## %entry movd %xmm0, %rax shrq $32, %rax movd %eax, %xmm2 addss %xmm0, %xmm2 movapd %xmm1, %xmm3 addss %xmm2, %xmm3 movd %xmm1, %rax shrq $32, %rax movd %eax, %xmm0 addss %xmm3, %xmm0 ret This seems pretty close to optimal to me, at least without using horizontal adds. This also triggers in lots of other code, including SPEC. llvm-svn: 112278	2010-08-27 18:31:05 +00:00
Owen Anderson	35ff7a208e	Use LVI to eliminate conditional branches where we've tested a related condition previously. Update tests for this change. This fixes PR5652. llvm-svn: 112270	2010-08-27 17:12:29 +00:00
Chris Lattner	e9dafffae3	filecheckize llvm-svn: 112235	2010-08-26 22:23:39 +00:00
Chris Lattner	1efc631212	rename test. llvm-svn: 112234	2010-08-26 22:20:47 +00:00
Chris Lattner	d5d68438c1	optimize "integer extraction out of the middle of a vector" as produced by SRoA. This is part of rdar://7892780, but needs another xform to expose this. llvm-svn: 112232	2010-08-26 22:14:59 +00:00
Chris Lattner	19a5dc488b	optimize bitcast(trunc(bitcast(x))) where the result is a float and 'x' is a vector to be a vector element extraction. This allows clang to compile: struct S { float A, B, C, D; }; float foo(struct S A) { return A.A + A.B+A.C+A.D; } into: _foo: ## @foo ## BB#0: ## %entry movd %xmm0, %rax shrq $32, %rax movd %eax, %xmm2 addss %xmm0, %xmm2 movapd %xmm1, %xmm3 addss %xmm2, %xmm3 movd %xmm1, %rax shrq $32, %rax movd %eax, %xmm0 addss %xmm3, %xmm0 ret instead of: _foo: ## @foo ## BB#0: ## %entry movd %xmm0, %rax movd %eax, %xmm0 shrq $32, %rax movd %eax, %xmm2 addss %xmm0, %xmm2 movd %xmm1, %rax movd %eax, %xmm1 addss %xmm2, %xmm1 shrq $32, %rax movd %eax, %xmm0 addss %xmm1, %xmm0 ret ... eliminating half of the horribleness. llvm-svn: 112227	2010-08-26 21:55:42 +00:00
Chris Lattner	d1a8743984	filecheckize llvm-svn: 112225	2010-08-26 21:51:41 +00:00
Chris Lattner	3113ee607c	rename test llvm-svn: 112224	2010-08-26 21:50:56 +00:00
Owen Anderson	77fcf53657	Make JumpThreading smart enough to properly thread StrSwitch when it's compiled with clang++. llvm-svn: 112198	2010-08-26 17:40:24 +00:00
Devang Patel	05becf3ac5	DIGlobalVariable can be used to encode debug info for globals that are directly folded into a constant by FE. llvm-svn: 112072	2010-08-25 18:52:02 +00:00
Owen Anderson	e0cdfa265a	In the default address space, any GEP off of null results in a trap value if you try to load it. Thus, any load in the default address space that completes implies that the base value that it GEP'd from was not null. llvm-svn: 112015	2010-08-25 01:16:47 +00:00
Owen Anderson	678fd04aa5	Re-apply r111568 with a fix for the clang self-host. llvm-svn: 111665	2010-08-20 18:24:43 +00:00
Owen Anderson	7c1b4fbd3b	Previous revert failed to remove this file. llvm-svn: 111582	2010-08-19 23:45:15 +00:00
Owen Anderson	0e57acb623	Revert r111568 to unbreak clang self-host. llvm-svn: 111571	2010-08-19 23:25:16 +00:00
Owen Anderson	7f2852ba2d	When a set of bitmask operations, typically from a bitfield initialization, only modifies the low bytes of a value, we can narrow the store to only over-write the affected bytes. llvm-svn: 111568	2010-08-19 22:15:40 +00:00
Kenneth Uildriks	69cdd103c0	Fixed and reactivated a partial specialization test llvm-svn: 111516	2010-08-19 12:42:38 +00:00
Chris Lattner	ab876b6ce8	Fix PR7755: knowing something about an inval for a pred from the LHS should disable reconsidering that pred on the RHS. However, knowing something about the pred on the RHS shouldn't disable subsequent additions on the RHS from happening. llvm-svn: 111349	2010-08-18 03:14:36 +00:00
Eric Christopher	08e9f0250a	Temporarily revert r110987 as it's causing some miscompares in vector heavy code. I'll re-enable when we've tracked down the problem. llvm-svn: 111318	2010-08-17 22:55:27 +00:00
Dan Gohman	e26025ddd0	When rotating loops, put the original header at the bottom of the loop, making the resulting loop significantly less ugly. Also, zap its trivial PHI nodes, since it's easy. llvm-svn: 111255	2010-08-17 17:39:21 +00:00
Dan Gohman	9178d0792f	Instead, teach SimplifyCFG to trim non-address-taken blocks from indirectbr destination lists. llvm-svn: 111122	2010-08-16 14:41:14 +00:00
Dan Gohman	afb3db46d2	LoopSimplify shouldn't split loop backedges that use indirectbr. PR7867. llvm-svn: 111061	2010-08-14 00:43:09 +00:00
Dan Gohman	d04a608a73	Teach SimplifyCFG how to simplify indirectbr instructions. - Eliminate redundant successors. - Convert an indirectbr with one successor into a direct branch. Also, generalize SimplifyCFG to be able to be run on a function entry block. It knows quite a few simplifications which are applicable to the entry block, and it only needs a few checks to avoid trouble with the entry block. llvm-svn: 111060	2010-08-14 00:29:42 +00:00
Nate Begeman	e57074fc48	Reapply this transformation now that it is passing the external test which it previously failed. llvm-svn: 110987	2010-08-13 00:17:53 +00:00
Chris Lattner	fd40059e71	fix PR7876: If ipsccp decides that a function's address is taken before it rewrites the code, we need to use that in the post-rewrite pass. llvm-svn: 110962	2010-08-12 22:25:23 +00:00
Eric Christopher	34acdf57df	Temporarily revert 110737 and 110734, they were causing failures in an external testsuite. llvm-svn: 110905	2010-08-12 07:01:22 +00:00
Nate Begeman	36e284c2be	Add test for recent instcombine vector shuffle enhancement llvm-svn: 110737	2010-08-10 21:58:00 +00:00
Eli Friedman	7197d66ff1	PR7853: fix a silly mistake introduced in r101899, and add a test to make sure it doesn't regress again. llvm-svn: 110597	2010-08-09 20:49:43 +00:00
Dan Gohman	d108d2b2f8	Move x86-specific tests out of test/Transforms/LoopStrengthReduce and into test/CodeGen/X86, so that they aren't run when the x86 target is not enabled. Fix uglygep.ll to not be x86-specific. llvm-svn: 110343	2010-08-05 17:04:15 +00:00
Dan Gohman	a80f89dbc7	Make instcombine set explicit alignments on load or store instructions with alignment 0, so that subsequent passes don't need to bother checking the TargetData ABI size manually. llvm-svn: 110128	2010-08-03 18:20:32 +00:00
Peter Collingbourne	10c4f9d6bd	Add an atomic lowering pass llvm-svn: 110113	2010-08-03 16:19:16 +00:00
Owen Anderson	e957c57ebb	Re-apply the infamous r108614, with a fix pointed out by Dirk Steinke. llvm-svn: 110036	2010-08-02 09:32:13 +00:00
Daniel Dunbar	f2be238c99	Speculatively revert r108614, "Another attempt at getting the clang self-host to like my instcombine patch.", in an attempt to fix Clang i386 bootstrap. - Also PR7719. llvm-svn: 109953	2010-07-31 19:51:11 +00:00
Owen Anderson	647ac93b7d	Fix a test with malformed IR. Not sure why this didn't fail before. llvm-svn: 109422	2010-07-26 18:44:56 +00:00
Dan Gohman	9e0ae022d2	Fix SCEVExpander::visitAddRecExpr so that it remembers the induction variable it inserted rather than using LoopInfo::getCanonicalInductionVariable to rediscover it, since that doesn't work on non-canonical loops. This fixes infinite recurrsion on such loops; PR7562. llvm-svn: 109419	2010-07-26 18:28:14 +00:00
Dan Gohman	48bddf693c	Avoid depending on LCSSA implicitly pulling in LoopSimplify. llvm-svn: 109410	2010-07-26 18:00:43 +00:00
Owen Anderson	f66e1873ea	Testcase for r108687. llvm-svn: 108689	2010-07-19 08:14:26 +00:00
Owen Anderson	c8dc055b5e	Another attempt at getting the clang self-host to like my instcombine patch. llvm-svn: 108614	2010-07-17 06:56:35 +00:00
Nick Lewycky	1b4a83430b	Arrays and vectors with different numbers of elements are not equivalent. llvm-svn: 108517	2010-07-16 06:31:12 +00:00
Tobias Grosser	9c86be4570	LoopSimplify does not update domfrontier correctly. This fixes PR7649. llvm-svn: 108513	2010-07-16 05:59:45 +00:00
Eric Christopher	5eef314caf	Also revert 108422, it's causing some test failures. Working on testcases for Owen. llvm-svn: 108494	2010-07-16 01:36:12 +00:00
Dan Gohman	39c67ace89	Fix this test. llvm-svn: 108491	2010-07-16 01:28:45 +00:00
Dan Gohman	1705ac2740	Fix the order that SCEVExpander considers add operands in so that it doesn't miss an opportunity to form a GEP, regardless of the relative loop depths of the operands. This fixes rdar://8197217. llvm-svn: 108475	2010-07-15 23:38:13 +00:00
Owen Anderson	01a2992a91	Reapply r108378, with bugfixes, testcase, and improved comment formatting. This now passes LIT, nighty test, and llvm-gcc bootstrap on my machine. llvm-svn: 108422	2010-07-15 15:00:23 +00:00
Chris Lattner	2b2265a9c5	Fix PR7647, handling the case when 'To' ends up being mutated by recursive simplification. This also enhances ReplaceAndSimplifyAllUses to actually do a real RAUW at the end of it, which updates any value handles pointing to "From" to start pointing to "To". This seems useful for debug info and random other VH users. llvm-svn: 108415	2010-07-15 06:36:08 +00:00
Chris Lattner	38e6ecd9f1	revert r108320, I see the failures now... llvm-svn: 108322	2010-07-14 06:16:35 +00:00
Chris Lattner	5822d6d579	reapply benjamin's instcombine patch, I don't see anything wrong with it and can't repro any problems with a manual self-host. llvm-svn: 108320	2010-07-14 05:59:13 +00:00
Duncan Sands	8864383748	Handle the case of a tail recursion in which the tail call is followed by a return that returns a constant, while elsewhere in the function another return instruction returns a different constant. This is a special case of accumulator recursion, so just generalize the existing logic a bit. llvm-svn: 108241	2010-07-13 15:41:41 +00:00
Benjamin Kramer	cf8ad46899	Nope, still breaks the release selfhost bots :( llvm-svn: 108153	2010-07-12 16:38:48 +00:00
Benjamin Kramer	e391789246	Reapply the "or" half of r108136, which seems to be less problematic. llvm-svn: 108152	2010-07-12 16:15:48 +00:00
Benjamin Kramer	98c95e7743	Revert r108141 again, sigh. llvm-svn: 108148	2010-07-12 14:42:04 +00:00

... 3 4 5 6 7 ...

2213 Commits