llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-01 16:33:37 +01:00

Author	SHA1	Message	Date
Chris Lattner	41b6b286a3	enhance isRemovable to refuse to delete volatile mem transfers now that DSE hacks on them. This fixes a regression I introduced, by generalizing DSE to hack on transfers. llvm-svn: 120445	2010-11-30 19:12:10 +00:00
Chris Lattner	7d444d0682	Rewrite the main DSE loop to be written in terms of reasoning about pairs of AA::Location's instead of looking for MemDep's "Def" predicate. This is more powerful and general, handling memset/memcpy/store all uniformly, and implementing PR8701 and probably obsoleting parts of memcpyoptimizer. This also fixes an obscure bug with init.trampoline and i8 stores, but I'm not surprised it hasn't been hit yet. Enhancing init.trampoline to carry the size that it stores would allow DSE to be much more aggressive about optimizing them. llvm-svn: 120406	2010-11-30 07:23:21 +00:00
Anders Carlsson	67e9e6234c	Add a puts optimization that converts puts() to putchar('\n'). llvm-svn: 120398	2010-11-30 06:19:18 +00:00
Anders Carlsson	2a46a03898	Fix a typo. llvm-svn: 120394	2010-11-30 06:03:55 +00:00
Anders Carlsson	a2ad88fb73	Rename this test to FPuts.ll since it actually tests fputs. llvm-svn: 120393	2010-11-30 05:59:26 +00:00
Chris Lattner	bea813875e	remove a use of llvm-dis llvm-svn: 120383	2010-11-30 02:04:15 +00:00
Chris Lattner	56b0cc6974	merge one more away llvm-svn: 120375	2010-11-30 01:06:43 +00:00
Chris Lattner	8e2909e4d8	I already merged partial-overwrite.ll -> PartialStore.ll Merge context-sensitive.ll -> simple.ll and upgrade it. llvm-svn: 120374	2010-11-30 01:05:07 +00:00
Chris Lattner	496eacefab	clean up DSE tests, removing some poorly reduced and useless old test, merging more into other larger .ll files, filecheckizing along the way. llvm-svn: 120373	2010-11-30 01:00:34 +00:00
Chris Lattner	083731f3d6	enhance basicaa to return "Mod" for a memcpy call when the queried location doesn't overlap the source, and add a testcase. llvm-svn: 120370	2010-11-30 00:43:16 +00:00
Chris Lattner	8ec1830a01	Teach basicaa that memset's modref set is at worst "mod" and never contains "ref". Enhance DSE to use a modref query instead of a store-specific hack to generalize the "ignore may-alias stores" optimization to handle memset and memcpy. llvm-svn: 120368	2010-11-30 00:28:45 +00:00
Chris Lattner	975dcf5ac8	my previous patch would cause us to start deleting some volatile stores, fix and add a testcase. llvm-svn: 120363	2010-11-30 00:12:39 +00:00
Benjamin Kramer	84bf47f2d8	Fix some broken CHECK lines. llvm-svn: 120332	2010-11-29 22:34:55 +00:00
Chris Lattner	2f8a9e1eac	fix PR8677, patch by Jakub Staszak! llvm-svn: 120325	2010-11-29 21:59:31 +00:00
Frits van Bommel	a59a8cf49f	Transform (extractvalue (load P), ...) to (load (gep P, 0, ...)) if the load has no other uses, shrinking the load. llvm-svn: 120323	2010-11-29 21:56:20 +00:00
Frits van Bommel	77f6750c83	Update this test to keep testing the -instcombine transform it's supposed to be testing instead of triggering the improved constant folding for insertvalue and extractvalue. llvm-svn: 120319	2010-11-29 20:55:40 +00:00
Frits van Bommel	6610b43890	Teach ConstantFoldInstruction() how to fold insertvalue and extractvalue. llvm-svn: 120316	2010-11-29 20:36:52 +00:00
Nick Lewycky	f6fa6b29f4	Treat a call of function pointer like a load of the pointer when considering whether the pointer can be replaced with the global variable it is a copy of. Fixes PR8680. llvm-svn: 120126	2010-11-24 22:04:20 +00:00
Benjamin Kramer	8d7096e8ca	The srem -> urem transform is not safe for any divisor that's not a power of two. E.g. -5 % 5 is 0 with srem and 1 with urem. Also addresses Frits van Bommel's comments. llvm-svn: 120049	2010-11-23 20:33:57 +00:00
Benjamin Kramer	c8e6037e7d	InstCombine: Reduce "X shift (A srem B)" to "X shift (A urem B)" iff B is positive. This allows to transform the rem in "1 << ((int)x % 8);" to an and. llvm-svn: 120028	2010-11-23 18:52:42 +00:00
Duncan Sands	555525adf4	Exploit distributive laws (eg: And distributes over Or, Mul over Add, etc) in a fairly systematic way in instcombine. Some of these cases were already dealt with, in which case I removed the existing code. The case of Add has a bunch of funky logic which covers some of this plus a few variants (considers shifts to be a form of multiplication), which I didn't touch. The simplification performed is: AB+AC -> A(B+C). The improvement is to do this in cases that were not already handled [such as AB-AC -> A(B-C), which was reported on the mailing list], and also to do it more often by not checking for "only one use" if "B+C" simplifies. llvm-svn: 120024	2010-11-23 14:23:47 +00:00
Chris Lattner	41281bd30f	duncan's spider sense was right, I completely reversed the condition on this instcombine xform. This fixes a miscompilation of 403.gcc. llvm-svn: 119988	2010-11-23 02:42:04 +00:00
Benjamin Kramer	b5a2a81094	InstCombine: Implement X - A-B -> X + AB. llvm-svn: 119984	2010-11-22 20:31:27 +00:00
Duncan Sands	73f0559779	If a GEP index simply advances by multiples of a type of zero size, then replace the index with zero. llvm-svn: 119974	2010-11-22 16:32:50 +00:00
Duncan Sands	c2b128ad7d	Add a rather pointless InstructionSimplify transform, inspired by recent constant folding improvements: if P points to a type of size zero, turn "gep P, N" into "P". More generally, if a gep index type has size zero, instcombine could replace the index with zero, but that is not done here. llvm-svn: 119942	2010-11-21 13:53:09 +00:00
Chris Lattner	3a0edfb37c	implement PR8576, deleting dead stores with intervening may-alias stores. llvm-svn: 119927	2010-11-21 07:34:32 +00:00
Chris Lattner	32a16bce7a	file checkize llvm-svn: 119926	2010-11-21 07:32:40 +00:00
Chris Lattner	908a01328c	optimize: void a(int x) { if (((1<<x)&8)==0) b(); } into "x != 3", which occurs over 100 times in 403.gcc but in no other program in llvm-test. llvm-svn: 119922	2010-11-21 06:44:42 +00:00
Chris Lattner	ba1cc33676	Implement PR8644: forwarding a memcpy value to a byval, allowing the memcpy to be eliminated. Unfortunately, the requirements on byval's without explicit alignment are really weak and impossible to predict in the mid-level optimizer, so this doesn't kick in much with current frontends. The fix is to change clang to set alignment on all byval arguments. llvm-svn: 119916	2010-11-21 00:28:59 +00:00
Owen Anderson	5ee547b9d5	Add a test for CodeGenPrepare's ability to look through PHI nodes when performing addressing mode folding, introduced in r119853. llvm-svn: 119857	2010-11-19 22:34:53 +00:00
Duncan Sands	4562d3b919	Factor code for testing whether replacing one value with another preserves LCSSA form out of ScalarEvolution and into the LoopInfo class. Use it to check that SimplifyInstruction simplifications are not breaking LCSSA form. Fixes PR8622. llvm-svn: 119727	2010-11-18 19:59:41 +00:00
Owen Anderson	c2db966e5e	Completely rework the datastructure GVN uses to represent the value number to leader mapping. Previously, this was a tree of hashtables, and a query recursed into the table for the immediate dominator ad infinitum if the initial lookup failed. This led to really bad performance on tall, narrow CFGs. We can instead replace it with what is conceptually a multimap of value numbers to leaders (actually represented by a hashtable with a list of Value*'s as the value type), and then determine which leader from that set to use very cheaply thanks to the DFS numberings maintained by DominatorTree. Because there are typically few duplicates of a given value, this scan tends to be quite fast. Additionally, we use a custom linked list and BumpPtr allocation to avoid any unnecessary allocation in representing the value-side of the multimap. This change brings with it a 15% (!) improvement in the total running time of GVN on 403.gcc, which I think is pretty good considering that includes all the "real work" being done by MemDep as well. The one downside to this approach is that we can no longer use GVN to perform simple conditional progation, but that seems like an acceptable loss since we now have LVI and CorrelatedValuePropagation to pick up the slack. If you see conditional propagation that's not happening, please file bugs against LVI or CVP. llvm-svn: 119714	2010-11-18 18:32:40 +00:00
Dan Gohman	ec75e876ab	Add support for PHI-translating sext, zext, and trunc instructions, enabling more PRE. PR8586. llvm-svn: 119704	2010-11-18 17:05:13 +00:00
Chris Lattner	c752718881	remove a pointless restriction from memcpyopt. It was refusing to optimize two memcpy's like this: copy A <- B copy C <- A if it couldn't prove that noalias(B,C). We can eliminate the copy by producing a memmove instead of memcpy. llvm-svn: 119694	2010-11-18 08:00:57 +00:00
Chris Lattner	1000d06bee	filecheckize, this is still not optimal, see PR8643 llvm-svn: 119693	2010-11-18 07:49:32 +00:00
Chris Lattner	6048697a30	allow eliminating an alloca that is just copied from an constant global if it is passed as a byval argument. The byval argument will just be a read, so it is safe to read from the original global instead. This allows us to promote away the %agg.tmp alloca in PR8582 llvm-svn: 119686	2010-11-18 06:41:51 +00:00
Chris Lattner	791e914b1b	enhance the "alloca is just a memcpy from constant global" to ignore calls that obviously can't modify the alloca because they are readonly/readnone. llvm-svn: 119683	2010-11-18 06:26:49 +00:00
Chris Lattner	44ccd4643d	fix a small oversight in the "eliminate memcpy from constant global" optimization. If the alloca that is "memcpy'd from constant" also has a memcpy from it, ignore it: it is a load. We now optimize the testcase to: define void @test2() { %B = alloca %T %a = bitcast %T* @G to i8* %b = bitcast %T* %B to i8* call void @llvm.memcpy.p0i8.p0i8.i64(i8* %b, i8* %a, i64 124, i32 4, i1 false) call void @bar(i8* %b) ret void } previously we would generate: define void @test() { %B = alloca %T %b = bitcast %T* %B to i8* %G.0 = getelementptr inbounds %T* @G, i32 0, i32 0 %tmp3 = load i8* %G.0, align 4 %G.1 = getelementptr inbounds %T* @G, i32 0, i32 1 %G.15 = bitcast [123 x i8]* %G.1 to i8* %1 = bitcast [123 x i8]* %G.1 to i984* %srcval = load i984* %1, align 1 %B.0 = getelementptr inbounds %T* %B, i32 0, i32 0 store i8 %tmp3, i8* %B.0, align 4 %B.1 = getelementptr inbounds %T* %B, i32 0, i32 1 %B.12 = bitcast [123 x i8]* %B.1 to i8* %2 = bitcast [123 x i8]* %B.1 to i984* store i984 %srcval, i984* %2, align 1 call void @bar(i8* %b) ret void } llvm-svn: 119682	2010-11-18 06:20:47 +00:00
Chris Lattner	c1e63bb987	filecheckize llvm-svn: 119681	2010-11-18 06:16:43 +00:00
Benjamin Kramer	1b330efb46	InstCombine: Add a missing irem identity (X % X -> 0). llvm-svn: 119538	2010-11-17 19:11:46 +00:00
Duncan Sands	825c7d7f79	In which I discover the existence of loops. Threading an operation over a phi node by applying it to each operand may be wrong if the operation and the phi node are mutually interdependent (the testcase has a simple example of this). So only do this transform if it would be correct to perform the operation in each predecessor of the block containing the phi, i.e. if the other operands all dominate the phi. This should fix the FFMPEG snow.c regression reported by İsmail Dönmez. llvm-svn: 119347	2010-11-16 12:16:38 +00:00
Duncan Sands	ccdccb2776	Teach InstructionSimplify the trick of skipping incoming phi values that are equal to the phi itself. llvm-svn: 119161	2010-11-15 17:52:45 +00:00
Duncan Sands	658bdcf094	Move PHI tests to phi.ll, out of select.ll. llvm-svn: 119153	2010-11-15 16:43:28 +00:00
Duncan Sands	617030ad18	Teach InstructionSimplify about phi nodes. I chose to have it simply offload the work to hasConstantValue rather than do something more complicated (such handling mutually recursive phis) because (1) it is not clear it is worth it; and (2) if it is worth it, maybe such logic would be better placed in hasConstantValue. Adjust some GVN tests which are now cleaned up much further (eg: all phi nodes are removed). llvm-svn: 119043	2010-11-14 13:30:18 +00:00
Chris Lattner	7ade54ed02	rename test. llvm-svn: 119033	2010-11-14 07:03:49 +00:00
Chris Lattner	9b45357c09	filecheckize, remove an old and useless test llvm-svn: 119032	2010-11-14 07:03:38 +00:00
Chris Lattner	449c3bd7b5	this test is pretty pointless and "propogation" isn't a word (or so Misha claims). llvm-svn: 119031	2010-11-14 07:02:03 +00:00
Duncan Sands	47dddbe925	Testcase to go along with commit 118923 ("Have GVN simplify instructions as it goes"). Before -std-compile-opts only got it down to %a = tail call i32 @foo(i32 0) readnone %x = tail call i32 @foo(i32 %a) readnone %y = tail call i32 @foo(i32 %a) readnone %z = icmp eq i32 %x, %y ret i1 %z while now -basicaa -gvn alone reduce it to %a = call i32 @foo(i32 0) readnone %x = call i32 @foo(i32 %a) readnone ret i1 true llvm-svn: 119009	2010-11-13 21:33:19 +00:00
Duncan Sands	88fc6cd7fe	Generalize the reassociation transform in SimplifyCommutative (now renamed to SimplifyAssociativeOrCommutative) "(A op C1) op C2" -> "A op (C1 op C2)", which previously was only done if C1 and C2 were constants, to occur whenever "C1 op C2" simplifies (a la InstructionSimplify). Since the simplifying operand combination can no longer be assumed to be the right-hand terms, consider all of the possible permutations. When compiling "gcc as one big file", transform 2 (i.e. using right-hand operands) fires about 4000 times but it has to be said that most of the time the simplifying operands are both constants. Transforms 3, 4 and 5 each fired once. Transform 6, which is an existing transform that I didn't change, never fired. With this change, the testcase is now optimized perfectly with one run of instcombine (previously it required instcombine + reassociate + instcombine, and it may just have been luck that this worked). llvm-svn: 119002	2010-11-13 15:10:37 +00:00
Dan Gohman	bc5a716f10	Enhance DSE to handle the case where a free call makes more than one store dead. This is especially noticeable in SingleSource/Benchmarks/Shootout/objinst. llvm-svn: 118875	2010-11-12 02:19:17 +00:00

1 2 3 4 5 ...

1968 Commits