1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-28 06:22:51 +01:00
Commit Graph

1967 Commits

Author SHA1 Message Date
Chris Lattner
7d444d0682 Rewrite the main DSE loop to be written in terms of reasoning
about pairs of AA::Location's instead of looking for MemDep's
"Def" predicate.  This is more powerful and general, handling
memset/memcpy/store all uniformly, and implementing PR8701 and
probably obsoleting parts of memcpyoptimizer.

This also fixes an obscure bug with init.trampoline and i8
stores, but I'm not surprised it hasn't been hit yet.  Enhancing
init.trampoline to carry the size that it stores would allow
DSE to be much more aggressive about optimizing them.

llvm-svn: 120406
2010-11-30 07:23:21 +00:00
Anders Carlsson
67e9e6234c Add a puts optimization that converts puts() to putchar('\n').
llvm-svn: 120398
2010-11-30 06:19:18 +00:00
Anders Carlsson
2a46a03898 Fix a typo.
llvm-svn: 120394
2010-11-30 06:03:55 +00:00
Anders Carlsson
a2ad88fb73 Rename this test to FPuts.ll since it actually tests fputs.
llvm-svn: 120393
2010-11-30 05:59:26 +00:00
Chris Lattner
bea813875e remove a use of llvm-dis
llvm-svn: 120383
2010-11-30 02:04:15 +00:00
Chris Lattner
56b0cc6974 merge one more away
llvm-svn: 120375
2010-11-30 01:06:43 +00:00
Chris Lattner
8e2909e4d8 I already merged partial-overwrite.ll -> PartialStore.ll
Merge context-sensitive.ll -> simple.ll and upgrade it.

llvm-svn: 120374
2010-11-30 01:05:07 +00:00
Chris Lattner
496eacefab clean up DSE tests, removing some poorly reduced and useless old test,
merging more into other larger .ll files, filecheckizing along the way.

llvm-svn: 120373
2010-11-30 01:00:34 +00:00
Chris Lattner
083731f3d6 enhance basicaa to return "Mod" for a memcpy call when the
queried location doesn't overlap the source, and add a testcase.

llvm-svn: 120370
2010-11-30 00:43:16 +00:00
Chris Lattner
8ec1830a01 Teach basicaa that memset's modref set is at worst "mod" and never
contains "ref".

Enhance DSE to use a modref query instead of a store-specific hack
to generalize the "ignore may-alias stores" optimization to handle
memset and memcpy.

llvm-svn: 120368
2010-11-30 00:28:45 +00:00
Chris Lattner
975dcf5ac8 my previous patch would cause us to start deleting some volatile
stores, fix and add a testcase.

llvm-svn: 120363
2010-11-30 00:12:39 +00:00
Benjamin Kramer
84bf47f2d8 Fix some broken CHECK lines.
llvm-svn: 120332
2010-11-29 22:34:55 +00:00
Chris Lattner
2f8a9e1eac fix PR8677, patch by Jakub Staszak!
llvm-svn: 120325
2010-11-29 21:59:31 +00:00
Frits van Bommel
a59a8cf49f Transform (extractvalue (load P), ...) to (load (gep P, 0, ...)) if the load has no other uses, shrinking the load.
llvm-svn: 120323
2010-11-29 21:56:20 +00:00
Frits van Bommel
77f6750c83 Update this test to keep testing the -instcombine transform it's supposed to be testing instead of triggering the improved constant folding for insertvalue and extractvalue.
llvm-svn: 120319
2010-11-29 20:55:40 +00:00
Frits van Bommel
6610b43890 Teach ConstantFoldInstruction() how to fold insertvalue and extractvalue.
llvm-svn: 120316
2010-11-29 20:36:52 +00:00
Nick Lewycky
f6fa6b29f4 Treat a call of function pointer like a load of the pointer when considering
whether the pointer can be replaced with the global variable it is a copy of.
Fixes PR8680.

llvm-svn: 120126
2010-11-24 22:04:20 +00:00
Benjamin Kramer
8d7096e8ca The srem -> urem transform is not safe for any divisor that's not a power of two.
E.g. -5 % 5 is 0 with srem and 1 with urem.

Also addresses Frits van Bommel's comments.

llvm-svn: 120049
2010-11-23 20:33:57 +00:00
Benjamin Kramer
c8e6037e7d InstCombine: Reduce "X shift (A srem B)" to "X shift (A urem B)" iff B is positive.
This allows to transform the rem in "1 << ((int)x % 8);" to an and.

llvm-svn: 120028
2010-11-23 18:52:42 +00:00
Duncan Sands
555525adf4 Exploit distributive laws (eg: And distributes over Or, Mul over Add, etc) in a
fairly systematic way in instcombine.  Some of these cases were already dealt
with, in which case I removed the existing code.  The case of Add has a bunch of
funky logic which covers some of this plus a few variants (considers shifts to be
a form of multiplication), which I didn't touch.  The simplification performed is:
A*B+A*C -> A*(B+C).  The improvement is to do this in cases that were not already
handled [such as A*B-A*C -> A*(B-C), which was reported on the mailing list], and
also to do it more often by not checking for "only one use" if "B+C" simplifies.

llvm-svn: 120024
2010-11-23 14:23:47 +00:00
Chris Lattner
41281bd30f duncan's spider sense was right, I completely reversed the condition
on this instcombine xform.  This fixes a miscompilation of 403.gcc.

llvm-svn: 119988
2010-11-23 02:42:04 +00:00
Benjamin Kramer
b5a2a81094 InstCombine: Implement X - A*-B -> X + A*B.
llvm-svn: 119984
2010-11-22 20:31:27 +00:00
Duncan Sands
73f0559779 If a GEP index simply advances by multiples of a type of zero size,
then replace the index with zero.

llvm-svn: 119974
2010-11-22 16:32:50 +00:00
Duncan Sands
c2b128ad7d Add a rather pointless InstructionSimplify transform, inspired by recent constant
folding improvements: if P points to a type of size zero, turn "gep P, N" into "P".
More generally, if a gep index type has size zero, instcombine could replace the
index with zero, but that is not done here.

llvm-svn: 119942
2010-11-21 13:53:09 +00:00
Chris Lattner
3a0edfb37c implement PR8576, deleting dead stores with intervening may-alias stores.
llvm-svn: 119927
2010-11-21 07:34:32 +00:00
Chris Lattner
32a16bce7a file checkize
llvm-svn: 119926
2010-11-21 07:32:40 +00:00
Chris Lattner
908a01328c optimize:
void a(int x) { if (((1<<x)&8)==0) b(); }

into "x != 3", which occurs over 100 times in 403.gcc but in no
other program in llvm-test.

llvm-svn: 119922
2010-11-21 06:44:42 +00:00
Chris Lattner
ba1cc33676 Implement PR8644: forwarding a memcpy value to a byval,
allowing the memcpy to be eliminated.

Unfortunately, the requirements on byval's without explicit 
alignment are really weak and impossible to predict in the 
mid-level optimizer, so this doesn't kick in much with current
frontends.  The fix is to change clang to set alignment on all
byval arguments.

llvm-svn: 119916
2010-11-21 00:28:59 +00:00
Owen Anderson
5ee547b9d5 Add a test for CodeGenPrepare's ability to look through PHI nodes when performing
addressing mode folding, introduced in r119853.

llvm-svn: 119857
2010-11-19 22:34:53 +00:00
Duncan Sands
4562d3b919 Factor code for testing whether replacing one value with another
preserves LCSSA form out of ScalarEvolution and into the LoopInfo
class.  Use it to check that SimplifyInstruction simplifications
are not breaking LCSSA form.  Fixes PR8622.

llvm-svn: 119727
2010-11-18 19:59:41 +00:00
Owen Anderson
c2db966e5e Completely rework the datastructure GVN uses to represent the value number to leader mapping. Previously,
this was a tree of hashtables, and a query recursed into the table for the immediate dominator ad infinitum
if the initial lookup failed.  This led to really bad performance on tall, narrow CFGs.

We can instead replace it with what is conceptually a multimap of value numbers to leaders (actually
represented by a hashtable with a list of Value*'s as the value type), and then
determine which leader from that set to use very cheaply thanks to the DFS numberings maintained by
DominatorTree.  Because there are typically few duplicates of a given value, this scan tends to be
quite fast.  Additionally, we use a custom linked list and BumpPtr allocation to avoid any unnecessary
allocation in representing the value-side of the multimap.

This change brings with it a 15% (!) improvement in the total running time of GVN on 403.gcc, which I
think is pretty good considering that includes all the "real work" being done by MemDep as well.

The one downside to this approach is that we can no longer use GVN to perform simple conditional progation,
but that seems like an acceptable loss since we now have LVI and CorrelatedValuePropagation to pick up
the slack.  If you see conditional propagation that's not happening, please file bugs against LVI or CVP.

llvm-svn: 119714
2010-11-18 18:32:40 +00:00
Dan Gohman
ec75e876ab Add support for PHI-translating sext, zext, and trunc instructions,
enabling more PRE. PR8586.

llvm-svn: 119704
2010-11-18 17:05:13 +00:00
Chris Lattner
c752718881 remove a pointless restriction from memcpyopt. It was
refusing to optimize two memcpy's like this:

copy A <- B
copy C <- A

if it couldn't prove that noalias(B,C).  We can eliminate
the copy by producing a memmove instead of memcpy.

llvm-svn: 119694
2010-11-18 08:00:57 +00:00
Chris Lattner
1000d06bee filecheckize, this is still not optimal, see PR8643
llvm-svn: 119693
2010-11-18 07:49:32 +00:00
Chris Lattner
6048697a30 allow eliminating an alloca that is just copied from an constant global
if it is passed as a byval argument.  The byval argument will just be a
read, so it is safe to read from the original global instead.  This allows
us to promote away the %agg.tmp alloca in PR8582

llvm-svn: 119686
2010-11-18 06:41:51 +00:00
Chris Lattner
791e914b1b enhance the "alloca is just a memcpy from constant global"
to ignore calls that obviously can't modify the alloca
because they are readonly/readnone.

llvm-svn: 119683
2010-11-18 06:26:49 +00:00
Chris Lattner
44ccd4643d fix a small oversight in the "eliminate memcpy from constant global"
optimization.  If the alloca that is "memcpy'd from constant" also has
a memcpy from *it*, ignore it: it is a load.  We now optimize the testcase to:

define void @test2() {
  %B = alloca %T
  %a = bitcast %T* @G to i8*
  %b = bitcast %T* %B to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %b, i8* %a, i64 124, i32 4, i1 false)
  call void @bar(i8* %b)
  ret void
}

previously we would generate:

define void @test() {
  %B = alloca %T
  %b = bitcast %T* %B to i8*
  %G.0 = getelementptr inbounds %T* @G, i32 0, i32 0
  %tmp3 = load i8* %G.0, align 4
  %G.1 = getelementptr inbounds %T* @G, i32 0, i32 1
  %G.15 = bitcast [123 x i8]* %G.1 to i8*
  %1 = bitcast [123 x i8]* %G.1 to i984*
  %srcval = load i984* %1, align 1
  %B.0 = getelementptr inbounds %T* %B, i32 0, i32 0
  store i8 %tmp3, i8* %B.0, align 4
  %B.1 = getelementptr inbounds %T* %B, i32 0, i32 1
  %B.12 = bitcast [123 x i8]* %B.1 to i8*
  %2 = bitcast [123 x i8]* %B.1 to i984*
  store i984 %srcval, i984* %2, align 1
  call void @bar(i8* %b)
  ret void
}

llvm-svn: 119682
2010-11-18 06:20:47 +00:00
Chris Lattner
c1e63bb987 filecheckize
llvm-svn: 119681
2010-11-18 06:16:43 +00:00
Benjamin Kramer
1b330efb46 InstCombine: Add a missing irem identity (X % X -> 0).
llvm-svn: 119538
2010-11-17 19:11:46 +00:00
Duncan Sands
825c7d7f79 In which I discover the existence of loops. Threading an operation
over a phi node by applying it to each operand may be wrong if the
operation and the phi node are mutually interdependent (the testcase
has a simple example of this).  So only do this transform if it would
be correct to perform the operation in each predecessor of the block
containing the phi, i.e. if the other operands all dominate the phi.
This should fix the FFMPEG snow.c regression reported by İsmail Dönmez.

llvm-svn: 119347
2010-11-16 12:16:38 +00:00
Duncan Sands
ccdccb2776 Teach InstructionSimplify the trick of skipping incoming phi
values that are equal to the phi itself.

llvm-svn: 119161
2010-11-15 17:52:45 +00:00
Duncan Sands
658bdcf094 Move PHI tests to phi.ll, out of select.ll.
llvm-svn: 119153
2010-11-15 16:43:28 +00:00
Duncan Sands
617030ad18 Teach InstructionSimplify about phi nodes. I chose to have it simply
offload the work to hasConstantValue rather than do something more
complicated (such handling mutually recursive phis) because (1) it is
not clear it is worth it; and (2) if it is worth it, maybe such logic
would be better placed in hasConstantValue.  Adjust some GVN tests
which are now cleaned up much further (eg: all phi nodes are removed).

llvm-svn: 119043
2010-11-14 13:30:18 +00:00
Chris Lattner
7ade54ed02 rename test.
llvm-svn: 119033
2010-11-14 07:03:49 +00:00
Chris Lattner
9b45357c09 filecheckize, remove an old and useless test
llvm-svn: 119032
2010-11-14 07:03:38 +00:00
Chris Lattner
449c3bd7b5 this test is pretty pointless and "propogation" isn't a word (or so Misha claims).
llvm-svn: 119031
2010-11-14 07:02:03 +00:00
Duncan Sands
47dddbe925 Testcase to go along with commit 118923 ("Have GVN simplify instructions
as it goes").  Before -std-compile-opts only got it down to
  %a = tail call i32 @foo(i32 0) readnone
  %x = tail call i32 @foo(i32 %a) readnone
  %y = tail call i32 @foo(i32 %a) readnone
  %z = icmp eq i32 %x, %y
  ret i1 %z
while now -basicaa -gvn alone reduce it to
  %a = call i32 @foo(i32 0) readnone
  %x = call i32 @foo(i32 %a) readnone
  ret i1 true

llvm-svn: 119009
2010-11-13 21:33:19 +00:00
Duncan Sands
88fc6cd7fe Generalize the reassociation transform in SimplifyCommutative (now renamed to
SimplifyAssociativeOrCommutative) "(A op C1) op C2" -> "A op (C1 op C2)",
which previously was only done if C1 and C2 were constants, to occur whenever
"C1 op C2" simplifies (a la InstructionSimplify).  Since the simplifying operand
combination can no longer be assumed to be the right-hand terms, consider all of
the possible permutations.  When compiling "gcc as one big file", transform 2
(i.e. using right-hand operands) fires about 4000 times but it has to be said
that most of the time the simplifying operands are both constants.  Transforms
3, 4 and 5 each fired once.  Transform 6, which is an existing transform that
I didn't change, never fired.  With this change, the testcase is now optimized
perfectly with one run of instcombine (previously it required instcombine +
reassociate + instcombine, and it may just have been luck that this worked).

llvm-svn: 119002
2010-11-13 15:10:37 +00:00
Dan Gohman
bc5a716f10 Enhance DSE to handle the case where a free call makes more than
one store dead. This is especially noticeable in
SingleSource/Benchmarks/Shootout/objinst.

llvm-svn: 118875
2010-11-12 02:19:17 +00:00
Dan Gohman
2cf0e523cb Filecheckize.
llvm-svn: 118874
2010-11-12 02:02:39 +00:00