1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-01 16:33:37 +01:00
Commit Graph

7774 Commits

Author SHA1 Message Date
Cameron Zwarich
0287e8ad22 Stop computing the number of uses twice per value in CodeGenPrepare's sinking of
addressing code. On 403.gcc this almost halves CodeGenPrepare time and reduces
total llc time by 9.5%. Unfortunately, getNumUses() is still the hottest function
in llc.

llvm-svn: 126782
2011-03-01 21:13:53 +00:00
Anders Carlsson
1eb388e6c3 Make InstCombiner::FoldAndOfICmps create a ConstantRange that's the
intersection of the LHS and RHS ConstantRanges and return "false" when
the range is empty.

This simplifies some code and catches some extra cases.

llvm-svn: 126744
2011-03-01 15:05:01 +00:00
Eli Friedman
c00dc3f262 Add an obvious missing safety check to DAE::RemoveDeadArgumentsFromCallers.
llvm-svn: 126720
2011-03-01 00:33:47 +00:00
Ted Kremenek
7b6189c848 Unbreak CMake build.
llvm-svn: 126715
2011-02-28 23:56:33 +00:00
Chris Lattner
77d6f9b20e update cmake
llvm-svn: 126694
2011-02-28 22:45:25 +00:00
Dan Gohman
91de965338 Delete the GEPSplitter experiment.
llvm-svn: 126671
2011-02-28 19:47:47 +00:00
Dan Gohman
db646bfdad Delete the SimplifyHalfPowrLibCalls pass, which was unused, and
only existed as the result of a misunderstanding.

llvm-svn: 126669
2011-02-28 19:41:14 +00:00
Frits van Bommel
4a2b658705 Teach SimplifyCFG that (switch (select cond, X, Y)) is better expressed as a branch.
Based on a patch by Alistair Lynn.

llvm-svn: 126647
2011-02-28 09:44:07 +00:00
Nick Lewycky
dcc97b5f44 srem doesn't actually have the same resulting sign as its numerator, you could
also have a zero when numerator = denominator. Reverts parts of r126635 and
r126637.

llvm-svn: 126644
2011-02-28 09:17:39 +00:00
Nick Lewycky
28f01da48e Teach InstCombine to fold "(shr exact X, Y) == 0" --> X == 0, fixing #1 from
PR9343.

llvm-svn: 126643
2011-02-28 08:31:40 +00:00
Nick Lewycky
e0f44d0aba The sign of an srem instruction is the sign of its dividend (the first
argument), regardless of the divisor. Teach instcombine about this and fix
test7 in PR9343!

llvm-svn: 126635
2011-02-28 06:20:05 +00:00
Benjamin Kramer
39a5d8596c Revert "SimplifyCFG: GEPs with just one non-constant index are also cheap."
Yes, there are other types than i8* and GEPs on them can produce an add+multiply.
We don't consider that cheap enough to be speculatively executed.

llvm-svn: 126481
2011-02-25 10:33:33 +00:00
Benjamin Kramer
44b43a85db SimplifyCFG: GEPs with just one non-constant index are also cheap.
llvm-svn: 126452
2011-02-24 23:26:09 +00:00
Benjamin Kramer
b5996b08b7 SimplifyCFG: GEPs with constant indices are cheap enough to be executed unconditionally.
llvm-svn: 126445
2011-02-24 22:46:11 +00:00
Devang Patel
29b946c0bd Do not use DIFactory. Use DIBuilder.
llvm-svn: 126398
2011-02-24 18:49:55 +00:00
Chris Lattner
e64e9c6f73 wire TargetLibraryInfo into simplify libcalls and use it in a couple of
trivial places.  This pass needs a lot of work.

llvm-svn: 126367
2011-02-24 07:16:14 +00:00
Chris Lattner
d1bb8c0277 move a massive amount of code out into its own helper function
to reduce nesting.  This needs to be turned into a table.

llvm-svn: 126366
2011-02-24 07:12:12 +00:00
Chris Lattner
72a2ebab6c change instcombine to not turn a call to non-varargs bitcast of
function prototype into a call to a varargs prototype.  We do
allow the xform if we have a definition, but otherwise we don't
want to risk that we're changing the abi in a subtle way.  On
X86-64, for example, varargs require passing stuff in %al.

llvm-svn: 126363
2011-02-24 05:10:56 +00:00
Cameron Zwarich
c5fa112a70 Make LoopDeletion work on loops with multiple edges, as long as the incoming
values from all of the loop's exiting blocks are equal. Patch by Andrew Clinton.

llvm-svn: 126253
2011-02-22 22:25:39 +00:00
Duncan Sands
17a277da6d If the phi node was used by an unreachable instruction that ends up using
itself without going via a phi node then we could return false here in
spite of making a change.  Also, tweak the comment because this method
can (and always could) return true without deleting the original phi node.
For example, if the phi node was used by a read-only invoke instruction
which is used by another phi node phi2 which is only used by and only uses
the invoke, then phi2 would be deleted but not the invoke instruction and
not the original phi node.

llvm-svn: 126129
2011-02-21 17:32:05 +00:00
Chris Lattner
83c60ae907 fix a crasher in disabled code (on variable stride loops)
llvm-svn: 126125
2011-02-21 17:02:55 +00:00
Duncan Sands
702e8b136e Simplify RecursivelyDeleteDeadPHINode. The only functionality change
should be that if the phi is used by a side-effect free instruction with
no uses then the phi and the instruction now get zapped (checked by the
unittest).

llvm-svn: 126124
2011-02-21 16:27:36 +00:00
Chris Lattner
9d2899115f Add some (disabled code) to print out negative strides.
llvm-svn: 126102
2011-02-21 02:08:54 +00:00
Nick Lewycky
ecae3aec02 Make RecursivelyDeleteDeadPHINode delete a phi node that has no users and add a
test for that. With this change, test/CodeGen/X86/codegen-dce.ll no longer finds
any instructions to DCE, so delete the test.

Also renamed J and JP to I and IP in RecursivelyDeleteDeadPHINode.

llvm-svn: 126088
2011-02-20 18:05:56 +00:00
Benjamin Kramer
85011c0273 Move "A | ~(A & ?) -> -1" from InstCombine to InstructionSimplify.
llvm-svn: 126082
2011-02-20 15:20:01 +00:00
Benjamin Kramer
50cd35c25e InstCombine: Add a bunch of combines of the form x | (y ^ z).
We usually catch this kind of optimization through InstSimplify's distributive
magic, but or doesn't distribute over xor in general.

"A | ~(A | B) -> A | ~B" hits 24 times on gcc.c.

llvm-svn: 126081
2011-02-20 13:23:43 +00:00
Nick Lewycky
4d7bb906df Teach RecursivelyDeleteDeadPHINodes to handle multiple self-references. Patch
by Andrew Clinton!

llvm-svn: 126077
2011-02-20 08:38:20 +00:00
Nick Lewycky
cd05f2a659 Instead of keeping two Value*->id# mappings, keep one Value->Value mapping and
one Value set. This is faster because we only need to use the set when there
isn't already an entry in the map. No functionality change!

llvm-svn: 126076
2011-02-20 08:11:03 +00:00
Eli Friedman
35ed1e5d6c PR9218: SimplifyDemandedVectorElts can return a non-null value that is not
the instruction passed in.  Make sure to account for this correctly, instead
of looping infinitely.

llvm-svn: 126058
2011-02-19 22:42:40 +00:00
Chris Lattner
cc2fa12fac rewrite the memset_pattern pattern generation stuff to accept any 2/4/8/16-byte
constant, including globals.  This makes us generate much more "pretty" pattern
globals as well because it doesn't break it down to an array of bytes all the
time.

This enables us to handle stores of relocatable globals.  This kicks in about
48 times in 254.gap, giving us stuff like this:

@.memset_pattern40 = internal constant [2 x %struct.TypHeader* (%struct.TypHeader*, %struct.TypHeader*)*] [%struct.TypHeader* (%struct.TypHeader*, %struct
.TypHeader*)* @IsFalse, %struct.TypHeader* (%struct.TypHeader*, %struct.TypHeader*)* @IsFalse], align 16

...
  call void @memset_pattern16(i8* %scevgep5859, i8* bitcast ([2 x %struct.TypHeader* (%struct.TypHeader*, %struct.TypHeader*)*]* @.memset_pattern40 to i8*
), i64 %tmp75) nounwind

llvm-svn: 126044
2011-02-19 19:56:44 +00:00
Chris Lattner
474215e713 Implement rdar://9009151, transforming strided loop stores of
unsplatable values into memset_pattern16 when it is available
(recent darwins).  This transforms lots of strided loop stores
of ints for example, like 5 in vpr:

  Formed memset:   call void @memset_pattern16(i8* %4, i8* getelementptr inbounds ([16 x i8]* @.memset_pattern9, i32 0, i32 0), i64 %tmp25)
    from store to: {%3,+,4}<%11> at:   store i32 3, i32* %scevgep, align 4, !tbaa !4

llvm-svn: 126040
2011-02-19 19:31:39 +00:00
Chris Lattner
2b183fca87 Make loop-idiom use TargetLibraryInfo to determine whether it is allowed
to hack on memset, memcpy etc.

llvm-svn: 125974
2011-02-18 22:22:15 +00:00
Oscar Fuentes
6e5d344a2e Move library stuff out of the toplevel CMakeLists.txt file.
llvm-svn: 125968
2011-02-18 22:06:14 +00:00
Duncan Sands
1ddd628de0 Add some transforms of the kind X-Y>X -> 0>Y which are valid when there is no
overflow.  These subsume some existing equality transforms, so zap those.

llvm-svn: 125843
2011-02-18 16:25:37 +00:00
Chris Lattner
eccec47f5c prevent jump threading from merging blocks when their address is
taken (and used!).  This prevents merging the blocks (invalidating
the block addresses) in a case like this:

#define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; })

void foo() {
  printf("%p\n", _THIS_IP_);
  printf("%p\n", _THIS_IP_);
  printf("%p\n", _THIS_IP_);
}

which fixes PR4151.

llvm-svn: 125829
2011-02-18 04:43:06 +00:00
Chris Lattner
cd170f7cf6 Don't unroll loops whose header block's address is taken.
This is part of a futile attempt to not "break" bizzaro
code like this:

 l1:
  printf("l1: %p\n", &&l1);
  ++x;

  if( x < 3 ) goto l1;


Previously we'd fold &&l1 to 1, which is fine per our semantics
but not helpful to the user.

llvm-svn: 125827
2011-02-18 04:25:21 +00:00
Chris Lattner
f9501b79f9 have instcombine preserve nsw/nuw/exact when sinking
common operations through a phi. 

llvm-svn: 125790
2011-02-17 23:01:49 +00:00
Chris Lattner
fd397180f3 fix typo
llvm-svn: 125787
2011-02-17 22:32:54 +00:00
Chris Lattner
fc8ee641a2 fix instcombine merging GEPs through a PHI to only make the
result inbounds if all of the inputs are inbounds.

llvm-svn: 125785
2011-02-17 22:21:26 +00:00
Chris Lattner
e797a8c29f add is always integer, thanks to Frits for noticing this.
llvm-svn: 125774
2011-02-17 20:55:29 +00:00
Duncan Sands
00610dbf64 Transform "A + B >= A + C" into "B >= C" if the adds do not wrap. Likewise for some
variations (some of these were already present so I unified the code).  Spotted by my
auto-simplifier as occurring a lot.

llvm-svn: 125734
2011-02-17 07:46:37 +00:00
Chris Lattner
6e936c247f preserve NUW/NSW when transforming add x,x
llvm-svn: 125711
2011-02-17 02:23:02 +00:00
Chris Lattner
828b97cdc2 fix PR9215, preventing -reassociate from clearing nsw/nuw when
it swaps the LHS/RHS of a single binop.

llvm-svn: 125700
2011-02-17 01:29:24 +00:00
Duncan Sands
061150ac1b Spelling fix: consequtive -> consecutive.
llvm-svn: 125563
2011-02-15 09:23:02 +00:00
Nadav Rotem
5306a4ae96 Fix 9216 - Endless loop in InstCombine pass.
The pattern "A&(A^B) -> A & ~B" recreated itself because ~B is
actually a xor -1.

llvm-svn: 125557
2011-02-15 07:13:48 +00:00
Devang Patel
091c6a8907 Do not forget DebugLoc!
llvm-svn: 125547
2011-02-15 02:02:30 +00:00
Chris Lattner
ccb24014c2 tidy up a bit.
llvm-svn: 125546
2011-02-15 01:56:08 +00:00
Chris Lattner
db204cbe42 convert ConstantVector::get to use ArrayRef.
llvm-svn: 125537
2011-02-15 00:14:00 +00:00
Devang Patel
4b1ea1ef94 Do not hoist @llvm.dbg.value. Here, @llvm.dbg.value is "referring" a value that is modified inside loop.
llvm-svn: 125529
2011-02-14 23:03:23 +00:00
Chris Lattner
ee7f7c2494 revert my ConstantVector patch, it seems to have made the llvm-gcc
builders unhappy.

llvm-svn: 125504
2011-02-14 18:15:46 +00:00
Chris Lattner
34f32cb4c2 Switch ConstantVector::get to use ArrayRef instead of a pointer+size
idiom.  Change various clients to simplify their code.

llvm-svn: 125487
2011-02-14 07:55:32 +00:00
Chris Lattner
dff71eae10 remove a now-unneccesary cast.
llvm-svn: 125464
2011-02-13 18:30:09 +00:00
Chris Lattner
72b78e11ba implement instcombine folding for things like (x >> c) < 42.
We were previously simplifying divisions, but not right shifts!

llvm-svn: 125454
2011-02-13 08:07:21 +00:00
Chris Lattner
dd3cc1b409 refactor some code out into a helper method.
llvm-svn: 125451
2011-02-13 07:43:07 +00:00
Daniel Dunbar
74c3b94237 SimplifyLibCalls: Add missing legalize check on various printf to puts and
putchar transforms, their return values are not compatible.

llvm-svn: 125442
2011-02-12 18:19:57 +00:00
Benjamin Kramer
793cd269de Also fold (A+B) == A -> B == 0 when the add is commuted.
llvm-svn: 125411
2011-02-11 21:46:48 +00:00
Chris Lattner
6c0014cd4a When lowering an inbounds gep, the intermediate adds can have
unsigned overflow (e.g. due to a negative array index), but
the scales on array size multiplications are known to not
sign wrap.

llvm-svn: 125409
2011-02-11 21:37:43 +00:00
Cameron Zwarich
ef11d9219e Make LoopUnswitch preserve ScalarEvolution by just forgetting everything about
a loop when unswitching it. It only does this in the complex case, because
everything should be fine already in the simple case.

llvm-svn: 125369
2011-02-11 06:08:28 +00:00
Cameron Zwarich
a71a5e4f90 LoopInstSimplify preserves ScalarEvolution.
llvm-svn: 125368
2011-02-11 06:08:25 +00:00
Cameron Zwarich
38343ff97c If we can't avoid running loop-simplify twice for now, at least avoid running
iv-users twice.

llvm-svn: 125318
2011-02-10 23:53:14 +00:00
Cameron Zwarich
9d8ab7d0f7 Rename 'loopsimplify' to 'loop-simplify'.
llvm-svn: 125317
2011-02-10 23:38:10 +00:00
Chris Lattner
d2c1936c14 implement the first part of PR8882: when lowering an inbounds
gep to explicit addressing, we know that none of the intermediate
computation overflows.

This could use review: it seems that the shifts certainly wouldn't
overflow, but could the intermediate adds overflow if there is a 
negative index?

Previously the testcase would instcombine to:

define i1 @test(i64 %i) {
  %p1.idx.mask = and i64 %i, 4611686018427387903
  %cmp = icmp eq i64 %p1.idx.mask, 1000
  ret i1 %cmp
}

now we get:

define i1 @test(i64 %i) {
  %cmp = icmp eq i64 %i, 1000
  ret i1 %cmp
}

llvm-svn: 125271
2011-02-10 07:11:16 +00:00
Chris Lattner
6e84f48cd8 Enhance a bunch of transformations in instcombine to start generating
exact/nsw/nuw shifts and have instcombine infer them when it can prove
that the relevant properties are true for a given shift without them.

Also, a variety of refactoring to use the new patternmatch logic thrown
in for good luck.  I believe that this takes care of a bunch of related
code quality issues attached to PR8862.

llvm-svn: 125267
2011-02-10 05:36:31 +00:00
Chris Lattner
72ac244f4e Enhance the "compare with shift" and "compare with div"
optimizations to be much more aggressive in the face of
exact/nsw/nuw div and shifts.  For example, these (which
are the same except the first is 'exact' sdiv:

define i1 @sdiv_icmp4_exact(i64 %X) nounwind {
  %A = sdiv exact i64 %X, -5   ; X/-5 == 0 --> x == 0
  %B = icmp eq i64 %A, 0
  ret i1 %B
}

define i1 @sdiv_icmp4(i64 %X) nounwind {
  %A = sdiv i64 %X, -5   ; X/-5 == 0 --> x == 0
  %B = icmp eq i64 %A, 0
  ret i1 %B
}

compile down to:

define i1 @sdiv_icmp4_exact(i64 %X) nounwind {
  %1 = icmp eq i64 %X, 0
  ret i1 %1
}

define i1 @sdiv_icmp4(i64 %X) nounwind {
  %X.off = add i64 %X, 4
  %1 = icmp ult i64 %X.off, 9
  ret i1 %1
}

This happens when you do something like:
  (ptr1-ptr2) == 42

where the pointers are pointers to non-unit types.

llvm-svn: 125266
2011-02-10 05:23:05 +00:00
Chris Lattner
0decae4bf7 more cleanups, notably bitcast isn't used for "signed to unsigned type
conversions". :)

llvm-svn: 125265
2011-02-10 05:17:27 +00:00
Chris Lattner
b974ff0c57 A bunch of cleanups and simplifications using the new PatternMatch predicates
and generally tidying things up.  Only very trivial functionality changes
like now doing (-1 - A) -> (~A) for vectors too.

 InstCombineAddSub.cpp |  296 +++++++++++++++++++++-----------------------------
 1 file changed, 126 insertions(+), 170 deletions(-)

llvm-svn: 125264
2011-02-10 05:14:58 +00:00
Chris Lattner
c741c5d744 teach SimplifyDemandedBits that exact shifts demand the bits they
are shifting out since they do require them to be zeros.  Similarly
for NUW/NSW bits of shl

llvm-svn: 125263
2011-02-10 05:09:34 +00:00
Eric Christopher
c86930fa03 Revert this in an attempt to bring the builders back.
llvm-svn: 125257
2011-02-10 01:48:24 +00:00
Cameron Zwarich
9e1e6813bd Turn this pass ordering:
Natural Loop Information
 Loop Pass Manager
   Canonicalize natural loops
 Scalar Evolution Analysis
 Loop Pass Manager
   Induction Variable Users
   Canonicalize natural loops
   Induction Variable Users
   Loop Strength Reduction

into this:

Scalar Evolution Analysis
Loop Pass Manager
  Canonicalize natural loops
  Induction Variable Users
  Loop Strength Reduction

This fixes <rdar://problem/8869639>. I also filed PR9184 on doing this sort of
thing automatically, but it seems easier to just change the ordering of the
passes if this is the only case.

llvm-svn: 125254
2011-02-10 01:07:54 +00:00
Chris Lattner
02088f3ab8 Teach instsimplify some tricks about exact/nuw/nsw shifts.
improve interfaces to instsimplify to take this info.

llvm-svn: 125196
2011-02-09 17:15:04 +00:00
Chris Lattner
7468ab4b90 Rework InstrTypes.h so to reduce the repetition around the NSW/NUW/Exact
versions of creation functions.  Eventually, the "insertion point" versions
of these should just be removed, we do have IRBuilder afterall.

Do a massive rewrite of much of pattern match.  It is now shorter and less
redundant and has several other widgets I will be using in other patches.
Among other changes, m_Div is renamed to m_IDiv (since it only matches 
integer divides) and m_Shift is gone (it used to match all binops!!) and
we now have m_LogicalShift for the one client to use.

Enhance IRBuilder to have "isExact" arguments to things like CreateUDiv
and reduce redundancy within IRbuilder by having these methods chain to
each other more instead of duplicating code.

llvm-svn: 125194
2011-02-09 17:00:45 +00:00
Nick Lewycky
b162446cda When removing a function from the function set and adding it to deferred, we
could end up removing a different function than we intended because it was
functionally equivalent, then end up with a comparison of a function against
itself in the next round of comparisons (the one in the function set and the
one on the deferred list). To fix this, I introduce a choice in the form of
comparison for ComparableFunctions, either normal or "pointer only" used to
find exact Function*'s in lookups.

Also add some debugging statements.

llvm-svn: 125180
2011-02-09 06:32:02 +00:00
Dan Gohman
ae7dba9ba8 Don't split any loop backedges, including backedges of loops other than
the active loop. This is generally desirable, and it avoids trouble
in situations such as the testcase in PR9123, though the failure
mode depends on use-list order, so it is infeasible to test.

llvm-svn: 125065
2011-02-08 00:55:13 +00:00
Benjamin Kramer
04249128ab SimplifyCFG: Track the number of used icmps when turning a icmp chain into a switch. If we used only one icmp, don't turn it into a switch.
Also prevent the switch-to-icmp transform from creating identity adds, noticed by Marius Wachtler.

llvm-svn: 125056
2011-02-07 22:37:28 +00:00
Chris Lattner
7b6a968f5d enhance vmcore to know that udiv's can be exact, and add a trivial
instcombine xform to exercise this.

Nothing forms exact udivs yet though.  This is progress on PR8862

llvm-svn: 124992
2011-02-06 21:44:57 +00:00
Nick Lewycky
7e863fb906 Simplify away redundant test, and document what's going on.
llvm-svn: 124977
2011-02-06 05:04:00 +00:00
Nick Lewycky
fb03aee332 Remove specialized comparison of InlineAsm objects. They're uniqued on creation
now, and this wasn't comparing some of their relevant bits anyhow.

llvm-svn: 124976
2011-02-06 04:33:50 +00:00
Benjamin Kramer
75785ec972 SimplifyCFG: Also transform switches that represent a range comparison but are not sorted into sub+icmp.
This transforms another 1000 switches in gcc.c.

llvm-svn: 124826
2011-02-03 22:51:41 +00:00
Benjamin Kramer
b739613711 SimplifyCFG: Turn switches into sub+icmp+branch if possible.
This makes the job of the later optzn passes easier, allowing the vast amount of
icmp transforms to chew on it.

We transform 840 switches in gcc.c, leading to a 16k byte shrink of the resulting
binary on i386-linux.

The testcase from README.txt now compiles into
  decl  %edi
  cmpl  $3, %edi
  sbbl  %eax, %eax
  andl  $1, %eax
  ret

llvm-svn: 124724
2011-02-02 15:56:22 +00:00
Nick Lewycky
4f38aaec24 Remove wasteful caching. This isn't needed for correctness because any function
that might have changed been affected by a merge elsewhere will have been
removed from the function set, and it isn't needed for performance because we
call grow() ahead of time to prevent reallocations.

llvm-svn: 124717
2011-02-02 05:31:01 +00:00
Dan Gohman
11acb5002d Conservatively, clear optional flags, such as nsw, when performing
reassociation. No testcase, because I wasn't able to create a testcase
which actually demonstrates a problem.

llvm-svn: 124713
2011-02-02 02:05:46 +00:00
Dan Gohman
4dc130ea78 Fix reassociate to clear optional flags, such as nsw.
llvm-svn: 124712
2011-02-02 02:02:34 +00:00
Anders Carlsson
f184e5de9a Recognize and simplify
(A+B) == A  ->  B == 0
A == (A+B)  ->  B == 0

llvm-svn: 124567
2011-01-30 22:01:13 +00:00
Francois Pichet
6aed3c72dc Unbreak the MSVC build.
The DEBUG() call at line 606 demands to see raw_ostream's definition. I have no idea why this seems to only break MSVC.

llvm-svn: 124545
2011-01-29 20:06:16 +00:00
Frits van Bommel
b1b70f2a44 Call SimplifyFDivInst() in InstCombiner::visitFDiv().
llvm-svn: 124535
2011-01-29 17:50:27 +00:00
Frits van Bommel
92dc04df67 Move InstCombine's knowledge of fdiv to SimplifyInstruction().
llvm-svn: 124534
2011-01-29 15:26:31 +00:00
Evan Cheng
20433f6339 Add a test for TCE return duplication.
llvm-svn: 124527
2011-01-29 04:53:35 +00:00
Evan Cheng
4af5487b74 Re-apply r124518 with fix. Watch out for invalidated iterator.
llvm-svn: 124526
2011-01-29 04:46:23 +00:00
Evan Cheng
1f943b9b13 Revert r124518. It broke Linux self-host.
llvm-svn: 124522
2011-01-29 02:43:04 +00:00
Evan Cheng
a1e4cb5f09 Re-commit r124462 with fixes. Tail recursion elim will now dup ret into unconditional predecessor to enable TCE on demand.
llvm-svn: 124518
2011-01-29 01:29:26 +00:00
Andrew Trick
72f17d97f3 Implementation of path profiling.
Modified patch by Adam Preuss.

This builds on the existing framework for block tracing, edge profiling and optimal edge profiling.
See -help-hidden for new flags.
For documentation, see the technical report "Implementation of Path Profiling..." in llvm.org/pubs.

llvm-svn: 124515
2011-01-29 01:09:53 +00:00
Duncan Sands
1a18d8df96 My auto-simplifier noticed that ((X/Y)*Y)/Y occurs several times in SPEC
benchmarks, and that it can be simplified to X/Y.  (In general you can only
simplify (Z*Y)/Y to Z if the multiplication did not overflow; if Z has the
form "X/Y" then this is the case).  This patch implements that transform and
moves some Div logic out of instcombine and into InstructionSimplify.
Unfortunately instcombine gets in the way somewhat, since it likes to change
(X/Y)*Y into X-(X rem Y), so I had to teach instcombine about this too.
Finally, thanks to the NSW/NUW flags, sometimes we know directly that "Z*Y"
does not overflow, because the flag says so, so I added that logic too.  This
eliminates a bunch of divisions and subtractions in 447.dealII, and has good
effects on some other benchmarks too.  It seems to have quite an effect on
tramp3d-v4 but it's hard to say if it's good or bad because inlining decisions
changed, resulting in massive changes all over.

llvm-svn: 124487
2011-01-28 16:51:11 +00:00
Nick Lewycky
f9a384e203 Rename functions to follow coding standard. Also rejiggers comments. No
functionality change.

llvm-svn: 124482
2011-01-28 08:43:14 +00:00
Nick Lewycky
edbe62c10d Add a doxygen comment for this class.
llvm-svn: 124480
2011-01-28 08:19:00 +00:00
Nick Lewycky
28f2b64333 Reorder for readability. (Chris, is this what you meant?)
llvm-svn: 124479
2011-01-28 07:36:21 +00:00
Evan Cheng
5b6c72e549 Revert r124462. There are a few big regressions that I need to fix first.
llvm-svn: 124478
2011-01-28 07:12:38 +00:00
Nick Lewycky
744bd3872f Reduce the number of functions we look at in the first pass, and preallocate
the function equality set.

llvm-svn: 124475
2011-01-28 05:48:15 +00:00
Nick Lewycky
74dfcccec4 Fold select + select where both selects are on the same condition.
llvm-svn: 124469
2011-01-28 03:28:10 +00:00
Evan Cheng
7031f450b3 - Stop simplifycfg from duplicating "ret" instructions into unconditional
branches. PR8575, rdar://5134905, rdar://8911460.
- Allow codegen tail duplication to dup small return blocks after register
  allocation is done.

llvm-svn: 124462
2011-01-28 02:19:21 +00:00
Benjamin Kramer
114c20e190 Unbreak the build.
llvm-svn: 124426
2011-01-27 20:30:54 +00:00