1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00
Commit Graph

4160 Commits

Author SHA1 Message Date
Jakub Staszak
2f69ff9223 Move includes to the .cpp file.
llvm-svn: 148342
2012-01-17 22:16:31 +00:00
Andrew Trick
071cb0a076 Fix a corner case hit by redundant phi elimination running after LSR.
Fixes PR11761: bad IR w/ redundant Phi elim

llvm-svn: 148177
2012-01-14 03:17:23 +00:00
Bill Wendling
771a417c3f A DenseMap of a std::map isn't a very good idea because the "grow()" method will
need to make a deep copy of each of the std::maps. Use a std::map of the
std::map instead. This improves the compile time of sqlite3 by ~2%.

llvm-svn: 148003
2012-01-12 01:41:03 +00:00
Bill Wendling
9974bf8911 Revert r147978. A DenseMap's iterators may become invalidated here.
llvm-svn: 147980
2012-01-11 23:43:34 +00:00
Bill Wendling
e0f4bb054e Use a DenseMap.
This appears to improve sqlite3's compile time by ~2%.

llvm-svn: 147978
2012-01-11 22:57:32 +00:00
Andrew Trick
1454836d05 Clarified the SCEV getSmallConstantTripCount interface with in-your-face comments.
This interface is misleading and dangerous, but it is actually what we need for unrolling.

llvm-svn: 147926
2012-01-11 06:52:55 +00:00
Eric Christopher
cc62a64250 Don't avoid recursing for pointer types, just reference types. Expand on
the comment.

Fixes constvars.exp on the gdb test builder.

llvm-svn: 147897
2012-01-11 00:01:29 +00:00
Chandler Carruth
844b5fc832 Cleanup these asserts to follow common LLVM style and coding
conventions. Also, clarify the grouping of one of the asserts to silence
-Wparentheses.

llvm-svn: 147863
2012-01-10 18:18:52 +00:00
David Blaikie
8d47bb30e3 Remove unnecessary default cases in switches that cover all enum values.
llvm-svn: 147855
2012-01-10 16:47:17 +00:00
Andrew Trick
db66631fb3 Enable LSR IV Chains with sufficient heuristics.
These heuristics are sufficient for enabling IV chains by
default. Performance analysis has been done for i386, x86_64, and
thumbv7. The optimization is rarely important, but can significantly
speed up certain cases by eliminating spill code within the
loop. Unrolled loops are prime candidates for IV chains. In many
cases, the final code could still be improved with more target
specific optimization following LSR. The goal of this feature is for
LSR to make the best choice of induction variables.

Instruction selection may not completely take advantage of this
feature yet. As a result, there could be cases of slight code size
increase.

Code size can be worse on x86 because it doesn't support postincrement
addressing. In fact, when chains are formed, you may see redundant
address plus stride addition in the addressing mode. GenerateIVChains
tries to compensate for the common cases.

On ARM, code size increase can be mitigated by using postincrement
addressing, but downstream codegen currently misses some opportunities.

llvm-svn: 147826
2012-01-10 01:45:08 +00:00
Devang Patel
bcde69f19c Update language check. Do not ignore DW_LANG_Python.
Patch by  Joe Groff!

llvm-svn: 147781
2012-01-09 17:49:47 +00:00
Andrew Trick
b228a75332 Cleanup comments and argument types related to my previous replaceCongruentPhis checkin.
llvm-svn: 147709
2012-01-07 01:29:21 +00:00
Andrew Trick
8a5a1e603e Extended replaceCongruentPhis to handle mixed phi types.
llvm-svn: 147707
2012-01-07 01:12:09 +00:00
Andrew Trick
60f6beef61 Expose isNonConstantNegative to users of ScalarEvolution.
llvm-svn: 147700
2012-01-07 00:27:31 +00:00
Andrew Trick
d5553c6ad2 Put all IVUsers in the processed set. Allow querying IVUsers with isIVUserOrOperand.
llvm-svn: 147686
2012-01-06 21:41:55 +00:00
Andrew Trick
a2096be363 SCEVExpander: hoistStep should check strict dominance.
llvm-svn: 147683
2012-01-06 21:23:43 +00:00
Dan Gohman
f9fb5b3951 Generalize isSafeToSpeculativelyExecute to work on arbitrary
Values, rather than just Instructions, since it's interesting
for ConstantExprs too.

llvm-svn: 147560
2012-01-04 23:01:09 +00:00
Andrew Trick
6839d66ab3 Fix SCEVExpander to handle loops with no preheader when LSR gives it a
"phony" insertion point.

Fixes rdar://10619599: "SelectionDAGBuilder shouldn't visit PHI nodes!" assert

llvm-svn: 147439
2012-01-02 21:25:10 +00:00
Benjamin Kramer
fba4e88bc2 PatternMatch: Introduce a matcher for instructions with the "exact" bit. Use it to simplify a few matchers.
llvm-svn: 147403
2012-01-01 17:55:30 +00:00
Nick Lewycky
7425820374 Change CaptureTracking to pass a Use* instead of a Value* when a value is
captured. This allows the tracker to look at the specific use, which may be
especially interesting for function calls.

Use this to fix 'nocapture' deduction in FunctionAttrs. The existing one does
not iterate until a fixpoint and does not guarantee that it produces the same
result regardless of iteration order. The new implementation builds up a graph
of how arguments are passed from function to function, and uses a bottom-up walk
on the argument-SCCs to assign nocapture. This gets us nocapture more often, and
does so rather efficiently and independent of iteration order.

llvm-svn: 147327
2011-12-28 23:24:21 +00:00
Benjamin Kramer
b5e584392b ComputeMaskedBits: Make knownzero computation more aggressive for ctlz with undef zero.
unsigned foo(unsigned x) { return 31 - __builtin_clz(x); }
now compiles into a single "bsrl" instruction on x86.

llvm-svn: 147255
2011-12-24 17:31:46 +00:00
Chandler Carruth
acdf352b77 Make the unreachable probability much much heavier. The previous
probability wouldn't be considered "hot" in some weird loop structures
or other compounding probability patterns. This makes it much harder to
confuse, but isn't really a principled fix. I'd actually like it if we
could model a zero probability, as it would make this much easier to
reason about. Suggestions for how to do this better are welcome.

llvm-svn: 147142
2011-12-22 09:26:37 +00:00
Nick Lewycky
cedb6b6ec4 Continue counting intrinsics as instructions (except when they aren't, such as
debug info) and for being vector operations. Fixes regression from r147037.

llvm-svn: 147093
2011-12-21 20:26:03 +00:00
Nick Lewycky
f147ddb0e6 Fix typo and spacing, no functionality change.
llvm-svn: 147092
2011-12-21 20:21:55 +00:00
Nick Lewycky
98758f0c39 A call to a function marked 'noinline' is not an inline candidate. The sole
call site of an intrinsic is also not an inline candidate. While here, make it
more obvious that this code ignores all intrinsics. Noticed by inspection!

llvm-svn: 147037
2011-12-21 06:06:30 +00:00
Nick Lewycky
9adbd36737 Make some intrinsics safe to speculatively execute.
llvm-svn: 147036
2011-12-21 05:52:02 +00:00
Jakub Staszak
6f3beda2b4 Add some constantness to BranchProbabilityInfo and BlockFrequnencyInfo.
llvm-svn: 146986
2011-12-20 20:03:10 +00:00
David Blaikie
576aba04f1 Unweaken vtables as per http://llvm.org/docs/CodingStandards.html#ll_virtual_anch
llvm-svn: 146960
2011-12-20 02:50:00 +00:00
Andrew Trick
a2555fd695 LSR: Fix another corner case in expansion of postinc users.
Fixes PR11571: Instruction does not dominate all uses

llvm-svn: 146950
2011-12-20 01:42:24 +00:00
Joerg Sonnenberger
8cf8d64d19 Allow inlining of functions with returns_twice calls, if they have the
attribute themselve.

llvm-svn: 146851
2011-12-18 20:35:43 +00:00
Eric Christopher
38b0b94ef2 When recursing for the original size of a type, stop if we are at a
pointer or a reference type - we actually just want the size of the
pointer then for that.

Fixes rdar://10335756

llvm-svn: 146785
2011-12-16 23:42:45 +00:00
Devang Patel
1b525a0c23 In DICompositeType, referenced to derived type is either metadata or null.
llvm-svn: 146744
2011-12-16 17:51:31 +00:00
Devang Patel
9578694f5b Virtual table holder field is either metadata or null.
llvm-svn: 146665
2011-12-15 17:55:56 +00:00
Dan Gohman
1add31cc93 Move Instruction::isSafeToSpeculativelyExecute out of VMCore and
into Analysis as a standalone function, since there's no need for
it to be in VMCore. Also, update it to use isKnownNonZero and
other goodies available in Analysis, making it more precise,
enabling more aggressive optimization.

llvm-svn: 146610
2011-12-14 23:49:11 +00:00
Andrew Trick
9c88f32f94 LSR: Fold redundant bitcasts on-the-fly.
llvm-svn: 146597
2011-12-14 22:07:19 +00:00
Eli Friedman
760c0f359a Fix a stupid typo in MemDepPrinter.
llvm-svn: 146549
2011-12-14 02:54:39 +00:00
Daniel Dunbar
b72534060e LLVMBuild: Introduce a common section which currently has a list of the
subdirectories to traverse into.
 - Originally I wanted to avoid this and just autoscan, but this has one key
   flaw in that new subdirectories can not automatically trigger a rerun of the
   llvm-build tool. This is particularly a pain when switching back and forth
   between trees where one has added a subdirectory, as the dependencies will
   tend to be wrong. This will also eliminates FIXME implicitly.

llvm-svn: 146436
2011-12-12 22:45:54 +00:00
Daniel Dunbar
30d6a45140 LLVMBuild: Remove trailing newline, which irked me.
llvm-svn: 146409
2011-12-12 19:48:00 +00:00
Chandler Carruth
083a91fab1 Switch llvm.cttz and llvm.ctlz to accept a second i1 parameter which
indicates whether the intrinsic has a defined result for a first
argument equal to zero. This will eventually allow these intrinsics to
accurately model the semantics of GCC's __builtin_ctz and __builtin_clz
and the X86 instructions (prior to AVX) which implement them.

This patch merely sets the stage by extending the signature of these
intrinsics and establishing auto-upgrade logic so that the old spelling
still works both in IR and in bitcode. The upgrade logic preserves the
existing (inefficient) semantics. This patch should not change any
behavior. CodeGen isn't updated because it can use the existing
semantics regardless of the flag's value.

Note that this will be followed by API updates to Clang and DragonEgg.

Reviewed by Nick Lewycky!

llvm-svn: 146357
2011-12-12 04:26:04 +00:00
Chad Rosier
7096fea51c Probably not a good idea to convert a single vector load into a memcpy. We
don't do this now, but add a test case to prevent this from happening in the
future.
Additional test for rdar://9892684

llvm-svn: 145879
2011-12-06 00:19:08 +00:00
Nadav Rotem
1a91e4381d Add support for vectors of pointers.
llvm-svn: 145801
2011-12-05 06:29:09 +00:00
Benjamin Kramer
4ff0db784c Clear the new cache.
llvm-svn: 145771
2011-12-03 15:19:55 +00:00
Benjamin Kramer
ed1cd704e0 Add a "seen blocks" cache to LVI to avoid a linear scan over the whole cache just to remove no blocks from the maps.
-15% on ARMDisassembler.cpp (Release build).  It's not that great to add another
layer of caching to the caching-heavy LVI but I don't see a better way.

llvm-svn: 145770
2011-12-03 15:16:45 +00:00
Chad Rosier
d830d783e2 Add support for constant folding the pow intrinsic.
rdar://10514247

llvm-svn: 145730
2011-12-03 00:00:03 +00:00
Chad Rosier
fdca220a9e Fix a few more places where TargetData/TargetLibraryInfo is not being passed.
Add FIXMEs to places that are non-trivial to fix.

llvm-svn: 145661
2011-12-02 01:26:24 +00:00
Chad Rosier
a1a79d5df4 Abuse of mass replace isn't warranted even when the build is failing. Thanks
for the suggestion, Eric.

llvm-svn: 145643
2011-12-01 23:16:03 +00:00
Chad Rosier
c9879f3554 Fix build by not assuming TLI is guaranteed. Will have to track down cases where
TLI isn't being passed to ensure we don't miss opportunities to fold calls.

llvm-svn: 145641
2011-12-01 22:38:31 +00:00
Chad Rosier
4d25975a28 Prevent library calls from being folded if -fno-builtin has been specified.
rdar://10500969

llvm-svn: 145639
2011-12-01 22:14:50 +00:00
Chad Rosier
0b4bd4832a Last bit of TargetLibraryInfo propagation. Also fixed a case for TargetData
where it appeared beneficial to pass.
More of rdar://10500969

llvm-svn: 145630
2011-12-01 21:29:16 +00:00
Chad Rosier
49a66381f7 Propagate TargetLibraryInfo throughout ConstantFolding.cpp and
InstructionSimplify.cpp.  Other fixups as needed.
Part of rdar://10500969

llvm-svn: 145559
2011-12-01 03:08:23 +00:00
Nick Lewycky
4047e7c476 Make use of "getScalarType()". No functionality change.
llvm-svn: 145556
2011-12-01 02:39:36 +00:00
Andrew Trick
247f749767 LSR: handle the expansion of phi operands that use postinc forms of the IV.
Fixes PR11431: SCEVExpander::expandAddRecExprLiterally(const llvm::SCEVAddRecExpr*): Assertion `(!isa<Instruction>(Result) || SE.DT->dominates(cast<Instruction>(Result), Builder.GetInsertPoint())) && "postinc expansion does not dominate use"' failed.

llvm-svn: 145482
2011-11-30 06:07:54 +00:00
Daniel Dunbar
4e00f5f8fd build/CMake: Finish removal of add_llvm_library_dependencies.
llvm-svn: 145420
2011-11-29 19:25:30 +00:00
Duncan Sands
97cc6da56c Fix a theoretical problem (not seen in the wild): if different instances of a
weak variable are compiled by different compilers, such as GCC and LLVM, while
LLVM may increase the alignment to the preferred alignment there is no reason to
think that GCC will use anything more than the ABI alignment.  Since it is the
GCC version that might end up in the final program (as the linkage is weak), it
is wrong to increase the alignment of loads from the global up to the preferred
alignment as the alignment might only be the ABI alignment.

Increasing alignment up to the ABI alignment might be OK, but I'm not totally
convinced that it is.  It seems better to just leave the alignment of weak
globals alone.

llvm-svn: 145413
2011-11-29 18:26:38 +00:00
Andrew Trick
63f81b112e SCEV fix. In general, Add/Mul expressions should not inherit NSW/NUW.
This reverts r139450, fixes r139453, and adds much needed comments and a
unit test.

llvm-svn: 145367
2011-11-29 02:16:38 +00:00
Andrew Trick
975fe6c09b Make SCEV print <nsw><nuw> for Add/MulExpr.
llvm-svn: 145364
2011-11-29 02:06:35 +00:00
Eli Friedman
473a76a0df Make SelectionDAG::InferPtrAlignment use llvm::ComputeMaskedBits instead of duplicating the logic for globals. Make llvm::ComputeMaskedBits handle GlobalVariables slightly more aggressively, to match what InferPtrAlignment knew how to do.
llvm-svn: 145304
2011-11-28 22:48:22 +00:00
Andrew Trick
8c051c1949 Remove the temporary flag -disable-unroll-scev and dead code.
SCEV should now be used for trip count analysis, not LoopInfo.

llvm-svn: 145262
2011-11-28 19:22:09 +00:00
Benjamin Kramer
d861d825f2 Move code into anonymous namespaces.
llvm-svn: 145154
2011-11-26 23:01:57 +00:00
Benjamin Kramer
f7f6231b4b Validate the return type when checking if a function is malloc.
Fixes PR11426. Not sure if a test case with a "wrong" malloc would be useful.

llvm-svn: 145106
2011-11-23 17:58:47 +00:00
Duncan Sands
3c1878ef53 Fix a crash in which a multiplication was being reported as being both negative
and positive: positive, because it could be directly computed to be positive;
negative, because the nsw flags means it is either negative or undefined (the
multiplication always overflowed).

llvm-svn: 145104
2011-11-23 16:26:47 +00:00
Nick Lewycky
566ea855fd Fix crasher in GVN due to my recent capture tracking changes.
llvm-svn: 145047
2011-11-21 19:42:56 +00:00
Nick Lewycky
1eb3e73f1d Add virtual destructor. Whoops!
llvm-svn: 145044
2011-11-21 18:32:21 +00:00
Nick Lewycky
5cadaed864 Less template, more virtual! Refactoring suggested by Chris in code review.
llvm-svn: 145014
2011-11-20 19:37:06 +00:00
Nick Lewycky
39c6f0a5d5 Refactor code to use new attribute getters on CallSite for NoCapture and ByVal.
Suggested in code review by Eli.

That code in InstCombine looks kinda suspicious.

llvm-svn: 145013
2011-11-20 19:09:04 +00:00
Benjamin Kramer
fbd156b1d5 SCEV: Actually set overflow flags on add expressions.
setFlags doesn't modify its arguments.

llvm-svn: 145007
2011-11-20 10:24:36 +00:00
Andrew Trick
fe5f7fc3b8 Fix a corner case in updating LoopInfo after fully unrolling an outer loop.
The loop tree's inclusive block lists are painful and expensive to
update. (I have no idea why they're inclusive). The design was
supposed to handle this case but the implementation missed it and my
unit tests weren't thorough enough.

Fixes PR11335: loop unroll update.

llvm-svn: 144970
2011-11-18 03:42:41 +00:00
Andrew Trick
fe618116fc Fix SCEV overly optimistic back edge taken count for multi-exit loops.
Fixes PR11375: Different results for 'clang++ huh.cpp'...

llvm-svn: 144746
2011-11-16 00:52:40 +00:00
Benjamin Kramer
de467fc4df Missed some users of Value::getNameStr.
llvm-svn: 144656
2011-11-15 18:30:06 +00:00
Benjamin Kramer
a2f57dee6d Remove all remaining uses of Value::getNameStr().
llvm-svn: 144648
2011-11-15 16:27:03 +00:00
Benjamin Kramer
3eeef2e739 Twinify GraphWriter a little bit.
llvm-svn: 144647
2011-11-15 16:26:38 +00:00
Nick Lewycky
a0b2f7ca1d Refactor capture tracking (which already had a couple flags for whether returns
and stores capture) to permit the caller to see each capture point and decide
whether to continue looking.

Use this inside memdep to do an analysis that basicaa won't do. This lets us
solve another devirtualization case, fixing PR8908!

llvm-svn: 144580
2011-11-14 22:49:42 +00:00
Nick Lewycky
772024a00d Don't try to loop on iterators that are potentially invalidated inside the loop. Fixes PR11361!
llvm-svn: 144454
2011-11-12 03:09:12 +00:00
Nick Lewycky
bb1e607255 Fix typo in comment.
llvm-svn: 144236
2011-11-09 22:45:04 +00:00
Nick Lewycky
c08fa4916a Don't forget to check FlagNW when determining whether an AddRecExpr will wrap
or not. Patch by Brendon Cahoon!

llvm-svn: 144173
2011-11-09 07:11:37 +00:00
Eli Friedman
6bda990650 Fix code to match comment. Fixes PR11340, a regression from r143209.
llvm-svn: 144121
2011-11-08 21:08:02 +00:00
Dan Gohman
19a8523a2f Teach instsimplify to simplify calls to undef.
llvm-svn: 143719
2011-11-04 18:32:42 +00:00
Daniel Dunbar
3760ebeebb build: Add initial cut at LLVMBuild.txt files.
llvm-svn: 143634
2011-11-03 18:53:17 +00:00
Duncan Sands
1077c1fa88 Reapply commit 143214 with a fix: m_ICmp doesn't match conditions
with the given predicate, it matches any condition and returns the
predicate - d'oh!  Original commit message:
The expression icmp eq (select (icmp eq x, 0), 1, x), 0 folds to false.
Spotted by my super-optimizer in 186.crafty and 450.soplex.  We really
need a proper infrastructure for handling generalizations of this kind
of thing (which occur a lot), however this case is so simple that I decided
to go ahead and implement it directly.

llvm-svn: 143318
2011-10-30 19:56:36 +00:00
Eli Friedman
7c9bef9ba8 Revert r143214; it's breaking a bunch of stuff.
llvm-svn: 143265
2011-10-29 00:56:07 +00:00
Duncan Sands
7791a854c3 The expression icmp eq (select (icmp eq x, 0), 1, x), 0 folds to false.
Spotted by my super-optimizer in 186.crafty and 450.soplex.  We really
need a proper infrastructure for handling generalizations of this kind
of thing (which occur a lot), however this case is so simple that I decided
to go ahead and implement it directly.

llvm-svn: 143214
2011-10-28 19:01:20 +00:00
Duncan Sands
3483c23658 A shift of a power of two is a power of two or zero.
For completeness - not spotted in the wild.

llvm-svn: 143211
2011-10-28 18:30:05 +00:00
Duncan Sands
5730fe6a31 Fold icmp ugt (udiv X, Y), X to false. Spotted by my super-optimizer
in 186.crafty.

llvm-svn: 143209
2011-10-28 18:17:44 +00:00
Duncan Sands
ca325638c8 Reapply commit 143028 with a fix: the problem was casting a ConstantExpr Mul
using BinaryOperator (which only works for instructions) when it should have
been a cast to OverflowingBinaryOperator (which also works for constants).
While there, correct a few other dubious looking uses of BinaryOperator.
Thanks to Chad Rosier for the testcase.  Original commit message:
My super-optimizer noticed that we weren't folding this expression to
true: (x *nsw x) sgt 0, where x = (y | 1).  This occurs in 464.h264ref.

llvm-svn: 143125
2011-10-27 19:16:21 +00:00
Bob Wilson
2ca603d9b7 Revert Duncan's r143028 expression folding which appears to be the culprit
behind a compile failure on 483.xalancbmk.

llvm-svn: 143102
2011-10-27 15:47:25 +00:00
Duncan Sands
5c8fa99c32 The maximum power of 2 dividing a power of 2 is itself. This occurs
in 403.gcc and was spotted by my super-optimizer.

llvm-svn: 143054
2011-10-26 20:55:21 +00:00
Duncan Sands
c463f54342 My super-optimizer noticed that we weren't folding this expression to
true: (x *nsw x) sgt 0, where x = (y | 1).  This occurs in 464.h264ref.

llvm-svn: 143028
2011-10-26 15:31:51 +00:00
Duncan Sands
be9c2e6e13 Restore commits 142790 and 142843 - they weren't breaking the build
bots.  Original commit messages:
- Reapply r142781 with fix. Original message:

  Enhance SCEV's brute force loop analysis to handle multiple PHI nodes in the
  loop header when computing the trip count.

  With this, we now constant evaluate:
    struct ListNode { const struct ListNode *next; int i; };
    static const struct ListNode node1 = {0, 1};
    static const struct ListNode node2 = {&node1, 2};
    static const struct ListNode node3 = {&node2, 3};
    int test() {
      int sum = 0;
      for (const struct ListNode *n = &node3; n != 0; n = n->next)
        sum += n->i;
      return sum;
    }

- Now that we look at all the header PHIs, we need to consider all the header PHIs
  when deciding that the loop has stopped evolving. Fixes miscompile in the gcc
  torture testsuite!

llvm-svn: 142919
2011-10-25 12:28:52 +00:00
Chandler Carruth
3cbbc35715 Fix the API usage in loop probability heuristics. It was incorrectly
classifying many edges as exiting which were in fact not. These mainly
formed edges into sub-loops. It was also not correctly classifying all
returning edges out of loops as leaving the loop. With this match most
of the loop heuristics are more rational.

Several serious regressions on loop-intesive benchmarks like perlbench's
loop tests when built with -enable-block-placement are fixed by these
updated heuristics. Unfortunately they in turn uncover some other
regressions. There are still several improvemenst that should be made to
loop heuristics including trip-count, and early back-edge management.

llvm-svn: 142917
2011-10-25 09:47:41 +00:00
Duncan Sands
da835efa2a Speculatively revert commits 142790 and 142843 to see if it fixes
the dragonegg and llvm-gcc self-host buildbots.  Original commit
messages:
- Reapply r142781 with fix. Original message:

  Enhance SCEV's brute force loop analysis to handle multiple PHI nodes in the
  loop header when computing the trip count.

  With this, we now constant evaluate:
    struct ListNode { const struct ListNode *next; int i; };
    static const struct ListNode node1 = {0, 1};
    static const struct ListNode node2 = {&node1, 2};
    static const struct ListNode node3 = {&node2, 3};
    int test() {
      int sum = 0;
      for (const struct ListNode *n = &node3; n != 0; n = n->next)
        sum += n->i;
      return sum;
    }

- Now that we look at all the header PHIs, we need to consider all the header PHIs
when deciding that the loop has stopped evolving. Fixes miscompile in the gcc
torture testsuite!

llvm-svn: 142916
2011-10-25 09:26:43 +00:00
Nick Lewycky
289c30130a Now that we look at all the header PHIs, we need to consider all the header PHIs
when deciding that the loop has stopped evolving. Fixes miscompile in the gcc
torture testsuite!

llvm-svn: 142843
2011-10-24 21:02:38 +00:00
Chandler Carruth
d04f838629 Remove return heuristics from the static branch probabilities, and
introduce no-return or unreachable heuristics.

The return heuristics from the Ball and Larus paper don't work well in
practice as they pessimize early return paths. The only good hitrate
return heuristics are those for:
 - NULL return
 - Constant return
 - negative integer return

Only the last of these three can possibly require significant code for
the returning block, and even the last is fairly rare and usually also
a constant. As a consequence, even for the cold return paths, there is
little code on that return path, and so little code density to be gained
by sinking it. The places where sinking these blocks is valuable (inner
loops) will already be weighted appropriately as the edge is a loop-exit
branch.

All of this aside, early returns are nearly as common as all three of
these return categories, and should actually be predicted as taken!
Rather than muddy the waters of the static predictions, just remain
silent on returns and let the CFG itself dictate any layout or other
issues.

However, the return heuristic was flagging one very important case:
unreachable. Unfortunately it still gave a 1/4 chance of the
branch-to-unreachable occuring. It also didn't do a rigorous job of
finding those blocks which post-dominate an unreachable block.

This patch builds a more powerful analysis that should flag all branches
to blocks known to then reach unreachable. It also has better worst-case
runtime complexity by not looping through successors for each block. The
previous code would perform an N^2 walk in the event of a single entry
block branching to N successors with a switch where each successor falls
through to the next and they finally fall through to a return.

Test case added for noreturn heuristics. Also doxygen comments improved
along the way.

llvm-svn: 142793
2011-10-24 12:01:08 +00:00
Nick Lewycky
64d4e26aec Reapply r142781 with fix. Original message:
Enhance SCEV's brute force loop analysis to handle multiple PHI nodes in the
  loop header when computing the trip count.

  With this, we now constant evaluate:
    struct ListNode { const struct ListNode *next; int i; };
    static const struct ListNode node1 = {0, 1};
    static const struct ListNode node2 = {&node1, 2};
    static const struct ListNode node3 = {&node2, 3};
    int test() {
      int sum = 0;
      for (const struct ListNode *n = &node3; n != 0; n = n->next)
        sum += n->i;
      return sum;
    }

llvm-svn: 142790
2011-10-24 06:57:05 +00:00
Nick Lewycky
341bf1548e PHI nodes not in the loop header aren't part of the loop iteration initial
state. Furthermore, they might not have two operands. This fixes the underlying
issue behind the crashes introduced in r142781.

llvm-svn: 142788
2011-10-24 05:51:01 +00:00
Nick Lewycky
d72de74587 Speculatively revert r142781. Bots are showing
Assertion `i_nocapture < OperandTraits<PHINode>::operands(this) && "getOperand() out of range!"' failed.
coming out of indvars.

llvm-svn: 142786
2011-10-24 04:00:25 +00:00
Chandler Carruth
c5722dbec0 Simplify the design of BranchProbabilityInfo by collapsing it into
a single class. Previously it was split between two classes, one
internal and one external. The concern seemed to center around exposing
the weights used, but those can remain confined to the implementation
file.

Having a single class to maintain the state and analyses in use will
also simplify several of the enhancements I want to make to our static
heuristics.

llvm-svn: 142783
2011-10-24 01:40:45 +00:00
Nick Lewycky
5ab7948d71 Enhance SCEV's brute force loop analysis to handle multiple PHI nodes in the
loop header when computing the trip count.

With this, we now constant evaluate:
  struct ListNode { const struct ListNode *next; int i; };
  static const struct ListNode node1 = {0, 1};
  static const struct ListNode node2 = {&node1, 2};
  static const struct ListNode node3 = {&node2, 3};
  int test() {
    int sum = 0;
    for (const struct ListNode *n = &node3; n != 0; n = n->next)
      sum += n->i;
    return sum;
  }

llvm-svn: 142781
2011-10-23 23:43:14 +00:00
Chandler Carruth
93ee01158c Tidy up a loop to be more idiomatic for LLVM's codebase, and remove some
extraneous whitespace. Trying to clean-up this pass as much as I can
before I start making functional changes.

llvm-svn: 142780
2011-10-23 22:40:13 +00:00
Chandler Carruth
151d4fc273 Teach the BranchProbabilityInfo pass to print its results, and use that
to bring it under direct test instead of merely indirectly testing it in
the BlockFrequencyInfo pass.

The next step is to start adding tests for the various heuristics
employed, and to start fixing those heuristics once they're under test.

llvm-svn: 142778
2011-10-23 21:21:50 +00:00
Benjamin Kramer
9adc582e35 Add compare operators to BranchProbability and use it to determine if an edge is hot.
llvm-svn: 142751
2011-10-23 11:19:14 +00:00