1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-31 07:52:55 +01:00
Commit Graph

139 Commits

Author SHA1 Message Date
Patrik Hägglund
51776725b8 Fix the inliner so that the optsize function attribute don't alter the
inline threshold if the global inline threshold is lower (as for -Oz).

Reviewed by Chandler Carruth and Bill Wendling.

llvm-svn: 157323
2012-05-23 13:42:57 +00:00
Chandler Carruth
020a15db9d Sink the collection of return instructions until after *all*
simplification has been performed. This is a bit less efficient
(requires another ilist walk of the basic blocks) but shouldn't matter
in practice. More importantly, it's just too much work to keep track of
all the various ways the return instructions can be mutated while
simplifying them. This fixes yet another crasher, reported by Daniel
Dunbar.

llvm-svn: 154179
2012-04-06 17:21:31 +00:00
Chandler Carruth
352f98dd1e Tweak this test to ensure the inliner did indeed fire. Thanks to Richard
Smith for pointing this out in review.

llvm-svn: 154178
2012-04-06 17:21:28 +00:00
Chandler Carruth
bd8f18f828 Actually finish this sentence in the comment the way I intended. Thanks
Matt for pointing this out.

llvm-svn: 154158
2012-04-06 01:19:38 +00:00
Chandler Carruth
dc52b30dac Sink the return instruction collection until after we're done deleting
dead code, including dead return instructions in some cases. Otherwise,
we end up having a bogus poniter to a return instruction that blows up
much further down the road.

It turns out that this pattern is both simpler to code, easier to update
in the face of enhancements to the inliner cleanup, and likely cheaper
given that it won't add dead instructions to the list.

Thanks to John Regehr's numerous test cases for teasing this out.

llvm-svn: 154157
2012-04-06 01:11:52 +00:00
Chandler Carruth
bee52a9371 Add some more testing to cover the remaining two cases where
always-inlining is disabled: recursive functions and indirectbr.

llvm-svn: 153833
2012-04-01 10:36:17 +00:00
Chandler Carruth
1a2234d527 Fix a pretty scary bug I introduced into the always inliner with
a single missing character. Somehow, this had gone untested. I've added
tests for returns-twice logic specifically with the always-inliner that
would have caught this, and fixed the bug.

Thanks to Matt for the careful review and spotting this!!! =D

llvm-svn: 153832
2012-04-01 10:21:05 +00:00
Chandler Carruth
0d7e05e4f9 Replace four tiny tests with various uses of grep and not with a single
test and FileCheck.

llvm-svn: 153831
2012-04-01 10:11:17 +00:00
Chandler Carruth
8cacff57bf Initial commit for the rewrite of the inline cost analysis to operate
on a per-callsite walk of the called function's instructions, in
breadth-first order over the potentially reachable set of basic blocks.

This is a major shift in how inline cost analysis works to improve the
accuracy and rationality of inlining decisions. A brief outline of the
algorithm this moves to:

- Build a simplification mapping based on the callsite arguments to the
  function arguments.
- Push the entry block onto a worklist of potentially-live basic blocks.
- Pop the first block off of the *front* of the worklist (for
  breadth-first ordering) and walk its instructions using a custom
  InstVisitor.
- For each instruction's operands, re-map them based on the
  simplification mappings available for the given callsite.
- Compute any simplification possible of the instruction after
  re-mapping, and store that back int othe simplification mapping.
- Compute any bonuses, costs, or other impacts of the instruction on the
  cost metric.
- When the terminator is reached, replace any conditional value in the
  terminator with any simplifications from the mapping we have, and add
  any successors which are not proven to be dead from these
  simplifications to the worklist.
- Pop the next block off of the front of the worklist, and repeat.
- As soon as the cost of inlining exceeds the threshold for the
  callsite, stop analyzing the function in order to bound cost.

The primary goal of this algorithm is to perfectly handle dead code
paths. We do not want any code in trivially dead code paths to impact
inlining decisions. The previous metric was *extremely* flawed here, and
would always subtract the average cost of two successors of
a conditional branch when it was proven to become an unconditional
branch at the callsite. There was no handling of wildly different costs
between the two successors, which would cause inlining when the path
actually taken was too large, and no inlining when the path actually
taken was trivially simple. There was also no handling of the code
*path*, only the immediate successors. These problems vanish completely
now. See the added regression tests for the shiny new features -- we
skip recursive function calls, SROA-killing instructions, and high cost
complex CFG structures when dead at the callsite being analyzed.

Switching to this algorithm required refactoring the inline cost
interface to accept the actual threshold rather than simply returning
a single cost. The resulting interface is pretty bad, and I'm planning
to do lots of interface cleanup after this patch.

Several other refactorings fell out of this, but I've tried to minimize
them for this patch. =/ There is still more cleanup that can be done
here. Please point out anything that you see in review.

I've worked really hard to try to mirror at least the spirit of all of
the previous heuristics in the new model. It's not clear that they are
all correct any more, but I wanted to minimize the change in this single
patch, it's already a bit ridiculous. One heuristic that is *not* yet
mirrored is to allow inlining of functions with a dynamic alloca *if*
the caller has a dynamic alloca. I will add this back, but I think the
most reasonable way requires changes to the inliner itself rather than
just the cost metric, and so I've deferred this for a subsequent patch.
The test case is XFAIL-ed until then.

As mentioned in the review mail, this seems to make Clang run about 1%
to 2% faster in -O0, but makes its binary size grow by just under 4%.
I've looked into the 4% growth, and it can be fixed, but requires
changes to other parts of the inliner.

llvm-svn: 153812
2012-03-31 12:42:41 +00:00
Chandler Carruth
385a981fde Clean up the naming in this test. Someone pointed this out in review at
one point, and I forgot to go back and clean it up. Sorry about that. =/

llvm-svn: 153801
2012-03-31 10:38:48 +00:00
Chandler Carruth
15d7a6e00c FileCheck-ize this test, and generally tidy it up prior to changing
things around.

llvm-svn: 153799
2012-03-31 09:22:33 +00:00
Chandler Carruth
e507ddfe74 Switch to WeakVHs in the value mapper, and aggressively prune dead basic
blocks in the function cloner. This removes the last case of trivially
dead code that I've been seeing in the wild getting inlined, analyzed,
re-inlined, optimized, only to be deleted. Nukes a FIXME from the
cleanup tests.

llvm-svn: 153572
2012-03-28 08:38:27 +00:00
Chandler Carruth
fc1ee5b5d6 Teach the function cloner (and thus the inliner) to simplify PHINodes
aggressively. There are lots of dire warnings about this being expensive
that seem to predate switching to the TrackingVH-based value remapper
that is automatically updated on RAUW. This makes it easy to not just
prune single-entry PHIs, but to fully simplify PHIs, and to recursively
simplify the newly inlined code to propagate PHINode simplifications.

This introduces a bit of a thorny problem though. We may end up
simplifying a branch condition to a constant when we fold PHINodes, and
we would like to nuke any dead blocks resulting from this so that time
isn't wasted continually analyzing them, but this isn't easy. Deleting
basic blocks *after* they are fully cloned and mapped into the new
function currently requires manually updating the value map. The last
piece of the simplification-during-inlining puzzle will require either
switching to WeakVH mappings or some other piece of refactoring. I've
left a FIXME in the testcase about this.

llvm-svn: 153410
2012-03-25 10:34:54 +00:00
Chandler Carruth
2a69d3eac5 Move the instruction simplification of callsite arguments in the inliner
to instead rely on much more generic and powerful instruction
simplification in the function cloner (and thus inliner).

This teaches the pruning function cloner to use instsimplify rather than
just the constant folder to fold values during cloning. This can
simplify a large number of things that constant folding alone cannot
begin to touch. For example, it will realize that 'or' and 'and'
instructions with certain constant operands actually become constants
regardless of what their other operand is. It also can thread back
through the caller to perform simplifications that are only possible by
looking up a few levels. In particular, GEPs and pointer testing tend to
fold much more heavily with this change.

This should (in some cases) have a positive impact on compile times with
optimizations on because the inliner itself will simply avoid cloning
a great deal of code. It already attempted to prune proven-dead code,
but now it will be use the stronger simplifications to prove more code
dead.

llvm-svn: 153403
2012-03-25 04:03:40 +00:00
Chandler Carruth
c626d97320 FileCheck-ize this test. Note the FIXME I've introduced here: we've
regressed seriously here, we are no longer removing allocas during
inline cleanup. This appears to be because of lifetime markers "using"
them. =/ I'll look into this shortly.

llvm-svn: 153394
2012-03-24 21:24:19 +00:00
Chandler Carruth
e0a21944a1 Rip out support for 'llvm.noinline'. This thing has a strange history...
It was added in 2007 as the first cut at supporting no-inline
attributes, but we didn't have function attributes of any form at the
time. However, it was added without any mention in the LangRef or other
documentation.

Later on, in 2008, Devang added function notes for 'inline=never' and
then turned them into proper function attributes. From that point
onward, as far as I can tell, the world moved on, and no one has touched
'llvm.noinline' in any meaningful way since.

It's time has now come. We have had better mechanisms for doing this for
a long time, all the frontends I'm aware of use them, and this is just
holding back progress. Given that it was never a documented feature of
the IR, I've provided no auto-upgrade support. If people know of real,
in-the-wild bitcode that relies on this, yell at me and I'll add it, but
I *seriously* doubt anyone cares.

llvm-svn: 152904
2012-03-16 06:10:15 +00:00
Chandler Carruth
889ecbc0f8 Extend the inline cost calculation to account for bonuses due to
correlated pairs of pointer arguments at the callsite. This is designed
to recognize the common C++ idiom of begin/end pointer pairs when the
end pointer is a constant offset from the begin pointer. With the
C-based idiom of a pointer and size, the inline cost saw the constant
size calculation, and this provides the same level of information for
begin/end pairs.

In order to propagate this information we have to search for candidate
operations on a pair of pointer function arguments (or derived from
them) which would be simplified if the pointers had a known constant
offset. Then the callsite analysis looks for such pointer pairs in the
argument list, and applies the appropriate bonus.

This helps LLVM detect that half of bounds-checked STL algorithms
(such as hash_combine_range, and some hybrid sort implementations)
disappear when inlined with a constant size input. However, it's not
a complete fix due the inaccuracy of our cost metric for constants in
general. I'm looking into that next.

Benchmarks showed no significant code size change, and very minor
performance changes. However, specific code such as hashing is showing
significantly cleaner inlining decisions.

llvm-svn: 152752
2012-03-14 23:19:53 +00:00
Chandler Carruth
015ff468c2 When inlining a function and adding its inner call sites to the
candidate set for subsequent inlining, try to simplify the arguments to
the inner call site now that inlining has been performed.

The goal here is to propagate and fold constants through deeply nested
call chains. Without doing this, we loose the inliner bonus that should
be applied because the arguments don't match the exact pattern the cost
estimator uses.

Reviewed on IRC by Benjamin Kramer.

llvm-svn: 152556
2012-03-12 11:19:33 +00:00
Chandler Carruth
98464723a5 FileCheck-ize this test.
llvm-svn: 152554
2012-03-12 11:19:28 +00:00
Chandler Carruth
63f95ab839 Undo a previous restriction on the inline cost calculation which Nick
introduced. Specifically, there are cost reductions for all
constant-operand icmp instructions against an alloca, regardless of
whether the alloca will in fact be elligible for SROA. That means we
don't want to abort the icmp reduction computation when we abort the
SROA reduction computation. That in turn frees us from the need to keep
a separate worklist and defer the ICmp calculations.

Use this new-found freedom and some judicious function boundaries to
factor the innards of computing the cost factor of any given instruction
out of the loop over the instructions and into static helper functions.
This greatly simplifies the code, and hopefully makes it more clear what
is happening here.

Reviewed by Eric Christopher. There is some concern that we'd like to
ensure this doesn't get out of hand, and I plan to benchmark the effects
of this change over the next few days along with some further fixes to
the inline cost.

llvm-svn: 152368
2012-03-09 02:49:36 +00:00
Eli Bendersky
4afdeeb682 Replace all instances of dg.exp file with lit.local.cfg, since all tests are run with LIT now and now Dejagnu. dg.exp is no longer needed.
Patch reviewed by Daniel Dunbar. It will be followed by additional cleanup patches.

llvm-svn: 150664
2012-02-16 06:28:33 +00:00
Bill Wendling
7761976036 Remove all references to the old EH.
There was always the current EH. -- Ministry of Truth

llvm-svn: 149335
2012-01-31 02:09:07 +00:00
Bill Wendling
76beba7841 Update test to new EH model.
llvm-svn: 149333
2012-01-31 02:05:13 +00:00
Nick Lewycky
7693b940b6 Support pointer comparisons against constants, when looking at the inline-cost
savings from a pointer argument becoming an alloca. Sometimes callees will even
compare a pointer to null and then branch to an otherwise unreachable block!
Detect these cases and compute the number of saved instructions, instead of
bailing out and reporting no savings.

llvm-svn: 148941
2012-01-25 08:27:40 +00:00
Nick Lewycky
55814f0b32 Fix CountCodeReductionForAlloca to more accurately represent what SROA can and
can't handle. Also don't produce non-zero results for things which won't be
transformed by SROA at all just because we saw the loads/stores before we saw
the use of the address.

llvm-svn: 148536
2012-01-20 08:35:20 +00:00
Joerg Sonnenberger
8cf8d64d19 Allow inlining of functions with returns_twice calls, if they have the
attribute themselve.

llvm-svn: 146851
2011-12-18 20:35:43 +00:00
Chris Lattner
9d1e8420ff Upgrade syntax of tests using volatile instructions to use 'load volatile' instead of 'volatile load', which is archaic.
llvm-svn: 145171
2011-11-27 06:54:59 +00:00
Eli Friedman
5012ac7cc0 Remap blockaddress correctly when inlining a function. Fixes PR10162.
llvm-svn: 142684
2011-10-21 20:45:19 +00:00
Bill Wendling
47fa03b39d Replace more uses of 'unwind' in the tests with calls to landingpad and
resume. Note that some of these tests were basically dead.

llvm-svn: 140076
2011-09-19 22:11:35 +00:00
Bill Wendling
e821eb23cd This testcase is dead. It doesn't inline even if I add the 'alwaysinline'
attribute to the @foo function.

llvm-svn: 140067
2011-09-19 21:14:33 +00:00
Bill Wendling
145872a92c Try to eliminate the use of the 'unwind' instruction.
llvm-svn: 139046
2011-09-02 22:41:11 +00:00
Bill Wendling
a96c532f67 Update to new EH scheme.
llvm-svn: 138989
2011-09-02 01:25:11 +00:00
Bill Wendling
68ada987a1 Update to new EH scheme.
llvm-svn: 138928
2011-09-01 01:08:21 +00:00
Bill Wendling
b0412973be Auto upgrade the old EH scheme to use the new one. This is on a trial basis. If
things to disasterously over night, this can be reverted.

llvm-svn: 138702
2011-08-27 06:11:03 +00:00
Chris Lattner
ad5400fa72 rip out a ton of intrinsic modernization logic from AutoUpgrade.cpp, which is
for pre-2.9 bitcode files.  We keep x86 unaligned loads, movnt, crc32, and the
target indep prefetch change.

As usual, updating the testsuite is a PITA.

llvm-svn: 133337
2011-06-18 06:05:24 +00:00
Chris Lattner
9e7c036d09 remove parser support for the obsolete "multiple return values" syntax, which
was replaced with return of a "first class aggregate".

llvm-svn: 133245
2011-06-17 06:49:41 +00:00
John McCall
806ec47668 SplitCriticalEdge can sometimes split the edge from an invoke to a landing
pad, separating the exception and selector calls from the new lpad.  Teaching
it not to do that, or to properly adjust the CFG afterwards, is out of
scope because it would require the other edges to the landing pad to be split
as well (effectively).  Instead, just recover from the most likely cases
during inlining.  The best long-term solution is to change the exception
representation and commit to either requiring or not requiring the more
complex edge-splitting logic;  this is just a shorter-term hack.

llvm-svn: 132799
2011-06-09 20:06:24 +00:00
John McCall
18921da759 First, do no harm -- even if we can't find a selector for an enclosing
landing pad, forward llvm.eh.resume calls to it instead of turning them
invalidly into invokes.

llvm-svn: 132382
2011-06-01 02:17:11 +00:00
John McCall
701020df37 Add the test case for phis in the outer landing pad during the inliner's
forwarding of eh.resume that I promised yesterday.

llvm-svn: 132307
2011-05-30 01:08:04 +00:00
John McCall
119a0222f5 Implement and document the llvm.eh.resume intrinsic, which is
transformed by the inliner into a branch to the enclosing landing pad
(when inlined through an invoke).  If not so optimized, it is lowered
DWARF EH preparation into a call to _Unwind_Resume (or _Unwind_SjLj_Resume
as appropriate).  Its chief advantage is that it takes both the
exception value and the selector value as arguments, meaning that there
is zero effort in recovering these;  however, the frontend is required
to pass these down, which is not actually particularly difficult.

Also document the behavior of landing pads a bit better, and make it
clearer that it's okay that personality functions don't always land at
landing pads.  This is just a fact of life.  Don't write optimizations that
rely on pushing things over an unwind edge.

llvm-svn: 132253
2011-05-28 07:45:59 +00:00
John McCall
2f479c4d42 Fix the inliner to maintain the current de facto invoke semantics:
- the selector for the landing pad must provide all available information
    about the handlers, filters, and cleanups within that landing pad
  - calls to _Unwind_Resume must be converted to branches to the enclosing
    lpad so as to avoid re-entering the unwinder when the lpad claimed it
    was going to handle the exception in some way
This is quite specific to libUnwind-based unwinding.  In an effort to not
interfere too badly with other unwinders, and with existing hacks in frontends,
this only triggers on _Unwind_Resume (not _Unwind_Resume_or_Rethrow) and does
nothing with selectors if it cannot find a selector call for either lpad.

llvm-svn: 132200
2011-05-27 18:34:38 +00:00
Nick Lewycky
82ddea6aaa Commit test change, forgotten as part of r131838.
llvm-svn: 131839
2011-05-22 05:31:47 +00:00
Nick Lewycky
9f7644d549 Teach the inliner to emit llvm.lifetime.start/end, to scope the local variables
of the inlinee to the code representing the original function.

llvm-svn: 131838
2011-05-22 05:22:10 +00:00
Chris Lattner
de9ec03027 relax testcase a bit.
llvm-svn: 123433
2011-01-14 07:46:33 +00:00
Chris Lattner
ba962825a4 when eliding a byval copy due to inlining a readonly function, we have
to make sure that the reused alloca has sufficient alignment.

llvm-svn: 122236
2010-12-20 08:10:40 +00:00
Chris Lattner
c0a48df9f9 pull byval processing out to its own helper function.
llvm-svn: 122235
2010-12-20 07:57:41 +00:00
Chris Lattner
029952c844 fix PR8769, a miscompilation by inliner when inlining a function with a byval
argument.  The generated alloca has to have at least the alignment of the
byval, if not, the client may be making assumptions that the new alloca won't
satisfy.

llvm-svn: 122234
2010-12-20 07:45:28 +00:00
Chris Lattner
52149d6e21 merge two tests.
llvm-svn: 122233
2010-12-20 07:39:57 +00:00
Chris Lattner
2fa128c4c5 filecheckize
llvm-svn: 122232
2010-12-20 07:38:24 +00:00
Dan Gohman
6aff5b94ff Make BasicAliasAnalysis a normal AliasAnalysis implementation which
does normal initialization and normal chaining. Change the default
AliasAnalysis implementation to NoAlias.

Update StandardCompileOpts.h and friends to explicitly request
BasicAliasAnalysis.

Update tests to explicitly request -basicaa.

llvm-svn: 116720
2010-10-18 18:04:47 +00:00