Add a missing interface to be able to call findNearestCommonDominator
for a PostDominanceTree. The function itself is already implemented in
DominatorTreeBase. The interface however was only added to the
DominatorTree class, but not the PostDominatorClass.
llvm-svn: 97915
argument is non-null, pass it along to PHITranslateSubExpr so that it can
prefer using existing values that dominate the PredBB, instead of just
blindly picking the first equivalent value that it finds on a uselist.
Also when the DominatorTree is specified, have PHITranslateValue filter
out any result that does not dominate the PredBB. This is basically just
refactoring the check that used to be in GetAvailablePHITranslatedSubExpr
and also in GVN.
Despite my initial expectations, this change does not affect the results
of GVN for any testcases that I could find, but it should help compile time.
Before this change, if PHITranslateSubExpr picked a value that does not
dominate, PHITranslateWithInsertion would then insert a new value, which GVN
would later determine to be redundant and would replace. By picking a good
value to begin with, we save GVN the extra work of inserting and then
replacing a new value.
llvm-svn: 97010
a loop exit value, so that if a loop gets deleted, ScalarEvolution
isn't stick holding on to dangling SCEVAddRecExprs for that loop. This
fixes PR6339.
llvm-svn: 96626
insert location has become an "inserted" instruction since the time
it was saved. If so, advance to the first non-"inserted" instruction.
llvm-svn: 96203
current insertion point, advance the current insertion point.
This avoids a use-before-def situation in a testcase extracted
from clang which is difficult to reduce to a reasonable-sized
regression test.
llvm-svn: 96151
bug fixes, and with improved heuristics for analyzing foreign-loop
addrecs.
This change also flattens IVUsers, eliminating the stride-oriented
groupings, which makes it easier to work with.
llvm-svn: 95975
cases, and implement target-independent folding rules for alignof and
offsetof. Also, reassociate reassociative operators when it leads to
more folding.
Generalize ScalarEvolution's isOffsetOf to recognize offsetof on
arrays. Rename getAllocSizeExpr to getSizeOfExpr, and getFieldOffsetExpr
to getOffsetOfExpr, for consistency with analagous ConstantExpr routines.
Make the target-dependent folder promote GEP array indices to
pointer-sized integers, to make implicit casting explicit and exposed
to subsequent folding.
And add a bunch of testcases for this new functionality, and a bunch
of related existing functionality.
llvm-svn: 94987
use plain SCEVUnknowns with ConstantExpr::getSizeOf and
ConstantExpr::getOffsetOf constants. This eliminates a bunch of
special-case code.
Also add code for pattern-matching these expressions, for clients that
want to recognize them.
Move ScalarEvolution's logic for expanding array and vector sizeof
expressions into an element count times the element size, to expose
the multiplication to subsequent folding, into the regular constant
folder.
llvm-svn: 94737
After running a batch of measurements, it is clear that the inliner metrics
need some adjustments:
Own argument bonus: 20 -> 5
Outgoing argument penalty: 0 -> 5
Alloca bonus: 10 -> 5
Constant instr bonus: 7 -> 5
Dead successor bonus: 40 -> 5*(avg instrs/block)
The new cost metrics are generaly 25 points higher than before, so we may need
to move thresholds.
With this change, InlineConstants::CallPenalty becomes a political correction:
if (!isa<IntrinsicInst>(II) && !callIsSmall(CS.getCalledFunction()))
NumInsts += InlineConstants::CallPenalty + CS.arg_size();
The code size is accurately modelled by CS.arg_size(). CallPenalty is added
because calls tend to take a long time, so it may not be worth it to inline a
function with lots of calls.
All of the political corrections are in the InlineConstants namespace:
IndirectCallBonus, CallPenalty, LastCallToStaticBonus, ColdccPenalty,
NoreturnPenalty.
llvm-svn: 94615
This new version is much more aggressive about doing "full" reduction in
cases where it reduces register pressure, and also more aggressive about
rewriting induction variables to count down (or up) to zero when doing so
reduces register pressure.
It currently uses fairly simplistic algorithms for finding reuse
opportunities, but it introduces a new framework allows it to combine
multiple strategies at once to form hybrid solutions, instead of doing
all full-reduction or all base+index.
llvm-svn: 94061
form of an expression. This is the expression without the
post-increment adjustment made, which is useful in determining
which registers will be used by the expansion.
llvm-svn: 93921
Nodes that had children outside of the post dominator tree (infinite loops)
where removed from the post dominator tree. This seems to be wrong. Leave them
in the tree.
llvm-svn: 93633
Move the DOTGraphTraits dotty printer/viewer templates, that were developed for
the dominance tree into their own header file. This will allow reuse in future
passes.
llvm-svn: 93632
This patch also cleans up code that expects there to be a bitcast in the first argument and testcases that call llvm.dbg.declare.
It also strips old llvm.dbg.declare intrinsics that did not pass metadata as the first argument.
llvm-svn: 93531
Compare the dominance information calculated using a dominance tree walk to the
information calculated based on DFS numbers, if XDEBUG is enabled.
llvm-svn: 92969
Remove a FIXME and unify code that was necessary to work around broken
updateDFSNumbers(). Before updateDFSNumbers() did not work correctly for post
dominators.
llvm-svn: 92968
The DFS number calculation for postdominators was broken. In the case of
multiple exits that form the post dominator root nodes, do not iterate over
all exits, but start from the virtual root node. Otherwise bbs, that are not
post dominated by any exit but by the virtual root node, will never be assigned
a DFS number.
llvm-svn: 92967
memcpy, memset and other intrinsics that only access their arguments
to be readnone if the intrinsic's arguments all point to local memory.
This improves the testcase in the README to readonly, but it could in
theory be made readnone, however this would involve more sophisticated
analysis that looks through the memcpy.
llvm-svn: 92829
instead of stored. This reduces memdep memory usage, and also eliminates a bunch of
weakvh's. This speeds up gvn on gcc.c-torture/20001226-1.c from 23.9s to 8.45s (2.8x)
on a different machine than earlier.
llvm-svn: 91885
contains another loop, or an instruction. The loop form is
substantially more efficient on large loops than the typical
code it replaces.
llvm-svn: 91654
of 91296 that caused trouble -- the Processed list needs to be
preserved for the livetime of the pass, as AddUsersIfInteresting
is called from other passes.
llvm-svn: 91641
isPodLike type trait. This is a generally useful type trait for
more than just DenseMap, and we really care about whether something
acts like a pod, not whether it really is a pod.
llvm-svn: 91421
phi translation of complex expressions like &A[i+1]. This has the
following benefits:
1. The phi translation logic is all contained in its own class with
a strong interface and verification that it is self consistent.
2. The logic is more correct than before. Previously, if intermediate
expressions got PHI translated, we'd miss the update and scan for
the wrong pointers in predecessor blocks. @phi_trans2 is a testcase
for this.
3. We have a lot less code in memdep.
We can handle phi translation across blocks of things like @phi_trans3,
which is pretty insane :).
This patch should fix the miscompiles of 255.vortex, and I tested it
with a bootstrap of llvm-gcc, llvm-test and dejagnu of course.
llvm-svn: 90926
The semantics of llvm.dbg.value are that starting from where it is executed, an offset into the specified user source variable is specified to get a new value.
An example:
call void @llvm.dbg.value(metadata !{ i32 7 }, i64 0, metadata !2)
Here the user source variable associated with metadata #2 gets the value "i32 7" at offset 0.
llvm-svn: 90788
it may be used in contexts where preheader insertion may have failed due
to an indirectbr.
Make LoopSimplify's LoopSimplify::SeparateNestedLoop properly fail in
the case that it would require splitting an indirectbr edge.
These fix PR5502.
llvm-svn: 89484
if it is not ultimately captured. Teach BasicAliasAnalysis that a
local object address which does not escape and is never stored does
not alias with a value resulting from a load.
llvm-svn: 89398
cannot be folded into target cmp instruction.
- Avoid a phase ordering issue where early cmp optimization would prevent the
later count-to-zero optimization.
- Add missing checks which could cause LSR to reuse stride that does not have
users.
- Fix a bug in count-to-zero optimization code which failed to find the pre-inc
iv's phi node.
- Remove, tighten, loosen some incorrect checks disable valid transformations.
- Quite a bit of code clean up.
llvm-svn: 86969
This allows StringRef to skip controversial if(str) check in constructor.
Buildbots, wait for corresponding clang and llvm-gcc FE check-ins!
llvm-svn: 86914
start using them in a trivial way when -enable-jump-threading-lvi
is passed. enable-jump-threading-lvi will be my playground for
awhile.
llvm-svn: 86789
except that the result may not be a constant. Switch jump threading to
use it so that it gets things like (X & 0) -> 0, which occur when phi preds
are deleted and the remaining phi pred was a zero.
llvm-svn: 86637
This patch forbids implicit conversion of DenseMap::const_iterator to
DenseMap::iterator which was possible because DenseMapIterator inherited
(publicly) from DenseMapConstIterator. Conversion the other way around is now
allowed as one may expect.
The template DenseMapConstIterator is removed and the template parameter
IsConst which specifies whether the iterator is constant is added to
DenseMapIterator.
Actually IsConst parameter is not necessary since the constness can be
determined from KeyT but this is not relevant to the fix and can be addressed
later.
Patch by Victor Zverovich!
llvm-svn: 86636