I tried to be as close as possible to the strongest check that
existed before; cleaning these up properly is left for future work.
Differential Revision: http://reviews.llvm.org/D19469
llvm-svn: 267758
This change adds a new hook for estimating the cost of vector extracts followed
by zero- and sign-extensions. The motivating example for this change is the
SMOV and UMOV instructions on AArch64. These instructions move data from vector
to general purpose registers while performing the corresponding extension
(sign-extend for SMOV and zero-extend for UMOV) at the same time. For these
operations, TargetTransformInfo can assume the extensions are free and only
report the cost of the vector extract. The SLP vectorizer has been updated to
make use of the new hook.
Differential Revision: http://reviews.llvm.org/D18523
llvm-svn: 267725
Summary:
Refine the workaround from r266877 that attempts to prevent
renaming of locals in inline assembly, so that in addition to looking
for a llvm.used local value, that there is at least one inline assembly
call in the module. Otherwise, debug functions added to the llvm.used
can block importing/exporting unnecessarily.
Reviewers: joker.eph
Subscribers: llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D19573
llvm-svn: 267717
Extract a part of isDereferenceableAndAlignedPointer functionality to Value::getPointerDerferecnceableBytes. Currently it's a NFC, but in future I'm going to accumulate all the logic about value dereferenceability in this function similarly to Value::getPointerAlignment function (D16144).
Reviewed By: reames
Differential Revision: http://reviews.llvm.org/D17572
llvm-svn: 267708
This is required to use this function from isSafeToSpeculativelyExecute
Reviewed By: hfinkel
Differential Revision: http://reviews.llvm.org/D16231
llvm-svn: 267692
When encountering a non-local pointer, LVI would eagerly scan the block for dereferences of the given object to prove the pointer to be non null. That's all well and good, but *then* we'd go recurse through our input blocks. As a result, we could end up scanning each and every block we traverse, even if the final definition was obviously non null or we found a constant value somewhere up the chain. The previous code papered over this by using the isKnownNonNull routine from value tracking. This made the duplication less painful in the common case.
Instead, we know do the block scan only *after* we've gotten the recursive results back. This lets us stop scanning individual blocks as soon as we've determined it to be non-null in any predecessor block and use our usual merge rules to propagate that information cheaply through successor blocks. For a pointer which can be found non-null, this does strictly less work and sometimes substaintially so.
Note that the case where we *can't* prove something non-null is still the really expensive case. We end up scanning each and every block looking for a dereference and never end up finding one.
llvm-svn: 267642
Previously we were recursing on our operands for unary and binary operators regardless of whether we knew how to reason about the operator in question. This has the effect of doing a potentially large amount of work, only to throw it away. By checking whether the operation is one LVI can handle, we can cut short the search and return the (overdefined) answer more quickly. The quality of the results produced should not change.
llvm-svn: 267626
As pointed out by John Regehr over in http://reviews.llvm.org/D19485, LVI was being incredibly stupid about applying its transfer rules. Rather than gathering local facts from the expression itself, it was simply giving up entirely if one of the inputs was overdefined. This greatly impacts the precision of the overall analysis and makes it far more fragile as well.
This patch builds on 267609 which did the same thing for unary casts.
llvm-svn: 267620
Essentially, I was using the wrong size function. For types which were sized, but not primitive, I wasn't getting a useful size for the operand and failed an assert. I fixed this, and also added a guard that the input is a sized type. Test case is for the original mistake. I'm not sure how to actually exercise the sized type check.
llvm-svn: 267618
As pointed out by John Regehr over in http://reviews.llvm.org/D19485, LVI was being incredibly stupid about applying its transfer rules. Rather than gathering local facts from the expression itself, it was simply giving up entirely if one of the inputs was overdefined. This greatly impacts the precision of the overall analysis and makes it far more fragile as well.
This patch implements only the unary operation case. Once this is in, I'll implement the same for the binary operations.
Differential Revision: http://reviews.llvm.org/D19492
llvm-svn: 267609
There has been much recent confusion about the partition in the lattice between constant and non-constant values. Hopefully, documenting this will prevent confusion going forward.
llvm-svn: 267440
This function handled both unary and binary operators. Cloning and specializing leads to much easier to follow code with minimal duplicatation.
llvm-svn: 267438
Summary:
This implements a new method of run-time checking the NoWrap
SCEV predicates, which should be easier to optimize and nicer
for targets that don't correctly handle multiplication/addition
of large integer types (like i128).
If the AddRec is {a,+,b} and the backedge taken count is c,
the idea is to check that |b| * c doesn't have unsigned overflow,
and depending on the sign of b, that:
a + |b| * c >= a (b >= 0) or
a - |b| * c <= a (b <= 0)
where the comparisons above are signed or unsigned, depending on
the flag that we're checking.
The advantage of doing this is that we avoid extending to a larger
type and we avoid the multiplication of large types (multiplying
i128 can be expensive).
Reviewers: sanjoy
Subscribers: llvm-commits, mzolotukhin
Differential Revision: http://reviews.llvm.org/D19266
llvm-svn: 267389
Summary:
Remove the GlobalValueInfo and change the ModuleSummaryIndex to directly
reference summary objects. The info structure was there to support lazy
parsing of the combined index summary objects, which is no longer
needed and not supported.
Reviewers: joker.eph
Subscribers: joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D19462
llvm-svn: 267344
Right now it only contains the LinkageType, but will be extended
with "hasSection", "isOptSize", "hasInlineAssembly", etc.
Differential Revision: http://reviews.llvm.org/D19404
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267319
The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling).
Differential Revision: http://reviews.llvm.org/D19172
llvm-svn: 267231
This intrinsic takes two arguments, ``%ptr`` and ``%offset``. It loads
a 32-bit value from the address ``%ptr + %offset``, adds ``%ptr`` to that
value and returns it. The constant folder specifically recognizes the form of
this intrinsic and the constant initializers it may load from; if a loaded
constant initializer is known to have the form ``i32 trunc(x - %ptr)``,
the intrinsic call is folded to ``x``.
LLVM provides that the calculation of such a constant initializer will
not overflow at link time under the medium code model if ``x`` is an
``unnamed_addr`` function. However, it does not provide this guarantee for
a constant initializer folded into a function body. This intrinsic can be
used to avoid the possibility of overflows when loading from such a constant.
Differential Revision: http://reviews.llvm.org/D18367
llvm-svn: 267223
The relative vtable ABI (PR26723) needs PLT relocations to refer to virtual
functions defined in other DSOs. The unnamed_addr attribute means that the
function's address is not significant, so we're allowed to substitute it
with the address of a PLT entry.
Also includes a bonus feature: addends for COFF image-relative references.
Differential Revision: http://reviews.llvm.org/D17938
llvm-svn: 267211
Summary:
(... while still not using a PostDomTree)
The way we use isKnownNotFullPoison from SCEV today, the new CFG walking
logic will not trigger for any realistic cases -- it will kick in only
for situations where we could have merged the contiguous basic blocks
anyway[0], since the poison generating instruction dominates all of its
non-PHI uses (which are the only uses we consider right now).
However, having this change in place will allow a later bugfix to break
fewer llvm-lit tests.
[0]: i.e. cases where block A branches to block B and B is A's only
successor and A is B's only predecessor.
Reviewers: broune, bjarke.roune
Subscribers: mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D19212
llvm-svn: 267175
Summary:
Also adds a small comment blurb on control flow + no-wrap flags, since
that question came up a few days back on llvm-dev.
Reviewers: bjarke.roune, broune
Subscribers: sanjoy, mcrosier, llvm-commits, mzolotukhin
Differential Revision: http://reviews.llvm.org/D19209
llvm-svn: 267110
This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations.
The bisection is enabled using a new command line option (-opt-bisect-limit). Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit. A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used.
The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check. Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute. A new function call has been added for module and SCC passes that behaves in a similar way.
Differential Revision: http://reviews.llvm.org/D19172
llvm-svn: 267022
This change adds a couple of test cases to make sure FindAvailableLoadedValue does the right thing. At the moment, the code added is dead, but separating it makes follow on changes far more obvious.
llvm-svn: 266999
No matter what value you OR in to A, the result of (or A, B) is going to be UGE A. When A and B are positive, it's SGE too. If A is negative, OR'ing a value into it can't make it positive, but can increase its value closer to -1, therefore (or A, B) is SGE A. Working through all possible combinations produces this truth table:
```
A is
+, -, +/-
F F F + B is
T F ? -
? F ? +/-
```
The related optimizations are flipping the 'slt' for 'sge' which always NOTs the result (if the result is known), and swapping the LHS and RHS while swapping the comparison predicate.
There are more idioms left to implement (aren't there always!) but I've stopped here because any more would risk becoming unreasonable for reviewers.
llvm-svn: 266939
Summary:
This patch prevents importing from (and therefore exporting from) any
module with a "llvm.used" local value. Local values need to be promoted
and renamed when importing, and their presense on the llvm.used variable
indicates that there are opaque uses that won't see the rename. One such
example is a use in inline assembly.
See also the discussion at:
http://lists.llvm.org/pipermail/llvm-dev/2016-April/098047.html
As part of this, move collectUsedGlobalVariables out of Transforms/Utils
and into IR/Module so that it can be used more widely. There are several
other places in LLVM that used copies of this code that can be cleaned
up as a follow on NFC patch.
Reviewers: joker.eph
Subscribers: pcc, llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D18986
llvm-svn: 266877
The functionality contained within getIntrinsicIDForCall is two-fold: it
checks if a CallInst's callee is a vectorizable intrinsic. If it isn't
an intrinsic, it attempts to map the call's target to a suitable
intrinsic.
Move the mapping functionality into getIntrinsicForCallSite and rename
getIntrinsicIDForCall to getVectorIntrinsicIDForCall while
reimplementing it in terms of getIntrinsicForCallSite.
llvm-svn: 266801
This patch improves SimplifyCFG to catch cases like:
if (a < b) {
if (a > b) <- known to be false
unreachable;
}
Phabricator Revision: http://reviews.llvm.org/D18905
llvm-svn: 266767
Rather than checking for the SCEV type prior to calling
getContantPart, perform the checks in the function. This reduces
the number of places where the checks are needed.
Differential Revision: http://reviews.llvm.org/D19241
llvm-svn: 266759
Summary:
Need to use predecessors for reverse graph, successors for forward graph.
succ_iterator/pred_iterator are not compatible, this patch is all the work necessary to work around that (which is what everywhere else does). Not sure if there is a better way, so cc'ing some random folks to take a gander :)
Reviewers: dblaikie, qcolombet, echristo
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18796
llvm-svn: 266718
Summary:
Calls to @llvm.experimental.deoptimize are expected to "never execute",
so optimize them as such.
Reviewers: chandlerc
Subscribers: junbuml, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D19095
llvm-svn: 266654
This reverts commit r266477.
This commit introduces cyclic dependency. This commit has "Analysis" depend on "ProfileData",
while "ProfileData" depends on "Object", which depends on "BitCode", which
depends on "Analysis".
llvm-svn: 266619
Removed some unused headers, replaced some headers with forward class declarations.
Found using simple scripts like this one:
clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap'
Patch by Eugene Kosov <claprix@yandex.ru>
Differential Revision: http://reviews.llvm.org/D19219
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266595
Adds an interface to get ProfileSummary for a module and makes InlineCost use ProfileSummary to get max function count.
Differential Revision: http://reviews.llvm.org/D18622
llvm-svn: 266477
Summary:
InlineCost's threshold is multiplied by this value. This lets us adjust
the inlining threshold up or down on a per-target basis. For example,
we might want to increase the threshold on targets where calls are
unusually expensive.
Reviewers: chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18560
llvm-svn: 266405
If the size of an AST entry changes, we also need to make sure we perform
necessary alias set merges, as the new size may overlap pointers in other sets.
We happen to run into this with memset, because memset allows an entry for a
i8* pointer to have a decidedly non-i8 size.
This fixes PR27262.
Differential Revision: http://reviews.llvm.org/D18939
llvm-svn: 266381
Some SIMD implementations are not IEEE-754 compliant, for example ARM's NEON.
This patch teaches the loop vectorizer to only allow transformations of loops
that either contain no floating-point operations or have enough allowance
flags supporting lack of precision (ex. -ffast-math, Darwin).
For that, the target description now has a method which tells us if the
vectorizer is allowed to handle FP math without falling into unsafe
representations, plus a check on every FP instruction in the candidate loop
to check for the safety flags.
This commit makes LLVM behave like GCC with respect to ARM NEON support, but
it stops short of fixing the underlying problem: sub-normals. Neither GCC
nor LLVM have a flag for allowing sub-normal operations. Before this patch,
GCC only allows it using unsafe-math flags and LLVM allows it by default with
no way to turn it off (short of not using NEON at all).
As a first step, we push this change to make it safe and in sync with GCC.
The second step is to discuss a new sub-normal's flag on both communitues
and come up with a common solution. The third step is to improve the FastMath
flags in LLVM to encode sub-normals and use those flags to restrict NEON FP.
Fixes PR16275.
llvm-svn: 266363