Summary:
Hoisting/sinking instruction out of a loop isn't always beneficial. Hoisting an instruction from a cold block inside a loop body out of the loop could hurt performance. This change makes Loop ICM profile aware - it now checks block frequency to make sure hoisting/sinking anly moves instruction to colder block.
Test Plan:
ninja check
Reviewers: asbirlea, sanjoy, reames, nikic, hfinkel, vsk
Reviewed By: asbirlea
Subscribers: fhahn, vsk, davidxl, xbolva00, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65060
llvm-svn: 368526
This is an extension of a transform that tries to produce positive floating-point
constants to improve canonicalization (and hopefully lead to more reassociation
and CSE).
The original patches were:
D4904
D5363 (rL221721)
But as the test diffs show, these were limited to basic patterns by walking from
an instruction to its single user rather than recursively moving up the def-use
sequence. No fast-math is required here because we're only rearranging implicit
FP negations in intermediate ops.
A motivating bug is:
https://bugs.llvm.org/show_bug.cgi?id=32939
Differential Revision: https://reviews.llvm.org/D65954
llvm-svn: 368512
The default behavior of Clang's indirect function call checker will replace
the address of each CFI-checked function in the output file's symbol table
with the address of a jump table entry which will pass CFI checks. We refer
to this as making the jump table `canonical`. This property allows code that
was not compiled with ``-fsanitize=cfi-icall`` to take a CFI-valid address
of a function, but it comes with a couple of caveats that are especially
relevant for users of cross-DSO CFI:
- There is a performance and code size overhead associated with each
exported function, because each such function must have an associated
jump table entry, which must be emitted even in the common case where the
function is never address-taken anywhere in the program, and must be used
even for direct calls between DSOs, in addition to the PLT overhead.
- There is no good way to take a CFI-valid address of a function written in
assembly or a language not supported by Clang. The reason is that the code
generator would need to insert a jump table in order to form a CFI-valid
address for assembly functions, but there is no way in general for the
code generator to determine the language of the function. This may be
possible with LTO in the intra-DSO case, but in the cross-DSO case the only
information available is the function declaration. One possible solution
is to add a C wrapper for each assembly function, but these wrappers can
present a significant maintenance burden for heavy users of assembly in
addition to adding runtime overhead.
For these reasons, we provide the option of making the jump table non-canonical
with the flag ``-fno-sanitize-cfi-canonical-jump-tables``. When the jump
table is made non-canonical, symbol table entries point directly to the
function body. Any instances of a function's address being taken in C will
be replaced with a jump table address.
This scheme does have its own caveats, however. It does end up breaking
function address equality more aggressively than the default behavior,
especially in cross-DSO mode which normally preserves function address
equality entirely.
Furthermore, it is occasionally necessary for code not compiled with
``-fsanitize=cfi-icall`` to take a function address that is valid
for CFI. For example, this is necessary when a function's address
is taken by assembly code and then called by CFI-checking C code. The
``__attribute__((cfi_jump_table_canonical))`` attribute may be used to make
the jump table entry of a specific function canonical so that the external
code will end up taking a address for the function that will pass CFI checks.
Fixes PR41972.
Differential Revision: https://reviews.llvm.org/D65629
llvm-svn: 368495
GlobalAlias and GlobalIFunc ought to be treated the same by the IR
linker, so we can generalize the code to be in terms of their common
base class GlobalIndirectSymbol.
Differential Revision: https://reviews.llvm.org/D55046
llvm-svn: 368357
For some targets the LICM pass can result in sub-optimal code in some
cases where it would be better not to run the pass, but it isn't
always possible to suppress the transformations heuristically.
Where the front-end has insight into such cases it is beneficial
to attach loop metadata to disable the pass - this change adds the
llvm.licm.disable metadata to enable that.
Differential Revision: https://reviews.llvm.org/D64557
llvm-svn: 368296
Summary:
The ever growing switch required Attribute::AttrKind values but they
might not be available for all abstract attributes we deduce. With the
new method we track statistics at the abstract attribute level. The
provided macros simplify the usage and make the messages uniform.
Reviewers: sstefan1, uenoku
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65732
llvm-svn: 368227
Summary:
The wrapper reduces boilerplate code and also provide a nice way to
determine the state type used by an abstract attributes statically via
AAType::StateType.
This was already discussed as part of the review of D65711.
Reviewers: sstefan1, uenoku
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65786
llvm-svn: 368224
Summary:
So far, whenever one wants to look at returned values, one had to deal
with the AAReturnedValues and potentially with the AAIsDead attribute.
In the same spirit as other checkForAllXXX methods, we add this
functionality now to the Attributor. By adopting the use sites we got
better results when return instructions were dead.
Reviewers: sstefan1, uenoku
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65733
llvm-svn: 368222
Commit r368064 was necessary after r367953 (D65712) broke the module
build. That happened, apparently, because the template class IRAttribute
defined in the header had a virtual method defined in the corresponding
source file (IRAttribute::manifest). To unbreak the situation this patch
introduces a helper function IRAttributeManifest::manifestAttrs which
is used to implement IRAttribute::manifest in the header. The deifnition
of the helper function is still in the source file.
Patch by jdoerfert (Johannes Doerfert)
Differential Revision: https://reviews.llvm.org/D65821
llvm-svn: 368076
Summary:
Similar to `Attributor::checkForAllCallSites`, we now provide such
functionality for instructions of a certain opcode through
`Attributor::checkForAllInstructions` and the convenient wrapper
`Attributor::checkForAllCallLikeInstructions`. This cleans up code,
avoids duplication, and simplifies the usage of liveness information.
Reviewers: sstefan1, uenoku
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65731
llvm-svn: 367961
Summary:
Certain properties, e.g., an AttrKind, are not shared among all abstract
attributes. This patch extracts the functionality into a helper struct.
Reviewers: uenoku, sstefan1
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65712
llvm-svn: 367953
To remove boilerplate, mostly passing through values to the
AbstractAttriubute base class, we extract the state into an IRPosition
helper. There is no function change intended but the IRPosition struct
will provide more functionality down the line.
Reviewers: sstefan1, uenoku
Subscribers: hiraditya, bollu, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65711
llvm-svn: 367952
Summary:
The new scheme is similar to the pass manager and dyn_cast scheme where
we identify classes by the address of a static member. This is better
than the old scheme in which we had to "invent" new Attributor enums if
there was no corresponding one.
Reviewers: sstefan1, uenoku
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65710
llvm-svn: 367951
Summary:
Instead of storing the reference to the InformationCache we now pass it
whenever it might be needed.
Reviewers: sstefan1, uenoku
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65709
llvm-svn: 367950
When we remove instructions cached references could still be live. This
patch avoids removing invoke instructions that are replaced by calls and
instead keeps them around but in a dead block.
llvm-svn: 367933
Summary:
This contains various fixes:
- Explicitly determine and return the next noreturn instruction.
- If an invoke calls a noreturn function which is not nounwind we
keep the unwind destination live. This also means we require an
invoke. Though we can still add the unreachable to the normal
destination block.
- Check if the return instructions are dead after we look for calls
to avoid triggering an optimistic fixpoint in the presence of
assumed liveness information.
- Make the interface work with "const" pointers.
- Some simplifications
While additional tests are included, full coverage is achieved only with
D59978.
Reviewers: sstefan1, uenoku
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65701
llvm-svn: 367791
When a fixpoint is indicated the change status is known due to the
fixpoint kind. This simplifies a common code pattern by making the
connection explicit.
llvm-svn: 367790
Modifying other AbstractAttributes to use Liveness AA and skip dead instructions.
Reviewers: jdoerfert, uenoku
Subscribers: hiraditya, llvm-commits
Differential revision: https://reviews.llvm.org/D65243
llvm-svn: 367725
This patch adds support to the WholeProgramDevirt pass to perform
index-based WPD, which is invoked from ThinLTO during the thin link.
The ThinLTO backend (WPD import phase) behaves the same regardless of
whether the WPD decisions were made with the index-based or (the
existing) IR-based analysis.
Depends on D54815.
Reviewers: pcc
Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, arphaman, dang, llvm-commits
Differential Revision: https://reviews.llvm.org/D55153
llvm-svn: 367679
This patch adds an ability to disable profile based peeling
causing the peeling of all iterations and as a result prohibits
further unroll/peeling attempts on that loop.
The motivation to get an ability to separate peeling usage in
pipeline where in the first part we peel only separate iterations if needed
and later in pipeline we apply the full peeling which will prohibit further peeling.
Reviewers: reames, fhahn
Reviewed By: reames
Subscribers: hiraditya, zzheng, dmgreen, llvm-commits
Differential Revision: https://reviews.llvm.org/D64983
llvm-svn: 367668
This allows folding of the scalar epilogue loop (the tail) into the main
vectorised loop body when the loop is annotated with a "vector predicate"
metadata hint. To fold the tail, instructions need to be predicated (masked),
enabling/disabling lanes for the remainder iterations.
Differential Revision: https://reviews.llvm.org/D65197
llvm-svn: 367592
Summary:
While `-div-rem-pairs` pass can decompose rem in div+rem pair when div-rem pair
is unsupported by target, nothing performs the opposite fold.
We can't do that in InstCombine or DAGCombine since neither of those has access to TTI.
So it makes most sense to teach `-div-rem-pairs` about it.
If we matched rem in expanded form, we know we will be able to place div-rem pair
next to each other so we won't regress the situation.
Also, we shouldn't decompose rem if we matched already-decomposed form.
This is surprisingly straight-forward otherwise.
The original patch was committed in rL367288 but was reverted in rL367289
because it exposed pre-existing RAUW issues in internal data structures
of the pass; those now have been addressed in a previous patch.
https://bugs.llvm.org/show_bug.cgi?id=42673
Reviewers: spatel, RKSimon, efriedma, ZaMaZaN4iK, bogner
Reviewed By: bogner
Subscribers: bogner, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65298
llvm-svn: 367419
Summary:
`DivRemPairs` internally creates two maps:
* {sign, divident, divisor} -> div instruction
* {sign, divident, divisor} -> rem instruction
Then it iterates over rem map, and looks if there is an entry
in div map with the same key. Then depending on some internal logic
it may RAUW rem instruction with something else.
But if that rem instruction is an input to other div/rem,
then it was used as a key in these maps, so the old value (used in key)
is now dandling, because RAUW didn't update those maps.
And we can't even RAUW map keys in general, there's `ValueMap`,
but we don't have a single `Value` as key...
The bug was discovered via D65298, and the test there exists.
Now, i'm not sure how to expose this issue in trunk.
The bug is clearly there if i change the map keys to be `AssertingVH`/`PoisoningVH`,
but i guess this didn't miscompiled anything thus far?
I really don't think this is benin without that patch.
The fix is actually rather straight-forward - instead of trying to somehow
shoe-horn `ValueMap` here (doesn't fit, key isn't just `Value`), or writing a new
`ValueMap` with key being a struct of `Value`s, we can just have an intermediate
data structure - a vector, each entry containing matching `Div, Rem` pair,
and pre-filling it before doing any modifications.
This way we won't need to query map after doing RAUW, so no bug is possible.
Reviewers: spatel, bogner, RKSimon, craig.topper
Reviewed By: spatel
Subscribers: hiraditya, hans, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65451
llvm-svn: 367417
LoopInfo can be easily preserved by passing it to the functions that
modify the CFG (SplitCriticalEdge and MergeBlockIntoPredecessor.
SplitCriticalEdge also preserves LoopSimplify and LCSSA form when when passing in
LoopInfo. The test case shows that we preserve LoopSimplify and
LoopInfo. Adding addPreservedID(LCSSAID) did not preserve LCSSA for some
reason.
Also I am not sure if it is possible to preserve those in the new pass
manager, as they aren't analysis passes.
Reviewers: reames, hfinkel, davide, jdoerfert
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D65137
llvm-svn: 367332
test-suite/MultiSource/Benchmarks/DOE-ProxyApps-C/miniGMG broke:
Only PHI nodes may reference their own value!
%sub33 = srem i32 %sub33, %ranks_in_i
This reverts commit r367288.
llvm-svn: 367289
Summary:
While `-div-rem-pairs` pass can decompose rem in div+rem pair when div-rem pair
is unsupported by target, nothing performs the opposite fold.
We can't do that in InstCombine or DAGCombine since neither of those has access to TTI.
So it makes most sense to teach `-div-rem-pairs` about it.
If we matched rem in expanded form, we know we will be able to place div-rem pair
next to each other so we won't regress the situation.
Also, we shouldn't decompose rem if we matched already-decomposed form.
This is surprisingly straight-forward otherwise.
https://bugs.llvm.org/show_bug.cgi?id=42673
Reviewers: spatel, RKSimon, efriedma, ZaMaZaN4iK, bogner
Reviewed By: bogner
Subscribers: bogner, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65298
llvm-svn: 367288
To avoid duplicates in loop metadata, if the string to add is
already there, just update the value.
Reviewers: reames, Ashutosh
Reviewed By: reames
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D65265
llvm-svn: 367087
changes were made to the patch since then.
--------
[NewPM] Port Sancov
This patch contains a port of SanitizerCoverage to the new pass manager. This one's a bit hefty.
Changes:
- Split SanitizerCoverageModule into 2 SanitizerCoverage for passing over
functions and ModuleSanitizerCoverage for passing over modules.
- ModuleSanitizerCoverage exists for adding 2 module level calls to initialization
functions but only if there's a function that was instrumented by sancov.
- Added legacy and new PM wrapper classes that own instances of the 2 new classes.
- Update llvm tests and add clang tests.
llvm-svn: 367053
We do not need the SmallPtrSet to avoid adding duplicates to
OpsToRename, because we already keep a ValueInfo mapping. If we see an
op for the first time, Infos will be empty and we can also add it to
OpsToRename.
We process operands by visiting BBs depth-first and then iterate over
all instructions & users, so the order should be deterministic.
Therefore we can skip one round of sorting, which we purely needed for
guaranteeing a deterministic order when iterating over the SmallPtrSet.
Reviewers: efriedma, davide
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D64816
llvm-svn: 367028
Summary:
Deduce dereferenceable attribute in Attributor.
These will be added in a later patch.
* dereferenceable(_or_null)_globally (D61652)
* Deduction based on load instruction (similar to D64258)
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: hiraditya, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64876
llvm-svn: 366788
[Attributor] Liveness analysis.
Liveness analysis abstract attribute used to indicate which BasicBlocks are dead and can therefore be ignored.
Right now we are only looking at noreturn calls.
Reviewers: jdoerfert, uenoku
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D64162
llvm-svn: 366769
[Attributor] Liveness analysis.
Liveness analysis abstract attribute used to indicate which BasicBlocks are dead and can therefore be ignored.
Right now we are only looking at noreturn calls.
Reviewers: jdoerfert, uenoku
Subscribers: hiraditya, llvm-commits
Differential revision: https://reviews.llvm.org/D64162
llvm-svn: 366753