Repeated inserts into AliasSetTracker have quadratic behavior - inserting a
pointer into AST is linear, since it requires walking over all "may" alias
sets and running an alias check vs. every pointer in the set.
We can avoid this by tracking the total number of pointers in "may" sets,
and when that number exceeds a threshold, declare the tracker "saturated".
This lumps all pointers into a single "may" set that aliases every other
pointer.
(This is a stop-gap solution until we migrate to MemorySSA)
This fixes PR28832.
Differential Revision: https://reviews.llvm.org/D23432
llvm-svn: 279274
Summary: We will need these in AMDGPU's new SchedStrategy implmentation.
Reviewers: MatzeB, atrick
Subscribers: llvm-commits, MatzeB
Differential Revision: https://reviews.llvm.org/D23679
llvm-svn: 279270
solve completely opaque MSVC build errors. It complains about lots of
stuff with this change without givin nearly enough information to even
try to fix.
llvm-svn: 279231
to run methods, both for transform passes and analysis passes.
This also allows the analysis manager to use a different set of extra
arguments from the pass manager where useful. Consider passes over
analysis produced units of IR like SCCs of the call graph or loops.
Passes of this nature will often want to refer to the analysis result
that was used to compute their IR units (the call graph or LoopInfo).
And for transformations, they may want to communicate special update
information to the outer pass manager. With this change, it becomes
possible to have a run method for a loop pass that looks more like:
PreservedAnalyses run(Loop &L, AnalysisManager<Loop, LoopInfo> &AM,
LoopInfo &LI, LoopUpdateRecord &UR);
And to query the analysis manager like:
AM.getResult<MyLoopAnalysis>(L, LI);
This makes accessing the known-available analyses convenient and clear,
and it makes passing customized data structures around easy.
My initial use case is going to be in updating the pass manager layers
when the analysis units of IR change. But there are more use cases here
such as having a layer that lets inner passes signal whether certain
additional passes should be run because of particular simplifications
made. Two desires for this have come up in the past: triggering
additional optimization after successfully unrolling loops, and
triggering additional inlining after collapsing indirect calls to direct
calls.
Despite adding this layer of generic extensibility, the *only* change to
existing, simple usage are for places where we forward declare the
AnalysisManager template. We really shouldn't be doing this because of
the fragility exposed here, but currently it makes coping with the
legacy PM code easier.
Differential Revision: http://reviews.llvm.org/D21462
llvm-svn: 279227
r279217 where it fails to select the path that other compilers select.
The workaround won't be as careful to produce an error when an analysis
result is incorrect, but we can rely on non-MSVC builds to catch such
errors it seems and MSVC doesn't seem to support the alternative
techniques.
Hoping this brings the windows bots back to life. If not, will have to
revert all of this.
llvm-svn: 279225
into the AnalysisManager class template.
Back when I first added this base class there were separate analysis
managers and some plausible reason why it would be a useful factoring of
common code between them. However, after a lot of refactoring cleaning,
we now have *entirely* shared code. The base class was just an arbitrary
division between code in one class template and a separate class
template. It didn't add anything and forced lots of indirection through
"derived_this" for no real gain.
We can always factor a base CRTP class out with common code if there is
ever some *other* analysis manager that wants to share a subset of
logic. But for now, folding things into the primary template is
a non-trivial simplification with no down sides I see. It shortens the
code considerably, removes an unhelpful abstraction, and will make
subsequent patches *dramatically* less complex which enhance the
analysis manager infrastructure to effectively cope with invalidation.
llvm-svn: 279221
its own invalidate method.
Previously, the technique would assume that if a result didn't have an
invalidate method that didn't exactly match the expected signature it
didn't have one at all. This is in fact not the case. And we had
analyses with incorrect signatures for the invalidate method in the
tree that would be erroneously invalidated in certain cases! Yikes.
Moreover a result might legitimately want to have multiple overloads for
the invalidate method, and if one changes or a new one is needed we
again really want a compiler error. For example in the tree we had not
added the overload for a *function* IR unit to the invalidate routine
for TLI. Doh.
So a new techique for the SFINAE detection here: if the result has *any*
member spelled "invalidate" we turn off the synthesis of a default
version. We don't care if it is a member function or a member variable
or how many overloads there are. Once a result has something by that
name it must provide suitable overloads for the contexts in which it is
used. This seems much more resilient and durable.
Huge props to Richard Smith who helped me figure out how on earth we
could even do this in C++. It took quite some doing. The technique is
remarkably clean however, and merely requires that the analysis results
are not *final* classes. I think that's a requirement we can live with
even if it is a bit odd.
I've fixed the two bad in-tree analysis results. And this will make my
next change which changes the API for invalidate much easier to
validate as correct.
llvm-svn: 279217
directly produce the index as the value type result.
This requires making the index movable which is straightforward. It
greatly simplifies things by allowing us to completely avoid the builder
API and the layers of abstraction inherent there. Instead both pass
managers can directly construct these when run by value. They still
won't be constructed truly eagerly thanks to the optional in the legacy
PM. The code that directly builds the index can also just share a direct
function.
A notable change here is that the result type of the analysis for the
new PM is no longer a reference type. This was really problematic when
making changes to how we handle result types to make our interface
requirements *much* more strict and precise. But I think this is an
overall improvement.
Differential Revision: https://reviews.llvm.org/D23701
llvm-svn: 279216
The ppc64 multistage bot fails on this.
This reverts commit r279124.
Also Revert "CodeGen: Add/Factor out LiveRegUnits class; NFCI" because it depends on the previous change
This reverts commit r279171.
llvm-svn: 279199
This is a little class template that just builds an inheritance chain of
empty classes. Despite how simple this is, it can be used to really
nicely create ranked overload sets. I've added a unittest as much to
document this as test it. You can pass an object of this type as an
argument to a function overload set an it will call the first viable and
enabled candidate at or below the rank of the object.
I'm planning to use this in a subsequent commit to more clearly rank
overload candidates used for SFINAE. All credit for this technique and
both lines of code here to Richard Smith who was helping me rewrite the
SFINAE check in question to much more effectively capture the intended
set of checks.
llvm-svn: 279197
Summary: Reduce store size to avoid leading and trailing zeros.
Reviewers: kcc, eugenis
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23648
llvm-svn: 279178
This is a set of register units intended to track register liveness, it
is similar in spirit to LivePhysRegs.
You can also think of this as the liveness tracking parts of the
RegisterScavenger factored out into an own class.
This was proposed in http://llvm.org/PR27609
Differential Revision: http://reviews.llvm.org/D21916
llvm-svn: 279171
The names of the tablegen defs now match the names of the ISD nodes.
This makes the world a slightly saner place, as previously "fround" matched
ISD::FP_ROUND and not ISD::FROUND.
Differential Revision: https://reviews.llvm.org/D23597
llvm-svn: 279129
Re-apply r276044 with off-by-1 instruction fix for the reload placement.
This is a variant of scavengeRegister() that works for
enterBasicBlockEnd()/backward(). The benefit of the backward mode is
that it is not affected by incomplete kill flags.
This patch also changes
PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register
scavenger in backwards mode.
Differential Revision: http://reviews.llvm.org/D21885
llvm-svn: 279124
When running 'opt -O2 verify-uselistorder-nodbg.lto.bc', there are 33m allocations. 8.2m
come from std::string allocations in Intrinsic::getName(). Turns out this method only
returns a std::string because it needs to handle overloads, but that is not the common case.
This adds an overload of getName which just returns a StringRef when there are no overloads
and so saves on the allocations.
llvm-svn: 279113
This reverts commit r279086, reapplying r279084. I'm not sure what I
ran before, because the compile failure for ADTTests reproduced locally.
The problem is that TestRev is calling BidirectionalVector::rbegin()
when the BidirectionalVector is const, but rbegin() is always non-const.
I've updated BidirectionalVector::rbegin() to be callable from const.
Original commit message follows.
--
As a follow-up to r278991, add some tests that check that
decltype(reverse(R).begin()) == decltype(R.rbegin()), and get them
passing by adding std::remove_reference to has_rbegin.
I'm using static_assert instead of EXPECT_TRUE (and updated the other
has_rbegin check from r278991 in the same way) since I figure that's
more helpful.
llvm-svn: 279091
The original patch was breaking some buildbots due to an
incorrect ordering of function definitions which caused some
compilers to recognize a definition but others to not.
llvm-svn: 279089
As a follow-up to r278991, add some tests that check that
decltype(reverse(R).begin()) == decltype(R.rbegin()), and get them
passing by adding std::remove_reference to has_rbegin.
I'm using static_assert instead of EXPECT_TRUE (and updated the other
has_rbegin check from r278991 in the same way) since I figure that's
more helpful.
llvm-svn: 279084
The ARMv8*-A descriptions in the ARM and AArch64 TargetParsers are incorrect
architecturally and mismatched to the backend descriptions.
RAS is an optional extension to ARMv8-A and ARMv8.1-A and mandatory in
ARMv8.2-A. Correct the ARMTargetParser descriptions which had this as enabled
by default in the earlier versions.
The FP16 and SPE extensions are optional in ARMv8.2-A and the backend defaults
them as off. They are not available as extensions to earlier ARMv8-A versions.
Correct the AArch64TargetParser which had these as enabled by default in all
ARMv8-A definitions.
These macros are only used to define preprocessor macros. There are no macros
yet as ACLE has not caught up with ARMv8.2-A so not possible to add a test.
Differential Revision: https://reviews.llvm.org/D23500
llvm-svn: 279078
Summary:
Skip the merging of common symbols for ThinLTO modules, they will be
merged by the final native object link. Trying to merge the symbols and
add to a combined module will incorrectly enable the common symbol to be
internalized in the ThinLTO module. Additionally, we will not want to
create a combined module for ThinLTO distributed builds.
This fixes failures in 7 cpu2006 benchmarks from the new LTO API in
ThinLTO mode.
Reviewers: mehdi_amini
Subscribers: pcc, llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D23637
llvm-svn: 279023
Summary:
We are going to combine poisoning of red zones and scope poisoning.
PR27453
Reviewers: kcc, eugenis
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23623
llvm-svn: 279020
Change the ilist traits to use decltype instead of sizeof, and add
HasObsoleteCustomization so that additions to this list don't need to be
added in two places.
I suspect this will now work with MSVC, since the trait tested in
r278991 seems to work. If for some reason it continues to fail on
Windows I'll follow up by adding back the #ifndef _MSC_VER.
llvm-svn: 279012
Summary: I later (after r278573) found that LoopIterator.h has some overlapping with LoopBodyTraits. It's good to use LoopBodyTraits because a *Traits struct is algorithm independent.
Reviewers: anemet, nadav, mkuper
Subscribers: mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D23529
llvm-svn: 278996
Duncan found that reverse worked on mutable rbegin(), but the has_rbegin
trait didn't work with a const method. See http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160815/382890.html
for more details.
Turns out this was already solved in clang with has_getDecl. Copied that and made it work for rbegin.
This includes the tests Duncan attached to that thread, including the traits test.
llvm-svn: 278991
Since I stopped writing empty export tries it causes LinkEdit to potentially be completely empty which results in invalid yaml being generated.
To prevent this we skip linkedit data if it is empty.
llvm-svn: 278985
This will allow tail duplication and tail merging during layout to have a
shared threshold to make sure that they don't overlap. No observable change
intended.
llvm-svn: 278981
This removes the undefined behaviour (UB) in ilist/ilist_node/etc.,
mainly by removing (gutting) the ilist_sentinel_traits customization
point and canonicalizing on a single, efficient memory layout. This
fixes PR26753.
The new ilist is a doubly-linked circular list.
- ilist_node_base has two ilist_node_base*: Next and Prev. Size-of: two
pointers.
- ilist_node<T> (size-of: two pointers) is a type-safe wrapper around
ilist_node_base.
- ilist_iterator<T> (size-of: two pointers) operates on an
ilist_node<T>*, and downcasts to T* on dereference.
- ilist_sentinel<T> (size-of: two pointers) is a wrapper around
ilist_node<T> that has some extra API for list management.
- ilist<T> (size-of: two pointers) has an ilist_sentinel<T>, whose
address is returned for end().
The new memory layout matches ilist_half_embedded_sentinel_traits<T>
exactly. The Head pointer that previously lived in ilist<T> is
effectively glued to the ilist_half_node<T> that lived in
ilist_half_embedded_sentinel_traits<T>, becoming the Next and Prev in
the ilist_sentinel_node<T>, respectively. sizeof(ilist<T>) is now the
size of two pointers, and there is never any additional storage for a
sentinel.
This is a much simpler design for a doubly-linked list, removing most of
the corner cases of list manipulation (add, remove, etc.). In follow-up
commits, I intend to move as many algorithms as possible into a
non-templated base class (ilist_base) to reduce code size.
Moreover, this fixes the UB in ilist_iterator/getNext/getPrev
operations. Previously, ilist_iterator<T> operated on a T*, even when
the sentinel was not of type T (i.e., ilist_embedded_sentinel_traits and
ilist_half_embedded_sentinel_traits). This added UB to all operations
involving end(). Now, ilist_iterator<T> operates on an ilist_node<T>*,
and only downcasts when the full type is guaranteed to be T*.
What did we lose? There used to be a crash (in some configurations) on
++end(). Curiously (via UB), ++end() would return begin() for users of
ilist_half_embedded_sentinel_traits<T>, but otherwise ++end() would
cause a nice dependable nullptr dereference, crashing instead of a
possible infinite loop. Options:
1. Lose that behaviour.
2. Keep it, by stealing a bit from Prev in asserts builds.
3. Crash on dereference instead, using the same technique.
Hans convinced me (because of the number of problems this and r278532
exposed on Windows) that we really need some assertion here, at least in
the short term. I've opted for #3 since I think it catches more bugs.
I added only a couple of unit tests to root out specific bugs I hit
during bring-up, but otherwise this is tested implicitly via the
extensive usage throughout LLVM.
Planned follow-ups:
- Remove ilist_*sentinel_traits<T>. Here I've just gutted them to
prevent build failures in sub-projects. Once I stop referring to them
in sub-projects, I'll come back and delete them.
- Add ilist_base and move algorithms there.
- Check and fix move construction and assignment.
Eventually, there are other interesting directions:
- Rewrite reverse iterators, so that rbegin().getNodePtr()==&*rbegin().
This allows much simpler logic when erasing elements during a reverse
traversal.
- Remove ilist_traits::createNode, by deleting the remaining API that
creates nodes. Intrusive lists shouldn't be creating nodes
themselves.
- Remove ilist_traits::deleteNode, by (1) asserting that lists are empty
on destruction and (2) changing API that calls it to take a Deleter
functor (intrusive lists shouldn't be in the memory management
business).
- Reconfigure the remaining callback traits (addNodeToList, etc.) to be
higher-level, pulling out a simple_ilist<T> that is much easier to
read and understand.
- Allow tags (e.g., ilist_node<T,tag1> and ilist_node<T,tag2>) so that T
can be a member of multiple intrusive lists.
llvm-svn: 278974
Summary:
This is part of the "NodeType* -> NodeRef" migration. Notice that since
GraphWriter prints object address as identity, I added a static_assert on
NodeRef to be a pointer type.
Reviewers: dblaikie
Subscribers: llvm-commits, MatzeB
Differential Revision: https://reviews.llvm.org/D23580
llvm-svn: 278966
Summary:
Looking at the implementation, GenericDomTree has more specific
requirements on NodeRef, e.g. NodeRefObject->getParent() should compile,
and NodeRef should be a pointer. We can remove the pointer requirement,
but it seems to have little gain, given the limited use cases.
Also changed GraphTraits<Inverse<Inverse<T>> to be more accurate.
Reviewers: dblaikie, chandlerc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23593
llvm-svn: 278961
This is used to mark functions with the C++11 [[ noreturn ]] or C11 _Noreturn
attributes.
Patch by Victor Leschuk!
https://reviews.llvm.org/D23167
llvm-svn: 278940
Refactored so that a LSRUse owns its fixups, as oppsed to letting the
LSRInstance own them. This makes it easier to rate formulas for
LSRUses, since the fixups are available directly. The Offsets vector
has been removed since it was no longer necessary.
New target hook isFoldableMemAccessOffset(), which is used during formula
rating.
For SystemZ, this is useful to express that loads and stores with
float or vector types with a big/negative offset should be avoided in
loops. Without this, LSR will generate a lot of negative offsets that
would require extra instructions for loading the address.
Updated tests:
test/CodeGen/SystemZ/loop-01.ll
Reviewed by: Quentin Colombet and Ulrich Weigand.
https://reviews.llvm.org/D19152
llvm-svn: 278927
In theory the indices of RC (and thus the index used for LiveRegs) may differ from the indices of OpRC.
Fixed the code to extract the correct RC index.
OpRC contains the first X consecutive elements of RC, and thus their indices are currently de facto the same, therefore a test cannot be added at this point.
Differential Revision: https://reviews.llvm.org/D23491
llvm-svn: 278923
Summary:
See D22198 for the motivation: We have a pass that uses LiveIntervals anyway,
and there is now a requirement to track a physical register that is not
usually tracked at this point of the compilation. The pass also introduces
instructions that affect this physical register, but we want to preserve
LiveIntervals.
Rather than add brittle and rarely exercised code to keep the tracking of
the physical register intact, we want to just remove the corresponding
LiveRange -- it didn't exist before anyway, and subsequent passes don't
expect it to be there.
Reviewers: MatzeB, arsenm
Subscribers: llvm-commits, MatzeB
Differential Revision: https://reviews.llvm.org/D22801
llvm-svn: 278920
can given the current __cplusplus definitions).
Without this, Clang triggers TONS of warnings about using a C++17
extension. I tried using LLVM_EXTENSION to turn these off and it doesn't
work.
Suggestions on a better approach are welcome, but at least this makes
the build usable for me again.
llvm-svn: 278909