1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 14:02:52 +02:00
Commit Graph

7019 Commits

Author SHA1 Message Date
Igor Laevsky
cf821eac6c [SCEV] Cache results during GetMinTrailingZeros query
Differential Revision: https://reviews.llvm.org/D29759

llvm-svn: 295060
2017-02-14 15:53:12 +00:00
Sanjay Patel
8e7e7e2058 [ValueTracking] use nonnull argument attribute to eliminate null checks
Enhancing value tracking's analysis of null-ness was suggested in D27855, so here's a first attempt at that.

This is part of solving:
https://llvm.org/bugs/show_bug.cgi?id=28430

Differential Revision: https://reviews.llvm.org/D28204

llvm-svn: 294897
2017-02-12 15:35:34 +00:00
Dorit Nuzman
15c8d7c6d1 [LV/LoopAccess] Check statically if an unknown dependence distance can be
proven larger than the loop-count

This fixes PR31098: Try to resolve statically data-dependences whose
compile-time-unknown distance can be proven larger than the loop-count, 
instead of resorting to runtime dependence checking (which are not always 
possible).

For vectorization it is sufficient to prove that the dependence distance 
is >= VF; But in some cases we can prune unknown dependence distances early,
and even before selecting the VF, and without a runtime test, by comparing 
the distance against the loop iteration count. Since the vectorized code 
will be executed only if LoopCount >= VF, proving distance >= LoopCount 
also guarantees that distance >= VF. This check is also equivalent to the 
Strong SIV Test.

Reviewers: mkuper, anemet, sanjoy

Differential Revision: https://reviews.llvm.org/D28044

llvm-svn: 294892
2017-02-12 09:32:53 +00:00
Peter Collingbourne
33fe886dfb IR: Function summary extensions for whole-program devirtualization pass.
The summary information includes all uses of llvm.type.test and
llvm.type.checked.load intrinsics that can be used to devirtualize calls,
including any constant arguments for virtual constant propagation.

Differential Revision: https://reviews.llvm.org/D29734

llvm-svn: 294795
2017-02-10 22:29:38 +00:00
Chandler Carruth
e56d7470a4 [PM/LCG] Teach LCG to support spurious reference edges.
Somewhat amazingly, this only requires teaching it to clean them up when
deleting a dead function from the graph. And we already have exactly the
necessary data structures to do that in the parent RefSCCs.

This allows ArgPromote to work in a much simpler way be merely letting
reference edges linger in the graph after the causing IR is deleted. We
will clean up these edges when we run any function pass over the IR, but
don't remove them eagerly.

This avoids all of the quadratic update issues both in the current pass
manager and in my previous attempt with the new pass manager.

Differential Revision: https://reviews.llvm.org/D29579

llvm-svn: 294663
2017-02-09 23:30:14 +00:00
Chandler Carruth
042041bdf3 [PM/LCG] Teach the LazyCallGraph how to replace a function without
disturbing the graph or having to update edges.

This is motivated by porting argument promotion to the new pass manager.
Because of how LLVM IR Function objects work, in order to change their
signature a new object needs to be created. This is efficient and
straight forward in the IR but previously was very hard to implement in
LCG. We could easily replace the function a node in the graph
represents. The challenging part is how to handle updating the edges in
the graph.

LCG previously used an edge to a raw function to represent a node that
had not yet been scanned for calls and references. This was the core
of its laziness. However, that model causes this kind of update to be
very hard:
1) The keys to lookup an edge need to be `Function*`s that would all
   need to be updated when we update the node.
2) There will be some unknown number of edges that haven't transitioned
   from `Function*` edges to `Node*` edges.

All of this complexity isn't necessary. Instead, we can always build
a node around any function, always pointing edges at it and always using
it as the key to lookup an edge. To maintain the laziness, we need to
sink the *edges* of a node into a secondary object and explicitly model
transitioning a node from empty to populated by scanning the function.
This design seems much cleaner in a number of ways, but importantly
there is now exactly *one* place where the `Function*` has to be
updated!

Some other cleanups that fall out of this include having something to
model the *entry* edges more accurately. Rather than hand rolling parts
of the node in the graph itself, we have an explicit `EdgeSequence`
object that gives us exactly the functionality needed. We also have
a consistent place to define the edge iterators and can use them for
both the entry edges and the internal edges of the graph.

The API used to model the separation between a node and its edges is
intentionally very thin as most clients are expected to deal with nodes
that have populated edges. We model this exactly as an optional does
with an additional method to populate the edges when that is
a reasonable thing for a client to do. This is based on API design
suggestions from Richard Smith and David Blaikie, credit goes to them
for helping pick how to model this without it being either too explicit
or too implicit.

The patch is somewhat noisy due to shifting around iterator types and
new syntax for walking the edges of a node, but most of the
functionality change is in the `Edge`, `EdgeSequence`, and `Node` types.

Differential Revision: https://reviews.llvm.org/D29577

llvm-svn: 294653
2017-02-09 23:24:13 +00:00
Daniel Berlin
76913a08c0 Drop graph_ prefix
llvm-svn: 294621
2017-02-09 20:37:46 +00:00
Daniel Berlin
911e120e71 GraphTraits: Add range versions of graph traits functions (graph_nodes, graph_children, inverse_graph_nodes, inverse_graph_children).
Summary:
Convert all obvious node_begin/node_end and child_begin/child_end
pairs to range based for.

Sending for review in case someone has a good idea how to make
graph_children able to be inferred. It looks like it would require
changing GraphTraits to be two argument or something. I presume
inference does not happen because it would have to check every
GraphTraits in the world to see if the noderef types matched.

Note: This change was 3-staged with clang as well, which uses
Dominators/etc from LLVM.

Reviewers: chandlerc, tstellarAMD, dblaikie, rsmith

Subscribers: arsenm, llvm-commits, nhaehnle

Differential Revision: https://reviews.llvm.org/D29767

llvm-svn: 294620
2017-02-09 20:37:24 +00:00
Vitaly Buka
ff45b198d1 LVI: Fix use-of-uninitialized-value after r294463
BlockValueStack can be reallocated making reference e invalid.

llvm-svn: 294572
2017-02-09 09:28:05 +00:00
Daniel Berlin
cfdd740e01 LVI: Add a per-value worklist limit to LazyValueInfo.
Summary:
LVI is now depth first, which is optimal for iteration strategy in
terms of work per call.  However, the way the results get cached means
it can still go very badly N^2 or worse right now.  The overdefined
cache is per-block, because LVI wants to try to get different results
for the same name in different blocks (IE solve the problem
PredicateInfo solves).  This means even if we discover a value is
overdefined after going very deep, it doesn't cache this information,
causing it to end up trying to rediscover it again and again.  The
same is true for values along the way.  In practice, overdefined
anywhere should mean overdefined everywhere (this is how, for example,
SCCP works).

Until we get around to reworking the overdefined cache, we need to
limit the worklist size we process.  Note that permanently reverting
the DFS strategy exploration seems the wrong strategy (temporarily
seems fine if we really want).  BFS is clearly the wrong approach, it
just gets luckier on some testcases.  It's also very hard to design
an effective throttle for BFS. For DFS, the throttle is directly related
to the depth of the CFG.  So really deep CFGs will get cutoff, smaller
ones will not. As the CFG simplifies, you get better results.
In BFS, the limit is it's related to the fan-out times average block size,
which is harder to reason about or make good choices for.

Bug being filed about the overdefined cache, but it will require major
surgery to fix it (plumbing predicateinfo through CVP or LVI).

Note: I did not make this number configurable because i'm not sure
anyone really needs to tweak this knob.  We run CVP 3 times. On the
testcases i have the slow ones happen in the middle, where CVP is
doing cleanup work other things are effective at.  Over the course of
3 runs, we don't see to have any real loss of performance.

I haven't gotten a minimized testcase yet, but just imagine in your
head a testcase where, going *up* the CFG, you have branches, one of
which leads 50000 blocks deep, and the other, to something where the
answer is overdefined immediately.  BFS would discover the overdefined
faster than DFS, but do more work to do so.  In practice, the right
answer is "once DFS discovers overdefined for a value, stop trying to
get more info about that value" (and so, DFS would normally cache the
overdefined results for every value it passed through in those 50k
blocks, and never do that work again. But it don't, because of the
naming problem)

Reviewers: chandlerc, djasper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29715

llvm-svn: 294463
2017-02-08 15:22:52 +00:00
Chandler Carruth
b07ec0321e Revert r293017 and fix the actual underlying issue.
The patch committed in r293017, as discussed on the list, doesn't really
make sense but was causing an actual issue to go away.

The issue turns out to be that in one place the extra template arguments
were dropped from the OuterAnalysisManagerProxy. This in turn caused the
types used in one set of places to access the key to be completely
different from the types used in another set of places for both Loop and
CGSCC cases where there are extra arguments.

I have literally no idea how anything seemed to work with this bug in
place. It blows my mind. But it did except for mingw64 in a DLL build.

I've added a really handy static assert that helps ensure we don't break
this in the future. It immediately diagnoses the issue with a compile
failure and a very clear error message. Much better that staring at
backtraces on a build bot. =]

llvm-svn: 294267
2017-02-07 01:50:48 +00:00
Philip Reames
7263af2894 [LVI] Switch from BFS to DFS exploration order
This patch changes the order in which LVI explores previously unexplored paths.

Previously, the code used an BFS strategy where each unexplored input was added to the search queue before any of them were explored. This has the effect of causing all inputs to be explored before returning to re-evaluate the merge point (non-local or phi node). This has the unfortunate property of doing redundant work if one of the inputs to the merge is found to be overdefined (i.e. unanalysable). If any input is overdefined, the result of the merge will be too; regardless of the values of other inputs.

The new code uses a DFS strategy where we re-evaluate the merge after evaluating each input. If we discover an overdefined input, we immediately return without exploring other inputs.

We have reports of large (4-10x) improvements of compile time with this patch and some reports of more precise analysis results as well.  See the review discussion for details.  The original motivating case was pr10584.

Differential Revision: https://reviews.llvm.org/D28190

llvm-svn: 294264
2017-02-07 00:25:24 +00:00
Chandler Carruth
e38cc318e0 [PM/LCG] Fix the no-asserts build after r294227. Sorry for the noise.
llvm-svn: 294235
2017-02-06 20:59:07 +00:00
Chandler Carruth
93759a2957 [PM/LCG] Remove the lazy RefSCC formation from the LazyCallGraph during
iteration.

The lazy formation of RefSCCs isn't really the most important part of
the laziness here -- that has to do with walking the functions
themselves -- and isn't essential to maintain. Originally, there were
incremental update algorithms that relied on updates happening
predominantly near the most recent RefSCC formed, but those have been
replaced with ones that have much tighter general case bounds at this
point. We do still perform asserts that only scale well due to this
incrementality, but those are easy to place behind EXPENSIVE_CHECKS.

Removing this simplifies the entire analysis by having a single up-front
step that builds all of the RefSCCs in a direct Tarjan walk. We can even
easily replace this with other or better algorithms at will and with
much less confusion now that there is no iterator-based incremental
logic involved. This removes a lot of complexity from LCG.

Another advantage of moving in this direction is that it simplifies
testing the system substantially as we no longer have to worry about
observing and mutating the graph half-way through the RefSCC formation.

We still need a somewhat special iterator for RefSCCs because we want
the iterator to remain stable in the face of graph updates. However,
this now merely involves relative indexing to the current RefSCC's
position in the sequence which isn't too hard.

Differential Revision: https://reviews.llvm.org/D29381

llvm-svn: 294227
2017-02-06 19:38:06 +00:00
Sanjay Patel
00cf3d4d68 [ValueTracking] emit a remark when we detect a conflicting assumption (PR31809)
This is a follow-up to D29395 where we try to be good citizens and let the user know that
we've probably gone off the rails.

This should allow us to resolve:
https://llvm.org/bugs/show_bug.cgi?id=31809

Differential Revision: https://reviews.llvm.org/D29404

llvm-svn: 294208
2017-02-06 18:26:06 +00:00
Daniil Fukalov
fd35b81460 [SCEV] limit recursion depth and operands number in getAddExpr
for a quite big function with source like

%add = add nsw i32 %mul, %conv
%mul1 = mul nsw i32 %add, %conv
%add2 = add nsw i32 %mul1, %add
%mul3 = mul nsw i32 %add2, %add
; repeat couple of thousands times
that can be produced by loop unroll, getAddExpr() tries to recursively construct SCEV and runs almost infinite time.

Added recursion depth restriction (with new parameter to set it)

Reviewers: sanjoy

Subscribers: hfinkel, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D28158

llvm-svn: 294181
2017-02-06 12:38:06 +00:00
Michael Kuperstein
f2f8127335 [SLP] Make sortMemAccesses explicitly return an error. NFC.
llvm-svn: 294029
2017-02-03 19:32:50 +00:00
Michael Kuperstein
f76d717429 [SLP] Use SCEV to sort memory accesses.
This generalizes memory access sorting to use differences between SCEVs,
instead of relying on constant offsets. That allows us to properly do
SLP vectorization of non-sequentially ordered loads within loops bodies.

Differential Revision: https://reviews.llvm.org/D29425

llvm-svn: 294027
2017-02-03 19:09:45 +00:00
Mehdi Amini
265b7f9709 Revert "[ThinLTO] Add an auto-hide feature"
This reverts commit r293970.

After more discussion, this belongs to the linker side and
there is no added value to do it at this level.

llvm-svn: 293993
2017-02-03 07:41:43 +00:00
Mehdi Amini
45f4286def [ThinLTO] Add an auto-hide feature
When a symbol is not exported outside of the
DSO, it is can be hidden. Usually we try to internalize
as much as possible, but it is not always possible, for
instance a symbol can be referenced outside of the LTO
unit, or there can be cross-module reference in ThinLTO.

This is a recommit of r293912 after fixing build failures,
and a recommit of r293918 after fixing LLD tests.

Differential Revision: https://reviews.llvm.org/D28978

llvm-svn: 293970
2017-02-03 00:32:38 +00:00
Mehdi Amini
47ba57f1d4 Revert "[ThinLTO] Add an auto-hide feature"
This reverts commit r293918, one lld test does not pass.

llvm-svn: 293961
2017-02-02 23:20:36 +00:00
Xinliang David Li
0bca950cbd [PGO] internal option cleanups
1. Added comments for options
2. Added missing option cl::desc field
3. Uniified function filter option for graph viewing.
   Now PGO count/raw-counts share the same
   filter option: -view-bfi-func-name=.

llvm-svn: 293938
2017-02-02 21:29:17 +00:00
Xinliang David Li
83423510b2 [PGO] make graph view internal options available for all builds
Differential Revision: https://reviews.llvm.org/D29259

llvm-svn: 293921
2017-02-02 19:18:56 +00:00
Mehdi Amini
f59f175fb8 [ThinLTO] Add an auto-hide feature
When a symbol is not exported outside of the
DSO, it is can be hidden. Usually we try to internalize
as much as possible, but it is not always possible, for
instance a symbol can be referenced outside of the LTO
unit, or there can be cross-module reference in ThinLTO.

This is a recommit of r293912 after fixing build failures.

Differential Revision: https://reviews.llvm.org/D28978

llvm-svn: 293918
2017-02-02 18:31:35 +00:00
Mehdi Amini
4dd531a1fc Revert "[ThinLTO] Add an auto-hide feature"
This reverts r293912, bots are broken.

llvm-svn: 293914
2017-02-02 18:24:37 +00:00
Mehdi Amini
2845cff029 [ThinLTO] Add an auto-hide feature
When a symbol is not exported outside of the
DSO, it is can be hidden. Usually we try to internalize
as much as possible, but it is not always possible, for
instance a symbol can be referenced outside of the LTO
unit, or there can be cross-module reference in ThinLTO.

Differential Revision: https://reviews.llvm.org/D28978

llvm-svn: 293912
2017-02-02 18:13:46 +00:00
Jun Bum Lim
296a001019 [JumpThread] Enhance finding partial redundant loads by continuing scanning single predecessor
Summary: While scanning predecessors to find an available loaded value, if the predecessor has a single predecessor, we can continue scanning through the single predecessor.

Reviewers: mcrosier, rengolin, reames, davidxl, haicheng

Reviewed By: rengolin

Subscribers: zzheng, llvm-commits

Differential Revision: https://reviews.llvm.org/D29200

llvm-svn: 293896
2017-02-02 15:12:34 +00:00
Adam Nemet
835438236b [LV] Also port failure remarks to new OptimizationRemarkEmitter API
llvm-svn: 293866
2017-02-02 05:41:51 +00:00
Sanjay Patel
01c2913892 [ValueTracking] remove a FIXME for something we don't want to do; NFC
The comment was added with:
https://reviews.llvm.org/rL293773
...but there would be a cost to implement this and possibly no payoff.

llvm-svn: 293823
2017-02-01 22:27:34 +00:00
Matthew Simpson
1981a68b4b [LV] Move interleaved access helper functions to VectorUtils (NFC)
This patch moves some helper functions related to interleaved access
vectorization out of LoopVectorize.cpp and into VectorUtils.cpp. We would like
to use these functions in a follow-on patch that improves interleaved load and
store lowering in (ARM/AArch64)ISelLowering.cpp. One of the functions was
already duplicated there and has been removed.

Differential Revision: https://reviews.llvm.org/D29398

llvm-svn: 293788
2017-02-01 17:45:46 +00:00
Sanjay Patel
57ac5b4c24 [ValueTracking] avoid crashing from bad assumptions (PR31809)
A program may contain llvm.assume info that disagrees with other analysis. 
This may be caused by UB in the program, so we must not crash because of that.

As noted in the code comments:
https://llvm.org/bugs/show_bug.cgi?id=31809
...we can do better, but this at least avoids the assert/crash in the bug report.

Differential Revision: https://reviews.llvm.org/D29395

llvm-svn: 293773
2017-02-01 15:41:32 +00:00
Eli Friedman
2fb6d274be [SCEV] Simplify/generalize howFarToZero solving.
Make SolveLinEquationWithOverflow take the start as a SCEV, so we can
solve more cases. With that implemented, get rid of the special case
for powers of two.

The additional functionality probably isn't particularly useful,
but it might help a little for certain cases involving pointer
arithmetic.

Differential Revision: https://reviews.llvm.org/D28884

llvm-svn: 293576
2017-01-31 00:42:42 +00:00
Matt Arsenault
dd128e79e5 NVPTX: Refactor NVPTXInferAddressSpaces to check TTI
Add a new TTI hook for getting the generic address space value.

llvm-svn: 293563
2017-01-30 23:02:12 +00:00
Sanjay Patel
4eb6691ce0 [ValueTracking] clean up lookThroughCast; NFCI
1. Use auto with dyn_cast.
2. Don't use else after return.
3. Convert chain of 'else if' to switch.
4. Improve variable names.

llvm-svn: 293432
2017-01-29 16:34:57 +00:00
Mohammad Shahid
a2646e67e7 [SLP] Vectorize loads of consecutive memory accesses, accessed in non-consecutive (jumbled) way.
The jumbled scalar loads will be sorted while building the tree and these accesses will be marked to generate shufflevector after the vectorized load with proper mask.

Reviewers: hfinkel, mssimpso, mkuper

Differential Revision: https://reviews.llvm.org/D26905

Change-Id: I9c0c8e6f91a00076a7ee1465440a3f6ae092f7ad
llvm-svn: 293386
2017-01-28 17:59:44 +00:00
Matthias Braun
5809e12d46 Cleanup dump() functions.
We had various variants of defining dump() functions in LLVM. Normalize
them (this should just consistently implement the things discussed in
http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html

For reference:
- Public headers should just declare the dump() method but not use
  LLVM_DUMP_METHOD or #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
- The definition of a dump method should look like this:
  #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
  LLVM_DUMP_METHOD void MyClass::dump() {
    // print stuff to dbgs()...
  }
  #endif

llvm-svn: 293359
2017-01-28 02:02:38 +00:00
Peter Collingbourne
3703d8c3b2 Analysis: Add appropriate const qualification to functions in TypeMetadataUtils.cpp. NFC.
llvm-svn: 293341
2017-01-27 22:55:30 +00:00
Mehdi Amini
6aa2c55163 Fix BasicAA incorrect assumption on GEP
This is fixing pr31761: BasicAA is deducing NoAlias
on the result of the GEP if the base pointer is itself NoAlias.

This is possible only if the NoAlias on the base pointer is
deduced with a non-sized query: this should guarantee that
the pointers are belonging to different memory allocation
and that the GEP can't legally jump from one to another.

Differential Revision: https://reviews.llvm.org/D29216

llvm-svn: 293293
2017-01-27 16:12:22 +00:00
Justin Lebar
ed0bd55a60 [ValueTracking] Add comment that CannotBeOrderedLessThanZero does the wrong thing for powi.
Summary:
CannotBeOrderedLessThanZero(powi(x, exp)) returns true if
CannotBeOrderedLessThanZero(x).  But powi(-0, exp) is negative if exp is
odd, so we actually want to return SignBitMustBeZero(x).

Except that also isn't right, because we want to return true if x is
NaN, even if x has a negative sign bit.

What we really need in order to fix this is a consistent approach in
this function to handling the sign bit of NaNs.  Without this it's very
difficult to say what the correct behavior here is.

Reviewers: hfinkel, efriedma, sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28927

llvm-svn: 293243
2017-01-27 00:58:34 +00:00
Daniil Fukalov
a685f298a9 [SCEV] Introduce add operation inlining limit
Inlining in getAddExpr() can cause abnormal computational time in some cases.
New parameter -scev-addops-inline-threshold is intruduced with default value 500.

Reviewers: sanjoy

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D28812

llvm-svn: 293176
2017-01-26 13:33:17 +00:00
Chandler Carruth
579549a336 [PM] Use PoisoningVH correctly when merely deleting entries in a map
with it.

This code was dereferencing the PoisoningVH which isn't allowed once it
is poisoned. But the code itself really doesn't need to access the
pointer, it is just doing the safe stuff of clearing out data structures
keyed on the pointer value.

Change the code to use iterators to erase directly from a DenseMap. This
is also substantially more efficient as it avoids lots of hashing and
lookups to do the erasure. DenseMap supports iterating behind the
iteration which is fairly easy to implement.

Sadly, I don't have a test case here. I'm not even close and I don't
know that I ever will be. The issue is that several of the tricky
aspects of fixing this only show up when you cause the stack's
SmallVector to be in *EXACTLY* the right location. I only ever got
a reproduction for those with Clang, and only with *exactly* the right
command line flags. Any adjustment, even to seemingly unrelated flags,
would make partial and half-way solutions magically start to "work". In
good news, all of this was caught with the LLVM test suite. Also, there
is no *specific* code here that is untested, just that the old pattern
of code won't immediately fail on any test case I've managed to
contrive.

llvm-svn: 293160
2017-01-26 08:31:54 +00:00
Jonas Paulsson
1dc6fdc89f [TargetTransformInfo] Refactor and improve getScalarizationOverhead()
Refactoring to remove duplications of this method.

New method getOperandsScalarizationOverhead() that looks at the present unique
operands and add extract costs for them. Old behaviour was to just add extract
costs for one operand of the type always, which still happens in
getArithmeticInstrCost() if no operands are provided by the caller.

This is a good start of improving on this, but there are more places
that can be improved by using getOperandsScalarizationOverhead().

Review: Hal Finkel
https://reviews.llvm.org/D29017

llvm-svn: 293155
2017-01-26 07:03:25 +00:00
Adam Nemet
9b34221336 [llc] Add -pass-remarks-output
This is the opt/llc counterpart of -fsave-optimization-record to output
optimization remarks in a YAML file.

llvm-svn: 293121
2017-01-26 00:39:51 +00:00
Justin Lebar
f25dcc8534 [ValueTracking] Implement SignBitMustBeZero correctly for sqrt.
Summary:
Previously we assumed that the result of sqrt(x) always had 0 as its
sign bit.  But sqrt(-0) == -0.

Reviewers: hfinkel, efriedma, sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28928

llvm-svn: 293115
2017-01-26 00:10:26 +00:00
Adam Nemet
eb46bca148 New OptimizationRemarkEmitter pass for MIR
This allows MIR passes to emit optimization remarks with the same level
of functionality that is available to IR passes.

It also hooks up the greedy register allocator to report spills.  This
allows for interesting use cases like increasing interleaving on a loop
until spilling of registers is observed.

I still need to experiment whether reporting every spill scales but this
demonstrates for now that the functionality works from llc
using -pass-remarks*=<pass>.

Differential Revision: https://reviews.llvm.org/D29004

llvm-svn: 293110
2017-01-25 23:20:33 +00:00
Adam Nemet
ab7818e0cc [OptDiag] Split code region out of DiagnosticInfoOptimizationBase
Code region is the only part of this class that is IR-specific.  Code
region is moved down in the inheritance tree to a new derived class,
called DiagnosticInfoIROptimization.

All the existing remarks are derived from this new class now.

This allows the new MIR pass-remark classes to be derived from
DiagnosticInfoOptimizationBase.

Also because we keep the name DiagnosticInfoOptimizationBase, the clang
parts don't need any adjustment.

Differential Revision: https://reviews.llvm.org/D29003

llvm-svn: 293109
2017-01-25 23:20:25 +00:00
whitequark
e14626baaf Mark @llvm.powi.* as safe to speculatively execute.
Floating point intrinsics in LLVM are generally not speculatively
executed, since most of them are defined to behave the same as libm
functions, which set errno.

However, the @llvm.powi.* intrinsics do not correspond to any libm
function, and lacks any defined error handling semantics in LangRef.
It most certainly does not alter errno.

llvm-svn: 293041
2017-01-25 09:32:30 +00:00
NAKAMURA Takumi
b8624b1643 Rewind instantiations of OuterAnalysisManagerProxy in r289317, r291651, and r291662.
I found root class should be instantiated for variadic tempate to instantiate static member explicitly.

This will fix failures in mingw DLL build.

llvm-svn: 293017
2017-01-25 04:26:29 +00:00
Sanjay Patel
7b87306d0f [InstSimplify] try to eliminate icmp Pred (add nsw X, C1), C2
I was surprised to see that we're missing icmp folds based on 'add nsw' in InstCombine, 
but we should handle the InstSimplify cases first because that could make the InstCombine
code simpler.

Here are Alive-based proofs for the logic:

Name: add_neg_constant
Pre: C1 < 0 && (C2 > ((1<<(width(C1)-1)) + C1))
%a = add nsw i7 %x, C1
%b = icmp sgt %a, C2
  =>
%b = false

Name: add_pos_constant
Pre: C1 > 0 && (C2 < ((1<<(width(C1)-1)) + C1 - 1))
%a = add nsw i6 %x, C1
%b = icmp slt %a, C2
  =>
%b = false

Name: nuw
Pre: C1 u>= C2
%a = add nuw i11 %x, C1
%b = icmp ult %a, C2
  =>
%b = false

Differential Revision: https://reviews.llvm.org/D29053

llvm-svn: 292952
2017-01-24 17:03:24 +00:00
Chandler Carruth
d7cc3d1b4a [PH] Replace uses of AssertingVH from members of analysis results with
a lazy-asserting PoisoningVH.

AssertVH is fundamentally incompatible with cache-invalidation of
analysis results. The invaliadtion happens after the AssertingVH has
already fired. Instead, use a PoisoningVH that will assert if the
dangling handle is ever used rather than merely be assigned or
destroyed.

This patch also removes all of the (numerous) doomed attempts to work
around this fundamental incompatibility. It is a pretty significant
simplification IMO.

The most interesting change is in the Inliner where we still do some
clearing because we don't want to rely on the coarse grained
invalidation strategy of the containing pass manager. However, I prefer
the approach that contains this logic to the cleanup phase of the
Inliner, and I think we could enhance the CGSCC analysis management
layer to make this even better in the future if desired.

The rest is straight cleanup.

I've also added a test for one of the harder cases to work around: when
a *module analysis* contains many AssertingVHes pointing at functions.

Differential Revision: https://reviews.llvm.org/D29006

llvm-svn: 292928
2017-01-24 12:55:57 +00:00