1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 02:33:06 +01:00
Commit Graph

573 Commits

Author SHA1 Message Date
Craig Topper
3454be975b [CallSite removal][TargetLibraryInfo] Replace ImmutableCallSite with CallBase in one of the getLibFunc signatures. NFC
Differential Revision: https://reviews.llvm.org/D78083
2020-04-15 22:43:41 -07:00
Juneyoung Lee
64eac7a6cd [ValueTracking] Implement canCreatePoison
Summary:
This PR adds `canCreatePoison(Instruction *I)` which returns true if `I` can generate poison from non-poison
operands.

Reviewers: spatel, nikic, lebedev.ri

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits, regehr, nlopes

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77890
2020-04-15 05:58:06 +09:00
Aaron Puchert
ac540cb0fe [ValueLattice] Remove unused DataLayout parameter of mergeIn, NFC
Reviewed By: fhahn, echristo

Differential Revision: https://reviews.llvm.org/D78061
2020-04-14 13:32:53 +02:00
Tyker
b6c8c60f14 [AssumeBundles] adapt Assumption cache to assume bundles
Summary: change assumption cache to store an assume along with an index to the operand bundle containing the knowledge.

Reviewers: jdoerfert, hfinkel

Reviewed By: jdoerfert

Subscribers: hiraditya, mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77402
2020-04-13 12:04:51 +02:00
Sanjay Patel
91701e7d4b [VectorUtils] add IR-level analysis for widening of shuffle mask
This is similar to the recent move/addition of "scaleShuffleMask" (D76508),
but there are a couple of differences:

1. The existing x86 helper (canWidenShuffleElements) always tries to
   divide-by-2, so it gets called iteratively and wouldn't handle the
   general case of non-pow-2 length.
2. The existing x86 code handles "SM_SentinelZero" - we don't have
   that in IR, but this code should be safe to use with that or other
   special (negative) values.

The motivation is to enable shuffle folds in instcombine/vector-combine
that are similar to D76844 and D76727, but in the reverse-bitcast direction.
Those patterns are visible in the tests for D40633.

Differential Revision: https://reviews.llvm.org/D77881
2020-04-12 10:14:19 -04:00
Sanjay Patel
8105dac4cc [VectorUtils] rename scaleShuffleMask to narrowShuffleMaskElts; NFC
As proposed in D77881, we'll have the related widening operation,
so this name becomes too vague.

While here, change the function signature to take an 'int' rather
than 'size_t' for the scaling factor, add an assert for overflow of
32-bits, and improve the documentation comments.
2020-04-11 10:05:49 -04:00
Johannes Doerfert
2941d88d79 [CallGraphUpdater] Remove dead constants before replacing a function
Dead constants might be left when a function is replaced, we can
gracefully handle this case and avoid complexity for the users who would
see an assertion otherwise.
2020-04-08 22:52:46 -05:00
Sanjay Patel
6f09e0c039 [ValueTracking] enhance matching of umin/umax with 'not' operands
The cmyk test is based on the known regression that resulted from:
rGf2fbdf76d8d0

This improves on the equivalent signed min/max change:
rG867f0c3c4d8c

The underlying icmp equivalence is:
  ~X pred ~Y --> Y pred X

For an icmp with constant, canonicalization results in a swapped pred:
  ~X < C -->  X > ~C
2020-04-06 11:51:59 -04:00
Sanjay Patel
9c389cc269 [ValueTracking] add/adjust tests for min/max; NFC 2020-04-06 11:14:05 -04:00
Sanjay Patel
e9fc5c66f0 [ValueTracking] enhance matching of smin/smax with 'not' operands
The cmyk tests are based on the known regression that resulted from:
rGf2fbdf76d8d0

So this improvement in analysis might be enough to restore that commit.
2020-04-05 08:54:12 -04:00
Sanjay Patel
1e300cdd8c [ValueTracking] add tests for smin/smax; NFC 2020-04-04 13:44:06 -04:00
Tyker
361dc8333f [NFC] Split Knowledge retention and place it more appropriatly
Summary:
Splitting Knowledge retention into Queries in Analysis and Builder into Transform/Utils
allows Queries and Transform/Utils to use Analysis.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77171
2020-04-02 15:01:41 +02:00
Craig Topper
987e8eec73 [VectorUtils][X86] De-templatize scaleShuffleMask and 2 X86 shuffle mask helpers and move their implementation to cpp files
Summary: These were templated due to SelectionDAG using int masks for shuffles and IR using unsigned masks for shuffles. But now that D72467 has landed we have an int mask version of IRBuilder::CreateShuffleVector. So just use int instead of a template

Reviewers: spatel, efriedma, RKSimon

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D77183
2020-04-01 00:46:48 -07:00
Uday Bondhugula
336d326470 Introduce support for lib function aligned_alloc in TLI / memory builtins
Aligned_alloc is a standard lib function and has been in glibc since
2.16 and in the C11 standard. It has semantics similar to malloc/calloc
for several analyses/transforms. This patch introduces aligned_alloc
in target library info and memory builtins. Subsequent ones will
make other passes aware and fix https://bugs.llvm.org/show_bug.cgi?id=44062

This change will also be useful to LLVM generators that need to allocate
buffers of vector elements larger than 16 bytes (for eg. 256-bit ones),
element boundary alignment for which is not typically provided by glibc malloc.

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Differential Revision: https://reviews.llvm.org/D76970
2020-03-29 23:36:24 +05:30
Sanjay Patel
70825ea20a [Analysis] simplify code for scaleShuffleMask
This is NFC-ish. The results should be identical, but perf is hopefully
better with the fast-path for no scaling. Added a unit test for that.

The code is adapted from what used to be the DAGCombiner equivalent
function before D76508 (rG0eeee83d7513).
2020-03-23 11:47:14 -04:00
Sanjay Patel
405b09108f [VectorUtils] move x86's scaleShuffleMask to generic VectorUtils
We have some long-standing missing shuffle optimizations that could
use this transform via VectorCombine now:
https://bugs.llvm.org/show_bug.cgi?id=35454
(and we still don't get that case in the backend either)

This function is apparently templated because there's existing code
in IR that treats mask values as unsigned and backend code that
treats masks values as signed.

The mask values are not endian-dependent (as shown by the existing
bitcast transform from DAGCombiner).

Differential Revision: https://reviews.llvm.org/D76508
2020-03-23 09:58:55 -04:00
Hiroshi Yamauchi
fb601ea161 [PSI] Add tests for is(Hot|Cold)FunctionInCallGraphNthPercentile.
Summary:
Follow up on D75283.

Also remove the test code that was moved to another test and was to be removed.

Reviewers: davidxl

Subscribers: eraman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75630
2020-03-10 08:21:10 -07:00
Whitney Tsang
509d3200f9 [LoopNest]: Analysis to discover properties of a loop nest.
Summary: This patch adds an analysis pass to collect loop nests and
summarize properties of the nest (e.g the nest depth, whether the nest
is perfect, what's the innermost loop, etc...).

The motivation for this patch was discussed at the latest meeting of the
LLVM loop group (https://ibm.box.com/v/llvm-loop-nest-analysis) where we
discussed
the unimodular loop transformation framework ( “A Loop Transformation
Theory and an Algorithm to Maximize Parallelism”, Michael E. Wolf and
Monica S. Lam, IEEE TPDS, October 1991). The unimodular framework
provides a convenient way to unify legality checking and code generation
for several loop nest transformations (e.g. loop reversal, loop
interchange, loop skewing) and their compositions. Given that the
unimodular framework is applicable to perfect loop nests this is one
property of interest we expose in this analysis. Several other utility
functions are also provided. In the future other properties of interest
can be added in a centralized place.
Authored By: etiotto
Reviewer: Meinersbur, bmahjour, kbarton, Whitney, dmgreen, fhahn,
reames, hfinkel, jdoerfert, ppc-slack
Reviewed By: Meinersbur
Subscribers: bryanpkc, ppc-slack, mgorny, hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D68789
2020-03-03 18:25:19 +00:00
Whitney Tsang
920787f3cc Revert "[LoopNest]: Analysis to discover properties of a loop nest."
This reverts commit 3a063d68e3c97136d10a2e770f389e6c13c3b317.

Broke the build with modules enabled:
http://green.lab.llvm.org/green/job/lldb-cmake/10655/console .
2020-03-03 14:07:49 +00:00
Whitney Tsang
67c3362c2a [LoopNest]: Analysis to discover properties of a loop nest.
Summary: This patch adds an analysis pass to collect loop nests and
summarize properties of the nest (e.g the nest depth, whether the nest
is perfect, what's the innermost loop, etc...).

The motivation for this patch was discussed at the latest meeting of the
LLVM loop group (https://ibm.box.com/v/llvm-loop-nest-analysis) where we
discussed
the unimodular loop transformation framework ( “A Loop Transformation
Theory and an Algorithm to Maximize Parallelism”, Michael E. Wolf and
Monica S. Lam, IEEE TPDS, October 1991). The unimodular framework
provides a convenient way to unify legality checking and code generation
for several loop nest transformations (e.g. loop reversal, loop
interchange, loop skewing) and their compositions. Given that the
unimodular framework is applicable to perfect loop nests this is one
property of interest we expose in this analysis. Several other utility
functions are also provided. In the future other properties of interest
can be added in a centralized place.
Authored By: etiotto
Reviewer: Meinersbur, bmahjour, kbarton, Whitney, dmgreen, fhahn,
reames, hfinkel, jdoerfert, ppc-slack
Reviewed By: Meinersbur
Subscribers: bryanpkc, ppc-slack, mgorny, hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D68789
2020-03-03 13:25:28 +00:00
Hiroshi Yamauchi
6e164c4b1d [PSI] Add the isCold query support with a given percentile value.
Summary: This follows up D67377 that added the isHot side.

Reviewers: davidxl

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75283
2020-03-02 12:50:15 -08:00
Brian Gesiak
cee5b6cc82 [LazyCallGraph] Fix ambiguous index value
After having committed https://reviews.llvm.org/D72226, 2 buildbots
running GCC 5.4.0 began failing. The cause was the order in which those
compilers evaluated the left- and right-hand sides of the expression
`RC.SCCIndices[C] = RC.SCCIndices.size();`. This commit splits the
expression into multiple statements to avoid ambiguity, and adds a test
case that exercises the code that caused the test failures on those
older compilers (which was originally included in the reviewed patch,
https://reviews.llvm.org/D72226).
2020-02-18 23:32:55 -05:00
Reid Kleckner
2a197a86b4 [IR] Lazily number instructions for local dominance queries
Essentially, fold OrderedBasicBlock into BasicBlock, and make it
auto-invalidate the instruction ordering when new instructions are
added. Notably, we don't need to invalidate it when removing
instructions, which is helpful when a pass mostly delete dead
instructions rather than transforming them.

The downside is that Instruction grows from 56 bytes to 64 bytes.  The
resulting LLVM code is substantially simpler and automatically handles
invalidation, which makes me think that this is the right speed and size
tradeoff.

The important change is in SymbolTableTraitsImpl.h, where the numbering
is invalidated. Everything else should be straightforward.

We probably want to implement a fancier re-numbering scheme so that
local updates don't invalidate the ordering, but I plan for that to be
future work, maybe for someone else.

Reviewed By: lattner, vsk, fhahn, dexonsmith

Differential Revision: https://reviews.llvm.org/D51664
2020-02-18 14:44:24 -08:00
Brian Gesiak
a9f4cb4ac7 Re-land "Add LazyCallGraph API to add function to RefSCC"
This re-commits https://reviews.llvm.org/D70927, which I reverted in
https://reviews.llvm.org/rG28213680b2a7d1fdeea16aa3f3a368879472c72a due
to a buildbot error:
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux/builds/13251

I no longer include a test case that appears to crash when built with the
buildbot's compiler, GCC 5.4.0.
2020-02-17 16:59:25 -05:00
Brian Gesiak
175b662e49 Revert "Add LazyCallGraph API to add function to RefSCC"
This reverts commit https://reviews.llvm.org/rG449a13509190b1c57e5fcf5cd7e8f0f647f564b4,
due to buildbot failures such as
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux/builds/13251.
2020-02-17 14:25:10 -05:00
Brian Gesiak
2dfb19d67e Add LazyCallGraph API to add function to RefSCC
Summary:
Depends on https://reviews.llvm.org/D70927.

`LazyCallGraph::addNewFunctionIntoSCC` allows users to insert a new
function node into a call graph, into a specific, existing SCC.

Extend this interface such that functions can be added even when they do
not belong in any existing SCC, but instead in a new SCC within an
existing RefSCC.

The ability to insert new functions as part of a RefSCC is necessary for
outlined functions that do not form a strongly connected cycle with the
function they are outlined from. An example of such a function would be the
coroutine funclets 'f.resume', etc., which are outlined from a coroutine 'f'.
Coroutine 'f' only references the funclets' addresses, it does not call
them directly.

Reviewers: jdoerfert, chandlerc, wenlei, hfinkel

Reviewed By: jdoerfert

Subscribers: hfinkel, JonChesterfield, mehdi_amini, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72226
2020-02-17 12:56:38 -05:00
Florian Hahn
0c03d6c1a0 [ValueLattice] Update markConstantRange to return false equal ranges.
Currently we always return true, when markConstantRange is used on an
object already containing a constant range. If NewR is equal to the
existing constant range however, nothing changes and we should return
false.

I also went ahead and added a clarifying comment and improved the
assertion.

Reviewers: efriedma, davide, nikic

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D73240
2020-02-15 22:06:55 +01:00
Ehud Katz
72c82a7786 [unittests] Fix TargetLibraryInfoTest.ValidProto 2020-02-12 14:13:14 +02:00
Johannes Doerfert
9686ad56e1 [FIX] Fix warning in LazyCallGraphTest caused by D70927 2020-02-08 18:58:16 -06:00
Johannes Doerfert
1b1628ecca Introduce a CallGraph updater helper class
The CallGraphUpdater is a helper that simplifies the process of updating
the call graph, both old and new style, while running an CGSCC pass.

The uses are contained in different commits, e.g. D70767.

More functionality is added as we need it.

Reviewed By: modocache, hfinkel

Differential Revision: https://reviews.llvm.org/D70927
2020-02-08 14:16:48 -06:00
George Burgess IV
3ad731fcef [SimplifyLibCalls] Add __strlen_chk.
Bionic has had `__strlen_chk` for a while. Optimizing that into a
constant is quite profitable, when possible.

Differential Revision: https://reviews.llvm.org/D74079
2020-02-08 11:51:00 -08:00
Sanjay Patel
945ea72b3a [Analysis] add query to get splat value from array of ints
I was debug stepping through an x86 shuffle lowering and
noticed we were doing an N^2 search for splat index. I
didn't find the equivalent functionality anywhere else in
LLVM, so here's a helper that takes an array of int and
returns a splatted index while ignoring undefs (any
negative value).

This might also be used inside existing
ShuffleVectorInst/ShuffleVectorSDNode functions and/or
help with D72467.

Differential Revision: https://reviews.llvm.org/D74064
2020-02-05 14:55:02 -05:00
Johannes Doerfert
f80ba61e01 [PM][CGSCC] Add a helper to update the call graph from SCC passes
With this patch new trivial edges can be added to an SCC in a CGSCC
pass via the updateCGAndAnalysisManagerForCGSCCPass method. It shares
almost all the code with the existing
updateCGAndAnalysisManagerForFunctionPass method but it implements the
first step towards the TODOs.

This was initially part of D70927.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D72025
2020-02-02 23:32:18 -06:00
Sanjay Patel
c19355e54c [Analysis] add optional index parameter to isSplatValue()
We want to allow splat value transforms to improve PR44588 and related bugs:
https://bugs.llvm.org/show_bug.cgi?id=44588
...but to do that, we need to know if values are splatted from the same,
specific index (lane) rather than splatted from an arbitrary index.

We can improve the undef handling with 1-liner follow-ups because the
Constant API optionally allow undefs now.

Differential Revision: https://reviews.llvm.org/D73549
2020-02-02 10:52:00 -05:00
Francesco Petrogalli
308e2db83d [llvm][VectorUtils] Tweak VFShape for scalable vector functions.
Summary:
This patch makes sure that the field VFShape.VF is greater than zero
when demangling the vector function name of scalable vector functions
encoded in the "vector-function-abi-variant" attribute.

This change is required to be able to provide instances of VFShape
that can be used to query the VFDatabase for the vectorization passes,
as such passes always require a positive value for the Vectorization Factor (VF)
needed by the vectorization process.

It is not possible to extract the value of VFShape.VF from the mangled
name of scalable vector functions, because it is encoded as
`x`. Therefore, the VFABI demangling function has been modified to
extract such information from the IR declaration of the vector
function, under the assumption that _all_ vectors in the signature of
the vector function have the same number of lanes. Such assumption is
valid because it is also assumed by the Vector Function ABI
specifications supported by the demangling function (x86, AArch64, and
LLVM internal one).

The unit tests that demangle scalable names have been modified by
adding the IR module that carries the declaration of the vector
function name being demangled.

In particular, the demangling function fails in the following cases:

1. When the declaration of the scalable vector function is not
    present in the module.

2. When the value of VFSHape.VF is not greater than 0.

Reviewers: jdoerfert, sdesmalen, andwar

Reviewed By: jdoerfert

Subscribers: mgorny, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73286
2020-01-30 05:53:56 +00:00
Hiroshi Yamauchi
547164a49a [Loads] Handle simple cases with same base pointer with constant offsets in FindAvailableLoadedValue when AA is null.
Summary:
This will help with devirtualization (store forwarding with vtable pointers in
the presence of other stores into members in the constructor.) During inlining,
we don't have AA.

Reviewers: davidxl

Subscribers: mgorny, Prazek, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71307
2020-01-29 13:05:46 -08:00
Benjamin Kramer
87d13166c7 Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.
2020-01-28 23:25:25 +01:00
Alina Sbirlea
62ee34a755 [UnitTests] Add invalidate methods. 2020-01-17 10:47:52 -08:00
Francesco Petrogalli
f6e39fe1d1 [VectorUtils] Rework the Vector Function Database (VFDatabase).
Summary:
This commits is a rework of the patch in
https://reviews.llvm.org/D67572.

The rework was requested to prevent out-of-tree performance regression
when vectorizing out-of-tree IR intrinsics. The vectorization of such
intrinsics is enquired via the static function `isTLIScalarize`. For
detail see the discussion in https://reviews.llvm.org/D67572.

Reviewers: uabelho, fhahn, sdesmalen

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72734
2020-01-16 15:08:26 +00:00
Sjoerd Meijer
6cb7b580a7 [SVEV] Recognise hardware-loop intrinsic loop.decrement.reg
Teach SCEV about the @loop.decrement.reg intrinsic, which has exactly the same
semantics as a sub expression. This allows us to query hardware-loops, which
contain this @loop.decrement.reg intrinsic, so that we can calculate iteration
counts, exit values, etc. of hardwareloops.

This "int_loop_decrement_reg" intrinsic is defined as "IntrNoDuplicate". Thus,
while hardware-loops and tripcounts now become analysable by SCEV, this
prevents the usual loop transformations from applying transformations on
hardware-loops, which is what we want at this point, for which I have added
test cases for loopunrolling and IndVarSimplify and LFTR.

Differential Revision: https://reviews.llvm.org/D71563
2020-01-10 09:35:00 +00:00
James Henderson
91705af363 [NFC] Fix trivial typos in comments
Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D72143

Patch by Kazuaki Ishizaki.
2020-01-06 10:50:26 +00:00
Florian Hahn
c2f9eea17d Revert "[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC)."
This reverts commit 51ef53f3bd23559203fe9af82ff2facbfedc1db3, as it
breaks some bots.
2020-01-04 18:44:38 +00:00
Florian Hahn
088559d18d [SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC).
SCEVExpander modifies the underlying function so it is more suitable in
Transforms/Utils, rather than Analysis. This allows using other
transform utils in SCEVExpander.

Reviewers: sanjoy.google, efriedma, reames

Reviewed By: sanjoy.google

Differential Revision: https://reviews.llvm.org/D71537
2020-01-04 18:29:35 +00:00
Francesco Petrogalli
16e0e6d447 Revert "[VectorUtils] Introduce the Vector Function Database (VFDatabase)."
This reverts commit 0be81968a283fd4161cb9ac9748d5ed200926292.

The VFDatabase needs some rework to be able to handle vectorization
and subsequent scalarization of intrinsics in out-of-tree versions of
the compiler. For more details, see the discussion in
https://reviews.llvm.org/D67572.
2019-12-13 19:42:04 +00:00
Kit Barton
6565d62ef3 Rename LoopInfo::isRotated() to LoopInfo::isRotatedForm().
This patch renames the LoopInfo::isRotated() method to LoopInfo::isRotatedForm()
to make it clear that the method checks whether the loop is in rotated form, not
whether the loop has been rotated by the LoopRotation pass.
2019-12-12 14:22:36 -05:00
Kit Barton
cd7e9ade1b [Loop] Add isRotated method to Loop class.
Summary:
This patch adds a method to determine if a loop is in rotated form (the latch is
an exiting block). It also modifies the getLoopGuardBranch method to use this
new method. This method can also be used in Loopfusion. Once this patch lands I
will make the corresponding changes there.

Reviewers: jdoerfert, Meinersbur, dmgreen, etiotto, Whitney, fhahn, hfinkel

Reviewed By: Meinersbur

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65958
2019-12-11 09:43:10 -05:00
Francesco Petrogalli
db175208b7 [VectorUtils] Introduce the Vector Function Database (VFDatabase).
This patch introduced the VFDatabase, the framework proposed in
http://lists.llvm.org/pipermail/llvm-dev/2019-June/133484.html. [*]

In this patch the VFDatabase is used to bridge the TargetLibraryInfo
(TLI) calls that were previously used to query for the availability of
vector counterparts of scalar functions.

The VFISAKind field `ISA` of VFShape have been moved into into VFInfo,
under the assumption that different vector ISAs may provide the same
vector signature. At the moment, the vectorizer accepts any of the
available ISAs as long as the signature provided by the VFDatabase
matches the one expected in the vectorization process. For example,
when targeting AVX or AVX2, which both have 256-bit registers, the IR
signature of the two vector functions associated to the two ISAs is
the same. The `getVectorizedFunction` method at the moment returns the
first available match. We will need to add more heuristics to the
search system to decide which of the available version (TLI, AVX,
AVX2, ...)  the system should prefer, when multiple versions with the
same VFShape are present.

Some of the code in this patch is based on the work done by Sumedh
Arani in https://reviews.llvm.org/D66025.

[*] Notice that in the proposal the VFDatabase was called SVFS. The
name VFDatabase is more in line with LLVM recommendations for
naming classes and variables.

Differential Revision: https://reviews.llvm.org/D67572
2019-12-10 16:36:44 +00:00
Francesco Petrogalli
3d7e1282ef [llvm][VFABI] Add more testing for LLVM internal mangling.
Summary:
The tests cover the internal mangling for:

1. Masked signatures.
2. Scalable signatures.
3. Masked scalable signatures with linear.

Reviewers: andwar

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71146
2019-12-09 15:49:45 +00:00
Francesco Petrogalli
4f5676a2c2 [fix][unittests][llvm] Fix running unit tests without assertions. [NFCI] 2019-12-05 03:28:19 +00:00
Francesco Petrogalli
a6a7edd0c0 [VectorUtils] API for VFShape, update VFInfo.
Summary:
This patch introduces an API to build and modify vector shapes.

The validity of a VFShape can be checked with the
`hasValidParameterList` method, which is also run in an assertion each
time a VFShape is modified.

The field VFISAKind has been moved to VFInfo under the assumption that
different ISAs can map to the same VFShape (as it can be in the case
of vector extensions with the same registers size, for example AVX and
AVX2).

Reviewers: sdesmalen, jdoerfert, simoll, hsaito

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70513
2019-12-04 20:40:05 +00:00
Heejin Ahn
1e28c741b7 [unittests] Add InitializePasses.h includes
Summary:
After D70211, Pass.h does not include InitializePasses.h anymore, so
these files need to include InitializePasses.h directly.

Reviewers: rnk

Subscribers: MatzeB, mehdi_amini, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70217
2019-11-13 19:42:58 -08:00
Francesco Petrogalli
d054b2dd05 [VFABI] Add LLVM internal mangling for vector functions.
Summary:
This patch adds a custom ISA for vector functions for internal use
in LLVM. The <isa> token is set to "_LLVM_", and it is not attached
to any specific instruction Vector ISA, or Vector Function ABI.

The ISA is used as a token for handling Vector Function ABI-style
vectorization for those vector functions that are not directly
associated to any existing Vector Function ABI (for example, some of
the vector functions exposed by TargetLibraryInfo). The demangling
function for this ISA in a Vector Function ABI context is set to be
the same as the common one shared between X86 and AArch64.

Reviewers: jdoerfert, sdesmalen, simoll

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70089
2019-11-13 03:26:39 +00:00
Francesco Petrogalli
cbc5510b51 [VFABI] Read/Write functions for the VFABI attribute.
The attribute is stored at the `FunctionIndex` attribute set, with the
name "vector-function-abi-variant".

The get/set methods of the attribute have assertion to verify that:

1. Each name in the attribute is a valid VFABI mangled name.

2. Each name in the attribute correspond to a function declared in the
   module.

Differential Revision: https://reviews.llvm.org/D69976
2019-11-12 03:40:42 +00:00
Whitney Tsang
a6f46eec3d [LOOPGUARD] Remove asserts in getLoopGuardBranch
Summary: The assertion in getLoopGuardBranch can be a 'return nullptr'
under if condition.
Authored By: DTharun
Reviewer: Whitney, fhahn
Reviewed By: Whitney, fhahn
Subscribers: fhahn, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D66084

llvm-svn: 373857
2019-10-06 16:39:43 +00:00
Whitney Tsang
c92726d91d [LOOPGUARD] Disable loop with multiple loop exiting blocks.
Summary: As discussed in the loop group meeting. With the current
definition of loop guard, we should not allow multiple loop exiting
blocks. For loops that has multiple loop exiting blocks, we can simply
unable to find the loop guard.
When getUniqueExitBlock() obtains a vector size not equals to one, that
means there is either no exit blocks or there exists more than one
unique block the loop exit to.
If we don't disallow loop with multiple loop exit blocks, then with our
current implementation, there can exist exit blocks don't post dominated
by the non pre-header successor of the guard block.
Reviewer: reames, Meinersbur, kbarton, etiotto, bmahjour
Reviewed By: Meinersbur, kbarton
Subscribers: fhahn, hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D66529

llvm-svn: 373011
2019-09-26 20:20:42 +00:00
Artur Pilipenko
7644f0319b [SCEV] Disable canonical expansion for non-affine addrecs.
Reviewed By: apilipenko

Differential Revision: https://reviews.llvm.org/D65276

Patch by Evgeniy Brevnov (ybrevnov@azul.com)

llvm-svn: 372789
2019-09-24 23:21:07 +00:00
Hiroshi Yamauchi
8566ca7d16 [PGO][PGSO] ProfileSummary changes.
(Split of off D67120)

ProfileSummary changes for profile guided size optimization.

Differential Revision: https://reviews.llvm.org/D67377

llvm-svn: 372783
2019-09-24 22:17:51 +00:00
Francesco Petrogalli
bb6e7b26a0 [SVFS] Vector Function ABI demangling.
This patch implements the demangling functionality as described in the
Vector Function ABI. This patch will be used to implement the
SearchVectorFunctionSystem (SVFS) as described in the RFC:

http://lists.llvm.org/pipermail/llvm-dev/2019-June/133484.html

A fuzzer is added to test the demangling utility.

Patch by Sumedh Arani <sumedh.arani@arm.com>

Differential revision: https://reviews.llvm.org/D66024

llvm-svn: 372343
2019-09-19 17:47:32 +00:00
Teresa Johnson
0062c013da Change TargetLibraryInfo analysis passes to always require Function
Summary:
This is the first change to enable the TLI to be built per-function so
that -fno-builtin* handling can be migrated to use function attributes.
See discussion on D61634 for background. This is an enabler for fixing
handling of these options for LTO, for example.

This change should not affect behavior, as the provided function is not
yet used to build a specifically per-function TLI, but rather enables
that migration.

Most of the changes were very mechanical, e.g. passing a Function to the
legacy analysis pass's getTLI interface, or in Module level cases,
adding a callback. This is similar to the way the per-function TTI
analysis works.

There was one place where we were looking for builtins but not in the
context of a specific function. See FindCXAAtExit in
lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround
could provide the wrong behavior in some corner cases. Suggestions
welcome.

Reviewers: chandlerc, hfinkel

Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66428

llvm-svn: 371284
2019-09-07 03:09:36 +00:00
Jonas Devlieghere
2c693415b7 [llvm] Migrate llvm::make_unique to std::make_unique
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.

llvm-svn: 369013
2019-08-15 15:54:37 +00:00
Nikolai Bozhenov
bb986bef44 [SCEV] Return zero from computeConstantDifference(X, X)
Without this patch computeConstantDifference returns None for cases like
these:

  computeConstantDifference(%x, %x)
  computeConstantDifference({%x,+,16}, {%x,+,16})

Differential Revision: https://reviews.llvm.org/D65474

llvm-svn: 368193
2019-08-07 17:38:38 +00:00
Peter Smith
b8d8637f8d [AliasAnalysis] Initialize a member variable that may be used by unit test.
The unit tests in BasicAliasAnalysisTest use the alias analysis API
directly and do not call setAAResults to initalize AAR. This gives a
valgrind error "Conditional Jump depends on unitialized variable".

On most buildbots the variable is nullptr, but in some cases it can be
non nullptr leading to seemingly random failures.

These tests were disabled in r366986. With the initialization they can be
enabled again.

Fixes PR42719

Differential Revision: https://reviews.llvm.org/D65568

llvm-svn: 367662
2019-08-02 08:05:14 +00:00
Whitney Tsang
9a08056433 [LOOPINFO] Introduce the loop guard API.
Summary:
This is the first patch for the loop guard. We introduced
getLoopGuardBranch() and isGuarded().
This currently only works on simplified loop, as it requires a preheader
and a latch to identify the guard.
It will work on loops of the form:
/// GuardBB:
///   br cond1, Preheader, ExitSucc <== GuardBranch
/// Preheader:
///   br Header
/// Header:
///  ...
///   br Latch
/// Latch:
///   br cond2, Header, ExitBlock
/// ExitBlock:
///   br ExitSucc
/// ExitSucc:
Prior discussions leading upto the decision to introduce the loop guard
API: http://lists.llvm.org/pipermail/llvm-dev/2019-May/132607.html
Reviewer: reames, kbarton, hfinkel, jdoerfert, Meinersbur, dmgreen
Reviewed By: reames
Subscribers: wuzish, hiraditya, jsji, llvm-commits, bmahjour, etiotto
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D63885

llvm-svn: 367033
2019-07-25 16:13:18 +00:00
George Burgess IV
826c75b015 [BasicAA] Temporarily disable two tests
These tests are breaking three independent upstream buildbots (as well
downstream ones). These breakages have appeared mysteriously,
consistently, and during different revisions. Sadly, none of
{ASAN,TSAN,MSAN,UBSAN} flag anything, so the cause here is nonobvious.

Until we've figured this out, it seems best to disable these tests
entirely, so that the affected bots don't remain silent about any other,
unrelated failures.

Please see PR42719 for more information.

llvm-svn: 366986
2019-07-25 06:53:59 +00:00
Serguei Katkov
00cc875c77 [LoopInfo] Fix getUniqueNonLatchExitBlocks
It is possible that exit block has two predecessors and one of them is a latch
block while another is not.

Current algorithm is based on the assumption that all exits are dedicated
and therefore we can check only first predecessor of loop exit to find all unique
exits.

However if we do not consider latch block and it is first predecessor of some
exit then this exit will be found.

Regression test is added.

As a side effect of algorithm re-writing, the restriction that all exits are dedicated
is eliminated.

Reviewers: reames, fhahn, efriedma
Reviewed By: efriedma
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D64787

llvm-svn: 366294
2019-07-17 07:09:20 +00:00
Serguei Katkov
b0fdbb02d2 [LoopInfo] Introduce getUniqueNonLatchExitBlocks utility function
Extract the code from LoopUnrollRuntime into utility function to
re-use it in D63923.

Reviewers: reames, mkuper
Reviewed By: reames
Subscribers: fhahn, hiraditya, zzheng, dmgreen, llvm-commits
Differential Revision: https://reviews.llvm.org/D64548

llvm-svn: 366040
2019-07-15 05:51:10 +00:00
Vitaly Buka
efa8f57b0d isBytewiseValue checks ConstantVector element by element
Summary: Vector of the same value with few undefs will sill be considered "Bytewise"

Reviewers: eugenis, pcc, jfb

Reviewed By: jfb

Subscribers: dexonsmith, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64031

llvm-svn: 365971
2019-07-12 22:37:55 +00:00
Vitaly Buka
66c24fb668 Return Undef from isBytewiseValue for empty arrays or structs
Reviewers: pcc, eugenis

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64052

llvm-svn: 365864
2019-07-12 02:23:07 +00:00
Vitaly Buka
b875fd735d Handle IntToPtr in isBytewiseValue
Summary:
This helps with more efficient use of memset for pattern initialization

From @pcc prototype for -ftrivial-auto-var-init=pattern optimizations

Binary size change on CTMark, (with -fuse-ld=lld -Wl,--icf=all, similar results with default linker options)
```
                   master           patch      diff
Os           8.238864e+05    8.238864e+05       0.0
O3           1.054797e+06    1.054797e+06       0.0
Os zero      8.292384e+05    8.292384e+05       0.0
O3 zero      1.062626e+06    1.062626e+06       0.0
Os pattern   8.579712e+05    8.338048e+05 -0.030299
O3 pattern   1.090502e+06    1.067574e+06 -0.020481
```

Zero vs Pattern on master
```
               zero       pattern      diff
Os     8.292384e+05  8.579712e+05  0.036578
O3     1.062626e+06  1.090502e+06  0.025124
```

Zero vs Pattern with the patch
```
               zero       pattern      diff
Os     8.292384e+05  8.338048e+05  0.003333
O3     1.062626e+06  1.067574e+06  0.003193
```

Reviewers: pcc, eugenis

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D63967

llvm-svn: 365858
2019-07-12 01:42:03 +00:00
Mikael Holmen
bca59d7f71 [test] Silence gcc 7.4 warning [NFC]
Without this gcc 7.4.0 complains with
 ../unittests/Analysis/ValueTrackingTest.cpp:937:66: error: ISO C++11 requires at least one argument for the "..." in a variadic macro [-Werror]
                          ::testing::ValuesIn(IsBytewiseValueTests));
                                                                   ^

llvm-svn: 365738
2019-07-11 07:07:23 +00:00
Vitaly Buka
1998d38e13 Add IsBytewiseValue unit test
Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63966

llvm-svn: 365710
2019-07-10 22:56:15 +00:00
Denis Bakhvalov
bdc34231f3 [SCEV] Fix for PR42397. SCEVExpander wrongly adds nsw to shl instruction.
Change-Id: I76c9f628c092ae3e6e78ebdaf55cec726e25d692
llvm-svn: 365363
2019-07-08 18:03:43 +00:00
Johannes Doerfert
6275dfc50e Use "willreturn" in isGuaranteedToTransferExecutionToSuccessor
The `willreturn` function attribute guarantees that a function call will
come back to the call site if the call is also known not to throw.
Therefore, this attribute can be used in
`isGuaranteedToTransferExecutionToSuccessor`.

Patch by Hideto Ueno (@uenoku)

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63372

llvm-svn: 364580
2019-06-27 19:29:48 +00:00
Sanjay Patel
9db50a3254 [Analysis] add isSplatValue() for vectors in IR
We have the related getSplatValue() already in IR (see code just above the proposed addition).
But sometimes we only need to know that the value is a splat rather than capture the splatted
scalar value. Also, we have an isSplatValue() function already in SDAG.

Motivation - recent bugs that would potentially benefit from improved splat analysis in IR:
https://bugs.llvm.org/show_bug.cgi?id=37428
https://bugs.llvm.org/show_bug.cgi?id=42174

Differential Revision: https://reviews.llvm.org/D63138

llvm-svn: 363106
2019-06-11 22:25:18 +00:00
Sanjay Patel
7113ec52f0 [Analysis] add unit test file for VectorUtils; NFC
llvm-svn: 362972
2019-06-10 18:19:05 +00:00
Whitney Tsang
6bfb9aa78b [LOOPINFO] Extend Loop object to add utilities to get the loop bounds,
step, and loop induction variable.

Summary: This PR extends the loop object with more utilities to get loop
bounds, step, and loop induction variable. There already exists passes
which try to obtain the loop induction variable in their own pass, e.g.
loop interchange. It would be useful to have a common area to get these
information.

/// Example:
/// for (int i = lb; i < ub; i+=step)
///   <loop body>
/// --- pseudo LLVMIR ---
/// beforeloop:
///   guardcmp = (lb < ub)
///   if (guardcmp) goto preheader; else goto afterloop
/// preheader:
/// loop:
///   i1 = phi[{lb, preheader}, {i2, latch}]
///   <loop body>
///   i2 = i1 + step
/// latch:
///   cmp = (i2 < ub)
///   if (cmp) goto loop
/// exit:
/// afterloop:
///
/// getBounds
///   getInitialIVValue      --> lb
///   getStepInst            --> i2 = i1 + step
///   getStepValue           --> step
///   getFinalIVValue        --> ub
///   getCanonicalPredicate  --> '<'
///   getDirection           --> Increasing
/// getInductionVariable          --> i1
/// getAuxiliaryInductionVariable --> {i1}
/// isCanonical                   --> false

Reviewers: kbarton, hfinkel, dmgreen, Meinersbur, jdoerfert, syzaara,
fhahn
Reviewed By: kbarton
Subscribers: tvvikram, bmahjour, etiotto, fhahn, jsji, hiraditya,
llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D60565

llvm-svn: 362644
2019-06-05 20:42:47 +00:00
Whitney Tsang
3eaa8d09fc Revert "Title: [LOOPINFO] Extend Loop object to add utilities to get the loop"
This reverts commit d34797dfc26c61cea19f45669a13ea572172ba34.

llvm-svn: 362615
2019-06-05 15:32:56 +00:00
Whitney Tsang
e5741d7263 Title: [LOOPINFO] Extend Loop object to add utilities to get the loop
bounds, step, and loop induction variable.

Summary: This PR extends the loop object with more utilities to get loop
bounds, step, and loop induction variable. There already exists passes
which try to obtain the loop induction variable in their own pass, e.g.
loop interchange. It would be useful to have a common area to get these
information.

/// Example:
/// for (int i = lb; i < ub; i+=step)
///   <loop body>
/// --- pseudo LLVMIR ---
/// beforeloop:
///   guardcmp = (lb < ub)
///   if (guardcmp) goto preheader; else goto afterloop
/// preheader:
/// loop:
///   i1 = phi[{lb, preheader}, {i2, latch}]
///   <loop body>
///   i2 = i1 + step
/// latch:
///   cmp = (i2 < ub)
///   if (cmp) goto loop
/// exit:
/// afterloop:
///
/// getBounds
///   getInitialIVValue      --> lb
///   getStepInst            --> i2 = i1 + step
///   getStepValue           --> step
///   getFinalIVValue        --> ub
///   getCanonicalPredicate  --> '<'
///   getDirection           --> Increasing
/// getInductionVariable          --> i1
/// getAuxiliaryInductionVariable --> {i1}
/// isCanonical                   --> false

Reviewers: kbarton, hfinkel, dmgreen, Meinersbur, jdoerfert, syzaara,
fhahn
Reviewed By: kbarton
Subscribers: tvvikram, bmahjour, etiotto, fhahn, jsji, hiraditya,
llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D60565

llvm-svn: 362609
2019-06-05 14:34:12 +00:00
Artur Pilipenko
22e123c85a Add ScalarEvolutionsTest::SCEVExpandInsertCanonicalIV tests
Test insertion of canonical IV in canonical expansion mode.

llvm-svn: 362432
2019-06-03 18:26:45 +00:00
Erik Pilkington
97e1411c56 [SimplifyLibCalls] Fold more fortified functions into non-fortified variants
When the object size argument is -1, no checking can be done, so calling the
_chk variant is unnecessary. We already did this for a bunch of these
functions.

rdar://50797197

Differential revision: https://reviews.llvm.org/D62358

llvm-svn: 362272
2019-05-31 22:41:36 +00:00
Kit Barton
5d44ff623b Revert [LOOPINFO] Extend Loop object to add utilities to get the loop bounds, step, induction variable, and guard branch.
This reverts r361517 (git commit 2049e4dd8f61100f88f14db33bd95d197bcbfbbc)

llvm-svn: 361553
2019-05-23 20:53:05 +00:00
Kit Barton
c6f1beaa2c [LOOPINFO] Extend Loop object to add utilities to get the loop bounds, step, induction variable, and guard branch.
Summary:
    This PR extends the loop object with more utilities to get loop bounds, step, induction variable, and guard branch. There already exists passes which try to obtain the loop induction variable in their own pass, e.g. loop interchange. It would be useful to have a common area to get these information. Moreover, loop fusion (https://reviews.llvm.org/D55851) is planning to use getGuard() to extend the kind of loops it is able to fuse, e.g. rotated loop with non-constant upper bound, which would have a loop guard.

      /// Example:
      /// for (int i = lb; i < ub; i+=step)
      ///   <loop body>
      /// --- pseudo LLVMIR ---
      /// beforeloop:
      ///   guardcmp = (lb < ub)
      ///   if (guardcmp) goto preheader; else goto afterloop
      /// preheader:
      /// loop:
      ///   i1 = phi[{lb, preheader}, {i2, latch}]
      ///   <loop body>
      ///   i2 = i1 + step
      /// latch:
      ///   cmp = (i2 < ub)
      ///   if (cmp) goto loop
      /// exit:
      /// afterloop:
      ///
      /// getBounds
      ///   getInitialIVValue      --> lb
      ///   getStepInst            --> i2 = i1 + step
      ///   getStepValue           --> step
      ///   getFinalIVValue        --> ub
      ///   getCanonicalPredicate  --> '<'
      ///   getDirection           --> Increasing
      /// getGuard             --> if (guardcmp) goto loop; else goto afterloop
      /// getInductionVariable          --> i1
      /// getAuxiliaryInductionVariable --> {i1}
      /// isCanonical                   --> false

    Committed on behalf of @Whitney (Whitney Tsang).

    Reviewers: kbarton, hfinkel, dmgreen, Meinersbur, jdoerfert, syzaara, fhahn

    Reviewed By: kbarton

    Subscribers: tvvikram, bmahjour, etiotto, fhahn, jsji, hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D60565

llvm-svn: 361517
2019-05-23 17:56:35 +00:00
Nick Desaulniers
fc4e6d1d1a [INLINER] allow inlining of blockaddresses if sole uses are callbrs
Summary:
It was supposed that Ref LazyCallGraph::Edge's were being inserted by
inlining, but that doesn't seem to be the case.  Instead, it seems that
there was no test for a blockaddress Constant in an instruction that
referenced the function that contained the instruction. Ex:

```
define void @f() {
  %1 = alloca i8*, align 8
2:
  store i8* blockaddress(@f, %2), i8** %1, align 8
  ret void
}
```

When iterating blockaddresses, do not add the function they refer to
back to the worklist if the blockaddress is referring to the contained
function (as opposed to an external function).

Because blockaddress has sligtly different semantics than GNU C's
address of labels, there are 3 cases that can occur with blockaddress,
where only 1 can happen in GNU C due to C's scoping rules:
* blockaddress is within the function it refers to (possible in GNU C).
* blockaddress is within a different function than the one it refers to
(not possible in GNU C).
* blockaddress is used in to declare a global (not possible in GNU C).

The second case is tested in:

```
$ ./llvm/build/unittests/Analysis/AnalysisTests \
  --gtest_filter=LazyCallGraphTest.HandleBlockAddress
```

This patch adjusts the iteration of blockaddresses in
LazyCallGraph::visitReferences to not revisit the blockaddresses
function in the first case.

The Linux kernel contains code that's not semantically valid at -O0;
specifically code passed to asm goto. It requires that asm goto be
inline-able. This patch conservatively does not attempt to handle the
more general case of inlining blockaddresses that have non-callbr users
(pr/39560).

https://bugs.llvm.org/show_bug.cgi?id=39560
https://bugs.llvm.org/show_bug.cgi?id=40722
https://github.com/ClangBuiltLinux/linux/issues/6
https://reviews.llvm.org/rL212077

Reviewers: jyknight, eli.friedman, chandlerc

Reviewed By: chandlerc

Subscribers: george.burgess.iv, nathanchance, mgorny, craig.topper, mengxu.gatech, void, mehdi_amini, E5ten, chandlerc, efriedma, eraman, hiraditya, haicheng, pirama, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58260

llvm-svn: 361173
2019-05-20 16:48:09 +00:00
Kit Barton
3d8187d171 Save the induction binary operator in IVDescriptors for non FP induction variables.
Summary:
Currently InductionBinOps are only saved for FP induction variables, the PR extends it with non FP induction variable, so user of IVDescriptors can query the InductionBinOps for integer induction variables.

The changes in hasUnsafeAlgebra() and getUnsafeAlgebraInst() are required for the existing LIT test cases to pass. As described in the comment of the two functions, one of the requirement to return true is it is a FP induction variable. The checks was not needed because InductionBinOp was not set on non FP cases before.

https://reviews.llvm.org/D60565 depends on the patch.

Committed on behalf of @Whitney (Whitney Tsang).

Reviewers: jdoerfert, kbarton, fhahn, hfinkel, dmgreen, Meinersbur

Reviewed By: jdoerfert

Subscribers: mgorny, hiraditya, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61329

llvm-svn: 360671
2019-05-14 13:26:36 +00:00
Nick Lewycky
3deb5ec521 An unreachable block may have a route to a reachable block, don't fast-path return that it can't.
A block reachable from the entry block can't have any route to a block that's not reachable from the entry block (if it did, that route would make it reachable from the entry block). That is the intended performance optimization for isPotentiallyReachable. For the case where we ask whether an unreachable from entry block has a route to a reachable from entry block, we can't conclude one way or the other. Fix a bug where we claimed there could be no such route.

The fix in rL357425 ironically reintroduced the very bug it was fixing but only when a DominatorTree is provided. This fixes the remaining bug.

llvm-svn: 357734
2019-04-04 23:09:40 +00:00
Sam Clegg
19fa622579 Fix TargetLibraryInfoTest.ValidProto after rL357552
llvm-svn: 357559
2019-04-03 02:30:35 +00:00
Nick Lewycky
57763bfd31 Add an optional list of blocks to avoid when looking for a path in isPotentiallyReachable.
The leads to some ambiguous overloads, so update three callers.

Differential Revision: https://reviews.llvm.org/D60085

llvm-svn: 357447
2019-04-02 01:05:48 +00:00
Nick Lewycky
2509c81e51 Not all blocks are reachable from entry. Don't assume they are.
Fixes a bug in isPotentiallyReachable, noticed by inspection.

llvm-svn: 357425
2019-04-01 20:03:16 +00:00
Chandler Carruth
7cfefe1cbe [NewPM] Fix a nasty bug with analysis invalidation in the new PM.
The issue here is that we actually allow CGSCC passes to mutate IR (and
therefore invalidate analyses) outside of the current SCC. At a minimum,
we need to support mutating parent and ancestor SCCs to support the
ArgumentPromotion pass which rewrites all calls to a function.

However, the analysis invalidation infrastructure is heavily based
around not needing to invalidate the same IR-unit at multiple levels.
With Loop passes for example, they don't invalidate other Loops. So we
need to customize how we handle CGSCC invalidation. Doing this without
gratuitously re-running analyses is even harder. I've avoided most of
these by using an out-of-band preserved set to accumulate the cross-SCC
invalidation, but it still isn't perfect in the case of re-visiting the
same SCC repeatedly *but* it coming off the worklist. Unclear how
important this use case really is, but I wanted to call it out.

Another wrinkle is that in order for this to successfully propagate to
function analyses, we have to make sure we have a proxy from the SCC to
the Function level. That requires pre-creating the necessary proxy.

The motivating test case now works cleanly and is added for
ArgumentPromotion.

Thanks for the review from Philip and Wei!

Differential Revision: https://reviews.llvm.org/D59869

llvm-svn: 357137
2019-03-28 00:51:36 +00:00
Alina Sbirlea
3fc1696aca [AliasAnalysis] Second prototype to cache BasicAA / anyAA state.
Summary:
Adding contained caching to AliasAnalysis. BasicAA is currently the only one using it.

AA changes:
- This patch is pulling the caches from BasicAAResults to AAResults, meaning the getModRefInfo call benefits from the IsCapturedCache as well when in "batch mode".
- All AAResultBase implementations add the QueryInfo member to all APIs. AAResults APIs maintain wrapper APIs such that all alias()/getModRefInfo call sites are unchanged.
- AA now provides a BatchAAResults type as a wrapper to AAResults. It keeps the AAResults instance and a QueryInfo instantiated to batch mode. It delegates all work to the AAResults instance with the batched QueryInfo. More API wrappers may be needed in BatchAAResults; only the minimum needed is currently added.

MemorySSA changes:
- All walkers are now templated on the AA used (AliasAnalysis=AAResults or BatchAAResults).
- At build time, we optimize uses; now we create a local walker (lives only as long as OptimizeUses does) using BatchAAResults.
- All Walkers have an internal AA and only use that now, never the AA in MemorySSA. The Walkers receive the AA they will use when built.

- The walker we use for queries after the build is instantiated on AliasAnalysis and is built after building MemorySSA and setting AA.
- All static methods doing walking are now templated on AliasAnalysisType if they are used both during build and after. If used only during build, the method now only takes a BatchAAResults. If used only after build, the method now takes an AliasAnalysis.

Subscribers: sanjoy, arsenm, jvesely, nhaehnle, jlebar, george.burgess.iv, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59315

llvm-svn: 356783
2019-03-22 17:22:19 +00:00
Nikita Popov
8b6244e618 [ValueTracking] Known bits support for unsigned saturating add/sub
We have two sources of known bits:

1. For adds leading ones of either operand are preserved. For sub
leading zeros of LHS and leading ones of RHS become leading zeros in
the result.

2. The saturating math is a select between add/sub and an all-ones/
zero value. As such we can carry out the add/sub known bits
calculation, and only preseve the known one/zero bits respectively.

Differential Revision: https://reviews.llvm.org/D58329

llvm-svn: 355223
2019-03-01 20:07:04 +00:00
Alina Sbirlea
8601cb2c88 [MemorySSA] Make insertDef insert corresponding phi nodes.
Summary:
The original assumption for the insertDef method was that it would not
materialize Defs out of no-where, hence it will not insert phis needed
after inserting a Def.

However, when cloning an instruction (use case used in LICM), we do
materialize Defs "out of no-where". If the block receiving a Def has at
least one other Def, then no processing is needed. If the block just
received its first Def, we must check where Phi placement is needed.
The only new usage of insertDef is in LICM, hence the trigger for the bug.

But the original goal of the method also fails to apply for the move()
method. If we move a Def from the entry point of a diamond to either the
left or right blocks, then the merge block must add a phi.
While this usecase does not currently occur, or may be viewed as an
incorrect transformation, MSSA must behave corectly given the scenario.

Resolves PR40749 and PR40754.

Reviewers: george.burgess.iv

Subscribers: sanjoy, jlebar, Prazek, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58652

llvm-svn: 355040
2019-02-27 22:20:22 +00:00
Chijun Sima
b98fe759b3 [DTU] Refine the interface and logic of applyUpdates
Summary:
This patch separates two semantics of `applyUpdates`:
1. User provides an accurate CFG diff and the dominator tree is updated according to the difference of `the number of edge insertions` and `the number of edge deletions` to infer the status of an edge before and after the update.
2. User provides a sequence of hints. Updates mentioned in this sequence might never happened and even duplicated.

Logic changes:

Previously, removing invalid updates is considered a side-effect of deduplication and is not guaranteed to be reliable. To handle the second semantic, `applyUpdates` does validity checking before deduplication, which can cause updates that have already been applied to be submitted again. Then, different calls to `applyUpdates` might cause unintended consequences, for example,
```
DTU(Lazy) and Edge A->B exists.
1. DTU.applyUpdates({{Delete, A, B}, {Insert, A, B}}) // User expects these 2 updates result in a no-op, but {Insert, A, B} is queued
2. Remove A->B
3. DTU.applyUpdates({{Delete, A, B}}) // DTU cancels this update with {Insert, A, B} mentioned above together (Unintended)
```
But by restricting the precondition that updates of an edge need to be strictly ordered as how CFG changes were made, we can infer the initial status of this edge to resolve this issue.

Interface changes:
The second semantic of `applyUpdates`  is separated to `applyUpdatesPermissive`.
These changes enable DTU(Lazy) to use the first semantic if needed, which is quite useful in `transforms/utils`.

Reviewers: kuhar, brzycki, dmgreen, grosser

Reviewed By: brzycki

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58170

llvm-svn: 354669
2019-02-22 13:48:38 +00:00
Chijun Sima
43cc9261be [DTU] Deprecate insertEdge*/deleteEdge*
Summary: This patch converts all existing `insertEdge*/deleteEdge*` to `applyUpdates` and marks `insertEdge*/deleteEdge*` as deprecated.

Reviewers: kuhar, brzycki

Reviewed By: kuhar, brzycki

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58443

llvm-svn: 354652
2019-02-22 05:41:43 +00:00
Richard Trieu
cc5fa2b650 Move DomTreeUpdater from IR to Analysis
DomTreeUpdater depends on headers from Analysis, but is in IR.  This is a
layering violation since Analysis depends on IR.  Relocate this code from IR
to Analysis to fix the layering violation.

llvm-svn: 353265
2019-02-06 02:52:52 +00:00
James Y Knight
d34c1cbe9e [opaque pointer types] Pass value type to GetElementPtr creation.
This cleans up all GetElementPtr creation in LLVM to explicitly pass a
value type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57173

llvm-svn: 352913
2019-02-01 20:44:47 +00:00
James Y Knight
c8b30de05f [opaque pointer types] Pass value type to LoadInst creation.
This cleans up all LoadInst creation in LLVM to explicitly pass the
value type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57172

llvm-svn: 352911
2019-02-01 20:44:24 +00:00
James Y Knight
846be29e5e [opaque pointer types] Add a FunctionCallee wrapper type, and use it.
Recommit r352791 after tweaking DerivedTypes.h slightly, so that gcc
doesn't choke on it, hopefully.

Original Message:
The FunctionCallee type is effectively a {FunctionType*,Value*} pair,
and is a useful convenience to enable code to continue passing the
result of getOrInsertFunction() through to EmitCall, even once pointer
types lose their pointee-type.

Then:
- update the CallInst/InvokeInst instruction creation functions to
  take a Callee,
- modify getOrInsertFunction to return FunctionCallee, and
- update all callers appropriately.

One area of particular note is the change to the sanitizer
code. Previously, they had been casting the result of
`getOrInsertFunction` to a `Function*` via
`checkSanitizerInterfaceFunction`, and storing that. That would report
an error if someone had already inserted a function declaraction with
a mismatching signature.

However, in general, LLVM allows for such mismatches, as
`getOrInsertFunction` will automatically insert a bitcast if
needed. As part of this cleanup, cause the sanitizer code to do the
same. (It will call its functions using the expected signature,
however they may have been declared.)

Finally, in a small number of locations, callers of
`getOrInsertFunction` actually were expecting/requiring that a brand
new function was being created. In such cases, I've switched them to
Function::Create instead.

Differential Revision: https://reviews.llvm.org/D57315

llvm-svn: 352827
2019-02-01 02:28:03 +00:00
James Y Knight
06da6dcca4 Revert "[opaque pointer types] Add a FunctionCallee wrapper type, and use it."
This reverts commit f47d6b38c7a61d50db4566b02719de05492dcef1 (r352791).

Seems to run into compilation failures with GCC (but not clang, where
I tested it). Reverting while I investigate.

llvm-svn: 352800
2019-01-31 21:51:58 +00:00
James Y Knight
fa51e33345 [opaque pointer types] Add a FunctionCallee wrapper type, and use it.
The FunctionCallee type is effectively a {FunctionType*,Value*} pair,
and is a useful convenience to enable code to continue passing the
result of getOrInsertFunction() through to EmitCall, even once pointer
types lose their pointee-type.

Then:
- update the CallInst/InvokeInst instruction creation functions to
  take a Callee,
- modify getOrInsertFunction to return FunctionCallee, and
- update all callers appropriately.

One area of particular note is the change to the sanitizer
code. Previously, they had been casting the result of
`getOrInsertFunction` to a `Function*` via
`checkSanitizerInterfaceFunction`, and storing that. That would report
an error if someone had already inserted a function declaraction with
a mismatching signature.

However, in general, LLVM allows for such mismatches, as
`getOrInsertFunction` will automatically insert a bitcast if
needed. As part of this cleanup, cause the sanitizer code to do the
same. (It will call its functions using the expected signature,
however they may have been declared.)

Finally, in a small number of locations, callers of
`getOrInsertFunction` actually were expecting/requiring that a brand
new function was being created. In such cases, I've switched them to
Function::Create instead.

Differential Revision: https://reviews.llvm.org/D57315

llvm-svn: 352791
2019-01-31 20:35:56 +00:00
Max Kazantsev
8a48aae360 [NFC] Fix warnings in unit test of r351725
llvm-svn: 351726
2019-01-21 07:27:47 +00:00
Max Kazantsev
f0c38d90c7 [SCEV][NFC] Introduces expression sizes estimation
This patch introduces the field `ExpressionSize` in SCEV. This field is
calculated only once on SCEV creation, and it represents the complexity of
this SCEV from arithmetical point of view (not from the point of the number
of actual different SCEV nodes that are used in the expression). Roughly
saying, it is the number of operands and operations symbols when we print this
SCEV.

A formal definition is following: if SCEV `X` has operands
  `Op1`, `Op2`, ..., `OpN`,
then
  Size(X) = 1 + Size(Op1) + Size(Op2) + ... + Size(OpN).
Size of SCEVConstant and SCEVUnknown is one.

Expression size may be used as a universal way to limit SCEV transformations
for huge SCEVs. Currently, we have a bunch of options that represents various
limits (such as recursion depth limit) that may not make any sense from the
point of view of a LLVM users who is not familiar with SCEV internals, and all
these different options pursue one goal. A more general rule that may
potentially allow us to get rid of this redundancy in options is "do not make
transformations with SCEVs of huge size". It can apply to all SCEV traversals
and transformations that may need to visit a SCEV node more than once, hence
they are prone to combinatorial explosions.

This patch only introduces SCEV sizes calculation as NFC, its utilization will
be introduced in follow-up patches.

Differential Revision: https://reviews.llvm.org/D35989
Reviewed By: reames

llvm-svn: 351725
2019-01-21 06:19:50 +00:00
Serge Guelton
b20ef5f960 Replace llvm::isPodLike<...> by llvm::is_trivially_copyable<...>
As noted in https://bugs.llvm.org/show_bug.cgi?id=36651, the specialization for
isPodLike<std::pair<...>> did not match the expectation of
std::is_trivially_copyable which makes the memcpy optimization invalid.

This patch renames the llvm::isPodLike trait into llvm::is_trivially_copyable.
Unfortunately std::is_trivially_copyable is not portable across compiler / STL
versions. So a portable version is provided too.

Note that the following specialization were invalid:

    std::pair<T0, T1>
    llvm::Optional<T>

Tests have been added to assert that former specialization are respected by the
standard usage of llvm::is_trivially_copyable, and that when a decent version
of std::is_trivially_copyable is available, llvm::is_trivially_copyable is
compared to std::is_trivially_copyable.

As of this patch, llvm::Optional is no longer considered trivially copyable,
even if T is. This is to be fixed in a later patch, as it has impact on a
long-running bug (see r347004)

Note that GCC warns about this UB, but this got silented by https://reviews.llvm.org/D50296.

Differential Revision: https://reviews.llvm.org/D54472

llvm-svn: 351701
2019-01-20 21:19:56 +00:00
Chandler Carruth
ae65e281f3 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Artur Pilipenko
76497de59d [CaptureTracking] Add a unit test for MaxUsesToExplore
llvm-svn: 350351
2019-01-03 20:16:33 +00:00
Nikita Popov
bb7c898cb7 [ValueTracking] Support funnel shifts in computeKnownBits()
If the shift amount is known, we can determine the known bits of the
output based on the known bits of two inputs.

This is essentially the same functionality as implemented in D54869,
but for ValueTracking rather than InstCombine SimplifyDemandedBits.

Differential Revision: https://reviews.llvm.org/D55140

llvm-svn: 348091
2018-12-02 14:14:11 +00:00
Nikita Popov
3a2addb4a0 [ValueTracking] Make unit tests easier to write; NFC
Generalize the existing MatchSelectPatternTest class to also work
with other types of tests. This reduces the amount of boilerplate
necessary to write ValueTracking tests in general, and computeKnownBits
tests in particular.

The inherited convention is that the function must be @test and the
tested instruction %A.

Differential Revision: https://reviews.llvm.org/D55141

llvm-svn: 348043
2018-11-30 22:22:30 +00:00
Vedant Kumar
bbc3a67102 [ProfileSummary] Standardize methods and fix comment
Every Analysis pass has a get method that returns a reference of the Result of
the Analysis, for example, BlockFrequencyInfo
&BlockFrequencyInfoWrapperPass::getBFI().  I believe that
ProfileSummaryInfo::getPSI() is the only exception to that, as it was returning
a pointer.

Another change is renaming isHotBB and isColdBB to isHotBlock and isColdBlock,
respectively.  Most methods use BB as the argument of variable names while
methods usually refer to Basic Blocks as Blocks, instead of BB.  For example,
Function::getEntryBlock, Loop:getExitBlock, etc.

I also fixed one of the comments.

Patch by Rodrigo Caetano Rocha!

Differential Revision: https://reviews.llvm.org/D54669

llvm-svn: 347182
2018-11-19 05:23:16 +00:00
Max Kazantsev
3032963e0b [SCEV][NFC] Verify IR in isLoop[Entry,Backedge]GuardedByCond
We have a lot of various bugs that are caused by misuse of SCEV (in particular in LV),
all of them can simply be described as "we ask SCEV to prove some fact on invalid IR".
Some of examples of those are PR36311, PR37221, PR39160.

The problem is that these failues manifest differently (what we saw was failure of various
asserts across SCEV, but there can also be miscompiles). This patch adds an assert into two
SCEV methods that strongly rely on correctness of the IR and are involved in known failues.
This will at least allow us to have a clear indication of what was wrong in this case.

This patch also fixes a unit test with incorrect IR that fails this verification.

Differential Revision: https://reviews.llvm.org/D52930
Reviewed By: fhahn

llvm-svn: 346389
2018-11-08 05:07:58 +00:00
Calixte Denizet
fe50815f32 Fix unit tests after patch https://reviews.llvm.org/rL346313
Summary: Tests are broken so fix them.

Reviewers: marco-c

Reviewed By: marco-c

Subscribers: sylvestre.ledru, llvm-commits

Differential Revision: https://reviews.llvm.org/D54208

llvm-svn: 346318
2018-11-07 14:46:26 +00:00
Sanjay Patel
7a0d37b0c0 [ValueTracking] determine sign of 0.0 from select when matching min/max FP
In PR39475:
https://bugs.llvm.org/show_bug.cgi?id=39475
..we may fail to recognize/simplify fabs() in some cases because we do not 
canonicalize fcmp with a -0.0 operand.

Adding that canonicalization can cause regressions on min/max FP tests, so 
that's this patch: for the purpose of determining whether something is min/max, 
let the value returned by the select determine how we treat a 0.0 operand in the fcmp.

This patch doesn't actually change the -0.0 to +0.0. It just changes the analysis, so 
we don't fail to recognize equivalent min/max patterns that only differ in the 
signbit of 0.0.

Differential Revision: https://reviews.llvm.org/D54001

llvm-svn: 346097
2018-11-04 14:28:48 +00:00
Sanjay Patel
72bbb14e2a [ValueTracking] peek through 2-input shuffles in ComputeNumSignBits
This patch gives the IR ComputeNumSignBits the same functionality as the 
DAG version (the code is derived from the existing code).

This an extension of the single input shuffle analysis added with D53659.

Differential Revision: https://reviews.llvm.org/D53987

llvm-svn: 346071
2018-11-03 13:18:55 +00:00
Sanjay Patel
8b1b1e2098 [ValueTracking] add test for non-canonical shuffle; NFC
llvm-svn: 346025
2018-11-02 18:14:24 +00:00
Sanjay Patel
5a1156b72d [ValueTracking] allow non-canonical shuffles when computing signbits
This possibility is noted in D53987 for a different case,
so we need to adjust the existing code.

llvm-svn: 345988
2018-11-02 15:51:47 +00:00
Alina Sbirlea
eb2015753b [AliasSetTracker] Misc cleanup (NFCI)
Summary: Remove two redundant checks, add one in the unit test. Remove an unused method. Fix computation of TotalMayAliasSetSize.
llvm-svn: 345911
2018-11-01 23:37:51 +00:00
Sanjay Patel
f773950be2 [ValueTracking] add tests for fmin/fmax; NFC
llvm-svn: 345777
2018-10-31 21:11:59 +00:00
Thomas Lively
e649dc137e [NFC] Rename minnan and maxnan to minimum and maximum
Summary:
Changes all uses of minnan/maxnan to minimum/maximum
globally. These names emphasize that the semantic difference between
these operations is more than just NaN-propagation.

Reviewers: arsenm, aheejin, dschuff, javed.absar

Subscribers: jholewinski, sdardis, wdng, sbc100, jgravelle-google, jrtc27, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D53112

llvm-svn: 345218
2018-10-24 22:49:55 +00:00
Nicolai Haehnle
236a317c44 DivergenceAnalysisTest: fix use of uninitialized memory
Thanks to Simon Moll for chasing it down.

Change-Id: If188f07c4aaec217f40a7a2ca029818f9202f1cb
llvm-svn: 344738
2018-10-18 12:54:39 +00:00
Nicolai Haehnle
0888142443 [DA] DivergenceAnalysis for unstructured, reducible CFGs
Summary:
This is patch 2 of the new DivergenceAnalysis (https://reviews.llvm.org/D50433).

This patch contains a generic divergence analysis implementation for
unstructured, reducible Control-Flow Graphs. It contains two new classes.
The `SyncDependenceAnalysis` class lazily computes sync dependences, which
relate divergent branches to points of joining divergent control. The
`DivergenceAnalysis` class contains the generic divergence analysis
implementation.

Reviewers: nhaehnle

Reviewed By: nhaehnle

Subscribers: sameerds, kristina, nhaehnle, xbolva00, tschuett, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D51491

llvm-svn: 344734
2018-10-18 09:38:44 +00:00
George Burgess IV
37a10a0055 Replace most users of UnknownSize with LocationSize::unknown(); NFC
Moving away from UnknownSize is part of the effort to migrate us to
LocationSizes (e.g. the cleanup promised in D44748).

This doesn't entirely remove all of the uses of UnknownSize; some uses
require tweaks to assume that UnknownSize isn't just some kind of int.
This patch is intended to just be a trivial replacement for all places
where LocationSize::unknown() will Just Work.

llvm-svn: 344186
2018-10-10 21:28:44 +00:00
George Burgess IV
e908528176 [Analysis] Make LocationSizes carry an 'imprecise' bit
There are places where we need to merge multiple LocationSizes of
different sizes into one, and get a sensible result.

There are other places where we want to optimize aggressively based on
the value of a LocationSizes (e.g. how can a store of four bytes be to
an area of storage that's only two bytes large?)

This patch makes LocationSize hold an 'imprecise' bit to note whether
the LocationSize can be treated as an upper-bound and lower-bound for
the size of a location, or just an upper-bound.

This concludes the series of patches leading up to this. The most recent
of which is r344108.

Fixes PR36228.

Differential Revision: https://reviews.llvm.org/D44748

llvm-svn: 344114
2018-10-10 06:39:40 +00:00
Thomas Lively
4754140dfc [ValueTracking] Allow select patterns to work on FP vectors
Summary:
This CL allows constant vectors of floats to be recognized as non-NaN
and non-zero in select patterns. This change makes
`matchSelectPattern` more powerful generally, but was motivated
specifically because I wanted fminnan and fmaxnan to be created for
vector versions of the scalar patterns they are created for.

Tested with check-all on all targets. A testcase in the WebAssembly
backend that tests the non-nan codepath is in an upcoming CL.

Reviewers: aheejin, dschuff

Subscribers: sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52324

llvm-svn: 343364
2018-09-28 21:36:43 +00:00
Fangrui Song
c2791239be llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...)
Summary: The convenience wrapper in STLExtras is available since rL342102.

Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb

Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D52573

llvm-svn: 343163
2018-09-27 02:13:45 +00:00
Fedor Sergeev
406a5f5b60 [New PM] Introducing PassInstrumentation framework
Pass Execution Instrumentation interface enables customizable instrumentation
of pass execution, as per "RFC: Pass Execution Instrumentation interface"
posted 06/07/2018 on llvm-dev@

The intent is to provide a common machinery to implement all
the pass-execution-debugging features like print-before/after,
opt-bisect, time-passes etc.

Here we get a basic implementation consisting of:
* PassInstrumentationCallbacks class that handles registration of callbacks
  and access to them.

* PassInstrumentation class that handles instrumentation-point interfaces
  that call into PassInstrumentationCallbacks.

* Callbacks accept StringRef which is just a name of the Pass right now.
  There were some ideas to pass an opaque wrapper for the pointer to pass instance,
  however it appears that pointer does not actually identify the instance
  (adaptors and managers might have the same address with the pass they govern).
  Hence it was decided to go simple for now and then later decide on what the proper
  mental model of identifying a "pass in a phase of pipeline" is.

* Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies
  on different IRUnits (e.g. Analyses).

* PassInstrumentationAnalysis analysis is explicitly requested from PassManager through
  usual AnalysisManager::getResult. All pass managers were updated to run that
  to get PassInstrumentation object for instrumentation calls.

* Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra
  args out of a generic PassManager's extra args. This is the only way I was able to explicitly
  run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or
  RepeatedPass::run.
  TODO: Upon lengthy discussions we agreed to accept this as an initial implementation
  and then get rid of getAnalysisResult by improving RepeatedPass implementation.

* PassBuilder takes PassInstrumentationCallbacks object to pass it further into
  PassInstrumentationAnalysis. Callbacks registration should be performed directly
  through PassInstrumentationCallbacks.

* new-pm tests updated to account for PassInstrumentationAnalysis being run

* Added PassInstrumentation tests to PassBuilderCallbacks unit tests.
  Other unit tests updated with registration of the now-required PassInstrumentationAnalysis.

  Made getName helper to return std::string (instead of StringRef initially) to fix
  asan builtbot failures on CGSCC tests.

Reviewers: chandlerc, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D47858

llvm-svn: 342664
2018-09-20 17:08:45 +00:00
Eric Christopher
7b8d4f4b37 Temporarily Revert "[New PM] Introducing PassInstrumentation framework"
as it was causing failures in the asan buildbot.

This reverts commit r342597.

llvm-svn: 342616
2018-09-20 05:16:29 +00:00
Fedor Sergeev
817b214094 [New PM] Introducing PassInstrumentation framework
Pass Execution Instrumentation interface enables customizable instrumentation
of pass execution, as per "RFC: Pass Execution Instrumentation interface"
posted 06/07/2018 on llvm-dev@

The intent is to provide a common machinery to implement all
the pass-execution-debugging features like print-before/after,
opt-bisect, time-passes etc.

Here we get a basic implementation consisting of:
* PassInstrumentationCallbacks class that handles registration of callbacks
  and access to them.

* PassInstrumentation class that handles instrumentation-point interfaces
  that call into PassInstrumentationCallbacks.

* Callbacks accept StringRef which is just a name of the Pass right now.
  There were some ideas to pass an opaque wrapper for the pointer to pass instance,
  however it appears that pointer does not actually identify the instance
  (adaptors and managers might have the same address with the pass they govern).
  Hence it was decided to go simple for now and then later decide on what the proper
  mental model of identifying a "pass in a phase of pipeline" is.

* Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies
  on different IRUnits (e.g. Analyses).

* PassInstrumentationAnalysis analysis is explicitly requested from PassManager through
  usual AnalysisManager::getResult. All pass managers were updated to run that
  to get PassInstrumentation object for instrumentation calls.

* Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra
  args out of a generic PassManager's extra args. This is the only way I was able to explicitly
  run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or
  RepeatedPass::run.
  TODO: Upon lengthy discussions we agreed to accept this as an initial implementation
  and then get rid of getAnalysisResult by improving RepeatedPass implementation.

* PassBuilder takes PassInstrumentationCallbacks object to pass it further into
  PassInstrumentationAnalysis. Callbacks registration should be performed directly
  through PassInstrumentationCallbacks.

* new-pm tests updated to account for PassInstrumentationAnalysis being run

* Added PassInstrumentation tests to PassBuilderCallbacks unit tests.
  Other unit tests updated with registration of the now-required PassInstrumentationAnalysis.

Reviewers: chandlerc, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D47858

llvm-svn: 342597
2018-09-19 22:42:57 +00:00
Fedor Sergeev
a39d896f9f Revert rL342544: [New PM] Introducing PassInstrumentation framework
A bunch of bots fail to compile unittests. Reverting.

llvm-svn: 342552
2018-09-19 14:54:48 +00:00
Fedor Sergeev
ee5fd89f4a [New PM] Introducing PassInstrumentation framework
Summary:
Pass Execution Instrumentation interface enables customizable instrumentation
of pass execution, as per "RFC: Pass Execution Instrumentation interface"
posted 06/07/2018 on llvm-dev@

The intent is to provide a common machinery to implement all
the pass-execution-debugging features like print-before/after,
opt-bisect, time-passes etc.

Here we get a basic implementation consisting of:
* PassInstrumentationCallbacks class that handles registration of callbacks
  and access to them.

* PassInstrumentation class that handles instrumentation-point interfaces
  that call into PassInstrumentationCallbacks.

* Callbacks accept StringRef which is just a name of the Pass right now.
  There were some ideas to pass an opaque wrapper for the pointer to pass instance,
  however it appears that pointer does not actually identify the instance
  (adaptors and managers might have the same address with the pass they govern).
  Hence it was decided to go simple for now and then later decide on what the proper
  mental model of identifying a "pass in a phase of pipeline" is.

* Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies
  on different IRUnits (e.g. Analyses).

* PassInstrumentationAnalysis analysis is explicitly requested from PassManager through
  usual AnalysisManager::getResult. All pass managers were updated to run that
  to get PassInstrumentation object for instrumentation calls.

* Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra
  args out of a generic PassManager's extra args. This is the only way I was able to explicitly
  run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or
  RepeatedPass::run.
  TODO: Upon lengthy discussions we agreed to accept this as an initial implementation
  and then get rid of getAnalysisResult by improving RepeatedPass implementation.

* PassBuilder takes PassInstrumentationCallbacks object to pass it further into
  PassInstrumentationAnalysis. Callbacks registration should be performed directly
  through PassInstrumentationCallbacks.

* new-pm tests updated to account for PassInstrumentationAnalysis being run

* Added PassInstrumentation tests to PassBuilderCallbacks unit tests.
  Other unit tests updated with registration of the now-required PassInstrumentationAnalysis.

Reviewers: chandlerc, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D47858

llvm-svn: 342544
2018-09-19 12:25:52 +00:00
Johannes Doerfert
ac18bae23b [LoopInfo][FIX] Remove leftover dump in unit test
llvm-svn: 341968
2018-09-11 17:49:43 +00:00
Johannes Doerfert
218cfa8be2 [LoopInfo] Fix Loop::getLoopID() for loops with multiple latches
The previous implementation traversed all loop blocks and bailed if one
was not a latch block. Since we are only interested in latch blocks, we
should only traverse those.

llvm-svn: 341926
2018-09-11 11:44:17 +00:00
Alina Sbirlea
8e13e528fa API to update MemorySSA for cloned blocks and added CFG edges.
Summary:
End goal is to update MemorySSA in all loop passes. LoopUnswitch clones all blocks in a loop. SimpleLoopUnswitch clones some blocks. LoopRotate clones some instructions.
Some of these loop passes also make CFG changes.
This is an API based on what I found needed in LoopUnswitch, SimpleLoopUnswitch, LoopRotate, LoopInstSimplify, LoopSimplifyCFG.
Adding dependent patches using this API for context.

Reviewers: george.burgess.iv, dberlin

Subscribers: sanjoy, jlebar, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D45299

llvm-svn: 341855
2018-09-10 20:13:01 +00:00
Nico Weber
ec4bd508f5 Rename a few unittests/.../Foo.cpp files to FooTest.cpp
The convention for unit test sources is that they're called FooTest.cpp.

No behavior change.
https://reviews.llvm.org/D51579

llvm-svn: 341313
2018-09-03 12:43:26 +00:00
Max Kazantsev
12bc747451 [NFC] Move OrderedInstructions and InstructionPrecedenceTracking to Analysis
These classes don't make any changes to IR and have no reason to be in
Transform/Utils. This patch moves them to Analysis folder. This will allow
us reusing these classes in some analyzes, like MustExecute.

llvm-svn: 341015
2018-08-30 04:49:03 +00:00
George Burgess IV
510f86ea5c [MemorySSA] Fix def optimization handling
In order for more complex updates of MSSA to happen (e.g. those in
D45299), MemoryDefs need to be actual `Use`s of what they're optimized
to. This patch makes that happen.

In addition, this patch changes our optimization behavior for Defs
slightly: we'll now consider a Def optimization invalid if the
MemoryAccess it's optimized to changes. That we weren't doing this
before was a bug, but given that we were tracking these with a WeakVH
before, it was sort of difficult for that to matter.

We're already have both of these behaviors for MemoryUses. The
difference is that a MemoryUse's defining access is always its optimized
access, and defining accesses are always `Use`s (in the LLVM sense).

Nothing exploded when testing a stage3 clang+llvm locally, so...

This also includes the test-case promised in r340461.

llvm-svn: 340577
2018-08-23 21:29:11 +00:00
Easwaran Raman
89a18d93a9 [BFI] Use rounding while computing profile counts.
Summary:
Profile count of a block is computed by multiplying its block frequency
by entry count and dividing the result by entry block frequency. Do
rounded division in the last step and update test cases appropriately.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50822

llvm-svn: 339835
2018-08-16 00:26:59 +00:00
George Burgess IV
185ad9b7c5 [MemorySSA] "Fix" lifetime intrinsic handling
MemorySSA currently creates MemoryAccesses for lifetime intrinsics, and
sometimes treats them as clobbers. This may/may not be the best way
forward, but while we're doing it, we should consider
MayAlias/PartialAlias to be clobbers.

The ideal fix here is probably to remove all of this reasoning about
lifetimes from MemorySSA + put it into the passes that need to care. But
that's a wayyy broader fix that needs some consensus, and we have
miscompiles + a release branch today, and this should solve the
miscompiles just as well.

differential revision is D43269. Landing without an explicit LGTM (and
without using the special please-autoclose-this syntax) so we can still
use that revision as a place to decide what the right fix here is.

llvm-svn: 339411
2018-08-10 05:14:43 +00:00
Manoj Gupta
647946fa14 llvm: Add support for "-fno-delete-null-pointer-checks"
Summary:
Support for this option is needed for building Linux kernel.
This is a very frequently requested feature by kernel developers.

More details : https://lkml.org/lkml/2018/4/4/601

GCC option description for -fdelete-null-pointer-checks:
This Assume that programs cannot safely dereference null pointers,
and that no code or data element resides at address zero.

-fno-delete-null-pointer-checks is the inverse of this implying that
null pointer dereferencing is not undefined.

This feature is implemented in LLVM IR in this CL as the function attribute
"null-pointer-is-valid"="true" in IR (Under review at D47894).
The CL updates several passes that assumed null pointer dereferencing is
undefined to not optimize when the "null-pointer-is-valid"="true"
attribute is present.

Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv

Reviewed By: efriedma, george.burgess.iv

Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits

Differential Revision: https://reviews.llvm.org/D47895

llvm-svn: 336613
2018-07-09 22:27:23 +00:00
John Brawn
deeee518d4 [PhiValues] Adjust unit test to invalidate instructions before deleting them
This should fix a sanitizer buildbot failure.

llvm-svn: 335862
2018-06-28 15:17:07 +00:00
John Brawn
365ebcbf1a Add a PhiValuesAnalysis pass to calculate the underlying values of phis
This pass is being added in order to make the information available to BasicAA,
which can't do caching of this information itself, but possibly this information
may be useful for other passes.

Incorporates code based on Daniel Berlin's implementation of Tarjan's algorithm.

Differential Revision: https://reviews.llvm.org/D47893

llvm-svn: 335857
2018-06-28 14:13:06 +00:00
Florian Hahn
6b8383829f [ValueLattice] Return false if value range did not change in mergeIn.
llvm-svn: 335729
2018-06-27 12:57:51 +00:00
David Bolvansky
24dda2d3d2 [SimplifyLibcalls] Replace locked IO with unlocked IO
Summary: If file stream arg is not captured and source is fopen, we could replace IO calls by unlocked IO ("_unlocked" function variants) to gain better speed,

Reviewers: efriedma, RKSimon, spatel, sanjoy, hfinkel, majnemer, lebedev.ri, rja

Reviewed By: rja

Subscribers: rja, srhines, efriedma, lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D45736

llvm-svn: 332452
2018-05-16 11:39:52 +00:00
Mandeep Singh Grang
0068f19cfd [unittests] Change std::sort to llvm::sort in response to r327219
r327219 added wrappers to std::sort which randomly shuffle the container before
sorting.  This will help in uncovering non-determinism caused due to undefined
sorting order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of
std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to
llvm::sort.  Refer the comments section in D44363 for a list of all the
required patches.

llvm-svn: 329475
2018-04-07 01:29:45 +00:00
Eric Fiselier
bf060d4e97 [Analysis] Support aligned new/delete functions.
Summary:
Clang's __builtin_operator_new/delete was recently taught about the aligned allocation overloads (r328134). This patch makes LLVM aware of them as well.
This allows the compiler to perform certain optimizations including eliding new/delete calls.

Reviewers: rsmith, majnemer, dblaikie, vsk, bkramer

Reviewed By: bkramer

Subscribers: ckennelly, llvm-commits

Differential Revision: https://reviews.llvm.org/D44769

llvm-svn: 329218
2018-04-04 19:01:51 +00:00
Alina Sbirlea
0d82331cd7 Expose must/may alias info in MemorySSA.
Summary:
Building MemorySSA gathers alias information for Defs/Uses.
Store and expose this information when optimizing uses (when building MemorySSA),
and when optimizing defs or updating uses (getClobberingMemoryAccess).
Current patch does not propagate alias information through MemoryPhis.

Reviewers: gbiv, dberlin

Subscribers: Prazek, sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D38569

llvm-svn: 327035
2018-03-08 18:03:14 +00:00
Florian Hahn
2659b307d2 [IPSCCP] Add getCompare which returns either true, false, undef or null.
getCompare returns true, false or undef constants if the comparison can
be evaluated, or nullptr if it cannot. This is in line with what
ConstantExpr::getCompare returns. It also allows us to use
ConstantExpr::getCompare for comparing constants.

Reviewers: davide, mssimpso, dberlin, anna

Reviewed By: davide

Differential Revision: https://reviews.llvm.org/D43761

llvm-svn: 326720
2018-03-05 17:33:50 +00:00
George Burgess IV
e3a72f8453 [MemorySSA] Invalidate def caches on deletion
The only cases I can come up with where this invalidation needs to
happen is when there's a deletion somewhere. If we find more creative
test-cases, we can probably go with another approach mentioned on
PR36529.

Fixes PR36529.

llvm-svn: 326177
2018-02-27 07:20:49 +00:00
George Burgess IV
d07692f11b [MemorySSA] Fix a cache invalidation bug with removed accesses
I suspect there's a deeper issue here, but we probably shouldn't be
using INVALID_MEMORYSSA_ID as liveOnEntry's ID anyway.

llvm-svn: 325971
2018-02-23 23:07:18 +00:00
Serguei Katkov
d58a989008 [SCEV] Do not cache S -> V if S is not equivalent of V
SCEV tracks the correspondence of created SCEV to original instruction.
However during creation of SCEV it is possible that nuw/nsw/exact flags are
lost.

As a result during expansion of the SCEV the instruction with nuw/nsw/exact
will be used where it was expected and we produce poison incorreclty.

Reviewers: sanjoy, mkazantsev, sebpop, jbhateja
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41578

llvm-svn: 322058
2018-01-09 06:47:14 +00:00
Serguei Katkov
0e56998fde [SCEV] Be careful with nuw/nsw/exact in InsertBinop
InsertBinop tries to find an appropriate instruction instead of
creating a new instruction. When it checks whether instruction is
the same as we need to create it ignores nuw/nsw/exact flags.

It leads to invalid behavior when poison instruction can be used
when it was not expected. Specifically, for example Expander
expands the SCEV built for instruction
%a = add i32 %v, 1
It is possible that InsertBinop can find an instruction
% b = add nuw nsw i32 %v, 1
and will use it instead of version w/o nuw nsw.
It is incorrect.

The patch conservatively ignores all instructions with any of
poison flags installed.

Reviewers: sanjoy, mkazantsev, sebpop, jbhateja
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41576

llvm-svn: 321475
2017-12-27 08:26:22 +00:00
Hal Finkel
828901caa8 [SimplifyLibCalls] Inline calls to cabs when it's safe to do so
When unsafe algerbra is allowed calls to cabs(r) can be replaced by:

  sqrt(creal(r)*creal(r) + cimag(r)*cimag(r))

Patch by Paul Walker, thanks!

Differential Revision: https://reviews.llvm.org/D40069

llvm-svn: 320901
2017-12-16 01:26:25 +00:00
Michael Zolotukhin
f3262a2cdf Remove redundant includes from unittests.
llvm-svn: 320630
2017-12-13 21:31:05 +00:00
Simon Dardis
186f679a98 Infer lowest bits of an integer Multiply when the low bits of the operands are known
When the lowest bits of the operands to an integer multiply are known, the low bits of the result are deducible.
Code to deduce known-zero bottom bits already existed, but this change improves on that by deducing known-ones.

Patch by: Pedro Ferreira

Reviewers: craig.topper, sanjoy, efriedma

Differential Revision: https://reviews.llvm.org/D34029

llvm-svn: 320269
2017-12-09 23:25:57 +00:00
Alina Sbirlea
0e9a4ac953 [ModRefInfo] Make enum ModRefInfo an enum class [NFC].
Summary:
Make enum ModRefInfo an enum class. Changes to ModRefInfo values should
be done using inline wrappers.
This should prevent future bit-wise opearations from being added, which can be more error-prone.

Reviewers: sanjoy, dberlin, hfinkel, george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40933

llvm-svn: 320107
2017-12-07 22:41:34 +00:00
Max Kazantsev
b69919b5fd [SCEV][NFC] Introduce isSafeToExpandAt function to SCEVExpander
This function checks that:
1) It is safe to expand a SCEV;
2) It is OK to materialize it at the specified location.
For example, attempt to expand a loop's AddRec to the same loop's preheader should fail.

Differential Revision: https://reviews.llvm.org/D39236

llvm-svn: 318377
2017-11-16 05:10:56 +00:00
Sanjoy Das
e62cbedf0c Revert "[SCEV] Maintain and use a loop->loop invalidation dependency"
This reverts commit r315713.  It causes PR34968.

I think I know what the problem is, but I don't think I'll have time to fix it
this week.

llvm-svn: 315962
2017-10-17 01:03:56 +00:00
Matthew Simpson
b85949d044 [SparsePropagation] Enable interprocedural analysis
This patch adds the ability to perform IPSCCP-like interprocedural analysis to
the generic sparse propagation solver. The patch gives clients the ability to
define their own custom LatticeKey types that the generic solver maps to custom
LatticeVal types. The custom lattice keys can be used, for example, to
distinguish among mappings for regular values, values returned from functions,
and values stored in global variables. Clients are responsible for defining how
to convert between LatticeKeys and LLVM Values by providing a specialization of
the LatticeKeyInfo template.

The added unit tests demonstrate how the generic solver can be used to perform
a simplified version of interprocedural constant propagation.

Differential Revision: https://reviews.llvm.org/D37353

llvm-svn: 315919
2017-10-16 17:44:17 +00:00
Sanjoy Das
7341e7e3ca [SCEV] Maintain and use a loop->loop invalidation dependency
Summary:
This change uses the loop use list added in the previous change to remember the
loops that appear in the trip count expressions of other loops; and uses it in
forgetLoop.  This lets us not scan every loop in the function on a forgetLoop
call.

With this change we no longer invalidate clear out backedge taken counts on
forgetValue.  I think this is fine -- the contract is that SCEV users must call
forgetLoop(L) if their change to the IR could have changed the trip count of L;
solely calling forgetValue on a value feeding into the backedge condition of L
is not enough.  Moreover, I don't think we can strengthen forgetValue to be
sufficient for invalidating trip counts without significantly re-architecting
SCEV.  For instance, if we have the loop:

  I = *Ptr;
  E = I + 10;
  do {
    // ...
  } while (++I != E);

then the backedge taken count of the loop is 9, and it has no reference to
either I or E, i.e. there is no way in SCEV today to re-discover the dependency
of the loop's trip count on E or I.  So a SCEV client cannot change E to (say)
"I + 20", call forgetValue(E) and expect the loop's trip count to be updated.

Reviewers: atrick, sunfish, mkazantsev

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D38435

llvm-svn: 315713
2017-10-13 17:13:44 +00:00
Sanjoy Das
9cb64b47cf [SCEV] Maintain loop use lists, and use them in forgetLoop
Summary:
Currently we do not correctly invalidate memoized results for add recurrences
that were created directly (i.e. they were not created from a `Value`).  This
change fixes this by keeping loop use lists and using the loop use lists to
determine which SCEV expressions to invalidate.

Here are some statistics on the number of uses of in the use lists of all loops
on a clang bootstrap (config: release, no asserts):

     Count: 731310
       Min: 1
      Mean: 8.555150
50th %time: 4
95th %tile: 25
99th %tile: 53
       Max: 433

Reviewers: atrick, sunfish, mkazantsev

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D38434

llvm-svn: 315672
2017-10-13 05:50:52 +00:00
Daniel Neilson
6a302cca5a [SCEV] Properly handle the case of a non-constant start with a zero accum in ScalarEvolution::createAddRecFromPHIWithCastsImpl
Summary:
 This patch fixes an error in the patch to ScalarEvolution::createAddRecFromPHIWithCastsImpl
made in D37265. In that patch we handle the cases where the either the start or accum values can be
zero after truncation. But, we assume that the start value must be a constant if the accum is
zero. This is clearly an erroneous assumption. This change removes that assumption.

Reviewers: sanjoy, dorit, mkazantsev

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38814

llvm-svn: 315491
2017-10-11 19:05:14 +00:00
Florian Hahn
a812673b8c [LVI] Move LVILatticeVal class to separate header file (NFC).
Summary:
This allows sharing the lattice value code between LVI and SCCP (D36656). 

It also adds a `satisfiesPredicate` function, used by D36656.

Reviewers: davide, sanjoy, efriedma

Reviewed By: sanjoy

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D37591

llvm-svn: 314411
2017-09-28 11:09:22 +00:00
Daniel Neilson
91bedd1195 [SCEV] Generalize folding of trunc(x)+n*trunc(y) into folding m*trunc(x)+n*trunc(y)
Summary:
A SCEV such as:
 {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop>

can be folded into, simply, {%v2,+,0}. However, the current code in ::getAddExpr()
will not try to apply the simplification m*trunc(x)+n*trunc(y) -> trunc(trunc(m)*x+trunc(n)*y)
because it only keys off having a non-multiplied trunc as the first term in the simplification.

This patch generalizes this code to try to do a more generic fold of these trunc
expressions.

Reviewers: sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37888

llvm-svn: 313988
2017-09-22 15:47:57 +00:00
Sanjoy Das
1139050cb2 [LoopInfo] Make LoopBase and Loop destructors non-public
Summary:
See comment for why I think this is a good idea.

This change also:

 - Removes an SCEV test case.  The SCEV test was not testing anything useful (most of it was `#if 0` ed out) and it would need to be updated to deal with a private ~Loop::Loop.
 - Updates the loop pass manager test case to deal with a private ~Loop::Loop.
 - Renames markAsRemoved to markAsErased to contrast with removeLoop, via the usual remove vs. erase idiom we already have for instructions and basic blocks.

Reviewers: chandlerc

Subscribers: mehdi_amini, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D37996

llvm-svn: 313695
2017-09-19 23:19:00 +00:00
Daniel Neilson
a7e6fd171a [SCEV] Ensure ScalarEvolution::createAddRecFromPHIWithCastsImpl properly handles out of range truncations of the start and accum values
Summary:
 When constructing the predicate P1 in ScalarEvolution::createAddRecFromPHIWithCastsImpl() it is possible
for the PHISCEV from which the predicate is constructed to be a SCEVConstant instead of a SCEVAddRec. If
this happens, then the cast<SCEVAddRec>(PHISCEV) in the code will assert.

 Such a PHISCEV is possible if either the start value or the accumulator value is a constant value
that not equal to its truncated value, and if the truncated value is zero.

 This patch adds tests that demonstrate the cast<> assertion, and fixes this problem by checking
whether the PHISCEV is a constant before constructing the P1 predicate; if it is, then P1 is
equivalent to one of P2 or P3. Additionally, if we know that the start value or accumulator
value are constants then we check whether the P2 and/or P3 predicates are known false at compile
time; if either is, then we bail out of constructing the AddRec.

Reviewers: sanjoy, mkazantsev, silviu.baranga

Reviewed By: mkazantsev

Subscribers: mkazantsev, llvm-commits

Differential Revision: https://reviews.llvm.org/D37265

llvm-svn: 312568
2017-09-05 19:54:03 +00:00
Chandler Carruth
10f8380767 [PM] Switch the CGSCC debug messages to use the standard LLVM debug
printing techniques with a DEBUG_TYPE controlling them.

It was a mistake to start re-purposing the pass manager `DebugLogging`
variable for generic debug printing -- those logs are intended to be
very minimal and primarily used for testing. More detailed and
comprehensive logging doesn't make sense there (it would only make for
brittle tests).

Moreover, we kept forgetting to propagate the `DebugLogging` variable to
various places making it also ineffective and/or unavailable. Switching
to `DEBUG_TYPE` makes this a non-issue.

llvm-svn: 310695
2017-08-11 05:47:13 +00:00
Chandler Carruth
d7fd660b9a [LCG] Switch one of the update methods for the LazyCallGraph to support
limited batch updates.

Specifically, allow removing multiple reference edges starting from
a common source node. There are a few constraints that play into
supporting this form of batching:

1) The way updates occur during the CGSCC walk, about the most we can
   functionally batch together are those with a common source node. This
   also makes the batching simpler to implement, so it seems
   a worthwhile restriction.
2) The far and away hottest function for large C++ files I measured
   (generated code for protocol buffers) showed a huge amount of time
   was spent removing ref edges specifically, so it seems worth focusing
   there.
3) The algorithm for removing ref edges is very amenable to this
   restricted batching. There are just both API and implementation
   special casing for the non-batch case that gets in the way. Once
   removed, supporting batches is nearly trivial.

This does modify the API in an interesting way -- now, we only preserve
the target RefSCC when the RefSCC structure is unchanged. In the face of
any splits, we create brand new RefSCC objects. However, all of the
users were OK with it that I could find. Only the unittest needed
interesting updates here.

How much does batching these updates help? I instrumented the compiler
when run over a very large generated source file for a protocol buffer
and found that the majority of updates are intrinsically updating one
function at a time. However, nearly 40% of the total ref edges removed
are removed as part of a batch of removals greater than one, so these
are the cases batching can help with.

When compiling the IR for this file with 'opt' and 'O3', this patch
reduces the total time by 8-9%.

Differential Revision: https://reviews.llvm.org/D36352

llvm-svn: 310450
2017-08-09 09:05:27 +00:00
Max Kazantsev
82dfccec6e Avoid comparison between signed and unsigned in SCEVExitLimitForget tests
llvm-svn: 310029
2017-08-04 06:03:51 +00:00
Max Kazantsev
d01b5fed62 Fix SCEVExitLimitForget tests to make Sanitizer happy
llvm-svn: 310023
2017-08-04 05:06:44 +00:00
Dehao Chen
26000e4af9 Do not want to use BFI to get profile count for sample pgo
Summary: For SamplePGO, we already record the callsite count in the call instruction itself. So we do not want to use BFI to get profile count as it is less accurate.

Reviewers: tejohnson, davidxl, eraman

Reviewed By: eraman

Subscribers: sanjoy, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D36025

llvm-svn: 309964
2017-08-03 17:11:41 +00:00
Benjamin Kramer
d7066bff7d Fix use after free in unit test.
llvm-svn: 309952
2017-08-03 15:59:37 +00:00
Max Kazantsev
d38cadc268 Removed unused variabled from unit test
llvm-svn: 309929
2017-08-03 09:25:44 +00:00
Max Kazantsev
6d34bf3cc8 [SCEV] Re-enable "Cache results of computeExitLimit"
The patch rL309080 was reverted because it did not clean up the cache on "forgetValue"
method call. This patch re-enables this change, adds the missing check and introduces
two new unit tests that make sure that the cache is cleaned properly.

Differential Revision: https://reviews.llvm.org/D36087

llvm-svn: 309925
2017-08-03 08:41:30 +00:00
Alina Sbirlea
7b373d280b Allow None as a MemoryLocation to getModRefInfo
Summary:
Adding part of the changes in D30369 (needed to make progress):
Current patch updates AliasAnalysis and MemoryLocation, but does _not_ clean up MemorySSA.

Original summary from D30369, by dberlin:
Currently, we have instructions which affect memory but have no memory
location. If you call, for example, MemoryLocation::get on a fence,
it asserts. This means things specifically have to avoid that. It
also means we end up with a copy of each API, one taking a memory
location, one not.

This starts to fix that.

We add MemoryLocation::getOrNone as a new call, and reimplement the
old asserting version in terms of it.

We make MemoryLocation optional in the (Instruction, MemoryLocation)
version of getModRefInfo, and kill the old one argument version in
favor of passing None (it had one caller). Now both can handle fences
because you can just use MemoryLocation::getOrNone on an instruction
and it will return a correct answer.

We use all this to clean up part of MemorySSA that had to handle this difference.

Note that literally every actual getModRefInfo interface we have could be made private and replaced with:

getModRefInfo(Instruction, Optional<MemoryLocation>)
and
getModRefInfo(Instruction, Optional<MemoryLocation>, Instruction, Optional<MemoryLocation>)

and delegating to the right ones, if we wanted to.

I have not attempted to do this yet.

Reviewers: dberlin, davide, dblaikie

Subscribers: sanjoy, hfinkel, chandlerc, llvm-commits

Differential Revision: https://reviews.llvm.org/D35441

llvm-svn: 309641
2017-08-01 00:28:29 +00:00
Chandler Carruth
099fbc1e8e [PM/LCG] Teach the LazyCallGraph to maintain reference edges from every
function to every defined function known to LLVM as a library function.

LLVM can introduce calls to these functions either by replacing other
library calls or by recognizing patterns (such as memset_pattern or
vector math patterns) and replacing those with calls. When these library
functions are actually defined in the module, we need to have reference
edges to them initially so that we visit them during the CGSCC walk in
the right order and can effectively rebuild the call graph afterward.

This was discovered when building code with Fortify enabled as that is
a common case of both inline definitions of library calls and
simplifications of code into calling them.

This can in extreme cases of LTO-ing with libc introduce *many* more
reference edges. I discussed a bunch of different options with folks but
all of them are unsatisfying. They either make the graph operations
substantially more complex even when there are *no* defined libfuncs, or
they introduce some other complexity into the callgraph. So this patch
goes with the simplest possible solution of actual synthetic reference
edges. If this proves to be a memory problem, I'm happy to implement one
of the clever techniques to save memory here.

llvm-svn: 308088
2017-07-15 08:08:19 +00:00
Chandler Carruth
4a89b23644 [PM] Fix a silly bug in my recent update to the CG update logic.
I used the wrong variable to update. This was even covered by a unittest
I wrote, and the comments for the unittest were correct (if confusing)
but the test itself just matched the buggy behavior. =[

llvm-svn: 307764
2017-07-12 09:08:11 +00:00
Konstantin Zhuravlyov
d382d6f3fc Enhance synchscope representation
OpenCL 2.0 introduces the notion of memory scopes in atomic operations to
  global and local memory. These scopes restrict how synchronization is
  achieved, which can result in improved performance.

  This change extends existing notion of synchronization scopes in LLVM to
  support arbitrary scopes expressed as target-specific strings, in addition to
  the already defined scopes (single thread, system).

  The LLVM IR and MIR syntax for expressing synchronization scopes has changed
  to use *syncscope("<scope>")*, where <scope> can be "singlethread" (this
  replaces *singlethread* keyword), or a target-specific name. As before, if
  the scope is not specified, it defaults to CrossThread/System scope.

  Implementation details:
    - Mapping from synchronization scope name/string to synchronization scope id
      is stored in LLVM context;
    - CrossThread/System and SingleThread scopes are pre-defined to efficiently
      check for known scopes without comparing strings;
    - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in
      the bitcode.

Differential Revision: https://reviews.llvm.org/D21723

llvm-svn: 307722
2017-07-11 22:23:00 +00:00
NAKAMURA Takumi
6a1725f368 CGSCCPassManagerTest.cpp: Fix warnings. [-Wunused-variable]
llvm-svn: 307511
2017-07-09 23:06:05 +00:00
Chandler Carruth
fda9b703c9 [PM] Fix a nasty bug in the new PM where we failed to properly
invalidation of analyses when merging SCCs.

While I've added a bunch of testing of this, it takes something much
more like the inliner to really trigger this as you need to have
partially-analyzed SCCs with updates at just the right time. So I've
added a direct test for this using the inliner and verifying the
domtree. Without the changes here, this test ends up finding a stale
dominator tree.

However, to handle this properly, we need to invalidate analyses
*before* merging the SCCs. After talking to Philip and Sanjoy about this
they convinced me this was the right approach. To do this, we need
a callback mechanism when merging SCCs so we can observe the cycle that
will be merged before the merge happens. This API update ended up being
surprisingly easy.

With this commit, the new PM passes the test-suite again. It hadn't
since MemorySSA was enabled for EarlyCSE as that also will find this bug
very quickly.

llvm-svn: 307498
2017-07-09 13:45:11 +00:00
Chandler Carruth
267b806fd9 [PM] Add unittesting of the call graph update logic with complex
dependencies between analyses.

This uncovers even more issues with the proxies and the splitting apart
of SCCs which are fixed in this patch. I discovered this while trying to
add more rigorous testing for a change I'm making to the call graph
update invalidation logic.

llvm-svn: 307497
2017-07-09 13:16:55 +00:00
Chandler Carruth
e28f591d48 [PM] Finish implementing and fix a chain of bugs uncovered by testing
the invalidation propagation logic from an SCC to a Function.

I wrote the infrastructure to test this but didn't actually use it in
the unit test where it was designed to be used. =[ My bad. Once
I actually added it to the test case I discovered that it also hadn't
been properly implemented, so I've implemented it. The logic in the FAM
proxy for an SCC pass to propagate invalidation follows the same ideas
as the FAM proxy for a Module pass, but the implementation is a bit
different to reflect the fact that it is forwarding just for an SCC.

However, implementing this correctly uncovered a surprising "bug" (it
was conservatively correct but relatively very expensive) in how we
handle invalidation when splitting one SCC into multiple SCCs. We did an
eager invalidation when in reality we should be deferring invaliadtion
for the *current* SCC to the CGSCC pass manager and just invaliating the
newly constructed SCCs. Otherwise we end up invalidating too much too
soon. This was exposed by the inliner test case that I've updated. Now,
we invalidate *just* the split off '(test1_f)' SCC when doing the CG
update, and then the inliner finishes and invalidates the '(test1_g,
test1_h)' SCC's analyses. The first few attempts at fixing this hit
still more bugs, but all of those are covered by existing tests. For
example, the inliner should also preserve the FAM proxy to avoid
unnecesasry invalidation, and this is safe because the CG update
routines it uses handle any necessary adjustments to the FAM proxy.

Finally, the unittests for the CGSCC pass manager needed a bunch of
updates where we weren't correctly preserving the FAM proxy because it
hadn't been fully implemented and failing to preserve it didn't matter.

Note that this doesn't yet fix the current crasher due to MemSSA finding
a stale dominator tree, but without this the fix to that crasher doesn't
really make any sense when testing because it relies on the proxy
behavior.

llvm-svn: 307487
2017-07-09 03:59:31 +00:00
Xin Tong
06115fe958 [AST] Fix a bug in aliasesUnknownInst. Make sure we are comparing the unknown instructions in the alias set and the instruction interested in.
Summary:
Make sure we are comparing the unknown instructions in the alias set and the instruction interested in.
I believe this is clearly a bug (missed opportunity). I can also add some test cases if desired.

Reviewers: hfinkel, davide, dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34597

llvm-svn: 306241
2017-06-25 12:55:11 +00:00
David Blaikie
285d868803 GlobalsModRef: Ensure optnone+readonly/readnone attributes are respected
llvm-svn: 304945
2017-06-07 21:37:39 +00:00
Alina Sbirlea
bbb2d68cae [mssa] Fix case when there is no definition in a block prior to an inserted use.
Summary:
Check that the first access before one being tested is valid.
Before this patch, if there was no definition prior to the Use being tested,
the first time Iter was deferenced, it hit the sentinel.

Reviewers: dberlin, gbiv

Subscribers: sanjoy, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D33950

llvm-svn: 304926
2017-06-07 16:46:53 +00:00
David Blaikie
73b29ebe81 GlobalsModRef+OptNone: Don't prove readnone/other properties from an optnone function
Seems like at least one reasonable interpretation of optnone is that the
optimizer never "looks inside" a function. This fix is consistent with
that interpretation.

Specifically this came up in the situation:

f3 calls f2 calls f1
f2 is always_inline
f1 is optnone

The application of readnone to f1 (& thus to f2) caused the inliner to
kill the call to f2 as being trivially dead (without even checking the
cost function, as it happens - not sure if that's also a bug).

llvm-svn: 304833
2017-06-06 20:51:15 +00:00
Chandler Carruth
87b8e94f84 Re-sort #include lines for unittests. This uses a slightly modified
clang-format (https://reviews.llvm.org/D33932) to keep primary headers
at the top and handle new utility headers like 'gmock' consistently with
other utility headers.

No other change was made. I did no manual edits, all of this is
clang-format.

This should allow other changes to have more clear and focused diffs,
and is especially motivated by moving some headers into more focused
libraries.

llvm-svn: 304786
2017-06-06 11:06:56 +00:00
Benjamin Kramer
f71ae721ed [OrderedBasicBlock] Return false for comesBefore(A, A)
So far it would return true for the first uncached query, then cached
queries return false.

llvm-svn: 304545
2017-06-02 13:10:31 +00:00
Gor Nishanov
349aaaf85e ScalarEvolution unit test: fix typo that breaks check-all
llvm-svn: 304063
2017-05-27 05:24:30 +00:00
Keno Fischer
c05ac10abe [SCEVExpander] Try harder to avoid introducing inttoptr
Summary:
This fixes introduction of an incorrect inttoptr/ptrtoint pair in
the included test case which makes use of non-integral pointers. I
suspect there are more cases like this left, but this takes care of
the one I was seeing at the moment.

Reviewers: sanjoy

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D33129

llvm-svn: 304058
2017-05-27 03:22:55 +00:00
Xin Tong
83078569e3 Revert "Add pthread_self function prototype and make it speculatable."
This reverts commit 143d7445b5dfa2f6d6c45bdbe0433d9fc531be21.

Build breaking

llvm-svn: 303496
2017-05-21 00:37:55 +00:00
Xin Tong
14d596ecb2 Add pthread_self function prototype and make it speculatable.
Summary: This allows pthread_self to be pulled out of a loop by LICM.

Reviewers: hfinkel, arsenm, davide

Reviewed By: davide

Subscribers: davide, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D32782

llvm-svn: 303495
2017-05-20 22:40:25 +00:00
Easwaran Raman
9567eabb4f Add hasProfileSummary and has{Sample|Instrumentation}Profile methods
ProfileSummaryInfo already checks whether the module has sample profile
in determining profile counts. This will also be useful in inliner to
clean up threshold updates.

llvm-svn: 303204
2017-05-16 20:14:39 +00:00
Andrew Kaylor
95445317bd [TLI] Add declarations for various math header file routines from math-finite.h that create '__<func>_finite as functions
Patch by Chris Chrulski

Differential Revision: https://reviews.llvm.org/D31787

llvm-svn: 302955
2017-05-12 22:11:12 +00:00
Teresa Johnson
81aa3660f7 Restrict call metadata based hotness detection to Sample PGO mode
Summary:
Don't use the metadata on call instructions for determining hotness
unless we are in sample PGO mode, where it is needed because profile
counts are not accurate. In instrumentation mode this is not necessary
and does more harm than good when calls have VP metadata that hasn't
been properly scaled after transformations or dropped after constant
prop based devirtualization (both should be fixed, but we don't need
to do this in the first place for instrumentation PGO).

This required adjusting a number of tests to distinguish between sample
and instrumentation PGO handling, and to add in profile summary metadata
so that getProfileCount can get the summary.

Reviewers: davidxl, danielcdh

Subscribers: aemerson, rengolin, mehdi_amini, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D32877

llvm-svn: 302844
2017-05-11 23:18:05 +00:00
Matthias Braun
85c90275ff TargetLibraryInfo: Introduce wcslen
wcslen is part of the C99 and C++98 standards.

- This introduces the function to TargetLibraryInfo.
- Also set attributes for wcslen in llvm::inferLibFuncAttributes().

Differential Revision: https://reviews.llvm.org/D32837

llvm-svn: 302278
2017-05-05 20:25:50 +00:00
Sanjoy Das
c75f995ef0 Teach SCEV normalization to de/normalize non-affine add recs
Summary:
Before this change, SCEV Normalization would incorrectly normalize
non-affine add recurrences.  To work around this there was (still is)
a check in place to make sure we only tried to normalize affine add
recurrences.

We recently found a bug in aforementioned check to bail out of
normalizing non-affine add recurrences.  However, instead of fixing
the bailout, I have decided to teach SCEV normalization to work
correctly with non-affine add recurrences, making the bailout
unnecessary (I'll remove it in a subsequent change).

I've also added some unit tests (which would have failed before this
change).

Reviewers: atrick, sunfish, efriedma

Reviewed By: atrick

Subscribers: mcrosier, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D32104

llvm-svn: 301281
2017-04-25 00:09:19 +00:00
Wei Mi
a91f9f0c5f [SCEV] Add a local cache for getZeroExtendExpr and getSignExtendExpr to prevent
the exponential behavior.

The patch is to fix PR32043. Functions getZeroExtendExpr and getSignExtendExpr
may call themselves recursively more than once. This is potentially a 2^N
complexity behavior. The exponential behavior was not commonly exposed before
because of existing global cache mechnism like UniqueSCEVs or some early return
mechanism when flags FlagNSW or FlagNUW are seen. However, we still have case
which can expose the exponential behavior, like the case in PR32043, so we add
a local cache in getZeroExtendExpr and getSignExtendExpr. If the input of the
functions -- SCEV and type pair have been seen before, we can find the extended
expression directly in the local cache.

Differential Revision: https://reviews.llvm.org/D30350

llvm-svn: 300494
2017-04-17 20:40:05 +00:00
Sanjoy Das
0bffb56cbe Generalize SCEV's unit testing helper a bit
llvm-svn: 300379
2017-04-14 23:47:53 +00:00
Craig Topper
9f4fbdf31c [ValueTracking] Avoid undefined behavior in unittest by not making a named ArrayRef from a std::initializer_list
One of the ValueTracking unittests creates a named ArrayRef initialized by a std::initializer_list. The underlying array for an std::initializer_list is only guaranteed to have a lifetime as long as the initializer_list object itself. So this can leave the ArrayRef pointing at an array that no long exists.

This fixes this to just create an explicit array instead of an ArrayRef.

Differential Revision: https://reviews.llvm.org/D32089

llvm-svn: 300354
2017-04-14 17:59:19 +00:00
Sanjoy Das
01c58fa721 Add a unit test for SCEV Normalization
llvm-svn: 300332
2017-04-14 15:50:04 +00:00
Daniel Berlin
e485ce96aa MemorySSA: Move to Analysis, from Transforms/Utils. It's used as
Analysis, it has Analysis passes, and once NewGVN is made an Analysis,
this removes the cross dependency from Analysis to Transform/Utils.
NFC.

llvm-svn: 299980
2017-04-11 20:06:36 +00:00
Matt Arsenault
204d4c1d7b Allow DataLayout to specify addrspace for allocas.
LLVM makes several assumptions about address space 0. However,
alloca is presently constrained to always return this address space.
There's no real way to avoid using alloca, so without this
there is no way to opt out of these assumptions.

The problematic assumptions include:
- That the pointer size used for the stack is the same size as
  the code size pointer, which is also the maximum sized pointer.

- That 0 is an invalid, non-dereferencable pointer value.

These are problems for AMDGPU because alloca is used to
implement the private address space, which uses a 32-bit
index as the pointer value. Other pointers are 64-bit
and behave more like LLVM's notion of generic address
space. By changing the address space used for allocas,
we can change our generic pointer type to be LLVM's generic
pointer type which does have similar properties.

llvm-svn: 299888
2017-04-10 22:27:50 +00:00
Simon Pilgrim
b8aa529569 Fix signed/unsigned comparison warnings
llvm-svn: 297561
2017-03-11 13:02:31 +00:00
Dehao Chen
f131435479 Refactor the PSI to extract getCallSiteCount and remove checks for profile type.
Summary: There is no need to check profile count as only CallInst will have metadata attached.

Reviewers: eraman

Reviewed By: eraman

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30799

llvm-svn: 297500
2017-03-10 19:45:16 +00:00
Sanjoy Das
58bb799513 [SCEV] Decrease the recursion threshold for CompareValueComplexity
Fixes PR32142.

r287232 accidentally increased the recursion threshold for
CompareValueComplexity from 2 to 32.  This change reverses that change
by introducing a separate flag for CompareValueComplexity's threshold.

llvm-svn: 296992
2017-03-05 23:49:17 +00:00
Sanjoy Das
97ea3c33f1 Fix signed-unsigned comparison warning
llvm-svn: 296274
2017-02-25 22:25:48 +00:00
Sanjoy Das
059733f666 [ValueTracking] Don't do an unchecked shift in ComputeNumSignBits
Summary:
Previously we used to return a bogus result, 0, for IR like `ashr %val,
-1`.

I've also added an assert checking that `ComputeNumSignBits` at least
returns 1.  That assert found an already checked in test case where we
were returning a bad result for `ashr %val, -1`.

Fixes PR32045.

Reviewers: spatel, majnemer

Reviewed By: spatel, majnemer

Subscribers: efriedma, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D30311

llvm-svn: 296273
2017-02-25 20:30:45 +00:00
Chandler Carruth
e56d7470a4 [PM/LCG] Teach LCG to support spurious reference edges.
Somewhat amazingly, this only requires teaching it to clean them up when
deleting a dead function from the graph. And we already have exactly the
necessary data structures to do that in the parent RefSCCs.

This allows ArgPromote to work in a much simpler way be merely letting
reference edges linger in the graph after the causing IR is deleted. We
will clean up these edges when we run any function pass over the IR, but
don't remove them eagerly.

This avoids all of the quadratic update issues both in the current pass
manager and in my previous attempt with the new pass manager.

Differential Revision: https://reviews.llvm.org/D29579

llvm-svn: 294663
2017-02-09 23:30:14 +00:00
Chandler Carruth
042041bdf3 [PM/LCG] Teach the LazyCallGraph how to replace a function without
disturbing the graph or having to update edges.

This is motivated by porting argument promotion to the new pass manager.
Because of how LLVM IR Function objects work, in order to change their
signature a new object needs to be created. This is efficient and
straight forward in the IR but previously was very hard to implement in
LCG. We could easily replace the function a node in the graph
represents. The challenging part is how to handle updating the edges in
the graph.

LCG previously used an edge to a raw function to represent a node that
had not yet been scanned for calls and references. This was the core
of its laziness. However, that model causes this kind of update to be
very hard:
1) The keys to lookup an edge need to be `Function*`s that would all
   need to be updated when we update the node.
2) There will be some unknown number of edges that haven't transitioned
   from `Function*` edges to `Node*` edges.

All of this complexity isn't necessary. Instead, we can always build
a node around any function, always pointing edges at it and always using
it as the key to lookup an edge. To maintain the laziness, we need to
sink the *edges* of a node into a secondary object and explicitly model
transitioning a node from empty to populated by scanning the function.
This design seems much cleaner in a number of ways, but importantly
there is now exactly *one* place where the `Function*` has to be
updated!

Some other cleanups that fall out of this include having something to
model the *entry* edges more accurately. Rather than hand rolling parts
of the node in the graph itself, we have an explicit `EdgeSequence`
object that gives us exactly the functionality needed. We also have
a consistent place to define the edge iterators and can use them for
both the entry edges and the internal edges of the graph.

The API used to model the separation between a node and its edges is
intentionally very thin as most clients are expected to deal with nodes
that have populated edges. We model this exactly as an optional does
with an additional method to populate the edges when that is
a reasonable thing for a client to do. This is based on API design
suggestions from Richard Smith and David Blaikie, credit goes to them
for helping pick how to model this without it being either too explicit
or too implicit.

The patch is somewhat noisy due to shifting around iterator types and
new syntax for walking the edges of a node, but most of the
functionality change is in the `Edge`, `EdgeSequence`, and `Node` types.

Differential Revision: https://reviews.llvm.org/D29577

llvm-svn: 294653
2017-02-09 23:24:13 +00:00
Chandler Carruth
59e43d5713 [SCEV] Scale back the test added in r294181 as it goes quadratic in
SCEV.

This test was immediately the slowest test in 'check-llvm' even in an
optimized build and was driving up the total test time by 50% for me.

Sanjoy has filed a PR about the quadratic behavior in SCEV but it is
also concerning that the test still passes given that r294181 added
a threshold at 32 to SCEV. I've followed up on the original patch to
figure out how this test should work long-term, but for now I want to
get check-llvm to be fast again.

llvm-svn: 294241
2017-02-06 21:27:12 +00:00
Chandler Carruth
93759a2957 [PM/LCG] Remove the lazy RefSCC formation from the LazyCallGraph during
iteration.

The lazy formation of RefSCCs isn't really the most important part of
the laziness here -- that has to do with walking the functions
themselves -- and isn't essential to maintain. Originally, there were
incremental update algorithms that relied on updates happening
predominantly near the most recent RefSCC formed, but those have been
replaced with ones that have much tighter general case bounds at this
point. We do still perform asserts that only scale well due to this
incrementality, but those are easy to place behind EXPENSIVE_CHECKS.

Removing this simplifies the entire analysis by having a single up-front
step that builds all of the RefSCCs in a direct Tarjan walk. We can even
easily replace this with other or better algorithms at will and with
much less confusion now that there is no iterator-based incremental
logic involved. This removes a lot of complexity from LCG.

Another advantage of moving in this direction is that it simplifies
testing the system substantially as we no longer have to worry about
observing and mutating the graph half-way through the RefSCC formation.

We still need a somewhat special iterator for RefSCCs because we want
the iterator to remain stable in the face of graph updates. However,
this now merely involves relative indexing to the current RefSCC's
position in the sequence which isn't too hard.

Differential Revision: https://reviews.llvm.org/D29381

llvm-svn: 294227
2017-02-06 19:38:06 +00:00
Daniil Fukalov
fd35b81460 [SCEV] limit recursion depth and operands number in getAddExpr
for a quite big function with source like

%add = add nsw i32 %mul, %conv
%mul1 = mul nsw i32 %add, %conv
%add2 = add nsw i32 %mul1, %add
%mul3 = mul nsw i32 %add2, %add
; repeat couple of thousands times
that can be produced by loop unroll, getAddExpr() tries to recursively construct SCEV and runs almost infinite time.

Added recursion depth restriction (with new parameter to set it)

Reviewers: sanjoy

Subscribers: hfinkel, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D28158

llvm-svn: 294181
2017-02-06 12:38:06 +00:00
David L. Jones
268960185f [Analysis] Add LibFunc_ prefix to enums in TargetLibraryInfo. (NFC)
Summary:
The LibFunc::Func enum holds enumerators named for libc functions.
Unfortunately, there are real situations, including libc implementations, where
function names are actually macros (musl uses "#define fopen64 fopen", for
example; any other transitively visible macro would have similar effects).

Strictly speaking, a conforming C++ Standard Library should provide any such
macros as functions instead (via <cstdio>). However, there are some "library"
functions which are not part of the standard, and thus not subject to this
rule (fopen64, for example). So, in order to be both portable and consistent,
the enum should not use the bare function names.

The old enum naming used a namespace LibFunc and an enum Func, with bare
enumerators. This patch changes LibFunc to be an enum with enumerators prefixed
with "LibFFunc_". (Unfortunately, a scoped enum is not sufficient to override
macros.)

There are additional changes required in clang.

Reviewers: rsmith

Subscribers: mehdi_amini, mzolotukhin, nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D28476

llvm-svn: 292848
2017-01-23 23:16:46 +00:00
Chandler Carruth
aea315a631 [LoopInfo] Add helper methods to compute two useful orderings of the
loops in a function.

These are relatively confusing to talk about and compute correctly so it
seems really good to write down their implementation in one place. I've
replaced one place we needed this in the loop PM infrastructure and
I have another place in a pending patch that wants it.

We can't quite use this for the core loop PM walk because there we're
sometimes working on a sub-forest.

I'll add the expected unittests before committing this but wanted to
make sure folks were happy with these names / comments.

Credit goes to Richard Smith for the idea for naming the order where siblings
are in reverse program order but the tree traversal remains preorder.

Differential Revision: https://reviews.llvm.org/D28932

llvm-svn: 292569
2017-01-20 02:41:20 +00:00
Easwaran Raman
7a2d97369c Add an interface to scale the frequencies of a set of blocks.
The scaling is done with reference to the the new frequency of a reference block.

Differential Revision: https://reviews.llvm.org/D28535

llvm-svn: 292507
2017-01-19 18:53:16 +00:00
Xin Tong
acffedf36e Refactor out LoopInfo computation so that it can be used by
other test cases.

Summary: Refactor out LoopInfo computation so that it can be
used by other test cases.

So i am changing this test proactively for later commit, which will use
this function.

Reviewers: sanjoy, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28778

llvm-svn: 292250
2017-01-17 20:24:39 +00:00
Ahmed Bougacha
c5c2e57ec2 [TLI] Add prototype checking for all remaining LibFuncs.
This is another step towards unifying all LibFunc prototype checks.
This work started in r267758 (D19469);  add the remaining checks.

Also add a unittest that checks each libfunc declared with a known-valid
and known-invalid prototype.  New libfuncs added in the future are
required to have prototype checking in place; the known-valid test will
fail otherwise.

Differential Revision: https://reviews.llvm.org/D28030

llvm-svn: 292188
2017-01-17 03:10:02 +00:00
Ahmed Bougacha
bb16a66093 [unittests] Alphabetize cmake file list. NFC.
llvm-svn: 292186
2017-01-17 03:09:55 +00:00
Xin Tong
97363b5028 Empty line. NFC.
llvm-svn: 292081
2017-01-15 23:32:11 +00:00
Xin Tong
9f93956e98 Use getLoopLatch in place of isLoopSimplifyForm
Summary:
Use getLoopLatch in place of isLoopSimplifyForm. we do not need
to know whether the loop has a preheader nor dedicated exits.

Reviewers: hfinkel, sanjoy, atrick, mkuper

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D28724

llvm-svn: 292078
2017-01-15 21:17:52 +00:00
Xin Tong
b8f4943e70 Delete a dead argument. NFC
llvm-svn: 292074
2017-01-15 19:53:59 +00:00
Easwaran Raman
754f5f3612 Compute summary before calling extractProfTotalWeight
extractProfTotalWeight checks if the profile type is sample profile, but
before that we have to ensure that summary is available. Also expanded
the unittest to test the case where there is no summar

Differential Revision: https://reviews.llvm.org/D28708

llvm-svn: 291982
2017-01-14 00:32:37 +00:00
Easwaran Raman
50f9f008cb ProfileSummaryInfo improvements.
* Add is{Hot|Cold}CallSite methods
* Fix a bug in isHotBB where it was looking for MD_prof on a return instruction
* Use MD_prof data only if sample profiling was used to collect profiles.
* Add an unit test to ProfileSummaryInfo

Differential Revision: https://reviews.llvm.org/D28584

llvm-svn: 291878
2017-01-13 01:34:00 +00:00
Chandler Carruth
e59e4b3dc5 [PM] Separate the LoopAnalysisManager from the LoopPassManager and move
the latter to the Transforms library.

While the loop PM uses an analysis to form the IR units, the current
plan is to have the PM itself establish and enforce both loop simplified
form and LCSSA. This would be a layering violation in the analysis
library.

Fundamentally, the idea behind the loop PM is to *transform* loops in
addition to running passes over them, so it really seemed like the most
natural place to sink this was into the transforms library.

We can't just move *everything* because we also have loop analyses that
rely on a subset of the invariants. So this patch splits the the loop
infrastructure into the analysis management that has to be part of the
analysis library, and the transform-aware pass manager.

This also required splitting the loop analyses' printer passes out to
the transforms library, which makes sense to me as running these will
transform the code into LCSSA in theory.

I haven't split the unittest though because testing one component
without the other seems nearly intractable.

Differential Revision: https://reviews.llvm.org/D28452

llvm-svn: 291662
2017-01-11 09:43:56 +00:00
Chandler Carruth
47fe0b8a82 [PM] Take more drastic measures to work around MSVC's failure on this
code. If this doesn't work and I can't find someone to help who has MSVC
installed, I'll back everything out I guess. =[

llvm-svn: 291661
2017-01-11 09:20:24 +00:00
Chandler Carruth
9d614df7e2 [PM] Pull a lambda out of an argument into a named variable to try and
get a little more clarity about the nature of the issue MSVC is having
with this code.

llvm-svn: 291656
2017-01-11 08:23:29 +00:00
Chandler Carruth
c483768d91 [PM] Another attempt to satisfy MSVC.
llvm-svn: 291655
2017-01-11 07:53:12 +00:00
Chandler Carruth
15b99828b8 [PM] Try to appease MSVC by explicitly disambiguating a member name as
a template.

llvm-svn: 291654
2017-01-11 07:37:50 +00:00
Chandler Carruth
4855803b43 [PM] Rewrite the loop pass manager to use a worklist and augmented run
arguments much like the CGSCC pass manager.

This is a major redesign following the pattern establish for the CGSCC layer to
support updates to the set of loops during the traversal of the loop nest and
to support invalidation of analyses.

An additional significant burden in the loop PM is that so many passes require
access to a large number of function analyses. Manually ensuring these are
cached, available, and preserved has been a long-standing burden in LLVM even
with the help of the automatic scheduling in the old pass manager. And it made
the new pass manager extremely unweildy. With this design, we can package the
common analyses up while in a function pass and make them immediately available
to all the loop passes. While in some cases this is unnecessary, I think the
simplicity afforded is worth it.

This does not (yet) address loop simplified form or LCSSA form, but those are
the next things on my radar and I have a clear plan for them.

While the patch is very large, most of it is either mechanically updating loop
passes to the new API or the new testing for the loop PM. The code for it is
reasonably compact.

I have not yet updated all of the loop passes to correctly leverage the update
mechanisms demonstrated in the unittests. I'll do that in follow-up patches
along with improved FileCheck tests for those passes that ensure things work in
more realistic scenarios. In many cases, there isn't much we can do with these
until the loop simplified form and LCSSA form are in place.

Differential Revision: https://reviews.llvm.org/D28292

llvm-svn: 291651
2017-01-11 06:23:21 +00:00
Sanjoy Das
a2fc5cf9c6 Fix an issue with isGuaranteedToTransferExecutionToSuccessor
I'm not sure if this was intentional, but today
isGuaranteedToTransferExecutionToSuccessor returns true for readonly and
argmemonly calls that may throw.  This commit changes the function to
not implicitly infer nounwind this way.

Even if we eventually specify readonly calls as not throwing,
isGuaranteedToTransferExecutionToSuccessor is not the best place to
infer that.  We should instead teach FunctionAttrs or some other such
pass to tag readonly functions / calls as nounwind instead.

llvm-svn: 290794
2016-12-31 22:12:34 +00:00
Chandler Carruth
90f7376682 [PM] Teach the CGSCC's CG update utility to more carefully invalidate
analyses when we're about to break apart an SCC.

We can't wait until after breaking apart the SCC to invalidate things:
1) Which SCC do we then invalidate? All of them?
2) Even if we invalidate all of them, a newly created SCC may not have
   a proxy that will convey the invalidation to functions!

Previously we only invalidated one of the SCCs and too late. This led to
stale analyses remaining in the cache. And because the caching strategy
actually works, they would get used and chaos would ensue.

Doing invalidation early is somewhat pessimizing though if we *know*
that the SCC structure won't change. So it turns out that the design to
make the mutation API force the caller to know the *kind* of mutation in
advance was indeed 100% correct and we didn't do enough of it. So this
change also splits two cases of switching a call edge to a ref edge into
two separate APIs so that callers can clearly test for this and take the
easy path without invalidating when appropriate. This is particularly
important in this case as we expect most inlines to be between functions
in separate SCCs and so the common case is that we don't have to so
aggressively invalidate analyses.

The LCG API change in turn needed some basic cleanups and better testing
in its unittest. No interesting functionality changed there other than
more coverage of the returned sequence of SCCs.

While this seems like an obvious improvement over the current state, I'd
like to revisit the core concept of invalidating within the CG-update
layer at all. I'm wondering if we would be better served forcing the
callers to handle the invalidation beforehand in the cases that they
can handle it. An interesting example is when we want to teach the
inliner to *update and preserve* analyses. But we can cross that bridge
when we get there.

With this patch, the new pass manager an build all of the LLVM test
suite at -O3 and everything passes. =D I haven't bootstrapped yet and
I'm sure there are still plenty of bugs, but this gives a nice baseline
so I'm going to increasingly focus on fleshing out the missing
functionality, especially the bits that are just turned off right now in
order to let us establish this baseline.

llvm-svn: 290664
2016-12-28 10:34:50 +00:00
Chandler Carruth
ff1b18d787 [LCG] Teach the ref edge removal to handle a ref edge that is trivial
due to a call cycle.

This actually crashed the ref removal before.

I've added a unittest that covers this kind of interesting graph
structure and mutation.

llvm-svn: 290645
2016-12-28 02:24:58 +00:00
Chandler Carruth
99ebb6e7dc [PM] Introduce the facilities for registering cross-IR-unit dependencies
that require deferred invalidation.

This handles the other real-world invalidation scenario that we have
cases of: a function analysis which caches references to a module
analysis. We currently do this in the AA aggregation layer and might
well do this in other places as well.

Since this is relative rare, the technique is somewhat more cumbersome.
Analyses need to register themselves when accessing the outer analysis
manager's proxy. This proxy is already necessarily present to allow
access to the outer IR unit's analyses. By registering here we can track
and trigger invalidation when that outer analysis goes away.

To make this work we need to enhance the PreservedAnalyses
infrastructure to support a (slightly) more explicit model for "sets" of
analyses, and allow abandoning a single specific analyses even when
a set covering that analysis is preserved. That allows us to describe
the scenario of preserving all Function analyses *except* for the one
where deferred invalidation has triggered.

We also need to teach the invalidator API to support direct ID calls
instead of always going through a template to dispatch so that we can
just record the ID mapping.

I've introduced testing of all of this both for simple module<->function
cases as well as for more complex cases involving a CGSCC layer.

Much like the previous patch I've not tried to fully update the loop
pass management layer because that layer is due to be heavily reworked
to use similar techniques to the CGSCC to handle updates. As that
happens, we'll have a better testing basis for adding support like this.

Many thanks to both Justin and Sean for the extensive reviews on this to
help bring the API design and documentation into a better state.

Differential Revision: https://reviews.llvm.org/D27198

llvm-svn: 290594
2016-12-27 08:40:39 +00:00
Chandler Carruth
c51940f6af [LCG] Teach the LazyCallGraph to handle visiting the blockaddress
constant expression and to correctly form function reference edges
through them without crashing because one of the operands (the
`BasicBlock` isn't actually a constant despite being an operand of
a constant).

llvm-svn: 290581
2016-12-27 05:00:45 +00:00
George Burgess IV
6a02f0bf77 Don't consider allocsize functions to be allocation functions.
This patch fixes some ASAN unittest failures on FreeBSD. See the
cfe-commits email thread for r290169 for more on those.

According to the LangRef, the allocsize attribute only tells us about
the number of bytes that exist at the memory location pointed to by the
return value of a function. It does not necessarily mean that the
function will only ever allocate. So, we need to be very careful about
treating functions with allocsize as general allocation functions. This
patch makes us fully conservative in this regard, though I suspect that
we have room to be a bit more aggressive if we want.

This has a FIXME that can be fixed by a relatively straightforward
refactor; I just wanted to keep this patch minimal. If this sticks, I'll
come back and fix it in a few days.

llvm-svn: 290397
2016-12-23 01:18:09 +00:00
Chandler Carruth
2ed363a464 [PM] Introduce a reasonable port of the main per-module pass pipeline
from the old pass manager in the new one.

I'm not trying to support (initially) the numerous options that are
currently available to customize the pass pipeline. If we end up really
wanting them, we can add them later, but I suspect many are no longer
interesting. The simplicity of omitting them will help a lot as we sort
out what the pipeline should look like in the new PM.

I've also documented to the best of my ability *why* each pass or group
of passes is used so that reading the pipeline is more helpful. In many
cases I think we have some questionable choices of ordering and I've
left FIXME comments in place so we know what to come back and revisit
going forward. But for now, I've left it as similar to the current
pipeline as I could.

Lastly, I've had to comment out several places where passes are not
ported to the new pass manager or where the loop pass infrastructure is
not yet ready. I did at least fix a few bugs in the loop pass
infrastructure uncovered by running the full pipeline, but I didn't want
to go too far in this patch -- I'll come back and re-enable these as the
infrastructure comes online. But I'd like to keep the comments in place
because I don't want to lose track of which passes need to be enabled
and where they go.

One thing that seemed like a significant API improvement was to require
that we don't build pipelines for O0. It seems to have no real benefit.

I've also switched back to returning pass managers by value as at this
API layer it feels much more natural to me for composition. But if
others disagree, I'm happy to go back to an output parameter.

I'm not 100% happy with the testing strategy currently, but it seems at
least OK. I may come back and try to refactor or otherwise improve this
in subsequent patches but I wanted to at least get a good starting point
in place.

Differential Revision: https://reviews.llvm.org/D28042

llvm-svn: 290325
2016-12-22 06:59:15 +00:00
Daniel Jasper
162ffcacd6 Revert @llvm.assume with operator bundles (r289755-r289757)
This creates non-linear behavior in the inliner (see more details in
r289755's commit thread).

llvm-svn: 290086
2016-12-19 08:22:17 +00:00
Vedant Kumar
896c96fa1e Retry: [BPI] Use a safer constructor to calculate branch probabilities
BPI may trigger signed overflow UB while computing branch probabilities for
cold calls or to unreachables. For example, with our current choice of weights,
we'll crash if there are >= 2^12 branches to an unreachable.

Use a safer BranchProbability constructor which is better at handling fractions
with large denominators.

Changes since the initial commit:
  - Use explicit casts to ensure that multiplication operands are 64-bit
    ints.

rdar://problem/29368161

Differential Revision: https://reviews.llvm.org/D27862

llvm-svn: 290022
2016-12-17 01:02:08 +00:00
Vedant Kumar
2a685e1c7e Revert "[BPI] Use a safer constructor to calculate branch probabilities"
This reverts commit r290016. It breaks this bot, even though the test
passes locally:

  http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/32956/

AnalysisTests: /home/bb/ninja-x64-msvc-RA-centos6/llvm-project/llvm/lib/Support/BranchProbability.cpp:52: static llvm::BranchProbability llvm::BranchProbability::getBranchProbability(uint64_t, uint64_t): Assertion `Numerator <= Denominator && "Probability cannot be bigger than 1!"' failed.
llvm-svn: 290019
2016-12-17 00:19:06 +00:00
Vedant Kumar
0a726263ad [BPI] Use a safer constructor to calculate branch probabilities
BPI may trigger signed overflow UB while computing branch probabilities
for cold calls or to unreachables. For example, with our current choice
of weights, we'll crash if there are >= 2^12 branches to an unreachable.

Use a safer BranchProbability constructor which is better at handling
fractions with large denominators.

rdar://problem/29368161

Differential Revision: https://reviews.llvm.org/D27862

llvm-svn: 290016
2016-12-17 00:09:51 +00:00
Hal Finkel
f224db75d2 Remove the AssumptionCache
After r289755, the AssumptionCache is no longer needed. Variables affected by
assumptions are now found by using the new operand-bundle-based scheme. This
new scheme is more computationally efficient, and also we need much less
code...

llvm-svn: 289756
2016-12-15 03:02:15 +00:00
Reid Kleckner
a3cd1e4795 Revert "[SCEVExpand] do not hoist divisions by zero (PR30935)"
Reverts r289412. It caused an OOB PHI operand access in instcombine when
ASan is enabled. Reduction in progress.

Also reverts "[SCEVExpander] Add a test case related to r289412"

llvm-svn: 289453
2016-12-12 18:52:32 +00:00
Sanjoy Das
1a88a7426f [SCEVExpander] Add a test case related to r289412
llvm-svn: 289435
2016-12-12 14:57:11 +00:00
Sebastian Pop
9402602249 [SCEVExpand] do not hoist divisions by zero (PR30935)
SCEVExpand computes the insertion point for the components of a SCEV to be code
generated.  When it comes to generating code for a division, SCEVexpand would
not be able to check (at compilation time) all the conditions necessary to avoid
a division by zero.  The patch disables hoisting of expressions containing
divisions by anything other than non-zero constants in order to avoid hoisting
these expressions past conditions that should hold before doing the division.

The patch passes check-all on x86_64-linux.

Differential Revision: https://reviews.llvm.org/D27216

llvm-svn: 289412
2016-12-12 02:52:51 +00:00
Sanjoy Das
fc775b1e49 [TBAA] Don't generate invalid TBAA when merging nodes
Summary:
Fix a corner case in `MDNode::getMostGenericTBAA` where we can sometimes
generate invalid TBAA metadata.

Reviewers: chandlerc, hfinkel, mehdi_amini, manmanren

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D26635

llvm-svn: 289403
2016-12-11 20:07:25 +00:00
Chandler Carruth
4fc787e70d [PM] Support invalidation of inner analysis managers from a pass over the outer IR unit.
Summary:
This never really got implemented, and was very hard to test before
a lot of the refactoring changes to make things more robust. But now we
can test it thoroughly and cleanly, especially at the CGSCC level.

The core idea is that when an inner analysis manager proxy receives the
invalidation event for the outer IR unit, it needs to walk the inner IR
units and propagate it to the inner analysis manager for each of those
units. For example, each function in the SCC needs to get an
invalidation event when the SCC gets one.

The function / module interaction is somewhat boring here. This really
becomes interesting in the face of analysis-backed IR units. This patch
effectively handles all of the CGSCC layer's needs -- both invalidating
SCC analysis and invalidating function analysis when an SCC gets
invalidated.

However, this second aspect doesn't really handle the
LoopAnalysisManager well at this point. That one will need some change
of design in order to fully integrate, because unlike the call graph,
the entire function behind a LoopAnalysis's results can vanish out from
under us, and we won't even have a cached API to access. I'd like to try
to separate solving the loop problems into a subsequent patch though in
order to keep this more focused so I've adapted them to the API and
updated the tests that immediately fail, but I've not added the level of
testing and validation at that layer that I have at the CGSCC layer.

An important aspect of this change is that the proxy for the
FunctionAnalysisManager at the SCC pass layer doesn't work like the
other proxies for an inner IR unit as it doesn't directly manage the
FunctionAnalysisManager and invalidation or clearing of it. This would
create an ever worsening problem of dual ownership of this
responsibility, split between the module-level FAM proxy and this
SCC-level FAM proxy. Instead, this patch changes the SCC-level FAM proxy
to work in terms of the module-level proxy and defer to it to handle
much of the updates. It only does SCC-specific invalidation. This will
become more important in subsequent patches that support more complex
invalidaiton scenarios.

Reviewers: jlebar

Subscribers: mehdi_amini, mcrosier, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D27197

llvm-svn: 289317
2016-12-10 06:34:44 +00:00
Chandler Carruth
84780666b4 [PM] Extend the explicit 'invalidate' method API on analysis results to
accept an Invalidator that allows them to invalidate themselves if their
dependencies are in turn invalidated.

Rather than recording the dependency graph ahead of time when analysis
get results from other analyses, this simply lets each result trigger
the immediate invalidation of any analyses they actually depend on. They
do this in a way that has three nice properties:

1) They don't have to handle transitive dependencies because the
   infrastructure will recurse for them.
2) The invalidate methods are still called only once. We just
   dynamically discover the necessary topological ordering, everything
   is memoized nicely.
3) The infrastructure still provides a default implementation and can
   access it so that only analyses which have dependencies need to do
   anything custom.

To make this work at all, the invalidation logic also has to defer the
deletion of the result objects themselves so that they can remain alive
until we have collected the complete set of results to invalidate.

A unittest is added here that has exactly the dependency pattern we are
concerned with. It hit the use-after-free described by Sean in much
detail in the long thread about analysis invalidation before this
change, and even in an intermediate form of this change where we failed
to defer the deletion of the result objects.

There is an important problem with doing dependency invalidation that
*isn't* solved here: we don't *enforce* that results correctly
invalidate all the analyses whose results they depend on.

I actually looked at what it would take to do that, and it isn't as hard
as I had thought but the complexity it introduces seems very likely to
outweigh the benefit. The technique would be to provide a base class for
an analysis result that would be populated with other results, and
automatically provide the invalidate method which immediately does the
correct thing. This approach has some nice pros IMO:
- Handles the case we care about and nothing else: only *results*
  that depend on other analyses trigger extra invalidation.
- Localized to the result rather than centralized in the analysis
  manager.
- Ties the storage of the reference to another result to the triggering
  of the invalidation of that analysis.
- Still supports extending invalidation in customized ways.

But the down sides here are:
- Very heavy-weight meta-programming is needed to provide this base
  class.
- Requires a pretty awful API for accessing the dependencies.

Ultimately, I fear it will not pull its weight. But we can re-evaluate
this at any point if we start discovering consistent problems where the
invalidation and dependencies get out of sync. It will fit as a clean
layer on top of the facilities in this patch that we can add if and when
we need it.

Note that I'm not really thrilled with the names for these APIs... The
name "Invalidator" seems ok but not great. The method name "invalidate"
also. In review some improvements were suggested, but they really need
*other* uses of these terms to be updated as well so I'm going to do
that in a follow-up commit.

I'm working on the actual fixes to various analyses that need to use
these, but I want to try to get tests for each of them so we don't
regress. And those changes are seperable and obvious so once this goes
in I should be able to roll them out throughout LLVM.

Many thanks to Sean, Justin, and others for help reviewing here.

Differential Revision: https://reviews.llvm.org/D23738

llvm-svn: 288077
2016-11-28 22:04:31 +00:00
Chandler Carruth
49ee0f2c30 [PM] Add an ASCII-art diagram for the call graph in the CGSCC unit test.
No functionality changed.

llvm-svn: 288013
2016-11-28 03:40:33 +00:00
Chandler Carruth
dad102bcc9 [PM] Change the static object whose address is used to uniquely identify
analyses to have a common type which is enforced rather than using
a char object and a `void *` type when used as an identifier.

This has a number of advantages. First, it at least helps some of the
confusion raised in Justin Lebar's code review of why `void *` was being
used everywhere by having a stronger type that connects to documentation
about this.

However, perhaps more importantly, it addresses a serious issue where
the alignment of these pointer-like identifiers was unknown. This made
it hard to use them in pointer-like data structures. We were already
dodging this in dangerous ways to create the "all analyses" entry. In
a subsequent patch I attempted to use these with TinyPtrVector and
things fell apart in a very bad way.

And it isn't just a compile time or type system issue. Worse than that,
the actual alignment of these pointer-like opaque identifiers wasn't
guaranteed to be a useful alignment as they were just characters.

This change introduces a type to use as the "key" object whose address
forms the opaque identifier. This both forces the objects to have proper
alignment, and provides type checking that we get it right everywhere.
It also makes the types somewhat less mysterious than `void *`.

We could go one step further and introduce a truly opaque pointer-like
type to return from the `ID()` static function rather than returning
`AnalysisKey *`, but that didn't seem to be a clear win so this is just
the initial change to get to a reliably typed and aligned object serving
is a key for all the analyses.

Thanks to Richard Smith and Justin Lebar for helping pick plausible
names and avoid making this refactoring many times. =] And thanks to
Sean for the super fast review!

While here, I've tried to move away from the "PassID" nomenclature
entirely as it wasn't really helping and is overloaded with old pass
manager constructs. Now we have IDs for analyses, and key objects whose
address can be used as IDs. Where possible and clear I've shortened this
to just "ID". In a few places I kept "AnalysisID" to make it clear what
was being identified.

Differential Revision: https://reviews.llvm.org/D27031

llvm-svn: 287783
2016-11-23 17:53:26 +00:00
Chandler Carruth
8fd13bdcc8 [LCG] Start using SCC relationship predicates in the unittest.
This mostly gives us nice unittesting of the predicates themselves. I'll
start using them further in subsequent commits to help test the actual
operations performed on the graph.

llvm-svn: 287698
2016-11-22 20:35:32 +00:00
Daniil Fukalov
cb1b606dcd [SCEV] limit recursion depth of CompareSCEVComplexity
Summary:
CompareSCEVComplexity goes too deep (50+ on a quite a big unrolled loop) and runs almost infinite time.

Added cache of "equal" SCEV pairs to earlier cutoff of further estimation. Recursion depth limit was also introduced as a parameter.

Reviewers: sanjoy

Subscribers: mzolotukhin, tstellarAMD, llvm-commits

Differential Revision: https://reviews.llvm.org/D26389

llvm-svn: 287232
2016-11-17 16:07:52 +00:00
Nico Weber
443375fd3a Revert r286437 r286438, they caused PR30976
llvm-svn: 286483
2016-11-10 17:55:41 +00:00