1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 22:42:46 +02:00
Commit Graph

5734 Commits

Author SHA1 Message Date
Chandler Carruth
2f75ae919d [PM/AA] Delete the LibCallAliasAnalysis and all the associated
infrastructure.

This AA was never used in tree. It's infrastructure also completely
overlaps that of TargetLibraryInfo which is used heavily by BasicAA to
achieve similar goals to those stated for this analysis.

As has come up in several discussions, the use case here is still really
important, but this code isn't helping move toward that use case. Any
progress on better supporting rich AA information for runtime library
environments would likely be better off starting from scratch or
starting from TargetLibraryInfo than from this base.

Differential Revision: http://reviews.llvm.org/D12028

llvm-svn: 245155
2015-08-15 09:22:21 +00:00
David Majnemer
85a57db552 [IR] Give catchret an optional 'return value' operand
Some personality routines require funclet exit points to be clearly
marked, this is done by producing a token at the funclet pad and
consuming it at the corresponding ret instruction.  CleanupReturnInst
already had a spot for this operand but CatchReturnInst did not.
Other personality routines don't need to use this which is why it has
been made optional.

llvm-svn: 245149
2015-08-15 02:46:08 +00:00
Bjarke Hammersholt Roune
c5853844e0 [SCEV] Apply NSW and NUW flags via poison value analysis for sub, mul and shl
Summary:
http://reviews.llvm.org/D11212 made Scalar Evolution able to propagate NSW and NUW flags from instructions to SCEVs for add instructions. This patch expands that to sub, mul and shl instructions.

This change makes LSR able to generate pointer induction variables for loops like these, where the index is 32 bit and the pointer is 64 bit:

  for (int i = 0; i < numIterations; ++i)
    sum += ptr[i - offset];

  for (int i = 0; i < numIterations; ++i)
    sum += ptr[i * stride];

  for (int i = 0; i < numIterations; ++i)
    sum += ptr[3 * (i << 7)];


Reviewers: atrick, sanjoy

Subscribers: sanjoy, majnemer, hfinkel, llvm-commits, meheff, jingyue, eliben

Differential Revision: http://reviews.llvm.org/D11860

llvm-svn: 245118
2015-08-14 22:45:26 +00:00
James Molloy
025f427f26 Separate out BDCE's analysis into a separate DemandedBits analysis.
This allows other areas of the compiler to use BDCE's bit-tracking.
NFCI.

llvm-svn: 245039
2015-08-14 11:09:09 +00:00
David Majnemer
10f2d9234b [IR] Add token types
This introduces the basic functionality to support "token types".
The motivation stems from the need to perform operations on a Value
whose provenance cannot be obscured.

There are several applications for such a type but my immediate
motivation stems from WinEH.  Our personality routine enforces a
single-entry - single-exit regime for cleanups.  After several rounds of
optimizations, we may be left with a terminator whose "cleanup-entry
block" is not entirely clear because control flow has merged two
cleanups together.  We have experimented with using labels as operands
inside of instructions which are not terminators to indicate where we
came from but found that LLVM does not expect such exotic uses of
BasicBlocks.

Instead, we can use this new type to clearly associate the "entry point"
and "exit point" of our cleanup.  This is done by having the cleanuppad
yield a Token and consuming it at the cleanupret.
The token type makes it impossible to obscure or otherwise hide the
Value, making it trivial to track the relationship between the two
points.

What is the burden to the optimizer?  Well, it turns out we have already
paid down this cost by accepting that there are certain calls that we
are not permitted to duplicate, optimizations have to watch out for
such instructions anyway.  There are additional places in the optimizer
that we will probably have to update but early examination has given me
the impression that this will not be heroic.

Differential Revision: http://reviews.llvm.org/D11861

llvm-svn: 245029
2015-08-14 05:09:07 +00:00
Chandler Carruth
5effbccc9f [PM/AA] Extract the interface for GlobalsModRef into a header along with
its creation function.

This required shifting a bunch of method definitions to be out-of-line
so that we could leave most of the implementation guts in the .cpp file.

llvm-svn: 245021
2015-08-14 03:48:20 +00:00
Chandler Carruth
ac11f6dc12 [PM/AA] Hoist the interface to TBAA into a dedicated header along with
its creation function. Update the relevant includes accordingly.

llvm-svn: 245019
2015-08-14 03:33:48 +00:00
Chandler Carruth
b26f0ade4b [PM/AA] Run clang-format over TBAA code to normalize the formatting
before making substantial changes.

llvm-svn: 245017
2015-08-14 03:26:15 +00:00
Chandler Carruth
30822a92e0 [PM/AA] Clean up the SCEV-AA comment formatting and typos.
llvm-svn: 245015
2015-08-14 03:14:50 +00:00
Chandler Carruth
aa8337ce94 [PM/AA] Run clang-format over the SCEV-AA code to normalize the
formatting.

llvm-svn: 245014
2015-08-14 03:12:16 +00:00
Chandler Carruth
2e1bf8f70a [PM/AA] Hoist the SCEV-AA interface to its own header and pull the
creation function into that header.

llvm-svn: 245013
2015-08-14 03:11:16 +00:00
Chandler Carruth
0e1aede735 [PM/AA] Hoist ScopedNoAliasAA's interface into a header and move the
creation function there.

Same basic refactoring as the other alias analyses. Nothing special
required this time around.

llvm-svn: 245012
2015-08-14 02:55:50 +00:00
Chandler Carruth
ca17b4c203 [PM/AA] Hoist the value handle definition for CFLAA into the header to
satisfy libc++'s std::forward_list which requires the value type to be
complete.

llvm-svn: 245011
2015-08-14 02:50:34 +00:00
Chandler Carruth
3a54a873d9 [PM/AA] Run clang-format over the ScopedNoAliasAA pass prior to making
substantial changes to normalize any formatting.

llvm-svn: 245010
2015-08-14 02:46:07 +00:00
Chandler Carruth
394485dd25 [PM/AA] Extract a minimal interface for CFLAA to its own header file.
I've used forward declarations and reorderd the source code some to make
this reasonably clean and keep as much of the code as possible in the
source file, including all the stratified set details. Just the basic AA
interface and the create function are in the header file, and the header
file is now included into the relevant locations.

llvm-svn: 245009
2015-08-14 02:42:20 +00:00
Chandler Carruth
b1e225bebc [PM/AA] Sink all the actual code from AliasAnalysisCounter back into the
.cpp file to make the header much less noisy.

Also makes it easy to use a static helper rather than a public method
for printing lines of stats.

llvm-svn: 245006
2015-08-14 02:12:12 +00:00
Chandler Carruth
d21e2548b8 [PM/AA] Run clang-format over this code to establish a clean baseline
for subsequent changes.

llvm-svn: 245005
2015-08-14 02:07:05 +00:00
Chandler Carruth
5e05aded1c [PM/AA] Hoist the AA counter pass into a header to match the analysis
pattern.

Also hoist the creation routine out of the generic header and into the
pass header now that we have one.

I've worked to not make any changes, even formatting ones here. I'll
clean up the formatting and other things in a follow-up patch now that
the code is in the right place.

llvm-svn: 245004
2015-08-14 02:05:41 +00:00
Chandler Carruth
fe6f1d651e [PM/AA] Remove the function names and class names from doxygen comments
and generally clean up their formatting.

llvm-svn: 245002
2015-08-14 01:43:46 +00:00
Chandler Carruth
9f00c47578 [PM/AA] Move the LibCall AA creation routine declaration to that
analysis's header file to be more consistent with other analyses.

llvm-svn: 245001
2015-08-14 01:43:02 +00:00
Chandler Carruth
d3f9d78554 [PM/AA] Run clang-format over LibCallAliasAnalysis prior to making
substantial changes needed for the new pass manager's AA integration.

llvm-svn: 245000
2015-08-14 01:38:25 +00:00
Chandler Carruth
97b830a9d8 [PM/AA] Remove the AliasDebugger pass.
This debugger was designed to catch places where the old update API was
failing to be used correctly. As I've removed the update API, it no
longer serves any purpose. We can introduce new debugging aid passes
around any future work w.r.t. updating AAs.

Note that I've updated the documentation here, but really I need to
rewrite the documentation to carefully spell out the ideas around
stateful AA and how things are changing in the AA world. However, I'm
hoping to do that as a follow-up to the refactoring of the AA
infrastructure to work in both old and new pass managers so that I can
write the documentation specific to that world.

Differential Revision: http://reviews.llvm.org/D11984

llvm-svn: 244825
2015-08-12 22:54:47 +00:00
Chandler Carruth
da8f360fe7 [PM/AA] Add missing static dependency edges from DSE and memdep to TLI.
I forgot to add these in r244780 and r244778. Sorry about that.

Also order the static dependencies in a lexicographical order.

llvm-svn: 244787
2015-08-12 18:10:45 +00:00
Chandler Carruth
3fd9750c3e [PM/AA] Have memdep explicitly get and use TargetLibraryInfo rather than
relying on sneaking it out of its AliasAnalysis.

This abuse of AA (to shuffle TLI around rather than explicitly depending
on it) is going away with my refactor of AA.

llvm-svn: 244778
2015-08-12 17:47:44 +00:00
James Molloy
b13284a73a [ValueTracking] Tweak a comment slightly
Hal asked for this change in D11146, but I missed it when I committed originally.

llvm-svn: 244754
2015-08-12 15:11:43 +00:00
James Molloy
ecd6525b24 Add support for floating-point minnum and maxnum
The select pattern recognition in ValueTracking (as used by InstCombine
and SelectionDAGBuilder) only knew about integer patterns. This teaches
it about minimum and maximum operations.

matchSelectPattern() has been extended to return a struct containing the
existing Flavor and a new enum defining the pattern's behavior when
given one NaN operand.

C minnum() is defined to return the non-NaN operand in this case, but
the idiomatic C "a < b ? a : b" would return the NaN operand.

ARM and AArch64 at least have different instructions for these different cases.

llvm-svn: 244580
2015-08-11 09:12:57 +00:00
Michael Kuperstein
6d16ba7233 [GMR] Be a bit smarter about which globals don't alias when doing recursive lookups
Should hopefully fix the remainder of PR24288.

Differential Revision: http://reviews.llvm.org/D11900

llvm-svn: 244575
2015-08-11 08:06:44 +00:00
Adam Nemet
3f19b8eced [LAA] Change name from addRuntimeCheck to addRuntimeChecks, NFC
This was requested by Hal in D11205.

llvm-svn: 244540
2015-08-11 00:09:37 +00:00
Igor Laevsky
dc6b4a78a4 [IndVarSimplify] Make cost estimation in RewriteLoopExitValues smarter
Differential Revision: http://reviews.llvm.org/D11687

llvm-svn: 244474
2015-08-10 18:23:58 +00:00
Silviu Baranga
fccd898d4b [TTI] Add a hook for specifying per-target defaults for Interleaved Accesses
Summary:
This adds a hook to TTI which enables us to selectively turn on by default
interleaved access vectorization for targets on which we have have performed
the required benchmarking.

Reviewers: rengolin

Subscribers: rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D11901

llvm-svn: 244449
2015-08-10 14:50:54 +00:00
Michael Kruse
8153cc8a10 [RegionInfo] Add debug-time region viewer functions
Summary:
Analogously to Function::viewCFG(), RegionInfo::view() and RegionInfo::viewOnly() are meant to be called in debugging sessions. They open a viewer to show how RegionInfo currently understands the region hierarchy.

The functions viewRegion(Function*) and viewRegionOnly(Function*) invoke a fresh region analysis of the function in contrast to viewRegion(RegionInfo*) and viewRegionOnly(RegionInfo*) which show the current analysis result.

Reviewers: grosser

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11875

llvm-svn: 244444
2015-08-10 13:21:59 +00:00
Michael Kruse
3373ce735a [RegionInfo] Use RegionInfo* instead of RegionInfoPass* as graph type
This allows printing region graphs when only the RegionInfo (e.g. Region::getRegionInfo()), but no RegionInfoPass object is available.

Specifically, we will use this to print RegionInfo graphs in the debugger.

Differential version: http://reviews.llvm.org/D11874

Reviewed-by: grosser
llvm-svn: 244442
2015-08-10 12:57:23 +00:00
Adam Nemet
4451b15c71 [LAA] Remove unused pointer partition argument from needsChecking(), NFC
This is no longer used in any of the callers.  Also remove the logic of
handling this argument.

llvm-svn: 244421
2015-08-09 20:06:08 +00:00
Adam Nemet
03ba5040ec [LAA] Remove unused pointer partition argument from generateChecks, NFC
LoopDistribution does its own filtering now.

llvm-svn: 244420
2015-08-09 20:06:06 +00:00
David Majnemer
3ac7e03604 [PHITransAddr] Don't assume that instruction operands are translatable
We can only PHI translate instructions.  In our attempt to PHI translate
a bitcast, we attempt to translate its operand; however, the operand
might be an argument or a global instead of an instruction.  Benignly
bail out when this happens.

This fixes PR24397.

Differential Revision: http://reviews.llvm.org/D11879

llvm-svn: 244418
2015-08-09 15:43:02 +00:00
Benjamin Kramer
69a3fdb314 Fix some comment typos.
llvm-svn: 244402
2015-08-08 18:27:36 +00:00
Adam Nemet
c67a08de7f [LAA] Remove unused pointer partition argument from getNumberOfChecks, NFC
This is unused after filtering checks was moved to the clients.

As a result, we can just return the number of the checks in the
precomputed set.

llvm-svn: 244369
2015-08-07 22:44:21 +00:00
Adam Nemet
36760f098d [LAA] Make the set of runtime checks part of the state of LAA, NFC
This is the full set of checks that clients can further filter. IOW,
it's client-agnostic.  This makes LAA complete in the sense that it now
provides the two main results of its analysis precomputed:

1. memory dependences via getDepChecker().getInsterestingDependences()
2. run-time checks via getRuntimePointerCheck().getChecks()

However, as a consequence we now compute this information pro-actively.
Thus if the client decides to skip the loop based on the dependences
we've computed the checks unnecessarily.  In order to see whether this
was a significant overhead I checked compile time on SPEC2k6 LTO bitcode
files.  The change was in the noise.

The checks are generated in canCheckPtrAtRT, at the same place where we
used to call groupChecks to merge checks.

llvm-svn: 244368
2015-08-07 22:44:15 +00:00
Adam Nemet
59fd1cc3c9 [LAA] Remove unused pointer partition argument from print(), NFC
This is now handled in the client.  No need for LAA to provide this
variant.

llvm-svn: 244349
2015-08-07 19:44:48 +00:00
Sanjoy Das
85ee23b440 [IndVars] Fix PR24356.
Unsigned predicates increase or decrease agnostic of the signs of their
increments.

llvm-svn: 244265
2015-08-06 20:43:41 +00:00
Pete Cooper
598f1f2fd1 Convert a bunch of loops to foreach. NFC.
After r244074, we now have a successors() method to iterate over
all the successors of a TerminatorInst.  This commit changes a bunch
of eligible loops to use it.

llvm-svn: 244260
2015-08-06 20:22:46 +00:00
Nico Rieck
4acab3dcc9 Rename inst_range() to instructions() for consistency. NFC
llvm-svn: 244248
2015-08-06 19:10:45 +00:00
Quentin Colombet
4323bdacbd [Reassociation] Fix miscompile for va_arg arguments.
iisUnmovableInstruction() had a list of instructions hardcoded which are
considered unmovable. The list lacked (at least) an entry for the va_arg
and cmpxchg instructions.
Fix this by introducing a new Instruction::mayBeMemoryDependent()
instead of maintaining another instruction list.

Patch by Matthias Braun <matze@braunis.de>.

Differential Revision: http://reviews.llvm.org/D11577

rdar://problem/22118647

llvm-svn: 244244
2015-08-06 18:44:34 +00:00
Chandler Carruth
6cbdb95607 [PM/AA] Clean up and homogenize comments throughout basic-aa.
llvm-svn: 244200
2015-08-06 08:17:06 +00:00
Chandler Carruth
5ffeb1ea3f [PM/AA] Run clang-format over all of basic-aa before making more
substantive edits.

llvm-svn: 244198
2015-08-06 07:57:58 +00:00
Chandler Carruth
a0655c50ee [PM/AA] Hoist the interface for BasicAA into a header file.
This is the first mechanical step in preparation for making this and all
the other alias analysis passes available to the new pass manager. I'm
factoring out all the totally boring changes I can so I'm moving code
around here with no other changes. I've even minimized the formatting
churn.

I'll reformat and freshen comments on the interface now that its located
in the right place so that the substantive changes don't triger this.

llvm-svn: 244197
2015-08-06 07:33:15 +00:00
Chandler Carruth
595690977b [PM/AA] Simplify the AliasAnalysis interface by removing a wrapper
around a DataLayout interface in favor of directly querying DataLayout.

This wrapper specifically helped handle the case where this no
DataLayout, but LLVM now requires it simplifynig all of this. I've
updated callers to directly query DataLayout. This in turn exposed
a bunch of places where we should have DataLayout readily available but
don't which I've fixed. This then in turn exposed that we were passing
DataLayout around in a bunch of arguments rather than making it readily
available so I've also fixed that.

No functionality changed.

llvm-svn: 244189
2015-08-06 02:05:46 +00:00
Wei Mi
3d7c898206 Add a stat to show how often the limit to decompose GEPs in BasicAA is reached.
Differential Revision: http://reviews.llvm.org/D9689

llvm-svn: 244174
2015-08-05 23:40:30 +00:00
Chandler Carruth
b00f078956 [PM] Remove a failed attempt to port the CallGraph analysis to the new
pass manager.

This never worked, and won't ever work. It was actually why I ended up
building the LazyCallGraph set of code which is more more effectively
wired up to the new pass manager. This accidentally got committed when
I was trying to land a cleanup of the code organization in the other
parts of this file. =[ My bad, but fortunately Dave was keen eyed enough
to spot that this code couldn't possibly work. =]

llvm-svn: 244127
2015-08-05 21:04:31 +00:00
David Blaikie
4b09d11d27 -Wdeprecated cleanup: Make CallGraph movable by default by using unique_ptr members rather than raw pointers.
The only place that tries to return a CallGraph by value
(CallGraphAnalysis::run) doesn't seem to be used right now, but it's a
reasonable bit of cleanup anyway.

llvm-svn: 244122
2015-08-05 20:55:50 +00:00
Chandler Carruth
98500f2974 [TTI] Make the cost APIs in TargetTransformInfo consistently use 'int'
rather than 'unsigned' for their costs.

For something like costs in particular there is a natural "negative"
value, that of savings or saved cost. As a consequence, there is a lot
of code that subtracts or creates negative values based on cost, all of
which is prone to awkwardness or bugs when dealing with an unsigned
type. Similarly, we *never* want these values to wrap, as that would
cause Very Bad code generation (likely percieved as an infinite loop as
we try to emit over 2^32 instructions or some such insanity).

All around 'int' seems a much better fit for these basic metrics. I've
added asserts to ensure that at least the TTI interface never returns
negative numbers here. If we ever have a use case for negative numbers,
we can remove this, but this way a bug where someone used '-1' to
produce a 'very large' cost will be caught by the assert.

This passes all tests, and is also UBSan clean.

No functional change intended.

Differential Revision: http://reviews.llvm.org/D11741

llvm-svn: 244080
2015-08-05 18:08:10 +00:00
Chandler Carruth
9d06f58b97 [GMR] Teach the conservative path of GMR to catch even more easy cases.
In PR24288 it was pointed out that the easy case of a non-escaping
global and something that *obviously* required an escape sometimes is
hidden behind PHIs (or selects in theory). Because we have this binary
test, we can easily just check that all possible input values satisfy
the requirement. This is done with a (very small) recursion through PHIs
and selects. With this, the specific example from the PR is correctly
folded by GVN.

Differential Revision: http://reviews.llvm.org/D11707

llvm-svn: 244078
2015-08-05 17:58:30 +00:00
Benjamin Kramer
21b192d068 [AA] Use CallSite cast idiom. No functionality change.
llvm-svn: 244045
2015-08-05 14:16:44 +00:00
Adam Nemet
45d547aa42 [LAA] Remove unused pointer partition argument from addRuntimeCheck, NFC
This variant of addRuntimeCheck is only used now from the LoopVectorizer
which does not use this parameter.

llvm-svn: 243955
2015-08-04 05:16:20 +00:00
Adam Nemet
9cd964a03a [LAA] Remove unused needsAnyChecking(), NFC
llvm-svn: 243921
2015-08-03 23:33:03 +00:00
David Blaikie
95e59129ca -Wdeprecated-clean: Fix cases of violating the rule of 5 in ways that are deprecated in C++11
Various value handles needed to be copy constructible and copy
assignable (mostly for their use in DenseMap). But to avoid an API that
might allow accidental slicing, make these members protected in the base
class and make derived classes final (the special members become
implicitly public there - but disallowing further derived classes that
might be sliced to the intermediate type).

Might be worth having a warning a bit like -Wnon-virtual-dtor that
catches public move/copy assign/ctors in classes with virtual functions.
(suppressable in the same way - by making them protected in the base,
and making the derived classes final) Could be fancier and only diagnose
them when they're actually called, potentially.

Also allow a few default implementations where custom implementations
(especially with non-standard return types) were implemented.

llvm-svn: 243909
2015-08-03 22:30:24 +00:00
Craig Topper
bbb2ce25cc De-constify pointers to Type since they can't be modified. NFC
This was already done in most places a while ago. This just fixes the ones that crept in over time.

llvm-svn: 243842
2015-08-01 22:20:21 +00:00
David Blaikie
fb8a7d97d2 -Wdeprecated-clean: Fix cases of violating the rule of 5 in ways that are deprecated in C++11
llvm-svn: 243788
2015-07-31 21:37:09 +00:00
David Majnemer
34ee3789f3 New EH representation for MSVC compatibility
This introduces new instructions neccessary to implement MSVC-compatible
exception handling support.  Most of the middle-end and none of the
back-end haven't been audited or updated to take them into account.

Differential Revision: http://reviews.llvm.org/D11097

llvm-svn: 243766
2015-07-31 17:58:14 +00:00
Bruno Cardoso Lopes
8ce897de05 [CaptureTracker] Provide an ordered basic block to PointerMayBeCapturedBefore
This patch is a follow up from r240560 and is a step further into
mitigating the compile time performance issues in CaptureTracker.

By providing the CaptureTracker with a "cached ordered basic block"
instead of computing it every time, MemDepAnalysis can use this cache
throughout its calls to AA->callCapturesBefore, avoiding to recompute it
for every scanned instruction. In the same testcase used in r240560,
compile time is reduced from 2min to 30s.

This also fixes PR22348.

rdar://problem/19230319
Differential Revision: http://reviews.llvm.org/D11364

llvm-svn: 243750
2015-07-31 14:31:35 +00:00
Eric Christopher
520f118385 Rename hasCompatibleFunctionAttributes->areInlineCompatible based
on suggestions. Currently the function is only used for inline purposes
and this is more descriptive for the use.

llvm-svn: 243578
2015-07-29 22:09:48 +00:00
Jingyue Wu
a6a8a2d2b1 [SCEV] Apply NSW and NUW flags via poison value analysis
Summary:
Make Scalar Evolution able to propagate NSW and NUW flags from instructions to SCEVs in some cases. This is based on reasoning about when poison from instructions with these flags would trigger undefined behavior. This gives a 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX.

There does not seem to be clear agreement about when poison should be considered to propagate through instructions. In this analysis, poison propagates only in cases where that should be uncontroversial.

This change makes LSR able to create induction variables for expressions like &ptr[i + offset] for loops like this:

  for (int i = 0; i < limit; ++i) {
    sum += ptr[i + offset];
  }

Here ptr is a 64 bit pointer and offset is a 32 bit integer. For NVPTX, LSR currently creates an induction variable for i + offset instead, which is not as fast. Improving this situation is what brings the 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX.


There are more details in this discussion on llvmdev.
June: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-June/thread.html#87234
July: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/thread.html#87392

Patch by Bjarke Roune

Reviewers: eliben, atrick, sanjoy

Subscribers: majnemer, hfinkel, jingyue, meheff, llvm-commits

Differential Revision: http://reviews.llvm.org/D11212

llvm-svn: 243460
2015-07-28 18:22:40 +00:00
Bruno Cardoso Lopes
cdfacd96b4 [LVI] Cleanup whitespaces. NFC
llvm-svn: 243430
2015-07-28 15:53:21 +00:00
Silviu Baranga
97f890ebca [LAA] Add clarifying comments for the checking pointer grouping algorithm. NFC
llvm-svn: 243416
2015-07-28 13:44:08 +00:00
Chandler Carruth
d51b8c5480 [GMR] Teach GlobalsModRef to distinguish an important and safe case of
no-alias with non-addr-taken globals: they cannot alias a captured
pointer.

If the non-global underlying object would have been a capture were it to
alias the global, we can firmly conclude no-alias. It isn't reasonable
for a transformation to introduce a capture in a way observable by an
alias analysis. Consider, even if it were to temporarily capture one
globals address into another global and then restore the other global
afterward, there would be no way for the load in the alias query to
observe that capture event correctly. If it observes it then the
temporary capturing would have changed the meaning of the program,
making it an invalid transformation. Even instrumentation passes or
a pass which is synthesizing stores to global variables to expose race
conditions in programs could not trigger this unless it queried the
alias analysis infrastructure mid-transform, in which case it seems
reasonable to return results from before the transform started.

See the comments in the change for a more detailed outlining of the
theory here.

This should address the primary performance regression found when the
non-conservatively-correct path of the alias query was disabled.

Differential Revision: http://reviews.llvm.org/D11410

llvm-svn: 243405
2015-07-28 11:11:11 +00:00
Chandler Carruth
e831556fca [GMR] Fix a long-standing bug in GlobalsModRef where it failed to clear
out the per-function modref data structures when functions were deleted
or when globals were deleted.

I don't actually know how the global deletion side of this bug hasn't
been hit before, but for the other it just-so-happens that functions
aren't likely to be deleted in the particular part of the LTO pipeline
where we currently enable GMR, so we got lucky.

With this patch, I can self-host with GMR enabled in the normal pass
pipeline!

I was a bit concerned about the compile-time impact of this chang, which
is part of what motivated my prior string of patches to make the
per-function datastructure very dense and fast to walk. With those
changes in place, I can't measure a significant compile time difference
(the difference is around 0.1% which is *way* below the noise) before
and after this patch when building a linked bitcode for all of Clang.

Differential Revision: http://reviews.llvm.org/D11453

llvm-svn: 243385
2015-07-28 06:01:57 +00:00
Adam Nemet
7b14f72b61 [LAA] Split out a helper to print a collection of memchecks
This is effectively an NFC but we can no longer print the index of the
pointer group so instead I print its address.  This still lets us
cross-check the section that list the checks against the section that
list the groups (see how I modified the test).

E.g. before we printed this:

    Run-time memory checks:
    Check 0:
      Comparing group 0:
        %arrayidxC = getelementptr inbounds i16, i16* %c, i64 %store_ind
        %arrayidxC1 = getelementptr inbounds i16, i16* %c, i64 %store_ind_inc
      Against group 1:
        %arrayidxA = getelementptr i16, i16* %a, i64 %ind
        %arrayidxA1 = getelementptr i16, i16* %a, i64 %add
    ...
    Grouped accesses:
      Group 0:
        (Low: %c High: (78 + %c))
          Member: {%c,+,4}<%for.body>
          Member: {(2 + %c),+,4}<%for.body>

Now we print this (changes are underlined):

    Run-time memory checks:
    Check 0:
      Comparing group (0x7f9c6040c320):
                       ~~~~~~~~~~~~~~
        %arrayidxC1 = getelementptr inbounds i16, i16* %c, i64 %store_ind_inc
        %arrayidxC = getelementptr inbounds i16, i16* %c, i64 %store_ind
      Against group (0x7f9c6040c358):
                     ~~~~~~~~~~~~~~
        %arrayidxA1 = getelementptr i16, i16* %a, i64 %add
        %arrayidxA = getelementptr i16, i16* %a, i64 %ind
    ...
    Grouped accesses:
      Group 0x7f9c6040c320:
            ~~~~~~~~~~~~~~
        (Low: %c High: (78 + %c))
          Member: {(2 + %c),+,4}<%for.body>
          Member: {%c,+,4}<%for.body>

llvm-svn: 243354
2015-07-27 23:54:41 +00:00
Sanjoy Das
7c45e992cb [TargetTransformInfo][NFCI] Add TargetTransformInfo::isZExtFree.
Summary:
This function is not used in this change but will be used in a
subsequent change.

Reviewers: mcrosier, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9180

llvm-svn: 243347
2015-07-27 23:27:43 +00:00
Sanjoy Das
4c063981c7 [IndVars] Make loop varying predicates loop invariant.
Summary:
Was D9784: "Remove loop variant range check when induction variable is
strictly increasing"

This change re-implements D9784 with the two differences:

 1. It does not use SCEVExpander and does not generate new
    instructions.  Instead, it does a quick local search for existing
    `llvm::Value`s that it needs when modifying the `icmp`
    instruction.

 2. It is more general -- it deals with both increasing and decreasing
    induction variables.

I've added all of the tests included with D9784, and two more.

As an example on what this change does (copied from D9784):

Given C code:

```
for (int i = M; i < N; i++) // i is known not to overflow
  if (i < 0) break;
  a[i] = 0;
}
```

This transformation produces:

```
for (int i = M; i < N; i++)
  if (M < 0) break;
  a[i] = 0;
}
```

Which can be unswitched into:

```
if (!(M < 0))
  for (int i = M; i < N; i++)
    a[i] = 0;
}
```

I went back and forth on whether the top level logic should live in
`SimplifyIndvar::eliminateIVComparison` or be put into its own
routine.  Right now I've put it under `eliminateIVComparison` because
even though the `icmp` is not *eliminated*, it no longer is an IV
comparison.  I'm open to putting it in its own helper routine if you
think that is better.

Reviewers: reames, nicholas, atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11278

llvm-svn: 243331
2015-07-27 21:42:49 +00:00
Adam Nemet
d3a021d6fe [LAA] Upper-case variable names, NFC
llvm-svn: 243313
2015-07-27 19:38:50 +00:00
Adam Nemet
c7915a9342 [LAA] Split out a helper from addRuntimeCheck to generate the check, NFC
llvm-svn: 243312
2015-07-27 19:38:48 +00:00
Matt Arsenault
ff3dae692f Fix assert when inlining a constantexpr addrspacecast
The pointer size of the addrspacecasted pointer might not have matched,
so this would have hit an assert in accumulateConstantOffset.

I think this was here to allow constant folding of a load of an
addrspacecasted constant. Accumulating the offset through the
addrspacecast doesn't make much sense, so something else is necessary
to allow folding the load through this cast.

llvm-svn: 243300
2015-07-27 18:31:03 +00:00
NAKAMURA Takumi
1c7fecc1fe LoopAccessAnalysis.cpp: Tweak r243239 to avoid side effects. It caused different emissions between gcc and clang.
llvm-svn: 243258
2015-07-27 01:35:30 +00:00
Jingyue Wu
91cf96359e Roll forward r243250
r243250 appeared to break clang/test/Analysis/dead-store.c on one of the build
slaves, but I couldn't reproduce this failure locally. Probably a false
positive as I saw this test was broken by r243246 or r243247 too but passed
later without people fixing anything.

llvm-svn: 243253
2015-07-26 19:10:03 +00:00
Jingyue Wu
61ee29a54f Revert r243250
breaks tests

llvm-svn: 243251
2015-07-26 18:30:13 +00:00
Jingyue Wu
f4362fe267 [TTI/CostModel] improve TTI::getGEPCost and use it in CostModel::getInstructionCost
Summary:
This patch updates TargetTransformInfoImplCRTPBase::getGEPCost to consider
addressing modes. It now returns TCC_Free when the GEP can be completely folded
to an addresing mode.

I started this patch as I refactored SLSR. Function isGEPFoldable looks common
and is indeed used by some WIP of mine. So I extracted that logic to getGEPCost.

Furthermore, I noticed getGEPCost wasn't directly tested anywhere. The best
testing bed seems CostModel, but its getInstructionCost method invokes
getAddressComputationCost for GEPs which provides very coarse estimation. So
this patch also makes getInstructionCost call the updated getGEPCost for GEPs.
This change inevitably breaks some tests because the cost model changes, but
nothing looks seriously wrong -- if we believe the new cost model is the right
way to go, these tests should be updated.

This patch is not perfect yet -- the comments in some tests need to be updated.
I want to know whether this is a right approach before fixing those details.

Reviewers: chandlerc, hfinkel

Subscribers: aschwaighofer, llvm-commits, aemerson

Differential Revision: http://reviews.llvm.org/D9819

llvm-svn: 243250
2015-07-26 17:28:13 +00:00
Adam Nemet
1745f79e9b [LAA] Begin moving the logic of generating checks out of addRuntimeCheck
Summary:
The goal is to start moving us closer to the model where
RuntimePointerChecking will compute and store the checks.  Then a client
can filter the check according to its requirements and then use the
filtered list of checks with addRuntimeCheck.

Before the patch, this is all done in addRuntimeCheck.  So the patch
starts to split up addRuntimeCheck while providing the old API under
what's more or less a wrapper now.

The new underlying addRuntimeCheck takes a collection of checks now,
expands the code for the bounds then generates the code for the checks.

I am not completely happy with making expandBounds static because now it
needs so many explicit arguments but I don't want to make the type
PointerBounds part of LAI.  This should get fixed when addRuntimeCheck
is moved to LoopVersioning where it really belongs, IMO.

Audited the assembly diff of the testsuite (including externals).  There
is a tiny bit of assembly churn that is due to the different order the
code for the bounds is expanded now
(MultiSource/Benchmarks/Prolangs-C/bison/conflicts.s and with LoopDist
on 456.hmmer/fast_algorithms.s).

Reviewers: hfinkel

Subscribers: klimek, llvm-commits

Differential Revision: http://reviews.llvm.org/D11205

llvm-svn: 243239
2015-07-26 05:32:14 +00:00
Chandler Carruth
017dde1e72 [GMR] Switch the function info we store for every function to be a much
more dense datastructure. We actually only have 3 bits of information
and an often-null pointer here. This fits very nicely into a
pointer-size value in the DenseMap from Function -> Info. Then we take
one more pointer hop to get to a secondary DenseMap from GlobalValue ->
ModRefInfo when we actually have precise info for particular globals.

This is more code than I would really like to do this packing, but it
ended up reasonably cleanly laid out. It should ensure we don't hit
scaling limitations with more widespread use of GMR.

llvm-svn: 242991
2015-07-23 07:50:52 +00:00
Craig Topper
de990d5486 [ScalarEvolution] Change addRequired to addRequiredTransitive on two passes where ScalarEvolution stores long lived raw pointers to objects those passes own.
This prevents the pointers from dangling when those passes are freed.

http://reviews.llvm.org/D11236
Patch by Steve King.

llvm-svn: 242989
2015-07-23 07:33:48 +00:00
Chandler Carruth
18045e8fbf [GMR] Further improve the FunctionInfo API inside of GlobalsModRef, NFC.
This takes the operation of merging a callee's information into the
current information and embeds it into the FunctionInfo type itself.
This is much cleaner as now we don't need to expose iteration of the
globals, etc.

Also, switched all the uses of a raw integer two maintain the mod/ref
info during the SCC walk into just directly manipulating it in the
FunctionInfo object.

llvm-svn: 242976
2015-07-23 00:12:32 +00:00
Chandler Carruth
3c531f97b2 [GMR] Wrap all of the per-function information behind a more strongly
typed interface as a precursor to rewriting how it is stored.

This way we know that the access paths are controlled and it should be
easy to store these bits in a different way.

No functionality changed.

llvm-svn: 242974
2015-07-22 23:56:31 +00:00
Chandler Carruth
2e896f4f08 [PM/AA] Extract the ModRef enums from the AliasAnalysis class in
preparation for de-coupling the AA implementations.

In order to do this, they had to become fake-scoped using the
traditional LLVM pattern of a leading initialism. These can't be actual
scoped enumerations because they're bitfields and thus inherently we use
them as integers.

I've also renamed the behavior enums that are specific to reasoning
about the mod/ref behavior of functions when called. This makes it more
clear that they have a very narrow domain of applicability.

I think there is a significantly cleaner API for all of this, but
I don't want to try to do really substantive changes for now, I just
want to refactor the things away from analysis groups so I'm preserving
the exact original design and just cleaning up the names, style, and
lifting out of the class.

Differential Revision: http://reviews.llvm.org/D10564

llvm-svn: 242963
2015-07-22 23:15:57 +00:00
Chandler Carruth
3b397f6c67 [GMR] Continue my quest to remove linked datastructures from GMR, NFC.
This replaces the next-to-last std::map with a DenseMap. While DenseMap
doesn't yet make tons of sense (there are 32 bytes or so in the value
type), my next change will reduce the value type to a single pointer --
we only need a pointer and 3 bits, and that is exactly what we can have.

llvm-svn: 242956
2015-07-22 22:32:34 +00:00
David Majnemer
dd0c248e77 [ConstantFolding] Support folding loads from a GlobalAlias
The MSVC ABI requires that we generate an alias for the vtable which
means looking through a GlobalAlias which cannot be overridden improves
our ability to devirtualize.

Found while investigating PR20801.

Patch by Andrew Zhogin!

Differential Revision: http://reviews.llvm.org/D11306

llvm-svn: 242955
2015-07-22 22:29:30 +00:00
Chandler Carruth
bba928b0ab [GMR] Make the collection of readers and writers of globals much more
efficient, NFC.

Previously, we built up vectors of function pointers to track readers
and writers. The primary problem here is that we would add the same
function to this vector every time we found an instruction that reads or
writes to the pointer. This could be a *lot* of redudant function
pointers. Instead of doing that, we can use a SmallPtrSet.

This does more than just reduce the size of the list of readers or
writers. We walk the entire lists of each and do a map lookup for each
one. By having sets, we will only do one map lookup per reader or writer
function.

But only one user of the pointer analyzer actually needs this
information, so we can also skip accumulating it (and doing a lot of
heap allocations) for all the other pointer analysis. This is
particularly useful because there are very many more pointers in some of
the other cases.

llvm-svn: 242950
2015-07-22 22:10:05 +00:00
Hans Wennborg
34fee45808 Fix -Wextra-semi warnings.
Patch by Eugene Zelenko!

Differential Revision: http://reviews.llvm.org/D11400

llvm-svn: 242930
2015-07-22 20:46:11 +00:00
Chandler Carruth
a4c7c6c264 [GMR] Switch from std::set to SmallPtrSet. NFC.
This almost certainly doesn't matter in some deep sense, but std::set is
essentially always going to be slower here. Now the alias query should
be essentially constant time instead of having to chase the set tree
each time.

llvm-svn: 242893
2015-07-22 11:47:54 +00:00
Chandler Carruth
41327e670f [GMR] Only look in the associated allocs map for an underlying value if
it wasn't one of the indirect globals (which clearly cannot be an
allocation function call). Also only do a single lookup into this map
instead of two. NFC.

llvm-svn: 242892
2015-07-22 11:43:24 +00:00
Chandler Carruth
c9d371fb74 [GMR] Switch to a DenseMap and clean up the iteration loop. NFC.
Since we have to iterate this map not that infrequently, we should use
a map that is efficient for iteration. It is also almost certainly much
faster for lookups as well. There is more to do in terms of reducing the
wasted overhead of GMR's runtime though. Not sure how much is worthwhile
though.

The loop improvements should hopefully address the code review that
Duncan gave when he saw this code as I moved it around.

llvm-svn: 242891
2015-07-22 11:36:09 +00:00
Chandler Carruth
18b5a704b2 [PM/AA] Try to fix libc++ build bots which require the type used in
std::list to be complete by hoisting the entire definition into the
class. Ugly, but hopefully works.

llvm-svn: 242888
2015-07-22 11:10:41 +00:00
Chandler Carruth
cdb8301de0 [PM/AA] Remove the last of the legacy update API from AliasAnalysis as
part of simplifying its interface and usage in preparation for porting
to work with the new pass manager.

Note that this will likely expose that we have dead arguments, members,
and maybe even pass requirements for AA. I'll be cleaning those up in
seperate patches. This just zaps the actual update API.

Differential Revision: http://reviews.llvm.org/D11325

llvm-svn: 242881
2015-07-22 09:49:59 +00:00
Chandler Carruth
939d6a13a3 [PM/AA] Put the 'final' keyword in the correct place. And actually
succeed at compiling my change before committing it too!

llvm-svn: 242879
2015-07-22 09:34:18 +00:00
Chandler Carruth
5204193caa [PM/AA] Replace the only use of the AliasAnalysis::deleteValue API (in
GlobalsModRef) with CallbackVHs that trigger the same behavior.

This is technically more expensive, but in benchmarking some LTO runs,
it seems unlikely to even be above the noise floor. The only way I was
able to measure the performance of GMR at all was to run nothing else
but this one analysis on a linked clang bitcode file. The call graph
analysis still took 5x more time than GMR, and this change at most made
GMR 2% slower (this is well within the noise, so its hard for me to be
sure that this is an actual change). However, in a real LTO run over the
same bitcode, the GMR run takes so little time that the pass timers
don't measure it.

With this, I can remove the last update API from the AliasAnalysis
interface, but I'll actually remove the interface hook point in
a follow-up commit.

Differential Revision: http://reviews.llvm.org/D11324

llvm-svn: 242878
2015-07-22 09:27:58 +00:00
Jingyue Wu
98a3974775 [MDA] change BlockScanLimit into a command line option.
Summary:
In the benchmark (https://github.com/vetter/shoc) we are researching,
the duplicated load is not eliminated because MemoryDependenceAnalysis
hit the BlockScanLimit. This patch change it into a command line option
instead of a hardcoded value.

Patched by Xuetian Weng. 

Test Plan: test/Analysis/MemoryDependenceAnalysis/memdep-block-scan-limit.ll

Reviewers: jingyue, reames

Subscribers: reames, llvm-commits

Differential Revision: http://reviews.llvm.org/D11366

llvm-svn: 242842
2015-07-21 21:50:39 +00:00
Sanjoy Das
24f4212b35 [SCEV][NFC] Fix a typo in a comment.
llvm-svn: 242834
2015-07-21 20:59:22 +00:00
Karthik Bhat
bceb74b45e Constfold trunc,rint,nearbyint,ceil and floor using APFloat
A patch by Chakshu Grover!
This patch allows constfolding of trunc,rint,nearbyint,ceil and floor intrinsics using APFloat class.
Differential Revision: http://reviews.llvm.org/D11144

llvm-svn: 242763
2015-07-21 08:52:23 +00:00
Chandler Carruth
c58ca38a3a [PM/AA] Remove the addEscapingUse update API that won't be easy to
directly model in the new PM.

This also was an incredibly brittle and expensive update API that was
never fully utilized by all the passes that claimed to preserve AA, nor
could it reasonably have been extended to all of them. Any number of
places add uses of values. If we ever wanted to reliably instrument
this, we would want a callback hook much like we have with ValueHandles,
but doing this for every use addition seems *extremely* expensive in
terms of compile time.

The only user of this update mechanism is GlobalsModRef. The idea of
using this to keep it up to date doesn't really work anyways as its
analysis requires a symmetric analysis of two different memory
locations. It would be very hard to make updates be sufficiently
rigorous to *guarantee* symmetric analysis in this way, and it pretty
certainly isn't true today.

However, folks have been using GMR with this update for a long time and
seem to not be hitting the issues. The reported issue that the update
hook fixes isn't even a problem any more as other changes to
GetUnderlyingObject worked around it, and that issue stemmed from *many*
years ago. As a consequence, a prior patch provided a flag to control
the unsafe behavior of GMR, and this patch removes the update mechanism
that has questionable compile-time tradeoffs and is causing problems
with moving to the new pass manager. Note the lack of test updates --
not one test in tree actually requires this update, even for a contrived
case.

All of this was extensively discussed on the dev list, this patch will
just enact what that discussion decides on. I'm sending it for review in
part to show what I'm planning, and in part to show the *amazing* amount
of work this avoids. Every call to the AA here is something like three
to six indirect function calls, which in the non-LTO pipeline never do
any work! =[

Differential Revision: http://reviews.llvm.org/D11214

llvm-svn: 242605
2015-07-18 03:26:46 +00:00
Chandler Carruth
27209caf10 [PM/AA] Disable the core unsafe aspect of GlobalsModRef in the face of
basic changes to the IR such as folding pointers through PHIs, Selects,
integer casts, store/load pairs, or outlining.

This leaves the feature available behind a flag. This flag's default
could be flipped if necessary, but the real-world performance impact of
this particular feature of GMR may not be sufficiently significant for
many folks to want to run the risk.

Currently, the risk here is somewhat mitigated by half-hearted attempts
to update GlobalsModRef when the rest of the optimizer changes
something. However, I am currently trying to remove that update
mechanism as it makes migrating the AA infrastructure to a form that can
be readily shared between new and old pass managers very challenging.
Without this update mechanism, it is possible that this still unlikely
failure mode will start to trip people, and so I wanted to try to
proactively avoid that.

There is a lengthy discussion on the mailing list about why the core
approach here is flawed, and likely would need to look totally different
to be both reasonably effective and resilient to basic IR changes
occuring. This patch is essentially the first of two which will enact
the result of that discussion. The next patch will remove the current
update mechanism.

Thanks to lots of folks that helped look at this from different angles.
Especial thanks to Michael Zolotukhin for doing some very prelimanary
benchmarking of LTO without GlobalsModRef to get a rough idea of the
impact we could be facing here. So far, it looks very small, but there
are some concerns lingering from other benchmarking. The default here
may get flipped if performance results end up pointing at this as a more
significant issue.

Also thanks to Pete and Gerolf for reviewing!

Differential Revision: http://reviews.llvm.org/D11213

llvm-svn: 242512
2015-07-17 06:58:24 +00:00
Cong Hou
9e8d1a2460 Add new constructors for LoopInfo/DominatorTree/BFI/BPI
Those new constructors make it more natural to construct an object for a function. For example, previously to build a LoopInfo for a function, we need four statements:

DominatorTree DT;
LoopInfo LI;
DT.recalculate(F);
LI.analyze(DT);

Now we only need one statement:

LoopInfo LI(DominatorTree(F));

http://reviews.llvm.org/D11274

llvm-svn: 242486
2015-07-16 23:23:35 +00:00
Cong Hou
bd3b6b651a Rename LoopInfo::Analyze() to LoopInfo::analyze() and turn its parameter type to const&.
The benefit of turning the parameter of LoopInfo::analyze() to const& is that it now can accept a rvalue.

http://reviews.llvm.org/D11250

llvm-svn: 242426
2015-07-16 18:23:57 +00:00
Silviu Baranga
9659e5c213 Fix memcheck interval ends for pointers with negative strides
Summary:
The checking pointer grouping algorithm assumes that the
starts/ends of the pointers are well formed (start <= end).

The runtime memory checking algorithm also assumes this by doing:

 start0 < end1 && start1 < end0

to detect conflicts. This check only works if start0 <= end0 and
start1 <= end1.

This change correctly orders the interval ends by either checking
the stride (if it is constant) or by using min/max SCEV expressions.

Reviewers: anemet, rengolin

Subscribers: rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D11149

llvm-svn: 242400
2015-07-16 14:02:58 +00:00
Adam Nemet
da5196c5c1 [LAA] Split out a helper to check the pointer partitions, NFC
This is made a static public member function to allow the transition of
this logic from LAA to LoopDistribution.  (Technically, it could be an
implementation-local static function but then it would not be accessible
from LoopDistribution.)

llvm-svn: 242376
2015-07-16 02:48:05 +00:00
Cong Hou
be08cf7ca6 Create a wrapper pass for BranchProbabilityInfo.
This new wrapper pass is useful when we want to do branch probability analysis conditionally (e.g. only in PGO mode) but don't want to add one more pass dependence.

http://reviews.llvm.org/D11241

llvm-svn: 242349
2015-07-15 22:48:29 +00:00
Cong Hou
b7ea1783c2 Rename doFunction() in BFI to calculate() and change its parameters from pointers to references.
http://reviews.llvm.org/D11196

llvm-svn: 242322
2015-07-15 19:58:26 +00:00
Tobias Edler von Koch
2f242c1a89 Analyze recursive PHI nodes in BasicAA
Summary:
This patch allows phi nodes like
  %x = phi [ %incptr, ... ] [ %var, ... ]
  %incptr = getelementptr %x, 1
to be analyzed by BasicAliasAnalysis.

In aliasPHI, we can detect incoming values that are recursive GEPs with a
constant offset. Instead of trying to analyze a recursive GEP (and failing), 
we now ignore it and instead set the size of the memory referenced by
the PHINode to UnknownSize. This represents all the possible memory
locations the pointer represented by the PHINode could be advanced to
by the GEP.

For now, this new behavior is turned off by default to allow debugging of
performance degradations seen with SPEC/x86 and Hexagon benchmarks.
The flag -basicaa-recphi turns it on.


Reviewers: hfinkel, sanjoy

Subscribers: tobiasvk_caf, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D10368

llvm-svn: 242320
2015-07-15 19:32:22 +00:00
Chandler Carruth
e6488cecb7 [PM/AA] Fix *numerous* serious bugs in GlobalsModRef found by
inspection.

While we want to handle calls specially in this code because they should
have been modeled by the call graph analysis that precedes it, we should
*not* be re-implementing the predicates for whether an instruction reads
or writes memory. Those are well defined already. Notably, at least the
following issues seem to be clearly missed before:
- Ordered atomic loads can "write" to memory by causing writes from other
  threads to become visible. Similarly for ordered atomic stores.
- AtomicRMW instructions quite obviously both read and write to memory.
- AtomicCmpXchg instructions also read and write to memory.
- Fences read and write to memory.
- Invokes of intrinsics or memory allocation functions.

I don't have any test cases, and I suspect this has never really come up
in the real world. But there is no reason why it wouldn't, and it makes
the code simpler to do this the right way.

While here, I've tried to make the loops significantly simpler as well
and added helpful comments as to what is going on.

llvm-svn: 242281
2015-07-15 08:53:29 +00:00
Chandler Carruth
3ce7d7f488 [PM/AA] Cleanup some loops to be range-based. NFC.
llvm-svn: 242275
2015-07-15 08:09:23 +00:00
Wei Mi
3e99add8f8 Create a wrapper pass for BlockFrequencyInfo.
This is useful when we want to do block frequency analysis
conditionally (e.g. only in PGO mode) but don't want to add
one more pass dependence.

Patch by congh.
Approved by dexonsmith.
Differential Revision: http://reviews.llvm.org/D11196

llvm-svn: 242248
2015-07-14 23:40:50 +00:00
Adam Nemet
2f11e017da [LAA] Introduce RuntimePointerChecking::PointerInfo, NFC
Turn this structure-of-arrays (i.e. the various pointer attributes) into
array-of-structures.

llvm-svn: 242219
2015-07-14 22:32:50 +00:00
Adam Nemet
b25d5963ee [LAA] Lift RuntimePointerCheck out of LoopAccessInfo, NFC
I am planning to add more nested classes inside RuntimePointerCheck so
all these triple-nesting would be hard to follow.

Also rename it to RuntimePointerChecking (i.e. append 'ing').

llvm-svn: 242218
2015-07-14 22:32:44 +00:00
Chandler Carruth
7be73698e4 [PM/AA] Reformat GlobalsModRef so that subsequent patches I make here
don't continually introduce formatting deltas. NFC

llvm-svn: 242129
2015-07-14 08:42:39 +00:00
Silviu Baranga
ae2a209a5b Cleanup after r241809 - remove uncessary call to std::sort
Summary:
The iteration order within a member of DepCands is deterministic
and therefore we don't have to sort the accesses within a member.
We also don't have to copy the indices of the pointers into a
vector, since we can iterate over the members of the class.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11145

llvm-svn: 242033
2015-07-13 14:48:24 +00:00
Manuel Klimek
7a05b84c26 Revert r241981 "Revert "Revert r236894 "[BasicAA] Fix zext & sext handling"""
The repros from PR23626 still fail.

llvm-svn: 242025
2015-07-13 13:50:55 +00:00
Jingyue Wu
248d90dab7 [LSR] don't attempt to promote ephemeral values to indvars
Summary:
This at least saves compile time. I also encountered a case where
ephemeral values affect whether other variables are promoted, causing
performance issues. It may be a bug in LSR, but I didn't manage to
reduce it yet. Anyhow, I believe it's in general not worth considering
ephemeral values in LSR.

Reviewers: atrick, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11115

llvm-svn: 242011
2015-07-13 03:28:53 +00:00
David Majnemer
31685d95b1 [InstSimplify] Teach InstSimplify how to simplify extractelement
llvm-svn: 242008
2015-07-13 01:15:53 +00:00
David Majnemer
bbc2fd617f [InstSimplify] Teach InstSimplify how to simplify extractvalue
llvm-svn: 242007
2015-07-13 01:15:46 +00:00
Hal Finkel
d776deb484 Revert "Revert r236894 "[BasicAA] Fix zext & sext handling""
r236894 caused PR23626 (Clang miscompiles webkit's base64 decoder), and was
reverted in r237984. This reapplies the patch with an additional test case for
PR23626 and the associated fix (both scales and offsets in the
BasicAliasAnalysis::constantOffsetHeuristic should initially be zero).

Patch by Nick White, thanks!

llvm-svn: 241981
2015-07-11 11:04:54 +00:00
Hal Finkel
ef984b764f Move getStrideFromPointer and friends from LoopVectorize to VectorUtils
The following functions are moved from the LoopVectorizer to VectorUtils:

  - getGEPInductionOperand
  - stripGetElementPtr
  - getUniqueCastUse
  - getStrideFromPointer

These used to be static functions in LoopVectorize, but will also be used by
the upcoming loop versioning LICM transformation.

Patch by Ashutosh Nema!

llvm-svn: 241980
2015-07-11 10:52:42 +00:00
Igor Laevsky
05bff16edd Add argmemonly attribute.
This change adds new attribute called "argmemonly". Function marked with this attribute can only access memory through it's argument pointers. This attribute directly corresponds to the "OnlyAccessesArgumentPointees" ModRef behaviour in alias analysis.

Differential Revision: http://reviews.llvm.org/D10398

llvm-svn: 241979
2015-07-11 10:30:36 +00:00
Chandler Carruth
c98a5f7bff [PM/AA] Completely remove the AliasAnalysis::copyValue interface.
No in-tree alias analysis used this facility, and it was not called in
any particularly rigorous way, so it seems unlikely to be correct.

Note that one of the only stateful AA implementations in-tree,
GlobalsModRef is completely broken currently (and any AA passes like it
are equally broken) because Module AA passes are not effectively
invalidated when a function pass that fails to update the AA stack runs.

Ultimately, it doesn't seem like we know how we want to build stateful
AA, and until then trying to support and maintain correctness for an
untested API is essentially impossible. To that end, I'm planning to rip
out all of the update API. It can return if and when we need it and know
how to build it on top of the new pass manager and as part of *tested*
stateful AA implementations in the tree.

Differential Revision: http://reviews.llvm.org/D10889

llvm-svn: 241975
2015-07-11 04:39:00 +00:00
Benjamin Kramer
90b7dad7cf [InstSimplify] Fold away ord/uno fcmps when nnan is present.
This is important to fold away the slow case of complex multiplies
emitted by clang.

llvm-svn: 241911
2015-07-10 14:02:02 +00:00
David Majnemer
80ac5e60bf Revert the new EH instructions
This reverts commits r241888-r241891, I didn't mean to commit them.

llvm-svn: 241893
2015-07-10 07:15:17 +00:00
David Majnemer
6310e08ce2 New EH representation for MSVC compatibility
Summary:
This introduces new instructions neccessary to implement MSVC-compatible
exception handling support.  Most of the middle-end and none of the
back-end haven't been audited or updated to take them into account.

Reviewers: rnk, JosephTremoulet, reames, nlewycky, rjmccall

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11041

llvm-svn: 241888
2015-07-10 07:00:44 +00:00
Adam Nemet
9e7561f620 [LAA] Fix grammar in debug output
llvm-svn: 241867
2015-07-09 22:17:41 +00:00
Adam Nemet
816459e704 [LAA] Hide NeedRTCheck logic completely inside canCheckPtrAtRT, NFC
Currently canCheckPtrAtRT returns two flags NeedRTCheck and CanDoRT.
NeedRTCheck says whether we need checks and CanDoRT whether we can
generate the checks.  The idea is to encode three states with these:

     Need/Can:
(1) false/dont-care: no checks are needed
(2) true/false: we need checks but can't generate them
(3) true/true: we need checks and we can generate them

This is pretty unnecessary since the caller (analyzeLoop) is only
interested in whether we can generate the checks if we actually need
them (i.e. 1 or 3).

So this change cleans up to return just that (CanDoRTIfNeeded) and pulls
all the underlying logic into canCheckPtrAtRT.

By doing all this, we simplify analyzeLoop which is the complex function
in LAA.

There is further room for improvement here by using RtCheck.Need
directly rather than a new local variable NeedRTCheck but that's for a
later patch.

llvm-svn: 241866
2015-07-09 22:17:38 +00:00
Silviu Baranga
3a8eca654a Don't rely on the DepCands iteration order when constructing checking pointer groups
Summary:
The checking pointer group construction algorithm relied on the iteration on DepCands.
We would need the same leaders across runs and the same iteration order over the underlying std::set for determinism.

This changes the algorithm to process the pointers in the order in which they were added to the runtime check, which is deterministic.
We need to update the tests, since the order in which pointers appear has changed.

No new tests were added, since it is impossible to test for non-determinism.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11064

llvm-svn: 241809
2015-07-09 15:18:25 +00:00
Adam Nemet
0248df44fb [LAA] Fix line break in comment
llvm-svn: 241785
2015-07-09 06:47:21 +00:00
Adam Nemet
f7994e3356 [LAA] Rename IsRTNeeded to IsRTCheckAnalysisNeeded
The original name was too close to NeedRTCheck which is what the actual
memcheck analysis returns.  This flag, as the new name suggests, is only
used to whether to initiate that analysis.

Also a comment is added to answer one question I had about this code for
a long time.  Namely, how does this flag differ from
isDependencyCheckNeeded since they are seemingly set at the same time.

llvm-svn: 241784
2015-07-09 06:47:18 +00:00
Mehdi Amini
8fce925ea2 Make TargetTransformInfo keeping a reference to the Module DataLayout
DataLayout is no longer optional. It was initialized with or without
a DataLayout, and the DataLayout when supplied could have been the
one from the TargetMachine.

Summary:
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.

Reviewers: echristo

Subscribers: jholewinski, llvm-commits, rafael, yaron.keren

Differential Revision: http://reviews.llvm.org/D11021

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 241774
2015-07-09 02:08:42 +00:00
Adam Nemet
743520f4d8 [LAA] Fix misleading use of word 'consecutive'
Fix some places where the word consecutive is used but the code really
means constant-stride (i.e. not just unit stride).

llvm-svn: 241763
2015-07-09 00:03:22 +00:00
Adam Nemet
24c6a55664 [LAA] Revert a small part of r239295
This commit ([LAA] Fix estimation of number of memchecks) regressed the
logic a bit.  We shouldn't quit the analysis if we encounter a pointer
without known bounds *unless* we actually need to emit a memcheck for
it.

The original code was using NumComparisons which is now computed
differently.  Instead I compute NeedRTCheck from NumReadPtrChecks and
NumWritePtrChecks.

As side note, I find the separation of NeedRTCheck and CanDoRT
confusing, so I will try to merge them in a follow-up patch.

llvm-svn: 241756
2015-07-08 22:58:48 +00:00
Adam Nemet
15a02a12cd [LAA] Add missing debug output after r239285
r239285 ([LoopAccessAnalysis] Teach LAA to check the memory dependence
between strided accesses.) introduced a new case under
MemoryDepChecker::isDependent.  We normally have debug output for each
case.

llvm-svn: 241707
2015-07-08 18:47:38 +00:00
Silviu Baranga
f1d9b77877 [LAA] Merge memchecks for accesses separated by a constant offset
Summary:
Often filter-like loops will do memory accesses that are
separated by constant offsets. In these cases it is
common that we will exceed the threshold for the
allowable number of checks.

However, it should be possible to merge such checks,
sice a check of any interval againt two other intervals separated
by a constant offset (a,b), (a+c, b+c) will be equivalent with
a check againt (a, b+c), as long as (a,b) and (a+c, b+c) overlap.
Assuming the loop will be executed for a sufficient number of
iterations, this will be true. If not true, checking against
(a, b+c) is still safe (although not equivalent).

As long as there are no dependencies between two accesses,
we can merge their checks into a single one. We use this
technique to construct groups of accesses, and then check
the intervals associated with the groups instead of
checking the accesses directly.

Reviewers: anemet

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10386

llvm-svn: 241673
2015-07-08 09:16:33 +00:00
Karthik Bhat
e65d1c7f73 Allow constfolding of llvm.sin.* and llvm.cos.* intrinsics
This patch const folds llvm.sin.* and llvm.cos.* intrinsics whenever feasible.

Differential Revision: http://reviews.llvm.org/D10836

llvm-svn: 241665
2015-07-08 03:55:47 +00:00
Reid Kleckner
45072b933e Rename llvm.frameescape and llvm.framerecover to localescape and localrecover
Summary:
Initially, these intrinsics seemed like part of a family of "frame"
related intrinsics, but now I think that's more confusing than helpful.
Initially, the LangRef specified that this would create a new kind of
allocation that would be allocated at a fixed offset from the frame
pointer (EBP/RBP). We ended up dropping that design, and leaving the
stack frame layout alone.

These intrinsics are really about sharing local stack allocations, not
frame pointers. I intend to go further and add an `llvm.localaddress()`
intrinsic that returns whatever register (EBP, ESI, ESP, RBX) is being
used to address locals, which should not be confused with the frame
pointer.

Naming suggestions at this point are welcome, I'm happy to re-run sed.

Reviewers: majnemer, nicholas

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11011

llvm-svn: 241633
2015-07-07 22:25:32 +00:00
Peter Collingbourne
f49ef7d3ac IR: Do not consider available_externally linkage to be linker-weak.
From the linker's perspective, an available_externally global is equivalent
to an external declaration (per isDeclarationForLinker()), so it is incorrect
to consider it to be a weak definition.

Also clean up some logic in the dead argument elimination pass and clarify
its comments to better explain how its behavior depends on linkage,
introduce GlobalValue::isStrongDefinitionForLinker() and start using
it throughout the optimizers and backend.

Differential Revision: http://reviews.llvm.org/D10941

llvm-svn: 241413
2015-07-05 20:52:35 +00:00
Yaron Keren
5f11002151 Delete whitespace at start of line.
llvm-svn: 241265
2015-07-02 14:17:12 +00:00
Eric Christopher
99d0bcb7d3 Add a routine to TargetTransformInfo that will allow targets to look
at the attributes on a function to determine whether or not to allow
inlining.

llvm-svn: 241220
2015-07-02 01:11:47 +00:00
Tobias Grosser
ce9ed925b6 Move delinearization from SCEVAddRecExpr to ScalarEvolution
The expressions we delinearize do not necessarily have to have a SCEVAddRecExpr
at the outermost level. At this moment, the additional flexibility  is not
exploited in LLVM itself, but in Polly we will soon soonish use this
functionality. For LLVM, this change should not affect existing functionality
(which is covered by test/Analysis/Delinearization/)

llvm-svn: 240952
2015-06-29 14:42:48 +00:00
Philip Reames
3680a5e70e Teach InlineCost to account for a null check which can be folded away
If we have a caller that knows a particular argument can never be null, we can exploit this fact while simplifying values in the inline cost analysis. This has the effect of reducing the cost for inlining when a null check is present in the callee, but the value is known non null in the caller. In particular, any dependent control flow can be discounted from the cost estimate.

Note that we use the parameter attributes at the call site to memoize the analysis within the caller's code.  The setting of this attribute is done in InstCombine, the inline cost analysis just consumes it.  This is intentional and important because we want the inline cost analysis results to be easily cachable themselves.  We're not currently doing so, but initial results on LTO indicate this will quickly become important.

Differential Revision: http://reviews.llvm.org/D9129

llvm-svn: 240828
2015-06-26 20:51:17 +00:00
David Blaikie
6b1ed69851 Move VectorUtils from Transforms to Analysis to correct layering violation
llvm-svn: 240804
2015-06-26 18:02:52 +00:00
Adam Nemet
8912979930 [LAA] Try to prove non-wrapping of pointers if SCEV cannot
Summary:
Scalar evolution does not propagate the non-wrapping flags to values
that are derived from a non-wrapping induction variable because
the non-wrapping property could be flow-sensitive.

This change is a first attempt to establish the non-wrapping property in
some simple cases.  The main idea is to look through the operations
defining the pointer.  As long as we arrive to a non-wrapping AddRec via
a small chain of non-wrapping instruction, the pointer should not wrap
either.

I believe that this essentially is what Andy described in
http://article.gmane.org/gmane.comp.compilers.llvm.cvs/220731 as the way
forward.

Reviewers: aschwaighofer, nadav, sanjoy, atrick

Reviewed By: atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10472

llvm-svn: 240798
2015-06-26 17:25:43 +00:00
Artur Pilipenko
ccbbd4db82 Take alignment into account in isSafeToLoadUnconditionally
Reviewed By: hfinkel

Differential Revision: http://reviews.llvm.org/D10475

llvm-svn: 240636
2015-06-25 12:18:43 +00:00
Jingyue Wu
8fd4d61a47 [LSR] canonicalize Prod*(1<<C) to Prod<<C
Summary:
Because LSR happens at a late stage where mul of a power of 2 is
typically canonicalized to shl, this canonicalization emits code that
can be better CSE'ed.

Test Plan:
Transforms/LoopStrengthReduce/shl.ll shows how this change makes GVN more
powerful. Fixes some existing tests due to this change.

Reviewers: sanjoy, majnemer, atrick

Reviewed By: majnemer, atrick

Subscribers: majnemer, llvm-commits

Differential Revision: http://reviews.llvm.org/D10448

llvm-svn: 240573
2015-06-24 19:28:40 +00:00
Bruno Cardoso Lopes
511568fdbd [CaptureTracking] Avoid long compilation time on large basic blocks
CaptureTracking becomes very expensive in large basic blocks while
calling PointerMayBeCaptured. PointerMayBeCaptured scans the BB the
number of times equal to the number of uses of 'BeforeHere', which is
currently capped at 20 and bails out with Tracker->tooManyUses().

The bottleneck here is the number of calls to PointerMayBeCaptured * the
basic block scan. In a testcase with a 82k instruction BB,
PointerMayBeCaptured is called 130k times, leading to 'shouldExplore'
taking 527k runs, this currently takes ~12min.

To fix this we locally (within PointerMayBeCaptured) number the
instructions in the basic block using a DenseMap to cache instruction
positions/numbers. We build the cache incrementally every time we need
to scan an unexplored part of the BB, improving compile time to only
take ~2min.

This triggers in the flow: DeadStoreElimination -> MepDepAnalysis ->
CaptureTracking.

Side note: after multiple runs in the test-suite I've seen no
performance nor compile time regressions, but could note a couple of
compile time improvements:

Performance Improvements - Compile Time Delta Previous  Current StdDev
SingleSource/Benchmarks/Misc-C++/bigfib -4.48%  0.8547  0.8164  0.0022
MultiSource/Benchmarks/TSVC/LoopRerolling-dbl/LoopRerolling-dbl -1.47% 1.3912  1.3707  0.0056

Differential Revision: http://reviews.llvm.org/D7010

llvm-svn: 240560
2015-06-24 17:53:17 +00:00
Alexander Kornienko
f993659b8f Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)
Apparently, the style needs to be agreed upon first.

llvm-svn: 240390
2015-06-23 09:49:53 +00:00
Chandler Carruth
440d4e2329 [PM/AA] Hoist the AliasResult enum out of the AliasAnalysis class.
This will allow classes to implement the AA interface without deriving
from the class or referencing an internal enum of some other class as
their return types.

Also, to a pretty fundamental extent, concepts such as 'NoAlias',
'MayAlias', and 'MustAlias' are first class concepts in LLVM and we
aren't saving anything by scoping them heavily.

My mild preference would have been to use a scoped enum, but that
feature is essentially completely broken AFAICT. I'm extremely
disappointed. For example, we cannot through any reasonable[1] means
construct an enum class (or analog) which has scoped names but converts
to a boolean in order to test for the possibility of aliasing.

[1]: Richard Smith came up with a "solution", but it requires class
templates, and lots of boilerplate setting up the enumeration multiple
times. Something like Boost.PP could potentially bundle this up, but
even that would be quite painful and it doesn't seem realistically worth
it. The enum class solution would probably work without the need for
a bool conversion.

Differential Revision: http://reviews.llvm.org/D10495

llvm-svn: 240255
2015-06-22 02:16:51 +00:00
Chandler Carruth
745bc09941 [PM/AA] Rework the names and comments in AliasSetTracker to more
accurately describe what is being tracked.

While these two enums do track mod/ref information and aliasing
information, they don't represent the exact same things as either the
mod/ref enums or the alias result enum in AA. They're definitions are
dominated by the structure of their lattice and the bit's various
semantics. This patch just calls them what they are and tries to spell
out usefully distinct names for these things.

This will clear the path for using a raw unscoped enum to represent some
of these concepts across LLVM's analysis library.

No functionality changed here.

Differential Revision: http://reviews.llvm.org/D10494

llvm-svn: 240254
2015-06-22 02:12:52 +00:00
Sanjoy Das
3200fa8f55 [CallGraph] Given -print-callgraph a stable printing order.
Summary:
Since FunctionMap has llvm::Function pointers as keys, the order in
which the traversal happens can differ from run to run, causing spurious
FileCheck failures.  Have CallGraph::print sort the CallGraphNodes by
name before printing them.

Reviewers: bogner, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10575

llvm-svn: 240191
2015-06-19 23:20:31 +00:00
Chad Rosier
d56c386ef6 Typo. NFC.
llvm-svn: 240141
2015-06-19 17:32:57 +00:00