1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 03:23:01 +02:00
Commit Graph

6829 Commits

Author SHA1 Message Date
Oliver Stannard
7be9f2d236 [ARM] Don't convert switches to lookup tables of pointers with ROPI/RWPI
With the ROPI and RWPI relocation models we can't always have pointers
to global data or functions in constant data, so don't try to convert switches
into lookup tables if any value in the lookup table would require a relocation.
We can still safely emit lookup tables of other values, such as simple
constants.

Differential Revision: https://reviews.llvm.org/D24462

llvm-svn: 283530
2016-10-07 08:48:24 +00:00
Henric Karlsson
a92296750c Test commit access (NFC)
llvm-svn: 283439
2016-10-06 10:58:41 +00:00
Bjorn Pettersson
403539ab64 [ValueTracking] Teach computeKnownBits and ComputeNumSignBits to look through ExtractElement.
Summary:
The computeKnownBits and ComputeNumSignBits functions in ValueTracking can now do a simple look-through of ExtractElement.

Reviewers: majnemer, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24955

llvm-svn: 283434
2016-10-06 09:56:21 +00:00
Mehdi Amini
b4869611fc Re-commit "Use StringRef in Support/Darf APIs (NFC)"
This reverts commit r283285 and re-commit r283275 with
a fix for format("%s", Str); where Str is a StringRef.

llvm-svn: 283298
2016-10-05 05:59:29 +00:00
Mehdi Amini
c494f9f824 Revert "Re-commit "Use StringRef in Support/Darf APIs (NFC)""
One test seems randomly broken: DebugInfo/X86/gnu-public-names.ll

llvm-svn: 283285
2016-10-05 01:04:02 +00:00
Mehdi Amini
89a7bf7e21 Re-commit "Use StringRef in Support/Darf APIs (NFC)"
This reverts commit r283278 and re-commit r283275 with
the update to fix the build on the LLDB side.

llvm-svn: 283281
2016-10-05 00:37:18 +00:00
Mehdi Amini
65317a7af9 Revert "Use StringRef in Support/Darf APIs (NFC)"
This reverts commit r283275, it broke LLDB Android debug server.

llvm-svn: 283278
2016-10-05 00:21:14 +00:00
Mehdi Amini
37c7e3e805 Use StringRef in Support/Darf APIs (NFC)
llvm-svn: 283275
2016-10-04 23:55:40 +00:00
Hal Finkel
95d4e72334 Don't filter diagnostics written as YAML to the output file
The purpose of the YAML diagnostic output file is to collect information on
optimizations performed, or not performed, for later processing by tools that
help users (and compiler developers) understand how code was optimized. As
such, the diagnostics that appear in the file should not be coupled to what a
user might want to see summarized for them as the compiler runs, and in fact,
because the user likely does not know what optimization diagnostics their tools
might want to use, the user cannot provide a useful filter regardless. As such,
we shouldn't filter the diagnostics going to the output file.

Differential Revision: https://reviews.llvm.org/D25224

llvm-svn: 283236
2016-10-04 18:13:45 +00:00
Adam Nemet
9b5496d078 Serialize remark argument as a mapping to get proper quotation for the value.
llvm-svn: 283231
2016-10-04 17:05:04 +00:00
Adam Nemet
3784a5beb7 Allow derived classes of OptimizationRemarkAnalysis in YAML
llvm-svn: 283230
2016-10-04 17:05:01 +00:00
Eli Friedman
6579a1a109 Make GlobalsAA ignore dead constant expressions.
Slightly improves the precision of GlobalsAA in certain situations, and
makes the behavior of optimization passes more predictable.

Differential Revision: https://reviews.llvm.org/D24104

llvm-svn: 283165
2016-10-04 00:03:55 +00:00
Volkan Keles
2c3720a7dd Add new target hooks for LoadStoreVectorizer
Summary: Added 6 new target hooks for the vectorizer in order to filter types, handle size constraints and decide how to split chains.

Reviewers: tstellarAMD, arsenm

Subscribers: arsenm, mzolotukhin, wdng, llvm-commits, nhaehnle

Differential Revision: https://reviews.llvm.org/D24727

llvm-svn: 283099
2016-10-03 10:31:34 +00:00
Sanjoy Das
0ab0771735 [SCEV] Rely on ConstantRange instead of custom logic; NFCI
This was first landed in rL283058 and subsequenlty reverted since a
change this depends on (rL283057) was buggy and had to be reverted.

llvm-svn: 283079
2016-10-02 20:59:10 +00:00
Sanjoy Das
ce09fb6c0a Revert r283057 and r283058
They've broken the sanitizer-bootstrap bots.  Reverting while I investigate.

Original commit messages:

r283057: "[ConstantRange] Make getEquivalentICmp smarter"

r283058: "[SCEV] Rely on ConstantRange instead of custom logic; NFCI"
llvm-svn: 283062
2016-10-02 02:40:27 +00:00
Sanjoy Das
48cc33630a Remove duplicated code; NFC
ICmpInst::makeConstantRange does exactly the same thing as
ConstantRange::makeExactICmpRegion.

llvm-svn: 283059
2016-10-02 00:09:57 +00:00
Sanjoy Das
09bd5d6637 [SCEV] Rely on ConstantRange instead of custom logic; NFCI
llvm-svn: 283058
2016-10-02 00:09:52 +00:00
Sanjoy Das
6b37e26252 [SCEV] Remove commented out code; NFC
llvm-svn: 283056
2016-10-02 00:09:45 +00:00
Mehdi Amini
1c713f7262 Use StringRef in TLI instead of raw pointer (NFC)
llvm-svn: 283005
2016-10-01 03:10:48 +00:00
Mehdi Amini
1fef2dd6b7 Use StringRef in Pass/PassManager APIs (NFC)
llvm-svn: 283004
2016-10-01 02:56:57 +00:00
Piotr Padlewski
40ebe4972e NFC fix doxygen comments
llvm-svn: 282950
2016-09-30 21:05:49 +00:00
Adam Nemet
abdb6ed3d0 [LAA, LV] Port to new streaming interface for opt remarks. Update LV
(Recommit after making sure IsVerbose gets properly initialized in
DiagnosticInfoOptimizationBase.  See previous commit that takes care of
this.)

OptimizationRemarkAnalysis directly takes the role of the report that is
generated by LAA.

Then we need the magic to be able to turn an LAA remark into an LV
remark.  This is done via a new OptimizationRemark ctor.

llvm-svn: 282813
2016-09-30 00:01:30 +00:00
Adam Nemet
3b79489357 Revert "[LAA, LV] Port to new streaming interface for opt remarks. Update LV"
This reverts commit r282758.

There are some clang failures I haven't seen.

llvm-svn: 282759
2016-09-29 20:17:37 +00:00
Adam Nemet
6a1c0c8759 [LAA, LV] Port to new streaming interface for opt remarks. Update LV
OptimizationRemarkAnalysis directly takes the role of the report that is
generated by LAA.

Then we need the magic to be able to turn an LAA remark into an LV
remark.  This is done via a new OptimizationRemark ctor.

llvm-svn: 282758
2016-09-29 20:12:18 +00:00
Dehao Chen
8c4cb5e152 Refactor the ProfileSummaryInfo to use doInitialization and doFinalization to handle Module update.
Summary: This refactors the change in r282616

Reviewers: davidxl, eraman, mehdi_amini

Subscribers: mehdi_amini, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D25041

llvm-svn: 282630
2016-09-28 21:00:58 +00:00
Dehao Chen
d6dfa42e25 Fix the bug when -compile-twice is specified, the PSI will be invalidated.
Summary:
When using llc with -compile-twice, module is generated twice, but getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI will still get the old PSI with the original (invalidated) Module. This patch checks if the module has changed when calling getPSI, if yes, update the module and invalidate the Summary.
The bug does not show up in the current llc because PSI is not used in CodeGen yet. But with https://reviews.llvm.org/D24989, the bug will be exposed by test/CodeGen/PowerPC/pr26378.ll

Reviewers: eraman, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24993

llvm-svn: 282616
2016-09-28 18:41:14 +00:00
Artur Pilipenko
e07af89244 Don't look through addrspacecast in GetPointerBaseWithConstantOffset
Pointers in different addrspaces can have different sizes, so it's not valid to look through addrspace cast calculating base and offset for a value.

This is similar to D13008.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D24729

llvm-svn: 282612
2016-09-28 17:57:16 +00:00
Sanjoy Das
18c0120be6 [SCEV] Use a SmallPtrSet as a temporary union predicate; NFC
Summary:
Instead of creating and destroying SCEVUnionPredicate instances (which
internally creates and destroys a DenseMap), use temporary SmallPtrSet
instances of remember the set of predicates that will get reified into a
SCEVUnionPredicate.

Reviewers: silviu.baranga, sbaranga

Subscribers: sanjoy, mcrosier, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D25000

llvm-svn: 282606
2016-09-28 17:14:58 +00:00
Sanjay Patel
51c21d223b [InstSimplify] allow or-of-icmps folds with vector splat constants
llvm-svn: 282592
2016-09-28 14:27:21 +00:00
Sanjay Patel
aac597925f [InstSimplify] allow and-of-icmps folds with vector splat constants
llvm-svn: 282590
2016-09-28 13:53:13 +00:00
Adam Nemet
395f991d15 [LAA] Rename emitAnalysis to recordAnalys. NFC
Ever since LAA was split out into an analysis on its own, this function
stopped emitting the report directly.  Instead it stores it to be
retrieved by the client which can then emit it as its own report
(e.g. -Rpass-analysis=loop-vectorize).

llvm-svn: 282561
2016-09-28 00:58:36 +00:00
Adam Nemet
ec2292c80c [Inliner] Port all opt remarks to new streaming API
llvm-svn: 282559
2016-09-27 23:47:03 +00:00
Adam Nemet
c03a73efe2 Shorten DiagnosticInfoOptimizationRemark* to OptimizationRemark*. NFC
With the new streaming interface, these class names need to be typed a
lot and it's way too looong.

llvm-svn: 282544
2016-09-27 22:19:23 +00:00
Adam Nemet
f602aa8cdd Output optimization remarks in YAML
(Re-committed after moving the template specialization under the yaml
namespace.  GCC was complaining about this.)

This allows various presentation of this data using an external tool.
This was first recommended here[1].

As an example, consider this module:

  1 int foo();
  2 int bar();
  3
  4 int baz() {
  5   return foo() + bar();
  6 }

The inliner generates these missed-optimization remarks today (the
hotness information is pulled from PGO):

  remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30)
  remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30)

Now with -pass-remarks-output=<yaml-file>, we generate this YAML file:

  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 10 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: foo
    - String:  will not be inlined into
    - Caller: baz
  ...
  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 18 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: bar
    - String:  will not be inlined into
    - Caller: baz
  ...

This is a summary of the high-level decisions:

* There is a new streaming interface to emit optimization remarks.
E.g. for the inliner remark above:

   ORE.emit(DiagnosticInfoOptimizationRemarkMissed(
                DEBUG_TYPE, "NotInlined", &I)
            << NV("Callee", Callee) << " will not be inlined into "
            << NV("Caller", CS.getCaller()) << setIsVerbose());

NV stands for named value and allows the YAML client to process a remark
using its name (NotInlined) and the named arguments (Callee and Caller)
without parsing the text of the message.

Subsequent patches will update ORE users to use the new streaming API.

* I am using YAML I/O for writing the YAML file.  YAML I/O requires you
to specify reading and writing at once but reading is highly non-trivial
for some of the more complex LLVM types.  Since it's not clear that we
(ever) want to use LLVM to parse this YAML file, the code supports and
asserts that we're writing only.

On the other hand, I did experiment that the class hierarchy starting at
DiagnosticInfoOptimizationBase can be mapped back from YAML generated
here (see D24479).

* The YAML stream is stored in the LLVM context.

* In the example, we can probably further specify the IR value used,
i.e. print "Function" rather than "Value".

* As before hotness is computed in the analysis pass instead of
DiganosticInfo.  This avoids the layering problem since BFI is in
Analysis while DiagnosticInfo is in IR.

[1] https://reviews.llvm.org/D19678#419445

Differential Revision: https://reviews.llvm.org/D24587

llvm-svn: 282539
2016-09-27 20:55:07 +00:00
Sanjoy Das
07e9608e3a [SCEV] Replace a struct with a function; NFC
We can do this now thanks to C++11 lambdas.

llvm-svn: 282515
2016-09-27 18:01:48 +00:00
Sanjoy Das
cbfb142a0c [SCEV] Use find instead of find_as; NFC
We don't need the extra generality here.

llvm-svn: 282514
2016-09-27 18:01:46 +00:00
Sanjoy Das
539ce05e97 [SCEV] Reduce the scope of a struct; NFC
llvm-svn: 282513
2016-09-27 18:01:44 +00:00
Sanjoy Das
dafc679040 [SCEV] Remove custom RAII wrapper; NFC
Instead use the pre-existing `scope_exit` class.

llvm-svn: 282512
2016-09-27 18:01:42 +00:00
Sanjoy Das
2d300f40f9 [SCEV] Make PendingLoopPredicates more frugal; NFCI
I don't expect `PendingLoopPredicates` to have very many
elements (e.g. when -O3'ing the sqlite3 amalgamation,
`PendingLoopPredicates` has at most 3 elements).  So now we use a
`SmallPtrSet` for it instead of the more heavyweight `DenseSet`.

llvm-svn: 282511
2016-09-27 18:01:38 +00:00
Adam Nemet
5058aadaf2 Revert "Output optimization remarks in YAML"
This reverts commit r282499.

The GCC bots are failing

llvm-svn: 282503
2016-09-27 16:39:24 +00:00
Adam Nemet
b1d6f940c4 Output optimization remarks in YAML
This allows various presentation of this data using an external tool.
This was first recommended here[1].

As an example, consider this module:

  1 int foo();
  2 int bar();
  3
  4 int baz() {
  5   return foo() + bar();
  6 }

The inliner generates these missed-optimization remarks today (the
hotness information is pulled from PGO):

  remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30)
  remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30)

Now with -pass-remarks-output=<yaml-file>, we generate this YAML file:

  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 10 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: foo
    - String:  will not be inlined into
    - Caller: baz
  ...
  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 18 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: bar
    - String:  will not be inlined into
    - Caller: baz
  ...

This is a summary of the high-level decisions:

* There is a new streaming interface to emit optimization remarks.
E.g. for the inliner remark above:

   ORE.emit(DiagnosticInfoOptimizationRemarkMissed(
                DEBUG_TYPE, "NotInlined", &I)
            << NV("Callee", Callee) << " will not be inlined into "
            << NV("Caller", CS.getCaller()) << setIsVerbose());

NV stands for named value and allows the YAML client to process a remark
using its name (NotInlined) and the named arguments (Callee and Caller)
without parsing the text of the message.

Subsequent patches will update ORE users to use the new streaming API.

* I am using YAML I/O for writing the YAML file.  YAML I/O requires you
to specify reading and writing at once but reading is highly non-trivial
for some of the more complex LLVM types.  Since it's not clear that we
(ever) want to use LLVM to parse this YAML file, the code supports and
asserts that we're writing only.

On the other hand, I did experiment that the class hierarchy starting at
DiagnosticInfoOptimizationBase can be mapped back from YAML generated
here (see D24479).

* The YAML stream is stored in the LLVM context.

* In the example, we can probably further specify the IR value used,
i.e. print "Function" rather than "Value".

* As before hotness is computed in the analysis pass instead of
DiganosticInfo.  This avoids the layering problem since BFI is in
Analysis while DiagnosticInfo is in IR.

[1] https://reviews.llvm.org/D19678#419445

Differential Revision: https://reviews.llvm.org/D24587

llvm-svn: 282499
2016-09-27 16:15:16 +00:00
Piotr Padlewski
3152e057e1 [thinlto] Basic thinlto fdo heuristic
Summary:
This patch improves thinlto importer
by importing 3x larger functions that are called from hot block.

I compared performance with the trunk on spec, and there
were about 2% on povray and 3.33% on milc. These results seems
to be consistant and match the results Teresa got with her simple
heuristic. Some benchmarks got slower but I think they are just
noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with
more iterations to confirm. Geomean of all benchmarks including the noisy ones
were about +0.02%.

I see much better improvement on google branch with Easwaran patch
for pgo callsite inlining (the inliner actually inline those big functions)
Over all I see +0.5% improvement, and I get +8.65% on povray.
So I guess we will see much bigger change when Easwaran patch will land
(it depends on new pass manager), but it is still worth putting this to trunk
before it.

Implementation details changes:
- Removed CallsiteCount.
- ProfileCount got replaced by Hotness
- hot-import-multiplier is set to 3.0 for now,
didn't have time to tune it up, but I see that we get most of the interesting
functions with 3, so there is no much performance difference with higher, and
binary size doesn't grow as much as with 10.0.

Reviewers: eraman, mehdi_amini, tejohnson

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D24638

llvm-svn: 282437
2016-09-26 20:37:32 +00:00
Chandler Carruth
59981672f6 [SCEV] Fix the order of members in the initializer list.
Noticed due to the warning on this line. Sanjoy is on
a less-than-awesome internet connection, so committing on his behalf.

llvm-svn: 282380
2016-09-26 04:49:58 +00:00
Sanjoy Das
ef4a695251 [SCEV] Assign LoopPropertiesCache in the move constructor
In a previous change I collapsed two different caches into one.  When
doing that I noticed that ScalarEvolution's move constructor was not
moving those caches.

To keep the previous change simple, I've moved that bugfix into this
separate change.

llvm-svn: 282376
2016-09-26 02:44:10 +00:00
Sanjoy Das
b9f8e32d12 [SCEV] Combine two predicates into one; NFC
Both `loopHasNoSideEffects` and `loopHasNoAbnormalExits` involve walking
the loop and maintaining similar sorts of caches.  This commit changes
SCEV to compute both the predicates via a single walk, and maintain a
single cache instead of two.

llvm-svn: 282375
2016-09-26 02:44:07 +00:00
Sanjoy Das
4dc0994173 [SCEV] Make it obvious BackedgeTakenInfo's constructor steals storage
Specifically, it moves SCEVUnionPredicates from its input into its own
storage.  Make this obvious at the type level.

llvm-svn: 282374
2016-09-26 01:10:27 +00:00
Sanjoy Das
6926238887 [SCEV] Further isolate incidental data structure; NFC
llvm-svn: 282373
2016-09-26 01:10:25 +00:00
Sanjoy Das
2025c6e5ef [SCEV] Simplify BackedgeTakenInfo::getMax; NFC
llvm-svn: 282372
2016-09-26 01:10:22 +00:00
Sanjoy Das
761e5aabb2 [SCEV] Reserve space in SmallVector; NFC
llvm-svn: 282368
2016-09-25 23:12:08 +00:00
Sanjoy Das
23b6c4c2a8 [SCEV] Have ExitNotTakenInfo keep a pointer to its predicate; NFC
SCEVUnionPredicate is a "heavyweight" structure, so it is beneficial to
store the (optional) data out of line.

llvm-svn: 282366
2016-09-25 23:12:04 +00:00
Sanjoy Das
6bbb5f521c [SCEV] Simplify tracking ExitNotTakenInfo instances; NFC
This change simplifies a data structure optimization in the
`BackedgeTakenInfo` class for loops with exactly one computable exit.

I've sanity checked that this does not regress compile time performance,
using sqlite3's amalgamated build.

llvm-svn: 282365
2016-09-25 23:12:00 +00:00
Sanjoy Das
3b4c0c8086 [SCEV] Rename a couple of fields; NFC
llvm-svn: 282364
2016-09-25 23:11:57 +00:00
Sanjoy Das
e39d2ae7eb [SCEV] Remove incidental data structure; NFC
llvm-svn: 282363
2016-09-25 23:11:55 +00:00
Duncan P. N. Exon Smith
2f789947a5 Analysis: Return early for UndefValue in computeKnownBits
There is no benefit in looking through assumptions on UndefValue to
guess known bits.  Return early to avoid walking their use-lists, and
assert that all instances of ConstantData are handled here for similar
reasons (UndefValue was the only integer/pointer holdout).

llvm-svn: 282337
2016-09-24 20:42:02 +00:00
Duncan P. N. Exon Smith
b94ad7ec1b Analysis: Return early in isKnownNonNullAt for ConstantData
Check and return early for ConstantPointerNull and UndefValue
specifically in isKnownNonNullAt, and assert that ConstantData never
make it to isKnownNonNullFromDominatingCondition.

This confirms that isKnownNonNullFromDominatingCondition never walks
through the use-list of an instance of ConstantData.  Given that such
use-lists cross module boundaries, it never really made sense to do so,
and was potentially very expensive.

llvm-svn: 282333
2016-09-24 19:39:47 +00:00
Sanjay Patel
6d495bce9c [TLI] isdigit / isascii / toascii param type should match return type (PR30484)
We crash in LibCallSimplifier if we don't check the validity of the function signature properly.

llvm-svn: 282278
2016-09-23 18:44:09 +00:00
Jun Bum Lim
bd9da0f624 Enhance calcColdCallHeuristics for InvokeInst
Summary: When identifying cold blocks, consider only the edge to the normal destination if the terminator is InvokeInst and let calcInvokeHeuristics() decide edge weights for the InvokeInst.

Reviewers: mcrosier, hfinkel, davidxl

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D24868

llvm-svn: 282262
2016-09-23 17:26:14 +00:00
Adam Nemet
f1965e4c24 [LV] When reporting about a specific instruction without debug location use loop's
This can occur for example if some optimization drops the debug location.

llvm-svn: 282048
2016-09-21 03:14:20 +00:00
Michael Kuperstein
54c54e0c3e [InferAttributes] Don't access parameters that don't exist.
Check for the correct number of parameters before querying their type.
This fixes PR30455.

llvm-svn: 282038
2016-09-20 23:10:31 +00:00
Sanjay Patel
e9cfe9ee79 move variables closer to their uses; add FIXMEs; NFC
llvm-svn: 281972
2016-09-20 14:36:14 +00:00
Elena Demikhovsky
fdf2d14b30 [Loop Vectorizer] Consecutive memory access - fixed and simplified
Amended consecutive memory access detection in Loop Vectorizer.
Load/Store were not handled properly without preceding GEP instruction.

Differential Revision: https://reviews.llvm.org/D20789

llvm-svn: 281853
2016-09-18 13:56:08 +00:00
Sanjay Patel
11e8147804 [InstCombine] allow vector types for constant folding / computeKnownBits (PR24942)
computeKnownBits() already works for integer vectors, so allow vector types when calling that from InstCombine.

I don't think the change to use m_APInt in computeKnownBits is strictly necessary because we do check for 
ConstantVector later, but it's more efficient to handle the splat case without needing to loop on vector elements.

This should work with InstSimplify, but doesn't yet, so I made that a FIXME comment on the test for PR24942:
https://llvm.org/bugs/show_bug.cgi?id=24942

Differential Revision: https://reviews.llvm.org/D24677

llvm-svn: 281777
2016-09-16 21:20:36 +00:00
David L Kreitzer
d923ca0cb8 Reapplying r278731 after fixing the problem that caused it to be reverted.
Enhance SCEV to compute the trip count for some loops with unknown stride.

Patch by Pankaj Chawla

Differential Revision: https://reviews.llvm.org/D22377

llvm-svn: 281732
2016-09-16 14:38:13 +00:00
Chandler Carruth
f1fcfdd6f0 [LCG] Redesign the lazy post-order iteration mechanism for the
LazyCallGraph to support repeated, stable iterations, even in the face
of graph updates.

This is particularly important to allow the CGSCC pass manager to walk
the RefSCCs (and thus everything else) in a module more than once. Lots
of unittests and other tests were hard or impossible to write because
repeated CGSCC pass managers which didn't invalidate the LazyCallGraph
would conclude the module was empty after the first one. =[ Really,
really bad.

The interesting thing is that in many ways this simplifies the code. We
can now re-use the same code for handling reference edge insertion
updates of the RefSCC graph as we use for handling call edge insertion
updates of the SCC graph. Outside of adapting to the shared logic for
this (which isn't trivial, but is *much* simpler than the DFS it
replaces!), the new code involves putting newly created RefSCCs when
deleting a reference edge into the cached list in the correct way, and
to re-formulate the iterator to be stable and effective even in the face
of these kinds of updates.

I've updated the unittests for the LazyCallGraph to re-iterate the
postorder sequence and verify that this all works. We even check for
using alternating iterators to trigger the lazy formation of RefSCCs
after mutation has occured.

It's worth noting that there are a reasonable number of likely
simplifications we can make past this. It isn't clear that we need to
keep the "LeafRefSCCs" around any more. But I've not removed that mostly
because I want this to be a more isolated change.

Differential Revision: https://reviews.llvm.org/D24219

llvm-svn: 281716
2016-09-16 10:20:17 +00:00
Sriraman Tallam
0bea49c555 [PM] Port CFGViewer and CFGPrinter to the new Pass Manager
Differential Revision: https://reviews.llvm.org/D24592

llvm-svn: 281640
2016-09-15 18:35:27 +00:00
Wei Mi
0092b54e43 Add some shortcuts in LazyValueInfo to reduce compile time of Correlated Value Propagation.
The patch is to partially fix PR10584. Correlated Value Propagation queries LVI
to check non-null for pointer params of each callsite. If we know the def of
param is an alloca instruction, we know it is non-null and can return early from
LVI. Similarly, CVP queries LVI to check whether pointer for each mem access is
constant. If the def of the pointer is an alloca instruction, we know it is not
a constant pointer. These shortcuts can reduce the cost of CVP significantly.

Differential Revision: https://reviews.llvm.org/D18066

llvm-svn: 281586
2016-09-15 06:28:34 +00:00
Wei Mi
c1cf1864f4 Create a getelementptr instead of sub expr for ValueOffsetPair if the
value is a pointer.

This patch is to fix PR30213. When expanding an expr based on ValueOffsetPair,
if the value is of pointer type, we can only create a getelementptr instead
of sub expr.

Differential Revision: https://reviews.llvm.org/D24088

llvm-svn: 281439
2016-09-14 04:39:50 +00:00
Andrea Di Biagio
5f72d10163 [ConstantFold] Improve the bitcast folding logic for constant vectors.
The constant folder didn't know how to always fold bitcasts of constant integer
vectors. In particular, it was unable to handle the case where a constant vector
had some undef elements, and the resulting (i.e. bitcasted) vector type had more
elements than the original vector type.

Example:
  %cast = bitcast <2 x i64><i64 undef, i64 2> to <4 x i32>

On a little endian target, %cast could have been folded to:
  <4 x i32><i32 undef, i32 undef, i32 2, i32 0>

This patch improves the folding logic by teaching how to correctly propagate
undef elements in the folded vector.

Differential Revision: https://reviews.llvm.org/D24301

llvm-svn: 281343
2016-09-13 14:50:47 +00:00
Philip Reames
d1d1d7910e [LVI] Complete the abstract of the cache layer [NFCI]
Convert the previous introduced is-a relationship between the LVICache and LVIImple clases into a has-a relationship and hide all the implementation details of the cache from the lazy query layer.

The only slightly concerning change here is removing the addition of a queried block into the SeenBlock set in LVIImpl::getBlockValue.  As far as I can tell, this was effectively dead code.  I think it *used* to be the case that getCachedValueInfo wasn't const and might end up inserting elements in the cache during lookup.  That's no longer true and hasn't been for a while.  I did fixup the const usage to make that more obvious.

llvm-svn: 281272
2016-09-12 22:38:44 +00:00
Philip Reames
14f5e6cf51 [LVI] Sink a couple more cache manipulation routines into the cache itself [NFCI]
The only interesting bit here is the refactor of the handle callback and even that's pretty straight-forward.

llvm-svn: 281267
2016-09-12 22:03:36 +00:00
Philip Reames
c53c4f4788 [LVI] Abstract out the actual cache logic [NFCI]
Seperate the caching logic from the implementation of the lazy analysis.  For the moment, the lazy analysis impl has a is-a relationship with the cache; this will change to a has-a relationship shortly.  This was done as two steps merely to keep the changes simple and the diff understandable.

llvm-svn: 281266
2016-09-12 21:46:58 +00:00
Justin Lebar
2c71fc8bb7 Add handling of !invariant.load to PropagateMetadata.
Summary:
This will let e.g. the load/store vectorizer propagate this metadata
appropriately.

Reviewers: arsenm

Subscribers: tra, jholewinski, hfinkel, mzolotukhin

Differential Revision: https://reviews.llvm.org/D23479

llvm-svn: 281153
2016-09-11 01:39:08 +00:00
Dehao Chen
eb60ada52c Do not widen load for different variable in GVN.
Summary:
Widening load in GVN is too early because it will block other optimizations like PRE, LICM.

https://llvm.org/bugs/show_bug.cgi?id=29110

The SPECCPU2006 benchmark impact of this patch:

Reference: o2_nopatch
(1): o2_patched

           Benchmark             Base:Reference   (1)  
-------------------------------------------------------
spec/2006/fp/C++/444.namd                  25.2  -0.08%
spec/2006/fp/C++/447.dealII               45.92  +1.05%
spec/2006/fp/C++/450.soplex                41.7  -0.26%
spec/2006/fp/C++/453.povray               35.65  +1.68%
spec/2006/fp/C/433.milc                   23.79  +0.42%
spec/2006/fp/C/470.lbm                    41.88  -1.12%
spec/2006/fp/C/482.sphinx3                47.94  +1.67%
spec/2006/int/C++/471.omnetpp             22.46  -0.36%
spec/2006/int/C++/473.astar               21.19  +0.24%
spec/2006/int/C++/483.xalancbmk           36.09  -0.11%
spec/2006/int/C/400.perlbench             33.28  +1.35%
spec/2006/int/C/401.bzip2                 22.76  -0.04%
spec/2006/int/C/403.gcc                   32.36  +0.12%
spec/2006/int/C/429.mcf                   41.04  -0.41%
spec/2006/int/C/445.gobmk                 26.94  +0.04%
spec/2006/int/C/456.hmmer                  24.5  -0.20%
spec/2006/int/C/458.sjeng                    28  -0.46%
spec/2006/int/C/462.libquantum            55.25  +0.27%
spec/2006/int/C/464.h264ref               45.87  +0.72%

geometric mean                                   +0.23%

For most benchmarks, it's a wash, but we do see stable improvements on some benchmarks, e.g. 447,453,482,400.

Reviewers: davidxl, hfinkel, dberlin, sanjoy, reames

Subscribers: gberry, junbuml

Differential Revision: https://reviews.llvm.org/D24096

llvm-svn: 281074
2016-09-09 18:42:35 +00:00
Chandler Carruth
6a72044ca6 [LCG] Clean up and make NDEBUG verify calls more rigorous with
make_scope_exit now that we have that utility.

This makes the code much more clear and readable by isolating the check.
It also makes it easy to go through and make sure all the interesting
update routines have a start and end verify so we don't slowly let the
graph drift into an invalid state.

llvm-svn: 280619
2016-09-04 08:34:31 +00:00
Chandler Carruth
2f848e0e0f [LCG] A NFC refactoring to extract the logic for doing
a postorder-sequence based update after edge insertion into a generic
helper function.

This separates the SCC-specific logic into two fairly simple lambdas and
extracts the rest into a generic helper template function. I think this
is a net win on its own merits because it disentangles different pieces
of the algorithm. Now there is one place that does the two-step
partition to identify a set of newly connected components and at the
same time update the postorder sequence.

However, I'm also hoping to re-use this an upcoming patch to update
a cached post-order sequence of RefSCCs when doing the analogous update
to the RefSCC graph, and I don't want to have two copies.

The diff is quite messy but this really is just moving things around and
making types generic rather than specific.

llvm-svn: 280618
2016-09-04 08:34:24 +00:00
Andrea Di Biagio
c30204881e Simplify code a bit. No functional change intended.
We don't need to call `GetCompareTy(LHS)' every single time true or false is
returned from function SimplifyFCmpInst as suggested by Sanjay in review D24142.

llvm-svn: 280491
2016-09-02 15:55:25 +00:00
Andrea Di Biagio
12f309496c [instsimplify] Fix incorrect folding of an ordered fcmp with a vector of all NaN.
This patch fixes a crash caused by an incorrect folding of an ordered comparison
between a packed floating point vector and a splat vector of NaN.

An ordered comparison between a vector and a constant vector of NaN, should
always be folded into a constant vector where each element is i1 false.

Since revision 266175, SimplifyFCmpInst folds the ordered fcmp into a scalar
'false'. Later on, this would cause an assertion failure, since the value type
of the folded value doesn't match the expected value type of the uses of the
original instruction: "Assertion failed: New->getType() == getType() &&
"replaceAllUses of value with new value of different type!".

This patch fixes the issue and adds a test case to the already existing test
InstSimplify/floating-point-compares.ll.

Differential Revision: https://reviews.llvm.org/D24143

llvm-svn: 280488
2016-09-02 14:47:43 +00:00
Michael Zolotukhin
d2ab1fcb94 [LoopInfo] Add verification by recomputation.
Summary:
Current implementation of LI verifier isn't ideal and fails to detect
some cases when LI is incorrect. For instance, it checks that all
recorded loops are in a correct form, but it has no way to check if
there are no more other (unrecorded in LI) loops in the function. This
patch adds a way to detect such bugs.

Reviewers: chandlerc, sanjoy, hfinkel

Subscribers: llvm-commits, silvas, mzolotukhin

Differential Revision: https://reviews.llvm.org/D23437

llvm-svn: 280280
2016-08-31 19:26:19 +00:00
Chad Rosier
a07337bd30 Fix indent. NFC.
llvm-svn: 280270
2016-08-31 18:37:52 +00:00
Tim Shen
830cc5f6f2 s/static inline/static/ for headers I have changed in r279475. NFC.
llvm-svn: 280257
2016-08-31 16:48:13 +00:00
David Majnemer
d770d3137d [Loads] Properly populate the visited set in isDereferenceableAndAlignedPointer
There were paths where we wouldn't populate the visited set, causing us
to recurse forever if an SSA variable was defined in terms of itself.

This fixes PR30210.

llvm-svn: 280191
2016-08-31 03:22:32 +00:00
NAKAMURA Takumi
35c1fea391 Fixup r279618, instantiate *AnalysisManagerProxy<*AnalysisManager,LazyCallGraph::SCC>, instead of *AnalysisManagerProxy<*AnalysisManager,LazyCallGraph::SCC,LazyCallGraph&>, for PassID.
Or they were not instantiated as expected;

  llvm::InnerAnalysisManagerProxy<llvm::AnalysisManager<llvm::Function>, llvm::LazyCallGraph::SCC>::PassID
  llvm::InnerAnalysisManagerProxy<llvm::AnalysisManager<llvm::Function>, llvm::LazyCallGraph::SCC>::PassID

llvm-svn: 280105
2016-08-30 15:47:13 +00:00
Piotr Padlewski
f3f0f64ccf NFC: add early exit in ModuleSummaryAnalysis
Summary:
Changed this code because it was not very readable.
The one question that I got after changing it is, should we
count calls to intrinsics? We don't add them to caller summary,
so maybe we shouldn't also count them?

Reviewers: tejohnson, eraman, mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23949

llvm-svn: 280036
2016-08-30 00:46:26 +00:00
Easwaran Raman
08d34d7c85 Fix a thinko in r278189.
llvm-svn: 280008
2016-08-29 20:45:51 +00:00
Elena Demikhovsky
eed0bb3d71 [Loop Vectorizer] Fixed memory confilict checks.
Fixed a bug in run-time checks for possible memory conflicts inside loop.
The bug is in Low <-> High boundaries calculation. The High boundary should be calculated as "last memory access pointer + element size".

Differential revision: https://reviews.llvm.org/D23176

llvm-svn: 279930
2016-08-28 08:53:53 +00:00
Adam Nemet
68515406cf [Inliner] Report when inlining fails because callee's def is unavailable
Summary:
This is obviously an interesting case because it may motivate code
restructuring or LTO.

Reporting this requires instantiation of ORE in the loop where the call
sites are first gathered.  I've checked compile-time
overhead *with* -Rpass-with-hotness and the worst slow-down was 6% in
mcf and quickly tailing off.  As before without -Rpass-with-hotness
there is no overhead.

Because this could be a pretty noisy diagnostics, it is currently
qualified as 'verbose'.  As of this patch, 'verbose' diagnostics are
only emitted with -Rpass-with-hotness, i.e. when the output is expected
to be filtered.

Reviewers: eraman, chandlerc, davidxl, hfinkel

Subscribers: tejohnson, Prazek, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D23415

llvm-svn: 279860
2016-08-26 20:21:05 +00:00
Bob Haarman
7dc400a765 limit the number of instructions per block examined by dead store elimination
Summary: Dead store elimination gets very expensive when large numbers of instructions need to be analyzed. This patch limits the number of instructions analyzed per store to the value of the memdep-block-scan-limit parameter (which defaults to 100). This resulted in no observed difference in performance of the generated code, and no change in the statistics for the dead store elimination pass, but improved compilation time on some files by more than an order of magnitude.

Reviewers: dexonsmith, bruno, george.burgess.iv, dberlin, reames, davidxl

Subscribers: davide, chandlerc, dberlin, davidxl, eraman, tejohnson, mbodart, llvm-commits

Differential Revision: https://reviews.llvm.org/D15537

llvm-svn: 279833
2016-08-26 16:34:27 +00:00
George Burgess IV
7056845909 Update a comment.
r279696, which changed `LLVM_CONSTEXPR AliasAttr` to `const AliasAttr`,
made this comment make less sense.

llvm-svn: 279699
2016-08-25 01:29:55 +00:00
George Burgess IV
dd6439368a Make some LLVM_CONSTEXPR variables const. NFC.
This patch changes LLVM_CONSTEXPR variable declarations to const
variable declarations, since LLVM_CONSTEXPR expands to nothing if the
current compiler doesn't support constexpr. In all of the changed
cases, it looks like the code intended the variable to be const instead
of sometimes-constexpr sometimes-not.

llvm-svn: 279696
2016-08-25 01:05:08 +00:00
Eugene Zelenko
5c80b0e4f8 Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes.
Differential revision: https://reviews.llvm.org/D23861

llvm-svn: 279695
2016-08-25 00:45:04 +00:00
Evgeny Stupachenko
a52e9dab82 The patch improves ValueTracking on left shift with nsw flag.
Summary:
The patch fixes PR28946.

Reviewers: majnemer, sanjoy

Differential Revision: http://reviews.llvm.org/D23296

From: Li Huang
llvm-svn: 279684
2016-08-24 23:01:33 +00:00
Chandler Carruth
94942f5c00 [PM] Introduce basic update capabilities to the new PM's CGSCC pass
manager, including both plumbing and logic to handle function pass
updates.

There are three fundamentally tied changes here:
1) Plumbing *some* mechanism for updating the CGSCC pass manager as the
   CG changes while passes are running.
2) Changing the CGSCC pass manager infrastructure to have support for
   the underlying graph to mutate mid-pass run.
3) Actually updating the CG after function passes run.

I can separate them if necessary, but I think its really useful to have
them together as the needs of #3 drove #2, and that in turn drove #1.

The plumbing technique is to extend the "run" method signature with
extra arguments. We provide the call graph that intrinsically is
available as it is the basis of the pass manager's IR units, and an
output parameter that records the results of updating the call graph
during an SCC passes's run. Note that "...UpdateResult" isn't a *great*
name here... suggestions very welcome.

I tried a pretty frustrating number of different data structures and such
for the innards of the update result. Every other one failed for one
reason or another. Sometimes I just couldn't keep the layers of
complexity right in my head. The thing that really worked was to just
directly provide access to the underlying structures used to walk the
call graph so that their updates could be informed by the *particular*
nature of the change to the graph.

The technique for how to make the pass management infrastructure cope
with mutating graphs was also something that took a really, really large
number of iterations to get to a place where I was happy. Here are some
of the considerations that drove the design:

- We operate at three levels within the infrastructure: RefSCC, SCC, and
  Node. In each case, we are working bottom up and so we want to
  continue to iterate on the "lowest" node as the graph changes. Look at
  how we iterate over nodes in an SCC running function passes as those
  function passes mutate the CG. We continue to iterate on the "lowest"
  SCC, which is the one that continues to contain the function just
  processed.

- The call graph structure re-uses SCCs (and RefSCCs) during mutation
  events for the *highest* entry in the resulting new subgraph, not the
  lowest. This means that it is necessary to continually update the
  current SCC or RefSCC as it shifts. This is really surprising and
  subtle, and took a long time for me to work out. I actually tried
  changing the call graph to provide the opposite behavior, and it
  breaks *EVERYTHING*. The graph update algorithms are really deeply
  tied to this particualr pattern.

- When SCCs or RefSCCs are split apart and refined and we continually
  re-pin our processing to the bottom one in the subgraph, we need to
  enqueue the newly formed SCCs and RefSCCs for subsequent processing.
  Queuing them presents a few challenges:
  1) SCCs and RefSCCs use wildly different iteration strategies at
     a high level. We end up needing to converge them on worklist
     approaches that can be extended in order to be able to handle the
     mutations.
  2) The order of the enqueuing need to remain bottom-up post-order so
     that we don't get surprising order of visitation for things like
     the inliner.
  3) We need the worklists to have set semantics so we don't duplicate
     things endlessly. We don't need a *persistent* set though because
     we always keep processing the bottom node!!!! This is super, super
     surprising to me and took a long time to convince myself this is
     correct, but I'm pretty sure it is... Once we sink down to the
     bottom node, we can't re-split out the same node in any way, and
     the postorder of the current queue is fixed and unchanging.
  4) We need to make sure that the "current" SCC or RefSCC actually gets
     enqueued here such that we re-visit it because we continue
     processing a *new*, *bottom* SCC/RefSCC.

- We also need the ability to *skip* SCCs and RefSCCs that get merged
  into a larger component. We even need the ability to skip *nodes* from
  an SCC that are no longer part of that SCC.

This led to the design you see in the patch which uses SetVector-based
worklists. The RefSCC worklist is always empty until an update occurs
and is just used to handle those RefSCCs created by updates as the
others don't even exist yet and are formed on-demand during the
bottom-up walk. The SCC worklist is pre-populated from the RefSCC, and
we push new SCCs onto it and blacklist existing SCCs on it to get the
desired processing.

We then *directly* update these when updating the call graph as I was
never able to find a satisfactory abstraction around the update
strategy.

Finally, we need to compute the updates for function passes. This is
mostly used as an initial customer of all the update mechanisms to drive
their design to at least cover some real set of use cases. There are
a bunch of interesting things that came out of doing this:

- It is really nice to do this a function at a time because that
  function is likely hot in the cache. This means we want even the
  function pass adaptor to support online updates to the call graph!

- To update the call graph after arbitrary function pass mutations is
  quite hard. We have to build a fairly comprehensive set of
  data structures and then process them. Fortunately, some of this code
  is related to the code for building the cal graph in the first place.
  Unfortunately, very little of it makes any sense to share because the
  nature of what we're doing is so very different. I've factored out the
  one part that made sense at least.

- We need to transfer these updates into the various structures for the
  CGSCC pass manager. Once those were more sanely worked out, this
  became relatively easier. But some of those needs necessitated changes
  to the LazyCallGraph interface to make it significantly easier to
  extract the changed SCCs from an update operation.

- We also need to update the CGSCC analysis manager as the shape of the
  graph changes. When an SCC is merged away we need to clear analyses
  associated with it from the analysis manager which we didn't have
  support for in the analysis manager infrsatructure. New SCCs are easy!
  But then we have the case that the original SCC has its shape changed
  but remains in the call graph. There we need to *invalidate* the
  analyses associated with it.

- We also need to invalidate analyses after we *finish* processing an
  SCC. But the analyses we need to invalidate here are *only those for
  the newly updated SCC*!!! Because we only continue processing the
  bottom SCC, if we split SCCs apart the original one gets invalidated
  once when its shape changes and is not processed farther so its
  analyses will be correct. It is the bottom SCC which continues being
  processed and needs to have the "normal" invalidation done based on
  the preserved analyses set.

All of this is mostly background and context for the changes here.

Many thanks to all the reviewers who helped here. Especially Sanjoy who
caught several interesting bugs in the graph algorithms, David, Sean,
and others who all helped with feedback.

Differential Revision: http://reviews.llvm.org/D21464

llvm-svn: 279618
2016-08-24 09:37:14 +00:00
David Majnemer
70561a7ec3 [ValueTracking] Use a function_ref to avoid multiple instantiations
No functional change intended, this should just be a code size
improvement.

llvm-svn: 279563
2016-08-23 20:52:00 +00:00
Sanjay Patel
af4e2d5037 [InstSimplify] allow icmp with constant folds for splat vectors, part 2
Completes the m_APInt changes for simplifyICmpWithConstant().

Other commits in this series:
https://reviews.llvm.org/rL279492
https://reviews.llvm.org/rL279530
https://reviews.llvm.org/rL279534
https://reviews.llvm.org/rL279538

llvm-svn: 279543
2016-08-23 18:00:51 +00:00
Sanjay Patel
16b202aa5c [InstSimplify] allow icmp with constant folds for splat vectors, part 1
llvm-svn: 279538
2016-08-23 17:30:56 +00:00
Sanjay Patel
9f87f9ae09 [InstSimplify] add helper function for SimplifyICmpInst(); NFCI
And add a FIXME because the helper excludes folds for vectors. It's
not clear yet how many of these are actually testable (and therefore
necessary?) because later analysis uses computeKnownBits and other
methods to catch many of these cases.

llvm-svn: 279492
2016-08-22 23:12:02 +00:00
Tim Shen
33e4d80307 [GraphTraits] Replace all NodeType usage with NodeRef
This should finish the GraphTraits migration.

Differential Revision: http://reviews.llvm.org/D23730

llvm-svn: 279475
2016-08-22 21:09:30 +00:00
Artur Pilipenko
7913289c6e Revert -r278267 [ValueTracking] An improvement to IR ValueTracking on Non-negative Integers
This change cause performance regression on MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt from LNT and some other bechmarks.

See https://reviews.llvm.org/D18777 for details.

llvm-svn: 279433
2016-08-22 13:14:07 +00:00
Tim Shen
823bde34b3 [GraphTraits] Make nodes_iterator dereference to NodeType*/NodeRef
Currently nodes_iterator may dereference to a NodeType* or a NodeType&. Make them all dereference to NodeType*, which is NodeRef later.

Differential Revision: https://reviews.llvm.org/D23704
Differential Revision: https://reviews.llvm.org/D23705

llvm-svn: 279326
2016-08-19 21:20:13 +00:00
Michael Kuperstein
2a0f74a4bf [AliasSetTracker] Degrade AliasSetTracker when may-alias sets get too large.
Repeated inserts into AliasSetTracker have quadratic behavior - inserting a
pointer into AST is linear, since it requires walking over all "may" alias
sets and running an alias check vs. every pointer in the set.

We can avoid this by tracking the total number of pointers in "may" sets,
and when that number exceeds a threshold, declare the tracker "saturated".
This lumps all pointers into a single "may" set that aliases every other
pointer.

(This is a stop-gap solution until we migrate to MemorySSA)

This fixes PR28832.
Differential Revision: https://reviews.llvm.org/D23432

llvm-svn: 279274
2016-08-19 17:05:22 +00:00
Chandler Carruth
f68dd1e089 [PM] Rework the new PM support for building the ModuleSummaryIndex to
directly produce the index as the value type result.

This requires making the index movable which is straightforward. It
greatly simplifies things by allowing us to completely avoid the builder
API and the layers of abstraction inherent there. Instead both pass
managers can directly construct these when run by value. They still
won't be constructed truly eagerly thanks to the optional in the legacy
PM. The code that directly builds the index can also just share a direct
function.

A notable change here is that the result type of the analysis for the
new PM is no longer a reference type. This was really problematic when
making changes to how we handle result types to make our interface
requirements *much* more strict and precise. But I think this is an
overall improvement.

Differential Revision: https://reviews.llvm.org/D23701

llvm-svn: 279216
2016-08-19 07:49:19 +00:00
Chandler Carruth
6e2f9dbe22 [Assumptions] Make collecting ephemeral values not quadratic in the
number of assume intrinsics.

The classical way to have a cache-friendly vector style container when
we need queue semantics for BFS instead of stack semantics for DFS is to
use an ever-growing vector and an index. Erasing from the front requires
O(size) work, and unless we expect the worklist to grow *very* large,
its probably cheaper to just grow and race down the list.

But that makes it more bad that we're putting the assume intrinsics in
this at all. We end up looking at the (by definition empty) use list to
see if they're ephemeral (when we've already put them in that set), etc.

Instead, directly populate the worklist with the operands when we mark
the assume intrinsics as ephemeral. Also, test the visited set *before*
putting things into the worklist so we don't accumulate the same value
in the list 100s of times.

It would be nice to use a set-vector for this but I think its useful to
test the set earlier to avoid repeatedly querying whether the same
instruction is safe to speculate.

Hopefully with these changes the number of values pushed onto the
worklist is smaller, and we avoid quadratic work by letting it grow as
necessary.

Differential Revision: https://reviews.llvm.org/D23396

llvm-svn: 279099
2016-08-18 17:51:24 +00:00
Hans Wennborg
2c5ccecba6 SCEV: Don't assert about non-SCEV-able value in isSCEVExprNeverPoison() (PR28932)
Differential Revision: https://reviews.llvm.org/D23594

llvm-svn: 278999
2016-08-17 22:50:18 +00:00
Justin Bogner
507d362929 Replace a few more "fall through" comments with LLVM_FALLTHROUGH
Follow up to r278902. I had missed "fall through", with a space.

llvm-svn: 278970
2016-08-17 20:30:52 +00:00
Tim Shen
6bdebc6a97 [GraphWriter] Change GraphWriter to use NodeRef in GraphTraits
Summary:
This is part of the "NodeType* -> NodeRef" migration. Notice that since
GraphWriter prints object address as identity, I added a static_assert on
NodeRef to be a pointer type.

Reviewers: dblaikie

Subscribers: llvm-commits, MatzeB

Differential Revision: https://reviews.llvm.org/D23580

llvm-svn: 278966
2016-08-17 20:07:29 +00:00
Jonas Paulsson
6ae041f182 [LoopStrenghtReduce] Refactoring and addition of a new target cost function.
Refactored so that a LSRUse owns its fixups, as oppsed to letting the
LSRInstance own them. This makes it easier to rate formulas for
LSRUses, since the fixups are available directly. The Offsets vector
has been removed since it was no longer necessary.

New target hook isFoldableMemAccessOffset(), which is used during formula
rating.

For SystemZ, this is useful to express that loads and stores with
float or vector types with a big/negative offset should be avoided in
loops. Without this, LSR will generate a lot of negative offsets that
would require extra instructions for loading the address.

Updated tests:
test/CodeGen/SystemZ/loop-01.ll

Reviewed by: Quentin Colombet and Ulrich Weigand.
https://reviews.llvm.org/D19152

llvm-svn: 278927
2016-08-17 13:24:19 +00:00
Justin Bogner
b5f5b0ef6d Replace "fallthrough" comments with LLVM_FALLTHROUGH
This is a mechanical change of comments in switches like fallthrough,
fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead.

llvm-svn: 278902
2016-08-17 05:10:15 +00:00
Duncan P. N. Exon Smith
dbfd669893 ObjCARC: Don't increment or dereference end() when scanning args
When there's only one argument and it doesn't match one of the known
functions, return ARCInstKind::CallOrUser rather than falling through
to the two argument case.  The old behaviour both incremented past and
dereferenced end().

llvm-svn: 278881
2016-08-17 01:02:18 +00:00
Reid Kleckner
bffc0a3155 Revert "Enhance SCEV to compute the trip count for some loops with unknown stride."
This reverts commit r278731. It caused http://crbug.com/638314

llvm-svn: 278853
2016-08-16 21:02:04 +00:00
David Majnemer
4ca4b42bf0 [InstSimplify] Fold gep (gep V, C), (xor V, -1) to C-1
llvm-svn: 278779
2016-08-16 06:13:46 +00:00
Sanjoy Das
2169ec7470 Revert "[ValueTracking] Improve ValueTracking on left shift with nsw flag"
This reverts commit r278172.  It causes PR28946.

llvm-svn: 278740
2016-08-15 21:01:31 +00:00
David L Kreitzer
dbb1c574cf Enhance SCEV to compute the trip count for some loops with unknown stride.
Patch by Pankaj Chawla

Differential Revision: https://reviews.llvm.org/D22377

llvm-svn: 278731
2016-08-15 20:21:41 +00:00
David Majnemer
c1fea83220 [ScopedNoAliasAA] collectMDInDomain should be a free function
collectMDInDomain doesn't use any class members, making it a free
function is not a functional change.

llvm-svn: 278651
2016-08-15 03:56:06 +00:00
David Majnemer
c6ff9e44e5 [ScopedNoAliasAA] Only collect noalias nodes if we have alias.scope nodes
No functional change is intended.

llvm-svn: 278646
2016-08-15 02:23:50 +00:00
David Majnemer
034c8790fa [ScopedNoAliasAA] Replace !ScopeNodes.size() with ScopeNodes.empty()
No functional change is intended.

llvm-svn: 278645
2016-08-15 02:23:48 +00:00
David Majnemer
c243355212 Revert "[ScopedNoAliasAA] Remove an unneccesary set"
This reverts commit r278641.  I'm not sure why but this has upset the
multistage builders...

llvm-svn: 278644
2016-08-15 02:23:46 +00:00
David Majnemer
141d23b277 [ScopedNoAliasAA] Remove an unneccesary set
We are trying to prove that one group of operands is a subset of
another.  We did this by populating two Sets and determining that every
element within one was inside the other.

However, this is unnecessary.  We can simply construct a single set and
test if each operand is within it.

llvm-svn: 278641
2016-08-15 00:13:04 +00:00
Pete Cooper
6327dd4768 Constify ValueTracking. NFC.
Almost all of the method here are only analysing Value's as opposed to
mutating them.  Mark all of the easy ones as const.

llvm-svn: 278585
2016-08-13 01:05:32 +00:00
Eugene Zelenko
10633be3a7 Fix some Clang-tidy modernize-use-using and Include What You Use warnings.
Differential revision: https://reviews.llvm.org/D23478

llvm-svn: 278583
2016-08-13 00:50:41 +00:00
Ehsan Amiri
03643a9b1f [BasicAA] Avoid calling GetUnderlyingObject, when the result of a previous call can be reused.
Recursive calls to aliasCheck from alias[GEP|Select|PHI] may result in a second call to GetUnderlyingObject for a Value, whose underlying object is already computed. This patch ensures that in this situations, the underlying object is not computed again, and the result of the previous call is resued.

https://reviews.llvm.org/D22305

llvm-svn: 278519
2016-08-12 16:05:03 +00:00
Artur Pilipenko
595cd6dff6 [LVI] Take guards into account
Teach LVI to gather control dependant constraints from guards.

Reviewed By: sanjoy

Differential Revision: https://reviews.llvm.org/D23358

llvm-svn: 278518
2016-08-12 15:52:23 +00:00
Artur Pilipenko
ed70fe0f9d [LVI] Fix potential memory corruption in getValueFromCondition
Rewrite Visited[Cond] = getValueFromConditionImpl(..., Visited) statement which can lead to a memory corruption since getValueFromConditionImpl changes Visited map and invalidates the iterators.

llvm-svn: 278514
2016-08-12 15:08:15 +00:00
Teresa Johnson
3e48c70c20 [PM] Port ModuleSummaryIndex analysis to new pass manager
Summary:
Port the ModuleSummaryAnalysisWrapperPass to the new pass manager.
Use it in the ported BitcodeWriterPass (similar to how we use the
legacy ModuleSummaryAnalysisWrapperPass in the legacy WriteBitcodePass).

Also, pass the -module-summary opt flag through to the new pass
manager pipeline and through to the bitcode writer pass, and add
a test that uses it.

Reviewers: mehdi_amini

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D23439

llvm-svn: 278508
2016-08-12 13:53:02 +00:00
Artur Pilipenko
bb0d3edfa5 [LVI] Take range metadata into account while calculating icmp condition constraints
Take range metadata into account for conditions like this:

%length = load i32, i32* %length_ptr, !range !{i32 0, i32 2147483647}
%cmp = icmp ult i32 %a, %length

This is a common pattern for range checks where the length of the array is dynamically loaded.

Reviewed By: sanjoy

Differential Revision: https://reviews.llvm.org/D23267

llvm-svn: 278496
2016-08-12 10:14:11 +00:00
Artur Pilipenko
62de453cec [LVI] Handle any predicate in comparisons like icmp <pred> (add Val, Offset), ...
Currently LVI can only gather value constraints from comparisons like:

* icmp <pred> Val, ...
* icmp ult (add Val, Offset), ...

In fact we can handle any predicate in latter comparisons.

Reviewed By: sanjoy

Differential Revision: https://reviews.llvm.org/D23357

llvm-svn: 278493
2016-08-12 10:05:11 +00:00
David Majnemer
95fedaaedc Use the range variant of transform instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278476
2016-08-12 04:32:42 +00:00
David Majnemer
9880e078f0 Use the range variant of remove_if instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278475
2016-08-12 04:32:37 +00:00
David Majnemer
319d420e44 Use the range variant of find/find_if instead of unpacking begin/end
If the result of the find is only used to compare against end(), just
use is_contained instead.

No functionality change is intended.

llvm-svn: 278469
2016-08-12 03:55:06 +00:00
Pete Cooper
fdd29b1f85 Refactor isValidAssumeForContext to reduce duplication and indentation. NFC.
This method had some duplicate code when we did or did not have a dom tree.  Refactor
it to remove the duplication, but also clean up the control flow to have less duplication.

llvm-svn: 278450
2016-08-12 01:00:15 +00:00
Xinliang David Li
58e24fa600 Add comment /NFC
llvm-svn: 278438
2016-08-11 23:09:56 +00:00
Pete Cooper
5b58e1879b Remove unnecessary extra version of isValidAssumeForContext. NFC.
There were 2 versions of this method.  A public one which takes a
const Instruction* and a private implementation which takes a mutable
Value* and casts to an Instruction*.

There was no need for the 2 versions as all callers pass a const Instruction*
and there was no need for a mutable pointer as we only do analysis here.

llvm-svn: 278434
2016-08-11 22:23:07 +00:00
David Majnemer
85242fb9f9 Use the range variant of find instead of unpacking begin/end
If the result of the find is only used to compare against end(), just
use is_contained instead.

No functionality change is intended.

llvm-svn: 278433
2016-08-11 22:21:41 +00:00
David Majnemer
5423e4bff5 Use range algorithms instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278417
2016-08-11 21:15:00 +00:00
Geoff Berry
0e7b9505a4 [SCEV] Update interface to handle SCEVExpander insert point motion.
Summary:
This is an extension of the fix in r271424.  That fix dealt with builder
insert points being moved by SCEV expansion, but only for the lifetime
of the expand call.  This change modifies the interface so that LSR can
safely call expand multiple times at the same insert point and do the
right thing if one of the expansions decides to move the original insert
point.

This is a fix for PR28719.

Reviewers: sanjoy

Subscribers: llvm-commits, mcrosier, mzolotukhin

Differential Revision: https://reviews.llvm.org/D23342

llvm-svn: 278413
2016-08-11 21:05:17 +00:00
Michael Kuperstein
838a77cab4 [AliasSetTracker] Delete dead code
Deletes unused remove() and containsPointer() interfaces. NFC.

Differential Revision: https://reviews.llvm.org/D23360

llvm-svn: 278365
2016-08-11 17:20:20 +00:00
Easwaran Raman
6747fed253 Make more fields of InlineParams Optional.
Differential revision: https://reviews.llvm.org/D23386

llvm-svn: 278312
2016-08-11 03:58:05 +00:00
Piotr Padlewski
08172a9e18 Changed sign of LastCallToStaticBouns
Summary:
I think it is much better this way.
When I firstly saw line:
  Cost += InlineConstants::LastCallToStaticBonus;
I though that this is a bug, because everywhere where the cost is being reduced
it is usuing -=.

Reviewers: eraman, tejohnson, mehdi_amini

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D23222

llvm-svn: 278290
2016-08-10 21:15:22 +00:00
Andrew Kaylor
6c20f4ff06 [ValueTracking] An improvement to IR ValueTracking on Non-negative Integers
Patch by Li Huang

Differential Revision: https://reviews.llvm.org/D18777

llvm-svn: 278267
2016-08-10 18:47:19 +00:00
Artur Pilipenko
5f99d6c4a4 [LVI] Handle conditions in the form of (cond1 && cond2)
Teach LVI how to gather information from conditions in the form of (cond1 && cond2). Our out-of-tree front-end emits range checks in this form.

Reviewed By: sanjoy

Differential Revision: http://reviews.llvm.org/D23200

llvm-svn: 278231
2016-08-10 15:13:15 +00:00
Artur Pilipenko
a77e404d5d [LVI] NFC. Make getValueFromCondition return LVILatticeValue instead of changing reference argument
Instead of returning bool and setting LVILatticeValue reference argument return LVILattice value. Use overdefined value to denote the case when we didn't gather any information from the condition.

This change was separated from the review "[LVI] Handle conditions in the form of (cond1 && cond2)" (https://reviews.llvm.org/D23200#inline-199531). Once getValueFromCondition returns LVILatticeValue we can cache the result in Visited map.

llvm-svn: 278224
2016-08-10 13:38:07 +00:00
Artur Pilipenko
981d14fd94 [LVI] Relax the assertion about LVILatticeVal type in getConstantRange
The problem was triggered by my recent change in CVP (D23059). Current code expected that integer constants are represented by constantrange LVILatticeVal and never represented as LVILatticeVal with constant tag. That is true for ConstantInt constants, although ConstantExpr integer type constants are legally represented as constant LVILatticeVal.

This code fails with CVP change in:

@b = global i32 0, align 4
define void @test6(i32 %a) {
bb:
  %add = add i32 %a, ptrtoint (i32* @b to i32)
  ret void
}
Currently getConstantRange code is not executed by any of the upstream passes. I'm going to add a test case to test/Transforms/CorrelatedValuePropagation/add.ll once I resubmit the CVP change.

Reviewed By: sanjoy

Differential Revision: http://reviews.llvm.org/D23194

llvm-svn: 278217
2016-08-10 12:54:54 +00:00
Easwaran Raman
101580aa20 Do not directly use inline threshold cl options in cost analysis.
This adds an InlineParams struct which is populated from the command line options by getInlineParams and passed to getInlineCost for the call analyzer to use.

Differential revision: https://reviews.llvm.org/D22120

llvm-svn: 278189
2016-08-10 00:48:04 +00:00
Adam Nemet
975ffb9e11 [Inliner,OptDiag] Add hotness attribute to opt diagnostics
Summary:
The inliner not being a function pass requires the work-around of
generating the OptimizationRemarkEmitter and in turn BFI on demand.
This will go away after the new PM is ready.

BFI is only computed inside ORE if the user has requested hotness
information for optimization diagnostitics (-pass-remark-with-hotness at
the 'opt' level).  Thus there is no additional overhead without the
flag.

Reviewers: hfinkel, davidxl, eraman

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22694

llvm-svn: 278185
2016-08-10 00:44:44 +00:00
Andrew Kaylor
3ceabfd932 [ValueTracking] Improve ValueTracking on left shift with nsw flag
Patch by Li Huang

Differential Revison: https://reviews.llvm.org/D23296

llvm-svn: 278172
2016-08-09 22:41:35 +00:00
Wei Mi
c62339e54c Fix the runtime error caused by "Use ValueOffsetPair to enhance value reuse during SCEV expansion".
The patch is to fix the bug in PR28705. It was caused by setting wrong return
value for SCEVExpander::findExistingExpansion. The return values of findExistingExpansion
have different meanings when the function is used in different ways so it is easy to make
mistake. The fix creates two new interfaces to replace SCEVExpander::findExistingExpansion,
and specifies where each interface is expected to be used.

Differential Revision: https://reviews.llvm.org/D22942

llvm-svn: 278161
2016-08-09 20:40:03 +00:00
Wei Mi
3da1a46d30 Recommit "Use ValueOffsetPair to enhance value reuse during SCEV expansion".
The fix for PR28705 will be committed consecutively.

In D12090, the ExprValueMap was added to reuse existing value during SCEV expansion.
However, const folding and sext/zext distribution can make the reuse still difficult.

A simplified case is: suppose we know S1 expands to V1 in ExprValueMap, and
  S1 = S2 + C_a
  S3 = S2 + C_b
where C_a and C_b are different SCEVConstants. Then we'd like to expand S3 as
V1 - C_a + C_b instead of expanding S2 literally. It is helpful when S2 is a
complex SCEV expr and S2 has no entry in ExprValueMap, which is usually caused
by the fact that S3 is generated from S1 after const folding.

In order to do that, we represent ExprValueMap as a mapping from SCEV to
ValueOffsetPair. We will save both S1->{V1, 0} and S2->{V1, C_a} into the
ExprValueMap when we create SCEV for V1. When S3 is expanded, it will first
expand S2 to V1 - C_a because of S2->{V1, C_a} in the map, then expand S3 to
V1 - C_a + C_b.

Differential Revision: https://reviews.llvm.org/D21313

llvm-svn: 278160
2016-08-09 20:37:50 +00:00
Anna Thomas
db9853f118 [AliasAnalysis] Treat invariant.start as read-memory
Summary:
We teach alias analysis that invariant.start is readonly.
This helps with GVN and memcopy optimizations that currently treat.
invariant.start as a clobber.
We need to treat this as readonly, so that DSE does not incorrectly
remove stores prior to the invariant.start

Reviewers: sanjoy, reames, majnemer, dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23214

llvm-svn: 278138
2016-08-09 17:18:05 +00:00
Artur Pilipenko
8501352578 [LVI] Make LVI smarter about comparisons with non-constants
Make LVI smarter about comparisons with a non-constant. For example, a s< b constraints a to be in [INT_MIN, INT_MAX) range. This is a part of https://llvm.org/bugs/show_bug.cgi?id=28620 fix.

Reviewed By: sanjoy

Differential Revision: https://reviews.llvm.org/D23205

llvm-svn: 278122
2016-08-09 14:50:08 +00:00
Artur Pilipenko
4ce2a7565d Revert 278107 which causes buildbot failures and in addition has wrong commit message
llvm-svn: 278109
2016-08-09 10:00:22 +00:00
Artur Pilipenko
6626464f9c Teach CorrelatedValuePropagation to mark adds as no wrap
Use LVI to prove that adds do not wrap. The change is motivated by https://llvm.org/bugs/show_bug.cgi?id=28620 bug and it's the first step to fix that problem.

Reviewed By: sanjoy

Differential Revision: http://reviews.llvm.org/D23059

llvm-svn: 278107
2016-08-09 09:41:34 +00:00