1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-23 21:13:02 +02:00
Commit Graph

16716 Commits

Author SHA1 Message Date
Michael Kuperstein
1a4dfa3ce1 [LV] Sink tripcount query to where it's actually used. NFC.
llvm-svn: 290142
2016-12-19 22:47:52 +00:00
Sanjay Patel
11ebb18e12 [InstCombine] use commutative matcher for pattern with commutative operators
This is a case that was missed in:
https://reviews.llvm.org/rL290067
...and it would regress if we fix operand complexity (PR28296).

llvm-svn: 290127
2016-12-19 18:35:37 +00:00
Sanjay Patel
8dd6f085af [InstCombine] add folds for icmp (umin|umax X, Y), X
This is a follow-up to:
https://reviews.llvm.org/rL289855 (https://reviews.llvm.org/D27531)
https://reviews.llvm.org/rL290111

llvm-svn: 290118
2016-12-19 17:32:37 +00:00
Florian Hahn
f1e8664904 [LoopVersioning] Require loop-simplify form for loop versioning.
Summary:
Requiring loop-simplify form for loop versioning ensures that the
runtime check block always dominates the exit block.
    
This patch closes #30958 (https://llvm.org/bugs/show_bug.cgi?id=30958).

Reviewers: silviu.baranga, hfinkel, anemet, ashutosh.nema

Subscribers: ashutosh.nema, mzolotukhin, efriedma, hfinkel, llvm-commits

Differential Revision: https://reviews.llvm.org/D27469

llvm-svn: 290116
2016-12-19 17:13:37 +00:00
Sanjay Patel
534830670e [InstCombine] add folds for icmp (smax X, Y), X
This is a follow-up to:
https://reviews.llvm.org/rL289855 (D27531)

llvm-svn: 290111
2016-12-19 16:28:53 +00:00
Daniel Jasper
162ffcacd6 Revert @llvm.assume with operator bundles (r289755-r289757)
This creates non-linear behavior in the inliner (see more details in
r289755's commit thread).

llvm-svn: 290086
2016-12-19 08:22:17 +00:00
Sanjay Patel
f685720721 [InstCombine] use commutative matchers for patterns with commutative operators
Background/motivation - I was circling back around to:
https://llvm.org/bugs/show_bug.cgi?id=28296

I made a simple patch for that and noticed some regressions, so added test cases for
those with rL281055, and this is hopefully the minimal fix for just those cases.

But as you can see from the surrounding untouched folds, we are missing commuted patterns
all over the place, and of course there are no regression tests to cover any of those cases.

We could sprinkle "m_c_" dust all over this file and catch most of the missing folds, but 
then we still wouldn't have test coverage, and we'd still miss some fraction of commuted 
patterns because they require adjustments to the match order.

I'm aware of the concern about the potential compile-time performance impact of adding 
matches like this (currently being discussed on llvm-dev), but I don't think there's any
evidence yet to suggest that handling commutative pattern matching more thoroughly is not
a worthwhile goal of InstCombine.

Differential Revision: https://reviews.llvm.org/D24419

llvm-svn: 290067
2016-12-18 18:49:48 +00:00
Craig Topper
a4a4c54e49 [InstCombine] Simplify code slightly. NFC
llvm-svn: 290046
2016-12-17 18:10:04 +00:00
Evgeniy Stepanov
eaffeadc52 Revert "[GVNHoist] Move GVNHoist to function simplification part of pipeline."
This reverts r289696, which caused TSan perf regression.

See PR31382.

llvm-svn: 290030
2016-12-17 01:53:15 +00:00
Michael Kuperstein
37502008ed Preserve loop metadata when folding branches to a common destination.
Differential Revision: https://reviews.llvm.org/D27830

llvm-svn: 289992
2016-12-16 21:23:59 +00:00
Adrian Prantl
0ab6669d6d Revert "[IR] Remove the DIExpression field from DIGlobalVariable."
This reverts commit 289920 (again).
I forgot to implement a Bitcode upgrade for the case where a DIGlobalVariable
has not DIExpression. Unfortunately it is not possible to safely upgrade
these variables without adding a flag to the bitcode record indicating which
version they are.
My plan of record is to roll the planned follow-up patch that adds a
unit: field to DIGlobalVariable into this patch before recomitting.
This way we only need one Bitcode upgrade for both changes (with a
version flag in the bitcode record to safely distinguish the record
formats).

Sorry for the churn!

llvm-svn: 289982
2016-12-16 19:39:01 +00:00
Matthew Simpson
7fb206b6f3 Reapply "[LV] Enable vectorization of loops with conditional stores by default"
This patch reapplies r289863. The original patch was reverted because it
exposed a bug causing the loop vectorizer to crash in the Python runtime on
PPC. The underlying issue was fixed with r289958.

llvm-svn: 289975
2016-12-16 19:12:02 +00:00
Matthew Simpson
add3b6cc60 [LV] Don't attempt to type-shrink scalarized instructions
After r288909, instructions feeding predicated instructions may be scalarized
if profitable. Since these instructions will remain scalar, we shouldn't
attempt to type-shrink them. We should only truncate vector types to their
minimal bit widths. This bug was exposed by enabling the vectorization of loops
containing conditional stores by default.

llvm-svn: 289958
2016-12-16 16:52:35 +00:00
Chandler Carruth
9362cf3865 Revert r289863: [LV] Enable vectorization of loops with conditional
stores by default

This uncovers a crasher in the loop vectorizer on PPC when building the
Python runtime. I'll send the testcase to the review thread for the
original commit.

llvm-svn: 289934
2016-12-16 11:31:39 +00:00
Adrian Prantl
2345112c5b [IR] Remove the DIExpression field from DIGlobalVariable.
This patch implements PR31013 by introducing a
DIGlobalVariableExpression that holds a pair of DIGlobalVariable and
DIExpression.

Currently, DIGlobalVariables holds a DIExpression. This is not the
best way to model this:

(1) The DIGlobalVariable should describe the source level variable,
    not how to get to its location.

(2) It makes it unsafe/hard to update the expressions when we call
    replaceExpression on the DIGLobalVariable.

(3) It makes it impossible to represent a global variable that is in
    more than one location (e.g., a variable with multiple
    DW_OP_LLVM_fragment-s).  We also moved away from attaching the
    DIExpression to DILocalVariable for the same reasons.

This reapplies r289902 with additional testcase upgrades.

<rdar://problem/29250149>
https://llvm.org/bugs/show_bug.cgi?id=31013
Differential Revision: https://reviews.llvm.org/D26769

llvm-svn: 289920
2016-12-16 04:25:54 +00:00
Teresa Johnson
581c51fa17 [ThinLTO] Thin link efficiency: More efficient export list computation
Summary:
Instead of checking whether a global referenced by a function being
imported is defined in the same module, speculatively always add the
referenced globals to the module's export list. After all imports are
computed, for each module prune any not in its defined set from its
export list.

For a huge C++ app with aggressive importing thresholds, even with
D27687 we spent a lot of time invoking modulePath() from
exportGlobalInModule (modulePath() was still the 2nd hottest routine in
profile). The reason is that with comdat/linkonce the summary lists for
each GUID can be long. For the app in question, for example, we were
invoking exportGlobalInModule almost 2 million times, and we traversed
an average of 63 entries in the summary list each time.

This patch reduced the thin link time for the app by about 10% (on top
of D27687) when using aggressive importing thresholds, and about 3.5% on
average with default importing thresholds.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27755

llvm-svn: 289918
2016-12-16 04:11:51 +00:00
Davide Italiano
c2513a6aa0 [SimplifyLibCalls] Use a lambda. NFCI.
llvm-svn: 289911
2016-12-16 02:28:38 +00:00
Adrian Prantl
daf4fef1f9 Revert "[IR] Remove the DIExpression field from DIGlobalVariable."
This reverts commit 289902 while investigating bot berakage.

llvm-svn: 289906
2016-12-16 01:00:30 +00:00
Peter Collingbourne
56f7d9cb53 Add missing library dep.
llvm-svn: 289903
2016-12-16 00:43:00 +00:00
Adrian Prantl
0eee52640f [IR] Remove the DIExpression field from DIGlobalVariable.
This patch implements PR31013 by introducing a
DIGlobalVariableExpression that holds a pair of DIGlobalVariable and
DIExpression.

Currently, DIGlobalVariables holds a DIExpression. This is not the
best way to model this:

(1) The DIGlobalVariable should describe the source level variable,
    not how to get to its location.

(2) It makes it unsafe/hard to update the expressions when we call
    replaceExpression on the DIGLobalVariable.

(3) It makes it impossible to represent a global variable that is in
    more than one location (e.g., a variable with multiple
    DW_OP_LLVM_fragment-s).  We also moved away from attaching the
    DIExpression to DILocalVariable for the same reasons.

<rdar://problem/29250149>
https://llvm.org/bugs/show_bug.cgi?id=31013
Differential Revision: https://reviews.llvm.org/D26769

llvm-svn: 289902
2016-12-16 00:36:43 +00:00
Peter Collingbourne
dec168cd58 IPO: Introduce ThinLTOBitcodeWriter pass.
This pass prepares a module containing type metadata for ThinLTO by splitting
it into regular and thin LTO parts if possible, and writing both parts to
a multi-module bitcode file. Modules that do not contain type metadata are
written unmodified as a single module.

All globals with type metadata are added to the regular LTO module, and
the rest are added to the thin LTO module.

Differential Revision: https://reviews.llvm.org/D27324

llvm-svn: 289899
2016-12-16 00:26:30 +00:00
Teresa Johnson
d81c3fc8d1 [ThinLTO] Thin link efficiency improvement: don't re-export globals (NFC)
Summary:
We were reinvoking exportGlobalInModule numerous times redundantly.
No need to re-export globals referenced by a global that was already
imported from its module. This resulted in a large speedup in the thin
link for a big application, particularly when importing aggressiveness
was cranked up.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27687

llvm-svn: 289896
2016-12-15 23:50:06 +00:00
Davide Italiano
1b19bf8526 [SimplifyLibCalls] Lower fls() to llvm.ctlz().
Differential Revision:  https://reviews.llvm.org/D14590

llvm-svn: 289894
2016-12-15 23:45:11 +00:00
Davide Italiano
71113df20d [SimplifyLibCalls] Remove redundant folding logic for ffs().
Lowering to llvm.cttz() will result in constant folding anyway
if the argument to ffs is a constant. Pointed out by Eli for
fls() in D14590.

llvm-svn: 289888
2016-12-15 23:11:00 +00:00
Teresa Johnson
f3799ec1f0 [ThinLTO] Revert part of r289843 that belonged to another patch.
The code change for D27687 accidentally got committed along with the
main change in r289843. Revert it temporarily, so that I can recommit it
along with its test as intended.

llvm-svn: 289875
2016-12-15 21:39:42 +00:00
Teresa Johnson
558bcc949a [ThinLTO] Remove stale comment (NFC)
This should have been removed with r288446.

llvm-svn: 289871
2016-12-15 20:53:31 +00:00
Teresa Johnson
bf0eb27ba0 [ThinLTO] Thin link efficiency: skip candidate added later with higher threshold (NFC)
Summary:
Thin link efficiency improvement. After adding an importing candidate to
the worklist we might have later added it again with a higher threshold.
Skip it when popped from the worklist if we recorded a higher threshold
than the current worklist entry, it will get processed again at the
higher threshold when that entry is popped.

This required adding the summary's GUID to the worklist, so that it can
be used to query the recorded highest threshold for it when we pop from the
worklist.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27696

llvm-svn: 289867
2016-12-15 20:48:19 +00:00
Matthew Simpson
765604b11b [LV] Enable vectorization of loops with conditional stores by default
This patch sets the default value of the "-enable-cond-stores-vec" command line
option to "true".

Differential Revision: https://reviews.llvm.org/D27814

llvm-svn: 289863
2016-12-15 20:11:05 +00:00
Andrea Di Biagio
9c41d674de [SimplifyCFG] Merge debug locations when hoisting an instruction from a then/else branch. NFC.
Now that a new API to merge debug locations has been committed at r289661 (see
review D26256 for more details), we can use it to "improve" the code added by
revision r280995.

Instead of nulling the debugloc of a commoned instruction, we use the 'merged'
debug location. At the moment, this is just a no functional change since
function `DILocation::getMergedLocation()` is just a stub and would always
return a null location.

Differential Revision: https://reviews.llvm.org/D27804

llvm-svn: 289862
2016-12-15 20:01:26 +00:00
Sanjay Patel
c02483f504 [InstCombine] add folds for icmp (smin X, Y), X
Min/max canonicalization (r287585) exposes the fact that we're missing combines for min/max patterns. 
This patch won't solve the example that was attached to that thread, so something else still needs fixing.

The line between InstCombine and InstSimplify gets blurry here because sometimes the icmp instruction that
we want to fold to already exists, but sometimes it's the swapped form of what we want.

Corresponding changes for smax/umin/umax to follow.

Differential Revision: https://reviews.llvm.org/D27531

llvm-svn: 289855
2016-12-15 19:13:37 +00:00
Teresa Johnson
4afc73f0ba [ThinLTO] Ensure callees get hot threshold when first seen on cold path
This is split out from D27696, since it turned out to be a bug fix and
not part of the NFC efficiency change.

Keep the same adjusted (possibly decayed) threshold in both the worklist
and the ImportList. Otherwise if we encountered it first along a cold
path, the callee would be added to the worklist with a lower decayed
threshold than when it is later encountered along a hot path. But the
logic uses the threshold recorded in the ImportList entry to check if
we should re-add it, and without this patch the threshold recorded there
is the same along both paths so we don't re-add it. Using the
same possibly decayed threshold in the ImportList ensures we re-add it
later with the higher non-decayed hot path threshold.

llvm-svn: 289843
2016-12-15 18:21:01 +00:00
Robert Lougher
85d2c88291 Revert "[SimplifyCFG] In sinkLastInstruction correctly set debugloc of common inst"
Reverting as it is causing buildbot failures (address sanitizer).

llvm-svn: 289833
2016-12-15 16:59:13 +00:00
Robert Lougher
05e30a99f7 [SimplifyCFG] In sinkLastInstruction correctly set debugloc of "common" inst
Simplify CFG will try to sink the last instruction in a series of basic blocks,
creating a "common" instruction in the successor block (sinkLastInstruction).
When it does this, the debug location of the single instruction should be the
merged debug locations of the commoned instructions.

Differential Revision: https://reviews.llvm.org/D27590

llvm-svn: 289828
2016-12-15 16:17:53 +00:00
Ehsan Amiri
790f008233 [InstCombine] New opportunities for FoldAndOfICmp and FoldXorOfICmp
A number of new patterns for simplifying and/xor of icmp:

(icmp ne %x, 0) ^ (icmp ne %y, 0) => icmp ne %x, %y if the following is true:
1- (%x = and %a, %mask) and (%y = and %b, %mask)
2- %mask is a power of 2.

(icmp eq %x, 0) & (icmp ne %y, 0) => icmp ult %x, %y if the following is true:
1- (%x = and %a, %mask1) and (%y = and %b, %mask2)
2- Let %t be the smallest power of 2 where %mask1 & %t != 0. Then for any
   %s that is a power of 2 and %s & %mask2 != 0, we must have %s <= %t.
For example if %mask1 = 24 and %mask2 = 16, setting %s = 16 and %t = 8
violates condition (2) above. So this optimization cannot be applied.

llvm-svn: 289813
2016-12-15 12:25:13 +00:00
Craig Topper
589041bbd2 [AVX-512][InstCombine] Add masked scalar FMA intrinsics to SimplifyDemandedVectorElts.
llvm-svn: 289759
2016-12-15 03:49:45 +00:00
Hal Finkel
f224db75d2 Remove the AssumptionCache
After r289755, the AssumptionCache is no longer needed. Variables affected by
assumptions are now found by using the new operand-bundle-based scheme. This
new scheme is more computationally efficient, and also we need much less
code...

llvm-svn: 289756
2016-12-15 03:02:15 +00:00
Hal Finkel
502475d4f3 Make processing @llvm.assume more efficient by using operand bundles
There was an efficiency problem with how we processed @llvm.assume in
ValueTracking (and other places). The AssumptionCache tracked all of the
assumptions in a given function. In order to find assumptions relevant to
computing known bits, etc. we searched every assumption in the function. For
ValueTracking, that means that we did O(#assumes * #values) work in InstCombine
and other passes (with a constant factor that can be quite large because we'd
repeat this search at every level of recursion of the analysis).

Several of us discussed this situation at the last developers' meeting, and
this implements the discussed solution: Make the values that an assume might
affect operands of the assume itself. To avoid exposing this detail to
frontends and passes that need not worry about it, I've used the new
operand-bundle feature to add these extra call "operands" in a way that does
not affect the intrinsic's signature. I think this solution is relatively
clean. InstCombine adds these extra operands based on what ValueTracking, LVI,
etc. will need and then those passes need only search the users of the values
under consideration. This should fix the computational-complexity problem.

At this point, no passes depend on the AssumptionCache, and so I'll remove
that as a follow-up change.

Differential Revision: https://reviews.llvm.org/D27259

llvm-svn: 289755
2016-12-15 02:53:42 +00:00
Dehao Chen
4c1d21ec8c Only sets profile summary when it was not preset.
Summary: SampleProfileLoader pass may be invoked twice by LTO. The 2nd pass should not append more summary info as it is already preset by the 1st pass.

Reviewers: eraman, davidxl

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D27733

llvm-svn: 289725
2016-12-14 22:06:49 +00:00
Dehao Chen
d24fcc64e6 Fix the bug in r289714 (NFC).
llvm-svn: 289724
2016-12-14 22:03:08 +00:00
Filipe Cabecinhas
0ed7767039 [asan] Don't skip instrumentation of masked load/store unless we've seen a full load/store on that pointer.
Reviewers: kcc, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27625

llvm-svn: 289718
2016-12-14 21:57:04 +00:00
Filipe Cabecinhas
978d341e1c [asan] Hook ClInstrumentWrites and ClInstrumentReads to masked operation instrumentation.
Reviewers: kcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27548

llvm-svn: 289717
2016-12-14 21:56:59 +00:00
Dehao Chen
a6fafec460 Create SampleProfileLoader pass in llvm instead of clang
Summary: We used to create SampleProfileLoader pass in clang. This makes LTO/ThinLTO unable to add this pass in the linker plugin. This patch moves the SampleProfileLoader pass creation from clang to llvm pass manager builder.

Reviewers: tejohnson, davidxl, dnovillo

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D27743

llvm-svn: 289714
2016-12-14 21:40:47 +00:00
Robert Lougher
2f625185c9 [InstCombine] Folding of a compare with RHS const should merge debug locations
If all the operands to a phi node are compares that have a RHS constant,
instcombine will try to pull them through the phi node, combining them into
a single operation. When it does this, the debug location of the new op
should be the merged debug locations of the phi node arguments.

Patch 8 of 8 for D26256.  Folding of a compare that has a RHS constant.

Differential Revision: https://reviews.llvm.org/D26256

llvm-svn: 289704
2016-12-14 20:27:22 +00:00
Robert Lougher
4f006e8400 [InstCombine] Folding of a binop with RHS const should merge the debug locations
If all the operands to a phi node are a binop with a RHS constant, instcombine
will try to pull them through the phi node, combining them into a single
operation. When it does this, the debug location of the new op should be the
merged debug locations of the phi node arguments.

Patch 7 of 8 for D26256.  Folding of a binop with RHS constant.

Differential Revision: https://reviews.llvm.org/D26256

llvm-svn: 289699
2016-12-14 20:07:49 +00:00
Geoff Berry
e0fc91243e [GVNHoist] Move GVNHoist to function simplification part of pipeline.
Summary:
Move GVNHoist to later in the optimization pipeline, specifically, to
the function simplification part of the pipeline.  The new pipeline
location allows GVNHoist to run on a function after its callees have
been inlined but before the function has been considered for inlining
into its callers, exposing more opportunities for hoisting.

Performance results on AArch64 kryo:
Improvements:
  Benchmarks/CoyoteBench/fftbench  -24.952%
  spec2006/bzip2                    -4.071%
  internal bmark                    -3.177%
  Benchmarks/PAQ8p/paq8p            -1.754%
  spec2000/perlbmk                  -1.328%
  spec2006/h264ref                  -1.140%

Regressions:
  internal bmark                    +1.818%
  Benchmarks/mafft/pairlocalalign   +1.084%

Reviewers: sebpop, dberlin, hiraditya

Subscribers: aemerson, mehdi_amini, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D27722

llvm-svn: 289696
2016-12-14 19:38:22 +00:00
Robert Lougher
af1853c43a [InstCombine] When folding casts through a phi node merge the debug locations
If all the operands to a phi node are a cast, instcombine will try to pull
them through the phi node, combining them into a single cast. When it does
this, the debug location of the new cast should be the merged debug locations
of the phi node arguments.

Patch 6 of 8 for D26256.  Folding of a cast operation.

Differential Revision: https://reviews.llvm.org/D26256

llvm-svn: 289693
2016-12-14 19:24:01 +00:00
Robert Lougher
66b5fbb84d [InstCombine] Folding loads through a phi node should merge the debug locations
If all the operands to a phi node are a load, instcombine will try to pull
them through the phi node, combining them into a single load. When it does
this, the debug location of the new load should be the merged debug locations
of the phi node arguments.

Patch 5 of 8 for D26256.  Folding of a load operation.

Differential Revision: https://reviews.llvm.org/D26256

llvm-svn: 289688
2016-12-14 19:02:14 +00:00
Robert Lougher
6783059cee [InstCombine] When folding GEP through a phi node merge the debug locations
If all the operands to a phi node are getelementptr, instcombine
will try to pull them through the phi node, combining them into a single
operation.  When it does this, the debug location of the new getelementptr
should be the merged debug locations of the phi node arguments.

Patch 4 of 8 for D26256.  Folding of a getelementptr operation.

Differential Revision: https://reviews.llvm.org/D26256

llvm-svn: 289684
2016-12-14 18:37:50 +00:00
Robert Lougher
ac0e540aa5 [InstCombine] Merge debug locations when folding through a phi node
If all the operands to a phi node are of the same operation, instcombine
will try to pull them through the phi node, combining them into a single
operation.  When it does this, the debug location of the operation should
be the merged debug locations of the phi node arguments.

Patch 3 of 8 for D26256.  Folding of a compare operation.

Differential Revision: https://reviews.llvm.org/D26256

llvm-svn: 289681
2016-12-14 18:14:57 +00:00
Robert Lougher
bf534d3173 [InstCombine] Merge debug locations when folding through a phi node
If all the operands to a phi node are of the same operation, instcombine
will try to pull them through the phi node, combining them into a single
operation.  When it does this, the debug location of the operation should
be the merged debug locations of the phi node arguments.

Patch 2 of 8 for D26256.  Folding of a binary operation.

Differential Revision: https://reviews.llvm.org/D26256

llvm-svn: 289679
2016-12-14 17:49:19 +00:00