1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-24 21:42:54 +02:00
Commit Graph

8170 Commits

Author SHA1 Message Date
Michael Kuperstein
147f6c96a5 [LoopUnroll] First form LCSSA, then loop-simplify
Running non-LCSSA-preserving LoopSimplify followed by LCSSA on (roughly) the
same loop is incorrect, since LoopSimplify may break LCSSA arbitrarily higher
in the loop nest. Instead, run LCSSA first, and then run LCSSA-preserving
LoopSimplify on the result.

This fixes PR31718.

Differential Revision: https://reviews.llvm.org/D29055

llvm-svn: 292854
2017-01-23 23:45:42 +00:00
Sanjay Patel
24068611a3 [InstSimplify] add tests to show missing folds from 'icmp (add nsw)'; NFC
llvm-svn: 292841
2017-01-23 22:42:55 +00:00
Alexey Bataev
812b7aeabd [SLP] Additional test with extra args in horizontal reductions.
llvm-svn: 292821
2017-01-23 19:28:23 +00:00
Piotr Padlewski
57e245c0ad [MemorySSA] Add new tests for invariant.groups
Summary:
Next round of extra tests for MSSA.
I have a prototype invariant.group handling implementation
that fixes all the FIXMEs, and I think it will be
easier to see what is the difference if I firstly
post this, and then only fix fixits.

Reviewers: george.burgess.iv, dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29022

llvm-svn: 292797
2017-01-23 16:38:10 +00:00
Simon Pilgrim
8221416b9d [InstCombine][X86] Add MULDQ/MULUDQ constant folding support
llvm-svn: 292793
2017-01-23 15:22:59 +00:00
Simon Pilgrim
6a557145af [InstCombine][X86] MULDQ/MULUDQ undef -> zero
Match generic mul behaviour so that <X x i64> multiply and muldq/muludq pattern act the same

llvm-svn: 292784
2017-01-23 12:07:32 +00:00
Alexey Bataev
5481977857 [SLP] Additional test for SLP vectorizer with 31 reduction elements.
llvm-svn: 292783
2017-01-23 11:53:16 +00:00
Simon Pilgrim
2a29239f10 [InstCombine][SSE] Tests showing missed opportunities to constant fold PMULDQ/PMULUDQ
llvm-svn: 292782
2017-01-23 10:57:39 +00:00
Chandler Carruth
c508becaa2 This test apparently requires an x86 target and is failing on numerous
bots ever since d0k fixed the CHECK lines so that it did something at
all.

It isn't actually testing SCEV directly but LSR, so move it into LSR and
the x86-specific tree of tests that already exists there. Target
dependence is common and unavoidable with the current design of LSR.

llvm-svn: 292774
2017-01-23 08:33:29 +00:00
Chandler Carruth
8965188357 [PM] Replace the hard invalidate in JumpThreading for LVI with correct
invalidation of deleted functions in GlobalDCE.

This was always testing a bug really triggered in GlobalDCE. Right now
we have analyses with asserting value handles into IR. As long as those
remain, when *deleting* an IR unit, we cannot wait for the normal
invalidation scheme to kick in even though it was designed to work
correctly in the face of these kinds of deletions. Instead, the pass
needs to directly handle invalidating the analysis results pointing at
that IR unit.

I've tought the Inliner about this and this patch teaches GlobalDCE.
This will handle the asserting VH case in the existing test as well as
other issues of the same fundamental variety. I've moved the test into
the GlobalDCE directory and added a comment explaining what is going on.

Note that we cannot simply require LVI here because LVI is too lazy.

llvm-svn: 292773
2017-01-23 08:33:24 +00:00
Chandler Carruth
b8dbb7e277 [PM] Add a dedicated test case for the issue fixed in r292770.
While this is covered by a clang test case, we should have something
locally to LLVM that immediately checks the inliner doesn't leave
analyses to dangling IR bodies.

llvm-svn: 292772
2017-01-23 07:53:20 +00:00
Benjamin Kramer
91e0605f7f Fix some broken CHECK lines.
The colon is important.

llvm-svn: 292761
2017-01-22 20:28:56 +00:00
Chandler Carruth
98c14daece [PM] Fix a really nasty bug introduced when adding PGO support to the
new PM's inliner.

The bug happens when we refine an SCC after having computed a proxy for
the FunctionAnalysisManager, and then proceed to compute fresh analyses
for functions in the *new* SCC using the manager provided by the old
SCC's proxy. *And* when we manage to mutate a function in this new SCC
in a way that invalidates those analyses. This can be... challenging to
reproduce.

I've managed to contrive a set of functions that trigger this and added
a test case, but it is a bit brittle. I've directly checked that the
passes run in the expected ways to help avoid the test just becoming
silently irrelevant.

This gets the new PM back to passing the LLVM test suite after the PGO
improvements landed.

llvm-svn: 292757
2017-01-22 10:34:01 +00:00
Piotr Padlewski
c81d41b203 [MemorySSA] Remove deprecated comment from test
llvm-svn: 292733
2017-01-21 22:14:02 +00:00
Piotr Padlewski
6fab7f63a9 [MemorySSA] Fix invariant.group test and add new
Summary:
This test had a bug: !llvm.invariant.group instead
of !invariant.group.

Also add some new test for future development.
All tests passes, when MSSA will support invariant.group
only the lines with FIXIT should be changed.

Reviewers: dberlin, george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28969

llvm-svn: 292730
2017-01-21 21:56:56 +00:00
Sanjay Patel
91f1806bc1 [InstCombine] use m_APInt to allow ashr folds for vectors with splat constants
We may be able to assert that no shl-shl or lshr-lshr pairs ever get here
because we should have already handled those in foldShiftedShift().

llvm-svn: 292726
2017-01-21 17:59:59 +00:00
Sanjay Patel
2afe78e0f9 [InstCombine] add tests for ashr-ashr; NFC
llvm-svn: 292724
2017-01-21 17:43:06 +00:00
Chandler Carruth
55a970e801 [PM] Teach the loop PM to run LoopSimplify prior to the loop pipeline.
This adds the last remaining core feature of the loop pass pipeline in
the new PM and removes the last of the really egregious hacks in the
LICM tests.

Sadly, this requires really substantial changes in the unittests in
order to provide and maintain simplified loops. This is particularly
hard because for example LoopSimplify will try to fold undef branches to
an ideal direction and simplify the loop accordingly.

Differential Revision: https://reviews.llvm.org/D28766

llvm-svn: 292709
2017-01-21 03:48:51 +00:00
Anmol P. Paralkar
abe4552937 MergeFunctions: Preserve debug info in thunks, under option -mergefunc-preserve-debug-info
Summary:
Under option -mergefunc-preserve-debug-info we:
- Do not create a new function for a thunk.
- Retain the debug info for a thunk's parameters (and associated
  instructions for the debug info) from the entry block.
  Note: -debug will display the algorithm at work.
- Create debug-info for the call (to the shared implementation) made by
  a thunk and its return value.
- Erase the rest of the function, retaining the (minimally sized) entry
  block to create a thunk.
- Preserve a thunk's call site to point to the thunk even when both occur
  within the same translation unit, to aid debugability. Note that this
  behaviour differs from the underlying -mergefunc implementation which
  modifies the thunk's call site to point to the shared implementation
  when both occur within the same translation unit.

Reviewers: echristo, eeckstein, dblaikie, aprantl, friss

Reviewed By: aprantl

Subscribers: davide, fhahn, jfb, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D28075

llvm-svn: 292702
2017-01-21 02:02:56 +00:00
Justin Lebar
1a976ec6c1 [ConstantFold] Remove test checking that we don't constant-fold sqrt(-2).
This depended on libm's errno behavior (we constant fold iff libm's
sqrt(-2) does not set errno) and was breaking on mac.

llvm-svn: 292701
2017-01-21 02:02:27 +00:00
Justin Lebar
22610cd767 [ConstantFolding] Constant-fold llvm.sqrt(x) like other intrinsics.
Summary:
Currently we return undef, but we're in the process of changing the
LangRef so that llvm.sqrt behaves like the other math intrinsics,
matching the return value of the standard libcall but not setting errno.

This change is legal even without the LangRef change because currently
calling llvm.sqrt(x) where x is negative is spec'ed to be UB.  But in
practice it's also safe because we're simply constant-folding fewer
inputs: Inputs >= -0 get constant-folded as before, but inputs < -0 now
aren't constant-folded, because ConstantFoldFP aborts if the host math
function raises an fp exception.

Reviewers: hfinkel, efriedma, sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28929

llvm-svn: 292692
2017-01-21 00:59:57 +00:00
Sanjay Patel
15f61be538 [InstCombine] auto-generate checks; NFC
llvm-svn: 292682
2017-01-20 23:39:01 +00:00
Peter Collingbourne
b815fa1b9f LowerTypeTests: Simplify; always create SizeM1 with type IntPtrTy, move initialization out of if statement.
llvm-svn: 292674
2017-01-20 23:22:28 +00:00
Dehao Chen
458d75fcda Add indirect call promotion to SamplePGO
Summary: This patch adds metadata for indirect call promotion in the sample profile loader.

Reviewers: xur, davidxl, dnovillo

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28923

llvm-svn: 292672
2017-01-20 22:56:07 +00:00
Easwaran Raman
9e1b62af44 Improve PGO support for the new inliner
This adds the following to the new PM based inliner in PGO mode:

* Use block frequency analysis to derive callsite's profile count and use
that to adjust thresholds of hot and cold callsites.

* Incrementally update the BFI of the caller after a callee gets inlined
into it. This incremental update is only within an invocation of the run
method - BFI is not preserved across calls to run.
Update the function entry count of the callee after inlining it into a
caller.

* I've tuned the thresholds for the hot and cold callsites using a hacked
up version of the old inliner that explicitly computes BFI on a set of
internal benchmarks and spec. Once the new PM based pipeline stabilizes
(IIRC Chandler mentioned there are known issues) I'll benchmark this
again and adjust the thresholds if required.
Inliner PGO support.

Differential revision: https://reviews.llvm.org/D28331

llvm-svn: 292666
2017-01-20 22:44:04 +00:00
Sanjay Patel
fc6ca85023 [ValueTracking] recognize variations of 'clamp' to improve codegen (PR31693)
By enhancing value tracking, we allow an existing min/max canonicalization to
kick in and improve codegen for several targets that have min/max instructions.

Unfortunately, recognizing min/max in value tracking may cause us to hit
a hack in InstCombiner::visitICmpInst() more often:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/109340.html
...but I'm hoping we can remove that soon.

Correctness proofs based on Alive:

Name: smaxmin
Pre: C1 < C2
%cmp2 = icmp slt i8 %x, C2
%min = select i1 %cmp2, i8 %x, i8 C2
%cmp3 = icmp slt i8 %x, C1
%r = select i1 %cmp3, i8 C1, i8 %min
=>
%cmp2 = icmp slt i8 %x, C2
%min = select i1 %cmp2, i8 %x, i8 C2
%cmp1 = icmp sgt i8 %min, C1
%r = select i1 %cmp1, i8 %min, i8 C1

Name: sminmax
Pre: C1 > C2
%cmp2 = icmp sgt i8 %x, C2
%max = select i1 %cmp2, i8 %x, i8 C2
%cmp3 = icmp sgt i8 %x, C1
%r = select i1 %cmp3, i8 C1, i8 %max
=>
%cmp2 = icmp sgt i8 %x, C2
%max = select i1 %cmp2, i8 %x, i8 C2
%cmp1 = icmp slt i8 %max, C1
%r = select i1 %cmp1, i8 %max, i8 C1

---------------------------------------- 
Optimization: smaxmin 
Done: 1 
Optimization is correct! 
---------------------------------------- 
Optimization: sminmax 
Done: 1 
Optimization is correct!



Name: umaxmin
Pre: C1 u< C2
%cmp2 = icmp ult i8 %x, C2
%min = select i1 %cmp2, i8 %x, i8 C2
%cmp3 = icmp ult i8 %x, C1
%r = select i1 %cmp3, i8 C1, i8 %min
=>
%cmp2 = icmp ult i8 %x, C2
%min = select i1 %cmp2, i8 %x, i8 C2
%cmp1 = icmp ugt i8 %min, C1
%r = select i1 %cmp1, i8 %min, i8 C1

Name: uminmax
Pre: C1 u> C2
%cmp2 = icmp ugt i8 %x, C2
%max = select i1 %cmp2, i8 %x, i8 C2
%cmp3 = icmp ugt i8 %x, C1
%r = select i1 %cmp3, i8 C1, i8 %max
=>
%cmp2 = icmp ugt i8 %x, C2
%max = select i1 %cmp2, i8 %x, i8 C2
%cmp1 = icmp ult i8 %max, C1
%r = select i1 %cmp1, i8 %max, i8 C1

---------------------------------------- 
Optimization: umaxmin 
Done: 1 
Optimization is correct! 
---------------------------------------- 
Optimization: uminmax 
Done: 1 
Optimization is correct! 
 

llvm-svn: 292660
2017-01-20 22:18:47 +00:00
Sanjay Patel
cbf458f9f0 [InstCombine] add tests to show missed canonicalization of min/max; NFC
Unfortunately, recognizing these in value tracking may cause us to hit
a hack in InstCombiner::visitICmpInst() more often:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/109340.html

...but besides being the obviously Right Thing To Do, there's a clear 
codegen win from identifying these patterns for several targets.

llvm-svn: 292655
2017-01-20 21:49:41 +00:00
Peter Collingbourne
d621a84bb5 LowerTypeTests: Implement importing of type identifiers.
To import a type identifier we read the summary and create external
references to the symbols defined when exporting.

Differential Revision: https://reviews.llvm.org/D28546

llvm-svn: 292654
2017-01-20 21:49:34 +00:00
Daniel Berlin
b36b0f3e75 NewGVN: Remove pr31686.ll, it is tested by pr31594.ll, which is much smaller and simpler
llvm-svn: 292649
2017-01-20 21:04:58 +00:00
Daniel Berlin
99591fa12d NewGVN: Fix PR 31686 and PR 31698 by rewriting store leader handling.
Summary:

This rewrites store expression/leader handling.  We no longer use the
value operand as the leader, instead, we store it separately.  We also
now store the stored value as part of the expression, and compare it
when comparing stores for equality.  This enables us to get rid of a
bunch of our previous hacks and machinations, as the existing
machinery takes care of everything *except* updating the stored value
on classes.  The only time we have to update it is if the storecount
goes to 0, and when we do, we destroy it.

Since we no longer use the value operand as the leader, during elimination, we have to use the value operand.  Doing this also fixes a bunch of store forwarding cases we were missing.

Any value operand we use is guaranteed to either be updated by previous eliminations, or minimized by future ones.

(IE the fact that we don't use the most dominating value operand when it's not a constant does not affect anything).

Sadly, this change also exposes that we didn't pay attention to the
output of the pr31594.ll test, as it also very clearly exposes the
same store leader bug we are fixing here.

(I added pr31682.ll anyway, but maybe we think that's too large to be useful)

On the plus side, propagate-ir-flags.ll now passes due to the
corrected store forwarding.

This change was 3 stage'd on darwin and linux, with the full test-suite.

Reviewers:
davide
Subscribers:
llvm-commits

llvm-svn: 292648
2017-01-20 21:04:30 +00:00
Haicheng Wu
2e6ad7c3a0 Recommit "[InlineCost] Use TTI to check if GEP is free." #3
This is the third attemp to recommit r292526.

The original summary:

Currently, a GEP is considered free only if its indices are all constant.
TTI::getGEPCost() can give target-specific more accurate analysis. TTI is
already used for the cost of many other instructions.

llvm-svn: 292633
2017-01-20 18:51:22 +00:00
Alexey Bataev
f9f1030204 [SLP] Initial test for fix of PR31690.
llvm-svn: 292631
2017-01-20 18:40:21 +00:00
Simon Pilgrim
228925067e [InstCombine][X86] Add MULDQ/MULUDQ undef handling
llvm-svn: 292627
2017-01-20 18:20:30 +00:00
Alexey Bataev
9fc3631cf1 [SLP] A new test for horizontal vectorization for non-power-of-2
instructions.

llvm-svn: 292626
2017-01-20 18:04:29 +00:00
Simon Pilgrim
014fbda9bf [InstCombine][SSE] Tests showing missed opportunities to handle muldq/muludq with undef arguments
Fixed a typo in existing test names at the same time

llvm-svn: 292619
2017-01-20 17:06:38 +00:00
Haicheng Wu
2713d4f8e0 Revert "Recommit "[InlineCost] Use TTI to check if GEP is free." #2"
This reverts commit r292616 because the test case still has problem.

llvm-svn: 292618
2017-01-20 16:52:22 +00:00
Haicheng Wu
a5ea4d4f05 Recommit "[InlineCost] Use TTI to check if GEP is free." #2
This is the second attemp to recommit r292526.

The original summary:

Currently, a GEP is considered free only if its indices are all constant.
TTI::getGEPCost() can give target-specific more accurate analysis. TTI is
already used for the cost of many other instructions.

llvm-svn: 292616
2017-01-20 16:36:34 +00:00
Simon Pilgrim
81f5b7f9e4 [InstCombine][SSE] Tests showing missed opportunities to constant fold packss/packus
llvm-svn: 292609
2017-01-20 13:21:30 +00:00
Simon Pilgrim
78d53eb230 [InstCombine][SSE] Tests showing missed opportunities to handle packss/packus with undef arguments
llvm-svn: 292601
2017-01-20 11:28:07 +00:00
Simon Pilgrim
7f59f5cfb4 [InstCombine][SSE] Add DemandedElts support for PACKSS/PACKUS instructions
Simplify a packss/packus truncation based on the elements of the mask that are actually demanded.

Differential Revision: https://reviews.llvm.org/D28777

llvm-svn: 292591
2017-01-20 09:28:21 +00:00
Chandler Carruth
0e35152dbe [PM] Port LoopSink to the new pass manager.
Like several other loop passes (the vectorizer, etc) this pass doesn't
really fit the model of a loop pass. The critical distinction is that it
isn't intended to be pipelined together with other loop passes. I plan
to add some documentation to the loop pass manager to make this more
clear on that side.

LoopSink is also different because it doesn't really need a lot of the
infrastructure of our loop passes. For example, if there aren't loop
invariant instructions causing a preheader to exist, there is no need to
form a preheader. It also doesn't need LCSSA because this pass is
only involved in sinking invariant instructions from a preheader into
the loop, not reasoning about live-outs.

This allows some nice simplifications to the pass in the new PM where we
can directly walk the loops once without restructuring them.

Differential Revision: https://reviews.llvm.org/D28921

llvm-svn: 292589
2017-01-20 08:42:19 +00:00
Daniel Berlin
5b3e6ee286 NewGVN: Fix PR 31682, an overactive assert.
Part of the assert has been left active for further debugging.
The other part has been turned into a stat for tracking for the
moment.

llvm-svn: 292583
2017-01-20 06:38:41 +00:00
Mohammad Shahid
2d7822bf2e [SLP] Add a base test for jumbled store
Change-Id: I905ce08a02c76a6896dcfd9629547417c99adc4a
llvm-svn: 292581
2017-01-20 06:05:33 +00:00
Haicheng Wu
b21ae23526 Revert "Recommit "[InlineCost] Use TTI to check if GEP is free.""
This reverts commit r292570.  The test still has problem.

llvm-svn: 292572
2017-01-20 03:40:41 +00:00
Haicheng Wu
60b7f5f2cf Recommit "[InlineCost] Use TTI to check if GEP is free."
This recommits r292526 which is reverted in r292529 after fixing the test case.

The original summary:

Currently, a GEP is considered free only if its indices are all constant.
TTI::getGEPCost() can give target-specific more accurate analysis. TTI is
already used for the cost of many other instructions.

llvm-svn: 292570
2017-01-20 03:09:11 +00:00
Anna Thomas
aaa1cad9b5 [AliasAnalysis] Fences do not modify constant memory location
Summary:
Fence instructions are currently marked as `ModRef` for all memory locations.

We can improve this for constant memory locations (such as constant globals),
since fence instructions cannot modify these locations.

This helps us to forward constant loads across fences (added test case in GVN).
There were no changes in behaviour for similar test cases in early-cse and licm.

Reviewers: dberlin, sanjoy, reames

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28914

llvm-svn: 292546
2017-01-20 00:21:33 +00:00
Davide Italiano
65f612f852 [SCCP] Teach the pass how to handle div with overdefined operands.
This can prove that:

extern int f;
int g() {
    int x = 0;
    for (int i = 0; i < 365; ++i) {
        x /= f;
    }
    return x;
}

always returns zero. Thanks to Sanjoy for confirming this
transformation actually made sense (bugs are mine).

llvm-svn: 292531
2017-01-19 23:07:51 +00:00
Haicheng Wu
25c6947c15 Revert "[InlineCost] Use TTI to check if GEP is free."
This reverts commit r292526.  The test case has problem.

llvm-svn: 292529
2017-01-19 22:51:03 +00:00
Haicheng Wu
05949d3bbc [InlineCost] Use TTI to check if GEP is free.
Currently, a GEP is considered free only if its indices are all constant.
TTI::getGEPCost() can give target-specific more accurate analysis. TTI is
already used for the cost of many other instructions.

Differential Revision: https://reviews.llvm.org/D28693

llvm-svn: 292526
2017-01-19 22:28:34 +00:00
Xin Tong
8cb15cfc13 Improve what can be promoted in LICM.
Summary:
In case of non-alloca pointers, we check for whether it is a pointer
from malloc-like calls and it is not captured. In such case, we can
promote the pointer, as the caller will have no way to access this pointer
even if there is unwinding in middle of the loop.

Reviewers: hfinkel, sanjoy, reames, eli.friedman

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28834

llvm-svn: 292510
2017-01-19 19:31:40 +00:00