1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

1905 Commits

Author SHA1 Message Date
Florian Hahn
5067814331 [ConstantFold] Add some tests for binops with constants and undefs.
Precommit tests for D70169.
2019-11-17 21:10:45 +00:00
Rachel Craik
e95e1acfea [LoopCacheAnalysis]: Fix assertion failure during cost computation
Ensure the stride and trip count have the same type before multiplying them during reference cost calculation

Reviewed By: jdoefert

Differential Revision: https://reviews.llvm.org/D70192
2019-11-15 14:56:26 -05:00
Simon Pilgrim
65d10d3de7 Remove commented out CHECK-NEXT to try and appease llvm-clang-x86_64-expensive-checks-win buildbot 2019-11-13 14:59:12 +00:00
Craig Topper
e9ebc959cb [X86] Remove setOperationAction for FP_TO_SINT v8i16.
This is no longer needed after widening legalization as we
custom legalize v8i8 ourselves.

Added entries to the cost model, but bumped the cost slightly
to account for the truncate shuffle that wasn't costed before.
2019-11-12 22:45:52 -08:00
Alina Sbirlea
a76fef4322 [GlobalsAA] Reenable test. 2019-11-12 16:53:28 -08:00
Alina Sbirlea
6f9d9bcfe8 Temporarily disable test. 2019-11-12 15:57:51 -08:00
Alina Sbirlea
28990514f5 [GlobalsAA] Restrict ModRef result if any internal method has its address taken.
Summary:
If there are any internal methods whose address was taken, conclude there is nothing known in relation of any other internal method and a global.

Reviewers: nlopes, sanjoy.google

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69690
2019-11-12 14:24:56 -08:00
bmahjour
d3fb929e1b [DDG] Data Dependence Graph - Pi Block
Summary:
    This patch adds Pi Blocks to the DDG. A pi-block represents a group of DDG
    nodes that are part of a strongly-connected component of the graph.
    Replacing all the SCCs with pi-blocks results in an acyclic representation
    of the DDG. For example if we have:
       {a -> b}, {b -> c, d}, {c -> a}
    the cycle a -> b -> c -> a is abstracted into a pi-block "p" as follows:
       {p -> d} with "p" containing: {a -> b}, {b -> c}, {c -> a}
    In this implementation the edges between nodes that are part of the pi-block
    are preserved. The crossing edges (edges where one end of the edge is in the
    set of nodes belonging to an SCC and the other end is outside that set) are
    replaced with corresponding edges to/from the pi-block node instead.

    Authored By: bmahjour

    Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert

    Reviewed By: Meinersbur

    Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto, ppc-slack

    Tag: #llvm

    Differential Revision: https://reviews.llvm.org/D68827
2019-11-08 15:46:08 -05:00
Tim Renouf
9054503f8d [CostModel] Fixed isExtractSubvectorMask for undef index off end
ShuffleVectorInst::isExtractSubvectorMask, introduced in
  [CostModel] Add SK_ExtractSubvector handling to getInstructionThroughput (PR39368)

erroneously thought that
%340 = shufflevector <4 x float> %339, <4 x float> undef, <3 x i32> <i32 2, i32 3, i32 undef>

is a subvector extract, even though it goes off the end of the parent
vector with the undef index. That then caused an assert in
BasicTTIImplBase::getExtractSubvectorOverhead.

This commit fixes that, by not considering the above a subvector
extract.

Differential Revision: https://reviews.llvm.org/D70005

Change-Id: I87b8b00b24bef19ffc9a1b82ef4eca3b8a246eaf
2019-11-08 15:40:09 +00:00
dfukalov
51fe2040ec [AMDGPU] Fix bug introduced in 47a5c36b37f0
Summary: [AMDGPU] Fix bug introduced in 47a5c36b37f0

Reviewers: foad, arsenm

Reviewed By: arsenm

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69915
2019-11-07 11:50:14 +03:00
Simon Pilgrim
09d86aa16b [CostModel][X86] Improve add vXi64 + fadd vXf64 reduction tests for SLM
As noted on D59710 we weren't handling the high costs of these operations on SLM.
2019-11-06 17:55:38 +00:00
Simon Pilgrim
eec64adc30 [CostModel][X86] Add add/fadd reduction tests for SLM 2019-11-06 17:04:22 +00:00
dfukalov
3c6fc97f38 [AMDGPU] Improve code size cost model (part 2)
Summary: Added estimations for ShuffleVector, some cast and arithmetic instructions

Reviewers: rampitec

Reviewed By: rampitec

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69629
2019-11-06 13:55:48 +03:00
Craig Topper
8db440d712 [X86] Lower the cost of avx512 horizontal bool and/or reductions to 2*log2(bitwidth)+1 for legal types.
This better represents the kshift+binop we'd get for each stage
before the final extract. Its likely we'll do even better by
doing a kmov and a cmp with a GPR, but this is a good start.

The default handling was costing a worst case single source
permute shuffle of the vector before the binop. This worst
case assumes the shuffle might have to be emulated with
extracts and inserts. But since we know we're doing a reduction
we can assume we'll get kshift lowering.

There's still some room for improvement here, but this is
much better than it was.
2019-11-04 22:58:04 -08:00
Johannes Doerfert
ff3180c323 [MustExecute] Forward iterate over conditional branches
Summary:
If a conditional branch is encountered we can try to find a join block
where the execution is known to continue. This means finding a suitable
block, e.g., the immediate post dominator of the conditional branch, and
proofing control will always reach that block.

This patch implements different techniques that work with and without
provided analysis.

Reviewers: uenoku, sstefan1, hfinkel

Subscribers: hiraditya, bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68933
2019-10-31 00:06:43 -05:00
Jay Foad
3ac69e43e9 [ConstantFold] Fold extractelement of getelementptr
Summary:
Getelementptr has vector type if any of its operands are vectors
(the scalar operands being implicitly broadcast to all vector elements).
Extractelement applied to a vector getelementptr can be folded by
applying the extractelement in turn to all of the vector operands.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69379
2019-10-28 18:32:39 +00:00
Roman Lebedev
b9a64cd7b6 [Analysis] Update Analysis/LazyValueAnalysis/lvi-after-jumpthreading.ll
I should have updated it in 1f665046fbf3b9d47a229714f689cd941f6f1216
but i didn't even realize those tests were there.
2019-10-23 18:39:10 +03:00
Philip Reames
4834c845c0 [SCEV] Simplify umin/max of zext and sext of the same value
This is a common idiom which arises after induction variables are widened, and we have two or more exit conditions.  Interestingly, we don't have instcombine or instsimplify support for this either.

Differential Revision: https://reviews.llvm.org/D69006

llvm-svn: 375349
2019-10-19 17:23:02 +00:00
Philip Reames
097009bc85 [Test] Precommit test for D69006
llvm-svn: 375190
2019-10-17 23:32:35 +00:00
Daniil Fukalov
1fb9c274aa [AMDGPU] Improve code size cost model
Summary:
Added estimation for zero size insertelement, extractelement
and llvm.fabs operators.
Updated inline/unroll parameters default values.

Reviewers: rampitec, arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68881

llvm-svn: 375109
2019-10-17 12:15:35 +00:00
Alina Sbirlea
ecebed8b85 [MemorySSA] Update for partial unswitch.
Update MSSA for blocks cloned when doing partial unswitching.
Enable additional testing with MSSA.
Resolves PR43641.

llvm-svn: 374850
2019-10-14 23:52:39 +00:00
Philip Reames
54645b47ac [Tests] Add a SCEV analysis test for llvm.widenable.condition
Mostly because we don't appear to have one and a prototype patch I just saw would have broken the example committed.

llvm-svn: 374835
2019-10-14 22:42:35 +00:00
Simon Pilgrim
80157bd480 [CostModel][X86] Add CTLZ scalar costs
Add specific scalar costs for CTLZ instructions, we can't discriminate between CTLZ and CTLZ_ZERO_UNDEF so we have to assume the worst. Given how BSR is often a microcoded nightmare on some older targets we might still be underestimating it.

For targets supporting LZCNT (Intel Haswell+ or AMD Fam10+), we provide overrides that assume 1cy costs.

llvm-svn: 374786
2019-10-14 16:30:17 +00:00
Simon Pilgrim
92f62b5161 [CostModel][X86] Add CTPOP scalar costs (PR43656)
Add specific scalar costs for ctpop instructions, these are based on the llvm-mca's SLM throughput numbers (the oldest model we have).

For targets supporting POPCNT, we provide overrides that assume 1cy costs.

llvm-svn: 374775
2019-10-14 14:07:43 +00:00
Simon Pilgrim
a95636568c [CostModel][X86] Improve sum reduction costs.
I can't see any notable differences in costs between SSE2 and SSE42 arches for FADD/ADD reduction, so I've lowered the target to just SSE2.

I've also added vXi8 sum reduction costs in line with the PSADBW codegen and discussions on PR42674.

llvm-svn: 374655
2019-10-12 13:21:50 +00:00
Alina Sbirlea
f51788e894 [MemorySSA] Update Phi simplification.
When simplifying a Phi to the unique value found incoming, check that
there wasn't a Phi already created to break a cycle. If so, remove it.
Resolves PR43541.

Some additional nits included.

llvm-svn: 374471
2019-10-10 23:27:21 +00:00
Alina Sbirlea
4f6544812b [MemorySSA] Additional handling of unreachable blocks.
Summary:
Whenever we get the previous definition, the assumption is that the
recursion starts ina  reachable block.
If the recursion starts in an unreachable block, we may recurse
indefinitely. Handle this case by returning LoE if the block is
unreachable.

Resolves PR43426.

Reviewers: george.burgess.iv

Subscribers: Prazek, sanjoy.google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68809

llvm-svn: 374447
2019-10-10 20:43:06 +00:00
Alina Sbirlea
5f91d66ebc [MemorySSA] Make the use of moveAllAfterMergeBlocks consistent.
Summary:
The rule for the moveAllAfterMergeBlocks API si for all instructions
from `From` to have been moved to `To`, while keeping the CFG edges (and
block terminators) unchanged.
Update all the callsites for moveAllAfterMergeBlocks to follow this.

Pending follow-up: since the same behavior is needed everytime, merge
all callsites into one. The common denominator may be the call to
`MergeBlockIntoPredecessor`.

Resolves PR43569.

Reviewers: george.burgess.iv

Subscribers: Prazek, sanjoy.google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68659

llvm-svn: 374177
2019-10-09 15:54:24 +00:00
Simon Pilgrim
bad15ead01 [CostModel][X86] Add tests for insertelement to non-immediate vector element indices
llvm-svn: 374161
2019-10-09 12:36:34 +00:00
Simon Pilgrim
862f561ce8 [CostModel][X86] Add tests for extractelement from non-immediate vector element indices
llvm-svn: 374160
2019-10-09 12:36:22 +00:00
Alina Sbirlea
e0abdc6f76 [MemorySSA] Don't hoist stores if interfering uses (as calls) exist.
llvm-svn: 373674
2019-10-03 22:20:04 +00:00
Alina Sbirlea
9a90d3875f [MemorySSA] Update Phi creation when inserting a Def.
MemoryPhis should be added in the IDF of the blocks newly gaining Defs.
This includes the blocks that gained a Phi and the block gaining a Def,
if the block did not have one before.
Resolves PR43427.

llvm-svn: 373505
2019-10-02 18:42:33 +00:00
Bardia Mahjour
a49fa03628 [DDG] Data Dependence Graph - Root Node
Summary:
This patch adds Root Node to the DDG. The purpose of the root node is to create a single entry node that allows graph walk iterators to iterate through all nodes of the graph, making sure that no node is left unvisited during a graph walk (eg. SCC or DFS). Once the DDG is fully constructed it will have exactly one root node. Every node in the graph is reachable from the root. The algorithm for connecting the root node is based on depth-first-search that keeps track of visited nodes to try to avoid creating unnecessary edges.

Authored By: bmahjour

Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert

Reviewed By: Meinersbur

Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto, ppc-slack

Tag: #llvm

Differential Revision: https://reviews.llvm.org/D67970

llvm-svn: 373386
2019-10-01 19:32:42 +00:00
Alina Sbirlea
df5f9ba943 [MemorySSA] Check for unreachable blocks when getting last definition.
If a single predecessor is found, still check if the block is
unreachable. The test that found this had a self loop unreachable block.
Resolves PR43493.

llvm-svn: 373383
2019-10-01 19:09:50 +00:00
Alina Sbirlea
ffbdeadbf4 [MemorySSA] Update last_access_in_block check.
The check for "was there an access in this block" should be: is the last
access in this block and is it not a newly inserted phi.
Resolves new test in PR43438.

Also fix a typo when simplifying trivial Phis to match the comment.

llvm-svn: 373380
2019-10-01 18:34:39 +00:00
Evandro Menezes
4b361fd46d [ConstantFolding] Fold constant calls to log2()
Somehow, folding calls to `log2()` with a constant was missing.

Differential revision: https://reviews.llvm.org/D67300

llvm-svn: 373262
2019-09-30 20:53:23 +00:00
Tim Northover
e0b9ad8443 Revert "[SCEV] add no wrap flag for SCEVAddExpr."
This reverts r366419 because the analysis performed is within the context of
the loop and it's only valid to add wrapping flags to "global" expressions if
they're always correct.

llvm-svn: 373184
2019-09-30 07:46:52 +00:00
Simon Pilgrim
de70987193 [CostModel][X86] Fix SLM <2 x i64> icmp costs
SLM is 2 x slower for <2 x i64> comparison ops than other vector types, we should account for this like we do for SLM <2 x i64> add/sub/mul costs.

This should remove some of the SLM codegen diffs in D43582

llvm-svn: 372954
2019-09-26 10:14:38 +00:00
Alina Sbirlea
ec066ddf96 [MemorySSA] Avoid adding Phis in the presence of unreachable blocks.
Summary:
If a block has all incoming values with the same MemoryAccess (ignoring
incoming values from unreachable blocks), then use that incoming
MemoryAccess and do not create a Phi in the first place.

Revert IDF work-around added in rL372673; it should not be required unless
the Def inserted is the first in its block.

The patch also cleans up a series of tests, added during the many
iterations on insertDef.

The patch also fixes PR43438.
The same issue that occurs in insertDef with "adding phis, hence the IDF of
Phis is needed", can also occur in fixupDefs: the `getPreviousRecursive`
call only adds Phis walking on the predecessor edges, which means there
may be the case of a Phi added walking the CFG "backwards" which
triggers the needs for an additional Phi in successor blocks.
Such Phis are added during fixupDefs only in the presence of unreachable
blocks.
Hence this highlights the need to avoid adding Phis in blocks with
unreachable predecessors in the first place.

Reviewers: george.burgess.iv

Subscribers: Prazek, sanjoy.google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67995

llvm-svn: 372932
2019-09-25 23:24:39 +00:00
Alina Sbirlea
366a27df61 [MemorySSA] Update Phi insertion.
Summary:
MemoryPhis may be needed following a Def insertion inthe IDF of all the
new accesses added (phis + potentially a def). Ensure this also  occurs when
only the new MemoryPhis are the defining accesses.

Note: The need for computing IDF here is because of new Phis added with
edges incoming from unreachable code, Phis that had previously been
simplified. The preferred solution is to not reintroduce such Phis.
This patch is the needed fix while working on the preferred solution.

Reviewers: george.burgess.iv

Subscribers: Prazek, sanjoy.google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67927

llvm-svn: 372673
2019-09-23 23:50:16 +00:00
Simon Pilgrim
68b7321f18 [Cost][X86] Add more missing vector truncation costs
The AVX512 cases still need some work to correct recognise the PMOV truncation cases.

llvm-svn: 372514
2019-09-22 16:46:15 +00:00
Simon Pilgrim
8adc9b60c4 [Cost][X86] Add v2i64 truncation costs
We are missing costs for a lot of truncation cases, I'm hoping to address all the 'zero cost' cases in trunc.ll

I thought this was a vector widening side effect, but even before this we had some interesting LV decisions (notably over indvars) being made due to these zero costs.

llvm-svn: 372498
2019-09-22 12:04:38 +00:00
Ulrich Weigand
8f9591eb21 [SystemZ] Support z15 processor name
The recently announced IBM z15 processor implements the architecture
already supported as "arch13" in LLVM.  This patch adds support for
"z15" as an alternate architecture name for arch13.

The patch also uses z15 in a number of places where we used arch13
as long as the official name was not yet announced.

llvm-svn: 372435
2019-09-20 23:04:45 +00:00
Shoaib Meenai
08db8a450f [Analysis] Allow -scalar-evolution-max-iterations more than once
At present, `-scalar-evolution-max-iterations` is a `cl::Optional`
option, which means it demands to be passed exactly zero or one times.
Our build system makes it pretty tricky to guarantee this. We often
accidentally pass the flag more than once (but always with the same
value) which results in an error, after which compilation fails:

```
clang (LLVM option parsing): for the -scalar-evolution-max-iterations option: may only occur zero or one times!
```

It seems reasonable to allow -scalar-evolution-max-iterations to be
passed more than once. Quoting the [[ http://llvm.org/docs/CommandLine.html#controlling-the-number-of-occurrences-required-and-allowed | documentation ]]:

> The cl::ZeroOrMore modifier ... indicates that your program will allow the option to be specified zero or more times.
> ...
> If an option is specified multiple times for an option of the cl::opt class, only the last value will be retained.

Original patch by: Enrico Bern Hardy Tanuwidjaja <etanuwid@fb.com>

Differential Revision: https://reviews.llvm.org/D67512

llvm-svn: 372346
2019-09-19 18:21:32 +00:00
Bardia Mahjour
12d9e4e269 Data Dependence Graph Basics
Summary:
This is the first patch in a series of patches that will implement data dependence graph in LLVM. Many of the ideas used in this implementation are based on the following paper:
D. J. Kuck, R. H. Kuhn, D. A. Padua, B. Leasure, and M. Wolfe (1981). DEPENDENCE GRAPHS AND COMPILER OPTIMIZATIONS.
This patch contains support for a basic DDGs containing only atomic nodes (one node for each instruction). The edges are two fold: def-use edges and memory-dependence edges.
The implementation takes a list of basic-blocks and only considers dependencies among instructions in those basic blocks. Any dependencies coming into or going out of instructions that do not belong to those basic blocks are ignored.

The algorithm for building the graph involves the following steps in order:

  1. For each instruction in the range of basic blocks to consider, create an atomic node in the resulting graph.
  2. For each node in the graph establish def-use edges to/from other nodes in the graph.
  3. For each pair of nodes containing memory instruction(s) create memory edges between them. This part of the algorithm goes through the instructions in lexicographical order and creates edges in reverse order if the sink of the dependence occurs before the source of it.

Authored By: bmahjour

Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert

Reviewed By: Meinersbur, fhahn, myhsu

Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto

Tag: #llvm

Differential Revision: https://reviews.llvm.org/D65350

llvm-svn: 372238
2019-09-18 17:43:45 +00:00
Jay Foad
6c1703b074 [SDA] Don't stop divergence propagation at the IPD.
Summary:
This fixes B42473 and B42706.

This patch makes the SDA propagate branch divergence until the end of the RPO traversal. Before, the SyncDependenceAnalysis propagated divergence only until the IPD in rpo order. RPO is incompatible with post dominance in the presence of loops. This made the SDA crash because blocks were missed in the propagation.

Reviewers: foad, nhaehnle

Reviewed By: foad

Subscribers: jvesely, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65274

llvm-svn: 372223
2019-09-18 13:40:22 +00:00
Bardia Mahjour
2d2a856f02 Revert "Data Dependence Graph Basics"
This reverts commit c98ec60993a7aa65073692b62f6d728b36e68ccd, which broke the sphinx-docs build.

llvm-svn: 372168
2019-09-17 19:22:01 +00:00
Bardia Mahjour
3f478f329a Data Dependence Graph Basics
Summary:
This is the first patch in a series of patches that will implement data dependence graph in LLVM. Many of the ideas used in this implementation are based on the following paper:
D. J. Kuck, R. H. Kuhn, D. A. Padua, B. Leasure, and M. Wolfe (1981). DEPENDENCE GRAPHS AND COMPILER OPTIMIZATIONS.
This patch contains support for a basic DDGs containing only atomic nodes (one node for each instruction). The edges are two fold: def-use edges and memory-dependence edges.
The implementation takes a list of basic-blocks and only considers dependencies among instructions in those basic blocks. Any dependencies coming into or going out of instructions that do not belong to those basic blocks are ignored.

The algorithm for building the graph involves the following steps in order:

  1. For each instruction in the range of basic blocks to consider, create an atomic node in the resulting graph.
  2. For each node in the graph establish def-use edges to/from other nodes in the graph.
  3. For each pair of nodes containing memory instruction(s) create memory edges between them. This part of the algorithm goes through the instructions in lexicographical order and creates edges in reverse order if the sink of the dependence occurs before the source of it.

Authored By: bmahjour

Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert

Reviewed By: Meinersbur, fhahn, myhsu

Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto

Tag: #llvm

Differential Revision: https://reviews.llvm.org/D65350

llvm-svn: 372162
2019-09-17 18:55:44 +00:00
Alina Sbirlea
d100299e59 [MemorySSA] Fix phi insertion when inserting a def.
Summary:
When inserting a Def, the current algorithm is walking edges backward
and inserting new Phis where needed. There may be additional Phis needed
in the IDF of the newly inserted Def and Phis.
Adding Phis in the IDF of the Def was added ina  previous patch, but we
may also need other Phis in the IDF of the newly added Phis.

Reviewers: george.burgess.iv

Subscribers: Prazek, sanjoy.google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67637

llvm-svn: 372138
2019-09-17 16:33:35 +00:00
Alina Sbirlea
4e2e1fe44b [MemorySSA] Update MSSA for non-conventional AA.
Summary:
Regularly when moving an instruction that may not read or write memory,
the instruction is not modelled in MSSA, so not action is necessary.
For a non-conventional AA pipeline, MSSA needs to explicitly check when
creating accesses, so as to not model instructions that may not read and
write memory.

Reviewers: george.burgess.iv

Subscribers: Prazek, sanjoy.google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67562

llvm-svn: 372137
2019-09-17 16:31:37 +00:00