1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-24 05:23:45 +02:00
Commit Graph

1242 Commits

Author SHA1 Message Date
Simon Pilgrim
868518caa8 [X86][AVX512] Add 512-bit vector ctlz costs + tests
llvm-svn: 303300
2017-05-17 21:02:18 +00:00
Simon Pilgrim
135af85db2 [X86][AVX512] Add 512-bit vector cttz costs + tests
llvm-svn: 303293
2017-05-17 20:22:54 +00:00
Simon Pilgrim
877d6ce5ff [X86] Split ctpop/ctlz/cttz cost tests
This will make things a lot easier to test all the permutations of avx512 

llvm-svn: 303290
2017-05-17 19:57:20 +00:00
Simon Pilgrim
52e72fd51b [X86][AVX512] Add 512-bit vector bitreverse costs + tests
llvm-svn: 303283
2017-05-17 19:20:20 +00:00
Jonas Paulsson
4273fbf74e [SystemZ] Modelling of costs of divisions with a constant power of 2.
Such divisions will eventually be implemented with shifts which should
be reflected in the cost function.

Review: Ulrich Weigand
llvm-svn: 303254
2017-05-17 12:46:26 +00:00
Max Kazantsev
66886e6d12 [SCEV] Fix sorting order for AddRecExprs
The existing sorting order in defined CompareSCEVComplexity sorts AddRecExprs
by loop depth, but does not pay attention to dominance of loops. This can
lead us to the following buggy situation:

for (...) { // loop1
  op1 = {A,+,B}
}
for (...) { // loop2
  op2 = {A,+,B}
  S = add op1, op2
}

In this case there is no guarantee that in operand list of S the op2 comes
before op1 (loop depth is the same, so they will be sorted just
lexicographically), so we can incorrectly treat S as a recurrence of loop1,
which is wrong.

This patch changes the sorting logic so that it places the dominated recs
before the dominating recs. This ensures that when we pick the first recurrency
in the operands order, it will be the bottom-most in terms of domination tree.
The attached test set includes some tests that produce incorrect SCEV
estimations and crashes with oldlogic.

Reviewers: sanjoy, reames, apilipenko, anna

Reviewed By: sanjoy

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D33121

llvm-svn: 303148
2017-05-16 07:27:06 +00:00
Daniel Jasper
72029fd63e Add '#' to test regex that I forgot in r303025.
llvm-svn: 303034
2017-05-15 04:58:27 +00:00
Daniel Jasper
9ff7dd68c1 Fix two tests that weren't correctly copied.
One didn't correctly fine the regex variable, the other still had a RUN
line for FNOBUILTIN-checks, which weren't copied to the file.

llvm-svn: 303025
2017-05-14 22:07:50 +00:00
Simon Pilgrim
58549a36e1 [X86][AVX1] Account for cost of extract/insert of 256-bit shifts
llvm-svn: 303023
2017-05-14 20:52:11 +00:00
Simon Pilgrim
3d4f03d434 [X86][AVX2] Fix costs for v4i64 ashr by splat
llvm-svn: 303022
2017-05-14 20:25:42 +00:00
Simon Pilgrim
662344253d [X86][AVX1] Account for cost of extract/insert of 256-bit shifts by splat
llvm-svn: 303021
2017-05-14 20:02:34 +00:00
Simon Pilgrim
9b072aaee8 [X86][AVX1] Account for cost of extract/insert of 256-bit SDIV/UDIV by mul sequences
llvm-svn: 303017
2017-05-14 18:52:15 +00:00
Simon Pilgrim
bd07f4d3ed [X86][XOP] XOP's general v16i8 shifts will be used instead of v8i16 shift + mask.
Tweak cost model to match what lowering actually does.

llvm-svn: 303013
2017-05-14 17:59:46 +00:00
Simon Pilgrim
d440f01082 [X86][SSE] Account for cost of extract/insert of v32i8 vector shifts
llvm-svn: 303012
2017-05-14 17:36:07 +00:00
Simon Pilgrim
75f3dd7e94 [X86][XOP] Account for cost of extract/insert of 256-bit vector shifts
llvm-svn: 303010
2017-05-14 13:38:53 +00:00
Justin Bogner
9e6beb3722 AA: Use generic intrinsics for tests instead of target specific ones
Update a few tests to use llvm.masked.load/store instead of arm neon
vector loads and stores, and move the tests that are actually specific
to those arm intrinsics to their own files. This lets us mark the
tests that use target specific intrinsics as requiring those targets.

llvm-svn: 302972
2017-05-13 00:12:52 +00:00
Serguei Katkov
4f8d50be22 [BPI] Ignore remainder while distributing the remaining probability from unreachanble
This is a follow up patch for https://reviews.llvm.org/rL300440
to address a comment.

To make implementation to be consistent with other cases we just
ignore the remainder after distribution of remaining probability between
reachable edges.

If we reduced the probability of some edges coming to unreachable
blocks we should distribute the remaining part across other edges
coming to reachable blocks to satisfy the condition that sum of all
probabilities should be equal to one. If this remaining part is not
divided by number of "reachable" edges then we get this remainder.
This remainder probability should be pretty small. Other cases just ignore
if the sum of probabilities is not equal to one so we do the same.

Reviewers: chandlerc, sanjoy, vsk, junbuml, reames
Reviewed By: reames
Subscribers: reames, llvm-commits
Differential Revision: https://reviews.llvm.org/D32124

llvm-svn: 302883
2017-05-12 07:50:06 +00:00
Matt Arsenault
46f6718d8f AMDGPU: Make some packed shuffles free
VOP3P instructions can encode access to either
half of the register.

llvm-svn: 302730
2017-05-10 21:29:33 +00:00
Matthew Simpson
383f076bfc [AArch64] Consider widening instructions in cost calculations
The AArch64 instruction set has a few "widening" instructions (e.g., uaddl,
saddl, uaddw, etc.) that take one or more doubleword operands and produce
quadword results. The operands are automatically sign- or zero-extended as
appropriate. However, in LLVM IR, these extends are explicit. This patch
updates TTI to consider these widening instructions as single operations whose
cost is attached to the arithmetic instruction. It marks extends that are part
of a widening operation "free" and applies a sub-target specified overhead
(zero by default) to the arithmetic instructions.

Differential Revision: https://reviews.llvm.org/D32706

llvm-svn: 302582
2017-05-09 20:18:12 +00:00
Simon Pilgrim
1e7b4fdb56 [X86][AVX1] Improve 256-bit vector costs for integer unary intrinsics.
Account for subvector extraction/insertion, helps prevent the vectorizers from selecting 256-bit vectors that will have to be split anyhow on AVX1 targets. 

llvm-svn: 302378
2017-05-07 20:58:55 +00:00
Michael Zolotukhin
bf99505718 [SCEV] createAddRecFromPHI: Optimize for the most common case.
Summary:
The existing implementation creates a symbolic SCEV expression every
time we analyze a phi node and then has to remove it, when the analysis
is finished. This is very expensive, and in most of the cases it's also
unnecessary. According to the data I collected, ~60-70% of analyzed phi
nodes (measured on SPEC) have the following form:
  PN = phi(Start, OP(Self, Constant))
Handling such cases separately significantly speeds this up.

Reviewers: sanjoy, pete

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32663

llvm-svn: 302096
2017-05-03 23:53:38 +00:00
Elad Cohen
3908e15a8b Support arbitrary address space pointers in masked gather/scatter intrinsics.
Fixes PR31789 - When loop-vectorize tries to use these intrinsics for a
non-default address space pointer we fail with a "Calling a function with a
bad singature!" assertion. This patch solves this by adding the 'vector of
pointers' argument as an overloaded type which will determine the address
space.

Differential revision: https://reviews.llvm.org/D31490

llvm-svn: 302018
2017-05-03 12:28:54 +00:00
Sanjoy Das
ddb6f73838 Use WeakVH instead of WeakTrackingVH in AliasSetTracker's UnkownInsts
In cases where an instruction (a call site, say) is RAUW'ed with some
other value (this is possible via the `returned` attribute, for
instance), we want the slot in UnknownInsts to point to the original
Instruction we wanted to track, not the value it got replaced by.

Fixes PR32587.

This relands r301426.

llvm-svn: 301814
2017-05-01 17:07:56 +00:00
Sanjoy Das
7b2e8503a6 Rename isKnownNotFullPoison to programUndefinedIfPoison; NFC
Summary:
programUndefinedIfPoison makes more sense, given what the function
does; and I'm about to add a function with a name similar to
isKnownNotFullPoison (so do the rename to avoid confusion).

Reviewers: broune, majnemer, bjarke.roune

Reviewed By: broune

Subscribers: mcrosier, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D30444

llvm-svn: 301776
2017-04-30 19:41:19 +00:00
Sanjoy Das
732f091d68 Reverts commit r301424, r301425 and r301426
Commits were:

"Use WeakVH instead of WeakTrackingVH in AliasSetTracker's UnkownInsts"
"Add a new WeakVH value handle; NFC"
"Rename WeakVH to WeakTrackingVH; NFC"

The changes assumed pointers are 8 byte aligned on all architectures.

llvm-svn: 301429
2017-04-26 16:37:05 +00:00
Sanjoy Das
25cc2dce97 Use WeakVH instead of WeakTrackingVH in AliasSetTracker's UnkownInsts
Summary:
In cases where an instruction (a call site, say) is RAUW'ed with some
other value (this is possible via the `returned` attribute, amongst
other things), we want the slot in UnknownInsts to point to the
original Instruction we wanted to track, not the value it got replaced
by.

Fixes PR32587.

Reviewers: davide

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D32268

llvm-svn: 301426
2017-04-26 16:21:02 +00:00
Sanjoy Das
9611d0a199 [IVUsers] Don't bail out of normalizing non-affine add recs
Summary:
In a previous change I changed SCEV's normalization / denormalization
to work with non-affine add recs.  So the bailout in IVUsers can be
removed.

Reviewers: atrick, efriedma

Reviewed By: atrick

Subscribers: davide, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D32105

llvm-svn: 301298
2017-04-25 06:53:25 +00:00
Sanjoy Das
4248d2608d [SCEV] Fix exponential time complexity by caching
llvm-svn: 301149
2017-04-24 00:09:46 +00:00
Eli Friedman
2e82973f03 Revert r300746 (SCEV analysis for or instructions).
There have been multiple reports of this causing problems: a
compile-time explosion on the LLVM testsuite, and a stack
overflow for an opencl kernel.

llvm-svn: 300928
2017-04-20 23:59:05 +00:00
Eli Friedman
c5a8782468 [SCEV] Make SCEV or modeling more aggressive.
Use haveNoCommonBitsSet to figure out whether an "or" instruction
is equivalent to addition. This handles more cases than just
checking for a constant on the RHS.

Differential Revision: https://reviews.llvm.org/D32239

llvm-svn: 300746
2017-04-19 20:19:58 +00:00
Matt Arsenault
b3625be5da AMDGPU: Change DivergenceAnalysis for function arguments
Stop assuming all functions are kernels.

llvm-svn: 300719
2017-04-19 17:42:34 +00:00
Serguei Katkov
4c27f4ad1e [BPI] Use metadata info before any other heuristics
Metadata potentially is more precise than any heuristics we use, so
it makes sense to use first metadata info if it is available. However it makes
sense to examine it against other strong heuristics like unreachable one.
If edge coming to unreachable block has higher probability then it is expected 
by unreachable heuristic then we use heuristic and remaining probability is
distributed among other reachable blocks equally.

An example where metadata might be more strong then unreachable heuristic is
as follows: it is possible that there are two branches and for the branch A
metadata says that its probability is (0, 2^25). For the branch B
the probability is (1, 2^25).
So the expectation is that first edge of B is hotter than first edge of A
because first edge of A did not executed at least once.
If first edge of A points to the unreachable block then using the unreachable
heuristics we'll set the probability for A to (1, 2^20) and now edge of A
becomes hotter than edge of B.
This is unexpected behavior.

This fixed the biggest part of https://bugs.llvm.org/show_bug.cgi?id=32214

Reviewers: sanjoy, junbuml, vsk, chandlerc

Reviewed By: chandlerc

Subscribers: llvm-commits, reames, davidxl

Differential Revision: https://reviews.llvm.org/D30631

llvm-svn: 300440
2017-04-17 04:33:04 +00:00
Brian Gesiak
ee1fc77541 [Analysis] Support bitreverse in -demanded-bits pass
Summary:
* Add a bitreverse case in the demanded bits analysis pass.
* Add tests for the bitreverse (and bswap) intrinsic in the
  demanded bits pass.
* Add a test case to the BDCE tests: that manipulations to
  high-order bits are eliminated once the bits are reversed
  and then right-shifted.

Reviewers: mkuper, jmolloy, hfinkel, trentxintong

Reviewed By: jmolloy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31857

llvm-svn: 300215
2017-04-13 16:44:25 +00:00
Piotr Padlewski
cd98794503 Remove readnone from invariant.group.barrier
Summary:
Readnone attribute would cause CSE of two barriers with
the same argument, which is invalid by example:

    struct Base {
          virtual int foo() { return 42; }
    };

    struct Derived1 : Base {
          int foo() override { return 50; }
    };

    struct Derived2 : Base {
          int foo() override { return 100; }
    };

    void foo() {
        Base *x = new Base{};
        new (x) Derived1{};
        int a = std::launder(x)->foo();
        new (x) Derived2{};
        int b = std::launder(x)->foo();
    }

Here 2 calls of std::launder will produce @llvm.invariant.group.barrier,
which would be merged into one call, causing devirtualization
to devirtualize second call into Derived1::foo() instead of
Derived2::foo()

Reviewers: chandlerc, dberlin, hfinkel

Subscribers: llvm-commits, rsmith, amharc

Differential Revision: https://reviews.llvm.org/D31531

llvm-svn: 300101
2017-04-12 20:45:12 +00:00
Renato Golin
941c1f67e7 [SystemZ] Fix target specific tests
llvm-svn: 300078
2017-04-12 17:14:46 +00:00
Jonas Paulsson
06c4fb73cd [SystemZ] Updated test fp-cast.ll
This did not get included in the previous commit for SystemZ cost functions.

llvm-svn: 300053
2017-04-12 12:11:41 +00:00
Jonas Paulsson
90b172efa0 [SystemZ] TargetTransformInfo cost functions implemented.
getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(),
getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(),
getInterleavedMemoryOpCost() implemented.

Interleaved access vectorization enabled.

BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads,
in which case the cost of the z/sext instruction becomes 0.

Review: Ulrich Weigand, Renato Golin.
https://reviews.llvm.org/D29631

llvm-svn: 300052
2017-04-12 11:49:08 +00:00
Serguei Katkov
dbe48d9fbb [BPI] Refactor post domination calculation and simple fix for ColdCall
Collection of PostDominatedByUnreachable and PostDominatedByColdCall have been
split out of heuristics itself. Update of the data happens now for each basic
block (before update for PostDominatedByColdCall might be skipped if
unreachable or matadata heuristic handled this basic block).

This separation allows re-ordering of heuristics without loosing
the post-domination information.

Reviewers: sanjoy, junbuml, vsk, chandlerc, reames

Reviewed By: chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31701

llvm-svn: 300029
2017-04-12 05:42:14 +00:00
Daniel Berlin
e485ce96aa MemorySSA: Move to Analysis, from Transforms/Utils. It's used as
Analysis, it has Analysis passes, and once NewGVN is made an Analysis,
this removes the cross dependency from Analysis to Transform/Utils.
NFC.

llvm-svn: 299980
2017-04-11 20:06:36 +00:00
Matt Arsenault
e0ead8d1f3 Add address space mangling to lifetime intrinsics
In preparation for allowing allocas to have non-0 addrspace.

llvm-svn: 299876
2017-04-10 20:18:21 +00:00
Max Kazantsev
ccddc942de [ScalarEvolution] Re-enable Predicate implication from operations
The patch rL298481 was reverted due to crash on clang-with-lto-ubuntu build.
The reason of the crash was type mismatch between either a or b and RHS in the following situation:

  LHS = sext(a +nsw b) > RHS.

This is quite rare, but still possible situation. Normally we need to cast all {a, b, RHS} to their widest type.
But we try to avoid creation of new SCEV that are not constants to avoid initiating recursive analysis that
can take a lot of time and/or cache a bad value for iterations number. To deal with this, in this patch we
reject this case and will not try to analyze it if the type of sum doesn't match with the type of RHS. In this
situation we don't need to create any non-constant SCEVs.

This patch also adds an assertion to the method IsProvedViaContext so that we could fail on it and not
go further into range analysis etc (because in some situations these analyzes succeed even when the passed
arguments have wrong types, what should not normally happen).

The patch also contains a fix for a problem with too narrow scope of the analysis caused by wrong
usage of predicates in recursive invocations.

The regression test on the said failure: test/Analysis/ScalarEvolution/implied-via-addition.ll

Reviewers: reames, apilipenko, anna, sanjoy

Reviewed By: sanjoy

Subscribers: mzolotukhin, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D31238

llvm-svn: 299205
2017-03-31 12:05:30 +00:00
Matt Arsenault
47e1a3f8ac AMDGPU: Add all atomicrmw fields to atomic.inc/dec
Add scope, order, isVolatile

llvm-svn: 299122
2017-03-30 22:21:40 +00:00
Max Kazantsev
556135911f Revert "[ScalarEvolution] Re-enable Predicate implication from operations"
This reverts commit rL298690

Causes failures on clang.

llvm-svn: 298693
2017-03-24 07:04:31 +00:00
Max Kazantsev
0c0a0b5621 [ScalarEvolution] Re-enable Predicate implication from operations
The patch rL298481 was reverted due to crash on clang-with-lto-ubuntu build.
The reason of the crash was type mismatch between either a or b and RHS in the following situation:

  LHS = sext(a +nsw b) > RHS.

This is quite rare, but still possible situation. Normally we need to cast all {a, b, RHS} to their widest type.
But we try to avoid creation of new SCEV that are not constants to avoid initiating recursive analysis that
can take a lot of time and/or cache a bad value for iterations number. To deal with this, in this patch we
reject this case and will not try to analyze it if the type of sum doesn't match with the type of RHS. In this
situation we don't need to create any non-constant SCEVs.

This patch also adds an assertion to the method IsProvedViaContext so that we could fail on it and not
go further into range analysis etc (because in some situations these analyzes succeed even when the passed
arguments have wrong types, what should not normally happen).

The patch also contains a fix for a problem with too narrow scope of the analysis caused by wrong
usage of predicates in recursive invocations.

The regression test on the said failure: test/Analysis/ScalarEvolution/implied-via-addition.ll

llvm-svn: 298690
2017-03-24 06:19:00 +00:00
Zhaoshi Zheng
132f4f81f0 Model ashr(shl(x, n), m) as mul(x, 2^(n-m)) when n > m
Given below case:

  %y = shl %x, n
  %z = ashr %y, m

when n = m, SCEV models it as sext(trunc(x)). This patch tries to handle
the case where n > m by using sext(mul(trunc(x), 2^(n-m)))) as the SCEV
expression.

llvm-svn: 298631
2017-03-23 18:06:09 +00:00
Anna Thomas
39cb171e59 [LVI] Add an LVI printer pass to capture test LVI cache after transformations
Summary:
Adding a printer pass for printing the LVI cache values after transformations
that use LVI.
This will help us in identifying cases where LVI
invariants are violated, or transforms that leave LVI in an incorrect state.
Right now, I have added two test cases to show that the printer pass is working.
I will be adding more test cases in a later change, once this change is
checked in upstream.

Reviewers: reames, dberlin, sanjoy, apilipenko

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30790

llvm-svn: 298542
2017-03-22 19:27:12 +00:00
Max Kazantsev
3c9d133cb1 Revert "[ScalarEvolution] Predicate implication from operations"
This reverts commit rL298481

Fails clang-with-lto-ubuntu build.

llvm-svn: 298489
2017-03-22 07:50:33 +00:00
Max Kazantsev
bcac1b9977 [ScalarEvolution] Predicate implication from operations
This patch allows SCEV predicate analysis to prove implication of some expression predicates
from context predicates related to arguments of those expressions.
It introduces three new rules:

For addition:
  (A >X && B >= 0) || (B >= 0 && A > X) ===> (A + B) > X.

For division:
  (A > X) && (0 < B <= X + 1) ===> (A / B > 0).
  (A > X) && (-B <= X < 0) ===> (A / B >= 0).

Using these rules, SCEV is able to prove facts like "if X > 1 then X / 2 > 0".
They can also be combined with the same context, to prove more complex expressions like
"if X > 1 then X/2 + 1 > 1".

Diffirential Revision: https://reviews.llvm.org/D30887

Reviewed by: sanjoy

llvm-svn: 298481
2017-03-22 04:48:46 +00:00
Matt Arsenault
dd9ab77318 AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel
Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
calling convention can be changed to a non-kernel.

Converted with perl -pi -e 's/define void/define amdgpu_kernel void/'
on the relevant test directories (and undoing in one place that actually
wanted a non-kernel).

llvm-svn: 298444
2017-03-21 21:39:51 +00:00
David Green
07f763034a [ConstantFolding] Fix to prevent constant folding having to repeatedly scan operands. NFCI
After the loop unroll threshold was increased in r295538, very
large constant expressions can be created. This prevents them
from having to be recursively scanned, leading to a compile
time blow-up.

Differential Revision: https://reviews.llvm.org/D30689

llvm-svn: 298356
2017-03-21 10:17:39 +00:00