1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

1137 Commits

Author SHA1 Message Date
Chad Rosier
2e3e81f00b [AArch64] Reduce vector insert/extract cost for Falkor.
Differential Revision: https://reviews.llvm.org/D28403

llvm-svn: 291254
2017-01-06 18:03:26 +00:00
Simon Pilgrim
f29ec731bd [CostModel][X86] Fix 512-bit SDIV/UDIV 'big' costs.
Set the costs on the lowest target that supports the type.

llvm-svn: 291229
2017-01-06 11:12:53 +00:00
Simon Pilgrim
6597231ee4 [CostModel][X86] Add SDIV/UDIV cost tests for a wider range of targets
Added a test demonstrating bug in AVX512 division costs

llvm-svn: 291228
2017-01-06 11:02:40 +00:00
Simon Pilgrim
87a1dcdf6d [CostModel][X86] Include the cost of 256-bit upper subvector extract/insertion in AVX1 v4i64 MUL
Matches other MUL/ADD/SUB 256-bit case on AVX1

llvm-svn: 291149
2017-01-05 18:20:25 +00:00
Chad Rosier
54356c6a0e [AArch64][CostModel] Add coverage for bswap intrinsics.
llvm-svn: 291140
2017-01-05 16:55:32 +00:00
Simon Pilgrim
e00b6e9f42 [CostModel][X86] Add support for broadcast shuffle costs
Currently only for broadcasts with input and output of the same width.

Differential Revision: https://reviews.llvm.org/D27811

llvm-svn: 291122
2017-01-05 15:56:08 +00:00
Chad Rosier
2caaab6e8f [AArch64] Remove mcpu option as this test is not target specific. NFC.
llvm-svn: 291117
2017-01-05 15:05:03 +00:00
Chad Rosier
4891053009 [AArch64] Remove unused arguments from tests. NFC.
llvm-svn: 291112
2017-01-05 14:48:53 +00:00
Tobias Grosser
08b29b5d00 Add missing CHECK: line to test case added in 29097
Without this CHECK line, we may not detect incorrectly detected additional
regions at the end of the region tree.

llvm-svn: 290994
2017-01-04 19:35:38 +00:00
Tobias Grosser
6152532380 RegionInfo: add new test case
This test case has been reduced from test/Analysis/RegionInfo/mix_1.ll and
provides us with a minimal example of a test case which caused problems while
working on an improved version of the RegionInfo analysis. We upstream this
test case, as it certainly can be helpful in future debugging and optimization
tests.

Test case reduced by Pratik Bhatu <cs12b1010@iith.ac.in>

llvm-svn: 290974
2017-01-04 17:50:15 +00:00
Simon Pilgrim
f1fa399ee0 [CostModel][X86] Updated vXi8 and vXi16 Reverse/Alternate shuffle costs
Actual codegen is much better than the extract+insert patterns that was assumed.

llvm-svn: 290962
2017-01-04 14:01:33 +00:00
Elena Demikhovsky
49038bfad5 Fixed shuffle-reverse cost on AVX-512.
(This changed was approved in https://reviews.llvm.org/D28118, but Simon asked to submit it separately).

llvm-svn: 290812
2017-01-02 11:44:10 +00:00
Elena Demikhovsky
ac6dc0e1f0 AVX-512 Loop Vectorizer: Cost calculation for interleave load/store patterns.
X86 target does not provide any target specific cost calculation for interleave patterns.It uses the common target-independent calculation, which gives very high numbers. As a result, the scalar version is chosen in many cases. The situation on AVX-512 is even worse, since we have 3-src shuffles that significantly reduce the cost.

In this patch I calculate the cost on AVX-512. It will allow to compare interleave pattern with gather/scatter and choose a better solution (PR31426).

* Shiffle-broadcast cost will be changed in Simon's upcoming patch.

Differential Revision: https://reviews.llvm.org/D28118

llvm-svn: 290810
2017-01-02 10:37:52 +00:00
Sanjay Patel
6927f00f21 [ValueTracking] add tests for known-nonnull-at; NFC
llvm-svn: 290790
2016-12-31 19:23:26 +00:00
Sanjoy Das
8c493b5ff3 [TBAAVerifier] Be stricter around verifying scalar nodes
This fixes the issue exposed in PR31393, where we weren't trying
sufficiently hard to diagnose bad TBAA metadata.

This does reduce the variety in the error messages we print out, but I
think the tradeoff of verifying more, simply and quickly overrules the
need for more helpful error messags here.

llvm-svn: 290713
2016-12-29 15:47:05 +00:00
Chandler Carruth
888ae91182 [PM] Teach MemDep to invalidate its result object when its cached
analysis handles become invalid.

Add a test case for its invalidation logic.

llvm-svn: 290620
2016-12-27 19:33:04 +00:00
Chandler Carruth
9a136f0e41 [PM] Add more dedicated testing to cover the invalidation logic added to
BasicAA in r290603.

I've kept the basic testing in the new PM test file as that also covers
the AAManager invalidation logic. If/when there is a good place for
broader AA testing it could move there.

This test is somewhat unsatisfying as I can't get it to fail even with
ASan outside of explicit checks of the invalidation. Apparently we don't
yet have any test coverage of the BasicAA code paths using either the
domtree or loopinfo -- I made both of them always be null and check-llvm
passed.

llvm-svn: 290612
2016-12-27 17:59:22 +00:00
Bryant Wong
b0749ef47d [AliasAnalysis] Teach BasicAA about memcpy.
Differential Revision: https://reviews.llvm.org/D27034

llvm-svn: 290526
2016-12-25 22:42:27 +00:00
Simon Pilgrim
2882ed164e [X86][SSE] Improve lowering of vXi64 multiplies
As mentioned on PR30845, we were performing our vXi64 multiplication as:

AloBlo = pmuludq(a, b);
AloBhi = pmuludq(a, psrlqi(b, 32));
AhiBlo = pmuludq(psrlqi(a, 32), b);
return AloBlo + psllqi(AloBhi, 32)+ psllqi(AhiBlo, 32);

when we could avoid one of the upper shifts with:

AloBlo = pmuludq(a, b);
AloBhi = pmuludq(a, psrlqi(b, 32));
AhiBlo = pmuludq(psrlqi(a, 32), b);
return AloBlo + psllqi(AloBhi + AhiBlo, 32);

This matches the lowering on gcc/icc.

Differential Revision: https://reviews.llvm.org/D27756

llvm-svn: 290267
2016-12-21 20:00:10 +00:00
Michael Kuperstein
e4bd47a0d4 [ConstantFolding] Fix vector GEPs harder
For vector GEPs, CastGEPIndices can end up in an infinite recursion, because
we compare the vector type to the scalar pointer type, find them different,
and then try to cast a type to itself.

Differential Revision: https://reviews.llvm.org/D28009

llvm-svn: 290260
2016-12-21 17:34:21 +00:00
Daniel Jasper
4c6537eccb Add files I seem to have dropped in my revert (r290086).
Sorry!

llvm-svn: 290087
2016-12-19 08:32:13 +00:00
Daniel Jasper
162ffcacd6 Revert @llvm.assume with operator bundles (r289755-r289757)
This creates non-linear behavior in the inliner (see more details in
r289755's commit thread).

llvm-svn: 290086
2016-12-19 08:22:17 +00:00
Matthew Simpson
bf784fec18 [AArch64] Guard Misaligned 128-bit store penalty by subtarget feature
This patch checks that the SlowMisaligned128Store subtarget feature is set
when penalizing such stores in getMemoryOpCost.

Differential Revision: https://reviews.llvm.org/D27677

llvm-svn: 289845
2016-12-15 18:36:59 +00:00
Simon Pilgrim
97b5f6a46b [CostModel][X86] Updated reverse shuffle costs
llvm-svn: 289819
2016-12-15 14:24:07 +00:00
Simon Pilgrim
6219e36a4f [CostModel] Fix long standing bug with reverse shuffle mask detection
Incorrect 'undef' mask index matching meant that broadcast shuffles could be detected as reverse shuffles

llvm-svn: 289811
2016-12-15 12:12:45 +00:00
Simon Pilgrim
d5264e8f7c [CostModel][X86] Add tests for reverse shuffle costs
llvm-svn: 289800
2016-12-15 10:45:53 +00:00
Hal Finkel
f224db75d2 Remove the AssumptionCache
After r289755, the AssumptionCache is no longer needed. Variables affected by
assumptions are now found by using the new operand-bundle-based scheme. This
new scheme is more computationally efficient, and also we need much less
code...

llvm-svn: 289756
2016-12-15 03:02:15 +00:00
Hal Finkel
502475d4f3 Make processing @llvm.assume more efficient by using operand bundles
There was an efficiency problem with how we processed @llvm.assume in
ValueTracking (and other places). The AssumptionCache tracked all of the
assumptions in a given function. In order to find assumptions relevant to
computing known bits, etc. we searched every assumption in the function. For
ValueTracking, that means that we did O(#assumes * #values) work in InstCombine
and other passes (with a constant factor that can be quite large because we'd
repeat this search at every level of recursion of the analysis).

Several of us discussed this situation at the last developers' meeting, and
this implements the discussed solution: Make the values that an assume might
affect operands of the assume itself. To avoid exposing this detail to
frontends and passes that need not worry about it, I've used the new
operand-bundle feature to add these extra call "operands" in a way that does
not affect the intrinsic's signature. I think this solution is relatively
clean. InstCombine adds these extra operands based on what ValueTracking, LVI,
etc. will need and then those passes need only search the users of the values
under consideration. This should fix the computational-complexity problem.

At this point, no passes depend on the AssumptionCache, and so I'll remove
that as a follow-up change.

Differential Revision: https://reviews.llvm.org/D27259

llvm-svn: 289755
2016-12-15 02:53:42 +00:00
Sanjoy Das
a0e8011216 [Verifier] Add verification for TBAA metadata
Summary:
This change adds some verification in the IR verifier around struct path
TBAA metadata.

Other than some basic sanity checks (e.g. we get constant integers where
we expect constant integers), this checks:

 - That by the time an struct access tuple `(base-type, offset)` is
   "reduced" to a scalar base type, the offset is `0`.  For instance, in
   C++ you can't start from, say `("struct-a", 16)`, and end up with
   `("int", 4)` -- by the time the base type is `"int"`, the offset
   better be zero.  In particular, a variant of this invariant is needed
   for `llvm::getMostGenericTBAA` to be correct.

 - That there are no cycles in a struct path.

 - That struct type nodes have their offsets listed in an ascending
   order.

 - That when generating the struct access path, you eventually reach the
   access type listed in the tbaa tag node.

Reviewers: dexonsmith, chandlerc, reames, mehdi_amini, manmanren

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D26438

llvm-svn: 289402
2016-12-11 20:07:15 +00:00
Keno Fischer
ab41662e58 ConstantFolding: Don't crash when encountering vector GEP
ConstantFolding tried to cast one of the scalar indices to a vector
type. Instead, use the vector type only for the first index (which
is the only one allowed to be a vector) and use its scalar type
otherwise.

Fixes PR31250.

Reviewers: majnemer
Differential Revision: https://reviews.llvm.org/D27389

llvm-svn: 289073
2016-12-08 17:22:35 +00:00
Haicheng Wu
20ce778776 [AArch64] Correct the check of signed 9-bit imm in isLegalAddressingMode()
In the addressing mode, signed 9-bit imm is [-256, 255], not [-512, 511].

Differential Revision: https://reviews.llvm.org/D27480

llvm-svn: 288876
2016-12-07 01:45:04 +00:00
Haicheng Wu
c157a2a55a [TTI/CostModel] Correct the way getGEPCost() calls isLegalAddressingMode()
Fix a bug when we call isLegalAddressingMode() from getGEPCost().

Differential Revision: https://reviews.llvm.org/D27357

llvm-svn: 288569
2016-12-03 01:57:24 +00:00
Guozhi Wei
27571a5d37 [ppc] Correctly compute the cost of loading 32/64 bit memory into VSR
VSX has instructions lxsiwax/lxsdx that can load 32/64 bit value into VSX register cheaply. That patch makes it known to memory cost model, so the vectorization of the test case in pr30990 is beneficial.

Differential Revision: https://reviews.llvm.org/D26713

llvm-svn: 288560
2016-12-03 00:41:43 +00:00
Alexey Bataev
7135b5ad24 [SLP] Fixed cost model for horizontal reduction.
Currently when cost of scalar operations is evaluated the vector type is
used for scalar operations. Patch fixes this issue and fixes evaluation
of the vector operations cost.
Several test showed that vector cost model is too optimistic. It
allowed vectorization of 8 or less add/fadd operations, though scalar
code is faster. Actually, only for 16 or more operations vector code
provides better performance.

Differential Revision: https://reviews.llvm.org/D26277

llvm-svn: 288398
2016-12-01 18:42:42 +00:00
Alexey Bataev
68437dbc6c [SLP] Additional tests with the cost of vector operations.
llvm-svn: 288377
2016-12-01 17:26:54 +00:00
Alexey Bataev
ef66b3c144 Revert "[SLP] Additional tests with the cost of vector operations."
This reverts commit a61718435fc4118c82f8aa6133fd81f803789c1e.

llvm-svn: 288371
2016-12-01 16:45:04 +00:00
Alexey Bataev
7f4c68a45a [SLP] Additional tests with the cost of vector operations.
llvm-svn: 288369
2016-12-01 16:11:48 +00:00
Sanjay Patel
b2bc8129f9 [InstSimplify] allow integer vector types to use computeKnownBits
Note that the non-splat lshr+lshr test folded, but that does not
work in general. Something is missing or wrong in computeKnownBits
as the non-splat shl+shl test still shows.

llvm-svn: 288005
2016-11-27 21:07:28 +00:00
Sanjay Patel
7b435e81e5 add tests to show missing analysis; NFC
llvm-svn: 287998
2016-11-27 15:54:45 +00:00
Simon Pilgrim
b2804b00f4 [X86][AVX512] Add support for v2i64 fptosi/fptoui/sitofp/uitofp on AVX512DQ-only targets
Use 512-bit instructions with subvector insertion/extraction like we do in a number of similar circumstances

llvm-svn: 287882
2016-11-24 14:46:55 +00:00
Simon Pilgrim
5df905e2e1 [X86][AVX512] Add support for v4i64 fptosi/fptoui/sitofp/uitofp on AVX512DQ-only targets
Use 512-bit instructions with subvector insertion/extraction like we do in a number of similar circumstances

llvm-svn: 287762
2016-11-23 14:01:18 +00:00
Simon Pilgrim
1f99fea9ae [CostModel][X86] Add missing AVX512DQ v8i64 fptosi/sitofp costs
llvm-svn: 287760
2016-11-23 13:42:09 +00:00
Simon Pilgrim
648b765007 [CostModel][X86] Add v2f32 -> v2i64 fptosi/fptoui cost tests
llvm-svn: 287756
2016-11-23 11:43:00 +00:00
Simon Pilgrim
b94dcea1ae [CostModel][X86] Updated sitofp/uitofp scalar/vector cost tests
Better coverage of all legal types + special cases.

Removed old fptoui tests which are all handled in fptoui.ll

llvm-svn: 287678
2016-11-22 18:55:49 +00:00
Yaxun Liu
7bae0ef103 Fix known zero bits for addrspacecast.
Currently LLVM assumes that a pointer addrspacecasted to a different addr space is equivalent to trunc or zext bitwise, which is not true. For example, in amdgcn target, when a null pointer is addrspacecasted from addr space 4 to 0, its value is changed from i64 0 to i32 -1.

This patch teaches LLVM not to assume known bits of addrspacecast instruction to its operand.

Differential Revision: https://reviews.llvm.org/D26803

llvm-svn: 287545
2016-11-21 15:42:31 +00:00
Craig Topper
d5ba7319d4 [AVX-512] Support FCOPYSIGN for v16f32 and v8f64
Summary:
This extends FCOPYSIGN support to 512-bit vectors.

I've also added tests to show what the 128-bit and 256-bit cases look like with broadcast loads.

Reviewers: delena, zvi, RKSimon, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26791

llvm-svn: 287298
2016-11-18 02:25:34 +00:00
Simon Pilgrim
50c2f1dd68 [CostModel][X86] Added mul costs for vXi8 vectors
More realistic v16i8/v32i8/v64i8 MUL costs - we have to extend to vXi16, use PMULLW and then truncate the result

llvm-svn: 286838
2016-11-14 15:54:24 +00:00
Simon Pilgrim
bf629ee6cc [X86][AVX] Fixed v16i16/v32i8 ADD/SUB costs on AVX1 subtargets
Add explicit v16i16/v32i8 ADD/SUB costs, matching the costs of v4i64/v8i32 - they were missing for some reason.

This has side effects on the LV max bandwidth tests (AVX1 now prefers 128-bit vectors vs AVX2 which still prefers 256-bit)

llvm-svn: 286832
2016-11-14 14:45:16 +00:00
Peter Collingbourne
fbb7ea5270 IR: Introduce inrange attribute on getelementptr indices.
If the inrange keyword is present before any index, loading from or
storing to any pointer derived from the getelementptr has undefined
behavior if the load or store would access memory outside of the bounds of
the element selected by the index marked as inrange.

This can be used, e.g. for alias analysis or to split globals at element
boundaries where beneficial.

As previously proposed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-July/102472.html

Differential Revision: https://reviews.llvm.org/D22793

llvm-svn: 286514
2016-11-10 22:34:55 +00:00
Tobias Grosser
68e42a688d [RegionInfo] Add three tests that include infinite loops
These examples are variations that were inspired from a small subgraph taken
from paper.ll which are interesting as they show certain issues with infinite
loops.

llvm-svn: 286450
2016-11-10 13:56:19 +00:00