1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 05:52:53 +02:00
Commit Graph

1744 Commits

Author SHA1 Message Date
Sanjay Patel
57c134b91a [InstCombine] allow vector icmp bool transforms
llvm-svn: 271843
2016-06-05 17:49:45 +00:00
Sanjay Patel
94e6bdaa7f fix documentation comments and other clean-ups; NFC
llvm-svn: 271839
2016-06-05 16:46:18 +00:00
Sanjay Patel
11b22015f7 [InstCombine] less 'CI' confusion; NFC
Change the name of the ICmpInst to 'ICmp' and the Constant (was a ConstantInt) to 'C',
so that it's hopefully clearer that 'CI' refers to CastInst in this context.

While we're scrubbing, fix the documentation comment and use 'auto' with 'dyn_cast'.

llvm-svn: 271817
2016-06-05 00:12:32 +00:00
Sanjay Patel
dad6c47e6b [InstCombine] allow vector constants for cast+icmp fold
This is step 1 of unknown towards fixing PR28001:
https://llvm.org/bugs/show_bug.cgi?id=28001

llvm-svn: 271810
2016-06-04 22:04:05 +00:00
Sanjay Patel
6ada1cf739 clean-up; NFC
llvm-svn: 271807
2016-06-04 21:20:44 +00:00
Sanjay Patel
1388784f5b fix formatting, punctuation; NFC
llvm-svn: 271804
2016-06-04 20:39:22 +00:00
Simon Pilgrim
c995ee7b75 [InstCombine][MMX] Extend SimplifyDemandedUseBits MOVMSK support to MMX
Add the MMX implementation to the SimplifyDemandedUseBits SSE/AVX MOVMSK support added in D19614

Requires a minor tweak as llvm.x86.mmx.pmovmskb takes a x86_mmx argument - so we have to be explicit about the implied v8i8 vector type.

llvm-svn: 271789
2016-06-04 13:42:46 +00:00
Sanjay Patel
500fe712ac [InstCombine] look through bitcasts to find selects
There was concern that creating bitcasts for the simpler potential select pattern:

define <2 x i64> @vecBitcastOp1(<4 x i1> %cmp, <2 x i64> %a) {
  %a2 = add <2 x i64> %a, %a
  %sext = sext <4 x i1> %cmp to <4 x i32>
  %bc = bitcast <4 x i32> %sext to <2 x i64>
  %and = and <2 x i64> %a2, %bc
  ret <2 x i64> %and
}

might lead to worse code for some targets, so this patch is matching the larger
patterns seen in the test cases.

The motivating example for this patch is this IR produced via SSE intrinsics in C:

define <2 x i64> @gibson(<2 x i64> %a, <2 x i64> %b) {
  %t0 = bitcast <2 x i64> %a to <4 x i32>
  %t1 = bitcast <2 x i64> %b to <4 x i32>
  %cmp = icmp sgt <4 x i32> %t0, %t1
  %sext = sext <4 x i1> %cmp to <4 x i32>
  %t2 = bitcast <4 x i32> %sext to <2 x i64>
  %and = and <2 x i64> %t2, %a
  %neg = xor <4 x i32> %sext, <i32 -1, i32 -1, i32 -1, i32 -1>
  %neg2 = bitcast <4 x i32> %neg to <2 x i64>
  %and2 = and <2 x i64> %neg2, %b
  %or = or <2 x i64> %and, %and2
  ret <2 x i64> %or
}

For an AVX target, this is currently:

vpcmpgtd  %xmm1, %xmm0, %xmm2
vpand     %xmm0, %xmm2, %xmm0
vpandn    %xmm1, %xmm2, %xmm1
vpor      %xmm1, %xmm0, %xmm0
retq

With this patch, it becomes:

vpmaxsd   %xmm1, %xmm0, %xmm0

Differential Revision: http://reviews.llvm.org/D20774

llvm-svn: 271676
2016-06-03 14:42:07 +00:00
Sanjay Patel
02638731c1 transform obscured FP sign bit ops into a fabs/fneg using TLI hook
This is effectively a revert of:
http://reviews.llvm.org/rL249702 - [InstCombine] transform masking off of an FP sign bit into a fabs() intrinsic call (PR24886)
and:
http://reviews.llvm.org/rL249701 - [ValueTracking] teach computeKnownBits that a fabs() clears sign bits
and a reimplementation as a DAG combine for targets that have IEEE754-compliant fabs/fneg instructions.

This is intended to resolve the objections raised on the dev list:
http://lists.llvm.org/pipermail/llvm-dev/2016-April/098154.html
and:
https://llvm.org/bugs/show_bug.cgi?id=24886#c4

In the interest of patch minimalism, I've only partly enabled AArch64. PowerPC, MIPS, x86 and others can enable later.

Differential Revision: http://reviews.llvm.org/D19391

llvm-svn: 271573
2016-06-02 20:01:37 +00:00
Sanjay Patel
647e745fb4 [InstCombine] remove guard for generating a vector select
This is effectively NFC because we already do this transform after r175380:
http://reviews.llvm.org/rL175380

and also via foldBoolSextMaskToSelect().

This change should just make it a bit more efficient to match the pattern. 
The original guard was added in r95058:
http://reviews.llvm.org/rL95058

A sampling of codegen for current in-tree targets shows no problems. This
makes sense given that we're already producing the vector selects via the
other transforms.

llvm-svn: 271554
2016-06-02 18:03:05 +00:00
Saleem Abdulrasool
f85a029a33 X86: permit using SjLj EH on x86 targets as an option
This adds support to the backed to actually support SjLj EH as an exception
model.  This is *NOT* the default model, and requires explicitly opting into it
from the frontend.  GCC supports this model and for MinGW can still be enabled
via the `--using-sjlj-exceptions` options.

Addresses PR27749!

llvm-svn: 271244
2016-05-31 01:48:07 +00:00
Craig Topper
4f195e8edb [X86] Remove SSE/AVX unaligned store intrinsics as clang no longer uses them. Auto upgrade to native unaligned store instructions.
llvm-svn: 271236
2016-05-30 23:15:56 +00:00
Simon Pilgrim
6ec0f7efbc [X86][SSE] (Reapplied) Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm)
This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused.

Reapplied now that the the companion patch (D20684) removes/auto-upgrade the clang intrinsics has been committed.

Differential Revision: http://reviews.llvm.org/D20686

llvm-svn: 271131
2016-05-28 18:03:41 +00:00
Sanjay Patel
dc7c644056 [InstCombine] move and/sext fold to helper function; NFCI
We need to enhance the pattern matching on these to look through bitcasts.

llvm-svn: 271051
2016-05-27 21:41:29 +00:00
Simon Pilgrim
99e3cf65ff Revert: r270973 - [X86][SSE] Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm)
llvm-svn: 270976
2016-05-27 09:02:25 +00:00
Simon Pilgrim
c8925e270b [X86][SSE] Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm)
This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused.

A companion patch (D20684) removes/auto-upgrade the clang intrinsics.

Differential Revision: http://reviews.llvm.org/D20686

llvm-svn: 270973
2016-05-27 08:49:15 +00:00
Chad Rosier
1b8b93b50a [InstCombine] Catch more bswap cases missed due to zext and truncs.
Fixes PR27824.
Differential Revision: http://reviews.llvm.org/D20591.

llvm-svn: 270853
2016-05-26 14:58:51 +00:00
Craig Topper
b6c098560d [X86] Add the AVX storeu intrinsics to InstCombine and LoopStrengthReduce in the same places that the SSE/SSE2 storeu intrinsics appear.
I don't really know how to test this. Just seemed like we should be consistent.

llvm-svn: 270819
2016-05-26 04:28:45 +00:00
Chad Rosier
8544c01533 Clarify that we match BSwap in InstCombine and BitReverse in CGP. NFC.
Also, rename recognizeBitReverseOrBSwapIdiom to recognizeBSwapOrBitReverseIdiom,
so the ordering of the MatchBSwaps and MatchBitReversals arguments are
consistent with the function name.

llvm-svn: 270715
2016-05-25 16:22:14 +00:00
Gerolf Hoflehner
02d0a11d91 [InstCombine] Fix assertion when bitcast is converted to gep
When an aggregate contains an opaque type its size cannot be
determined. This triggers an "Invalid GetElementPtrInst indices for type" assert
in function checkGEPType. The fix suppresses the conversion in this case.

http://reviews.llvm.org/D20319

llvm-svn: 270479
2016-05-23 19:23:17 +00:00
Sanjay Patel
305c2f05c5 reduce indent; NFC
llvm-svn: 270372
2016-05-22 17:08:52 +00:00
Guozhi Wei
dc5a37bcd2 [InstCombine] Avoid combining the bitcast of a var that is used as both address and result of load instructions
This patch fixes https://llvm.org/bugs/show_bug.cgi?id=27703.

If there is a sequence of one or more load instructions, each loaded value is used as address of later load instruction, bitcast is necessary to change the value type, don't optimize it.

llvm-svn: 270135
2016-05-19 21:07:01 +00:00
Sanjay Patel
a9e371ad52 [InstCombine] add another test for wrong icmp constant (PR27792)
It doesn't matter if the comparison is unsigned; the inc/dec is always signed.

llvm-svn: 269831
2016-05-17 20:20:40 +00:00
Sanjay Patel
2d43600215 [InstCombine] fix constant to be signed for signed comparisons
This bug was introduced in r269728 and is the likely cause of many stage 2 ubsan bot failures.
I'll add a test in a follow-up commit assuming this fixes things properly.

llvm-svn: 269797
2016-05-17 18:38:55 +00:00
Benjamin Kramer
36892682d9 [InstCombine] Don't crash when trying to take an element of a ConstantExpr.
Fixes PR27786.

llvm-svn: 269757
2016-05-17 12:08:55 +00:00
Sanjay Patel
fdf7fd7ebc try to avoid unused variable warning in release build; NFCI
llvm-svn: 269729
2016-05-17 01:12:31 +00:00
Sanjay Patel
59f49118ec [InstCombine] check vector elements before trying to transform LE/GE vector icmp (PR27756)
Fix a bug introduced with rL269426 :
[InstCombine] canonicalize* LE/GE vector integer comparisons to LT/GT (PR26701, PR26819)

We were assuming that a ConstantDataVector / ConstantVector / ConstantAggregateZero operand of
an ICMP was composed of ConstantInt elements, but it might have ConstantExpr or UndefValue 
elements. Handle those appropriately.

Also, refactor this function to join the scalar and vector paths and eliminate the switches.

Differential Revision: http://reviews.llvm.org/D20289

llvm-svn: 269728
2016-05-17 00:57:57 +00:00
Sanjay Patel
68b6b9da9e use 'match' for less indenting; NFCI
llvm-svn: 269494
2016-05-13 21:51:17 +00:00
Jun Bum Lim
309caf0d59 Rename getLargestLegalIntTypeSize to getLargestLegalIntTypeSizeInBits(). NFC.
Summary: Rename DataLayout::getLargestLegalIntTypeSize to DataLayout::getLargestLegalIntTypeSizeInBits() to prevent similar mistakes  fixed in r269433.

Reviewers: joker.eph, mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20248

llvm-svn: 269456
2016-05-13 18:38:35 +00:00
Sanjay Patel
e6e6eb6572 [InstCombine] handle zero constant vectors for LE/GE comparisons too
Enhancement to: http://reviews.llvm.org/rL269426
With discussion in: http://reviews.llvm.org/D17859

This should complete the fixes for: PR26701, PR26819:
https://llvm.org/bugs/show_bug.cgi?id=26701
https://llvm.org/bugs/show_bug.cgi?id=26819
 

llvm-svn: 269439
2016-05-13 17:28:12 +00:00
Sanjay Patel
2c688c8b52 [InstCombine] canonicalize* LE/GE vector integer comparisons to LT/GT (PR26701, PR26819)
*We don't currently handle the  edge case constants (min/max values), so it's not a complete
canonicalization.

To fully solve the motivating bugs, we need to enhance this to recognize a zero vector
too because that's a ConstantAggregateZero which is a ConstantData, not a ConstantVector
or a ConstantDataVector.

Differential Revision: http://reviews.llvm.org/D17859 

llvm-svn: 269426
2016-05-13 15:10:46 +00:00
Chad Rosier
1293f2b484 [InstCombine] Fold icmp ugt/ult (udiv i32 C2, X), C1.
This patch adds support for two optimizations:
icmp ugt (udiv C2, X), C1 -> icmp ule X, C2/(C1+1)
icmp ult (udiv C2, X), C1 -> icmp ugt X, C2/C1

Differential Revision: http://reviews.llvm.org/D20123

llvm-svn: 269109
2016-05-10 20:22:09 +00:00
Arnaud A. de Grandmaison
6c99fa9cb1 [InstCombine] Remove trivially empty va_start/va_end and va_copy/va_end ranges.
When a va_start or va_copy is immediately followed by a va_end (ignoring
debug information or other start/end in between), then it is safe to
remove the pair. As this code shares some commonalities with the lifetime
markers, this has been factored to helper functions.

This InstCombine pattern kicks-in 3 times when running the LLVM test
suite.

llvm-svn: 269033
2016-05-10 09:24:49 +00:00
Chad Rosier
e9eec305ad Typo. NFC.
llvm-svn: 268975
2016-05-09 21:37:43 +00:00
Chad Rosier
8b41062bd4 [InstCombine] Fold icmp eq/ne (udiv i32 A, B), 0 -> icmp ugt/ule B, A.
Differential Revision: http://reviews.llvm.org/D20036

llvm-svn: 268960
2016-05-09 19:30:20 +00:00
Philip Reames
28e49b3a15 Reapply 267210 with fix for PR27490
Original Commit Message
Extend load/store type canonicalization to handle unordered operations

Extend the type canonicalization logic to work for unordered atomic loads and stores.  Note that while this change itself is fairly simple and low risk, there's a reasonable chance this will expose problems in the backends by suddenly generating IR they wouldn't have seen before.  Anything of this nature will be an existing bug in the backend (you could write an atomic float load), but this will definitely change the frequency with which such cases are encountered.  If you see problems, feel free to revert this change, but please make sure you collect a test case. 

Note that the concern about lowering is now much less likely.  PR27490 proved that we already *were* mucking with the types of ordered atomics and volatiles.  As a result, this change doesn't introduce as much new behavior as originally thought.

llvm-svn: 268809
2016-05-06 22:17:01 +00:00
Balaram Makam
8a2614ed09 "Reapply r268521 "[InstCombine] Canonicalize icmp instructions based on dominating conditions.""
This reapplies commit r268521, that was reverted in r268530 due to a test failure in select-implied.ll
Modified the test case to reflect the new change.

llvm-svn: 268557
2016-05-04 21:32:14 +00:00
Balaram Makam
6df288fe78 Revert "[InstCombine] Canonicalize icmp instructions based on dominating conditions."
This reverts commit 573a40f79b35cf3e71db331bb00f6a84f03b835d.

llvm-svn: 268530
2016-05-04 18:37:35 +00:00
Balaram Makam
93aa5a1614 [InstCombine] Canonicalize icmp instructions based on dominating conditions.
Summary:
    This patch canonicalizes conditions based on the constant range information
    of the dominating branch condition.
    For example:

      %cmp = icmp slt i64 %a, 0
      br i1 %cmp, label %land.lhs.true, label %lor.rhs
      lor.rhs:
        %cmp2 = icmp sgt i64 %a, 0

    Would now be canonicalized into:

      %cmp = icmp slt i64 %a, 0
      br i1 %cmp, label %land.lhs.true, label %lor.rhs
      lor.rhs:
        %cmp2 = icmp ne i64 %a, 0

Reviewers: mcrosier, gberry, t.p.northover, llvm-commits, reames, hfinkel, sanjoy, majnemer

Subscribers: MatzeB, majnemer, mcrosier

Differential Revision: http://reviews.llvm.org/D18841

llvm-svn: 268521
2016-05-04 17:34:20 +00:00
Simon Pilgrim
b50a7dc851 [InstCombine][SSE] Added support to VPERMD/VPERMPS to shuffle combine to accept UNDEF elements.
llvm-svn: 268206
2016-05-01 20:43:02 +00:00
Simon Pilgrim
a6ec426d4d [InstCombine][SSE] Added support to VPERMILVAR to shuffle combine to accept UNDEF elements.
llvm-svn: 268204
2016-05-01 20:22:42 +00:00
Simon Pilgrim
f29cf3f78c [InstCombine][SSE] Added support to PSHUFB to shuffle combine to accept UNDEF elements.
llvm-svn: 268202
2016-05-01 19:26:21 +00:00
Simon Pilgrim
b8dbbcff7b [InstCombine][AVX2] Combine VPERMD/VPERMPS intrinsics with constant masks to shufflevector.
llvm-svn: 268199
2016-05-01 16:41:22 +00:00
Simon Pilgrim
895490c6cb [InstCombine][AVX] VPERMILVAR to shuffle combine to use general aggregate elements. NFCI.
Make use of Constant::getAggregateElement instead of checking constant types - first step towards adding support for UNDEF mask elements.

llvm-svn: 268158
2016-04-30 07:23:30 +00:00
Simon Pilgrim
35052f3ec3 [InstCombine][SSE] PSHUFB to shuffle combine to use general aggregate elements. NFCI.
Make use of Constant::getAggregateElement instead of checking constant types - first step towards adding support for UNDEF mask elements.

llvm-svn: 268115
2016-04-29 21:34:54 +00:00
Chad Rosier
5fffe70b4f [InstCombine] Determine the result of a select based on a dominating condition.
Differential Revision: http://reviews.llvm.org/D19550

llvm-svn: 268104
2016-04-29 21:12:31 +00:00
Sanjay Patel
b2e1fa0e6a [InstCombine] clean up; NFC
llvm-svn: 268099
2016-04-29 20:54:56 +00:00
Sanjay Patel
7526baa6fd [InstCombine] add helper function for ICmp with constant canonicalization; NFCI
As suggested in http://reviews.llvm.org/D17859 , we should enhance this
to support vectors.

llvm-svn: 268059
2016-04-29 16:22:25 +00:00
David Majnemer
8e5b7c88db [InstCombine] Propagate operand bundles
We neglected to transfer operand bundles for some transforms.  These
were found via inspection, I'll try to come up with some test cases.

llvm-svn: 268010
2016-04-29 08:07:20 +00:00
Ahmed Bougacha
7feaca3ef9 [InstCombine] Remove trailing whitespace. NFC.
r267873.

llvm-svn: 267887
2016-04-28 14:36:07 +00:00