1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-24 21:42:54 +02:00
Commit Graph

2581 Commits

Author SHA1 Message Date
Sanjay Patel
7ae16c985d fix typos; NFC
llvm-svn: 293816
2017-02-01 21:38:32 +00:00
Sanjay Patel
e50e307832 [InstCombine] move folds for shift-shift pairs; NFCI
Although this is 'no-functional-change-intended', I'm adding tests
for shl-shl and lshr-lshr pairs because there is no existing test 
coverage for those folds.

It seems like we should be able to remove some code from foldShiftedShift()
at this point because we're handling those patterns on the general path.

llvm-svn: 293814
2017-02-01 21:31:34 +00:00
Sanjoy Das
3cfb1e2af8 [InstCombine] Allow InstCombine to merge adjacent guards
Summary:
If there are two adjacent guards with different conditions, we can
remove one of them and include its condition into the condition of
another one. This patch allows InstCombine to merge them by the
following pattern:

    guard(a); guard(b) -> guard(a & b).

Reviewers: reames, apilipenko, igor-laevsky, anna, sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29378

llvm-svn: 293778
2017-02-01 16:34:55 +00:00
Davide Italiano
14ea2e3d79 [Instcombine] Combine consecutive identical fences
Differential Revision:  https://reviews.llvm.org/D29314

llvm-svn: 293661
2017-01-31 18:09:05 +00:00
Arnold Schwaighofer
19f3324bf3 Don't combine stores to a swifterror pointer operand to a different type
llvm-svn: 293658
2017-01-31 17:53:49 +00:00
Sanjay Patel
6bac136936 [InstCombine] add test for possible zext-phi transform; NFC
The datalayout doesn't include i1, so we don't do a potential shrink and sink transform.

Example based on discussion here:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/109631.html

llvm-svn: 293656
2017-01-31 17:43:00 +00:00
Silviu Baranga
ce8eb0df03 [InstCombine] Make sure that LHS and RHS have the same type in
transformToIndexedCompare

If they don't have the same type, the size of the constant
index would need to be adjusted (and this wouldn't be always
possible).

Alternatively we could try the analysis with the initial
RHS value, which would guarantee that the two sides have
the same type. However it is unlikely that in practice this
would pass our transformation requirements.

Fixes PR31808 (https://llvm.org/bugs/show_bug.cgi?id=31808).

llvm-svn: 293629
2017-01-31 14:04:15 +00:00
Sanjay Patel
c1ccdd6039 [InstCombine] enable (X <<nsw C1) >>s C2 --> X <<nsw (C1 - C2) for vectors with splat constants
llvm-svn: 293570
2017-01-30 23:35:52 +00:00
Sanjay Patel
22524fa47b [InstCombine] add vector test for (X <<nsw C1) >>s C2 --> X <<nsw (C1 - C2); NFC
llvm-svn: 293566
2017-01-30 23:26:17 +00:00
Sanjay Patel
94f616d48b [InstCombine] enable more lshr(shl X, C1), C2 folds for vectors with splat constants
llvm-svn: 293562
2017-01-30 23:01:05 +00:00
Sanjay Patel
2612342500 [InstCombine] add tests for more shift-shift patterns; NFC
llvm-svn: 293555
2017-01-30 22:24:36 +00:00
Sanjay Patel
c49aeb010f [InstCombine] enable (X >>?exact C1) << C2 --> X >>?exact (C1-C2) for vectors with splat constants
llvm-svn: 293524
2017-01-30 18:40:23 +00:00
Sanjay Patel
d5bd701535 [InstCombine] add vector splat tests for (X >>?exact C1) << C2 --> X >>?exact (C1-C2); NFC
llvm-svn: 293517
2017-01-30 18:17:14 +00:00
Sanjay Patel
36f9e16e69 [InstCombine] enable (X <<nsw C1) >>s C2 --> X <<nsw (C1-C2) for vectors with splat constants
llvm-svn: 293507
2017-01-30 17:19:32 +00:00
Sanjay Patel
dd64889b0b [InstCombine] fixed to propagate 'exact' on lshr
The original shift is bigger, so this may qualify as 'obvious', 
but here's an attempt at an Alive-based proof:

Name: exact
Pre: (C1 u< C2)
%a = shl i8 %x, C1
%b = lshr exact i8 %a, C2 
  =>
%c = lshr exact i8 %x, C2 - C1
%b = and i8 %c, ((1 << width(C1)) - 1) u>> C2

Optimization is correct!

llvm-svn: 293498
2017-01-30 16:53:03 +00:00
Sanjay Patel
d8b4f45a76 [InstCombine] add 'exact' to lshr to show that it got dropped; NFC
llvm-svn: 293496
2017-01-30 16:38:49 +00:00
Sanjay Patel
20af4c5918 [InstCombine] enable lshr(shl X, C1), C2 folds for vectors with splat constants
llvm-svn: 293489
2017-01-30 16:11:40 +00:00
Sanjay Patel
3c39c77f98 [InstCombine] add tests for shift-shift patterns; NFC
llvm-svn: 293487
2017-01-30 15:54:50 +00:00
Sanjay Patel
25491b6e66 [InstCombine] enable (X >>?,exact C1) << C2 --> X << (C2 - C1) for vectors with splats
llvm-svn: 293435
2017-01-29 17:11:18 +00:00
Sanjay Patel
a2a34d3386 [InstCombine] add tests for shl(shr X, C1), C2 transforms; NFC
llvm-svn: 293434
2017-01-29 16:52:59 +00:00
Sanjay Patel
3ff9c85b95 [InstCombine] move icmp transforms that might be recognized as min/max and inf-loop (PR31751)
This is a minimal patch to avoid the infinite loop in:
https://llvm.org/bugs/show_bug.cgi?id=31751

But the general problem is bigger: we're not canonicalizing all of the min/max forms reported
by value tracking's matchSelectPattern(), and we don't define min/max consistently. Some code
uses matchSelectPattern(), other code uses matchers like m_Umax, and others have their own
inline definitions which may be subtly different from any of the above.

The reason that the test cases in this patch need a cast op to trigger is because we don't
(yet) canonicalize all min/max forms based on matchSelectPattern() in 
canonicalizeMinMaxWithConstant(), but we do make min/max+cast transforms based on 
matchSelectPattern() in visitSelectInst().

The location of the icmp transforms that trigger the inf-loop seems arbitrary at best, so
I'm moving those behind the min/max fence in visitICmpInst() as the quick fix.

llvm-svn: 293345
2017-01-27 23:26:27 +00:00
Justin Lebar
1a917336a5 [NVPTX] Upgrade NVVM intrinsics in InstCombineCalls.
Summary:
There are many NVVM intrinsics that we can't entirely get rid of, but
that nonetheless often correspond to target-generic LLVM intrinsics.

For example, if flush denormals to zero (ftz) is enabled, we can convert
@llvm.nvvm.ceil.ftz.f to @llvm.ceil.f32.  On the other hand, if ftz is
disabled, we can't do this, because @llvm.ceil.f32 will be lowered to a
non-ftz PTX instruction.  In this case, we can, however, simplify the
non-ftz nvvm ceil intrinsic, @llvm.nvvm.ceil.f, to @llvm.ceil.f32.

These transformations are particularly useful because they let us
constant fold instructions that appear in libdevice, the bitcode library
that ships with CUDA and essentially functions as its libm.

Reviewers: tra

Subscribers: hfinkel, majnemer, llvm-commits

Differential Revision: https://reviews.llvm.org/D28794

llvm-svn: 293244
2017-01-27 00:58:58 +00:00
Sanjoy Das
a70ceeeee1 Revert a couple of InstCombine/Guard checkins
This change reverts:

r293061: "[InstCombine] Canonicalize guards for NOT OR condition"
r293058: "[InstCombine] Canonicalize guards for AND condition"

They miscompile cases like:

```
declare void @llvm.experimental.guard(i1, ...)

define void @test_guard_not_or(i1 %A, i1 %B) {
  %C = or i1 %A, %B
  %D = xor i1 %C, true
  call void(i1, ...) @llvm.experimental.guard(i1 %D, i32 20, i32 30)[ "deopt"() ]
  ret void
}
```

because they do transfer the `i32 20, i32 30` parameters to newly
created guard instructions.

llvm-svn: 293227
2017-01-26 23:38:11 +00:00
Sanjay Patel
66560a79a5 [InstCombine] fold (X >>u C) << C --> X & (-1 << C)
We already have this fold when the lshr has one use, but it doesn't need that
restriction. We may be able to remove some code from foldShiftedShift().

Also, move the similar:
(X << C) >>u C --> X & (-1 >>u C)
...directly into visitLShr to help clean up foldShiftByConstOfShiftByConst().

That whole function seems questionable since it is called by commonShiftTransforms(),
but there's really not much in common if we're checking the shift opcodes for every
fold.

llvm-svn: 293215
2017-01-26 22:08:10 +00:00
Sanjay Patel
afb8de4915 [InstCombine] use m_APInt to allow (X << C) >>u C --> X & (-1 >>u C) with splat vectors
llvm-svn: 293208
2017-01-26 20:52:27 +00:00
Sanjay Patel
a94fdc692e [InstCombine] add tests for shift-shift folds; NFC
llvm-svn: 293205
2017-01-26 20:10:55 +00:00
Artur Pilipenko
6b244ca01f [InstCombine] Canonicalize guards for NOT OR condition
This is a partial fix for Bug 31520 - [guards] canonicalize guards in instcombine

Reviewed By: apilipenko

Differential Revision: https://reviews.llvm.org/D29075

Patch by Maxim Kazantsev.

llvm-svn: 293061
2017-01-25 14:45:12 +00:00
Simon Pilgrim
6c22b528b6 [InstCombine][SSE] Add support for PACKSS/PACKUS constant folding
Differential Revision: https://reviews.llvm.org/D28949

llvm-svn: 293060
2017-01-25 14:37:24 +00:00
Artur Pilipenko
71132836b4 [InstCombine] Canonicalize guards for AND condition
This is a partial fix for Bug 31520 - [guards] canonicalize guards in instcombine

Reviewed By: apilipenko

Differential Revision: https://reviews.llvm.org/D29074

Patch by Maxim Kazantsev.

llvm-svn: 293058
2017-01-25 14:20:52 +00:00
Artur Pilipenko
8d5f1040c3 [InstCombine] Allow InstrCombine to remove one of adjacent guards if they are equivalent
This is a partial fix for Bug 31520 - [guards] canonicalize guards in instcombine

Reviewed By: majnemer, apilipenko

Differential Revision: https://reviews.llvm.org/D29071

Patch by Maxim Kazantsev.

llvm-svn: 293056
2017-01-25 14:12:12 +00:00
Gerolf Hoflehner
a64ca53972 [InstCombine] Added regression test to narrow-swich.ll
llvm-svn: 293018
2017-01-25 04:34:59 +00:00
Matt Arsenault
4249853cf1 SimplifyLibCalls: Replace more unary libcalls with intrinsics
llvm-svn: 292855
2017-01-23 23:55:08 +00:00
Simon Pilgrim
8221416b9d [InstCombine][X86] Add MULDQ/MULUDQ constant folding support
llvm-svn: 292793
2017-01-23 15:22:59 +00:00
Simon Pilgrim
6a557145af [InstCombine][X86] MULDQ/MULUDQ undef -> zero
Match generic mul behaviour so that <X x i64> multiply and muldq/muludq pattern act the same

llvm-svn: 292784
2017-01-23 12:07:32 +00:00
Simon Pilgrim
2a29239f10 [InstCombine][SSE] Tests showing missed opportunities to constant fold PMULDQ/PMULUDQ
llvm-svn: 292782
2017-01-23 10:57:39 +00:00
Benjamin Kramer
91e0605f7f Fix some broken CHECK lines.
The colon is important.

llvm-svn: 292761
2017-01-22 20:28:56 +00:00
Sanjay Patel
91f1806bc1 [InstCombine] use m_APInt to allow ashr folds for vectors with splat constants
We may be able to assert that no shl-shl or lshr-lshr pairs ever get here
because we should have already handled those in foldShiftedShift().

llvm-svn: 292726
2017-01-21 17:59:59 +00:00
Sanjay Patel
2afe78e0f9 [InstCombine] add tests for ashr-ashr; NFC
llvm-svn: 292724
2017-01-21 17:43:06 +00:00
Justin Lebar
1a976ec6c1 [ConstantFold] Remove test checking that we don't constant-fold sqrt(-2).
This depended on libm's errno behavior (we constant fold iff libm's
sqrt(-2) does not set errno) and was breaking on mac.

llvm-svn: 292701
2017-01-21 02:02:27 +00:00
Justin Lebar
22610cd767 [ConstantFolding] Constant-fold llvm.sqrt(x) like other intrinsics.
Summary:
Currently we return undef, but we're in the process of changing the
LangRef so that llvm.sqrt behaves like the other math intrinsics,
matching the return value of the standard libcall but not setting errno.

This change is legal even without the LangRef change because currently
calling llvm.sqrt(x) where x is negative is spec'ed to be UB.  But in
practice it's also safe because we're simply constant-folding fewer
inputs: Inputs >= -0 get constant-folded as before, but inputs < -0 now
aren't constant-folded, because ConstantFoldFP aborts if the host math
function raises an fp exception.

Reviewers: hfinkel, efriedma, sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28929

llvm-svn: 292692
2017-01-21 00:59:57 +00:00
Sanjay Patel
15f61be538 [InstCombine] auto-generate checks; NFC
llvm-svn: 292682
2017-01-20 23:39:01 +00:00
Sanjay Patel
fc6ca85023 [ValueTracking] recognize variations of 'clamp' to improve codegen (PR31693)
By enhancing value tracking, we allow an existing min/max canonicalization to
kick in and improve codegen for several targets that have min/max instructions.

Unfortunately, recognizing min/max in value tracking may cause us to hit
a hack in InstCombiner::visitICmpInst() more often:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/109340.html
...but I'm hoping we can remove that soon.

Correctness proofs based on Alive:

Name: smaxmin
Pre: C1 < C2
%cmp2 = icmp slt i8 %x, C2
%min = select i1 %cmp2, i8 %x, i8 C2
%cmp3 = icmp slt i8 %x, C1
%r = select i1 %cmp3, i8 C1, i8 %min
=>
%cmp2 = icmp slt i8 %x, C2
%min = select i1 %cmp2, i8 %x, i8 C2
%cmp1 = icmp sgt i8 %min, C1
%r = select i1 %cmp1, i8 %min, i8 C1

Name: sminmax
Pre: C1 > C2
%cmp2 = icmp sgt i8 %x, C2
%max = select i1 %cmp2, i8 %x, i8 C2
%cmp3 = icmp sgt i8 %x, C1
%r = select i1 %cmp3, i8 C1, i8 %max
=>
%cmp2 = icmp sgt i8 %x, C2
%max = select i1 %cmp2, i8 %x, i8 C2
%cmp1 = icmp slt i8 %max, C1
%r = select i1 %cmp1, i8 %max, i8 C1

---------------------------------------- 
Optimization: smaxmin 
Done: 1 
Optimization is correct! 
---------------------------------------- 
Optimization: sminmax 
Done: 1 
Optimization is correct!



Name: umaxmin
Pre: C1 u< C2
%cmp2 = icmp ult i8 %x, C2
%min = select i1 %cmp2, i8 %x, i8 C2
%cmp3 = icmp ult i8 %x, C1
%r = select i1 %cmp3, i8 C1, i8 %min
=>
%cmp2 = icmp ult i8 %x, C2
%min = select i1 %cmp2, i8 %x, i8 C2
%cmp1 = icmp ugt i8 %min, C1
%r = select i1 %cmp1, i8 %min, i8 C1

Name: uminmax
Pre: C1 u> C2
%cmp2 = icmp ugt i8 %x, C2
%max = select i1 %cmp2, i8 %x, i8 C2
%cmp3 = icmp ugt i8 %x, C1
%r = select i1 %cmp3, i8 C1, i8 %max
=>
%cmp2 = icmp ugt i8 %x, C2
%max = select i1 %cmp2, i8 %x, i8 C2
%cmp1 = icmp ult i8 %max, C1
%r = select i1 %cmp1, i8 %max, i8 C1

---------------------------------------- 
Optimization: umaxmin 
Done: 1 
Optimization is correct! 
---------------------------------------- 
Optimization: uminmax 
Done: 1 
Optimization is correct! 
 

llvm-svn: 292660
2017-01-20 22:18:47 +00:00
Sanjay Patel
cbf458f9f0 [InstCombine] add tests to show missed canonicalization of min/max; NFC
Unfortunately, recognizing these in value tracking may cause us to hit
a hack in InstCombiner::visitICmpInst() more often:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/109340.html

...but besides being the obviously Right Thing To Do, there's a clear 
codegen win from identifying these patterns for several targets.

llvm-svn: 292655
2017-01-20 21:49:41 +00:00
Simon Pilgrim
228925067e [InstCombine][X86] Add MULDQ/MULUDQ undef handling
llvm-svn: 292627
2017-01-20 18:20:30 +00:00
Simon Pilgrim
014fbda9bf [InstCombine][SSE] Tests showing missed opportunities to handle muldq/muludq with undef arguments
Fixed a typo in existing test names at the same time

llvm-svn: 292619
2017-01-20 17:06:38 +00:00
Simon Pilgrim
81f5b7f9e4 [InstCombine][SSE] Tests showing missed opportunities to constant fold packss/packus
llvm-svn: 292609
2017-01-20 13:21:30 +00:00
Simon Pilgrim
78d53eb230 [InstCombine][SSE] Tests showing missed opportunities to handle packss/packus with undef arguments
llvm-svn: 292601
2017-01-20 11:28:07 +00:00
Simon Pilgrim
7f59f5cfb4 [InstCombine][SSE] Add DemandedElts support for PACKSS/PACKUS instructions
Simplify a packss/packus truncation based on the elements of the mask that are actually demanded.

Differential Revision: https://reviews.llvm.org/D28777

llvm-svn: 292591
2017-01-20 09:28:21 +00:00
Davide Italiano
97a0cbad8b [InstCombine] Simplify gep (gep p, a), (b-a)
Patch by Andrea Canciani.

Differential Revision:  https://reviews.llvm.org/D27413

llvm-svn: 292506
2017-01-19 18:51:56 +00:00
Sanjay Patel
70009a571f [InstCombine] icmp Pred (shl nsw X, C1), C0 --> icmp Pred X, C0 >> C1
Try harder to fold icmp with shl nsw as discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/108749.html

This is similar to the 'shl nuw' transforms that were added with D25913.

This may eventually help solve:
https://llvm.org/bugs/show_bug.cgi?id=30773

Differential Revision: https://reviews.llvm.org/D28406

llvm-svn: 292492
2017-01-19 16:12:10 +00:00