1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-01 08:23:21 +01:00
Commit Graph

811 Commits

Author SHA1 Message Date
Nick Lewycky
2cbaf887bb Add more analysis of the sign bit of an srem instruction. If the LHS is negative
then the result could go either way. If it's provably positive then so is the
srem. Fixes PR9343 #7!

llvm-svn: 127146
2011-03-07 01:50:10 +00:00
Nick Lewycky
46bb763f35 ConstantInt has some getters which return ConstantInt's or ConstantVector's of
the value splatted into every element. Extend this to getTrue and getFalse which
by providing new overloads that take Types that are either i1 or <N x i1>. Use
it in InstCombine to add vector support to some code, fixing PR8469!

llvm-svn: 127116
2011-03-06 03:36:19 +00:00
Nick Lewycky
a2cb87f86d Thread comparisons over udiv/sdiv/ashr/lshr exact and lshr nuw/nsw whenever
possible. This goes into instcombine and instsimplify because instsimplify
doesn't need to check hasOneUse since it returns (almost exclusively) constants.

This fixes PR9343 #4 #5 and #8!

llvm-svn: 127064
2011-03-05 05:19:11 +00:00
Nick Lewycky
b2557b7cf1 Try once again to optimize "icmp (srem X, Y), Y" by turning the comparison into
true/false or "icmp slt/sge Y, 0".

llvm-svn: 127063
2011-03-05 04:28:48 +00:00
Nick Lewycky
3bc3a84ba8 Fold "icmp pred (srem X, Y), Y" like we do for urem. Handle signed comparisons
in the urem case, though not the other way around. This is enough to get #3 from
PR9343!

llvm-svn: 126991
2011-03-04 10:06:52 +00:00
Anders Carlsson
1eb388e6c3 Make InstCombiner::FoldAndOfICmps create a ConstantRange that's the
intersection of the LHS and RHS ConstantRanges and return "false" when
the range is empty.

This simplifies some code and catches some extra cases.

llvm-svn: 126744
2011-03-01 15:05:01 +00:00
Nick Lewycky
dcc97b5f44 srem doesn't actually have the same resulting sign as its numerator, you could
also have a zero when numerator = denominator. Reverts parts of r126635 and
r126637.

llvm-svn: 126644
2011-02-28 09:17:39 +00:00
Nick Lewycky
28f01da48e Teach InstCombine to fold "(shr exact X, Y) == 0" --> X == 0, fixing #1 from
PR9343.

llvm-svn: 126643
2011-02-28 08:31:40 +00:00
Nick Lewycky
e0f44d0aba The sign of an srem instruction is the sign of its dividend (the first
argument), regardless of the divisor. Teach instcombine about this and fix
test7 in PR9343!

llvm-svn: 126635
2011-02-28 06:20:05 +00:00
Chris Lattner
72a2ebab6c change instcombine to not turn a call to non-varargs bitcast of
function prototype into a call to a varargs prototype.  We do
allow the xform if we have a definition, but otherwise we don't
want to risk that we're changing the abi in a subtle way.  On
X86-64, for example, varargs require passing stuff in %al.

llvm-svn: 126363
2011-02-24 05:10:56 +00:00
Benjamin Kramer
50cd35c25e InstCombine: Add a bunch of combines of the form x | (y ^ z).
We usually catch this kind of optimization through InstSimplify's distributive
magic, but or doesn't distribute over xor in general.

"A | ~(A | B) -> A | ~B" hits 24 times on gcc.c.

llvm-svn: 126081
2011-02-20 13:23:43 +00:00
Eli Friedman
35ed1e5d6c PR9218: SimplifyDemandedVectorElts can return a non-null value that is not
the instruction passed in.  Make sure to account for this correctly, instead
of looping infinitely.

llvm-svn: 126058
2011-02-19 22:42:40 +00:00
Duncan Sands
1ddd628de0 Add some transforms of the kind X-Y>X -> 0>Y which are valid when there is no
overflow.  These subsume some existing equality transforms, so zap those.

llvm-svn: 125843
2011-02-18 16:25:37 +00:00
Chris Lattner
f9501b79f9 have instcombine preserve nsw/nuw/exact when sinking
common operations through a phi. 

llvm-svn: 125790
2011-02-17 23:01:49 +00:00
Chris Lattner
fc8ee641a2 fix instcombine merging GEPs through a PHI to only make the
result inbounds if all of the inputs are inbounds.

llvm-svn: 125785
2011-02-17 22:21:26 +00:00
Nadav Rotem
ad2fd4eada Enhance constant folding of bitcast operations on vectors of floats.
Add getAllOnesValue of FP numbers to Constants and APFloat.
Add more tests.

llvm-svn: 125776
2011-02-17 21:22:27 +00:00
Duncan Sands
00610dbf64 Transform "A + B >= A + C" into "B >= C" if the adds do not wrap. Likewise for some
variations (some of these were already present so I unified the code).  Spotted by my
auto-simplifier as occurring a lot.

llvm-svn: 125734
2011-02-17 07:46:37 +00:00
Chris Lattner
6e936c247f preserve NUW/NSW when transforming add x,x
llvm-svn: 125711
2011-02-17 02:23:02 +00:00
Chris Lattner
79947d56ea filecheckize
llvm-svn: 125710
2011-02-17 02:21:03 +00:00
Nick Lewycky
5c854580b2 Teach PatternMatch that splat vectors could be floating point as well as
integer. Fixes PR9228!

llvm-svn: 125613
2011-02-15 23:13:23 +00:00
Nadav Rotem
5306a4ae96 Fix 9216 - Endless loop in InstCombine pass.
The pattern "A&(A^B) -> A & ~B" recreated itself because ~B is
actually a xor -1.

llvm-svn: 125557
2011-02-15 07:13:48 +00:00
Nadav Rotem
98c5af8517 Fix test
llvm-svn: 125460
2011-02-13 16:13:16 +00:00
Nadav Rotem
3ce2bfbb55 Fix a regression from r125393;
It caused a crash in MultiSource/Benchmarks/Bullet.
Opt hit an assertion with "opt -std-compile-opts" because
Constant::getAllOnesValue doesn't know how to handle floats.

This patch added a test to reproduce the problem and a check that the
destination vector is of integer type.

Thank you Benjamin!

llvm-svn: 125459
2011-02-13 15:45:34 +00:00
Chris Lattner
4362c74d13 add PR#
llvm-svn: 125455
2011-02-13 08:27:31 +00:00
Chris Lattner
72b78e11ba implement instcombine folding for things like (x >> c) < 42.
We were previously simplifying divisions, but not right shifts!

llvm-svn: 125454
2011-02-13 08:07:21 +00:00
Benjamin Kramer
793cd269de Also fold (A+B) == A -> B == 0 when the add is commuted.
llvm-svn: 125411
2011-02-11 21:46:48 +00:00
Nadav Rotem
c2e51ee68a Fix 9173.
Add more folding patterns to constant expressions of vector selects and vector
bitcasts.

llvm-svn: 125393
2011-02-11 19:37:55 +00:00
Chris Lattner
d2c1936c14 implement the first part of PR8882: when lowering an inbounds
gep to explicit addressing, we know that none of the intermediate
computation overflows.

This could use review: it seems that the shifts certainly wouldn't
overflow, but could the intermediate adds overflow if there is a 
negative index?

Previously the testcase would instcombine to:

define i1 @test(i64 %i) {
  %p1.idx.mask = and i64 %i, 4611686018427387903
  %cmp = icmp eq i64 %p1.idx.mask, 1000
  ret i1 %cmp
}

now we get:

define i1 @test(i64 %i) {
  %cmp = icmp eq i64 %i, 1000
  ret i1 %cmp
}

llvm-svn: 125271
2011-02-10 07:11:16 +00:00
Chris Lattner
6e84f48cd8 Enhance a bunch of transformations in instcombine to start generating
exact/nsw/nuw shifts and have instcombine infer them when it can prove
that the relevant properties are true for a given shift without them.

Also, a variety of refactoring to use the new patternmatch logic thrown
in for good luck.  I believe that this takes care of a bunch of related
code quality issues attached to PR8862.

llvm-svn: 125267
2011-02-10 05:36:31 +00:00
Chris Lattner
72ac244f4e Enhance the "compare with shift" and "compare with div"
optimizations to be much more aggressive in the face of
exact/nsw/nuw div and shifts.  For example, these (which
are the same except the first is 'exact' sdiv:

define i1 @sdiv_icmp4_exact(i64 %X) nounwind {
  %A = sdiv exact i64 %X, -5   ; X/-5 == 0 --> x == 0
  %B = icmp eq i64 %A, 0
  ret i1 %B
}

define i1 @sdiv_icmp4(i64 %X) nounwind {
  %A = sdiv i64 %X, -5   ; X/-5 == 0 --> x == 0
  %B = icmp eq i64 %A, 0
  ret i1 %B
}

compile down to:

define i1 @sdiv_icmp4_exact(i64 %X) nounwind {
  %1 = icmp eq i64 %X, 0
  ret i1 %1
}

define i1 @sdiv_icmp4(i64 %X) nounwind {
  %X.off = add i64 %X, 4
  %1 = icmp ult i64 %X.off, 9
  ret i1 %1
}

This happens when you do something like:
  (ptr1-ptr2) == 42

where the pointers are pointers to non-unit types.

llvm-svn: 125266
2011-02-10 05:23:05 +00:00
Chris Lattner
0decae4bf7 more cleanups, notably bitcast isn't used for "signed to unsigned type
conversions". :)

llvm-svn: 125265
2011-02-10 05:17:27 +00:00
Chris Lattner
e29022d779 merge two tests.
llvm-svn: 125195
2011-02-09 17:06:41 +00:00
Chris Lattner
7b6a968f5d enhance vmcore to know that udiv's can be exact, and add a trivial
instcombine xform to exercise this.

Nothing forms exact udivs yet though.  This is progress on PR8862

llvm-svn: 124992
2011-02-06 21:44:57 +00:00
Anders Carlsson
f184e5de9a Recognize and simplify
(A+B) == A  ->  B == 0
A == (A+B)  ->  B == 0

llvm-svn: 124567
2011-01-30 22:01:13 +00:00
Duncan Sands
1a18d8df96 My auto-simplifier noticed that ((X/Y)*Y)/Y occurs several times in SPEC
benchmarks, and that it can be simplified to X/Y.  (In general you can only
simplify (Z*Y)/Y to Z if the multiplication did not overflow; if Z has the
form "X/Y" then this is the case).  This patch implements that transform and
moves some Div logic out of instcombine and into InstructionSimplify.
Unfortunately instcombine gets in the way somewhat, since it likes to change
(X/Y)*Y into X-(X rem Y), so I had to teach instcombine about this too.
Finally, thanks to the NSW/NUW flags, sometimes we know directly that "Z*Y"
does not overflow, because the flag says so, so I added that logic too.  This
eliminates a bunch of divisions and subtractions in 447.dealII, and has good
effects on some other benchmarks too.  It seems to have quite an effect on
tramp3d-v4 but it's hard to say if it's good or bad because inlining decisions
changed, resulting in massive changes all over.

llvm-svn: 124487
2011-01-28 16:51:11 +00:00
Nick Lewycky
9bbbb3e6f5 Clean up the tests a little, make sure we match an instruction in the right
test.

llvm-svn: 124473
2011-01-28 05:13:17 +00:00
Nick Lewycky
74dfcccec4 Fold select + select where both selects are on the same condition.
llvm-svn: 124469
2011-01-28 03:28:10 +00:00
Owen Anderson
6e3425c7e0 Just because we have determined that an (fcmp | fcmp) is true for A < B,
A == B, and A > B, does not mean we can fold it to true.  We still need to
check for A ? B (A unordered B).

llvm-svn: 123993
2011-01-21 19:39:42 +00:00
Chris Lattner
f225708ef1 fix PR9013, an infinite loop in instcombine.
llvm-svn: 123968
2011-01-21 05:29:50 +00:00
Nick Lewycky
c4300debc2 Don't try to pull vector bitcasts that change the number of elements through
a select. A vector select is pairwise on each element so we'd need a new
condition with the right number of elements to select on. Fixes PR8994.

llvm-svn: 123963
2011-01-21 02:30:43 +00:00
Chris Lattner
2067fb2a93 enhance FoldOpIntoPhi in instcombine to try harder when a phi has
multiple uses.  In some cases, all the uses are the same operation,
so instcombine can go ahead and promote the phi.  In the testcase
this pushes an add out of the loop.

llvm-svn: 123568
2011-01-16 05:28:59 +00:00
Chris Lattner
aba06ce448 fix PR8983, a broken assertion.
llvm-svn: 123562
2011-01-16 03:43:53 +00:00
Chris Lattner
74ed5d30ca implement an instcombine xform that canonicalizes casts outside of and-with-constant operations.
This fixes rdar://8808586 which observed that we used to compile:


union xy {
        struct x { _Bool b[15]; } x;
        __attribute__((packed))
        struct y {
                __attribute__((packed)) unsigned long b0to7;
                __attribute__((packed)) unsigned int b8to11;
                __attribute__((packed)) unsigned short b12to13;
                __attribute__((packed)) unsigned char b14;
        } y;
};

struct x
foo(union xy *xy)
{
        return xy->x;
}

into:

_foo:                                   ## @foo
	movq	(%rdi), %rax
	movabsq	$1095216660480, %rcx    ## imm = 0xFF00000000
	andq	%rax, %rcx
	movabsq	$-72057594037927936, %rdx ## imm = 0xFF00000000000000
	andq	%rax, %rdx
	movzbl	%al, %esi
	orq	%rdx, %rsi
	movq	%rax, %rdx
	andq	$65280, %rdx            ## imm = 0xFF00
	orq	%rsi, %rdx
	movq	%rax, %rsi
	andq	$16711680, %rsi         ## imm = 0xFF0000
	orq	%rdx, %rsi
	movl	%eax, %edx
	andl	$-16777216, %edx        ## imm = 0xFFFFFFFFFF000000
	orq	%rsi, %rdx
	orq	%rcx, %rdx
	movabsq	$280375465082880, %rcx  ## imm = 0xFF0000000000
	movq	%rax, %rsi
	andq	%rcx, %rsi
	orq	%rdx, %rsi
	movabsq	$71776119061217280, %r8 ## imm = 0xFF000000000000
	andq	%r8, %rax
	orq	%rsi, %rax
	movzwl	12(%rdi), %edx
	movzbl	14(%rdi), %esi
	shlq	$16, %rsi
	orl	%edx, %esi
	movq	%rsi, %r9
	shlq	$32, %r9
	movl	8(%rdi), %edx
	orq	%r9, %rdx
	andq	%rdx, %rcx
	movzbl	%sil, %esi
	shlq	$32, %rsi
	orq	%rcx, %rsi
	movl	%edx, %ecx
	andl	$-16777216, %ecx        ## imm = 0xFFFFFFFFFF000000
	orq	%rsi, %rcx
	movq	%rdx, %rsi
	andq	$16711680, %rsi         ## imm = 0xFF0000
	orq	%rcx, %rsi
	movq	%rdx, %rcx
	andq	$65280, %rcx            ## imm = 0xFF00
	orq	%rsi, %rcx
	movzbl	%dl, %esi
	orq	%rcx, %rsi
	andq	%r8, %rdx
	orq	%rsi, %rdx
	ret

We now compile this into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movzwl	12(%rdi), %eax
	movzbl	14(%rdi), %ecx
	shlq	$16, %rcx
	orl	%eax, %ecx
	shlq	$32, %rcx
	movl	8(%rdi), %edx
	orq	%rcx, %rdx
	movq	(%rdi), %rax
	ret

A small improvement :-)

llvm-svn: 123520
2011-01-15 06:32:33 +00:00
Duncan Sands
44c273d907 Move some shift transforms out of instcombine and into InstructionSimplify.
While there, I noticed that the transform "undef >>a X -> undef" was wrong.
For example if X is 2 then the top two bits must be equal, so the result can
not be anything.  I fixed this in the constant folder as well.  Also, I made
the transform for "X << undef" stronger: it now folds to undef always, even
though X might be zero.  This is in accordance with the LangRef, but I must
admit that it is fairly aggressive.  Also, I added "i32 X << 32 -> undef"
following the LangRef and the constant folder, likewise fairly aggressive.

llvm-svn: 123417
2011-01-14 00:37:45 +00:00
Owen Anderson
4479341626 Fix a random missed optimization by making InstCombine more aggressive when determining which bits are demanded by
a comparison against a constant.

llvm-svn: 123203
2011-01-11 00:36:45 +00:00
Chandler Carruth
772e26df36 Teach instcombine about the rest of the SSE and SSE2 conversion
intrinsics element dependencies. Reviewed by Nick.

llvm-svn: 123161
2011-01-10 07:19:37 +00:00
Chandler Carruth
7f854ac9a9 Fold two related tests into the newly FileCheck-ized test, migrating
them to FileCheck as well.

llvm-svn: 123154
2011-01-10 02:53:58 +00:00
Chandler Carruth
7c332e5abd Clean up and FileCheck-ize a test.
llvm-svn: 123153
2011-01-10 02:53:54 +00:00
Tobias Grosser
9899845dd3 Instcombine: Fix pattern where the sext did not dominate the icmp using it
llvm-svn: 123121
2011-01-09 16:00:11 +00:00
Frits van Bommel
966cc00809 Fix a bug in r123034 (trying to sext/zext non-integers) and clean up a little.
llvm-svn: 123061
2011-01-08 10:51:36 +00:00