1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-01 08:23:21 +01:00
Commit Graph

781 Commits

Author SHA1 Message Date
Chris Lattner
0decae4bf7 more cleanups, notably bitcast isn't used for "signed to unsigned type
conversions". :)

llvm-svn: 125265
2011-02-10 05:17:27 +00:00
Chris Lattner
e29022d779 merge two tests.
llvm-svn: 125195
2011-02-09 17:06:41 +00:00
Chris Lattner
7b6a968f5d enhance vmcore to know that udiv's can be exact, and add a trivial
instcombine xform to exercise this.

Nothing forms exact udivs yet though.  This is progress on PR8862

llvm-svn: 124992
2011-02-06 21:44:57 +00:00
Anders Carlsson
f184e5de9a Recognize and simplify
(A+B) == A  ->  B == 0
A == (A+B)  ->  B == 0

llvm-svn: 124567
2011-01-30 22:01:13 +00:00
Duncan Sands
1a18d8df96 My auto-simplifier noticed that ((X/Y)*Y)/Y occurs several times in SPEC
benchmarks, and that it can be simplified to X/Y.  (In general you can only
simplify (Z*Y)/Y to Z if the multiplication did not overflow; if Z has the
form "X/Y" then this is the case).  This patch implements that transform and
moves some Div logic out of instcombine and into InstructionSimplify.
Unfortunately instcombine gets in the way somewhat, since it likes to change
(X/Y)*Y into X-(X rem Y), so I had to teach instcombine about this too.
Finally, thanks to the NSW/NUW flags, sometimes we know directly that "Z*Y"
does not overflow, because the flag says so, so I added that logic too.  This
eliminates a bunch of divisions and subtractions in 447.dealII, and has good
effects on some other benchmarks too.  It seems to have quite an effect on
tramp3d-v4 but it's hard to say if it's good or bad because inlining decisions
changed, resulting in massive changes all over.

llvm-svn: 124487
2011-01-28 16:51:11 +00:00
Nick Lewycky
9bbbb3e6f5 Clean up the tests a little, make sure we match an instruction in the right
test.

llvm-svn: 124473
2011-01-28 05:13:17 +00:00
Nick Lewycky
74dfcccec4 Fold select + select where both selects are on the same condition.
llvm-svn: 124469
2011-01-28 03:28:10 +00:00
Owen Anderson
6e3425c7e0 Just because we have determined that an (fcmp | fcmp) is true for A < B,
A == B, and A > B, does not mean we can fold it to true.  We still need to
check for A ? B (A unordered B).

llvm-svn: 123993
2011-01-21 19:39:42 +00:00
Chris Lattner
f225708ef1 fix PR9013, an infinite loop in instcombine.
llvm-svn: 123968
2011-01-21 05:29:50 +00:00
Nick Lewycky
c4300debc2 Don't try to pull vector bitcasts that change the number of elements through
a select. A vector select is pairwise on each element so we'd need a new
condition with the right number of elements to select on. Fixes PR8994.

llvm-svn: 123963
2011-01-21 02:30:43 +00:00
Chris Lattner
2067fb2a93 enhance FoldOpIntoPhi in instcombine to try harder when a phi has
multiple uses.  In some cases, all the uses are the same operation,
so instcombine can go ahead and promote the phi.  In the testcase
this pushes an add out of the loop.

llvm-svn: 123568
2011-01-16 05:28:59 +00:00
Chris Lattner
aba06ce448 fix PR8983, a broken assertion.
llvm-svn: 123562
2011-01-16 03:43:53 +00:00
Chris Lattner
74ed5d30ca implement an instcombine xform that canonicalizes casts outside of and-with-constant operations.
This fixes rdar://8808586 which observed that we used to compile:


union xy {
        struct x { _Bool b[15]; } x;
        __attribute__((packed))
        struct y {
                __attribute__((packed)) unsigned long b0to7;
                __attribute__((packed)) unsigned int b8to11;
                __attribute__((packed)) unsigned short b12to13;
                __attribute__((packed)) unsigned char b14;
        } y;
};

struct x
foo(union xy *xy)
{
        return xy->x;
}

into:

_foo:                                   ## @foo
	movq	(%rdi), %rax
	movabsq	$1095216660480, %rcx    ## imm = 0xFF00000000
	andq	%rax, %rcx
	movabsq	$-72057594037927936, %rdx ## imm = 0xFF00000000000000
	andq	%rax, %rdx
	movzbl	%al, %esi
	orq	%rdx, %rsi
	movq	%rax, %rdx
	andq	$65280, %rdx            ## imm = 0xFF00
	orq	%rsi, %rdx
	movq	%rax, %rsi
	andq	$16711680, %rsi         ## imm = 0xFF0000
	orq	%rdx, %rsi
	movl	%eax, %edx
	andl	$-16777216, %edx        ## imm = 0xFFFFFFFFFF000000
	orq	%rsi, %rdx
	orq	%rcx, %rdx
	movabsq	$280375465082880, %rcx  ## imm = 0xFF0000000000
	movq	%rax, %rsi
	andq	%rcx, %rsi
	orq	%rdx, %rsi
	movabsq	$71776119061217280, %r8 ## imm = 0xFF000000000000
	andq	%r8, %rax
	orq	%rsi, %rax
	movzwl	12(%rdi), %edx
	movzbl	14(%rdi), %esi
	shlq	$16, %rsi
	orl	%edx, %esi
	movq	%rsi, %r9
	shlq	$32, %r9
	movl	8(%rdi), %edx
	orq	%r9, %rdx
	andq	%rdx, %rcx
	movzbl	%sil, %esi
	shlq	$32, %rsi
	orq	%rcx, %rsi
	movl	%edx, %ecx
	andl	$-16777216, %ecx        ## imm = 0xFFFFFFFFFF000000
	orq	%rsi, %rcx
	movq	%rdx, %rsi
	andq	$16711680, %rsi         ## imm = 0xFF0000
	orq	%rcx, %rsi
	movq	%rdx, %rcx
	andq	$65280, %rcx            ## imm = 0xFF00
	orq	%rsi, %rcx
	movzbl	%dl, %esi
	orq	%rcx, %rsi
	andq	%r8, %rdx
	orq	%rsi, %rdx
	ret

We now compile this into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movzwl	12(%rdi), %eax
	movzbl	14(%rdi), %ecx
	shlq	$16, %rcx
	orl	%eax, %ecx
	shlq	$32, %rcx
	movl	8(%rdi), %edx
	orq	%rcx, %rdx
	movq	(%rdi), %rax
	ret

A small improvement :-)

llvm-svn: 123520
2011-01-15 06:32:33 +00:00
Duncan Sands
44c273d907 Move some shift transforms out of instcombine and into InstructionSimplify.
While there, I noticed that the transform "undef >>a X -> undef" was wrong.
For example if X is 2 then the top two bits must be equal, so the result can
not be anything.  I fixed this in the constant folder as well.  Also, I made
the transform for "X << undef" stronger: it now folds to undef always, even
though X might be zero.  This is in accordance with the LangRef, but I must
admit that it is fairly aggressive.  Also, I added "i32 X << 32 -> undef"
following the LangRef and the constant folder, likewise fairly aggressive.

llvm-svn: 123417
2011-01-14 00:37:45 +00:00
Owen Anderson
4479341626 Fix a random missed optimization by making InstCombine more aggressive when determining which bits are demanded by
a comparison against a constant.

llvm-svn: 123203
2011-01-11 00:36:45 +00:00
Chandler Carruth
772e26df36 Teach instcombine about the rest of the SSE and SSE2 conversion
intrinsics element dependencies. Reviewed by Nick.

llvm-svn: 123161
2011-01-10 07:19:37 +00:00
Chandler Carruth
7f854ac9a9 Fold two related tests into the newly FileCheck-ized test, migrating
them to FileCheck as well.

llvm-svn: 123154
2011-01-10 02:53:58 +00:00
Chandler Carruth
7c332e5abd Clean up and FileCheck-ize a test.
llvm-svn: 123153
2011-01-10 02:53:54 +00:00
Tobias Grosser
9899845dd3 Instcombine: Fix pattern where the sext did not dominate the icmp using it
llvm-svn: 123121
2011-01-09 16:00:11 +00:00
Frits van Bommel
966cc00809 Fix a bug in r123034 (trying to sext/zext non-integers) and clean up a little.
llvm-svn: 123061
2011-01-08 10:51:36 +00:00
Tobias Grosser
48469b566a InstCombine: Match min/max hidden by sext/zext
X = sext x; x >s c ? X : C+1 --> X = sext x; X <s C+1 ? C+1 : X
X = sext x; x <s c ? X : C-1 --> X = sext x; X >s C-1 ? C-1 : X
X = zext x; x >u c ? X : C+1 --> X = zext x; X <u C+1 ? C+1 : X
X = zext x; x <u c ? X : C-1 --> X = zext x; X >u C-1 ? C-1 : X
X = sext x; x >u c ? X : C+1 --> X = sext x; X <u C+1 ? C+1 : X
X = sext x; x <u c ? X : C-1 --> X = sext x; X >u C-1 ? C-1 : X

Instead of calculating this with mixed types promote all to the
larger type. This enables scalar evolution to analyze this
expression. PR8866

llvm-svn: 123034
2011-01-07 21:33:14 +00:00
Benjamin Kramer
62b5a4d14c Revert 122959, it needs more thought. Add it back to README.txt with additional notes.
llvm-svn: 123030
2011-01-07 20:42:20 +00:00
Benjamin Kramer
fb2bb22b6f InstCombine: Turn _chk functions into the "unsafe" variant if length and max langth are equal.
This happens when we take the (non-constant) length from a malloc.

llvm-svn: 122961
2011-01-06 14:22:52 +00:00
Benjamin Kramer
5834b2bab8 InstCombine: If we call llvm.objectsize on a malloc call we can replace it with the size passed to malloc.
llvm-svn: 122959
2011-01-06 13:11:05 +00:00
Benjamin Kramer
d5e1c24646 InstCombine: Teach llvm.objectsize folding to look through GEPs.
llvm-svn: 122958
2011-01-06 13:07:49 +00:00
Chris Lattner
83067bc3e7 implement constant folding support for an exotic constant expr:
ret i64 ptrtoint (i8* getelementptr ([1000 x i8]* @X, i64 1, i64 sub (i64 0, i64 ptrtoint ([1000 x i8]* @X to i64))) to i64)

to "ret i64 1000".  This allows us to correctly compute the trip count
on a loop in PR8883, which occurs with std::fill on a char array.  This
allows us to transform it into a memset with a constant size.

llvm-svn: 122950
2011-01-06 06:19:46 +00:00
Chris Lattner
dbb1b09731 fix an off-by-one bug that caused a crash analyzing
ashr's with huge shift amounts, PR8896

llvm-svn: 122814
2011-01-04 18:19:15 +00:00
Owen Anderson
6afd90810e When determining if we can fold (x >> C1) << C2, the bits that we need to verify are zero
are not the low bits of x, but the bits that WILL be the low bits after the operation completes.

llvm-svn: 122529
2010-12-23 23:56:24 +00:00
Benjamin Kramer
27d13684f5 InstCombine: creating selects from -1 and 0 is fine, they combine into a sext from i1.
llvm-svn: 122453
2010-12-22 23:12:15 +00:00
Duncan Sands
e1522867e6 Make this test not depend on how the variable is named.
llvm-svn: 122413
2010-12-22 17:08:04 +00:00
Duncan Sands
922251757b Add a generic expansion transform: A op (B op' C) -> (A op B) op' (A op C)
if both A op B and A op C simplify.  This fires fairly often but doesn't
make that much difference.  On gcc-as-one-file it removes two "and"s and
turns one branch into a select.

llvm-svn: 122399
2010-12-22 13:36:08 +00:00
Benjamin Kramer
bec7a6be15 Teach InstCombine to merge (icmp ult (X + CA), C1) | (icmp eq X, C2) into (icmp ult (X + CA), C1 + 1) if C2 + CA == C1.
InstCombine creates these so now we compile x == 23 || x == 24 || x == 25 to
  %x.off = add i32 %x, -23
  %1 = icmp ult i32 %x.off, 3
instead of
  %x.off = add i32 %x, -23
  %1 = icmp ult i32 %x.off, 2
  %cmp3 = icmp eq i32 %x, 25
  %ret2 = or i1 %1, %cmp3

llvm-svn: 122248
2010-12-20 16:18:51 +00:00
Duncan Sands
f72cfa961d Have SimplifyBinOp dispatch Xor, Add and Sub to the corresponding methods
(they had just been forgotten before).  Adding Xor causes "main" in the
existing testcase 2010-11-01-lshr-mask.ll to be hugely more simplified.

llvm-svn: 122245
2010-12-20 14:47:04 +00:00
Chris Lattner
b27b5d0a3a fix PR8807 by making transformConstExprCastCall aware of byval arguments.
llvm-svn: 122238
2010-12-20 08:36:38 +00:00
Mon P Wang
236fa96503 Test case for r122215 when InstCombine optimizes memset
llvm-svn: 122216
2010-12-20 01:06:23 +00:00
Chris Lattner
29475c23d0 X86 supports i8/i16 overflow ops (except i8 multiplies), we should
generate them.  

Now we compile:

define zeroext i8 @X(i8 signext %a, i8 signext %b) nounwind ssp {
entry:
  %0 = tail call %0 @llvm.sadd.with.overflow.i8(i8 %a, i8 %b)
  %cmp = extractvalue %0 %0, 1
  br i1 %cmp, label %if.then, label %if.end

into:

_X:                                     ## @X
## BB#0:                                ## %entry
	subl	$12, %esp
	movb	16(%esp), %al
	addb	20(%esp), %al
	jo	LBB0_2

Before we were generating:

_X:                                     ## @X
## BB#0:                                ## %entry
	pushl	%ebp
	movl	%esp, %ebp
	subl	$8, %esp
	movb	12(%ebp), %al
	testb	%al, %al
	setge	%cl
	movb	8(%ebp), %dl
	testb	%dl, %dl
	setge	%ah
	cmpb	%cl, %ah
	sete	%cl
	addb	%al, %dl
	testb	%dl, %dl
	setge	%al
	cmpb	%al, %ah
	setne	%al
	andb	%cl, %al
	testb	%al, %al
	jne	LBB0_2

llvm-svn: 122186
2010-12-19 20:03:11 +00:00
Chris Lattner
3bc741a0d2 recognize an unsigned add with overflow idiom into uadd.
This resolves a README entry and technically resolves PR4916,
but we still get poor code for the testcase in that PR because
GVN isn't CSE'ing uadd with add, filed as PR8817.

Previously we got:

_test7:                                 ## @test7
	addq	%rsi, %rdi
	cmpq	%rdi, %rsi
	movl	$42, %eax
	cmovaq	%rsi, %rax
	ret

Now we get:

_test7:                                 ## @test7
	addq	%rsi, %rdi
	movl	$42, %eax
	cmovbq	%rsi, %rax
	ret

llvm-svn: 122182
2010-12-19 19:37:52 +00:00
Chris Lattner
faef9b6bfb optimize uadd(x, cst) into a comparison when the normal
result is dead.  This is required for my next patch to not
regress the testsuite.

llvm-svn: 122181
2010-12-19 19:35:32 +00:00
Chris Lattner
d1f114d8f2 generalize the sadd creation code to not require that the
sadd formed is half the size of the original type. We can
now compile this into a sadd.i8:

unsigned char X(char a, char b) {
  int res = a+b;
  if ((unsigned )(res+128) > 255U)
    abort();
  return res;
}

llvm-svn: 122178
2010-12-19 18:35:09 +00:00
Chris Lattner
bb0d067691 fix another miscompile in the llvm.sadd formation logic: it wasn't
checking to see if the high bits of the original add result were dead.
Inserting a smaller add and zexting back to that size is not good enough.

This is likely to be the fix for 8816.

llvm-svn: 122177
2010-12-19 18:22:06 +00:00
Chris Lattner
c7876edb16 fix a bug (possibly 8816) in the sadd forming xform: it isn't
profitable (or safe) to promote code when the add-with-constant
has other uses.

llvm-svn: 122175
2010-12-19 17:59:02 +00:00
Nate Begeman
063d88d6fb Add vector versions of some existing scalar transforms to aid codegen in matching psign & pblend operations to the IR produced by clang/gcc for their C idioms.
llvm-svn: 122105
2010-12-17 23:12:19 +00:00
Owen Anderson
6acf8c9125 Reapply r121905 (automatic synthesis of @llvm.sadd.with.overflow) with a fix for a bug that manifested itself
on the DragonEgg self-host bot.  Unfortunately, the testcase is pretty messy and doesn't reduce well due to
interactions with other parts of InstCombine.

llvm-svn: 122072
2010-12-17 18:08:00 +00:00
Duncan Sands
22de496ae3 Speculatively revert commit 121905 since it looks like it might have broken the
dragonegg self-host buildbot.  Original commit message:

Add an InstCombine transform to recognize instances of manual overflow-safe addition
(performing the addition in a wider type and explicitly checking for overflow), and
fold them down to intrinsics.  This currently only supports signed-addition, but could
be generalized if someone works out the magic constant formulas for other operations.

llvm-svn: 121965
2010-12-16 09:40:54 +00:00
Owen Anderson
aefeb448a9 Add an InstCombine transform to recognize instances of manual overflow-safe addition
(performing the addition in a wider type and explicitly checking for overflow), and
fold them down to intrinsics.  This currently only supports signed-addition, but could
be generalized if someone works out the magic constant formulas for other operations.

Fixes <rdar://problem/8558713>.

llvm-svn: 121905
2010-12-15 22:32:38 +00:00
Benjamin Kramer
a638216447 Generalize the and-icmp-select instcombine further by allowing selects of the form
(x & 2^n) ? 2^m+C : C

we can offset both arms by C to get the "(x & 2^n) ? 2^m : 0" form, optimize the
select to a shift and apply the offset afterwards.

llvm-svn: 121609
2010-12-11 10:49:22 +00:00
Benjamin Kramer
5a1721f4ac Factor the (x & 2^n) ? 2^m : 0 instcombine into its own method and generalize it
to catch cases where n != m with a shift.

llvm-svn: 121608
2010-12-11 09:42:59 +00:00
Dan Gohman
3d9fc7db03 Really check that the bits that will become zero are actually already zero
before eliminating the operation that zeros them. This fixes rdar://8739316.

llvm-svn: 121353
2010-12-09 02:52:17 +00:00
Chris Lattner
bea813875e remove a use of llvm-dis
llvm-svn: 120383
2010-11-30 02:04:15 +00:00
Frits van Bommel
a59a8cf49f Transform (extractvalue (load P), ...) to (load (gep P, 0, ...)) if the load has no other uses, shrinking the load.
llvm-svn: 120323
2010-11-29 21:56:20 +00:00