Chris Lattner
69a9143584
Add an instcombine to clean up a common pattern produced
...
by the SRoA "promote to large integer" code, eliminating
some type conversions like this:
%94 = zext i16 %93 to i32 ; <i32> [#uses=2]
%96 = lshr i32 %94, 8 ; <i32> [#uses=1]
%101 = trunc i32 %96 to i8 ; <i8> [#uses=1]
This also unblocks other xforms from happening, now clang is able to compile:
struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }
into:
_foo: ## @foo
## BB#0: ## %entry
pshufd $1, %xmm0, %xmm2
addss %xmm0, %xmm2
movdqa %xmm1, %xmm3
addss %xmm2, %xmm3
pshufd $1, %xmm1, %xmm0
addss %xmm3, %xmm0
ret
on x86-64, instead of:
_foo: ## @foo
## BB#0: ## %entry
movd %xmm0, %rax
shrq $32, %rax
movd %eax, %xmm2
addss %xmm0, %xmm2
movapd %xmm1, %xmm3
addss %xmm2, %xmm3
movd %xmm1, %rax
shrq $32, %rax
movd %eax, %xmm0
addss %xmm3, %xmm0
ret
This seems pretty close to optimal to me, at least without
using horizontal adds. This also triggers in lots of other
code, including SPEC.
llvm-svn: 112278
2010-08-27 18:31:05 +00:00
Chris Lattner
e9dafffae3
filecheckize
...
llvm-svn: 112235
2010-08-26 22:23:39 +00:00
Chris Lattner
1efc631212
rename test.
...
llvm-svn: 112234
2010-08-26 22:20:47 +00:00
Chris Lattner
d5d68438c1
optimize "integer extraction out of the middle of a vector" as produced
...
by SRoA. This is part of rdar://7892780, but needs another xform to
expose this.
llvm-svn: 112232
2010-08-26 22:14:59 +00:00
Chris Lattner
19a5dc488b
optimize bitcast(trunc(bitcast(x))) where the result is a float and 'x'
...
is a vector to be a vector element extraction. This allows clang to
compile:
struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }
into:
_foo: ## @foo
## BB#0: ## %entry
movd %xmm0, %rax
shrq $32, %rax
movd %eax, %xmm2
addss %xmm0, %xmm2
movapd %xmm1, %xmm3
addss %xmm2, %xmm3
movd %xmm1, %rax
shrq $32, %rax
movd %eax, %xmm0
addss %xmm3, %xmm0
ret
instead of:
_foo: ## @foo
## BB#0: ## %entry
movd %xmm0, %rax
movd %eax, %xmm0
shrq $32, %rax
movd %eax, %xmm2
addss %xmm0, %xmm2
movd %xmm1, %rax
movd %eax, %xmm1
addss %xmm2, %xmm1
shrq $32, %rax
movd %eax, %xmm0
addss %xmm1, %xmm0
ret
... eliminating half of the horribleness.
llvm-svn: 112227
2010-08-26 21:55:42 +00:00
Chris Lattner
d1a8743984
filecheckize
...
llvm-svn: 112225
2010-08-26 21:51:41 +00:00
Chris Lattner
3113ee607c
rename test
...
llvm-svn: 112224
2010-08-26 21:50:56 +00:00
Owen Anderson
678fd04aa5
Re-apply r111568 with a fix for the clang self-host.
...
llvm-svn: 111665
2010-08-20 18:24:43 +00:00
Owen Anderson
7c1b4fbd3b
Previous revert failed to remove this file.
...
llvm-svn: 111582
2010-08-19 23:45:15 +00:00
Owen Anderson
0e57acb623
Revert r111568 to unbreak clang self-host.
...
llvm-svn: 111571
2010-08-19 23:25:16 +00:00
Owen Anderson
7f2852ba2d
When a set of bitmask operations, typically from a bitfield initialization, only modifies the low bytes of a value,
...
we can narrow the store to only over-write the affected bytes.
llvm-svn: 111568
2010-08-19 22:15:40 +00:00
Eric Christopher
08e9f0250a
Temporarily revert r110987 as it's causing some miscompares in
...
vector heavy code. I'll re-enable when we've tracked down the problem.
llvm-svn: 111318
2010-08-17 22:55:27 +00:00
Nate Begeman
e57074fc48
Reapply this transformation now that it is passing the external test which it previously failed.
...
llvm-svn: 110987
2010-08-13 00:17:53 +00:00
Eric Christopher
34acdf57df
Temporarily revert 110737 and 110734, they were causing failures
...
in an external testsuite.
llvm-svn: 110905
2010-08-12 07:01:22 +00:00
Nate Begeman
36e284c2be
Add test for recent instcombine vector shuffle enhancement
...
llvm-svn: 110737
2010-08-10 21:58:00 +00:00
Eli Friedman
7197d66ff1
PR7853: fix a silly mistake introduced in r101899, and add a test to make sure
...
it doesn't regress again.
llvm-svn: 110597
2010-08-09 20:49:43 +00:00
Dan Gohman
a80f89dbc7
Make instcombine set explicit alignments on load or store
...
instructions with alignment 0, so that subsequent passes don't
need to bother checking the TargetData ABI size manually.
llvm-svn: 110128
2010-08-03 18:20:32 +00:00
Owen Anderson
e957c57ebb
Re-apply the infamous r108614, with a fix pointed out by Dirk Steinke.
...
llvm-svn: 110036
2010-08-02 09:32:13 +00:00
Daniel Dunbar
f2be238c99
Speculatively revert r108614, "Another attempt at getting the clang self-host to
...
like my instcombine patch.", in an attempt to fix Clang i386 bootstrap.
- Also PR7719.
llvm-svn: 109953
2010-07-31 19:51:11 +00:00
Owen Anderson
f66e1873ea
Testcase for r108687.
...
llvm-svn: 108689
2010-07-19 08:14:26 +00:00
Owen Anderson
c8dc055b5e
Another attempt at getting the clang self-host to like my instcombine patch.
...
llvm-svn: 108614
2010-07-17 06:56:35 +00:00
Eric Christopher
5eef314caf
Also revert 108422, it's causing some test failures.
...
Working on testcases for Owen.
llvm-svn: 108494
2010-07-16 01:36:12 +00:00
Owen Anderson
01a2992a91
Reapply r108378, with bugfixes, testcase, and improved comment formatting.
...
This now passes LIT, nighty test, and llvm-gcc bootstrap on my machine.
llvm-svn: 108422
2010-07-15 15:00:23 +00:00
Chris Lattner
38e6ecd9f1
revert r108320, I see the failures now...
...
llvm-svn: 108322
2010-07-14 06:16:35 +00:00
Chris Lattner
5822d6d579
reapply benjamin's instcombine patch, I don't see anything wrong with it and can't repro any problems with a manual self-host.
...
llvm-svn: 108320
2010-07-14 05:59:13 +00:00
Benjamin Kramer
cf8ad46899
Nope, still breaks the release selfhost bots :(
...
llvm-svn: 108153
2010-07-12 16:38:48 +00:00
Benjamin Kramer
e391789246
Reapply the "or" half of r108136, which seems to be less problematic.
...
llvm-svn: 108152
2010-07-12 16:15:48 +00:00
Benjamin Kramer
98c95e7743
Revert r108141 again, sigh.
...
llvm-svn: 108148
2010-07-12 14:42:04 +00:00
Benjamin Kramer
c4f46375d3
Reapply 108136 with an ugly pasto fixed.
...
llvm-svn: 108141
2010-07-12 13:44:00 +00:00
Benjamin Kramer
d9bf737e62
Revert r108136 until I figure out why it broke selfhost.
...
llvm-svn: 108139
2010-07-12 12:35:49 +00:00
Benjamin Kramer
f00a49ceff
instcombine: fold (x & y) | (~x & z) and (x & y) ^ (~x & z) into ((y ^ z) & x) ^ z which is one instruction shorter. (PR6773)
...
before:
%and = and i32 %y, %x
%neg = xor i32 %x, -1
%and4 = and i32 %z, %neg
%xor = xor i32 %and4, %and
after:
%xor1 = xor i32 %z, %y
%and2 = and i32 %xor1, %x
%xor = xor i32 %and2, %z
llvm-svn: 108136
2010-07-12 11:54:45 +00:00
Chris Lattner
59bffe35a1
fix PR7311 by avoiding breaking casts when a bitcast from scalar->vector
...
is involved.
llvm-svn: 108117
2010-07-12 01:19:22 +00:00
Chris Lattner
d8288040c3
fix PR7429, a crash turning a load from a string into a float.
...
llvm-svn: 108113
2010-07-12 00:22:51 +00:00
Chris Lattner
68f5ec0fa2
convert to filechecconvert to filecheckk
...
llvm-svn: 108112
2010-07-12 00:21:10 +00:00
Chris Lattner
64eeea9044
merge two tests.
...
llvm-svn: 108111
2010-07-12 00:19:47 +00:00
Benjamin Kramer
27eb255a70
Teach instcombine to transform
...
(X >s -1) ? C1 : C2 and (X <s 0) ? C2 : C1
into ((X >>s 31) & (C2 - C1)) + C1, avoiding the conditional.
This optimization could be extended to take non-const C1 and C2 but we better
stay conservative to avoid code size bloat for now.
for
int sel(int n) {
return n >= 0 ? 60 : 100;
}
we now generate
sarl $31, %edi
andl $40, %edi
leal 60(%rdi), %eax
instead of
testl %edi, %edi
movl $60, %ecx
movl $100, %eax
cmovnsl %ecx, %eax
llvm-svn: 107866
2010-07-08 11:39:10 +00:00
Dan Gohman
50fffcaea3
Constant fold x == undef to undef.
...
llvm-svn: 107074
2010-06-28 21:30:07 +00:00
Rafael Espindola
d7a63bead9
Remove arm_apcscc from the test files. It is the default and doing this
...
matches what llvm-gcc and clang now produce.
llvm-svn: 106221
2010-06-17 15:18:27 +00:00
Dan Gohman
22d22caaed
Teach instcombine to promote alloca array sizes.
...
llvm-svn: 104945
2010-05-28 15:09:00 +00:00
Dan Gohman
bab79afa29
Add a testcase for getelementptr index promotion.
...
llvm-svn: 104944
2010-05-28 15:07:59 +00:00
Duncan Sands
32d3986765
Teach instCombine to remove malloc+free if malloc's only uses are comparisons
...
to null. Patch by Matti Niemenmaa.
llvm-svn: 104871
2010-05-27 19:09:06 +00:00
Chris Lattner
0b442d35da
Teach instcombine to transform a bitcast/(zext|trunc)/bitcast sequence
...
with a vector input and output into a shuffle vector. This sort of
sequence happens when the input code stores with one type and reloads
with another type and then SROA promotes to i96 integers, which make
everyone sad.
This fixes rdar://7896024
llvm-svn: 103354
2010-05-08 21:50:26 +00:00
Nick Lewycky
c639c07492
Fix declarations in a few more tests.
...
llvm-svn: 101676
2010-04-17 21:29:25 +00:00
Eric Christopher
6b38179ee2
Verify function prototypes before trying to optimize functions. We also
...
need TargetData, just return false if we don't have it.
Update testcases accordingly.
Fixes PR6807.
llvm-svn: 101011
2010-04-12 04:48:00 +00:00
Dan Gohman
5bf62639ed
Print empty structs as {} rather than { }.
...
llvm-svn: 100787
2010-04-08 18:03:05 +00:00
Chris Lattner
23334439e9
add newlines at the end of files.
...
llvm-svn: 100705
2010-04-07 22:53:17 +00:00
Mon P Wang
484bbe6aa9
Reapply address space patch after fixing an issue in MemCopyOptimizer.
...
Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
llvm-svn: 100304
2010-04-04 03:10:48 +00:00
Mon P Wang
0ccf050ca3
Revert r100191 since it breaks objc in clang
...
llvm-svn: 100199
2010-04-02 18:43:02 +00:00
Mon P Wang
a01350755e
Reapply address space patch after fixing an issue in MemCopyOptimizer.
...
Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
llvm-svn: 100191
2010-04-02 18:04:15 +00:00
Bob Wilson
aae933cc81
Revert Mon Ping's change 99928, since it broke all the llvm-gcc buildbots.
...
llvm-svn: 99948
2010-03-30 22:27:04 +00:00