Chris Lattner
bf0f375aba
fix PR8267 - Instcombine shouldn't optimizer away volatile memcpy's.
...
llvm-svn: 115296
2010-10-01 05:51:02 +00:00
Chris Lattner
c131f7d23b
upgrade this test.
...
llvm-svn: 115295
2010-10-01 05:47:16 +00:00
Jakob Stoklund Olesen
c9755c5213
Don't try to constant fold libm functions with non-finite arguments.
...
Usually we wouldn't do this anyway because llvm_fenv_testexcept would return an
exception, but we have seen some cases where neither errno nor fenv detect an
exception on arm-linux.
llvm-svn: 114893
2010-09-27 21:29:20 +00:00
Jakob Stoklund Olesen
eb9c5129d1
Be more precise when trying to XFAIL this tester: http://google1.osuosl.org:8011/builders/llvm-arm-linux
...
llvm-svn: 114755
2010-09-24 20:34:49 +00:00
Dan Gohman
e96841854e
Attempt to XFAIL this test on arm-linux, which is inexplicably failing.
...
llvm-svn: 114241
2010-09-18 00:04:37 +00:00
Dan Gohman
0e6744d219
Fix this test so that folding doesn't depend on a potentially
...
"inexact" result.
llvm-svn: 114198
2010-09-17 20:15:53 +00:00
Dan Gohman
9dc559bdef
Fix the folding of floating-point math library calls, like sin(infinity),
...
so that it detects errors on platforms where libm doesn't set errno.
It's still subject to host libm details though.
llvm-svn: 114148
2010-09-17 01:38:06 +00:00
Owen Anderson
e305c119b9
Add a reduced testcase for the infinite loop fixed in r113763.
...
llvm-svn: 113770
2010-09-13 18:28:40 +00:00
Owen Anderson
9c34a7831d
Re-apply r113679, which was reverted in r113720, which added a paid of new instcombine transforms
...
to expose greater opportunities for store narrowing in codegen. This patch fixes a potential
infinite loop in instcombine caused by one of the introduced transforms being overly aggressive.
llvm-svn: 113763
2010-09-13 17:59:27 +00:00
Eric Christopher
d4aaabfa74
Revert 113679, it was causing an infinite loop in a testcase that I've sent
...
on to Owen.
llvm-svn: 113720
2010-09-12 06:09:23 +00:00
Owen Anderson
d4ebde12ce
Invert and-of-or into or-of-and when doing so would allow us to clear bits of the and's mask.
...
This can result in increased opportunities for store narrowing in code generation. Update a number of
tests for this change. This fixes <rdar://problem/8285027>.
Additionally, because this inverts the order of ors and ands, some patterns for optimizing or-of-and-of-or
no longer fire in instances where they did originally. Add a simple transform which recaptures most of these
opportunities: if we have an or-of-constant-or and have failed to fold away the inner or, commute the order
of the two ors, to give the non-constant or a chance for simplification instead.
llvm-svn: 113679
2010-09-11 05:48:06 +00:00
Benjamin Kramer
6110efbf8c
Teach InstructionSimplify to fold (A & B) & A -> A & B and (A | B) | A -> A | B.
...
Reassociate does this but it doesn't catch all cases (e.g. if the operands are i1).
llvm-svn: 113651
2010-09-10 22:39:55 +00:00
Owen Anderson
c51d7d1a8d
Generalize instcombine's support for combining multiple bit checks into a single test. Patch by Dirk Steinke!
...
llvm-svn: 113423
2010-09-08 22:16:17 +00:00
Chris Lattner
a58a97dafc
Fix a serious performance regression introduced by r108687 on linux:
...
turning (fptrunc (sqrt (fpext x))) -> (sqrtf x) is great, but we have
to delete the original sqrt as well. Not doing so causes us to do
two sqrt's when building with -fmath-errno (the default on linux).
llvm-svn: 113260
2010-09-07 20:01:38 +00:00
Chris Lattner
f2534b401f
rename test.
...
llvm-svn: 113257
2010-09-07 19:57:06 +00:00
Owen Anderson
f700be9fb2
Add a test for PR4413, which was apparently fixed at some point in the past.
...
llvm-svn: 112987
2010-09-03 18:33:08 +00:00
Chris Lattner
a1691fc6cb
more test cleanup
...
llvm-svn: 112892
2010-09-02 22:38:56 +00:00
Owen Anderson
bd9edea8a3
Remove r111665, which implemented store-narrowing in InstCombine. Chris discovered a miscompilation in it, and it's not easily
...
fixable at the optimizer level. I'll investigate reimplementing it in DAGCombine.
llvm-svn: 112575
2010-08-31 04:41:06 +00:00
Chris Lattner
b61cf1e296
handle the constant case of vector insertion. For something
...
like this:
struct S { float A, B, C, D; };
struct S g;
struct S bar() {
struct S A = g;
++A.B;
A.A = 42;
return A;
}
we now generate:
_bar: ## @bar
## BB#0: ## %entry
movq _g@GOTPCREL(%rip), %rax
movss 12(%rax), %xmm0
pshufd $16, %xmm0, %xmm0
movss 4(%rax), %xmm2
movss 8(%rax), %xmm1
pshufd $16, %xmm1, %xmm1
unpcklps %xmm0, %xmm1
addss LCPI1_0(%rip), %xmm2
pshufd $16, %xmm2, %xmm2
movss LCPI1_1(%rip), %xmm0
pshufd $16, %xmm0, %xmm0
unpcklps %xmm2, %xmm0
ret
instead of:
_bar: ## @bar
## BB#0: ## %entry
movq _g@GOTPCREL(%rip), %rax
movss 12(%rax), %xmm0
pshufd $16, %xmm0, %xmm0
movss 4(%rax), %xmm2
movss 8(%rax), %xmm1
pshufd $16, %xmm1, %xmm1
unpcklps %xmm0, %xmm1
addss LCPI1_0(%rip), %xmm2
movd %xmm2, %eax
shlq $32, %rax
addq $1109917696, %rax ## imm = 0x42280000
movd %rax, %xmm0
ret
llvm-svn: 112345
2010-08-28 01:50:57 +00:00
Chris Lattner
c70b0c0ee7
optimize bitcasts from large integers to vector into vector
...
element insertion from the pieces that feed into the vector.
This handles a pattern that occurs frequently due to code
generated for the x86-64 abi. We now compile something like
this:
struct S { float A, B, C, D; };
struct S g;
struct S bar() {
struct S A = g;
++A.A;
++A.C;
return A;
}
into all nice vector operations:
_bar: ## @bar
## BB#0: ## %entry
movq _g@GOTPCREL(%rip), %rax
movss LCPI1_0(%rip), %xmm1
movss (%rax), %xmm0
addss %xmm1, %xmm0
pshufd $16, %xmm0, %xmm0
movss 4(%rax), %xmm2
movss 12(%rax), %xmm3
pshufd $16, %xmm2, %xmm2
unpcklps %xmm2, %xmm0
addss 8(%rax), %xmm1
pshufd $16, %xmm1, %xmm1
pshufd $16, %xmm3, %xmm2
unpcklps %xmm2, %xmm1
ret
instead of icky integer operations:
_bar: ## @bar
movq _g@GOTPCREL(%rip), %rax
movss LCPI1_0(%rip), %xmm1
movss (%rax), %xmm0
addss %xmm1, %xmm0
movd %xmm0, %ecx
movl 4(%rax), %edx
movl 12(%rax), %esi
shlq $32, %rdx
addq %rcx, %rdx
movd %rdx, %xmm0
addss 8(%rax), %xmm1
movd %xmm1, %eax
shlq $32, %rsi
addq %rax, %rsi
movd %rsi, %xmm1
ret
This resolves rdar://8360454
llvm-svn: 112343
2010-08-28 01:20:38 +00:00
Chris Lattner
08d2f26030
tidy up test.
...
llvm-svn: 112321
2010-08-27 23:15:21 +00:00
Chris Lattner
3f880c2097
Enhance the shift propagator to handle the case when you have:
...
A = shl x, 42
...
B = lshr ..., 38
which can be transformed into:
A = shl x, 4
...
iff we can prove that the would-be-shifted-in bits
are already zero. This eliminates two shifts in the testcase
and allows eliminate of the whole i128 chain in the real example.
llvm-svn: 112314
2010-08-27 22:53:44 +00:00
Chris Lattner
80632e5fd9
Implement a pretty general logical shift propagation
...
framework, which is good at ripping through bitfield
operations. This generalize a bunch of the existing
xforms that instcombine does, such as
(x << c) >> c -> and
to handle intermediate logical nodes. This is useful for
ripping up the "promote to large integer" code produced by
SRoA.
llvm-svn: 112304
2010-08-27 22:24:38 +00:00
Chris Lattner
1a15c898b9
merge and filecheckize test
...
llvm-svn: 112289
2010-08-27 20:44:45 +00:00
Chris Lattner
a571568019
merge two tests
...
llvm-svn: 112288
2010-08-27 20:42:10 +00:00
Chris Lattner
866b888095
teach the truncation optimization that an entire chain of
...
computation can be truncated if it is fed by a sext/zext that doesn't
have to be exactly equal to the truncation result type.
llvm-svn: 112285
2010-08-27 20:32:06 +00:00
Chris Lattner
69a9143584
Add an instcombine to clean up a common pattern produced
...
by the SRoA "promote to large integer" code, eliminating
some type conversions like this:
%94 = zext i16 %93 to i32 ; <i32> [#uses=2]
%96 = lshr i32 %94, 8 ; <i32> [#uses=1]
%101 = trunc i32 %96 to i8 ; <i8> [#uses=1]
This also unblocks other xforms from happening, now clang is able to compile:
struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }
into:
_foo: ## @foo
## BB#0: ## %entry
pshufd $1, %xmm0, %xmm2
addss %xmm0, %xmm2
movdqa %xmm1, %xmm3
addss %xmm2, %xmm3
pshufd $1, %xmm1, %xmm0
addss %xmm3, %xmm0
ret
on x86-64, instead of:
_foo: ## @foo
## BB#0: ## %entry
movd %xmm0, %rax
shrq $32, %rax
movd %eax, %xmm2
addss %xmm0, %xmm2
movapd %xmm1, %xmm3
addss %xmm2, %xmm3
movd %xmm1, %rax
shrq $32, %rax
movd %eax, %xmm0
addss %xmm3, %xmm0
ret
This seems pretty close to optimal to me, at least without
using horizontal adds. This also triggers in lots of other
code, including SPEC.
llvm-svn: 112278
2010-08-27 18:31:05 +00:00
Chris Lattner
e9dafffae3
filecheckize
...
llvm-svn: 112235
2010-08-26 22:23:39 +00:00
Chris Lattner
1efc631212
rename test.
...
llvm-svn: 112234
2010-08-26 22:20:47 +00:00
Chris Lattner
d5d68438c1
optimize "integer extraction out of the middle of a vector" as produced
...
by SRoA. This is part of rdar://7892780, but needs another xform to
expose this.
llvm-svn: 112232
2010-08-26 22:14:59 +00:00
Chris Lattner
19a5dc488b
optimize bitcast(trunc(bitcast(x))) where the result is a float and 'x'
...
is a vector to be a vector element extraction. This allows clang to
compile:
struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }
into:
_foo: ## @foo
## BB#0: ## %entry
movd %xmm0, %rax
shrq $32, %rax
movd %eax, %xmm2
addss %xmm0, %xmm2
movapd %xmm1, %xmm3
addss %xmm2, %xmm3
movd %xmm1, %rax
shrq $32, %rax
movd %eax, %xmm0
addss %xmm3, %xmm0
ret
instead of:
_foo: ## @foo
## BB#0: ## %entry
movd %xmm0, %rax
movd %eax, %xmm0
shrq $32, %rax
movd %eax, %xmm2
addss %xmm0, %xmm2
movd %xmm1, %rax
movd %eax, %xmm1
addss %xmm2, %xmm1
shrq $32, %rax
movd %eax, %xmm0
addss %xmm1, %xmm0
ret
... eliminating half of the horribleness.
llvm-svn: 112227
2010-08-26 21:55:42 +00:00
Chris Lattner
d1a8743984
filecheckize
...
llvm-svn: 112225
2010-08-26 21:51:41 +00:00
Chris Lattner
3113ee607c
rename test
...
llvm-svn: 112224
2010-08-26 21:50:56 +00:00
Owen Anderson
678fd04aa5
Re-apply r111568 with a fix for the clang self-host.
...
llvm-svn: 111665
2010-08-20 18:24:43 +00:00
Owen Anderson
7c1b4fbd3b
Previous revert failed to remove this file.
...
llvm-svn: 111582
2010-08-19 23:45:15 +00:00
Owen Anderson
0e57acb623
Revert r111568 to unbreak clang self-host.
...
llvm-svn: 111571
2010-08-19 23:25:16 +00:00
Owen Anderson
7f2852ba2d
When a set of bitmask operations, typically from a bitfield initialization, only modifies the low bytes of a value,
...
we can narrow the store to only over-write the affected bytes.
llvm-svn: 111568
2010-08-19 22:15:40 +00:00
Eric Christopher
08e9f0250a
Temporarily revert r110987 as it's causing some miscompares in
...
vector heavy code. I'll re-enable when we've tracked down the problem.
llvm-svn: 111318
2010-08-17 22:55:27 +00:00
Nate Begeman
e57074fc48
Reapply this transformation now that it is passing the external test which it previously failed.
...
llvm-svn: 110987
2010-08-13 00:17:53 +00:00
Eric Christopher
34acdf57df
Temporarily revert 110737 and 110734, they were causing failures
...
in an external testsuite.
llvm-svn: 110905
2010-08-12 07:01:22 +00:00
Nate Begeman
36e284c2be
Add test for recent instcombine vector shuffle enhancement
...
llvm-svn: 110737
2010-08-10 21:58:00 +00:00
Eli Friedman
7197d66ff1
PR7853: fix a silly mistake introduced in r101899, and add a test to make sure
...
it doesn't regress again.
llvm-svn: 110597
2010-08-09 20:49:43 +00:00
Dan Gohman
a80f89dbc7
Make instcombine set explicit alignments on load or store
...
instructions with alignment 0, so that subsequent passes don't
need to bother checking the TargetData ABI size manually.
llvm-svn: 110128
2010-08-03 18:20:32 +00:00
Owen Anderson
e957c57ebb
Re-apply the infamous r108614, with a fix pointed out by Dirk Steinke.
...
llvm-svn: 110036
2010-08-02 09:32:13 +00:00
Daniel Dunbar
f2be238c99
Speculatively revert r108614, "Another attempt at getting the clang self-host to
...
like my instcombine patch.", in an attempt to fix Clang i386 bootstrap.
- Also PR7719.
llvm-svn: 109953
2010-07-31 19:51:11 +00:00
Owen Anderson
f66e1873ea
Testcase for r108687.
...
llvm-svn: 108689
2010-07-19 08:14:26 +00:00
Owen Anderson
c8dc055b5e
Another attempt at getting the clang self-host to like my instcombine patch.
...
llvm-svn: 108614
2010-07-17 06:56:35 +00:00
Eric Christopher
5eef314caf
Also revert 108422, it's causing some test failures.
...
Working on testcases for Owen.
llvm-svn: 108494
2010-07-16 01:36:12 +00:00
Owen Anderson
01a2992a91
Reapply r108378, with bugfixes, testcase, and improved comment formatting.
...
This now passes LIT, nighty test, and llvm-gcc bootstrap on my machine.
llvm-svn: 108422
2010-07-15 15:00:23 +00:00
Chris Lattner
38e6ecd9f1
revert r108320, I see the failures now...
...
llvm-svn: 108322
2010-07-14 06:16:35 +00:00