Evan Cheng
0163d059e4
PHI elimination should not break back edge. It can cause some significant code placement issues. rdar://8263994
...
good:
LBB0_2:
mov r2, r0
. . .
mov r1, r2
bne LBB0_2
bad:
LBB0_2:
mov r2, r0
. . .
@ BB#3:
mov r1, r2
b LBB0_2
llvm-svn: 111221
2010-08-17 01:20:36 +00:00
Benjamin Kramer
0224854fdc
Test expects SSE, give him SSE.
...
llvm-svn: 111115
2010-08-15 23:32:03 +00:00
Benjamin Kramer
3116e6f58d
Restore arch on these test, they fail on arm.
...
llvm-svn: 111109
2010-08-15 20:42:56 +00:00
Dale Johannesen
6e5cf0f5b6
Mark as XFAIL on darwin 8. PR 7886.
...
llvm-svn: 111108
2010-08-15 19:40:29 +00:00
Dale Johannesen
3f9c148d0e
Revert 110491. While not wrong, it was based on a
...
misanalysis and is undesirable.
llvm-svn: 111028
2010-08-13 18:43:45 +00:00
Bruno Cardoso Lopes
7cb26cb8be
- Teach SSEDomainFix to switch between different levels of AVX instructions. Here we guess that AVX will have domain issues, so just implement them for consistency and in the future we remove if it's unnecessary.
...
- Make foldMemoryOperandImpl aware of 256-bit zero vectors folding and support the 128-bit counterparts of AVX too.
- Make sure MOV[AU]PS instructions are only selected when SSE1 is enabled, and duplicate the patterns to match AVX.
- Add a testcase for a simple 128-bit zero vector creation.
llvm-svn: 110946
2010-08-12 20:20:53 +00:00
Bruno Cardoso Lopes
bb491bd56c
Begin to support some vector operations for AVX 256-bit intructions. The long
...
term goal here is to be able to match enough of vector_shuffle and build_vector
so all avx intrinsics which aren't mapped to their own built-ins but to
shufflevector calls can be codegen'd. This is the first (baby) step, support
building zeroed vectors.
llvm-svn: 110897
2010-08-12 02:06:36 +00:00
Devang Patel
66fc7d88ae
This is x86 only test.
...
llvm-svn: 110887
2010-08-12 00:17:38 +00:00
Bruno Cardoso Lopes
2051068483
Add testcases for all AVX 256-bit intrinsics added in the last couple days
...
llvm-svn: 110854
2010-08-11 21:12:09 +00:00
Bruno Cardoso Lopes
fa19084e79
Reapply r109881 using a more strict command line for llc.
...
llvm-svn: 110833
2010-08-11 17:39:23 +00:00
Jakob Stoklund Olesen
99402e857d
Fix test for more architectures. Patch by Tobias Grosser.
...
llvm-svn: 110685
2010-08-10 16:48:24 +00:00
Tobias Grosser
766f219db9
Fix failing testcase.
...
Those look like typos to me.
llvm-svn: 110664
2010-08-10 09:54:29 +00:00
Devang Patel
84f48b5483
Handle TAG_constant for integers.
...
llvm-svn: 110656
2010-08-10 07:11:13 +00:00
Dale Johannesen
23f9086dd3
Use sdmem and sse_load_f64 (etc.) for the vector
...
form of CMPSD (etc.) Matching a 128-bit memory
operand is wrong, the instruction uses only 64 bits
(same as ADDSD etc.) 8193553.
llvm-svn: 110491
2010-08-07 00:33:42 +00:00
Eric Christopher
cf17d8dfa7
Add an option to always emit realignment code for a particular module.
...
llvm-svn: 110404
2010-08-05 23:57:43 +00:00
Devang Patel
9801232716
Move x86 specific tests into test/CodeGen/X86.
...
llvm-svn: 110372
2010-08-05 20:25:37 +00:00
Dan Gohman
d108d2b2f8
Move x86-specific tests out of test/Transforms/LoopStrengthReduce and
...
into test/CodeGen/X86, so that they aren't run when the x86 target is
not enabled.
Fix uglygep.ll to not be x86-specific.
llvm-svn: 110343
2010-08-05 17:04:15 +00:00
Daniel Dunbar
c93cd33f41
tests: CodeGen/X86/GC tests require X86.
...
llvm-svn: 110338
2010-08-05 15:45:33 +00:00
Bill Wendling
446a54d234
The lower invoke pass needs to have unreachable code elimination run after it
...
because it could create such things. This fixes a MingW buildbot test failure.
llvm-svn: 110279
2010-08-04 23:36:02 +00:00
Eli Friedman
401dbe036d
PR7814: Truncates cannot be ignored for signed comparisons.
...
llvm-svn: 110268
2010-08-04 22:40:58 +00:00
Stuart Hastings
003c3778ff
call-imm.ll test case regex fix. Patch by Dimitry Andric!
...
llvm-svn: 110199
2010-08-04 15:31:35 +00:00
Jakob Stoklund Olesen
058d1fc5bd
OK, that's it. This test is going away now. But don't worry, I am taking it to a
...
nice farm in the country where it can play with other tests. And bunnies.
It is not clear what is being tested, and the revision history shows a bunch of
random changes to the expected instruction count. Clearly, we are just fudging
it to pass whenever it fails.
llvm-svn: 110118
2010-08-03 17:21:14 +00:00
Bob Wilson
43273fe746
Revert new AVX intrinsic tests. They are breaking buildbots and Bruno is
...
away from a computer now.
--- Reverse-merging r109881 into '.':
D test/CodeGen/X86/avx-intrinsics-x86.ll
D test/CodeGen/X86/avx-intrinsics-x86_64.ll
llvm-svn: 109959
2010-07-31 22:36:03 +00:00
Bruno Cardoso Lopes
f6ed26ef55
A *bunch* of tests for AVX intrinsics
...
llvm-svn: 109881
2010-07-30 19:57:56 +00:00
Eli Friedman
bea7c851cf
Fix for bug reported by Evzen Muller on llvm-commits: make sure to correctly
...
check the range of the constant when optimizing a comparison between a
constant and a sign_extend_inreg node.
llvm-svn: 109854
2010-07-30 06:44:31 +00:00
Nate Begeman
133820e806
Implement a vectorized algorithm for <16 x i8> << <16 x i8>
...
This is about 4x faster and smaller than the existing scalarization.
llvm-svn: 109566
2010-07-28 00:21:48 +00:00
Nate Begeman
068e932975
~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller types coming in future patches.
...
For:
define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp {
entry:
%shl = shl <4 x i32> %r, %a ; <<4 x i32>> [#uses=1]
%tmp2 = bitcast <4 x i32> %shl to <2 x i64> ; <<2 x i64>> [#uses=1]
ret <2 x i64> %tmp2
}
We get:
_shl: ## @shl
pslld $23, %xmm1
paddd LCPI0_0, %xmm1
cvttps2dq %xmm1, %xmm1
pmulld %xmm1, %xmm0
ret
Instead of:
_shl: ## @shl
pshufd $3, %xmm0, %xmm2
movd %xmm2, %eax
pshufd $3, %xmm1, %xmm2
movd %xmm2, %ecx
shll %cl, %eax
movd %eax, %xmm2
pshufd $1, %xmm0, %xmm3
movd %xmm3, %eax
pshufd $1, %xmm1, %xmm3
movd %xmm3, %ecx
shll %cl, %eax
movd %eax, %xmm3
punpckldq %xmm2, %xmm3
movd %xmm0, %eax
movd %xmm1, %ecx
shll %cl, %eax
movd %eax, %xmm2
movhlps %xmm0, %xmm0
movd %xmm0, %eax
movhlps %xmm1, %xmm1
movd %xmm1, %ecx
shll %cl, %eax
movd %eax, %xmm0
punpckldq %xmm0, %xmm2
movdqa %xmm2, %xmm0
punpckldq %xmm3, %xmm0
ret
llvm-svn: 109549
2010-07-27 22:37:06 +00:00
Dan Gohman
1694c4352a
Use the proper type for shift counts. This fixes a bootstrap error.
...
llvm-svn: 109265
2010-07-23 21:08:12 +00:00
Dan Gohman
8859ab786b
DAGCombine (shl (anyext x, c)) to (anyext (shl x, c)) if the high bits
...
are not demanded. This often allows the anyext to be folded away.
llvm-svn: 109242
2010-07-23 18:03:30 +00:00
Eric Christopher
4924d5fb93
Custom lower the memory barrier instructions and add support
...
for lowering without sse2. Add a couple of new testcases.
Fixes a few libgomp tests and latent bugs. Remove a few todos.
llvm-svn: 109078
2010-07-22 02:48:34 +00:00
Dan Gohman
45ba7b5c5c
Fix SCEV denormalization of expressions where the exit value from
...
one loop is involved in the increment of an addrec for another
loop. This fixes rdar://8168938.
llvm-svn: 108863
2010-07-20 17:06:20 +00:00
Duncan Sands
6203c740a9
The same problem was being tracked in PR7652.
...
llvm-svn: 108843
2010-07-20 15:52:32 +00:00
Dan Gohman
28f747a608
After a custom inserter, in a block which has constant instructions,
...
update the current basic block in addition to the current insert
position, so that they remain consistent. This fixes rdar://8204072.
llvm-svn: 108765
2010-07-19 22:48:56 +00:00
Owen Anderson
acd445be06
Remove r108639 now that it is handled by InstCombine instead.
...
llvm-svn: 108688
2010-07-19 08:10:24 +00:00
Owen Anderson
f1ae3b35e7
Add a testcase for r108639.
...
llvm-svn: 108640
2010-07-18 08:57:19 +00:00
Bill Wendling
85d6ed81b7
Consider this function:
...
void foo() { __builtin_unreachable(); }
It will output the following on Darwin X86:
_func1:
Leh_func_begin0:
pushq %rbp
Ltmp0:
movq %rsp, %rbp
Ltmp1:
Leh_func_end0:
This prolog adds a new Call Frame Information (CFI) row to the FDE with an
address that is not within the address range of the code it describes -- part is
equal to the end of the function -- and therefore results in an invalid EH
frame. If we emit a nop in this situation, then the CFI row is now within the
address range.
llvm-svn: 108568
2010-07-16 22:51:10 +00:00
Jakob Stoklund Olesen
858d6bb512
Remove the X86::FP_REG_KILL pseudo-instruction and the X86FloatingPointRegKill
...
pass that inserted it.
It is no longer necessary to limit the live ranges of FP registers to a single
basic block.
llvm-svn: 108536
2010-07-16 17:41:44 +00:00
Jakob Stoklund Olesen
114bab20ae
Add forgotten test case.
...
llvm-svn: 108506
2010-07-16 04:45:35 +00:00
Dan Gohman
5e485c833f
Use the source-order scheduler instead of the "fast" scheduler at -O0,
...
because it's more likely to keep debug line information in its original
order.
llvm-svn: 108496
2010-07-16 02:01:19 +00:00
Bill Wendling
756b0a4d45
Revert. This isn't the correct way to go.
...
llvm-svn: 108478
2010-07-15 23:42:21 +00:00
Bill Wendling
991234752d
Handle code gen for the unreachable instruction if it's the only instruction in
...
the function. We'll just turn it into a "trap" instruction instead.
The problem with not handling this is that it might generate a prologue without
the equivalent epilogue to go with it:
$ cat t.ll
define void @foo() {
entry:
unreachable
}
$ llc -o - t.ll -relocation-model=pic -disable-fp-elim -unwind-tables
.section __TEXT,__text,regular,pure_instructions
.globl _foo
.align 4, 0x90
_foo: ## @foo
Leh_func_begin0:
## BB#0: ## %entry
pushq %rbp
Ltmp0:
movq %rsp, %rbp
Ltmp1:
Leh_func_end0:
...
The unwind tables then have bad data in them causing all sorts of problems.
Fixes <rdar://problem/8096481>.
llvm-svn: 108473
2010-07-15 23:32:40 +00:00
Evan Cheng
ffbae6ad52
Split -enable-finite-only-fp-math to two options:
...
-enable-no-nans-fp-math and -enable-no-infs-fp-math. All of the current codegen fp math optimizations only care whether the fp arithmetics arguments and results can never be NaN.
llvm-svn: 108465
2010-07-15 22:07:12 +00:00
Chris Lattner
e2f110cba5
fix the definitions of ConstTextCoalSection/ConstDataCoalSection
...
to keep "Text" in sync with the "pure instructions" section attribute.
Lack of this attribute was preventing the assembler from emitting
multibyte noops instructions for templates (and inlines, and other
coalesced stuff) and was causing the assembler to mismatch .o files.
This fixes rdar://8018335
llvm-svn: 108461
2010-07-15 21:22:00 +00:00
Devang Patel
3028e38bd8
Fix crash reported in PR7653.
...
llvm-svn: 108441
2010-07-15 18:45:27 +00:00
Dan Gohman
d75ba463e0
Watch out for a constant offset cancelling out a base register, forming
...
a zero. This situation arrises in Fortran code with induction variables
that start at 1 instead of 0. This fixes PR7651.
llvm-svn: 108424
2010-07-15 15:14:45 +00:00
Devang Patel
8924b27a38
Make it a .ll test case.
...
llvm-svn: 108370
2010-07-14 23:12:52 +00:00
Dan Gohman
8e01a639c0
Delete fast-isel's trivial load optimization; it breaks debugging because
...
it can look past points where a debugger might modify user variables.
llvm-svn: 108336
2010-07-14 17:25:37 +00:00
Evan Cheng
f6478f489d
Fix for PR7193 was overly conservative. The only case where sibcall callee
...
address cannot be allocated a register is in 32-bit mode where the first
three arguments are marked inreg. In that case EAX, EDX, and ECX will be
used for argument passing.
This fixes PR7610.
llvm-svn: 108327
2010-07-14 06:44:01 +00:00
Evan Cheng
7ff31f22a4
Re-enable the test with fix.
...
llvm-svn: 108319
2010-07-14 05:49:23 +00:00
Chris Lattner
b800dea3b6
temporarily disable to test to fix buildbots.
...
llvm-svn: 108310
2010-07-14 02:21:59 +00:00