Dan Gohman
42337e0ee9
Generalize LSR's OptimizeMax to handle the new kinds of max expressions
...
that indvars may use, now that indvars is recognizing le and ge loops.
llvm-svn: 102235
2010-04-24 03:13:44 +00:00
Stuart Hastings
85b5c330f2
Per Chris, fuse four trivial tests using grep (r102199) into one that uses FileCheck.
...
llvm-svn: 102216
2010-04-23 22:12:57 +00:00
Dan Gohman
6a48222bd8
Change TargetData's algorithm for computing defualt vector type
...
alignment to match what's used in clang and GCC for __alignof, rather
than trying to guess what Legalize is going to be doing.
llvm-svn: 102206
2010-04-23 19:41:15 +00:00
Stuart Hastings
ad81819149
Add some missing x86 patterns for movdq2q. Fixes two (LLVM-)GCC DejaGNU testcases. Radar 6881029.
...
llvm-svn: 102199
2010-04-23 19:03:32 +00:00
Dan Gohman
38949c2f1f
Fix LSR to tolerate cases where ScalarEvolution initially
...
misses an opportunity to fold add operands, but folds them
after LSR has separated them out. This fixes rdar://7886751.
llvm-svn: 102157
2010-04-23 01:55:05 +00:00
Jim Grosbach
b9dccb6103
Update ARM DAGtoDAG for matching UBFX instruction for unsigned bitfield
...
extraction. This fixes PR5998.
llvm-svn: 102144
2010-04-22 23:24:18 +00:00
Evan Cheng
a324da99ae
Do not try to optimize a copy that has already been marked for deletion.
...
llvm-svn: 102027
2010-04-21 20:57:54 +00:00
Evan Cheng
dbfb7dc438
Implement -disable-non-leaf-fp-elim which disable frame pointer elimination
...
optimization for non-leaf functions. This will be hooked up to gcc's
-momit-leaf-frame-pointer option. rdar://7886181
llvm-svn: 101984
2010-04-21 03:18:23 +00:00
Evan Cheng
a0c4b2952f
- Clean up some crappy code which deals with coalescing of copies which look at
...
extract_subreg / insert_subreg, etc.
- Add support for more aggressive insert_subreg coalescing.
llvm-svn: 101971
2010-04-21 00:44:22 +00:00
Dan Gohman
570b621976
Add another variant of this test which found a place where
...
CodeGen's ComputeMaskedBits was being over-conservative when computing
bits for an ADD.
llvm-svn: 101963
2010-04-21 00:19:28 +00:00
Chris Lattner
6db0f451a7
teach the x86 address matching stuff to handle
...
(shl (or x,c), 3) the same as (shl (add x, c), 3)
when x doesn't have any bits from c set.
This finishes off PR1135. Before we compiled the block to:
to:
LBB0_3: ## %bb
cmpb $4, %dl
sete %dl
addb %dl, %cl
movb %cl, %dl
shlb $2, %dl
addb %r8b, %dl
shlb $2, %dl
movzbl %dl, %edx
movl %esi, (%rdi,%rdx,4)
leaq 2(%rdx), %r9
movl %esi, (%rdi,%r9,4)
leaq 1(%rdx), %r9
movl %esi, (%rdi,%r9,4)
addq $3, %rdx
movl %esi, (%rdi,%rdx,4)
incb %r8b
decb %al
movb %r8b, %dl
jne LBB0_1
Now we produce:
LBB0_3: ## %bb
cmpb $4, %dl
sete %dl
addb %dl, %cl
movb %cl, %dl
shlb $2, %dl
addb %r8b, %dl
shlb $2, %dl
movzbl %dl, %edx
movl %esi, (%rdi,%rdx,4)
movl %esi, 8(%rdi,%rdx,4)
movl %esi, 4(%rdi,%rdx,4)
movl %esi, 12(%rdi,%rdx,4)
incb %r8b
decb %al
movb %r8b, %dl
jne LBB0_1
llvm-svn: 101958
2010-04-20 23:18:40 +00:00
Bill Wendling
a87efb5d0f
Move CodeGen/X86/2010-04-19-DAGCombineCrash.ll into CodeGen/X86/crash.ll. Also
...
reduce.
llvm-svn: 101925
2010-04-20 18:14:47 +00:00
Chris Lattner
b66b0c36cd
Bill's change in r95336 broke empty aggregates embedded
...
in other types. fix this by only bumping zero-byte globals
up to a single byte if the *entire global* is zero size,
fixing PR6340.
This also fixes empty arrays etc to be handled correctly,
and only does this on subsection-via-symbols targets (aka
darwin) which is the only place where this matters.
llvm-svn: 101879
2010-04-20 06:20:21 +00:00
Chris Lattner
04fb51984f
teach cellspu how to return i8 and i16 from calls,
...
patch by Kalle Raiskila!
llvm-svn: 101875
2010-04-20 05:36:09 +00:00
Bill Wendling
887dac2aa6
The visitXOR method can return the same SDNode. If so, we don't want to delete
...
it as it's not dead.
llvm-svn: 101855
2010-04-20 01:25:01 +00:00
Bob Wilson
2e6cd50a50
Fix tests for Neon load/store intrinsics to match the i8* types expected by
...
the intrinsics. The reason for those i8* types is that the intrinsics are
overloaded on the vector type and we don't have a way to declare an intrinsic
where one argument is an overloaded vector type and another argument is a
pointer to the vector element type. The bitcasts added here will match what
the frontend will typically generate when these intrinsics are used.
llvm-svn: 101840
2010-04-20 00:17:16 +00:00
Nick Lewycky
c639c07492
Fix declarations in a few more tests.
...
llvm-svn: 101676
2010-04-17 21:29:25 +00:00
Chris Lattner
99d17acb35
fix PR6332, allowing an index of zero into a zero sized array
...
even if the element of the array has no size.
llvm-svn: 101662
2010-04-17 19:02:33 +00:00
Dan Gohman
5736cd1e47
Start function numbering at 0.
...
llvm-svn: 101638
2010-04-17 16:29:15 +00:00
Evan Cheng
d3d5e6793a
Add nounwind.
...
llvm-svn: 101613
2010-04-17 03:43:36 +00:00
Jakob Stoklund Olesen
7e77f60652
Add test case for machine-sink on critical edges
...
llvm-svn: 101416
2010-04-15 23:19:16 +00:00
Evan Cheng
c843326d60
Use default lowering of DYNAMIC_STACKALLOC. As far as I can tell, ARM isle is doing the right thing and codegen looks correct for both Thumb and Thumb2.
...
llvm-svn: 101410
2010-04-15 22:20:34 +00:00
Jakob Stoklund Olesen
a40915cc26
Fix PR6847. RegScavenger should ignore DebugValues.
...
llvm-svn: 101392
2010-04-15 20:28:39 +00:00
Evan Cheng
2f6d7ecd1b
ARM SelectDYN_ALLOC should emit a copy from SP rather than referencing SP directly. In cases where there are two dyn_alloc in the same BB it would have caused the old SP value to be reused and badness ensues. rdar://7493908
...
llvm is generating poor code for dynamic alloca, I'll fix that later.
llvm-svn: 101383
2010-04-15 18:42:28 +00:00
Chris Lattner
1b7ecfdf60
enhance the load/store narrowing optimization to handle a
...
tokenfactor in between the load/store. This allows us to
optimize test7 into:
_test7: ## @test7
## BB#0: ## %entry
movl (%rdx), %eax
## kill: SIL<def> ESI<kill>
movb %sil, 5(%rdi)
ret
instead of:
_test7: ## @test7
## BB#0: ## %entry
movl 4(%esp), %ecx
movl $-65281, %eax ## imm = 0xFFFFFFFFFFFF00FF
andl 4(%ecx), %eax
movzbl 8(%esp), %edx
shll $8, %edx
addl %eax, %edx
movl 12(%esp), %eax
movl (%eax), %eax
movl %edx, 4(%ecx)
ret
llvm-svn: 101355
2010-04-15 06:10:49 +00:00
Chris Lattner
8c5a5c9094
teach codegen to turn trunc(zextload) into load when possible.
...
This doesn't occur much at all, it only seems to formed in the case
when the trunc optimization kicks in due to phase ordering. In that
case it is saves a few bytes on x86-32.
llvm-svn: 101350
2010-04-15 05:40:59 +00:00
Chris Lattner
510d19e597
add a simple dag combine to replace trivial shl+lshr with
...
and. This happens with the store->load narrowing stuff.
llvm-svn: 101348
2010-04-15 05:28:43 +00:00
Chris Lattner
3282f3d34f
Implement rdar://7860110 (also in target/readme.txt) narrowing
...
a load/or/and/store sequence into a narrower store when it is
safe. Daniel tells me that clang will start producing this sort
of thing with bitfields, and this does trigger a few dozen times
on 176.gcc produced by llvm-gcc even now.
This compiles code like CodeGen/X86/2009-05-28-DAGCombineCrash.ll
into:
movl %eax, 36(%rdi)
instead of:
movl $4294967295, %eax ## imm = 0xFFFFFFFF
andq 32(%rdi), %rax
shlq $32, %rcx
addq %rax, %rcx
movq %rcx, 32(%rdi)
and each of the testcases into a single store. Each of them used
to compile into craziness like this:
_test4:
movl $65535, %eax ## imm = 0xFFFF
andl (%rdi), %eax
shll $16, %esi
addl %eax, %esi
movl %esi, (%rdi)
ret
llvm-svn: 101343
2010-04-15 04:48:01 +00:00
Chris Lattner
553267e9cc
further tweak this to do something useful.
...
llvm-svn: 101341
2010-04-15 04:31:42 +00:00
Chris Lattner
a4b3756baf
remove undef control flow.
...
llvm-svn: 101340
2010-04-15 04:30:19 +00:00
Jakob Stoklund Olesen
7343a90490
Remove unneeded types from test.
...
llvm-svn: 101286
2010-04-14 20:56:09 +00:00
Bob Wilson
7b19d89e3a
Don't custom lower bit converts to ARM VMOVDRRD or VMOVDRR when the operand
...
does not have a legal type. The legalizer does not know how to handle those
nodes. Radar 7854640.
llvm-svn: 101282
2010-04-14 20:45:23 +00:00
Evan Cheng
172f2f9e2d
Add test for post-ra machine licm.
...
llvm-svn: 101182
2010-04-13 22:10:03 +00:00
Bob Wilson
526e615ff9
Handle a v2f64 formal parameter that is split between registers and memory
...
such that the entire second half is in memory. Radar 7855014.
llvm-svn: 101181
2010-04-13 22:03:22 +00:00
Evan Cheng
6ffb1ed4fb
Fix test on non-x86 hosts.
...
llvm-svn: 101163
2010-04-13 18:54:04 +00:00
Evan Cheng
b8861dcb04
Re-apply 101075 and fix it properly. Just reuse the debug info of the branch instruction being optimized. There is no need to --I which can deref off start of the BB.
...
llvm-svn: 101162
2010-04-13 18:50:27 +00:00
Eric Christopher
330ca0c937
Temporarily revert r101075, it's causing invalid iterator assertions
...
in a nightly tester.
llvm-svn: 101158
2010-04-13 18:37:58 +00:00
Chris Lattner
dabcd9738c
add llvm codegen support for -ffunction-sections and -fdata-sections,
...
patch by Sylvere Teissier!
llvm-svn: 101106
2010-04-13 00:36:43 +00:00
Evan Cheng
ec21a36774
Use .set expression for x86 pic jump table reference to reduce assembly relocation. rdar://7738756
...
llvm-svn: 101085
2010-04-12 23:07:17 +00:00
Bill Wendling
5a56f7fc20
Third time's a charm...
...
llvm-svn: 101081
2010-04-12 22:43:21 +00:00
Bill Wendling
e1bf74de52
Genericize the label test.
...
llvm-svn: 101079
2010-04-12 22:40:37 +00:00
Bill Wendling
fd58812c95
Correct test to test what I mean it to test.
...
llvm-svn: 101077
2010-04-12 22:25:42 +00:00
Bill Wendling
1f2e71928c
Micro-optimization:
...
If we have this situation:
jCC L1
jmp L2
L1:
...
L2:
...
We can get a small performance boost by emitting this instead:
jnCC L2
L1:
...
L2:
...
This testcase shows an example of this:
float func(float x, float y) {
double product = (double)x * y;
if (product == 0.0)
return product;
return product - 1.0;
}
llvm-svn: 101075
2010-04-12 22:19:57 +00:00
Evan Cheng
90788354c9
Enable post regalloc machine licm by default.
...
llvm-svn: 101023
2010-04-12 06:25:28 +00:00
Benjamin Kramer
f040734da3
Make sure this test tests something.
...
llvm-svn: 100879
2010-04-09 19:03:31 +00:00
Bob Wilson
ee7665078a
Add a testcase for svn r100568.
...
llvm-svn: 100876
2010-04-09 18:29:29 +00:00
Chris Lattner
5408e8a62b
"On SPU, variables in the .bss section that are allocated with the .lcomm directive are not aligned on 16 byte boundaries. This causes misaligned loads, as the generated assembly assumes this "default" alignment.
...
this patch disables .lcomm in favour of '.local .comm'
Patch by Kalle Raisklia!
llvm-svn: 100875
2010-04-09 18:27:03 +00:00
Dan Gohman
a451b859f9
Merge a few fast-isel tests.
...
llvm-svn: 100860
2010-04-09 15:03:55 +00:00
Evan Cheng
619f1b8a94
Coalescer should not delete copy instructions whose defs are partially dead. e.g.
...
%RDI<def,dead> = MOV64rr %RAX<kill>, %EDI<imp-def>
llvm-svn: 100804
2010-04-08 20:02:37 +00:00
Evan Cheng
3fa0b6fb03
Avoid using f64 to lower memcpy from constant string. It's cheaper to use i32 store of immediates.
...
llvm-svn: 100751
2010-04-08 07:37:57 +00:00