Chris Lattner
3553a131d0
Implement InstCombine/vec_demanded_elts.ll:test2. This allows us to turn
...
unsigned test(float f) {
return _mm_cvtsi128_si32( (__m128i) _mm_set_ss( f*f ));
}
into:
_test:
movss 4(%esp), %xmm0
mulss %xmm0, %xmm0
movd %xmm0, %eax
ret
instead of:
_test:
movss 4(%esp), %xmm0
mulss %xmm0, %xmm0
xorps %xmm1, %xmm1
movss %xmm0, %xmm1
movd %xmm1, %eax
ret
GCC gets:
_test:
subl $28, %esp
movss 32(%esp), %xmm0
mulss %xmm0, %xmm0
xorps %xmm1, %xmm1
movss %xmm0, %xmm1
movaps %xmm1, %xmm0
movd %xmm0, 12(%esp)
movl 12(%esp), %eax
addl $28, %esp
ret
llvm-svn: 36020
2007-04-14 22:29:23 +00:00
Chris Lattner
7928216e38
avoid copying sets and vectors around.
...
llvm-svn: 36017
2007-04-14 22:10:17 +00:00
Jeff Cohen
5a502fb622
Fix PR1329.
...
llvm-svn: 36016
2007-04-14 21:50:21 +00:00
Chris Lattner
357a11fcbb
disable switch lowering using shift/and. It still breaks ppc bootstrap for
...
some reason. :( Will investigate.
llvm-svn: 36011
2007-04-14 19:39:41 +00:00
Chris Lattner
3cebebfdd4
avoid iterator invalidation.
...
llvm-svn: 36002
2007-04-14 18:06:52 +00:00
Jeff Cohen
6e724c5338
An even better fix.
...
llvm-svn: 35998
2007-04-14 17:18:29 +00:00
Jeff Cohen
114799eab9
Fix recent regression that broke several llvm-tests.
...
llvm-svn: 35996
2007-04-14 16:55:19 +00:00
Anton Korobeynikov
bdb4f560da
Fix PR1325: Case range optimization was performed in the case it
...
shouldn't. Also fix some "latent" bug on 64-bit platforms
llvm-svn: 35990
2007-04-14 13:25:55 +00:00
Chris Lattner
6e71d21892
disable shift/and lowering to work around PR1325 for now.
...
llvm-svn: 35985
2007-04-14 02:26:56 +00:00
Chris Lattner
a283acb406
Implement a few missing xforms: printf("foo\n") -> puts. printf("x") -> putchar
...
printf("") -> noop. Still need to do the xforms for fprintf.
This implements Transforms/SimplifyLibCalls/Printf.ll
llvm-svn: 35984
2007-04-14 01:17:48 +00:00
Chris Lattner
25f2c932b7
in addition to merging, constantmerge should also delete trivially dead globals,
...
in order to clean up after simplifylibcalls.
llvm-svn: 35982
2007-04-14 01:11:54 +00:00
Chris Lattner
6f64f54168
Implement PR1201 and test/Transforms/InstCombine/malloc-free-delete.ll
...
llvm-svn: 35981
2007-04-14 00:20:02 +00:00
Chris Lattner
b97ff21db2
use an accessor to simplify code.
...
llvm-svn: 35979
2007-04-14 00:17:39 +00:00
Chris Lattner
5ed58fc4a9
add GetElementPtrInst::hasAllZeroIndices, a long-overdue helper method.
...
Writing it twice in the same day was too much for me.
llvm-svn: 35978
2007-04-14 00:12:57 +00:00
Reid Spencer
84c2475e77
We want the number of bits needed, not the power of 2.
...
llvm-svn: 35977
2007-04-14 00:00:10 +00:00
Jeff Cohen
3ffd34cac6
Silence VC++ warning.
...
llvm-svn: 35975
2007-04-13 22:52:03 +00:00
Chris Lattner
8477dd1722
Now that codegen prepare isn't defeating me, I can finally fix what I set
...
out to do! :)
This fixes a problem where LSR would insert a bunch of code into each MBB
that uses a particular subexpression (e.g. IV+base+C). The problem is that
this code cannot be CSE'd back together if inserted into different blocks.
This patch changes LSR to attempt to insert a single copy of this code and
share it, allowing codegenprepare to duplicate the code if it can be sunk
into various addressing modes. On CodeGen/ARM/lsr-code-insertion.ll,
for example, this gives us code like:
add r8, r0, r5
str r6, [r8, #+4]
..
ble LBB1_4 @cond_next
LBB1_3: @cond_true
str r10, [r8, #+4]
LBB1_4: @cond_next
...
LBB1_5: @cond_true55
ldr r6, LCPI1_1
str r6, [r8, #+4]
instead of:
add r10, r0, r6
str r8, [r10, #+4]
...
ble LBB1_4 @cond_next
LBB1_3: @cond_true
add r8, r0, r6
str r10, [r8, #+4]
LBB1_4: @cond_next
...
LBB1_5: @cond_true55
add r8, r0, r6
ldr r10, LCPI1_1
str r10, [r8, #+4]
Besides being smaller and more efficient, this makes it immediately
obvious that it is profitable to predicate LBB1_3 now :)
llvm-svn: 35972
2007-04-13 20:42:26 +00:00
Chris Lattner
bc03b6c341
Completely rewrite addressing-mode related sinking of code. In particular,
...
this fixes problems where codegenprepare would sink expressions into load/stores
that are not valid, and fixes cases where it would miss important valid ones.
This fixes several serious codesize and perf issues, particularly on targets
with complex addressing modes like arm and x86. For example, now we compile
CodeGen/X86/isel-sink.ll to:
_test:
movl 8(%esp), %eax
movl 4(%esp), %ecx
cmpl $1233, %eax
ja LBB1_2 #F
LBB1_1: #T
movl $4, (%ecx,%eax,4)
movl $141, %eax
ret
LBB1_2: #F
movl (%ecx,%eax,4), %eax
ret
instead of:
_test:
movl 8(%esp), %eax
leal (,%eax,4), %ecx
addl 4(%esp), %ecx
cmpl $1233, %eax
ja LBB1_2 #F
LBB1_1: #T
movl $4, (%ecx)
movl $141, %eax
ret
LBB1_2: #F
movl (%ecx), %eax
ret
llvm-svn: 35970
2007-04-13 20:30:56 +00:00
Reid Spencer
6e7854339e
Implement a getBitsNeeded method to determine how many bits are needed to
...
represent a string in binary form by an APInt.
llvm-svn: 35968
2007-04-13 19:19:07 +00:00
Devang Patel
d86d04983a
Remove use of SlowOperationInformer.
...
llvm-svn: 35967
2007-04-13 18:58:18 +00:00
Devang Patel
d01bb17f76
Undo previous check-in.
...
llvm-svn: 35966
2007-04-13 18:35:15 +00:00
Devang Patel
bfd8480bad
Hello uses LLVMSupport.a (SlowerOperationInformer)
...
llvm-svn: 35965
2007-04-13 18:28:23 +00:00
Anton Korobeynikov
5bb6590218
Fix PR1323 : we haven't updated phi nodes in good manner :)
...
llvm-svn: 35963
2007-04-13 06:53:51 +00:00
Chris Lattner
e7cab7b7a4
arm has r+r*s and r+i addr modes, but no r+i+r*s addr modes.
...
llvm-svn: 35962
2007-04-13 06:50:55 +00:00
Zhou Sheng
dedfc40044
Make the apint construction more effective.
...
llvm-svn: 35960
2007-04-13 05:57:32 +00:00
Chris Lattner
335f1cb1f8
CSE simple binary expressions when they are inserted. This makes LSR produce
...
less huge code that needs to be cleaned up by sdisel.
llvm-svn: 35959
2007-04-13 05:04:18 +00:00
Reid Spencer
d31093d340
Implement review feedback .. don't double search a set.
...
llvm-svn: 35957
2007-04-12 21:57:15 +00:00
Reid Spencer
f1154e6d96
Make sure intrinsics that are lowered to functions make the function weak
...
linkage so we only end up with one of them in a program. These are, after
all overloaded and templatish in nature.
llvm-svn: 35956
2007-04-12 21:53:38 +00:00
Reid Spencer
0325471d3c
Provide support for intrinsics that lower themselves to a function body.
...
This can happen for intrinsics that are overloaded. In such cases it is
necessary to emit a function prototype before the body of the function
that calls the intrinsic and to ensure we don't emit it multiple times.
llvm-svn: 35954
2007-04-12 21:00:45 +00:00
Lauro Ramos Venancio
6c5f53f6ac
Implement Thread Local Storage (TLS) in CBackend.
...
llvm-svn: 35951
2007-04-12 18:42:08 +00:00
Lauro Ramos Venancio
a76c2806de
Implement the "thread_local" keyword.
...
llvm-svn: 35950
2007-04-12 18:32:50 +00:00
Reid Spencer
1e53c865c2
Fix bugs in generated code for part_select and part_set so that llc doesn't
...
barf when CBE is run with a program that contains these intrinsics.
llvm-svn: 35946
2007-04-12 13:30:14 +00:00
Reid Spencer
76e9a17f61
Fix a bug in PartSet. The replacement value needs to be zext or trunc to
...
the size of the value, not just zext. Also, give better names to two BBs.
llvm-svn: 35945
2007-04-12 12:46:33 +00:00
Chris Lattner
0da8de5848
the result of an inline asm copy can be an arbitrary VT that the register
...
class supports. In the case of vectors, this means we often get the wrong
type (e.g. we get v4f32 instead of v8i16). Make sure to convert the vector
result to the right type. This fixes CodeGen/X86/2007-04-11-InlineAsmVectorResult.ll
llvm-svn: 35944
2007-04-12 06:00:20 +00:00
Chris Lattner
7acaf64d70
fold noop vbitconvert instructions
...
llvm-svn: 35943
2007-04-12 05:58:43 +00:00
Chris Lattner
2f221a83ec
Fix weirdness handling single element vectors.
...
llvm-svn: 35941
2007-04-12 04:44:28 +00:00
Chris Lattner
2b6b79b896
Fix mmx paddq, add support for the 'y' register class, though it isn't tested.
...
llvm-svn: 35940
2007-04-12 04:14:49 +00:00
Reid Spencer
82da0eb67c
For PR1284:
...
Implement the "part_set" intrinsic.
llvm-svn: 35938
2007-04-12 02:48:46 +00:00
Chris Lattner
9564abbfb5
improve the patch for PR1318 to also support grouped options with custom
...
handlers (like the pass list). My previous fix only supported *new* command
line options, not additions to old ones.
This fixes test/Feature/load_module.ll
llvm-svn: 35935
2007-04-12 00:36:29 +00:00
Chris Lattner
b97b122176
Fix CodeGen/X86/2007-03-24-InlineAsmPModifier.ll
...
llvm-svn: 35926
2007-04-11 22:29:46 +00:00
Reid Spencer
2afe5c8354
Build Hello by default so it can be used in test cases.
...
llvm-svn: 35922
2007-04-11 21:03:37 +00:00
Chris Lattner
f29ad16397
fix an infinite loop compiling ldecod, notice by JeffC.
...
llvm-svn: 35910
2007-04-11 16:51:53 +00:00
Chris Lattner
e9a9a3f172
Fix incorrect fall-throughs in addr mode code. This fixes CodeGen/ARM/arm-negative-stride.ll
...
llvm-svn: 35909
2007-04-11 16:17:12 +00:00
Chris Lattner
f7451ea3c2
Fix Transforms/ScalarRepl/union-pointer.ll
...
llvm-svn: 35906
2007-04-11 15:45:25 +00:00
Chris Lattner
32f6730bb1
Fix PR1318 by reacting appropriately to a mutating option list.
...
llvm-svn: 35905
2007-04-11 15:35:18 +00:00
Reid Spencer
bd2afc8391
Fix a bug where ICmpInst objects instantiated directly with a name would
...
not retain that name. Not noticed because AsmParser always sets name after
construction. However, llvm2cpp noticed.
llvm-svn: 35903
2007-04-11 13:04:48 +00:00
Reid Spencer
9b497be3c4
Fix an approximate calculation in an assertion not to give false negatives.
...
llvm-svn: 35901
2007-04-11 13:00:04 +00:00
Chris Lattner
27a80589de
Turn stuff like:
...
icmp slt i32 %X, 0 ; <i1>:0 [#uses=1]
sext i1 %0 to i32 ; <i32>:1 [#uses=1]
into:
%X.lobit = ashr i32 %X, 31 ; <i32> [#uses=1]
This implements InstCombine/icmp.ll:test[34]
llvm-svn: 35891
2007-04-11 06:57:46 +00:00
Chris Lattner
b659c04f13
Simplify some comparisons to arithmetic, this implements:
...
Transforms/InstCombine/icmp.ll
llvm-svn: 35890
2007-04-11 06:53:04 +00:00
Chris Lattner
1d20292190
Fix this harder.
...
llvm-svn: 35888
2007-04-11 06:50:51 +00:00