1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 11:33:24 +02:00
Commit Graph

3129 Commits

Author SHA1 Message Date
Devang Patel
e5765bd115 Mem2Reg does not need TargetData.
llvm-svn: 36444
2007-04-25 18:32:35 +00:00
Devang Patel
b6ef7bc5d9 Remove unused function argument.
llvm-svn: 36441
2007-04-25 17:15:20 +00:00
Anton Korobeynikov
25dc9a61cb Implement aliases. This fixes PR1017 and it's dependent bugs. CFE part
will follow.

llvm-svn: 36435
2007-04-25 14:27:10 +00:00
Chris Lattner
e0e5106141 If an alloca only has two types of uses: 1) reads 2) a memcpy/memmove that
copies from a constant global, then we can change the reads to read from the
global instead of from the alloca.  This eliminates the alloca and the memcpy,
and promotes secondary optimizations (because the loads are now loads from
a constant global).

This is important for a common C idiom:

void foo() {
   int A[] = {1,2,3,4,5,6,7,8,9...};
   ... only reads of A ...
}

For some reason, people forget to mark the array static or const.

This triggers on these multisource benchmarks:
JM/ldecode: block_pos, [3 x [4 x [4 x i32]]]
FreeBench/mason: m, [18 x i32], inlined 4 times
MiBench/office-stringsearch: search_strings, [1332 x i8*]
MiBench/office-stringsearch: find_strings, [1333 x i8*]
Prolangs-C++/city: dirs, [9 x i8*], inlined 4 places

and these spec benchmarks:
177.mesa: message, [8 x [32 x i8]]
186.crafty: bias_rl45, [64 x i32]
186.crafty: diag_sq, [64 x i32]
186.crafty: empty, [9 x i8]
186.crafty: xlate, [15 x i8]
186.crafty: status, [13 x i8]
186.crafty: bdinfo, [25 x i8]
445.gobmk: routines, [16 x i8*]
458.sjeng: piece_rep, [14 x i8*]
458.sjeng: t, [13 x i32], inlined 4 places.
464.h264ref: block8x8_idx, [3 x [4 x [4 x i32]]]
464.h264ref: block_pos, [3 x [4 x [4 x i32]]]
464.h264ref: j_off_tab, [12 x i32]

This implements Transforms/ScalarRepl/memcpy-from-global.ll

llvm-svn: 36429
2007-04-25 06:40:51 +00:00
Chris Lattner
cf27f9705f refactor the SROA code out into its own method, no functionality change.
llvm-svn: 36426
2007-04-25 05:02:56 +00:00
Owen Anderson
d0af582609 Undo my previous changes. Since my approach to this problem is being revised,
this approach is no longer appropriate.

llvm-svn: 36421
2007-04-25 04:18:54 +00:00
Devang Patel
938e0db5e3 Fix
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070423/048376.html

llvm-svn: 36417
2007-04-25 00:37:04 +00:00
Owen Anderson
df3ece8561 Rollback some changes that adversely affected performance. I'm currently rethinking
my approach to this, so hopefully I'll find a way to do this without making this slower.

llvm-svn: 36392
2007-04-24 06:40:39 +00:00
Devang Patel
6a9016dcf7 Fix
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070423/048333.html

llvm-svn: 36380
2007-04-23 22:42:03 +00:00
Owen Anderson
2be28af998 Make PredicateSimplifier not use DominatorTree.
llvm-svn: 36300
2007-04-21 07:38:12 +00:00
Owen Anderson
9813f7fa80 Fix a comment.
llvm-svn: 36299
2007-04-21 07:12:44 +00:00
Jeff Cohen
2afa206eb3 Comment out usage of write() for now.
llvm-svn: 36287
2007-04-20 22:40:10 +00:00
Devang Patel
3c304ecdc9 Avoid recursion.
llvm-svn: 36272
2007-04-20 20:04:37 +00:00
Owen Anderson
41b527b75f Move more passes to using ETForest instead of DominatorTree.
llvm-svn: 36271
2007-04-20 06:27:13 +00:00
Zhou Sheng
9339daf407 Make use of ConstantInt::isZero instead of ConstantInt::isNullValue.
llvm-svn: 36261
2007-04-19 05:39:12 +00:00
Zhou Sheng
2cb77f2d6c Make the operations of APInt variables more efficient.
llvm-svn: 36260
2007-04-19 05:35:00 +00:00
Evan Cheng
91e0637ed3 Revert Owen's last check-in. This is breaking Mac OS X / PPC llvm-gcc bootstrap.
llvm-svn: 36258
2007-04-18 22:39:00 +00:00
Owen Anderson
16d1fd6ec8 Revert changes that caused breakage.
llvm-svn: 36255
2007-04-18 06:46:57 +00:00
Owen Anderson
781cbecd34 Switch more uses of DominatorTree over to ETForest.
llvm-svn: 36254
2007-04-18 05:43:13 +00:00
Owen Anderson
c3aa967684 Use ETForest instead of DominatorTree.
llvm-svn: 36252
2007-04-18 05:25:43 +00:00
Owen Anderson
c07f29d206 Use ETForest instead of DominatorTree.
llvm-svn: 36249
2007-04-18 04:55:33 +00:00
Owen Anderson
d8bb3aca7b Use new ETForest accessor.
llvm-svn: 36248
2007-04-18 04:46:35 +00:00
Owen Anderson
909895b564 Use ETForest instead of DominatorTree.
llvm-svn: 36247
2007-04-18 04:39:32 +00:00
Dan Gohman
a7e1f3c85b Spell doFinalization right, so that it is a proper virtual override and
gets called.

llvm-svn: 36208
2007-04-17 18:21:36 +00:00
Chris Lattner
8e28e7654d remove use of BasicBlock::getNext
llvm-svn: 36205
2007-04-17 18:09:47 +00:00
Chris Lattner
d503b61318 remove use of BasicBlock::getNext
llvm-svn: 36202
2007-04-17 17:54:12 +00:00
Chris Lattner
e5d747e0be eliminate use of Instruction::getNext()
llvm-svn: 36200
2007-04-17 17:51:03 +00:00
Chris Lattner
7b6a5d7956 remove use of Instruction::getNext
llvm-svn: 36199
2007-04-17 17:47:54 +00:00
Devang Patel
7d868316fd Fix
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070416/047888.html

llvm-svn: 36182
2007-04-16 23:03:45 +00:00
Anton Korobeynikov
f3e62a428a Removed tabs everywhere except autogenerated & external files. Add make
target for tabs checking.

llvm-svn: 36146
2007-04-16 18:10:23 +00:00
Chris Lattner
601aa5b489 Fix PR1335 and Transforms/Inline/2007-04-15-InlineEH.ll
llvm-svn: 36090
2007-04-15 21:38:06 +00:00
Owen Anderson
ea857029ea Remove ImmediateDominator analysis. The same information can be obtained from DomTree. A lot of code for
constructing ImmediateDominator is now folded into DomTree construction.

This is part of the ongoing work for PR217.

llvm-svn: 36063
2007-04-15 08:47:27 +00:00
Chris Lattner
798749cafe fix SimplifyLibCalls/IsDigit.ll
llvm-svn: 36047
2007-04-15 05:38:40 +00:00
Chris Lattner
fe00dd8315 Extend store merging to support the 'if/then' version in addition to if/then/else.
This sinks the two stores in this example into a single store in cond_next.  In this
case, it allows elimination of the load as well:

        store double 0.000000e+00, double* @s.3060
        %tmp3 = fcmp ogt double %tmp1, 5.000000e-01             ; <i1> [#uses=1]
        br i1 %tmp3, label %cond_true, label %cond_next
cond_true:              ; preds = %entry
        store double 1.000000e+00, double* @s.3060
        br label %cond_next
cond_next:              ; preds = %entry, %cond_true
        %tmp6 = load double* @s.3060            ; <double> [#uses=1]

This implements Transforms/InstCombine/store-merge.ll:test2

llvm-svn: 36040
2007-04-15 01:02:18 +00:00
Chris Lattner
ecd0fda993 refactor some code, no functionality change.
llvm-svn: 36037
2007-04-15 00:07:55 +00:00
Chris Lattner
022c2bc0c3 fix long lines
llvm-svn: 36031
2007-04-14 23:32:02 +00:00
Chris Lattner
9764a3cf09 Implement Transforms/InstCombine/vec_extract_elt.ll, transforming:
define i32 @test(float %f) {
        %tmp7 = insertelement <4 x float> undef, float %f, i32 0
        %tmp17 = bitcast <4 x float> %tmp7 to <4 x i32>
        %tmp19 = extractelement <4 x i32> %tmp17, i32 0
        ret i32 %tmp19
}

into:

define i32 @test(float %f) {
        %tmp19 = bitcast float %f to i32                ; <i32> [#uses=1]
        ret i32 %tmp19
}

On PPC, this is the difference between:

_test:
        mfspr r2, 256
        oris r3, r2, 8192
        mtspr 256, r3
        stfs f1, -16(r1)
        addi r3, r1, -16
        addi r4, r1, -32
        lvx v2, 0, r3
        stvx v2, 0, r4
        lwz r3, -32(r1)
        mtspr 256, r2
        blr

and:

_test:
        stfs f1, -4(r1)
        nop
        nop
        nop
        lwz r3, -4(r1)
        blr

llvm-svn: 36025
2007-04-14 23:02:14 +00:00
Chris Lattner
3553a131d0 Implement InstCombine/vec_demanded_elts.ll:test2. This allows us to turn
unsigned test(float f) {
 return _mm_cvtsi128_si32( (__m128i) _mm_set_ss( f*f ));
}

into:

_test:
        movss 4(%esp), %xmm0
        mulss %xmm0, %xmm0
        movd %xmm0, %eax
        ret

instead of:

_test:
        movss 4(%esp), %xmm0
        mulss %xmm0, %xmm0
        xorps %xmm1, %xmm1
        movss %xmm0, %xmm1
        movd %xmm1, %eax
        ret

GCC gets:

_test:
        subl    $28, %esp
        movss   32(%esp), %xmm0
        mulss   %xmm0, %xmm0
        xorps   %xmm1, %xmm1
        movss   %xmm0, %xmm1
        movaps  %xmm1, %xmm0
        movd    %xmm0, 12(%esp)
        movl    12(%esp), %eax
        addl    $28, %esp
        ret

llvm-svn: 36020
2007-04-14 22:29:23 +00:00
Chris Lattner
7928216e38 avoid copying sets and vectors around.
llvm-svn: 36017
2007-04-14 22:10:17 +00:00
Chris Lattner
3cebebfdd4 avoid iterator invalidation.
llvm-svn: 36002
2007-04-14 18:06:52 +00:00
Jeff Cohen
6e724c5338 An even better fix.
llvm-svn: 35998
2007-04-14 17:18:29 +00:00
Jeff Cohen
114799eab9 Fix recent regression that broke several llvm-tests.
llvm-svn: 35996
2007-04-14 16:55:19 +00:00
Chris Lattner
a283acb406 Implement a few missing xforms: printf("foo\n") -> puts. printf("x") -> putchar
printf("") -> noop.  Still need to do the xforms for fprintf.

This implements Transforms/SimplifyLibCalls/Printf.ll

llvm-svn: 35984
2007-04-14 01:17:48 +00:00
Chris Lattner
25f2c932b7 in addition to merging, constantmerge should also delete trivially dead globals,
in order to clean up after simplifylibcalls.

llvm-svn: 35982
2007-04-14 01:11:54 +00:00
Chris Lattner
6f64f54168 Implement PR1201 and test/Transforms/InstCombine/malloc-free-delete.ll
llvm-svn: 35981
2007-04-14 00:20:02 +00:00
Chris Lattner
b97ff21db2 use an accessor to simplify code.
llvm-svn: 35979
2007-04-14 00:17:39 +00:00
Chris Lattner
8477dd1722 Now that codegen prepare isn't defeating me, I can finally fix what I set
out to do! :)

This fixes a problem where LSR would insert a bunch of code into each MBB
that uses a particular subexpression (e.g. IV+base+C).  The problem is that
this code cannot be CSE'd back together if inserted into different blocks.

This patch changes LSR to attempt to insert a single copy of this code and
share it, allowing codegenprepare to duplicate the code if it can be sunk
into various addressing modes.  On CodeGen/ARM/lsr-code-insertion.ll,
for example, this gives us code like:

        add r8, r0, r5
        str r6, [r8, #+4]
..
        ble LBB1_4      @cond_next
LBB1_3: @cond_true
        str r10, [r8, #+4]
LBB1_4: @cond_next
...
LBB1_5: @cond_true55
        ldr r6, LCPI1_1
        str r6, [r8, #+4]

instead of:

        add r10, r0, r6
        str r8, [r10, #+4]
...
        ble LBB1_4      @cond_next
LBB1_3: @cond_true
        add r8, r0, r6
        str r10, [r8, #+4]
LBB1_4: @cond_next
...
LBB1_5: @cond_true55
        add r8, r0, r6
        ldr r10, LCPI1_1
        str r10, [r8, #+4]

Besides being smaller and more efficient, this makes it immediately
obvious that it is profitable to predicate LBB1_3 now :)

llvm-svn: 35972
2007-04-13 20:42:26 +00:00
Chris Lattner
bc03b6c341 Completely rewrite addressing-mode related sinking of code. In particular,
this fixes problems where codegenprepare would sink expressions into load/stores
that are not valid, and fixes cases where it would miss important valid ones.

This fixes several serious codesize and perf issues, particularly on targets
with complex addressing modes like arm and x86.  For example, now we compile
CodeGen/X86/isel-sink.ll to:

_test:
        movl 8(%esp), %eax
        movl 4(%esp), %ecx
        cmpl $1233, %eax
        ja LBB1_2       #F
LBB1_1: #T
        movl $4, (%ecx,%eax,4)
        movl $141, %eax
        ret
LBB1_2: #F
        movl (%ecx,%eax,4), %eax
        ret

instead of:

_test:
        movl 8(%esp), %eax
        leal (,%eax,4), %ecx
        addl 4(%esp), %ecx
        cmpl $1233, %eax
        ja LBB1_2       #F
LBB1_1: #T
        movl $4, (%ecx)
        movl $141, %eax
        ret
LBB1_2: #F
        movl (%ecx), %eax
        ret

llvm-svn: 35970
2007-04-13 20:30:56 +00:00
Devang Patel
d86d04983a Remove use of SlowOperationInformer.
llvm-svn: 35967
2007-04-13 18:58:18 +00:00
Devang Patel
d01bb17f76 Undo previous check-in.
llvm-svn: 35966
2007-04-13 18:35:15 +00:00