1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-28 14:32:51 +01:00
Commit Graph

1624 Commits

Author SHA1 Message Date
Reid Spencer
3ac324ab41 Fix numerous inferred casts.
llvm-svn: 32479
2006-12-12 09:18:51 +00:00
Reid Spencer
562b83c7df Change inferred getCast into specific getCast. Passes all tests.
llvm-svn: 32469
2006-12-12 05:05:00 +00:00
Chris Lattner
a9b75a7e35 Patch for PR1045 and Transforms/ScalarRepl/2006-12-11-SROA-Crash.ll
llvm-svn: 32468
2006-12-12 04:24:41 +00:00
Chris Lattner
28e7eaf6b8 trunc to integer, not to FP.
llvm-svn: 32426
2006-12-11 01:17:00 +00:00
Chris Lattner
a8eec08185 implement promotion of unions containing two packed types of the same width.
This implements Transforms/ScalarRepl/union-packed.ll

llvm-svn: 32422
2006-12-11 00:35:08 +00:00
Chris Lattner
098fb42690 * Eliminate calls to CastInst::createInferredCast.
* Add support for promoting unions with fp values in them.  This produces
   our new int<->fp bitcast instructions, implementing
   Transforms/ScalarRepl/union-fp-int.ll

As an example, this allows us to compile this:

union intfloat { int i; float f; };
float invsqrt(const float arg_x) {
    union intfloat x = { .f = arg_x };
    const float xhalf = arg_x * 0.5f;
    x.i = 0x5f3759df - (x.i >> 1);
    return x.f * (1.5f - xhalf * x.f * x.f);
}

into:

_invsqrt:
        movss 4(%esp), %xmm0
        movd %xmm0, %eax
        sarl %eax
        movl $1597463007, %ecx
        subl %eax, %ecx
        movd %ecx, %xmm1
        mulss LCPI1_0, %xmm0
        mulss %xmm1, %xmm0
        movss LCPI1_1, %xmm2
        mulss %xmm1, %xmm0
        subss %xmm0, %xmm2
        movl 8(%esp), %eax
        mulss %xmm2, %xmm1
        movss %xmm1, (%eax)
        ret

instead of:

_invsqrt:
        subl $4, %esp
        movss 8(%esp), %xmm0
        movss %xmm0, (%esp)
        movl (%esp), %eax
        movl $1597463007, %ecx
        sarl %eax
        subl %eax, %ecx
        movl %ecx, (%esp)
        mulss LCPI1_0, %xmm0
        movss (%esp), %xmm1
        mulss %xmm1, %xmm0
        mulss %xmm1, %xmm0
        movss LCPI1_1, %xmm2
        subss %xmm0, %xmm2
        mulss %xmm2, %xmm1
        movl 12(%esp), %eax
        movss %xmm1, (%eax)
        addl $4, %esp
        ret

llvm-svn: 32418
2006-12-10 23:56:50 +00:00
Reid Spencer
069149765d Incorporate any changes in the successor blocks into the result of
MarkAliveBlocks.

llvm-svn: 32375
2006-12-08 21:52:01 +00:00
Bill Wendling
23b8b13c9d Removing even more <iostream> includes.
llvm-svn: 32320
2006-12-07 20:04:42 +00:00
Bill Wendling
a3246c4272 Changed llvm_ostream et all to OStream. llvm_cerr, llvm_cout, llvm_null, are
now cerr, cout, and NullStream resp.

llvm-svn: 32298
2006-12-07 01:30:32 +00:00
Reid Spencer
ff6cd88f93 Update ConstantIntegral Max/Min tests for new interface.
llvm-svn: 32288
2006-12-06 20:39:57 +00:00
Chris Lattner
06ba0b8202 add missing #include
llvm-svn: 32280
2006-12-06 18:14:47 +00:00
Chris Lattner
a531ce882e Detemplatize the Statistic class. The only type it is instantiated with
is 'unsigned'.

llvm-svn: 32279
2006-12-06 17:46:33 +00:00
Chris Lattner
3d1758e08c Remove the 'printname' argument to WriteAsOperand. It is always true, and
passing false would make the asmprinter fail anyway.

llvm-svn: 32264
2006-12-06 06:16:21 +00:00
Chris Lattner
e0738f8f8b add an instcombine xform. This speeds up 462.libquantum from 9.78s to
7.48s.  This regression is due to unforseen consequences of the cast patch.

llvm-svn: 32209
2006-12-05 01:26:29 +00:00
Devang Patel
ae17721f63 SCCP does not handle Packed Type properly. Disable Packed Type handling
for now.

llvm-svn: 32208
2006-12-04 23:54:59 +00:00
Reid Spencer
d727d239f8 Update call to CastInst::getCastOpcode for its new signature.
llvm-svn: 32166
2006-12-04 02:48:01 +00:00
Jeff Cohen
f99052befb Unbreak VC++ build.
llvm-svn: 32113
2006-12-02 02:22:01 +00:00
Chris Lattner
1629b0d995 disable transformations that are invalid for fp vectors. This fixes
Transforms/InstCombine/2006-12-01-BadFPVectorXform.ll

llvm-svn: 32112
2006-12-02 00:13:08 +00:00
Reid Spencer
529fb41272 Remove 4 FIXMEs to hack around cast-to-bool problems which no longer exist.
llvm-svn: 32051
2006-11-30 23:13:36 +00:00
Chris Lattner
2fd5719f50 implement cast.ll:test35. With this, we recognize:
unsigned short swp(unsigned short a) {
       return ((a & 0xff00) >> 8 | (a & 0x00ff) << 8);
}

as an idiom for bswap.

llvm-svn: 32011
2006-11-29 07:18:39 +00:00
Chris Lattner
03fdea2e74 Teach instcombine to turn trunc(srl x, c) -> srl (trunc(x), c) when safe.
This implements InstCombine/cast.ll:test34.  It fires hundreds of times on
176.gcc.

llvm-svn: 32009
2006-11-29 07:04:07 +00:00
Chris Lattner
0409f2c48d Implement Regression/Transforms/InstCombine/bswap-fold.ll,
folding   seteq (bswap(x)), c -> seteq(x,bswap(c))

llvm-svn: 32006
2006-11-29 05:02:16 +00:00
Reid Spencer
a866877d2f Join a split line.
llvm-svn: 31996
2006-11-29 01:11:01 +00:00
Reid Spencer
c48fe0fd4d Undo the last patch until 253.perlbmk passes with these changes.
llvm-svn: 31977
2006-11-28 20:23:51 +00:00
Reid Spencer
8587322e79 Remove 4 FIXME's from the CAST patch now that the back end is correctly
producing code for "trunc to bool". This passes all tests on Linux.

llvm-svn: 31963
2006-11-28 07:23:01 +00:00
Chris Lattner
b391cbb939 Fix PR1014 and InstCombine/2006-11-27-XorBug.ll.
llvm-svn: 31941
2006-11-27 19:55:07 +00:00
Reid Spencer
992d9788b3 For PR950:
The long awaited CAST patch. This introduces 12 new instructions into LLVM
to replace the cast instruction. Corresponding changes throughout LLVM are
provided. This passes llvm-test, llvm/test, and SPEC CPUINT2000 with the
exception of 175.vpr which fails only on a slight floating point output
difference.

llvm-svn: 31931
2006-11-27 01:05:10 +00:00
Bill Wendling
1b3a86000a Removed #include <iostream> and replaced with llvm_* streams.
llvm-svn: 31923
2006-11-26 09:46:52 +00:00
Nick Lewycky
cd25e651c2 Update to new predicate simplifier VRP design. Fixes PR966 and PR967.
Remove predicate simplifier from default gcc3 pipeline. New design is too
slow to enable by default.
Add new testcases for problems encountered in development.

llvm-svn: 31895
2006-11-22 23:49:16 +00:00
Chris Lattner
632c66b8ef This xform is handled by FoldOpIntoPhi in visitCastInst in a more elegant way.
llvm-svn: 31889
2006-11-21 17:05:13 +00:00
Chris Lattner
cc4df7e0ab If an indvar with a variable stride is used by the exit condition, go ahead
and handle it like constant stride vars.  This fixes some bad codegen in
variable stride cases.  For example, it compiles this:

void foo(int k, int i) {
  for (k=i+i; k <= 8192; k+=i)
    flags2[k] = 0;
}

to:

LBB1_1: #bb.preheader
        movl %eax, %ecx
        addl %ecx, %ecx
        movl L_flags2$non_lazy_ptr, %edx
LBB1_2: #bb
        movb $0, (%edx,%ecx)
        addl %eax, %ecx
        cmpl $8192, %ecx
        jle LBB1_2      #bb
LBB1_5: #return
        ret

or (if the array is local and we are in dynamic-nonpic or static mode):

LBB3_2: #bb
        movb $0, _flags2(%ecx)
        addl %eax, %ecx
        cmpl $8192, %ecx
        jle LBB3_2      #bb

and:

        lis r2, ha16(L_flags2$non_lazy_ptr)
        lwz r2, lo16(L_flags2$non_lazy_ptr)(r2)
        slwi r3, r4, 1
LBB1_2: ;bb
        li r5, 0
        add r6, r4, r3
        stbx r5, r2, r3
        cmpwi cr0, r6, 8192
        bgt cr0, LBB1_5 ;return

instead of:

        leal (%eax,%eax,2), %ecx
        movl %eax, %edx
        addl %edx, %edx
        addl L_flags2$non_lazy_ptr, %edx
        xorl %esi, %esi
LBB1_2: #bb
        movb $0, (%edx,%esi)
        movl %eax, %edi
        addl %esi, %edi
        addl %ecx, %esi
        cmpl $8192, %esi
        jg LBB1_5       #return

and:

        lis r2, ha16(L_flags2$non_lazy_ptr)
        lwz r2, lo16(L_flags2$non_lazy_ptr)(r2)
        mulli r3, r4, 3
        slwi r5, r4, 1
        li r6, 0
        add r2, r2, r5
LBB1_2: ;bb
        li r5, 0
        add r7, r3, r6
        stbx r5, r2, r6
        add r6, r4, r6
        cmpwi cr0, r7, 8192
        ble cr0, LBB1_2 ;bb

This speeds up Benchmarks/Shootout/sieve from 8.533s to 6.464s and
implements LoopStrengthReduce/var_stride_used_by_compare.ll

llvm-svn: 31809
2006-11-17 06:17:33 +00:00
Chris Lattner
0a2d29b345 Fix a gcc 4.2 warning.
llvm-svn: 31751
2006-11-15 04:53:24 +00:00
Chris Lattner
0114b0c20e implement InstCombine/shift-simplify.ll by transforming:
(X >> Z) op (Y >> Z)  -> (X op Y) >> Z

for all shifts and all ops={and/or/xor}.

llvm-svn: 31729
2006-11-14 07:46:50 +00:00
Chris Lattner
616335f272 implement InstCombine/and-compare.ll:test1. This compiles:
typedef struct { unsigned prefix : 4; unsigned code : 4; unsigned unsigned_p : 4; } tree_common;
int foo(tree_common *a, tree_common *b) { return a->code == b->code; }

into:

_foo:
        movl 4(%esp), %eax
        movl 8(%esp), %ecx
        movl (%eax), %eax
        xorl (%ecx), %eax
        # TRUNCATE movb %al, %al
        shrb $4, %al
        testb %al, %al
        sete %al
        movzbl %al, %eax
        ret

instead of:

_foo:
        movl 8(%esp), %eax
        movb (%eax), %al
        shrb $4, %al
        movl 4(%esp), %ecx
        movb (%ecx), %cl
        shrb $4, %cl
        cmpb %al, %cl
        sete %al
        movzbl %al, %eax
        ret

saving one cycle by eliminating a shift.

llvm-svn: 31727
2006-11-14 06:06:06 +00:00
Chris Lattner
65a873caa2 Fix InstCombine/2006-11-10-ashr-miscompile.ll a miscompilation introduced
by the shr -> [al]shr patch.  This was reduced from 176.gcc.

llvm-svn: 31653
2006-11-10 23:38:52 +00:00
Chris Lattner
cd3c5f59e2 Teach ShrinkDemandedConstant how to handle X+C. This implements:
add.ll:test33, add.ll:test34, shift-sra.ll:test2

llvm-svn: 31586
2006-11-09 05:12:27 +00:00
Chris Lattner
d1bcd014a8 reenable factoring of GEP expressions, being more precise about the
case that it bad to do.

llvm-svn: 31563
2006-11-08 19:42:28 +00:00
Chris Lattner
77e0e67f23 make this code more efficient by not creating a phi node we are just going to
delete in the first place.  This also makes it simpler.

llvm-svn: 31562
2006-11-08 19:29:23 +00:00
Chris Lattner
f2cd0aeced disable this factoring optzn for GEPs for now, this severely pessimizes some
loops.

llvm-svn: 31560
2006-11-08 18:49:31 +00:00
Reid Spencer
da1f5b882a For PR950:
This patch converts the old SHR instruction into two instructions,
AShr (Arithmetic) and LShr (Logical). The Shr instructions now are not
dependent on the sign of their operands.

llvm-svn: 31542
2006-11-08 06:47:33 +00:00
Chris Lattner
924b2b109f scalarrepl should not split the two elements of the vsiidx array:
int func(vFloat v0, vFloat v1) {
        int ii;
        vSInt32 vsiidx[2];
        vsiidx[0] = _mm_cvttps_epi32(v0);
        vsiidx[1] = _mm_cvttps_epi32(v1);
        ii = ((int *) vsiidx)[4];
        return ii;
}

This fixes Transforms/ScalarRepl/2006-11-07-InvalidArrayPromote.ll

llvm-svn: 31524
2006-11-07 22:42:47 +00:00
Jeff Cohen
e1003da1a2 Unbreak VC++ build.
llvm-svn: 31464
2006-11-05 19:31:28 +00:00
Nick Lewycky
8b17d79f5e Remove commented line from earlier debugging.
llvm-svn: 31460
2006-11-05 14:19:40 +00:00
Andrew Lenharth
c2f822392c The wrong parameter was being tested to deturmine i32 vs i64
llvm-svn: 31431
2006-11-03 22:45:50 +00:00
Chris Lattner
8f8a1ed82e remove dead code
llvm-svn: 31398
2006-11-03 01:34:58 +00:00
Reid Spencer
4bafa71dc1 For PR786:
Turn on -Wunused and -Wno-unused-parameter. Clean up most of the resulting
fall out by removing unused variables. Remaining warnings have to do with
unused functions (I didn't want to delete code without review) and unused
variables in generated code. Maintainers should clean up the remaining
issues when they see them. All changes pass DejaGnu tests and Olden.

llvm-svn: 31380
2006-11-02 20:25:50 +00:00
Reid Spencer
1abf69e923 For PR950:
Replace the REM instruction with UREM, SREM and FREM.

llvm-svn: 31369
2006-11-02 01:53:59 +00:00
Devang Patel
ae6d81559a There can be more than one PHINode at the start of the block.
llvm-svn: 31362
2006-11-01 23:04:45 +00:00
Devang Patel
3f3161cee2 Handle PHINode with only one incoming value.
This fixes http://llvm.org/bugs/show_bug.cgi?id=979

llvm-svn: 31358
2006-11-01 22:26:43 +00:00
Chris Lattner
2cedfa5156 Factor gep instructions through phi nodes.
llvm-svn: 31346
2006-11-01 07:43:41 +00:00
Chris Lattner
61ea2af8fe Turn a phi of many loads into a phi of the address and a single load of the
result.  This can significantly shrink code and exposes identities more
aggressively.

llvm-svn: 31344
2006-11-01 07:13:54 +00:00
Chris Lattner
e43c3b1681 Fix a bug in the previous patch
llvm-svn: 31342
2006-11-01 04:55:47 +00:00
Chris Lattner
7211110992 Fold things like "phi [add (a,b), add(c,d)]" into two phi's and one add.
This triggers thousands of times on multisource.

llvm-svn: 31341
2006-11-01 04:51:18 +00:00
Chris Lattner
a1da382ad3 break edges more intelligently
llvm-svn: 31257
2006-10-28 06:45:33 +00:00
Chris Lattner
41216d38c5 SplitCriticalEdge checks to see if an edge is critical, don't check twice
llvm-svn: 31255
2006-10-28 06:38:14 +00:00
Chris Lattner
93414d06c4 prepare for a change I'm about to make
llvm-svn: 31248
2006-10-28 00:59:20 +00:00
Reid Spencer
4665cb220d Simplify code a bit by changing instances of:
InsertNewInstBefore(new CastInst(Val, ValTy, Val->GetName()), I)
into:
   InsertCastBefore(Val, ValTy, I)

llvm-svn: 31204
2006-10-26 19:19:06 +00:00
Reid Spencer
6833ffe8b8 For PR950:
Make necessary changes to support DIV -> [SUF]Div. This changes llvm to
have three division instructions: signed, unsigned, floating point. The
bytecode and assembler are bacwards compatible, however.

llvm-svn: 31195
2006-10-26 06:15:43 +00:00
Nick Lewycky
e7580b4a17 Fix 2006-10-25-AddSetCC. A relational operator (like setlt) can never
produce an EQ property.

llvm-svn: 31193
2006-10-26 02:35:18 +00:00
Nick Lewycky
be9829c45f Resurrect r1.25.
Fix and comment the "or", "and" and "xor" transformations.

llvm-svn: 31189
2006-10-25 23:48:24 +00:00
Chris Lattner
f4a5fcbb3a hide symbols properly
llvm-svn: 31184
2006-10-25 21:14:31 +00:00
Chris Lattner
0cf64c9469 Fix Transforms/ScalarRepl/2006-10-23-PointerUnionCrash.ll
llvm-svn: 31151
2006-10-24 06:26:32 +00:00
Chris Lattner
d13449ed3e Revert back to r1.21, which was the last revision of predsimplify that
passes llvm-gcc bootstrap.

llvm-svn: 31146
2006-10-24 00:36:21 +00:00
Chris Lattner
91e628645b Handle fallout from the recent branch-on-undef changes. This fixes
Prolangs-C/agrep and SCCP/2006-10-23-IPSCCP-Crash.ll

llvm-svn: 31132
2006-10-23 18:57:02 +00:00
Nick Lewycky
6830bee9b4 Remove the Backwards operation. Resolving now works at the time when a
property is added by running through the list of uses of the value and
adding resolved properties to the property set.

llvm-svn: 31126
2006-10-23 01:56:02 +00:00
Nick Lewycky
25e815f0a2 Fix similar missing optimization opportunity in XOR.
llvm-svn: 31123
2006-10-22 22:22:58 +00:00
Nick Lewycky
5eec4941d1 Whoops! Add missing NULL check.
llvm-svn: 31121
2006-10-22 21:38:24 +00:00
Nick Lewycky
b81c926e06 Handle "if ((x|y) != 0)" for ints like we do for bools. Fixes missed
optimization opportunity pointed out by Chris Lattner.

llvm-svn: 31118
2006-10-22 21:36:41 +00:00
Nick Lewycky
c680dabd94 AllocaInst can't return a null pointer. Fixes missed optimization
opportunity pointed out by Andrew Lewycky.

llvm-svn: 31115
2006-10-22 19:53:27 +00:00
Chris Lattner
51e762d4cb Add a workaround for PR962, disabling the more aggressive form of this
transformation.  This speeds up a C++ app 2.25x.

llvm-svn: 31113
2006-10-22 18:42:26 +00:00
Chris Lattner
50b3810d9a 3 Changes:
1. Better document what is going on here.
2. Only hack on one branch per iteration, making the results less conservative.
3. Handle the problematic case by marking edges executable instead of by
   playing with value lattice states.  This is far less pessimistic, and fixes
   SCCP/ipsccp-gvar.ll.

llvm-svn: 31106
2006-10-22 05:59:17 +00:00
Chris Lattner
6ea0134893 Fix an ugly problem in SCCP. This fixes Benchmarks/Misc-C++/mandel-text.cpp
llvm-svn: 31073
2006-10-20 20:19:08 +00:00
Chris Lattner
38ed7d9e49 Fix miscompilation of MallocBench/espresso which code review pointed out
but apparently didn't make it into the final patch.

llvm-svn: 31070
2006-10-20 18:20:21 +00:00
Reid Spencer
d414793dbc For PR950:
This patch implements the first increment for the Signless Types feature.
All changes pertain to removing the ConstantSInt and ConstantUInt classes
in favor of just using ConstantInt.

llvm-svn: 31063
2006-10-20 07:07:24 +00:00
Devang Patel
b030b91f4a While creating mask, use 1ULL instead of 1.
llvm-svn: 31062
2006-10-20 01:16:56 +00:00
Devang Patel
880a9d823f It is OK to remove extra cast if operation is EQ/NE even though source
and destination sign may not match but other conditions are met.

llvm-svn: 31056
2006-10-19 20:59:13 +00:00
Devang Patel
88406a6e1e Typo Typo.
llvm-svn: 31055
2006-10-19 19:21:36 +00:00
Devang Patel
277990c79f Typo.
llvm-svn: 31054
2006-10-19 19:05:38 +00:00
Devang Patel
d9ade71cc7 Fix bug in PR454 resolution. Added new test case.
This fixes llvmAsmParser.cpp miscompile by llvm on PowerPC Darwin.

llvm-svn: 31053
2006-10-19 18:54:08 +00:00
Reid Spencer
c6aa794a41 Undo Chris' last patch, it caused a regression.
llvm-svn: 30991
2006-10-16 23:08:08 +00:00
Chris Lattner
fd983f91e7 fix a buggy check that accidentally disabled this xform
llvm-svn: 30967
2006-10-15 22:42:15 +00:00
Nick Lewycky
686cc9cacc Replace custom dispatch code with two uses of InstVisitor. Improves
compile-time performance.

llvm-svn: 30896
2006-10-12 02:02:44 +00:00
Chris Lattner
9f980ec2a1 Implement SROA of unions with mixed pointers/integers in them. This implements
PR892 and Transforms/ScalarRepl/union-pointer.ll:test2

llvm-svn: 30825
2006-10-08 23:53:04 +00:00
Chris Lattner
f8afa75cef Implement Transforms/ScalarRepl/union-pointer.ll:test
llvm-svn: 30823
2006-10-08 23:28:04 +00:00
Chris Lattner
513ba43053 add a new SimplifyDemandedVectorElts method, which works similarly to
SimplifyDemandedBits.  The idea is that some operations can be simplified if
not all of the computed elements are needed.  Some targets (like x86) have a
large number of intrinsics that operate on a single element, but pass other
elts through unmodified.  If those other elements are not needed, the
intrinsics can be simplified to scalar operations, and insertelement ops can
be removed.

This turns (f.e.):

ushort %Convert_sse(float %f) {
        %tmp = insertelement <4 x float> undef, float %f, uint 0                ; <<4 x float>> [#uses=1]
        %tmp10 = insertelement <4 x float> %tmp, float 0.000000e+00, uint 1             ; <<4 x float>> [#uses=1]
        %tmp11 = insertelement <4 x float> %tmp10, float 0.000000e+00, uint 2           ; <<4 x float>> [#uses=1]
        %tmp12 = insertelement <4 x float> %tmp11, float 0.000000e+00, uint 3           ; <<4 x float>> [#uses=1]
        %tmp28 = tail call <4 x float> %llvm.x86.sse.sub.ss( <4 x float> %tmp12, <4 x float> < float 1.000000e+00, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp37 = tail call <4 x float> %llvm.x86.sse.mul.ss( <4 x float> %tmp28, <4 x float> < float 5.000000e-01, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp37, <4 x float> < float 6.553500e+04, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> zeroinitializer )          ; <<4 x float>> [#uses=1]
        %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 )              ; <int> [#uses=1]
        %tmp69 = cast int %tmp to ushort                ; <ushort> [#uses=1]
        ret ushort %tmp69
}

into:

ushort %Convert_sse(float %f) {
entry:
        %tmp28 = sub float %f, 1.000000e+00             ; <float> [#uses=1]
        %tmp37 = mul float %tmp28, 5.000000e-01         ; <float> [#uses=1]
        %tmp375 = insertelement <4 x float> undef, float %tmp37, uint 0         ; <<4 x float>> [#uses=1]
        %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp375, <4 x float> < float 6.553500e+04, float undef, float undef, float undef > )           ; <<4 x float>> [#uses=1]
        %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> < float 0.000000e+00, float undef, float undef, float undef > )            ; <<4 x float>> [#uses=1]
        %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 )              ; <int> [#uses=1]
        %tmp69 = cast int %tmp to ushort                ; <ushort> [#uses=1]
        ret ushort %tmp69
}

which improves codegen from:

_Convert_sse:
        movss LCPI1_0, %xmm0
        movss 4(%esp), %xmm1
        subss %xmm0, %xmm1
        movss LCPI1_1, %xmm0
        mulss %xmm0, %xmm1
        movss LCPI1_2, %xmm0
        minss %xmm0, %xmm1
        xorps %xmm0, %xmm0
        maxss %xmm0, %xmm1
        cvttss2si %xmm1, %eax
        andl $65535, %eax
        ret

to:

_Convert_sse:
        movss 4(%esp), %xmm0
        subss LCPI1_0, %xmm0
        mulss LCPI1_1, %xmm0
        movss LCPI1_2, %xmm1
        minss %xmm1, %xmm0
        xorps %xmm1, %xmm1
        maxss %xmm1, %xmm0
        cvttss2si %xmm0, %eax
        andl $65535, %eax
        ret


This is just a first step, it can be extended in many ways.  Testcase here:
Transforms/InstCombine/vec_demanded_elts.ll

llvm-svn: 30752
2006-10-05 06:55:50 +00:00
Nick Lewycky
f5ad6f5e2e Simplify logic further.
Ensure that we copy KnownProperties before calling visitBasicBlock, else
we may leak properties into blocks where they don't belong.

llvm-svn: 30705
2006-10-03 17:36:01 +00:00
Nick Lewycky
89e4e147f3 Simplify, now that predsimplify depends on break-crit-edges.
Fix SwitchInst where dest-block is the same as one of the cases.

llvm-svn: 30700
2006-10-03 15:19:11 +00:00
Nick Lewycky
0da988d8a7 Move break-crit-edges before the predicate simplifier. Allows us to
optimize in more cases.

llvm-svn: 30699
2006-10-03 14:52:23 +00:00
Chris Lattner
ed1e28e373 Fix a bug from r1.391 of this file, where we checked the size instead of
the alignment when promoting allocations.  This implements
InstCombine/cast.ll:test32

llvm-svn: 30682
2006-10-01 19:40:58 +00:00
Chris Lattner
168061f13d Eliminate ConstantBool::True and ConstantBool::False. Instead, provide
ConstantBool::getTrue() and ConstantBool::getFalse().

llvm-svn: 30665
2006-09-28 23:35:22 +00:00
Chris Lattner
358e9432a1 set DEBUG_TYPE right
llvm-svn: 30623
2006-09-27 04:58:23 +00:00
Nick Lewycky
800fff3067 Style changes only. Remove dead code, fix a comment.
llvm-svn: 30588
2006-09-23 15:13:08 +00:00
Chris Lattner
e87cf1c708 Fix Transforms/IndVarsSimplify/2006-09-20-LFTR-Crash.ll
llvm-svn: 30555
2006-09-21 05:12:20 +00:00
Nick Lewycky
2aff202559 Don't rewrite ConstantExpr::get.
llvm-svn: 30552
2006-09-21 01:05:35 +00:00
Nick Lewycky
eb301d20a6 Once we're down to "setcc type constant1, constant2", at least come up
with the right answer.

llvm-svn: 30550
2006-09-20 23:02:24 +00:00
Nick Lewycky
99b3c50130 Use a total ordering to compare instructions.
Fixes infinite loop in resolve().

llvm-svn: 30540
2006-09-20 17:04:01 +00:00
Andrew Lenharth
cf0746ba2a simplify
llvm-svn: 30535
2006-09-20 15:37:57 +00:00
Chris Lattner
6ddcf6bba8 We went through all that trouble to compute whether it was safe to transform
this comparison, but never checked it.  Whoops, no wonder we miscompiled
177.mesa!

llvm-svn: 30511
2006-09-20 04:44:59 +00:00
Evan Cheng
a7347758f5 Back out Chris' last set of changes. This breaks 177.mesa and povray somehow.
llvm-svn: 30505
2006-09-20 01:39:40 +00:00
Evan Cheng
8652c13f13 80 col.
llvm-svn: 30504
2006-09-20 01:10:02 +00:00
Andrew Lenharth
0240d56eb6 If we have an add, do it in the pointer realm, not the int realm. This is critical in the linux kernel for pointer analysis correctness
llvm-svn: 30496
2006-09-19 18:24:51 +00:00
Chris Lattner
2d2d80a4c2 implement select.ll:test19-22
llvm-svn: 30482
2006-09-19 06:18:21 +00:00
Nick Lewycky
96939f2d94 Walk down the dominator tree instead of the control flow graph. That means
that we can't modify the CFG any more, at least not until it's possible
to update the dominator tree (PR217).

llvm-svn: 30469
2006-09-18 21:09:35 +00:00
Chris Lattner
1efde528d6 Fix an infinite loop building the CFE
llvm-svn: 30465
2006-09-18 18:27:05 +00:00
Chris Lattner
9c8bffb5e8 Implement InstCombine/cast.ll:test31. This speeds up 462.libquantum by 26%.
llvm-svn: 30456
2006-09-18 05:27:43 +00:00
Chris Lattner
f7e8879212 Implement Transforms/InstCombine/shift-sra.ll:test0
llvm-svn: 30450
2006-09-18 04:31:40 +00:00
Chris Lattner
6ee34e89bc Rewrite shift/and/compare sequences to promote better licm of the RHS.
Use isLogicalShift/isArithmeticShift to simplify code.

llvm-svn: 30448
2006-09-18 04:22:48 +00:00
Chris Lattner
a4689e489e Fix Transforms/InstCombine/2006-09-15-CastToBool.ll and PR913
llvm-svn: 30405
2006-09-16 03:14:10 +00:00
Nick Lewycky
d8a64a4b2a Add some more consistency checks.
llvm-svn: 30305
2006-09-13 19:32:53 +00:00
Nick Lewycky
29d605880a Fix unionSets so that it can merge correctly.
llvm-svn: 30304
2006-09-13 19:24:01 +00:00
Nick Lewycky
315cc49646 Erase dead instructions.
llvm-svn: 30298
2006-09-13 18:55:37 +00:00
Chris Lattner
c35e7175c3 An sinkable instruction may exist with uses, if those uses are in dead blocks.
Handle this.  This fixes PR908 and Transforms/LICM/2006-09-12-DeadUserOfSunkInstr.ll

llvm-svn: 30275
2006-09-12 19:17:09 +00:00
Chris Lattner
0cffa03571 Fix PR905 and InstCombine/2006-09-11-EmptyStructCrash.ll
llvm-svn: 30266
2006-09-11 21:43:16 +00:00
Nick Lewycky
f9acdaf05e Skip the linear search if the answer is already known.
llvm-svn: 30251
2006-09-11 17:23:34 +00:00
Chris Lattner
2921612126 Allow tail duplication in more cases, relaxing the previous restriction a
bit.  This fixes Regression/Transforms/TailDup/MergeTest.ll

llvm-svn: 30237
2006-09-10 18:17:58 +00:00
Nick Lewycky
3bfe103166 Replace EquivalenceClasses with a custom-built data structure. Many common
operations (like findProperties) should be faster, at the expense of
unionSets being slower in cases that are rare in practise.

Don't erase a dead Instruction. This fixes a memory corruption issue.

llvm-svn: 30235
2006-09-10 02:27:07 +00:00
Chris Lattner
91d21d85e8 Implement Transforms/InstCombine/hoist_instr.ll
llvm-svn: 30234
2006-09-09 22:02:56 +00:00
Chris Lattner
6847781b3e Turn div X, (Cond ? Y : 0) -> div X, Y
This implements select.ll::test18.

llvm-svn: 30230
2006-09-09 20:26:32 +00:00
Chris Lattner
6aebff10e8 Throttle back tail duplication to avoid creating really ugly sequences of code.
For Transforms/TailDup/if-tail-dup.ll, f.e., it produces:

_foo:
        movl 8(%esp), %eax
        movl 4(%esp), %ecx
        testl $1, %ecx
        je LBB1_2       #cond_next
LBB1_1: #cond_true
        movl $1, (%eax)
LBB1_2: #cond_next
        testl $2, %ecx
        je LBB1_4       #cond_next10
LBB1_3: #cond_true6
        movl $1, 4(%eax)
LBB1_4: #cond_next10
        testl $4, %ecx
        je LBB1_6       #cond_next18
LBB1_5: #cond_true14
        movl $1, 8(%eax)
LBB1_6: #cond_next18
        testl $8, %ecx
        je LBB1_8       #return
LBB1_7: #cond_true22
        movl $1, 12(%eax)
        ret
LBB1_8: #return
        ret

instead of:

_foo:
        movl 4(%esp), %eax
        testl $2, %eax
        sete %cl
        movl 8(%esp), %edx
        testl $1, %eax
        je LBB1_2       #cond_next
LBB1_1: #cond_true
        movl $1, (%edx)
        testb %cl, %cl
        jne LBB1_4      #cond_next10
        jmp LBB1_3      #cond_true6
LBB1_2: #cond_next
        testb %cl, %cl
        jne LBB1_4      #cond_next10
LBB1_3: #cond_true6
        movl $1, 4(%edx)
        testl $4, %eax
        je LBB1_6       #cond_next18
        jmp LBB1_5      #cond_true14
LBB1_4: #cond_next10
        testl $4, %eax
        je LBB1_6       #cond_next18
LBB1_5: #cond_true14
        movl $1, 8(%edx)
        testl $8, %eax
        je LBB1_8       #return
        jmp LBB1_7      #cond_true22
LBB1_6: #cond_next18
        testl $8, %eax
        je LBB1_8       #return
LBB1_7: #cond_true22
        movl $1, 12(%edx)
        ret
LBB1_8: #return
        ret

llvm-svn: 30158
2006-09-07 21:30:15 +00:00
Nick Lewycky
26f5df3031 Improve handling of SelectInst.
Reorder operations to remove duplicated work.
Fix to leave floating-point types out of the optimization.
Add tests to predsimplify.ll for SwitchInst and SelectInst handling.

llvm-svn: 30055
2006-09-02 19:40:38 +00:00
Nick Lewycky
ebb3b930fd Don't confuse canonicalize and lookup. Fixes predsimplify.reg4.ll. Also
corrects missing optimization opportunity removing cases from a switch.

llvm-svn: 30009
2006-09-01 03:26:35 +00:00
Nick Lewycky
e31a5a1b20 Properties where both Values weren't in the union (as being equal to
another Value) weren't being found by findProperties.

This fixes predsimplify.ll test6, a missed optimization opportunity.

llvm-svn: 29991
2006-08-31 00:39:16 +00:00
Nick Lewycky
4a44c62fab Move to using the EquivalenceClass ADT. Removes SynSets.
If a branch's condition has become a ConstantBool, simplify it immediately.
Removing the edge saves work and exposes up more optimization opportunities
in the pass.
Add support for SelectInst.

llvm-svn: 29970
2006-08-30 02:46:48 +00:00
Devang Patel
a5bb9b49d3 Do not rely on std::sort and std::erase to get list of unique
exit blocks. The output is dependent on addresses of basic block.

Add and use Loop::getUniqueExitBlocks.

llvm-svn: 29966
2006-08-29 22:29:16 +00:00
Owen Anderson
bbfa479f14 Clean up a bit.
llvm-svn: 29950
2006-08-29 06:10:56 +00:00
Nick Lewycky
9535a84c33 Add PredicateSimplifier pass. Collapses equal variables into one form
and simplifies expressions. This implements the optimization described
in PR807.

llvm-svn: 29947
2006-08-28 22:44:55 +00:00
Owen Anderson
ee603f511f Make LoopUnroll fold excessive BasicBlocks. This results in a significant speedup of
gccas on 252.eon

llvm-svn: 29936
2006-08-28 02:09:46 +00:00
Chris Lattner
a39dcb5377 eliminate RegisterOpt. It does the same thing as RegisterPass.
llvm-svn: 29925
2006-08-27 22:42:52 +00:00
Chris Lattner
33bd5dcfb7 s|llvm/Support/Visibility.h|llvm/Support/Compiler.h|
llvm-svn: 29911
2006-08-27 12:54:02 +00:00
Owen Anderson
aac2dbf9dd Fix a crash related to updating Phi nodes in the original header block. This was
causing a crash in 175.vpr

llvm-svn: 29887
2006-08-25 22:13:55 +00:00
Owen Anderson
e02cb4cda0 Add an assertion to check that we're really preserving LCSSA.
llvm-svn: 29886
2006-08-25 22:12:36 +00:00
Owen Anderson
b1d980f98a Reapply the indvars patch, since nothing blew up last night.
llvm-svn: 29874
2006-08-25 17:41:25 +00:00
Owen Anderson
596b22375a Revert my previous patch. Since there are some major changes that went in today,
I'm going to wait to put this in HEAD until tomorrow, so as not to clutter the nightly
tester.

llvm-svn: 29868
2006-08-25 03:45:57 +00:00
Owen Anderson
54c87a08ab Specify that indvars actually preserve LCSSA. This has been done for a while, but I
forgot to put in the analysis usage.

llvm-svn: 29867
2006-08-25 03:32:13 +00:00
Owen Anderson
0aa48d0522 Implement unrolling of multiblock loops. This significantly improves the
utility of the LoopUnroll pass.

Also, add a testcase for multiblock-loop unrolling.

llvm-svn: 29859
2006-08-24 21:28:19 +00:00
Reid Spencer
1ae3d19c51 Fix a grammaro in a comment.
llvm-svn: 29765
2006-08-18 09:01:07 +00:00
Chris Lattner
d9ce68d3ec Handle single-entry PHI nodes correctly. This fixes PR877 and
Transforms/CondProp/2006-08-14-SingleEntryPhiCrash.ll

llvm-svn: 29673
2006-08-14 21:38:05 +00:00
Chris Lattner
52419ac93e Changes:
1. Update an obsolete comment.
  2. Make the sorting by base an explicit (though still N^2) step, so
     that the code is more clear on what it is doing.
  3. Partition uses so that uses inside the loop are handled before uses
     outside the loop.

Note that none of these changes currently changes the code inserted by LSR,
but they are a stepping stone to getting there.

This code is the result of some crazy pair programming with Nate. :)

llvm-svn: 29493
2006-08-03 06:34:50 +00:00
Chris Lattner
92a0b69813 Add some advice
llvm-svn: 29324
2006-07-27 04:24:14 +00:00
Chris Lattner
3890fa2c4a Minor comment tweaks
llvm-svn: 29226
2006-07-20 19:06:16 +00:00
Owen Anderson
3d84b9e0cc Add an assertion.
llvm-svn: 29199
2006-07-19 05:48:45 +00:00
Owen Anderson
7d68cbc39a Make LoopUnroll not die on LCSSA Phis. This makes lencod work again.
llvm-svn: 29198
2006-07-19 05:45:14 +00:00
Owen Anderson
8a36136176 Fix a error that hadn't yet cause any problems, but I'm sure it would have
somewhere down the road.

llvm-svn: 29197
2006-07-19 03:51:48 +00:00
Evan Cheng
725fc9e73d Only reuse a previous IV if it would not require a type conversion.
llvm-svn: 29186
2006-07-18 19:07:58 +00:00
Owen Anderson
715e6d06e6 Hopefully the final attempt at making IndVars preserve LCSSA.
This should fix PR 831.

llvm-svn: 29141
2006-07-14 18:49:15 +00:00
Chris Lattner
f404323e66 Revert this patch temporarily until PR831 is fixed.
llvm-svn: 29134
2006-07-13 19:05:20 +00:00
Owen Anderson
d4000ea452 IndVars now (correctly) preserves LCSSA form.
llvm-svn: 29126
2006-07-12 21:29:14 +00:00
Chris Lattner
8c650b87d6 Silence a warning produced in assertions-disabled mode
llvm-svn: 29108
2006-07-11 18:31:26 +00:00
Owen Anderson
0dd2844c05 Revert my indvars changes because they were breaking things. Unfortunately this
didn't start showing up until after the recent instcombine fixes.

llvm-svn: 29102
2006-07-11 07:25:33 +00:00
Owen Anderson
bc6e3cc1b3 Add a comment, and fix a typo that broke the build.
llvm-svn: 29094
2006-07-10 22:15:25 +00:00
Owen Anderson
ea91b4ae30 Don't indent the entire function.
llvm-svn: 29093
2006-07-10 22:03:18 +00:00
Chris Lattner
8c5c35af13 Recognize 16-bit bswaps by relaxing overconstrained pattern.
This implements Transforms/InstCombine/bswap.ll:test[34].

llvm-svn: 29087
2006-07-10 20:25:24 +00:00
Owen Anderson
81986ede26 Make instcombine not remove Phi nodes when LCSSA is live.
llvm-svn: 29083
2006-07-10 19:03:49 +00:00
Chris Lattner
496bd3fbf6 Use hidden visibility to make symbols in an anonymous namespace get
dropped.  This shrinks libllvmgcc.dylib another 67K

llvm-svn: 28975
2006-06-28 23:17:24 +00:00
Chris Lattner
3aac973374 Shrink libllvmgcc.dylib by another 23K
llvm-svn: 28972
2006-06-28 22:08:15 +00:00
Owen Anderson
8f95262124 Switch to a very conservative heuristic for determining when loop-unswitching
will be profitable.  This is mainly to remove some cases where excessive
unswitching would result in long compile times and/or huge generated code.

Once someone comes up with a better heuristic that avoids these cases, this
should be switched out.

llvm-svn: 28962
2006-06-28 17:47:50 +00:00
Chris Lattner
d6dbd6d552 Fix Transforms/InstCombine/2006-06-28-infloop.ll
llvm-svn: 28961
2006-06-28 17:34:50 +00:00
Chris Lattner
1d8a141786 Don't unswitch really large loops even if they are mostly filled with empty
blocks.

llvm-svn: 28959
2006-06-28 16:38:55 +00:00
Andrew Lenharth
764ee8eb29 Catch more function pointer casting problems
Remove the Function pointer cast in these calls, converting it to
a cast of argument.
%tmp60 = tail call int cast (int (ulong)* %str to int (int)*)( int 10 )
%tmp60 = tail call int cast (int (ulong)* %str to int (int)*)( uint %tmp51 )

llvm-svn: 28953
2006-06-28 01:01:52 +00:00
Owen Anderson
f9dbb7c834 Fix for 2006-06-27-DeadSwitchCase.ll
Be more careful when updating Phi nodes after eliminating dead switch cases.  Fix
proposed by Chris.

llvm-svn: 28947
2006-06-27 22:26:09 +00:00
Owen Anderson
1c3b04d485 De-pessimize the handling of LCSSA Phi nodes in IndVarSimplify. Hopefully this
will make Shootout-C/nestedloop faster.

llvm-svn: 28924
2006-06-27 02:17:08 +00:00
Chris Lattner
b12f94b14a random code cleanups, no functionality change
llvm-svn: 28914
2006-06-26 19:10:05 +00:00
Owen Anderson
71056f7113 Make LoopUnswitch able to unswitch loops with live-out values by taking advantage
of LCSSA.  This results several times the number of unswitchings occurring on
tests such and timberwolfmc, unix-tbl, and ldecod.

llvm-svn: 28912
2006-06-26 07:44:36 +00:00
Chris Lattner
25b04b4249 Fix IndVarsSimplify/2006-06-16-Indvar-LCSSA-Crash.ll, a case where a
"LCSSA" phi node causes indvars to break dominance properties.  This fixes
causes indvars to avoid inserting aggressive code in this case, instead
indvars should be fixed to be more aggressive in the face of lcssa phi's.

llvm-svn: 28850
2006-06-17 01:02:31 +00:00
Chris Lattner
d99c49c826 Implement Transforms/InstCombine/bswap.ll, turning common shift/and/or bswap
idioms into bswap intrinsics.

llvm-svn: 28803
2006-06-15 19:07:26 +00:00
Chris Lattner
68a0b5c8a0 Fix Transforms/LoopUnswitch/2006-06-13-SingleEntryPHI.ll, a loop unswitch
bug exposed by the recent lcssa work.

llvm-svn: 28779
2006-06-14 04:46:17 +00:00
Owen Anderson
276e728e4b Reapply my 6/9 changes. The bug Evan saw no longer occurs.
llvm-svn: 28759
2006-06-12 21:49:21 +00:00
Evan Cheng
d99c8e2e5f Back out Owen's 6/9 changes. They broke MultiSource/Benchmarks/Prolangs-C/bison (and perhaps others).
llvm-svn: 28747
2006-06-11 09:32:57 +00:00
Owen Anderson
4a0ceb1e6d Add LCSSA as a requirement for LoopUnswitch, and assert that LoopUnswitch preserves
LCSSA.

llvm-svn: 28739
2006-06-09 18:40:32 +00:00
Evan Cheng
6039769dc1 RewriteExpr, either the new PHI node of induction variable or the
post-increment value, should be first cast to the appropriated type (to the
type of the common expr). Otherwise, the rewrite of a use based on (common +
iv) may end up with an incorrect type.

llvm-svn: 28735
2006-06-09 00:12:42 +00:00
Reid Spencer
59137abcac Fix a spello in a comment.
llvm-svn: 28714
2006-06-07 21:24:10 +00:00
Chris Lattner
19394c3f30 Fix a bug in a recent patch. This fixes UnitTests/Vector/Altivec/casts.c on
PPC/altivec

llvm-svn: 28698
2006-06-06 22:26:02 +00:00
Chris Lattner
6cca762d5f Remove unneeded hook. Patch by Anton K. Thanks!
llvm-svn: 28664
2006-06-02 19:11:46 +00:00
Chris Lattner
339d8b5ba9 Silence a -pedantic warning.
llvm-svn: 28632
2006-06-01 17:16:21 +00:00
Chris Lattner
36f99c7c1d Swap the order of operands created here. For +&|^, the order doesn't matter,
but for sub, it really does!  Fix fixes a miscompilation of fibheap_cut in
llvmgcc4.

llvm-svn: 28600
2006-05-31 21:14:00 +00:00
Chris Lattner
0043931185 Implement Transforms/InstCombine/store.ll:test2.
llvm-svn: 28503
2006-05-26 19:19:20 +00:00
Chris Lattner
c6c2770e08 Transform things like (splat(splat)) -> splat
llvm-svn: 28490
2006-05-26 00:29:06 +00:00
Chris Lattner
261299e3f5 Introduce a helper function that simplifies interpretation of shuffle masks.
No functionality change.

llvm-svn: 28489
2006-05-25 23:48:38 +00:00
Chris Lattner
c678c720a7 Turn (cast (shuffle (cast)) -> shuffle (cast) if it reduces the # casts in
the program.  This exposes more opportunities for the instcombiner, and implements
vec_shuffle.ll:test6

llvm-svn: 28487
2006-05-25 23:24:33 +00:00
Chris Lattner
5df88a112b extract element from a shuffle vector can be trivially turned into an
extractelement from the SV's source.  This implement vec_shuffle.ll:test[45]

llvm-svn: 28485
2006-05-25 22:53:38 +00:00
Chris Lattner
1ddd46999b Silence a bogus gcc warning
llvm-svn: 28422
2006-05-20 23:14:03 +00:00
Chris Lattner
cc9a99f371 Declare that lowerinvoke doesn't interact with other lowering passes.
Patch written by Domagoj Babic!

llvm-svn: 28367
2006-05-17 21:05:27 +00:00
Evan Cheng
111642322d Backing out last check-in for now. It's causing an infinite loop gccas lencode.
llvm-svn: 28284
2006-05-14 06:46:03 +00:00
Chris Lattner
c927dced9e Add/Sub/Mul are safe to promote here as well. Incrementing a single-bit
bitfield now gives this code:

_plus:
        lwz r2, 0(r3)
        rlwimi r2, r2, 0, 1, 31
        xoris r2, r2, 32768
        stw r2, 0(r3)
        blr

instead of this:

_plus:
        lwz r2, 0(r3)
        srwi r4, r2, 31
        slwi r4, r4, 31
        addis r4, r4, -32768
        rlwimi r2, r4, 0, 0, 0
        stw r2, 0(r3)
        blr

this can obviously still be improved.

llvm-svn: 28275
2006-05-13 02:16:08 +00:00
Chris Lattner
eea864472d Implement simple promotion for cast elimination in instcombine. This is
currently very limited, but can be extended in the future.  For example,
we now compile:

uint %test30(uint %c1) {
        %c2 = cast uint %c1 to ubyte
        %c3 = xor ubyte %c2, 1
        %c4 = cast ubyte %c3 to uint
        ret uint %c4
}

to:

_xor:
        movzbl 4(%esp), %eax
        xorl $1, %eax
        ret

instead of:

_xor:
        movb $1, %al
        xorb 4(%esp), %al
        movzbl %al, %eax
        ret

More impressively, we now compile:

struct B { unsigned bit : 1; };
void xor(struct B *b) { b->bit = b->bit ^ 1; }

To (X86/PPC):

_xor:
        movl 4(%esp), %eax
        xorl $-2147483648, (%eax)
        ret
_xor:
        lwz r2, 0(r3)
        xoris r2, r2, 32768
        stw r2, 0(r3)
        blr

instead of (X86/PPC):

_xor:
        movl 4(%esp), %eax
        movl (%eax), %ecx
        movl %ecx, %edx
        shrl $31, %edx
        # TRUNCATE movb %dl, %dl
        xorb $1, %dl
        movzbl %dl, %edx
        andl $2147483647, %ecx
        shll $31, %edx
        orl %ecx, %edx
        movl %edx, (%eax)
        ret

_xor:
        lwz r2, 0(r3)
        srwi r4, r2, 31
        xori r4, r4, 1
        rlwimi r2, r4, 31, 0, 0
        stw r2, 0(r3)
        blr

This implements InstCombine/cast.ll:test30.

llvm-svn: 28273
2006-05-13 02:06:03 +00:00
Chris Lattner
3fad520c62 Refactor some code, making it simpler.
When doing the initial pass of constant folding, if we get a constantexpr,
simplify the constant expr like we would do if the constant is folded in the
normal loop.

This fixes the missed-optimization regression in
Transforms/InstCombine/getelementptr.ll last night.

llvm-svn: 28224
2006-05-11 17:11:52 +00:00
Chris Lattner
e8fe3f2a08 Two changes:
1. Implement InstCombine/deadcode.ll by not adding instructions in unreachable
   blocks (due to constants in conditional branches/switches) to the worklist.
   This causes them to be deleted before instcombine starts up, leading to
   better optimization.

2. In the prepass over instructions, do trivial constprop/dce as we go.  This
   has the effect of improving the effectiveness of #1.  In addition, it
   *significantly* speeds up instcombine on test cases with large amounts of
   constant folding code (for example, that produced by code specialization
   or partial evaluation).  In one example, it speeds up instcombine from
   0.0589s to 0.0224s with a release build (a 2.6x speedup).

llvm-svn: 28215
2006-05-10 19:00:36 +00:00
Chris Lattner
f49a22d601 Patch to make some xforms preserve each other. Patch contributed by
Domagoj Babic!

llvm-svn: 28181
2006-05-09 04:13:41 +00:00
Chris Lattner
7661770087 Move some code around.
Make the "fold (and (cast A), (cast B)) -> (cast (and A, B))" transformation
only apply when both casts really will cause code to be generated.  If one or
both doesn't, then this xform doesn't remove a cast.

This fixes Transforms/InstCombine/2006-05-06-Infloop.ll

llvm-svn: 28141
2006-05-06 09:00:16 +00:00
Chris Lattner
73bfc4c2ea Fix an infinite loop compiling oggenc last night.
llvm-svn: 28128
2006-05-05 20:51:30 +00:00
Chris Lattner
95637c4889 Implement InstCombine/cast.ll:test29
llvm-svn: 28126
2006-05-05 06:39:07 +00:00
Chris Lattner
55938f67ae Fix Transforms/InstCombine/2006-05-04-DemandedBitCrash.ll
llvm-svn: 28101
2006-05-04 17:33:35 +00:00
Chris Lattner
1275941193 Add pass ID's for various passes, so they can be AddRequiredID. Patch by
Domagoj Babic!

llvm-svn: 28048
2006-05-02 04:24:36 +00:00
Chris Lattner
4c79d3b238 Fix InstCombine/2006-04-28-ShiftShiftLongLong.ll
llvm-svn: 28019
2006-04-28 22:21:41 +00:00
Chris Lattner
2118062b9c Fix Transforms/Reassociate/2006-04-27-ReassociateVector.ll
llvm-svn: 28007
2006-04-28 04:14:49 +00:00
Chris Lattner
3f6a151e2d Add support for inserting undef into a vector. This implements
Transforms/InstCombine/vec_insert_to_shuffle.ll

llvm-svn: 27997
2006-04-27 21:14:21 +00:00
Chris Lattner
01afcd337a Fix Transforms/ScalarRepl/2006-04-20-PromoteCrash.ll
llvm-svn: 27912
2006-04-20 20:48:50 +00:00
Andrew Lenharth
e2b150550a Make code match cvs commit message :)
llvm-svn: 27881
2006-04-20 15:41:37 +00:00
Andrew Lenharth
8f08647a6d If we can convert the return pointer type into an integer that IntPtrType
can be converted to losslessly, we can continue the conversion to a direct call.

llvm-svn: 27880
2006-04-20 14:56:47 +00:00
Chris Lattner
76beeb373d Turn x86 unaligned load/store intrinsics into aligned load/store instructions
if the pointer is known aligned.

llvm-svn: 27781
2006-04-17 22:26:56 +00:00
Chris Lattner
4422d3de1b Fix a bug in the 'shuffle(undef,x,mask) -> shuffle(x, undef,mask')' xform
Make the insert/extract elt -> shuffle code more aggressive.

This fixes CodeGen/PowerPC/vec_shuffle.ll

llvm-svn: 27728
2006-04-16 00:51:47 +00:00
Chris Lattner
da260db137 Canonicalize shuffle(undef,x,mask) -> shuffle(x, undef,mask').
llvm-svn: 27727
2006-04-16 00:03:56 +00:00
Chris Lattner
055889cfb9 significant cleanups to code that uses insert/extractelt heavily. This builds
maximal shuffles out of them where possible.

llvm-svn: 27717
2006-04-15 01:39:45 +00:00
Chris Lattner
b3cae60d0b Teach scalarrepl to promote unions of vectors and floats, producing
insert/extractelement operations.  This implements
Transforms/ScalarRepl/vector_promote.ll

llvm-svn: 27710
2006-04-14 21:42:41 +00:00
Reid Spencer
56aa7c79b7 Get rid of a signed/unsigned compare warning.
llvm-svn: 27625
2006-04-12 19:28:15 +00:00
Chris Lattner
7900e6da3b Turn casts into getelementptr's when possible. This enables SROA to be more
aggressive in some cases where LLVMGCC 4 is inserting casts for no reason.

This implements InstCombine/cast.ll:test27/28.

llvm-svn: 27620
2006-04-12 18:09:35 +00:00
Chris Lattner
ec4fbd3b41 Implement vec_shuffle.ll:test3
llvm-svn: 27573
2006-04-10 23:06:36 +00:00
Chris Lattner
42be18f65f Implement InstCombine/vec_shuffle.ll:test[12]
llvm-svn: 27571
2006-04-10 22:45:52 +00:00
Chris Lattner
a0a718c0cc Add supprot for shufflevector
llvm-svn: 27513
2006-04-08 01:19:12 +00:00
Chris Lattner
bc0489232b Lower vperm(x,y, mask) -> shuffle(x,y,mask) if mask is constant. This allows
us to compile oh-so-realistic stuff like this:

 vec_vperm(A, B, (vector unsigned char){14});

to:
        vspltb v0, v0, 14

instead of:

        vspltisb v0, 14
        vperm v0, v2, v1, v0

llvm-svn: 27452
2006-04-06 19:19:17 +00:00
Chris Lattner
42a1e621f1 vector casts of casts are eliminable. Transform this:
%tmp = cast <4 x uint> %tmp to <4 x int>                ; <<4 x int>> [#uses=1]
        %tmp = cast <4 x int> %tmp to <4 x float>               ; <<4 x float>> [#uses=1]

into:

        %tmp = cast <4 x uint> %tmp to <4 x float>              ; <<4 x float>> [#uses=1]

llvm-svn: 27355
2006-04-02 05:43:13 +00:00
Chris Lattner
3c994295fe Allow transforming this:
%tmp = cast <4 x uint>* %testData to <4 x int>*         ; <<4 x int>*> [#uses=1]
        %tmp = load <4 x int>* %tmp             ; <<4 x int>> [#uses=1]

to this:

        %tmp = load <4 x uint>* %testData               ; <<4 x uint>> [#uses=1]
        %tmp = cast <4 x uint> %tmp to <4 x int>                ; <<4 x int>> [#uses=1]

llvm-svn: 27353
2006-04-02 05:37:12 +00:00
Chris Lattner
cb26b2dfe8 Turn altivec lvx/stvx intrinsics into loads and stores. This allows the
elimination of one load from this:

int AreSecondAndThirdElementsBothNegative( vector float *in ) {
#define QNaN 0x7FC00000
const vector unsigned int testData = (vector unsigned int)( QNaN, 0, 0, QNaN );
vector float test = vec_ld( 0, (float*) &testData );
return ! vec_any_ge( test, *in );
}

Now generating:

_AreSecondAndThirdElementsBothNegative:
        mfspr r2, 256
        oris r4, r2, 49152
        mtspr 256, r4
        li r4, lo16(LCPI1_0)
        lis r5, ha16(LCPI1_0)
        addi r6, r1, -16
        lvx v0, r5, r4
        stvx v0, 0, r6
        lvx v1, 0, r3
        vcmpgefp. v0, v0, v1
        mfcr r3, 2
        rlwinm r3, r3, 27, 31, 31
        xori r3, r3, 1
        cntlzw r3, r3
        srwi r3, r3, 5
        mtspr 256, r2
        blr

llvm-svn: 27352
2006-04-02 05:30:25 +00:00
Chris Lattner
c2e9b030da Fix InstCombine/2006-04-01-InfLoop.ll
llvm-svn: 27330
2006-04-01 22:05:01 +00:00
Chris Lattner
497bbd4650 Fold A^(B&A) -> (B&A)^A
Fold (B&A)^A == ~B & A

This implements InstCombine/xor.ll:test2[56]

llvm-svn: 27328
2006-04-01 08:03:55 +00:00
Chris Lattner
79819f52dc If we can look through vector operations to find the scalar version of an
extract_element'd value, do so.

llvm-svn: 27323
2006-03-31 23:01:56 +00:00
Chris Lattner
0af2e8be73 extractelement(undef,x) -> undef
llvm-svn: 27300
2006-03-31 18:25:14 +00:00
Chris Lattner
e57e873543 Fix Transforms/InstCombine/2006-03-30-ExtractElement.ll
llvm-svn: 27261
2006-03-30 22:02:40 +00:00
Chris Lattner
bc48efc7fa Don't crash on packed logical ops
llvm-svn: 27125
2006-03-25 21:58:26 +00:00
Chris Lattner
826dbd7173 Fix spello
llvm-svn: 27052
2006-03-24 07:14:34 +00:00
Chris Lattner
37951daad5 add the actual cost to the debug info
llvm-svn: 27051
2006-03-24 07:14:00 +00:00
Jim Laskey
5df6eab07d Can't combine anymore - we don't have a chain through llvm.dbg intrinsics.
llvm-svn: 26992
2006-03-23 18:10:42 +00:00
Chris Lattner
a779db7473 silence a bogus gcc warning
llvm-svn: 26953
2006-03-22 17:27:24 +00:00
Chris Lattner
986357c54c Teach cee to propagate through switch statements. This implements
Transforms/CorrelatedExprs/switch.ll

Patch contributed by Eric Kidd!

llvm-svn: 26872
2006-03-19 19:37:24 +00:00
Evan Cheng
7955a187b1 - Fixed a bogus if condition.
- Added more debugging info.
- Allow reuse of IV of negative stride. e.g. -4 stride == 2 * iv of -2 stride.

llvm-svn: 26841
2006-03-18 08:03:12 +00:00
Evan Cheng
afe1ee3496 Sort StrideOrder so we can process the smallest strides first. This allows
for more IV reuses.

llvm-svn: 26837
2006-03-18 00:44:49 +00:00
Evan Cheng
fa1b885135 Allow users of iv / stride to be rewritten with expression that is a multiply
of a smaller stride even if they have a common loop invariant expression part.

llvm-svn: 26828
2006-03-17 19:52:23 +00:00
Evan Cheng
db35180b27 For each loop, keep track of all the IV expressions inserted indexed by
stride. For a set of uses of the IV of a stride which is a multiple
of another stride, do not insert a new IV expression. Rather, reuse the
previous IV and rewrite the uses as uses of IV expression multiplied by
the factor.

e.g.
x = 0 ...; x ++
y = 0 ...; y += 4
then use of y can be rewritten as use of 4*x for x86.

llvm-svn: 26803
2006-03-16 21:53:05 +00:00
Chris Lattner
92090188c3 Implement a FIXME, recusively reassociating
A*A*B + A*A*C   -->   A*(A*B+A*C)   -->   A*(A*(B+C))

This implements Reassociate/mul-factor3.ll

llvm-svn: 26757
2006-03-14 16:04:29 +00:00
Chris Lattner
9ccb2201b8 extract some code into a method, no functionality change
llvm-svn: 26755
2006-03-14 07:11:11 +00:00
Chris Lattner
0e3d81d8ff Promote shifts by a constant to multiplies so that we can reassociate
(x<<1)+(y<<1) -> (X+Y)<<1.  This implements
Transforms/Reassociate/shift-factor.ll

llvm-svn: 26753
2006-03-14 06:55:18 +00:00
Evan Cheng
692235499c Added target lowering hooks which LSR consults to make more intelligent
transformation decisions.

llvm-svn: 26738
2006-03-13 23:14:23 +00:00
Chris Lattner
256eff3ac6 Fix a miscompilation of 188.ammp with the new CFE. 188.ammp is accessing
arrays out of range in a horrible way, but we shouldn't break it anyway.
Details in the comments.

llvm-svn: 26606
2006-03-08 01:05:29 +00:00
Chris Lattner
8a548b3e7d Teach the alignment handling code to look through constant expr casts and GEPs
llvm-svn: 26580
2006-03-07 01:28:57 +00:00
Chris Lattner
58fe521b5b Teach instcombine to increase the alignment of memset/memcpy/memmove when
the pointer is known to come from either a global variable, alloca or
malloc.  This allows us to compile this:

  P = malloc(28);
  memset(P, 0, 28);

into explicit stores on PPC instead of a memset call.

llvm-svn: 26577
2006-03-06 20:18:44 +00:00
Chris Lattner
43e9ec760b Make vector narrowing more effective, implementing
Transforms/InstCombine/vec_narrow.ll.  This add support for narrowing
extract_element(insertelement) also.

llvm-svn: 26538
2006-03-05 00:22:33 +00:00
Chris Lattner
7694fbc4bb Add factoring of multiplications, e.g. turning A*A+A*B into A*(A+B).
Testcase here: Transforms/Reassociate/mulfactor.ll

llvm-svn: 26524
2006-03-04 09:31:13 +00:00
Chris Lattner
f526a4e5f6 Canonicalize (X+C1)*C2 -> X*C2+C1*C2
This implements Transforms/InstCombine/add.ll:test31

llvm-svn: 26519
2006-03-04 06:04:02 +00:00
Chris Lattner
45ee76ee68 Change this to work with renamed intrinsics.
llvm-svn: 26484
2006-03-03 01:34:17 +00:00
Chris Lattner
092112baa6 Generalize the REM folding code to handle another case Nick Lewycky
pointed out: realize the AND can provide factors and look through Casts.

llvm-svn: 26469
2006-03-02 06:50:58 +00:00
Chris Lattner
74e3523663 Fix a regression in a patch from a couple of days ago. This fixes
Transforms/InstCombine/2006-02-28-Crash.ll

llvm-svn: 26427
2006-02-28 19:47:20 +00:00
Chris Lattner
684fc3dc9d Implement rem.ll:test[7-9] and PR712
llvm-svn: 26415
2006-02-28 05:49:21 +00:00
Chris Lattner
63647f5028 Simplify some code now that the RHS of a rem can't be 0
llvm-svn: 26413
2006-02-28 05:40:55 +00:00
Chris Lattner
3b9fc06289 Rearrange some code, fold "rem X, 0", implementing rem.ll:test6
llvm-svn: 26411
2006-02-28 05:30:45 +00:00
Chris Lattner
792bfd8f28 Merge two almost-identical pieces of code.
Make this code more powerful by using ComputeMaskedBits instead of looking
for an AND operand.  This lets us fold this:

int %test23(int %a) {
        %tmp.1 = and int %a, 1
        %tmp.2 = seteq int %tmp.1, 0
        %tmp.3 = cast bool %tmp.2 to int  ;; xor tmp1, 1
        ret int %tmp.3
}

into: xor (and a, 1), 1
llvm-svn: 26396
2006-02-27 02:38:23 +00:00
Chris Lattner
c27bba037b Fold (A^B) == A -> B == 0
and  (A-B) == A  ->  B == 0

llvm-svn: 26394
2006-02-27 01:44:11 +00:00
Chris Lattner
80e2fa8a9d Fold (X|C1)^C2 -> X^(C1|C2) when possible. This implements
InstCombine/or.ll:test23.

llvm-svn: 26385
2006-02-26 19:57:54 +00:00
Chris Lattner
304aeda827 Fix a problem that Nate noticed that boils down to an over conservative check
in the code that does "select C, (X+Y), (X-Y) --> (X+(select C, Y, (-Y)))".
We now compile this loop:

LBB1_1: ; no_exit
        add r6, r2, r3
        subf r3, r2, r3
        cmpwi cr0, r2, 0
        addi r7, r5, 4
        lwz r2, 0(r5)
        addi r4, r4, 1
        blt cr0, LBB1_4 ; no_exit
LBB1_3: ; no_exit
        mr r3, r6
LBB1_4: ; no_exit
        cmpwi cr0, r4, 16
        mr r5, r7
        bne cr0, LBB1_1 ; no_exit

into this instead:

LBB1_1: ; no_exit
        srawi r6, r2, 31
        add r2, r2, r6
        xor r6, r2, r6
        addi r7, r5, 4
        lwz r2, 0(r5)
        addi r4, r4, 1
        add r3, r3, r6
        cmpwi cr0, r4, 16
        mr r5, r7
        bne cr0, LBB1_1 ; no_exit

llvm-svn: 26356
2006-02-24 18:05:58 +00:00
Chris Lattner
fafc1f9c51 Fix Regression/Transforms/LoopUnswitch/2006-02-22-UnswitchCrash.ll, which
caused SPASS to fail building last night.

We can't trivially unswitch a loop if the exit block has phi nodes in it,
because we don't know which predecessor to use.

llvm-svn: 26320
2006-02-22 23:55:00 +00:00
Chris Lattner
b539c70e07 Add some comments, simplify some code, and fix a bug that caused rewriting
to rewrite with the wrong value.

llvm-svn: 26311
2006-02-22 06:37:14 +00:00