1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-24 05:23:45 +02:00
Commit Graph

7947 Commits

Author SHA1 Message Date
Chris Lattner
306dd8a44a Enhance hasConstantValue to ignore undef values in phi nodes. This allows it
to think that PHI[4, undef] == 4.

llvm-svn: 17096
2004-10-17 21:23:26 +00:00
Chris Lattner
ef0888e493 hasConstantValue will soon return instructions that don't dominate the PHI node,
so prepare for this.

llvm-svn: 17095
2004-10-17 21:22:38 +00:00
Chris Lattner
caf0d76a8a The first hunk corrects a bug when printing undef null values. We would print
0->field, which is illegal.  Now we print ((foo*)0)->field.

The second hunk is an optimization to not print undefined phi values.

llvm-svn: 17094
2004-10-17 17:48:59 +00:00
Chris Lattner
f96fb0c946 Don't print stuff out from the code generator. This broke the JIT horribly
last night. :)  bork!

llvm-svn: 17093
2004-10-17 17:40:50 +00:00
Reid Spencer
210d95cffb Make the library name SparcV9 specific
llvm-svn: 17089
2004-10-17 15:01:12 +00:00
Reid Spencer
7ece7ff509 Consolidate the definitions
llvm-svn: 17088
2004-10-17 15:00:26 +00:00
Reid Spencer
9a97056275 PPC32GenCodeEmitter instead of PowerPCGenCodeEmitter
llvm-svn: 17087
2004-10-17 14:59:38 +00:00
Chris Lattner
63e6bdd207 Rewrite support for cast uint -> FP. In particular, we used to compile this:
double %test(uint %X) {
        %tmp.1 = cast uint %X to double         ; <double> [#uses=1]
        ret double %tmp.1
}

into:

test:
        sub %ESP, 8
        mov %EAX, DWORD PTR [%ESP + 12]
        mov %ECX, 0
        mov DWORD PTR [%ESP], %EAX
        mov DWORD PTR [%ESP + 4], %ECX
        fild QWORD PTR [%ESP]
        add %ESP, 8
        ret

... which basically zero extends to 8 bytes, then does an fild for an
8-byte signed int.

Now we generate this:


test:
        sub %ESP, 4
        mov %EAX, DWORD PTR [%ESP + 8]
        mov DWORD PTR [%ESP], %EAX
        fild DWORD PTR [%ESP]
        shr %EAX, 31
        fadd DWORD PTR [.CPItest_0 + 4*%EAX]
        add %ESP, 4
        ret

        .section .rodata
        .align  4
.CPItest_0:
        .quad   5728578726015270912

This does a 32-bit signed integer load, then adds in an offset if the sign
bit of the integer was set.

It turns out that this is substantially faster than the preceeding sequence.
Consider this testcase:

unsigned a[2]={1,2};
volatile double G;

void main() {
    int i;
    for (i=0; i<100000000; ++i )
        G += a[i&1];
}

On zion (a P4 Xeon, 3Ghz), this patch speeds up the testcase from 2.140s
to 0.94s.

On apoc, an athlon MP 2100+, this patch speeds up the testcase from 1.72s
to 1.34s.

Note that the program takes 2.5s/1.97s on zion/apoc with GCC 3.3 -O3
-fomit-frame-pointer.

llvm-svn: 17083
2004-10-17 08:01:28 +00:00
Chris Lattner
bf114f32c0 Unify handling of constant pool indexes with the other code paths, allowing
us to use index registers for CPI's

llvm-svn: 17082
2004-10-17 07:49:45 +00:00
Chris Lattner
892b15538d Give the asmprinter the ability to print memrefs with a constant pool index,
index reg and scale

llvm-svn: 17081
2004-10-17 07:16:32 +00:00
Chris Lattner
2fdca0bc02 fold:
%X = and Y, constantint
  %Z = setcc %X, 0

instead of emitting:

        and %EAX, 3
        test %EAX, %EAX
        je .LBBfoo2_2   # UnifiedReturnBlock

We now emit:

        test %EAX, 3
        je .LBBfoo2_2   # UnifiedReturnBlock

This triggers 581 times on 176.gcc for example.

llvm-svn: 17080
2004-10-17 06:10:40 +00:00
Chris Lattner
3f095f3c33 All of these labels are off by one now that the unreachable instruction exists
llvm-svn: 17079
2004-10-17 05:37:47 +00:00
Nate Begeman
f9aac7846c Implement bitfield insert by recognizing the following pattern:
1. optional shift left
2. and x, immX
3. and y, immY
4. or z, x, y
==> rlwimi z, x, y, shift, mask begin, mask end

where immX == ~immY and immX is a run of set bits. This transformation
fires 32 times on voronoi, once on espresso, and probably several
dozen times on external benchmarks such as gcc.

To put this in terms of actual code generated for
struct B { unsigned a : 3; unsigned b : 2; };
void storeA (struct B *b, int v) { b->a = v;}
void storeB (struct B *b, int v) { b->b = v;}

Old:
_storeA:
        rlwinm r2, r4, 0, 29, 31
        lwz r4, 0(r3)
        rlwinm r4, r4, 0, 0, 28
        or r2, r4, r2
        stw r2, 0(r3)
        blr

_storeB:
        rlwinm r2, r4, 3, 0, 28
        rlwinm r2, r2, 0, 27, 28
        lwz r4, 0(r3)
        rlwinm r4, r4, 0, 29, 26
        or r2, r2, r4
        stw r2, 0(r3)
        blr

New:
_storeA:
        lwz r2, 0(r3)
        rlwimi r2, r4, 0, 29, 31
        stw r2, 0(r3)
        blr

_storeB:
        lwz r2, 0(r3)
        rlwimi r2, r4, 3, 27, 28
        stw r2, 0(r3)
        blr

llvm-svn: 17078
2004-10-17 05:19:20 +00:00
Chris Lattner
e5aa085c1d Fix constant folding relational operators with undef operands.
llvm-svn: 17077
2004-10-17 04:01:51 +00:00
Chris Lattner
dc55caa720 I forgot that sparc no longer uses the shared asmwriter. Give it support
for undef.

llvm-svn: 17075
2004-10-17 02:44:45 +00:00
Chris Lattner
bb5b3f0b2f Add support for unreachable and undef
llvm-svn: 17074
2004-10-17 02:42:42 +00:00
Chris Lattner
8c86882a99 Implement constant folding of undef values.
llvm-svn: 17070
2004-10-16 23:31:32 +00:00
Chris Lattner
68f14bc09c Fix a type violation
llvm-svn: 17069
2004-10-16 23:28:04 +00:00
Nate Begeman
d4c970aa3d Finally fix one of the oldest FIXMEs in the PowerPC backend: correctly
flag rotate left word immediate then mask insert (rlwimi) as a two-address
instruction, and update the ISel usage of the instruction accordingly.

This will allow us to properly schedule rlwimi, and use it to efficiently
codegen bitfield operations.

llvm-svn: 17068
2004-10-16 20:43:38 +00:00
Chris Lattner
d12442c206 Kill the bogon that slipped into my buffer before I committed.
llvm-svn: 17067
2004-10-16 19:46:33 +00:00
Chris Lattner
b55574181d Implement InstCombine/getelementptr.ll:test9, which is the source of many
ugly and giant constnat exprs in some programs.

llvm-svn: 17066
2004-10-16 19:44:59 +00:00
Chris Lattner
a0019a2104 Do not erroneously accept revision 6 bytecode files when the format hasn't
been defined yet!

llvm-svn: 17063
2004-10-16 18:56:02 +00:00
Chris Lattner
3a1215ce83 Fix fix fix
llvm-svn: 17057
2004-10-16 18:21:50 +00:00
Chris Lattner
2fae8a1ef9 Add support for unreachable
llvm-svn: 17056
2004-10-16 18:21:33 +00:00
Chris Lattner
8d479b62ad Add support for undef
llvm-svn: 17055
2004-10-16 18:19:26 +00:00
Chris Lattner
8336590b1f Add support for undef, unreachable, and function flags
llvm-svn: 17054
2004-10-16 18:18:16 +00:00
Chris Lattner
eb973c8226 Parse undef and unreachable
llvm-svn: 17053
2004-10-16 18:17:13 +00:00
Chris Lattner
cbdf19fed2 Add support
llvm-svn: 17052
2004-10-16 18:16:19 +00:00
Chris Lattner
5fac2c8212 Add support for undef and unreachable
llvm-svn: 17051
2004-10-16 18:14:10 +00:00
Chris Lattner
3662abfd5a ADd support for undef and unreachable
llvm-svn: 17050
2004-10-16 18:13:47 +00:00
Chris Lattner
ae2e5f4de1 Teach the X86 backend about unreachable and undef. Among other things, we
now compile:

'foo() {}' into "ret" instead of "mov EAX, 0; ret"

llvm-svn: 17049
2004-10-16 18:13:05 +00:00
Chris Lattner
08ad95ec1f Add support for unreachable and undef
llvm-svn: 17048
2004-10-16 18:12:13 +00:00
Chris Lattner
3ebca6fb19 Optimize instructions involving undef values. For example X+undef == undef.
llvm-svn: 17047
2004-10-16 18:11:37 +00:00
Chris Lattner
4fca8caaee Add support for UndefValue
llvm-svn: 17046
2004-10-16 18:10:31 +00:00
Chris Lattner
ca01f160ee When promoting mem2reg, make uninitialized values become undef isntead of 0.
llvm-svn: 17045
2004-10-16 18:10:06 +00:00
Chris Lattner
80f963c30b Handle undef values as undefined on the constant lattice
ignore unreachable instructions

llvm-svn: 17044
2004-10-16 18:09:41 +00:00
Chris Lattner
c630ba08cf Add note
llvm-svn: 17043
2004-10-16 18:09:25 +00:00
Chris Lattner
4a37579191 Add support for the undef value. Implement a new optimization based on globals
that are initialized with undef.  When promoting malloc to a global, start out
initialized to undef

llvm-svn: 17042
2004-10-16 18:09:00 +00:00
Chris Lattner
93bf5a8066 Add support for undef and unreachable
llvm-svn: 17041
2004-10-16 18:08:06 +00:00
Chris Lattner
8824a5ee1c Implement UndefValue class
llvm-svn: 17040
2004-10-16 18:07:16 +00:00
Chris Lattner
72d8078a36 Add a missing dependency
llvm-svn: 17031
2004-10-16 17:12:55 +00:00
Chris Lattner
b1e427f563 Fix file header
llvm-svn: 17030
2004-10-16 16:37:42 +00:00
Chris Lattner
b3a86dc93f Be more careful about looking for constants when we really want constantint's.
llvm-svn: 17029
2004-10-16 16:07:10 +00:00
Chris Lattner
d3844cc216 Move the implementation of the instructions clone methods to this file so
that the vtables for these classes are only instantiated in this translation
unit, not in every xlation unit they are used.

llvm-svn: 17026
2004-10-15 23:52:53 +00:00
Chris Lattner
33dd5f87b8 There is no reason not to build these in parallel
llvm-svn: 17023
2004-10-15 23:22:15 +00:00
Misha Brukman
209f5ea32b Add a space between the type and name of value when printing error message
llvm-svn: 17022
2004-10-15 23:08:50 +00:00
Chris Lattner
b346be57a2 Don't print a bunch of metrics that are meaningless for external functions
llvm-svn: 17017
2004-10-15 19:40:31 +00:00
Chris Lattner
25b5777485 Instruction select globals with offsets better. For example, on this test
case:

int C[100];
int foo() {
  return C[4];
}

We now codegen:

foo:
        mov %EAX, DWORD PTR [C + 16]
        ret

instead of:

foo:
        mov %EAX, OFFSET C
        mov %EAX, DWORD PTR [%EAX + 16]
        ret

Other impressive features may be coming later.

This patch is contributed by Jeff Cohen!

llvm-svn: 17011
2004-10-15 05:05:29 +00:00
Chris Lattner
38de76365d Give the X86 JIT the ability to encode global+disp constants. Patch
contributed by Jeff Cohen!

llvm-svn: 17010
2004-10-15 04:53:13 +00:00
Chris Lattner
812d56631a Give the X86 asm printer the ability to print out addressing modes that have
constant displacements from global variables.  Patch by Jeff Cohen!

llvm-svn: 17009
2004-10-15 04:44:53 +00:00