1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 20:43:44 +02:00
Commit Graph

840 Commits

Author SHA1 Message Date
Chris Lattner
3ed3e8669f Add debug-only=jit printout, so we see when lazily resolved symbols are
set up.

llvm-svn: 17862
2004-11-15 23:16:55 +00:00
Chris Lattner
9ef34d44e1 Simplify and rearrange long shift code
llvm-svn: 17861
2004-11-15 23:16:34 +00:00
Misha Brukman
8c1b4a5b9d GhostLinkage should not reach asm printing stage
llvm-svn: 17750
2004-11-14 21:03:49 +00:00
Chris Lattner
09b7f968e0 Don't print unneeded labels
llvm-svn: 17714
2004-11-13 23:27:11 +00:00
Chris Lattner
1cde11aa95 shld is a very high latency operation. Instead of emitting it for shifts of
two or three, open code the equivalent operation which is faster on athlon
and P4 (by a substantial margin).

For example, instead of compiling this:

long long X2(long long Y) { return Y << 2; }

to:

X3_2:
        movl 4(%esp), %eax
        movl 8(%esp), %edx
        shldl $2, %eax, %edx
        shll $2, %eax
        ret

Compile it to:

X2:
        movl 4(%esp), %eax
        movl 8(%esp), %ecx
        movl %eax, %edx
        shrl $30, %edx
        leal (%edx,%ecx,4), %edx
        shll $2, %eax
        ret

Likewise, for << 3, compile to:

X3:
        movl 4(%esp), %eax
        movl 8(%esp), %ecx
        movl %eax, %edx
        shrl $29, %edx
        leal (%edx,%ecx,8), %edx
        shll $3, %eax
        ret

This matches icc, except that icc open codes the shifts as adds on the P4.

llvm-svn: 17707
2004-11-13 20:48:57 +00:00
Chris Lattner
c531e090db Add missing check
llvm-svn: 17706
2004-11-13 20:04:38 +00:00
Chris Lattner
d1381380ae Compile:
long long X3_2(long long Y) { return Y+Y; }
int X(int Y) { return Y+Y; }

into:

X3_2:
        movl 4(%esp), %eax
        movl 8(%esp), %edx
        addl %eax, %eax
        adcl %edx, %edx
        ret
X:
        movl 4(%esp), %eax
        addl %eax, %eax
        ret

instead of:

X3_2:
        movl 4(%esp), %eax
        movl 8(%esp), %edx
        shldl $1, %eax, %edx
        shll $1, %eax
        ret

X:
        movl 4(%esp), %eax
        shll $1, %eax
        ret

llvm-svn: 17705
2004-11-13 20:03:48 +00:00
John Criswell
402e338f11 Correct the name of stosd for the AT&T syntax:
It's stosl (l for long == 32 bit).

llvm-svn: 17658
2004-11-10 04:48:15 +00:00
John Criswell
97da76178c Fix compilation problem; make the cast and the LHS be the same type.
llvm-svn: 17488
2004-11-05 16:17:06 +00:00
Chris Lattner
499e1b16a7 Quiet VC++ warnings
llvm-svn: 17484
2004-11-05 04:50:59 +00:00
Chris Lattner
d9696aa7b8 Fix a warning
llvm-svn: 17431
2004-11-02 15:27:57 +00:00
Chris Lattner
10de12fd46 Add placeholder variable to make Win32 work, applied for Morten Ofstad
llvm-svn: 17406
2004-11-01 20:10:20 +00:00
Reid Spencer
d3f7233495 Change Library Names Not To Conflict With Others When Installed
llvm-svn: 17286
2004-10-27 23:18:45 +00:00
Reid Spencer
019621a1ea Adjust to changes in Makefile.rules
llvm-svn: 17167
2004-10-22 21:02:08 +00:00
Reid Spencer
e48ba34fd4 We won't use automake
llvm-svn: 17155
2004-10-22 03:35:04 +00:00
Reid Spencer
ce514b1c2c Initial automake generated Makefile template
llvm-svn: 17136
2004-10-18 23:55:41 +00:00
Chris Lattner
cac643c78f Improve compatibility with VC++, patch contributed by Morten Ofstad!
llvm-svn: 17126
2004-10-18 15:54:17 +00:00
Chris Lattner
f96fb0c946 Don't print stuff out from the code generator. This broke the JIT horribly
last night. :)  bork!

llvm-svn: 17093
2004-10-17 17:40:50 +00:00
Chris Lattner
63e6bdd207 Rewrite support for cast uint -> FP. In particular, we used to compile this:
double %test(uint %X) {
        %tmp.1 = cast uint %X to double         ; <double> [#uses=1]
        ret double %tmp.1
}

into:

test:
        sub %ESP, 8
        mov %EAX, DWORD PTR [%ESP + 12]
        mov %ECX, 0
        mov DWORD PTR [%ESP], %EAX
        mov DWORD PTR [%ESP + 4], %ECX
        fild QWORD PTR [%ESP]
        add %ESP, 8
        ret

... which basically zero extends to 8 bytes, then does an fild for an
8-byte signed int.

Now we generate this:


test:
        sub %ESP, 4
        mov %EAX, DWORD PTR [%ESP + 8]
        mov DWORD PTR [%ESP], %EAX
        fild DWORD PTR [%ESP]
        shr %EAX, 31
        fadd DWORD PTR [.CPItest_0 + 4*%EAX]
        add %ESP, 4
        ret

        .section .rodata
        .align  4
.CPItest_0:
        .quad   5728578726015270912

This does a 32-bit signed integer load, then adds in an offset if the sign
bit of the integer was set.

It turns out that this is substantially faster than the preceeding sequence.
Consider this testcase:

unsigned a[2]={1,2};
volatile double G;

void main() {
    int i;
    for (i=0; i<100000000; ++i )
        G += a[i&1];
}

On zion (a P4 Xeon, 3Ghz), this patch speeds up the testcase from 2.140s
to 0.94s.

On apoc, an athlon MP 2100+, this patch speeds up the testcase from 1.72s
to 1.34s.

Note that the program takes 2.5s/1.97s on zion/apoc with GCC 3.3 -O3
-fomit-frame-pointer.

llvm-svn: 17083
2004-10-17 08:01:28 +00:00
Chris Lattner
bf114f32c0 Unify handling of constant pool indexes with the other code paths, allowing
us to use index registers for CPI's

llvm-svn: 17082
2004-10-17 07:49:45 +00:00
Chris Lattner
892b15538d Give the asmprinter the ability to print memrefs with a constant pool index,
index reg and scale

llvm-svn: 17081
2004-10-17 07:16:32 +00:00
Chris Lattner
2fdca0bc02 fold:
%X = and Y, constantint
  %Z = setcc %X, 0

instead of emitting:

        and %EAX, 3
        test %EAX, %EAX
        je .LBBfoo2_2   # UnifiedReturnBlock

We now emit:

        test %EAX, 3
        je .LBBfoo2_2   # UnifiedReturnBlock

This triggers 581 times on 176.gcc for example.

llvm-svn: 17080
2004-10-17 06:10:40 +00:00
Chris Lattner
ae2e5f4de1 Teach the X86 backend about unreachable and undef. Among other things, we
now compile:

'foo() {}' into "ret" instead of "mov EAX, 0; ret"

llvm-svn: 17049
2004-10-16 18:13:05 +00:00
Chris Lattner
25b5777485 Instruction select globals with offsets better. For example, on this test
case:

int C[100];
int foo() {
  return C[4];
}

We now codegen:

foo:
        mov %EAX, DWORD PTR [C + 16]
        ret

instead of:

foo:
        mov %EAX, OFFSET C
        mov %EAX, DWORD PTR [%EAX + 16]
        ret

Other impressive features may be coming later.

This patch is contributed by Jeff Cohen!

llvm-svn: 17011
2004-10-15 05:05:29 +00:00
Chris Lattner
38de76365d Give the X86 JIT the ability to encode global+disp constants. Patch
contributed by Jeff Cohen!

llvm-svn: 17010
2004-10-15 04:53:13 +00:00
Chris Lattner
812d56631a Give the X86 asm printer the ability to print out addressing modes that have
constant displacements from global variables.  Patch by Jeff Cohen!

llvm-svn: 17009
2004-10-15 04:44:53 +00:00
Chris Lattner
1b9a284e54 Allow X86 addressing modes to represent globals with offsets. Patch contributed
by Jeff Cohen!

llvm-svn: 17008
2004-10-15 04:43:20 +00:00
Reid Spencer
e6418ec30f Update to reflect changes in Makefile rules.
llvm-svn: 16950
2004-10-13 11:46:52 +00:00
Reid Spencer
1b7459b29d Initial version of automake Makefile.am file.
llvm-svn: 16893
2004-10-10 22:20:40 +00:00
Chris Lattner
2419e1d27e The person who was planning to add SSE support isn't anymore, so disable
the -sse* options (to avoid misleading people).

Also, the stack alignment of the target doesn't depend on whether SSE is
eventually implemented, so remove a comment.

llvm-svn: 16860
2004-10-08 22:41:46 +00:00
Chris Lattner
1291307d27 Fix a major regression from the bugfix for 2004-10-08-SelectSetCCFold.llx,
which prevented setcc's from being folded into branches.  It appears that
conditional branchinst's CC operand is actually operand(2), not operand(0)
as we might expect. :(

llvm-svn: 16859
2004-10-08 22:24:31 +00:00
Chris Lattner
1ac1e54bf9 Fix bug: 2004-10-08-SelectSetCCFold.llx. Normally this is hidden by the
instcombine xform, which is why we didn't notice it before.

llvm-svn: 16840
2004-10-08 16:34:13 +00:00
Chris Lattner
82aa8544a5 Remove debugging code, fix encoding problem. This fixes the problems
the JIT had last night.

llvm-svn: 16766
2004-10-06 14:31:50 +00:00
Chris Lattner
b0e465f0cb Codegen signed mod by 2 or -2 more efficiently. Instead of generating:
t:
        mov %EDX, DWORD PTR [%ESP + 4]
        mov %ECX, 2
        mov %EAX, %EDX
        sar %EDX, 31
        idiv %ECX
        mov %EAX, %EDX
        ret

Generate:
t:
        mov %ECX, DWORD PTR [%ESP + 4]
***     mov %EAX, %ECX
        cdq
        and %ECX, 1
        xor %ECX, %EDX
        sub %ECX, %EDX
***     mov %EAX, %ECX
        ret

Note that the two marked moves are redundant, and should be eliminated by the
register allocator, but aren't.

Compare this to GCC, which generates:

t:
        mov     %eax, DWORD PTR [%esp+4]
        mov     %edx, %eax
        shr     %edx, 31
        lea     %ecx, [%edx+%eax]
        and     %ecx, -2
        sub     %eax, %ecx
        ret

or ICC 8.0, which generates:

t:
        movl      4(%esp), %ecx                                 #3.5
        movl      $-2147483647, %eax                            #3.25
        imull     %ecx                                          #3.25
        movl      %ecx, %eax                                    #3.25
        sarl      $31, %eax                                     #3.25
        addl      %ecx, %edx                                    #3.25
        subl      %edx, %eax                                    #3.25
        addl      %eax, %eax                                    #3.25
        negl      %eax                                          #3.25
        subl      %eax, %ecx                                    #3.25
        movl      %ecx, %eax                                    #3.25
        ret                                                     #3.25

We would be in great shape if not for the moves.

llvm-svn: 16763
2004-10-06 05:01:07 +00:00
Chris Lattner
09b6b3f514 Fix a scary bug with signed division by a power of two. We used to generate:
s:   ;; X / 4
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %ECX, %EAX
        sar %ECX, 1
        shr %ECX, 30
        mov %EDX, %EAX
        add %EDX, %ECX
        sar %EAX, 2
        ret

When we really meant:

s:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %ECX, %EAX
        sar %ECX, 1
        shr %ECX, 30
        add %EAX, %ECX
        sar %EAX, 2
        ret

Hey, this also reduces register pressure too :)

llvm-svn: 16761
2004-10-06 04:19:43 +00:00
Chris Lattner
9258948b08 Codegen signed divides by 2 and -2 more efficiently. In particular
instead of:

s:   ;; X / 2
        movl 4(%esp), %eax
        movl %eax, %ecx
        shrl $31, %ecx
        movl %eax, %edx
        addl %ecx, %edx
        sarl $1, %eax
        ret

t:   ;; X / -2
        movl 4(%esp), %eax
        movl %eax, %ecx
        shrl $31, %ecx
        movl %eax, %edx
        addl %ecx, %edx
        sarl $1, %eax
        negl %eax
        ret

Emit:

s:
        movl 4(%esp), %eax
        cmpl $-2147483648, %eax
        sbbl $-1, %eax
        sarl $1, %eax
        ret

t:
        movl 4(%esp), %eax
        cmpl $-2147483648, %eax
        sbbl $-1, %eax
        sarl $1, %eax
        negl %eax
        ret

llvm-svn: 16760
2004-10-06 04:02:39 +00:00
Chris Lattner
acd213fba3 Add some new instructions. Fix the asm string for sbb32rr
llvm-svn: 16759
2004-10-06 04:01:02 +00:00
Chris Lattner
0228f228df * Prune #includes
* Update comments
* Rearrange code a bit
* Finally ELIMINATE the GAS workaround emitter for Intel mode.  woot!

llvm-svn: 16647
2004-10-04 07:31:08 +00:00
Chris Lattner
581948c8f6 Add support for emitting AT&T style .s files, and make it the default. Users
may now choose their output format with the -x86-asm-syntax={intel|att} flag.

llvm-svn: 16646
2004-10-04 07:24:48 +00:00
Chris Lattner
5959f4a108 Convert some missed patterns to support AT&T style
llvm-svn: 16645
2004-10-04 07:23:07 +00:00
Chris Lattner
a05d9f53bb Apparently the GNU assembler has a HUGE hack to be compatible with really
old and broken AT&T syntax assemblers.  The problem with this hack is that
*SOME* forms of the fdiv and fsub instructions have the 'r' bit inverted.
This was a real pain to figure out, but is trivially easy to support: thus
we are now bug compatible with gas and gcc.

llvm-svn: 16644
2004-10-04 07:08:46 +00:00
Chris Lattner
08098895db Fix incorrect suffix
llvm-svn: 16642
2004-10-04 05:20:16 +00:00
Chris Lattner
c2fc9597bd Fix some more missed suffixes and swapped operands
llvm-svn: 16641
2004-10-04 01:38:10 +00:00
Chris Lattner
7b15a84728 Add missing suffixes to FP instructions for AT&T mode
llvm-svn: 16640
2004-10-04 00:43:31 +00:00
Chris Lattner
8d44dcca97 Add support for the -x86-asm-syntax flag, which can be used to choose between
Intel and AT&T style assembly language.  The ultimate goal of this is to
eliminate the GasBugWorkaroundEmitter class, but for now AT&T style emission
is not fully operational.

llvm-svn: 16639
2004-10-03 20:36:57 +00:00
Chris Lattner
94780713a8 Add support to the instruction patterns for AT&T style output, which will
hopefully lead to the death of the 'GasBugWorkaroundEmitter'.  This also
includes changes to wrap the whole file to 80 columns! Woot! :)

Note that the AT&T style output has not been tested at all.

llvm-svn: 16638
2004-10-03 20:35:00 +00:00
Alkis Evlogimenos
3f8f30bcb8 The real x87 floating point registers should not be allocatable. They
are only used by the stackifier when transforming FPn register
allocations to the real stack file x87 registers.

llvm-svn: 16472
2004-09-21 21:22:11 +00:00
Misha Brukman
e877aacbaa s/ISel/X86ISel/ to have unique class names for debugging via gdb because the C++
front-end in gcc does not mangle classes in anonymous namespaces correctly.

llvm-svn: 16469
2004-09-21 18:21:21 +00:00
Reid Spencer
c6a8d70cff Convert code to compile with vc7.1.
Patch contributed by Paolo Invernizzi. Thanks Paolo!

llvm-svn: 16368
2004-09-15 17:06:42 +00:00
Misha Brukman
9e5af08ef9 Fit long lines into 80 cols via creative space elimination
llvm-svn: 16353
2004-09-15 01:40:18 +00:00