1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-29 23:12:55 +01:00
Commit Graph

1076 Commits

Author SHA1 Message Date
Chris Lattner
79ba9d58fd Don't emit two comparisons when comparing a FP value against zero!
llvm-svn: 20651
2005-03-17 16:29:26 +00:00
Chris Lattner
c9a3ea81bf Fix the missing symbols problem Bill was hitting. Patch contributed by
Bill Wendling!!

llvm-svn: 20649
2005-03-17 15:38:16 +00:00
Chris Lattner
4b688a1c70 This mega patch converts us from using Function::a{iterator|begin|end} to
using Function::arg_{iterator|begin|end}.  Likewise Module::g* -> Module::global_*.

This patch is contributed by Gabor Greif, thanks!

llvm-svn: 20597
2005-03-15 04:54:21 +00:00
Reid Spencer
b0ca4aa8cd Patch to make assembly output compatible with mingw compilation (identical
to cygwin)

llvm-svn: 20520
2005-03-08 17:02:05 +00:00
Chris Lattner
a024984017 Fix spelling, patch contributed by Gabor Greif!
llvm-svn: 20343
2005-02-27 06:18:25 +00:00
Chris Lattner
9838ab1271 Silence some uninit variable warnings.
llvm-svn: 20284
2005-02-23 05:57:21 +00:00
Chris Lattner
ab92b92bc5 We can fold promoted and non-promoted loads into divs also!
llvm-svn: 19835
2005-01-25 20:35:10 +00:00
Chris Lattner
a9a0369879 Fold promoted loads into binary ops for FP, allowing us to generate m32 forms
of FP ops.

llvm-svn: 19834
2005-01-25 20:03:11 +00:00
Chris Lattner
6ff85c3152 Silence a warning.
llvm-svn: 19798
2005-01-23 23:20:06 +00:00
Chris Lattner
94952e0947 Allow the FP stackifier to completely ignore functions that do not use FP at
all.  This should speed up the X86 backend fairly significantly on integer
codes.  Now if only we didn't have to compute livevar still... ;-)

llvm-svn: 19796
2005-01-23 23:13:59 +00:00
Reid Spencer
5c7b6e83f0 Support Cygwin assembly generation. The cygwin version of Gnu ASsembler
doesn't support certain directives and symbols on cygwin are prefixed with
an underscore. This patch makes the necessary adjustments to the output.

llvm-svn: 19775
2005-01-23 03:52:14 +00:00
Chris Lattner
b4cf4ffb04 Speed up folding operations into loads.
llvm-svn: 19733
2005-01-21 21:43:02 +00:00
Chris Lattner
fd4d7f71ae The ever-important vanity pass name :)
llvm-svn: 19731
2005-01-21 21:35:14 +00:00
Chris Lattner
5f2fbeaa69 Fix a FIXME: realize that argument stores are all independent (don't alias)
llvm-svn: 19728
2005-01-21 19:46:38 +00:00
Chris Lattner
febeb380ae Implement ADD_PARTS/SUB_PARTS so that 64-bit integer add/sub work. This
fixes most of the remaining llc-beta failures.

llvm-svn: 19716
2005-01-20 18:53:00 +00:00
Chris Lattner
8b0a2a3251 Fix a crash compiling 134.perl.
llvm-svn: 19711
2005-01-20 16:50:16 +00:00
Chris Lattner
6534e1ede3 Fix a problem where were were literally selecting for INCREASED register
pressure, not decreases register pressure.  Fix problem where we accidentally
swapped the operands of SHLD, which caused fourinarow to fail.  This fixes
fourinarow.

llvm-svn: 19697
2005-01-19 17:24:34 +00:00
Chris Lattner
b75589131d When commuting these instructions, make sure to actually swap the operands too.
llvm-svn: 19694
2005-01-19 16:55:52 +00:00
Chris Lattner
fde1a5688b Implement Regression/CodeGen/X86/rotate.ll: emit rotate instructions (which
typically cost 1 cycle) instead of shld/shrd instruction (which are typically
6 or more cycles).  This also saves code space.

For example, instead of emitting:

rotr:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %CL, BYTE PTR [%ESP + 8]
        shrd %EAX, %EAX, %CL
        ret
rotli:
        mov %EAX, DWORD PTR [%ESP + 4]
        shrd %EAX, %EAX, 27
        ret

Emit:

rotr32:
        mov %CL, BYTE PTR [%ESP + 8]
        mov %EAX, DWORD PTR [%ESP + 4]
        ror %EAX, %CL
        ret
rotli32:
        mov %EAX, DWORD PTR [%ESP + 4]
        ror %EAX, 27
        ret

We also emit byte rotate instructions which do not have a sh[lr]d counterpart
at all.

llvm-svn: 19692
2005-01-19 08:07:05 +00:00
Chris Lattner
34757ff939 Add rotate instructions.
llvm-svn: 19690
2005-01-19 07:50:03 +00:00
Chris Lattner
e539ce8223 Match 16-bit shld/shrd instructions as well, implementing shift-double.llx:test5
llvm-svn: 19689
2005-01-19 07:37:26 +00:00
Chris Lattner
9d5ee289d7 Improve coverage of the X86 instruction set by adding 16-bit shift doubles.
llvm-svn: 19687
2005-01-19 07:31:24 +00:00
Chris Lattner
c03f360215 Teach the code generator that shrd/shld is commutable if it has an immediate.
This allows us to generate this:

foo:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %EDX, DWORD PTR [%ESP + 8]
        shld %EDX, %EDX, 2
        shl %EAX, 2
        ret

instead of this:

foo:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %ECX, DWORD PTR [%ESP + 8]
        mov %EDX, %EAX
        shrd %EDX, %ECX, 30
        shl %EAX, 2
        ret

Note the magically transmogrifying immediate.

llvm-svn: 19686
2005-01-19 07:11:01 +00:00
Chris Lattner
575e912fcf Codegen long >> 2 to this:
foo:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %EDX, DWORD PTR [%ESP + 8]
        shrd %EAX, %EDX, 2
        sar %EDX, 2
        ret

instead of this:

test1:
        mov %ECX, DWORD PTR [%ESP + 4]
        shr %ECX, 2
        mov %EDX, DWORD PTR [%ESP + 8]
        mov %EAX, %EDX
        shl %EAX, 30
        or %EAX, %ECX
        sar %EDX, 2
        ret

and long << 2 to this:

foo:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %ECX, DWORD PTR [%ESP + 8]
***     mov %EDX, %EAX
        shrd %EDX, %ECX, 30
        shl %EAX, 2
        ret

instead of this:

foo:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %ECX, %EAX
        shr %ECX, 30
        mov %EDX, DWORD PTR [%ESP + 8]
        shl %EDX, 2
        or %EDX, %ECX
        shl %EAX, 2
        ret

The extra copy (marked ***) can be eliminated when I teach the code generator
that shrd32rri8 is really commutative.

llvm-svn: 19681
2005-01-19 06:18:43 +00:00
Chris Lattner
419a5d213b X86 shifts mask the amount.
llvm-svn: 19678
2005-01-19 03:36:30 +00:00
Chris Lattner
6dec8cb829 Code to handle FP_EXTEND is dead now. X86 doesn't support any data types to
FP_EXTEND from!

llvm-svn: 19674
2005-01-18 20:05:56 +00:00
Chris Lattner
798e9c85d6 Remove more dead code.
llvm-svn: 19673
2005-01-18 19:50:08 +00:00
Chris Lattner
401814508f The selection dag code handles the promotions from F32 to F64 for us, so we
don't need to even think about F32 in the X86 code anymore.

llvm-svn: 19672
2005-01-18 19:46:54 +00:00
Chris Lattner
dc09e52b3e Fix 124.m88ksim.
llvm-svn: 19667
2005-01-18 17:35:28 +00:00
Chris Lattner
a04b1ee7a8 Do not emit loads multiple times, potentially in the wrong places.
llvm-svn: 19661
2005-01-18 04:18:32 +00:00
Chris Lattner
722ddeb86e Eliminate bad assertions.
llvm-svn: 19659
2005-01-18 04:00:54 +00:00
Chris Lattner
8f3a8d96e2 * Eliminate the TokenSet and just use the ExprMap for both tokens and values.
* Insert some really pedantic assertions that will notice when we emit the
  same loads more than one time, exposing bugs.  This turns a miscompilation in
  bzip2 into a compile-fail.  yaay.

llvm-svn: 19658
2005-01-18 03:51:59 +00:00
Chris Lattner
b3edb09ede Rely on the code in MatchAddress to do this work. Otherwise we fail to
match (X+Y)+(Z << 1), because we match the X+Y first, consuming the index
register, then there is no place to put the Z.

llvm-svn: 19652
2005-01-18 02:25:52 +00:00
Chris Lattner
ce2e0125dc Fix a problem where probing for addressing modes caused expressions to be
emitted too early.  In particular, this fixes
Regression/CodeGen/X86/regpressure.ll:regpressure3.

This also improves the 2nd basic block in 164.gzip:flush_block, which went from

.LBBflush_block_1:      # loopentry.1.i
        movzx %EAX, WORD PTR [dyn_ltree + 20]
        movzx %ECX, WORD PTR [dyn_ltree + 16]
        mov DWORD PTR [%ESP + 32], %ECX
        movzx %ECX, WORD PTR [dyn_ltree + 12]
        movzx %EDX, WORD PTR [dyn_ltree + 8]
        movzx %EBX, WORD PTR [dyn_ltree + 4]
        mov DWORD PTR [%ESP + 36], %EBX
        movzx %EBX, WORD PTR [dyn_ltree]
        add DWORD PTR [%ESP + 36], %EBX
        add %EDX, DWORD PTR [%ESP + 36]
        add %ECX, %EDX
        add DWORD PTR [%ESP + 32], %ECX
        add %EAX, DWORD PTR [%ESP + 32]
        movzx %ECX, WORD PTR [dyn_ltree + 24]
        add %EAX, %ECX
        mov %ECX, 0
        mov %EDX, %ECX

to

.LBBflush_block_1:      # loopentry.1.i
        movzx %EAX, WORD PTR [dyn_ltree]
        movzx %ECX, WORD PTR [dyn_ltree + 4]
        add %ECX, %EAX
        movzx %EAX, WORD PTR [dyn_ltree + 8]
        add %EAX, %ECX
        movzx %ECX, WORD PTR [dyn_ltree + 12]
        add %ECX, %EAX
        movzx %EAX, WORD PTR [dyn_ltree + 16]
        add %EAX, %ECX
        movzx %ECX, WORD PTR [dyn_ltree + 20]
        add %ECX, %EAX
        movzx %EAX, WORD PTR [dyn_ltree + 24]
        add %ECX, %EAX
        mov %EAX, 0
        mov %EDX, %EAX

... which results in less spilling in the function.

This change alone speeds up 164.gzip from 37.23s to 36.24s on apoc.  The
default isel takes 37.31s.

llvm-svn: 19650
2005-01-18 01:06:26 +00:00
Chris Lattner
a78f9ced61 Fix indentation.
llvm-svn: 19649
2005-01-17 23:25:45 +00:00
Chris Lattner
dff1e3e86f Don't bother using max here.
llvm-svn: 19647
2005-01-17 23:02:13 +00:00
Chris Lattner
2d86b43318 Do not give token factor nodes outrageous weights
llvm-svn: 19645
2005-01-17 22:56:09 +00:00
Chris Lattner
f2878ce8ba Two changes:
1. Fold  [mem] += (1|-1) into inc [mem]/dec [mem] to save some icache space.
 2. Do not let token factor nodes prevent forming '[mem] op= val' folds.

llvm-svn: 19643
2005-01-17 22:10:42 +00:00
Chris Lattner
40c0fca632 Refactor load/op/store folding into it's own method, no functionality changes.
llvm-svn: 19641
2005-01-17 19:25:26 +00:00
Chris Lattner
2348abc421 Fix a major regression last night that prevented us from producing [mem] op= reg
operations.

The body of the if is less indented but unmodified in this patch.

llvm-svn: 19638
2005-01-17 17:49:14 +00:00
Chris Lattner
adb669ab1f Codegen this:
int %foo(int %X) {
        %T = add int %X, 13
        %S = mul int %T, 3
        ret int %S
}

as this:

        mov %ECX, DWORD PTR [%ESP + 4]
        lea %EAX, DWORD PTR [%ECX + 2*%ECX + 39]
        ret

instead of this:

        mov %ECX, DWORD PTR [%ESP + 4]
        mov %EAX, %ECX
        add %EAX, 13
        imul %EAX, %EAX, 3
        ret

llvm-svn: 19633
2005-01-17 06:48:02 +00:00
Chris Lattner
51590b615c Fix test/Regression/CodeGen/X86/2005-01-17-CycleInDAG.ll and 132.ijpeg.
Do not fold a load into an operation if it will induce a cycle in the DAG.

Repeat after me: dAg.

llvm-svn: 19631
2005-01-17 06:26:58 +00:00
Chris Lattner
f1e85bec5a Do not fold a load into a comparison that is used by more than one place.
The comparison will probably be folded, so this is not ok to do.
This fixed 197.parser.

llvm-svn: 19624
2005-01-17 01:34:14 +00:00
Chris Lattner
1b8c8fe020 Do not codegen 'xor bool, true' as 'not reg'. not reg inverts the upper bits
of the bytereg.  This fixes yacr2, 300.twolf and probably others.

llvm-svn: 19622
2005-01-17 00:23:16 +00:00
Chris Lattner
46dac4394c Set up the shift and setcc types.
If we emit a load because we followed a token chain to get to it, try to
fold it into its single user if possible.

llvm-svn: 19620
2005-01-17 00:00:33 +00:00
Chris Lattner
9ffc59287e * Adjust to changes in TargetLowering interfaces.
* Remove custom promotion for bool and byte select ops.  Legalize now
  promotes them for us.
* Allow folding ConstantPoolIndexes into EXTLOAD's, useful for float immediates.
* Declare which operations are not supported better.
* Add some hacky code for TRUNCSTORE to pretend that we have truncstore
  for i16 types.  This is useful for testing promotion code because I can
  just remove 16-bit registers all together and verify that programs work.

llvm-svn: 19614
2005-01-16 07:34:08 +00:00
Chris Lattner
f3d950e816 Add support for truncstore and *extload.
llvm-svn: 19566
2005-01-15 05:22:24 +00:00
Chris Lattner
27c91fac94 Adjust to CopyFromREg changes.
llvm-svn: 19561
2005-01-14 22:37:41 +00:00
Chris Lattner
7a8788c9ac Add new ImplicitDef node, rename CopyRegSDNode class to RegSDNode.
llvm-svn: 19535
2005-01-13 20:50:02 +00:00
Chris Lattner
fce6a5439d Codegen factor nodes more intelligently according to perceived register pressure.
llvm-svn: 19532
2005-01-13 19:56:00 +00:00
Chris Lattner
cb4359465a Initial trivial (but stupid) codegen for this node.
llvm-svn: 19529
2005-01-13 18:01:36 +00:00
Chris Lattner
9a70166615 Add some really pedantic assertions to the load folding code. Fix a bunch
of cases where we accidentally emitted a load folded once and unfolded
elsewhere.

llvm-svn: 19522
2005-01-13 05:53:16 +00:00
Chris Lattner
2ab70aafe0 We can only fold a load into an op if there is exactly one use of the value.
Checking to see if the load has two uses is not equivalent, as the chain
value may have zero uses.

llvm-svn: 19518
2005-01-12 18:38:26 +00:00
Chris Lattner
4b03f0f99e Try both ways to fold an add together. This allows us to generate this code
imul %EAX, %EAX, 400
        add %ECX, %EAX
        add %ESI, DWORD PTR [%ECX + 4*%EDX]
        inc %EDX
        cmp %EDX, 100

instead of this:

        imul %EAX, %EAX, 400
        add %ECX, %EAX
        mov %EAX, %EDX
        shl %EAX, 2
        add %ECX, %EAX
        add %ESI, DWORD PTR [%ECX]
        inc %EDX
        cmp %EDX, 100

llvm-svn: 19513
2005-01-12 18:08:53 +00:00
Chris Lattner
61c572eb7f Fix a major miscompilation where we were overwriting the scale reg.
llvm-svn: 19511
2005-01-12 07:33:20 +00:00
Chris Lattner
5816f1a302 Do not use the type of the RHS constant to determine the type of the operation.
This fails for shifts because the constant is always 8 bits.

llvm-svn: 19508
2005-01-12 05:22:07 +00:00
Chris Lattner
89d6b21ae6 Do not lose the offset from teh global when peephole optimizing instructions.
This fixes FreeBench/pcompress

llvm-svn: 19507
2005-01-12 05:17:28 +00:00
Jeff Cohen
614a5ec22a Fix C++ more compilatiom errors
llvm-svn: 19504
2005-01-12 04:29:05 +00:00
Chris Lattner
5ef92f3a40 Fix a compile error with VC++, which things that static const arrays need
to be dynamically initialized. :(

llvm-svn: 19503
2005-01-12 04:23:22 +00:00
Chris Lattner
627c64e5e5 Fix a bug that caused us to crash on povray. We weren't emitting an FP_REG_KILL into a block that had a successor with a FP PHI node.
llvm-svn: 19502
2005-01-12 04:21:28 +00:00
Chris Lattner
a5f0ba59a0 Print a load of a null pointer (in intel mode) like this:
mov %AX, WORD PTR [0]

instead of like this:

        mov %AX, WORD PTR []

llvm-svn: 19501
2005-01-12 04:07:11 +00:00
Chris Lattner
360988bae2 Print a load of a null pointer like this:
movw 0, %ax

instead of like this:

        movw , %ax

llvm-svn: 19500
2005-01-12 04:05:19 +00:00
Chris Lattner
3c85c67c97 Fix a crash compiling povray on UINT_TO_FP from i16.
llvm-svn: 19499
2005-01-12 04:00:00 +00:00
Chris Lattner
4e72a2a000 There are no [mem] op= reg instructions for FP, so remove their entries.
llvm-svn: 19496
2005-01-12 03:16:09 +00:00
Chris Lattner
00cb0ace9b Fix a bug where we didn't insert FP_REG_KILL instructions into MBB's that
contain FP PHI nodes but no other FP defining instructions.  This fixes
183.equake

llvm-svn: 19495
2005-01-12 02:57:10 +00:00
Chris Lattner
92166ed1df Fold TRUNCATE (LOAD P) into a smaller load from P.
llvm-svn: 19494
2005-01-12 02:19:06 +00:00
Chris Lattner
258b23bd9d Be more careful about order of arg evalution for CopyToReg nodes. This shrinks
256.bzip2 from 7142 to 7103 lines of .s file.

Second, add initial support for folding loads into compares, though this code
is dynamically dead for now. :(

llvm-svn: 19493
2005-01-12 02:02:48 +00:00
Chris Lattner
604416e8f4 Fold some more [mem] op= val operators. This allows us to things like this
several times in 256.bzip2:

        mov %EAX, DWORD PTR [%ESP + 204]
-       mov %EAX, DWORD PTR [%EAX]
-       or %EAX, 2097152
-       mov %ECX, DWORD PTR [%ESP + 204]
-       mov DWORD PTR [%ECX], %EAX
+       or DWORD PTR [%EAX], 2097152

llvm-svn: 19492
2005-01-12 01:28:00 +00:00
Chris Lattner
e83ae1063f Fold loads into sign/zero extends. instead of:
mov %AL, BYTE PTR [%EDX + l18_length_code]
  movzx %EAX, %AL

Emit:

  movzx %EAX, BYTE PTR [%EDX + l18_length_code]

llvm-svn: 19489
2005-01-11 23:33:00 +00:00
Chris Lattner
87a38bd4a8 Comment out debug code :)
Select [mem] += Val operations.  For constants, we used to get:

  mov %ECX, -32768
  add %ECX, DWORD PTR [l4_match_start]
  mov DWORD PTR [l4_match_start], %ECX

Now we get:

  add DWORD PTR [l4_match_start], -32768

For other values we used to get:

  mov %EBP, %EDI   ;; because the add destroys the value
  add %EBP, DWORD PTR [l4_input_len]
  mov DWORD PTR [l4_input_len], %EBP

now we get:

  add DWORD PTR [l4_input_len], %EDI

Both of these use less registers than the alternative, are faster and smaller.

llvm-svn: 19488
2005-01-11 23:21:30 +00:00
Chris Lattner
282473a25d Handle the global address case here, not just the offset case.
llvm-svn: 19487
2005-01-11 22:58:43 +00:00
Chris Lattner
9eb2cc700b Treat int constants as not requiring a register, since they are almost always
folded into an instruction.

llvm-svn: 19486
2005-01-11 22:29:12 +00:00
Chris Lattner
7cb2220907 * Factor a bunch of binary operator cases into shared code.
* Fold loads into Add, sub, and, or, xor and mul when possible.
* Codegen shl X, 1 as add X, X

llvm-svn: 19483
2005-01-11 21:19:59 +00:00
Chris Lattner
b838c9748e Fold multiplies by 3,5,9 into addressing modes when possible.
llvm-svn: 19480
2005-01-11 19:37:02 +00:00
Chris Lattner
e7b1130b01 Instead of generating stuff like this:
mov %ECX, %EAX
        add %ECX, 32768
        mov %SI, WORD PTR [2*%ECX + l13_prev]

Generate this:

        mov %SI, WORD PTR [2*%ECX + l13_prev + 65536]

This occurs when you have a GEP instruction where an index is
"something + imm".

llvm-svn: 19472
2005-01-11 06:36:20 +00:00
Chris Lattner
bb63a09cd1 Implement MEMCPY natively in terms of rep movs*
llvm-svn: 19468
2005-01-11 06:19:26 +00:00
Chris Lattner
b2b08a8bc1 Implement memset -> rep stos*
llvm-svn: 19467
2005-01-11 06:14:36 +00:00
Chris Lattner
58816a9e81 Announce that we don't support mem ops yet.
llvm-svn: 19466
2005-01-11 05:57:36 +00:00
Chris Lattner
f867443d7e Teach the address selector to make 'reg+reg' addressing modes.
llvm-svn: 19457
2005-01-11 04:40:19 +00:00
Chris Lattner
edf06be50e Emit NOT instructions.
llvm-svn: 19455
2005-01-11 04:31:30 +00:00
Chris Lattner
4e4bef2d6c Fix a bug emitting branches that broke a lot of programs.
llvm-svn: 19452
2005-01-11 04:06:27 +00:00
Chris Lattner
4b51297a94 Be more careful where we set ContainsFPCode. We were missing a set in the
int -> FP casting code.  Note that we don't have to set it for FP operations
that take FP values as operands: whatever produces the FP value will set the
flag.

llvm-svn: 19451
2005-01-11 03:50:45 +00:00
Chris Lattner
0c4c4094e3 Fix a major bug in setcc/cmov folding, where we accidentally
inverted the sense of the comparison.

llvm-svn: 19450
2005-01-11 03:37:59 +00:00
Chris Lattner
d188e03011 Take register pressure into account when we have to decide whether to
evaluate the LHS or the RHS of an operation first.  This causes good things
to happen.  For example, instead of compiling a loop to this:

.LBBstrength_result7_1: # loopentry
        movl 16(%esp), %edi
        movl (%edi), %edi             ;;; LOAD
        movl (%ecx), %ebx
        movl $2, (%eax,%ebx,4)
        movl (%edx), %ebx
        movl %esi, %ebp
        addl $21, %ebp
        addl $42, %esi
        cmpl $0, %edi                 ;;; USE
        cmovne %esi, %ebp
        cmpl %ebp, %ebx
        movl %ebp, %esi
        jg .LBBstrength_result7_1

We now compile it to this:

.LBBstrength_result7_1: # loopentry
        movl %edi, %ebx
        addl $42, %ebx
        addl $21, %edi
        movl (%ecx), %ebp              ;; LOAD
        cmpl $0, %ebp                  ;; USE
        cmovne %ebx, %edi
        movl (%edx), %ebx
        movl $2, (%eax,%ebx,4)
        movl (%esi), %ebx
        cmpl %edi, %ebx
        jg .LBBstrength_result7_1

Which reduces register pressure enough (in this case) to avoid spilling in the
loop.

As another example, consider the CodeGen/X86/regpressure.ll testcase.  We
used to generate this code for both cases:

regpressure1:
        subl $32, %esp
        movl %esi, 12(%esp)
        movl %edi, 8(%esp)
        movl %ebx, 4(%esp)
        movl %ebp, (%esp)
        movl 36(%esp), %ecx
        movl (%ecx), %eax
        movl 4(%ecx), %edx
        movl %edx, 24(%esp)
        movl 8(%ecx), %edx
        movl %edx, 16(%esp)
        movl 12(%ecx), %edx
        movl 16(%ecx), %esi
        movl 20(%ecx), %edi
        movl 24(%ecx), %ebx
        movl %ebx, 28(%esp)
        movl 28(%ecx), %ebx
        movl 32(%ecx), %ebp
        movl %ebp, 20(%esp)
        movl 36(%ecx), %ecx
        imull 24(%esp), %eax
        imull 16(%esp), %eax
        imull %edx, %eax
        imull %esi, %eax
        imull %edi, %eax
        imull 28(%esp), %eax
        imull %ebx, %eax
        imull 20(%esp), %eax
        imull %ecx, %eax
        movl (%esp), %ebp
        movl 4(%esp), %ebx
        movl 8(%esp), %edi
        movl 12(%esp), %esi
        addl $32, %esp
        ret

This code is basically trying to do all of the loads first, then execute all
of the multiplies.  Because we run out of registers, lots of spill code happens.
We now generate this code for both cases:

regpressure1:
        movl 4(%esp), %ecx
        movl (%ecx), %eax
        movl 4(%ecx), %edx
        imull %edx, %eax
        movl 8(%ecx), %edx
        imull %edx, %eax
        movl 12(%ecx), %edx
        imull %edx, %eax
        movl 16(%ecx), %edx
        imull %edx, %eax
        movl 20(%ecx), %edx
        imull %edx, %eax
        movl 24(%ecx), %edx
        imull %edx, %eax
        movl 28(%ecx), %edx
        imull %edx, %eax
        movl 32(%ecx), %edx
        imull %edx, %eax
        movl 36(%ecx), %ecx
        imull %ecx, %eax
        ret

which is much nicer (when we fold loads into the muls it will be even better).
The old instruction selector used to produce the good code for regpressure1
but not for regpressure2, as it depended on the order of operations in the
LLVM code.

llvm-svn: 19449
2005-01-11 03:11:44 +00:00
Chris Lattner
497e24c885 Fold setcc instructions into selects.
llvm-svn: 19438
2005-01-10 22:10:13 +00:00
Chris Lattner
65d007ab62 Add conditional moves for the parity flag.
llvm-svn: 19437
2005-01-10 22:09:33 +00:00
Chris Lattner
d61491dea2 Implement 8-bit multiply for X86.
llvm-svn: 19435
2005-01-10 20:55:48 +00:00
Chris Lattner
fcab5f75c0 Codegen (Reg|imm)+&GV as an LEA, because we cannot put it into the immediate field
of an ADDri (due to current restrictions on MachineOperand :( ).  This allows
us to generate:

        leal Data+16000, %edx

instead of:

        movl $Data, %edx
        addl $16000, %edx

llvm-svn: 19420
2005-01-09 20:20:29 +00:00
Chris Lattner
35375c11bf Fix copy and pasto's for FP -> Int. This fixes fldry
llvm-svn: 19418
2005-01-09 19:49:59 +00:00
Chris Lattner
45155a3dee Initial implementation of FP->INT and INT->FP casts
Also, fix zero_extend from bool to i8, which fixes Shootout/objinst.

llvm-svn: 19414
2005-01-09 18:52:44 +00:00
Chris Lattner
9ca9b20447 Fix a subtle bug involving constant expr casts from int to fp
llvm-svn: 19410
2005-01-09 01:49:29 +00:00
Chris Lattner
c5e53c07fd Implement varargs and returnaddress/frameaddress intrinsics. With this
patch, all of SingleSource/UnitTests passes.

llvm-svn: 19408
2005-01-09 00:01:27 +00:00
Chris Lattner
ca81756527 Okay 15th time is the charm. Looking at the vector size is useless as it
gets clobbered by a previous statement.  This fixes all calls finally.

llvm-svn: 19399
2005-01-08 20:51:36 +00:00
Chris Lattner
85816cff9a Okay, my off by one was actually off by two. This fixes Generic/2003-07-07-BadLongConst.ll
llvm-svn: 19398
2005-01-08 20:39:31 +00:00
Chris Lattner
2d68cb6cf4 Fix off by one error
llvm-svn: 19396
2005-01-08 20:31:34 +00:00
Chris Lattner
c4d075cfa3 Adjust to changes in LowerCallTo interface
Minor bugfixes

llvm-svn: 19376
2005-01-08 19:28:19 +00:00
Chris Lattner
6c7d3bd8ea Wrap long line.
llvm-svn: 19367
2005-01-08 06:59:50 +00:00
Chris Lattner
473ec492f7 The X86 instruction selector already handles codegen of:
store float 123.45, float* %P

as an integer store.  This adds handling of float immediate stores as integers
for arguments passed function calls.

This is now tested by CodeGen/X86/store-fp-constant.ll

llvm-svn: 19364
2005-01-08 05:45:24 +00:00
Chris Lattner
2c398fc8f6 Allow the selection-dag based selector to be diabled with -disable-pattern-isel.
For now, this is the default, as the current selector is missing some big pieces.
To enable the new selector, pass -disable-pattern-isel=false to llc or lli.

llvm-svn: 19335
2005-01-07 07:50:50 +00:00
Chris Lattner
216198574d Reimplementation of the X86 pattern isel. This is still missing many large
pieces, but can already do amazing things in some cases.

llvm-svn: 19334
2005-01-07 07:49:41 +00:00
Chris Lattner
74019f517a This file is now dead.
llvm-svn: 19333
2005-01-07 07:49:05 +00:00
Chris Lattner
079b497982 Add a new prototype
llvm-svn: 19332
2005-01-07 07:48:33 +00:00
Chris Lattner
608dd77d6b Codegen -1 and -0.0 more efficiently. This implements CodeGen/X86/negatize_zero.ll
llvm-svn: 19313
2005-01-06 21:19:16 +00:00
Chris Lattner
6d651234d6 1. If a double FP constant must be put into a constant pool, but it can be
precisely represented as a float, put it into the constant pool as a
   float.
2. Use the cbw/cwd/cdq instructions instead of an explicit SAR for signed
   division.

llvm-svn: 19291
2005-01-05 16:30:14 +00:00
Chris Lattner
b438f5251f Minor optimization to allocate R8 registers in a better order.
llvm-svn: 19289
2005-01-05 16:09:16 +00:00
Jeff Cohen
36968ed8c1 Revert elimination of global variable hack... still needed.
llvm-svn: 19273
2005-01-03 16:34:19 +00:00
Chris Lattner
1aaf8cccb2 ADC and IMUL are also commutable.
llvm-svn: 19264
2005-01-03 01:27:59 +00:00
Jeff Cohen
1087b72875 Eliminate the use of the global variable hack in the X86 target that was used
to get Visual Studio to link in X86.lib to the executables that need it.  There
is another way of doing it.

llvm-svn: 19252
2005-01-02 04:23:12 +00:00
Chris Lattner
a78fd4726e Disable 2->3 address promotion of add and inc instructions to LEA's. In
addition to being three address, LEA's don't set the flags.

This fixes 186.crafty.

llvm-svn: 19251
2005-01-02 04:18:17 +00:00
Chris Lattner
3ef32da6c3 Add a new method.
llvm-svn: 19249
2005-01-02 02:38:18 +00:00
Chris Lattner
95f1e628ed Add support for SETNPr to lower to memory form.
llvm-svn: 19248
2005-01-02 02:37:46 +00:00
Chris Lattner
d6bc921fa8 Implement the convertToThreeAddress method, add support for inverting JP/JNP
branches.

llvm-svn: 19247
2005-01-02 02:37:07 +00:00
Chris Lattner
0d6f03e52b Two changes here:
1. Add new instructions for checking parity flags: JP, JNP, SETP, SETNP.
2. Set the isCommutable and isPromotableTo3Address bits on several
   instructions.

llvm-svn: 19246
2005-01-02 02:35:46 +00:00
Chris Lattner
3b78513843 Remove unused enum value
llvm-svn: 19024
2004-12-17 22:41:46 +00:00
Chris Lattner
d11ba51208 Change the sentinal
llvm-svn: 19007
2004-12-17 00:46:51 +00:00
Chris Lattner
59d0c02d2b Create a stack slot for the return address lazily instead of eagerly. This
save small amounts of time for functions that don't call llvm.returnaddress
or llvm.frameaddress (which is almost all functions).

llvm-svn: 19006
2004-12-17 00:07:46 +00:00
Chris Lattner
4b1d58bf4b Adjust to changes in asmwriter filenames
llvm-svn: 18987
2004-12-16 17:33:24 +00:00
Chris Lattner
4136428410 Set the rounding mode for the X86 FPU to 64-bits instead of 80-bits. We
don't support long double anyway, and this gives us FP results closer to
other targets.

This also speeds up 179.art from 41.4s to 18.32s, by eliminating a problem
with extra precision that causes an FP == comparison to fail (leading to
extra loop iterations).

llvm-svn: 18895
2004-12-13 17:23:11 +00:00
Chris Lattner
6131b06f73 Use the target triple to pick this target.
llvm-svn: 18830
2004-12-12 17:40:28 +00:00
Chris Lattner
99c8cf8ef8 Fix a regression caused by the previous patch
llvm-svn: 18449
2004-12-03 05:13:15 +00:00
Chris Lattner
316f923a9c Spill/restore X86 floating point stack registers with 64-bits of precision
instead of 80-bits of precision.  This fixes PR467.

This change speeds up fldry on X86 with LLC from 7.32s on apoc to 4.68s.

llvm-svn: 18433
2004-12-02 18:17:31 +00:00
Chris Lattner
dfdd49b7af Consider 64-bit registers to be FP as well.
llvm-svn: 18432
2004-12-02 17:57:21 +00:00
Tanya Lattner
893f987574 Reverting this patch:
http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20041122/021428.html

It broke Mutlisource/Applications/obsequi

llvm-svn: 18407
2004-12-01 18:27:03 +00:00
Chris Lattner
9c400f3b28 Revamp long/ulong comparisons to use a much more efficient sequence (thanks
to Brian and the Sun compiler for pointing out that the obvious works :)

This also enables folding all long comparisons into setcc and branch
instructions: before we could only do == and !=

For example, for:
void test(unsigned long long A, unsigned long long B) {
   if (A < B) foo();
 }

We now generate:

test:
        subl $4, %esp
        movl %esi, (%esp)
        movl 8(%esp), %eax
        movl 12(%esp), %ecx
        movl 16(%esp), %edx
        movl 20(%esp), %esi
        subl %edx, %eax
        sbbl %esi, %ecx
        jae .LBBtest_2  # UnifiedReturnBlock
.LBBtest_1:     # then
        call foo
        movl (%esp), %esi
        addl $4, %esp
        ret
.LBBtest_2:     # UnifiedReturnBlock
        movl (%esp), %esi
        addl $4, %esp
        ret

Instead of:

test:
        subl $12, %esp
        movl %esi, 8(%esp)
        movl %ebx, 4(%esp)
        movl 16(%esp), %eax
        movl 20(%esp), %ecx
        movl 24(%esp), %edx
        movl 28(%esp), %esi
        cmpl %edx, %eax
        setb %al
        cmpl %esi, %ecx
        setb %bl
        cmove %ax, %bx
        testb %bl, %bl
        je .LBBtest_2   # UnifiedReturnBlock
.LBBtest_1:     # then
        call foo
        movl 4(%esp), %ebx
        movl 8(%esp), %esi
        addl $12, %esp
        ret
.LBBtest_2:     # UnifiedReturnBlock
        movl 4(%esp), %ebx
        movl 8(%esp), %esi
        addl $12, %esp
        ret

llvm-svn: 18330
2004-11-29 05:55:24 +00:00
Chris Lattner
7a34cbf266 Do not push two return addresses on the stack when we call external functions who have their addresses taken. This fixes test-call.ll
llvm-svn: 18134
2004-11-22 22:25:30 +00:00
Chris Lattner
f4c8575535 There is no reason to emit function stubs for direct calls.
llvm-svn: 18082
2004-11-21 03:46:06 +00:00
Chris Lattner
6d1fb33657 ignore generated files
llvm-svn: 18073
2004-11-21 00:01:54 +00:00
Chris Lattner
4a340e281e Remove all JIT specific code and switch the code generator over to emitting
relocations for global references.

llvm-svn: 18068
2004-11-20 23:55:15 +00:00
Chris Lattner
b9a44893e9 Implement the X86 JIT interfaces
llvm-svn: 18067
2004-11-20 23:54:33 +00:00
Chris Lattner
8e33311566 Describe the X86 target-specific relocations.
llvm-svn: 18066
2004-11-20 23:54:19 +00:00
Chris Lattner
3c20464ad7 We implement these interfaces
llvm-svn: 18065
2004-11-20 23:53:56 +00:00
Chris Lattner
0c79788bc4 Dont' forget to switch back to decimal output
llvm-svn: 18010
2004-11-19 20:57:07 +00:00
Chris Lattner
a7eec14b04 Fix a major bug in the signed shr code, which apparently only breaks 134.perl!
llvm-svn: 17902
2004-11-16 18:40:52 +00:00
Chris Lattner
41d31d7461 Remove a dead function, which died when we got GAS emission working (phwew,
hold your nose!)

llvm-svn: 17869
2004-11-16 04:34:29 +00:00
Chris Lattner
b378786c97 Implement a simple FIXME: if we are emitting a basic block address that has
already been emitted, we don't have to remember it and deal with it later,
just emit it directly.

llvm-svn: 17868
2004-11-16 04:30:51 +00:00
Chris Lattner
3f73c77ace * Merge some win32 ifdefs together
* Get rid of "emitMaybePCRelativeValue", either we want to emit a PC relative
  value or not: drop the maybe BS.  As it turns out, the only places where
  the bool was a variable coming in, the bool was a dynamic constant.

llvm-svn: 17867
2004-11-16 04:21:18 +00:00
Chris Lattner
3ed3e8669f Add debug-only=jit printout, so we see when lazily resolved symbols are
set up.

llvm-svn: 17862
2004-11-15 23:16:55 +00:00
Chris Lattner
9ef34d44e1 Simplify and rearrange long shift code
llvm-svn: 17861
2004-11-15 23:16:34 +00:00
Misha Brukman
8c1b4a5b9d GhostLinkage should not reach asm printing stage
llvm-svn: 17750
2004-11-14 21:03:49 +00:00
Chris Lattner
09b7f968e0 Don't print unneeded labels
llvm-svn: 17714
2004-11-13 23:27:11 +00:00
Chris Lattner
1cde11aa95 shld is a very high latency operation. Instead of emitting it for shifts of
two or three, open code the equivalent operation which is faster on athlon
and P4 (by a substantial margin).

For example, instead of compiling this:

long long X2(long long Y) { return Y << 2; }

to:

X3_2:
        movl 4(%esp), %eax
        movl 8(%esp), %edx
        shldl $2, %eax, %edx
        shll $2, %eax
        ret

Compile it to:

X2:
        movl 4(%esp), %eax
        movl 8(%esp), %ecx
        movl %eax, %edx
        shrl $30, %edx
        leal (%edx,%ecx,4), %edx
        shll $2, %eax
        ret

Likewise, for << 3, compile to:

X3:
        movl 4(%esp), %eax
        movl 8(%esp), %ecx
        movl %eax, %edx
        shrl $29, %edx
        leal (%edx,%ecx,8), %edx
        shll $3, %eax
        ret

This matches icc, except that icc open codes the shifts as adds on the P4.

llvm-svn: 17707
2004-11-13 20:48:57 +00:00
Chris Lattner
c531e090db Add missing check
llvm-svn: 17706
2004-11-13 20:04:38 +00:00
Chris Lattner
d1381380ae Compile:
long long X3_2(long long Y) { return Y+Y; }
int X(int Y) { return Y+Y; }

into:

X3_2:
        movl 4(%esp), %eax
        movl 8(%esp), %edx
        addl %eax, %eax
        adcl %edx, %edx
        ret
X:
        movl 4(%esp), %eax
        addl %eax, %eax
        ret

instead of:

X3_2:
        movl 4(%esp), %eax
        movl 8(%esp), %edx
        shldl $1, %eax, %edx
        shll $1, %eax
        ret

X:
        movl 4(%esp), %eax
        shll $1, %eax
        ret

llvm-svn: 17705
2004-11-13 20:03:48 +00:00
John Criswell
402e338f11 Correct the name of stosd for the AT&T syntax:
It's stosl (l for long == 32 bit).

llvm-svn: 17658
2004-11-10 04:48:15 +00:00
John Criswell
97da76178c Fix compilation problem; make the cast and the LHS be the same type.
llvm-svn: 17488
2004-11-05 16:17:06 +00:00
Chris Lattner
499e1b16a7 Quiet VC++ warnings
llvm-svn: 17484
2004-11-05 04:50:59 +00:00
Chris Lattner
d9696aa7b8 Fix a warning
llvm-svn: 17431
2004-11-02 15:27:57 +00:00
Chris Lattner
10de12fd46 Add placeholder variable to make Win32 work, applied for Morten Ofstad
llvm-svn: 17406
2004-11-01 20:10:20 +00:00
Reid Spencer
d3f7233495 Change Library Names Not To Conflict With Others When Installed
llvm-svn: 17286
2004-10-27 23:18:45 +00:00
Reid Spencer
019621a1ea Adjust to changes in Makefile.rules
llvm-svn: 17167
2004-10-22 21:02:08 +00:00