Andrew Lenharth
f5b9a8fe57
Let me introduce you to the early stages of the llvm backend for the alpha processor
...
llvm-svn: 19764
2005-01-22 23:41:55 +00:00
Chris Lattner
b4cf4ffb04
Speed up folding operations into loads.
...
llvm-svn: 19733
2005-01-21 21:43:02 +00:00
Chris Lattner
fd4d7f71ae
The ever-important vanity pass name :)
...
llvm-svn: 19731
2005-01-21 21:35:14 +00:00
Chris Lattner
5f2fbeaa69
Fix a FIXME: realize that argument stores are all independent (don't alias)
...
llvm-svn: 19728
2005-01-21 19:46:38 +00:00
Chris Lattner
febeb380ae
Implement ADD_PARTS/SUB_PARTS so that 64-bit integer add/sub work. This
...
fixes most of the remaining llc-beta failures.
llvm-svn: 19716
2005-01-20 18:53:00 +00:00
Chris Lattner
8b0a2a3251
Fix a crash compiling 134.perl.
...
llvm-svn: 19711
2005-01-20 16:50:16 +00:00
Chris Lattner
6534e1ede3
Fix a problem where were were literally selecting for INCREASED register
...
pressure, not decreases register pressure. Fix problem where we accidentally
swapped the operands of SHLD, which caused fourinarow to fail. This fixes
fourinarow.
llvm-svn: 19697
2005-01-19 17:24:34 +00:00
Chris Lattner
b75589131d
When commuting these instructions, make sure to actually swap the operands too.
...
llvm-svn: 19694
2005-01-19 16:55:52 +00:00
Chris Lattner
fde1a5688b
Implement Regression/CodeGen/X86/rotate.ll: emit rotate instructions (which
...
typically cost 1 cycle) instead of shld/shrd instruction (which are typically
6 or more cycles). This also saves code space.
For example, instead of emitting:
rotr:
mov %EAX, DWORD PTR [%ESP + 4]
mov %CL, BYTE PTR [%ESP + 8]
shrd %EAX, %EAX, %CL
ret
rotli:
mov %EAX, DWORD PTR [%ESP + 4]
shrd %EAX, %EAX, 27
ret
Emit:
rotr32:
mov %CL, BYTE PTR [%ESP + 8]
mov %EAX, DWORD PTR [%ESP + 4]
ror %EAX, %CL
ret
rotli32:
mov %EAX, DWORD PTR [%ESP + 4]
ror %EAX, 27
ret
We also emit byte rotate instructions which do not have a sh[lr]d counterpart
at all.
llvm-svn: 19692
2005-01-19 08:07:05 +00:00
Chris Lattner
34757ff939
Add rotate instructions.
...
llvm-svn: 19690
2005-01-19 07:50:03 +00:00
Chris Lattner
e539ce8223
Match 16-bit shld/shrd instructions as well, implementing shift-double.llx:test5
...
llvm-svn: 19689
2005-01-19 07:37:26 +00:00
Chris Lattner
9d5ee289d7
Improve coverage of the X86 instruction set by adding 16-bit shift doubles.
...
llvm-svn: 19687
2005-01-19 07:31:24 +00:00
Chris Lattner
c03f360215
Teach the code generator that shrd/shld is commutable if it has an immediate.
...
This allows us to generate this:
foo:
mov %EAX, DWORD PTR [%ESP + 4]
mov %EDX, DWORD PTR [%ESP + 8]
shld %EDX, %EDX, 2
shl %EAX, 2
ret
instead of this:
foo:
mov %EAX, DWORD PTR [%ESP + 4]
mov %ECX, DWORD PTR [%ESP + 8]
mov %EDX, %EAX
shrd %EDX, %ECX, 30
shl %EAX, 2
ret
Note the magically transmogrifying immediate.
llvm-svn: 19686
2005-01-19 07:11:01 +00:00
Chris Lattner
33efebcdc8
Finegrainify namespacification
...
Add default impl of commuteInstruction
Add notes about ugly V9 code.
llvm-svn: 19684
2005-01-19 06:53:34 +00:00
Chris Lattner
575e912fcf
Codegen long >> 2 to this:
...
foo:
mov %EAX, DWORD PTR [%ESP + 4]
mov %EDX, DWORD PTR [%ESP + 8]
shrd %EAX, %EDX, 2
sar %EDX, 2
ret
instead of this:
test1:
mov %ECX, DWORD PTR [%ESP + 4]
shr %ECX, 2
mov %EDX, DWORD PTR [%ESP + 8]
mov %EAX, %EDX
shl %EAX, 30
or %EAX, %ECX
sar %EDX, 2
ret
and long << 2 to this:
foo:
mov %EAX, DWORD PTR [%ESP + 4]
mov %ECX, DWORD PTR [%ESP + 8]
*** mov %EDX, %EAX
shrd %EDX, %ECX, 30
shl %EAX, 2
ret
instead of this:
foo:
mov %EAX, DWORD PTR [%ESP + 4]
mov %ECX, %EAX
shr %ECX, 30
mov %EDX, DWORD PTR [%ESP + 8]
shl %EDX, 2
or %EDX, %ECX
shl %EAX, 2
ret
The extra copy (marked ***) can be eliminated when I teach the code generator
that shrd32rri8 is really commutative.
llvm-svn: 19681
2005-01-19 06:18:43 +00:00
Chris Lattner
419a5d213b
X86 shifts mask the amount.
...
llvm-svn: 19678
2005-01-19 03:36:30 +00:00
Chris Lattner
fbd1f8e4fd
Add a hook to find out how the target handles shift amounts that are out of
...
range. Either they are undefined (the default), they mask the shift amount
to the size of the register (X86, Alpha, etc), or they extend the shift (PPC).
This defaults to undefined, which is conservatively correct.
llvm-svn: 19677
2005-01-19 03:36:14 +00:00
Chris Lattner
6dec8cb829
Code to handle FP_EXTEND is dead now. X86 doesn't support any data types to
...
FP_EXTEND from!
llvm-svn: 19674
2005-01-18 20:05:56 +00:00
Chris Lattner
798e9c85d6
Remove more dead code.
...
llvm-svn: 19673
2005-01-18 19:50:08 +00:00
Chris Lattner
401814508f
The selection dag code handles the promotions from F32 to F64 for us, so we
...
don't need to even think about F32 in the X86 code anymore.
llvm-svn: 19672
2005-01-18 19:46:54 +00:00
Chris Lattner
dc09e52b3e
Fix 124.m88ksim.
...
llvm-svn: 19667
2005-01-18 17:35:28 +00:00
Chris Lattner
a04b1ee7a8
Do not emit loads multiple times, potentially in the wrong places.
...
llvm-svn: 19661
2005-01-18 04:18:32 +00:00
Tanya Lattner
d3459278f2
Minor changes.
...
llvm-svn: 19660
2005-01-18 04:15:41 +00:00
Chris Lattner
722ddeb86e
Eliminate bad assertions.
...
llvm-svn: 19659
2005-01-18 04:00:54 +00:00
Chris Lattner
8f3a8d96e2
* Eliminate the TokenSet and just use the ExprMap for both tokens and values.
...
* Insert some really pedantic assertions that will notice when we emit the
same loads more than one time, exposing bugs. This turns a miscompilation in
bzip2 into a compile-fail. yaay.
llvm-svn: 19658
2005-01-18 03:51:59 +00:00
Chris Lattner
b3edb09ede
Rely on the code in MatchAddress to do this work. Otherwise we fail to
...
match (X+Y)+(Z << 1), because we match the X+Y first, consuming the index
register, then there is no place to put the Z.
llvm-svn: 19652
2005-01-18 02:25:52 +00:00
Chris Lattner
ce2e0125dc
Fix a problem where probing for addressing modes caused expressions to be
...
emitted too early. In particular, this fixes
Regression/CodeGen/X86/regpressure.ll:regpressure3.
This also improves the 2nd basic block in 164.gzip:flush_block, which went from
.LBBflush_block_1: # loopentry.1.i
movzx %EAX, WORD PTR [dyn_ltree + 20]
movzx %ECX, WORD PTR [dyn_ltree + 16]
mov DWORD PTR [%ESP + 32], %ECX
movzx %ECX, WORD PTR [dyn_ltree + 12]
movzx %EDX, WORD PTR [dyn_ltree + 8]
movzx %EBX, WORD PTR [dyn_ltree + 4]
mov DWORD PTR [%ESP + 36], %EBX
movzx %EBX, WORD PTR [dyn_ltree]
add DWORD PTR [%ESP + 36], %EBX
add %EDX, DWORD PTR [%ESP + 36]
add %ECX, %EDX
add DWORD PTR [%ESP + 32], %ECX
add %EAX, DWORD PTR [%ESP + 32]
movzx %ECX, WORD PTR [dyn_ltree + 24]
add %EAX, %ECX
mov %ECX, 0
mov %EDX, %ECX
to
.LBBflush_block_1: # loopentry.1.i
movzx %EAX, WORD PTR [dyn_ltree]
movzx %ECX, WORD PTR [dyn_ltree + 4]
add %ECX, %EAX
movzx %EAX, WORD PTR [dyn_ltree + 8]
add %EAX, %ECX
movzx %ECX, WORD PTR [dyn_ltree + 12]
add %ECX, %EAX
movzx %EAX, WORD PTR [dyn_ltree + 16]
add %EAX, %ECX
movzx %ECX, WORD PTR [dyn_ltree + 20]
add %ECX, %EAX
movzx %EAX, WORD PTR [dyn_ltree + 24]
add %ECX, %EAX
mov %EAX, 0
mov %EDX, %EAX
... which results in less spilling in the function.
This change alone speeds up 164.gzip from 37.23s to 36.24s on apoc. The
default isel takes 37.31s.
llvm-svn: 19650
2005-01-18 01:06:26 +00:00
Chris Lattner
a78f9ced61
Fix indentation.
...
llvm-svn: 19649
2005-01-17 23:25:45 +00:00
Chris Lattner
dff1e3e86f
Don't bother using max here.
...
llvm-svn: 19647
2005-01-17 23:02:13 +00:00
Chris Lattner
2d86b43318
Do not give token factor nodes outrageous weights
...
llvm-svn: 19645
2005-01-17 22:56:09 +00:00
Chris Lattner
f2878ce8ba
Two changes:
...
1. Fold [mem] += (1|-1) into inc [mem]/dec [mem] to save some icache space.
2. Do not let token factor nodes prevent forming '[mem] op= val' folds.
llvm-svn: 19643
2005-01-17 22:10:42 +00:00
Chris Lattner
40c0fca632
Refactor load/op/store folding into it's own method, no functionality changes.
...
llvm-svn: 19641
2005-01-17 19:25:26 +00:00
Chris Lattner
2348abc421
Fix a major regression last night that prevented us from producing [mem] op= reg
...
operations.
The body of the if is less indented but unmodified in this patch.
llvm-svn: 19638
2005-01-17 17:49:14 +00:00
Chris Lattner
adb669ab1f
Codegen this:
...
int %foo(int %X) {
%T = add int %X, 13
%S = mul int %T, 3
ret int %S
}
as this:
mov %ECX, DWORD PTR [%ESP + 4]
lea %EAX, DWORD PTR [%ECX + 2*%ECX + 39]
ret
instead of this:
mov %ECX, DWORD PTR [%ESP + 4]
mov %EAX, %ECX
add %EAX, 13
imul %EAX, %EAX, 3
ret
llvm-svn: 19633
2005-01-17 06:48:02 +00:00
Tanya Lattner
5a10531cf8
Added tmp instructions to preserve ssa.
...
llvm-svn: 19632
2005-01-17 06:47:26 +00:00
Chris Lattner
51590b615c
Fix test/Regression/CodeGen/X86/2005-01-17-CycleInDAG.ll and 132.ijpeg.
...
Do not fold a load into an operation if it will induce a cycle in the DAG.
Repeat after me: dAg.
llvm-svn: 19631
2005-01-17 06:26:58 +00:00
Chris Lattner
f1e85bec5a
Do not fold a load into a comparison that is used by more than one place.
...
The comparison will probably be folded, so this is not ok to do.
This fixed 197.parser.
llvm-svn: 19624
2005-01-17 01:34:14 +00:00
Chris Lattner
1b8c8fe020
Do not codegen 'xor bool, true' as 'not reg'. not reg inverts the upper bits
...
of the bytereg. This fixes yacr2, 300.twolf and probably others.
llvm-svn: 19622
2005-01-17 00:23:16 +00:00
Chris Lattner
46dac4394c
Set up the shift and setcc types.
...
If we emit a load because we followed a token chain to get to it, try to
fold it into its single user if possible.
llvm-svn: 19620
2005-01-17 00:00:33 +00:00
Chris Lattner
4c88cc95ee
Shift and setcc types default to the pointer type.
...
llvm-svn: 19619
2005-01-16 23:59:48 +00:00
Tanya Lattner
fea188af7e
Added paramters to a few functions in order to allow me to change the functions to preserve SSA
...
llvm-svn: 19615
2005-01-16 08:51:10 +00:00
Chris Lattner
9ffc59287e
* Adjust to changes in TargetLowering interfaces.
...
* Remove custom promotion for bool and byte select ops. Legalize now
promotes them for us.
* Allow folding ConstantPoolIndexes into EXTLOAD's, useful for float immediates.
* Declare which operations are not supported better.
* Add some hacky code for TRUNCSTORE to pretend that we have truncstore
for i16 types. This is useful for testing promotion code because I can
just remove 16-bit registers all together and verify that programs work.
llvm-svn: 19614
2005-01-16 07:34:08 +00:00
Chris Lattner
b49d2a7b0f
Use enums, move virtual dtor out of line.
...
llvm-svn: 19610
2005-01-16 07:28:11 +00:00
Chris Lattner
be2a427f51
cycles_t -> CycleCount_t
...
llvm-svn: 19604
2005-01-16 04:20:30 +00:00
Reid Spencer
afa1cb9e11
Rename BUILD_* to PROJ_*
...
llvm-svn: 19592
2005-01-16 02:21:29 +00:00
Tanya Lattner
66cf1a6f82
Fixed a couple of instructions that broke SSA.
...
llvm-svn: 19587
2005-01-16 02:14:17 +00:00
Chris Lattner
605b9a23a2
Improve compatiblity with HPUX on Itanium, patch by Duraid Madina
...
llvm-svn: 19586
2005-01-16 01:31:31 +00:00
Chris Lattner
06c297f8ca
Set up identity transforms.
...
llvm-svn: 19584
2005-01-16 01:20:18 +00:00
Chris Lattner
1d0e1ffe02
Move some information out of LegalizeDAG into the generic Target interface.
...
llvm-svn: 19581
2005-01-16 01:10:58 +00:00
Chris Lattner
98611ce291
Add a new target-independent code generator flag.
...
llvm-svn: 19567
2005-01-15 06:00:32 +00:00
Chris Lattner
f3d950e816
Add support for truncstore and *extload.
...
llvm-svn: 19566
2005-01-15 05:22:24 +00:00
Chris Lattner
27c91fac94
Adjust to CopyFromREg changes.
...
llvm-svn: 19561
2005-01-14 22:37:41 +00:00
Chris Lattner
c032990335
Fix Regression/CodeGen/PowerPC/2005-01-14-UndefLong.ll
...
llvm-svn: 19557
2005-01-14 20:22:02 +00:00
Chris Lattner
b0b49268c4
Fix: Regression/CodeGen/PowerPC/2005-01-14-SetSelectCrash.ll
...
llvm-svn: 19555
2005-01-14 19:31:00 +00:00
Chris Lattner
7a8788c9ac
Add new ImplicitDef node, rename CopyRegSDNode class to RegSDNode.
...
llvm-svn: 19535
2005-01-13 20:50:02 +00:00
Chris Lattner
fce6a5439d
Codegen factor nodes more intelligently according to perceived register pressure.
...
llvm-svn: 19532
2005-01-13 19:56:00 +00:00
Chris Lattner
cb4359465a
Initial trivial (but stupid) codegen for this node.
...
llvm-svn: 19529
2005-01-13 18:01:36 +00:00
Chris Lattner
9a70166615
Add some really pedantic assertions to the load folding code. Fix a bunch
...
of cases where we accidentally emitted a load folded once and unfolded
elsewhere.
llvm-svn: 19522
2005-01-13 05:53:16 +00:00
Chris Lattner
2ab70aafe0
We can only fold a load into an op if there is exactly one use of the value.
...
Checking to see if the load has two uses is not equivalent, as the chain
value may have zero uses.
llvm-svn: 19518
2005-01-12 18:38:26 +00:00
Chris Lattner
4b03f0f99e
Try both ways to fold an add together. This allows us to generate this code
...
imul %EAX, %EAX, 400
add %ECX, %EAX
add %ESI, DWORD PTR [%ECX + 4*%EDX]
inc %EDX
cmp %EDX, 100
instead of this:
imul %EAX, %EAX, 400
add %ECX, %EAX
mov %EAX, %EDX
shl %EAX, 2
add %ECX, %EAX
add %ESI, DWORD PTR [%ECX]
inc %EDX
cmp %EDX, 100
llvm-svn: 19513
2005-01-12 18:08:53 +00:00
Chris Lattner
61c572eb7f
Fix a major miscompilation where we were overwriting the scale reg.
...
llvm-svn: 19511
2005-01-12 07:33:20 +00:00
Chris Lattner
5816f1a302
Do not use the type of the RHS constant to determine the type of the operation.
...
This fails for shifts because the constant is always 8 bits.
llvm-svn: 19508
2005-01-12 05:22:07 +00:00
Chris Lattner
89d6b21ae6
Do not lose the offset from teh global when peephole optimizing instructions.
...
This fixes FreeBench/pcompress
llvm-svn: 19507
2005-01-12 05:17:28 +00:00
Jeff Cohen
614a5ec22a
Fix C++ more compilatiom errors
...
llvm-svn: 19504
2005-01-12 04:29:05 +00:00
Chris Lattner
5ef92f3a40
Fix a compile error with VC++, which things that static const arrays need
...
to be dynamically initialized. :(
llvm-svn: 19503
2005-01-12 04:23:22 +00:00
Chris Lattner
627c64e5e5
Fix a bug that caused us to crash on povray. We weren't emitting an FP_REG_KILL into a block that had a successor with a FP PHI node.
...
llvm-svn: 19502
2005-01-12 04:21:28 +00:00
Chris Lattner
a5f0ba59a0
Print a load of a null pointer (in intel mode) like this:
...
mov %AX, WORD PTR [0]
instead of like this:
mov %AX, WORD PTR []
llvm-svn: 19501
2005-01-12 04:07:11 +00:00
Chris Lattner
360988bae2
Print a load of a null pointer like this:
...
movw 0, %ax
instead of like this:
movw , %ax
llvm-svn: 19500
2005-01-12 04:05:19 +00:00
Chris Lattner
3c85c67c97
Fix a crash compiling povray on UINT_TO_FP from i16.
...
llvm-svn: 19499
2005-01-12 04:00:00 +00:00
Chris Lattner
4e72a2a000
There are no [mem] op= reg instructions for FP, so remove their entries.
...
llvm-svn: 19496
2005-01-12 03:16:09 +00:00
Chris Lattner
00cb0ace9b
Fix a bug where we didn't insert FP_REG_KILL instructions into MBB's that
...
contain FP PHI nodes but no other FP defining instructions. This fixes
183.equake
llvm-svn: 19495
2005-01-12 02:57:10 +00:00
Chris Lattner
92166ed1df
Fold TRUNCATE (LOAD P) into a smaller load from P.
...
llvm-svn: 19494
2005-01-12 02:19:06 +00:00
Chris Lattner
258b23bd9d
Be more careful about order of arg evalution for CopyToReg nodes. This shrinks
...
256.bzip2 from 7142 to 7103 lines of .s file.
Second, add initial support for folding loads into compares, though this code
is dynamically dead for now. :(
llvm-svn: 19493
2005-01-12 02:02:48 +00:00
Chris Lattner
604416e8f4
Fold some more [mem] op= val operators. This allows us to things like this
...
several times in 256.bzip2:
mov %EAX, DWORD PTR [%ESP + 204]
- mov %EAX, DWORD PTR [%EAX]
- or %EAX, 2097152
- mov %ECX, DWORD PTR [%ESP + 204]
- mov DWORD PTR [%ECX], %EAX
+ or DWORD PTR [%EAX], 2097152
llvm-svn: 19492
2005-01-12 01:28:00 +00:00
Chris Lattner
e83ae1063f
Fold loads into sign/zero extends. instead of:
...
mov %AL, BYTE PTR [%EDX + l18_length_code]
movzx %EAX, %AL
Emit:
movzx %EAX, BYTE PTR [%EDX + l18_length_code]
llvm-svn: 19489
2005-01-11 23:33:00 +00:00
Chris Lattner
87a38bd4a8
Comment out debug code :)
...
Select [mem] += Val operations. For constants, we used to get:
mov %ECX, -32768
add %ECX, DWORD PTR [l4_match_start]
mov DWORD PTR [l4_match_start], %ECX
Now we get:
add DWORD PTR [l4_match_start], -32768
For other values we used to get:
mov %EBP, %EDI ;; because the add destroys the value
add %EBP, DWORD PTR [l4_input_len]
mov DWORD PTR [l4_input_len], %EBP
now we get:
add DWORD PTR [l4_input_len], %EDI
Both of these use less registers than the alternative, are faster and smaller.
llvm-svn: 19488
2005-01-11 23:21:30 +00:00
Chris Lattner
282473a25d
Handle the global address case here, not just the offset case.
...
llvm-svn: 19487
2005-01-11 22:58:43 +00:00
Chris Lattner
9eb2cc700b
Treat int constants as not requiring a register, since they are almost always
...
folded into an instruction.
llvm-svn: 19486
2005-01-11 22:29:12 +00:00
Chris Lattner
7cb2220907
* Factor a bunch of binary operator cases into shared code.
...
* Fold loads into Add, sub, and, or, xor and mul when possible.
* Codegen shl X, 1 as add X, X
llvm-svn: 19483
2005-01-11 21:19:59 +00:00
Chris Lattner
b1a72cb39a
Clear the whole array, always.
...
llvm-svn: 19482
2005-01-11 20:25:26 +00:00
Chris Lattner
b838c9748e
Fold multiplies by 3,5,9 into addressing modes when possible.
...
llvm-svn: 19480
2005-01-11 19:37:02 +00:00
Chris Lattner
e7b1130b01
Instead of generating stuff like this:
...
mov %ECX, %EAX
add %ECX, 32768
mov %SI, WORD PTR [2*%ECX + l13_prev]
Generate this:
mov %SI, WORD PTR [2*%ECX + l13_prev + 65536]
This occurs when you have a GEP instruction where an index is
"something + imm".
llvm-svn: 19472
2005-01-11 06:36:20 +00:00
Chris Lattner
bb63a09cd1
Implement MEMCPY natively in terms of rep movs*
...
llvm-svn: 19468
2005-01-11 06:19:26 +00:00
Chris Lattner
b2b08a8bc1
Implement memset -> rep stos*
...
llvm-svn: 19467
2005-01-11 06:14:36 +00:00
Chris Lattner
58816a9e81
Announce that we don't support mem ops yet.
...
llvm-svn: 19466
2005-01-11 05:57:36 +00:00
Chris Lattner
f867443d7e
Teach the address selector to make 'reg+reg' addressing modes.
...
llvm-svn: 19457
2005-01-11 04:40:19 +00:00
Chris Lattner
edf06be50e
Emit NOT instructions.
...
llvm-svn: 19455
2005-01-11 04:31:30 +00:00
Chris Lattner
4e4bef2d6c
Fix a bug emitting branches that broke a lot of programs.
...
llvm-svn: 19452
2005-01-11 04:06:27 +00:00
Chris Lattner
4b51297a94
Be more careful where we set ContainsFPCode. We were missing a set in the
...
int -> FP casting code. Note that we don't have to set it for FP operations
that take FP values as operands: whatever produces the FP value will set the
flag.
llvm-svn: 19451
2005-01-11 03:50:45 +00:00
Chris Lattner
0c4c4094e3
Fix a major bug in setcc/cmov folding, where we accidentally
...
inverted the sense of the comparison.
llvm-svn: 19450
2005-01-11 03:37:59 +00:00
Chris Lattner
d188e03011
Take register pressure into account when we have to decide whether to
...
evaluate the LHS or the RHS of an operation first. This causes good things
to happen. For example, instead of compiling a loop to this:
.LBBstrength_result7_1: # loopentry
movl 16(%esp), %edi
movl (%edi), %edi ;;; LOAD
movl (%ecx), %ebx
movl $2, (%eax,%ebx,4)
movl (%edx), %ebx
movl %esi, %ebp
addl $21, %ebp
addl $42, %esi
cmpl $0, %edi ;;; USE
cmovne %esi, %ebp
cmpl %ebp, %ebx
movl %ebp, %esi
jg .LBBstrength_result7_1
We now compile it to this:
.LBBstrength_result7_1: # loopentry
movl %edi, %ebx
addl $42, %ebx
addl $21, %edi
movl (%ecx), %ebp ;; LOAD
cmpl $0, %ebp ;; USE
cmovne %ebx, %edi
movl (%edx), %ebx
movl $2, (%eax,%ebx,4)
movl (%esi), %ebx
cmpl %edi, %ebx
jg .LBBstrength_result7_1
Which reduces register pressure enough (in this case) to avoid spilling in the
loop.
As another example, consider the CodeGen/X86/regpressure.ll testcase. We
used to generate this code for both cases:
regpressure1:
subl $32, %esp
movl %esi, 12(%esp)
movl %edi, 8(%esp)
movl %ebx, 4(%esp)
movl %ebp, (%esp)
movl 36(%esp), %ecx
movl (%ecx), %eax
movl 4(%ecx), %edx
movl %edx, 24(%esp)
movl 8(%ecx), %edx
movl %edx, 16(%esp)
movl 12(%ecx), %edx
movl 16(%ecx), %esi
movl 20(%ecx), %edi
movl 24(%ecx), %ebx
movl %ebx, 28(%esp)
movl 28(%ecx), %ebx
movl 32(%ecx), %ebp
movl %ebp, 20(%esp)
movl 36(%ecx), %ecx
imull 24(%esp), %eax
imull 16(%esp), %eax
imull %edx, %eax
imull %esi, %eax
imull %edi, %eax
imull 28(%esp), %eax
imull %ebx, %eax
imull 20(%esp), %eax
imull %ecx, %eax
movl (%esp), %ebp
movl 4(%esp), %ebx
movl 8(%esp), %edi
movl 12(%esp), %esi
addl $32, %esp
ret
This code is basically trying to do all of the loads first, then execute all
of the multiplies. Because we run out of registers, lots of spill code happens.
We now generate this code for both cases:
regpressure1:
movl 4(%esp), %ecx
movl (%ecx), %eax
movl 4(%ecx), %edx
imull %edx, %eax
movl 8(%ecx), %edx
imull %edx, %eax
movl 12(%ecx), %edx
imull %edx, %eax
movl 16(%ecx), %edx
imull %edx, %eax
movl 20(%ecx), %edx
imull %edx, %eax
movl 24(%ecx), %edx
imull %edx, %eax
movl 28(%ecx), %edx
imull %edx, %eax
movl 32(%ecx), %edx
imull %edx, %eax
movl 36(%ecx), %ecx
imull %ecx, %eax
ret
which is much nicer (when we fold loads into the muls it will be even better).
The old instruction selector used to produce the good code for regpressure1
but not for regpressure2, as it depended on the order of operations in the
LLVM code.
llvm-svn: 19449
2005-01-11 03:11:44 +00:00
Chris Lattner
497e24c885
Fold setcc instructions into selects.
...
llvm-svn: 19438
2005-01-10 22:10:13 +00:00
Chris Lattner
65d007ab62
Add conditional moves for the parity flag.
...
llvm-svn: 19437
2005-01-10 22:09:33 +00:00
Chris Lattner
d61491dea2
Implement 8-bit multiply for X86.
...
llvm-svn: 19435
2005-01-10 20:55:48 +00:00
Chris Lattner
fcab5f75c0
Codegen (Reg|imm)+&GV as an LEA, because we cannot put it into the immediate field
...
of an ADDri (due to current restrictions on MachineOperand :( ). This allows
us to generate:
leal Data+16000, %edx
instead of:
movl $Data, %edx
addl $16000, %edx
llvm-svn: 19420
2005-01-09 20:20:29 +00:00
Chris Lattner
35375c11bf
Fix copy and pasto's for FP -> Int. This fixes fldry
...
llvm-svn: 19418
2005-01-09 19:49:59 +00:00
Chris Lattner
45155a3dee
Initial implementation of FP->INT and INT->FP casts
...
Also, fix zero_extend from bool to i8, which fixes Shootout/objinst.
llvm-svn: 19414
2005-01-09 18:52:44 +00:00
Chris Lattner
9ca9b20447
Fix a subtle bug involving constant expr casts from int to fp
...
llvm-svn: 19410
2005-01-09 01:49:29 +00:00
Chris Lattner
c5e53c07fd
Implement varargs and returnaddress/frameaddress intrinsics. With this
...
patch, all of SingleSource/UnitTests passes.
llvm-svn: 19408
2005-01-09 00:01:27 +00:00
Chris Lattner
ca81756527
Okay 15th time is the charm. Looking at the vector size is useless as it
...
gets clobbered by a previous statement. This fixes all calls finally.
llvm-svn: 19399
2005-01-08 20:51:36 +00:00