Chris Lattner
e83ae1063f
Fold loads into sign/zero extends. instead of:
...
mov %AL, BYTE PTR [%EDX + l18_length_code]
movzx %EAX, %AL
Emit:
movzx %EAX, BYTE PTR [%EDX + l18_length_code]
llvm-svn: 19489
2005-01-11 23:33:00 +00:00
Chris Lattner
87a38bd4a8
Comment out debug code :)
...
Select [mem] += Val operations. For constants, we used to get:
mov %ECX, -32768
add %ECX, DWORD PTR [l4_match_start]
mov DWORD PTR [l4_match_start], %ECX
Now we get:
add DWORD PTR [l4_match_start], -32768
For other values we used to get:
mov %EBP, %EDI ;; because the add destroys the value
add %EBP, DWORD PTR [l4_input_len]
mov DWORD PTR [l4_input_len], %EBP
now we get:
add DWORD PTR [l4_input_len], %EDI
Both of these use less registers than the alternative, are faster and smaller.
llvm-svn: 19488
2005-01-11 23:21:30 +00:00
Chris Lattner
282473a25d
Handle the global address case here, not just the offset case.
...
llvm-svn: 19487
2005-01-11 22:58:43 +00:00
Chris Lattner
9eb2cc700b
Treat int constants as not requiring a register, since they are almost always
...
folded into an instruction.
llvm-svn: 19486
2005-01-11 22:29:12 +00:00
Chris Lattner
74fcfd5148
Print the value types in the nodes of the graph
...
llvm-svn: 19485
2005-01-11 22:21:04 +00:00
Chris Lattner
f588cdd51e
add an assertion, avoid creating copyfromreg/copytoreg pairs that are the
...
same for PHI nodes.
llvm-svn: 19484
2005-01-11 22:03:46 +00:00
Chris Lattner
7cb2220907
* Factor a bunch of binary operator cases into shared code.
...
* Fold loads into Add, sub, and, or, xor and mul when possible.
* Codegen shl X, 1 as add X, X
llvm-svn: 19483
2005-01-11 21:19:59 +00:00
Chris Lattner
b1a72cb39a
Clear the whole array, always.
...
llvm-svn: 19482
2005-01-11 20:25:26 +00:00
Chris Lattner
b838c9748e
Fold multiplies by 3,5,9 into addressing modes when possible.
...
llvm-svn: 19480
2005-01-11 19:37:02 +00:00
Chris Lattner
8de5a27681
Squelch optimized warning.
...
llvm-svn: 19475
2005-01-11 17:46:49 +00:00
Chris Lattner
e7b1130b01
Instead of generating stuff like this:
...
mov %ECX, %EAX
add %ECX, 32768
mov %SI, WORD PTR [2*%ECX + l13_prev]
Generate this:
mov %SI, WORD PTR [2*%ECX + l13_prev + 65536]
This occurs when you have a GEP instruction where an index is
"something + imm".
llvm-svn: 19472
2005-01-11 06:36:20 +00:00
Chris Lattner
bb63a09cd1
Implement MEMCPY natively in terms of rep movs*
...
llvm-svn: 19468
2005-01-11 06:19:26 +00:00
Chris Lattner
b2b08a8bc1
Implement memset -> rep stos*
...
llvm-svn: 19467
2005-01-11 06:14:36 +00:00
Chris Lattner
58816a9e81
Announce that we don't support mem ops yet.
...
llvm-svn: 19466
2005-01-11 05:57:36 +00:00
Chris Lattner
963af6652b
Teach legalize to lower MEMSET/MEMCPY/MEMMOVE operations if the target
...
does not support them.
llvm-svn: 19465
2005-01-11 05:57:22 +00:00
Chris Lattner
6b9082114f
Print new operations.
...
llvm-svn: 19464
2005-01-11 05:57:01 +00:00
Chris Lattner
7cde8a2658
Turn memset/memcpy/memmove into the corresponding operations.
...
llvm-svn: 19463
2005-01-11 05:56:49 +00:00
Chris Lattner
f867443d7e
Teach the address selector to make 'reg+reg' addressing modes.
...
llvm-svn: 19457
2005-01-11 04:40:19 +00:00
Reid Spencer
7e9642515c
Add the LOADABLE_MODULE=1 directive to indicate that this shared library is
...
intended to be a dlopenable module and not a "plain" shared library.
llvm-svn: 19456
2005-01-11 04:33:32 +00:00
Chris Lattner
edf06be50e
Emit NOT instructions.
...
llvm-svn: 19455
2005-01-11 04:31:30 +00:00
Chris Lattner
2eacd11a86
shift X, 0 -> X
...
llvm-svn: 19453
2005-01-11 04:25:13 +00:00
Chris Lattner
4e4bef2d6c
Fix a bug emitting branches that broke a lot of programs.
...
llvm-svn: 19452
2005-01-11 04:06:27 +00:00
Chris Lattner
4b51297a94
Be more careful where we set ContainsFPCode. We were missing a set in the
...
int -> FP casting code. Note that we don't have to set it for FP operations
that take FP values as operands: whatever produces the FP value will set the
flag.
llvm-svn: 19451
2005-01-11 03:50:45 +00:00
Chris Lattner
0c4c4094e3
Fix a major bug in setcc/cmov folding, where we accidentally
...
inverted the sense of the comparison.
llvm-svn: 19450
2005-01-11 03:37:59 +00:00
Chris Lattner
d188e03011
Take register pressure into account when we have to decide whether to
...
evaluate the LHS or the RHS of an operation first. This causes good things
to happen. For example, instead of compiling a loop to this:
.LBBstrength_result7_1: # loopentry
movl 16(%esp), %edi
movl (%edi), %edi ;;; LOAD
movl (%ecx), %ebx
movl $2, (%eax,%ebx,4)
movl (%edx), %ebx
movl %esi, %ebp
addl $21, %ebp
addl $42, %esi
cmpl $0, %edi ;;; USE
cmovne %esi, %ebp
cmpl %ebp, %ebx
movl %ebp, %esi
jg .LBBstrength_result7_1
We now compile it to this:
.LBBstrength_result7_1: # loopentry
movl %edi, %ebx
addl $42, %ebx
addl $21, %edi
movl (%ecx), %ebp ;; LOAD
cmpl $0, %ebp ;; USE
cmovne %ebx, %edi
movl (%edx), %ebx
movl $2, (%eax,%ebx,4)
movl (%esi), %ebx
cmpl %edi, %ebx
jg .LBBstrength_result7_1
Which reduces register pressure enough (in this case) to avoid spilling in the
loop.
As another example, consider the CodeGen/X86/regpressure.ll testcase. We
used to generate this code for both cases:
regpressure1:
subl $32, %esp
movl %esi, 12(%esp)
movl %edi, 8(%esp)
movl %ebx, 4(%esp)
movl %ebp, (%esp)
movl 36(%esp), %ecx
movl (%ecx), %eax
movl 4(%ecx), %edx
movl %edx, 24(%esp)
movl 8(%ecx), %edx
movl %edx, 16(%esp)
movl 12(%ecx), %edx
movl 16(%ecx), %esi
movl 20(%ecx), %edi
movl 24(%ecx), %ebx
movl %ebx, 28(%esp)
movl 28(%ecx), %ebx
movl 32(%ecx), %ebp
movl %ebp, 20(%esp)
movl 36(%ecx), %ecx
imull 24(%esp), %eax
imull 16(%esp), %eax
imull %edx, %eax
imull %esi, %eax
imull %edi, %eax
imull 28(%esp), %eax
imull %ebx, %eax
imull 20(%esp), %eax
imull %ecx, %eax
movl (%esp), %ebp
movl 4(%esp), %ebx
movl 8(%esp), %edi
movl 12(%esp), %esi
addl $32, %esp
ret
This code is basically trying to do all of the loads first, then execute all
of the multiplies. Because we run out of registers, lots of spill code happens.
We now generate this code for both cases:
regpressure1:
movl 4(%esp), %ecx
movl (%ecx), %eax
movl 4(%ecx), %edx
imull %edx, %eax
movl 8(%ecx), %edx
imull %edx, %eax
movl 12(%ecx), %edx
imull %edx, %eax
movl 16(%ecx), %edx
imull %edx, %eax
movl 20(%ecx), %edx
imull %edx, %eax
movl 24(%ecx), %edx
imull %edx, %eax
movl 28(%ecx), %edx
imull %edx, %eax
movl 32(%ecx), %edx
imull %edx, %eax
movl 36(%ecx), %ecx
imull %ecx, %eax
ret
which is much nicer (when we fold loads into the muls it will be even better).
The old instruction selector used to produce the good code for regpressure1
but not for regpressure2, as it depended on the order of operations in the
LLVM code.
llvm-svn: 19449
2005-01-11 03:11:44 +00:00
Chris Lattner
07a3ade230
Print SelectionDAGs bottom up, include extra info in the node labels
...
llvm-svn: 19447
2005-01-11 00:34:33 +00:00
Chris Lattner
1c273d3a14
Add a marker for the graph root.
...
llvm-svn: 19445
2005-01-10 23:52:04 +00:00
Chris Lattner
daa052a97e
Put the operation name in each node, put the function name on the graph.
...
llvm-svn: 19444
2005-01-10 23:26:00 +00:00
Chris Lattner
0307506841
Split out SDNode::getOperationName into its own method.
...
llvm-svn: 19443
2005-01-10 23:25:25 +00:00
Chris Lattner
8c13447254
Implement initial selectiondag printing support. This gets us a nice
...
graph with no labels! :)
llvm-svn: 19441
2005-01-10 23:08:40 +00:00
Chris Lattner
497e24c885
Fold setcc instructions into selects.
...
llvm-svn: 19438
2005-01-10 22:10:13 +00:00
Chris Lattner
65d007ab62
Add conditional moves for the parity flag.
...
llvm-svn: 19437
2005-01-10 22:09:33 +00:00
Chris Lattner
5433d8de29
Lower to the correct functions. This fixes FreeBench/fourinarow
...
llvm-svn: 19436
2005-01-10 21:02:37 +00:00
Chris Lattner
d61491dea2
Implement 8-bit multiply for X86.
...
llvm-svn: 19435
2005-01-10 20:55:48 +00:00
Chris Lattner
b35b30c283
Rework constant pool handling so that function constant pools are no longer
...
leaked to the system. Now they are destroyed with the JITMemoryManager is
destroyed.
llvm-svn: 19434
2005-01-10 18:23:22 +00:00
Jeff Cohen
8b03a55724
Apply feedback from Chris.
...
llvm-svn: 19432
2005-01-10 04:23:32 +00:00
Jeff Cohen
a7f1ae5dc0
Apply feed back from Chris:
...
1. Rename createLoaderPass to CreateProfileLoaderPass
2. Opt shouldn't use the pass registered in CodeGen.
llvm-svn: 19431
2005-01-10 03:56:27 +00:00
Chris Lattner
02236df007
Implement a couple of more simplifications. This lets us codegen:
...
int test2(int * P, int* Q, int A, int B) {
return P+A == P;
}
into:
test2:
movl 4(%esp), %eax
movl 12(%esp), %eax
shll $2, %eax
cmpl $0, %eax
sete %al
movzbl %al, %eax
ret
instead of:
test2:
movl 4(%esp), %eax
movl 12(%esp), %ecx
leal (%eax,%ecx,4), %ecx
cmpl %eax, %ecx
sete %al
movzbl %al, %eax
ret
ICC is producing worse code:
test2:
movl 4(%esp), %eax #8.5
movl 12(%esp), %edx #8.5
lea (%edx,%edx), %ecx #9.9
addl %ecx, %ecx #9.9
addl %eax, %ecx #9.9
cmpl %eax, %ecx #9.16
movl $0, %eax #9.16
sete %al #9.16
ret #9.16
as is GCC (looks like our old code):
test2:
movl 4(%esp), %edx
movl 12(%esp), %eax
leal (%edx,%eax,4), %ecx
cmpl %edx, %ecx
sete %al
movzbl %al, %eax
ret
llvm-svn: 19430
2005-01-10 02:03:02 +00:00
Chris Lattner
8d09b03ed1
Fix incorrect constant folds, fixing Stepanov after the SHR patch.
...
llvm-svn: 19429
2005-01-10 01:16:03 +00:00
Chris Lattner
9d479d4a34
Constant fold shifts, turning this loop:
...
.LBB_Z5test0PdS__3: # no_exit.1
fldl data(,%eax,8)
fldl 24(%esp)
faddp %st(1)
fstl 24(%esp)
incl %eax
movl $16000, %ecx
sarl $3, %ecx
cmpl %eax, %ecx
fstpl 16(%esp)
#FP_REG_KILL
jg .LBB_Z5test0PdS__3 # no_exit.1
into:
.LBB_Z5test0PdS__3: # no_exit.1
fldl data(,%eax,8)
fldl 24(%esp)
faddp %st(1)
fstl 24(%esp)
incl %eax
cmpl $2000, %eax
fstpl 16(%esp)
#FP_REG_KILL
jl .LBB_Z5test0PdS__3 # no_exit.1
llvm-svn: 19427
2005-01-10 00:07:15 +00:00
Reid Spencer
283688b80d
Rename Unix/*.cpp and Win32/*.cpp to have a *.inc suffix so that the silly
...
gdb debugger doesn't get confused on which file it is reading (the one in
lib/System or the one in lib/System/{Win32,Unix})
llvm-svn: 19426
2005-01-09 23:29:00 +00:00
Chris Lattner
59d7066da8
Add some folds for == and != comparisons. This allows us to
...
codegen this loop in stepanov:
no_exit.i: ; preds = %entry, %no_exit.i, %then.i, %_Z5checkd.exit
%i.0.0 = phi int [ 0, %entry ], [ %i.0.0, %no_exit.i ], [ %inc.0, %_Z5checkd.exit ], [ %inc.012, %then.i ] ; <int> [#uses=3]
%indvar = phi uint [ %indvar.next, %no_exit.i ], [ 0, %entry ], [ 0, %then.i ], [ 0, %_Z5checkd.exit ] ; <uint> [#uses=3]
%result_addr.i.0 = phi double [ %tmp.4.i.i, %no_exit.i ], [ 0.000000e+00, %entry ], [ 0.000000e+00, %then.i ], [ 0.000000e+00, %_Z5checkd.exit ] ; <double> [#uses=1]
%first_addr.0.i.2.rec = cast uint %indvar to int ; <int> [#uses=1]
%first_addr.0.i.2 = getelementptr [2000 x double]* %data, int 0, uint %indvar ; <double*> [#uses=1]
%inc.i.rec = add int %first_addr.0.i.2.rec, 1 ; <int> [#uses=1]
%inc.i = getelementptr [2000 x double]* %data, int 0, int %inc.i.rec ; <double*> [#uses=1]
%tmp.3.i.i = load double* %first_addr.0.i.2 ; <double> [#uses=1]
%tmp.4.i.i = add double %result_addr.i.0, %tmp.3.i.i ; <double> [#uses=2]
%tmp.2.i = seteq double* %inc.i, getelementptr ([2000 x double]* %data, int 0, int 2000) ; <bool> [#uses=1]
%indvar.next = add uint %indvar, 1 ; <uint> [#uses=1]
br bool %tmp.2.i, label %_Z10accumulateIPddET0_T_S2_S1_.exit, label %no_exit.i
To this:
.LBB_Z4testIPddEvT_S1_T0__1: # no_exit.i
fldl data(,%eax,8)
fldl 16(%esp)
faddp %st(1)
fstpl 16(%esp)
incl %eax
movl %eax, %ecx
shll $3, %ecx
cmpl $16000, %ecx
#FP_REG_KILL
jne .LBB_Z4testIPddEvT_S1_T0__1 # no_exit.i
instead of this:
.LBB_Z4testIPddEvT_S1_T0__1: # no_exit.i
fldl data(,%eax,8)
fldl 16(%esp)
faddp %st(1)
fstpl 16(%esp)
incl %eax
leal data(,%eax,8), %ecx
leal data+16000, %edx
cmpl %edx, %ecx
#FP_REG_KILL
jne .LBB_Z4testIPddEvT_S1_T0__1 # no_exit.i
llvm-svn: 19425
2005-01-09 20:52:51 +00:00
Jeff Cohen
f692cd303d
Add last four createXxxPass functions
...
llvm-svn: 19424
2005-01-09 20:42:52 +00:00
Jeff Cohen
91dd6d2d20
Fix VC++ compilation error
...
llvm-svn: 19423
2005-01-09 20:41:56 +00:00
Chris Lattner
fa06762d0e
Print the DAG out more like a DAG in nested format.
...
llvm-svn: 19422
2005-01-09 20:38:33 +00:00
Chris Lattner
e3b9f22967
Print out nodes sorted by their address to make it easier to find them in a list.
...
llvm-svn: 19421
2005-01-09 20:26:36 +00:00
Chris Lattner
fcab5f75c0
Codegen (Reg|imm)+&GV as an LEA, because we cannot put it into the immediate field
...
of an ADDri (due to current restrictions on MachineOperand :( ). This allows
us to generate:
leal Data+16000, %edx
instead of:
movl $Data, %edx
addl $16000, %edx
llvm-svn: 19420
2005-01-09 20:20:29 +00:00
Chris Lattner
82caa0dc2e
Add a simple transformation. This allows us to compile one of the inner
...
loops in stepanov to this:
.LBB_Z5test0PdS__2: # no_exit.1
fldl data(,%eax,8)
fldl 24(%esp)
faddp %st(1)
fstl 24(%esp)
incl %eax
cmpl $2000, %eax
fstpl 16(%esp)
#FP_REG_KILL
jl .LBB_Z5test0PdS__2
instead of this:
.LBB_Z5test0PdS__2: # no_exit.1
fldl data(,%eax,8)
fldl 24(%esp)
faddp %st(1)
fstl 24(%esp)
incl %eax
movl $data, %ecx
movl %ecx, %edx
addl $16000, %edx
subl %ecx, %edx
movl %edx, %ecx
sarl $2, %ecx
shrl $29, %ecx
addl %ecx, %edx
sarl $3, %edx
cmpl %edx, %eax
fstpl 16(%esp)
#FP_REG_KILL
jl .LBB_Z5test0PdS__2
The old instruction selector produced:
.LBB_Z5test0PdS__2: # no_exit.1
fldl 24(%esp)
faddl data(,%eax,8)
fstl 24(%esp)
movl %eax, %ecx
incl %ecx
incl %eax
leal data+16000, %edx
movl $data, %edi
subl %edi, %edx
movl %edx, %edi
sarl $2, %edi
shrl $29, %edi
addl %edi, %edx
sarl $3, %edx
cmpl %edx, %ecx
fstpl 16(%esp)
#FP_REG_KILL
jl .LBB_Z5test0PdS__2 # no_exit.1
Which is even worse!
llvm-svn: 19419
2005-01-09 20:09:57 +00:00
Chris Lattner
35375c11bf
Fix copy and pasto's for FP -> Int. This fixes fldry
...
llvm-svn: 19418
2005-01-09 19:49:59 +00:00
Chris Lattner
d674d08230
Fix a bug legalizing call instructions (make sure to remember all result
...
values), and eliminate some switch statements.
llvm-svn: 19417
2005-01-09 19:43:23 +00:00