Chris Lattner
3c0e683c5d
add prototype, remove dead proto
...
llvm-svn: 22835
2005-08-17 19:32:03 +00:00
Chris Lattner
a11bdf3abe
Fix a bug in RemoveDeadNodes where it would crash when its "optional"
...
argument is not specified.
Implement ReplaceAllUsesWith.
llvm-svn: 22834
2005-08-17 19:00:20 +00:00
Jim Laskey
7cdadb13d5
Switched to using BitsToDouble for int_to_float to avoid aliasing problem.
...
llvm-svn: 22831
2005-08-17 17:42:52 +00:00
Chris Lattner
82abf46492
Fix some bugs in the alpha backend, some of which I introduced yesterday,
...
and some that were preexisting. All alpha regtests pass now.
llvm-svn: 22829
2005-08-17 17:08:24 +00:00
Jim Laskey
2370cb4e85
Change hex float constants for the sake of VC++.
...
llvm-svn: 22828
2005-08-17 09:44:59 +00:00
Chris Lattner
dbfcba7565
Add a new beta option for critical edge splitting, to avoid a problem that
...
Nate noticed in yacr2 (and I know occurs in other places as well).
This is still rough, as the critical edge blocks are not intelligently placed
but is added to get some idea to see if this improves performance.
llvm-svn: 22825
2005-08-17 06:37:43 +00:00
Chris Lattner
969232d2ec
Use a new helper to split critical edges, making the code simpler.
...
Do not claim to not change the CFG. We do change the cfg to split critical
edges. This isn't causing us a problem now, but could likely do so in the
future.
llvm-svn: 22824
2005-08-17 06:35:16 +00:00
Chris Lattner
a103a2e9c6
Fix a regression on X86, where FP values can be promoted too.
...
llvm-svn: 22822
2005-08-17 06:06:25 +00:00
Chris Lattner
1035045433
Fix a few small typos I noticed when converting this over to the DAG->DAG
...
selector. Also, there is no difference between addSImm and addImm, so just
use addImm, folding some branches.
llvm-svn: 22819
2005-08-17 01:25:14 +00:00
Jim Laskey
09b3a01b90
Removed UINT_TO_FP and SINT_TO_FP from ISel outright.
...
llvm-svn: 22818
2005-08-17 01:14:38 +00:00
Andrew Lenharth
32d53db341
thinko. Should fix s4addl.ll regression
...
llvm-svn: 22817
2005-08-17 00:47:24 +00:00
Jim Laskey
7584bd6c69
Remove ISel code generation for UINT_TO_FP and SINT_TO_FP. Now asserts if
...
marked as legal.
llvm-svn: 22816
2005-08-17 00:41:40 +00:00
Jim Laskey
cdde9cec23
Make UINT_TO_FP and SINT_TO_FP use generic expansion.
...
llvm-svn: 22815
2005-08-17 00:40:22 +00:00
Jim Laskey
59b9ee0529
Added generic code expansion for [signed|unsigned] i32 to [f32|f64] casts in the
...
legalizer. PowerPC now uses this expansion instead of ISel version.
Example:
// signed integer to double conversion
double f1(signed x) {
return (double)x;
}
// unsigned integer to double conversion
double f2(unsigned x) {
return (double)x;
}
// signed integer to float conversion
float f3(signed x) {
return (float)x;
}
// unsigned integer to float conversion
float f4(unsigned x) {
return (float)x;
}
Byte Code:
internal fastcc double %_Z2f1i(int %x) {
entry:
%tmp.1 = cast int %x to double ; <double> [#uses=1]
ret double %tmp.1
}
internal fastcc double %_Z2f2j(uint %x) {
entry:
%tmp.1 = cast uint %x to double ; <double> [#uses=1]
ret double %tmp.1
}
internal fastcc float %_Z2f3i(int %x) {
entry:
%tmp.1 = cast int %x to float ; <float> [#uses=1]
ret float %tmp.1
}
internal fastcc float %_Z2f4j(uint %x) {
entry:
%tmp.1 = cast uint %x to float ; <float> [#uses=1]
ret float %tmp.1
}
internal fastcc double %_Z2g1i(int %x) {
entry:
%buffer = alloca [2 x uint] ; <[2 x uint]*> [#uses=3]
%tmp.0 = getelementptr [2 x uint]* %buffer, int 0, int 0 ; <uint*> [#uses=1]
store uint 1127219200, uint* %tmp.0
%tmp.2 = cast int %x to uint ; <uint> [#uses=1]
%tmp.3 = xor uint %tmp.2, 2147483648 ; <uint> [#uses=1]
%tmp.5 = getelementptr [2 x uint]* %buffer, int 0, int 1 ; <uint*> [#uses=1]
store uint %tmp.3, uint* %tmp.5
%tmp.9 = cast [2 x uint]* %buffer to double* ; <double*> [#uses=1]
%tmp.10 = load double* %tmp.9 ; <double> [#uses=1]
%tmp.13 = load double* cast (long* %signed_bias to double*) ; <double> [#uses=1]
%tmp.14 = sub double %tmp.10, %tmp.13 ; <double> [#uses=1]
ret double %tmp.14
}
internal fastcc double %_Z2g2j(uint %x) {
entry:
%buffer = alloca [2 x uint] ; <[2 x uint]*> [#uses=3]
%tmp.0 = getelementptr [2 x uint]* %buffer, int 0, int 0 ; <uint*> [#uses=1]
store uint 1127219200, uint* %tmp.0
%tmp.1 = getelementptr [2 x uint]* %buffer, int 0, int 1 ; <uint*> [#uses=1]
store uint %x, uint* %tmp.1
%tmp.4 = cast [2 x uint]* %buffer to double* ; <double*> [#uses=1]
%tmp.5 = load double* %tmp.4 ; <double> [#uses=1]
%tmp.8 = load double* cast (long* %unsigned_bias to double*) ; <double> [#uses=1]
%tmp.9 = sub double %tmp.5, %tmp.8 ; <double> [#uses=1]
ret double %tmp.9
}
internal fastcc float %_Z2g3i(int %x) {
entry:
%buffer = alloca [2 x uint] ; <[2 x uint]*> [#uses=3]
%tmp.0 = getelementptr [2 x uint]* %buffer, int 0, int 0 ; <uint*> [#uses=1]
store uint 1127219200, uint* %tmp.0
%tmp.2 = cast int %x to uint ; <uint> [#uses=1]
%tmp.3 = xor uint %tmp.2, 2147483648 ; <uint> [#uses=1]
%tmp.5 = getelementptr [2 x uint]* %buffer, int 0, int 1 ; <uint*> [#uses=1]
store uint %tmp.3, uint* %tmp.5
%tmp.9 = cast [2 x uint]* %buffer to double* ; <double*> [#uses=1]
%tmp.10 = load double* %tmp.9 ; <double> [#uses=1]
%tmp.13 = load double* cast (long* %signed_bias to double*) ; <double> [#uses=1]
%tmp.14 = sub double %tmp.10, %tmp.13 ; <double> [#uses=1]
%tmp.16 = cast double %tmp.14 to float ; <float> [#uses=1]
ret float %tmp.16
}
internal fastcc float %_Z2g4j(uint %x) {
entry:
%buffer = alloca [2 x uint] ; <[2 x uint]*> [#uses=3]
%tmp.0 = getelementptr [2 x uint]* %buffer, int 0, int 0 ; <uint*> [#uses=1]
store uint 1127219200, uint* %tmp.0
%tmp.1 = getelementptr [2 x uint]* %buffer, int 0, int 1 ; <uint*> [#uses=1]
store uint %x, uint* %tmp.1
%tmp.4 = cast [2 x uint]* %buffer to double* ; <double*> [#uses=1]
%tmp.5 = load double* %tmp.4 ; <double> [#uses=1]
%tmp.8 = load double* cast (long* %unsigned_bias to double*) ; <double> [#uses=1]
%tmp.9 = sub double %tmp.5, %tmp.8 ; <double> [#uses=1]
%tmp.11 = cast double %tmp.9 to float ; <float> [#uses=1]
ret float %tmp.11
}
PowerPC Code:
.machine ppc970
.const
.align 2
.CPIl1__Z2f1i_0: ; float 0x4330000080000000
.long 1501560836 ; float 4.5036e+15
.text
.align 2
.globl l1__Z2f1i
l1__Z2f1i:
.LBBl1__Z2f1i_0: ; entry
xoris r2, r3, 32768
stw r2, -4(r1)
lis r2, 17200
stw r2, -8(r1)
lfd f0, -8(r1)
lis r2, ha16(.CPIl1__Z2f1i_0)
lfs f1, lo16(.CPIl1__Z2f1i_0)(r2)
fsub f1, f0, f1
blr
.const
.align 2
.CPIl2__Z2f2j_0: ; float 0x4330000000000000
.long 1501560832 ; float 4.5036e+15
.text
.align 2
.globl l2__Z2f2j
l2__Z2f2j:
.LBBl2__Z2f2j_0: ; entry
stw r3, -4(r1)
lis r2, 17200
stw r2, -8(r1)
lfd f0, -8(r1)
lis r2, ha16(.CPIl2__Z2f2j_0)
lfs f1, lo16(.CPIl2__Z2f2j_0)(r2)
fsub f1, f0, f1
blr
.const
.align 2
.CPIl3__Z2f3i_0: ; float 0x4330000080000000
.long 1501560836 ; float 4.5036e+15
.text
.align 2
.globl l3__Z2f3i
l3__Z2f3i:
.LBBl3__Z2f3i_0: ; entry
xoris r2, r3, 32768
stw r2, -4(r1)
lis r2, 17200
stw r2, -8(r1)
lfd f0, -8(r1)
lis r2, ha16(.CPIl3__Z2f3i_0)
lfs f1, lo16(.CPIl3__Z2f3i_0)(r2)
fsub f0, f0, f1
frsp f1, f0
blr
.const
.align 2
.CPIl4__Z2f4j_0: ; float 0x4330000000000000
.long 1501560832 ; float 4.5036e+15
.text
.align 2
.globl l4__Z2f4j
l4__Z2f4j:
.LBBl4__Z2f4j_0: ; entry
stw r3, -4(r1)
lis r2, 17200
stw r2, -8(r1)
lfd f0, -8(r1)
lis r2, ha16(.CPIl4__Z2f4j_0)
lfs f1, lo16(.CPIl4__Z2f4j_0)(r2)
fsub f0, f0, f1
frsp f1, f0
blr
llvm-svn: 22814
2005-08-17 00:39:29 +00:00
Chris Lattner
bd8cbd4951
add a new TargetConstant node
...
llvm-svn: 22813
2005-08-17 00:34:06 +00:00
Nate Begeman
d549160121
Implement a couple improvements:
...
Remove dead code in ISD::Constant handling
Add support for add long, imm16
We now codegen 'long long foo(long long a) { return ++a; }'
as:
addic r4, r4, 1
addze r3, r3
blr
instead of:
li r2, 1
li r5, 0
addc r2, r4, r2
adde r3, r3, r5
blr
llvm-svn: 22811
2005-08-17 00:20:08 +00:00
Chris Lattner
8fcd135595
This is a dummy, it doesn't matter what the ValueType is
...
llvm-svn: 22809
2005-08-16 21:59:52 +00:00
Chris Lattner
9d651a0e3c
updates for changes in nodes
...
llvm-svn: 22808
2005-08-16 21:58:15 +00:00
Chris Lattner
583658a766
update the backends to work with the new CopyFromReg/CopyToReg/ImplicitDef nodes
...
llvm-svn: 22807
2005-08-16 21:56:37 +00:00
Chris Lattner
3b7e157005
Eliminate the RegSDNode class, which 3 nodes (CopyFromReg/CopyToReg/ImplicitDef)
...
used to tack a register number onto the node.
Instead of doing this, make a new node, RegisterSDNode, which is a leaf
containing a register number. These three operations just become normal
DAG nodes now, instead of requiring special handling.
Note that with this change, it is no longer correct to make illegal
CopyFromReg/CopyToReg nodes. The legalizer will not touch them, and this
is bad, so don't do it. :)
llvm-svn: 22806
2005-08-16 21:55:35 +00:00
Nate Begeman
f6b6378f23
Implement BR_CC and BRTWOWAY_CC. This allows the removal of a rather nasty
...
fixme from the PowerPC backend. Emit slightly better code for legalizing
select_cc.
llvm-svn: 22805
2005-08-16 19:49:35 +00:00
Chris Lattner
65b9983515
Allow passing a dag into dump and getOperationName. If one is available
...
when printing a node, use it to render target operations with their
target instruction name instead of "<<unknown>>".
llvm-svn: 22804
2005-08-16 18:33:07 +00:00
Chris Lattner
1b07a165e0
Use a extant helper to do this.
...
llvm-svn: 22802
2005-08-16 18:31:23 +00:00
Chris Lattner
73348d1e89
Add some methods for dag->dag isel.
...
Split RemoveNodeFromCSEMaps out of DeleteNodesIfDead to do it.
llvm-svn: 22801
2005-08-16 18:17:10 +00:00
Chris Lattner
a4c5954c52
Pull the LLVM -> DAG lowering code out of the pattern selector so that it
...
can be shared with the DAG->DAG selector.
llvm-svn: 22799
2005-08-16 17:14:42 +00:00
Chris Lattner
bc70c99aef
Fix a bad case in gzip where we put lots of things in registers across the
...
loop, because a IV-dependent value was used outside of the loop and didn't
have immediate-folding capability
llvm-svn: 22798
2005-08-16 00:38:11 +00:00
Chris Lattner
3bfebb1e8f
Fix Transforms/LoopStrengthReduce/2005-08-15-AddRecIV.ll
...
llvm-svn: 22797
2005-08-16 00:37:01 +00:00
Chris Lattner
543dd2e01c
Turn loop strength reduction on by default.
...
Only run createLowerConstantExpressionsPass for the simple isel. The DAG
isel has no need for it.
llvm-svn: 22794
2005-08-15 23:47:04 +00:00
Chris Lattner
c2cbe96e25
Teach LLVM to know how many times a loop executes when constructed with
...
a < expression, e.g.: for (i = m; i < n; ++i)
llvm-svn: 22793
2005-08-15 23:33:51 +00:00
Jim Laskey
c501f51bd2
Broke 80 column rule.
...
llvm-svn: 22792
2005-08-15 17:35:26 +00:00
Jim Laskey
92eccb9901
Changed code gen for int to f32 to use rounding. This makes FP results
...
consistent with gcc.
llvm-svn: 22791
2005-08-15 17:14:19 +00:00
Andrew Lenharth
b9c77079dc
isIntImmediate is a good Idea. Add a flavor that checks bounds while it is at it
...
llvm-svn: 22790
2005-08-15 14:31:37 +00:00
Nate Begeman
54423e60c6
Fix last night's PPC32 regressions by
...
1. Not selecting the false value of a select_cc in the false arm, which
isn't legal for nested selects.
2. Actually returning the node we created and Legalized in the FP_TO_UINT
Expander.
llvm-svn: 22789
2005-08-14 18:38:32 +00:00
Nate Begeman
6e0168fe5f
Fix last night's X86 regressions by putting code for SSE in the if(SSE)
...
block. nur.
llvm-svn: 22788
2005-08-14 18:37:02 +00:00
Andrew Lenharth
972c05771c
only build .a on alpha
...
llvm-svn: 22787
2005-08-14 15:14:34 +00:00
Nate Begeman
89f12b7721
Fix FP_TO_UINT with Scalar SSE2 now that the legalizer can handle it. We
...
now generate the relatively good code sequences:
unsigned short foo(float a) { return a; }
_foo:
movss 4(%esp), %xmm0
cvttss2si %xmm0, %eax
movzwl %ax, %eax
ret
and
unsigned bar(float a) { return a; }
_bar:
movss .CPI_bar_0, %xmm0
movss 4(%esp), %xmm1
movapd %xmm1, %xmm2
subss %xmm0, %xmm2
cvttss2si %xmm2, %eax
xorl $-2147483648, %eax
cvttss2si %xmm1, %ecx
ucomiss %xmm0, %xmm1
cmovb %ecx, %eax
ret
llvm-svn: 22786
2005-08-14 04:36:51 +00:00
Nate Begeman
9be6a214ff
Teach the legalizer how to legalize FP_TO_UINT.
...
Teach the legalizer to promote FP_TO_UINT to FP_TO_SINT if the wider
FP_TO_UINT is also illegal. This allows us on PPC to codegen
unsigned short foo(float a) { return a; }
as:
_foo:
.LBB_foo_0: ; entry
fctiwz f0, f1
stfd f0, -8(r1)
lwz r2, -4(r1)
rlwinm r3, r2, 0, 16, 31
blr
instead of:
_foo:
.LBB_foo_0: ; entry
fctiwz f0, f1
stfd f0, -8(r1)
lwz r2, -4(r1)
lis r3, ha16(.CPI_foo_0)
lfs f0, lo16(.CPI_foo_0)(r3)
fcmpu cr0, f1, f0
blt .LBB_foo_2 ; entry
.LBB_foo_1: ; entry
fsubs f0, f1, f0
fctiwz f0, f0
stfd f0, -16(r1)
lwz r2, -12(r1)
xoris r2, r2, 32768
.LBB_foo_2: ; entry
rlwinm r3, r2, 0, 16, 31
blr
llvm-svn: 22785
2005-08-14 01:20:53 +00:00
Nate Begeman
2426bb6589
Make FP_TO_UINT Illegal. This allows us to generate significantly better
...
codegen for FP_TO_UINT by using the legalizer's SELECT variant.
Implement a codegen improvement for SELECT_CC, selecting the false node in
the MBB that feeds the phi node. This allows us to codegen:
void foo(int *a, int b, int c) { int d = (a < b) ? 5 : 9; *a = d; }
as:
_foo:
li r2, 5
cmpw cr0, r4, r3
bgt .LBB_foo_2 ; entry
.LBB_foo_1: ; entry
li r2, 9
.LBB_foo_2: ; entry
stw r2, 0(r3)
blr
insted of:
_foo:
li r2, 5
li r5, 9
cmpw cr0, r4, r3
bgt .LBB_foo_2 ; entry
.LBB_foo_1: ; entry
or r2, r5, r5
.LBB_foo_2: ; entry
stw r2, 0(r3)
blr
llvm-svn: 22784
2005-08-14 01:17:16 +00:00
Andrew Lenharth
5f4a72a18e
Testing a variable before it is defined doesn't work so well. It is a fairly small thing, so just let everyone build the .a file
...
llvm-svn: 22783
2005-08-13 14:58:23 +00:00
Chris Lattner
57a2a74e99
Ooops, don't forget to clear this. The real inner loop is now:
...
.LBB_foo_3: ; no_exit.1
lfd f2, 0(r9)
lfd f3, 8(r9)
fmul f4, f1, f2
fmadd f4, f0, f3, f4
stfd f4, 8(r9)
fmul f3, f1, f3
fmsub f2, f0, f2, f3
stfd f2, 0(r9)
addi r9, r9, 16
addi r8, r8, 1
cmpw cr0, r8, r4
ble .LBB_foo_3 ; no_exit.1
llvm-svn: 22782
2005-08-13 07:42:01 +00:00
Chris Lattner
f59e855dbc
Recursively scan scev expressions for common subexpressions. This allows us
...
to handle nested loops much better, for example, by being able to tell that
these two expressions:
{( 8 + ( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp 12)}<loopentry.1>
{(( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp12)}<loopentry.1>
Have the following common part that can be shared:
{(( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp12)}<loopentry.1>
This allows us to codegen an important inner loop in 168.wupwise as:
.LBB_foo_4: ; no_exit.1
lfd f2, 16(r9)
fmul f3, f0, f2
fmul f2, f1, f2
fadd f4, f3, f2
stfd f4, 8(r9)
fsub f2, f3, f2
stfd f2, 16(r9)
addi r8, r8, 1
addi r9, r9, 16
cmpw cr0, r8, r4
ble .LBB_foo_4 ; no_exit.1
instead of:
.LBB_foo_3: ; no_exit.1
lfdx f2, r6, r9
add r10, r6, r9
lfd f3, 8(r10)
fmul f4, f1, f2
fmadd f4, f0, f3, f4
stfd f4, 8(r10)
fmul f3, f1, f3
fmsub f2, f0, f2, f3
stfdx f2, r6, r9
addi r9, r9, 16
addi r8, r8, 1
cmpw cr0, r8, r4
ble .LBB_foo_3 ; no_exit.1
llvm-svn: 22781
2005-08-13 07:27:18 +00:00
Nate Begeman
021a5b3fe1
Remove an unncessary argument to SimplifySelectCC and add an additional
...
assert when creating a select_cc node.
llvm-svn: 22780
2005-08-13 06:14:17 +00:00
Nate Begeman
4e8f777256
Fix the fabs regression on x86 by abstracting the select_cc optimization
...
out into SimplifySelectCC. This allows both ISD::SELECT and ISD::SELECT_CC
to use the same set of simplifying folds.
llvm-svn: 22779
2005-08-13 06:00:21 +00:00
Nate Begeman
55d6c917c5
Remove support for 64b PPC, it's been broken for a long time. It'll be
...
back once a DAG->DAG ISel exists.
llvm-svn: 22778
2005-08-13 05:59:16 +00:00
Andrew Lenharth
d9a0c60da4
Fix oversized GOT problem with gcc-4 on alpha
...
llvm-svn: 22777
2005-08-13 05:09:50 +00:00
Chris Lattner
2681a83c43
Teach SplitCriticalEdge to update LoopInfo if it is alive. This fixes
...
a problem in LoopStrengthReduction, where it would split critical edges
then confused itself with outdated loop information.
llvm-svn: 22776
2005-08-13 01:38:43 +00:00
Chris Lattner
87bcd2794b
remove dead code. The exit block list is computed on demand, thus does not
...
need to be updated. This code is a relic from when it did.
llvm-svn: 22775
2005-08-13 01:30:36 +00:00
Chris Lattner
e06d2c3760
implement a couple of simple shift foldings.
...
e.g. (X & 7) >> 3 -> 0
llvm-svn: 22774
2005-08-12 23:54:58 +00:00
Jim Laskey
6b95280f3c
Fix for 2005-08-12-rlwimi-crash.ll. Make allowance for masks being shifted to
...
zero.
llvm-svn: 22773
2005-08-12 23:52:46 +00:00
Jim Laskey
7b32c9131d
1. This changes handles the cases of (~x)&y and x&(~y) yielding ANDC, and
...
(~x)|y and x|(~y) yielding ORC.
llvm-svn: 22771
2005-08-12 23:38:02 +00:00