Nate Begeman
ce63e383b8
Add a couple missing transforms in getSetCC that were triggering assertions
...
in the PPC Pattern ISel
llvm-svn: 21297
2005-04-14 08:56:52 +00:00
Duraid Madina
eb4c15b42c
we have zextloads, not sextloads!
...
llvm-svn: 21296
2005-04-14 08:37:32 +00:00
Nate Begeman
b707ec16b4
Add the necessary support to codegen condition register logical ops with
...
register allocated condition registers. Make sure that the printed
output is gas compatible.
llvm-svn: 21295
2005-04-14 03:20:38 +00:00
Nate Begeman
99a9840b56
Start allocating condition registers. Almost all explicit uses of CR0 are
...
now gone. Next step is to get rid of the remaining ones and then start
allocating bools to CRs where appropriate.
llvm-svn: 21294
2005-04-13 23:15:44 +00:00
Nate Begeman
ae49d52006
Implement the fold shift X, zext(Y) -> shift X, Y at the target level,
...
where it is safe to do so.
llvm-svn: 21293
2005-04-13 22:14:14 +00:00
Nate Begeman
20b3399465
Disbale the broken fold of shift + sz[ext] for now
...
Move the transform for select (a < 0) ? b : 0 into the dag from ppc isel
Enable the dag to fold and (setcc, 1) -> setcc for targets where setcc
always produces zero or one.
llvm-svn: 21291
2005-04-13 21:23:31 +00:00
Chris Lattner
89f7e115a4
fix an infinite loop
...
llvm-svn: 21289
2005-04-13 20:06:29 +00:00
Chris Lattner
475fe85ddf
fix some serious miscompiles on ia64, alpha, and ppc
...
llvm-svn: 21288
2005-04-13 19:53:40 +00:00
Chris Lattner
03d675414e
avoid work when possible, perhaps fix the problem nate and andrew are seeing
...
with != 0 comparisons vanishing.
llvm-svn: 21287
2005-04-13 19:41:05 +00:00
Andrew Lenharth
cbf7a52768
WOW, function calls still seem to work after this.
...
llvm-svn: 21286
2005-04-13 17:17:28 +00:00
Andrew Lenharth
d1fee6d24a
prepare for func call optimization
...
llvm-svn: 21285
2005-04-13 16:19:50 +00:00
Duraid Madina
b9d2d9ac63
* add the shladd instruction
...
* fold left shifts of 1, 2, 3 or 4 bits into adds
This doesn't save much now, but should get a serious workout once
multiplies by constants get converted to shift/add/sub sequences.
Hold on! :)
llvm-svn: 21282
2005-04-13 06:12:04 +00:00
Andrew Lenharth
ec33ab6a2f
add matches for SxADDL and company, as well as simplify the SxADDQ code
...
llvm-svn: 21281
2005-04-13 05:19:55 +00:00
Chris Lattner
9540cf8c7e
Implement expansion of unsigned i64 -> FP.
...
Note that this probably only works for little endian targets, but is enough
to get siod working :)
llvm-svn: 21280
2005-04-13 05:09:42 +00:00
Duraid Madina
67e553e521
* if ANDing with a constant of the form:
...
0x00000..00FFF..FF
^ ^
^ ^
any number of
0's followed by
some number of
1's
then we use dep.z to just paste zeros over the input. For the special
cases where this is zxt1/zxt2/zxt4, we use those instructions instead,
because we're all about readability!!!
that's what it's about!! readability!
*twitch* ;D
llvm-svn: 21279
2005-04-13 04:50:54 +00:00
Andrew Lenharth
510db15268
added all flavors of zap for anding
...
llvm-svn: 21276
2005-04-13 03:47:03 +00:00
Chris Lattner
1a6247ff51
Make expansion of uint->fp cast assert out instead of infinitely recurse.
...
llvm-svn: 21275
2005-04-13 03:42:14 +00:00
Chris Lattner
85c6a7bed0
Fix some mysteriously missing {}'s which cause the miscompilation of
...
Olden/mst, Ptrdist/bc, Obsequi, etc.
llvm-svn: 21274
2005-04-13 03:29:53 +00:00
Chris Lattner
63450e87d9
add back the optimization that Nate added for shl X, (zext_inreg y)
...
llvm-svn: 21273
2005-04-13 02:58:13 +00:00
Chris Lattner
759afe07d7
Oops, remove these too.
...
llvm-svn: 21272
2005-04-13 02:47:57 +00:00
Chris Lattner
8489ac991d
remove one more occurance of this that snuck in
...
llvm-svn: 21271
2005-04-13 02:46:17 +00:00
Chris Lattner
5fdb103328
Remove support for ZERO_EXTEND_INREG. This pessimizes code, genering stuff
...
like this:
ldah $1,1($31)
lda $1,-1($1)
and $0,$1,$24
instead of this:
zap $0,252,$24
To get this back, the selector should recognize the ISD::AND case where this
happens and emit the appropriate ZAP instruction.
llvm-svn: 21270
2005-04-13 02:43:40 +00:00
Chris Lattner
a2e92e69da
Remove special handling of ZERO_EXTEND_INREG. This pessimizes code, causing
...
things like this:
mov r9 = 65535;;
and r8 = r8, r9;;
To be emitted instead of:
zxt2 r8 = r8;;
To get this back, the selector for ISD::AND should recognize this case.
llvm-svn: 21269
2005-04-13 02:41:52 +00:00
Chris Lattner
26c7c9150a
Elimate handling of ZERO_EXTEND_INREG. This causes the PPC backend to emit
...
andi instructions instead of rlwinm instructions for zero extend, but they
seem like they would take the same time.
llvm-svn: 21268
2005-04-13 02:40:26 +00:00
Chris Lattner
f25fefd9cf
Z_E_I is gone
...
llvm-svn: 21267
2005-04-13 02:39:05 +00:00
Chris Lattner
4f188f949c
Instead of making ZERO_EXTEND_INREG nodes, use the helper method in
...
SelectionDAG to do the job with AND. Don't legalize Z_E_I anymore as
it is gone
llvm-svn: 21266
2005-04-13 02:38:47 +00:00
Chris Lattner
bce0030a88
Remove all foldings of ZERO_EXTEND_INREG, moving them to work for AND nodes
...
instead. OVerall, this increases the amount of folding we can do.
llvm-svn: 21265
2005-04-13 02:38:18 +00:00
Nate Begeman
38d8248a9e
Fold shift x, [sz]ext(y) -> shift x, y
...
llvm-svn: 21262
2005-04-12 23:32:28 +00:00
Nate Begeman
a56527ea5f
Fold shift by size larger than type size to undef
...
Make llvm undef values generate ISD::UNDEF nodes
llvm-svn: 21261
2005-04-12 23:12:17 +00:00
Nate Begeman
79c8b8fd1c
Implement setcc op, -1 sequences
...
Remove dead setcc op, 0 sequences
Coming later: generalization of op, imm
llvm-svn: 21260
2005-04-12 21:22:28 +00:00
Chris Lattner
58f72ab722
promote extload i1 -> extload i8
...
llvm-svn: 21258
2005-04-12 20:30:10 +00:00
Chris Lattner
7c88662870
add an argument to allow avoiding deleting phi nodes.
...
llvm-svn: 21255
2005-04-12 18:52:14 +00:00
Chris Lattner
ee06161a63
Get rid of this for_each loop
...
llvm-svn: 21253
2005-04-12 18:51:33 +00:00
Duraid Madina
39fcec1541
* OK, after changing to use liveIn/liveOut instead of IDEFs,
...
to avoid redundant mov out3=r44 type instructions, we need to
tell the register allocator the truth about out? registers.
FIXME: unfortunately, since the list of allocatable registers is immutable,
we can't simply 'delete r127' from the allocation order, say, if 'out0' is
used. The only correct thing we can do is have a linear order of regs:
out7, out6 ... out2, out1, out0, r32, r33, r34 ... r126, r127
and slide a 'window' of 96 registers along this line, depending on how many
of the out? regs a function actually uses. The only downside of this is
that the out? registers will be allocated _first_, which makes the
resulting assembly ugly. :( Note this in the README. Hope this gets fixed
soon. :) (note the 3rd person speech there)
llvm-svn: 21252
2005-04-12 18:42:59 +00:00
Andrew Lenharth
174d44f223
Get rid of idefs for arguments (oops)
...
llvm-svn: 21251
2005-04-12 17:47:57 +00:00
Andrew Lenharth
1b8a8331c9
Get rid of idefs for arguments
...
llvm-svn: 21250
2005-04-12 17:35:16 +00:00
Chris Lattner
be2dceff48
Put out* into the allocation order, allowing the register allocator to
...
coallesce moves into outgoing args.
llvm-svn: 21249
2005-04-12 15:12:51 +00:00
Chris Lattner
37712352c0
Make sure to realize that calls use their argument regs
...
llvm-svn: 21248
2005-04-12 15:12:19 +00:00
Duraid Madina
2821f99f19
stop emitting IDEFs for args - change to using liveIn/liveOut
...
llvm-svn: 21247
2005-04-12 14:54:44 +00:00
Nate Begeman
f96b42f1b6
Initial support for allocation condition registers
...
llvm-svn: 21246
2005-04-12 07:04:16 +00:00
Chris Lattner
f8d9224d8c
Fix a crash analyzing MultiSource/Benchmarks/MallocBench/gs
...
llvm-svn: 21245
2005-04-12 03:59:27 +00:00
Chris Lattner
cfc7093ca6
Remove some redundant checks, add a couple of new ones. This allows us to
...
compile this:
int foo (unsigned long a, unsigned long long g) {
return a >= g;
}
To:
foo:
movl 8(%esp), %eax
cmpl %eax, 4(%esp)
setae %al
cmpl $0, 12(%esp)
sete %cl
andb %al, %cl
movzbl %cl, %eax
ret
instead of:
foo:
movl 8(%esp), %eax
cmpl %eax, 4(%esp)
setae %al
movzbw %al, %cx
movl 12(%esp), %edx
cmpl $0, %edx
sete %al
movzbw %al, %ax
cmpl $0, %edx
cmove %cx, %ax
movzbl %al, %eax
ret
llvm-svn: 21244
2005-04-12 02:54:39 +00:00
Chris Lattner
61f353dbdc
Emit comparisons against the sign bit better. Codegen this:
...
bool %test1(long %X) {
%A = setlt long %X, 0
ret bool %A
}
like this:
test1:
cmpl $0, 8(%esp)
setl %al
movzbl %al, %eax
ret
instead of:
test1:
movl 8(%esp), %ecx
cmpl $0, %ecx
setl %al
movzbw %al, %ax
cmpl $0, 4(%esp)
setb %dl
movzbw %dl, %dx
cmpl $0, %ecx
cmove %dx, %ax
movzbl %al, %eax
ret
llvm-svn: 21243
2005-04-12 02:19:10 +00:00
Chris Lattner
6cbbb55967
Emit long comparison against -1 better. Instead of this (x86):
...
test2:
movl 8(%esp), %eax
notl %eax
movl 4(%esp), %ecx
notl %ecx
orl %eax, %ecx
cmpl $0, %ecx
sete %al
movzbl %al, %eax
ret
or this (PPC):
_test2:
nor r2, r4, r4
nor r3, r3, r3
or r2, r2, r3
cntlzw r2, r2
srwi r3, r2, 5
blr
Emit this:
test2:
movl 8(%esp), %eax
andl 4(%esp), %eax
cmpl $-1, %eax
sete %al
movzbl %al, %eax
ret
or this:
_test2:
.LBB_test2_0: ;
and r2, r4, r3
cmpwi cr0, r2, -1
li r3, 1
li r2, 0
beq .LBB_test2_2 ;
.LBB_test2_1: ;
or r3, r2, r2
.LBB_test2_2: ;
blr
it seems like the PPC isel could do better for R32 == -1 case.
llvm-svn: 21242
2005-04-12 01:46:05 +00:00
Chris Lattner
37534d43d0
canonicalize x <u 1 -> x == 0. On this testcase:
...
unsigned long long g;
unsigned long foo (unsigned long a) {
return (a >= g) ? 1 : 0;
}
It changes the ppc code from:
_foo:
.LBB_foo_0: ; entry
mflr r11
stw r11, 8(r1)
bl "L00000$pb"
"L00000$pb":
mflr r2
addis r2, r2, ha16(L_g$non_lazy_ptr-"L00000$pb")
lwz r2, lo16(L_g$non_lazy_ptr-"L00000$pb")(r2)
lwz r4, 0(r2)
lwz r2, 4(r2)
cmplw cr0, r3, r2
li r2, 1
li r3, 0
bge .LBB_foo_2 ; entry
.LBB_foo_1: ; entry
or r2, r3, r3
.LBB_foo_2: ; entry
cmplwi cr0, r4, 1
li r3, 1
li r5, 0
blt .LBB_foo_4 ; entry
.LBB_foo_3: ; entry
or r3, r5, r5
.LBB_foo_4: ; entry
cmpwi cr0, r4, 0
beq .LBB_foo_6 ; entry
.LBB_foo_5: ; entry
or r2, r3, r3
.LBB_foo_6: ; entry
rlwinm r3, r2, 0, 31, 31
lwz r11, 8(r1)
mtlr r11
blr
to:
_foo:
.LBB_foo_0: ; entry
mflr r11
stw r11, 8(r1)
bl "L00000$pb"
"L00000$pb":
mflr r2
addis r2, r2, ha16(L_g$non_lazy_ptr-"L00000$pb")
lwz r2, lo16(L_g$non_lazy_ptr-"L00000$pb")(r2)
lwz r4, 0(r2)
lwz r2, 4(r2)
cmplw cr0, r3, r2
li r2, 1
li r3, 0
bge .LBB_foo_2 ; entry
.LBB_foo_1: ; entry
or r2, r3, r3
.LBB_foo_2: ; entry
cntlzw r3, r4
srwi r3, r3, 5
cmpwi cr0, r4, 0
beq .LBB_foo_4 ; entry
.LBB_foo_3: ; entry
or r2, r3, r3
.LBB_foo_4: ; entry
rlwinm r3, r2, 0, 31, 31
lwz r11, 8(r1)
mtlr r11
blr
llvm-svn: 21241
2005-04-12 00:28:49 +00:00
Nate Begeman
a154deaaff
Implement bitfield clears
...
Implement divide by negative power of two
llvm-svn: 21240
2005-04-12 00:10:02 +00:00
Nate Begeman
f31b58f145
Update PPC readme. Remove things that are done or aren't ppc specific
...
llvm-svn: 21232
2005-04-11 20:48:57 +00:00
Chris Lattner
7f0f0854fa
Teach the dag mechanism that this:
...
long long test2(unsigned A, unsigned B) {
return ((unsigned long long)A << 32) + B;
}
is equivalent to this:
long long test1(unsigned A, unsigned B) {
return ((unsigned long long)A << 32) | B;
}
Now they are both codegen'd to this on ppc:
_test2:
blr
or this on x86:
test2:
movl 4(%esp), %edx
movl 8(%esp), %eax
ret
llvm-svn: 21231
2005-04-11 20:29:59 +00:00
Chris Lattner
71f3d4ce57
Fix expansion of shifts by exactly NVT bits on arch's (like X86) that have
...
masking shifts.
This fixes the miscompilation of this:
long long test1(unsigned A, unsigned B) {
return ((unsigned long long)A << 32) | B;
}
into this:
test1:
movl 4(%esp), %edx
movl %edx, %eax
orl 8(%esp), %eax
ret
allowing us to generate this instead:
test1:
movl 4(%esp), %edx
movl 8(%esp), %eax
ret
llvm-svn: 21230
2005-04-11 20:08:52 +00:00
Chris Lattner
55e620f08d
IA64 supports this operation.
...
llvm-svn: 21228
2005-04-11 18:55:36 +00:00
Chris Lattner
ee715b2abc
ORo sets CR0
...
llvm-svn: 21227
2005-04-11 15:03:48 +00:00
Chris Lattner
7d11f40ee2
Revert the previous patch, which I didn't mean to check in.
...
llvm-svn: 21226
2005-04-11 15:03:41 +00:00
Chris Lattner
d925c74452
Fix a minor bug (ORo didn't mark that it set CR0).
...
Refactor how . instructions are handled. In particular, instead of passing
the RC flag all the way up the inheritance hierarchy, just make a new tblgen
class 'DOT' which can be added to an instruction definition.
For example, instead of this:
-def AND : XForm_6<31, 28, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB),
-let Defs = [CR0] in
-def ANDo : XForm_6<31, 28, 1, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB),
- "and. $rA, $rS, $rB">;
We now have this:
+def AND : XForm_6<31, 28, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB),
"and $rA, $rS, $rB">;
llvm-svn: 21225
2005-04-11 15:01:39 +00:00
Duraid Madina
01aaf77792
hmm, should probably change addImm() to take 64-bit arguments one day anyway.
...
llvm-svn: 21224
2005-04-11 07:16:39 +00:00
Nate Begeman
783fe2108e
Add recording variants of ISD::AND and ISD::OR. This kills almost 1000
...
(1.5%) instructions in 186.crafty
llvm-svn: 21222
2005-04-11 06:34:10 +00:00
Duraid Madina
d2ae9221c7
assorted fixes:
...
* clean up immediates (we use 14, 22 and 64 bit immediates now. sane.)
* fold r0/f0/f1 registers into comparisons against 0/0.0/1.0
* fix nasty thinko - didn't use two-address form of conditional add
for extending bools to integers, so occasionally there would be
garbage in the result. it's amazing how often zeros are just
sitting around in registers ;) - this should fix a bunch of tests.
llvm-svn: 21221
2005-04-11 05:55:56 +00:00
Jeff Cohen
1b89da675f
Eliminate tabs
...
llvm-svn: 21216
2005-04-11 03:44:22 +00:00
Nate Begeman
32163963cb
Fix libcall code to not pass a NULL Chain to LowerCallTo
...
Fix libcall code to not crash or assert looking for an ADJCALLSTACKUP node
when it is known that there is no ADJCALLSTACKDOWN to match.
Expand i64 multiply when ISD::MULHU is legal for the target.
llvm-svn: 21214
2005-04-11 03:01:51 +00:00
Chris Lattner
4f26677dc9
Don't bother sign/zext_inreg'ing the result of an and operation if we know
...
the result does change as a result of the extend.
This improves codegen for Alpha on this testcase:
int %a(ushort* %i) {
%tmp.1 = load ushort* %i
%tmp.2 = cast ushort %tmp.1 to int
%tmp.4 = and int %tmp.2, 1
ret int %tmp.4
}
Generating:
a:
ldgp $29, 0($27)
ldwu $0,0($16)
and $0,1,$0
ret $31,($26),1
instead of:
a:
ldgp $29, 0($27)
ldwu $0,0($16)
and $0,1,$0
addl $0,0,$0
ret $31,($26),1
btw, alpha really should switch to livein/outs for args :)
llvm-svn: 21213
2005-04-10 23:37:16 +00:00
Chris Lattner
c730ea00e2
Teach legalize to deal with targets that don't support some SEXTLOAD/ZEXTLOADs
...
llvm-svn: 21212
2005-04-10 22:54:25 +00:00
Chris Lattner
1b9e1e26cb
don't zextload fp values!
...
llvm-svn: 21209
2005-04-10 17:40:35 +00:00
Nate Begeman
34aa7ec9cb
Fix another fixme: factor out the constant fp generation code.
...
llvm-svn: 21207
2005-04-10 06:06:10 +00:00
Nate Begeman
b6c9b326e3
Fix 64 bit argument loading that straddles the args in regs / args on stack
...
boundary.
llvm-svn: 21206
2005-04-10 05:53:14 +00:00
Chris Lattner
0c089eae41
Until we have a dag combiner, promote using zextload's instead of extloads.
...
This gives the optimizer a bit of information about the top-part of the
value.
llvm-svn: 21205
2005-04-10 04:33:47 +00:00
Chris Lattner
9d13d0b958
Fold zext_inreg(zextload), likewise for sext's
...
llvm-svn: 21204
2005-04-10 04:33:08 +00:00
Chris Lattner
9c8fe594e5
add a simple xform
...
llvm-svn: 21203
2005-04-10 04:04:49 +00:00
Nate Begeman
0a43e95718
Remove unnecessary Implicit Defs. Since r0 is not in allocation, we do not
...
have to inform the register allocator it might be stepped on.
llvm-svn: 21202
2005-04-10 03:59:42 +00:00
Nate Begeman
f5cedbc812
Make sure that BRCOND branches can be converted into long branches too.
...
llvm-svn: 21198
2005-04-10 01:48:29 +00:00
Nate Begeman
ab8e705a52
Don't hand ISD::CALL nodes off to SelectExprFP. This fixes siod.
...
llvm-svn: 21197
2005-04-10 01:14:13 +00:00
Chris Lattner
b3518a838c
Fix a thinko. If the operand is promoted, pass the promoted value into
...
the new zero extend, not the original operand. This fixes cast bool -> long
on ppc.
Add an unrelated fixme
llvm-svn: 21196
2005-04-10 01:13:15 +00:00
Chris Lattner
c1bacbff9d
rename getPPCOpcodeForSetCCNumber -> getPPCOpcodeForSetCCOpode to be more
...
correct. Remove the EmitComparison retvalue, as it is always the first arg.
Fix a place where we incorrectly passed in the setcc opcode instead of the
setcc number, causing us to miscompile crafty. Crafty now works!
llvm-svn: 21195
2005-04-10 01:03:31 +00:00
Nate Begeman
a2374d39df
fix ISD::BRCONDTWOWAY codegen to not deference the end() iterator
...
llvm-svn: 21193
2005-04-09 23:35:05 +00:00
Chris Lattner
17c60891c1
Fix CodeGen/Generic/2005-05-09-GlobalInPHI.ll, which was reduced from 254.gap.
...
This caused the "use before a def" assertion on some programs.
With this patch, 254.gap now passes with the PPC backend.
llvm-svn: 21191
2005-04-09 22:05:17 +00:00
Chris Lattner
034716de24
add a little peephole optimization. This allows us to codegen:
...
int a(short i) {
return i & 1;
}
as
_a:
andi. r3, r3, 1
blr
instead of:
_a:
rlwinm r2, r3, 0, 16, 31
andi. r3, r2, 1
blr
on ppc. It should also help the other risc targets.
llvm-svn: 21189
2005-04-09 21:43:54 +00:00
Chris Lattner
b630949c2e
do not set the root to null if an argument is dead
...
llvm-svn: 21188
2005-04-09 21:23:24 +00:00
Nate Begeman
dda6155d19
Add rlwnm instruction for variable rotate
...
Generate rotate left/right immediate
Generate code for brcondtwoway
Use new livein/liveout functionality
llvm-svn: 21187
2005-04-09 20:09:12 +00:00
Chris Lattner
72b1964108
Fix a crash on 173.applu by asking for a constant bigger than 32-bits.
...
llvm-svn: 21185
2005-04-09 19:47:21 +00:00
Chris Lattner
c97f9f403f
Switch this instruction selector over to using liveins and liveouts, eliminating
...
implicit defs on entry to the function. yaay :)
llvm-svn: 21184
2005-04-09 16:32:30 +00:00
Chris Lattner
77ab286605
there is no need to remove this instruction, linscan does it already as it
...
removes noop moves.
llvm-svn: 21183
2005-04-09 16:24:20 +00:00
Chris Lattner
f408e9a07b
Adjust live intervals to support a livein set
...
llvm-svn: 21182
2005-04-09 16:17:50 +00:00
Chris Lattner
fd4d6e0847
Use live out sets for return values instead of imp_defs, which is cleaner and faster.
...
llvm-svn: 21181
2005-04-09 15:23:56 +00:00
Chris Lattner
1a9c8fc64a
Consider the livein/out set for a function, allowing targets to not have to
...
use ugly imp_def/imp_uses for arguments and return values.
llvm-svn: 21180
2005-04-09 15:23:25 +00:00
Duraid Madina
2d3e52576d
ok, the "ia64 has a boatload of registers" joke stopped being funny today ;)
...
* fix overallocation of integer (stacked) registers: we can't allocate
registers for local use if they are required as output registers
this fixes 'toast' in the test suite, and all sorts of larger programs
like bzip2 etc.
llvm-svn: 21178
2005-04-09 11:53:00 +00:00
Nate Begeman
98bcb13bfa
Optimize FSEL a bit for fneg arguments. This fixes the recently added test
...
case so that we emit
_test_fneg_sel:
.LBB_test_fneg_sel_0: ;
fsel f1, f1, f3, f2
blr
instead of:
_test_fneg_sel:
.LBB_test_fneg_sel_0: ;
fneg f0, f1
fneg f0, f0
fsel f1, f0, f3, f2
blr
llvm-svn: 21177
2005-04-09 09:33:07 +00:00
Chris Lattner
3fde68ea7a
Fix CodeGen/SparcV9/2005-05-09-GEP-Crash.ll a crash on some specfp program
...
lets hope this doesn't break other programs with induced entropy
llvm-svn: 21174
2005-04-09 06:27:14 +00:00
Chris Lattner
afa0001d54
recognize some patterns as fabs operations, so that fabs at the source level
...
is deconstructed then reconstructed here. This catches 19 fabs's in 177.mesa
9 in 168.wupwise, 5 in 171.swim, 3 in 172.mgrid, and 14 in 173.applu out of
specfp2000.
This allows the X86 code generator to make MUCH better code than before for
each of these and saves one instr on ppc.
This depends on the previous CFE patch to expose these correctly.
llvm-svn: 21171
2005-04-09 05:15:53 +00:00
Chris Lattner
8e6eafa8e1
Emit BRCONDTWOWAY when possible.
...
llvm-svn: 21167
2005-04-09 03:30:29 +00:00
Chris Lattner
55b73bda6c
Legalize BRCONDTWOWAY into a BRCOND/BR pair if a target doesn't support it.
...
llvm-svn: 21166
2005-04-09 03:30:19 +00:00
Chris Lattner
da902bdf1b
print and fold BRCONDTWOWAY correctly
...
llvm-svn: 21165
2005-04-09 03:27:28 +00:00
Chris Lattner
39f963f968
This target does not support/want ISD::BRCONDTWOWAY
...
llvm-svn: 21164
2005-04-09 03:22:37 +00:00
Chris Lattner
c80baf5567
This target does not yet support ISD::BRCONDTWOWAY
...
llvm-svn: 21163
2005-04-09 03:22:30 +00:00
Nate Begeman
99fb6814bd
64b: Expand S/UREM
...
32b: No longer pattern match fneg(fsub(fmul)) as fnmsub
Pattern match fsub a, mul(b, c) as fnmsub
Pattern match fadd a, mul(b, c) as fmadd
Those changes speed up hydro2d by 2.5%, distray by 6%, and scimark by 8%
llvm-svn: 21161
2005-04-09 03:05:51 +00:00
Chris Lattner
31170cd2ec
canonicalize a bunch of operations involving fneg
...
llvm-svn: 21160
2005-04-09 03:02:46 +00:00
Nate Begeman
95e1b860a1
Fix 64b shifts
...
llvm-svn: 21159
2005-04-08 23:45:01 +00:00
Nate Begeman
3fca499b8d
Match Mac OS X 64 bit calling conventions
...
llvm-svn: 21157
2005-04-08 21:26:05 +00:00
Andrew Lenharth
6eff2083b5
collect a few statistics, factor constants (constant loading and mult), fix logic operation pattern matchs, supress FP div when int dividing by a constant
...
llvm-svn: 21156
2005-04-08 17:28:49 +00:00
Duraid Madina
e7412561bf
fix bogus division-by-power-of-2 (was wrong for negative input, adds extr insn)
...
fix hack in division (clean up frcpa instruction)
llvm-svn: 21153
2005-04-08 10:01:48 +00:00
Chris Lattner
f82999fabe
Fix bug: InstCombine/2005-05-07-UDivSelectCrash.ll
...
llvm-svn: 21152
2005-04-08 04:03:26 +00:00
Nate Begeman
6875356db1
Optimized code sequences for setcc reg, 0
...
Optimized code sequence for (a < 0) ? b : 0
llvm-svn: 21150
2005-04-07 20:30:01 +00:00
Andrew Lenharth
b1b7ef0979
Alpha zero extends setcc results
...
llvm-svn: 21149
2005-04-07 20:11:32 +00:00
Chris Lattner
9a56ef5693
If a target zero or sign extends the result of its setcc, allow folding of
...
this into sign/zero extension instructions later.
On PPC, for example, this testcase:
%G = external global sbyte
implementation
void %test(int %X, int %Y) {
%C = setlt int %X, %Y
%D = cast bool %C to sbyte
store sbyte %D, sbyte* %G
ret void
}
Now codegens to:
cmpw cr0, r3, r4
li r3, 1
li r4, 0
blt .LBB_test_2 ;
.LBB_test_1: ;
or r3, r4, r4
.LBB_test_2: ;
addis r2, r2, ha16(L_G$non_lazy_ptr-"L00000$pb")
lwz r2, lo16(L_G$non_lazy_ptr-"L00000$pb")(r2)
stb r3, 0(r2)
instead of:
cmpw cr0, r3, r4
li r3, 1
li r4, 0
blt .LBB_test_2 ;
.LBB_test_1: ;
or r3, r4, r4
.LBB_test_2: ;
*** rlwinm r3, r3, 0, 31, 31
addis r2, r2, ha16(L_G$non_lazy_ptr-"L00000$pb")
lwz r2, lo16(L_G$non_lazy_ptr-"L00000$pb")(r2)
stb r3, 0(r2)
llvm-svn: 21148
2005-04-07 19:43:53 +00:00
Chris Lattner
352dd3e579
PowerPC zero extends setcc results
...
llvm-svn: 21147
2005-04-07 19:41:49 +00:00
Chris Lattner
583dcc66f2
X86 zero extends setcc results
...
llvm-svn: 21146
2005-04-07 19:41:46 +00:00
Chris Lattner
bbe0e9e9db
Remove somethign I had for testing
...
llvm-svn: 21144
2005-04-07 18:58:54 +00:00
Andrew Lenharth
c3e5c42b86
fix a small optimization opertunity and make gcc happy
...
llvm-svn: 21143
2005-04-07 18:15:28 +00:00
Chris Lattner
ee836c7b32
This patch does two things. First, it canonicalizes 'X >= C' -> 'X > C-1'
...
(likewise for <= >=u >=u).
Second, it implements a special case hack to turn 'X gtu SINTMAX' -> 'X lt 0'
On powerpc, for example, this changes this:
lis r2, 32767
ori r2, r2, 65535
cmplw cr0, r3, r2
bgt .LBB_test_2
into:
cmpwi cr0, r3, 0
blt .LBB_test_2
llvm-svn: 21142
2005-04-07 18:14:58 +00:00
Andrew Lenharth
469244a2b4
fixup magic constant making code. tested by thousands of random divisions.... by 10000. ok, so random divisors would be good too, but this at least fixes some things
...
llvm-svn: 21140
2005-04-07 17:19:16 +00:00
Andrew Lenharth
61215a78fb
lowercase instructions, makes diff happier
...
llvm-svn: 21139
2005-04-07 17:17:48 +00:00
Chris Lattner
05d79f36b1
Implement the following xforms:
...
(X-Y)-X --> -Y
A + (B - A) --> B
(B - A) + A --> B
llvm-svn: 21138
2005-04-07 17:14:51 +00:00
Chris Lattner
5b31ada26d
Implement InstCombine/add.ll:test28, transforming C1-(X+C2) --> (C1-C2)-X.
...
This occurs several dozen times in specint2k, particularly in crafty and gcc
apparently.
llvm-svn: 21136
2005-04-07 16:28:01 +00:00
Chris Lattner
ea1752ddf5
Transform X-(X+Y) == -Y and X-(Y+X) == -Y
...
llvm-svn: 21134
2005-04-07 16:15:25 +00:00
Andrew Lenharth
3d1228500d
It wasn't happy about this either
...
llvm-svn: 21133
2005-04-07 14:18:13 +00:00
Andrew Lenharth
ce0dcea720
Yea, it wasn't happy
...
llvm-svn: 21132
2005-04-07 13:55:53 +00:00
Duraid Madina
ab6e2cfaeb
teach asmprinter to print s8/s14 operands
...
llvm-svn: 21131
2005-04-07 12:34:36 +00:00
Duraid Madina
ed8eb83203
codegen immediate forms of add/sub/shift
...
llvm-svn: 21130
2005-04-07 12:33:38 +00:00
Duraid Madina
5c4a7b68b3
add immediate forms of add, sub, shift
...
llvm-svn: 21129
2005-04-07 12:32:24 +00:00
Chris Lattner
22bbc2351e
Fix a really scary bug that Nate found where we weren't deleting the right
...
elements auto of the autoCSE maps.
llvm-svn: 21128
2005-04-07 00:30:13 +00:00
Nate Begeman
b890f32ac9
Pattern match bitfield insert, which helps shift long by immediate, among
...
other things.
llvm-svn: 21127
2005-04-06 23:51:40 +00:00
Nate Begeman
6c5e4c3bb1
Fix some shift bugs
...
llvm-svn: 21126
2005-04-06 22:42:08 +00:00
Alkis Evlogimenos
242fe17bc4
Make these 64 bit constants so that this compiles on x86-32 as well.
...
llvm-svn: 21125
2005-04-06 22:09:40 +00:00
Andrew Lenharth
abf46a4f5e
added sdiv by 2^k and works for neg divisors also
...
llvm-svn: 21124
2005-04-06 22:03:13 +00:00
Chris Lattner
49d166c6b6
Don't make this require loopsimplify. It works BETTER with loop simplify
...
but should not require it.
llvm-svn: 21123
2005-04-06 21:45:00 +00:00
Nate Begeman
7898fc8cc8
Teach ExpandShift how to handle shifts by a constant. This allows targets
...
like PowerPC to codegen long shifts in many fewer instructions.
llvm-svn: 21122
2005-04-06 21:13:14 +00:00
Andrew Lenharth
5c71402202
fix copy/paste errors, and add imm support to SxADDQ and SxSUBQ
...
llvm-svn: 21121
2005-04-06 20:59:59 +00:00
Chris Lattner
40ee8a062b
Fix SingleSource/Regression/C/2005-05-06-LongLongSignedShift.c, we were not
...
properly sign extending the top of the result of a 64-bit shift right by
a constant > 32.
llvm-svn: 21120
2005-04-06 20:59:35 +00:00
Andrew Lenharth
9f65102f9b
Added Nate's div by constant stuff, also scaled operations!
...
llvm-svn: 21116
2005-04-06 20:25:34 +00:00
Chris Lattner
7c0786dda8
Fix a namespace issue, reported by Vladimir Merzliakov!
...
llvm-svn: 21115
2005-04-06 19:45:39 +00:00
Duraid Madina
5fa854a7fe
steal sampo's div-by-constant-power-of-2 stuff
...
thanks sampo!!
llvm-svn: 21113
2005-04-06 09:55:17 +00:00
Duraid Madina
1d156293cb
add fms instruction
...
llvm-svn: 21112
2005-04-06 09:54:09 +00:00
Nate Begeman
98251d6a1c
Fixed version of optimized integer divide is now fixed. Calculate the
...
quotient, not the remainder. Also, make sure to remove the old div operand
from the ExprMap and let SelectExpr insert the new one.
llvm-svn: 21111
2005-04-06 06:44:57 +00:00
Duraid Madina
2db29b8016
lie a bit and say that r1/r12 (GP/SP) _aren't_ callee-save, as we take
...
care of this ourselves
llvm-svn: 21110
2005-04-06 06:18:36 +00:00
Duraid Madina
62c4de8d62
make sure 'special' registers don't get allocated
...
llvm-svn: 21109
2005-04-06 06:17:54 +00:00
Chris Lattner
c87ff88efb
Add (untested) support for MULHS and MULHU.
...
llvm-svn: 21107
2005-04-06 04:21:07 +00:00
Chris Lattner
ba7cdbebb1
add signed versions of the extra precision multiplies
...
llvm-svn: 21106
2005-04-06 04:19:22 +00:00
Nate Begeman
aee0f81849
Turn off the div -> mul optimization until it works correctly 100% of the
...
time.
llvm-svn: 21105
2005-04-06 03:36:33 +00:00
Nate Begeman
b44597771c
Add support for MULHS and MULHU nodes
...
Have LegalizeDAG handle SREM and UREM for us
Codegen SDIV and UDIV by constant as a multiply by magic constant instead
of integer divide, which is very slow.
llvm-svn: 21104
2005-04-06 00:25:27 +00:00
Nate Begeman
4457b4994c
Expand SREM and UREM for targets that claim not to have them, like PowerPC
...
llvm-svn: 21103
2005-04-06 00:23:54 +00:00
Nate Begeman
12af81407b
Add MULHU and MULHS nodes for the high part of an (un)signed 32x32=64b
...
multiply.
llvm-svn: 21102
2005-04-05 22:36:56 +00:00
Andrew Lenharth
613a940af8
added lowerargs support for varargs
...
llvm-svn: 21101
2005-04-05 20:51:46 +00:00
Nate Begeman
82ff41c342
Behold, rlwinm with certain immediate arguments is printed as the much more
...
readable slwi or srwi (shift left/right word immediate).
llvm-svn: 21099
2005-04-05 18:19:50 +00:00
Nate Begeman
581553fd21
Fix cut & paste errors (32->64), and codegen float->int more optimally.
...
llvm-svn: 21098
2005-04-05 17:32:30 +00:00
Tanya Lattner
4d8f553a82
Updated to use dep analyzer.
...
llvm-svn: 21097
2005-04-05 16:36:44 +00:00
Nate Begeman
152dbbe856
Remove 64 bit simple ISel, it never worked correctly
...
Add initial (buggy) implementation of 64 bit pattern ISel
llvm-svn: 21096
2005-04-05 08:51:15 +00:00
Nate Begeman
a18a26f47c
Back out the previous change to SelectBranchCC, since there are cases it
...
could miscompile. A correct solution will be found in the near future.
llvm-svn: 21095
2005-04-05 04:32:16 +00:00
Nate Begeman
358dee806e
Rename canUseAsImmediateForOpcode to getImmediateForOpcode to better
...
indicate that it is not a boolean function.
Properly emit the pseudo instruction for conditional branch, so that we
can fix up conditional branches whose displacements are too large.
Reserve the right amount of opcode space for said pseudo instructions.
llvm-svn: 21094
2005-04-05 04:22:58 +00:00
Chris Lattner
c5b8fbe7f9
do not crash when using -debug
...
llvm-svn: 21092
2005-04-05 01:12:03 +00:00
Nate Begeman
ede4abc899
Implement SDIV by power of 2 as srawi/addze rather than load imm, divw
...
llvm-svn: 21091
2005-04-05 00:15:08 +00:00
Nate Begeman
00002553ba
Pattern match fp mul-add, mul-sub, neg-mul-add, and neg-mul-sub
...
llvm-svn: 21090
2005-04-04 23:40:36 +00:00
Nate Begeman
682fd51f9c
Add support for multiply-add, multiply-sub, and their negated versions
...
llvm-svn: 21089
2005-04-04 23:01:51 +00:00
Chris Lattner
57ea4daa2e
do not dereference an extra layer of pointers to determine if an external
...
call can modify a memory location. This fixes
test/Regression/Analysis/Andersens/modreftest.ll
llvm-svn: 21088
2005-04-04 22:23:21 +00:00