long long test2(unsigned A, unsigned B) {
return ((unsigned long long)A << 32) + B;
}
is equivalent to this:
long long test1(unsigned A, unsigned B) {
return ((unsigned long long)A << 32) | B;
}
Now they are both codegen'd to this on ppc:
_test2:
blr
or this on x86:
test2:
movl 4(%esp), %edx
movl 8(%esp), %eax
ret
llvm-svn: 21231
masking shifts.
This fixes the miscompilation of this:
long long test1(unsigned A, unsigned B) {
return ((unsigned long long)A << 32) | B;
}
into this:
test1:
movl 4(%esp), %edx
movl %edx, %eax
orl 8(%esp), %eax
ret
allowing us to generate this instead:
test1:
movl 4(%esp), %edx
movl 8(%esp), %eax
ret
llvm-svn: 21230
Refactor how . instructions are handled. In particular, instead of passing
the RC flag all the way up the inheritance hierarchy, just make a new tblgen
class 'DOT' which can be added to an instruction definition.
For example, instead of this:
-def AND : XForm_6<31, 28, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB),
-let Defs = [CR0] in
-def ANDo : XForm_6<31, 28, 1, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB),
- "and. $rA, $rS, $rB">;
We now have this:
+def AND : XForm_6<31, 28, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB),
"and $rA, $rS, $rB">;
llvm-svn: 21225
* clean up immediates (we use 14, 22 and 64 bit immediates now. sane.)
* fold r0/f0/f1 registers into comparisons against 0/0.0/1.0
* fix nasty thinko - didn't use two-address form of conditional add
for extending bools to integers, so occasionally there would be
garbage in the result. it's amazing how often zeros are just
sitting around in registers ;) - this should fix a bunch of tests.
llvm-svn: 21221
Fix libcall code to not crash or assert looking for an ADJCALLSTACKUP node
when it is known that there is no ADJCALLSTACKDOWN to match.
Expand i64 multiply when ISD::MULHU is legal for the target.
llvm-svn: 21214
the result does change as a result of the extend.
This improves codegen for Alpha on this testcase:
int %a(ushort* %i) {
%tmp.1 = load ushort* %i
%tmp.2 = cast ushort %tmp.1 to int
%tmp.4 = and int %tmp.2, 1
ret int %tmp.4
}
Generating:
a:
ldgp $29, 0($27)
ldwu $0,0($16)
and $0,1,$0
ret $31,($26),1
instead of:
a:
ldgp $29, 0($27)
ldwu $0,0($16)
and $0,1,$0
addl $0,0,$0
ret $31,($26),1
btw, alpha really should switch to livein/outs for args :)
llvm-svn: 21213
correct. Remove the EmitComparison retvalue, as it is always the first arg.
Fix a place where we incorrectly passed in the setcc opcode instead of the
setcc number, causing us to miscompile crafty. Crafty now works!
llvm-svn: 21195
int a(short i) {
return i & 1;
}
as
_a:
andi. r3, r3, 1
blr
instead of:
_a:
rlwinm r2, r3, 0, 16, 31
andi. r3, r2, 1
blr
on ppc. It should also help the other risc targets.
llvm-svn: 21189
* fix overallocation of integer (stacked) registers: we can't allocate
registers for local use if they are required as output registers
this fixes 'toast' in the test suite, and all sorts of larger programs
like bzip2 etc.
llvm-svn: 21178
is deconstructed then reconstructed here. This catches 19 fabs's in 177.mesa
9 in 168.wupwise, 5 in 171.swim, 3 in 172.mgrid, and 14 in 173.applu out of
specfp2000.
This allows the X86 code generator to make MUCH better code than before for
each of these and saves one instr on ppc.
This depends on the previous CFE patch to expose these correctly.
llvm-svn: 21171
32b: No longer pattern match fneg(fsub(fmul)) as fnmsub
Pattern match fsub a, mul(b, c) as fnmsub
Pattern match fadd a, mul(b, c) as fmadd
Those changes speed up hydro2d by 2.5%, distray by 6%, and scimark by 8%
llvm-svn: 21161