them. This allows for elminination of redundant extends in the entry
blocks of functions on PowerPC.
Add support for i32 x i32 -> i64 multiplies, by recognizing when the inputs
to ISD::MUL in ExpandOp are actually just extended i32 values and not real
i64 values. this allows us to codegen
int mulhs(int a, int b) { return ((long long)a * b) >> 32; }
as:
_mulhs:
mulhw r3, r4, r3
blr
instead of:
_mulhs:
mulhwu r2, r4, r3
srawi r5, r3, 31
mullw r5, r4, r5
add r2, r2, r5
srawi r4, r4, 31
mullw r3, r4, r3
add r3, r2, r3
blr
with a similar improvement on x86.
llvm-svn: 23147
token chains first. For this C function:
int test() {
int i;
for (i = 0; i < 100000; ++i)
foo();
}
Instead of emitting this (condition before call)
.LBB_test_1: ; no_exit
addi r30, r30, 1
lis r2, 1
ori r2, r2, 34464
cmpw cr2, r30, r2
bl L_foo$stub
bne cr2, .LBB_test_1 ; no_exit
Emit this:
.LBB_test_1: ; no_exit
bl L_foo$stub
addi r30, r30, 1
lis r2, 1
ori r2, r2, 34464
cmpw cr0, r30, r2
bne cr0, .LBB_test_1 ; no_exit
Which makes it so we don't have to save/restore cr2 in the prolog/epilog of
the function.
This also makes the code much more similar to what the pattern isel produces.
llvm-svn: 23135
putting it into the constant pool. This allows the isel machinery to
create constants that it will end up deciding are not needed, without them
ending up in the resultant function constant pool.
llvm-svn: 23081
select. Also teach it that the bit count instructions can only set the low bits
of the result, depending on the size of the input.
This allows us to compile this:
int %eq0(int %a) {
%tmp.1 = seteq int %a, 0 ; <bool> [#uses=1]
%tmp.2 = cast bool %tmp.1 to int ; <int> [#uses=1]
ret int %tmp.2
}
To this:
_eq0:
cntlzw r2, r3
srwi r3, r2, 5
blr
instead of this:
_eq0:
cntlzw r2, r3
rlwinm r3, r2, 27, 31, 31
blr
when setcc is marked illegal on ppc (which restores parity to non-illegal
setcc). Thanks to Nate for pointing this out.
llvm-svn: 23013
Use this information to avoid doing expensive interval intersections for
registers that could not possible be interesting. This speeds up linscan
on ia64 compiling kc++ in release mode from taking 7.82s to 4.8s(!), total
itanium llc time on this program is 27.3s now. This marginally speeds up
PPC and X86, but they appear to be limited by other parts of linscan, not
this code.
On this program, on itanium, live intervals now takes 41% of llc time.
llvm-svn: 22986
number of regs (e.g. most riscs), many functions won't need to use callee
clobbered registers. Do a speculative check to see if we can get a free
register without processing the fixed list (which has all of these). This
saves a lot of time on machines with lots of callee clobbered regs (e.g.
ppc and itanium, also x86).
This reduces ppc llc compile time from 184s -> 172s on kc++. This is probably
worth FAR FAR more on itanium though.
llvm-svn: 22972
we spill out of the fast path. The scan of active_ and the calls to
updateSpillWeights don't need to happen unless a spill occurs. This reduces
debug llc time of kc++ with ppc from 187.3s to 183.2s.
llvm-svn: 22971