172 %ECX<def> = MOV32rr %reg1039<kill>
180 INLINEASM <es:subl $5,$1
sbbl $3,$0>, 10, %EAX<def>, 14, %ECX<earlyclobber,def>, 9, %EAX<kill>,
36, <fi#0>, 1, %reg0, 0, 9, %ECX<kill>, 36, <fi#1>, 1, %reg0, 0
188 %EAX<def> = MOV32rr %EAX<kill>
196 %ECX<def> = MOV32rr %ECX<kill>
204 %ECX<def> = MOV32rr %ECX<kill>
212 %EAX<def> = MOV32rr %EAX<kill>
220 %EAX<def> = MOV32rr %EAX
228 %reg1039<def> = MOV32rr %ECX<kill>
The early clobber operand ties ECX input to the ECX def.
The live interval of ECX is represented as this:
%reg20,inf = [46,47:1)[174,230:0) 0@174-(230) 1@46-(47)
The right way to represent this is something like
%reg20,inf = [46,47:2)[174,182:1)[181:230:0) 0@174-(182) 1@181-230 @2@46-(47)
Of course that won't work since that means overlapping live ranges defined by two val#.
The workaround for now is to add a bit to val# which says the val# is redefined by a early clobber def somewhere. This prevents the move at 228 from being optimized away by SimpleRegisterCoalescing::AdjustCopiesBackFrom.
llvm-svn: 61259
- Use SplitBlockPredecessors to factor out common predecessors of the critical edge destination. This is disabled for now due to some regressions.
llvm-svn: 61248
The EH_frame and .eh symbols are now private, except for darwin9 and earlier.
The patch also fixes the definition of PrivateGlobalPrefix on pcc linux.
llvm-svn: 61242
computation code. Also, avoid adding output-depenency edges when both
defs are dead, which frequently happens with EFLAGS defs.
Compute Depth and Height lazily, and always in terms of edge latency
values. For the schedulers that don't care about latency, edge latencies
are set to 1.
Eliminate Cycle and CycleBound, and LatencyPriorityQueue's Latencies array.
These are all subsumed by the Depth and Height fields.
llvm-svn: 61073
which are identical to the original patterns.
- Change the multiply with overflow so that we distinguish between signed and
unsigned multiplication. Currently, unsigned multiplication with overflow
isn't working!
llvm-svn: 60963
for promoted integer types, eg: i16 on ppc-32, or
i24 on any platform. Complete support for arbitrary
precision integers would require handling expanded
integer types, eg: i128, but I couldn't be bothered.
llvm-svn: 60834
overflow/carry from the "arithmetic with overflow" intrinsics. It searches the
machine basic block from bottom to top to find the SETO/SETC instruction that is
its conditional. If an instruction modifies EFLAGS before it reaches the
SETO/SETC instruction, then it defaults to the normal instruction emission.
llvm-svn: 60807
essential problem was that the DAG can contain
random unused nodes which were never analyzed.
When remapping a value of a node being processed,
such a node may become used and need to be analyzed;
however due to operands being transformed during
analysis the node may morph into a different one.
Users of the morphing node need to be updated, and
this wasn't happening. While there I added a bunch
of documentation and sanity checks, so I (or some
other poor soul) won't have to scratch their head
over this stuff so long trying to remember how it
was all supposed to work next time some obscure
problem pops up! The extra sanity checking exposed
a few places where invariants weren't being preserved,
so those are fixed too. Since some of the sanity
checking is expensive, I added a flag to turn it
on. It is also turned on when building with
ENABLE_EXPENSIVE_CHECKS=1.
llvm-svn: 60797
Fix the shift amount when unrolling a vector shift into scalar shifts.
Fix problem in getShuffleScalarElt where it assumes that the input of
a bit convert must be a vector.
llvm-svn: 60740
and use it in x86 address mode folding. Also, make
getRegForValue return 0 for illegal types even if it has a
ValueMap for them, because Argument values are put in the
ValueMap. This fixes PR3181.
llvm-svn: 60696
loops when they can be subsumed into addressing modes.
Change X86 addressing mode check to realize that
some PIC references need an extra register.
(I believe this is correct for Linux, if not, I'm sure
someone will tell me.)
llvm-svn: 60608
1. GlobalBaseReg may have been spilled.
2. It may not be live at the use.
3. Spiller doesn't know this is happening so it won't prevent GlobalBaseReg from being spilled later (That by itself is a nasty hack. It's needed because we don't insert the reload until later).
llvm-svn: 60595
aren't part of the test suite but are generally useful nonetheless, and can
be expanded later to test the backend against the actual Cell SPU system.
There's basically no other good place to put this code, so put it here for
the time being.
- vecoperations.c: Vector shuffles for all supported vector types, tests
for v16i8 add and multiply.
llvm-svn: 60566
foldMemoryOperand how to "fold" them, by converting them into constant-pool
loads. When they aren't folded, they use xorps/cmpeqd, but for example when
register pressure is high, they may now be folded as memory operands, which
reduces register pressure.
Also, mark V_SET0 isAsCheapAsAMove so that two-address-elimination will
remat it instead of copying zeros around (V_SETALLONES was already marked).
llvm-svn: 60461
delegates to the regular x86-32 convention which handles byval, but only
after it handles a few cases, and it's necessary to handle byval before
handling those cases. This fixes PR3122 (and rdar://6400815), llvm-gcc
miscompiling LLVM.
llvm-svn: 60453
1. ppcf128 select is expanded to f64 select's.
2. f64 select operand 0 is an i1 truncate, it's promoted to i32 zero_extend.
3. f64 select is updated. It's changed back to a "NewNode" and being re-analyzed.
4. f64 select operands are being processed. Operand 0 is a "NewNode". It's being expunged out of ReplacedValues map.
5. ExpungeNode tries to remap f64 select and notice it's a "NewNode" and assert.
Duncan, please take a look. Thanks.
llvm-svn: 60443
- Incorporate Tilmann Scheller's ISD::TRUNCATE custom lowering patch
- Update SPU calling convention info, even if it's not used yet (but can be
at some point or another)
- Ensure that any-extended f32 loads are custom lowered, especially when
they're promoted for use in printf.
llvm-svn: 60438
- LowerXADDO lowers [SU]ADDO into an ADD with an implicit EFLAGS define. The
EFLAGS are fed into a SETCC node which has the conditional COND_O or COND_C,
depending on the type of ADDO requested.
- LowerBRCOND now recognizes if it's coming from a SETCC node with COND_O or
COND_C set.
llvm-svn: 60388
figuring out the base of the IV. This produces better
code in the example. (Addresses use (IV) instead of
(BASE,IV) - a significant improvement on low-register
machines like x86).
llvm-svn: 60374