LiveVariables add implicit kills to correctly track partial register kills. This works well enough and is fairly accurate. But coalescer can make it impossible to maintain these markers. e.g.
BL <ga:sss1>, %R0<kill,undef>, %S0<kill>, %R0<imp-def>, %R1<imp-def,dead>, %R2<imp-def,dead>, %R3<imp-def,dead>, %R12<imp-def,dead>, %LR<imp-def,dead>, %D0<imp-def>, ...
...
%reg1031<def> = FLDS <cp#1>, 0, 14, %reg0, Mem:LD4[ConstantPool]
...
%S0<def> = FCPYS %reg1031<kill>, 14, %reg0, %D0<imp-use,kill>
When reg1031 and S0 are coalesced, the copy (FCPYS) will be eliminated the the implicit-kill of D0 is lost. In this case it's possible to move the marker to the FLDS. But in many cases, this is not possible. Suppose
%reg1031<def> = FOO <cp#1>, %D0<imp-def>
...
%S0<def> = FCPYS %reg1031<kill>, 14, %reg0, %D0<imp-use,kill>
When FCPYS goes away, the definition of S0 is the "FOO" instruction. However, transferring the D0 implicit-kill to FOO doesn't work since it is the def of D0 itself. We need to fix this in another time by introducing a "kill" pseudo instruction to track liveness.
Disabling the assertion is not ideal, but machine verifier is doing that job now. It's important to know double-def is not a miscomputation since it means a register should be free but it's not tracked as free. It's a performance issue instead.
llvm-svn: 82677
of the defs are processed.
Also fix a implicit_def propagation bug: a implicit_def of a physical register
should be applied to uses of the sub-registers.
llvm-svn: 82616
variable increment / decrement slighter high priority.
This has major impact on some micro-benchmarks. On MultiSource/Applications
and spec tests, it's a minor win. It also reduce 256.bzip instruction count
by 8%, 55 on 164.gzip on i386 / Darwin.
llvm-svn: 82485
And fix a bug with the behavior of min/max instructions formed from
fcmp uge comparisons.
Also, use FiniteOnlyFPMath() for this code instead of UnsafeFPMath,
as it is more specific.
llvm-svn: 82466
we pushed the beginning of the interval back 1, so the
interval would overlap with inputs that die. We were
also pushing the end of the interval back 1, though,
which means the earlyclobber didn't overlap with other
output operands. Don't do this. PR 4964.
llvm-svn: 82342
getSymbolForDwarfGlobalReference is smart enough to know that it
needs to register the stub it references with MachineModuleInfoMachO,
so that it gets emitted at the end of the file.
Move stub emission from X86ATTAsmPrinter::doFinalization to the
new X86ATTAsmPrinter::EmitEndOfAsmFile asmprinter hook. The important
thing here is that EmitEndOfAsmFile is called *after* the ehframes are
emitted, so we get all the stubs.
This allows us to remove a gross hack from the asmprinter where it would
"just know" that it needed to output stubs for personality functions.
Now this is all driven from a consistent interface.
The testcase change is just reordering the expected output now that the
stubs come out after the ehframe instead of before.
This also unblocks other changes that Bill wants to make.
llvm-svn: 82269
has multiple uses, as one of the other uses may be on a path
to a different node above the callseq_start, because that
leads to a cyclic graph. This problem is exposed when
-combiner-global-alias-analysis is used. This fixes PR4880.
llvm-svn: 81821
Change the picbase symbol on non-darwin systems from ".Lllvm$4.$piclabel" to
".L4$pb". The actual name doesn't matter and the darwin name is shorter.
llvm-svn: 81688
input filename so that opt doesn't print the input filename in the
output so that grep lines in the tests don't unintentionally match
strings in the input filename.
llvm-svn: 81537
safe. This can happen we a subreg_to_reg 0 has been coalesced. One
exception is when the instruction that folds the load is a move, then we
can simply turn it into a 32-bit load from the stack slot.
rdar://7170444
llvm-svn: 81494
to instructions instead of zero extended ones. This makes the asmprinter
print signed values more consistently. This apparently only really affects
the X86 backend.
llvm-svn: 81265
depth first order, so it wouldn't process unreachable blocks.
When compiling at -O0, late dead block elimination isn't done
and the bad instructions got to isel.
llvm-svn: 81187
from floating-point to integer first, and bitcast the result
back to floating-point. Previously, this test was passing by
falling back to SelectionDAG lowering. The resulting code isn't
as nice, but it's correct and CodeGen now stays on the fast path.
llvm-svn: 81171
linear scan reg alloc. This fixes a problem I ran into where extracting
a function from a larger file caused the generated code to change (masking
the problem I was trying to debug) because the allocator behaved differently.
This changes the results for two X86 regression checks. stack-color-with-reg
is improved, with one less instruction, but pr3495 is worse, with one more
copy. As far as I can tell, these tests were just getting lucky or unlucky,
so I've changed the expected results.
llvm-svn: 81060
tied to different source registers, the TwoAddressInstructionPass needs to
be smarter. Change it to check before replacing a source register whether
that source register is tied to a different destination register, and if so,
defer handling it until a subsequent iteration.
llvm-svn: 80654
makes an eggregious hack somewhat more palatable. Bringing the LSDA forward
and making it a GV available for reference would be even better, but is
beyond the scope of what I'm looking to solve at this point.
Objective C++ code could generate function names that broke the previous
scheme. This fixes that.
llvm-svn: 80649
moves. This avoids the need to promote the operands (or implicitly
extend them, a partial register update condition), and can reduce
i8 register pressure. This substantially speeds up code such as
write_hex in lib/Support/raw_ostream.cpp.
subclass-coalesce.ll is too trivial and no longer tests what it was
originally intended to test.
llvm-svn: 80184
leads to partial-register definitions. To help avoid redundant
zero-extensions, also teach the h-register matching patterns that
use movzbl to match anyext as well as zext.
llvm-svn: 80099
more and is much nicer to the OS.
- Dan, please check. If there are parts of the test you think I should strip
out so it doesn't cause random failures let me know (there are still some PIC
label numbers in it, for example).
llvm-svn: 80019
code, according to Anton (I'm not totally convinced, but we can always
resurrect patches if we need to do so.)
- Start moving CellSPU's tests to prefer FileCheck.
llvm-svn: 79958
all Darwin targets; could be split into separate tests for
the chip subdirectories, but from Chris' last mail on testing
I assume he'd rather have only one test. Generic seems to be
the best available, maybe there should be a Darwin subdirectory?
llvm-svn: 79877
When undoing a reuse in ReuseInfo::GetRegForReload, check if it was only a
sub-register being used. The MachineOperand::getSubReg() method is only valid
for virtual registers, so we have to recover the sub-register index manually.
llvm-svn: 79855