The KILL pseudo-instruction may survive to the asm printer pass, just like the IMPLICIT_DEF. Print the KILL as a comment instead of just leaving a blank line in the output.
With -asm-verbose=0, a blank line is printed, like IMPLICIT?DEF.
llvm-svn: 86041
and extract_subreg as a "copy" that defines a valno.
Also fixes a typo. These two issues prevent a simple subreg coalescing from
happening before.
llvm-svn: 86022
This introduces a new pass, SlotIndexes, which is responsible for numbering
instructions for register allocation (and other clients). SlotIndexes numbering
is designed to match the existing scheme, so this patch should not cause any
changes in the generated code.
For consistency, and to avoid naming confusion, LiveIndex has been renamed
SlotIndex.
The processImplicitDefs method of the LiveIntervals analysis has been moved
into its own pass so that it can be run prior to SlotIndexes. This was
necessary to match the existing numbering scheme.
llvm-svn: 85979
an unconditional branch (possibly from tail merging), this code is
trying to redirect all of its predecessors to go directly to the branch
target, but that isn't feasible for indirect branches. The other
predecessors (that don't end with indirect branches) could theoretically
still be handled, but that is not easily done right now.
The AnalyzeBranch interface doesn't currently let us distinguish jump table
branches from indirect branches, and this code is currently handling
jump tables. To avoid punting on address-taken blocks, we would have to give
up handling jump tables. That seems like a bad tradeoff.
llvm-svn: 85975
the loop preheader. Add instructions which are already in the preheader block that
may be common expressions of those that are hoisted out. These does get a few more
instructions CSE'ed.
llvm-svn: 85799
- Be consistent when referring to MachineBasicBlocks: BB#0.
- Be consistent when referring to virtual registers: %reg1024.
- Be consistent when referring to unknown physical registers: %physreg10.
- Be consistent when referring to known physical registers: %RAX
- Be consistent when referring to register 0: %reg0
- Be consistent when printing alignments: align=16
- Print jump table contents.
- Don't print host addresses, in general.
- and various other cleanups.
llvm-svn: 85682
previously running CodePlacementOpt. Also print headers before
each dump in -print-machineinstrs mode, so that it's clear which
dump is which.
llvm-svn: 85681
unfolding loads for hoisting. getOpcodeAfterMemoryUnfold returns the
opcode of the original operation without the load, not the load
itself, MachineLICM needs to know the operand index in order to get
the correct register class. Extend getOpcodeAfterMemoryUnfold to
return this information.
llvm-svn: 85622
bunch of associated comments, because it doesn't have anything to do
with DAGs or scheduling. This is another step in decoupling MachineInstr
emitting from scheduling.
llvm-svn: 85517
indexed via the stack pointer, even if a frame pointer is present. Update the
heuristic to place it nearest the stack pointer in that case, rather than
nearest the frame pointer.
llvm-svn: 85474
the second (store) instruction in SpillSlotToUsesMap
consistently. I don't think this matters functionally,
but it's cleaner and Evan wants it this way.
llvm-svn: 85463
to spill after all, we weren't handling 2-instruction
spill sequences correctly (PPC Altivec). We need to
remove the store in this case. Removing the other
instruction(s) would be goodness but is not needed for
correctness, and isn't done here. 7331562.
llvm-svn: 85437
use it to control tail merging when there is a tradeoff between performance
and code size. When there is only 1 instruction in the common tail, we have
been merging. That can be good for code size but is a definite loss for
performance. Now we will avoid tail merging in that case when the
optimization level is "Aggressive", i.e., "-O3". Radar 7338114.
Since the IfConversion pass invokes BranchFolding, it too needs to know
the optimization level. Note that I removed the RegisterPass instantiation
for IfConversion because it required a default constructor. If someone
wants to keep that for some reason, we can add a default constructor with
a hard-wired optimization level.
llvm-svn: 85346
machineinstr whether the aliased register is dead, rather than the original
register is dead. This allows it to get the correct answer when examining
an instruction like this:
CALLpcrel32 <ga:foo>, %AL<imp-def>, %EAX<imp-def,dead>
where EAX is dead but a subregister of it is still live. This fixes PR5294.
llvm-svn: 85135
bootstrapping. It's not safe to leave identity subreg_to_reg and insert_subreg
around.
- Relax register scavenging to allow use of partially "not-live" registers. It's
common for targets to operate on registers where the top bits are undef. e.g.
s0 =
d0 = insert_subreg d0<undef>, s0, 1
...
= d0
When the insert_subreg is eliminated by the coalescer, the scavenger used to
complain. The previous fix was to keep to insert_subreg around. But that's
brittle and it's overly conservative when we want to use the scavenger to
allocate registers. It's actually legal and desirable for other instructions
to use the "undef" part of d0. e.g.
s0 =
d0 = insert_subreg d0<undef>, s0, 1
...
s1 =
= s1
= d0
We probably need add a "partial-undef" marker on machine operand so the
machine verifier would not complain.
llvm-svn: 85091
used elsewhere - an exit block is a block outside the loop branched to
from within the loop. An exiting block is a block inside the loop that
branches out.
llvm-svn: 85019
to break up CFG diamonds by banishing one of the blocks to the end of
the function, which is bad for code density and branch size.
This does pessimize MultiSource/Benchmarks/Ptrdist/yacr2, the
benchmark cited as the reason for the change, however I've examined
the code and it looks more like a case of gaming a particular
branch than of being generally applicable.
llvm-svn: 84803
tracked. Instead of trying to manually keep track of these locations
while doing complex modifications, just recompute them when they're needed.
This fixes a bug in which the TopMBB and BotMBB were not correctly updated,
leading to invalid transformations.
llvm-svn: 84598
appropriate restore location for the spill as well as perform the actual
save and restore.
The Thumb1 target uses this to make sure R12 is not clobbered while a spilled
scavenger register is live there.
llvm-svn: 84554
stack slots and giving them different PseudoSourceValue's did not fix the
problem of post-alloc scheduling miscompiling llvm itself.
- Apply Dan's conservative workaround by assuming any non fixed stack slots can
alias other memory locations. This means a load from spill slot #1 cannot
move above a store of spill slot #2.
- Enable post-alloc scheduling for x86 at optimization leverl Default and above.
llvm-svn: 84424
to be more general and understand more varieties of loops.
Teach CodePlacementOpt to reorganize the basic blocks of a loop so that
they are contiguous. This also includes a fair amount of logic for preserving
fall-through edges while doing so. This fixes a BranchFolding-ism where blocks
which can't be made to use a fall-through edge and don't conveniently fit
anywhere nearby get tossed out to the end of the function.
llvm-svn: 84295
header is just the entry block to the loop, and it needn't be at
the top of the loop in the code layout.
Remove the code that suppressed loop alignment for outer loops,
so that outer loops are aligned.
llvm-svn: 84158
so get rid of eh.selector.i64 and rename eh.selector.i32 to eh.selector.
Likewise for eh.typeid.for. This aligns us with gcc, which always uses a
32 bit value for the selector on all platforms. My understanding is that
the register allocator used to assert if the selector intrinsic size didn't
match the pointer size, and this was the reason for introducing the two
variants. However my testing shows that this is no longer the case (I
fixed some bugs in selector lowering yesterday, and some more today in the
fastisel path; these might have caused the original problems).
llvm-svn: 84106
to remat non-load instructions as loads, and the remat code now uses
the UnmodeledSideEffects flags, MachineMemOperands, and similar things
to decide which instructions are valid for rematerialization.
llvm-svn: 84060
truncating an SDValue (depending on whether the target
type is bigger or smaller than the value's type); or zero
extending or truncating it. Use it in a few places (this
seems to be a popular operation, but I only modified cases
of it in SelectionDAGBuild). In particular, the eh_selector
lowering was doing this wrong due to a repeated rather than
inverted test, fixed with this change.
llvm-svn: 84027
bootstrap of FSF-style PPC, so there is some
reason to believe the original bug (which was
never analyzed) has been fixed, probably by
82266.
llvm-svn: 83871
into MachineInstrs. This is mostly just moving the code from
ScheduleDAGSDNodesEmit.cpp into a new class. This decouples MachineInstr
emitting from scheduling.
llvm-svn: 83699
is trivially rematerializable and integrate it into
TargetInstrInfo::isTriviallyReMaterializable. This way, all places that
need to know whether an instruction is rematerializable will get the
same answer.
This enables the useful parts of the aggressive-remat option by
default -- using AliasAnalysis to determine whether a memory location
is invariant, and removes the questionable parts -- rematting operations
with virtual register inputs that may not be live everywhere.
llvm-svn: 83687
While recording beginning of a function, use scope info from the first location entry instead of just relying on first location entry itself.
llvm-svn: 83684
to declare that they preserve other passes without needing to pull in
additional header file or library dependencies. Convert MachineFunctionPass
and CodeGenLICM to make use of this.
llvm-svn: 83555
implementations with a new MachineInstr::isInvariantLoad, which uses
MachineMemOperands and is target-independent. This brings MachineLICM
and other functionality to targets which previously lacked an
isInvariantLoad implementation.
llvm-svn: 83475
a virtual register to eliminate a frame index, it can return that register
and the constant stored there to PEI to track. When scavenging to allocate
for those registers, PEI then tracks the last-used register and value, and
if it is still available and matches the value for the next index, reuses
the existing value rather and removes the re-materialization instructions.
Fancier tracking and adjustment of scavenger allocations to keep more
values live for longer is possible, but not yet implemented and would likely
be better done via a different, less special-purpose, approach to the
problem.
eliminateFrameIndex() is modified so the target implementations can return
the registers they wish to be tracked for reuse.
ARM Thumb1 implements and utilizes the new mechanism. All other targets are
simply modified to adjust for the changed eliminateFrameIndex() prototype.
llvm-svn: 83467
verbose-asm mode, print comments instead. This eliminates a non-comment
difference between verbose-asm mode and non-verbose-asm mode.
Also, factor out the relevant code out of all the targets and into
target-independent code.
llvm-svn: 83392