When a physical register is in use, some alias of that register has a live
interval with a relevant live range. That is the sad state of intervals after
physreg coalescing of subregs, and it is good enough for correct register
allocation.
llvm-svn: 110452
This pass tries to remove comparison instructions when possible. For instance,
if you have this code:
sub r1, 1
cmp r1, 0
bz L1
and "sub" either sets the same flag as the "cmp" instruction or could be
converted to set the same flag, then we can eliminate the "cmp" instruction all
together. This is a important for ARM where the ALU instructions could set the
CPSR flag, but need a special suffix ('s') to do so.
llvm-svn: 110423
LiveVariables becomes horribly wrong while the coalescer is running, but the
analysis is not zapped until after the coalescer pass has run. This causes tons
of false reports when calling verify form the coalescer.
llvm-svn: 110402
We verify that the LiveInterval is live at uses and defs, and that all
instructions have a SlotIndex.
Stuff we don't check yet:
- Is the LiveInterval minimal?
- Do all defs correspond to instructions or phis?
- Do all defs dominate all their live ranges?
- Are all live ranges continually reachable from their def?
llvm-svn: 110386
be killed before being redefined.
These checks are usually disabled, and usually fail when enabled. We de facto
allow live registers to be redefined without a kill, the corresponding
assertions in RegScavenger were removed long ago.
llvm-svn: 110362
When the normalizeSpillWeights function was introduced, I forgot to remove this
normalization.
This change could affect register allocation. Hopefully for the better.
llvm-svn: 110119
multiple defs, like t2LDRSB_POST.
The first def could accidentally steal the physreg that the second, tied def was
required to be allocated to.
Now, the tied use-def is treated more like an early clobber, and the physreg is
reserved before allocating the other defs.
This would never be a problem when the tied def was the only def which is the
usual case.
This fixes MallocBench/gs for thumb2 -O0.
llvm-svn: 109715
protectors, to be near the stack protectors on the stack. Accomplish this by
tagging the stack object with a predicate that indicates that it would trigger
this. In the prolog-epilog inserter, assign these objects to the stack after the
stack protector but before the other objects.
llvm-svn: 109481
appropriate for targets without detailed instruction iterineries.
The scheduler schedules for increased instruction level parallelism in
low register pressure situation; it schedules to reduce register pressure
when the register pressure becomes high.
On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2
by 16%.
llvm-svn: 109300
to be of a different register class. For example, in Thumb1 if the live-in is
a high register, we want the vreg to be a low register. rdar://8224931
llvm-svn: 109291
it's too late to start backing off aggressive latency scheduling when most
of the registers are in use so the threshold should be a bit tighter.
- Correctly handle live out's and extract_subreg etc.
- Enable register pressure aware scheduling by default for hybrid scheduler.
For ARM, this is almost always a win on # of instructions. It's runtime
neutral for most of the tests. But for some kernels with high register
pressure it can be a huge win. e.g. 464.h264ref reduced number of spills by
54 and sped up by 20%.
llvm-svn: 109279
Make MDNode::destroy private.
Fix the one thing that used MDNode::destroy, outside of MDNode itself.
One should never delete or destroy an MDNode explicitly. MDNodes
implicitly go away when there are no references to them (implementation
details aside).
llvm-svn: 109028
This is a work in progress. So far we have some basic loop analysis to help
determine where it is useful to split a live range around a loop.
The actual loop splitting code from Splitter.cpp is also going to move in here.
llvm-svn: 108842
loop, for the reasons in the comments. This is a
major win on 253.perlbmk on ARM Darwin. I expect it
to be a good heuristic in general, but it's possible
some things will regress; I'll be watching.
7940152.
llvm-svn: 108792
- Unfortunate, but necessary for now to handle subtarget instruction matching. Eventually we should factor out the lower level target machine information so we don't need to do this.
llvm-svn: 108664
I am assured by people more knowledgeable than me that there are no rounding issues in eliminating this.
This fixed <rdar://problem/8197504>.
llvm-svn: 108639
Still very much under development. Comments and fixes will be forthcoming.
(This commit includes some small tweaks to LiveIntervals & LoopInfo to support the splitter)
llvm-svn: 108615
any command line paramater changed the register allocation produced by
PBQP.
Turns out variety is not the spice of life.
Fixed some comparators, added others. All good now.
llvm-svn: 108613
void foo() { __builtin_unreachable(); }
It will output the following on Darwin X86:
_func1:
Leh_func_begin0:
pushq %rbp
Ltmp0:
movq %rsp, %rbp
Ltmp1:
Leh_func_end0:
This prolog adds a new Call Frame Information (CFI) row to the FDE with an
address that is not within the address range of the code it describes -- part is
equal to the end of the function -- and therefore results in an invalid EH
frame. If we emit a nop in this situation, then the CFI row is now within the
address range.
llvm-svn: 108568
since it doesn't work for front-ends which don't emit column information
(which includes llvm-gcc in its present configuration), and doesn't
work for clang for K&R style variables where the variables are declared
in a different order from the parameter list.
Instead, make a separate pass through the instructions to collect the
llvm.dbg.declare instructions in order. This ensures that the debug
information for variables is emitted in this order.
llvm-svn: 108538
occasions, caused code to be generated in a different order.
All cases I've seen involved float softening in the type
legalizer, and this could be perhaps be fixed there, but
it's better not to generate things differently in the first
place. 7797940 (6/29/2010..7/15/2010).
llvm-svn: 108484
the function. We'll just turn it into a "trap" instruction instead.
The problem with not handling this is that it might generate a prologue without
the equivalent epilogue to go with it:
$ cat t.ll
define void @foo() {
entry:
unreachable
}
$ llc -o - t.ll -relocation-model=pic -disable-fp-elim -unwind-tables
.section __TEXT,__text,regular,pure_instructions
.globl _foo
.align 4, 0x90
_foo: ## @foo
Leh_func_begin0:
## BB#0: ## %entry
pushq %rbp
Ltmp0:
movq %rsp, %rbp
Ltmp1:
Leh_func_end0:
...
The unwind tables then have bad data in them causing all sorts of problems.
Fixes <rdar://problem/8096481>.
llvm-svn: 108473
-enable-no-nans-fp-math and -enable-no-infs-fp-math. All of the current codegen fp math optimizations only care whether the fp arithmetics arguments and results can never be NaN.
llvm-svn: 108465
to keep "Text" in sync with the "pure instructions" section attribute.
Lack of this attribute was preventing the assembler from emitting
multibyte noops instructions for templates (and inlines, and other
coalesced stuff) and was causing the assembler to mismatch .o files.
This fixes rdar://8018335
llvm-svn: 108461
independent of the order that isel happens to visit the dbg_declare
intrinsics. This fixes a bug in which the formal arguments were
being printed in reverse order, now that fast isel is going bottom up.
llvm-svn: 108369
LiveInterval::overlapsFrom dereferences end() if it is called on an empty
interval.
It would be reasonable to just return false - an empty interval doesn't overlap
anything, but I want to know who is doing it first.
llvm-svn: 108264
they already have one.
This fixes the himenobmtxpa miscompilation on ARM.
The PostRA scheduler got confused by the double memoperand and hoisted a stack
slot load above a store to the same slot.
llvm-svn: 108219
AggressiveAntiDepBreaker should not be using getPhysicalRegisterRegClass. An
instruction might be using a register that can only be replaced with one from
a subclass of getPhysicalRegisterRegClass.
With this patch we use getMinimalPhysRegClass. This is correct, but
conservative. We should check the uses of the register and select the
largest register class that can be used in all of them.
llvm-svn: 108122
physical register can be allocated in the class of the virtual are sufficient.
I think that the test for virtual registers is more strict than it needs to be,
it should be possible to coalesce two virtual registers the class of one
is a subclass of the other.
llvm-svn: 108118
getMinimalPhysRegClass. It was used to produce spills, and it is better to
use the most specific class if possible.
Update getLoadStoreRegOpcode to handle GR32_AD.
llvm-svn: 108115
Targets must now implement TargetInstrInfo::copyPhysReg instead. There is no
longer a default implementation forwarding to copyRegToReg.
llvm-svn: 108095
The first one was used just to call isSafeToMoveRegClassDefs. In
general, using a more specific reg class is better, in practice only
x86 implements that method and the results are always the same.
The second one is in FindFreeRegister and is used to check if a register
is in a register class, a much more direct call to contains is better as
it should cover more cases and is faster.
llvm-svn: 108093
assert()s, switching to void-casts. Removed an unneeded Compiler.h include as
a result. There are two other uses in LLVM, but they're not due to assert()s,
so I've left them alone.
llvm-svn: 108088
correct alignment information, which simplifies ExpandRes_VAARG a bit.
The patch introduces a new alignment information to TargetLoweringInfo. This is
needed since the two natural candidates cannot be used:
* The 's' in target data: If this is set to the minimal alignment of any
argument, getCallFrameTypeAlignment would return 4 for doubles on ARM for
example.
* The getTransientStackAlignment method. It is possible for an architecture to
have argument less aligned than what we maintain the stack pointer.
llvm-svn: 108072
The remaining copyRegToReg calls actually check the return value (shock!), so we
cannot trivially replace them with COPY instructions.
llvm-svn: 108069
if a block is split (by a custom inserter), the insert point may be in a
different block than it was originally. This fixes 32-bit llvm-gcc
bootstrap builds, and I haven't been able to reproduce it otherwise.
llvm-svn: 108060
ScheduleDAGEmit, TwoAddressLowering, and PHIElimination.
This switches the bulk of register copies to using COPY, but many less used
copyRegToReg calls remain.
llvm-svn: 108050
- Check getBytesToPopOnReturn().
- Eschew ST0 and ST1 for return values.
- Fix the PIC base register initialization so that it doesn't ever
fail to end up the top of the entry block.
llvm-svn: 108039
inserted in a MBB, and return an already inserted MI.
This target API change is necessary to allow foldMemoryOperand to call
storeToStackSlot and loadFromStackSlot when folding a COPY to a stack slot
reference in a target independent way.
The foldMemoryOperandImpl hook is going to change in the same way, but I'll wait
until COPY folding is actually implemented. Most targets only fold copies and
won't need to specialize this hook at all.
llvm-svn: 107991
U utils/TableGen/FastISelEmitter.cpp
--- Reverse-merging r107943 into '.':
U test/CodeGen/X86/fast-isel.ll
U test/CodeGen/X86/fast-isel-loads.ll
U include/llvm/Target/TargetLowering.h
U include/llvm/Support/PassNameParser.h
U include/llvm/CodeGen/FunctionLoweringInfo.h
U include/llvm/CodeGen/CallingConvLower.h
U include/llvm/CodeGen/FastISel.h
U include/llvm/CodeGen/SelectionDAGISel.h
U lib/CodeGen/LLVMTargetMachine.cpp
U lib/CodeGen/CallingConvLower.cpp
U lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
U lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp
U lib/CodeGen/SelectionDAG/FastISel.cpp
U lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
U lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
U lib/CodeGen/SelectionDAG/InstrEmitter.cpp
U lib/CodeGen/SelectionDAG/TargetLowering.cpp
U lib/Target/XCore/XCoreISelLowering.cpp
U lib/Target/XCore/XCoreISelLowering.h
U lib/Target/X86/X86ISelLowering.cpp
U lib/Target/X86/X86FastISel.cpp
U lib/Target/X86/X86ISelLowering.h
llvm-svn: 107987
disabled and then never turned back on again. Adjust some tests, one because
this change avoids an unnecessary instruction, and the other to make it
continue testing what it was intended to test.
llvm-svn: 107941
the simplification of frame index register scavenging to not have to check
for available registers directly and instead just let scavengeRegister()
handle it.
llvm-svn: 107880
EXTRACT_SUBREG no longer appears as a machine instruction. Use COPY instead.
Add isCopy() checks in many places using isMoveInstr() and isExtractSubreg().
The isMoveInstr hook will be removed later.
llvm-svn: 107879
This target hook is intended to replace copyRegToReg entirely, but for now it
calls copyRegToReg.
Any remaining calls to copyRegToReg wil be replaced by COPY instructions.
llvm-svn: 107854
(if there are any) and use the one which remains available for the longest
rather than just using the first one. This should help enable better re-use
of the loaded frame index values. rdar://7318760
llvm-svn: 107847
around everywhere, and also give it an InsertPt member, to enable isel
to operate at an arbitrary position within a block, rather than just
appending to a block.
llvm-svn: 107791
INSERT_SUBREG will now only appear in SSA machine instructions.
Fix the handling of partial redefs in ProcessImplicitDefs. This is now relevant
since partial redef COPY instructions appear.
llvm-svn: 107726
It is OK for an alias live range to overlap if there is a copy to or from the
physical register. CoalescerPair can work out if the copy is coalescable
independently of the alias.
This means that we can join with the actual destination interval instead of
using the getOrigDstReg() hack. It is no longer necessary to merge clobber
ranges into subregisters.
llvm-svn: 107695
This way *only* debug sections can be discarded, but not the opposite. Seems like the copy-and-pasto from ELF code, since there it contains the reverse flag ('alloc').
llvm-svn: 107658
The COPY instruction is intended to replace the target specific copy
instructions for virtual registers as well as the EXTRACT_SUBREG and
INSERT_SUBREG instructions in MachineFunctions. It won't we used in a selection
DAG.
COPY is lowered to native register copies by LowerSubregs.
llvm-svn: 107529
new basic blocks, and if used as a function argument, that can cause call frame
setup / destroy pairs to be split across a basic block boundary. That prevents
us from doing a simple assertion to check that the pairs match and alloc/
dealloc the same amount of space. Modify the assertion to only check the
amount allocated when there are matching pairs in the same basic block.
rdar://8022442
llvm-svn: 107517
- X86 unfolding should check if the instructions being unfolded has memoperands.
If there is no memoperands, then it must assume conservative alignment. If this
would introduce an expensive sse unaligned load / store, then unfoldMemoryOperand
etc. should not unfold the instruction.
llvm-svn: 107509
PrologEpilog code, and use it to determine whether
the asm forces stack alignment or not. gcc consistently
does not do this for GCC-style asms; Apple gcc inconsistently
sometimes does it for asm blocks. There is no
convenient place to put a bit in either the SDNode or
the MachineInstr form, so I've added an extra operand
to each; unlovely, but it does allow for expansion for
more bits, should we need it. PR 5125. Some
existing testcases are affected.
The operand lists of the SDNode and MachineInstr forms
are indexed with awesome mnemonics, like "2"; I may
fix this someday, but not now. I'm not making it any
worse. If anyone is inspired I think you can find all
the right places from this patch.
llvm-svn: 107506
This allows us to recognize the common case where all uses could be
rematerialized, and no stack slot allocation is necessary.
If some values could be fully rematerialized, remove them from the live range
before allocating a stack slot for the rest.
llvm-svn: 107492
Objective-C metadata types which should be marked as "weak", but which the
linker will remove upon final linkage. However, this linkage isn't specific to
Objective-C.
For example, the "objc_msgSend_fixup_alloc" symbol is defined like this:
.globl l_objc_msgSend_fixup_alloc
.weak_definition l_objc_msgSend_fixup_alloc
.section __DATA, __objc_msgrefs, coalesced
.align 3
l_objc_msgSend_fixup_alloc:
.quad _objc_msgSend_fixup
.quad L_OBJC_METH_VAR_NAME_1
This is different from the "linker_private" linkage type, because it can't have
the metadata defined with ".weak_definition".
Currently only supported on Darwin platforms.
llvm-svn: 107433