stuff now that we don't care about emulating the old broken
behavior of the old isel. This eliminates the
'CheckChainCompatible' check (along with IsChainCompatible) which
did an incorrect and inefficient scan *up* the chain nodes which
happened as the pattern was being formed and does the validation
at the end in HandleMergeInputChains when it forms a structural
pattern. This scans "down" the graph, which means that it is
quickly bounded by nodes already selected. This also handles
token factors that get "trapped" in the dag.
Removing the CheckChainCompatible nodes also shrinks the
generated tables by about 6K for X86 (down to 83K).
There are two pieces remaining before I can nuke PreprocessRMW:
1. I xfailed a test because we're now producing worse code in a
case that has nothing to do with the change: it turns out that
our use of MorphNodeTo will leave dead nodes in the graph
which (depending on how the graph is walked) end up causing
bogus uses of chains and blocking matches. This is really
bad for other reasons, so I'll fix this in a follow-up patch.
2. CheckFoldableChainNode needs to be improved to handle the TF.
llvm-svn: 97539
ComplexPattern at the root be generated multiple times, once
for each opcode they are part of. This encourages factoring
because the opcode checks get treated just like everything
else in the matcher.
llvm-svn: 97439
to a scope where every child starts with a CheckOpcode, but
executes more efficiently. Enhance DAGISelMatcherOpt to
form it.
This also fixes a bug in CheckOpcode: apparently the SDNodeInfo
objects are not pointer comparable, we have to compare the
enum name.
llvm-svn: 97438
specifies whether there is an output flag or not. Use this
instead of redundantly encoding the chain/flag results in the
output vtlist.
llvm-svn: 97419
even some the old isel didn't. There are several parts of
this that make me feel dirty, but it's no worse than the
old isel. I'll clean up the parts I can do without ripping
out the old one next.
llvm-svn: 97415
and restore the entire matcher stack by value. This is because children
we're testing could do moveparent or other things besides just
scribbling on additions to the stack.
llvm-svn: 97212
instead of to have a chained series of scope nodes. This makes
the generated table smaller, improves the efficiency of the
interpreter, and make the factoring optimization much more
reasonable to implement.
llvm-svn: 97160
the new isel: fold movechild+record+moveparent into a
single recordchild N node. This shrinks the X86 table
from 125443 to 117502 bytes.
llvm-svn: 97031
necessary to swap the operands to handle NaN and negative zero properly.
Also, reintroduce logic for checking for NaN conditions when forming
SSE min and max instructions, fixed to take into consideration NaNs and
negative zeros. This allows forming min and max instructions in more
cases.
llvm-svn: 97025
internal nodes with flag results. Record these with a new
OPC_MarkFlagResults opcode and use this to update the interior
nodes' flag results properly. This fixes CodeGen/X86/i256-add.ll
with the new isel.
llvm-svn: 97021
Previously, LiveIntervalAnalysis would infer phi joins by looking for multiply
defined registers. That doesn't work if the phi join is implicitly defined in
all but one of the predecessors.
llvm-svn: 96994
ridiculously ginormous patterns and need more than one byte
of displacement for encodings. This fixes CellSPU/fdiv.ll.
SPU is still doing something else ridiculous though.
llvm-svn: 96833
result nodes correctly. Note that this includes a horrible hack
in DAGISelHeader which cannot be fixed reasonably without
eliminating (parallel) from input patterns. That, in turn,
can't be done until we support writing multiple result patterns
for the X86and_flag and related multiple-result nodes.
llvm-svn: 96767
<4 x i32> with <4 x float> values if they end up the same
register class. This gets us up to 231 passes on the ppc
tests (only 7 fails).
llvm-svn: 96750
the point where it is to the 95% feature complete mark, it just
needs result updating to be done (then testing, optimization
etc).
More specificallly, this adds support for chain and flag handling
on the result nodes, support for sdnodexforms, support for variadic
nodes, memrefs, pinned physreg inputs, and probably lots of other
stuff.
In the old DAGISelEmitter, this deletes the dead code related to
OperatorMap, cleans up a variety of dead stuff handling "implicit
remapping" from things like globaladdr -> targetglobaladdr (which
is no longer used because globaladdr always needs to be legalized),
and some minor formatting fixes.
llvm-svn: 96716
Moderate the weight given to very small intervals.
The spill weight given to new intervals created when spilling was not
normalized in the same way as the original spill weights calculated by
CalcSpillWeights. That meant that restored registers would tend to hang around
because they had a much higher spill weight that unspilled registers.
This improves the runtime of a few tests by up to 10%, and there are no
significant regressions.
llvm-svn: 96613
CheckComplexPattern function. Though it is logically const,
I don't have the fortitude to clean up all the targets now,
and it not being const doesn't block anything.
llvm-svn: 96426
use and only call IsProfitableToFold/IsLegalToFold on the load
being folded, like the old dagiselemitter does. This
substantially simplifies the code and improves opportunities for
sharing.
llvm-svn: 96368
IsLegalToFold and IsProfitableToFold. The generic version of the later simply checks whether the folding candidate has a single use.
This allows the target isel routines more flexibility in deciding whether folding makes sense. The specific case we are interested in is folding constant pool loads with multiple uses.
llvm-svn: 96255
produce a table based matcher instead of gobs of C++ Code.
Though it's not done yet, the shrinkage seems promising,
the table for the X86 ISel is 75K and still has a lot of
optimization to come (compare to the ~1.5M of .o generated
the old way, much of which will go away).
The code is currently disabled by default (the #if 0 in
DAGISelEmitter.cpp). When enabled it generates a dead
SelectCode2 function in the DAGISel Header which will
eventually replace SelectCode.
There is still a lot of stuff left to do, which are
documented with a trail of FIXMEs.
llvm-svn: 96215
created. This ensures it's updated at all time. It means targets which perform
dynamic stack alignment would know whether it is required and whether frame
pointer register cannot be made available register allocation.
This is a fix for rdar://7625239. Sorry, I can't create a reasonably sized test
case.
llvm-svn: 96069
reduce down to a single value. InstCombine already does this transformation
but DAG legalization may introduce new opportunities. This has turned out to
be important for ARM where 64-bit values are split up during type legalization:
InstCombine is not able to remove the PHI cycles on the 64-bit values but
the separate 32-bit values can be optimized. I measured the compile time
impact of this (running llc on 176.gcc) and it was not significant.
llvm-svn: 95951
The major win of this is that the code is simpler and they
print on the same line as the instruction again:
movl %eax, 96(%esp) ## 4-byte Spill
movl 96(%esp), %eax ## 4-byte Reload
cmpl 92(%esp), %eax ## 4-byte Folded Reload
jl LBB7_86
llvm-svn: 95738
into TargetOpcodes.h. #include the new TargetOpcodes.h
into MachineInstr. Add new inline accessors (like isPHI())
to MachineInstr, and start using them throughout the
codebase.
llvm-svn: 95687
are from debug info. Add an iterator to MachineRegisterInfo
to skip Debug operands when walking the use list. No
functional change yet.
llvm-svn: 95473
Instruction selection for X86 now can choose an instruction
sequence that will fit any address of any symbol, no matter
the pointer width. X86-64 uses a mov+call-via-reg sequence
for this.
llvm-svn: 95323
"visit*" method is called, take the newly created nodes, walk them in a DFS
fashion, and if they don't have an ordering set, then give it one.
llvm-svn: 94757
This allows code gen and the exception table writer to cooperate to make sure
landing pads are associated with the correct invoke locations.
llvm-svn: 94726
Move the X86 implementation of function body emission up to
AsmPrinter::EmitFunctionBody, which works by calling the virtual
EmitInstruction method.
llvm-svn: 94716
which is more convenient, and change getPICJumpTableRelocBaseExpr
to take a MachineFunction to match.
Next, move the X86 code that create a PICBase symbol to
X86TargetLowering::getPICBaseSymbol from
X86MCInstLower::GetPICBaseSymbol, which was an asmprinter specific
library. This eliminates a 'gross hack', and allows us to
implement X86ISelLowering::getPICJumpTableRelocBaseExpr which now
calls it.
This in turn allows us to eliminate the
X86AsmPrinter::printPICJumpTableSetLabel method, which was the
only overload of printPICJumpTableSetLabel.
llvm-svn: 94526
MachineFunctionAnalysis dole them out, instead of having
AsmPrinter do both. Have the AsmPrinter::SetupMachineFunction
method set the 'AsmPrinter::MF' variable.
llvm-svn: 94509
1. MachineJumpTableInfo is now created lazily for a function the first time
it actually makes a jump table instead of for every function.
2. The encoding of jump table entries is now described by the
MachineJumpTableInfo::JTEntryKind enum. This enum is determined by the
TLI::getJumpTableEncoding() hook, instead of by lots of code scattered
throughout the compiler that "knows" that jump table entries are always
32-bits in pic mode (for example).
3. The size and alignment of jump table entries is now calculated based on
their kind, instead of at machinefunction creation time.
Future work includes using the EntryKind in more places in the compiler,
eliminating other logic that "knows" the layout of jump tables in various
situations.
llvm-svn: 94470
the '-pre-RA-sched' flag. It actually makes more sense to do it this way. Also,
keep track of the SDNode ordering by default. Eventually, we would like to make
this ordering a way to break a "tie" in the scheduler. However, doing that now
breaks the "CodeGen/X86/abi-isel.ll" test for 32-bit Linux.
llvm-svn: 94308
of int initializers), change some methods to be static functions,
use raw_ostream::write_hex instead of a smallstring dance with
APValue::toStringUnsigned(S, 16).
llvm-svn: 93991
function can support dynamic stack realignment. That's a much easier question
to answer at instruction selection stage than whether the function actually
will have dynamic alignment prologue. This allows the removal of the
stack alignment heuristic pass, and improves code quality for cases where
the heuristic would result in dynamic alignment code being generated when
it was not strictly necessary.
llvm-svn: 93885
doing global variable classification anymore) and hookized, sink almost
all target targets global variable emission code into AsmPrinter and out
of each target.
Some notes:
1. PIC16 does completely custom and crazy stuff, so it is not changed.
2. XCore has some custom handling for extra directives. I'll look at it next.
3. This switches linux/ppc to use .globl instead of .global. If .globl is
actually wrong, let me know and I'll fix it.
4. This makes linux/ppc get a lot of random cases right which were obviously
wrong before, it is probably now a bit healthier.
5. Blackfin will probably start getting .comm and other things that it didn't
before. If this is undesirable, it should explicitly opt out of these
things by clearing the relevant fields of MCAsmInfo.
This leads to a nice diffstat:
14 files changed, 127 insertions(+), 830 deletions(-)
llvm-svn: 93858
This makes a similar code dead in all the other targets, I'll clean it up
in a bit.
This also moves handling of lcomm up before acquisition of a section,
since lcomm never needs a section.
llvm-svn: 93851
print/dumpWithDepth allows one to dump a DAG up to N levels deep.
dump/printWithFullDepth prints the whole DAG, subject to a depth limit
on 100 in the default case (to prevent infinite recursion).
Have CannotYetSelect to a dumpWithFullDepth so it is clearer exactly
what the non-matching DAG looks like.
llvm-svn: 93538
Remove most of old Mach-O Writer support, it has been replaced by MCMachOStreamer
Further refactoring to completely remove MachOWriter and drive the object file
writer with the AsmPrinter MCInst/MCSection logic is forthcoming.
llvm-svn: 93527
For now, this pass is fairly conservative. It only perform the replacement when both the pre- and post- extension values are used in the block. It will miss cases where the post-extension values are live, but not used.
llvm-svn: 93278
(OP (trunc x), (trunc y)) -> (trunc (OP x, y))
Unfortunately this simple change causes dag combine to infinite looping. The problem is the shrink demanded ops optimization tend to canonicalize expressions in the opposite manner. That is badness. This patch disable those optimizations in dag combine but instead it is done as a late pass in sdisel.
This also exposes some deficiencies in dag combine and x86 setcc / brcond lowering. Teach them to look pass ISD::TRUNCATE in various places.
llvm-svn: 92849
An instruction like this:
%reg1097:1<def> = VMOVSR %R3<kill>, 14, %reg0
Must be replaced with this when substituting physical registers:
%S0<def> = VMOVSR %R3<kill>, 14, %reg0, %D0<imp-def>
llvm-svn: 92812
clear what information these functions are actually using.
This is also a micro-optimization, as passing a SDNode * around is
simpler than passing a { SDNode *, int } by value or reference.
llvm-svn: 92564
This fixes an in-place update bug where code inserted at the end of basic blocks may not be covered by existing intervals which were live across the entire block. It is also consistent with the way ranges are specified for live intervals.
llvm-svn: 91859
- Move DisableScheduling flag into TargetOption.h
- Move SDNodeOrdering into its own header file. Give it a minimal interface that
doesn't conflate construction with storage.
- Move assigning the ordering into the SelectionDAGBuilder.
This isn't used yet, so there should be no functional changes.
llvm-svn: 91727
LegalizeDAG.cpp. Unlike the code it replaces, which simply decrements the simple
type by one, getHalfSizedIntegerVT() searches for the smallest simple integer
type that is at least half the size of the type it is called on. This approach
has the advantage that it will continue working if a new value type (such as
i24) is added to MVT.
Also, in preparation for new value types, remove the assertions that
non-power-of-2 8-bit-mutiple types are Extended when legalizing extload and
truncstore operations.
llvm-svn: 91614
remove start/finishGVStub and the BufferState helper class from the
MachineCodeEmitter interface. It has the side-effect of not setting the
indirect global writable and then executable on ARM, but that shouldn't be
necessary.
llvm-svn: 91464
isPodLike type trait. This is a generally useful type trait for
more than just DenseMap, and we really care about whether something
acts like a pod, not whether it really is a pod.
llvm-svn: 91421
stuff isn't used just yet.
We want to model the GCC `-fno-schedule-insns' and `-fno-schedule-insns2'
flags. The hypothesis is that the people who use these flags know what they are
doing, and have hand-optimized the C code to reduce latencies and other
conflicts.
The idea behind our scheme to turn off scheduling is to create a map "on the
side" during DAG generation. It will order the nodes by how they appeared in the
code. This map is then used during scheduling to get the ordering.
llvm-svn: 91392
- Loosen the restrictions when checking of it branches to a landing pad.
- Make the loop more efficient by checking the '.insert' return value.
- Do cheaper checks first.
llvm-svn: 91101
more than one successor. Normally, these extra successors are dead. However,
some of them may branch to exception handling landing pads. If we remove those
successors, then the landing pads could go away if all predecessors to it are
removed. Before, it was checking if the direct successor was the landing
pad. But it could be the result of jumping through multiple basic blocks to get
to it. If we were to only check for the existence of an EH_LABEL in the basic
block and not remove successors if it's in there, then it could stop actually
dead basic blocks from being removed.
llvm-svn: 91092
The coalescer is supposed to clean these up, but when setting up parameters
for a function call, there may be copies to physregs. If the defining
instruction has been LICM'ed far away, the coalescer won't touch it.
The register allocation hint does not always work - when the register
allocator is backtracking, it clears the hints.
This patch is more conservative than r90502, and does not break
483.xalancbmk/i686. It still breaks the PowerPC bootstrap, so it is disabled
by default, and can be enabled with the -trivial-coalesce-ends option.
llvm-svn: 91049
When a call is placed to spill an interval this spiller will first try to
break the interval up into its component values. Single value intervals and
intervals which have already been split (or are the result of previous splits)
are spilled by the default spiller.
Splitting intervals as described above may improve the performance of generated
code in some circumstances. This work is experimental however, and it still
miscompiles many benchmarks. It's not recommended for general use yet.
llvm-svn: 90951
The coalescer is supposed to clean these up, but when setting up parameters
for a function call, there may be copies to physregs. If the defining
instruction has been LICM'ed far away, the coalescer won't touch it.
The register allocation hint does not always work - when the register
allocator is backtracking, it clears the hints.
This patch takes care of a few more cases that r90163 missed.
llvm-svn: 90502
We want LiveVariables clients to use methods rather than accessing the
getVarInfo data structure directly. That way it will be possible to change the
LiveVariables representation.
llvm-svn: 90240
running tail duplication when doing branch folding for if-conversion, and
we also want to be able to run tail duplication earlier to fix some
reg alloc problems. Move the CanFallThrough function from BranchFolding
to MachineBasicBlock so that it can be shared by TailDuplication.
llvm-svn: 89904
Note that "hasDotLocAndDotFile"-style debug info was already broken;
people wanting this functionality should implement it in the
AsmPrinter/DwarfWriter code.
llvm-svn: 89711
When splitting a critical edge, the registers live through the edge are:
- Used in a PHI instruction, or
- Live out from the predecessor, and
- Live in to the successor.
This allows the coalescer to eliminate even more phi joins.
llvm-svn: 89530
slots. The AsmPrinter will use this information to determine whether to
print a spill/reload comment.
Remove default argument values. It's too easy to pass a wrong argument
value when multiple arguments have default values. Make everything
explicit to trap bugs early.
Update all targets to adhere to the new interfaces..
llvm-svn: 87022
making it visible to clients and adding LLVM-style cast capability.
This will be used by AsmPrinter to determine when to emit spill comments
for an instruction.
llvm-svn: 87019