when the unconditional branch destination is the fallthrough block. The
canonicalization makes it easier to allow optimizations on DAGs to invert
conditional branches. The branch folding pass (and AnalyzeBranch) will clean up
the unnecessary unconditional branches later.
This is one of the patches leading up to disabling codegen prepare critical edge
splitting.
llvm-svn: 114630
Allocator instances can now be created by calling createPBQPRegisterAllocator.
Tidied up use of CoalescerPair as per Jakob's suggestions.
Made the new PBQPBuilder based construction process the default. The internal construction process
remains in-place and available via -pbqp-builder=false for now. It will be removed shortly if the new
process doesn't cause any regressions.
llvm-svn: 114626
new VariantKind to the MCSymbolExpr seems like overkill, but I'm not sure
there's a more straightforward way to get the printing difference captured.
(i.e., x86 uses @PLT, ARM uses (PLT)).
llvm-svn: 114613
CombineTo to avoid putting the result on the worklist. I don't think it makes
much difference for now, but it might help someday as we add more DAG
combine optimizations.
llvm-svn: 114595
of those. Refactor to share code for handling BUILD_VECTOR(VMOVRRD).
I don't have a testcase that exercises this, but it seems like an obvious
good thing to do.
llvm-svn: 114589
truncates are free only in the case where the extended type is legal but the
load type is not. If both types are illegal, such as when they are too big,
the load may not be legalized into an extended load.
llvm-svn: 114568
ARM cross-compiler on x86, because the MMO size did not match the type size.
This fixes the MMO size and also the size of the stack object to match the
type size.
llvm-svn: 114554
x86-32: 32-bit calls were named "call" not "calll". 64-bit calls were correctly
named "callq", so this only impacted x86-32.
This fixes rdar://8456370 - llvm-mc rejects 'calll'
This also exposes that mingw/64 is generating a 32-bit call instead of a 64-bit call,
I will file a bugzilla.
llvm-svn: 114534
creating it before and subtracting split ranges.
This way, the SSA update code in LiveIntervalMap can properly create and use new
phi values in dupli. Now it is possible to create split regions where a value
escapes along two different CFG edges, creating phi values outside the split
region.
This is a work in progress and probably quite broken.
llvm-svn: 114492
by having X86DAGToDAGISel::SelectAddr get passed in the parent node
of the operand match (the load/store/atomic op) and having it get
the address space from that, instead of having special FS/GS addr
mode operations that require duplicating the entire instruction set
to support.
This makes FS and GS relative accesses *far* more predictable and
work much better. It also simplifies the X86 backend a bit, more
to come.
There is still a pending issue with nodes like ISD::PREFETCH and
X86ISD::FLD, which really should be MemSDNode's but aren't.
llvm-svn: 114491
that complex patterns are matched after the entire pattern has
a structural match, therefore the NodeStack isn't in a useful
state when the actual call to the matcher happens.
llvm-svn: 114489
load when the type of the load is not legal, even if truncates are not free.
The load is going to be legalized to an extending load anyway.
llvm-svn: 114488
passed the root of the match, even though only a few patterns
actually needed this (one in X86, several in ARM [which should
be refactored anyway], and some in CellSPU that I don't feel
like detangling). Instead of requiring all ComplexPatterns to
take the dead root, have targets opt into getting the root by
putting SDNPWantRoot on the ComplexPattern.
llvm-svn: 114471
I think I've audited all uses, so it should be dependable for address spaces,
and the pointer+offset info should also be accurate when there.
llvm-svn: 114464
(sbbl x, x) sets the registers to 0 or ~0. Combined with two's complement arithmetic, we can fold
the intermediate AND and the ADD into a single SUB.
This fixes <rdar://problem/8449754>.
llvm-svn: 114460
instead of calling lower_bound or upper_bound directly.
This cleans up the search logic a bit because {lower,upper}_bound compare
LR->start by default, and it is usually simpler to search LR->end.
Funnelling all searches through one function also makes it possible to replace
the search algorithm with something faster than binary search.
llvm-svn: 114448
I am unable to write a test for this case, help is solicited, though...
What I did is to tickle the code in the debugger and verify that we do the right thing.
llvm-svn: 114430
into OptimizeCompareInstr.
This necessitates the passing of CmpValue around,
so widen the virtual functions to accomodate.
No functionality changes.
llvm-svn: 114428
"getFixedStack" on the MachinePointerInfo class. While
this isn't the problem I'm setting out to solve, it is the
right way to eliminate PseudoSourceValue, so lets go with it.
llvm-svn: 114406
MachinePointerInfo struct, no functionality change.
This also adds an assert to MachineMemOperand::MachineMemOperand
that verifies that the Value* is either null or is an IR pointer type.
llvm-svn: 114389
CombinerAA cannot assume that different FrameIndex's never alias, but can instead use
MachineFrameInfo to get the actual offsets of these slots and check for actual aliasing.
This fixes CodeGen/X86/2010-02-19-TailCallRetAddrBug.ll and CodeGen/X86/tailcallstack64.ll
when CombinerAA is enabled, modulo a different register allocation sequence.
llvm-svn: 114348
between the high and low registers for prologue/epilogue code. This was
a Darwin-only thing that wasn't providing a realistic benefit anymore.
Combining the save areas simplifies the compiler code and results in better
ARM/Thumb2 codegen.
For example, previously we would generate code like:
push {r4, r5, r6, r7, lr}
add r7, sp, #12
stmdb sp!, {r8, r10, r11}
With this change, we combine the register saves and generate:
push {r4, r5, r6, r7, r8, r10, r11, lr}
add r7, sp, #12
rdar://8445635
llvm-svn: 114340
For now the allocator still uses the old (internal) construction mechanism by default. This will be phased out soon assuming
no issues with the builder system come up.
To invoke the new construction mechanism just pass '-regalloc=pbqp -pbqp-builder' to llc. To provide custom constraints a
Target just needs to extend PBQPBuilder and pass an instance of their derived builder to the RegAllocPBQP constructor.
llvm-svn: 114272
NO path to the destination containing side effects, not that SOME path contains no side effects.
In practice, this only manifests with CombinerAA enabled, because otherwise the chain has little
to no branching, so "any" is effectively equivalent to "all".
llvm-svn: 114268
value should be in GPRs when it's going to be used as a scalar, and we use
VMOVRRD to make that happen, but if the value is converted back to a vector
we need to fold to a simple bit_convert. Radar 8407927.
llvm-svn: 114233
1) Do forward copy propagation. This makes it easier to estimate the cost of the
instruction being sunk.
2) Break critical edges on demand, including cases where the value is used by
PHI nodes.
Critical edge splitting is not yet enabled by default.
llvm-svn: 114227
expression to include the modifier.
- Gross, but this a corner case we don't expect to see often in practice, but
it is worth accepting.
- Also improves diagnostics on invalid modifiers.
llvm-svn: 114154
more careful not to call SimplifyInstructionsInBlock() on an unreachable block, the issue has been fixed at a higher level. Add
a big warning to SimplifyInstructionsInBlock() to hopefully prevent this in the future.
llvm-svn: 114117
walking the asm arguments once and stashing their Values. This is
wrong because the same memory location can be in the list twice, and
if the first one has a sunkaddr substituted, the stashed value for the
second one will be wrong (use-after-free). PR 8154.
llvm-svn: 114104
a Constant into a ConstantRange. Handle this conservatively for now, rather than asserting. The testcase is
more complex that I would like, but the manifestation of the problem is sensitive to iteration orders and the state of the
LVI cache, and I have not been able to reproduce it with manually constructed or simplified cases.
Fixes PR8162.
llvm-svn: 114103
This cleans up after the mess r108567 left in the CellSPU backend.
ORCvt-instruction were used to reinterpret registers, and the ORs were then
removed by isMoveInstr(). This patch now removes 350 instrucions of format:
or $3, $3, $3
(from the 52 testcases in CodeGen/CellSPU). One case of a nonexistant or is
checked for.
Some moves of the form 'ori $., $., 0' and 'ai $., $., 0' still remain.
llvm-svn: 114074
be possible to implement this very carefully to allow a lock-free implementation while still
avoiding illegal interleavings, but I haven't been able to figure one out.
llvm-svn: 114046
great deal because we don't have to worry about maintaining SSA form.
Unconditionally copy back to dupli when the register is live out of the split
range, even if the live-out value was defined outside the range. Skipping the
back-copy only makes sense when the live range is going to spill outside the
split range, and we don't know that it will. Besides, this was a hack to avoid
SSA update issues.
Clear up some confusion about the end point of a half-open LiveRange. Methinks
LiveRanges need to be closed so both start and end are included in the range.
The low bits of a SlotIndex are symbolic, so a half-open range doesn't really
make sense. This would be a pervasive change, though.
llvm-svn: 114043
that all the setup of this class currently happens at static initialization time, this misses the fact
that some later events can cause mutation of the PassRegistrationListeners list, and thus cause race issues.
llvm-svn: 114036
The ELF implementation now creates text, data and bss to match the gnu as
behavior.
The text streamer still has the old MachO specific behavior since
the testsuite checks that it will error when a directive is given
before a setting the current section for example.
A nice benefit is that -n is not required anymore when producing
ELF files.
llvm-svn: 114027
functions in ARMBaseInfo.h so it can be used in the MC library as well.
For anything bigger than this, we may want a means to have a small support
library for shared helper functions like this. Cross that bridge when we
come to it.
llvm-svn: 114016
VFP instructions use it for loading some constants, so implement that
handling.
Not thrilled with adding a member to MCOperand, but not sure there's much of
a better option that's not pretty fragile (like putting a double in the
union instead and just assuming that's good enough). Suggestions welcome...
llvm-svn: 113996
Recognize VLD1q64Pseudo as a stack slot load.
Reject these if they are loading or storing a subregister. The API (and
VirtRegRewriter) doesn't know how to deal with that.
llvm-svn: 113985
encountered while building llvm-gcc for arm. This is probably the same issue
that the ppc buildbot hit. llvm::prior works on a MachineBasicBlock::iterator,
not a plain MachineInstr.
llvm-svn: 113983
backing out following to get it back to green,
so I can investigate in peace:
svn merge -c -113840 llvm/test/CodeGen/ARM/arm-and-tst-peephole.ll
svn merge -c -113876 -c -113839 llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
llvm-svn: 113980
"The register specified for a dregpair is the corresponding Q register, so to
get the pair, we need to look up the sub-regs based on the qreg. Create a
lookup function since we don't have access to TargetRegisterInfo here to
be able to use getSubReg(ARM::dsub_[01])."
Additionaly, fix the NEON VLD1* and VST1* instruction patterns not to use
the dregpair modifier for the 2xdreg versions. Explicitly specifying the two
registers as operands is more correct and more consistent with the other
instruction patterns. This enables further cleanup of special case code in the
disassembler as a nice side-effect.
llvm-svn: 113903
get the pair, we need to look up the sub-regs based on the qreg. Create a
lookup function since we don't have access to TargetRegisterInfo here to
be able to use getSubReg(ARM::dsub_[01]).
llvm-svn: 113875
isn't a good level of abstraction for memdep. Instead, generalize
AliasAnalysis::alias and related interfaces with a new Location
class for describing a memory location. For now, this is the same
Pointer and Size as before, plus an additional field for a TBAA tag.
Also, introduce a fixed MD_tbaa metadata tag kind.
llvm-svn: 113858
an argument, so that we can distinguish instructions with the same register
classes but different numbers of registers (e.g., vld3 and vld4). Fix some
of the non-pseudo NEON ld/st instruction itineraries to reflect the number
of registers loaded or stored, not just the opcode name.
llvm-svn: 113854
by morphing the 'and' to its recording form 'andS'.
This is basically a test commit into this area, to
see whether the bots like me. Several generalizations
can be applied and various avenues of code simplification
are open. I'll introduce those as I go.
I am aware of stylistic input from Bill Wendling, about
where put the analysis complexity, but I am positive
that we can move things around easily and will find a
satisfactory solution.
llvm-svn: 113839
non-function-local value, it may result in the metadata no longer needing to be
function-local. Check for this condition, and clear the isFunctionLocal flag, if
it's still in the uniquing map, since any node in the uniquing map needs to have
an accurate function-local flag.
Also, add an assert to help catch problematic cases.
llvm-svn: 113828