- Don't call malloc+free in the very hot forward().
- Don't call isTiedToDefOperand().
- Don't create BitVector temporaries.
- Merge DeadRegs into KillRegs.
- Eliminate the early clobber checks, they were irrelevant to scavenging.
- Remove unnecessary code from -Asserts builds.
This speeds up ARM PEI by 3.4x and overall llc -O0 codegen time by 11%.
llvm-svn: 149189
around within a basic block while maintaining live-intervals.
Updated ScheduleTopDownLive in MachineScheduler.cpp to use the moveInstr API
when reordering MIs.
llvm-svn: 149147
we're at it, allow PatternMatch's "neg" pattern to match integer
vector negations, and enhance ComputeNumSigned bits to handle
shl of vectors.
llvm-svn: 149082
The live range of the source register may be extended when a redundant
copy is eliminated. Make sure any kill flags between the two copies are
cleared.
This fixes PR11765.
llvm-svn: 149069
This enables the linker to match concrete relocation types (absolute or relative) with whatever library or C++ support code is being linked against.
llvm-svn: 149057
ConstantVector. Fix some outright bugs in the implementation of
ConstantArray and Constant struct, which would cause us to not make
one big UndefValue when asking for an array/struct with all undef
elements. Enhance Constant::isAllOnesValue to work with
ConstantDataVector.
llvm-svn: 149021
to reduce the number of cast<>'s we have. This allows someone to use
things like Ty->getVectorNumElements() instead of
cast<VectorType>(Ty)->getNumElements() when you know that a type is a
vector.
It would be a great general cleanup to move the codebase to use these,
I will do so in the code I'm touching.
llvm-svn: 148999
to 64-bits, and added a new attribute in bit #32. Specifically, remove
this new attribute from the enum used in the C API. It's not yet clear
what the best approach is for exposing these new attributes in the
C API, and several different proposals are on the table. Until then, we
can simply not expose this bit in the API at all.
Also, I've reverted a somewhat unrelated change in the same revision
which switched from "1 << 31" to "1U << 31" for the top enum. While "1
<< 31" is technically undefined behavior, implementations DTRT here.
However, MS and -pedantic mode warn about non-'int' type enumerator
values. If folks feel strongly about this I can put the 'U' back in, but
it seemed best to wait for the proper solution.
llvm-svn: 148937
add a ConstantDataArray::getString method that corresponds to the (to be
removed) StringRef version of ConstantArray::get, but is dramatically more
efficient.
llvm-svn: 148804
and clean up some other misc stuff. Unlike ConstantArray, we will
prefer to emit .fill directives for "String" arrays that all have
the same value, since they are denser than emitting a .ascii
llvm-svn: 148793
same semantics as ConstantArray's but much more efficient because they
don't have to return std::string's. The ConstantArray methods will
eventually be removed.
llvm-svn: 148792
out into a new ConstantFoldLoadThroughGEPIndices (more useful) function
and rewrite it to be simpler, more efficient, and to handle the new
ConstantDataSequential type.
Enhance ConstantFoldLoadFromConstPtr to handle ConstantDataSequential.
llvm-svn: 148786
violation -- MC cannot depend on CodeGen.
Specifically, the MCTargetDesc component of each target is actually
a subcomponent of the MC library. As such, it cannot depend on the
target-independent code generator, because MC itself cannot depend on
the target-independent code generator. This change moved a flag from the
ARM MCTargetDesc file ARMMCAsmInfo.cpp to the CodeGen layer in
ARMException.cpp, leaving behind an 'extern' to refer back to it. That
layering order isn't viable givin the constraints outlined above.
Commandline flags are designed to be static specifically to avoid these
types of bugs.
Fixing this is likely going to require some non-trivial refactoring.
llvm-svn: 148759
classes, per PR1324. Not all of their helper functions are implemented,
nothing creates them, and the rest of the compiler doesn't handle them yet.
llvm-svn: 148741
This change adds an new value to the --arm-enable-ehabi option that
disables emitting unwinding descriptors. This mode gives a working
backtrace() without the (currently broken) exception support.
llvm-svn: 148686
in a subclass named DyldELFObject. This class supports rebasing the object file
it represents by re-mapping section addresses to the actual memory addresses
the object was placed in. This is required for MC-JIT implementation on ELF with
debugging support.
Patch reviewed on llvm-commits.
Developed together with Ashok Thirumurthi and Andrew Kaylor.
llvm-svn: 148653
A register mask operand kills any live physreg that isn't preserved.
Unlike an implicit-def operand, the clobbered physregs are never live
afterwards.
This means LiveVariables has to track a much smaller number of live
physregs, and it should spend much less time in addRegisterDead().
llvm-svn: 148609
Problem: LLVM needs more function attributes than currently available (32 bits).
One such proposed attribute is "address_safety", which shows that a function is being checked for address safety (by AddressSanitizer, SAFECode, etc).
Solution:
- extend the Attributes from 32 bits to 64-bits
- wrap the object into a class so that unsigned is never erroneously used instead
- change "unsigned" to "Attributes" throughout the code, including one place in clang.
- the class has no "operator uint64 ()", but it has "uint64_t Raw() " to support packing/unpacking.
- the class has "safe operator bool()" to support the common idiom: if (Attributes attr = getAttrs()) useAttrs(attr);
- The CTOR from uint64_t is marked explicit, so I had to add a few explicit CTOR calls
- Add the new attribute "address_safety". Doing it in the same commit to check that attributes beyond first 32 bits actually work.
- Some of the functions from the Attribute namespace are worth moving inside the class, but I'd prefer to have it as a separate commit.
Tested:
"make check" on Linux (32-bit and 64-bit) and Mac (10.6)
built/run spec CPU 2006 on Linux with clang -O2.
This change will break clang build in lib/CodeGen/CGCall.cpp.
The following patch will fix it.
llvm-svn: 148553
LSR has gradually been improved to more aggressively reuse existing code, particularly existing phi cycles. This exposed problems with the SCEVExpander's sloppy treatment of its insertion point. I applied some rigor to the insertion point problem that will hopefully avoid an endless bug cycle in this area. Changes:
- Always used properlyDominates to check safe code hoisting.
- The insertion point provided to SCEV is now considered a lower bound. This is usually a block terminator or the use itself. Under no cirumstance may SCEVExpander insert below this point.
- LSR is reponsible for finding a "canonical" insertion point across expansion of different expressions.
- Robust logic to determine whether IV increments are in "expanded" form and/or can be safely hoisted above some insertion point.
Fixes PR11783: SCEVExpander assert.
llvm-svn: 148535
to instruction right after the last instruction in the bundle.
- Add a finalizeBundle() variant that doesn't specify LastMI. Instead, the code
will find the last instruction in the bundle by following the 'InsideBundle'
marker. This is useful in case bundles are formed early (i.e. during MI
scheduling) but finalized later (i.e. after register allocator has finished
rewriting virtual registers with physical registers).
llvm-svn: 148444
This SelectionDAG node will be attached to call nodes by LowerCall(),
and eventually becomes a MO_RegisterMask MachineOperand on the
MachineInstr representing the call instruction.
LowerCall() will attach a register mask that depends on the calling
convention.
llvm-svn: 148436
When set, this bit indicates that a register is completely defined by
the value of its sub-registers.
Use the CoveredBySubRegs property to infer which super-registers are
call-preserved given a list of callee-saved registers. For example, the
ARM registers D8-D15 are callee-saved. This now automatically implies
that Q4-Q7 are call-preserved.
Conversely, Win64 callees save XMM6-XMM15, but the corresponding
YMM6-YMM15 registers are not call-preserved because they are not fully
defined by their sub-registers.
llvm-svn: 148363
Targets can now add CalleeSavedRegs defs to their *CallingConv.td file.
TableGen will use this to create a *_SaveList array suitable for
returning from getCalleeSavedRegs() as well as a *_RegMask bit mask
suitable for returning from getCallPreservedMask().
llvm-svn: 148346
BitVector uses the native word size for its internal representation.
That doesn't work well for literal bit masks in source code.
This patch adds BitVector operations to efficiently apply literal bit
masks specified as arrays of uint32_t. Since each array entry always
holds exactly 32 bits, these portable bit masks can be source code
literals, probably produced by TableGen.
llvm-svn: 148272
Move to a by-section allocation and relocation scheme. This allows
better support for sections which do not contain externally visible
symbols.
Flesh out the relocation address vs. local storage address separation a
bit more as well. Remote process JITs use this to tell the relocation
resolution code where the code will live when it executes.
The startFunctionBody/endFunctionBody interfaces to the JIT and the
memory manager are deprecated. They'll stick around for as long as the
old JIT does, but the MCJIT doesn't use them anymore.
llvm-svn: 148258
Register masks will be used as a compact representation of large clobber
lists. Currently, an x86 call instruction has some 40 operands
representing call-clobbered registers. That's more than 1kB of useless
operands per call site.
A register mask operand references a bit mask of call-preserved
registers, everything else is clobbered. The bit mask will typically
come from TargetRegisterInfo::getCallPreservedMask().
By abandoning ImplicitDefs for call-clobbered registers, it also becomes
possible to share call instruction descriptions between calling
conventions, and we can get rid of the WINCALL* instructions.
This patch introduces the new operand kind. Future patches will add
RegMask support to target-independent passes before finally the fixed
clobber lists can be removed from call instruction descriptions.
llvm-svn: 148250
or Clang is using this, and it would be hard to use it correctly given
the thread hostility of the function. Also, it never checked the return
which is rather dangerous with chdir. If someone was in fact using this,
please let me know, as well as what the usecase actually is so that
I can add it back and make it more correct and secure to use. (That
said, it's never going to be "safe" per-se, but we could at least
document the risks...)
llvm-svn: 148211
The hook returns a bit-mask of call-preserved registers that will
eventually replace the current list of implicit defs on call
instructions. This will make it possible to support multiple calling
conventions without duplicating call instruction descriptors.
The call-preserved mask is slightly different from the list returned by
the getCalleeSavedRegs() hook, it includes all aliases that are
preserved by calls.
The hook takes a CallingConv::ID argument instead of a MachineFunction
pointer, so it can provide information about calls to extern functions,
and even indirect function calls.
TRI::getCalleeSavedRegs() returns information about the function
currently being compiled. TRI::getCallPreservedMask() returns
information about the functions it is calling.
llvm-svn: 148165
Consider this code:
int h() {
int x;
try {
x = f();
g();
} catch (...) {
return x+1;
}
return x;
}
The variable x is undefined on the first edge to the landing pad, but it
has the f() return value on the second edge to the landing pad.
SplitAnalysis::getLastSplitPoint() would assume that the return value
from f() was live into the landing pad when f() throws, which is of
course impossible.
Detect these cases, and treat them as if the landing pad wasn't there.
This allows spill code to be inserted after the function call to f().
<rdar://problem/10664933>
llvm-svn: 147912
Delete the alternative implementation in LiveIntervalAnalysis.
These functions computed the same thing, but SplitAnalysis caches the
result.
llvm-svn: 147911
with other symbols.
An object in the __cfstring section is suppoed to be filled with CFString
objects, which have a pointer to ___CFConstantStringClassReference followed by a
pointer to a __cstring. If we allow the object in the __cstring section to be
merged with another global, then it could end up in any section. Because the
linker is going to remove these symbols in the final executable, we shouldn't
bother to merge them.
<rdar://problem/10564621>
llvm-svn: 147899
functional change in r147860 to use DW_TAG_label's instead TAG_subprogram's.
This only changes names and updates comments. No functional change.
llvm-svn: 147877
of several newly un-defaulted switches. This also helps optimizers
(including LLVM's) recognize that every case is covered, and we should
assume as much.
llvm-svn: 147861
These heuristics are sufficient for enabling IV chains by
default. Performance analysis has been done for i386, x86_64, and
thumbv7. The optimization is rarely important, but can significantly
speed up certain cases by eliminating spill code within the
loop. Unrolled loops are prime candidates for IV chains. In many
cases, the final code could still be improved with more target
specific optimization following LSR. The goal of this feature is for
LSR to make the best choice of induction variables.
Instruction selection may not completely take advantage of this
feature yet. As a result, there could be cases of slight code size
increase.
Code size can be worse on x86 because it doesn't support postincrement
addressing. In fact, when chains are formed, you may see redundant
address plus stride addition in the addressing mode. GenerateIVChains
tries to compensate for the common cases.
On ARM, code size increase can be mitigated by using postincrement
addressing, but downstream codegen currently misses some opportunities.
llvm-svn: 147826
file error checking. Use that to error on an unfinished cfi_startproc.
The error is not nice, but is already better than a segmentation fault.
llvm-svn: 147717
opportunities that only present themselves after late optimizations
such as tail duplication .e.g.
## BB#1:
movl %eax, %ecx
movl %ecx, %eax
ret
The register allocator also leaves some of them around (due to false
dep between copies from phi-elimination, etc.)
This required some changes in codegen passes. Post-ra scheduler and the
pseudo-instruction expansion passes have been moved after branch folding
and tail merging. They were before branch folding before because it did
not always update block livein's. That's fixed now. The pass change makes
independently since we want to properly schedule instructions after
branch folding / tail duplication.
rdar://10428165
rdar://10640363
llvm-svn: 147716
The register allocators don't currently support adding reserved
registers while they are running. Extend the MRI API to keep track of
the set of reserved registers when register allocation started.
Target hooks like hasFP() and needsStackRealignment() can look at this
set to avoid reserving more registers during register allocation.
llvm-svn: 147577
Using DenseMap iterators isn't free as they have to check for empty
buckets. Dominator queries are common so this gives a minor speedup.
llvm-svn: 147544
Get back getHostTriple.
For JIT compilation, use the host triple instead of the default
target: this fixes some JIT testcases that used to fail when the
compiler has been configured as a cross compiler.
llvm-svn: 147542
captured. This allows the tracker to look at the specific use, which may be
especially interesting for function calls.
Use this to fix 'nocapture' deduction in FunctionAttrs. The existing one does
not iterate until a fixpoint and does not guarantee that it produces the same
result regardless of iteration order. The new implementation builds up a graph
of how arguments are passed from function to function, and uses a bottom-up walk
on the argument-SCCs to assign nocapture. This gets us nocapture more often, and
does so rather efficiently and independent of iteration order.
llvm-svn: 147327