A non-identity copy cannot be coalesced when the phi join destination register
is live at the copy site.
Also verify the condition that the PHI join source register is only used in
the PHI join. Otherwise the coalescing is invalid.
llvm-svn: 86322
Here is the original commit message:
This commit updates malloc optimizations to operate on malloc calls that have constant int size arguments.
Update CreateMalloc so that its callers specify the size to allocate:
MallocInst-autoupgrade users use non-TargetData-computed allocation sizes.
Optimization uses use TargetData to compute the allocation size.
Now that malloc calls can have constant sizes, update isArrayMallocHelper() to use TargetData to determine the size of the malloced type and the size of malloced arrays.
Extend getMallocType() to support malloc calls that have non-bitcast uses.
Update OptimizeGlobalAddressOfMalloc() to optimize malloc calls that have non-bitcast uses. The bitcast use of a malloc call has to be treated specially here because the uses of the bitcast need to be replaced and the bitcast needs to be erased (just like the malloc call) for OptimizeGlobalAddressOfMalloc() to work correctly.
Update PerformHeapAllocSRoA() to optimize malloc calls that have non-bitcast uses. The bitcast use of the malloc is not handled specially here because ReplaceUsesOfMallocWithGlobal replaces through the bitcast use.
Update OptimizeOnceStoredGlobal() to not care about the malloc calls' bitcast use.
Update all globalopt malloc tests to not rely on autoupgraded-MallocInsts, but instead use explicit malloc calls with correct allocation sizes.
llvm-svn: 86311
This assert was very conservative to begin with (the error condition is well
covered by tests elsewhere in the code) so we won't miss much by removing it.
llvm-svn: 86088
MallocInst-autoupgrade users use non-TargetData-computed allocation sizes.
Optimization uses use TargetData to compute the allocation size.
Now that malloc calls can have constant sizes, update isArrayMallocHelper() to use TargetData to determine the size of the malloced type and the size of malloced arrays.
Extend getMallocType() to support malloc calls that have non-bitcast uses.
Update OptimizeGlobalAddressOfMalloc() to optimize malloc calls that have non-bitcast uses. The bitcast use of a malloc call has to be treated specially here because the uses of the bitcast need to be replaced and the bitcast needs to be erased (just like the malloc call) for OptimizeGlobalAddressOfMalloc() to work correctly.
Update PerformHeapAllocSRoA() to optimize malloc calls that have non-bitcast uses. The bitcast use of the malloc is not handled specially here because ReplaceUsesOfMallocWithGlobal replaces through the bitcast use.
Update OptimizeOnceStoredGlobal() to not care about the malloc calls' bitcast use.
Update all globalopt malloc tests to not rely on autoupgraded-MallocInsts, but instead use explicit malloc calls with correct allocation sizes.
llvm-svn: 86077
The KILL pseudo-instruction may survive to the asm printer pass, just like the IMPLICIT_DEF. Print the KILL as a comment instead of just leaving a blank line in the output.
With -asm-verbose=0, a blank line is printed, like IMPLICIT?DEF.
llvm-svn: 86041
This introduces a new pass, SlotIndexes, which is responsible for numbering
instructions for register allocation (and other clients). SlotIndexes numbering
is designed to match the existing scheme, so this patch should not cause any
changes in the generated code.
For consistency, and to avoid naming confusion, LiveIndex has been renamed
SlotIndex.
The processImplicitDefs method of the LiveIntervals analysis has been moved
into its own pass so that it can be run prior to SlotIndexes. This was
necessary to match the existing numbering scheme.
llvm-svn: 85979
This makes both logical sense (see below) and increases the
number of functions marked readnone/readonly by about 1-2%
in practice. The number of functions marked nocapture goes
up by about 5-10%. The reason it makes sense is shown by
the following example: if you run -functionattrs -inline on
it, then no attributes are assigned. But if you instead run
-inline -functionattrs then @f is marked readnone because the
simplifications produced by the inliner eliminate the store.
@x = external global i32
define void @w(i1 %b) {
br i1 %b, label %write, label %return
write:
store i32 1, i32 *@x
br label %return
return:
ret void
}
define void @f() {
call void @w(i1 0)
ret void
}
llvm-svn: 85893
1. we'd run simplifycfg at the very start, even though
the per function passes have already cleaned this up.
2. In the main per-function pipeline that is interlaced with inlining
etc, we would do instcombine, jump threading, simplifycfg *before*
doing SROA. SROA is much more likely to expose opportunities for
these passes than they are for SROA, so move SRoA up earlier.
also add some comments.
llvm-svn: 85742
ipconstprop and doesn't take much time. Just run it in its place.
This adds a testcase for it, which I plan to expand to cover other
"integration" cases, where we expect the optimizer to be able to
eliminate various things. Due to phase order issues we've regressed
in a number of areas and integration tests are the only way I see to
prevent this.
llvm-svn: 85729