This time it's for real! I am going to hook this up in the frontends as well.
The inliner has some experimental heuristics for dealing with the inline hint.
When given a -respect-inlinehint option, functions marked with the inline
keyword are given a threshold just above the default for -O3.
We need some experiments to determine if that is the right thing to do.
llvm-svn: 95466
xform it is checking to actually pass. There is no need to match
m_SelectCst<0, -1> since instcombine canonicalizes that into not(sext).
Add matches for sext(not(x)) in addition to not(sext(x)).
llvm-svn: 95420
short-circuited conditions to AND/OR expressions, and those expressions
are often converted back to a short-circuited form in code gen. The
original source order may have been optimized to take advantage of the
expected values, and if we reassociate them, we change the order and
subvert that optimization. Radar 7497329.
llvm-svn: 95333
This makes the inliner about as agressive as it was before my changes to the
inliner cost calculations. These levels give the same performance and slightly
smaller code than before.
llvm-svn: 95320
Fix bugs where we would compute out of bounds as in bounds, and where
we couldn't know that the linker could override the size of an array.
Add a few new testcases, change existing testcase to use a private
global array instead of extern.
llvm-svn: 95283
The SRThreshold value makes perfect sense for checking if an entire aggregate
should be promoted to a scalar integer, but it is not so good for splitting
an aggregate into its separate elements. A struct may contain a large embedded
array along with some scalar fields that would benefit from being split apart
by SROA. Even if the total aggregate size is large, it may still be good to
perform SROA. Thus, the most important piece of this patch is simply moving
the aggregate size comparison vs. SRThreshold so that it guards only the
aggregate promotion.
We have also been checking the number of elements to decide if an aggregate
should be split up. The limit of "SRThreshold/4" seemed rather arbitrary,
and I don't think it's very useful to derive this limit from SRThreshold
anyway. I've collected some data showing that the current default limit of
32 (since SRThreshold defaults to 128) is a reasonable cutoff for struct
types. One thing suggested by the data is that distinguishing between structs
and arrays might be useful. There are (obviously) a lot more large arrays
than large structs (as measured by the number of elements and not the total
size -- a large array inside a struct still counts as a single element given
the way we do SROA right now). Out of 8377 arrays where we successfully
performed SROA while compiling a large set of benchmarks, only 16 of them had
more than 8 elements. And, for those 16 arrays, it's not at all clear that
SROA was actually beneficial. So, to offset the compile time cost of
investigating more large structs for SROA, the patch lowers the limit on array
elements to 8.
This fixes Apple Radar 7563690.
llvm-svn: 95224
disabled by default. This divides the existing load PRE code into 2 phases:
first it checks that it is safe to move the load to each of the predecessors
where it is unavailable, and then if it is safe, the code is changed to move
the load. Radar 7571861.
llvm-svn: 95007
of objc message send was getting marked arm_apcscc, but the prototype
isn't. This is fine at runtime because objcmsgsend is implemented in
assembly. Only turn a mismatched caller and callee into 'unreachable'
if the callee is a definition.
llvm-svn: 94986
case, instcombine can't zap the invoke for fear of changing the CFG.
However, we have to do something to prevent the next iteration of
instcombine from inserting another store -> undef before the invoke
thereby getting into infinite iteration between dead store elim and
store insertion.
Just zap the callee to null, which will prevent the next iteration
from doing anything.
llvm-svn: 94985
This bug was exposed by my inliner cost changes in r94615, and caused failures
of lencod on most architectures when building with LTO.
This patch fixes lencod and 464.h264ref on x86-64 (and likely others).
llvm-svn: 94858
create a testcase where this matters. The select+load transformation only
occurs when isSafeToLoadUnconditionally is true, and in those situations,
instcombine also changes the underlying objects to be aligned. This seems
like a good idea regardless, and I've verified that it doesn't pessimize
the subsequent realignment.
llvm-svn: 94850
(via APInt &RHSKnownZero = KnownZero, etc) seems dangerous and confusing to me: it
is easy not to notice this, and then wonder why KnownZero/RHSKnownZero changed
underneath you when you modified RHSKnownZero/KnownZero etc. So get rid of this.
No intended functionality change (tested with "make check" + llvm-gcc bootstrap).
llvm-svn: 94802
This was already being done in SSAUpdater::GetValueAtEndOfBlock so I've
just changed SSAUpdater to check for existing PHIs in both places.
llvm-svn: 94690
Modules and ModuleProviders. Because the "ModuleProvider" simply materializes
GlobalValues now, and doesn't provide modules, it's renamed to
"GVMaterializer". Code that used to need a ModuleProvider to materialize
Functions can now materialize the Functions directly. Functions no longer use a
magic linkage to record that they're materializable; they simply ask the
GVMaterializer.
Because the C ABI must never change, we can't remove LLVMModuleProviderRef or
the functions that refer to it. Instead, because Module now exposes the same
functionality ModuleProvider used to, we store a Module* in any
LLVMModuleProviderRef and translate in the wrapper methods. The bindings to
other languages still use the ModuleProvider concept. It would probably be
worth some time to update them to follow the C++ more closely, but I don't
intend to do it.
Fixes http://llvm.org/PR5737 and http://llvm.org/PR5735.
llvm-svn: 94686