linear scan reg alloc. This fixes a problem I ran into where extracting
a function from a larger file caused the generated code to change (masking
the problem I was trying to debug) because the allocator behaved differently.
This changes the results for two X86 regression checks. stack-color-with-reg
is improved, with one less instruction, but pr3495 is worse, with one more
copy. As far as I can tell, these tests were just getting lucky or unlucky,
so I've changed the expected results.
llvm-svn: 81060
Constant uniquing tables. This allows distinct ConstantExpr objects
with the same operation and different flags.
Even though a ConstantExpr "a + b" is either always overflowing or
never overflowing (due to being a ConstantExpr), it's still necessary
to be able to represent it both with and without overflow flags at
the same time within the IR, because the safety of the flag may
depend on the context of the use. If the constant really does overflow,
it wouldn't ever be safe to use with the flag set, however the use
may be in code that is never actually executed.
This also makes it possible to merge all the flags tests into a single test.
llvm-svn: 80998
There's a bug with ocamlc that uses "char*" instead of "const char*" for
global string variables. This causes g++ to be very noisy when linking
ocamlc programs. That's why the ocaml test used to cat to /dev/null.
ocamlopt doesn't have this problem, so we can get rid of the >/dev/null,
which may obscure some problems.
llvm-svn: 80968
D test/Analysis/Profiling
--- Reverse-merging r80907 into '.':
U lib/Analysis/ProfileInfoLoaderPass.cpp
Attempt to remove failure in the self-hosting build bot.
llvm-svn: 80966
instead of a bool argument, and to do the dominator check itself.
This makes it eaiser to use when DominatorTree information is
available.
llvm-svn: 80920
simplifylibcalls optimization is thus valid for C++ but not C.
It's not important enough to worry about for C++ apps, so just
remove it.
rdar://7191924
llvm-svn: 80887
edge-profiling, this is more useful since the loading of the
optimal-edge-profiling is more complicated.
The edge-profiling is tested in edge-profiling.ll where only the
instrumentation is tested.
llvm-svn: 80791
for sanity. This didn't turn up any bugs.
Change CallGraphNode to maintain its "callsite" information in the
call edges list as a WeakVH instead of as an instruction*. This fixes
a broad class of dangling pointer bugs, and makes CallGraph have a number
of useful invariants again. This fixes the class of problem indicated
by PR4029 and PR3601.
llvm-svn: 80663
tied to different source registers, the TwoAddressInstructionPass needs to
be smarter. Change it to check before replacing a source register whether
that source register is tied to a different destination register, and if so,
defer handling it until a subsequent iteration.
llvm-svn: 80654
makes an eggregious hack somewhat more palatable. Bringing the LSDA forward
and making it a GV available for reference would be even better, but is
beyond the scope of what I'm looking to solve at this point.
Objective C++ code could generate function names that broke the previous
scheme. This fixes that.
llvm-svn: 80649
changes: SimplifyDemandedBits can't use the builder yet because it
has the wrong insertion point. This fixes a crash building
MultiSource/Benchmarks/PAQ8p
llvm-svn: 80537
indirect function pointer, inline it, then go to delete the body.
The problem is that the callgraph had other references to the function,
though the inliner had no way to know it, so we got a dangling pointer
and an invalid iterator out of the deal.
The fix to this is pretty simple: stop the inliner from deleting the
function by knowing that there are references to it. Do this by making
CallGraphNodes contain a refcount. This requires moving deletion of
available_externally functions to the module-level cleanup sweep where
it belongs.
llvm-svn: 80533
is itself a bitcast. Since we have gep(bitcast(bitcast(y))) in this
case, just wait for the two bitcasts to get zapped. This prevents
instcombine from confusing some aliasing stuff, and allows it to
directly eliminate the load in the testcase.
llvm-svn: 80508
- I'm still trying to figure out the cleanest way to implement this and match the assembler, currently there are some substantial differences.
llvm-svn: 80347
calls into a function and if the calls bring in arrays, try to merge
them together to reduce stack size. For example, in the testcase
we'd previously end up with 4 allocas, now we end up with 2 allocas.
As described in the comments, this is not really the ideal solution
to this problem, but it is surprisingly effective. For example, on
176.gcc, we end up eliminating 67 arrays at "gccas" time and another
24 at "llvm-ld" time.
One piece of concern that I didn't look into: at -O0 -g with
forced inlining this will almost certainly result in worse debug
info. I think this is acceptable though given that this is a case
of "debugging optimized code", and we don't want debug info to
prevent the optimizer from doing things anyway.
llvm-svn: 80215
moves. This avoids the need to promote the operands (or implicitly
extend them, a partial register update condition), and can reduce
i8 register pressure. This substantially speeds up code such as
write_hex in lib/Support/raw_ostream.cpp.
subclass-coalesce.ll is too trivial and no longer tests what it was
originally intended to test.
llvm-svn: 80184
- I moved section creation back into AsmParser. I think policy decisions like
this should be pushed higher, not lower, when possible (in addition the
assembler has flags which change this behavior, for example).
llvm-svn: 80162