changes:
* BytecodeReader::getType(...) used to return a null pointer
on error. This was only checked about half the time. Now we convert
it to throw an exception, and delete the half that checked for error.
This was checked in before, but psmith crashed and lost the change :(
* insertValue no longer returns -1 on error, so callers don't need to
check for it.
* Substantial rewrite of InstructionReader.cpp, to use more efficient,
simpler, data structures. This provides another 5% speedup. This also
makes the code much easier to read and understand.
llvm-svn: 8984
new, simpler, ForwardReferences data structure. This is just the first
simple replacement, subsequent changes will improve the code more.
This simple change improves the performance of loading a file from HDF5
(contributed by Bill) from 2.36s to 1.93s, a 22% improvement. This
presumably has to do with the fact that we only create ONE placeholder for
a particular forward referenced values, and also may be because the data
structure is much simpler.
llvm-svn: 8979
in the bytecode parser. Before we tried to shoehorn basic blocks into the
"getValue" code path with other types of values. For a variety of reasons
this was a bad idea, so this patch separates it out into its own data structure.
This simplifies the code, makes it fit in 80 columns, and is also much faster.
In a testcase provided by Bill, which has lots of PHI nodes, this patch speeds
up bytecode parsing from taking 6.9s to taking 2.32s. More speedups to
follow later.
llvm-svn: 8977
of a test that Bill Wendling sent me from 228.5s to 105s. Obviously there is
more improvement to be had, but this is a nice speedup which should be "felt"
by many programs.
llvm-svn: 8962
and TargetInstrDescriptor::ImplicitUses to always point to a null
terminated array and never be null. So there is no need to check for
pointer validity when iterating over those sets. Code that looked
like:
if (const unsigned* AS = TID.ImplicitDefs) {
for (int i = 0; AS[i]; ++i) {
// use AS[i]
}
}
was changed to:
for (const unsigned* AS = TID.ImplicitDefs; *AS; ++AS) {
// use *AS
}
llvm-svn: 8960
of callees between executions.
On eon, in release mode, this changes the inliner from taking 11.5712s
to taking 2.2066s. In debug mode, it went from taking 14.4148s to
taking 7.0745s. In release mode, this is a 24.7% speedup of gccas, in
debug mode, it's a total speedup of 11.7%.
This also makes it slightly more aggressive. This could be because we
are not judging the size of the functions quite as accurately as before.
When we start looking at the performance of the generated code, this can
be investigated further.
llvm-svn: 8893
Running the inliner on 252.eon used to take 48.4763s, now it takes 14.4148s.
In release mode, it went from taking 25.8741s to taking 11.5712s.
This also fixes a FIXME.
llvm-svn: 8890
"minimal" SSA form (in other words, it doesn't insert dead PHIs). This
speeds up the mem2reg pass very significantly because it doesn't have to
do a lot of frivolous work in many common cases.
In the 252.eon function I have been playing with, this doesn't even insert
the 120 PHI nodes that it used to which were trivially dead (in the process
of promoting 356 alloca instructions overall). This speeds up the mem2reg
pass from 1.2459s to 0.1284s. More significantly, the DCE pass used to take
2.4138s to remove the 120 dead PHI nodes that mem2reg constructed, now it
takes 0.0134s (which is the time to scan the function and decide that there
is nothing dead). So overall, on this one function, we speed things up a
total of 3.5179s, which is a 24.8x speedup! :)
This change is tested by the Mem2Reg/2003-10-05-DeadPHIInsertion.ll test,
which now passes.
llvm-svn: 8884
basic block. This is amazingly common in code generated by the C/C++ front-ends.
This change makes it not have to insert ANY phi nodes, whereas before it would insert
a ton of dead ones which DCE would have to clean up.
Thus, this fix improves compile-time performance of these trivial allocas in two ways:
1. It doesn't have to do the walking and book-keeping for renaming
2. It does not insert dead phi nodes for them which would have to
subsequently be cleaned up.
On my favorite testcase from 252.eon, this special case handles 305 out of
356 promoted allocas in the function. It speeds up the mem2reg pass from 7.5256s
to 1.2505s. It inserts 677 fewer dead PHI nodes, which speeds up a subsequent
-dce pass from 18.7524s to 2.4806s.
There are still 120 trivially dead PHI nodes being inserted for variables used
in multiple basic blocks, but they are not handled by this patch.
llvm-svn: 8881
*** Revamp the code which handled unreachable code in the function. Now the
code is much more efficient for high-degree basic blocks, such as those
that occur in the 252.eon SPEC benchmark.
For the interested, the time to promote a SINGLE alloca in _ZN7mrScene4ReadERSi
function used to be > 3.5s. Now it is < .075s. The function has a LOT of
allocas in it, so it appeared to be infinite looping, this should make it much
nicer. :)
llvm-svn: 8863
work-list of value definitions. This allows elimination of the explicit
'iterative' step of the algorithm, and also reuses temporary memory better.
llvm-svn: 8861
* Do not insert a new entry into NewPhiNodes during the rename pass if there are no PHIs in a block.
* Do not compute WriteSets in parallel
llvm-svn: 8858
* Eliminate the KillList instance variable, instead, just delete loads and
stores as they are "renamed", and delete allocas when they are done
* Make the 'visited' set an instance variable to avoid passing it on the stack.
llvm-svn: 8857
constants as necessary due to type resolution. With this change, the
following spec benchmarks now link: 176.gcc, 177.mesa, 252.eon,
253.perlbmk, & 300.twolf. IOW, all SPEC INT and FP benchmarks now link.
llvm-svn: 8853
machinery. This dramatically simplifies how things works, removes irritating
little corner cases, and overall improves speed and reliability.
Highlights of this change are:
1. The exponential algorithm built into the code is now gone. For example
the time to disassemble one bytecode file from the mesa benchmark went
from taking 12.5s to taking 0.16s.
2. The linker bugs should be dramatically reduced. The one remaining bug
has to do with constant handling, which I actually introduced in
"union-find" checkins.
3. The code is much easier to follow, as a result of fewer special cases.
It's probably also smaller. yaay.
llvm-svn: 8842
This makes use of the new PATypeHolder's to keep types from being deleted
prematurely, instead of the wierd "self reference" garbage. This is easier
to understand and more efficient as well.
llvm-svn: 8834
* Instead of a #define, use inline function
* Fix the name on the #define, errr... now inline function to be more logical:
it doesn't CHECK the alignment, it PERFORMS the alignment
* To get string name of a Type*, use getDescription(), not getName()
llvm-svn: 8683
* Make sure we align the buffer we're given
* Do not let exceptions propagate when the caller asks for a Module*
* Add doxygenified comments to wrapper functions
llvm-svn: 8682
because it can add a module ID which we do not have at this time.
* Check to see if the module has been initialized when materializing it.
llvm-svn: 8674
- no more passing around a string pointer to set errors
- no more returning booleans and checking for errors, we use C++ exceptions
* Broke functionality into 2 new classes, one reads from file, one from a stream
* Implemented lazy function streaming - the parser can read in a function at-a-time
llvm-svn: 8671
... by making sure to update PHI nodes to take into consideration the
extra edges we get if we inline a call instruction through an invoke.
llvm-svn: 8664
PhyRegAlloc.cpp:
Don't include TargetMachine.h or TargetRegInfo.h, because these are provided
by PhyRegAlloc.h.
Merge class RegisterAllocator into class PhyRegAlloc.
Simplify & move ctor, dtor to PhyRegAlloc.h.
Make some of PhyRegAlloc's reference members into pointer members,
so they can be more easily messed with.
MarkAllocatedRegs() becomes a member method, with fewer args.
PhyRegAlloc.h:
Include Pass.h, TargetMachine.h and TargetRegInfo.h. Don't declare
TargetRegInfo forward.
Give AddedInstrns the obvious clear() method.
Make some of PhyRegAlloc's reference members into pointer members,
so they can be more easily messed with.
Add prototype for markAllocatedRegs().
Remove unused inline void constructLiveRanges().
llvm-svn: 8641
functionality of FunctionInfo pass as doFinalization method.
Rename pass to match names of other passes like it.
Rename the pass creator fn to mimic the other creator fn names.
Include StringExtras for utostr().
Make symbol prologue/epilogue stuff redundant with
EmitBytecodeToAssembly, in preparation for refactoring.
llvm-svn: 8597
arguments and environment.
Perhaps it should be merged with the RunProgramWithTimeout function, but I'd
want to allow it to inherit the parent process's stdin and stdout.
I'll save that for a rainy day...
llvm-svn: 8577
in it being both shorter and more effective. It no longer depends on the
callgraph, so one FIXME has been fixed.
Additionally, this pass was not able to delete recursive (but dead) functions
if they were pointed to by global variables which were also dead. In fact
this pass had a lot of problems deleting functions which were only pointed
to by dead globals and other stuff.
Fixing this means that the entire EH library should be stripped away now from
programs that don't use sjlj or exceptions.
llvm-svn: 8567
the #define up there too
* Since we're including system headers, use the ones in include/llvm/Config
* While we're here, use the canonical LLVM header ordering algorithm
llvm-svn: 8463
unify all exit nodes of a function to compute post-dominance information.
This does not work with functions that have both unwind and return nodes,
because we cannot unify these blocks. The new implementation is better
anyway. :)
llvm-svn: 8460