it to be off. If it looks like it's completely unnecessary after testing, I
will remove it completely (which is the hope).
* Callers of the DSNode "copy ctor" can not choose to not copy links.
* Make node collapsing not create a garbage node in some cases, avoiding a
memory allocation, and a subsequent DNE.
* When merging types, allow two functions of different types to be merged
without collapsing.
* Use DSNodeHandle::isNull more often instead of DSNodeHandle::getNode() == 0,
as it is much more efficient.
*** Implement the new, more efficient reachability cloner class
In addition to only cloning nodes that are reachable from interesting
roots, this also fixes the huge inefficiency we had where we cloned lots
of nodes, only to merge them away immediately after they were cloned.
Now we only actually allocate a node if there isn't one to merge it into.
* Eliminate the now-obsolete cloneReachable* and clonePartiallyInto methods
* Rewrite updateFromGlobalsGraph to use the reachability cloner
* Rewrite mergeInGraph to use the reachability cloner
* Disable the scalar map scanning code in removeTriviallyDeadNodes. In large
SCC's, this is extremely expensive. We need a better data structure for the
scalar map, because we really want to scan the unique node handles, not ALL
of the scalars.
* Remove the incorrect SANER_CODE_FOR_CHECKING_IF_ALL_REFERRERS_ARE_FROM_SCALARMAP code.
* Move the code for eliminating integer nodes from the trivially dead
eliminator to the dead node eliminator.
* removeDeadNodes no longer uses removeTriviallyDeadNodes, as it contains a
superset of the node removal power.
* Only futz around with the globals graph in removeDeadNodes if it is modified
llvm-svn: 10987
map was only used to implement a marginal GlobalsGraph optimization, and it
actually slows the analysis down (due to the overhead of keeping it), so just
eliminate it entirely.
llvm-svn: 10955
in terms of it.
Though clonePartiallyInto is not cloning partial graphs yet, this change
dramatically speeds up inlining of graphs with many scalars. For example,
this change speeds up the BU pass on 253.perlbmk from 69s to 36s, because
it avoids iteration over the scalar map, which can get pretty large.
llvm-svn: 10951
used to eliminate the hard coded, hacked in, sparc specific, global TargetData.
Changing the TargetData used to actually match the code fixes problems, and
eliminates a crash.
llvm-svn: 9659
and (2) faster inlining by cloning only reachable nodes. In particular:
(1) Added DSGraph::cloneReachableSubgraph and DSGraph::cloneReachableNodes
to clone the subgraph reachable from a set of root nodes, into the
current graph, merging the global nodes into thos in the current graph.
The TD pass now uses this for faster inlining, and so does the
next function.
(2) Added DSGraph::updateFromGlobalGraph() to rematerialize nodes from the
globals graph into the current graph in both BU and TD passes.
(3) `I' flags are removed from all nodes in the globals graph, because they
are difficult to maintain correctly and are not needed anyway.
(4) Aux. function calls are only removed to the globals graph if they
will never be resovled. (This is what fixed gap.) The immediate
reason is that if we took these out of a function (and moved them to
the globals graph) we would need to rematerialize these nodes into the
function graph for every function in the BU pass. The longer term
problem is that we would need to find a way to remove them from the
globals graph iff they have been resolved on all paths through the
call graph.
llvm-svn: 7187
* Add new MultiObject flag to DSNode which keeps track of whether or not
multiple objects have been merged into the node, allowing must-alias info
to be tracked.
llvm-svn: 6794
This helps a lot of testcases, for example:
New Time New #Nodes Old Time Old #Nodes
254.gap: 91.1024 21605 91.1397 22657
povray31: 2.7807 8613 3.0152 10338
255.vortex: 1.2034 8153 1.2172 8822
moria: .6756 3150 .7054 3877
300.twolf: .1652 2010 .1851 3270
Typically, testcases which use long and ulong integers a lot get better, f.e. povray above.
llvm-svn: 5566