1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 05:52:53 +02:00
Commit Graph

27 Commits

Author SHA1 Message Date
Alp Toker
18115693f7 Fix typos
llvm-svn: 208839
2014-05-15 01:52:21 +00:00
Duncan P. N. Exon Smith
2eaef1aa01 Reapply "blockfreq: Approximate irreducible control flow"
This reverts commit r207287, reapplying r207286.

I'm hoping that declaring an explicit struct and instantiating
`addBlockEdges()` directly works around the GCC crash from r207286.
This is a lot more boilerplate, though.

llvm-svn: 207438
2014-04-28 20:02:29 +00:00
Duncan P. N. Exon Smith
c54b3a7e23 Revert "blockfreq: Approximate irreducible control flow"
This reverts commit r207286.  It causes an ICE on the
cmake-llvm-x86_64-linux buildbot [1]:

    llvm/lib/Analysis/BlockFrequencyInfo.cpp: In lambda function:
    llvm/lib/Analysis/BlockFrequencyInfo.cpp:182:1: internal compiler error: in get_expr_operands, at tree-ssa-operands.c:1035

[1]: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/12093/steps/build_llvm/logs/stdio

llvm-svn: 207287
2014-04-25 23:16:58 +00:00
Duncan P. N. Exon Smith
3189616c35 blockfreq: Approximate irreducible control flow
Previously, irreducible backedges were ignored.  With this commit,
irreducible SCCs are discovered on the fly, and modelled as loops with
multiple headers.

This approximation specifies the headers of irreducible sub-SCCs as its
entry blocks and all nodes that are targets of a backedge within it
(excluding backedges within true sub-loops).  Block frequency
calculations act as if we insert a new block that intercepts all the
edges to the headers.  All backedges and entries to the irreducible SCC
point to this imaginary block.  This imaginary block has an edge (with
even probability) to each header block.

The result is now reasonable enough that I've added a number of
testcases for irreducible control flow.  I've outlined in
`BlockFrequencyInfoImpl.h` ways to improve the approximation.

<rdar://problem/14292693>

llvm-svn: 207286
2014-04-25 23:08:57 +00:00
Duncan P. N. Exon Smith
6117699d7e blockfreq: Only one mass distribution per node
Remove the concepts of "forward" and "general" mass distributions, which
was wrong.  The split might have made sense in an early version of the
algorithm, but it's definitely wrong now.

<rdar://problem/14292693>

llvm-svn: 207195
2014-04-25 04:38:43 +00:00
Duncan P. N. Exon Smith
7be280eab6 blockfreq: Use better branch weights in multiexit test
The branch weights were even before.  Make them different.

<rdar://problem/14292693>

llvm-svn: 207193
2014-04-25 04:38:37 +00:00
Duncan P. N. Exon Smith
d2094f508a blockfreq: Clean up irreducible testcases
Strip irreducible testcases to pure control flow.  The function calls
made the branch weights more believable but cluttered it up a lot.
There isn't going to be any constant analysis here, so just use dumb
branch logic to clarify the important parts.

<rdar://problem/14292693>

llvm-svn: 207192
2014-04-25 04:38:35 +00:00
Duncan P. N. Exon Smith
e7f58108ee blockfreq: Skip irreducible backedges inside functions
The branch that skips irreducible backedges was only active when
propagating mass at the top-level.  In particular, when propagating mass
through a loop recognized by `LoopInfo` with irreducible control flow
inside, irreducible backedges would not be skipped.

Not sure where that idea came from, but the result was that mass was
lost until after loop exit.  Added a testcase that covers this case.

llvm-svn: 206860
2014-04-22 03:31:53 +00:00
Duncan P. N. Exon Smith
78dd4cd9af Reapply "blockfreq: Rewrite BlockFrequencyInfoImpl"
This reverts commit r206707, reapplying r206704.  The preceding commit
to CalcSpillWeights should have sorted out the failing buildbots.

<rdar://problem/14292693>

llvm-svn: 206766
2014-04-21 17:57:07 +00:00
Duncan P. N. Exon Smith
f65036e329 Revert "blockfreq: Rewrite BlockFrequencyInfoImpl"
This reverts commit r206704, as expected.

llvm-svn: 206707
2014-04-19 22:46:00 +00:00
Duncan P. N. Exon Smith
707997192f Reapply "blockfreq: Rewrite BlockFrequencyInfoImpl"
This reverts commit r206677, reapplying my BlockFrequencyInfo rewrite.

I've done a careful audit, added some asserts, and fixed a couple of
bugs (unfortunately, they were in unlikely code paths).  There's a small
chance that this will appease the failing bots [1][2].  (If so, great!)

If not, I have a follow-up commit ready that will temporarily add
-debug-only=block-freq to the two failing tests, allowing me to compare
the code path between what the failing bots and what my machines (and
the rest of the bots) are doing.  Once I've triggered those builds, I'll
revert both commits so the bots go green again.

[1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816
[2]: http://llvm-amd64.freebsd.your.org/b/builders/clang-i386-freebsd/builds/18445

<rdar://problem/14292693>

llvm-svn: 206704
2014-04-19 22:34:26 +00:00
Duncan P. N. Exon Smith
0ee9548e22 Revert "blockfreq: Rewrite BlockFrequencyInfoImpl" (#2)
This reverts commit r206666, as planned.

Still stumped on why the bots are failing.  Sanitizer bots haven't
turned anything up.  If anyone can help me debug either of the failures
(referenced in r206666) I'll owe them a beer.  (In the meantime, I'll be
auditing my patch for undefined behaviour.)

llvm-svn: 206677
2014-04-19 00:42:46 +00:00
Duncan P. N. Exon Smith
66e247e69c Reapply "blockfreq: Rewrite BlockFrequencyInfoImpl" (#2)
This reverts commit r206628, reapplying r206622 (and r206626).

Two tests are failing only on buildbots [1][2]: i.e., I can't reproduce
on Darwin, and Chandler can't reproduce on Linux.  Asan and valgrind
don't tell us anything, but we're hoping the msan bot will catch it.

So, I'm applying this again to get more feedback from the bots.  I'll
leave it in long enough to trigger builds in at least the sanitizer
buildbots (it was failing for reasons unrelated to my commit last time
it was in), and hopefully a few others.... and then I expect to revert a
third time.

[1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816
[2]: http://llvm-amd64.freebsd.your.org/b/builders/clang-i386-freebsd/builds/18445

llvm-svn: 206666
2014-04-18 22:30:03 +00:00
Duncan P. N. Exon Smith
80fdbd652d Revert "blockfreq: Rewrite BlockFrequencyInfoImpl" (#2)
This reverts commit r206622 and the MSVC fixup in r206626.

Apparently the remotely failing tests are still failing, despite my
attempt to fix the nondeterminism in r206621.

llvm-svn: 206628
2014-04-18 17:56:08 +00:00
Duncan P. N. Exon Smith
cf746f5ff0 Reapply "blockfreq: Rewrite BlockFrequencyInfoImpl"
This reverts commit r206556, effectively reapplying commit r206548 and
its fixups in r206549 and r206550.

In an intervening commit I've added target triples to the tests that
were failing remotely [1] (but passing locally).  I'm hoping the mystery
is solved?  I'll revert this again if the tests are still failing
remotely.

[1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816

llvm-svn: 206622
2014-04-18 17:22:25 +00:00
Duncan P. N. Exon Smith
79011f6e40 Revert "blockfreq: Rewrite BlockFrequencyInfoImpl"
This reverts commits r206548, r206549 and r206549.

There are some unit tests failing that aren't failing locally [1], so
reverting until I have time to investigate.

[1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816

llvm-svn: 206556
2014-04-18 02:17:43 +00:00
Duncan P. N. Exon Smith
78f8766db3 blockfreq: Rewrite BlockFrequencyInfoImpl
Rewrite the shared implementation of BlockFrequencyInfo and
MachineBlockFrequencyInfo entirely.

The old implementation had a fundamental flaw:  precision losses from
nested loops (or very wide branches) compounded past loop exits (and
convergence points).

The @nested_loops testcase at the end of
test/Analysis/BlockFrequencyAnalysis/basic.ll is motivating.  This
function has three nested loops, with branch weights in the loop headers
of 1:4000 (exit:continue).  The old analysis gives non-sensical results:

    Printing analysis 'Block Frequency Analysis' for function 'nested_loops':
    ---- Block Freqs ----
     entry = 1.0
     for.cond1.preheader = 1.00103
     for.cond4.preheader = 5.5222
     for.body6 = 18095.19995
     for.inc8 = 4.52264
     for.inc11 = 0.00109
     for.end13 = 0.0

The new analysis gives correct results:

    Printing analysis 'Block Frequency Analysis' for function 'nested_loops':
    block-frequency-info: nested_loops
     - entry: float = 1.0, int = 8
     - for.cond1.preheader: float = 4001.0, int = 32007
     - for.cond4.preheader: float = 16008001.0, int = 128064007
     - for.body6: float = 64048012001.0, int = 512384096007
     - for.inc8: float = 16008001.0, int = 128064007
     - for.inc11: float = 4001.0, int = 32007
     - for.end13: float = 1.0, int = 8

Most importantly, the frequency leaving each loop matches the frequency
entering it.

The new algorithm leverages BlockMass and PositiveFloat to maintain
precision, separates "probability mass distribution" from "loop
scaling", and uses dithering to eliminate probability mass loss.  I have
unit tests for these types out of tree, but it was decided in the review
to make the classes private to BlockFrequencyInfoImpl, and try to shrink
them (or remove them entirely) in follow-up commits.

The new algorithm should generally have a complexity advantage over the
old.  The previous algorithm was quadratic in the worst case.  The new
algorithm is still worst-case quadratic in the presence of irreducible
control flow, but it's linear without it.

The key difference between the old algorithm and the new is that control
flow within a loop is evaluated separately from control flow outside,
limiting propagation of precision problems and allowing loop scale to be
calculated independently of mass distribution.  Loops are visited
bottom-up, their loop scales are calculated, and they are replaced by
pseudo-nodes.  Mass is then distributed through the function, which is
now a DAG.  Finally, loops are revisited top-down to multiply through
the loop scales and the masses distributed to pseudo nodes.

There are some remaining flaws.

  - Irreducible control flow isn't modelled correctly.  LoopInfo and
    MachineLoopInfo ignore irreducible edges, so this algorithm will
    fail to scale accordingly.  There's a note in the class
    documentation about how to get closer.  See also the comments in
    test/Analysis/BlockFrequencyInfo/irreducible.ll.

  - Loop scale is limited to 4096 per loop (2^12) to avoid exhausting
    the 64-bit integer precision used downstream.

  - The "bias" calculation proposed on llvmdev is *not* incorporated
    here.  This will be added in a follow-up commit, once comments from
    this review have been handled.

llvm-svn: 206548
2014-04-18 01:57:45 +00:00
Daniel Dunbar
a496d61c01 [tests] Cleanup initialization of test suffixes.
- Instead of setting the suffixes in a bunch of places, just set one master
   list in the top-level config. We now only modify the suffix list in a few
   suites that have one particular unique suffix (.ml, .mc, .yaml, .td, .py).

 - Aside from removing the need for a bunch of lit.local.cfg files, this enables
   4 tests that were inadvertently being skipped (one in
   Transforms/BranchFolding, a .s file each in DebugInfo/AArch64 and
   CodeGen/PowerPC, and one in CodeGen/SI which is now failing and has been
   XFAILED).

 - This commit also fixes a bunch of config files to use config.root instead of
   older copy-pasted code.

llvm-svn: 188513
2013-08-16 00:37:11 +00:00
Jakob Stoklund Olesen
1c034ce174 Minimize precision loss when computing cyclic probabilities.
Allow block frequencies to exceed 32 bits by using the new
BlockFrequency division function.

llvm-svn: 185236
2013-06-28 22:40:43 +00:00
Jakob Stoklund Olesen
a4ca837638 Print block frequencies in decimal form.
This is easier to read than the internal fixed-point representation.

If anybody knows the correct algorithm for converting fixed-point
numbers to base 10, feel free to fix it.

llvm-svn: 184881
2013-06-25 21:57:38 +00:00
Benjamin Kramer
3b56c8dd50 BlockFrequency: Bump up the entry frequency a bit.
This is a band-aid to fix the most severe regressions we're seeing from basing
spill decisions on block frequencies, until we have a better solution.

llvm-svn: 184835
2013-06-25 13:34:40 +00:00
Benjamin Kramer
30c35d5305 Revert "BlockFrequency: Saturate at 1 instead of 0 when multiplying a frequency with a branch probability."
This reverts commit r184584. Breaks PPC selfhost.

llvm-svn: 184590
2013-06-21 20:20:27 +00:00
Benjamin Kramer
3315e168ee BlockFrequency: Saturate at 1 instead of 0 when multiplying a frequency with a branch probability.
Zero is used by BlockFrequencyInfo as a special "don't know" value. It also
causes a sink for frequencies as you can't ever get off a zero frequency with
more multiplies.

This recovers a 10% regression on MultiSource/Benchmarks/7zip. A zero frequency
was propagated into an inner loop causing excessive spilling.

PR16402.

llvm-svn: 184584
2013-06-21 19:30:05 +00:00
Eli Bendersky
4afdeeb682 Replace all instances of dg.exp file with lit.local.cfg, since all tests are run with LIT now and now Dejagnu. dg.exp is no longer needed.
Patch reviewed by Daniel Dunbar. It will be followed by additional cleanup patches.

llvm-svn: 150664
2012-02-16 06:28:33 +00:00
Chandler Carruth
12a645d6f6 Generalize the reading of probability metadata to work for both branches
and switches, with arbitrary numbers of successors. Still optimized for
the common case of 2 successors for a conditional branch.

Add a test case for switch metadata showing up in the BlockFrequencyInfo pass.

llvm-svn: 142493
2011-10-19 10:32:19 +00:00
Chandler Carruth
18a382b4b6 Teach the BranchProbabilityInfo analysis pass to read any metadata
encoding of probabilities. In the absense of metadata, it continues to
fall back on static heuristics.

This allows __builtin_expect, after lowering through llvm.expect
a branch instruction's metadata, to actually enter the branch
probability model. This is one component of resolving PR2577.

llvm-svn: 142492
2011-10-19 10:30:30 +00:00
Chandler Carruth
13b475d4f6 Add pass printing support to BlockFrequencyInfo pass. The implementation
layer already had support for printing the results of this analysis, but
the wiring was missing.

Now that printing the analysis works, actually bring some of this
analysis, and the BranchProbabilityInfo analysis that it wraps, under
test! I'm planning on fixing some bugs and doing other work here, so
having a nice place to add regression tests and a way to observe the
results is really useful.

llvm-svn: 142491
2011-10-19 10:12:41 +00:00