1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00
llvm-mirror/test/Transforms/HotColdSplit/outline-disjoint-diamonds.ll
Vedant Kumar 03b893cf5f [HotColdSplit] Introduce a cost model to control splitting behavior
The main goal of the model is to avoid *increasing* function size, as
that would eradicate any memory locality benefits from splitting. This
happens when:

  - There are too many inputs or outputs to the cold region. Argument
    materialization and reloads of outputs have a cost.

  - The cold region has too many distinct exit blocks, causing a large
    switch to be formed in the caller.

  - The code size cost of the split code is less than the cost of a
    set-up call.

A secondary goal is to prevent excessive overall binary size growth.

With the cost model in place, I experimented to find a splitting
threshold that works well in practice. To make warm & cold code easily
separable for analysis purposes, I moved split functions to a "cold"
section. I experimented with thresholds between [0, 4] and set the
default to the threshold which minimized geomean __text size.

Experiment data from building LNT+externals for X86 (N = 639 programs,
all sizes in bytes):

| Configuration | __text geom size | __cold geom size | TEXT geom size |
| **-Os**       | 1736.3           | 0, n=0           | 10961.6        |
| -Os, thresh=0 | 1740.53          | 124.482, n=134   | 11014          |
| -Os, thresh=1 | 1734.79          | 57.8781, n=90    | 10978.6        |
| -Os, thresh=2 | ** 1733.85 **    | 65.6604, n=61    | 10977.6        |
| -Os, thresh=3 | 1733.85          | 65.3071, n=61    | 10977.6        |
| -Os, thresh=4 | 1735.08          | 67.5156, n=54    | 10965.7        |
| **-Oz**       | 1554.4           | 0, n=0           | 10153          |
| -Oz, thresh=2 | ** 1552.2 **     | 65.633, n=61     | 10176          |
| **-O3**       | 2563.37          | 0, n=0           | 13105.4        |
| -O3, thresh=2 | ** 2559.49 **    | 71.1072, n=61    | 13162.4        |

Picking thresh=2 reduces the geomean __text section size by 0.14% at
-Os, -Oz, and -O3 and causes ~0.2% growth in the TEXT segment. Note that
TEXT size is page-aligned, whereas section sizes are byte-aligned.

Experiment data from building LNT+externals for ARM64 (N = 558 programs,
all sizes in bytes):

| Configuration | __text geom size | __cold geom size | TEXT geom size |
| **-Os**       | 1763.96          | 0, n=0           | 42934.9        |
| -Os, thresh=2 | ** 1760.9 **     | 76.6755, n=61    | 42934.9        |

Picking thresh=2 reduces the geomean __text section size by 0.17% at
-Os and causes no growth in the TEXT segment.

Measurements were done with D57082 (r352080) applied.

Differential Revision: https://reviews.llvm.org/D57125

llvm-svn: 352228
2019-01-25 18:30:37 +00:00

58 lines
1.1 KiB
LLVM

; RUN: opt -S -hotcoldsplit -hotcoldsplit-threshold=-1 < %s 2>&1 | FileCheck %s
; CHECK-LABEL: define {{.*}}@fun
; CHECK: call {{.*}}@fun.cold.2(
; CHECK-NEXT: ret void
; CHECK: call {{.*}}@fun.cold.1(
; CHECK-NEXT: ret void
define void @fun() {
entry:
br i1 undef, label %A.then, label %A.else
A.else:
br label %A.then4
A.then4:
br i1 undef, label %A.then5, label %A.end
A.then5:
br label %A.cleanup
A.end:
br label %A.cleanup
A.cleanup:
%A.cleanup.dest.slot.0 = phi i32 [ 1, %A.then5 ], [ 0, %A.end ]
unreachable
A.then:
br i1 undef, label %B.then, label %B.else
B.then:
ret void
B.else:
br label %B.then4
B.then4:
br i1 undef, label %B.then5, label %B.end
B.then5:
br label %B.cleanup
B.end:
br label %B.cleanup
B.cleanup:
%B.cleanup.dest.slot.0 = phi i32 [ 1, %B.then5 ], [ 0, %B.end ]
unreachable
}
; CHECK-LABEL: define {{.*}}@fun.cold.1(
; CHECK: %B.cleanup.dest.slot.0 = phi i32 [ 1, %B.then5 ], [ 0, %B.end ]
; CHECK-NEXT: unreachable
; CHECK-LABEL: define {{.*}}@fun.cold.2(
; CHECK: %A.cleanup.dest.slot.0 = phi i32 [ 1, %A.then5 ], [ 0, %A.end ]
; CHECK-NEXT: unreachable