1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00
llvm-mirror/test/tools/llvm-profgen/merge-cold-profile.test
wlei 6ee5e73092 [CSSPGO][llvm-profgen] Merge and trim profile for cold context to reduce profile size
This change allows merging and trimming cold context profile in llvm-profgen to solve profile size bloat problem. Currently when the profile's total sample is below threshold(supported by a switch), it will be considered cold and merged into a base context-less profile, which will at least keep the profile quality as good as the baseline(non-cs).

For example, two input profiles:
 [main @ foo @ bar]:60
 [main @ bar]:50
Under threshold = 100, the two profiles will be merge into one with the base context, get result:
 [bar]:110

Added two switches:
`--csprof-cold-thres=<value>`: Specified the total samples threshold for a context profile to be considered cold, with 100 being the default. Any cold context profiles will be merged into context-less base profile by default.
`--csprof-keep-cold`: Force profile generation to keep cold context profiles instead of dropping them. By default, any cold context will not be written to output profile.

Results:
Though not yet evaluating it with the latest CSSPGO, our internal branch shows neutral on performance but significantly reduce the profile size. Detailed evaluation on llvm-profgen with CSSPGO will come later.

Differential Revision: https://reviews.llvm.org/D94111
2021-02-04 11:05:03 -08:00

71 lines
2.1 KiB
Plaintext

; Used the data from recursion-compression.test, refer it for the unmerged output
; RUN: llvm-profgen --perfscript=%S/Inputs/recursion-compression-pseudoprobe.perfscript --binary=%S/Inputs/recursion-compression-pseudoprobe.perfbin --output=%t --compress-recursion=-1 --csprof-cold-thres=8
; RUN: FileCheck %s --input-file %t
; Test --csprof-keep-cold
; RUN: llvm-profgen --perfscript=%S/Inputs/recursion-compression-pseudoprobe.perfscript --binary=%S/Inputs/recursion-compression-pseudoprobe.perfbin --output=%t --compress-recursion=-1 --csprof-cold-thres=100 --csprof-keep-cold
; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-KEEP-COLD
; CHECK: [fa]:14:4
; CHECK-NEXT: 1: 4
; CHECK-NEXT: 3: 4
; CHECK-NEXT: 4: 2
; CHECK-NEXT: 5: 1
; CHECK-NEXT: 7: 2 fb:2
; CHECK-NEXT: 8: 1 fa:1
; CHECK-NEXT: !CFGChecksum: 120515930909
; CHECK-NEXT:[main:2 @ foo:5 @ fa:8 @ fa:7 @ fb:5 @ fb]:13:4
; CHECK-NEXT: 1: 4
; CHECK-NEXT: 2: 3
; CHECK-NEXT: 3: 1
; CHECK-NEXT: 5: 4 fb:4
; CHECK-NEXT: 6: 1 fa:1
; CHECK-NEXT: !CFGChecksum: 72617220756
; CHECK-KEEP-COLD: [fb]:19:6
; CHECK-KEEP-COLD-NEXT: 1: 6
; CHECK-KEEP-COLD-NEXT: 2: 3
; CHECK-KEEP-COLD-NEXT: 3: 3
; CHECK-KEEP-COLD-NEXT: 5: 4 fb:4
; CHECK-KEEP-COLD-NEXT: 6: 3 fa:3
; CHECK-KEEP-COLD-NEXT: !CFGChecksum: 72617220756
; CHECK-KEEP-COLD-NEXT:[fa]:14:4
; CHECK-KEEP-COLD-NEXT: 1: 4
; CHECK-KEEP-COLD-NEXT: 3: 4
; CHECK-KEEP-COLD-NEXT: 4: 2
; CHECK-KEEP-COLD-NEXT: 5: 1
; CHECK-KEEP-COLD-NEXT: 7: 2 fb:2
; CHECK-KEEP-COLD-NEXT: 8: 1 fa:1
; CHECK-KEEP-COLD-NEXT: !CFGChecksum: 120515930909
; clang -O3 -fexperimental-new-pass-manager -fuse-ld=lld -fpseudo-probe-for-profiling
; -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -Xclang -mdisable-tail-calls
; -g test.c -o a.out
; Copied from recursion-compression.test
#include <stdio.h>
int fb(int n) {
if(n > 10) return fb(n / 2);
return fa(n - 1);
}
int fa(int n) {
if(n < 2) return n;
if(n % 2) return fb(n - 1);
return fa(n - 1);
}
void foo() {
int s, i = 0;
while (i++ < 10000)
s += fa(i);
printf("sum is %d\n", s);
}
int main() {
foo();
return 0;
}