mirror of
https://github.com/RPCS3/llvm-mirror.git
synced 2024-11-22 18:54:02 +01:00
6ee5e73092
This change allows merging and trimming cold context profile in llvm-profgen to solve profile size bloat problem. Currently when the profile's total sample is below threshold(supported by a switch), it will be considered cold and merged into a base context-less profile, which will at least keep the profile quality as good as the baseline(non-cs). For example, two input profiles: [main @ foo @ bar]:60 [main @ bar]:50 Under threshold = 100, the two profiles will be merge into one with the base context, get result: [bar]:110 Added two switches: `--csprof-cold-thres=<value>`: Specified the total samples threshold for a context profile to be considered cold, with 100 being the default. Any cold context profiles will be merged into context-less base profile by default. `--csprof-keep-cold`: Force profile generation to keep cold context profiles instead of dropping them. By default, any cold context will not be written to output profile. Results: Though not yet evaluating it with the latest CSSPGO, our internal branch shows neutral on performance but significantly reduce the profile size. Detailed evaluation on llvm-profgen with CSSPGO will come later. Differential Revision: https://reviews.llvm.org/D94111
55 lines
1.5 KiB
Plaintext
55 lines
1.5 KiB
Plaintext
; RUN: llvm-profgen --perfscript=%S/Inputs/inline-cs-pseudoprobe.perfscript --binary=%S/Inputs/inline-cs-pseudoprobe.perfbin --output=%t --show-unwinder-output --csprof-cold-thres=0 | FileCheck %s --check-prefix=CHECK-UNWINDER
|
|
; RUN: FileCheck %s --input-file %t
|
|
|
|
; CHECK: [main:2 @ foo]:74:0
|
|
; CHECK-NEXT: 2: 15
|
|
; CHECK-NEXT: 3: 15
|
|
; CHECK-NEXT: 4: 14
|
|
; CHECK-NEXT: 5: 1
|
|
; CHECK-NEXT: 6: 15
|
|
; CHECK-NEXT: 8: 14 bar:14
|
|
; CHECK-NEXT: !CFGChecksum: 138950591924
|
|
; CHECK-NEXT:[main:2 @ foo:8 @ bar]:56:14
|
|
; CHECK-NEXT: 1: 14
|
|
; CHECK-NEXT: 2: 14
|
|
; CHECK-NEXT: 3: 14
|
|
; CHECK-NEXT: 4: 14
|
|
; CHECK-NEXT: !CFGChecksum: 72617220756
|
|
|
|
|
|
; CHECK-UNWINDER: Binary(inline-cs-pseudoprobe.perfbin)'s Range Counter:
|
|
; CHECK-UNWINDER-EMPTY:
|
|
; CHECK-UNWINDER-NEXT: (800, 858): 1
|
|
; CHECK-UNWINDER-NEXT: (80e, 82b): 1
|
|
; CHECK-UNWINDER-NEXT: (80e, 858): 13
|
|
|
|
; CHECK-UNWINDER: Binary(inline-cs-pseudoprobe.perfbin)'s Branch Counter:
|
|
; CHECK-UNWINDER-EMPTY:
|
|
; CHECK-UNWINDER-NEXT: (82b, 800): 1
|
|
; CHECK-UNWINDER-NEXT: (858, 80e): 15
|
|
|
|
; clang -O3 -fexperimental-new-pass-manager -fuse-ld=lld -fpseudo-probe-for-profiling
|
|
; -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -Xclang -mdisable-tail-calls
|
|
; -g test.c -o a.out
|
|
|
|
#include <stdio.h>
|
|
|
|
int bar(int x, int y) {
|
|
if (x % 3) {
|
|
return x - y;
|
|
}
|
|
return x + y;
|
|
}
|
|
|
|
void foo() {
|
|
int s, i = 0;
|
|
while (i++ < 4000 * 4000)
|
|
if (i % 91) s = bar(i, s); else s += 30;
|
|
printf("sum is %d\n", s);
|
|
}
|
|
|
|
int main() {
|
|
foo();
|
|
return 0;
|
|
}
|