[CSSPGO][llvm-profgen] Pseudo probe decoding and disassembling
This change implements pseudo probe decoding and disassembling for llvm-profgen/CSSPGO. Please see https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s and https://reviews.llvm.org/D89707 for more context about CSSPGO and llvm-profgen.
**ELF section format**
Please see the encoding patch(https://reviews.llvm.org/D91878) for more details of the format, just copy the example here:
Two section(`.pseudo_probe_desc` and `.pseudoprobe` ) is emitted in ELF to support pseudo probe.
The format of `.pseudo_probe_desc` section looks like:
```
.section .pseudo_probe_desc,"",@progbits
.quad 6309742469962978389 // Func GUID
.quad 4294967295 // Func Hash
.byte 9 // Length of func name
.ascii "_Z5funcAi" // Func name
.quad 7102633082150537521
.quad 138828622701
.byte 12
.ascii "_Z8funcLeafi"
.quad 446061515086924981
.quad 4294967295
.byte 9
.ascii "_Z5funcBi"
.quad -2016976694713209516
.quad 72617220756
.byte 7
.ascii "_Z3fibi"
```
For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format :
```
FUNCTION BODY (one for each outlined function present in the text section)
GUID (uint64)
GUID of the function
NPROBES (ULEB128)
Number of probes originating from this function.
NUM_INLINED_FUNCTIONS (ULEB128)
Number of callees inlined into this function, aka number of
first-level inlinees
PROBE RECORDS
A list of NPROBES entries. Each entry contains:
INDEX (ULEB128)
TYPE (uint4)
0 - block probe, 1 - indirect call, 2 - direct call
ATTRIBUTE (uint3)
reserved
ADDRESS_TYPE (uint1)
0 - code address, 1 - address delta
CODE_ADDRESS (uint64 or ULEB128)
code address or address delta, depending on ADDRESS_TYPE
INLINED FUNCTION RECORDS
A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined
callees. Each record contains:
INLINE SITE
GUID of the inlinee (uint64)
ID of the callsite probe (ULEB128)
FUNCTION BODY
A FUNCTION BODY entry describing the inlined function.
```
**Disassembling**
A switch `--show-pseudo-probe` is added to use along with `--show-disassembly` to print disassembly code with pseudo probe directives.
For example:
```
00000000002011a0 <foo2>:
2011a0: 50 push rax
2011a1: 85 ff test edi,edi
[Probe]: FUNC: foo2 Index: 1 Type: Block
2011a3: 74 02 je 2011a7 <foo2+0x7>
[Probe]: FUNC: foo2 Index: 3 Type: Block
[Probe]: FUNC: foo2 Index: 4 Type: Block
[Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6
2011a5: 58 pop rax
2011a6: c3 ret
[Probe]: FUNC: foo2 Index: 2 Type: Block
2011a7: bf 01 00 00 00 mov edi,0x1
[Probe]: FUNC: foo2 Index: 5 Type: IndirectCall
2011ac: ff d6 call rsi
[Probe]: FUNC: foo2 Index: 4 Type: Block
2011ae: 58 pop rax
2011af: c3 ret
```
**Implementation**
- `PseudoProbeDecoder` is added in ProfiledBinary as an infra for the decoding. It decoded the two section and generate two map: `GUIDProbeFunctionMap` stores all the `PseudoProbeFunction` which is the abstraction of a general function. `AddressProbesMap` stores all the pseudo probe info indexed by its address.
- All the inline info is encoded into binary as a trie(`PseudoProbeInlineTree`) and will be constructed from the decoding. Each pseudo probe can get its inline context(`getInlineContext`) by traversing its inline tree node backwards.
Test Plan:
ninja & ninja check-llvm
Differential Revision: https://reviews.llvm.org/D92334
2020-11-24 05:33:23 +01:00
|
|
|
//===--- PseudoProbe.cpp - Pseudo probe decoding utilities ------*- C++-*-===//
|
|
|
|
//
|
|
|
|
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
|
|
|
// See https://llvm.org/LICENSE.txt for license information.
|
|
|
|
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
|
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
|
|
|
|
#include "PseudoProbe.h"
|
|
|
|
#include "ErrorHandling.h"
|
|
|
|
#include "llvm/Support/Endian.h"
|
|
|
|
#include "llvm/Support/LEB128.h"
|
|
|
|
#include "llvm/Support/raw_ostream.h"
|
|
|
|
#include <limits>
|
|
|
|
#include <memory>
|
|
|
|
|
|
|
|
using namespace llvm;
|
|
|
|
using namespace sampleprof;
|
|
|
|
using namespace support;
|
|
|
|
|
|
|
|
namespace llvm {
|
|
|
|
namespace sampleprof {
|
|
|
|
|
|
|
|
static StringRef getProbeFNameForGUID(const GUIDProbeFunctionMap &GUID2FuncMAP,
|
|
|
|
uint64_t GUID) {
|
|
|
|
auto It = GUID2FuncMAP.find(GUID);
|
|
|
|
assert(It != GUID2FuncMAP.end() &&
|
|
|
|
"Probe function must exist for a valid GUID");
|
|
|
|
return It->second.FuncName;
|
|
|
|
}
|
|
|
|
|
|
|
|
void PseudoProbeFuncDesc::print(raw_ostream &OS) {
|
|
|
|
OS << "GUID: " << FuncGUID << " Name: " << FuncName << "\n";
|
|
|
|
OS << "Hash: " << FuncHash << "\n";
|
|
|
|
}
|
|
|
|
|
[CSSPGO][llvm-profgen] Compress recursive cycles in calling context
This change compresses the context string by removing cycles due to recursive function for CS profile generation. Removing recursion cycles is a way to normalize the calling context which will be better for the sample aggregation and also make the context promoting deterministic.
Specifically for implementation, we recognize adjacent repeated frames as cycles and deduplicated them through multiple round of iteration.
For example:
Considering a input context string stack:
[“a”, “a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”]
For first iteration,, it removed all adjacent repeated frames of size 1:
[“a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”]
For second iteration, it removed all adjacent repeated frames of size 2:
[“a”, “b”, “c”, “a”, “b”, “c”, “d”]
So in the end, we get compressed output:
[“a”, “b”, “c”, “d”]
Compression will be called in two place: one for sample's context key right after unwinding, one is for the eventual context string id in the ProfileGenerator.
Added a switch `compress-recursion` to control the size of duplicated frames, default -1 means no size limit.
Added unit tests and regression test for this.
Differential Revision: https://reviews.llvm.org/D93556
2021-01-30 00:00:08 +01:00
|
|
|
void PseudoProbe::getInlineContext(SmallVectorImpl<std::string> &ContextStack,
|
[CSSPGO][llvm-profgen] Pseudo probe decoding and disassembling
This change implements pseudo probe decoding and disassembling for llvm-profgen/CSSPGO. Please see https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s and https://reviews.llvm.org/D89707 for more context about CSSPGO and llvm-profgen.
**ELF section format**
Please see the encoding patch(https://reviews.llvm.org/D91878) for more details of the format, just copy the example here:
Two section(`.pseudo_probe_desc` and `.pseudoprobe` ) is emitted in ELF to support pseudo probe.
The format of `.pseudo_probe_desc` section looks like:
```
.section .pseudo_probe_desc,"",@progbits
.quad 6309742469962978389 // Func GUID
.quad 4294967295 // Func Hash
.byte 9 // Length of func name
.ascii "_Z5funcAi" // Func name
.quad 7102633082150537521
.quad 138828622701
.byte 12
.ascii "_Z8funcLeafi"
.quad 446061515086924981
.quad 4294967295
.byte 9
.ascii "_Z5funcBi"
.quad -2016976694713209516
.quad 72617220756
.byte 7
.ascii "_Z3fibi"
```
For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format :
```
FUNCTION BODY (one for each outlined function present in the text section)
GUID (uint64)
GUID of the function
NPROBES (ULEB128)
Number of probes originating from this function.
NUM_INLINED_FUNCTIONS (ULEB128)
Number of callees inlined into this function, aka number of
first-level inlinees
PROBE RECORDS
A list of NPROBES entries. Each entry contains:
INDEX (ULEB128)
TYPE (uint4)
0 - block probe, 1 - indirect call, 2 - direct call
ATTRIBUTE (uint3)
reserved
ADDRESS_TYPE (uint1)
0 - code address, 1 - address delta
CODE_ADDRESS (uint64 or ULEB128)
code address or address delta, depending on ADDRESS_TYPE
INLINED FUNCTION RECORDS
A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined
callees. Each record contains:
INLINE SITE
GUID of the inlinee (uint64)
ID of the callsite probe (ULEB128)
FUNCTION BODY
A FUNCTION BODY entry describing the inlined function.
```
**Disassembling**
A switch `--show-pseudo-probe` is added to use along with `--show-disassembly` to print disassembly code with pseudo probe directives.
For example:
```
00000000002011a0 <foo2>:
2011a0: 50 push rax
2011a1: 85 ff test edi,edi
[Probe]: FUNC: foo2 Index: 1 Type: Block
2011a3: 74 02 je 2011a7 <foo2+0x7>
[Probe]: FUNC: foo2 Index: 3 Type: Block
[Probe]: FUNC: foo2 Index: 4 Type: Block
[Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6
2011a5: 58 pop rax
2011a6: c3 ret
[Probe]: FUNC: foo2 Index: 2 Type: Block
2011a7: bf 01 00 00 00 mov edi,0x1
[Probe]: FUNC: foo2 Index: 5 Type: IndirectCall
2011ac: ff d6 call rsi
[Probe]: FUNC: foo2 Index: 4 Type: Block
2011ae: 58 pop rax
2011af: c3 ret
```
**Implementation**
- `PseudoProbeDecoder` is added in ProfiledBinary as an infra for the decoding. It decoded the two section and generate two map: `GUIDProbeFunctionMap` stores all the `PseudoProbeFunction` which is the abstraction of a general function. `AddressProbesMap` stores all the pseudo probe info indexed by its address.
- All the inline info is encoded into binary as a trie(`PseudoProbeInlineTree`) and will be constructed from the decoding. Each pseudo probe can get its inline context(`getInlineContext`) by traversing its inline tree node backwards.
Test Plan:
ninja & ninja check-llvm
Differential Revision: https://reviews.llvm.org/D92334
2020-11-24 05:33:23 +01:00
|
|
|
const GUIDProbeFunctionMap &GUID2FuncMAP,
|
|
|
|
bool ShowName) const {
|
|
|
|
uint32_t Begin = ContextStack.size();
|
|
|
|
PseudoProbeInlineTree *Cur = InlineTree;
|
|
|
|
// It will add the string of each node's inline site during iteration.
|
|
|
|
// Note that it won't include the probe's belonging function(leaf location)
|
2021-01-11 18:08:39 +01:00
|
|
|
while (Cur->hasInlineSite()) {
|
[CSSPGO][llvm-profgen] Pseudo probe decoding and disassembling
This change implements pseudo probe decoding and disassembling for llvm-profgen/CSSPGO. Please see https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s and https://reviews.llvm.org/D89707 for more context about CSSPGO and llvm-profgen.
**ELF section format**
Please see the encoding patch(https://reviews.llvm.org/D91878) for more details of the format, just copy the example here:
Two section(`.pseudo_probe_desc` and `.pseudoprobe` ) is emitted in ELF to support pseudo probe.
The format of `.pseudo_probe_desc` section looks like:
```
.section .pseudo_probe_desc,"",@progbits
.quad 6309742469962978389 // Func GUID
.quad 4294967295 // Func Hash
.byte 9 // Length of func name
.ascii "_Z5funcAi" // Func name
.quad 7102633082150537521
.quad 138828622701
.byte 12
.ascii "_Z8funcLeafi"
.quad 446061515086924981
.quad 4294967295
.byte 9
.ascii "_Z5funcBi"
.quad -2016976694713209516
.quad 72617220756
.byte 7
.ascii "_Z3fibi"
```
For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format :
```
FUNCTION BODY (one for each outlined function present in the text section)
GUID (uint64)
GUID of the function
NPROBES (ULEB128)
Number of probes originating from this function.
NUM_INLINED_FUNCTIONS (ULEB128)
Number of callees inlined into this function, aka number of
first-level inlinees
PROBE RECORDS
A list of NPROBES entries. Each entry contains:
INDEX (ULEB128)
TYPE (uint4)
0 - block probe, 1 - indirect call, 2 - direct call
ATTRIBUTE (uint3)
reserved
ADDRESS_TYPE (uint1)
0 - code address, 1 - address delta
CODE_ADDRESS (uint64 or ULEB128)
code address or address delta, depending on ADDRESS_TYPE
INLINED FUNCTION RECORDS
A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined
callees. Each record contains:
INLINE SITE
GUID of the inlinee (uint64)
ID of the callsite probe (ULEB128)
FUNCTION BODY
A FUNCTION BODY entry describing the inlined function.
```
**Disassembling**
A switch `--show-pseudo-probe` is added to use along with `--show-disassembly` to print disassembly code with pseudo probe directives.
For example:
```
00000000002011a0 <foo2>:
2011a0: 50 push rax
2011a1: 85 ff test edi,edi
[Probe]: FUNC: foo2 Index: 1 Type: Block
2011a3: 74 02 je 2011a7 <foo2+0x7>
[Probe]: FUNC: foo2 Index: 3 Type: Block
[Probe]: FUNC: foo2 Index: 4 Type: Block
[Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6
2011a5: 58 pop rax
2011a6: c3 ret
[Probe]: FUNC: foo2 Index: 2 Type: Block
2011a7: bf 01 00 00 00 mov edi,0x1
[Probe]: FUNC: foo2 Index: 5 Type: IndirectCall
2011ac: ff d6 call rsi
[Probe]: FUNC: foo2 Index: 4 Type: Block
2011ae: 58 pop rax
2011af: c3 ret
```
**Implementation**
- `PseudoProbeDecoder` is added in ProfiledBinary as an infra for the decoding. It decoded the two section and generate two map: `GUIDProbeFunctionMap` stores all the `PseudoProbeFunction` which is the abstraction of a general function. `AddressProbesMap` stores all the pseudo probe info indexed by its address.
- All the inline info is encoded into binary as a trie(`PseudoProbeInlineTree`) and will be constructed from the decoding. Each pseudo probe can get its inline context(`getInlineContext`) by traversing its inline tree node backwards.
Test Plan:
ninja & ninja check-llvm
Differential Revision: https://reviews.llvm.org/D92334
2020-11-24 05:33:23 +01:00
|
|
|
std::string ContextStr;
|
|
|
|
if (ShowName) {
|
|
|
|
StringRef FuncName =
|
|
|
|
getProbeFNameForGUID(GUID2FuncMAP, std::get<0>(Cur->ISite));
|
|
|
|
ContextStr += FuncName.str();
|
|
|
|
} else {
|
|
|
|
ContextStr += Twine(std::get<0>(Cur->ISite)).str();
|
|
|
|
}
|
|
|
|
ContextStr += ":";
|
|
|
|
ContextStr += Twine(std::get<1>(Cur->ISite)).str();
|
|
|
|
ContextStack.emplace_back(ContextStr);
|
|
|
|
Cur = Cur->Parent;
|
|
|
|
}
|
|
|
|
// Make the ContextStack in caller-callee order
|
|
|
|
std::reverse(ContextStack.begin() + Begin, ContextStack.end());
|
|
|
|
}
|
|
|
|
|
|
|
|
std::string
|
|
|
|
PseudoProbe::getInlineContextStr(const GUIDProbeFunctionMap &GUID2FuncMAP,
|
|
|
|
bool ShowName) const {
|
|
|
|
std::ostringstream OContextStr;
|
|
|
|
SmallVector<std::string, 16> ContextStack;
|
|
|
|
getInlineContext(ContextStack, GUID2FuncMAP, ShowName);
|
|
|
|
for (auto &CxtStr : ContextStack) {
|
|
|
|
if (OContextStr.str().size())
|
|
|
|
OContextStr << " @ ";
|
|
|
|
OContextStr << CxtStr;
|
|
|
|
}
|
|
|
|
return OContextStr.str();
|
|
|
|
}
|
|
|
|
|
|
|
|
static const char *PseudoProbeTypeStr[3] = {"Block", "IndirectCall",
|
|
|
|
"DirectCall"};
|
|
|
|
|
|
|
|
void PseudoProbe::print(raw_ostream &OS,
|
|
|
|
const GUIDProbeFunctionMap &GUID2FuncMAP,
|
|
|
|
bool ShowName) {
|
|
|
|
OS << "FUNC: ";
|
|
|
|
if (ShowName) {
|
|
|
|
StringRef FuncName = getProbeFNameForGUID(GUID2FuncMAP, GUID);
|
|
|
|
OS << FuncName.str() << " ";
|
|
|
|
} else {
|
|
|
|
OS << GUID << " ";
|
|
|
|
}
|
|
|
|
OS << "Index: " << Index << " ";
|
|
|
|
OS << "Type: " << PseudoProbeTypeStr[static_cast<uint8_t>(Type)] << " ";
|
2021-06-17 20:09:13 +02:00
|
|
|
|
[CSSPGO][llvm-profgen] Pseudo probe decoding and disassembling
This change implements pseudo probe decoding and disassembling for llvm-profgen/CSSPGO. Please see https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s and https://reviews.llvm.org/D89707 for more context about CSSPGO and llvm-profgen.
**ELF section format**
Please see the encoding patch(https://reviews.llvm.org/D91878) for more details of the format, just copy the example here:
Two section(`.pseudo_probe_desc` and `.pseudoprobe` ) is emitted in ELF to support pseudo probe.
The format of `.pseudo_probe_desc` section looks like:
```
.section .pseudo_probe_desc,"",@progbits
.quad 6309742469962978389 // Func GUID
.quad 4294967295 // Func Hash
.byte 9 // Length of func name
.ascii "_Z5funcAi" // Func name
.quad 7102633082150537521
.quad 138828622701
.byte 12
.ascii "_Z8funcLeafi"
.quad 446061515086924981
.quad 4294967295
.byte 9
.ascii "_Z5funcBi"
.quad -2016976694713209516
.quad 72617220756
.byte 7
.ascii "_Z3fibi"
```
For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format :
```
FUNCTION BODY (one for each outlined function present in the text section)
GUID (uint64)
GUID of the function
NPROBES (ULEB128)
Number of probes originating from this function.
NUM_INLINED_FUNCTIONS (ULEB128)
Number of callees inlined into this function, aka number of
first-level inlinees
PROBE RECORDS
A list of NPROBES entries. Each entry contains:
INDEX (ULEB128)
TYPE (uint4)
0 - block probe, 1 - indirect call, 2 - direct call
ATTRIBUTE (uint3)
reserved
ADDRESS_TYPE (uint1)
0 - code address, 1 - address delta
CODE_ADDRESS (uint64 or ULEB128)
code address or address delta, depending on ADDRESS_TYPE
INLINED FUNCTION RECORDS
A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined
callees. Each record contains:
INLINE SITE
GUID of the inlinee (uint64)
ID of the callsite probe (ULEB128)
FUNCTION BODY
A FUNCTION BODY entry describing the inlined function.
```
**Disassembling**
A switch `--show-pseudo-probe` is added to use along with `--show-disassembly` to print disassembly code with pseudo probe directives.
For example:
```
00000000002011a0 <foo2>:
2011a0: 50 push rax
2011a1: 85 ff test edi,edi
[Probe]: FUNC: foo2 Index: 1 Type: Block
2011a3: 74 02 je 2011a7 <foo2+0x7>
[Probe]: FUNC: foo2 Index: 3 Type: Block
[Probe]: FUNC: foo2 Index: 4 Type: Block
[Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6
2011a5: 58 pop rax
2011a6: c3 ret
[Probe]: FUNC: foo2 Index: 2 Type: Block
2011a7: bf 01 00 00 00 mov edi,0x1
[Probe]: FUNC: foo2 Index: 5 Type: IndirectCall
2011ac: ff d6 call rsi
[Probe]: FUNC: foo2 Index: 4 Type: Block
2011ae: 58 pop rax
2011af: c3 ret
```
**Implementation**
- `PseudoProbeDecoder` is added in ProfiledBinary as an infra for the decoding. It decoded the two section and generate two map: `GUIDProbeFunctionMap` stores all the `PseudoProbeFunction` which is the abstraction of a general function. `AddressProbesMap` stores all the pseudo probe info indexed by its address.
- All the inline info is encoded into binary as a trie(`PseudoProbeInlineTree`) and will be constructed from the decoding. Each pseudo probe can get its inline context(`getInlineContext`) by traversing its inline tree node backwards.
Test Plan:
ninja & ninja check-llvm
Differential Revision: https://reviews.llvm.org/D92334
2020-11-24 05:33:23 +01:00
|
|
|
std::string InlineContextStr = getInlineContextStr(GUID2FuncMAP, ShowName);
|
|
|
|
if (InlineContextStr.size()) {
|
|
|
|
OS << "Inlined: @ ";
|
|
|
|
OS << InlineContextStr;
|
|
|
|
}
|
|
|
|
OS << "\n";
|
|
|
|
}
|
|
|
|
|
|
|
|
template <typename T> T PseudoProbeDecoder::readUnencodedNumber() {
|
|
|
|
if (Data + sizeof(T) > End) {
|
|
|
|
exitWithError("Decode unencoded number error in " + SectionName +
|
|
|
|
" section");
|
|
|
|
}
|
|
|
|
T Val = endian::readNext<T, little, unaligned>(Data);
|
|
|
|
return Val;
|
|
|
|
}
|
|
|
|
|
|
|
|
template <typename T> T PseudoProbeDecoder::readUnsignedNumber() {
|
|
|
|
unsigned NumBytesRead = 0;
|
|
|
|
uint64_t Val = decodeULEB128(Data, &NumBytesRead);
|
|
|
|
if (Val > std::numeric_limits<T>::max() || (Data + NumBytesRead > End)) {
|
|
|
|
exitWithError("Decode number error in " + SectionName + " section");
|
|
|
|
}
|
|
|
|
Data += NumBytesRead;
|
|
|
|
return static_cast<T>(Val);
|
|
|
|
}
|
|
|
|
|
|
|
|
template <typename T> T PseudoProbeDecoder::readSignedNumber() {
|
|
|
|
unsigned NumBytesRead = 0;
|
|
|
|
int64_t Val = decodeSLEB128(Data, &NumBytesRead);
|
|
|
|
if (Val > std::numeric_limits<T>::max() || (Data + NumBytesRead > End)) {
|
|
|
|
exitWithError("Decode number error in " + SectionName + " section");
|
|
|
|
}
|
|
|
|
Data += NumBytesRead;
|
|
|
|
return static_cast<T>(Val);
|
|
|
|
}
|
|
|
|
|
|
|
|
StringRef PseudoProbeDecoder::readString(uint32_t Size) {
|
|
|
|
StringRef Str(reinterpret_cast<const char *>(Data), Size);
|
|
|
|
if (Data + Size > End) {
|
|
|
|
exitWithError("Decode string error in " + SectionName + " section");
|
|
|
|
}
|
|
|
|
Data += Size;
|
|
|
|
return Str;
|
|
|
|
}
|
|
|
|
|
|
|
|
void PseudoProbeDecoder::buildGUID2FuncDescMap(const uint8_t *Start,
|
|
|
|
std::size_t Size) {
|
|
|
|
// The pseudo_probe_desc section has a format like:
|
|
|
|
// .section .pseudo_probe_desc,"",@progbits
|
|
|
|
// .quad -5182264717993193164 // GUID
|
|
|
|
// .quad 4294967295 // Hash
|
|
|
|
// .uleb 3 // Name size
|
|
|
|
// .ascii "foo" // Name
|
|
|
|
// .quad -2624081020897602054
|
|
|
|
// .quad 174696971957
|
|
|
|
// .uleb 34
|
|
|
|
// .ascii "main"
|
|
|
|
#ifndef NDEBUG
|
|
|
|
SectionName = "pseudo_probe_desc";
|
|
|
|
#endif
|
|
|
|
Data = Start;
|
|
|
|
End = Data + Size;
|
|
|
|
|
|
|
|
while (Data < End) {
|
|
|
|
uint64_t GUID = readUnencodedNumber<uint64_t>();
|
|
|
|
uint64_t Hash = readUnencodedNumber<uint64_t>();
|
|
|
|
uint32_t NameSize = readUnsignedNumber<uint32_t>();
|
2021-03-04 05:37:19 +01:00
|
|
|
StringRef Name = FunctionSamples::getCanonicalFnName(readString(NameSize));
|
[CSSPGO][llvm-profgen] Pseudo probe decoding and disassembling
This change implements pseudo probe decoding and disassembling for llvm-profgen/CSSPGO. Please see https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s and https://reviews.llvm.org/D89707 for more context about CSSPGO and llvm-profgen.
**ELF section format**
Please see the encoding patch(https://reviews.llvm.org/D91878) for more details of the format, just copy the example here:
Two section(`.pseudo_probe_desc` and `.pseudoprobe` ) is emitted in ELF to support pseudo probe.
The format of `.pseudo_probe_desc` section looks like:
```
.section .pseudo_probe_desc,"",@progbits
.quad 6309742469962978389 // Func GUID
.quad 4294967295 // Func Hash
.byte 9 // Length of func name
.ascii "_Z5funcAi" // Func name
.quad 7102633082150537521
.quad 138828622701
.byte 12
.ascii "_Z8funcLeafi"
.quad 446061515086924981
.quad 4294967295
.byte 9
.ascii "_Z5funcBi"
.quad -2016976694713209516
.quad 72617220756
.byte 7
.ascii "_Z3fibi"
```
For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format :
```
FUNCTION BODY (one for each outlined function present in the text section)
GUID (uint64)
GUID of the function
NPROBES (ULEB128)
Number of probes originating from this function.
NUM_INLINED_FUNCTIONS (ULEB128)
Number of callees inlined into this function, aka number of
first-level inlinees
PROBE RECORDS
A list of NPROBES entries. Each entry contains:
INDEX (ULEB128)
TYPE (uint4)
0 - block probe, 1 - indirect call, 2 - direct call
ATTRIBUTE (uint3)
reserved
ADDRESS_TYPE (uint1)
0 - code address, 1 - address delta
CODE_ADDRESS (uint64 or ULEB128)
code address or address delta, depending on ADDRESS_TYPE
INLINED FUNCTION RECORDS
A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined
callees. Each record contains:
INLINE SITE
GUID of the inlinee (uint64)
ID of the callsite probe (ULEB128)
FUNCTION BODY
A FUNCTION BODY entry describing the inlined function.
```
**Disassembling**
A switch `--show-pseudo-probe` is added to use along with `--show-disassembly` to print disassembly code with pseudo probe directives.
For example:
```
00000000002011a0 <foo2>:
2011a0: 50 push rax
2011a1: 85 ff test edi,edi
[Probe]: FUNC: foo2 Index: 1 Type: Block
2011a3: 74 02 je 2011a7 <foo2+0x7>
[Probe]: FUNC: foo2 Index: 3 Type: Block
[Probe]: FUNC: foo2 Index: 4 Type: Block
[Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6
2011a5: 58 pop rax
2011a6: c3 ret
[Probe]: FUNC: foo2 Index: 2 Type: Block
2011a7: bf 01 00 00 00 mov edi,0x1
[Probe]: FUNC: foo2 Index: 5 Type: IndirectCall
2011ac: ff d6 call rsi
[Probe]: FUNC: foo2 Index: 4 Type: Block
2011ae: 58 pop rax
2011af: c3 ret
```
**Implementation**
- `PseudoProbeDecoder` is added in ProfiledBinary as an infra for the decoding. It decoded the two section and generate two map: `GUIDProbeFunctionMap` stores all the `PseudoProbeFunction` which is the abstraction of a general function. `AddressProbesMap` stores all the pseudo probe info indexed by its address.
- All the inline info is encoded into binary as a trie(`PseudoProbeInlineTree`) and will be constructed from the decoding. Each pseudo probe can get its inline context(`getInlineContext`) by traversing its inline tree node backwards.
Test Plan:
ninja & ninja check-llvm
Differential Revision: https://reviews.llvm.org/D92334
2020-11-24 05:33:23 +01:00
|
|
|
|
|
|
|
// Initialize PseudoProbeFuncDesc and populate it into GUID2FuncDescMap
|
|
|
|
GUID2FuncDescMap.emplace(GUID, PseudoProbeFuncDesc(GUID, Hash, Name));
|
|
|
|
}
|
|
|
|
assert(Data == End && "Have unprocessed data in pseudo_probe_desc section");
|
|
|
|
}
|
|
|
|
|
|
|
|
void PseudoProbeDecoder::buildAddress2ProbeMap(const uint8_t *Start,
|
|
|
|
std::size_t Size) {
|
|
|
|
// The pseudo_probe section encodes an inline forest and each tree has a
|
|
|
|
// format like:
|
|
|
|
// FUNCTION BODY (one for each uninlined function present in the text
|
|
|
|
// section)
|
|
|
|
// GUID (uint64)
|
|
|
|
// GUID of the function
|
|
|
|
// NPROBES (ULEB128)
|
|
|
|
// Number of probes originating from this function.
|
|
|
|
// NUM_INLINED_FUNCTIONS (ULEB128)
|
|
|
|
// Number of callees inlined into this function, aka number of
|
|
|
|
// first-level inlinees
|
|
|
|
// PROBE RECORDS
|
|
|
|
// A list of NPROBES entries. Each entry contains:
|
|
|
|
// INDEX (ULEB128)
|
|
|
|
// TYPE (uint4)
|
|
|
|
// 0 - block probe, 1 - indirect call, 2 - direct call
|
|
|
|
// ATTRIBUTE (uint3)
|
2021-06-17 20:09:13 +02:00
|
|
|
// 1 - reserved
|
[CSSPGO][llvm-profgen] Pseudo probe decoding and disassembling
This change implements pseudo probe decoding and disassembling for llvm-profgen/CSSPGO. Please see https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s and https://reviews.llvm.org/D89707 for more context about CSSPGO and llvm-profgen.
**ELF section format**
Please see the encoding patch(https://reviews.llvm.org/D91878) for more details of the format, just copy the example here:
Two section(`.pseudo_probe_desc` and `.pseudoprobe` ) is emitted in ELF to support pseudo probe.
The format of `.pseudo_probe_desc` section looks like:
```
.section .pseudo_probe_desc,"",@progbits
.quad 6309742469962978389 // Func GUID
.quad 4294967295 // Func Hash
.byte 9 // Length of func name
.ascii "_Z5funcAi" // Func name
.quad 7102633082150537521
.quad 138828622701
.byte 12
.ascii "_Z8funcLeafi"
.quad 446061515086924981
.quad 4294967295
.byte 9
.ascii "_Z5funcBi"
.quad -2016976694713209516
.quad 72617220756
.byte 7
.ascii "_Z3fibi"
```
For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format :
```
FUNCTION BODY (one for each outlined function present in the text section)
GUID (uint64)
GUID of the function
NPROBES (ULEB128)
Number of probes originating from this function.
NUM_INLINED_FUNCTIONS (ULEB128)
Number of callees inlined into this function, aka number of
first-level inlinees
PROBE RECORDS
A list of NPROBES entries. Each entry contains:
INDEX (ULEB128)
TYPE (uint4)
0 - block probe, 1 - indirect call, 2 - direct call
ATTRIBUTE (uint3)
reserved
ADDRESS_TYPE (uint1)
0 - code address, 1 - address delta
CODE_ADDRESS (uint64 or ULEB128)
code address or address delta, depending on ADDRESS_TYPE
INLINED FUNCTION RECORDS
A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined
callees. Each record contains:
INLINE SITE
GUID of the inlinee (uint64)
ID of the callsite probe (ULEB128)
FUNCTION BODY
A FUNCTION BODY entry describing the inlined function.
```
**Disassembling**
A switch `--show-pseudo-probe` is added to use along with `--show-disassembly` to print disassembly code with pseudo probe directives.
For example:
```
00000000002011a0 <foo2>:
2011a0: 50 push rax
2011a1: 85 ff test edi,edi
[Probe]: FUNC: foo2 Index: 1 Type: Block
2011a3: 74 02 je 2011a7 <foo2+0x7>
[Probe]: FUNC: foo2 Index: 3 Type: Block
[Probe]: FUNC: foo2 Index: 4 Type: Block
[Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6
2011a5: 58 pop rax
2011a6: c3 ret
[Probe]: FUNC: foo2 Index: 2 Type: Block
2011a7: bf 01 00 00 00 mov edi,0x1
[Probe]: FUNC: foo2 Index: 5 Type: IndirectCall
2011ac: ff d6 call rsi
[Probe]: FUNC: foo2 Index: 4 Type: Block
2011ae: 58 pop rax
2011af: c3 ret
```
**Implementation**
- `PseudoProbeDecoder` is added in ProfiledBinary as an infra for the decoding. It decoded the two section and generate two map: `GUIDProbeFunctionMap` stores all the `PseudoProbeFunction` which is the abstraction of a general function. `AddressProbesMap` stores all the pseudo probe info indexed by its address.
- All the inline info is encoded into binary as a trie(`PseudoProbeInlineTree`) and will be constructed from the decoding. Each pseudo probe can get its inline context(`getInlineContext`) by traversing its inline tree node backwards.
Test Plan:
ninja & ninja check-llvm
Differential Revision: https://reviews.llvm.org/D92334
2020-11-24 05:33:23 +01:00
|
|
|
// ADDRESS_TYPE (uint1)
|
|
|
|
// 0 - code address, 1 - address delta
|
|
|
|
// CODE_ADDRESS (uint64 or ULEB128)
|
|
|
|
// code address or address delta, depending on Flag
|
|
|
|
// INLINED FUNCTION RECORDS
|
|
|
|
// A list of NUM_INLINED_FUNCTIONS entries describing each of the
|
|
|
|
// inlined callees. Each record contains:
|
|
|
|
// INLINE SITE
|
|
|
|
// Index of the callsite probe (ULEB128)
|
|
|
|
// FUNCTION BODY
|
|
|
|
// A FUNCTION BODY entry describing the inlined function.
|
|
|
|
#ifndef NDEBUG
|
|
|
|
SectionName = "pseudo_probe";
|
|
|
|
#endif
|
|
|
|
Data = Start;
|
|
|
|
End = Data + Size;
|
|
|
|
|
|
|
|
PseudoProbeInlineTree *Root = &DummyInlineRoot;
|
|
|
|
PseudoProbeInlineTree *Cur = &DummyInlineRoot;
|
|
|
|
uint64_t LastAddr = 0;
|
|
|
|
uint32_t Index = 0;
|
|
|
|
// A DFS-based decoding
|
|
|
|
while (Data < End) {
|
2021-04-20 03:04:43 +02:00
|
|
|
if (Root == Cur) {
|
|
|
|
// Use a sequential id for top level inliner.
|
|
|
|
Index = Root->getChildren().size();
|
|
|
|
} else {
|
|
|
|
// Read inline site for inlinees
|
[CSSPGO][llvm-profgen] Pseudo probe decoding and disassembling
This change implements pseudo probe decoding and disassembling for llvm-profgen/CSSPGO. Please see https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s and https://reviews.llvm.org/D89707 for more context about CSSPGO and llvm-profgen.
**ELF section format**
Please see the encoding patch(https://reviews.llvm.org/D91878) for more details of the format, just copy the example here:
Two section(`.pseudo_probe_desc` and `.pseudoprobe` ) is emitted in ELF to support pseudo probe.
The format of `.pseudo_probe_desc` section looks like:
```
.section .pseudo_probe_desc,"",@progbits
.quad 6309742469962978389 // Func GUID
.quad 4294967295 // Func Hash
.byte 9 // Length of func name
.ascii "_Z5funcAi" // Func name
.quad 7102633082150537521
.quad 138828622701
.byte 12
.ascii "_Z8funcLeafi"
.quad 446061515086924981
.quad 4294967295
.byte 9
.ascii "_Z5funcBi"
.quad -2016976694713209516
.quad 72617220756
.byte 7
.ascii "_Z3fibi"
```
For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format :
```
FUNCTION BODY (one for each outlined function present in the text section)
GUID (uint64)
GUID of the function
NPROBES (ULEB128)
Number of probes originating from this function.
NUM_INLINED_FUNCTIONS (ULEB128)
Number of callees inlined into this function, aka number of
first-level inlinees
PROBE RECORDS
A list of NPROBES entries. Each entry contains:
INDEX (ULEB128)
TYPE (uint4)
0 - block probe, 1 - indirect call, 2 - direct call
ATTRIBUTE (uint3)
reserved
ADDRESS_TYPE (uint1)
0 - code address, 1 - address delta
CODE_ADDRESS (uint64 or ULEB128)
code address or address delta, depending on ADDRESS_TYPE
INLINED FUNCTION RECORDS
A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined
callees. Each record contains:
INLINE SITE
GUID of the inlinee (uint64)
ID of the callsite probe (ULEB128)
FUNCTION BODY
A FUNCTION BODY entry describing the inlined function.
```
**Disassembling**
A switch `--show-pseudo-probe` is added to use along with `--show-disassembly` to print disassembly code with pseudo probe directives.
For example:
```
00000000002011a0 <foo2>:
2011a0: 50 push rax
2011a1: 85 ff test edi,edi
[Probe]: FUNC: foo2 Index: 1 Type: Block
2011a3: 74 02 je 2011a7 <foo2+0x7>
[Probe]: FUNC: foo2 Index: 3 Type: Block
[Probe]: FUNC: foo2 Index: 4 Type: Block
[Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6
2011a5: 58 pop rax
2011a6: c3 ret
[Probe]: FUNC: foo2 Index: 2 Type: Block
2011a7: bf 01 00 00 00 mov edi,0x1
[Probe]: FUNC: foo2 Index: 5 Type: IndirectCall
2011ac: ff d6 call rsi
[Probe]: FUNC: foo2 Index: 4 Type: Block
2011ae: 58 pop rax
2011af: c3 ret
```
**Implementation**
- `PseudoProbeDecoder` is added in ProfiledBinary as an infra for the decoding. It decoded the two section and generate two map: `GUIDProbeFunctionMap` stores all the `PseudoProbeFunction` which is the abstraction of a general function. `AddressProbesMap` stores all the pseudo probe info indexed by its address.
- All the inline info is encoded into binary as a trie(`PseudoProbeInlineTree`) and will be constructed from the decoding. Each pseudo probe can get its inline context(`getInlineContext`) by traversing its inline tree node backwards.
Test Plan:
ninja & ninja check-llvm
Differential Revision: https://reviews.llvm.org/D92334
2020-11-24 05:33:23 +01:00
|
|
|
Index = readUnsignedNumber<uint32_t>();
|
|
|
|
}
|
|
|
|
// Switch/add to a new tree node(inlinee)
|
2021-01-13 23:27:03 +01:00
|
|
|
Cur = Cur->getOrAddNode(std::make_tuple(Cur->GUID, Index));
|
[CSSPGO][llvm-profgen] Pseudo probe decoding and disassembling
This change implements pseudo probe decoding and disassembling for llvm-profgen/CSSPGO. Please see https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s and https://reviews.llvm.org/D89707 for more context about CSSPGO and llvm-profgen.
**ELF section format**
Please see the encoding patch(https://reviews.llvm.org/D91878) for more details of the format, just copy the example here:
Two section(`.pseudo_probe_desc` and `.pseudoprobe` ) is emitted in ELF to support pseudo probe.
The format of `.pseudo_probe_desc` section looks like:
```
.section .pseudo_probe_desc,"",@progbits
.quad 6309742469962978389 // Func GUID
.quad 4294967295 // Func Hash
.byte 9 // Length of func name
.ascii "_Z5funcAi" // Func name
.quad 7102633082150537521
.quad 138828622701
.byte 12
.ascii "_Z8funcLeafi"
.quad 446061515086924981
.quad 4294967295
.byte 9
.ascii "_Z5funcBi"
.quad -2016976694713209516
.quad 72617220756
.byte 7
.ascii "_Z3fibi"
```
For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format :
```
FUNCTION BODY (one for each outlined function present in the text section)
GUID (uint64)
GUID of the function
NPROBES (ULEB128)
Number of probes originating from this function.
NUM_INLINED_FUNCTIONS (ULEB128)
Number of callees inlined into this function, aka number of
first-level inlinees
PROBE RECORDS
A list of NPROBES entries. Each entry contains:
INDEX (ULEB128)
TYPE (uint4)
0 - block probe, 1 - indirect call, 2 - direct call
ATTRIBUTE (uint3)
reserved
ADDRESS_TYPE (uint1)
0 - code address, 1 - address delta
CODE_ADDRESS (uint64 or ULEB128)
code address or address delta, depending on ADDRESS_TYPE
INLINED FUNCTION RECORDS
A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined
callees. Each record contains:
INLINE SITE
GUID of the inlinee (uint64)
ID of the callsite probe (ULEB128)
FUNCTION BODY
A FUNCTION BODY entry describing the inlined function.
```
**Disassembling**
A switch `--show-pseudo-probe` is added to use along with `--show-disassembly` to print disassembly code with pseudo probe directives.
For example:
```
00000000002011a0 <foo2>:
2011a0: 50 push rax
2011a1: 85 ff test edi,edi
[Probe]: FUNC: foo2 Index: 1 Type: Block
2011a3: 74 02 je 2011a7 <foo2+0x7>
[Probe]: FUNC: foo2 Index: 3 Type: Block
[Probe]: FUNC: foo2 Index: 4 Type: Block
[Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6
2011a5: 58 pop rax
2011a6: c3 ret
[Probe]: FUNC: foo2 Index: 2 Type: Block
2011a7: bf 01 00 00 00 mov edi,0x1
[Probe]: FUNC: foo2 Index: 5 Type: IndirectCall
2011ac: ff d6 call rsi
[Probe]: FUNC: foo2 Index: 4 Type: Block
2011ae: 58 pop rax
2011af: c3 ret
```
**Implementation**
- `PseudoProbeDecoder` is added in ProfiledBinary as an infra for the decoding. It decoded the two section and generate two map: `GUIDProbeFunctionMap` stores all the `PseudoProbeFunction` which is the abstraction of a general function. `AddressProbesMap` stores all the pseudo probe info indexed by its address.
- All the inline info is encoded into binary as a trie(`PseudoProbeInlineTree`) and will be constructed from the decoding. Each pseudo probe can get its inline context(`getInlineContext`) by traversing its inline tree node backwards.
Test Plan:
ninja & ninja check-llvm
Differential Revision: https://reviews.llvm.org/D92334
2020-11-24 05:33:23 +01:00
|
|
|
// Read guid
|
|
|
|
Cur->GUID = readUnencodedNumber<uint64_t>();
|
|
|
|
// Read number of probes in the current node.
|
|
|
|
uint32_t NodeCount = readUnsignedNumber<uint32_t>();
|
|
|
|
// Read number of direct inlinees
|
|
|
|
Cur->ChildrenToProcess = readUnsignedNumber<uint32_t>();
|
|
|
|
// Read all probes in this node
|
|
|
|
for (std::size_t I = 0; I < NodeCount; I++) {
|
|
|
|
// Read index
|
|
|
|
uint32_t Index = readUnsignedNumber<uint32_t>();
|
|
|
|
// Read type | flag.
|
|
|
|
uint8_t Value = readUnencodedNumber<uint8_t>();
|
|
|
|
uint8_t Kind = Value & 0xf;
|
|
|
|
uint8_t Attr = (Value & 0x70) >> 4;
|
|
|
|
// Read address
|
|
|
|
uint64_t Addr = 0;
|
|
|
|
if (Value & 0x80) {
|
|
|
|
int64_t Offset = readSignedNumber<int64_t>();
|
|
|
|
Addr = LastAddr + Offset;
|
|
|
|
} else {
|
|
|
|
Addr = readUnencodedNumber<int64_t>();
|
|
|
|
}
|
|
|
|
// Populate Address2ProbesMap
|
2021-04-20 03:04:43 +02:00
|
|
|
auto &Probes = Address2ProbesMap[Addr];
|
|
|
|
Probes.emplace_back(Addr, Cur->GUID, Index, PseudoProbeType(Kind), Attr,
|
|
|
|
Cur);
|
|
|
|
Cur->addProbes(&Probes.back());
|
[CSSPGO][llvm-profgen] Pseudo probe decoding and disassembling
This change implements pseudo probe decoding and disassembling for llvm-profgen/CSSPGO. Please see https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s and https://reviews.llvm.org/D89707 for more context about CSSPGO and llvm-profgen.
**ELF section format**
Please see the encoding patch(https://reviews.llvm.org/D91878) for more details of the format, just copy the example here:
Two section(`.pseudo_probe_desc` and `.pseudoprobe` ) is emitted in ELF to support pseudo probe.
The format of `.pseudo_probe_desc` section looks like:
```
.section .pseudo_probe_desc,"",@progbits
.quad 6309742469962978389 // Func GUID
.quad 4294967295 // Func Hash
.byte 9 // Length of func name
.ascii "_Z5funcAi" // Func name
.quad 7102633082150537521
.quad 138828622701
.byte 12
.ascii "_Z8funcLeafi"
.quad 446061515086924981
.quad 4294967295
.byte 9
.ascii "_Z5funcBi"
.quad -2016976694713209516
.quad 72617220756
.byte 7
.ascii "_Z3fibi"
```
For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format :
```
FUNCTION BODY (one for each outlined function present in the text section)
GUID (uint64)
GUID of the function
NPROBES (ULEB128)
Number of probes originating from this function.
NUM_INLINED_FUNCTIONS (ULEB128)
Number of callees inlined into this function, aka number of
first-level inlinees
PROBE RECORDS
A list of NPROBES entries. Each entry contains:
INDEX (ULEB128)
TYPE (uint4)
0 - block probe, 1 - indirect call, 2 - direct call
ATTRIBUTE (uint3)
reserved
ADDRESS_TYPE (uint1)
0 - code address, 1 - address delta
CODE_ADDRESS (uint64 or ULEB128)
code address or address delta, depending on ADDRESS_TYPE
INLINED FUNCTION RECORDS
A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined
callees. Each record contains:
INLINE SITE
GUID of the inlinee (uint64)
ID of the callsite probe (ULEB128)
FUNCTION BODY
A FUNCTION BODY entry describing the inlined function.
```
**Disassembling**
A switch `--show-pseudo-probe` is added to use along with `--show-disassembly` to print disassembly code with pseudo probe directives.
For example:
```
00000000002011a0 <foo2>:
2011a0: 50 push rax
2011a1: 85 ff test edi,edi
[Probe]: FUNC: foo2 Index: 1 Type: Block
2011a3: 74 02 je 2011a7 <foo2+0x7>
[Probe]: FUNC: foo2 Index: 3 Type: Block
[Probe]: FUNC: foo2 Index: 4 Type: Block
[Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6
2011a5: 58 pop rax
2011a6: c3 ret
[Probe]: FUNC: foo2 Index: 2 Type: Block
2011a7: bf 01 00 00 00 mov edi,0x1
[Probe]: FUNC: foo2 Index: 5 Type: IndirectCall
2011ac: ff d6 call rsi
[Probe]: FUNC: foo2 Index: 4 Type: Block
2011ae: 58 pop rax
2011af: c3 ret
```
**Implementation**
- `PseudoProbeDecoder` is added in ProfiledBinary as an infra for the decoding. It decoded the two section and generate two map: `GUIDProbeFunctionMap` stores all the `PseudoProbeFunction` which is the abstraction of a general function. `AddressProbesMap` stores all the pseudo probe info indexed by its address.
- All the inline info is encoded into binary as a trie(`PseudoProbeInlineTree`) and will be constructed from the decoding. Each pseudo probe can get its inline context(`getInlineContext`) by traversing its inline tree node backwards.
Test Plan:
ninja & ninja check-llvm
Differential Revision: https://reviews.llvm.org/D92334
2020-11-24 05:33:23 +01:00
|
|
|
LastAddr = Addr;
|
|
|
|
}
|
|
|
|
|
|
|
|
// Look for the parent for the next node by subtracting the current
|
|
|
|
// node count from tree counts along the parent chain. The first node
|
|
|
|
// in the chain that has a non-zero tree count is the target.
|
|
|
|
while (Cur != Root) {
|
|
|
|
if (Cur->ChildrenToProcess == 0) {
|
|
|
|
Cur = Cur->Parent;
|
|
|
|
if (Cur != Root) {
|
|
|
|
assert(Cur->ChildrenToProcess > 0 &&
|
|
|
|
"Should have some unprocessed nodes");
|
|
|
|
Cur->ChildrenToProcess -= 1;
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
assert(Data == End && "Have unprocessed data in pseudo_probe section");
|
|
|
|
assert(Cur == Root &&
|
|
|
|
" Cur should point to root when the forest is fully built up");
|
|
|
|
}
|
|
|
|
|
|
|
|
void PseudoProbeDecoder::printGUID2FuncDescMap(raw_ostream &OS) {
|
|
|
|
OS << "Pseudo Probe Desc:\n";
|
|
|
|
// Make the output deterministic
|
|
|
|
std::map<uint64_t, PseudoProbeFuncDesc> OrderedMap(GUID2FuncDescMap.begin(),
|
|
|
|
GUID2FuncDescMap.end());
|
|
|
|
for (auto &I : OrderedMap) {
|
|
|
|
I.second.print(OS);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
void PseudoProbeDecoder::printProbeForAddress(raw_ostream &OS,
|
|
|
|
uint64_t Address) {
|
|
|
|
auto It = Address2ProbesMap.find(Address);
|
|
|
|
if (It != Address2ProbesMap.end()) {
|
|
|
|
for (auto &Probe : It->second) {
|
|
|
|
OS << " [Probe]:\t";
|
|
|
|
Probe.print(OS, GUID2FuncDescMap, true);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-11-26 05:33:17 +01:00
|
|
|
const PseudoProbe *
|
|
|
|
PseudoProbeDecoder::getCallProbeForAddr(uint64_t Address) const {
|
|
|
|
auto It = Address2ProbesMap.find(Address);
|
|
|
|
if (It == Address2ProbesMap.end())
|
|
|
|
return nullptr;
|
2021-04-20 03:04:43 +02:00
|
|
|
const auto &Probes = It->second;
|
2020-11-26 05:33:17 +01:00
|
|
|
|
|
|
|
const PseudoProbe *CallProbe = nullptr;
|
|
|
|
for (const auto &Probe : Probes) {
|
|
|
|
if (Probe.isCall()) {
|
|
|
|
assert(!CallProbe &&
|
|
|
|
"There should be only one call probe corresponding to address "
|
|
|
|
"which is a callsite.");
|
|
|
|
CallProbe = &Probe;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return CallProbe;
|
|
|
|
}
|
|
|
|
|
2021-01-11 18:08:39 +01:00
|
|
|
const PseudoProbeFuncDesc *
|
|
|
|
PseudoProbeDecoder::getFuncDescForGUID(uint64_t GUID) const {
|
|
|
|
auto It = GUID2FuncDescMap.find(GUID);
|
|
|
|
assert(It != GUID2FuncDescMap.end() && "Function descriptor doesn't exist");
|
|
|
|
return &It->second;
|
|
|
|
}
|
|
|
|
|
2020-11-26 05:33:17 +01:00
|
|
|
void PseudoProbeDecoder::getInlineContextForProbe(
|
[CSSPGO][llvm-profgen] Compress recursive cycles in calling context
This change compresses the context string by removing cycles due to recursive function for CS profile generation. Removing recursion cycles is a way to normalize the calling context which will be better for the sample aggregation and also make the context promoting deterministic.
Specifically for implementation, we recognize adjacent repeated frames as cycles and deduplicated them through multiple round of iteration.
For example:
Considering a input context string stack:
[“a”, “a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”]
For first iteration,, it removed all adjacent repeated frames of size 1:
[“a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”]
For second iteration, it removed all adjacent repeated frames of size 2:
[“a”, “b”, “c”, “a”, “b”, “c”, “d”]
So in the end, we get compressed output:
[“a”, “b”, “c”, “d”]
Compression will be called in two place: one for sample's context key right after unwinding, one is for the eventual context string id in the ProfileGenerator.
Added a switch `compress-recursion` to control the size of duplicated frames, default -1 means no size limit.
Added unit tests and regression test for this.
Differential Revision: https://reviews.llvm.org/D93556
2021-01-30 00:00:08 +01:00
|
|
|
const PseudoProbe *Probe, SmallVectorImpl<std::string> &InlineContextStack,
|
2020-11-26 05:33:17 +01:00
|
|
|
bool IncludeLeaf) const {
|
|
|
|
Probe->getInlineContext(InlineContextStack, GUID2FuncDescMap, true);
|
2021-01-11 18:08:39 +01:00
|
|
|
if (!IncludeLeaf)
|
|
|
|
return;
|
|
|
|
// Note that the context from probe doesn't include leaf frame,
|
|
|
|
// hence we need to retrieve and prepend leaf if requested.
|
|
|
|
const auto *FuncDesc = getFuncDescForGUID(Probe->GUID);
|
|
|
|
InlineContextStack.emplace_back(FuncDesc->FuncName + ":" +
|
|
|
|
Twine(Probe->Index).str());
|
|
|
|
}
|
|
|
|
|
|
|
|
const PseudoProbeFuncDesc *
|
|
|
|
PseudoProbeDecoder::getInlinerDescForProbe(const PseudoProbe *Probe) const {
|
|
|
|
PseudoProbeInlineTree *InlinerNode = Probe->InlineTree;
|
|
|
|
if (!InlinerNode->hasInlineSite())
|
|
|
|
return nullptr;
|
|
|
|
return getFuncDescForGUID(std::get<0>(InlinerNode->ISite));
|
2020-11-26 05:33:17 +01:00
|
|
|
}
|
|
|
|
|
[CSSPGO][llvm-profgen] Pseudo probe decoding and disassembling
This change implements pseudo probe decoding and disassembling for llvm-profgen/CSSPGO. Please see https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s and https://reviews.llvm.org/D89707 for more context about CSSPGO and llvm-profgen.
**ELF section format**
Please see the encoding patch(https://reviews.llvm.org/D91878) for more details of the format, just copy the example here:
Two section(`.pseudo_probe_desc` and `.pseudoprobe` ) is emitted in ELF to support pseudo probe.
The format of `.pseudo_probe_desc` section looks like:
```
.section .pseudo_probe_desc,"",@progbits
.quad 6309742469962978389 // Func GUID
.quad 4294967295 // Func Hash
.byte 9 // Length of func name
.ascii "_Z5funcAi" // Func name
.quad 7102633082150537521
.quad 138828622701
.byte 12
.ascii "_Z8funcLeafi"
.quad 446061515086924981
.quad 4294967295
.byte 9
.ascii "_Z5funcBi"
.quad -2016976694713209516
.quad 72617220756
.byte 7
.ascii "_Z3fibi"
```
For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format :
```
FUNCTION BODY (one for each outlined function present in the text section)
GUID (uint64)
GUID of the function
NPROBES (ULEB128)
Number of probes originating from this function.
NUM_INLINED_FUNCTIONS (ULEB128)
Number of callees inlined into this function, aka number of
first-level inlinees
PROBE RECORDS
A list of NPROBES entries. Each entry contains:
INDEX (ULEB128)
TYPE (uint4)
0 - block probe, 1 - indirect call, 2 - direct call
ATTRIBUTE (uint3)
reserved
ADDRESS_TYPE (uint1)
0 - code address, 1 - address delta
CODE_ADDRESS (uint64 or ULEB128)
code address or address delta, depending on ADDRESS_TYPE
INLINED FUNCTION RECORDS
A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined
callees. Each record contains:
INLINE SITE
GUID of the inlinee (uint64)
ID of the callsite probe (ULEB128)
FUNCTION BODY
A FUNCTION BODY entry describing the inlined function.
```
**Disassembling**
A switch `--show-pseudo-probe` is added to use along with `--show-disassembly` to print disassembly code with pseudo probe directives.
For example:
```
00000000002011a0 <foo2>:
2011a0: 50 push rax
2011a1: 85 ff test edi,edi
[Probe]: FUNC: foo2 Index: 1 Type: Block
2011a3: 74 02 je 2011a7 <foo2+0x7>
[Probe]: FUNC: foo2 Index: 3 Type: Block
[Probe]: FUNC: foo2 Index: 4 Type: Block
[Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6
2011a5: 58 pop rax
2011a6: c3 ret
[Probe]: FUNC: foo2 Index: 2 Type: Block
2011a7: bf 01 00 00 00 mov edi,0x1
[Probe]: FUNC: foo2 Index: 5 Type: IndirectCall
2011ac: ff d6 call rsi
[Probe]: FUNC: foo2 Index: 4 Type: Block
2011ae: 58 pop rax
2011af: c3 ret
```
**Implementation**
- `PseudoProbeDecoder` is added in ProfiledBinary as an infra for the decoding. It decoded the two section and generate two map: `GUIDProbeFunctionMap` stores all the `PseudoProbeFunction` which is the abstraction of a general function. `AddressProbesMap` stores all the pseudo probe info indexed by its address.
- All the inline info is encoded into binary as a trie(`PseudoProbeInlineTree`) and will be constructed from the decoding. Each pseudo probe can get its inline context(`getInlineContext`) by traversing its inline tree node backwards.
Test Plan:
ninja & ninja check-llvm
Differential Revision: https://reviews.llvm.org/D92334
2020-11-24 05:33:23 +01:00
|
|
|
} // end namespace sampleprof
|
|
|
|
} // end namespace llvm
|