Currently when .llvm.call-graph-profile is created by llvm it explicitly encodes the symbol indices. This section is basically a black box for post processing tools. For example, if we run strip -s on the object files the symbol table changes, but indices in that section do not. In non-visible behavior indices point to wrong symbols. The visible behavior indices point outside of Symbol table: "invalid symbol index".
This patch changes the format by using R_*_NONE relocations to indicate the from/to symbols. The Frequency (Weight) will still be in the .llvm.call-graph-profile, but symbol information will be in relocation section. In LLD information from both sections is used to reconstruct call graph profile. Relocations themselves will never be applied.
With this approach post processing tools that handle relocations correctly work for this section also. Tools can add/remove symbols and as long as they handle relocation sections with this approach information stays correct.
Doing a quick experiment with clang-13.
The size went up from 107KB to 322KB, aggregate of all the input sections. Size of clang-13 binary is ~118MB. For users of -fprofile-use/-fprofile-sample-use the size of object files will go up slightly, it will not impact final binary size.
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D104080
This patch adds support for a new field in the FileHeader, which states
the name to use for the section header string table. This also allows
combining the string table with another string table in the object, e.g.
the symbol name string table. The field is optional. By default,
.shstrtab will continue to be used.
This partially fixes https://bugs.llvm.org/show_bug.cgi?id=50506.
Reviewed by: Higuoxing
Differential Revision: https://reviews.llvm.org/D104035
Add in the ability of parsing symbol table for 64 bit object.
Reviewed By: jhenderson, DiggerLin
Differential Revision: https://reviews.llvm.org/D85774
This change was originally landed in: 5000a1b4b9edeb9e994f2a5b36da8d48599bea49
It was reverted in: 061e071d8c9b98526f35cad55a918a4f1615afd4
This change adds support for a new WASM_SEG_FLAG_STRINGS flag in
the object format which works in a similar fashion to SHF_STRINGS
in the ELF world.
Unlike the ELF linker this support is currently limited:
- No support for SHF_MERGE (non-string merging)
- Always do full tail merging ("lo" can be merged with "hello")
- Only support single byte strings (p2align 0)
Like the ELF linker merging is only performed at `-O1` and above.
This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828,
although crucially it doesn't not currently support debug sections
because they are not represented by data segments (they are custom
sections)
Differential Revision: https://reviews.llvm.org/D97657
This reverts commit 5000a1b4b9edeb9e994f2a5b36da8d48599bea49.
Breaks tests, see https://reviews.llvm.org/D97657#2749151
Easily repros locally with `ninja check-llvm-mc-webassembly`.
This change adds support for a new WASM_SEG_FLAG_STRINGS flag in
the object format which works in a similar fashion to SHF_STRINGS
in the ELF world.
Unlike the ELF linker this support is currently limited:
- No support for SHF_MERGE (non-string merging)
- Always do full tail merging ("lo" can be merged with "hello")
- Only support single byte strings (p2align 0)
Like the ELF linker merging is only performed at `-O1` and above.
This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828,
although crucially it doesn't not currently support debug sections
because they are not represented by data segments (they are custom
sections)
Differential Revision: https://reviews.llvm.org/D97657
In future patches I will be setting the IsText parameter frequently so I will refactor the args to be in the following order. I have removed the FileSize parameter because it is never used.
```
static ErrorOr<std::unique_ptr<MemoryBuffer>>
getFile(const Twine &Filename, bool IsText = false,
bool RequiresNullTerminator = true, bool IsVolatile = false);
static ErrorOr<std::unique_ptr<MemoryBuffer>>
getFileOrSTDIN(const Twine &Filename, bool IsText = false,
bool RequiresNullTerminator = true);
static ErrorOr<std::unique_ptr<MB>>
getFileAux(const Twine &Filename, uint64_t MapSize, uint64_t Offset,
bool IsText, bool RequiresNullTerminator, bool IsVolatile);
static ErrorOr<std::unique_ptr<WritableMemoryBuffer>>
getFile(const Twine &Filename, bool IsVolatile = false);
```
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D99182
This patch renames the "Initial" member of WasmLimits to the name used
in the spec, "Minimum".
In the core WebAssembly specification, the Limits data type has one
required "min" member and one optional "max" member, indicating the
minimum required size of the corresponding table or memory, and the
maximum size, if any.
Although the WebAssembly spec does instantiate locally-defined tables
and memories with the initial size being equal to the minimum size, it
can't impose such a requirement for imports. It doesn't make sense to
require an initial size for a memory import, for example. The compiler
can only sensibly express the minimum and maximum sizes.
See
https://github.com/WebAssembly/js-types/blob/master/proposals/js-types/Overview.md#naming-of-size-limits
for a related discussion that agrees that the right name of "initial" is
"minimum" when querying the type of a table or memory from JavaScript.
(Of course it still makes sense for JS to speak in terms of an initial
size when it explicitly instantiates memories and tables.)
Differential Revision: https://reviews.llvm.org/D99186
With reference types, tables can have non-zero table numbers. This
commit adds support for element sections against these tables.
Differential Revision: https://reviews.llvm.org/D97923
As discussed in D95511, this allows us to encode invalid BBAddrMap
sections to be used in more rigorous testing.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D96831
As discussed in D95511, this allows us to encode invalid BBAddrMap
sections to be used in more rigorous testing.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D96831
This patch let the yaml encoding use Hex64 values for NumBlocks, BB AddressOffset, BB Size, and BB Metadata.
Additionally, it changes the decoded values in elf2yaml to uint64_t to match DataExtractor::getULEB128 return type.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D95767
This was discussed in D93678 thread.
Currently we have one special chunk - Fill.
This patch re implements the "SectionHeaderTable" key to become a special chunk too.
With that we are able to place the section header table at any location,
just like we place sections.
Differential revision: https://reviews.llvm.org/D95140
This makes the following improvements.
For `SHT_GNU_versym`:
* yaml2obj: set `sh_link` to index of `.dynsym` section automatically.
For `SHT_GNU_verdef`:
* yaml2obj: set `sh_link` to index of `.dynstr` section automatically.
* yaml2obj: set `sh_info` field automatically.
* obj2yaml: don't dump the `Info` field when its value matches the number of version definitions.
For `SHT_GNU_verneed`:
* yaml2obj: set `sh_link` to index of `.dynstr` section automatically.
* yaml2obj: set `sh_info` field automatically.
* obj2yaml: don't dump the `Info` field when its value matches the number of version dependencies.
Also, simplifies few test cases.
Differential revision: https://reviews.llvm.org/D94956
This reuses the code from yaml2obj (moves it to ELFYAML.h).
With it we can set the `sh_entsize` in a single place in `obj2yaml`.
Note that it also fixes a bug of `yaml2obj`: we do not
set the `sh_entsize` field for the `SHT_ARM_EXIDX` section properly.
Differential revision: https://reviews.llvm.org/D93858
Currently we crash when we have an object with SHT_SYMTAB/SHT_DYNSYM sections
of size 0.
With this patch instead of the crash we start to dump them properly.
Differential revision: https://reviews.llvm.org/D93697
We have the following issues related to group sections:
1) yaml2obj is unable to set the custom `sh_entsize` value, because the `EntSize`
key is currently ignored.
2) obj2yaml is unable to dump the group section which `sh_entsize != 4`.
3) obj2yaml always dumps the "EntSize" for group sections, though
usually we are trying to omit dumping default values when dumping keys.
I.e. we should not print the "EntSize" key when `sh_entsize` == 4.
This patch fixes (1),(3) and adds the test case to document the behavior of (2).
Differential revision: https://reviews.llvm.org/D93854
`getUniquedSectionName(const Elf_Shdr *Sec)` assumes that
`Sec` is not `nullptr`.
I've found one place in `getUniquedSymbolName` where it is
not true (because of that we crash when trying to dump
unnamed null section symbols).
Patch fixes the crash and changes the signature of the
`getUniquedSectionName` section to accept a reference.
Differential revision: https://reviews.llvm.org/D93754
This is similar to D93760.
When something is wrong with the hash table header we dump
its context as a raw data.
Currently we have the calculation overflow issue and it is possible to
bypass the validation we have (and crash).
The patch fixes it.
Differential revision: https://reviews.llvm.org/D93799
When something is wrong with the GNU hash table header we dump
its context as a raw data.
Currently we have the calculation overflow issue and it is possible to
bypass the validation we have (and crash).
The patch fixes it.
Differential revision: https://reviews.llvm.org/D93760
When a field is optional we can use the `=<none>` syntax in macros.
This patch makes `Value`/`Size` fields of `Symbol` optional
and adds test cases for them.
Differential revision: https://reviews.llvm.org/D93010
It is allowed to have multiple `SHT_SYMTAB_SHNDX` sections, though
we currently don't implement it.
The current implementation assumes that there is a maximum of one SHT_SYMTAB_SHNDX
section and that it is always linked with .symtab section.
This patch drops this limitations.
Differential revision: https://reviews.llvm.org/D92644
Allow sections to be placed into COMDAT groups, in addtion to functions and data
segments.
Also make section symbols unnamed, which allows sections with identical names
(section names are independent of their section symbols, but previously we
gave the symbols the same name as their sections, which results in collisions
when sections are identically-named).
Differential Revision: https://reviews.llvm.org/D92691
Currently when we dump sections, we dump them in the order,
which is specified in the sections header table.
With that the order in the output might not match the order in the file.
This patch starts sorting them by by file offsets when dumping.
When the order in the section header table doesn't match the order
in the file, we should emit the "SectionHeaderTable" key. This patch does it.
Differential revision: https://reviews.llvm.org/D91249
This patch starts emitting the `EShNum` key, when the `e_shnum = 0`
and the section header table exists.
`e_shnum` might be 0, when the the number of entries in the section header
table is larger than or equal to SHN_LORESERVE (0xff00).
In this case the real number of entries
in the section header table is held in the `sh_size`
member of the initial entry in section header table.
Currently, obj2yaml crashes when an object has `e_shoff != 0` and the `sh_size`
member of the initial entry in section header table is `0`.
This patch fixes it.
Differential revision: https://reviews.llvm.org/D92098
The following line asserts when `sh_addralign > MAX_UINT32 && (uint32_t)sh_addralign == 0`:
```
ExpectedOffset = alignTo(ExpectedOffset,
SecHdr.sh_addralign ? SecHdr.sh_addralign : 1);
```
it happens because `sh_addralign` is truncated to 32-bit value, but `alignTo`
doesn't accept `Align == 0`. We should change `1` to `1uLL`.
Differential revision: https://reviews.llvm.org/D92163
This commit factors out a WasmTableType definition from WasmTable, as is
the case for WasmGlobal and other data types. Also add support for
extracting the SymbolName for a table from the linking section's symbol
table.
Differential Revision: https://reviews.llvm.org/D91849
Currently we never dump the `sh_offset` key.
Though it sometimes an important information.
To reduce the noise this patch implements the following logic:
1) The "Offset" key for the first section is always emitted.
2) If we can derive the offset for a next section naturally,
then the "Offset" key is omitted.
By "naturally" I mean that section[X] offset is expected to be:
```
offsetOf(section[X]) == alignTo(section[X - 1].sh_offset + section[X - 1].sh_size, section[X].sh_addralign)
```
So, when it has the expected value, we omit it from the output.
Differential revision: https://reviews.llvm.org/D91152
Imagine we have a YAML declaration of few sections: `foo1`, `<unnamed 2>`, `foo3`, `foo4`.
To put them into segment we can do (1*):
```
Sections:
- Section: foo1
- Section: foo4
```
or we can use (2*):
```
Sections:
- Section: foo1
- Section: foo3
- Section: foo4
```
or (3*) :
```
Sections:
- Section: foo1
## "(index 2)" here is a name that we automatically created for a unnamed section.
- Section: (index 2)
- Section: foo3
- Section: foo4
```
It looks really confusing that we don't have to list all of sections.
At first I've tried to make this rule stricter and report an error when there is a gap
(i.e. when a section is included into segment, but not listed explicitly).
This did not work perfect, because such approach conflicts with unnamed sections/fills (see (3*)).
This patch drops "Sections" key and introduces 2 keys instead: `FirstSec` and `LastSec`.
Both are optional.
Differential revision: https://reviews.llvm.org/D90458
While generating yamls for my tests I noticed that the new debug_abbrev format (with multiple table support) was incorrectly assigning id's to the table because it was generating one per abbrev entry in the table. For instance, the first table would get id 4 when 5 abbrev entries existed in the table. By itself this is not a problem but the corresponding debug_info sections were still referencing id 0. This was introduced here: https://reviews.llvm.org/D83116.
Maybe a better fix is to actually correctly calculate the table id when emitting debug info? From a quick glance it seems to me the ID is just being calculated as the distance between the first DWARFAbbreviationDeclarationSet and the one the debug info entry points to, which means it's just its index and not the actual table id that was generated when emitting the debug_abbrev tables. With my fix I guess this is fine but on the diff that introduced this Pavel mentioned that he would like to have some sort of unique id between them but not necessarily +1 increasing, but for that to work we need to actually find the table ID, I guess by going directly to Y.DebugAbbrev but to honest I have no idea how to link the DWARFAbbreviationDeclarationSet and the Y.DebugAbbrev, so I just did this simple fix.
I also realized there's barely any tests for MachO so it might useful to invest on that if the tool is being reworked on.
Reviewed By: Higuoxing, jhenderson
Differential Revision: https://reviews.llvm.org/D87179
YAML support allows us to better test the feature in the subsequent patches. The implementation is quite similar to the .stack_sizes section.
Reviewed By: jhenderson, grimar
Differential Revision: https://reviews.llvm.org/D88717
`Link` is not an optional field currently.
Because of this it is not convenient to write macros.
This makes it optional and fixes corresponding test cases.
Differential revision: https://reviews.llvm.org/D90390
This teaches obj2yaml to dump valid regular (not thin) archives.
This also teaches yaml2obj to recognize archives YAML descriptions,
what allows to craft all different kinds of archives (valid and broken ones).
Differential revision: https://reviews.llvm.org/D89949
For testing purposes I need a way to build and install FileCheck and
yaml2obj. I had to choose between making FileCheck an LLVM tool and
making obj2yaml and yaml2obj utilities. I think the distinction is
rather arbitrary but my understanding is that tools are things meant for
the toolchain while utilities are more used for things like testing,
which is the case here.
The functional difference is that these tools now end up in the
${LLVM_UTILS_INSTALL_DIR}, which defaults to the ${LLVM_TOOLS_INSTALL_DIR}.
Unless you specified a different value or you added obj2yaml and
yaml2obj to ${LLVM_TOOLCHAIN_TOOLS}, this patch shouldn't change
anything.
Differential revision: https://reviews.llvm.org/D89357
Many sections either do not have a support of `Size`/`Content` or support just a
one of them, e.g only `Content`.
`Section` is the base class for sections. This patch adds `Content` and `Size` members
to it and removes similar members from derived classes. This allows to cleanup and
generalize the code and adds a support of these keys for all sections (`SHT_MIPS_ABIFLAGS`
is a only exception, it requires unrelated specific changes to be done).
I had to update/add many tests to test the new functionality properly.
Differential revision: https://reviews.llvm.org/D89039
Adds more testing in basic-assembly.s and a new test tables.s.
Adds support to yaml reading and writing of tables as well.
Differential Revision: https://reviews.llvm.org/D88815
This patch makes the opcode_base and the standard_opcode_lengths fields
of the line table optional. When both of them are not specified,
yaml2obj emits them according to the line table's version.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D88355
The `Group` class represents a group section and it is
named inconsistently with other sections which all has
the "Section" suffix. It is sometimes confusing,
this patch addresses the issue.
Differential revision: https://reviews.llvm.org/D88892