Two RUN lines produce outputs that, each, have some common parts and
some different parts. The common parts are checked under label A. The
differing parts are associated to a function and checked under labels B
and C, respectivelly.
When build_function_body_dictionary is called for the first RUN line, it
will attribute the function body to labels A and C. When the second RUN
is passed to build_function_body_dictionary, it sees that the function
body under A is different from what it has. If in this second RUN line,
A were at the end of the prefixes list, A's body is still kept
associated with the first run's function.
When we output the function body (i.e. add_checks), we stop after
emitting for the first prefix matching that function. So we end up with
the wrong function body (first RUN's A-association).
There is no reason to special-case the last label in the prefixes list,
and the fix is to always clear a label association if we find a RUN line
where the body is different.
Differential Revision: https://reviews.llvm.org/D93078
This is related to https://bugs.llvm.org/show_bug.cgi?id=40868.
Currently we don't print `OS Specific`/``Processor Specific`/`<unknown>`
prefixes when dumping the ELF file type. This is not consistent
with GNU readelf. The patch fixes it.
Also, this patch removes the `types.test`, because we already have
`file-types.test`, which tests more cases and this patch revealed that
we have such a duplicate.
Differential revision: https://reviews.llvm.org/D93096
On macOS/arm, signature verification has kill semantics by default.
Signature verification is cached with a file's inode (actually, vnode),
and if a new executable is copied over an existing file (which reuses
the inode), the cache isn't invalidated. So when the new executable
is executed, the kernel still has the old content's signature cached
and the kills the executable because the old signatue doesn't match
the new contents (https://openradar.appspot.com/FB8914243).
As workaround, rm the desitnation files first, to ensure they have
a fresh vnode (and hence no stale cached signature) after the copy.
Part of PR46647. See also e0e334a9c1ac for a similar change.
When llvm-rc loads an external file, it looks for it relative to
a number of include directories and the current working directory.
If the path is considered absolute, llvm-rc tries to open the
filename as such, and doesn't try to open it relative to other
paths.
On Windows, a path name like "\dir\file" isn't considered absolute
as it lacks the drive name, but by appending it on top of the search
dirs, it's not found.
LLVM's sys::path::append just appends such a path (same with a properly
absolute posix path) after the paths it's supposed to be relative to.
This fix doesn't handle the case if the resource script and the
external file are on a different drive than the current working
directory; to fix that, we'd have to make LLVM's sys::path::append
handle appending fully absolute and partially absolute paths (ones
lacking a drive prefix but containing a root directory), or switch
to C++17's std::filesystem.
Differential Revision: https://reviews.llvm.org/D92558
-DENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=on configured LLD and LLVMgold.so
will use the new pass manager by default. Add an option to
use the legacy pass manager. This will also be used by the Clang driver
when -fno-new-pass-manager (D92915) / -fno-experimental-new-pass-manager is set.
Reviewed By: aeubanks, tejohnson
Differential Revision: https://reviews.llvm.org/D92916
This changes the `printNotesHelper` to report warnings on its side when
there are errors when dumping notes.
With that we can provide more content when reporting warnings about broken notes.
Differential revision: https://reviews.llvm.org/D92636
It is allowed to have multiple `SHT_SYMTAB_SHNDX` sections, though
we currently don't implement it.
The current implementation assumes that there is a maximum of one SHT_SYMTAB_SHNDX
section and that it is always linked with .symtab section.
This patch drops this limitations.
Differential revision: https://reviews.llvm.org/D92644
MD5 is used.
Currently during sample profile loading, NameTable has to be loaded entirely
up front before any name string is retrieved. That is because NameTable is
stored using ULEB128 encoding and cannot be directly accessed like an array.
However, if MD5 is used to represent name in the NameTable, it has fixed
length. If MD5 names are stored in uint64_t type instead of ULEB128, NameTable
can be accessed like an array then in many cases only part of the NameTable
has to be read. This is helpful for reducing compile time especially when
small source file is compiled. We find that after this change, the elapsed
time to build a large application distributively is reduced by 5% and the
accumulative cpu time used for building is also reduced by 5%. The size of
the profile is slightly reduced with this change by ~0.2%, and that also
indicates encoding MD5 in ULEB128 doesn't save the storage space.
Differential Revision: https://reviews.llvm.org/D92621
Don't know why under Sanitizer build(asan/msan/ubsan), the `std::unordered_map<string, ...>`'s output order is reversed, make the regression test failed.
This change creates a workaround by using sorted container to make the output deterministic.
Reviewed By: hoy, wenlei
Differential Revision: https://reviews.llvm.org/D92816
Our internal build bot hit a failure in llvm/test/tools/llvm-symbolizer/pdb/missing_pdb.test
because the test was checking for an error message that is emitted when a pdb file is
missing. But when the drive is mapped to a removalable drive (such as a DVD drive) in
Windows, you get a different error message which causes the test to fail.
This fixes the test by changing the drive the missing pdb is expected to be on to C:\
instead of D:\ as that is the drive historically used to install Windows and thus
if present should be a hard drive.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D92787
This stack of changes introduces `llvm-profgen` utility which generates a profile data file from given perf script data files for sample-based PGO. It’s part of(not only) the CSSPGO work. Specifically to support context-sensitive with/without pseudo probe profile, it implements a series of functionalities including perf trace parsing, instruction symbolization, LBR stack/call frame stack unwinding, pseudo probe decoding, etc. Also high throughput is achieved by multiple levels of sample aggregation and compatible format with one stop is generated at the end. Please refer to: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s for the CSSPGO RFC.
This change supports context-sensitive profile data generation into llvm-profgen. With simultaneous sampling for LBR and call stack, we can identify leaf of LBR sample with calling context from stack sample . During the process of deriving fall through path from LBR entries, we unwind LBR by replaying all the calls and returns (including implicit calls/returns due to inlining) backwards on top of the sampled call stack. Then the state of call stack as we unwind through LBR always represents the calling context of current fall through path.
we have two types of virtual unwinding 1) LBR unwinding and 2) linear range unwinding.
Specifically, for each LBR entry which can be classified into call, return, regular branch, LBR unwinding will replay the operation by pushing, popping or switching leaf frame towards the call stack and since the initial call stack is most recently sampled, the replay should be in anti-execution order, i.e. for the regular case, pop the call stack when LBR is call, push frame on call stack when LBR is return. After each LBR processed, it also needs to align with the next LBR by going through instructions from previous LBR's target to current LBR's source, which we named linear unwinding. As instruction from linear range can come from different function by inlining, linear unwinding will do the range splitting and record counters through the range with same inline context.
With each fall through path from LBR unwinding, we aggregate each sample into counters by the calling context and eventually generate full context sensitive profile (without relying on inlining) to driver compiler's PGO/FDO.
A breakdown of noteworthy changes:
- Added `HybridSample` class as the abstraction perf sample including LBR stack and call stack
* Extended `PerfReader` to implement auto-detect whether input perf script output contains CS profile, then do the parsing. Multiple `HybridSample` are extracted
* Speed up by aggregating `HybridSample` into `AggregatedSamples`
* Added VirtualUnwinder that consumes aggregated `HybridSample` and implements unwinding of calls, returns, and linear path that contains implicit call/return from inlining. Ranges and branches counters are aggregated by the calling context. Here calling context is string type, each context is a pair of function name and callsite location info, the whole context is like `main:1 @ foo:2 @ bar`.
* Added PorfileGenerater that accumulates counters by ranges unfolding or branch target mapping, then generates context-sensitive function profile including function body, inferring callee's head sample, callsite target samples, eventually records into ProfileMap.
* Leveraged LLVM build-in(`SampleProfWriter`) writer to support different serialization format with no stop
- `getCanonicalFnName` for callee name and name from ELF section
- Added regression test for both unwinding and profile generation
Test Plan:
ninja & ninja check-llvm
Reviewed By: hoy, wenlei, wmi
Differential Revision: https://reviews.llvm.org/D89723
This rewrites the logic to get rid of "ELFSymbolRef" API where possible.
This allowed to handle possible errors better, improve warnings reported and add new ones.
Also 'reportWarning' was replaced with 'reportUniqueWarning'
Differential revision: https://reviews.llvm.org/D92545
Avoid calling getFlags on a non-existent symbol.
The way this is triggered is by calling strip -N on a binary, which sets
the MH_NLIST_OUTOFSYNC_WITH_DYLDINFO header flag. Then, in the
LC_FUNCTION_STARTS command, nm is trying to print the stripped symbols
and needs the proper checks.
TargetMachine::shouldAssumeDSOLocal currently implies dso_local for
Static. Split some tests so that these `external dso_local global`
will align with the Clang behavior.
They are currently implicit because TargetMachine::shouldAssumeDSOLocal implies
dso_local.
For external data, clang -fno-pic emits the dso_local specifier for ELF and
non-MinGW COFF. Adding explicit dso_local makes these tests in align with the
clang behavior and helps implementing an option to use GOT indirection for
external data access in -fno-pic mode (to avoid copy relocations).
This is similar to what we did earlier for fields of the Section class.
When a field is optional we can use the =<none> syntax in macros.
This was splitted from D92478.
Differential revision: https://reviews.llvm.org/D92565
This also teaches MachO writers/readers about the MachO cpu subtype,
beyond the minimal subtype reader support present at the moment.
This also defines a preprocessor macro to allow users to distinguish
__arm64__ from __arm64e__.
arm64e defaults to an "apple-a12" CPU, which supports v8.3a, allowing
pointer-authentication codegen.
It also currently defaults to ios14 and macos11.
Differential Revision: https://reviews.llvm.org/D87095
llvm-link should not rely on the '.a' file extension when deciding if input file
should be loaded as archive. Archives may have other extensions (f.e. .lib) or no
extensions at all. This patch changes llvm-link to use llvm::file_magic to check
if input file is an archive.
Reviewed By: RaviNarayanaswamy
Differential Revision: https://reviews.llvm.org/D92376
llvm-link should not rely on the '.a' file extension when deciding if input file
should be loaded as archive. Archives may have other extensions (f.e. .lib) or no
extensions at all. This patch changes llvm-link to use llvm::file_magic to check
if input file is an archive.
Reviewed By: RaviNarayanaswamy
Differential Revision: https://reviews.llvm.org/D92376
This implementation of `ELFDumper<ELFT>::printAttributes()` in llvm-readobj has issues:
1) It crashes when the content of the attribute section is empty.
2) It uses `unwrapOrError` and `reportWarning` calls, though
ideally we want to use `reportUniqueWarning`.
3) It contains a TODO about redundant format version check.
`lib/Support/ELFAttributeParser.cpp` uses a hardcoded constant instead of the named constant.
This patch fixes all these issues.
Differential revision: https://reviews.llvm.org/D92318
This:
1) Changes `reportWarning` to `reportUniqueWarning` (no-op here).
2) Adds more context to the message.
3) Merges `broken-dynsym-link.test` into `dyn-symbols.test`, adds more testing.
Differential revision: https://reviews.llvm.org/D92380
The static_assert in "libcxx/include/memory" was the main offender here,
but then I figured I might as well `git grep -i instantat` and fix all
the instances I found. One was in user-facing HTML documentation;
the rest were in comments or tests.
Currently when we dump sections, we dump them in the order,
which is specified in the sections header table.
With that the order in the output might not match the order in the file.
This patch starts sorting them by by file offsets when dumping.
When the order in the section header table doesn't match the order
in the file, we should emit the "SectionHeaderTable" key. This patch does it.
Differential revision: https://reviews.llvm.org/D91249
This merges `invalid-attr-section-size.test` and `invalid-attr-version.test`
into `invalid-attributes-sec.test`.
This allows to have a single place where other related test cases can be added.
Differential revision: https://reviews.llvm.org/D92316
This is the #1 of 2 changes that make remarks hotness threshold option
available in more tools. The changes also allow the threshold to sync with
hotness threshold from profile summary with special value 'auto'.
This change modifies the interface of lto::setupLLVMOptimizationRemarks() to
accept remarks hotness threshold. Update all the tools that use it with remarks
hotness threshold options:
* lld: '--opt-remarks-hotness-threshold='
* llvm-lto2: '--pass-remarks-hotness-threshold='
* llvm-lto: '--lto-pass-remarks-hotness-threshold='
* gold plugin: '-plugin-opt=opt-remarks-hotness-threshold='
Differential Revision: https://reviews.llvm.org/D85809
If prefaced with a %, expand text macros and macro functions in any statement.
Also, prevent expanding text macros in the message of an ECHO directive unless expanded explicitly by the statement expansion operator.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D89740
The model was committed in 4b8ade837e36b7f0181ce86fc23f33851d0fdd35
but not yet enabled to allow for a few fix ups. This adds a few
of these fixes, and also a LLVM MCA test to check most instructions.
While I do have plans to look into some more tuning, it's time to
enable this as it better than using the A53 schedule.
Differential Revision: https://reviews.llvm.org/D88017
This does the same as `--mcpu=help` but was only
documented in the user guide.
* Added a test for both options.
* Corrected the single dash in `-mcpu=help` text.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D92305
This patch starts emitting the `EShNum` key, when the `e_shnum = 0`
and the section header table exists.
`e_shnum` might be 0, when the the number of entries in the section header
table is larger than or equal to SHN_LORESERVE (0xff00).
In this case the real number of entries
in the section header table is held in the `sh_size`
member of the initial entry in section header table.
Currently, obj2yaml crashes when an object has `e_shoff != 0` and the `sh_size`
member of the initial entry in section header table is `0`.
This patch fixes it.
Differential revision: https://reviews.llvm.org/D92098
The following line asserts when `sh_addralign > MAX_UINT32 && (uint32_t)sh_addralign == 0`:
```
ExpectedOffset = alignTo(ExpectedOffset,
SecHdr.sh_addralign ? SecHdr.sh_addralign : 1);
```
it happens because `sh_addralign` is truncated to 32-bit value, but `alignTo`
doesn't accept `Align == 0`. We should change `1` to `1uLL`.
Differential revision: https://reviews.llvm.org/D92163
This is related to MIPS. Currently we might report an error and exit,
though there is no problem to report a warning and try to continue dumping
an object. The code uses `MipsGOTParser<ELFT> Parser`, which is isolated
in this method.
Differential revision: https://reviews.llvm.org/D92090
In text-item contexts, %expr expands to a string containing the results of evaluating `expr`.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D89736
This starts using `reportUniqueWarnings` instead of `reportError`
in the code that is responsible for dumping notes.
Differential revision: https://reviews.llvm.org/D92021
`notes_begin()` is used for iterating over notes. This API in some cases might print
section type and index. At the same time during iterating, the `Elf_Note_Iterator`
might omit it as it doesn't have this info.
Because of above we might have the redundant duplication of information in warnings:
(See D92021).
```
warning: '[[FILE]]': unable to read notes from the SHT_NOTE section with index 1: SHT_NOTE section [index 1] has invalid offset (0x40) or size (0xffff0000)
```
This change stops reporting section index/type in Object/ELF.h/notes_begin().
(FTR, this was introduced by me for llvm-readobj in D64470).
Instead we can describe sections/program headers on the caller side.
Differential revision: https://reviews.llvm.org/D92081