This reduces the number of check-llvm failures by 500.
Ideally we'd have a codegen version of PassRegistry.def, or have all the
codegen passes ported and put into PassRegistry.def. But since that
doesn't exist yet, hardcode the list of codegen IR passes.
There are still codegen passes missing from this list, I'll add them
later as I stumble upon them.
Reviewed By: asbirlea, ychen
Differential Revision: https://reviews.llvm.org/D84872
This introduces the printRelocationsHelper() which now contains the common
code used by both GNU and LLVM output styles.
Differential revision: https://reviews.llvm.org/D83935
This reverts commit d054c7ee2e9f4f98af7f22a5b00a941eb919bd59.
There are discussions about the utility name, its functionality and user interface.
Revert before we reach consensus.
This patch refactors the llvm tools namely, llvm-stress and sancov,
as well as the llvm TableGen utility, to use the new InitLLVM
interface which encapsulates PrettyStackTrace.
This is from https://reviews.llvm.org/D70702, but only for LLVM.
Reviewed-by: Kai
Differential Revision: https://reviews.llvm.org/D83484
Currently, when dumping section headers, llvm-readelf
prints "RELR" for SHT_ANDROID_RELR/SHT_RELR sections.
The behavior was introduced in D47919 and revealed in D84330.
But "SHT_ANDROID_RELR" has a different value from "SHT_RELR".
Also, "SHT_ANDROID_REL/SHT_ANDROID_RELA" are printed as "ANDROID_REL/ANDROID_RELA",
what makes the handling of the "SHT_ANDROID_RELR" inconsistent.
This patch makes llvm-readelf to print "ANDROID_RELR" instead of "RELR".
Differential revision: https://reviews.llvm.org/D84393
PGO profile is usually more precise than sample profile. However, PGO profile
needs to be collected from loadtest and loadtest may not be representative
enough to the production workload. Sample profile collected from production
can be used as a supplement -- for functions cold in loadtest but warm/hot
in production, we can scale up the related function in PGO profile if the
function is warm or hot in sample profile.
The implementation contains changes in compiler side and llvm-profdata side.
Given an instr profile and a sample profile, for a function cold in PGO
profile but warm/hot in sample profile, llvm-profdata will either mark
all the counters in the profile to be -1 or scale up the max count in the
function to be above hot threshold, depending on the zero counter ratio in
the profile. The assumption is if there are too many counters being zero
in the function profile, the profile is more likely to cause harm than good,
then llvm-profdata will mark all the counters to be -1 indicating the
function is hot but the profile is unaccountable. In compiler side, if a
function profile with all -1 counters is seen, the function entry count will
be set to be above hot threshold but its internal profile will be dropped.
In the long run, it may be useful to let compiler support using PGO profile
and sample profile at the same time, but that requires more careful design
and more substantial changes to make two profiles work seamlessly. The patch
here serves as a simple intermediate solution.
Differential Revision: https://reviews.llvm.org/D81981
This patch helps teach llvm-readelf to emit a correct number spaces when
dumping in hex format.
Before this patch, when the hex data doesn't fill the 4th column, some
spaces are missing.
```
Hex dump of section '.sec':
0x00000000 00000000 00000000 00000000 00000000 ................
0x00000010 00000000 00000000 00000000 0000 ..............
```
After this patch:
```
Hex dump of section '.sec':
0x00000000 00000000 00000000 00000000 00000000 ................
0x00000010 00000000 00000000 00000000 0000 ..............
```
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D84640
Starting with Skylake, the LBR contains the precise number of cycles between the two
consecutive branches.
Making use of this will hopefully make the measurements more precise than the
existing methods of using RDTSC.
Differential Revision: https://reviews.llvm.org/D77422
New change: check for existence of field `cycles` in perf_branch_entry before enabling this mode.
This should prevent compilation errors when building for older kernel whose headers don't support it.
Much like with function reduction, there may be remaining unhandled uses
of function, in particular in blockaddress. And in constants we can't
RAUW it with undef, because undef is not a function.
Instead, let's try to pretent that in the remaining cases, the new
signature didn't change, by bitcasting it.
A new (previously crashing) test case added.
There may be other users of a function other than CallInsts,
but what's more important, we can't actually replace function pointer
with undef, because for constants, that would not preserve the type
and RAUW would assert.
In particular, that affects blockaddress, however it proves to be
prohibitively complex to come up with a good test involving blockaddress:
we'd need to both ensure that the function body survives until
this pass, and is not interesting in this pass.
Summary:
Recursion detection can be non-trivial. Currently, the state-of-the-art for LLVM,
as far as i'm concerned, is D72362 `[clang-tidy] misc-no-recursion: a new check`.
However, it is quite limited:
* It does very basic call-graph based analysis, in the sense it will report even dynamically-unreachable recursion.
* It is inherently limited to a single TU
* It is hard to gauge how problematic each recursion is in practice.
Some of that can be addressed by adding clang analyzer-based check,
then it would at least support multiple TU's.
However, we can approach this problem from another angle - dynamic run-time analysis.
We already have means to capture a run-time callgraph (XRay, duh),
and there are already means to reconstruct it within `llvm-xray` tool.
This proposes to add a `-recursive-calls-only` switch to the `account` tool.
When the switch is on, when re-constructing callgraph for latency reconstruction,
each time we enter/leave some function, we increment/decrement an entry for the function
in a "recursion depth" map. If, when we leave the function, said entry was at `1`,
then that means the function didn't call itself, however if it is at `2` or more,
then that means the function (possibly indirectly) called itself.
If the depth is 1, we don't account the time spent there,
unless within this call stack the function already recursed into itself.
Note that we don't pay for recursion depth tracking when `recursive-calls-only` is not on,
and the perf impact is insignificant (+0.3% regression)
The overhead of the option is actually negative, around -5.26% user time on a medium-sized (3.5G) XRay log.
As a practical example, that 3.5G log is a capture of the entire middle-end opt pipeline
at `-O3` for RawSpeed unity build. There are total of `5500` functions in the log,
however `-recursive-calls-only` says that `269`, or 5%, are recursive.
Having this functionality could be helpful for recursion eradication.
Reviewers: dberris, mboerger
Reviewed By: dberris
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D84582
It doesn't really need to know where Timings are stored, it just needs
to be able to sort them, so MutableArrayRef is enough.
That uncovers an interesting quirk that it relied on
implicit double->int conversion for calculating percentiles.
We can happily turn function definitions into declarations,
thus obscuring their argument from being elided by this pass.
I don't believe there is a good reason to just ignore declarations.
likely even proper llvm intrinsics ones,
at worst the input becomes uninteresting.
The other question here is that all these transforms are all-or-nothing.
In some cases, should we be treating each use separately?
The main blocker here seemed to be that llvm::CloneFunctionInto()
does `&OldFunc->front()`, which inserts a nullptr into a densemap,
which is not happy about it and asserts.
replaceFunctionCalls() is very non-exhaustive, it only handles
CallInst's. Which means, by the time we drop old function,
there may still be uses of it lurking around.
Let's instead whack-a-mole them by all by replacing with undef.
I'm not sure this is the best handling, especially for calls, but IMO
poorly reduced input is much better than crashing reduction tool.
A (previously-crashing!) test added.
Fixes https://bugs.llvm.org/show_bug.cgi?id=46819
Terminator may have returned value, so we need to replace uses,
and in general handle invoke as a branch inst.
I'm not sure this is the best handling, but IMO poorly reduced
input is much better than crashing reduction tool.
A (previously-crashing!) test added.
Fixes https://bugs.llvm.org/show_bug.cgi?id=46818
ReduceFunctions could do it, but it also replaces *all* calls with undef,
so if any of undef replacements makes reduction uninteresting,
it won't work.
ReduceBasicBlocks also could do it, but well, it may take many guesses
for all the blocks of a function to happen to be out-of-chunk,
which is not a very efficient way to go about it.
So let's just do this first.
See https://lists.llvm.org/pipermail/llvm-dev/2020-July/143373.html
"[llvm-dev] Multiple documents in one test file" for some discussions.
`extract part filename` splits the input file into multiple parts separated by
regex `^(.|//)--- ` and extract the specified part to stdout or the
output file (if specified).
Use case A (organizing input of different formats (e.g. linker
script+assembly) in one file).
```
// RUN: extract lds %s -o %t.lds
// RUN: extract asm %s -o %t.s
// RUN: llvm-mc %t.s -o %t.o
// RUN: ld.lld -T %t.lds %t.o -o %t
This is sometimes better than the %S/Inputs/ approach because the user
can see the auxiliary files immediately and don't have to open another file.
```
Use case B (for utilities which don't have built-in input splitting
feature):
```
// RUN: extract case1 %s | llc | FileCheck %s --check-prefix=CASE1
// RUN: extract case2 %s | llc | FileCheck %s --check-prefix=CASE2
Combing tests prudently can improve readability.
This is sometimes better than having multiple test files.
```
Since this is a new utility, there is no git history concerns for
UpperCase variable names. I use lowerCase variable names like mlir/lld.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D83834
It is used for printing section headers in the GNU style
and the implementation can be simplified.
Differential revision: https://reviews.llvm.org/D84330
Summary:
If there was a single target to begin with, because a single target
can only occupy a single chunk, we couldn't increase granularity.
and would immediately give up.
Likewise, if we had multiple targets, if by the end we'd end up with
a single target, we wouldn't finish reducing it, it would always
end up being "interesting"
Reviewers: dblaikie, nickdesaulniers, diegotf
Reviewed By: dblaikie
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D84318
This patch includes the supporting code that enables always
instrumenting the function entry block by default.
This patch will NOT the default behavior.
It adds a variant bit in the profile version, adds new directives in
text profile format, and changes llvm-profdata tool accordingly.
This patch is a split of D83024 (https://reviews.llvm.org/D83024)
Many test changes from D83024 are also included.
Differential Revision: https://reviews.llvm.org/D84261
It was requested in D84173 thread to not do it, because otherwise we extract and
check the name of the symbol table in LLVM style, but do not use it and
might report a warning which perhaps might be confusing.
Differential revision: https://reviews.llvm.org/D84231
These functions can be used to generate strings like
"SHT_?? section with index ?" to describe sections in error/warning messages,
what helps to simplify and generalize them.
Also this allows to isolate the following common code pattern:
`&Sec - &cantFail(Obj->sections()).front();`
Differential revision: https://reviews.llvm.org/D84240
The current behavior was introduced by me in D37567 and it is a bit strange. It prints the
"Error: ...." message to the errs() manually and stops dumping the group section which has this error.
This behavior is consistent with GNU though, but it is very inconsistent with what the regular llvm-readelf
code usually does/prints, so I suggest to change the implementation:
1) Instead of printing "Error: ...." to errs() - just report a warning.
2) Try to continue dumping the section.
3) Merge broken-group.test to group.text.
This is what this patch does.
Differential revision: https://reviews.llvm.org/D84170
We have an issue currently: we are trying to read the name of the SHT_DYNSYM section
very early and using `unwrapOrError` call for that.
The name is needed only for the GNU output. Because of the current logic, the tool
fails to dump the whole object when something is wrong with the name of the .dynsym section.
This patch delays reading the name and also allows it to be broken.
Differential revision: https://reviews.llvm.org/D84173
Add support for flattening archives while creating static libraries.
Hence, can now pass archives as input in addition to Mach-O binaries.
Furthermore, archives themselves must only conatain Mach-O binaries. As
per cctools' libtool's behavior, llvm-libtool-darwin does not flatten
archives recursively.
Reviewed by alexshap, smeenai, jhenderson
Differential Revision: https://reviews.llvm.org/D83520
Add support for creating static libraries when the input includes only
Mach-O binaries (and not libraries/archives themselves).
Reviewed by alexshap, Ktwu, smeenai, jhenderson, MaskRay, mtrent
Differential Revision: https://reviews.llvm.org/D83002
OptNoneInstrumentation is part of StandardInstrumentations. It skips
functions (or loops) that are marked optnone.
The feature of skipping optional passes for optnone functions under NPM
is gated on a -enable-npm-optnone flag. Currently it is by default
false. That is because we still need to mark all required passes to be
required. Otherwise optnone functions will start having incorrect
semantics. After that is done in following changes, we can remove the
flag and always enable this.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D83519
In an object file, a "PC Begin" field in a FDE is usually relocated by a
PC-relative relocation. Use a relocation-aware DWARFDataExtractor overload (with
DWARFContext and a reference to its internal .eh_frame representation) to decode
addresses correctly. In an object file, most sections have addresses of zero. So
the displayed addresses are almost always offsets relative to the start of the
associated text section.
DWARFContext::create handles .eh_frame and .rela.eh_frame by itself, so if there
are more than one .eh_frame (technically possible, but almost always erronerous
in practice), this will only handle the first one. Supporting multiple
.eh_frame is beyond the scope of this patch.
Reviewed By: grimar, jhenderson
Differential Revision: https://reviews.llvm.org/D84106
In addition, move the definition of the class into the Debugify.h,
so we can use it from different levels.
The motivation for this is D82547.
Differential Revision: https://reviews.llvm.org/D83391
Newly-added test previously crashed.
While it is up for debate whether or not instruction reduction
should be indiscriminate in instruction dropping (there you can
just ensure that the test case is still -verify'ies), here
if we drop terminator, CloneFunctionInto() will immediately crash.
So let's not do that :)
This matches LLD and fixes https://sourceware.org/bugzilla/show_bug.cgi?id=26262#c1
.o is a bad choice for save-temps output because it is easy to override the bitcode file (*.o)
```
# Use bfd for the example, -fuse-ld=gold is similar.
clang -flto -c a.c # generate bitcode file a.o
clang -fuse-ld=bfd -flto a.o -o a -Wl,-plugin-opt=save-temps # override a.o
# The user repeats the command but get surprised, because a.o is now a combined module.
clang -fuse-ld=bfd -flto a.o -o a -Wl,-plugin-opt=save-temps
```
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D84132
This patch changes llvm-readelf (and llvm-readobj for consistency)
behavior to print an error when executed with no input files.
Reading from stdin can be achieved via a '-' for the input
object.
Fixes https://bugs.llvm.org/show_bug.cgi?id=46400
Differential Revision: https://reviews.llvm.org/D83704
Reviewed by: jhenderson, MaskRay, sbc, jyknight
There is a strange "feature" of the code: it handles all relocations as `Elf_Rela`.
For handling `Elf_Rel` it converts them to `Elf_Rela` and passes `bool IsRela` to
specify the real type everywhere.
A related issue is that the
`decode_relrs` helper in lib/Object has to return `Expected<std::vector<Elf_Rela>>`
because of that, though it could return a vector of `Elf_Rel`.
I think we should just start using templates for relocation types, it makes the code
cleaner and shorter. This patch does it.
Differential revision: https://reviews.llvm.org/D83871
It fixes/improves the following:
1) Some code was duplicated.
2) A "The .MIPS.abiflags section has a wrong size" error was not reported as a warning,
but was printed to stdout for the LLVM style. Also, it was reported as an error for the GNU style.
This patch changes the behavior to be consistent and to report warnings.
3) `unwrapOrError()` was used before, now a warning is reported instead.
Differential revision: https://reviews.llvm.org/D84033
The function extractArgumentsFromModule() was passing a one-based index to,
but replaceFunctionCalls() was expecting a zero-based argument index. This
resulted in assertion errors when reducing function call arguments with
different types. Additionally, the
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D84099
.gcno, .gcda and source files can be modified while we are reading them. If the
concurrent modification of a file being read nullifies the NUL terminator
assumption, llvm-cov can trip over an assertion failure in MemoryBuffer::init.
This is not so rare - the source files can be in an editor and .gcda can be
written by an running process (if the process forks, when .gcda gets written is
probably more unpredictable).
There is no accompanying test because an assertion failure requires data
races with some involved setting.
Under NPM, the asan-globals-md analysis is required but cannot be run
within the asan function pass due to module analyses not being able to
run from a function pass. So this pins all tests using "-asan" to the
legacy PM and adds a corresponding RUN line with
-passes='require<asan-globals-md>,function(asan)'.
Now all tests in Instrumentation/AddressSanitizer pass when
-enable-new-pm is by default on.
Tests were automatically converted using the following python script and
failures were manually fixed up.
import sys
for i in sys.argv:
with open(i, 'r') as f:
s = f.read()
with open(i, 'w') as f:
for l in s.splitlines():
if "RUN:" in l and ' -asan -asan-module ' in l and '\\' not in l:
f.write(l.replace(' -asan -asan-module ', ' -asan -asan-module -enable-new-pm=0 '))
f.write('\n')
f.write(l.replace(' -asan -asan-module ', " -passes='require<asan-globals-md>,function(asan),module(asan-module)' "))
f.write('\n')
elif "RUN:" in l and ' -asan ' in l and '\\' not in l:
f.write(l.replace(' -asan ', ' -asan -enable-new-pm=0 '))
f.write('\n')
f.write(l.replace(' -asan ', " -passes='require<asan-globals-md>,function(asan)' "))
f.write('\n')
else:
f.write(l)
f.write('\n')
See https://bugs.llvm.org/show_bug.cgi?id=46611.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D83921
This diff starts the implementation of llvm-libtool-darwin
(an llvm based replacement of cctool's libtool).
Libtool is used for creating static and dynamic libraries
from a bunch of object files given as input.
Reviewed by alexshap, smeenai, jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D82923