Passing '-dlopen <library-path>' to lli will cause the specified library to be
loaded (via llvm::sys::DynamicLibrary::LoadLibraryPermanently) before JIT'd code
is executed, making the library's symbols accessible to JIT'd code.
This avoids questionable code such as taking address of current
range-based for variable and comparing it with vector begin iterator.
While this may not be a problem in itself, it can be written more consice.
This was initially suggested by @aaronpuchert.
It seems like gcc 5.5 wants to iterate over the new variable instead
of the container that lives outside the loop. But of course this
new container is empty.
Plus using a different variable names makes the code more readable.
The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket.
== Background ==
Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void*). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads.
By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to.
This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market.
== The problem ==
The heavyweight_hardware_concurrency() API was introduced so that only *one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group".
== The changes in this patch ==
To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO).
When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead.
The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware *core* will be used.
When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once.
Differential Revision: https://reviews.llvm.org/D71775
Replace use of widenPath in comparePaths with UTF8ToUTF16. widenPath
does a lot more than just conversion from UTF-8 to UTF-16. This is not
necessary for CompareStringOrdinal and could possibly even cause
problems.
Differential Revision: https://reviews.llvm.org/D74477
function_ref is non-owning, so if we get it as a parameter in constructor,
our reference goes out-of-scope as soon as constructor returns.
Instead, let's just take it as a parameter to the actual `generate()` call
Summary:
Currently, we only have nice exploration for LEA instruction,
while for the rest, we rely on `randomizeUnsetVariables()`
to sometimes generate something interesting.
While that works, it isn't very reliable in coverage :)
Here, i'm making an assumption that while we may want to explore
multi-instruction configs, we are most interested in the
characteristics of the main instruction we were asked about.
Which we can do, by taking the existing `randomizeMCOperand()`,
and turning it on it's head - instead of relying on it to randomly fill
one of the interesting values, let's pregenerate all the possible interesting
values for the variable, and then generate as much `InstructionTemplate`
combinations of these possible values for variables as needed/possible.
Of course, that requires invasive changes to no longer pass just the
naked `Instruction`, but sometimes partially filled `InstructionTemplate`.
As it can be seen from the test, this allows us to explore
`X86::OperandType::OPERAND_COND_CODE` for instructions
that take such an operand.
I'm hoping this will greatly simplify exploration.
Reviewers: courbet, gchatelet
Reviewed By: gchatelet
Subscribers: orodley, mgorny, sdardis, tschuett, jrtc27, atanasyan, mstojanovic, andreadb, RKSimon, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74156
Summary:
GNU objdump prints the file format in lowercase, e.g. `elf64-x86-64`. llvm-objdump prints `ELF64-x86-64` right now, even though piping that into llvm-objcopy refuses that as a valid arch to use.
As an example of a problem this causes, see: https://github.com/ClangBuiltLinux/linux/issues/779
Reviewers: MaskRay, jhenderson, alexshap
Reviewed By: MaskRay
Subscribers: tpimh, sbc100, grimar, jvesely, nhaehnle, kerbowa, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D74433
Summary:
Simplifies the C++11-style "-> decltype(...)" return-type deduction.
Note that you have to be careful about whether the function return type
is `auto` or `decltype(auto)`. The difference is that bare `auto`
strips const and reference, just like lambda return type deduction. In
some cases that's what we want (or more likely, we know that the return
type is a value type), but whenever we're wrapping a templated function
which might return a reference, we need to be sure that the return type
is decltype(auto).
No functional change.
Subscribers: dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74383
Followup to D74085.
Replace the use of `report_fatal_error()` with returning the error to
`llvm-exegesis.cpp` and handling it there.
To facilitate this, a new `Error` type has been added which is only used
to log errors to the yaml output.
Differential Revision: https://reviews.llvm.org/D74215
Summary: Commit 141915963b6ab36ee4e577d1b27673fa4d05b409 was reverted in
abe01e17f648a97666d4fbed41f0861686a17972 because it broke builds testing
without libpfm. A preparatory commit <commit_sha1> was added to enable
this recommit.
Original commit message:
Followup to D74085.
Replace the use of `report_fatal_error()` with returning the error to
`llvm-exegesis.cpp` and handling it there.
Differential Revision: https://reviews.llvm.org/D74113
Summary: Commit b3576f60ebc8f660afad8120a72473be47517573 was reverted in
abe01e17f648a97666d4fbed41f0861686a17972 because it broke builds testing
without libpfm. A preparatory commit <commit_sha1> was added to enable
this recommit.
Original commit message:
Fix inconsistencies in error reporting created by mixing
`report_fatal_error()` and `ExitOnErr()`, and add additional information
to the error message to make it more user friendly. Minimize the use
`report_fatal_error()` because it's meant for use in very rare cases and
it results in low information density of the error messages.
Summary of the new design:
* For command line argument errors output `llvm-exegesis: <error_message>`,
which is consistent with the error output format emitted by the backend
which checks correctness of the command line arguments.
* For other errors the format `llvm-exegesis error: <error_message>` is used.
** If the error occurred during file access `<error_message>` will have
of two parts: `'<file_name>': <rest_of_the_error_message>`
Differential Revision: https://reviews.llvm.org/D74085
All errors of type `Failure` are `StringError`s. In order for exit code
mapping to detect that specifically a clustering error has occurred it
needs to have a different type.
This patch also prepares D74085 where termination `report_fatal_error()`
will be replaced with emitting `StringError`s.
Differential Revision: https://reviews.llvm.org/D74124
It broke e.g. all tests under tools/llvm-exegesis/X86/ when libpfm is
not available, see comment on D74085.
This reverts commit b3576f60ebc8f660afad8120a72473be47517573 and
141915963b6ab36ee4e577d1b27673fa4d05b409.
Followup to D74085.
Replace the use of `report_fatal_error()` with returning the error to
`llvm-exegesis.cpp` and handling it there.
Differential Revision: https://reviews.llvm.org/D74113
Fix inconsistencies in error reporting created by mixing
`report_fatal_error()` and `ExitOnErr()`, and add additional information
to the error message to make it more user friendly. Minimize the use
`report_fatal_error()` because it's meant for use in very rare cases and
it results in low information density of the error messages.
Summary of the new design:
* For command line argument errors output `llvm-exegesis: <error_message>`,
which is consistent with the error output format emitted by the backend
which checks correctness of the command line arguments.
* For other errors the format `llvm-exegesis error: <error_message>` is used.
** If the error occurred during file access `<error_message>` will have
of two parts: `'<file_name>': <rest_of_the_error_message>`
Differential Revision: https://reviews.llvm.org/D74085
* Hide unrelated options.
* Add "OVERVIEW: " to yaml2obj -h/--help.
* Place options under a yaml2obj category.
* Disallow -docnum. Currently -docnum is the only yaml2obj specific long option that is affected.
* Specify `cl::init("-")` and `cl::Prefix` for OutputFilename. The
latter allows `-ofile`
Reviewed By: grimar, jhenderson
Differential Revision: https://reviews.llvm.org/D73982
Summary:
The output from llvm-reduce still has significantly more attributes than
bugpoint does. Teach llvm-reduce to remove attributes.
Reviewers: diegotf, dblaikie, george.burgess.iv
Subscribers: mgorny, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73853
Previously the description allowed to describe symbols with use of
`Name` and `Index` keys. This patch removes them and now it is still
possible to use either names or symbol indexes, but the code is simpler
and the format is slightly different.
Such a change will be useful for another patches, e.g:
https://reviews.llvm.org/D73788#inline-671077
Differential revision: https://reviews.llvm.org/D73888
This extends the RemarkStreamer to allow for other emitters (e.g.
frontends, SIL, etc.) to emit remarks through a common interface.
See changes in llvm/docs/Remarks.rst for motivation and design choices.
Differential Revision: https://reviews.llvm.org/D73676
Summary:
llvm-objdump -macho will no longer print "Contents of" headers when
disassembling section contents when -no-leading-headers is specified.
For historical reasons, this flag is independent of -no-leading-addr.
Reviewers: ab, pete, jhenderson
Reviewed By: jhenderson
Subscribers: rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73574
Summary:
llvm-objdump started warning when asked to disassemble a section that
isn't present in the input files, in Yuanfang Chen's change:
d16c162c9453db855503134fe29ae4a3c0bec936. The problem is that the
logic was restricted only to the generic llvm-objdump parser, not to the
Mach-O-specific parser used for Apple toolchain compatibility. The
solution is to log section names from the Mach-O parser.
The macho-cstring-dump.test has been updated to fail if it encounters
this new warning in the future.
Reviewers: pete, ab, lhames, jhenderson, grimar, MaskRay, ychen
Reviewed By: jhenderson, grimar
Subscribers: rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73586
Summary:
It turns out that CUR_DIRECTION is just an internal placeholder, not an actual
valid encoded value.
Reviewers: gchatelet
Subscribers: tschuett, mstojanovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73343