The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket.
== Background ==
Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void*). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads.
By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to.
This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market.
== The problem ==
The heavyweight_hardware_concurrency() API was introduced so that only *one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group".
== The changes in this patch ==
To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO).
When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead.
The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware *core* will be used.
When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once.
Differential Revision: https://reviews.llvm.org/D71775
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
Summary:
The primary goal of this refactoring is to separate DWARF optimizing part.
So that it could be reused by linker or by any other client.
There was a thread on llvm-dev discussing the necessity of such a refactoring:
http://lists.llvm.org/pipermail/llvm-dev/2019-September/135068.html.
This is a final part from series of patches for dsymutil.
Previous patches : D71068, D71839, D72476. This patch:
1. Creates lib/DWARFLinker interface :
void addObjectFile(DwarfLinkerObjFile &ObjFile);
bool link();
void setOptions;
1. Moves all linking logic from tools/dsymutil/DwarfLinkerForBinary
into lib/DWARFLinker.
2. Renames RelocationManager into AddressesManager.
3. Remarks creation logic moved from separate parallel execution
into object file loading routine.
Testing: it passes "check-all" lit testing. MD5 checksum for clang .dSYM bundle
matches for the dsymutil with/without that patch.
Reviewers: JDevlieghere, friss, dblaikie, aprantl, jdoerfert
Reviewed By: JDevlieghere
Subscribers: merge_guards_bot, hiraditya, jfb, llvm-commits, probinson, thegameg
Tags: #llvm, #debug-info
Differential Revision: https://reviews.llvm.org/D72915
It's been an empty target since r360498 and friends
(`git log --grep='Move InstPrinter files to MCTargetDesc.' llvm/lib/Target`),
but due to hwo the way these targets are structured it was silently
an empty target without anyone noticing.
No behavior change.
This reverts D53469, which changed llvm's DWARF emission to emit
DW_AT_call_return_pc as a function-local offset. Such an encoding is not
compatible with post-link block re-ordering tools and isn't standards-
compliant.
In addition to reverting back to the original DW_AT_call_return_pc
encoding, teach lldb how to fix up DW_AT_call_return_pc when the address
comes from an object file pointed-to by a debug map. While doing this I
noticed that lldb's support for tail calls that cross a DSO/object file
boundary wasn't covered, so I added tests for that. This latter case
exercises the newly added return PC fixup.
The dsymutil changes in this patch were originally included in D49887:
the associated test should be sufficient to test DW_AT_call_return_pc
encoding purely on the llvm side.
Differential Revision: https://reviews.llvm.org/D72489
Summary:
This is the next portion of patches for dsymutil.
Create DwarfEmitter interface to generate all debug info tables.
Put DwarfEmitter into DwarfLinker library and make tools/dsymutil/DwarfStreamer
to be child of DwarfEmitter.
It passes check-all testing. MD5 checksum for clang .dSYM bundle matches
for the dsymutil with/without that patch.
Reviewers: JDevlieghere, friss, dblaikie, aprantl
Reviewed By: JDevlieghere
Subscribers: merge_guards_bot, hiraditya, thegameg, probinson, llvm-commits
Tags: #llvm, #debug-info
Differential Revision: https://reviews.llvm.org/D72476
Summary:
This patch relands D71271. The problem with D71271 is that it has cyclic dependency:
CodeGen->AsmPrinter->DebugInfoDWARF->CodeGen. To avoid cyclic dependency this patch
puts implementation for DWARFOptimizer into separate library: lib/DWARFLinker.
Thus the difference between this patch and D71271 is in that DWARFOptimizer renamed into
DWARFLinker and it`s files are put into lib/DWARFLinker.
Reviewers: JDevlieghere, friss, dblaikie, aprantl
Reviewed By: JDevlieghere
Subscribers: thegameg, merge_guards_bot, probinson, mgorny, hiraditya, llvm-commits
Tags: #llvm, #debug-info
Differential Revision: https://reviews.llvm.org/D71839
Static archives contain object files which contain sections pointing to
external remark files.
When static archives are shipped without the remark files, dsymutil
shouldn't generate an error.
Instead, generate a warning to inform the user that remarks for that
library won't be available in the .dSYM.
as it causes a layering violation/dependency cycle:
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp -> llvm/DebugInfo/DWARF/DWARFExpression.h
llvm/include/llvm/DebugInfo/DWARF/DWARFOptimizer.h -> llvm/CodeGen/NonRelocatableStringpool.h
This reverts commit abc7f6800df8a1f40e1e2c9ccce826abb0208284.
That patch is extracted from the D70709. It moves CompileUnit, DeclContext
into llvm/DebugInfo/DWARF. It also adds new file DWARFOptimizer with
AddressesMap class. AddressesMap generalizes functionality
from RelocationManager.
Differential Revision: https://reviews.llvm.org/D71271
That refactoring moves NonRelocatableStringpool into common CodeGen folder.
So that NonRelocatableStringpool could be used not only inside dsymutil.
Differential Revision: https://reviews.llvm.org/D71068
The functions lookForDIEsToKeep and keepDIEAndDependencies are mutually
recursive and can cause a stackoverflow for large projects. While this
has always been the case, it became a bigger issue when we parallelized
dsymutil, because threads get only a fraction of the stack space.
This patch removes the final recursive call from lookForDIEsToKeep. The
call was used to look at the current DIE's parent chain and mark
everything as kept.
This was tested by running dsymutil on clang built in debug (both with
and without modules) and comparing the MD5 hash of the generated dSYM
companion file.
Differential revision: https://reviews.llvm.org/D70994
The functions lookForDIEsToKeep and keepDIEAndDependencies are mutually
recursive and can cause a stackoverflow for large projects. While this
has always been the case, it became a bigger issue when we parallelized
dsymutil, because threads get only a fraction of the stack space.
In an attempt to tackle this issue, we removed part of the recursion in
r338536 by introducing a worklist. Processing of child DIEs was no
longer recursive. However, we still received bug reports where we'd run
out of stack space.
This patch removes another recursive call from lookForDIEsToKeep. The
call was used to look at DIEs that reference the current DIE. To make
this possible, we inlined keepDIEAndDependencies and added this work to
the existing worklist. Because the function is not tail recursive, we
needed to add two more types of worklist entries to perform the
subsequent work.
This was tested by running dsymutil on clang built in debug (both with
and without modules) and comparing the MD5 hash of the generated dSYM
companion file.
Differential revision: https://reviews.llvm.org/D70990
This adds support to dsymutil for linking remark files and placing them
in the final .dSYM bundle.
The result will be placed in:
* a.out.dSYM/Contents/Resources/Remarks/a.out
or
* a.out.dSYM/Contents/Resources/Remarks/a.out-<arch> for universal binaries
When multi-threaded, this runs a third thread which loops over all the
object files and parses remarks as it finds __remarks sections.
Testing this involves running dsymutil on pre-built binaries and object
files, then running llvm-bcanalyzer on the final result to check for
remarks.
Differential Revision: https://reviews.llvm.org/D69142
Ensure we walk the children of common blocks when deciding what DIEs to
keep. Otherwise we might incorrectly discard them leading to missing
variables in the linked debug info.
This also sorts the list of DW_TAGs alphabetically.
MipsMCAsmInfo was using '$' prefix for Mips32 and '.L' for Mips64
regardless of -target-abi option. By passing MCTargetOptions to MCAsmInfo
we can find out Mips ABI and pick appropriate prefix.
Tags: #llvm, #clang, #lldb
Differential Revision: https://reviews.llvm.org/D66795
After changing dsymutil to use libOption, we lost error reporting for
missing required arguments (input files). Additionally, we stopped
complaining about unknown arguments. This patch fixes both and adds a
test.
llvm-svn: 375044
Since r374600 clang emits base address selection entries. Currently
dsymutil does not support these entries and incorrectly interprets them
as location list entries.
This patch adds support for base address selection entries in dsymutil
and makes sure they are relocated correctly.
Thanks to Dave for coming up with the test case!
Differential revision: https://reviews.llvm.org/D69005
llvm-svn: 374957
The original patch got reverted because it hit a long-standing legacy
issue on Windows that prevents files from being named `com`. Thanks
Kristina & Jeremy for pointing this out.
llvm-svn: 374178
The added test files ("com", "com1.o", "com2.o") are reserved names on
Windows, and makes 'git checkout' fail with a filesystem error.
llvm-svn: 374144
For common symbols the linker emits only a single symbol entry in the
debug map. This caused dsymutil to not relocate common symbols when
linking DWARF coming form object files that did not have this entry.
This patch fixes that by keeping track of common symbols in the object
files and synthesizing a debug map entry for them using the address from
the main binary.
Differential revision: https://reviews.llvm.org/D68680
llvm-svn: 374139
The verbose output for finding relocations assumed that we'd always dump
the DIE after (which starts with a newline) and therefore didn't include
one itself. However, this isn't always true, leading to garbled output.
This patch adds a newline to the verbose output and adds a line that
says that the DIE is being kept (which isn't obvious otherwise). It also
adds a 0x prefix to the relocations.
llvm-svn: 374123
The lambda is taking the stack-allocated Verify boolean by reference and
it would go out of scope on the next iteration. Moving it out of the
loop should fix the issue.
Fixes https://bugs.llvm.org/show_bug.cgi?id=43549
llvm-svn: 373683
The dsymutil implementation file has a using-directive for the llvm
namespace. This patch just removes redundant namespace qualifiers.
llvm-svn: 373623
This patch reimplements command line option parsing in dsymutil with
Tablegen and libOption. The main motivation for this change is to
prevent clashes with other cl::opt options defined in llvm. Although
it's a bit more heavyweight, it has some nice advantages such as no
global static initializers and better separation between the code and
the option definitions.
I also used this opportunity to improve how dsymutil deals with
incompatible options. Instead of having checks spread across the code,
everything is now grouped together in verifyOptions. The fact that the
options are no longer global means that we need to pass them around a
bit more, but I think it's worth the trade-off.
Differential revision: https://reviews.llvm.org/D68361
llvm-svn: 373622
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet, JDevlieghere, alexshap, rupprecht, jhenderson
Subscribers: sdardis, nemanjai, hiraditya, kbarton, jakehehrlich, jrtc27, MaskRay, atanasyan, jsji, seiya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D67499
llvm-svn: 371742
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.
llvm-svn: 369013
Changes: no changes. A fix for the clang code will be landed right on top.
Original commit message:
SectionRef::getName() returns std::error_code now.
Returning Expected<> instead has multiple benefits.
For example, it forces user to check the error returned.
Also Expected<> may keep a valuable string error message,
what is more useful than having a error code.
(Object\invalid.test was updated to show the new messages printed.)
This patch makes a change for all users to switch to Expected<> version.
Note: in a few places the error returned was ignored before my changes.
In such places I left them ignored. My intention was to convert the interface
used, and not to improve and/or the existent users in this patch.
(Though I think this is good idea for a follow-ups to revisit such places
and either remove consumeError calls or comment each of them to clarify why
it is OK to have them).
Differential revision: https://reviews.llvm.org/D66089
llvm-svn: 368826
SectionRef::getName() returns std::error_code now.
Returning Expected<> instead has multiple benefits.
For example, it forces user to check the error returned.
Also Expected<> may keep a valuable string error message,
what is more useful than having a error code.
(Object\invalid.test was updated to show the new messages printed.)
This patch makes a change for all users to switch to Expected<> version.
Note: in a few places the error returned was ignored before my changes.
In such places I left them ignored. My intention was to convert the interface
used, and not to improve and/or the existent users in this patch.
(Though I think this is good idea for a follow-ups to revisit such places
and either remove consumeError calls or comment each of them to clarify why
it is OK to have them).
Differential revision: https://reviews.llvm.org/D66089
llvm-svn: 368812
Some of these names were abbreviated, some were not, some pluralised,
some not. Made the API difficult to use - since it's an exact 1:1
mapping to the DWARF sections - use those names (changing underscore
separation for camel casing).
llvm-svn: 368189
This updates all libraries and tools in LLVM Core to use 64-bit offsets
which directly or indirectly come to DataExtractor.
Differential Revision: https://reviews.llvm.org/D65638
llvm-svn: 368014
In r367348, I changed dsymutil to pass the LinkOptions by value isntead
of by const reference. However, the options were still captured by
reference in the LinkLambda. This patch fixes that by passing them in by
value.
llvm-svn: 367635