1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00
llvm-mirror/lib/Support
Alexandre Ganea ae05eb086d [Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups
The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket.

== Background ==
Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void*). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads.

By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to.

This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market.

== The problem ==
The heavyweight_hardware_concurrency() API was introduced so that only *one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group".

== The changes in this patch ==
To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO).

When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead.
The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware *core* will be used.

When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once.

Differential Revision: https://reviews.llvm.org/D71775
2020-02-14 10:24:22 -05:00
..
Unix [Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups 2020-02-14 10:24:22 -05:00
Windows [Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups 2020-02-14 10:24:22 -05:00
AArch64TargetParser.cpp Fix some more -Wrange-loop-analysis warnings in AArch64TargetParser 2020-02-04 16:57:49 -08:00
ABIBreak.cpp
Allocator.cpp Revert "[Support] Explicitly instantiate BumpPtrAllocatorImpl" 2020-01-18 09:33:00 -08:00
AMDGPUMetadata.cpp [AMDGPU] add support for hostcall buffer pointer as hidden kernel argument 2019-11-20 15:53:55 +05:30
APFloat.cpp [APFloat] Fix FP remainder operation 2020-02-12 10:42:55 +02:00
APInt.cpp Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
APSInt.cpp
ARMAttributeParser.cpp ARMAttributeParser - fix shadow variable name warnings from decodeULEB128 calls. NFCI. 2019-11-02 20:12:59 +00:00
ARMBuildAttrs.cpp
ARMTargetParser.cpp [ARM][TargetParser] Improve handling of dependencies between target features 2020-02-05 16:07:51 +00:00
ARMWinEH.cpp
Atomic.cpp
BinaryStreamError.cpp
BinaryStreamReader.cpp BinaryStream - fix static analyzer warnings. NFCI. 2019-11-08 13:20:24 +00:00
BinaryStreamRef.cpp
BinaryStreamWriter.cpp
BlockFrequency.cpp
BranchProbability.cpp
BuryPointer.cpp
CachePruning.cpp
Chrono.cpp
circular_raw_ostream.cpp
CMakeLists.txt Revert "[CMake] Link against ZLIB::ZLIB" 2020-02-06 13:55:28 -08:00
CodeGenCoverage.cpp
COM.cpp
CommandLine.cpp Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
Compression.cpp build: reduce CMake handling for zlib 2020-01-02 11:19:12 -08:00
ConvertUTF.cpp
ConvertUTFWrapper.cpp
COPYRIGHT.regex
CrashRecoveryContext.cpp Fix MSVC build with C++ EH enabled 2020-02-11 15:56:10 -08:00
CRC.cpp Make llvm::crc32() work also for input sizes larger than 32 bits. 2020-02-05 21:32:11 +01:00
DAGDeltaAlgorithm.cpp
DataExtractor.cpp
Debug.cpp Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
DebugCounter.cpp Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
DeltaAlgorithm.cpp
DJB.cpp Revert "Forward declare Optional<T> in STLExtras.h" 2019-11-13 16:36:21 -08:00
DynamicLibrary.cpp
Errno.cpp
Error.cpp [Error] Remove a broken code fragment accidentally included in 76bcbaafab2. 2019-11-20 17:50:22 -08:00
ErrorHandling.cpp Revert "Revert "Reland "[Support] make report_fatal_error abort instead of exit""" 2020-02-13 10:16:06 -08:00
FileCheck.cpp Revert "Remove redundant "std::move"s in return statements" 2020-02-10 07:07:40 -08:00
FileCheckImpl.h FileCheck [9/12]: Add support for matching formats 2020-01-24 14:15:28 +00:00
FileCollector.cpp Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
FileOutputBuffer.cpp [LLD][ELF] Support --[no-]mmap-output-file with F_no_mmap 2019-10-29 15:49:08 -07:00
FileUtilities.cpp
FoldingSet.cpp
FormattedStream.cpp
FormatVariadic.cpp
GlobPattern.cpp Reapply r375051: [support] GlobPattern: add support for \ and [!...], and allow ] in more places 2019-10-17 18:09:05 +00:00
GraphWriter.cpp Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
Hashing.cpp
Host.cpp [Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups 2020-02-14 10:24:22 -05:00
InitLLVM.cpp [Signal] Allow llvm clients to opt into one-shot SIGPIPE handling 2019-11-18 10:27:27 -08:00
IntEqClasses.cpp
IntervalMap.cpp
ItaniumManglingCanonicalizer.cpp Use std::foo_t rather than std::foo in LLVM. 2020-02-11 15:12:51 -08:00
JSON.cpp Revert "Remove redundant "std::move"s in return statements" 2020-02-10 07:07:40 -08:00
KnownBits.cpp [NFC][KnownBits] Add getMinValue() / getMaxValue() methods 2019-12-03 20:04:51 +03:00
LEB128.cpp
LineIterator.cpp
LLVMBuild.txt
Locale.cpp
LockFileManager.cpp Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
LowLevelType.cpp
ManagedStatic.cpp
MathExtras.cpp
MD5.cpp
Memory.cpp
MemoryBuffer.cpp Revert "Remove redundant "std::move"s in return statements" 2020-02-10 07:07:40 -08:00
NativeFormatting.cpp Use std::foo_t rather than std::foo in LLVM. 2020-02-11 15:12:51 -08:00
Optional.cpp
Parallel.cpp [Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups 2020-02-14 10:24:22 -05:00
Path.cpp Revert "Remove redundant "std::move"s in return statements" 2020-02-10 07:07:40 -08:00
PluginLoader.cpp
PrettyStackTrace.cpp
Process.cpp [Clang][Driver] After default -fintegrated-cc1, make llvm::report_fatal_error() generate preprocessed source + reproducer.sh again. 2020-02-11 10:17:30 -05:00
Program.cpp
RandomNumberGenerator.cpp
raw_os_ostream.cpp
raw_ostream.cpp raw_ostream - fix static analyzer warnings. NFCI. 2019-11-08 15:09:55 +00:00
regcomp.c
regengine.inc
regerror.c
regex2.h
regex_impl.h
Regex.cpp Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
regexec.c
regfree.c
regstrlcpy.c
regutils.h
RWMutex.cpp
ScaledNumber.cpp
ScopedPrinter.cpp
SHA1.cpp [Support] Optimize SHA1 implementation 2019-11-11 22:14:28 -08:00
Signals.cpp Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
Signposts.cpp
SmallPtrSet.cpp
SmallVector.cpp
SourceMgr.cpp Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
SpecialCaseList.cpp Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
Statistic.cpp Statistic - Fix MSVC shadow warning against global PrintOnExit static variable. NFC. 2019-11-21 12:08:01 +00:00
StringExtras.cpp Print quoted backslashes in LLVM IR as \\ instead of \5C 2019-10-10 18:31:57 +00:00
StringMap.cpp
StringPool.cpp
StringRef.cpp [APFloat] Fix checked error assert failures 2020-01-09 09:42:32 +02:00
StringSaver.cpp
SymbolRemappingReader.cpp
SystemUtils.cpp
TargetParser.cpp [NFC] Fixes -Wrange-loop-analysis warnings 2020-01-01 20:01:37 +01:00
TargetRegistry.cpp
TarWriter.cpp Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
Threading.cpp [Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups 2020-02-14 10:24:22 -05:00
ThreadLocal.cpp
ThreadPool.cpp [Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups 2020-02-14 10:24:22 -05:00
TimeProfiler.cpp [Support] Wrap extern TLS variable in getter function 2020-01-31 11:32:36 +02:00
Timer.cpp Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
ToolOutputFile.cpp Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
TrigramIndex.cpp
Triple.cpp [NFC,format] Sort switch cases alphabetically 2020-01-09 18:37:24 +01:00
Twine.cpp
Unicode.cpp
UnicodeCaseFold.cpp
Valgrind.cpp
VersionTuple.cpp
VirtualFileSystem.cpp [VFS] More consistent support for Windows 2020-02-05 11:38:20 -08:00
Watchdog.cpp
WithColor.cpp
xxhash.cpp
YAMLParser.cpp [llvm] Replace SmallStr.str().str() with std::string conversion operator. 2020-01-29 21:16:46 -08:00
YAMLTraits.cpp Revert "Remove redundant "std::move"s in return statements" 2020-02-10 07:07:40 -08:00
Z3Solver.cpp