llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 11:13:28 +01:00

Author	SHA1	Message	Date
dfukalov	b7b67e3e9a	[NFC] Reduce include files dependency and AA header cleanup (part 2). Continuing work started in https://reviews.llvm.org/D92489: Removed a bunch of includes from "AliasAnalysis.h" and "LoopPassManager.h". Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D92852	2020-12-17 14:04:48 +03:00
Ahmed Bougacha	fe6a3c2668	[Triple][MachO] Define "arm64e", an AArch64 subarch for Pointer Auth. This also teaches MachO writers/readers about the MachO cpu subtype, beyond the minimal subtype reader support present at the moment. This also defines a preprocessor macro to allow users to distinguish __arm64__ from __arm64e__. arm64e defaults to an "apple-a12" CPU, which supports v8.3a, allowing pointer-authentication codegen. It also currently defaults to ios14 and macos11. Differential Revision: https://reviews.llvm.org/D87095	2020-12-03 07:53:59 -08:00
dfukalov	b944ac9e0a	[NFC] Reduce include files dependency. 1. Removed #include "...AliasAnalysis.h" in other headers and modules. 2. Cleaned up includes in AliasAnalysis.h. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D92489	2020-12-03 18:25:05 +03:00
Arthur Eubanks	3801be51e1	[LTO][NewPM] Run verifier when doing LTO This matches the legacy PM. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D92138	2020-12-01 10:14:53 -08:00
Wei Wang	d0b74589e5	[Remarks][1/2] Expand remarks hotness threshold option support in more tools This is the #1 of 2 changes that make remarks hotness threshold option available in more tools. The changes also allow the threshold to sync with hotness threshold from profile summary with special value 'auto'. This change modifies the interface of lto::setupLLVMOptimizationRemarks() to accept remarks hotness threshold. Update all the tools that use it with remarks hotness threshold options: * lld: '--opt-remarks-hotness-threshold=' * llvm-lto2: '--pass-remarks-hotness-threshold=' * llvm-lto: '--lto-pass-remarks-hotness-threshold=' * gold plugin: '-plugin-opt=opt-remarks-hotness-threshold=' Differential Revision: https://reviews.llvm.org/D85809	2020-11-30 21:55:49 -08:00
Ella Ma	59b89a3124	[llvm][clang][mlir] Add checks for the return values from Target::createXXX to prevent protential null deref All these potential null pointer dereferences are reported by my static analyzer for null smart pointer dereferences, which has a different implementation from `alpha.cplusplus.SmartPtr`. The checked pointers in this patch are initialized by Target::createXXX functions. When the creator function pointer is not correctly set, a null pointer will be returned, or the creator function may originally return a null pointer. Some of them may not make sense as they may be checked before entering the function, but I fixed them all in this patch. I submit this fix because 1) similar checks are found in some other places in the LLVM codebase for the same return value of the function; and, 2) some of the pointers are dereferenced before they are checked, which may definitely trigger a null pointer dereference if the return value is nullptr. Reviewed By: tejohnson, MaskRay, jpienaar Differential Revision: https://reviews.llvm.org/D91410	2020-11-21 21:04:12 -08:00
serge-sans-paille	82b6e6053d	llvmbuildectomy - replace llvm-build by plain cmake No longer rely on an external tool to build the llvm component layout. Instead, leverage the existing `add_llvm_componentlibrary` cmake function and introduce `add_llvm_component_group` to accurately describe component behavior. These function store extra properties in the created targets. These properties are processed once all components are defined to resolve library dependencies and produce the header expected by llvm-config. Differential Revision: https://reviews.llvm.org/D90848	2020-11-13 10:35:24 +01:00
Arthur Eubanks	3102160c9b	[NFC] Clean up PassBuilder Make DebugLogging a member variable so that users of PassBuilder don't need to pass it around so much. Move call to TargetMachine::registerPassBuilderCallbacks() within PassBuilder so users don't need to remember to call it. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D90437	2020-10-30 10:03:59 -07:00
Mircea Trofin	5f21904e7d	[ThinLTO] Fix .llvmcmd emission llvm::EmbedBitcodeInModule needs (what used to be called) EmbedMarker set, in order to emit .llvmcmd. EmbedMarker is really about embedding the command line, so renamed the parameter accordingly, too. This was not caught at test because the check-prefix was incorrect, but FileCheck does not report that when multiple prefixes are provided. A separate patch will address that. Differential Revision: https://reviews.llvm.org/D90278	2020-10-28 17:45:30 -07:00
Mircea Trofin	6b13330c44	[NFC][ThinLTO] Change command line passing to EmbedBitcodeInModule Changing to pass by ref - less null checks to worry about. Differential Revision: https://reviews.llvm.org/D90330	2020-10-28 12:33:39 -07:00
Alexandre Ganea	41c38f2370	Re-land [ThinLTO] Re-order modules for optimal multi-threaded processing This reverts 9b5b3050237db3642ed7ab1bdb3ffa2202511b99 and fixes the unwanted re-ordering when generating ThinLTO indexes. The goal of this patch is to better balance thread utilization during ThinLTO in-process linking (in llvm-lto2 or in LLD). Before this patch, large modules would often be scheduled late during execution, taking a long time to complete, thus starving the thread pool. We now sort modules in descending order, based on each module's bitcode size, so that larger modules are processed first. By doing so, smaller modules have a better chance to keep the thread pool active, and thus avoid starvation when the bitcode compilation is almost complete. In our case (on dual Intel Xeon Gold 6140, Windows 10 version 2004, two-stage build), this saves 15 sec when linking `clang.exe` with LLD & -flto=thin, /opt:lldltojobs=all, no ThinLTO cache, -DLLVM_INTEGRATED_CRT_ALLOC=d:\git\rpmalloc. Before patch: 100 sec After patch: 85 sec Inspired by the work done by David Callahan in D60495. Differential Revision: https://reviews.llvm.org/D87966	2020-10-13 21:54:15 -04:00
Jordan Rupprecht	5f9071dec6	Temporarily revert "[ThinLTO] Re-order modules for optimal multi-threaded processing" This reverts commit 6537004913f3009d896bc30856698e7d22199ba7. This is causing test failures internally, and while a few of the cases turned out to be bad user code (relying on a specific order of static initialization across translation units), some cases are less clear. Temporarily reverting for now, and Teresa is going to follow up with more details.	2020-10-09 14:36:20 -07:00
Mircea Trofin	3c9a842f53	[ThinLTO] Option to bypass function importing. This completes the circle, complementing -lto-embed-bitcode (specifically, post-merge-pre-opt). Using -thinlto-assume-merged skips function importing. The index file is still needed for the other data it contains. Differential Revision: https://reviews.llvm.org/D87949	2020-09-22 13:12:11 -07:00
Alexandre Ganea	09fd7108ad	[ThinLTO] Re-order modules for optimal multi-threaded processing Re-use an optimizition from the old LTO API (used by ld64). This sorts modules in ascending order, based on bitcode size, so that larger modules are processed first. This allows for smaller modules to be process last, and better fill free threads 'slots', and thusly allow for better multi-thread load balancing. In our case (on dual Intel Xeon Gold 6140, Windows 10 version 2004, two-stage build), this saves 15 sec when linking `clang.exe` with LLD & `-flto=thin`, `/opt:lldltojobs=all`, no ThinLTO cache, -DLLVM_INTEGRATED_CRT_ALLOC=d:\git\rpmalloc. Before patch: 102 sec After patch: 85 sec Inspired by the work done by David Callahan in D60495. Differential Revision: https://reviews.llvm.org/D87966	2020-09-22 11:25:59 -04:00
Mircea Trofin	2d0a6945c4	[ThinLTO] add post-thinlto-merge option to -lto-embed-bitcode This will embed bitcode after (Thin)LTO merge, but before optimizations. In the case the thinlto backend is called from clang, the .llvmcmd section is also produced. Doing so in the case where the caller is the linker doesn't yet have a motivation, and would require plumbing through command line args. Differential Revision: https://reviews.llvm.org/D87636	2020-09-15 15:56:11 -07:00
Mircea Trofin	a6aa03251f	[ThinLTO] Make -lto-embed-bitcode an enum The current behavior of -lto-embed-bitcode is not quite the same as that of -fembed-bitcode. While both populate .llvmbc with bitcode, the latter populates it with pre-optimized bitcode(), while the former with post-optimized. The scenarios driving them are different - the latter's goal is to allow re-compilation, while the former, IIUC, is execution. I plan to add a third mode for thinlto cases, closely-related to -fembed-bitcode's scenario: adding the bitcode pre-optimization, but post-merging. This would allow re-compilation without requiring the other .bc files that were merged (akin to how -fembed-bitcode allows recompilation without all the .h files) The third mode can't co-exist with the current -lto-embed-bitcode mode, because the latter would overwrite it. For clarity, we change -lto-embed-bitcode to be an enum. () That's the compiler semantics. The driver splits compilation in 2 phases, so if -fembed-bitcode is given to the driver, the .llvmbc is optimized bitcode; if the option is passed to the compiler (after -cc1), the section is pre-optimized. Differential Revision: https://reviews.llvm.org/D87477	2020-09-11 13:24:54 -07:00
Mircea Trofin	f35266cc9a	[NFC][ThinLTO] Let llvm::EmbedBitcodeInModule handle serialization. llvm::EmbedBitcodeInModule handles serializing the passed-in module, if the provided MemoryBufferRef is invalid. This is already the path taken in one of the uses of the API - clang::EmbedBitcode, when called from BackendConsumer::HandleTranslationUnit - so might as well do the same here and reduce (by very little) code duplication. The only difference this patch introduces is that the serialization happens with ShouldPreserveUseListOrder set to true. Differential Revision: https://reviews.llvm.org/D87339	2020-09-10 10:25:00 -07:00
Mircea Trofin	1513446701	[NFC][ThinLTO] EmbedBitcodeSection doesn't need the Config Instead, passing in the command line options, initialized to nullptr. In an upcoming patch, we can then use the parameter to pass actual command line options. Differential Revision: https://reviews.llvm.org/D87336	2020-09-08 17:14:44 -07:00
Steven Wu	a48b964bbb	[ThinLTO][Legacy] Fix StringRef assertion from ThinLTO bots This is a presumed fix for FireFox thinLTO bot fix which hits assertion failure for invalid index when access StringRef. Techinically, `IRName` in the symtab should not be empty string for the entries we cared about but this will help to fix the bot before more information can be provided. Otherwise, NFCI.	2020-09-04 12:30:09 -07:00
Steven Wu	0298cbfe6d	[LTO] Don't apply LTOPostLink module flag during writeMergedModule For `ld64` which uses legacy LTOCodeGenerator, it relies on writeMergedModule to perform `ld -r` (generates a linked object file). If all the inputs to `ld -r` is fullLTO bitcode, `ld64` will linked the bitcode module, internalize all the symbols and write out another fullLTO bitcode object file. This bitcode file doesn't have all the bitcode inputs and it should not have LTOPostLink module flag. It will also cause error when this bitcode object file is linked with other LTO object file. Fix the issue by not applying LTOPostLink flag during writeMergedModule function. The flag should only be added when all the bitcode are linked and ready to be optimized. rdar://problem/58462798 Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D84789	2020-08-26 11:17:45 -07:00
Steven Wu	244658bfb9	[ThinLTO][Legacy] Compute PreservedGUID based on IRName in Symtab Instead of computing GUID based on some assumption about symbol mangling rule from IRName to symbol name, lookup the IRName from all the symtabs from all the input files to see if there are any matching symbols entry provides the IRName for GUID computation. rdar://65853754 Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D84803	2020-08-26 10:15:00 -07:00
Wei Wang	7dabc836ba	[FIX] Avoid creating BFI when emitting remarks for dead functions Dead function has its body stripped away, and can cause various analyses to panic. Also it does not make sense to apply analyses on such function. Reviewed By: xazax.hun, MaskRay, wenlei, hoy Differential Revision: https://reviews.llvm.org/D84715	2020-08-25 11:12:38 -07:00
Arthur Eubanks	c136b2bbc8	[NewPM] Support optnone under new pass manager OptNoneInstrumentation is part of StandardInstrumentations. It skips functions (or loops) that are marked optnone. The feature of skipping optional passes for optnone functions under NPM is gated on a -enable-npm-optnone flag. Currently it is by default false. That is because we still need to mark all required passes to be required. Otherwise optnone functions will start having incorrect semantics. After that is done in following changes, we can remove the flag and always enable this. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D83519	2020-07-21 09:53:43 -07:00
Eli Friedman	9d315e1c2b	Remove GlobalValue::getAlignment(). This function is deceptive at best: it doesn't return what you'd expect. If you have an arbitrary GlobalValue and you want to determine the alignment of that pointer, Value::getPointerAlignment() returns the correct value. If you want the actual declared alignment of a function or variable, GlobalObject::getAlignment() returns that. This patch switches all the users of GlobalValue::getAlignment to an appropriate alternative. Differential Revision: https://reviews.llvm.org/D80368	2020-06-23 19:13:42 -07:00
Momchil Velikov	a064519335	[LTO] Use StringRef instead of C-style strings in setCodeGenDebugOptions Fixes an issue with missing nul-terminators and saves us some string copying, compared to a version which would insert nul-terminators. Differential Revision: https://reviews.llvm.org/D82033	2020-06-22 11:22:18 +01:00
romanova-ekaterina	8f7a8fd1c3	Error related to ThinLTO caching needs to be downgraded to a remark This is a fix for PR #46392 (Diagnostic message (error) related to ThinLTO caching needs to be downgraded to a remark). There are diagnostic messages related to ThinLTO caching that contain the word "error", but they are really just notices/remarks for users, and they don't cause a build failure. The word "error" appearing can be confusing to users, and may even cause deeper problems. User's build system might be designed to interpret any error messages (even a benign error message as the one above) reported by the compiler as a build failure, thus causing the build to fail "needlessly". In short, the term "error" in this diagnostic is misleading at best, and may be causing build systems to fail at worst. Differential Revision: https://reviews.llvm.org/D82138	2020-06-19 16:03:29 -07:00
Vitaly Buka	5eed995b68	[StackSafety] Run ThinLTO Summary: ThinLTO linking runs dataflow processing on collected function parameters. Then StackSafetyGlobalInfoWrapperPass in ThinLTO backend will run as usual looking up to external symbol in the summary if needed. Depends on D80985. Reviewers: eugenis, pcc Reviewed By: eugenis Subscribers: inglorion, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D81242	2020-06-12 18:11:29 -07:00
Vitaly Buka	436769363c	[StackSafety] Pass summary into codegen Summary: The patch wraps ThinLTO index into immutable pass which can be used by StackSafety analysis. Reviewers: eugenis, pcc Reviewed By: eugenis Subscribers: hiraditya, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80985	2020-06-10 21:02:54 -07:00
Hongtao Yu	9bce9489e7	[LLD][ThinLTO] Add --thinlto-single-module to allow compiling partial modules. This change introduces an LLD switch --thinlto-single-module to allow compiling only a part of the input modules. This is specifically enables: 1. Fast investigating/debugging modules of interest without spending time on compiling unrelated modules. 2. Compiler debug dump with -mllvm -debug-only= for specific modules. It will be useful for large applications which has 1K+ input modules for thinLTO. The switch can be combined with `--lto-obj-path=` or `--lto-emit-asm` to obtain intermediate object files or assembly files. So far the module name matching is implemented as a fuzzy name lookup where the modules with name containing the switch value are compiled. E.g, Command: ld.lld main.o thin.a --thinlto-single-module=thin.a --lto-obj-path=single.o log: [ThinLTO] Selecting thin.a(thin1.o at 168) to compile [ThinLTO] Selecting thin.a(thin2.o at 228) to compile Command: ld.lld main.o thin.a --thinlto-single-module=thin1.o --lto-obj-path=single.o log: [ThinLTO] Selecting thin.a(thin1.o at 168) to compile Differential Revision: https://reviews.llvm.org/D80406	2020-06-10 15:32:30 -07:00
romanova-ekaterina	7e98b6c7a5	Fixed false ThinLTO cache misses problem (PR 45819). We relied on the fact that the iterators walks through the elements of a DenseSet in a deterministic order (which is not true). This caused ThinLTO cache misses. This patch addresses this problem. See PR 45819 for additional information https://bugs.llvm.org/show_bug.cgi?id=45819 Differential Revision: https://reviews.llvm.org/D79772	2020-06-10 12:41:41 -07:00
Hiroshi Yamauchi	c65f25e192	[PGO] Improve the working set size heuristics under the partial sample PGO. Summary: The working set size heuristics (ProfileSummaryInfo::hasHugeWorkingSetSize) under the partial sample PGO may not be accurate because the profile is partial and the number of hot profile counters in the ProfileSummary may not reflect the actual working set size of the program being compiled. To improve this, the (approximated) ratio of the the number of profile counters of the program being compiled to the number of profile counters in the partial sample profile is computed (which is called the partial profile ratio) and the working set size of the profile is scaled by this ratio to reflect the working set size of the program being compiled and used for the working set size heuristics. The partial profile ratio is approximated based on the number of the basic blocks in the program and the NumCounts field in the ProfileSummary and computed through the thin LTO indexing. This means that there is the limitation that the scaled working set size is available to the thin LTO post link passes only. Reviewers: davidxl Subscribers: mgorny, eraman, hiraditya, steven_wu, dexonsmith, arphaman, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79831	2020-06-01 10:29:23 -07:00
Simon Pilgrim	3909a36379	TargetLowering.h - remove unnecessary TargetMachine.h include. NFC Replace with forward declaration and move dependency down to source files that actually need it. Both TargetLowering.h and TargetMachine.h are 2 of the most expensive headers (top 10) in the ClangBuildAnalyzer report when building llc.	2020-05-23 19:49:38 +01:00
Craig Topper	c8f290ffea	[Align] Remove operations on MaybeAlign that asserted that it had a defined value. If the caller needs to reponsible for making sure the MaybeAlign has a value, then we should just make the caller convert it to an Align with operator*. I explicitly deleted the relational comparison operators that were being inherited from Optional. It's unclear what the meaning of two MaybeAligns were one is defined and the other isn't should be. So make the caller reponsible for defining the behavior. I left the ==/!= operators from Optional. But now that exposed a weird quirk that ==/!= between Align and MaybeAlign required the MaybeAlign to be defined. But now we use the operator== from Optional that takes an Optional and the Value. Differential Revision: https://reviews.llvm.org/D80455	2020-05-22 21:54:28 -07:00
Zakk Chen	6bb0aa513e	[LTO] Suppress emission of empty combined module by default Summary: That unless the user requested an output object (--lto-obj-path), the an unused empty combined module is not emitted. This changed is helpful for some target (ex. RISCV-V) which encoded the ABI info in IR module flags (target-abi). Empty unused module has no ABI info so the linker would get the linking error during merging incompatible ABIs. Reviewers: tejohnson, espindola, MaskRay Subscribers: emaste, inglorion, arichardson, hiraditya, simoncook, MaskRay, steven_wu, dexonsmith, PkmX, dang, lenary, s.egerton, luismarques, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78988	2020-05-04 18:31:09 -07:00
serge-sans-paille	69169b30b2	Update compiler extension integration into the build system The approach here is to create a new (empty) component, `Extensions', where all statically compiled extensions dynamically register their dependencies. That way we're more natively compatible with LLVMBuild and llvm-config. Fixes: https://bugs.llvm.org/show_bug.cgi?id=44870 Differential Revision: https://reviews.llvm.org/D78192	2020-04-24 09:40:14 +02:00
Eli Friedman	82b54aea79	Enable new passmanager plugin support for LTO. This should make both static and dynamic NewPM plugins work with LTO. And as a bonus, it makes static linking of OldPM plugins more reliable for plugins with both an OldPM and NewPM interface. I only implemented the command-line flag to specify NewPM plugins in llvm-lto2, to show it works. Support can be added for other tools later. Differential Revision: https://reviews.llvm.org/D76866	2020-04-14 15:07:07 -07:00
Mehdi Amini	e27edda00f	Revert "Move ModuleSummaryAnalysis from libAnalysis to libObject to break the dependency from Analysis to Object" This reverts commit 10df1563d608323a3144afc5f6038ecb81869b92. Some buildbots are broken.	2020-04-14 00:27:08 +00:00
Mehdi Amini	c54ae056de	Move ModuleSummaryAnalysis from libAnalysis to libObject to break the dependency from Analysis to Object ModuleSummaryAnalysis is the only file in libAnalysis that brings a dependency on the CodeGen layer from libAnalysis, moving it breaks this dependency. Differential Revision: https://reviews.llvm.org/D77994	2020-04-13 23:12:11 +00:00
Fangrui Song	d41ca54775	[ThinLTO] Drop dso_local if a GlobalVariable satisfies isDeclarationForLinker() dso_local leads to direct access even if the definition is not within this compilation unit (it is still in the same linkage unit). On ELF, such a relocation (e.g. R_X86_64_PC32) referencing a STB_GLOBAL STV_DEFAULT object can cause a linker error in a -shared link. If the linkage is changed to available_externally, the dso_local flag should be dropped, so that no direct access will be generated. The current behavior is benign, because -fpic does not assume dso_local (clang/lib/CodeGen/CodeGenModule.cpp:shouldAssumeDSOLocal). If we do that for -fno-semantic-interposition (D73865), there will be an R_X86_64_PC32 linker error without this patch. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D74751	2020-04-07 15:46:01 -07:00
Benjamin Kramer	abbd22c44e	[LTO] Replace hand-rolled endian conversion with support::endian. NFCI.	2020-04-06 13:23:27 +02:00
Florian Hahn	18e00645a2	Revert "[Darwin] Respect -fno-unroll-loops during LTO." As per post-commit comment at https://reviews.llvm.org/D76916, this should better be done at the TU level. This reverts commit 9ce198d6ed371399e9bd9ba8b48fbab0f4e60240.	2020-03-30 15:20:30 +01:00
Florian Hahn	b61c82964d	[Darwin] Respect -fno-unroll-loops during LTO. Currently -fno-unroll-loops is ignored when doing LTO on Darwin. This patch adds a new -lto-no-unroll-loops option to the LTO code generator and forwards it to the linker if -fno-unroll-loops is passed. Reviewers: thegameg, steven_wu Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D76916	2020-03-27 22:19:03 +00:00
Alexandre Ganea	61ed3dc5bf	[ThinLTO] Allow usage of all hardware threads in the system Before this patch, it wasn't possible to extend the ThinLTO threads to all SMT/CMT threads in the system. Only one thread per core was allowed, instructed by usage of llvm::heavyweight_hardware_concurrency() in the ThinLTO code. Any number passed to the LLD flag /opt:lldltojobs=..., or any other ThinLTO-specific flag, was previously interpreted in the context of llvm::heavyweight_hardware_concurrency(), which means SMT disabled. One can now say in LLD: /opt:lldltojobs=0 -- Use one std::thread / hardware core in the system (no SMT). Default value if flag not specified. /opt:lldltojobs=N -- Limit usage to N threads, regardless of usage of heavyweight_hardware_concurrency(). /opt:lldltojobs=all -- Use all hardware threads in the system. Equivalent to /opt:lldltojobs=$(nproc) on Linux and /opt:lldltojobs=%NUMBER_OF_PROCESSORS% on Windows. When an affinity mask is set for the process, threads will be created only for the cores selected by the mask. When N > number-of-hardware-threads-in-the-system, the threads in the thread pool will be dispatched equally on all CPU sockets (tested only on Windows). When N <= number-of-hardware-threads-on-a-CPU-socket, the threads will remain on the CPU socket where the process started (only on Windows). Differential Revision: https://reviews.llvm.org/D75153	2020-03-27 10:20:58 -04:00
Fangrui Song	4cdc54cceb	[LTO] onfig::addSaveTemps: clear ResolutionFile upon an error Otherwise ld.lld -save-temps will crash when writing to ResolutionFile. llvm-lto2 -save-temps does not crash because it exits immediately. Reviewed By: evgeny777 Differential Revision: https://reviews.llvm.org/D75426	2020-03-02 17:49:04 -08:00
Francis Visoiu Mistrih	67c4e6afe6	[LTO][Legacy] Add explicit dependency on BinaryFormat This fixes some windows bots.	2020-02-28 15:50:43 -08:00
Francis Visoiu Mistrih	b4279d9528	[LTO][Legacy] Add new API to query Mach-O CPU (sub)type Tools working with object files on Darwin (e.g. lipo) may need to know properties like the CPU type and subtype of a bitcode file. The logic of converting a triple to a Mach-O CPU_(SUB_)TYPE should be provided by LLVM instead of relying on tools to re-implement it. Differential Revision: https://reviews.llvm.org/D75067	2020-02-28 12:56:05 -08:00
Alexandre Ganea	e8dc2577fb	Improve comments after 8404aeb56a73ab24f9b295111de3b37a37f0b841.	2020-02-18 14:25:21 -05:00
Alexandre Ganea	ae05eb086d	[Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket. == Background == Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads. By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to. This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market. == The problem == The heavyweight_hardware_concurrency() API was introduced so that only one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group". == The changes in this patch == To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO). When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead. The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware core will be used. When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once. Differential Revision: https://reviews.llvm.org/D71775	2020-02-14 10:24:22 -05:00
Bill Wendling	0816222e8f	Revert "Remove redundant "std::move"s in return statements" The build failed with error: call to deleted constructor of 'llvm::Error' errors. This reverts commit 1c2241a7936bf85aa68aef94bd40c3ba77d8ddf2.	2020-02-10 07:07:40 -08:00
Bill Wendling	e45b5f33f3	Remove redundant "std::move"s in return statements	2020-02-10 06:39:44 -08:00

1 2 3 4 5 ...

755 Commits