2016-12-16 01:26:30 +01:00
|
|
|
//===- ThinLTOBitcodeWriter.cpp - Bitcode writing pass for ThinLTO --------===//
|
|
|
|
//
|
2019-01-19 09:50:56 +01:00
|
|
|
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
|
|
|
// See https://llvm.org/LICENSE.txt for license information.
|
|
|
|
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
|
2016-12-16 01:26:30 +01:00
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
|
2017-06-01 03:02:12 +02:00
|
|
|
#include "llvm/Transforms/IPO/ThinLTOBitcodeWriter.h"
|
2017-02-14 04:42:38 +01:00
|
|
|
#include "llvm/Analysis/BasicAliasAnalysis.h"
|
2020-04-14 02:27:08 +02:00
|
|
|
#include "llvm/Analysis/ModuleSummaryAnalysis.h"
|
2017-05-10 20:52:16 +02:00
|
|
|
#include "llvm/Analysis/ProfileSummaryInfo.h"
|
2016-12-16 01:26:30 +01:00
|
|
|
#include "llvm/Analysis/TypeMetadataUtils.h"
|
|
|
|
#include "llvm/Bitcode/BitcodeWriter.h"
|
|
|
|
#include "llvm/IR/Constants.h"
|
2017-02-08 21:44:00 +01:00
|
|
|
#include "llvm/IR/DebugInfo.h"
|
2020-12-09 13:06:50 +01:00
|
|
|
#include "llvm/IR/Instructions.h"
|
2016-12-16 01:26:30 +01:00
|
|
|
#include "llvm/IR/Intrinsics.h"
|
|
|
|
#include "llvm/IR/Module.h"
|
|
|
|
#include "llvm/IR/PassManager.h"
|
Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
recompiles touches affected_files header
342380 95 3604 llvm/include/llvm/ADT/STLExtras.h
314730 234 1345 llvm/include/llvm/InitializePasses.h
307036 118 2602 llvm/include/llvm/ADT/APInt.h
213049 59 3611 llvm/include/llvm/Support/MathExtras.h
170422 47 3626 llvm/include/llvm/Support/Compiler.h
162225 45 3605 llvm/include/llvm/ADT/Optional.h
158319 63 2513 llvm/include/llvm/ADT/Triple.h
140322 39 3598 llvm/include/llvm/ADT/StringRef.h
137647 59 2333 llvm/include/llvm/Support/Error.h
131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
2019-11-13 22:15:01 +01:00
|
|
|
#include "llvm/InitializePasses.h"
|
2018-04-20 03:36:48 +02:00
|
|
|
#include "llvm/Object/ModuleSymbolTable.h"
|
2016-12-16 01:26:30 +01:00
|
|
|
#include "llvm/Pass.h"
|
|
|
|
#include "llvm/Support/ScopedPrinter.h"
|
[ThinLTO] Add support for emitting minimized bitcode for thin link
Summary:
The cumulative size of the bitcode files for a very large application
can be huge, particularly with -g. In a distributed build environment,
all of these files must be sent to the remote build node that performs
the thin link step, and this can exceed size limits.
The thin link actually only needs the summary along with a bitcode
symbol table. Until we have a proper bitcode symbol table, simply
stripping the debug metadata results in significant size reduction.
Add support for an option to additionally emit minimized bitcode
modules, just for use in the thin link step, which for now just strips
all debug metadata. I plan to add a cc1 option so this can be invoked
easily during the compile step.
However, care must be taken to ensure that these minimized thin link
bitcode files produce the same index as with the original bitcode files,
as these original bitcode files will be used in the backends.
Specifically:
1) The module hash used for caching is typically produced by hashing the
written bitcode, and we want to include the hash that would correspond
to the original bitcode file. This is because we want to ensure that
changes in the stripped portions affect caching. Added plumbing to emit
the same module hash in the minimized thin link bitcode file.
2) The module paths in the index are constructed from the module ID of
each thin linked bitcode, and typically is automatically generated from
the input file path. This is the path used for finding the modules to
import from, and obviously we need this to point to the original bitcode
files. Added gold-plugin support to take a suffix replacement during the
thin link that is used to override the identifier on the MemoryBufferRef
constructed from the loaded thin link bitcode file. The assumption is
that the build system can specify that the minimized bitcode file has a
name that is similar but uses a different suffix (e.g. out.thinlink.bc
instead of out.o).
Added various tests to ensure that we get identical index files out of
the thin link step.
Reviewers: mehdi_amini, pcc
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D31027
llvm-svn: 298638
2017-03-23 20:47:39 +01:00
|
|
|
#include "llvm/Support/raw_ostream.h"
|
|
|
|
#include "llvm/Transforms/IPO.h"
|
2017-02-14 04:42:38 +01:00
|
|
|
#include "llvm/Transforms/IPO/FunctionAttrs.h"
|
2018-02-08 08:23:24 +01:00
|
|
|
#include "llvm/Transforms/IPO/FunctionImport.h"
|
cfi-icall: Allow the jump table to be optionally made non-canonical.
The default behavior of Clang's indirect function call checker will replace
the address of each CFI-checked function in the output file's symbol table
with the address of a jump table entry which will pass CFI checks. We refer
to this as making the jump table `canonical`. This property allows code that
was not compiled with ``-fsanitize=cfi-icall`` to take a CFI-valid address
of a function, but it comes with a couple of caveats that are especially
relevant for users of cross-DSO CFI:
- There is a performance and code size overhead associated with each
exported function, because each such function must have an associated
jump table entry, which must be emitted even in the common case where the
function is never address-taken anywhere in the program, and must be used
even for direct calls between DSOs, in addition to the PLT overhead.
- There is no good way to take a CFI-valid address of a function written in
assembly or a language not supported by Clang. The reason is that the code
generator would need to insert a jump table in order to form a CFI-valid
address for assembly functions, but there is no way in general for the
code generator to determine the language of the function. This may be
possible with LTO in the intra-DSO case, but in the cross-DSO case the only
information available is the function declaration. One possible solution
is to add a C wrapper for each assembly function, but these wrappers can
present a significant maintenance burden for heavy users of assembly in
addition to adding runtime overhead.
For these reasons, we provide the option of making the jump table non-canonical
with the flag ``-fno-sanitize-cfi-canonical-jump-tables``. When the jump
table is made non-canonical, symbol table entries point directly to the
function body. Any instances of a function's address being taken in C will
be replaced with a jump table address.
This scheme does have its own caveats, however. It does end up breaking
function address equality more aggressively than the default behavior,
especially in cross-DSO mode which normally preserves function address
equality entirely.
Furthermore, it is occasionally necessary for code not compiled with
``-fsanitize=cfi-icall`` to take a function address that is valid
for CFI. For example, this is necessary when a function's address
is taken by assembly code and then called by CFI-checking C code. The
``__attribute__((cfi_jump_table_canonical))`` attribute may be used to make
the jump table entry of a specific function canonical so that the external
code will end up taking a address for the function that will pass CFI checks.
Fixes PR41972.
Differential Revision: https://reviews.llvm.org/D65629
llvm-svn: 368495
2019-08-10 00:31:59 +02:00
|
|
|
#include "llvm/Transforms/IPO/LowerTypeTests.h"
|
2016-12-16 01:26:30 +01:00
|
|
|
#include "llvm/Transforms/Utils/Cloning.h"
|
2017-04-27 22:27:27 +02:00
|
|
|
#include "llvm/Transforms/Utils/ModuleUtils.h"
|
2016-12-16 01:26:30 +01:00
|
|
|
using namespace llvm;
|
|
|
|
|
|
|
|
namespace {
|
|
|
|
|
2021-08-03 19:56:56 +02:00
|
|
|
// Determine if a promotion alias should be created for a symbol name.
|
|
|
|
static bool allowPromotionAlias(const std::string &Name) {
|
|
|
|
// Promotion aliases are used only in inline assembly. It's safe to
|
|
|
|
// simply skip unusual names. Subset of MCAsmInfo::isAcceptableChar()
|
|
|
|
// and MCAsmInfoXCOFF::isAcceptableChar().
|
|
|
|
for (const char &C : Name) {
|
|
|
|
if (isAlnum(C) || C == '_' || C == '.')
|
|
|
|
continue;
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2016-12-16 01:26:30 +01:00
|
|
|
// Promote each local-linkage entity defined by ExportM and used by ImportM by
|
|
|
|
// changing visibility and appending the given ModuleId.
|
2017-06-16 02:18:29 +02:00
|
|
|
void promoteInternals(Module &ExportM, Module &ImportM, StringRef ModuleId,
|
|
|
|
SetVector<GlobalValue *> &PromoteExtra) {
|
2017-04-12 03:43:07 +02:00
|
|
|
DenseMap<const Comdat *, Comdat *> RenamedComdats;
|
2017-03-31 01:43:08 +02:00
|
|
|
for (auto &ExportGV : ExportM.global_values()) {
|
2016-12-16 01:26:30 +01:00
|
|
|
if (!ExportGV.hasLocalLinkage())
|
2017-03-31 01:43:08 +02:00
|
|
|
continue;
|
2016-12-16 01:26:30 +01:00
|
|
|
|
2017-04-12 03:43:07 +02:00
|
|
|
auto Name = ExportGV.getName();
|
2017-12-01 00:05:52 +01:00
|
|
|
GlobalValue *ImportGV = nullptr;
|
|
|
|
if (!PromoteExtra.count(&ExportGV)) {
|
|
|
|
ImportGV = ImportM.getNamedValue(Name);
|
|
|
|
if (!ImportGV)
|
|
|
|
continue;
|
|
|
|
ImportGV->removeDeadConstantUsers();
|
|
|
|
if (ImportGV->use_empty()) {
|
|
|
|
ImportGV->eraseFromParent();
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
}
|
2016-12-16 01:26:30 +01:00
|
|
|
|
2021-08-03 19:56:56 +02:00
|
|
|
std::string OldName = Name.str();
|
2017-04-12 03:43:07 +02:00
|
|
|
std::string NewName = (Name + ModuleId).str();
|
|
|
|
|
|
|
|
if (const auto *C = ExportGV.getComdat())
|
|
|
|
if (C->getName() == Name)
|
|
|
|
RenamedComdats.try_emplace(C, ExportM.getOrInsertComdat(NewName));
|
2016-12-16 01:26:30 +01:00
|
|
|
|
|
|
|
ExportGV.setName(NewName);
|
|
|
|
ExportGV.setLinkage(GlobalValue::ExternalLinkage);
|
|
|
|
ExportGV.setVisibility(GlobalValue::HiddenVisibility);
|
|
|
|
|
2017-06-16 02:18:29 +02:00
|
|
|
if (ImportGV) {
|
|
|
|
ImportGV->setName(NewName);
|
|
|
|
ImportGV->setVisibility(GlobalValue::HiddenVisibility);
|
|
|
|
}
|
2021-08-03 19:56:56 +02:00
|
|
|
|
|
|
|
if (isa<Function>(&ExportGV) && allowPromotionAlias(OldName)) {
|
|
|
|
// Create a local alias with the original name to avoid breaking
|
|
|
|
// references from inline assembly.
|
|
|
|
std::string Alias = ".set " + OldName + "," + NewName + "\n";
|
|
|
|
ExportM.appendModuleInlineAsm(Alias);
|
|
|
|
}
|
2017-03-31 01:43:08 +02:00
|
|
|
}
|
2017-04-12 03:43:07 +02:00
|
|
|
|
|
|
|
if (!RenamedComdats.empty())
|
|
|
|
for (auto &GO : ExportM.global_objects())
|
|
|
|
if (auto *C = GO.getComdat()) {
|
|
|
|
auto Replacement = RenamedComdats.find(C);
|
|
|
|
if (Replacement != RenamedComdats.end())
|
|
|
|
GO.setComdat(Replacement->second);
|
|
|
|
}
|
2016-12-16 01:26:30 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
// Promote all internal (i.e. distinct) type ids used by the module by replacing
|
|
|
|
// them with external type ids formed using the module id.
|
|
|
|
//
|
|
|
|
// Note that this needs to be done before we clone the module because each clone
|
|
|
|
// will receive its own set of distinct metadata nodes.
|
|
|
|
void promoteTypeIds(Module &M, StringRef ModuleId) {
|
|
|
|
DenseMap<Metadata *, Metadata *> LocalToGlobal;
|
|
|
|
auto ExternalizeTypeId = [&](CallInst *CI, unsigned ArgNo) {
|
|
|
|
Metadata *MD =
|
|
|
|
cast<MetadataAsValue>(CI->getArgOperand(ArgNo))->getMetadata();
|
|
|
|
|
|
|
|
if (isa<MDNode>(MD) && cast<MDNode>(MD)->isDistinct()) {
|
|
|
|
Metadata *&GlobalMD = LocalToGlobal[MD];
|
|
|
|
if (!GlobalMD) {
|
2017-12-28 17:58:54 +01:00
|
|
|
std::string NewName = (Twine(LocalToGlobal.size()) + ModuleId).str();
|
2016-12-16 01:26:30 +01:00
|
|
|
GlobalMD = MDString::get(M.getContext(), NewName);
|
|
|
|
}
|
|
|
|
|
|
|
|
CI->setArgOperand(ArgNo,
|
|
|
|
MetadataAsValue::get(M.getContext(), GlobalMD));
|
|
|
|
}
|
|
|
|
};
|
|
|
|
|
|
|
|
if (Function *TypeTestFunc =
|
|
|
|
M.getFunction(Intrinsic::getName(Intrinsic::type_test))) {
|
|
|
|
for (const Use &U : TypeTestFunc->uses()) {
|
|
|
|
auto CI = cast<CallInst>(U.getUser());
|
|
|
|
ExternalizeTypeId(CI, 1);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (Function *TypeCheckedLoadFunc =
|
|
|
|
M.getFunction(Intrinsic::getName(Intrinsic::type_checked_load))) {
|
|
|
|
for (const Use &U : TypeCheckedLoadFunc->uses()) {
|
|
|
|
auto CI = cast<CallInst>(U.getUser());
|
|
|
|
ExternalizeTypeId(CI, 2);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
for (GlobalObject &GO : M.global_objects()) {
|
|
|
|
SmallVector<MDNode *, 1> MDs;
|
|
|
|
GO.getMetadata(LLVMContext::MD_type, MDs);
|
|
|
|
|
|
|
|
GO.eraseMetadata(LLVMContext::MD_type);
|
|
|
|
for (auto MD : MDs) {
|
|
|
|
auto I = LocalToGlobal.find(MD->getOperand(1));
|
|
|
|
if (I == LocalToGlobal.end()) {
|
|
|
|
GO.addMetadata(LLVMContext::MD_type, *MD);
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
GO.addMetadata(
|
|
|
|
LLVMContext::MD_type,
|
2018-05-31 15:29:58 +02:00
|
|
|
*MDNode::get(M.getContext(), {MD->getOperand(0), I->second}));
|
2016-12-16 01:26:30 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
// Drop unused globals, and drop type information from function declarations.
|
|
|
|
// FIXME: If we made functions typeless then there would be no need to do this.
|
|
|
|
void simplifyExternals(Module &M) {
|
|
|
|
FunctionType *EmptyFT =
|
|
|
|
FunctionType::get(Type::getVoidTy(M.getContext()), false);
|
|
|
|
|
|
|
|
for (auto I = M.begin(), E = M.end(); I != E;) {
|
|
|
|
Function &F = *I++;
|
|
|
|
if (F.isDeclaration() && F.use_empty()) {
|
|
|
|
F.eraseFromParent();
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
2017-07-19 19:54:29 +02:00
|
|
|
if (!F.isDeclaration() || F.getFunctionType() == EmptyFT ||
|
|
|
|
// Changing the type of an intrinsic may invalidate the IR.
|
|
|
|
F.getName().startswith("llvm."))
|
2016-12-16 01:26:30 +01:00
|
|
|
continue;
|
|
|
|
|
|
|
|
Function *NewF =
|
2018-12-18 10:52:52 +01:00
|
|
|
Function::Create(EmptyFT, GlobalValue::ExternalLinkage,
|
|
|
|
F.getAddressSpace(), "", &M);
|
2021-04-14 19:42:50 +02:00
|
|
|
NewF->copyAttributesFrom(&F);
|
|
|
|
// Only copy function attribtues.
|
|
|
|
NewF->setAttributes(
|
|
|
|
AttributeList::get(M.getContext(), AttributeList::FunctionIndex,
|
|
|
|
F.getAttributes().getFnAttributes()));
|
2016-12-16 01:26:30 +01:00
|
|
|
NewF->takeName(&F);
|
|
|
|
F.replaceAllUsesWith(ConstantExpr::getBitCast(NewF, F.getType()));
|
|
|
|
F.eraseFromParent();
|
|
|
|
}
|
|
|
|
|
|
|
|
for (auto I = M.global_begin(), E = M.global_end(); I != E;) {
|
|
|
|
GlobalVariable &GV = *I++;
|
|
|
|
if (GV.isDeclaration() && GV.use_empty()) {
|
|
|
|
GV.eraseFromParent();
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2018-02-08 08:23:24 +01:00
|
|
|
static void
|
|
|
|
filterModule(Module *M,
|
|
|
|
function_ref<bool(const GlobalValue *)> ShouldKeepDefinition) {
|
|
|
|
std::vector<GlobalValue *> V;
|
|
|
|
for (GlobalValue &GV : M->global_values())
|
|
|
|
if (!ShouldKeepDefinition(&GV))
|
|
|
|
V.push_back(&GV);
|
|
|
|
|
|
|
|
for (GlobalValue *GV : V)
|
|
|
|
if (!convertToDeclaration(*GV))
|
|
|
|
GV->eraseFromParent();
|
2016-12-16 01:26:30 +01:00
|
|
|
}
|
|
|
|
|
2017-02-14 04:42:38 +01:00
|
|
|
void forEachVirtualFunction(Constant *C, function_ref<void(Function *)> Fn) {
|
|
|
|
if (auto *F = dyn_cast<Function>(C))
|
|
|
|
return Fn(F);
|
2017-03-03 00:10:17 +01:00
|
|
|
if (isa<GlobalValue>(C))
|
|
|
|
return;
|
2017-02-14 04:42:38 +01:00
|
|
|
for (Value *Op : C->operands())
|
|
|
|
forEachVirtualFunction(cast<Constant>(Op), Fn);
|
|
|
|
}
|
|
|
|
|
2021-02-19 02:23:02 +01:00
|
|
|
// Clone any @llvm[.compiler].used over to the new module and append
|
|
|
|
// values whose defs were cloned into that module.
|
|
|
|
static void cloneUsedGlobalVariables(const Module &SrcM, Module &DestM,
|
|
|
|
bool CompilerUsed) {
|
2021-02-24 00:50:45 +01:00
|
|
|
SmallVector<GlobalValue *, 4> Used, NewUsed;
|
2021-02-19 02:23:02 +01:00
|
|
|
// First collect those in the llvm[.compiler].used set.
|
|
|
|
collectUsedGlobalVariables(SrcM, Used, CompilerUsed);
|
|
|
|
// Next build a set of the equivalent values defined in DestM.
|
|
|
|
for (auto *V : Used) {
|
|
|
|
auto *GV = DestM.getNamedValue(V->getName());
|
|
|
|
if (GV && !GV->isDeclaration())
|
2021-02-24 00:50:45 +01:00
|
|
|
NewUsed.push_back(GV);
|
2021-02-19 02:23:02 +01:00
|
|
|
}
|
|
|
|
// Finally, add them to a llvm[.compiler].used variable in DestM.
|
|
|
|
if (CompilerUsed)
|
2021-02-24 00:50:45 +01:00
|
|
|
appendToCompilerUsed(DestM, NewUsed);
|
2021-02-19 02:23:02 +01:00
|
|
|
else
|
2021-02-24 00:50:45 +01:00
|
|
|
appendToUsed(DestM, NewUsed);
|
2021-02-19 02:23:02 +01:00
|
|
|
}
|
|
|
|
|
2016-12-16 01:26:30 +01:00
|
|
|
// If it's possible to split M into regular and thin LTO parts, do so and write
|
|
|
|
// a multi-module bitcode file with the two parts to OS. Otherwise, write only a
|
|
|
|
// regular LTO bitcode file to OS.
|
2017-02-14 04:42:38 +01:00
|
|
|
void splitAndWriteThinLTOBitcode(
|
[ThinLTO] Add support for emitting minimized bitcode for thin link
Summary:
The cumulative size of the bitcode files for a very large application
can be huge, particularly with -g. In a distributed build environment,
all of these files must be sent to the remote build node that performs
the thin link step, and this can exceed size limits.
The thin link actually only needs the summary along with a bitcode
symbol table. Until we have a proper bitcode symbol table, simply
stripping the debug metadata results in significant size reduction.
Add support for an option to additionally emit minimized bitcode
modules, just for use in the thin link step, which for now just strips
all debug metadata. I plan to add a cc1 option so this can be invoked
easily during the compile step.
However, care must be taken to ensure that these minimized thin link
bitcode files produce the same index as with the original bitcode files,
as these original bitcode files will be used in the backends.
Specifically:
1) The module hash used for caching is typically produced by hashing the
written bitcode, and we want to include the hash that would correspond
to the original bitcode file. This is because we want to ensure that
changes in the stripped portions affect caching. Added plumbing to emit
the same module hash in the minimized thin link bitcode file.
2) The module paths in the index are constructed from the module ID of
each thin linked bitcode, and typically is automatically generated from
the input file path. This is the path used for finding the modules to
import from, and obviously we need this to point to the original bitcode
files. Added gold-plugin support to take a suffix replacement during the
thin link that is used to override the identifier on the MemoryBufferRef
constructed from the loaded thin link bitcode file. The assumption is
that the build system can specify that the minimized bitcode file has a
name that is similar but uses a different suffix (e.g. out.thinlink.bc
instead of out.o).
Added various tests to ensure that we get identical index files out of
the thin link step.
Reviewers: mehdi_amini, pcc
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D31027
llvm-svn: 298638
2017-03-23 20:47:39 +01:00
|
|
|
raw_ostream &OS, raw_ostream *ThinLinkOS,
|
|
|
|
function_ref<AAResults &(Function &)> AARGetter, Module &M) {
|
2017-04-27 22:27:27 +02:00
|
|
|
std::string ModuleId = getUniqueModuleId(&M);
|
2016-12-16 01:26:30 +01:00
|
|
|
if (ModuleId.empty()) {
|
2018-06-01 17:20:47 +02:00
|
|
|
// We couldn't generate a module ID for this module, write it out as a
|
|
|
|
// regular LTO module with an index for summary-based dead stripping.
|
|
|
|
ProfileSummaryInfo PSI(M);
|
|
|
|
M.addModuleFlag(Module::Error, "ThinLTO", uint32_t(0));
|
|
|
|
ModuleSummaryIndex Index = buildModuleSummaryIndex(M, nullptr, &PSI);
|
|
|
|
WriteBitcodeToFile(M, OS, /*ShouldPreserveUseListOrder=*/false, &Index);
|
|
|
|
|
[ThinLTO] Add support for emitting minimized bitcode for thin link
Summary:
The cumulative size of the bitcode files for a very large application
can be huge, particularly with -g. In a distributed build environment,
all of these files must be sent to the remote build node that performs
the thin link step, and this can exceed size limits.
The thin link actually only needs the summary along with a bitcode
symbol table. Until we have a proper bitcode symbol table, simply
stripping the debug metadata results in significant size reduction.
Add support for an option to additionally emit minimized bitcode
modules, just for use in the thin link step, which for now just strips
all debug metadata. I plan to add a cc1 option so this can be invoked
easily during the compile step.
However, care must be taken to ensure that these minimized thin link
bitcode files produce the same index as with the original bitcode files,
as these original bitcode files will be used in the backends.
Specifically:
1) The module hash used for caching is typically produced by hashing the
written bitcode, and we want to include the hash that would correspond
to the original bitcode file. This is because we want to ensure that
changes in the stripped portions affect caching. Added plumbing to emit
the same module hash in the minimized thin link bitcode file.
2) The module paths in the index are constructed from the module ID of
each thin linked bitcode, and typically is automatically generated from
the input file path. This is the path used for finding the modules to
import from, and obviously we need this to point to the original bitcode
files. Added gold-plugin support to take a suffix replacement during the
thin link that is used to override the identifier on the MemoryBufferRef
constructed from the loaded thin link bitcode file. The assumption is
that the build system can specify that the minimized bitcode file has a
name that is similar but uses a different suffix (e.g. out.thinlink.bc
instead of out.o).
Added various tests to ensure that we get identical index files out of
the thin link step.
Reviewers: mehdi_amini, pcc
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D31027
llvm-svn: 298638
2017-03-23 20:47:39 +01:00
|
|
|
if (ThinLinkOS)
|
|
|
|
// We don't have a ThinLTO part, but still write the module to the
|
|
|
|
// ThinLinkOS if requested so that the expected output file is produced.
|
2018-06-01 17:20:47 +02:00
|
|
|
WriteBitcodeToFile(M, *ThinLinkOS, /*ShouldPreserveUseListOrder=*/false,
|
|
|
|
&Index);
|
|
|
|
|
2016-12-16 01:26:30 +01:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
promoteTypeIds(M, ModuleId);
|
|
|
|
|
2019-07-29 19:22:40 +02:00
|
|
|
// Returns whether a global or its associated global has attached type
|
|
|
|
// metadata. The former may participate in CFI or whole-program
|
|
|
|
// devirtualization, so they need to appear in the merged module instead of
|
|
|
|
// the thin LTO module. Similarly, globals that are associated with globals
|
|
|
|
// with type metadata need to appear in the merged module because they will
|
|
|
|
// reference the global's section directly.
|
2018-05-31 15:29:58 +02:00
|
|
|
auto HasTypeMetadata = [](const GlobalObject *GO) {
|
2019-07-29 19:22:40 +02:00
|
|
|
if (MDNode *MD = GO->getMetadata(LLVMContext::MD_associated))
|
|
|
|
if (auto *AssocVM = dyn_cast_or_null<ValueAsMetadata>(MD->getOperand(0)))
|
|
|
|
if (auto *AssocGO = dyn_cast<GlobalObject>(AssocVM->getValue()))
|
|
|
|
if (AssocGO->hasMetadata(LLVMContext::MD_type))
|
|
|
|
return true;
|
2018-05-31 15:29:58 +02:00
|
|
|
return GO->hasMetadata(LLVMContext::MD_type);
|
2016-12-16 01:26:30 +01:00
|
|
|
};
|
|
|
|
|
2017-02-14 04:42:38 +01:00
|
|
|
// Collect the set of virtual functions that are eligible for virtual constant
|
|
|
|
// propagation. Each eligible function must not access memory, must return
|
|
|
|
// an integer of width <=64 bits, must take at least one argument, must not
|
|
|
|
// use its first argument (assumed to be "this") and all arguments other than
|
|
|
|
// the first one must be of <=64 bit integer type.
|
|
|
|
//
|
|
|
|
// Note that we test whether this copy of the function is readnone, rather
|
|
|
|
// than testing function attributes, which must hold for any copy of the
|
|
|
|
// function, even a less optimized version substituted at link time. This is
|
|
|
|
// sound because the virtual constant propagation optimizations effectively
|
|
|
|
// inline all implementations of the virtual function into each call site,
|
|
|
|
// rather than using function attributes to perform local optimization.
|
2018-08-28 00:10:59 +02:00
|
|
|
DenseSet<const Function *> EligibleVirtualFns;
|
2017-04-12 03:43:07 +02:00
|
|
|
// If any member of a comdat lives in MergedM, put all members of that
|
|
|
|
// comdat in MergedM to keep the comdat together.
|
|
|
|
DenseSet<const Comdat *> MergedMComdats;
|
2017-02-14 04:42:38 +01:00
|
|
|
for (GlobalVariable &GV : M.globals())
|
2017-04-12 03:43:07 +02:00
|
|
|
if (HasTypeMetadata(&GV)) {
|
|
|
|
if (const auto *C = GV.getComdat())
|
|
|
|
MergedMComdats.insert(C);
|
2017-02-14 04:42:38 +01:00
|
|
|
forEachVirtualFunction(GV.getInitializer(), [&](Function *F) {
|
|
|
|
auto *RT = dyn_cast<IntegerType>(F->getReturnType());
|
|
|
|
if (!RT || RT->getBitWidth() > 64 || F->arg_empty() ||
|
|
|
|
!F->arg_begin()->use_empty())
|
|
|
|
return;
|
2021-01-18 19:16:36 +01:00
|
|
|
for (auto &Arg : drop_begin(F->args())) {
|
2017-02-14 04:42:38 +01:00
|
|
|
auto *ArgT = dyn_cast<IntegerType>(Arg.getType());
|
|
|
|
if (!ArgT || ArgT->getBitWidth() > 64)
|
|
|
|
return;
|
|
|
|
}
|
2017-07-11 07:39:20 +02:00
|
|
|
if (!F->isDeclaration() &&
|
|
|
|
computeFunctionBodyMemoryAccess(*F, AARGetter(*F)) == MAK_ReadNone)
|
2017-02-14 04:42:38 +01:00
|
|
|
EligibleVirtualFns.insert(F);
|
|
|
|
});
|
2017-04-12 03:43:07 +02:00
|
|
|
}
|
2017-02-14 04:42:38 +01:00
|
|
|
|
2016-12-16 01:26:30 +01:00
|
|
|
ValueToValueMapTy VMap;
|
2017-02-14 04:42:38 +01:00
|
|
|
std::unique_ptr<Module> MergedM(
|
2018-02-14 20:50:40 +01:00
|
|
|
CloneModule(M, VMap, [&](const GlobalValue *GV) -> bool {
|
2017-04-12 03:43:07 +02:00
|
|
|
if (const auto *C = GV->getComdat())
|
|
|
|
if (MergedMComdats.count(C))
|
|
|
|
return true;
|
2017-02-14 04:42:38 +01:00
|
|
|
if (auto *F = dyn_cast<Function>(GV))
|
|
|
|
return EligibleVirtualFns.count(F);
|
|
|
|
if (auto *GVar = dyn_cast_or_null<GlobalVariable>(GV->getBaseObject()))
|
|
|
|
return HasTypeMetadata(GVar);
|
|
|
|
return false;
|
|
|
|
}));
|
2017-02-08 21:44:00 +01:00
|
|
|
StripDebugInfo(*MergedM);
|
2018-02-06 04:29:18 +01:00
|
|
|
MergedM->setModuleInlineAsm("");
|
2016-12-16 01:26:30 +01:00
|
|
|
|
2021-02-19 02:23:02 +01:00
|
|
|
// Clone any llvm.*used globals to ensure the included values are
|
|
|
|
// not deleted.
|
|
|
|
cloneUsedGlobalVariables(M, *MergedM, /*CompilerUsed*/ false);
|
|
|
|
cloneUsedGlobalVariables(M, *MergedM, /*CompilerUsed*/ true);
|
|
|
|
|
2017-02-14 04:42:38 +01:00
|
|
|
for (Function &F : *MergedM)
|
|
|
|
if (!F.isDeclaration()) {
|
|
|
|
// Reset the linkage of all functions eligible for virtual constant
|
|
|
|
// propagation. The canonical definitions live in the thin LTO module so
|
|
|
|
// that they can be imported.
|
|
|
|
F.setLinkage(GlobalValue::AvailableExternallyLinkage);
|
|
|
|
F.setComdat(nullptr);
|
|
|
|
}
|
|
|
|
|
2017-06-16 02:18:29 +02:00
|
|
|
SetVector<GlobalValue *> CfiFunctions;
|
|
|
|
for (auto &F : M)
|
|
|
|
if ((!F.hasLocalLinkage() || F.hasAddressTaken()) && HasTypeMetadata(&F))
|
|
|
|
CfiFunctions.insert(&F);
|
|
|
|
|
2017-04-12 03:43:07 +02:00
|
|
|
// Remove all globals with type metadata, globals with comdats that live in
|
|
|
|
// MergedM, and aliases pointing to such globals from the thin LTO module.
|
2017-02-14 04:42:38 +01:00
|
|
|
filterModule(&M, [&](const GlobalValue *GV) {
|
|
|
|
if (auto *GVar = dyn_cast_or_null<GlobalVariable>(GV->getBaseObject()))
|
2017-04-12 03:43:07 +02:00
|
|
|
if (HasTypeMetadata(GVar))
|
|
|
|
return false;
|
|
|
|
if (const auto *C = GV->getComdat())
|
|
|
|
if (MergedMComdats.count(C))
|
|
|
|
return false;
|
2017-02-14 04:42:38 +01:00
|
|
|
return true;
|
|
|
|
});
|
2016-12-16 01:26:30 +01:00
|
|
|
|
2017-06-16 02:18:29 +02:00
|
|
|
promoteInternals(*MergedM, M, ModuleId, CfiFunctions);
|
|
|
|
promoteInternals(M, *MergedM, ModuleId, CfiFunctions);
|
|
|
|
|
2018-04-20 03:36:48 +02:00
|
|
|
auto &Ctx = MergedM->getContext();
|
2017-06-16 02:18:29 +02:00
|
|
|
SmallVector<MDNode *, 8> CfiFunctionMDs;
|
|
|
|
for (auto V : CfiFunctions) {
|
|
|
|
Function &F = *cast<Function>(V);
|
|
|
|
SmallVector<MDNode *, 2> Types;
|
|
|
|
F.getMetadata(LLVMContext::MD_type, Types);
|
|
|
|
|
|
|
|
SmallVector<Metadata *, 4> Elts;
|
|
|
|
Elts.push_back(MDString::get(Ctx, F.getName()));
|
|
|
|
CfiFunctionLinkage Linkage;
|
cfi-icall: Allow the jump table to be optionally made non-canonical.
The default behavior of Clang's indirect function call checker will replace
the address of each CFI-checked function in the output file's symbol table
with the address of a jump table entry which will pass CFI checks. We refer
to this as making the jump table `canonical`. This property allows code that
was not compiled with ``-fsanitize=cfi-icall`` to take a CFI-valid address
of a function, but it comes with a couple of caveats that are especially
relevant for users of cross-DSO CFI:
- There is a performance and code size overhead associated with each
exported function, because each such function must have an associated
jump table entry, which must be emitted even in the common case where the
function is never address-taken anywhere in the program, and must be used
even for direct calls between DSOs, in addition to the PLT overhead.
- There is no good way to take a CFI-valid address of a function written in
assembly or a language not supported by Clang. The reason is that the code
generator would need to insert a jump table in order to form a CFI-valid
address for assembly functions, but there is no way in general for the
code generator to determine the language of the function. This may be
possible with LTO in the intra-DSO case, but in the cross-DSO case the only
information available is the function declaration. One possible solution
is to add a C wrapper for each assembly function, but these wrappers can
present a significant maintenance burden for heavy users of assembly in
addition to adding runtime overhead.
For these reasons, we provide the option of making the jump table non-canonical
with the flag ``-fno-sanitize-cfi-canonical-jump-tables``. When the jump
table is made non-canonical, symbol table entries point directly to the
function body. Any instances of a function's address being taken in C will
be replaced with a jump table address.
This scheme does have its own caveats, however. It does end up breaking
function address equality more aggressively than the default behavior,
especially in cross-DSO mode which normally preserves function address
equality entirely.
Furthermore, it is occasionally necessary for code not compiled with
``-fsanitize=cfi-icall`` to take a function address that is valid
for CFI. For example, this is necessary when a function's address
is taken by assembly code and then called by CFI-checking C code. The
``__attribute__((cfi_jump_table_canonical))`` attribute may be used to make
the jump table entry of a specific function canonical so that the external
code will end up taking a address for the function that will pass CFI checks.
Fixes PR41972.
Differential Revision: https://reviews.llvm.org/D65629
llvm-svn: 368495
2019-08-10 00:31:59 +02:00
|
|
|
if (lowertypetests::isJumpTableCanonical(&F))
|
2017-06-16 02:18:29 +02:00
|
|
|
Linkage = CFL_Definition;
|
cfi-icall: Allow the jump table to be optionally made non-canonical.
The default behavior of Clang's indirect function call checker will replace
the address of each CFI-checked function in the output file's symbol table
with the address of a jump table entry which will pass CFI checks. We refer
to this as making the jump table `canonical`. This property allows code that
was not compiled with ``-fsanitize=cfi-icall`` to take a CFI-valid address
of a function, but it comes with a couple of caveats that are especially
relevant for users of cross-DSO CFI:
- There is a performance and code size overhead associated with each
exported function, because each such function must have an associated
jump table entry, which must be emitted even in the common case where the
function is never address-taken anywhere in the program, and must be used
even for direct calls between DSOs, in addition to the PLT overhead.
- There is no good way to take a CFI-valid address of a function written in
assembly or a language not supported by Clang. The reason is that the code
generator would need to insert a jump table in order to form a CFI-valid
address for assembly functions, but there is no way in general for the
code generator to determine the language of the function. This may be
possible with LTO in the intra-DSO case, but in the cross-DSO case the only
information available is the function declaration. One possible solution
is to add a C wrapper for each assembly function, but these wrappers can
present a significant maintenance burden for heavy users of assembly in
addition to adding runtime overhead.
For these reasons, we provide the option of making the jump table non-canonical
with the flag ``-fno-sanitize-cfi-canonical-jump-tables``. When the jump
table is made non-canonical, symbol table entries point directly to the
function body. Any instances of a function's address being taken in C will
be replaced with a jump table address.
This scheme does have its own caveats, however. It does end up breaking
function address equality more aggressively than the default behavior,
especially in cross-DSO mode which normally preserves function address
equality entirely.
Furthermore, it is occasionally necessary for code not compiled with
``-fsanitize=cfi-icall`` to take a function address that is valid
for CFI. For example, this is necessary when a function's address
is taken by assembly code and then called by CFI-checking C code. The
``__attribute__((cfi_jump_table_canonical))`` attribute may be used to make
the jump table entry of a specific function canonical so that the external
code will end up taking a address for the function that will pass CFI checks.
Fixes PR41972.
Differential Revision: https://reviews.llvm.org/D65629
llvm-svn: 368495
2019-08-10 00:31:59 +02:00
|
|
|
else if (F.hasExternalWeakLinkage())
|
2017-06-16 02:18:29 +02:00
|
|
|
Linkage = CFL_WeakDeclaration;
|
|
|
|
else
|
|
|
|
Linkage = CFL_Declaration;
|
|
|
|
Elts.push_back(ConstantAsMetadata::get(
|
|
|
|
llvm::ConstantInt::get(Type::getInt8Ty(Ctx), Linkage)));
|
2021-01-21 06:35:53 +01:00
|
|
|
append_range(Elts, Types);
|
2017-06-16 02:18:29 +02:00
|
|
|
CfiFunctionMDs.push_back(MDTuple::get(Ctx, Elts));
|
|
|
|
}
|
2016-12-16 01:26:30 +01:00
|
|
|
|
2017-06-16 02:18:29 +02:00
|
|
|
if(!CfiFunctionMDs.empty()) {
|
|
|
|
NamedMDNode *NMD = MergedM->getOrInsertNamedMetadata("cfi.functions");
|
|
|
|
for (auto MD : CfiFunctionMDs)
|
|
|
|
NMD->addOperand(MD);
|
|
|
|
}
|
2016-12-16 01:26:30 +01:00
|
|
|
|
2018-01-10 01:00:51 +01:00
|
|
|
SmallVector<MDNode *, 8> FunctionAliases;
|
|
|
|
for (auto &A : M.aliases()) {
|
|
|
|
if (!isa<Function>(A.getAliasee()))
|
|
|
|
continue;
|
|
|
|
|
|
|
|
auto *F = cast<Function>(A.getAliasee());
|
|
|
|
|
2018-05-31 15:29:58 +02:00
|
|
|
Metadata *Elts[] = {
|
|
|
|
MDString::get(Ctx, A.getName()),
|
|
|
|
MDString::get(Ctx, F->getName()),
|
|
|
|
ConstantAsMetadata::get(
|
|
|
|
ConstantInt::get(Type::getInt8Ty(Ctx), A.getVisibility())),
|
|
|
|
ConstantAsMetadata::get(
|
|
|
|
ConstantInt::get(Type::getInt8Ty(Ctx), A.isWeakForLinker())),
|
|
|
|
};
|
2018-01-10 01:00:51 +01:00
|
|
|
|
|
|
|
FunctionAliases.push_back(MDTuple::get(Ctx, Elts));
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!FunctionAliases.empty()) {
|
|
|
|
NamedMDNode *NMD = MergedM->getOrInsertNamedMetadata("aliases");
|
|
|
|
for (auto MD : FunctionAliases)
|
|
|
|
NMD->addOperand(MD);
|
|
|
|
}
|
|
|
|
|
2018-04-20 03:36:48 +02:00
|
|
|
SmallVector<MDNode *, 8> Symvers;
|
|
|
|
ModuleSymbolTable::CollectAsmSymvers(M, [&](StringRef Name, StringRef Alias) {
|
|
|
|
Function *F = M.getFunction(Name);
|
|
|
|
if (!F || F->use_empty())
|
|
|
|
return;
|
|
|
|
|
2018-05-31 15:29:58 +02:00
|
|
|
Symvers.push_back(MDTuple::get(
|
|
|
|
Ctx, {MDString::get(Ctx, Name), MDString::get(Ctx, Alias)}));
|
2018-04-20 03:36:48 +02:00
|
|
|
});
|
|
|
|
|
|
|
|
if (!Symvers.empty()) {
|
|
|
|
NamedMDNode *NMD = MergedM->getOrInsertNamedMetadata("symvers");
|
|
|
|
for (auto MD : Symvers)
|
|
|
|
NMD->addOperand(MD);
|
|
|
|
}
|
|
|
|
|
2017-06-16 02:18:29 +02:00
|
|
|
simplifyExternals(*MergedM);
|
2016-12-16 01:26:30 +01:00
|
|
|
|
|
|
|
// FIXME: Try to re-use BSI and PFI from the original module here.
|
2017-05-10 20:52:16 +02:00
|
|
|
ProfileSummaryInfo PSI(M);
|
|
|
|
ModuleSummaryIndex Index = buildModuleSummaryIndex(M, nullptr, &PSI);
|
2016-12-16 01:26:30 +01:00
|
|
|
|
2017-06-09 01:01:49 +02:00
|
|
|
// Mark the merged module as requiring full LTO. We still want an index for
|
|
|
|
// it though, so that it can participate in summary-based dead stripping.
|
|
|
|
MergedM->addModuleFlag(Module::Error, "ThinLTO", uint32_t(0));
|
|
|
|
ModuleSummaryIndex MergedMIndex =
|
|
|
|
buildModuleSummaryIndex(*MergedM, nullptr, &PSI);
|
|
|
|
|
[ThinLTO] Add support for emitting minimized bitcode for thin link
Summary:
The cumulative size of the bitcode files for a very large application
can be huge, particularly with -g. In a distributed build environment,
all of these files must be sent to the remote build node that performs
the thin link step, and this can exceed size limits.
The thin link actually only needs the summary along with a bitcode
symbol table. Until we have a proper bitcode symbol table, simply
stripping the debug metadata results in significant size reduction.
Add support for an option to additionally emit minimized bitcode
modules, just for use in the thin link step, which for now just strips
all debug metadata. I plan to add a cc1 option so this can be invoked
easily during the compile step.
However, care must be taken to ensure that these minimized thin link
bitcode files produce the same index as with the original bitcode files,
as these original bitcode files will be used in the backends.
Specifically:
1) The module hash used for caching is typically produced by hashing the
written bitcode, and we want to include the hash that would correspond
to the original bitcode file. This is because we want to ensure that
changes in the stripped portions affect caching. Added plumbing to emit
the same module hash in the minimized thin link bitcode file.
2) The module paths in the index are constructed from the module ID of
each thin linked bitcode, and typically is automatically generated from
the input file path. This is the path used for finding the modules to
import from, and obviously we need this to point to the original bitcode
files. Added gold-plugin support to take a suffix replacement during the
thin link that is used to override the identifier on the MemoryBufferRef
constructed from the loaded thin link bitcode file. The assumption is
that the build system can specify that the minimized bitcode file has a
name that is similar but uses a different suffix (e.g. out.thinlink.bc
instead of out.o).
Added various tests to ensure that we get identical index files out of
the thin link step.
Reviewers: mehdi_amini, pcc
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D31027
llvm-svn: 298638
2017-03-23 20:47:39 +01:00
|
|
|
SmallVector<char, 0> Buffer;
|
2016-12-16 01:26:30 +01:00
|
|
|
|
[ThinLTO] Add support for emitting minimized bitcode for thin link
Summary:
The cumulative size of the bitcode files for a very large application
can be huge, particularly with -g. In a distributed build environment,
all of these files must be sent to the remote build node that performs
the thin link step, and this can exceed size limits.
The thin link actually only needs the summary along with a bitcode
symbol table. Until we have a proper bitcode symbol table, simply
stripping the debug metadata results in significant size reduction.
Add support for an option to additionally emit minimized bitcode
modules, just for use in the thin link step, which for now just strips
all debug metadata. I plan to add a cc1 option so this can be invoked
easily during the compile step.
However, care must be taken to ensure that these minimized thin link
bitcode files produce the same index as with the original bitcode files,
as these original bitcode files will be used in the backends.
Specifically:
1) The module hash used for caching is typically produced by hashing the
written bitcode, and we want to include the hash that would correspond
to the original bitcode file. This is because we want to ensure that
changes in the stripped portions affect caching. Added plumbing to emit
the same module hash in the minimized thin link bitcode file.
2) The module paths in the index are constructed from the module ID of
each thin linked bitcode, and typically is automatically generated from
the input file path. This is the path used for finding the modules to
import from, and obviously we need this to point to the original bitcode
files. Added gold-plugin support to take a suffix replacement during the
thin link that is used to override the identifier on the MemoryBufferRef
constructed from the loaded thin link bitcode file. The assumption is
that the build system can specify that the minimized bitcode file has a
name that is similar but uses a different suffix (e.g. out.thinlink.bc
instead of out.o).
Added various tests to ensure that we get identical index files out of
the thin link step.
Reviewers: mehdi_amini, pcc
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D31027
llvm-svn: 298638
2017-03-23 20:47:39 +01:00
|
|
|
BitcodeWriter W(Buffer);
|
|
|
|
// Save the module hash produced for the full bitcode, which will
|
|
|
|
// be used in the backends, and use that in the minimized bitcode
|
|
|
|
// produced for the full link.
|
|
|
|
ModuleHash ModHash = {{0}};
|
2018-02-14 20:11:32 +01:00
|
|
|
W.writeModule(M, /*ShouldPreserveUseListOrder=*/false, &Index,
|
[ThinLTO] Add support for emitting minimized bitcode for thin link
Summary:
The cumulative size of the bitcode files for a very large application
can be huge, particularly with -g. In a distributed build environment,
all of these files must be sent to the remote build node that performs
the thin link step, and this can exceed size limits.
The thin link actually only needs the summary along with a bitcode
symbol table. Until we have a proper bitcode symbol table, simply
stripping the debug metadata results in significant size reduction.
Add support for an option to additionally emit minimized bitcode
modules, just for use in the thin link step, which for now just strips
all debug metadata. I plan to add a cc1 option so this can be invoked
easily during the compile step.
However, care must be taken to ensure that these minimized thin link
bitcode files produce the same index as with the original bitcode files,
as these original bitcode files will be used in the backends.
Specifically:
1) The module hash used for caching is typically produced by hashing the
written bitcode, and we want to include the hash that would correspond
to the original bitcode file. This is because we want to ensure that
changes in the stripped portions affect caching. Added plumbing to emit
the same module hash in the minimized thin link bitcode file.
2) The module paths in the index are constructed from the module ID of
each thin linked bitcode, and typically is automatically generated from
the input file path. This is the path used for finding the modules to
import from, and obviously we need this to point to the original bitcode
files. Added gold-plugin support to take a suffix replacement during the
thin link that is used to override the identifier on the MemoryBufferRef
constructed from the loaded thin link bitcode file. The assumption is
that the build system can specify that the minimized bitcode file has a
name that is similar but uses a different suffix (e.g. out.thinlink.bc
instead of out.o).
Added various tests to ensure that we get identical index files out of
the thin link step.
Reviewers: mehdi_amini, pcc
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D31027
llvm-svn: 298638
2017-03-23 20:47:39 +01:00
|
|
|
/*GenerateHash=*/true, &ModHash);
|
2018-02-14 20:11:32 +01:00
|
|
|
W.writeModule(*MergedM, /*ShouldPreserveUseListOrder=*/false, &MergedMIndex);
|
2017-06-28 01:50:11 +02:00
|
|
|
W.writeSymtab();
|
2017-04-17 19:51:36 +02:00
|
|
|
W.writeStrtab();
|
2016-12-16 01:26:30 +01:00
|
|
|
OS << Buffer;
|
[ThinLTO] Add support for emitting minimized bitcode for thin link
Summary:
The cumulative size of the bitcode files for a very large application
can be huge, particularly with -g. In a distributed build environment,
all of these files must be sent to the remote build node that performs
the thin link step, and this can exceed size limits.
The thin link actually only needs the summary along with a bitcode
symbol table. Until we have a proper bitcode symbol table, simply
stripping the debug metadata results in significant size reduction.
Add support for an option to additionally emit minimized bitcode
modules, just for use in the thin link step, which for now just strips
all debug metadata. I plan to add a cc1 option so this can be invoked
easily during the compile step.
However, care must be taken to ensure that these minimized thin link
bitcode files produce the same index as with the original bitcode files,
as these original bitcode files will be used in the backends.
Specifically:
1) The module hash used for caching is typically produced by hashing the
written bitcode, and we want to include the hash that would correspond
to the original bitcode file. This is because we want to ensure that
changes in the stripped portions affect caching. Added plumbing to emit
the same module hash in the minimized thin link bitcode file.
2) The module paths in the index are constructed from the module ID of
each thin linked bitcode, and typically is automatically generated from
the input file path. This is the path used for finding the modules to
import from, and obviously we need this to point to the original bitcode
files. Added gold-plugin support to take a suffix replacement during the
thin link that is used to override the identifier on the MemoryBufferRef
constructed from the loaded thin link bitcode file. The assumption is
that the build system can specify that the minimized bitcode file has a
name that is similar but uses a different suffix (e.g. out.thinlink.bc
instead of out.o).
Added various tests to ensure that we get identical index files out of
the thin link step.
Reviewers: mehdi_amini, pcc
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D31027
llvm-svn: 298638
2017-03-23 20:47:39 +01:00
|
|
|
|
ThinLTO Minimized Bitcode File Size Reduction
Summary: Currently the ThinLTO minimized bitcode file only strip the debug info, but there is still a lot of information in the minimized bit code file that will be not used for thin linker. In this patch, most of the extra information is striped to reduce the minimized bitcode file. Now only ModuleVersion, ModuleInfo, ModuleGlobalValueSummary, ModuleHash, Symtab and Strtab are left. Now the minimized bitcode file size is reduced to 15%-30% of the debug info stripped bitcode file size.
Reviewers: danielcdh, tejohnson, pcc
Reviewed By: pcc
Subscribers: mehdi_amini, aprantl, inglorion, eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D35334
llvm-svn: 308760
2017-07-21 19:25:20 +02:00
|
|
|
// If a minimized bitcode module was requested for the thin link, only
|
|
|
|
// the information that is needed by thin link will be written in the
|
|
|
|
// given OS (the merged module will be written as usual).
|
[ThinLTO] Add support for emitting minimized bitcode for thin link
Summary:
The cumulative size of the bitcode files for a very large application
can be huge, particularly with -g. In a distributed build environment,
all of these files must be sent to the remote build node that performs
the thin link step, and this can exceed size limits.
The thin link actually only needs the summary along with a bitcode
symbol table. Until we have a proper bitcode symbol table, simply
stripping the debug metadata results in significant size reduction.
Add support for an option to additionally emit minimized bitcode
modules, just for use in the thin link step, which for now just strips
all debug metadata. I plan to add a cc1 option so this can be invoked
easily during the compile step.
However, care must be taken to ensure that these minimized thin link
bitcode files produce the same index as with the original bitcode files,
as these original bitcode files will be used in the backends.
Specifically:
1) The module hash used for caching is typically produced by hashing the
written bitcode, and we want to include the hash that would correspond
to the original bitcode file. This is because we want to ensure that
changes in the stripped portions affect caching. Added plumbing to emit
the same module hash in the minimized thin link bitcode file.
2) The module paths in the index are constructed from the module ID of
each thin linked bitcode, and typically is automatically generated from
the input file path. This is the path used for finding the modules to
import from, and obviously we need this to point to the original bitcode
files. Added gold-plugin support to take a suffix replacement during the
thin link that is used to override the identifier on the MemoryBufferRef
constructed from the loaded thin link bitcode file. The assumption is
that the build system can specify that the minimized bitcode file has a
name that is similar but uses a different suffix (e.g. out.thinlink.bc
instead of out.o).
Added various tests to ensure that we get identical index files out of
the thin link step.
Reviewers: mehdi_amini, pcc
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D31027
llvm-svn: 298638
2017-03-23 20:47:39 +01:00
|
|
|
if (ThinLinkOS) {
|
|
|
|
Buffer.clear();
|
|
|
|
BitcodeWriter W2(Buffer);
|
|
|
|
StripDebugInfo(M);
|
2018-02-14 20:11:32 +01:00
|
|
|
W2.writeThinLinkBitcode(M, Index, ModHash);
|
|
|
|
W2.writeModule(*MergedM, /*ShouldPreserveUseListOrder=*/false,
|
2017-06-09 01:01:49 +02:00
|
|
|
&MergedMIndex);
|
2017-06-28 01:50:11 +02:00
|
|
|
W2.writeSymtab();
|
2017-04-17 19:51:36 +02:00
|
|
|
W2.writeStrtab();
|
[ThinLTO] Add support for emitting minimized bitcode for thin link
Summary:
The cumulative size of the bitcode files for a very large application
can be huge, particularly with -g. In a distributed build environment,
all of these files must be sent to the remote build node that performs
the thin link step, and this can exceed size limits.
The thin link actually only needs the summary along with a bitcode
symbol table. Until we have a proper bitcode symbol table, simply
stripping the debug metadata results in significant size reduction.
Add support for an option to additionally emit minimized bitcode
modules, just for use in the thin link step, which for now just strips
all debug metadata. I plan to add a cc1 option so this can be invoked
easily during the compile step.
However, care must be taken to ensure that these minimized thin link
bitcode files produce the same index as with the original bitcode files,
as these original bitcode files will be used in the backends.
Specifically:
1) The module hash used for caching is typically produced by hashing the
written bitcode, and we want to include the hash that would correspond
to the original bitcode file. This is because we want to ensure that
changes in the stripped portions affect caching. Added plumbing to emit
the same module hash in the minimized thin link bitcode file.
2) The module paths in the index are constructed from the module ID of
each thin linked bitcode, and typically is automatically generated from
the input file path. This is the path used for finding the modules to
import from, and obviously we need this to point to the original bitcode
files. Added gold-plugin support to take a suffix replacement during the
thin link that is used to override the identifier on the MemoryBufferRef
constructed from the loaded thin link bitcode file. The assumption is
that the build system can specify that the minimized bitcode file has a
name that is similar but uses a different suffix (e.g. out.thinlink.bc
instead of out.o).
Added various tests to ensure that we get identical index files out of
the thin link step.
Reviewers: mehdi_amini, pcc
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D31027
llvm-svn: 298638
2017-03-23 20:47:39 +01:00
|
|
|
*ThinLinkOS << Buffer;
|
|
|
|
}
|
2016-12-16 01:26:30 +01:00
|
|
|
}
|
|
|
|
|
[ThinLTO] Add summary entries for index-based WPD
Summary:
If LTOUnit splitting is disabled, the module summary analysis computes
the summary information necessary to perform single implementation
devirtualization during the thin link with the index and no IR. The
information collected from the regular LTO IR in the current hybrid WPD
algorithm is summarized, including:
1) For vtable definitions, record the function pointers and their offset
within the vtable initializer (subsumes the information collected from
IR by tryFindVirtualCallTargets).
2) A record for each type metadata summarizing the vtable definitions
decorated with that metadata (subsumes the TypeIdentiferMap collected
from IR).
Also added are the necessary bitcode records, and the corresponding
assembly support.
The follow-on index-based WPD patch is D55153.
Depends on D53890.
Reviewers: pcc
Subscribers: mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits
Differential Revision: https://reviews.llvm.org/D54815
llvm-svn: 364960
2019-07-02 21:38:02 +02:00
|
|
|
// Check if the LTO Unit splitting has been enabled.
|
|
|
|
bool enableSplitLTOUnit(Module &M) {
|
[LTO] Record whether LTOUnit splitting is enabled in index
Summary:
Records in the module summary index whether the bitcode was compiled
with the option necessary to enable splitting the LTO unit
(e.g. -fsanitize=cfi, -fwhole-program-vtables, or -fsplit-lto-unit).
The information is passed down to the ModuleSummaryIndex builder via a
new module flag "EnableSplitLTOUnit", which is propagated onto a flag
on the summary index.
This is then used during the LTO link to check whether all linked
summaries were built with the same value of this flag. If not, an error
is issued when we detect a situation requiring whole program visibility
of the class hierarchy. This is the case when both of the following
conditions are met:
1) We are performing LowerTypeTests or Whole Program Devirtualization.
2) There are type tests or type checked loads in the code.
Note I have also changed the ThinLTOBitcodeWriter to also gate the
module splitting on the value of this flag.
Reviewers: pcc
Subscribers: ormris, mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, dang, llvm-commits
Differential Revision: https://reviews.llvm.org/D53890
llvm-svn: 350948
2019-01-11 19:31:57 +01:00
|
|
|
bool EnableSplitLTOUnit = false;
|
|
|
|
if (auto *MD = mdconst::extract_or_null<ConstantInt>(
|
|
|
|
M.getModuleFlag("EnableSplitLTOUnit")))
|
|
|
|
EnableSplitLTOUnit = MD->getZExtValue();
|
[ThinLTO] Add summary entries for index-based WPD
Summary:
If LTOUnit splitting is disabled, the module summary analysis computes
the summary information necessary to perform single implementation
devirtualization during the thin link with the index and no IR. The
information collected from the regular LTO IR in the current hybrid WPD
algorithm is summarized, including:
1) For vtable definitions, record the function pointers and their offset
within the vtable initializer (subsumes the information collected from
IR by tryFindVirtualCallTargets).
2) A record for each type metadata summarizing the vtable definitions
decorated with that metadata (subsumes the TypeIdentiferMap collected
from IR).
Also added are the necessary bitcode records, and the corresponding
assembly support.
The follow-on index-based WPD patch is D55153.
Depends on D53890.
Reviewers: pcc
Subscribers: mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits
Differential Revision: https://reviews.llvm.org/D54815
llvm-svn: 364960
2019-07-02 21:38:02 +02:00
|
|
|
return EnableSplitLTOUnit;
|
|
|
|
}
|
[LTO] Record whether LTOUnit splitting is enabled in index
Summary:
Records in the module summary index whether the bitcode was compiled
with the option necessary to enable splitting the LTO unit
(e.g. -fsanitize=cfi, -fwhole-program-vtables, or -fsplit-lto-unit).
The information is passed down to the ModuleSummaryIndex builder via a
new module flag "EnableSplitLTOUnit", which is propagated onto a flag
on the summary index.
This is then used during the LTO link to check whether all linked
summaries were built with the same value of this flag. If not, an error
is issued when we detect a situation requiring whole program visibility
of the class hierarchy. This is the case when both of the following
conditions are met:
1) We are performing LowerTypeTests or Whole Program Devirtualization.
2) There are type tests or type checked loads in the code.
Note I have also changed the ThinLTOBitcodeWriter to also gate the
module splitting on the value of this flag.
Reviewers: pcc
Subscribers: ormris, mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, dang, llvm-commits
Differential Revision: https://reviews.llvm.org/D53890
llvm-svn: 350948
2019-01-11 19:31:57 +01:00
|
|
|
|
[ThinLTO] Add summary entries for index-based WPD
Summary:
If LTOUnit splitting is disabled, the module summary analysis computes
the summary information necessary to perform single implementation
devirtualization during the thin link with the index and no IR. The
information collected from the regular LTO IR in the current hybrid WPD
algorithm is summarized, including:
1) For vtable definitions, record the function pointers and their offset
within the vtable initializer (subsumes the information collected from
IR by tryFindVirtualCallTargets).
2) A record for each type metadata summarizing the vtable definitions
decorated with that metadata (subsumes the TypeIdentiferMap collected
from IR).
Also added are the necessary bitcode records, and the corresponding
assembly support.
The follow-on index-based WPD patch is D55153.
Depends on D53890.
Reviewers: pcc
Subscribers: mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits
Differential Revision: https://reviews.llvm.org/D54815
llvm-svn: 364960
2019-07-02 21:38:02 +02:00
|
|
|
// Returns whether this module needs to be split because it uses type metadata.
|
|
|
|
bool hasTypeMetadata(Module &M) {
|
2016-12-16 01:26:30 +01:00
|
|
|
for (auto &GO : M.global_objects()) {
|
2018-05-31 15:29:58 +02:00
|
|
|
if (GO.hasMetadata(LLVMContext::MD_type))
|
2016-12-16 01:26:30 +01:00
|
|
|
return true;
|
|
|
|
}
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
[ThinLTO] Add support for emitting minimized bitcode for thin link
Summary:
The cumulative size of the bitcode files for a very large application
can be huge, particularly with -g. In a distributed build environment,
all of these files must be sent to the remote build node that performs
the thin link step, and this can exceed size limits.
The thin link actually only needs the summary along with a bitcode
symbol table. Until we have a proper bitcode symbol table, simply
stripping the debug metadata results in significant size reduction.
Add support for an option to additionally emit minimized bitcode
modules, just for use in the thin link step, which for now just strips
all debug metadata. I plan to add a cc1 option so this can be invoked
easily during the compile step.
However, care must be taken to ensure that these minimized thin link
bitcode files produce the same index as with the original bitcode files,
as these original bitcode files will be used in the backends.
Specifically:
1) The module hash used for caching is typically produced by hashing the
written bitcode, and we want to include the hash that would correspond
to the original bitcode file. This is because we want to ensure that
changes in the stripped portions affect caching. Added plumbing to emit
the same module hash in the minimized thin link bitcode file.
2) The module paths in the index are constructed from the module ID of
each thin linked bitcode, and typically is automatically generated from
the input file path. This is the path used for finding the modules to
import from, and obviously we need this to point to the original bitcode
files. Added gold-plugin support to take a suffix replacement during the
thin link that is used to override the identifier on the MemoryBufferRef
constructed from the loaded thin link bitcode file. The assumption is
that the build system can specify that the minimized bitcode file has a
name that is similar but uses a different suffix (e.g. out.thinlink.bc
instead of out.o).
Added various tests to ensure that we get identical index files out of
the thin link step.
Reviewers: mehdi_amini, pcc
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D31027
llvm-svn: 298638
2017-03-23 20:47:39 +01:00
|
|
|
void writeThinLTOBitcode(raw_ostream &OS, raw_ostream *ThinLinkOS,
|
2017-02-14 04:42:38 +01:00
|
|
|
function_ref<AAResults &(Function &)> AARGetter,
|
|
|
|
Module &M, const ModuleSummaryIndex *Index) {
|
[ThinLTO] Add summary entries for index-based WPD
Summary:
If LTOUnit splitting is disabled, the module summary analysis computes
the summary information necessary to perform single implementation
devirtualization during the thin link with the index and no IR. The
information collected from the regular LTO IR in the current hybrid WPD
algorithm is summarized, including:
1) For vtable definitions, record the function pointers and their offset
within the vtable initializer (subsumes the information collected from
IR by tryFindVirtualCallTargets).
2) A record for each type metadata summarizing the vtable definitions
decorated with that metadata (subsumes the TypeIdentiferMap collected
from IR).
Also added are the necessary bitcode records, and the corresponding
assembly support.
The follow-on index-based WPD patch is D55153.
Depends on D53890.
Reviewers: pcc
Subscribers: mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits
Differential Revision: https://reviews.llvm.org/D54815
llvm-svn: 364960
2019-07-02 21:38:02 +02:00
|
|
|
std::unique_ptr<ModuleSummaryIndex> NewIndex = nullptr;
|
|
|
|
// See if this module has any type metadata. If so, we try to split it
|
|
|
|
// or at least promote type ids to enable WPD.
|
|
|
|
if (hasTypeMetadata(M)) {
|
|
|
|
if (enableSplitLTOUnit(M))
|
|
|
|
return splitAndWriteThinLTOBitcode(OS, ThinLinkOS, AARGetter, M);
|
|
|
|
// Promote type ids as needed for index-based WPD.
|
|
|
|
std::string ModuleId = getUniqueModuleId(&M);
|
|
|
|
if (!ModuleId.empty()) {
|
|
|
|
promoteTypeIds(M, ModuleId);
|
|
|
|
// Need to rebuild the index so that it contains type metadata
|
|
|
|
// for the newly promoted type ids.
|
|
|
|
// FIXME: Probably should not bother building the index at all
|
|
|
|
// in the caller of writeThinLTOBitcode (which does so via the
|
|
|
|
// ModuleSummaryIndexAnalysis pass), since we have to rebuild it
|
|
|
|
// anyway whenever there is type metadata (here or in
|
|
|
|
// splitAndWriteThinLTOBitcode). Just always build it once via the
|
|
|
|
// buildModuleSummaryIndex when Module(s) are ready.
|
|
|
|
ProfileSummaryInfo PSI(M);
|
2019-08-15 17:54:37 +02:00
|
|
|
NewIndex = std::make_unique<ModuleSummaryIndex>(
|
[ThinLTO] Add summary entries for index-based WPD
Summary:
If LTOUnit splitting is disabled, the module summary analysis computes
the summary information necessary to perform single implementation
devirtualization during the thin link with the index and no IR. The
information collected from the regular LTO IR in the current hybrid WPD
algorithm is summarized, including:
1) For vtable definitions, record the function pointers and their offset
within the vtable initializer (subsumes the information collected from
IR by tryFindVirtualCallTargets).
2) A record for each type metadata summarizing the vtable definitions
decorated with that metadata (subsumes the TypeIdentiferMap collected
from IR).
Also added are the necessary bitcode records, and the corresponding
assembly support.
The follow-on index-based WPD patch is D55153.
Depends on D53890.
Reviewers: pcc
Subscribers: mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits
Differential Revision: https://reviews.llvm.org/D54815
llvm-svn: 364960
2019-07-02 21:38:02 +02:00
|
|
|
buildModuleSummaryIndex(M, nullptr, &PSI));
|
|
|
|
Index = NewIndex.get();
|
|
|
|
}
|
|
|
|
}
|
2016-12-16 01:26:30 +01:00
|
|
|
|
[ThinLTO] Add summary entries for index-based WPD
Summary:
If LTOUnit splitting is disabled, the module summary analysis computes
the summary information necessary to perform single implementation
devirtualization during the thin link with the index and no IR. The
information collected from the regular LTO IR in the current hybrid WPD
algorithm is summarized, including:
1) For vtable definitions, record the function pointers and their offset
within the vtable initializer (subsumes the information collected from
IR by tryFindVirtualCallTargets).
2) A record for each type metadata summarizing the vtable definitions
decorated with that metadata (subsumes the TypeIdentiferMap collected
from IR).
Also added are the necessary bitcode records, and the corresponding
assembly support.
The follow-on index-based WPD patch is D55153.
Depends on D53890.
Reviewers: pcc
Subscribers: mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits
Differential Revision: https://reviews.llvm.org/D54815
llvm-svn: 364960
2019-07-02 21:38:02 +02:00
|
|
|
// Write it out as an unsplit ThinLTO module.
|
[ThinLTO] Add support for emitting minimized bitcode for thin link
Summary:
The cumulative size of the bitcode files for a very large application
can be huge, particularly with -g. In a distributed build environment,
all of these files must be sent to the remote build node that performs
the thin link step, and this can exceed size limits.
The thin link actually only needs the summary along with a bitcode
symbol table. Until we have a proper bitcode symbol table, simply
stripping the debug metadata results in significant size reduction.
Add support for an option to additionally emit minimized bitcode
modules, just for use in the thin link step, which for now just strips
all debug metadata. I plan to add a cc1 option so this can be invoked
easily during the compile step.
However, care must be taken to ensure that these minimized thin link
bitcode files produce the same index as with the original bitcode files,
as these original bitcode files will be used in the backends.
Specifically:
1) The module hash used for caching is typically produced by hashing the
written bitcode, and we want to include the hash that would correspond
to the original bitcode file. This is because we want to ensure that
changes in the stripped portions affect caching. Added plumbing to emit
the same module hash in the minimized thin link bitcode file.
2) The module paths in the index are constructed from the module ID of
each thin linked bitcode, and typically is automatically generated from
the input file path. This is the path used for finding the modules to
import from, and obviously we need this to point to the original bitcode
files. Added gold-plugin support to take a suffix replacement during the
thin link that is used to override the identifier on the MemoryBufferRef
constructed from the loaded thin link bitcode file. The assumption is
that the build system can specify that the minimized bitcode file has a
name that is similar but uses a different suffix (e.g. out.thinlink.bc
instead of out.o).
Added various tests to ensure that we get identical index files out of
the thin link step.
Reviewers: mehdi_amini, pcc
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D31027
llvm-svn: 298638
2017-03-23 20:47:39 +01:00
|
|
|
|
|
|
|
// Save the module hash produced for the full bitcode, which will
|
|
|
|
// be used in the backends, and use that in the minimized bitcode
|
|
|
|
// produced for the full link.
|
|
|
|
ModuleHash ModHash = {{0}};
|
2018-02-14 20:11:32 +01:00
|
|
|
WriteBitcodeToFile(M, OS, /*ShouldPreserveUseListOrder=*/false, Index,
|
[ThinLTO] Add support for emitting minimized bitcode for thin link
Summary:
The cumulative size of the bitcode files for a very large application
can be huge, particularly with -g. In a distributed build environment,
all of these files must be sent to the remote build node that performs
the thin link step, and this can exceed size limits.
The thin link actually only needs the summary along with a bitcode
symbol table. Until we have a proper bitcode symbol table, simply
stripping the debug metadata results in significant size reduction.
Add support for an option to additionally emit minimized bitcode
modules, just for use in the thin link step, which for now just strips
all debug metadata. I plan to add a cc1 option so this can be invoked
easily during the compile step.
However, care must be taken to ensure that these minimized thin link
bitcode files produce the same index as with the original bitcode files,
as these original bitcode files will be used in the backends.
Specifically:
1) The module hash used for caching is typically produced by hashing the
written bitcode, and we want to include the hash that would correspond
to the original bitcode file. This is because we want to ensure that
changes in the stripped portions affect caching. Added plumbing to emit
the same module hash in the minimized thin link bitcode file.
2) The module paths in the index are constructed from the module ID of
each thin linked bitcode, and typically is automatically generated from
the input file path. This is the path used for finding the modules to
import from, and obviously we need this to point to the original bitcode
files. Added gold-plugin support to take a suffix replacement during the
thin link that is used to override the identifier on the MemoryBufferRef
constructed from the loaded thin link bitcode file. The assumption is
that the build system can specify that the minimized bitcode file has a
name that is similar but uses a different suffix (e.g. out.thinlink.bc
instead of out.o).
Added various tests to ensure that we get identical index files out of
the thin link step.
Reviewers: mehdi_amini, pcc
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D31027
llvm-svn: 298638
2017-03-23 20:47:39 +01:00
|
|
|
/*GenerateHash=*/true, &ModHash);
|
ThinLTO Minimized Bitcode File Size Reduction
Summary: Currently the ThinLTO minimized bitcode file only strip the debug info, but there is still a lot of information in the minimized bit code file that will be not used for thin linker. In this patch, most of the extra information is striped to reduce the minimized bitcode file. Now only ModuleVersion, ModuleInfo, ModuleGlobalValueSummary, ModuleHash, Symtab and Strtab are left. Now the minimized bitcode file size is reduced to 15%-30% of the debug info stripped bitcode file size.
Reviewers: danielcdh, tejohnson, pcc
Reviewed By: pcc
Subscribers: mehdi_amini, aprantl, inglorion, eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D35334
llvm-svn: 308760
2017-07-21 19:25:20 +02:00
|
|
|
// If a minimized bitcode module was requested for the thin link, only
|
|
|
|
// the information that is needed by thin link will be written in the
|
|
|
|
// given OS.
|
|
|
|
if (ThinLinkOS && Index)
|
2018-02-14 20:11:32 +01:00
|
|
|
WriteThinLinkBitcodeToFile(M, *ThinLinkOS, *Index, ModHash);
|
2016-12-16 01:26:30 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
class WriteThinLTOBitcode : public ModulePass {
|
|
|
|
raw_ostream &OS; // raw_ostream to print on
|
[ThinLTO] Add support for emitting minimized bitcode for thin link
Summary:
The cumulative size of the bitcode files for a very large application
can be huge, particularly with -g. In a distributed build environment,
all of these files must be sent to the remote build node that performs
the thin link step, and this can exceed size limits.
The thin link actually only needs the summary along with a bitcode
symbol table. Until we have a proper bitcode symbol table, simply
stripping the debug metadata results in significant size reduction.
Add support for an option to additionally emit minimized bitcode
modules, just for use in the thin link step, which for now just strips
all debug metadata. I plan to add a cc1 option so this can be invoked
easily during the compile step.
However, care must be taken to ensure that these minimized thin link
bitcode files produce the same index as with the original bitcode files,
as these original bitcode files will be used in the backends.
Specifically:
1) The module hash used for caching is typically produced by hashing the
written bitcode, and we want to include the hash that would correspond
to the original bitcode file. This is because we want to ensure that
changes in the stripped portions affect caching. Added plumbing to emit
the same module hash in the minimized thin link bitcode file.
2) The module paths in the index are constructed from the module ID of
each thin linked bitcode, and typically is automatically generated from
the input file path. This is the path used for finding the modules to
import from, and obviously we need this to point to the original bitcode
files. Added gold-plugin support to take a suffix replacement during the
thin link that is used to override the identifier on the MemoryBufferRef
constructed from the loaded thin link bitcode file. The assumption is
that the build system can specify that the minimized bitcode file has a
name that is similar but uses a different suffix (e.g. out.thinlink.bc
instead of out.o).
Added various tests to ensure that we get identical index files out of
the thin link step.
Reviewers: mehdi_amini, pcc
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D31027
llvm-svn: 298638
2017-03-23 20:47:39 +01:00
|
|
|
// The output stream on which to emit a minimized module for use
|
|
|
|
// just in the thin link, if requested.
|
|
|
|
raw_ostream *ThinLinkOS;
|
2016-12-16 01:26:30 +01:00
|
|
|
|
|
|
|
public:
|
|
|
|
static char ID; // Pass identification, replacement for typeid
|
[ThinLTO] Add support for emitting minimized bitcode for thin link
Summary:
The cumulative size of the bitcode files for a very large application
can be huge, particularly with -g. In a distributed build environment,
all of these files must be sent to the remote build node that performs
the thin link step, and this can exceed size limits.
The thin link actually only needs the summary along with a bitcode
symbol table. Until we have a proper bitcode symbol table, simply
stripping the debug metadata results in significant size reduction.
Add support for an option to additionally emit minimized bitcode
modules, just for use in the thin link step, which for now just strips
all debug metadata. I plan to add a cc1 option so this can be invoked
easily during the compile step.
However, care must be taken to ensure that these minimized thin link
bitcode files produce the same index as with the original bitcode files,
as these original bitcode files will be used in the backends.
Specifically:
1) The module hash used for caching is typically produced by hashing the
written bitcode, and we want to include the hash that would correspond
to the original bitcode file. This is because we want to ensure that
changes in the stripped portions affect caching. Added plumbing to emit
the same module hash in the minimized thin link bitcode file.
2) The module paths in the index are constructed from the module ID of
each thin linked bitcode, and typically is automatically generated from
the input file path. This is the path used for finding the modules to
import from, and obviously we need this to point to the original bitcode
files. Added gold-plugin support to take a suffix replacement during the
thin link that is used to override the identifier on the MemoryBufferRef
constructed from the loaded thin link bitcode file. The assumption is
that the build system can specify that the minimized bitcode file has a
name that is similar but uses a different suffix (e.g. out.thinlink.bc
instead of out.o).
Added various tests to ensure that we get identical index files out of
the thin link step.
Reviewers: mehdi_amini, pcc
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D31027
llvm-svn: 298638
2017-03-23 20:47:39 +01:00
|
|
|
WriteThinLTOBitcode() : ModulePass(ID), OS(dbgs()), ThinLinkOS(nullptr) {
|
2016-12-16 01:26:30 +01:00
|
|
|
initializeWriteThinLTOBitcodePass(*PassRegistry::getPassRegistry());
|
|
|
|
}
|
|
|
|
|
[ThinLTO] Add support for emitting minimized bitcode for thin link
Summary:
The cumulative size of the bitcode files for a very large application
can be huge, particularly with -g. In a distributed build environment,
all of these files must be sent to the remote build node that performs
the thin link step, and this can exceed size limits.
The thin link actually only needs the summary along with a bitcode
symbol table. Until we have a proper bitcode symbol table, simply
stripping the debug metadata results in significant size reduction.
Add support for an option to additionally emit minimized bitcode
modules, just for use in the thin link step, which for now just strips
all debug metadata. I plan to add a cc1 option so this can be invoked
easily during the compile step.
However, care must be taken to ensure that these minimized thin link
bitcode files produce the same index as with the original bitcode files,
as these original bitcode files will be used in the backends.
Specifically:
1) The module hash used for caching is typically produced by hashing the
written bitcode, and we want to include the hash that would correspond
to the original bitcode file. This is because we want to ensure that
changes in the stripped portions affect caching. Added plumbing to emit
the same module hash in the minimized thin link bitcode file.
2) The module paths in the index are constructed from the module ID of
each thin linked bitcode, and typically is automatically generated from
the input file path. This is the path used for finding the modules to
import from, and obviously we need this to point to the original bitcode
files. Added gold-plugin support to take a suffix replacement during the
thin link that is used to override the identifier on the MemoryBufferRef
constructed from the loaded thin link bitcode file. The assumption is
that the build system can specify that the minimized bitcode file has a
name that is similar but uses a different suffix (e.g. out.thinlink.bc
instead of out.o).
Added various tests to ensure that we get identical index files out of
the thin link step.
Reviewers: mehdi_amini, pcc
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D31027
llvm-svn: 298638
2017-03-23 20:47:39 +01:00
|
|
|
explicit WriteThinLTOBitcode(raw_ostream &o, raw_ostream *ThinLinkOS)
|
|
|
|
: ModulePass(ID), OS(o), ThinLinkOS(ThinLinkOS) {
|
2016-12-16 01:26:30 +01:00
|
|
|
initializeWriteThinLTOBitcodePass(*PassRegistry::getPassRegistry());
|
|
|
|
}
|
|
|
|
|
|
|
|
StringRef getPassName() const override { return "ThinLTO Bitcode Writer"; }
|
|
|
|
|
|
|
|
bool runOnModule(Module &M) override {
|
|
|
|
const ModuleSummaryIndex *Index =
|
|
|
|
&(getAnalysis<ModuleSummaryIndexWrapperPass>().getIndex());
|
[ThinLTO] Add support for emitting minimized bitcode for thin link
Summary:
The cumulative size of the bitcode files for a very large application
can be huge, particularly with -g. In a distributed build environment,
all of these files must be sent to the remote build node that performs
the thin link step, and this can exceed size limits.
The thin link actually only needs the summary along with a bitcode
symbol table. Until we have a proper bitcode symbol table, simply
stripping the debug metadata results in significant size reduction.
Add support for an option to additionally emit minimized bitcode
modules, just for use in the thin link step, which for now just strips
all debug metadata. I plan to add a cc1 option so this can be invoked
easily during the compile step.
However, care must be taken to ensure that these minimized thin link
bitcode files produce the same index as with the original bitcode files,
as these original bitcode files will be used in the backends.
Specifically:
1) The module hash used for caching is typically produced by hashing the
written bitcode, and we want to include the hash that would correspond
to the original bitcode file. This is because we want to ensure that
changes in the stripped portions affect caching. Added plumbing to emit
the same module hash in the minimized thin link bitcode file.
2) The module paths in the index are constructed from the module ID of
each thin linked bitcode, and typically is automatically generated from
the input file path. This is the path used for finding the modules to
import from, and obviously we need this to point to the original bitcode
files. Added gold-plugin support to take a suffix replacement during the
thin link that is used to override the identifier on the MemoryBufferRef
constructed from the loaded thin link bitcode file. The assumption is
that the build system can specify that the minimized bitcode file has a
name that is similar but uses a different suffix (e.g. out.thinlink.bc
instead of out.o).
Added various tests to ensure that we get identical index files out of
the thin link step.
Reviewers: mehdi_amini, pcc
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D31027
llvm-svn: 298638
2017-03-23 20:47:39 +01:00
|
|
|
writeThinLTOBitcode(OS, ThinLinkOS, LegacyAARGetter(*this), M, Index);
|
2016-12-16 01:26:30 +01:00
|
|
|
return true;
|
|
|
|
}
|
|
|
|
void getAnalysisUsage(AnalysisUsage &AU) const override {
|
|
|
|
AU.setPreservesAll();
|
2017-02-14 04:42:38 +01:00
|
|
|
AU.addRequired<AssumptionCacheTracker>();
|
2016-12-16 01:26:30 +01:00
|
|
|
AU.addRequired<ModuleSummaryIndexWrapperPass>();
|
2017-02-14 04:42:38 +01:00
|
|
|
AU.addRequired<TargetLibraryInfoWrapperPass>();
|
2016-12-16 01:26:30 +01:00
|
|
|
}
|
|
|
|
};
|
|
|
|
} // anonymous namespace
|
|
|
|
|
|
|
|
char WriteThinLTOBitcode::ID = 0;
|
|
|
|
INITIALIZE_PASS_BEGIN(WriteThinLTOBitcode, "write-thinlto-bitcode",
|
|
|
|
"Write ThinLTO Bitcode", false, true)
|
2017-02-14 04:42:38 +01:00
|
|
|
INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)
|
2016-12-16 01:26:30 +01:00
|
|
|
INITIALIZE_PASS_DEPENDENCY(ModuleSummaryIndexWrapperPass)
|
2017-02-14 04:42:38 +01:00
|
|
|
INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
|
2016-12-16 01:26:30 +01:00
|
|
|
INITIALIZE_PASS_END(WriteThinLTOBitcode, "write-thinlto-bitcode",
|
|
|
|
"Write ThinLTO Bitcode", false, true)
|
|
|
|
|
[ThinLTO] Add support for emitting minimized bitcode for thin link
Summary:
The cumulative size of the bitcode files for a very large application
can be huge, particularly with -g. In a distributed build environment,
all of these files must be sent to the remote build node that performs
the thin link step, and this can exceed size limits.
The thin link actually only needs the summary along with a bitcode
symbol table. Until we have a proper bitcode symbol table, simply
stripping the debug metadata results in significant size reduction.
Add support for an option to additionally emit minimized bitcode
modules, just for use in the thin link step, which for now just strips
all debug metadata. I plan to add a cc1 option so this can be invoked
easily during the compile step.
However, care must be taken to ensure that these minimized thin link
bitcode files produce the same index as with the original bitcode files,
as these original bitcode files will be used in the backends.
Specifically:
1) The module hash used for caching is typically produced by hashing the
written bitcode, and we want to include the hash that would correspond
to the original bitcode file. This is because we want to ensure that
changes in the stripped portions affect caching. Added plumbing to emit
the same module hash in the minimized thin link bitcode file.
2) The module paths in the index are constructed from the module ID of
each thin linked bitcode, and typically is automatically generated from
the input file path. This is the path used for finding the modules to
import from, and obviously we need this to point to the original bitcode
files. Added gold-plugin support to take a suffix replacement during the
thin link that is used to override the identifier on the MemoryBufferRef
constructed from the loaded thin link bitcode file. The assumption is
that the build system can specify that the minimized bitcode file has a
name that is similar but uses a different suffix (e.g. out.thinlink.bc
instead of out.o).
Added various tests to ensure that we get identical index files out of
the thin link step.
Reviewers: mehdi_amini, pcc
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D31027
llvm-svn: 298638
2017-03-23 20:47:39 +01:00
|
|
|
ModulePass *llvm::createWriteThinLTOBitcodePass(raw_ostream &Str,
|
|
|
|
raw_ostream *ThinLinkOS) {
|
|
|
|
return new WriteThinLTOBitcode(Str, ThinLinkOS);
|
2016-12-16 01:26:30 +01:00
|
|
|
}
|
2017-06-01 03:02:12 +02:00
|
|
|
|
|
|
|
PreservedAnalyses
|
|
|
|
llvm::ThinLTOBitcodeWriterPass::run(Module &M, ModuleAnalysisManager &AM) {
|
|
|
|
FunctionAnalysisManager &FAM =
|
|
|
|
AM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager();
|
|
|
|
writeThinLTOBitcode(OS, ThinLinkOS,
|
|
|
|
[&FAM](Function &F) -> AAResults & {
|
|
|
|
return FAM.getResult<AAManager>(F);
|
|
|
|
},
|
|
|
|
M, &AM.getResult<ModuleSummaryIndexAnalysis>(M));
|
|
|
|
return PreservedAnalyses::all();
|
|
|
|
}
|