1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-18 18:42:46 +02:00
Commit Graph

292 Commits

Author SHA1 Message Date
Fangrui Song
2174d3b961 [LTO] Add SelectionKind to IRSymtab and use it in ld.lld/LLVMgold
In PGO, a C++ external linkage function `foo` has a private counter
`__profc_foo` and a private `__profd_foo` in a `comdat nodeduplicate`.

A `__attribute__((weak))` function `foo` has a weak hidden counter `__profc_foo`
and a private `__profd_foo` in a `comdat nodeduplicate`.

In `ld.lld a.o b.o`, say a.o defines an external linkage `foo` and b.o
defines a weak `foo`. Currently we treat `comdat nodeduplicate` as `comdat any`,
ld.lld will incorrectly consider `b.o:__profc_foo` non-prevailing.  In the worst
case when `b.o:__profd_foo` is retained and `b.o:__profc_foo` isn't, there will
be dangling reference causing an `undefined hidden symbol` error.

Add SelectionKind to `Comdat` in IRSymtab and let linkers ignore nodeduplicate comdat.

Differential Revision: https://reviews.llvm.org/D106228
2021-07-20 13:22:00 -07:00
Sami Tolvanen
dd2d6bb1ca LTO: Export functions referenced by non-canonical CFI jump tables
LowerTypeTests pass adds functions with a non-canonical jump table
to cfiFunctionDecls instead of cfiFunctionDefs. As the jump table
is in the regular LTO object, these functions will also need to be
exported. This change fixes the non-canonical jump table case and
adds a test similar to the existing one for canonical jump tables.

Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D103120
2021-06-08 14:57:43 -07:00
serge-sans-paille
73bc91a5e6 Revert "[NFC] remove explicit default value for strboolattr attribute in tests"
This reverts commit bda6e5bee04c75b1f1332b4fd1ac4e8ef6c3c247.

See https://lab.llvm.org/buildbot/#/builders/109/builds/15424 for instance
2021-05-24 19:43:40 +02:00
serge-sans-paille
1f63b26006 [NFC] remove explicit default value for strboolattr attribute in tests
Since d6de1e1a71406c75a4ea4d5a2fe84289f07ea3a1, no attributes is quivalent to
setting attribute to false.

This is a preliminary commit for https://reviews.llvm.org/D99080
2021-05-24 19:31:04 +02:00
Nick Lewycky
a634617e94 Remove dead CHECK-ERR line. 2021-03-30 09:31:00 -07:00
Florian Hahn
c376195fed Revert "[LV] Move runtime pointer size check to LVP::plan()."
This reverts commit 25fbe803d4dbcf8ff3a3a9ca161f5b9a68353ed0.

This breaks a clang test which filters for the wrong remark type.
2021-03-29 14:41:53 +01:00
Florian Hahn
53ccebfadc [LV] Move runtime pointer size check to LVP::plan().
This removes the need for the remaining doesNotMeet check and instead
directly checks if there are too many runtime checks for vectorization
in the planner.

A subsequent patch will adjust the logic used to decide whether to
vectorize with runtime to consider their cost more accurately.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D98634
2021-03-29 14:12:29 +01:00
Yuanfang Chen
f44fb33098 [LTO][MC] Discard non-prevailing defined symbols in module-level assembly
This is the alternative approach to D96931.

In LTO, for each module with inlineasm block, prepend directive ".lto_discard <sym>, <sym>*" to the beginning of the inline
asm.  ".lto_discard" is both a module inlineasm block marker and (optionally) provides a list of symbols to be discarded.

In MC while emitting for inlineasm, discard symbol binding & symbol
definitions according to ".lto_disard".

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D98762
2021-03-18 15:33:42 -07:00
Nikita Popov
93b123f786 [DCE] Don't remove non-willreturn calls
In both ADCE and BDCE (via DemandedBits) we should not remove
instructions that are not guaranteed to return. This issue was
pointed out by fhahn in the recent llvm-dev thread.

Differential Revision: https://reviews.llvm.org/D96993
2021-02-19 12:35:40 +01:00
Florian Hahn
40e47d0fef Recommit "[LTO] Use lto::backend for code generation."
This version of the patch includes a fix for the cfi failures.

(undoes the revert commit 7db390cc7738a9ba0ed7d4ca59ab6ea2e69c47e9)

It also undoes reverts of follow-up patches that also needed reverting
originally:

  * [LTO] Add option enable NewPM with LTOCodeGenerator.
    (undoes revert commit 0a17664b47c153aa26a0d31b4835f26375440ec6)

  * [LTOCodeGenerator] Use lto::Config for options (NFC)."
    (undoes revert commit b0a8e41cfff717ff067bf63412d6edb0280608cd)
2021-02-15 10:05:42 +00:00
Fangrui Song
d33006d182 ELFObjectWriter: Don't sort non-local symbols
As we don't sort local symbols, don't sort non-local symbols.  This makes
non-local symbols appear in their register order, which matches GNU as. The
register order is nice in that you can write tests with interleaved CHECK
prefixes, e.g.

```
// CHECK: something about foo
.globl foo
foo:
// CHECK: something about bar
.globl bar
bar:
```

With the lexicographical order, the user needs to place lexicographical smallest
symbol first or keep CHECK prefixes in one place.
2021-02-13 10:32:27 -08:00
Florian Hahn
b7ef828be9 Revert "[LTO] Use lto::backend for code generation."
This reverts commit 6a59f0560648b43324b5aed51b9ef996404a25e0, because
it is causing failures on green dragon.
2021-02-03 22:49:30 +00:00
Florian Hahn
66a5e06679 Revert "[LTO] Add option enable NewPM with LTOCodeGenerator."
This reverts commit 7a6a2cc81aaf064e6f5bc9a9a16973f552d2bdc2 because
it is causing failures on green dragon.
2021-02-03 22:49:20 +00:00
Florian Hahn
d993d158ec [LTO] Add option enable NewPM with LTOCodeGenerator.
This patch adds an option to enable the new pass manager in
LTOCodeGenerator. It also updates a few tests with legacy PM specific
tests, which started failing after 6a59f0560648 when
LLVM_ENABLE_NEW_PASS_MANAGER=true.
2021-01-30 11:54:20 +00:00
Florian Hahn
da0198959b [LTO] Use lto::backend for code generation.
This patch updates LTOCodeGenerator to use the utilities provided by
LTOBackend to run middle-end optimizations and backend code generation.

This is a first step towards unifying the code used by libLTO's C API
and the newer, C++ interface (see PR41541).

The immediate motivation is to allow using the new pass manager when
doing LTO using libLTO's C API, which is used on Darwin, among others.

With the changes, there are no codegen/stats differences when building
MultiSource/SPEC2000/SPEC2006 on Darwin X86 with LTO, compared
to without the patch.

Reviewed By: steven_wu

Differential Revision: https://reviews.llvm.org/D94487
2021-01-30 10:09:55 +00:00
Florian Hahn
962b770084 [LTO] Add support for existing Config::Freestanding option.
lto::Config has a field to control whether the build is "freestanding"
(no builtins) or not, but it is not hooked up to the code actually
running the passes.

This patch adds support for the flag to both the code that runs
optimization with the new and old pass managers, by explicitly adding a
TargetLibraryInfo instance. If Freestanding is true, all library functions
are disabled.

Reviewed By: steven_wu

Differential Revision: https://reviews.llvm.org/D94630
2021-01-22 13:45:39 +00:00
Florian Hahn
2bcad0829e [LTO] Add test for freestanding LTO option.
This patch adds a test for the -lto-freestanding option, similar to
llvm/test/ThinLTO/X86/tli-nobuiltin.ll.
2021-01-13 20:43:22 +00:00
Florian Hahn
04553aba9f [LTO] Add test to ensure objc-arc-contract is executed.
This test adds additional test coverage for upcoming refactorings.
2021-01-13 12:18:17 +00:00
Teresa Johnson
01add29a48 [ICP] Don't promote when target not defined in module
This guards against cases where the symbol was dead code eliminated in
the binary by ThinLTO, and we have a sample profile collected for one
binary but used to optimize another.

Most of the benefit from ICP comes from inlining the target, which we
can't do with only a declaration anyway. If this is in the pre-ThinLTO
link step (e.g. for instrumentation based PGO), we will attempt the
promotion again in the ThinLTO backend after importing anyway, and we
don't need the early promotion to facilitate that.

Differential Revision: https://reviews.llvm.org/D92804
2020-12-08 07:45:36 -08:00
Wei Wang
d0b74589e5 [Remarks][1/2] Expand remarks hotness threshold option support in more tools
This is the #1 of 2 changes that make remarks hotness threshold option
available in more tools. The changes also allow the threshold to sync with
hotness threshold from profile summary with special value 'auto'.

This change modifies the interface of lto::setupLLVMOptimizationRemarks() to
accept remarks hotness threshold. Update all the tools that use it with remarks
hotness threshold options:

* lld: '--opt-remarks-hotness-threshold='
* llvm-lto2: '--pass-remarks-hotness-threshold='
* llvm-lto: '--lto-pass-remarks-hotness-threshold='
* gold plugin: '-plugin-opt=opt-remarks-hotness-threshold='

Differential Revision: https://reviews.llvm.org/D85809
2020-11-30 21:55:49 -08:00
Yuanfang Chen
728cc71092 [IRMover] Avoid materializing global value that belongs to not-yet-linked module
We saw the same assertion failure mentioned here
https://bugs.llvm.org/show_bug.cgi?id=42063 in our internal tests.

The failure happens in the same circumstance as D47898 and D66814 where
uniqueing of DICompositeTypes causes `Mapper::mapValue` to be called on
GlobalValues(`G`) from a not-yet-linked module(`M`). The following type-mapping for
`G` may not complete correctly (fail to unique types etc.  depending on the
the complexity of the types) because IRLinker::computeTypeMapping is not done for `M`
in this path.

D47898 and D66814 fixed some type-mapping issue after Mapper::mapValue
is called on `G`. However, it seems it did not handle some complex cases. I
think we should delay linking globals like `G` until its owing module is
linked. In this way, we could save unnecessary type mapping and prune
these corner cases. It is also supposed to reduce the total number of structs
ending up in the combined module.

D47898 is reverted (its test is kept) because it regresses the test case here.
D66814 could also be reverted (the `check-all` looks good). But it looks reasonable
anyway, so I thought I should keep it.

Also tested the patch with clang self-host regularLTO/ThinLTO build, things look
good as well.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D87001
2020-10-07 18:14:07 -07:00
Mircea Trofin
2d0a6945c4 [ThinLTO] add post-thinlto-merge option to -lto-embed-bitcode
This will embed bitcode after (Thin)LTO merge, but before optimizations.
In the case the thinlto backend is called from clang, the .llvmcmd
section is also produced. Doing so in the case where the caller is the
linker doesn't yet have a motivation, and would require plumbing through
command line args.

Differential Revision: https://reviews.llvm.org/D87636
2020-09-15 15:56:11 -07:00
Mircea Trofin
a6aa03251f [ThinLTO] Make -lto-embed-bitcode an enum
The current behavior of -lto-embed-bitcode is not quite the same as that
of -fembed-bitcode. While both populate .llvmbc with bitcode, the latter
populates it with pre-optimized bitcode(*), while the former with
post-optimized. The scenarios driving them are different - the latter's
goal is to allow re-compilation, while the former, IIUC, is execution.

I plan to add a third mode for thinlto cases, closely-related to
-fembed-bitcode's scenario: adding the bitcode pre-optimization, but
post-merging. This would allow re-compilation without requiring the
other .bc files that were merged (akin to how -fembed-bitcode allows
recompilation without all the .h files)

The third mode can't co-exist with the current -lto-embed-bitcode mode,
because the latter would overwrite it. For clarity, we change
-lto-embed-bitcode to be an enum.

(*) That's the compiler semantics. The driver splits compilation in 2
phases, so if -fembed-bitcode is given to the driver, the .llvmbc is
optimized bitcode; if the option is passed to the compiler (after -cc1),
the section is pre-optimized.

Differential Revision: https://reviews.llvm.org/D87477
2020-09-11 13:24:54 -07:00
Steven Wu
0298cbfe6d [LTO] Don't apply LTOPostLink module flag during writeMergedModule
For `ld64` which uses legacy LTOCodeGenerator, it relies on
writeMergedModule to perform `ld -r` (generates a linked object file).
If all the inputs to `ld -r` is fullLTO bitcode, `ld64` will linked the
bitcode module, internalize all the symbols and write out another
fullLTO bitcode object file. This bitcode file doesn't have all the
bitcode inputs and it should not have LTOPostLink module flag. It will
also cause error when this bitcode object file is linked with other LTO
object file.
Fix the issue by not applying LTOPostLink flag during writeMergedModule
function. The flag should only be added when all the bitcode are linked
and ready to be optimized.

rdar://problem/58462798

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D84789
2020-08-26 11:17:45 -07:00
Fangrui Song
0c234a7fc0 [TargetLoweringObjectFileImpl] Make .llvmbc and .llvmcmd non-SHF_ALLOC
There are two ways .llvmbc can be produced:

* clang -c -fembed-bitcode=all (which also produces .llvmcmd)
* LTO backend: ld.lld -mllvm -lto-embed-bitcode or -plugin-opt=-lto-embed-bitcode

.llvmbc and .llvmcmd have the SHF_ALLOC flag, so they can be dropped by
--gc-sections.

This patch sets SectionKind::Metadata to drop the SHF_ALLOC flag. This
is conceptually correct: the two sections are not part of the process
image, so SHF_ALLOC is not appropriate.

`test/LTO/X86/embed-bitcode.ll`: changed `llvm-objcopy -O binary --only-section` to
`llvm-objcopy --dump-section`. `-O binary` does not dump non-SHF_ALLOC sections.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D86374
2020-08-25 13:37:29 -07:00
Wei Wang
7dabc836ba [FIX] Avoid creating BFI when emitting remarks for dead functions
Dead function has its body stripped away, and can cause various
analyses to panic. Also it does not make sense to apply analyses on
such function.

Reviewed By: xazax.hun, MaskRay, wenlei, hoy

Differential Revision: https://reviews.llvm.org/D84715
2020-08-25 11:12:38 -07:00
Wei Mi
93d80a543f [SampleFDO] Add use-sample-profile function attribute.
When sampleFDO is enabled, people may expect they can use
-fno-profile-sample-use to opt-out using sample profile for a certain file.
That could be either for debugging purpose or for performance tuning purpose.
However, when thinlto is enabled, if a function in file A compiled with
-fno-profile-sample-use is imported to another file B compiled with
-fprofile-sample-use, the inlined copy of the function in file B may still
get its profile annotated.

The inconsistency may even introduce profile unused warning because if the
target is not compiled with explicit debug information flag, the function
in file A won't have its debug information enabled (debug information will
be enabled implicitly only when -fprofile-sample-use is used). After it is
imported into file B which is compiled with -fprofile-sample-use, profile
annotation for the outline copy of the function will fail because the
function has no debug information, and that will trigger  profile unused
warning.

We add a new attribute use-sample-profile to control whether a function
will use its sample profile no matter for its outline or inline copies.
That will make the behavior of -fno-profile-sample-use consistent.

Differential Revision: https://reviews.llvm.org/D79959
2020-06-02 17:23:17 -07:00
Eli Friedman
f5d3346387 Infer alignment of unmarked loads in IR/bitcode parsing.
For IR generated by a compiler, this is really simple: you just take the
datalayout from the beginning of the file, and apply it to all the IR
later in the file. For optimization testcases that don't care about the
datalayout, this is also really simple: we just use the default
datalayout.

The complexity here comes from the fact that some LLVM tools allow
overriding the datalayout: some tools have an explicit flag for this,
some tools will infer a datalayout based on the code generation target.
Supporting this properly required plumbing through a bunch of new
machinery: we want to allow overriding the datalayout after the
datalayout is parsed from the file, but before we use any information
from it. Therefore, IR/bitcode parsing now has a callback to allow tools
to compute the datalayout at the appropriate time.

Not sure if I covered all the LLVM tools that want to use the callback.
(clang? lli? Misc IR manipulation tools like llvm-link?). But this is at
least enough for all the LLVM regression tests, and IR without a
datalayout is not something frontends should generate.

This change had some sort of weird effects for certain CodeGen
regression tests: if the datalayout is overridden with a datalayout with
a different program or stack address space, we now parse IR based on the
overridden datalayout, instead of the one written in the file (or the
default one, if none is specified). This broke a few AVR tests, and one
AMDGPU test.

Outside the CodeGen tests I mentioned, the test changes are all just
fixing CHECK lines and moving around datalayout lines in weird places.

Differential Revision: https://reviews.llvm.org/D78403
2020-05-14 13:03:50 -07:00
Craig Topper
8757f48ecf [IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand().
This method has been commented as deprecated for a while. Remove
it and replace all uses with the equivalent getCalledOperand().

I also made a few cleanups in here. For example, to removes use
of getElementType on a pointer when we could just use getFunctionType
from the call.

Differential Revision: https://reviews.llvm.org/D78882
2020-04-27 22:17:03 -07:00
Fangrui Song
d41ca54775 [ThinLTO] Drop dso_local if a GlobalVariable satisfies isDeclarationForLinker()
dso_local leads to direct access even if the definition is not within this compilation unit (it is
still in the same linkage unit). On ELF, such a relocation (e.g. R_X86_64_PC32) referencing a
STB_GLOBAL STV_DEFAULT object can cause a linker error in a -shared link.

If the linkage is changed to available_externally, the dso_local flag should be dropped, so that no
direct access will be generated.

The current behavior is benign, because -fpic does not assume dso_local
(clang/lib/CodeGen/CodeGenModule.cpp:shouldAssumeDSOLocal).
If we do that for -fno-semantic-interposition (D73865), there will be an
R_X86_64_PC32 linker error without this patch.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D74751
2020-04-07 15:46:01 -07:00
Fangrui Song
3d2508df1f [X86InstPrinter] Change printPCRelImm to print the target address in hexadecimal form
```
// llvm-objdump -d output (before)
400000: e8 0b 00 00 00   callq 11
400005: e8 0b 00 00 00   callq 11

// llvm-objdump -d output (after)
400000: e8 0b 00 00 00  callq 0x400010
400005: e8 0b 00 00 00  callq 0x400015

// GNU objdump -d. The lack of 0x is not ideal because the result cannot be re-assembled
400000: e8 0b 00 00 00  callq 400010
400005: e8 0b 00 00 00  callq 400015
```

In llvm-objdump, we pass the address of the next MCInst. Ideally we
should just thread the address of the current address, unfortunately we
cannot call X86MCCodeEmitter::encodeInstruction (X86MCCodeEmitter
requires MCInstrInfo and MCContext) to get the length of the MCInst.

MCInstPrinter::printInst has other callers (e.g llvm-mc -filetype=asm, llvm-mca) which set Address to 0.
They leave MCInstPrinter::PrintBranchImmAsAddress as false and this change is a no-op for them.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D76580
2020-03-26 08:28:59 -07:00
Fangrui Song
e44e8f46a9 ThinLTOBitcodeWriter: drop dso_local when a GlobalVariable is converted to a declaration
If we infer the dso_local flag for -fpic, dso_local should be dropped
when we convert a GlobalVariable a declaration. dso_local causes the
generation of direct access (e.g. R_X86_64_PC32). Such relocations referencing
STB_GLOBAL STV_DEFAULT objects are not allowed in a -shared link.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D74749
2020-03-05 18:09:33 -08:00
Fangrui Song
25a0241f66 [llvm-objdump] -d: print 00000000 <foo>: instead of 00000000 foo:
The new behavior matches GNU objdump. A pair of angle brackets makes tests slightly easier.

`.foo:` is not unique and thus cannot be used in a `CHECK-LABEL:` directive.
Without `-LABEL`, the CHECK line can match the `Disassembly of section`
line and causes the next `CHECK-NEXT:` to fail.

```
Disassembly of section .foo:

0000000000001634 .foo:
```

Bdragon: <> has metalinguistic connotation. it just "feels right"

Reviewed By: rupprecht

Differential Revision: https://reviews.llvm.org/D75713
2020-03-05 18:05:28 -08:00
Francis Visoiu Mistrih
b4279d9528 [LTO][Legacy] Add new API to query Mach-O CPU (sub)type
Tools working with object files on Darwin (e.g. lipo) may need to know
properties like the CPU type and subtype of a bitcode file. The logic of
converting a triple to a Mach-O CPU_(SUB_)TYPE should be provided by
LLVM instead of relying on tools to re-implement it.

Differential Revision: https://reviews.llvm.org/D75067
2020-02-28 12:56:05 -08:00
Yuanfang Chen
dd53274771 Revert "Revert "Reland "[Support] make report_fatal_error abort instead of exit"""
This reverts commit 80a34ae31125aa46dcad47162ba45b152aed968d with fixes.

Previously, since bots turning on EXPENSIVE_CHECKS are essentially turning on
MachineVerifierPass by default on X86 and the fact that
inline-asm-avx-v-constraint-32bit.ll and inline-asm-avx512vl-v-constraint-32bit.ll
are not expected to generate functioning machine code, this would go
down to `report_fatal_error` in MachineVerifierPass. Here passing
`-verify-machineinstrs=0` to make the intent explicit.
2020-02-13 10:16:06 -08:00
Yuanfang Chen
2dbac841f9 Revert "Revert "Revert "Reland "[Support] make report_fatal_error abort instead of exit""""
This reverts commit bb51d243308dbcc9a8c73180ae7b9e47b98e68fb.
2020-02-13 10:08:05 -08:00
Yuanfang Chen
93e82c22ef Revert "Revert "Reland "[Support] make report_fatal_error abort instead of exit"""
This reverts commit 80a34ae31125aa46dcad47162ba45b152aed968d with fixes.

On bots llvm-clang-x86_64-expensive-checks-ubuntu and
llvm-clang-x86_64-expensive-checks-debian only,
llc returns 0 for these two tests unexpectedly. I tweaked the RUN line a little
bit in the hope that LIT is the culprit since this change is not in the
codepath these tests are testing.
llvm\test\CodeGen\X86\inline-asm-avx-v-constraint-32bit.ll
llvm\test\CodeGen\X86\inline-asm-avx512vl-v-constraint-32bit.ll
2020-02-13 10:02:53 -08:00
Yuanfang Chen
c7fb4c55c4 Revert "Reland "[Support] make report_fatal_error abort instead of exit""
This reverts commit rGcd5b308b828e, rGcd5b308b828e, rG8cedf0e2994c.

There are issues to be investigated for polly bots and bots turning on
EXPENSIVE_CHECKS.
2020-02-11 20:41:53 -08:00
Yuanfang Chen
83a2f3c1ba Reland "[Support] make report_fatal_error abort instead of exit"
Summary:
Reland D67847 after D73742 is committed. Replace `sys::Process::Exit(1)`
with `abort` in `report_fatal_error`.

After this patch, for tools turning on `CrashRecoveryContext`,
crash handler installed by `CrashRecoveryContext` is called unless
they installed a non-returning handler using `llvm::install_fatal_error_handler`
like `cc1_main` currently does.

Reviewers: rnk, MaskRay, aganea, hans, espindola, jhenderson

Subscribers: jholewinski, qcolombet, dschuff, jyknight, emaste, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, rupprecht, jocewei, jsji, Jim, dmgreen, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74456
2020-02-11 18:20:40 -08:00
Gabor Horvath
fe44f75159 [LTO] Add optimization remarks for removed functions
This only works with regular LTO for now.

Differential Revision: https://reviews.llvm.org/D73597
2020-01-29 15:53:51 -08:00
Yuanfang Chen
b1c09bbef0 Revert "[Support] make report_fatal_error abort instead of exit"
This reverts commit 647c3f4e47de8a850ffcaa897db68702d8d2459a.

Got bots failure from sanitizer-windows and maybe others.
2020-01-15 17:52:25 -08:00
Yuanfang Chen
725cd0da61 [Support] make report_fatal_error abort instead of exit
Summary:
This patch could be treated as a rebase of D33960. It also fixes PR35547.
A fix for `llvm/test/Other/close-stderr.ll` is proposed in D68164. Seems
the consensus is that the test is passing by chance and I'm not
sure how important it is for us. So it is removed like in D33960 for now.
The rest of the test fixes are just adding `--crash` flag to `not` tool.

** The reason it fixes PR35547 is

`exit` does cleanup including calling class destructor whereas `abort`
does not do any cleanup. In multithreading environment such as ThinLTO or JIT,
threads may share states which mostly are ManagedStatic<>. If faulting thread
tearing down a class when another thread is using it, there are chances of
memory corruption. This is bad 1. It will stop error reporting like pretty
stack printer; 2. The memory corruption is distracting and nondeterministic in
terms of error message, and corruption type (depending one the timing, it
could be double free, heap free after use, etc.).

Reviewers: rnk, chandlerc, zturner, sepavloff, MaskRay, espindola

Reviewed By: rnk, MaskRay

Subscribers: wuzish, jholewinski, qcolombet, dschuff, jyknight, emaste, sdardis, nemanjai, jvesely, nhaehnle, sbc100, arichardson, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, lenary, s.egerton, pzheng, cfe-commits, MaskRay, filcab, davide, MatzeB, mehdi_amini, hiraditya, steven_wu, dexonsmith, rupprecht, seiya, llvm-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D67847
2020-01-15 17:05:13 -08:00
James Henderson
91705af363 [NFC] Fix trivial typos in comments
Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D72143

Patch by Kazuaki Ishizaki.
2020-01-06 10:50:26 +00:00
Fangrui Song
8e823f1246 [llvm-nm] Display STT_GNU_IFUNC as 'i'
Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D71803
2019-12-25 09:47:53 -08:00
Fangrui Song
d9c5df08b1 Migrate function attribute "no-frame-pointer-elim" to "frame-pointer"="all" as cleanups after D56351 2019-12-24 15:57:33 -08:00
Teresa Johnson
3a2a2bb628 [LTO] Support for embedding bitcode section during LTO
Summary:
This adds support for embedding bitcode in a binary during LTO. The libLTO gains supports the `-lto-embed-bitcode` flag. The option allows users of the LTO library to embed a bitcode section. For example, LLD can pass the option via `ld.lld -mllvm=-lto-embed-bitcode`.

This feature allows doing something comparable to `clang -c -fembed-bitcode`, but on the (LTO) linker level. Having bitcode alongside native code has many use-cases. To give an example, the MacOS linker can create a `-bitcode_bundle` section containing bitcode. Also, having this feature built into LLVM is an alternative to 3rd party tools such as [[ https://github.com/travitch/whole-program-llvm | wllvm ]] or [[ https://github.com/SRI-CSL/gllvm | gllvm ]]. As with these tools, this feature simplifies creating "whole-program" llvm bitcode files, but in contrast to wllvm/gllvm it does not rely on a specific llvm frontend/driver.

Patch by Josef Eisl <josef.eisl@oracle.com>

Reviewers: #llvm, #clang, rsmith, pcc, alexshap, tejohnson

Reviewed By: tejohnson

Subscribers: tejohnson, mehdi_amini, inglorion, hiraditya, aheejin, steven_wu, dexonsmith, dang, cfe-commits, llvm-commits, #llvm, #clang

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68213
2019-12-12 12:34:19 -08:00
Oliver Stannard
6aaf81e821 Reland: Dead Virtual Function Elimination
Remove dead virtual functions from vtables with
replaceNonMetadataUsesWith, so that CGProfile metadata gets cleaned up
correctly.

Original commit message:

Currently, it is hard for the compiler to remove unused C++ virtual
functions, because they are all referenced from vtables, which are referenced
by constructors. This means that if the constructor is called from any live
code, then we keep every virtual function in the final link, even if there
are no call sites which can use it.

This patch allows unused virtual functions to be removed during LTO (and
regular compilation in limited circumstances) by using type metadata to match
virtual function call sites to the vtable slots they might load from. This
information can then be used in the global dead code elimination pass instead
of the references from vtables to virtual functions, to more accurately
determine which functions are reachable.

To make this transformation safe, I have changed clang's code-generation to
always load virtual function pointers using the llvm.type.checked.load
intrinsic, instead of regular load instructions. I originally tried writing
this using clang's existing code-generation, which uses the llvm.type.test
and llvm.assume intrinsics after doing a normal load. However, it is possible
for optimisations to obscure the relationship between the GEP, load and
llvm.type.test, causing GlobalDCE to fail to find virtual function call
sites.

The existing linkage and visibility types don't accurately describe the scope
in which a virtual call could be made which uses a given vtable. This is
wider than the visibility of the type itself, because a virtual function call
could be made using a more-visible base class. I've added a new
!vcall_visibility metadata type to represent this, described in
TypeMetadata.rst. The internalization pass and libLTO have been updated to
change this metadata when linking is performed.

This doesn't currently work with ThinLTO, because it needs to see every call
to llvm.type.checked.load in the linkage unit. It might be possible to
extend this optimisation to be able to use the ThinLTO summary, as was done
for devirtualization, but until then that combination is rejected in the
clang driver.

To test this, I've written a fuzzer which generates random C++ programs with
complex class inheritance graphs, and virtual functions called through object
and function pointers of different types. The programs are spread across
multiple translation units and DSOs to test the different visibility
restrictions.

I've also tried doing bootstrap builds of LLVM to test this. This isn't
ideal, because only classes in anonymous namespaces can be optimised with
-fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not
work correctly with -fvisibility=hidden. However, there are only 12 test
failures when building with -fvisibility=hidden (and an unmodified compiler),
and this change does not cause any new failures for either value of
-fvisibility.

On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size
reduction of ~6%, over a baseline compiled with "-O2 -flto
-fvisibility=hidden -fwhole-program-vtables". The best cases are reductions
of ~14% in 450.soplex and 483.xalancbmk, and there are no code size
increases.

I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which
show a geomean size reduction of ~3%, again with no size increases.

I had hoped that this would have no effect on performance, which would allow
it to awlays be enabled (when using -fwhole-program-vtables). However, the
changes in clang to use the llvm.type.checked.load intrinsic are causing ~1%
performance regression in the C++ parts of SPEC2006. It should be possible to
recover some of this perf loss by teaching optimisations about the
llvm.type.checked.load intrinsic, which would make it worth turning this on
by default (though it's still dependent on -fwhole-program-vtables).

Differential revision: https://reviews.llvm.org/D63932

llvm-svn: 375094
2019-10-17 09:58:57 +00:00
Jorge Gorbe Moya
6a19d78c0a Revert "Dead Virtual Function Elimination"
This reverts commit 9f6a873268e1ad9855873d9d8007086c0d01cf4f.

llvm-svn: 374844
2019-10-14 23:25:25 +00:00
Oliver Stannard
901c588c1f Dead Virtual Function Elimination
Currently, it is hard for the compiler to remove unused C++ virtual
functions, because they are all referenced from vtables, which are referenced
by constructors. This means that if the constructor is called from any live
code, then we keep every virtual function in the final link, even if there
are no call sites which can use it.

This patch allows unused virtual functions to be removed during LTO (and
regular compilation in limited circumstances) by using type metadata to match
virtual function call sites to the vtable slots they might load from. This
information can then be used in the global dead code elimination pass instead
of the references from vtables to virtual functions, to more accurately
determine which functions are reachable.

To make this transformation safe, I have changed clang's code-generation to
always load virtual function pointers using the llvm.type.checked.load
intrinsic, instead of regular load instructions. I originally tried writing
this using clang's existing code-generation, which uses the llvm.type.test
and llvm.assume intrinsics after doing a normal load. However, it is possible
for optimisations to obscure the relationship between the GEP, load and
llvm.type.test, causing GlobalDCE to fail to find virtual function call
sites.

The existing linkage and visibility types don't accurately describe the scope
in which a virtual call could be made which uses a given vtable. This is
wider than the visibility of the type itself, because a virtual function call
could be made using a more-visible base class. I've added a new
!vcall_visibility metadata type to represent this, described in
TypeMetadata.rst. The internalization pass and libLTO have been updated to
change this metadata when linking is performed.

This doesn't currently work with ThinLTO, because it needs to see every call
to llvm.type.checked.load in the linkage unit. It might be possible to
extend this optimisation to be able to use the ThinLTO summary, as was done
for devirtualization, but until then that combination is rejected in the
clang driver.

To test this, I've written a fuzzer which generates random C++ programs with
complex class inheritance graphs, and virtual functions called through object
and function pointers of different types. The programs are spread across
multiple translation units and DSOs to test the different visibility
restrictions.

I've also tried doing bootstrap builds of LLVM to test this. This isn't
ideal, because only classes in anonymous namespaces can be optimised with
-fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not
work correctly with -fvisibility=hidden. However, there are only 12 test
failures when building with -fvisibility=hidden (and an unmodified compiler),
and this change does not cause any new failures for either value of
-fvisibility.

On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size
reduction of ~6%, over a baseline compiled with "-O2 -flto
-fvisibility=hidden -fwhole-program-vtables". The best cases are reductions
of ~14% in 450.soplex and 483.xalancbmk, and there are no code size
increases.

I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which
show a geomean size reduction of ~3%, again with no size increases.

I had hoped that this would have no effect on performance, which would allow
it to awlays be enabled (when using -fwhole-program-vtables). However, the
changes in clang to use the llvm.type.checked.load intrinsic are causing ~1%
performance regression in the C++ parts of SPEC2006. It should be possible to
recover some of this perf loss by teaching optimisations about the
llvm.type.checked.load intrinsic, which would make it worth turning this on
by default (though it's still dependent on -fwhole-program-vtables).

Differential revision: https://reviews.llvm.org/D63932

llvm-svn: 374539
2019-10-11 11:59:55 +00:00
Pirama Arumuga Nainar
96d22d6fc4 [IRMover] Don't map globals if their types are the same
Summary:
During IR Linking, if the types of two globals in destination and source
modules are the same, it can only be because the global in the
destination module is originally from the source module and got added to
the destination module from a shared metadata.

We shouldn't map this type to itself in case the type's components get
remapped to a new type from the destination (for instance, during the
loop over SrcM->getIdentifiedStructTypes() further below in
IRLinker::computeTypeMapping()).

Fixes PR40312.

Reviewers: tejohnson, pcc, srhines

Subscribers: mehdi_amini, hiraditya, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66814

llvm-svn: 371643
2019-09-11 18:35:49 +00:00