1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 12:43:36 +01:00
Commit Graph

350 Commits

Author SHA1 Message Date
Roman Lebedev
347c3e4e9c [InstCombine] Always try to invert non-canonical predicate of an icmp
Summary:
The actual transform i was going after was:
https://rise4fun.com/Alive/Tp9H
```
Name: zz
Pre: isPowerOf2(C0) && isPowerOf2(C1) && C1 == C0
%t0 = and i8 %x, C0
%r = icmp eq i8 %t0, C1
  =>
%t = icmp eq i8 %t0, 0
%r = xor i1 %t, -1

Name: zz
Pre: isPowerOf2(C0)
%t0 = and i8 %x, C0
%r = icmp ne i8 %t0, 0
  =>
%t = icmp eq i8 %t0, 0
%r = xor i1 %t, -1
```
but as it can be seen from the current tests, we already canonicalize most of it,
and we are only missing handling multi-use non-canonical icmp predicates.

If we have both `!=0` and `==0`, even though we can CSE them,
we end up being stuck with them. We should canonicalize to the `==0`.

I believe this is one of the cleanup steps i'll need after `-scalarizer`
if i end up proceeding with my WIP alloca promotion helper pass.

Reviewers: spatel, jdoerfert, nikic

Reviewed By: nikic

Subscribers: zzheng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83139
2020-07-04 18:12:04 +03:00
Teresa Johnson
15d602b26a [WPD] Allow virtual calls to be analyzed with multiple type tests
Summary:
In D52514 I had fixed a bug with WPD after indirect call promotion, by
checking that a type test being analyzed dominates potential virtual
calls. With that fix I included a small effiency enhancement to avoid
processing a devirt candidate multiple times (when there are multiple
type tests). This latter change wasn't in response to any measured
efficiency issues, it was merely theoretical. Unfortuantely, it turns
out to limit optimization opportunities after inlining.

Specifically, consider code that looks like:

class A {
  virtual void foo();
};
class B : public A {
  void foo();
}
void callee(A *a) {
  a->foo(); // Call 1
}
void caller(B *b) {
  b->foo(); // Call 2
  callee(b);
}

After inlining callee into caller, because of the existing call to
b->foo() in caller there will be 2 type tests in caller for the vtable
pointer of b: the original type test against B from Call 2, and the
inlined type test against A from Call 1. If the code was compiled with
-fstrict-vtable-pointers, then after optimization WPD will see that
both type tests are associated with the inlined virtual Call 1.
With my earlier change to only process a virtual call against one type
test, we may only consider virtual Call 1 against the base class A type
test, which can't be devirtualized. With my change here to remove this
restriction, it also gets considered for the type test against the
derived class B type test, where it can be devirtualized.

Note that if caller didn't include it's own earlier virtual call
b->foo() we will not be able to devirtualize after inlining callee even
after this fix, since there would not be a type test against B in the
IR. As a future enhancement we can consider inserting type tests at call
sites that pass pointers to classes with virtual calls, to enable
context-sensitive devirtualization after inlining.

Reviewers: pcc, vitalybuka, evgeny777

Subscribers: Prazek, hiraditya, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79235
2020-06-24 10:51:24 -07:00
Hiroshi Yamauchi
9896d761e8 [ThinLTO] Compute the basic block count across modules.
Summary:
Count the per-module number of basic blocks when the module summary is computed
and sum them up during Thin LTO indexing.

This is used to estimate the working set size under the partial sample PGO.

This is split off of D79831.

Reviewers: davidxl, espindola

Subscribers: emaste, inglorion, hiraditya, MaskRay, steven_wu, dexonsmith, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80403
2020-05-28 10:33:05 -07:00
Zakk Chen
6bb0aa513e [LTO] Suppress emission of empty combined module by default
Summary:
That unless the user requested an output object (--lto-obj-path), the an
unused empty combined module is not emitted.

This changed is helpful for some target (ex. RISCV-V) which encoded the
ABI info in IR module flags (target-abi). Empty unused module has no ABI
info so the linker would get the linking error during merging
incompatible ABIs.

Reviewers: tejohnson, espindola, MaskRay

Subscribers: emaste, inglorion, arichardson, hiraditya, simoncook, MaskRay, steven_wu, dexonsmith, PkmX, dang, lenary, s.egerton, luismarques, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78988
2020-05-04 18:31:09 -07:00
Fangrui Song
d41ca54775 [ThinLTO] Drop dso_local if a GlobalVariable satisfies isDeclarationForLinker()
dso_local leads to direct access even if the definition is not within this compilation unit (it is
still in the same linkage unit). On ELF, such a relocation (e.g. R_X86_64_PC32) referencing a
STB_GLOBAL STV_DEFAULT object can cause a linker error in a -shared link.

If the linkage is changed to available_externally, the dso_local flag should be dropped, so that no
direct access will be generated.

The current behavior is benign, because -fpic does not assume dso_local
(clang/lib/CodeGen/CodeGenModule.cpp:shouldAssumeDSOLocal).
If we do that for -fno-semantic-interposition (D73865), there will be an
R_X86_64_PC32 linker error without this patch.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D74751
2020-04-07 15:46:01 -07:00
Fangrui Song
bb43600d6c [Object] Change ELFObjectFile<ELFT>::getFileFormatName() to use BFD names
Follow-up for D74433

What the function returns are almost standard BFD names, except that "ELF" is
in uppercase instead of lowercase.

This patch changes "ELF" to "elf" and changes ARM/AArch64 to use their BFD names.
MIPS and PPC64 have endianness differences as well, but this patch does not intend to address them.

Advantages:

* llvm-objdump: the "file format " line matches GNU objdump on ARM/AArch64 objects
* "file format " line can be extracted and fed into llvm-objcopy -O literally.
  (https://github.com/ClangBuiltLinux/linux/issues/779 has such a use case)

Affected tools: llvm-readobj, llvm-objdump, llvm-dwarfdump, MCJIT (internal implementation detail, not exposed)

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D76046
2020-03-16 07:42:04 -07:00
evgeny
dcf4590f35 [WPD] Provide a way to prevent functions from being devirtualized
Differential revision: https://reviews.llvm.org/D75617
2020-03-09 14:05:15 +03:00
Teresa Johnson
7a274552f2 Revert "Restore "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP""
This reverts commit 80d0a137a5aba6998fadb764f1e11cb901aae233, and the
follow on fix in 873c0d0786dcf22f4af39f65df824917f70f2170. It is
causing test failures after a multi-stage clang bootstrap. See
discussion on D73242 and D75201.
2020-03-02 14:02:13 -08:00
Teresa Johnson
c20158c036 [ThinLTO/LowerTypeTests] Handle unpromoted local type ids
Summary:
Fixes an issue that cropped up after the changes in D73242 to delay
the lowering of type tests. LTT couldn't handle any type tests with
non-string type id (which happens for local vtables, which we try to
promote during the compile step but cannot always when there are no
exported symbols).

We can simply treat the same as having an Unknown resolution, which
delays their lowering, still allowing such type tests to be used in
subsequent optimization (e.g. planned usage during ICP). The final
lowering which simply removes these handles them fine.

Beefed up an existing ThinLTO test for such unpromoted type ids so that
the internal vtable isn't removed before lower type tests, which hides
the problem.

Reviewers: evgeny777, pcc

Subscribers: inglorion, hiraditya, steven_wu, dexonsmith, aganea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75201
2020-03-02 09:31:44 -08:00
Stefanos Baziotis
d11bc38c04 Fix [ADT][NFC] SCCIterator: Change hasLoop() to hasCycle() 2020-03-01 19:35:58 +02:00
Teresa Johnson
b852bbbbae Restore "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP"
This restores commit 748bb5a0f1964d20dfb3891b0948ab6c66236c70, along
with a fix for a Chromium test suite build issue (and a new test for
that case).

Differential Revision: https://reviews.llvm.org/D73242
2020-02-11 10:48:05 -08:00
Teresa Johnson
c96a29c643 [ThinLTO] Disable "Always import constants" due to compile time issues
Summary:
Disable the always importing of constants introduced in D70404 by
default under a new internal option, since it is causing order of
magnitude compile time regressions during the thin link. Will continue
investigating why the regressions occur.

Reviewers: evgeny777, wmi

Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73724
2020-01-30 10:12:48 -08:00
Teresa Johnson
32209014dc Restore "[LTO/WPD] Enable aggressive WPD under LTO option"
This restores 59733525d37cf9ad88b5021b33ecdbaf2e18911c (D71913), along
with bot fix 19c76989bb505c3117730c47df85fd3800ea2767.

The bot failure should be fixed by D73418, committed as
af954e441a5170a75687699d91d85e0692929d43.

I also added a fix for non-x86 bot failures by requiring x86 in new test
lld/test/ELF/lto/devirt_vcall_vis_public.ll.
2020-01-27 07:55:05 -08:00
Teresa Johnson
b8d82ed595 Revert "[LTO/WPD] Enable aggressive WPD under LTO option"
This reverts commit 59733525d37cf9ad88b5021b33ecdbaf2e18911c.

There is a windows sanitizer bot failure in one of the cfi tests
that I will need some time to figure out:
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/57155/steps/stage%201%20check/logs/stdio
2020-01-23 17:29:24 -08:00
Teresa Johnson
7a368427db [LTO/WPD] Enable aggressive WPD under LTO option
Summary:
Third part in series to support Safe Whole Program Devirtualization
Enablement, see RFC here:
http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html

This patch adds type test metadata under -fwhole-program-vtables,
even for classes without hidden visibility. It then changes WPD to skip
devirtualization for a virtual function call when any of the compatible
vtables has public vcall visibility.

Additionally, internal LLVM options as well as lld and gold-plugin
options are added which enable upgrading all public vcall visibility
to linkage unit (hidden) visibility during LTO. This enables the more
aggressive WPD to kick in based on LTO time knowledge of the visibility
guarantees.

Support was added to all flavors of LTO WPD (regular, hybrid and
index-only), and to both the new and old LTO APIs.

Unfortunately it was not simple to split the first and second parts of
this part of the change (the unconditional emission of type tests and
the upgrading of the vcall visiblity) as I needed a way to upgrade the
public visibility on legacy WPD llvm assembly tests that don't include
linkage unit vcall visibility specifiers, to avoid a lot of test churn.

I also added a mechanism to LowerTypeTests that allows dropping type
test assume sequences we now aggressively insert when we invoke
distributed ThinLTO backends with null indexes, which is used in testing
mode, and which doesn't invoke the normal ThinLTO backend pipeline.

Depends on D71907 and D71911.

Reviewers: pcc, evgeny777, steven_wu, espindola

Subscribers: emaste, Prazek, inglorion, arichardson, hiraditya, MaskRay, dexonsmith, dang, davidxl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71913
2020-01-23 16:09:44 -08:00
Fangrui Song
e480c416f6 Reland "[llvm-nm] Don't report "no symbols" error for files that contain symbols" 2020-01-17 10:08:42 -08:00
Sam Clegg
8e90006f5a Revert "[llvm-nm] Don't report "no symbols" error for files that contain symbols"
This reverts commit ab974161ba699534f3e30b1f4b036eec9c33053c.

This change broke several tests, and the pre-commit bot even warning
me that it would. Doh!
2020-01-17 09:57:32 -08:00
Sam Clegg
e162b6ec2d [llvm-nm] Don't report "no symbols" error for files that contain symbols
Previously we were reporting this error if we were list no symbols
which is not the same thing as the file containing no symbols.

Also, always report the filename when printing errors.

This matches the GNU nm behaviour.

This a followup to https://reviews.llvm.org/D52810

Differential Revision: https://reviews.llvm.org/D72658
2020-01-17 09:30:55 -08:00
evgeny
eb9d6d122a [ThinLTO] Always import constants
This patch imports constant variables even when they can't be internalized
(which results in promotion). This offers some extra constant folding
opportunities.

Differential revision: https://reviews.llvm.org/D70404
2020-01-15 19:29:01 +03:00
Teresa Johnson
6e7f4a36bb [ThinLTO/WPD] Remove an overly-aggressive assert
Summary:
An assert added to the index-based WPD was trying to verify that we only
have multiple vtables for a given guid when they are all non-external
linkage. This is too conservative because we may have multiple external
vtable with the same guid when they are in comdat. Remove the assert,
as we don't have comdat information in the index, the linker should
issue an error in this case.

See discussion on D71040 for more information.

Reviewers: evgeny777, aganea

Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72648
2020-01-14 10:57:14 -08:00
Teresa Johnson
4bd379a0d0 [ThinLTO] Handle variable with twice promoted name (Rust)
Summary:
Ensure that we can internalize values produced from two rounds of
promotion.

Note that this cannot happen currently via clang, but in other use cases
such as the Rust compiler which does a first round of ThinLTO on library
code, producing bitcode, and a second round on the final binary.

In particular this can happen if a function is exported and promoted,
ending up with a ".llvm.${hash}" suffix, and then goes through a round
of optimization creating an internal switch table expansion variable
that is internal and contains the promoted name of the enclosing
function. This variable will be promoted in the second round of ThinLTO
if @foo is imported again, and therefore ends up with two
".llvm.${hash}" suffixes. Only the final one should be stripped when
consulting the index to locate the summary.

Reviewers: wmi

Subscribers: mehdi_amini, inglorion, hiraditya, JDevlieghere, steven_wu, dexonsmith, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72711
2020-01-14 10:54:03 -08:00
Teresa Johnson
6720ba0105 [ThinLTO/WPD] Fix index-based WPD for alias vtables
Summary:
A recent fix in D69452 fixed index based WPD in the presence of
available_externally vtables. It added a cast of the vtable def
summary to a GlobalVarSummary. However, in some cases one def may be an
alias, in which case we need to get the base object before casting,
otherwise we will crash.

Reviewers: evgeny777, steven_wu, aganea

Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71040
2020-01-13 13:38:26 -08:00
David Herzka
192cb1f651 Make lazyload_metadata.ll resilient to the addition of new metadata kinds
Summary: The specific number of records loaded depends on the number of kinds, but the difference between the lazy and not lazy cases does not.

Reviewers: modocache

Subscribers: llvm-commits, dexonsmith, steven_wu, hiraditya, mehdi_amini

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71882
2019-12-26 18:23:08 -05:00
Reid Kleckner
d2b4fa9ec1 Revert "Make lazyload_metadata.ll resilient to the addition of new metadata kinds"
This reverts commit be4704bd41a4dd8bb5c4dd5a614744c69fb3cf8e.

This test fails on Windows without awk.
2019-12-26 14:29:11 -08:00
David Herzka
3caead3cf9 Make lazyload_metadata.ll resilient to the addition of new metadata kinds
Summary: The specific number of records loaded depends on the number of kinds, but the difference between the lazy and not lazy cases does not.

Reviewers: modocache

Subscribers: llvm-commits, dexonsmith, steven_wu, hiraditya, mehdi_amini

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71882
2019-12-26 13:56:07 -05:00
David Herzka
6d5d0bb01c Revert "Make lazyload_metadata.ll resilient to the addition of new metadata kinds"
This reverts commit 2e6c15d1e7a47f11fab2dd3a40fcff01906923ae.

It causes the test to fail on Windows
2019-12-25 19:52:17 -05:00
David Herzka
65a463a420 Make lazyload_metadata.ll resilient to the addition of new metadata kinds
Summary: The specific number of records loaded depends on the number of kinds, but the difference between the lazy and not lazy cases does not.

Reviewers: modocache

Subscribers: mehdi_amini, hiraditya, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71730
2019-12-25 18:59:21 -05:00
evgeny
0c4768b2e9 [ThinLTO] Show preserved symbols in DOT files
Differential revision: https://reviews.llvm.org/D71608
2019-12-18 18:33:15 +03:00
Teresa Johnson
a5dac4e339 [ThinLTO] Add option to disable readonly/writeonly attribute propagation
Summary:
Add an option to allow the attribute propagation on the index to be
disabled, to allow a workaround for issues (such as that fixed by
D70977).

Also move the setting of the WithAttributePropagation flag on the index
into propagateAttributes(), and remove some old stale code that predated
this flag and cleared the maybe read/write only bits when we need to
disable the propagation (previously only when importing disabled, now
also when the new option disables it).

Reviewers: evgeny777, steven_wu

Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70984
2019-12-05 16:33:54 -08:00
Teresa Johnson
cacc24d853 [ThinLTO] Fix importing of writeonly variables in distributed ThinLTO
Summary:
D69561/dde5893 enabled importing of readonly variables with references,
however, it introduced a bug relating to importing/internalization of
writeonly variables with references.

A fix for this was added in D70006/7f92d66. But this didn't work in
distributed ThinLTO mode. The reason is that the fix (importing the
writeonly var with a zeroinitializer) was only applied when there were
references on the writeonly var summary. In distributed ThinLTO mode,
where we only have a small slice of the index, we will not have the
references on the importing side if we are not importing those
referenced values. Rather than changing this handshaking (which will
require a lot of other changes, since that's how we know what to import
in the distributed backend clang invocation), we can simply always give
the writeonly variable a zero initializer.

Reviewers: evgeny777, steven_wu

Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70977
2019-12-04 14:59:27 -08:00
Francis Visoiu Mistrih
a0d8c69089 [Remarks][ThinLTO] Use the correct file extension based on the format
Since we now have multiple formats, the ThinLTO remark files should also
respect that.
2019-12-02 13:04:43 -08:00
evgeny
fa92ba961c [ThinLTO] Make ValueInfo::operator bool() explicit
Differential revision: https://reviews.llvm.org/D70383
2019-11-19 12:46:09 +03:00
Teresa Johnson
0d09270c42 ThinLTO: Fix assembler to emit alwaysInline in the summary
Summary: The earlier commit (https://reviews.llvm.org/D70014) missed this one : If Always_Inline happens to be the only entry in FuncFlags, then the assembler will not print it in the summary.

Patch by Bharathi Seshadri <bseshadr@cisco.com>

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70323
2019-11-18 15:02:13 -08:00
Teresa Johnson
1ab13376f4 ThinLTO : Import always_inline functions irrespective of the threshold
Summary: A user can force a function to be inlined by specifying the always_inline attribute. Currently, thinlto implementation is not aware of always_inline functions and does not guarantee import of such functions, which in turn can prevent inlining of such functions.

Patch by Bharathi Seshadri <bseshadr@cisco.com>

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70014
2019-11-08 17:02:01 -08:00
evgeny
2729417b4d [ThinLTO] Fix bug when importing writeonly variables
Patch enables import of write-only variables with non-trivial initializers
to fix linker errors. Initializers of imported variables are converted to
'zeroinitializer' to avoid promotion of referenced objects.

Differential revision: https://reviews.llvm.org/D70006
2019-11-08 20:50:34 +03:00
evgeny
7372ca12be [ThinLTO] Import readonly vars with refs
Patch allows importing declarations of functions and variables, referenced
by the initializer of some other readonly variable.
Differential revision: https://reviews.llvm.org/D69561
2019-11-07 15:13:35 +03:00
Teresa Johnson
e9043f8a26 Recommit "[ThinLTO] Handle GUID collision in import global processing""
This recommits cc0b9647b76178bc3869bbfff80535ad86366472 which was
reverted in d39d1a2f87aca3cfabe58ecfa5146879baa70096.

I added a fix for an issue found when testing via distributed ThinLTO,
and added a test case for that failure.
2019-11-01 13:57:01 -07:00
Teresa Johnson
7143e76354 Revert "[LLD][ThinLTO] Handle GUID collision in import global processing"
This reverts commit cc0b9647b76178bc3869bbfff80535ad86366472.

The commit is causing a failure in internal testing. Will recommit with
a fix later.
2019-11-01 10:02:58 -07:00
Teresa Johnson
0230a0a970 [ThinLTO/WPD] Fix index-based WPD for available_externally vtables
Summary:
Clang does not add type metadata to available_externally vtables. When
choosing a summary to look at for virtual function definitions, make
sure we skip summaries for any available externally vtables as they will
not describe any virtual function functions, which are only summarized
in the presence of type metadata on the vtable def. Simply look for the
corresponding strong def's summary.

Also add handling for same-named local vtables with the same GUID
because of same-named files without enough distinguishing path.
In that case we return a conservative result with no devirtualization.

Reviewers: pcc, davidxl, evgeny777

Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69452
2019-10-30 17:59:08 -07:00
Teresa Johnson
a378559c07 [LLD][ThinLTO] Handle GUID collision in import global processing
Summary:
If there are a GUID collision between two globals checking the
summarylist from the import index to make assumption can be dangerous.

Do not assume that a GlobalValue that has a GlobalVarSummary
actually is a GlobalVariable as it can be another GlobalValue with
the same GUID that the summary is connected to.

Patch by Joel Klinghed (the_jk@opera.com)

Reviewers: evgeny777, tejohnson

Reviewed By: tejohnson

Subscribers: tejohnson, dblaikie, MaskRay, mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67322
2019-10-25 12:36:01 -07:00
Eugene Leviant
b54476ab94 [ThinLTO] Don't internalize during promotion
Differential revision: https://reviews.llvm.org/D69107

llvm-svn: 375493
2019-10-22 09:24:12 +00:00
Eugene Leviant
1a39c6003e One more attempt to fix PS4 buildbot after r375219
PS4 buildbot seems to be dropping variable names for some reason

llvm-svn: 375237
2019-10-18 14:11:19 +00:00
Eugene Leviant
9c01211772 Attempt to fix PS4 buildbot after r375219
llvm-svn: 375235
2019-10-18 13:52:51 +00:00
Eugene Leviant
7787d23a97 [ThinLTOCodeGenerator] Add support for index-based WPD
Differential revision: https://reviews.llvm.org/D68950

llvm-svn: 375219
2019-10-18 10:54:14 +00:00
Oliver Stannard
6aaf81e821 Reland: Dead Virtual Function Elimination
Remove dead virtual functions from vtables with
replaceNonMetadataUsesWith, so that CGProfile metadata gets cleaned up
correctly.

Original commit message:

Currently, it is hard for the compiler to remove unused C++ virtual
functions, because they are all referenced from vtables, which are referenced
by constructors. This means that if the constructor is called from any live
code, then we keep every virtual function in the final link, even if there
are no call sites which can use it.

This patch allows unused virtual functions to be removed during LTO (and
regular compilation in limited circumstances) by using type metadata to match
virtual function call sites to the vtable slots they might load from. This
information can then be used in the global dead code elimination pass instead
of the references from vtables to virtual functions, to more accurately
determine which functions are reachable.

To make this transformation safe, I have changed clang's code-generation to
always load virtual function pointers using the llvm.type.checked.load
intrinsic, instead of regular load instructions. I originally tried writing
this using clang's existing code-generation, which uses the llvm.type.test
and llvm.assume intrinsics after doing a normal load. However, it is possible
for optimisations to obscure the relationship between the GEP, load and
llvm.type.test, causing GlobalDCE to fail to find virtual function call
sites.

The existing linkage and visibility types don't accurately describe the scope
in which a virtual call could be made which uses a given vtable. This is
wider than the visibility of the type itself, because a virtual function call
could be made using a more-visible base class. I've added a new
!vcall_visibility metadata type to represent this, described in
TypeMetadata.rst. The internalization pass and libLTO have been updated to
change this metadata when linking is performed.

This doesn't currently work with ThinLTO, because it needs to see every call
to llvm.type.checked.load in the linkage unit. It might be possible to
extend this optimisation to be able to use the ThinLTO summary, as was done
for devirtualization, but until then that combination is rejected in the
clang driver.

To test this, I've written a fuzzer which generates random C++ programs with
complex class inheritance graphs, and virtual functions called through object
and function pointers of different types. The programs are spread across
multiple translation units and DSOs to test the different visibility
restrictions.

I've also tried doing bootstrap builds of LLVM to test this. This isn't
ideal, because only classes in anonymous namespaces can be optimised with
-fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not
work correctly with -fvisibility=hidden. However, there are only 12 test
failures when building with -fvisibility=hidden (and an unmodified compiler),
and this change does not cause any new failures for either value of
-fvisibility.

On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size
reduction of ~6%, over a baseline compiled with "-O2 -flto
-fvisibility=hidden -fwhole-program-vtables". The best cases are reductions
of ~14% in 450.soplex and 483.xalancbmk, and there are no code size
increases.

I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which
show a geomean size reduction of ~3%, again with no size increases.

I had hoped that this would have no effect on performance, which would allow
it to awlays be enabled (when using -fwhole-program-vtables). However, the
changes in clang to use the llvm.type.checked.load intrinsic are causing ~1%
performance regression in the C++ parts of SPEC2006. It should be possible to
recover some of this perf loss by teaching optimisations about the
llvm.type.checked.load intrinsic, which would make it worth turning this on
by default (though it's still dependent on -fwhole-program-vtables).

Differential revision: https://reviews.llvm.org/D63932

llvm-svn: 375094
2019-10-17 09:58:57 +00:00
Eugene Leviant
37fc87859f [ThinLTO] Import virtual method with single implementation in hybrid mode
Differential revision: https://reviews.llvm.org/D68782

llvm-svn: 375083
2019-10-17 07:46:18 +00:00
Jorge Gorbe Moya
6a19d78c0a Revert "Dead Virtual Function Elimination"
This reverts commit 9f6a873268e1ad9855873d9d8007086c0d01cf4f.

llvm-svn: 374844
2019-10-14 23:25:25 +00:00
Oliver Stannard
901c588c1f Dead Virtual Function Elimination
Currently, it is hard for the compiler to remove unused C++ virtual
functions, because they are all referenced from vtables, which are referenced
by constructors. This means that if the constructor is called from any live
code, then we keep every virtual function in the final link, even if there
are no call sites which can use it.

This patch allows unused virtual functions to be removed during LTO (and
regular compilation in limited circumstances) by using type metadata to match
virtual function call sites to the vtable slots they might load from. This
information can then be used in the global dead code elimination pass instead
of the references from vtables to virtual functions, to more accurately
determine which functions are reachable.

To make this transformation safe, I have changed clang's code-generation to
always load virtual function pointers using the llvm.type.checked.load
intrinsic, instead of regular load instructions. I originally tried writing
this using clang's existing code-generation, which uses the llvm.type.test
and llvm.assume intrinsics after doing a normal load. However, it is possible
for optimisations to obscure the relationship between the GEP, load and
llvm.type.test, causing GlobalDCE to fail to find virtual function call
sites.

The existing linkage and visibility types don't accurately describe the scope
in which a virtual call could be made which uses a given vtable. This is
wider than the visibility of the type itself, because a virtual function call
could be made using a more-visible base class. I've added a new
!vcall_visibility metadata type to represent this, described in
TypeMetadata.rst. The internalization pass and libLTO have been updated to
change this metadata when linking is performed.

This doesn't currently work with ThinLTO, because it needs to see every call
to llvm.type.checked.load in the linkage unit. It might be possible to
extend this optimisation to be able to use the ThinLTO summary, as was done
for devirtualization, but until then that combination is rejected in the
clang driver.

To test this, I've written a fuzzer which generates random C++ programs with
complex class inheritance graphs, and virtual functions called through object
and function pointers of different types. The programs are spread across
multiple translation units and DSOs to test the different visibility
restrictions.

I've also tried doing bootstrap builds of LLVM to test this. This isn't
ideal, because only classes in anonymous namespaces can be optimised with
-fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not
work correctly with -fvisibility=hidden. However, there are only 12 test
failures when building with -fvisibility=hidden (and an unmodified compiler),
and this change does not cause any new failures for either value of
-fvisibility.

On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size
reduction of ~6%, over a baseline compiled with "-O2 -flto
-fvisibility=hidden -fwhole-program-vtables". The best cases are reductions
of ~14% in 450.soplex and 483.xalancbmk, and there are no code size
increases.

I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which
show a geomean size reduction of ~3%, again with no size increases.

I had hoped that this would have no effect on performance, which would allow
it to awlays be enabled (when using -fwhole-program-vtables). However, the
changes in clang to use the llvm.type.checked.load intrinsic are causing ~1%
performance regression in the C++ parts of SPEC2006. It should be possible to
recover some of this perf loss by teaching optimisations about the
llvm.type.checked.load intrinsic, which would make it worth turning this on
by default (though it's still dependent on -fwhole-program-vtables).

Differential revision: https://reviews.llvm.org/D63932

llvm-svn: 374539
2019-10-11 11:59:55 +00:00
Teresa Johnson
d2f6522f79 [ThinLTO/WPD] Ensure devirtualized targets use promoted symbol when necessary
Summary:
This fixes a hole in the handling of devirtualized targets that were
local but need promoting due to devirtualization in another module. We
were not correctly referencing the promoted symbol in some cases. Make
sure the code that updates the name also looks at the ExportedGUIDs set
by utilizing a callback that checks all conditions (the callback
utilized by the internalization/promotion code).

Reviewers: pcc, davidxl, hiraditya

Subscribers: mehdi_amini, Prazek, inglorion, steven_wu, dexonsmith, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68159

llvm-svn: 373485
2019-10-02 16:36:59 +00:00
Petr Hosek
24e3a769e6 Reland "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM"
This patch contains the basic functionality for reporting potentially
incorrect usage of __builtin_expect() by comparing the developer's
annotation against a collected PGO profile. A more detailed proposal and
discussion appears on the CFE-dev mailing list
(http://lists.llvm.org/pipermail/cfe-dev/2019-July/062971.html) and a
prototype of the initial frontend changes appear here in D65300

We revised the work in D65300 by moving the misexpect check into the
LLVM backend, and adding support for IR and sampling based profiles, in
addition to frontend instrumentation.

We add new misexpect metadata tags to those instructions directly
influenced by the llvm.expect intrinsic (branch, switch, and select)
when lowering the intrinsics. The misexpect metadata contains
information about the expected target of the intrinsic so that we can
check against the correct PGO counter when emitting diagnostics, and the
compiler's values for the LikelyBranchWeight and UnlikelyBranchWeight.
We use these branch weight values to determine when to emit the
diagnostic to the user.

A future patch should address the comment at the top of
LowerExpectIntrisic.cpp to hoist the LikelyBranchWeight and
UnlikelyBranchWeight values into a shared space that can be accessed
outside of the LowerExpectIntrinsic pass. Once that is done, the
misexpect metadata can be updated to be smaller.

In the long term, it is possible to reconstruct portions of the
misexpect metadata from the existing profile data. However, we have
avoided this to keep the code simple, and because some kind of metadata
tag will be required to identify which branch/switch/select instructions
are influenced by the use of llvm.expect

Patch By: paulkirth
Differential Revision: https://reviews.llvm.org/D66324

llvm-svn: 371635
2019-09-11 16:19:50 +00:00