1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 02:52:53 +02:00
Commit Graph

190697 Commits

Author SHA1 Message Date
Fangrui Song
45420d05a1 Add function attribute "patchable-function-prefix" to support -fpatchable-function-entry=N,M where M>0
Similar to the function attribute `prefix` (prefix data),
"patchable-function-prefix" inserts data (M NOPs) before the function
entry label.

-fpatchable-function-entry=2,1 (1 NOP before entry, 1 NOP after entry)
will look like:

```
  .type	foo,@function
.Ltmp0:               # @foo
  nop
foo:
.Lfunc_begin0:
  # optional `bti c` (AArch64 Branch Target Identification) or
  # `endbr64` (Intel Indirect Branch Tracking)
  nop

  .section  __patchable_function_entries,"awo",@progbits,get,unique,0
  .p2align  3
  .quad .Ltmp0
```

-fpatchable-function-entry=N,0 + -mbranch-protection=bti/-fcf-protection=branch has two reasonable
placements (https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01185.html):

```
(a)         (b)

func:       func:
.Ltmp0:     bti c
  bti c     .Ltmp0:
  nop       nop
```

(a) needs no additional code. If the consensus is to go for (b), we will
need more code in AArch64BranchTargets.cpp / X86IndirectBranchTracking.cpp .

Differential Revision: https://reviews.llvm.org/D73070
2020-01-23 17:02:27 -08:00
Changpeng Fang
d65c44efb7 AMDGPU: Implement FDIV optimizations in AMDGPUCodeGenPrepare
Summary:
      RCP has the accuracy limit. If FDIV fpmath require high accuracy rcp may not
    meet the requirement. However, in DAG lowering, fpmath information gets lost,
    and thus we may generate either inaccurate rcp related computation or slow code
    for fdiv.

    In patch implements fdiv optimizations in the AMDGPUCodeGenPrepare, which could
    exactly know !fpmath.

     FastUnsafeRcpLegal: We determine whether it is legal to use rcp based on
                         unsafe-fp-math, fast math flags, denormals and fpmath
                         accuracy request.

     RCP Optimizations:
       1/x -> rcp(x) when fast unsafe rcp is legal or fpmath >= 2.5ULP with
                                                      denormals flushed.
       a/b -> a*rcp(b) when fast unsafe rcp is legal.

     Use fdiv.fast:
       a/b -> fdiv.fast(a, b) when RCP optimization is not performed and
                              fpmath >= 2.5ULP with denormals flushed.

       1/x -> fdiv.fast(1,x)  when RCP optimization is not performed and
                              fpmath >= 2.5ULP with denormals.

    Reviewers:
      arsenm

    Differential Revision:
      https://reviews.llvm.org/D71293
2020-01-23 16:57:43 -08:00
Johannes Doerfert
8a70392072 [Attributor] Avoid REQUIRED dependences in favor of OPTIONAL ones
When we use information only to short-cut deduction or improve it, we
can use OPTIONAL dependences instead of REQUIRED ones to avoid cascading
pessimistic fixpoints.

We also need to track dependences only when we use assumed information,
e.g., we act on assumed liveness information.
2020-01-23 18:42:46 -06:00
Johannes Doerfert
d776e30a85 [Attributor] Record dependences only when necessary
If we use assumed information from AAValueSimplify we need to record
an OPTIONAL dependence, otherwise we do not.
2020-01-23 18:42:45 -06:00
Johannes Doerfert
3af2a054ca [Attributor][FIX] Avoid dangling pointers during code deletion
It can happen that we have instructions in the ToBeDeletedInsts set
which are deleted earlier already. To avoid dangling pointers we use
weak tracking handles.
2020-01-23 18:42:45 -06:00
Johannes Doerfert
bc11333dff [Attributor][FIX] Handle non-pointers when following uses
When we follow uses, e.g., in AAMemoryBehavior or AANoCapture, we need
to make sure the value is a pointer before we ask for abstract
attributes only valid for pointers. This happens because we follow
pointers through calls that do not capture but may return the value.
2020-01-23 18:42:45 -06:00
Johannes Doerfert
1004835efd [Attributor][NFC] Do not (try to) simplify void values
We might accidentally ask AAValueSimplify to simplify a void value. That
can lead to very interesting, and very wrong, results. We now handle
this case gracefully.
2020-01-23 18:42:45 -06:00
Alina Sbirlea
c4e8c14e79 [LoopStrengthReduce] Reuse utility method to clean dead instructions. [NFCI]
Create a utility wrapper for the RecursivelyDeleteTriviallyDeadInstructions utility
method, which sets to nullptr the instructions that are not trivially
dead. Use the new method in LoopStrengthReduce.
Alternative: add a bool to the same method; this option adds a marginal
amount of overhead to the other callers, and the method needs to be
updated to return a bool status when it removes/doesn't remove
instructions.
2020-01-23 16:27:32 -08:00
Johannes Doerfert
d57534ea24 [Attributor][FIX][Alignment] Do not report a change if there was none
If alignment was manifested but it is actually only as good as the
data-layout provided one we should not report it as a change.

For testing purposes we still manifest the information.
2020-01-23 18:13:52 -06:00
Johannes Doerfert
a6c480c3be [Attributor][NFC] Add an assertion 2020-01-23 18:13:52 -06:00
Johannes Doerfert
87f66bcbeb [Attributor][NFC] Fix spelling 2020-01-23 18:13:52 -06:00
Johannes Doerfert
ecccd25e1e [Attributor] byval arguments are always noalias
`byval` introduces a local copy of the argument. That copy cannot alias
anything.
2020-01-23 18:13:52 -06:00
Johannes Doerfert
738356d128 [Attributor][FIX] Store alignment only holds for the pointer value
We accidentally used the store alignment for the value operand as well,
which is incorrect and crashed the SPASS application in the test suite.
2020-01-23 18:13:52 -06:00
Teresa Johnson
7a368427db [LTO/WPD] Enable aggressive WPD under LTO option
Summary:
Third part in series to support Safe Whole Program Devirtualization
Enablement, see RFC here:
http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html

This patch adds type test metadata under -fwhole-program-vtables,
even for classes without hidden visibility. It then changes WPD to skip
devirtualization for a virtual function call when any of the compatible
vtables has public vcall visibility.

Additionally, internal LLVM options as well as lld and gold-plugin
options are added which enable upgrading all public vcall visibility
to linkage unit (hidden) visibility during LTO. This enables the more
aggressive WPD to kick in based on LTO time knowledge of the visibility
guarantees.

Support was added to all flavors of LTO WPD (regular, hybrid and
index-only), and to both the new and old LTO APIs.

Unfortunately it was not simple to split the first and second parts of
this part of the change (the unconditional emission of type tests and
the upgrading of the vcall visiblity) as I needed a way to upgrade the
public visibility on legacy WPD llvm assembly tests that don't include
linkage unit vcall visibility specifiers, to avoid a lot of test churn.

I also added a mechanism to LowerTypeTests that allows dropping type
test assume sequences we now aggressively insert when we invoke
distributed ThinLTO backends with null indexes, which is used in testing
mode, and which doesn't invoke the normal ThinLTO backend pipeline.

Depends on D71907 and D71911.

Reviewers: pcc, evgeny777, steven_wu, espindola

Subscribers: emaste, Prazek, inglorion, arichardson, hiraditya, MaskRay, dexonsmith, dang, davidxl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71913
2020-01-23 16:09:44 -08:00
Alina Sbirlea
1565d298b3 [Utils] Use WeakTrackingVH in vector used as scratch storage.
The utility method RecursivelyDeleteTriviallyDeadInstructions receives
as input a vector of Instructions, where all inputs are valid
instructions. This same vector is used as a scratch storage (per the
header comment) to recursively delete instructions. If an instruction is
added as an operand of multiple other instructions, it may be added twice,
then deleted once, then the second reference in the vector is invalid.
Switch to using a Vector<WeakTrackingVH>.
This change facilitates a clean-up in LoopStrengthReduction.
2020-01-23 16:04:57 -08:00
Amara Emerson
cb539bcc0e [AArch64][GlobalISel] Remove duplicate attribute lookup code that was supposed to be cached. NFC.
When I cached this a long time ago it seems I forgot to remove the locally
declared variable of the same name in select(), so the caching wasn't having
any compile time benefit. Doh.
2020-01-23 15:50:08 -08:00
Hubert Tong
7b776a1886 [tests] Use host-based XFAIL for test/MC/AMDGPU/hsa-gfx10-v3.s
Summary:
This patch applies D60551 to an additional file. In particular, the test
is currently marked XFAIL for a number of big-endian targets; however,
the failure is actually dependent on the host endianness instead. The
test actually specifies a specific target triple.

Reviewers: rampitec, xingxue, daltenty

Reviewed By: rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, fedor.sergeev, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73192
2020-01-23 17:55:32 -05:00
Amara Emerson
d942a7b79b [AArch64][GlobalISel] Fallback if the +strict-align target feature is given.
Works around PR44246.
2020-01-23 14:45:29 -08:00
Florian Hahn
ebce5bf21c [IPSCCP] Use ParamState for arguments at call sites.
We currently use integer ranges to merge concrete function arguments.
We use the ParamState range for those, but we only look up concrete
values in the regular state. For concrete function arguments that are
themselves arguments of the containing function, we can use the param
state directly and improve the precision in some cases.

Besides improving the results in some cases, this is also a small step towards
switching to ValueLatticeElement, by allowing D60582 to be a NFC.

Reviewers: efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D71836
2020-01-23 13:55:42 -08:00
Matt Arsenault
c5e9e39558 GlobalISel: Add MIPatternMatch for G_ICMP/G_FCMP 2020-01-23 13:30:47 -08:00
Matt Arsenault
95d5499a92 AMDGPU/GlobalISel: Fix RegBanKSelect for llvm.amdgcn.exp.compr
This wasn't updated for the immarg handling change. We really need a
verifier for this.
2020-01-23 13:30:46 -08:00
Teresa Johnson
dae138eca8 [ThinLTO] Summarize vcall_visibility metadata
Summary:
Second patch in series to support Safe Whole Program Devirtualization
Enablement, see RFC here:
http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html

Summarize vcall_visibility metadata in ThinLTO global variable summary.

Depends on D71907.

Reviewers: pcc, evgeny777, steven_wu

Subscribers: mehdi_amini, Prazek, inglorion, hiraditya, dexonsmith, arphaman, ostannard, llvm-commits, cfe-commits, davidxl

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71911
2020-01-23 13:19:56 -08:00
Reid Kleckner
af435f4dd3 [PDB] Simplify API for making section map, NFC
Prevents API misuse described in PR44495
2020-01-23 12:15:21 -08:00
Matt Arsenault
37207e405d AMDGPU: Fix ubsan error
Since register classes go up to 1024, 32 elements, all masks bits are
needed and a 32-bit shift by 32 is illegal. We didn't have any
instructions theoretically using a 32 element VGPR before
d1dbb5e4718a8f845abf0783513a33a55429470b
2020-01-23 15:05:47 -05:00
Roman Lebedev
f72d7af9b1 [IR] Attribute/AttrBuilder: use Value::MaximumAlignment magic constant
Summary:
I initially encountered those assertions when trying to create
this IR `alignment` attribute from clang's `__attribute__((assume_aligned(imm)))`,
because until D72994 there is no sanity checking for the value of `imm`.

But even then, we have `llvm::Value::MaximumAlignment` constant (which is `536870912`),
which is enforced for clang attributes, and then there are some other magical constant
(`0x40000000` i.e. `1073741824` i.e. `2 * 536870912`) in
`Attribute::getWithAlignment()`/`AttrBuilder::addAlignmentAttr()`.

I strongly suspect that `0x40000000` is incorrect,
and that also should be `llvm::Value::MaximumAlignment`.

Reviewers: erichkeane, hfinkel, jdoerfert, gchatelet, courbet

Reviewed By: erichkeane

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D72998
2020-01-23 22:50:49 +03:00
Teresa Johnson
8920d6a40a [WPD/VFE] Always emit vcall_visibility metadata for -fwhole-program-vtables
Summary:
First patch to support Safe Whole Program Devirtualization Enablement,
see RFC here: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html

Always emit !vcall_visibility metadata under -fwhole-program-vtables,
and not just for -fvirtual-function-elimination. The vcall visibility
metadata will (in a subsequent patch) be used to communicate to WPD
which vtables are safe to devirtualize, and we will optionally convert
the metadata to hidden visibility at link time. Subsequent follow on
patches will help enable this by adding vcall_visibility metadata to the
ThinLTO summaries, and always emit type test intrinsics under
-fwhole-program-vtables (and not just for vtables with hidden
visibility).

In order to do this safely with VFE, since for VFE all vtable loads must
be type checked loads which will no longer be the case, this patch adds
a new "Virtual Function Elim" module flag to communicate to GlobalDCE
whether to perform VFE using the vcall_visibility metadata.

One additional advantage of using the vcall_visibility metadata to drive
more WPD at LTO link time is that we can use the same mechanism to
enable more aggressive VFE at LTO link time as well. The link time
option proposed in the RFC will convert vcall_visibility metadata to
hidden (aka linkage unit visibility), which combined with
-fvirtual-function-elimination will allow it to be done more
aggressively at LTO link time under the same conditions.

Reviewers: pcc, ostannard, evgeny777, steven_wu

Subscribers: mehdi_amini, Prazek, hiraditya, dexonsmith, davidxl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71907
2020-01-23 11:36:01 -08:00
Alina Sbirlea
8cfea297a5 [LoopIdiomRecognize] Teach LoopIdiomRecognize to preserve MemorySSA. 2020-01-23 11:31:12 -08:00
Alina Sbirlea
22d4f4cbf9 [IndVarSimplify] Fix for MemorySSA preserve. 2020-01-23 11:06:16 -08:00
Fangrui Song
0b827d4aa9 [AArch64][test] Fix MC/AArch64 tests after D72799 2020-01-23 10:47:50 -08:00
Fangrui Song
6e3b58e0b9 [AArch64][test] Fix tests after D72799 2020-01-23 10:45:15 -08:00
Justin Bogner
626b423640 [LoopUnroll] Avoid UB when converting from WeakVH to Value *
Calling `operator*` on a WeakVH with a null value yields a null
reference, which is UB. Avoid this by implicitly converting the WeakVH
to a `Value *` rather than dereferencing and then taking the address
for the type conversion.

Differential Revision: https://reviews.llvm.org/D73280
2020-01-23 10:36:39 -08:00
Danilo Carvalho Grael
698ac54e59 [SVE] Add SVE2 patterns for unpredicated multiply instructions
Summary:
Add patterns for SVE2 unpredicated multiply instructions:
- mul, smulh, umulh, pmul, sqdmulh, sqrdmulh

Reviewers: sdesmalen, huntergr, efriedma, c-rhodes, kmclaughlin, rengolin

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits, amehsan

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72799
2020-01-23 13:20:53 -05:00
Simon Pilgrim
7a10fa3e59 [X86] LowerRotate - early out for vector rotates by zero 2020-01-23 17:48:09 +00:00
Simon Pilgrim
4c18f5d376 [X86] Add test showing failure to remove vector rotate by zero 2020-01-23 17:48:08 +00:00
Simon Pilgrim
3a410151bf [X86] Add AVX512 tests for vector rotations 2020-01-23 17:48:08 +00:00
Simon Pilgrim
9d10b6a6c7 [SelectionDAG] ComputeNumSignBits - add ISD::ADD demanded elts support 2020-01-23 17:48:07 +00:00
Sam Parker
acdd8b62e2 [RDA] Skip debug values
Skip debug instructions when iterating through a block to find uses.

Differential Revision: https://reviews.llvm.org/D73273
2020-01-23 17:04:54 +00:00
Matt Arsenault
3170227425 AMDGPU/GlobalISel: Select V_ADD3_U32/V_XOR3_B32
The other 3-op patterns should also be theoretically handled, but
currently there's a bug in the inferred pattern complexity.

I'm not sure what the error handling strategy should be for potential
constant bus violations. I think the correct strategy is to never
produce mixed SGPR and VGPR operands in a typical VOP instruction,
which will trivially avoid them. However, it's possible to still have
hand written MIR (or erroneously transformed code) with these
operands. When these fold, the restriction will be violated. We
currently don't have any verifiers for reg bank legality. For now,
just ignore the restriction.

It might be worth triggering a DAG fallback on verifier error.
2020-01-23 12:04:20 -05:00
Matt Arsenault
9a7dbde109 GlobalISel: Use Register 2020-01-23 12:04:20 -05:00
Simon Pilgrim
f1b4b4c1bb [SelectionDAG] ComputeNumSignBits - add ISD::ADD vector support
Add missing handling for (ADD (AND X, 1), -1) uniform vectors
2020-01-23 16:42:12 +00:00
Simon Pilgrim
94ace0b9c4 [X86][SSE] Add ComputeNumSignBits tests for (ADD (AND X, 1), -1) vectors 2020-01-23 16:42:11 +00:00
Guillaume Chatelet
2efa9bb646 [Alignment][NFC] Use Align with CreateAlignedStore
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, bollu

Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73274
2020-01-23 17:34:32 +01:00
Matt Arsenault
7f79f10d3e AMDGPU: Check for other uses when looking through casted select
Fixes mesa regression on ext_transform_feedback-max-varyings
2020-01-23 11:31:24 -05:00
Sam Parker
ee505c7a11 [NFC][ARM] Add test 2020-01-23 16:21:52 +00:00
Simon Pilgrim
10568c1873 [SelectionDAG] ComputeNumSignBits - add ISD::SUB demanded elts support 2020-01-23 16:20:48 +00:00
Simon Pilgrim
aa8536e4c8 [X86][AVX] Add AVX1/AVX2 ashr vector tests 2020-01-23 16:20:48 +00:00
Michael Liao
2c5a63a728 Fix GCC warning/error '-fpermission'. NFC. 2020-01-23 10:45:02 -05:00
Krzysztof Parzyszek
0d83f9aecc [Hexagon] Remove unused operand definitions: s10_0Imm and s10_6Imm 2020-01-23 09:38:54 -06:00
Sergej Jaskiewicz
22b60a6c62 Revert "[tablegen] Emit string literals instead of char arrays"
This reverts commit ce23515f5ab01161c98449d833b3ae013b553aa8.

That commit broke some builds on Windows:
http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/13870
2020-01-23 18:22:22 +03:00
Alexey Lapshin
0d5d418268 [Dsymutil][Debuginfo][NFC] #4 Refactor dsymutil to separate DWARF optimizing part.
Summary:
The primary goal of this refactoring is to separate DWARF optimizing part.
So that it could be reused by linker or by any other client.
There was a thread on llvm-dev discussing the necessity of such a refactoring:

http://lists.llvm.org/pipermail/llvm-dev/2019-September/135068.html.

This is a final part from series of patches for dsymutil.
Previous patches : D71068, D71839, D72476. This patch:

1. Creates lib/DWARFLinker interface :

   void addObjectFile(DwarfLinkerObjFile &ObjFile);
   bool link();
   void setOptions;

1. Moves all linking logic from tools/dsymutil/DwarfLinkerForBinary
   into lib/DWARFLinker.
2. Renames RelocationManager into AddressesManager.
3. Remarks creation logic moved from separate parallel execution
   into object file loading routine.

Testing: it passes "check-all" lit testing. MD5 checksum for clang .dSYM bundle
matches for the dsymutil with/without that patch.

Reviewers: JDevlieghere, friss, dblaikie, aprantl, jdoerfert

Reviewed By: JDevlieghere

Subscribers: merge_guards_bot, hiraditya, jfb, llvm-commits, probinson, thegameg

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D72915
2020-01-23 18:16:32 +03:00