1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 10:42:39 +01:00
Commit Graph

219299 Commits

Author SHA1 Message Date
Michael Liao
b2d24acf01 [amdgpu] Add 64-bit PC support when expanding unconditional branches.
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D106445
2021-07-26 14:50:30 -04:00
Craig Topper
837c44ce91 [TypePromotion] Remove redundant if. NFC
The same condition was checked in the previous if. Maybe this was
a bad merge resolution?
2021-07-26 11:47:25 -07:00
Kevin P. Neal
800de3f8d5 [FPEnv][InstSimplify] Enable more folds for constrained fadd
Precommit tests, try 2. My tree is up-to-date as of this morning so this
should go better than my first try.
2021-07-26 14:06:21 -04:00
Amara Emerson
6a4c66376a [AArch4][GlobalISel] Post-legalize combine s64 = G_MERGE s32, 0 -> G_ZEXT.
These are generated as a byproduce of legalization.

Differential Revision: https://reviews.llvm.org/D106768
2021-07-26 10:58:04 -07:00
Eli Friedman
2211738092 [LLVM IR] Allow volatile stores to trap.
Proposed alternative to D105338.

This is ugly, but short-term I think it's the best way forward: first,
let's formalize the hacks into a coherent model. Then we can consider
extensions of that model (we could have different flavors of volatile
with different rules).

Differential Revision: https://reviews.llvm.org/D106309
2021-07-26 10:51:00 -07:00
Amara Emerson
d0d4c1578a [AArch64][GlobalISel] Enable some select combines after legalization.
The legalizer generates selects for some operations, which can have constant
condition values, resulting in lots of dead code if it's not folded away.

Differential Revision: https://reviews.llvm.org/D106762
2021-07-26 10:40:32 -07:00
Amara Emerson
b09f2e63d9 [GlobalISel] Add combine for merge(unmerge) and use AArch64 postlegal-combiner.
Differential Revision: https://reviews.llvm.org/D106761
2021-07-26 10:37:31 -07:00
Heejin Ahn
cd521f45d9 [WebAssembly] Improve pseudocode in LowerEmscriptenEHSjLj
Both `__THREW__` and `__threwValue` are global variables, and we have
been distinguishing the global variable `__THREW__` and the loaded value
`%__THREW__.val` in comments but not doing it for `__threwValue`. Made
the pseudocode comments consistent for both variables.

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D106524
2021-07-26 10:13:28 -07:00
Simon Pilgrim
e79fa78700 [X86][AVX] Add PR50053 test case 2021-07-26 17:57:38 +01:00
Florian Hahn
3ef6cc6cad [LAA] Remove RuntimeCheckingPtrGroup::RtCheck member (NFC).
This patch removes RtCheck from RuntimeCheckingPtrGroup to make it
possible to construct RuntimeCheckingPtrGroup objects without a
RuntimePointerChecking object. This should make it easier to
re-use the code to generate runtime checks, e.g. in D102834.

RtCheck was only used to access the pointer info for a given index.
Instead, the start and end expressions can be passed directly.

For code-gen, we also need to know the address space to use. This can
also be explicitly passed at construction.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D105481
2021-07-26 17:38:10 +01:00
Stephen Tozer
27b2ac2965 [DebugInfo] Correctly update debug users of SSA values in tail duplication
During tail duplication, SSA values may be updated and have their uses
replaced with a virtual register, and any debug instructions that use
that value are deleted. This patch fixes the implementation of the debug
instruction deletion to work correctly for debug instructions that use
the SSA value multiple times, by batching deletions so that we don't
attempt to delete the same instruction twice.

Differential Revision: https://reviews.llvm.org/D106557
2021-07-26 17:27:57 +01:00
Simon Pilgrim
e67bfc0f97 [Analysis] Fix getOrderedReductionCost to call target's getArithmeticInstrCost implementation
The getOrderedReductionCost implementation introduced in D105432 calls the CRTP base version getArithmeticInstrCost instead of the redirecting to the target version.

Differential Revision: https://reviews.llvm.org/D106795
2021-07-26 17:15:43 +01:00
Sander de Smalen
2b39df3750 [LV] Remove assert that VF cannot be scalable in setCostBasedWideningDecision.
Scalarization for scalable vectors is not (yet) supported, so the
LV discards a VF when scalarization is chosen as the widening
decision. It should therefore not assert that the VF is not scalable
when it computes the decision to scalarize.

The code can get here when both the interleave-cost, gather/scatter cost
and scalarization-cost are all illegal. This may e.g. happen for SVE
when the VF=1, to avoid generating `<vscale x 1 x eltty>` types that
the code-generator cannot yet handle.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D106656
2021-07-26 17:11:45 +01:00
Nikita Popov
c0e218377f [MergeICmps] Collect block instructions once (NFC)
Collect the relevant instructions for a given BCECmpBlock once
on construction, rather than repeating this logic in multiple
places.
2021-07-26 18:07:20 +02:00
Fangrui Song
834bd6611c [llvm-objcopy] Drop GRP_COMDAT if the group signature is localized
See [GRP_COMDAT group with STB_LOCAL signature](https://groups.google.com/g/generic-abi/c/2X6mR-s2zoc)
objcopy PR: https://sourceware.org/bugzilla/show_bug.cgi?id=27931

GRP_COMDAT deduplication is purely based on the signature symbol name in
ld.lld/GNU ld/gold. The local/global status is not part of the equation.

If the signature symbol is localized by --localize-hidden or
--keep-global-symbol, the intention is likely to make the group fully
localized. Drop GRP_COMDAT to suppress deduplication.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D106782
2021-07-26 09:05:18 -07:00
Fangrui Song
f0aeef47e5 [yaml2obj][MachO] Rename PayloadString to Content
The new name is conciser and matches yaml2obj ELF & DWARF.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D106759
2021-07-26 09:04:51 -07:00
Nikita Popov
fa9a368e39 [MergeICmps] Try to fix MSVC build failure
Apparently this fails to line up the types -- try to sidestep the
issue entirely by writing the code in a more reasonable way: Walk
over the operands and perform a set lookup, rather than walking
over the set and performing an operand scan.
2021-07-26 17:31:27 +02:00
Kazu Hirata
24ba96f75f [AsmParser] Remove MDRef (NFC)
The last use was removed on Jan 12, 2015 in commit
ab617d597708fcf3c4b829bf595e9d990ca66c07.
2021-07-26 08:29:33 -07:00
Paul Walker
6d505ef4dd [SVE] Use reg+reg addressing mode for immediate offsets.
For reg+imm SVE addressing mode imm is implictly scaled by VL,
making them impractical for truely immediate offsets.  However, if
the offset can be unscaled based on the storage element type we
can use the reg+reg SVE addressing mode and thus either reduce the
number of generate add instructions or replace them with a mov
instruction that can be hoisted from the hot code path.

Differential Revision: https://reviews.llvm.org/D106744
2021-07-26 16:24:16 +01:00
Sanjay Patel
2b04c07ca1 [SimplifyLibCalls] avoid crash on pointer math
We could try harder to screen out libcalls by
function signature (and that would be a much larger
change than for sprintf alone), but that might make
the transition to type-less pointers more difficult.

https://llvm.org/PR51200
2021-07-26 11:08:45 -04:00
Sanjay Patel
a1170d1e3b [SimplifyLibCalls] reduce code duplication; NFC 2021-07-26 11:08:45 -04:00
Nikita Popov
6ebcde8cc9 [MergeICmps] Separate out BCECmp and use Optional (NFC)
Separate out the BCECmp part from BCECmpBlock, which just stores
the comparison atoms without the branch instruction. At the same
time switch the code to return Optional<> rather than objects in
invalid state and partially constructed objects.
2021-07-26 17:06:43 +02:00
Sander de Smalen
6651df41b7 [LV] Don't assume isScalarAfterVectorization if one of the uses needs widening.
This fixes an issue that was found in D105199, where a GEP instruction
is used both as the address of a store, as well as the value of a store.
For the former, the value is scalar after vectorization, but the latter
(as value) requires widening.

Other code in that function seems to prevent similar cases from happening,
but it seems this case was missed.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D106164
2021-07-26 16:01:55 +01:00
Bradley Smith
5e51e7ed64 [AArch64][SVE] Break false dependencies for inactive lanes of unary operations
Differential Revision: https://reviews.llvm.org/D105889
2021-07-26 15:01:21 +00:00
Ulrich Weigand
81afdbc83c [SystemZ] Add support for new cpu architecture - arch14
This patch adds support for the next-generation arch14
CPU architecture to the SystemZ backend.

This includes:
- Basic support for the new processor and its features.
- Detection of arch14 as host processor.
- Assembler/disassembler support for new instructions.
- New LLVM intrinsics for certain new instructions.
- Support for low-level builtins mapped to new LLVM intrinsics.
- New high-level intrinsics in vecintrin.h.
- Indicate support by defining  __VEC__ == 10304.

Note: No currently available Z system supports the arch14
architecture.  Once new systems become available, the
official system name will be added as supported -march name.
2021-07-26 16:57:28 +02:00
Florian Hahn
ca0aa2b075 Recommit "[VPlan] Add recipe for first-order rec phis, make splicing explicit."
This reverts the revert commit b1777b04dc4b1a9fee0e7effa7e177892ab32ef0.

The patch originally got reverted due to a crash:
https://bugs.chromium.org/p/chromium/issues/detail?id=1232798#c2

The underlying issue was that we were not using the stored values from
the modified memory recipes, but the out-of-date values directly from
the IR (accessed via the VPlan). This should be fixed in d995d6376. A
reduced version of the reproducer has been added in 93664503be6b.
2021-07-26 15:50:30 +01:00
Nikita Popov
bab200ac44 [IR] Consider non-willreturn as side effect (PR50511)
This adjusts mayHaveSideEffect() to return true for !willReturn()
instructions. Just like other side-effects, non-willreturn calls
(aka "divergence") cannot be removed and cannot be reordered relative
to other side effects. This fixes a number of bugs where
non-willreturn calls are either incorrectly dropped or moved. In
particular, it also fixes the last open problem in
https://bugs.llvm.org/show_bug.cgi?id=50511.

I performed a cursory review of all current mayHaveSideEffect()
uses, which convinced me that these are indeed the desired default
semantics. Places that do not want to consider non-willreturn as a
sideeffect generally do not want mayHaveSideEffect() semantics at
all. I identified two such cases, which are addressed by D106591
and D106742. Finally, there is a use in SCEV for which we don't
really have an appropriate API right now -- what it wants is
basically "would this be considered forward progress". I've just
spelled out the previous semantics there.

Differential Revision: https://reviews.llvm.org/D106749
2021-07-26 16:35:14 +02:00
Benjamin Kramer
9ae7d5aa56 Simplify away some SmallVector copies. NFCI.
The lifetime of the initializer list is the full expression, so we can
skip storing it in a temporary vector.
2021-07-26 16:33:38 +02:00
Jeremy Morse
9b82c651ff [InstrRef][AArch64][1/4] Accept constant physreg variable locations
Late in SelectionDAG we join up instruction numbers with their defining
instructions, if it couldn't be done during the main part of SelectionDAG.
One exception is function arguments, where we have to point a DBG_PHI
instruction at the incoming live register, as they don't have a defining
instruction. This patch adds another exception, for constant physregs, like
aarch64 has.

It may seem wasteful to use two instructions where we could use a single
DBG_VALUE, however the whole point of instruction referencing is to
decouple the identification of values from the specification of where
variable location ranges start.

(Part of my aarch64 work to ease adoption of  instruction referencing, as
in the meta comment on D104520)

Differential Revision: https://reviews.llvm.org/D104520
2021-07-26 15:26:15 +01:00
Florian Hahn
d00a7edec9 [LV] Add test to store a first-order rec via interleave group.
This is a reduced version of the reproducer from
https://bugs.chromium.org/p/chromium/issues/detail?id=1232798#c2
2021-07-26 15:20:04 +01:00
Alexey Bataev
37023003ea [SLP]Fix costs calculations.
Need to fix several cost-related problems. The final type may be defined
incorrectly because of to early definition (we may end up with the wider
type), the CommonCost should not be redefined in ExtractElements
cost related calculations and the shuffle of the final insertelements
vectors should be calculated as a cost of single vector permutations
+ costs of two vector permutations for other n-1 incoming vectors.

Differential Revision: https://reviews.llvm.org/D106578
2021-07-26 07:14:03 -07:00
gbreynoo
40ac5ec4fa [llvm-readobj] Display multiple function names for stack size entries
The current implementation of displaying .stack_size information
presumes that each entry represents a single function but this is not
always the case. For example with the use of ICF multiple functions can
be represented with the same code, meaning that the address found in a
.stack_size entry corresponds to multiple function symbols.
This change allows multiple function names to be displayed when
appropriate.

Differential Revision: https://reviews.llvm.org/D105884
2021-07-26 14:49:53 +01:00
Jay Foad
453349344b [AMDGPU][GISel] Fix MMO for raw/struct buffer access with non-constant offset
Codegen for the raw/struct buffer access intrinsics would update the
offset in the MMO to reflect the combined offset, if it was known to be
constant. If the combined offset was not known to be constant, or if
there was an index, it would set the offset in the MMO to 0. This is
unsafe because it makes it look like the access does not alias with
another access with a fixed non-zero offset.

Fix these cases by setting the pointer in the MMO to null, to reflect
the fact that we do not have any known IR value pointer + constant
offset for the access.

D106284 did this for SelectionDAG. This is the corresponding fix for
GlobalISel.

Differential Revision: https://reviews.llvm.org/D106451
2021-07-26 14:27:30 +01:00
Jay Foad
eaeb8d4f70 [AMDGPU] Pre-commit global-isel test case for D106451
This test case shows the scheduler wrongly reordering two buffer
accesses that might alias.
2021-07-26 14:27:30 +01:00
Jay Foad
7f0f3d6b7b [AMDGPU] Fix MMO for raw/struct buffer access with non-constant offset
Codegen for the raw/struct buffer access intrinsics would update the
offset in the MMO to reflect the combined offset, if it was known to be
constant. If the combined offset was not known to be constant, or if
there was an index, it would set the offset in the MMO to 0. This is
unsafe because it makes it look like the access does not alias with
another access with a fixed non-zero offset.

Fix these cases by setting the pointer in the MMO to null, to reflect
the fact that we do not have any known IR value pointer + constant
offset for the access.

Differential Revision: https://reviews.llvm.org/D106284
2021-07-26 14:27:30 +01:00
David Green
1cb5f9c5d7 [ARM] Ensure correct regclass in distributing postinc
The register class required for some MVE loads/stores is more
constrained than the register we use when creating postinc. Make sure we
constrain the register class to keep the code correct.
2021-07-26 14:26:38 +01:00
Tim Northover
c8cc09ffa5 AArch64: support i128 (& larger) returns in GlobalISel 2021-07-26 14:16:35 +01:00
Nikita Popov
845ad210b0 [SimplifyCFG] Improve store speculation check
isSafeToSpeculateStore() looks for a preceding store to the same
location to make sure that introducing a new store of the same
value is safe. It currently bails on intervening mayHaveSideEffect()
instructions. However, I believe just checking mayWriteToMemory()
is sufficient there -- we just need to make sure that we know which
value was stored, we don't care if we can unwind in the meantime.

While looking into this, I started having some doubts about the
correctness of the transform with regard to thread safety. While
we don't try to hoist non-simple stores, I believe we also need
to make sure that the preceding store is simple as well. Otherwise
we could introduce a spurious non-atomic write after an atomic write
-- under our memory model this would result in a subsequent undef
atomic read, even if the second write stores the same value as the
first.

Example: https://alive2.llvm.org/ce/z/q_3YAL

Differential Revision: https://reviews.llvm.org/D106742
2021-07-26 15:01:00 +02:00
Kerry McLaughlin
d43867c3c3 [SVE] Fix casts to <FixedVectorType> in truncateToMinimalBitwidths
Fixes more casts to `<FixedVectorType>` for the cases where the
instruction is a Insert/ExtractElementInst.

For fixed-width, this part of truncateToMinimalBitWidths is tested by
AArch64/type-shrinkage-insertelt.ll. I attempted to write a test case for this part
of truncateToMinimalBitWidths which uses scalable vectors, but was unable to add
one. The tests in type-shrinkage-insertelt.ll rely on scalarization to create extract
element instructions for instance, which is not possible for scalable vectors.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D106163
2021-07-26 13:44:51 +01:00
Alexey Bataev
7d6020c9b9 Revert "[SLP]Fix costs calculations."
This reverts commit a053afed49897aa34e08287f91c5255efa4e5131 to fix
buildbots.
2021-07-26 05:42:34 -07:00
Caroline Concatto
7e3be1e953 [AArch65][SVE] Remove vector_splice from AddedComplexity pattern
The pattern for vector_splice with Index equal or bigger than
zero was misplaced in the AddedComplexity = 1 pattern in the AArch64
tablegen file. This patch fixes it by removing vector_splice pattern
from inside AddedComplexity = 1.
2021-07-26 13:35:51 +01:00
Alexey Bataev
d88801e0e6 [SLP]Fix costs calculations.
Need to fix several cost-related problems. The final type may be defined
incorrectly because of to early definition (we may end up with the wider
type), the CommonCost should not be redefined in ExtractElements
cost related calculations and the shuffle of the final insertelements
vectors should be calculated as a cost of single vector permutations
+ costs of two vector permutations for other n-1 incoming vectors.

Differential Revision: https://reviews.llvm.org/D106578
2021-07-26 04:37:22 -07:00
Paul Walker
fc75021aa7 [NFC] Change VFShape so it contains an ElementCount rather than seperate VF and IsScalable properties.
Differential Revision: https://reviews.llvm.org/D106750
2021-07-26 12:25:46 +01:00
Philipp Krones
d7917544a3 [Inliner] Make the CallPenalty configurable
Tests with multiple benchmarks, like Embench [1], showed that the
CallPenalty magic number has the most influence on inlining decisions
when optimizing for size.

On the other hand, there was no good default value for this parameter.
Some benchmarks profited strongly from a reduced call penalty. On
example is the picojpeg benchmark compiled for RISC-V, which got 6%
smaller with a CallPenalty of 10 instead of 12. Other benchmarks
increased in size, like matmult.

This commit makes the compromise of turning the magic number constant of
CallPenalty into a configurable value. This introduces the flag
`--inline-call-penalty`. With that flag users can fine tune the inliner
to their needs.

The CallPenalty constant was also used for loops. This commit replaces
the CallPenalty constant with a new LoopPenalty constant that is now
used instead.

This is a slimmed down version of https://reviews.llvm.org/D30899

[1]: https://github.com/embench/embench-iot

Differential Revision: https://reviews.llvm.org/D105976
2021-07-26 12:07:49 +01:00
Florian Hahn
a0b07d2e54 [VPlan] Use stored value from recipes for interleave groups.
Instead of getting the VPValue for the stored IR values through the
current plan, use the stored value of the recipes directly.

This way, the correct VPValues are used if the store recipes have been
modified in the VPlan and the IR value is not correct any longer. This
can happen, e.g. due to D105008.
2021-07-26 12:05:23 +01:00
Dylan Fleming
6f6b3d4f7a [SVE] Add support for folding for select + masked loads
Add folds to instcombine to support the removal of select instruction when the masked_load is guaranteed to zero the same lanes, i.e. select(mask, mload(,,mask,0), 0) -> mload(,,mask,0).

Patch originally authored by @paulwalker-arm

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D106376
2021-07-26 11:58:41 +01:00
Caroline Concatto
9af238dfc2 [SVE][AArch64] Improve code generation for vector_splice for Imm > 0
This patch implements vector_splice in tablegen for all cases when the
Immediate is positive and lower than the known minimum value of
a scalable vector.
Vector_splice can be implemented using SVE instruction EXT.
For instance :
    @llvm.experimental.vector.splice(Vector_1, Vector_2, Imm)
    @llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1) ==> <B, C, D, E>
        EXT  Vector_1, Vector_2, Imm              // Vector_1 = B, C, D + Vector_2 = E

Depends on D105633

Differential Revision: https://reviews.llvm.org/D106273
2021-07-26 11:45:46 +01:00
David Sherwood
fd6a38b569 Fix test failures caused by 0aff1798b5721d5f95d16f465b99d357012bb8d1 2021-07-26 11:40:26 +01:00
Caroline Concatto
d9b910d9d2 [AArch64][SVE] Improve code generation for vector_splice for Imm == -1
This patch implements vector_splice in tablegen for:
  a) when the immediate is equal to -1 (Imm==1) and uses:
       INSR  +  LASTB
For instance :
@llvm.experimental.vector.splice(Vector_1, Vector_2, -1)
@llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1) ==> <D, E, F, G>
    LAST   RegLast, Vector_1                 // RegLast = D
    INSR   Res, (Vector_1 >> 1), RegLast     // Res = D + E, F, G

Differential Revision: https://reviews.llvm.org/D105633
2021-07-26 11:25:01 +01:00
Simon Pilgrim
96fce2fe63 [X86][AVX] Prefer vinsertf128 to vperm2f128 on AVX1 targets
Splatting the lower xmm with vinsertf128 is at least as quick as vperm2f128, and a lot faster on some AMD targets.

First step towards PR50053
2021-07-26 11:11:56 +01:00