1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-18 18:42:46 +02:00
Commit Graph

213344 Commits

Author SHA1 Message Date
Oliver Stannard
0914bea32c Revert "Reapply "OpaquePtr: Turn inalloca into a type attribute""
Reverting because test 'Bindings/Go/go.test' is failing on most
buildbots.

This reverts commit fc9df309917e57de704f3ce4372138a8d4a23d7a.
2021-03-29 11:32:22 +01:00
Simon Pilgrim
f3d685c6a2 [X86][F16C] Add F16C -O0 test coverage
Ensure the duplicate conversions noticed in D48614 have gone
2021-03-29 11:31:20 +01:00
Simon Pilgrim
da2d2f6455 [X86] Regenerate tests to add missing @PLT 2021-03-29 11:31:19 +01:00
Simon Pilgrim
b49e0cd2a5 [X86][SSE] combineHorizOpWithShuffle - consistently use getTargetShuffleInputs to decode shuffles
Minor cleanup before I start trying to merge the unary/binary shuffle combining paths.
2021-03-29 11:31:19 +01:00
Nashe Mncube
762a55526d [SVE][Analysis]Instruction costs for ops on scalable-vec
The following operations have no associated cost for them
when applied to scalable vectors, and as a consequence
can trigger a crash when a call is made to
AArch64TTIImpl::getCastInstrCost():
- fptrunc
- trunc
- fpext
- fpto(u,s)i

This patch adds costs for these operations and
relevant regression tests.

Differential Revision: https://reviews.llvm.org/D98934
2021-03-29 11:15:50 +01:00
Stefan Gränitz
cc45a8982f [Orc][tests] Moving one MCJIT test over to Orc to make sure the PowerPC fix worked
The PowerPC fix landed in d9069dd9b576. This is in preparation for D98931.
2021-03-29 11:53:03 +02:00
Jingu Kang
9d66510fb1 [NFC][LoopUnswitch] Move hasPartialIVCondition to LoopUtils
Differential revision: https://reviews.llvm.org/D99490
2021-03-29 10:29:45 +01:00
Petar Avramovic
4315e26388 [AMDGPU] Extend gfx10 test coverage. NFC.
Differential Revision: https://reviews.llvm.org/D99267
2021-03-29 11:13:55 +02:00
David Green
895ea4b8bd [ARM] Extend MVE lane interleaving to handle other non-instruction leaves
This extends the recent MVE lane interleaving passto handle other
non-instruction leaves, for which a new shuffle is added. This helps
especially for constants and potentially for arguments.

Differential Revision: https://reviews.llvm.org/D97289
2021-03-29 09:05:45 +01:00
Lang Hames
859b69b17f [ORC][C-bindings] Fix some ORC C bindings function names and signatures.
LLVMOrcDisposeObjectLayer and LLVMOrcExecutionSessionGetJITDylibByName did not
have matching signatures between the C-API header and binding implementations.
Fixes http://llvm.org/PR49745.

Patch by Mats Larsen. Thanks Mats!

Reviewed by: lhames

Differential Revision: https://reviews.llvm.org/D99478
2021-03-28 16:30:47 -07:00
Craig Topper
7c7e47bb5a [RISCV] Add a RV64 mulhsu test case. NFC 2021-03-28 15:54:44 -07:00
David Green
44947e83f7 [ARM] Fix the Changed value in the MVE lane interleaving pass. 2021-03-28 23:47:53 +01:00
Nikita Popov
ab6e561d30 [BasicAA] Make sure types match in constant offset heuristic
This can only happen if offset types that are larger than the
pointer size are involved. The previous implementation did not
assert in this case because it initialized the APInts to the
width of one of the variables -- though I strongly suspect it
did not compute correct results in this case.

Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=32621
reported by fhahn.
2021-03-28 21:38:09 +02:00
Craig Topper
99c3c72646 [X86] Add phase ordering test for the problem D99427 is trying to solve. NFC 2021-03-28 12:14:30 -07:00
Craig Topper
de0e08c72b [X86] Optimize vXi8 MULHS on targets where we can't sign_extend to the next register size.
For these cases we need to extract the upper or lower elements,
multiply them using 16-bit multiplies and repack them.

Previously we used punpcklbw/punpckhbw+psraw or pmovsxbw+pshudfd to
extract and sign extend so we could use pmullw to compute the 16-bit
product and then shift down the high bits.

We can avoid the need to sign extend if we unpack the bytes into
the high byte of each word and fill the lower byte with 0 using
pxor. This puts the sign bit of each byte into the sign bit of
each word. Since the LHS and RHS have 8 trailing zeros, the full
32-bit product of those 16-bit values will have 16 trailing zeros.
This means the 16-bit product of the original bytes is in the upper
16 bits which we can calculate using pmulhw.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D98587
2021-03-28 11:41:29 -07:00
Craig Topper
6e75027132 [X86][update_llc_test_checks] Use a less greedy regular expression for replacing constant pool labels in tests.
While working on D97208 I noticed that these greedy regular
expressions prevent tests from failing when (%rip) appears after
a constant pool label when it didn't before.

Reviewed By: RKSimon, pengfei

Differential Revision: https://reviews.llvm.org/D99460
2021-03-28 11:39:46 -07:00
LLVM GN Syncbot
4a9e2678d2 [gn build] Port 7b6f760fcd19 2021-03-28 18:35:33 +00:00
David Green
d2d877079e [ARM] MVE vector lane interleaving
MVE does not have a single sext/zext or trunc instruction that takes the
bottom half of a vector and extends to a full width, like NEON has with
MOVL. Instead it is expected that this happens through top/bottom
instructions. So the MVE equivalent VMOVLT/B instructions take either
the even or odd elements of the input and extend them to the larger
type, producing a vector with half the number of elements each of double
the bitwidth. As there is no simple instruction for a normal extend, we
often have to expand sext/zext/trunc into a series of lane moves (or
stack loads/stores, which we do not do yet).

This pass takes vector code that starts at truncs, looks for
interconnected blobs of operations that end with sext/zext and
transforms them by adding shuffles so that the lanes are interleaved and
the MVE VMOVL/VMOVN instructions can be used. This is done pre-ISel so
that it can work across basic blocks.

This initial version of the pass just handles a limited set of
instructions, not handling constants or splats or FP, which can all come
as extensions to this base.

Differential Revision: https://reviews.llvm.org/D95804
2021-03-28 19:34:58 +01:00
Craig Topper
39af919139 [RISCV] Add test case for mulhsu.
We don't yet use mulhsu, but we should.
2021-03-28 11:03:39 -07:00
Matt Arsenault
403cadc380 Reapply "OpaquePtr: Turn inalloca into a type attribute"
This reverts commit 20d5c42e0ef5d252b434bcb610b04f1cb79fe771.
2021-03-28 13:35:21 -04:00
Sanjay Patel
2bdb86c857 [InstCombine] sink min/max intrinsics with common op after select
This is another step towards parity with cmp+select min/max idioms.

See D98152.
2021-03-28 13:13:04 -04:00
Sanjay Patel
fbffae5729 [InstCombine] add tests for select of min/max intrinsics; NFC 2021-03-28 13:13:04 -04:00
Nico Weber
755e1b95c9 Revert "OpaquePtr: Turn inalloca into a type attribute"
This reverts commit 4fefed65637ec46c8c2edad6b07b5569ac61e9e5.
Broke check-clang everywhere.
2021-03-28 13:02:52 -04:00
Zakk Chen
56db174a0b [RISCV][Clang] Update new overloading rules for RVV intrinsics.
RVV intrinsics has new overloading rule, please see
82aac7dad4

Changed:
1. Rename `generic` to `overloaded` because the new rule is not using C11 generic.
2. Change HasGeneric to HasNoMaskedOverloaded because all masked operations
   support overloading api.
3. Add more overloaded tests due to overloading rule changed.

Differential Revision: https://reviews.llvm.org/D99189
2021-03-28 09:04:35 -07:00
Stefan Gränitz
cd6dd68cfd [Orc][examples] Add missing dependency to OrcShared in LLJITWithRemoteDebugging 2021-03-28 17:48:28 +02:00
Stefan Gränitz
8130b8a5a1 [Orc][examples] Add LLJITWithRemoteDebugging example 2021-03-28 17:25:09 +02:00
Matt Arsenault
600ba40bd7 AArch64/GlobalISel: Remove IR section from test 2021-03-28 11:12:59 -04:00
Matt Arsenault
9b63996812 OpaquePtr: Turn inalloca into a type attribute
I think byval/sret and the others are close to being able to rip out
the code to support the missing type case. A lot of this code is
shared with inalloca, so catch this up to the others so that can
happen.
2021-03-28 11:12:23 -04:00
Florian Hahn
f051f52528 [LV] Mark a few more cost-model members as const (NFC). 2021-03-28 14:59:48 +01:00
Nikita Popov
bdb84921a1 [BasicAA] Handle gep with unknown sizes earlier (NFCI)
If the sizes of both memory locations are unknown, we can only
perform a check on the underlying objects. There's no point in
going through GEP decomposition in this case.
2021-03-28 15:48:49 +02:00
Florian Hahn
3a9af0ee66 [SelDag] Add isIntOrFPConstant helper function.
This patch adds a new isIntOrFPConstant  helper function to check if a
SDValue is a integer of FP constant. This pattern is used in various
places.

There also are places that incorrectly just check for integer constants,
e.g. D99384, so hopefully this helper will help people avoid that issue.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D99428
2021-03-28 12:48:58 +01:00
Hsiangkai Wang
3039a6f04c [RISCV] Add vfabs.v pseudo instruction.
Differential Revision: https://reviews.llvm.org/D99454
2021-03-28 10:24:05 +08:00
Vaivaswatha Nagaraj
60a545b91a [OCaml][Test] Fix and enable debuginfo.ml test
`get_or_create_type_array` was used on a non-type MDNode.
Add interface for `get_or_create_array` and use that instead.

Differential Revision: https://reviews.llvm.org/D99450
2021-03-28 06:25:39 +05:30
Craig Topper
519ec4e15e [X86] Regenerate a bunch of tests to pick up @PLT
I'm prepping another patch to the same tests and this just adds
noise to my diff.
2021-03-27 16:41:35 -07:00
Craig Topper
9c604c891f [RISCV] Add a pattern for (sext_inreg (mul (and X, 0xffffffff), (and Y, 0xffffffff)), i32) to suppress MULW formation
We have a special pattern for
(mul (and X, 0xffffffff), (and Y, 0xffffffff)), to optimize the
ANDs to shift. But if a sext_inreg coms first, we'll form a MULW
and limit the effectiveness of the special match. So this patch
adds a larger pattern to suppress the MULW formation by emitting
a sext.w and then the same output we use for the
(mul (and X, 0xffffffff), (and Y, 0xffffffff)). This should all
get CSEd.

This is the issue I was trying to fix with D99029, but that affected
many more tests.
2021-03-27 15:37:18 -07:00
Nikita Popov
f2b0645b2f [BasicAA] Refactor linear expression decomposition
The current linear expression decomposition handles zext/sext by
decomposing the casted operand, and then checking NUW/NSW flags
to determine whether the extension can be distributed. This has
some disadvantages:

First, it is not possible to perform a partial decomposition. If
we have zext((x + C1) +<nuw> C2) then we will fail to decompose
the expression entirely, even though it would be safe and
profitable to decompose it to zext(x + C1) +<nuw> zext(C2)

Second, we may end up performing unnecessary decompositions,
which will later be discarded because they lack nowrap flags
necessary for extensions.

Third, correctness of the code is not entirely obvious: At a high
level, we encounter zext(x -<nuw> C) in the form of a zext on the
linear expression x + (-C) with nuw flag set. Notably, this case
must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C).
The code handles this correctly by speculatively zexting constants
to the final bitwidth, and performing additional fixup if the
actual extension turns out to be an sext. This was not immediately
obvious to me.

This patch inverts the approach: An ExtendedValue represents a
zext(sext(V)), and linear expression decomposition will try to
decompose V further, either by absorbing another sext/zext into the
ExtendedValue, or by distributing zext(sext(x op C)) over a binary
operator with appropriate nsw/nuw flags. At each step we can
determine whether distribution is legal and abort with a partial
decomposition if not. We also know which extensions we need to
apply to constants, and don't need to speculate or fixup.
2021-03-27 23:31:58 +01:00
Florian Hahn
4ba4438b35 [LV] Fix formatting from 2f9d68c3f12a. 2021-03-27 21:29:56 +00:00
Florian Hahn
5447475d44 [LV] Mark some methods as const (NFC).
Mark a few methods as const, as they do not modify any state.
2021-03-27 21:27:53 +00:00
Alex Reinking
e8bc4b3fec [CMake] Use write_basic_package_version_file for LLVM
Use the CMake 3.13 features of CMakeConfigPackageHelpers to generate
LLVMConfigVersion.cmake with proper architecture detection, major+minor
version matching, etc.

Differential Revision: https://reviews.llvm.org/D99451
2021-03-27 21:02:20 +00:00
Nico Weber
47177685fc [gn build] rewrap a comment to 80 cols 2021-03-27 12:50:33 -04:00
Simon Pilgrim
441e65c23a [X86][SSE] foldShuffleOfHorizOp - remove broadcast handling.
Remove VBROADCAST/MOVDDUP/splat-shuffle handling from foldShuffleOfHorizOp

This can all be handled by canonicalizeShuffleMaskWithHorizOp along as we check that the HADD/SUB are only used once (to prevent infinite loops on slow-horizop targets which will try to reuse the nodes again followed by a post-hop shuffle).
2021-03-27 15:09:23 +00:00
Joel E. Denny
1d86d8b8f3 [FileCheck] Try to fix buildbot failures caused by c7c542e8f306
For example,

<https://lab.llvm.org/buildbot/#/builders/132/builds/3929>

has this diagnostic:

```
/opt/gcc/9.3.0/snos/include/g++/bits/stl_tree.h:780:8: error: static assertion failed: comparison object must be invocable as const
  780 |        is_invocable_v<const _Compare&, const _Key&, const _Key&>,
      |        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
2021-03-27 11:03:10 -04:00
Joel E. Denny
fcd363afd3 [FileCheck] Fix -dump-input per-pattern diagnostic indexing
In input dump annotations, `check:2'1` indicates diagnostic 1 for the
`CHECK` directive on check file line 2.  Without this patch,
`-dump-input` computes the diagnostic index with the assumption that
FileCheck *consecutively* produces all diagnostics for the same
pattern.  Already, that can be a false assumption, as in the examples
below.  Moreover, it seems like a brittle assumption as FileCheck
evolves.  Finally, it actually complicates the implementation even if
it makes it slightly more efficient.

This patch avoids that assumption.  Examples below show results after
applying this patch.  Before applying this patch, `'N` is omitted
throughout these examples because the implementation doesn't notice
there's more than one diagnostic per pattern.

First, `CHECK-LABEL` violates the assumption because `CHECK-LABEL`
tries to match twice, and other directives can match in between:

```
$ cat check
CHECK: foobar
CHECK-LABEL: foobar

$ FileCheck -vv check < input |& tail -8
<<<<<<
           1: text
           2: foobar
label:2'0     ^~~~~~
check:1       ^~~~~~
label:2'1           X error: no match found
           3: text
>>>>>>
```

Second, `--implicit-check-not` is obviously processed many times among
other directives:

```
$ cat check
CHECK: foo
CHECK: foo

$ FileCheck -vv -dump-input=always -implicit-check-not=foo \
            check < input |& tail -16
<<<<<<
            1: text
not:imp1'0     X~~~~
            2: foo
check:1        ^~~
not:imp1'1        X
            3: text
not:imp1'1     ~~~~~
            4: foo
check:2        ^~~
not:imp1'2        X
            5: text
not:imp1'2     ~~~~~
            6:
eof:2          ^
>>>>>>
```

Reviewed By: thopre, jhenderson

Differential Revision: https://reviews.llvm.org/D97813
2021-03-27 10:36:21 -04:00
Nikita Popov
4cf39be5a9 [BasicAA] Correct handle implicit sext in decomposition
While explicit sext instructions were handled correctly, the
implicit sext that occurs if the offset is smaller than the
pointer size blindly assumed that sext(X * Scale + Offset) is the
same as sext(X) * Scale + Offset, which is obviously not correct.

Fix this by extracting the code that handles linear expression
extension and reusing it for the implicit sext as well.
2021-03-27 15:15:47 +01:00
Nikita Popov
13fbfb64c9 [BasicAA] Clarify entry values of GetLinearExpression() (NFC)
A number of variables need to be correctly initialized on entry
to GetLinearExpression() for the implementation to behave reasonably.

The fact that SExtBits can currenlty be non-zero on entry is a bug,
as demonstrated by the added test: For implicit sexts by the GEP,
we do currently skip legality checks.
2021-03-27 14:50:09 +01:00
Nikita Popov
50e408c21a [BasicAA] Bail out earlier for invalid shift amount
Currently, we'd produce an incorrect decomposition, because we
already recursively called GetLinearExpression(), so the Scale=1,
Offset=0 will not necessarily be relative to the shl itself.

Now, this doesn't actually matter for functional correctness,
because such a shift is poison anyway, so its okay to return
an incorrect decomposition. It's still unnecessarily confusing
though, and we can easily avoid this by checking the bitwidth
earlier.
2021-03-27 12:41:16 +01:00
Nikita Popov
f99cc32fd4 [BasicAA] Retain shl nowrap flags in GetLinearExpression()
Nowrap flags between mul and shl differ in that mul nsw allows
multiplication of 1 * INT_MIN, while shl nsw does not. This means
that it is always fine to transfer shl nowrap flags to muls, but
not necessarily the other way around. In this case the NUW/NSW
results refer to mul/add operations, so it's fine to retain the
flags from the shl.
2021-03-27 12:26:22 +01:00
Simon Pilgrim
cdae6be18a [X86][SSE] combineX86ShuffleChain - attempt to recognise 'hidden' identity shuffles
See if the combined shuffle mask is equivalent to an identity shuffle, typically this is due to repeated LHS/RHS ops in horiz-ops, but isTargetShuffleEquivalent might see other patterns as well.

This is another small step towards getting rid of foldShuffleOfHorizOp and relying on canonicalizeShuffleMaskWithHorizOp and generic shuffle combining.
2021-03-27 11:09:30 +00:00
Juneyoung Lee
c9eefe2dea Make FoldBranchToCommonDest poison-safe by default
This is a small patch to make FoldBranchToCommonDest poison-safe by default.
After fc3f0c9c, only two syntactic changes are needed to fix unit tests.
This does not cause any assembly difference in testsuite as well (-O3, X86-64 Manjaro).

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D99452
2021-03-27 19:05:12 +09:00
Sanjay Patel
64dc4c67c5 [x86] prevent crashing while matching pmaddwd
This could crash in 2 ways: either one or both of
the input vectors could be a different size than
the math ops.

https://llvm.org/PR49716
2021-03-27 05:27:14 -04:00