1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 02:52:53 +02:00
Commit Graph

215981 Commits

Author SHA1 Message Date
Stanislav Mekhanoshin
af64ca04f5 [AMDGPU] Add support for architected flat scratch
Add support for the readonly flat Scratch register initialized
by the SPI.

Differential Revision: https://reviews.llvm.org/D102432
2021-05-14 10:53:48 -07:00
Nico Weber
8a2b8ef8d1 [gn build] (manually) merge b7d1ab75cf47
No check-hwasan-lam target yet, though.
2021-05-14 13:51:10 -04:00
Tomasz Miąsko
19ed746cf3 [Demangle][Rust] Parse integer constants
Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102179
2021-05-14 19:47:19 +02:00
Philip Reames
72f2c7d2ee Do actual DCE in LoopUnroll (try 2)
Recommitting after addressing a missed review comment, and updating an aarch64 test I'd missed.

LoopUnroll does a limited DCE pass after unrolling, but if you have a chain of dead instructions, it only deletes the last one. Improve the code to recursively delete all trivially dead instructions.

Differential Revision: https://reviews.llvm.org/D102511
2021-05-14 10:42:36 -07:00
Benjamin Kramer
e9a9f45f1e Document updated googletest + modifications 2021-05-14 19:26:12 +02:00
Matt Arsenault
15058e16a1 AMDGPU: Fix assert when rewriting saddr d16 loads
moveOperands does not handle moving tied operands since it would
generally have to fixup the tied operand references. Avoid the assert
by untying and retying after the modification. These in place
modifications really aren't managable.
2021-05-14 13:24:19 -04:00
Roman Lebedev
7a6506cfad [NFC][X86][MCA] Add sudo-zero-idiom vperm2f128/vperm2i128 tests - don't break deps
While btver2 model states that this pattern is a zero-cycle zero-idiom
on Jaguar, it does not appear to be the case on Znver3,
here it measures as not being recognized as dep-breaking zero-idiom,
let alone a zero-cycle one.
2021-05-14 20:23:05 +03:00
Roman Lebedev
a8537d3144 [X86] AMD Zen 3: same-reg AVX YMM VPCMPGT{B,W,D,Q} is a zero-cycle(!) dep-breaking zero-idiom
As measured by exegesis, and confirmed by ref docs.
2021-05-14 20:23:05 +03:00
Roman Lebedev
d28d04de75 [X86] AMD Zen 3: same-reg AVX XMM VPCMPGT{B,W,D,Q} is a zero-cycle(!) dep-breaking zero-idiom
As measured by exegesis, and confirmed by ref docs.
2021-05-14 20:23:04 +03:00
Roman Lebedev
2ff7114732 [X86] AMD Zen 3: same-reg SSE XMM PCMPGT{B,W,D,Q} is a 1-cycle(!) dep-breaking zero-idiom
As measured by exegesis, and confirmed by ref docs.
2021-05-14 20:23:04 +03:00
Roman Lebedev
ff2ff878b4 [NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VPCMPGT{B,W,D,Q} tests 2021-05-14 20:23:04 +03:00
Roman Lebedev
317197c4a8 [NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VPCMPGT{B,W,D,Q} tests 2021-05-14 20:23:04 +03:00
Roman Lebedev
069fb685b6 [NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM PCMPGT{B,W,D,Q} tests 2021-05-14 20:23:03 +03:00
Roman Lebedev
6e90e82fb2 [X86] AMD Zen 3: same-reg AVX YMM VPSUBUS{B,W} is a 1-cycle(!) dep-breaking zero-idiom
Not really mentioned in ref docs, but measures as such.
Yes, this one is also not zero-cycle.
2021-05-14 20:23:03 +03:00
Roman Lebedev
e641283c37 [X86] AMD Zen 3: same-reg AVX XMM VPSUBUS{B,W} is a 1-cycle(!) dep-breaking zero-idiom
Not really mentioned in ref docs, but measures as such.
Yes, this one is also not zero-cycle.
2021-05-14 20:23:03 +03:00
Roman Lebedev
7224d53fef [X86] AMD Zen 3: same-reg SSE XMM PSUBUS{B,W} is a 1-cycle(!) dep-breaking zero-idiom
Not really mentioned in ref docs, but measures as such.
2021-05-14 20:23:03 +03:00
Roman Lebedev
103eef39fa [NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VPSUBUS{B,W} tests 2021-05-14 20:23:03 +03:00
Roman Lebedev
747b0319b1 [NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VPSUBUS{B,W} tests 2021-05-14 20:23:02 +03:00
Roman Lebedev
d2246847e9 [NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM PSUBUS{B,W} tests 2021-05-14 20:23:02 +03:00
Roman Lebedev
19263a16d8 [X86] AMD Zen 3: same-reg AVX YMM VPSUBS{B,W} is a 1-cycle(!) dep-breaking zero-idiom
Not really mentioned in ref docs, but measures as such.
Yes, this one is also not zero-cycle.
2021-05-14 20:23:02 +03:00
Roman Lebedev
139c28fa45 [X86] AMD Zen 3: same-reg AVX XMM VPSUBS{B,W} is a 1-cycle(!) dep-breaking zero-idiom
Not really mentioned in ref docs, but measures as such.
Yes, this one is also not zero-cycle.
2021-05-14 20:23:02 +03:00
Roman Lebedev
fe32a28378 [X86] AMD Zen 3: same-reg SSE XMM PSUBS{B,W} is a 1-cycle(!) dep-breaking zero-idiom
Not really mentioned in ref docs, but measures as such.
2021-05-14 20:23:02 +03:00
Roman Lebedev
9a5a00fd92 [NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VPSUBS{B,W} tests 2021-05-14 20:23:01 +03:00
Roman Lebedev
c27c048587 [NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VPSUBS{B,W} tests 2021-05-14 20:23:01 +03:00
Roman Lebedev
10c74938fb [NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM PSUBS{B,W} tests 2021-05-14 20:23:01 +03:00
Roman Lebedev
0d11d5063f [X86] AMD Zen 3: same-reg AVX YMM VPSUB{B,W,D,Q} is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by the exegesis measurements, and ref docs.
2021-05-14 20:23:01 +03:00
Roman Lebedev
4181fdd9ec [X86] AMD Zen 3: same-reg AVX XMM VPSUB{B,W,D,Q} is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by the exegesis measurements, and ref docs.
2021-05-14 20:23:01 +03:00
Roman Lebedev
5b956e1952 [X86] AMD Zen 3: same-reg SSE XMM PSUB{B,W,D,Q} is a 1-cycle(!) dep-breaking zero-idiom
As confirmed by the exegesis measurements, and ref docs.
2021-05-14 20:23:00 +03:00
Roman Lebedev
554780708a [NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VPSUB{B,W,D,Q} tests 2021-05-14 20:23:00 +03:00
Roman Lebedev
7a8c45f4d0 [NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VPSUB{B,W,D,Q} tests 2021-05-14 20:23:00 +03:00
Roman Lebedev
530148d646 [NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM PSUB{B,W,D,Q} tests 2021-05-14 20:23:00 +03:00
Roman Lebedev
522d03976e [X86] AMD Zen 3: same-reg AVX YMM VPANDN is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 20:23:00 +03:00
Roman Lebedev
d334fd8763 [X86] AMD Zen 3: same-reg AVX XMM VPANDN is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 20:23:00 +03:00
Roman Lebedev
747aa83d9d [X86] AMD Zen 3: same-reg SSE XMM PANDN is a 1-cycle(!) dep-breaking zero-idiom
As confirmed by the exegesis measurements, and ref docs.
2021-05-14 20:22:59 +03:00
Roman Lebedev
f96afc073b [NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VPANDN tests 2021-05-14 20:22:59 +03:00
Roman Lebedev
7cdc1e03f2 [NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VPANDN tests 2021-05-14 20:22:59 +03:00
Roman Lebedev
106fd4d50d [NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM PANDN tests 2021-05-14 20:22:59 +03:00
Roman Lebedev
1fc967929a [X86] AMD Zen 3: same-reg AVX YMM VPXOR is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 20:22:59 +03:00
Roman Lebedev
3b69b7222f [X86] AMD Zen 3: same-reg AVX XMM VPXOR is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 20:22:58 +03:00
Roman Lebedev
722c0e895f [X86] AMD Zen 3: same-reg SSE XMM PXOR is a 1-cycle(!) dep-breaking zero-idiom
As confirmed by the exegesis measurements, and ref docs.
2021-05-14 20:22:58 +03:00
Roman Lebedev
2d84799d26 [NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VPXOR tests 2021-05-14 20:22:58 +03:00
Roman Lebedev
42e170ffb7 [NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VPXOR tests 2021-05-14 20:22:58 +03:00
Roman Lebedev
7e11b78748 [NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM PXOR tests 2021-05-14 20:22:58 +03:00
Benjamin Kramer
62a029fa79 Bump googletest to 1.10.0 2021-05-14 19:16:31 +02:00
Philip Reames
658d86d1c9 Revert "Do actual DCE in LoopUnroll"
This reverts commit 9d1a61e695eb01298e26c76867d65592f1e1968c.

I'd missed some review feedback, and had missed updating an aarch64 test.  Reverting while I fix both.
2021-05-14 10:15:30 -07:00
Philip Reames
8371b39ae3 Do actual DCE in LoopUnroll
LoopUnroll does a limited DCE pass after unrolling, but if you have a chain of dead instructions, it only deletes the last one. Improve the code to recursively delete all trivially dead instructions.

Differential Revision: https://reviews.llvm.org/D102511
2021-05-14 10:05:25 -07:00
Philip Reames
143d78fe72 Autogen a test for ease of update 2021-05-14 09:33:17 -07:00
Florian Hahn
0a57e484b4 [LV] Add a few more complex first-order recurrence tests. 2021-05-14 17:27:17 +01:00
Bradley Smith
626d5e84ba [AArch64][SVE] Combine cntp intrinsics with add/sub to produce incp/decp
Depends on D101062

Differential Revision: https://reviews.llvm.org/D102077
2021-05-14 17:16:06 +01:00
Simon Pilgrim
95a1c7f9d0 [X86][SSE] Pull out combineToHorizontalAddSub helper from inside (F)ADD/SUB combines. NFCI.
The intention is to be able to run this from additional locations (such as shuffle combining) in the future.
2021-05-14 16:52:55 +01:00