1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00
Commit Graph

215762 Commits

Author SHA1 Message Date
Roman Lebedev
19263a16d8 [X86] AMD Zen 3: same-reg AVX YMM VPSUBS{B,W} is a 1-cycle(!) dep-breaking zero-idiom
Not really mentioned in ref docs, but measures as such.
Yes, this one is also not zero-cycle.
2021-05-14 20:23:02 +03:00
Roman Lebedev
139c28fa45 [X86] AMD Zen 3: same-reg AVX XMM VPSUBS{B,W} is a 1-cycle(!) dep-breaking zero-idiom
Not really mentioned in ref docs, but measures as such.
Yes, this one is also not zero-cycle.
2021-05-14 20:23:02 +03:00
Roman Lebedev
fe32a28378 [X86] AMD Zen 3: same-reg SSE XMM PSUBS{B,W} is a 1-cycle(!) dep-breaking zero-idiom
Not really mentioned in ref docs, but measures as such.
2021-05-14 20:23:02 +03:00
Roman Lebedev
9a5a00fd92 [NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VPSUBS{B,W} tests 2021-05-14 20:23:01 +03:00
Roman Lebedev
c27c048587 [NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VPSUBS{B,W} tests 2021-05-14 20:23:01 +03:00
Roman Lebedev
10c74938fb [NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM PSUBS{B,W} tests 2021-05-14 20:23:01 +03:00
Roman Lebedev
0d11d5063f [X86] AMD Zen 3: same-reg AVX YMM VPSUB{B,W,D,Q} is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by the exegesis measurements, and ref docs.
2021-05-14 20:23:01 +03:00
Roman Lebedev
4181fdd9ec [X86] AMD Zen 3: same-reg AVX XMM VPSUB{B,W,D,Q} is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by the exegesis measurements, and ref docs.
2021-05-14 20:23:01 +03:00
Roman Lebedev
5b956e1952 [X86] AMD Zen 3: same-reg SSE XMM PSUB{B,W,D,Q} is a 1-cycle(!) dep-breaking zero-idiom
As confirmed by the exegesis measurements, and ref docs.
2021-05-14 20:23:00 +03:00
Roman Lebedev
554780708a [NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VPSUB{B,W,D,Q} tests 2021-05-14 20:23:00 +03:00
Roman Lebedev
7a8c45f4d0 [NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VPSUB{B,W,D,Q} tests 2021-05-14 20:23:00 +03:00
Roman Lebedev
530148d646 [NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM PSUB{B,W,D,Q} tests 2021-05-14 20:23:00 +03:00
Roman Lebedev
522d03976e [X86] AMD Zen 3: same-reg AVX YMM VPANDN is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 20:23:00 +03:00
Roman Lebedev
d334fd8763 [X86] AMD Zen 3: same-reg AVX XMM VPANDN is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 20:23:00 +03:00
Roman Lebedev
747aa83d9d [X86] AMD Zen 3: same-reg SSE XMM PANDN is a 1-cycle(!) dep-breaking zero-idiom
As confirmed by the exegesis measurements, and ref docs.
2021-05-14 20:22:59 +03:00
Roman Lebedev
f96afc073b [NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VPANDN tests 2021-05-14 20:22:59 +03:00
Roman Lebedev
7cdc1e03f2 [NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VPANDN tests 2021-05-14 20:22:59 +03:00
Roman Lebedev
106fd4d50d [NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM PANDN tests 2021-05-14 20:22:59 +03:00
Roman Lebedev
1fc967929a [X86] AMD Zen 3: same-reg AVX YMM VPXOR is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 20:22:59 +03:00
Roman Lebedev
3b69b7222f [X86] AMD Zen 3: same-reg AVX XMM VPXOR is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 20:22:58 +03:00
Roman Lebedev
722c0e895f [X86] AMD Zen 3: same-reg SSE XMM PXOR is a 1-cycle(!) dep-breaking zero-idiom
As confirmed by the exegesis measurements, and ref docs.
2021-05-14 20:22:58 +03:00
Roman Lebedev
2d84799d26 [NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VPXOR tests 2021-05-14 20:22:58 +03:00
Roman Lebedev
42e170ffb7 [NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VPXOR tests 2021-05-14 20:22:58 +03:00
Roman Lebedev
7e11b78748 [NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM PXOR tests 2021-05-14 20:22:58 +03:00
Benjamin Kramer
62a029fa79 Bump googletest to 1.10.0 2021-05-14 19:16:31 +02:00
Philip Reames
658d86d1c9 Revert "Do actual DCE in LoopUnroll"
This reverts commit 9d1a61e695eb01298e26c76867d65592f1e1968c.

I'd missed some review feedback, and had missed updating an aarch64 test.  Reverting while I fix both.
2021-05-14 10:15:30 -07:00
Philip Reames
8371b39ae3 Do actual DCE in LoopUnroll
LoopUnroll does a limited DCE pass after unrolling, but if you have a chain of dead instructions, it only deletes the last one. Improve the code to recursively delete all trivially dead instructions.

Differential Revision: https://reviews.llvm.org/D102511
2021-05-14 10:05:25 -07:00
Philip Reames
143d78fe72 Autogen a test for ease of update 2021-05-14 09:33:17 -07:00
Florian Hahn
0a57e484b4 [LV] Add a few more complex first-order recurrence tests. 2021-05-14 17:27:17 +01:00
Bradley Smith
626d5e84ba [AArch64][SVE] Combine cntp intrinsics with add/sub to produce incp/decp
Depends on D101062

Differential Revision: https://reviews.llvm.org/D102077
2021-05-14 17:16:06 +01:00
Simon Pilgrim
95a1c7f9d0 [X86][SSE] Pull out combineToHorizontalAddSub helper from inside (F)ADD/SUB combines. NFCI.
The intention is to be able to run this from additional locations (such as shuffle combining) in the future.
2021-05-14 16:52:55 +01:00
Benjamin Kramer
5b498c5293 Bump googletest to 1.8.1
We've accumulated a scary amount of local patches to this directory. I
tried to merge them all, but if your favorite change is missing please
reapply it manually (and send it upstream).
2021-05-14 17:20:06 +02:00
Bradley Smith
75a6360f5c [AArch64][SVE] Add unpredicated vector BIC ISD node
Addition of this node allows us to better utilize the different forms of
the SVE BIC instructions, including using the alias to an AND (immediate).

Differential Revision: https://reviews.llvm.org/D101831
2021-05-14 16:12:13 +01:00
Philip Reames
c7e075736a [rs4gc] Strip memory related attributes consistently
I noticed that rs4gc is not stripping a number of memory aliasing related attributes. We do strip some from call sites, but don't strip the same ones from declarations or parameters.

Why do we need to strip these? Two answers:

    Safepoints conceptually read and write to the entire garbage collected heap in the physical model. We need this to preserve ordering of all loads and stores with respect to possible relocation.
    We can infer other attributes from these. For instance, readnone can imply both nofree and nosync. Both of which don't hold after physical rewriting.

Note: This exposed a latent issue which was fixed a couple weeks back in 01801d5274.

Differential Revision: https://reviews.llvm.org/D99802
2021-05-14 07:54:56 -07:00
David Green
fe87644b94 [ARM] Expand predecessor search to multiple blocks when reverting WhileLoopStarts
We were previously only searching a single preheader for call
instructions when reverting WhileLoopStarts to DoLoopStarts. This
extends that to multiple blocks that can come up when, for example a
loop is expanded from a memcpy. It also expends the instructions from
just Call's to also include other LoopStarts, to catch other low
overhead loops in the preheader.

Differential Revision: https://reviews.llvm.org/D102269
2021-05-14 15:08:14 +01:00
David Green
e4fe455507 [ARM] Define CPSR on MEMCPY pseudos
These pseudos are converted post-isel into t2WhileLoopStart and
t2LoopEnd/LoopDec instructions, which themselves are defined to clobber
CPSR. Doing the same with the MEMCPY nodes will make sure they are
scheduled correctly to not end up with incorrect uses.
2021-05-14 15:06:59 +01:00
Hsiangkai Wang
2f0f65c71c [RISCV] Add the DebugLoc parameter to getVLENFactoredAmount().
The MachineBasicBlock::iterator is continuously changing during
generating the frame handling instructions. We should use the DebugLoc
from the caller, instead of getting it from the changing iterator.

If the prologue instructions located in a basic block without any other
instructions after these prologue instructions, the iterator will be
updated to the boundary of the basic block and it is invalid to use the
iterator to access DebugLoc. This patch also fixes the crash when
accessing DebugLoc using the iterator.

Differential Revision: https://reviews.llvm.org/D102386
2021-05-14 21:31:06 +08:00
Dmitry Preobrazhensky
cb8494eadb [AMDGPU][MC][NFC][DOC] Updated AMD GPU assembler syntax description.
Summary of changes:
- added description of GFX90A;
- minor bugfixing and improvements.
2021-05-14 16:13:30 +03:00
Sanjay Patel
1a03ebd2a9 [SDAG] reduce code duplication for extend_vec_inreg combines; NFC
These are identical so far, and I was looking at adding a fold
for a pattern with scalar_to_vector which would also nd up duplicated.
2021-05-14 08:29:57 -04:00
Djordje Todorovic
742e2e5ef7 [Transforms][Debugify] Fix "Missing line" false alarm on PHI nodes
This is a fix for https://bugs.llvm.org/show_bug.cgi?id=49959

The "Missing line" false alarm was introduced in D75242.

Patch by Yilong Guo<yilong.guo@intel.com>

Differential Revision: https://reviews.llvm.org/D100446
2021-05-14 14:06:13 +02:00
Jay Foad
b171d1e0c0 [TableGen] Remove unneeded forward defs. NFC. 2021-05-14 12:36:20 +01:00
Roman Lebedev
d6092a32f1 [X86] AMD Zen 3: same-reg AVX YMM VANDNPD is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 14:06:24 +03:00
Roman Lebedev
d568fd158a [X86] AMD Zen 3: same-reg AVX XMM VANDNPD is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 14:06:24 +03:00
Roman Lebedev
6b8db228ae [X86] AMD Zen 3: same-reg SSE XMM ANDNPD is a 1-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 14:06:24 +03:00
Roman Lebedev
26c4e61f3c [NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VANDNPD tests 2021-05-14 14:06:24 +03:00
Roman Lebedev
874a23ed56 [NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VANDNPD tests 2021-05-14 14:06:24 +03:00
Roman Lebedev
fcc1b61e41 [NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM ANDNPD tests 2021-05-14 14:06:24 +03:00
Roman Lebedev
c43be7e3ef [X86] AMD Zen 3: same-reg AVX YMM VANDNPS is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 14:06:24 +03:00
Roman Lebedev
c3c0fbe384 [X86] AMD Zen 3: same-reg AVX XMM VANDNPS is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 14:06:23 +03:00
Roman Lebedev
0764782fd5 [X86] AMD Zen 3: same-reg SSE XMM ANDNPS is a 1-cycle(!) dep-breaking zero-idiom
Same as SSE XMM XORPS/XORPD, it is not zero-cycle, even though it breaks the deps.
As confirmed by the exegesis measurements, and ref docs.
2021-05-14 14:06:23 +03:00