1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 03:02:36 +01:00

[X86] AMD Zen 3: SSE XMM moves are zero-cycle

I've verified this with llvm-exegesis.
This is not limited to zero registers.

Refs:
AMD SOG 19h, 2.9.4 Zero Cycle Move
The processor is able to execute certain register to register
mov operations with zero cycle delay.

Agner,
22.13 Instructions with no latency
Register-to-register move instructions are resolved at
the register rename stage without using any execution units.
These instructions have zero latency. It is possible to do six such
register renamings per clock cycle, and it is even possible to
rename the same register multiple times in one clock cycle.
This commit is contained in:
Roman Lebedev 2021-05-07 16:15:43 +03:00
parent e720a8cc78
commit 8c8821fc73
2 changed files with 1246 additions and 1220 deletions

View File

@ -1464,7 +1464,7 @@ defm : Zn3WriteResYMM<WriteVecMoveY, [Zn3FPFMisc0123], 0, [1], 1>;
def : IsOptimizableRegisterMove<[
InstructionEquivalenceClass<[
// GPR variants.
MOV32rr, MOV64rr
MOV32rr, MOV64rr,
// FIXME: MOVSXD32rr, but it is only supported in disassembler.
// FIXME: XCHG32rr/XCHG64rr after MCA is fixed
@ -1472,7 +1472,9 @@ def : IsOptimizableRegisterMove<[
// MMX moves are *NOT* eliminated.
// SSE variants.
// FIXME
MOVAPSrr, MOVUPSrr,
MOVAPDrr, MOVUPDrr,
MOVDQArr, MOVDQUrr
// AVX variants.
// FIXME

File diff suppressed because it is too large Load Diff