Simon Pilgrim
a6e815c9bd
[X86][Atom] Fix vector integer multiplication resource/throughputs
...
Match whats documented in the Intel AOM (and Agner/instlatx64 agree) - vector integer multiplies are pipelined - all Port0, throughput = 2 @ 128bits, 1 @ 64bits.
Noticed while checking reduction costs - now that we can use in-order models in llvm-mca, the atom model is the "worst case scenario" we have in x86.
2021-05-15 14:25:48 +01:00
Roman Lebedev
7a6506cfad
[NFC][X86][MCA] Add sudo-zero-idiom vperm2f128/vperm2i128 tests - don't break deps
...
While btver2 model states that this pattern is a zero-cycle zero-idiom
on Jaguar, it does not appear to be the case on Znver3,
here it measures as not being recognized as dep-breaking zero-idiom,
let alone a zero-cycle one.
2021-05-14 20:23:05 +03:00
Roman Lebedev
a8537d3144
[X86] AMD Zen 3: same-reg AVX YMM VPCMPGT{B,W,D,Q} is a zero-cycle(!) dep-breaking zero-idiom
...
As measured by exegesis, and confirmed by ref docs.
2021-05-14 20:23:05 +03:00
Roman Lebedev
d28d04de75
[X86] AMD Zen 3: same-reg AVX XMM VPCMPGT{B,W,D,Q} is a zero-cycle(!) dep-breaking zero-idiom
...
As measured by exegesis, and confirmed by ref docs.
2021-05-14 20:23:04 +03:00
Roman Lebedev
2ff7114732
[X86] AMD Zen 3: same-reg SSE XMM PCMPGT{B,W,D,Q} is a 1-cycle(!) dep-breaking zero-idiom
...
As measured by exegesis, and confirmed by ref docs.
2021-05-14 20:23:04 +03:00
Roman Lebedev
ff2ff878b4
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VPCMPGT{B,W,D,Q} tests
2021-05-14 20:23:04 +03:00
Roman Lebedev
317197c4a8
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VPCMPGT{B,W,D,Q} tests
2021-05-14 20:23:04 +03:00
Roman Lebedev
069fb685b6
[NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM PCMPGT{B,W,D,Q} tests
2021-05-14 20:23:03 +03:00
Roman Lebedev
6e90e82fb2
[X86] AMD Zen 3: same-reg AVX YMM VPSUBUS{B,W} is a 1-cycle(!) dep-breaking zero-idiom
...
Not really mentioned in ref docs, but measures as such.
Yes, this one is also not zero-cycle.
2021-05-14 20:23:03 +03:00
Roman Lebedev
e641283c37
[X86] AMD Zen 3: same-reg AVX XMM VPSUBUS{B,W} is a 1-cycle(!) dep-breaking zero-idiom
...
Not really mentioned in ref docs, but measures as such.
Yes, this one is also not zero-cycle.
2021-05-14 20:23:03 +03:00
Roman Lebedev
7224d53fef
[X86] AMD Zen 3: same-reg SSE XMM PSUBUS{B,W} is a 1-cycle(!) dep-breaking zero-idiom
...
Not really mentioned in ref docs, but measures as such.
2021-05-14 20:23:03 +03:00
Roman Lebedev
103eef39fa
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VPSUBUS{B,W} tests
2021-05-14 20:23:03 +03:00
Roman Lebedev
747b0319b1
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VPSUBUS{B,W} tests
2021-05-14 20:23:02 +03:00
Roman Lebedev
d2246847e9
[NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM PSUBUS{B,W} tests
2021-05-14 20:23:02 +03:00
Roman Lebedev
19263a16d8
[X86] AMD Zen 3: same-reg AVX YMM VPSUBS{B,W} is a 1-cycle(!) dep-breaking zero-idiom
...
Not really mentioned in ref docs, but measures as such.
Yes, this one is also not zero-cycle.
2021-05-14 20:23:02 +03:00
Roman Lebedev
139c28fa45
[X86] AMD Zen 3: same-reg AVX XMM VPSUBS{B,W} is a 1-cycle(!) dep-breaking zero-idiom
...
Not really mentioned in ref docs, but measures as such.
Yes, this one is also not zero-cycle.
2021-05-14 20:23:02 +03:00
Roman Lebedev
fe32a28378
[X86] AMD Zen 3: same-reg SSE XMM PSUBS{B,W} is a 1-cycle(!) dep-breaking zero-idiom
...
Not really mentioned in ref docs, but measures as such.
2021-05-14 20:23:02 +03:00
Roman Lebedev
9a5a00fd92
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VPSUBS{B,W} tests
2021-05-14 20:23:01 +03:00
Roman Lebedev
c27c048587
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VPSUBS{B,W} tests
2021-05-14 20:23:01 +03:00
Roman Lebedev
10c74938fb
[NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM PSUBS{B,W} tests
2021-05-14 20:23:01 +03:00
Roman Lebedev
0d11d5063f
[X86] AMD Zen 3: same-reg AVX YMM VPSUB{B,W,D,Q} is a zero-cycle(!) dep-breaking zero-idiom
...
As confirmed by the exegesis measurements, and ref docs.
2021-05-14 20:23:01 +03:00
Roman Lebedev
4181fdd9ec
[X86] AMD Zen 3: same-reg AVX XMM VPSUB{B,W,D,Q} is a zero-cycle(!) dep-breaking zero-idiom
...
As confirmed by the exegesis measurements, and ref docs.
2021-05-14 20:23:01 +03:00
Roman Lebedev
5b956e1952
[X86] AMD Zen 3: same-reg SSE XMM PSUB{B,W,D,Q} is a 1-cycle(!) dep-breaking zero-idiom
...
As confirmed by the exegesis measurements, and ref docs.
2021-05-14 20:23:00 +03:00
Roman Lebedev
554780708a
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VPSUB{B,W,D,Q} tests
2021-05-14 20:23:00 +03:00
Roman Lebedev
7a8c45f4d0
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VPSUB{B,W,D,Q} tests
2021-05-14 20:23:00 +03:00
Roman Lebedev
530148d646
[NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM PSUB{B,W,D,Q} tests
2021-05-14 20:23:00 +03:00
Roman Lebedev
522d03976e
[X86] AMD Zen 3: same-reg AVX YMM VPANDN is a zero-cycle(!) dep-breaking zero-idiom
...
As confirmed by exegesis measurements, and ref docs.
2021-05-14 20:23:00 +03:00
Roman Lebedev
d334fd8763
[X86] AMD Zen 3: same-reg AVX XMM VPANDN is a zero-cycle(!) dep-breaking zero-idiom
...
As confirmed by exegesis measurements, and ref docs.
2021-05-14 20:23:00 +03:00
Roman Lebedev
747aa83d9d
[X86] AMD Zen 3: same-reg SSE XMM PANDN is a 1-cycle(!) dep-breaking zero-idiom
...
As confirmed by the exegesis measurements, and ref docs.
2021-05-14 20:22:59 +03:00
Roman Lebedev
f96afc073b
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VPANDN tests
2021-05-14 20:22:59 +03:00
Roman Lebedev
7cdc1e03f2
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VPANDN tests
2021-05-14 20:22:59 +03:00
Roman Lebedev
106fd4d50d
[NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM PANDN tests
2021-05-14 20:22:59 +03:00
Roman Lebedev
1fc967929a
[X86] AMD Zen 3: same-reg AVX YMM VPXOR is a zero-cycle(!) dep-breaking zero-idiom
...
As confirmed by exegesis measurements, and ref docs.
2021-05-14 20:22:59 +03:00
Roman Lebedev
3b69b7222f
[X86] AMD Zen 3: same-reg AVX XMM VPXOR is a zero-cycle(!) dep-breaking zero-idiom
...
As confirmed by exegesis measurements, and ref docs.
2021-05-14 20:22:58 +03:00
Roman Lebedev
722c0e895f
[X86] AMD Zen 3: same-reg SSE XMM PXOR is a 1-cycle(!) dep-breaking zero-idiom
...
As confirmed by the exegesis measurements, and ref docs.
2021-05-14 20:22:58 +03:00
Roman Lebedev
2d84799d26
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VPXOR tests
2021-05-14 20:22:58 +03:00
Roman Lebedev
42e170ffb7
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VPXOR tests
2021-05-14 20:22:58 +03:00
Roman Lebedev
7e11b78748
[NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM PXOR tests
2021-05-14 20:22:58 +03:00
Roman Lebedev
d6092a32f1
[X86] AMD Zen 3: same-reg AVX YMM VANDNPD is a zero-cycle(!) dep-breaking zero-idiom
...
As confirmed by exegesis measurements, and ref docs.
2021-05-14 14:06:24 +03:00
Roman Lebedev
d568fd158a
[X86] AMD Zen 3: same-reg AVX XMM VANDNPD is a zero-cycle(!) dep-breaking zero-idiom
...
As confirmed by exegesis measurements, and ref docs.
2021-05-14 14:06:24 +03:00
Roman Lebedev
6b8db228ae
[X86] AMD Zen 3: same-reg SSE XMM ANDNPD is a 1-cycle(!) dep-breaking zero-idiom
...
As confirmed by exegesis measurements, and ref docs.
2021-05-14 14:06:24 +03:00
Roman Lebedev
26c4e61f3c
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VANDNPD tests
2021-05-14 14:06:24 +03:00
Roman Lebedev
874a23ed56
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VANDNPD tests
2021-05-14 14:06:24 +03:00
Roman Lebedev
fcc1b61e41
[NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM ANDNPD tests
2021-05-14 14:06:24 +03:00
Roman Lebedev
c43be7e3ef
[X86] AMD Zen 3: same-reg AVX YMM VANDNPS is a zero-cycle(!) dep-breaking zero-idiom
...
As confirmed by exegesis measurements, and ref docs.
2021-05-14 14:06:24 +03:00
Roman Lebedev
c3c0fbe384
[X86] AMD Zen 3: same-reg AVX XMM VANDNPS is a zero-cycle(!) dep-breaking zero-idiom
...
As confirmed by exegesis measurements, and ref docs.
2021-05-14 14:06:23 +03:00
Roman Lebedev
0764782fd5
[X86] AMD Zen 3: same-reg SSE XMM ANDNPS is a 1-cycle(!) dep-breaking zero-idiom
...
Same as SSE XMM XORPS/XORPD, it is not zero-cycle, even though it breaks the deps.
As confirmed by the exegesis measurements, and ref docs.
2021-05-14 14:06:23 +03:00
Roman Lebedev
823fdcbc30
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VANDNPS tests
2021-05-14 14:06:23 +03:00
Roman Lebedev
dd7ffd482a
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VANDNPS tests
2021-05-14 14:06:23 +03:00
Roman Lebedev
e17513d36b
[NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM ANDNPS tests
2021-05-14 14:06:23 +03:00