1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00
llvm-mirror/include/llvm/Target
Andrea Di Biagio c19db3b1d5 [llvm-mca][BtVer2] teach how to identify false dependencies on partially written
registers.

The goal of this patch is to improve the throughput analysis in llvm-mca for the
case where instructions perform partial register writes.

On x86, partial register writes are quite difficult to model, mainly because
different processors tend to implement different register merging schemes in
hardware.

When the code contains partial register writes, the IPC (instructions per
cycles) estimated by llvm-mca tends to diverge quite significantly from the
observed IPC (using perf).

Modern AMD processors (at least, from Bulldozer onwards) don't rename partial
registers. Quoting Agner Fog's microarchitecture.pdf:
" The processor always keeps the different parts of an integer register together.
For example, AL and AH are not treated as independent by the out-of-order
execution mechanism. An instruction that writes to part of a register will
therefore have a false dependence on any previous write to the same register or
any part of it."

This patch is a first important step towards improving the analysis of partial
register updates. It changes the semantic of RegisterFile descriptors in
tablegen, and teaches llvm-mca how to identify false dependences in the presence
of partial register writes (for more details: see the new code comments in
include/Target/TargetSchedule.h - class RegisterFile).

This patch doesn't address the case where a write to a part of a register is
followed by a read from the whole register.  On Intel chips, high8 registers
(AH/BH/CH/DH)) can be stored in separate physical registers. However, a later
(dirty) read of the full register (example: AX/EAX) triggers a merge uOp, which
adds extra latency (and potentially affects the pipe usage).
This is a very interesting article on the subject with a very informative answer
from Peter Cordes:
https://stackoverflow.com/questions/45660139/how-exactly-do-partial-registers-on-haswell-skylake-perform-writing-al-seems-to

In future, the definition of RegisterFile can be extended with extra information
that may be used to identify delays caused by merge opcodes triggered by a dirty
read of a partial write.

Differential Revision: https://reviews.llvm.org/D49196

llvm-svn: 337123
2018-07-15 11:01:38 +00:00
..
GlobalISel [globalisel] Update GlobalISel emitter to match new representation of extending loads 2018-05-05 20:53:24 +00:00
CodeGenCWrappers.h Fix layering of MachineValueType.h by moving it from CodeGen to Support 2018-03-23 23:58:25 +00:00
GenericOpcodes.td [GISel]: Add G_ADDRSPACE_CAST Opcode 2018-06-22 20:58:51 +00:00
Target.td [cfi-verify] Support AArch64. 2018-07-13 15:19:33 +00:00
TargetCallingConv.td Swift Calling Convention: add swifterror attribute. 2016-04-01 21:41:15 +00:00
TargetInstrPredicate.td [RFC][Patch 1/3] Add a new class of predicates for variant scheduling classes. 2018-05-25 15:55:37 +00:00
TargetIntrinsicInfo.h GlobalISel: support translation of intrinsic calls. 2016-07-29 22:32:36 +00:00
TargetItinerary.td [NFC] Fix comment of class InstrStage 2018-02-12 15:02:49 +00:00
TargetLoweringObjectFile.h Remove \brief commands from doxygen comments. 2018-05-01 15:54:18 +00:00
TargetMachine.h [MachineOutliner] Add support for target-default outlining. 2018-06-30 03:56:03 +00:00
TargetOptions.h [MachineOutliner] Add support for target-default outlining. 2018-06-30 03:56:03 +00:00
TargetSchedule.td [llvm-mca][BtVer2] teach how to identify false dependencies on partially written 2018-07-15 11:01:38 +00:00
TargetSelectionDAG.td [TableGen] Support multi-alternative pattern fragments 2018-07-13 13:18:00 +00:00