1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00
Commit Graph

24 Commits

Author SHA1 Message Date
Andrea Di Biagio
f4b2508e93 [llvm-mca] A write latency cannot be a negative value. NFC
llvm-svn: 336437
2018-07-06 13:46:10 +00:00
Andrea Di Biagio
46e908a592 [llvm-mca] improve the instruction issue logic implemented by the Scheduler.
This patch modifies the Scheduler heuristic used to select the next instruction
to issue to the pipelines.

The motivating example is test X86/BtVer2/add-sequence.s, for which llvm-mca
wrongly reported an estimated IPC of 1.50. According to perf, the actual IPC for
that test should have been ~2.00.
It turns out that an IPC of 2.00 for test add-sequence.s cannot possibly be
predicted by a Scheduler that only prioritizes instructions based on their
"age". A similar issue also affected test X86/BtVer2/dependent-pmuld-paddd.s,
for which llvm-mca wrongly estimated an IPC of 0.84 instead of an IPC of 1.00.

Instructions in the ReadyQueue are now ranked based on two factors:
 - The "age" of an instruction.
 - The number of unique users of writes associated with an instruction.

The new logic still prioritizes older instructions over younger instructions to
minimize the pressure on the reorder buffer. However, the number of users of an
instruction now also affects the overall rank. This potentially increases the
ability of the Scheduler to extract instruction level parallelism.  This patch
fixes the problem with the wrong IPC reported for test add-sequence.s and test
dependent-pmuld-paddd.s.

llvm-svn: 336420
2018-07-06 08:08:30 +00:00
Andrea Di Biagio
a5b30c5c97 [llvm-mca] Remove field HasReadAdvanceEntries from class ReadDescriptor.
This simplifies the logic that updates RAW dependencies in the DispatchStage.
There is no advantage in storing that flag in the ReadDescriptor; we should
simply rely on the call to `STI.getReadAdvanceCycles()` to obtain the
ReadAdvance cycles. If there are no read-advance entries, then method
`getReadAdvanceCycles()` quickly returns 0.

No functional change intended.

llvm-svn: 335977
2018-06-29 14:24:46 +00:00
Andrea Di Biagio
e6cf14997b [llvm-mca] Use a WriteRef to describe register writes in class RegisterFile.
This patch introduces a new class named WriteRef. A WriteRef is used by the
RegisterFile to keep track of register definitions. Internally it wraps a
WriteState, as well as the source index of the defining instruction.

This patch allows the tool to propagate additional information to support future
analysis on data dependencies.

llvm-svn: 335867
2018-06-28 15:50:26 +00:00
Andrea Di Biagio
bb10807000 [llvm-mca] Avoid calling method update() on instructions that are already in the IS_READY state. NFCI
When promoting instructions from the wait queue to the ready queue, we should
check if an instruction has already reached the IS_READY state before
calling method update().

llvm-svn: 335722
2018-06-27 11:17:07 +00:00
Andrea Di Biagio
a58b32e07f [llvm-mca] Remove unused header files and correctly guard some include headers under NDEBUG. NFC
llvm-svn: 335589
2018-06-26 10:44:12 +00:00
Matt Davis
12a71b7b2e [llvm-mca] Rename Backend to Pipeline. NFC.
Summary:
This change renames the Backend and BackendPrinter to Pipeline and PipelinePrinter respectively. 
Variables and comments have also been updated to reflect this change.

The reason for this rename, is to be slightly more correct about what MCA is modeling.  MCA models a Pipeline, which implies some logical sequence of stages. 

Reviewers: andreadb, courbet, RKSimon

Reviewed By: andreadb, courbet

Subscribers: mgorny, javed.absar, tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D48496

llvm-svn: 335496
2018-06-25 16:53:00 +00:00
Andrea Di Biagio
6a1ca757a7 [llvm-mca] Set the operand ID for implicit register reads/writes. NFC
Also, move the definition of InstRef at the end of Instruction.h to avoid a
forward declaration.

llvm-svn: 335363
2018-06-22 16:37:05 +00:00
Andrea Di Biagio
4893e095df [llvm-mca][X86] Teach how to identify register writes that implicitly clear the upper portion of a super-register.
This patch teaches llvm-mca how to identify register writes that implicitly zero
the upper portion of a super-register.

On X86-64, a general purpose register is implemented in hardware as a 64-bit
register. Quoting the Intel 64 Software Developer's Manual: "an update to the
lower 32 bits of a 64 bit integer register is architecturally defined to zero
extend the upper 32 bits".  Also, a write to an XMM register performed by an AVX
instruction implicitly zeroes the upper 128 bits of the aliasing YMM register.

This patch adds a new method named clearsSuperRegisters to the MCInstrAnalysis
interface to help identify instructions that implicitly clear the upper portion
of a super-register.  The rest of the patch teaches llvm-mca how to use that new
method to obtain the information, and update the register dependencies
accordingly.

I compared the kernels from tests clear-super-register-1.s and
clear-super-register-2.s against the output from perf on btver2.  Previously
there was a large discrepancy between the estimated IPC and the measured IPC.
Now the differences are mostly in the noise.

Differential Revision: https://reviews.llvm.org/D48225

llvm-svn: 335113
2018-06-20 10:08:11 +00:00
Andrea Di Biagio
0fae23f1ce [llvm-mca] Correctly update the CyclesLeft of a register read in the presence of partial register updates.
This patch fixe the logic in ReadState::cycleEvent(). That method was not
correctly updating field `TotalCycles`.

Added extra code comments in class ReadState to better describe each field.

llvm-svn: 334028
2018-06-05 17:12:02 +00:00
Matt Davis
efffaf5cb4 [llvm-mca] Introduce a pipeline Stage class and FetchStage.
Summary:
    This is just an idea, really two ideas.  I expect some push-back,
    but I realize that posting a diff is the most comprehensive way to express
    these concepts.

    This patch introduces a Stage class which represents the
    various stages of an instruction pipeline.  As a start, I have created a simple
    FetchStage that is based on existing logic for how MCA produces
    instructions, but now encapsulated in a Stage.  The idea should become more concrete
    once we introduce additional stages.  The idea being, that when a stage completes,
    the next stage in the pipeline will be executed.  Stages are chained together
    as a singly linked list to closely model a real pipeline. For now there is only one stage,
    so the stage-to-stage flow of instructions isn't immediately obvious.

    Eventually, Stage will also handle event notifications, but that functionality
    is not complete, and not destined for this patch.  Ideally, an interested party 
    can register for notifications from a particular stage.  Callbacks will be issued to
    these listeners at various points in the execution of the stage.  
    For now, eventing functionality remains similar to what it has been in mca::Backend. 
    We will be building-up the Stage class as we move on, such as adding debug output.

    This patch also removes the unique_ptr<Instruction> return value from
    InstrBuilder::createInstruction.  An Instruction pointer is still produced,
    but now it's up to the caller to decide how that item should be managed post-allocation
    (e.g., smart pointer).  This allows the Fetch stage to create instructions and
    manage the lifetime of those instructions as it wishes, and not have to be bound to any
    specific managed pointer type.  Other callers of createInstruction might have different 
    requirements, and thus can manage the pointer to fit their needs.  Another idea would be to push the
   ownership to the RCU. 

    Currently, the FetchStage will wrap the Instruction
    pointer in a shared_ptr.  This allows us to remove the Instruction container in
    Backend, which was probably going to disappear, or move, at some point anyways.
    Note that I did run these changes through valgrind, to make sure we are not leaking
    memory.  While the shared_ptr comes with some additional overhead it relieves us
    from having to manage a list of generated instructions, and/or make lookup calls
    to remove the instructions. 

    I realize that both the Stage class and the Instruction pointer management
    (mentioned directly above) are separate but related ideas, and probably should
    land as separate patches; I am happy to do that if either idea is decent.
    The main reason these two ideas are together is that
    Stage::execute() can mutate an InstRef. For the fetch stage, the InstRef is populated
    as the primary action of that stage (execute()).  I didn't want to change the Stage interface
    to support the idea of generating an instruction.  Ideally, instructions are to
    be pushed through the pipeline.  I didn't want to draw too much of a
    specialization just for the fetch stage.  Excuse the word-salad.

Reviewers: andreadb, courbet, RKSimon

Reviewed By: andreadb

Subscribers: llvm-commits, mgorny, javed.absar, tschuett, gbedwell

Differential Revision: https://reviews.llvm.org/D46741

llvm-svn: 332390
2018-05-15 20:21:04 +00:00
Andrea Di Biagio
c6b21b2d69 [llvm-mca] Improved support for dependency-breaking instructions.
The tool assumes that a zero-latency instruction that doesn't consume hardware
resources is an optimizable dependency-breaking instruction. That means, it
doesn't have to wait on register input operands, and it doesn't consume any
physical register. The PRF knows how to optimize it at register renaming stage.

llvm-svn: 332249
2018-05-14 15:08:22 +00:00
Matt Davis
3e43e69679 [llvm-mca] Avoid exposing index values in the MCA interfaces.
Summary:
This patch eliminates many places where we originally needed to  pass index
values to represent an instruction.  The index is still used as a key, in various parts of 
MCA.  I'm  not comfortable eliminating the index just yet.    By burying the index in
the instruction, we can avoid exposing that value in many places.

Eventually, we should consider removing the Instructions list in the Backend 
all together,   it's only used to hold and reclaim the memory for the allocated 
Instruction instances.  Instead we could pass around a smart pointer.  But that's
a separate discussion/patch.

Reviewers: andreadb, courbet, RKSimon

Reviewed By: andreadb

Subscribers: javed.absar, tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D46367

llvm-svn: 331660
2018-05-07 18:29:15 +00:00
Adrian Prantl
076a6683eb Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

llvm-svn: 331272
2018-05-01 15:54:18 +00:00
Andrea Di Biagio
f16ca8a8c6 [llvm-mca] Remove method Instruction::isZeroLatency(). NFCI
llvm-svn: 330807
2018-04-25 09:38:58 +00:00
Andrea Di Biagio
b5a92f83f3 [llvm-mca] Do not assume that implicit reads cannot be associated with ReadAdvance entries.
Before, the instruction builder incorrectly assumed that only explicit reads
could have been associated with ReadAdvance entries.
This patch fixes the issue and adds a test to verify it.

llvm-svn: 328972
2018-04-02 13:46:49 +00:00
Andrea Di Biagio
46429367d5 [llvm-mca] Correctly set the ReadAdvance information for register use operands.
The tool was passing the wrong operand index to method
MCSubtargetInfo::getReadAdvanceCycles(). That method requires a "UseIdx", and
not the operand index. This was found when testing X86 code where instructions
had a memory folded operand.

This patch fixes the issue and adds test read-advance-1.s to ensure that
the ReadAfterLd (a ReadAdvance of 3cy) information is correctly used.

llvm-svn: 328790
2018-03-29 14:26:56 +00:00
Andrea Di Biagio
2b25cb95ef [llvm-mca] run clang-format on all files.
This also addresses Simon's review comment in D44839.

llvm-svn: 328428
2018-03-24 16:05:36 +00:00
Andrea Di Biagio
739c509f1d [llvm-mca] Minor refactoring. NFCI
Also, removed a couple of unused methods from class Instruction.

llvm-svn: 328198
2018-03-22 14:14:49 +00:00
Andrea Di Biagio
660fc0f64f [llvm-mca] Simplify (and better standardize) the Instruction interface.
llvm-svn: 328190
2018-03-22 11:39:34 +00:00
Andrea Di Biagio
92f9994efa [llvm-mca] Simplify code. NFC
llvm-svn: 328187
2018-03-22 10:19:20 +00:00
Andrea Di Biagio
3cce5c6f4b [llvm-mca] Use llvm::make_unique in a few places. NFC
Also, clang-format a couple of DEBUG functions.

llvm-svn: 327978
2018-03-20 12:58:34 +00:00
Andrea Di Biagio
29d48bb255 [llvm-mca] Move the logic that updates the register files from InstrBuilder to DispatchUnit. NFCI
Before this patch, the register file was always updated at instruction creation
time. That means, new read-after-write dependencies, and new temporary registers
were allocated at instruction creation time.

This patch refactors the code in InstrBuilder, and move all the logic that
updates the register file into the dispatch unit. We only want to update the
register file when instructions are effectively dispatched (not before).

This refactoring also helps removing a bad dependency between the InstrBuilder
and the DispatchUnit.

No functional change intended.

llvm-svn: 327514
2018-03-14 14:57:23 +00:00
Andrea Di Biagio
45f0e5261e [llvm-mca] LLVM Machine Code Analyzer.
llvm-mca is an LLVM based performance analysis tool that can be used to
statically measure the performance of code, and to help triage potential
problems with target scheduling models.

llvm-mca uses information which is already available in LLVM (e.g. scheduling
models) to statically measure the performance of machine code in a specific cpu.
Performance is measured in terms of throughput as well as processor resource
consumption. The tool currently works for processors with an out-of-order
backend, for which there is a scheduling model available in LLVM.

The main goal of this tool is not just to predict the performance of the code
when run on the target, but also help with diagnosing potential performance
issues.

Given an assembly code sequence, llvm-mca estimates the IPC (instructions per
cycle), as well as hardware resources pressure. The analysis and reporting style
were mostly inspired by the IACA tool from Intel.

This patch is related to the RFC on llvm-dev visible at this link:
http://lists.llvm.org/pipermail/llvm-dev/2018-March/121490.html

Differential Revision: https://reviews.llvm.org/D43951

llvm-svn: 326998
2018-03-08 13:05:02 +00:00