1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00
Commit Graph

138 Commits

Author SHA1 Message Date
Matt Davis
12a71b7b2e [llvm-mca] Rename Backend to Pipeline. NFC.
Summary:
This change renames the Backend and BackendPrinter to Pipeline and PipelinePrinter respectively. 
Variables and comments have also been updated to reflect this change.

The reason for this rename, is to be slightly more correct about what MCA is modeling.  MCA models a Pipeline, which implies some logical sequence of stages. 

Reviewers: andreadb, courbet, RKSimon

Reviewed By: andreadb, courbet

Subscribers: mgorny, javed.absar, tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D48496

llvm-svn: 335496
2018-06-25 16:53:00 +00:00
Matt Davis
e6c725e139 [llvm-mca] Remove unnecessary include and forward decl in RCU. NFC.
The DispatchUnit is no longer a dependency of RCU, so this patch removes a
stale include and forward decl.  This patch also cleans up some comments.

llvm-svn: 335392
2018-06-22 21:35:26 +00:00
Andrea Di Biagio
fa489a9fd0 [llvm-mca] Remove redundant call. NFC
llvm-svn: 335368
2018-06-22 17:03:40 +00:00
Andrea Di Biagio
6a1ca757a7 [llvm-mca] Set the operand ID for implicit register reads/writes. NFC
Also, move the definition of InstRef at the end of Instruction.h to avoid a
forward declaration.

llvm-svn: 335363
2018-06-22 16:37:05 +00:00
Matt Davis
85ee56f371 [llvm-mca] Introduce a sequential container of Stages
Summary:
Remove explicit stages and introduce a list of stages.

A pipeline should be composed of an arbitrary list of stages, and not any
 predefined list of stages in the Backend.  The Backend should not know of any
 particular stage, rather it should only be concerned that it has a list of
 stages, and that those stages will fulfill the contract of what it means to be
 a Stage (namely pre/post/execute a given instruction).

For now, we leave the original set of stages defined in the Backend ctor;
however, I imagine these will be moved out at a later time.

This patch makes an adjustment to the semantics of Stage::isReady.
Specifically, what the Backend really needs to know is if a Stage has
unfinished work.  With that said, it is more appropriately renamed
Stage::hasWorkToComplete().  This change will clean up the check in
Backend::run(), allowing us to query each stage to see if there is unfinished
work, regardless of what subclass a stage might be.  I feel that this change
simplifies the semantics too, but that's a subjective statement.

Given how RetireStage and ExecuteStage handle data in their preExecute(), I've
had to change the order of Retire and Execute in our stage list.  Retire must
complete any of its preExecute actions before ExecuteStage's preExecute can
take control.  This is mainly because both stages utilize the RCU.  In the
meantime, I want to see if I can adjust that or remove that coupling.

Reviewers: andreadb, RKSimon, courbet

Reviewed By: andreadb

Subscribers: tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D46907

llvm-svn: 335361
2018-06-22 16:17:26 +00:00
Andrea Di Biagio
371b1bd7b2 [llvm-mca] Updates comment in code, and remove some stale comments. NFC
Also, rename fields `TotalMappings` and `NumUsedMappings` in struct
RegisterMappingTracker into `NumPhysRegs` and `NumUsedPhysRegs`.

llvm-svn: 335219
2018-06-21 12:14:49 +00:00
Andrea Di Biagio
17e1fccadc [llvm-mca] use APint::operator[] to obtain the bit value. NFC
llvm-svn: 335131
2018-06-20 14:30:17 +00:00
Andrea Di Biagio
4893e095df [llvm-mca][X86] Teach how to identify register writes that implicitly clear the upper portion of a super-register.
This patch teaches llvm-mca how to identify register writes that implicitly zero
the upper portion of a super-register.

On X86-64, a general purpose register is implemented in hardware as a 64-bit
register. Quoting the Intel 64 Software Developer's Manual: "an update to the
lower 32 bits of a 64 bit integer register is architecturally defined to zero
extend the upper 32 bits".  Also, a write to an XMM register performed by an AVX
instruction implicitly zeroes the upper 128 bits of the aliasing YMM register.

This patch adds a new method named clearsSuperRegisters to the MCInstrAnalysis
interface to help identify instructions that implicitly clear the upper portion
of a super-register.  The rest of the patch teaches llvm-mca how to use that new
method to obtain the information, and update the register dependencies
accordingly.

I compared the kernels from tests clear-super-register-1.s and
clear-super-register-2.s against the output from perf on btver2.  Previously
there was a large discrepancy between the estimated IPC and the measured IPC.
Now the differences are mostly in the noise.

Differential Revision: https://reviews.llvm.org/D48225

llvm-svn: 335113
2018-06-20 10:08:11 +00:00
Matt Davis
8c00f4b3c2 [llvm-mca] Cleanup the header syntax line. Fix a comment. NFC.
This patch removes a few dashes from the header comment to make room for the syntax line.

llvm-svn: 334986
2018-06-18 21:38:38 +00:00
Andrea Di Biagio
f6e62b21be [llvm-mca] Use an ordered map to collect hardware statistics. NFC.
Histogram entries are now ordered by key.  This should improves their
readability when statistics are printed.

llvm-svn: 334961
2018-06-18 17:04:56 +00:00
Roman Lebedev
4ec9cfed2e [MCA] Add -summary-view option
Summary:
While that is indeed a quite interesting summary stat,
there are cases where it does not really add anything
other than consuming extra lines.

Declutters the output of D48190.

Reviewers: RKSimon, andreadb, courbet, craig.topper

Reviewed By: andreadb

Subscribers: javed.absar, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D48209

llvm-svn: 334833
2018-06-15 14:01:43 +00:00
Matt Davis
e567892d10 [llvm-mca] Clean up the header comment. NFC.
This change removes a few dashes to make room for the header syntax string.

llvm-svn: 334770
2018-06-14 20:58:54 +00:00
Matt Davis
0594746649 [llvm-mca] Introduce the ExecuteStage (was originally the Scheduler class).
Summary: This patch transforms the Scheduler class into the ExecuteStage.  Most of the logic remains.  

Reviewers: andreadb, RKSimon, courbet

Reviewed By: andreadb

Subscribers: mgorny, javed.absar, tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D47246

llvm-svn: 334679
2018-06-14 01:20:18 +00:00
Andrea Di Biagio
7cc34df875 [llvm-mca] Fixed a bug in the logic that checks if a memory operation is ready to execute.
Fixes PR37790.

In some (very rare) cases, the LSUnit (Load/Store unit) was wrongly marking a
load (or store) as "ready to execute" effectively bypassing older memory barrier
instructions.

To reproduce this bug, the memory barrier must be the first instruction in the
input assembly sequence, and it doesn't have to perform any register writes.

llvm-svn: 334633
2018-06-13 18:30:14 +00:00
Andrea Di Biagio
b297f1e738 Revert: [llvm-mca] Flush the output stream before we start the analysis of a new code region. NFC
Not sure why, but it breaks buildbot clang-cmake-armv8-full.
It causes a failure in TEST 'Xray-armhf-linux :: TestCases/Posix/profiling-single-threaded.cc'.

llvm-svn: 334617
2018-06-13 16:33:52 +00:00
Andrea Di Biagio
24185018fb [llvm-mca] Flush the output stream before we start the analysis of a new code region. NFC
llvm-svn: 334610
2018-06-13 15:43:56 +00:00
Andrea Di Biagio
0fae23f1ce [llvm-mca] Correctly update the CyclesLeft of a register read in the presence of partial register updates.
This patch fixe the logic in ReadState::cycleEvent(). That method was not
correctly updating field `TotalCycles`.

Added extra code comments in class ReadState to better describe each field.

llvm-svn: 334028
2018-06-05 17:12:02 +00:00
Andrea Di Biagio
8b4cab0d54 [RFC][patch 3/3] Add support for variant scheduling classes in llvm-mca.
This patch is the last of a sequence of three patches related to LLVM-dev RFC
"MC support for variant scheduling classes".
http://lists.llvm.org/pipermail/llvm-dev/2018-May/123181.html

This fixes PR36672.

The main goal of this patch is to teach llvm-mca how to solve variant scheduling
classes.  This patch does that, plus it adds new variant scheduling classes to
the BtVer2 scheduling model to identify so-called zero-idioms (i.e. so-called
dependency breaking instructions that are known to generate zero, and that are
optimized out in hardware at register renaming stage).

Without the BtVer2 change, this patch would not have had any meaningful tests.
This patch is effectively the union of two changes:
 1) a change that teaches llvm-mca how to resolve variant scheduling classes.
 2) a change to the BtVer2 scheduling model that allows us to special-case
    packed XOR zero-idioms (this partially fixes PR36671).

Differential Revision: https://reviews.llvm.org/D47374 

llvm-svn: 333909
2018-06-04 15:43:09 +00:00
Andrea Di Biagio
9cf08c77bf [llvm-mca] Track cycles contributed by resources that are in a 'Super' relationship.
This is required if we want to correctly match the behavior of method
SubtargetEmitter::ExpandProcResource() in Tablegen. When computing the set of
"consumed" processor resources and resource cycles, the logic in
ExpandProcResource() doesn't update the number of resource cycles contributed by
a "Super" resource to a group.  We need to take this into account when a model
declares a processor resource which is part of a 'processor resource group', and
it is also used as the "Super" of other resources.

llvm-svn: 333892
2018-06-04 12:23:07 +00:00
Andrea Di Biagio
0110f33d56 [llvm-mca] Move the logic that computes the block throughput into Support.h. NFC
This will allow us to share the logic that computes the block throughput with
other views.

llvm-svn: 333755
2018-06-01 14:35:21 +00:00
Andrea Di Biagio
93f3fc5db3 [llvm-mca] Fixed a problem caused by an invalid use of a processor resource mask in the Scheduler.
The lambda functions used by method ResourceManager::mustIssueImmediately() was
incorrectly truncating masks of buffered processor resources to 32-bit quantities.
The invalid mask values were then used to access a map of processor
resource descriptors.

Fixes PR37643.

llvm-svn: 333692
2018-05-31 20:27:46 +00:00
Matt Davis
d559fd8c73 [llvm-mca] Update the header's guard name. NFC.
This patch also places a comment at the end of the header guard.

llvm-svn: 333297
2018-05-25 18:45:43 +00:00
Matt Davis
3be2788247 [llvm-mca] Update DispatchStage header comment. NFC.
Updated the comment to be a wee bit more descriptive.

llvm-svn: 333296
2018-05-25 18:31:28 +00:00
Matt Davis
61885d41df [llvm-mca] Add the RetireStage.
Summary:
This class maintains the same logic as the original RetireControlUnit.

This is just an intermediate patch to make the RCU a Stage.  Future patches will remove the dependency on the DispatchStage, and then more properly populate the pre/execute/post Stage interface.  

Reviewers: andreadb, RKSimon, courbet

Reviewed By: andreadb, courbet

Subscribers: javed.absar, mgorny, tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D47244

llvm-svn: 333292
2018-05-25 18:00:25 +00:00
Andrea Di Biagio
2aa4f29a7e [llvm-mca] Fix a rounding problem in SummaryView.cpp exposed by r333204.
Before printing the block reciprocal throughput, ensure that the floating point
number is always rounded the same way on every target.
No functional change intended.

llvm-svn: 333210
2018-05-24 17:22:14 +00:00
Matt Davis
4c2c0e5a34 [llvm-mca] Fix header comments. NFC.
llvm-svn: 333096
2018-05-23 16:15:06 +00:00
Andrea Di Biagio
d476292a08 [llvm-mca] Print the "Block RThroughput" in the SummaryView.
This patch implements the "block reciprocal throughput" computation in the
SummaryView.

The block reciprocal throughput is computed as the MAX of:
  - NumMicroOps / DispatchWidth
  - Resource Cycles / #Units   (for every resource consumed).

The block throughput is bounded from above by the hardware dispatch throughput.
That is because the DispatchWidth is an upper bound on how many opcodes can be part
of a single dispatch group.

The block throughput is also limited by the amount of hardware parallelism. The
number of available resource units affects how the resource pressure is
distributed, and also how many blocks can be delivered every cycle.

llvm-svn: 333095
2018-05-23 15:59:27 +00:00
Matt Davis
127f57c750 [llvm-mca] Move DispatchStage::cycleEvent to preExecute. NFC.
Summary:
This is an intermediate change, it moves the non-notification logic from
Backend::notifyCycleBegin to runCycle().
    
Once the scheduler becomes part of the Execution stage
the explicit call to Scheduler::cycleEvent will disappear.
    
The logic for Dispatch::cycleEvent() can be in
the preExecute phase, which this patch addresses.


Reviewers: andreadb, RKSimon, courbet

Reviewed By: andreadb

Subscribers: tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D47213

llvm-svn: 333029
2018-05-22 20:51:58 +00:00
Andrea Di Biagio
36a508f010 [llvm-mca] Removed an empty line generated by the timeline view. NFC.
Also, regenerate all tests.

llvm-svn: 332853
2018-05-21 17:11:56 +00:00
Matt Davis
d16c1af4bc [llvm-mca] Make Dispatch a subclass of Stage.
Summary:
The logic of dispatch remains the same, but now DispatchUnit is a Stage (DispatchStage).

This change has the benefit of simplifying the backend runCycle() code.
The same logic applies, but it belongs to different components now.  This is just a start,
eventually we will need to remove the call to the DispatchStage in Scheduler.cpp, but
that will be a separate patch.  This change is mostly a renaming and moving of existing logic.

This change also encouraged me to remove the Subtarget (STI) member from the
Backend class.  That member was used to initialize the other members of Backend
and to eventually call DispatchUnit::dispatch().  Now that we have Stages, we
can eliminate this by instantiating the DispatchStage with everything it needs
at the time of construction (e.g., Subtarget).  That change allows us to call
DispatchStage::execute(IR) as we expect to call execute() for all other stages.

Once we add the Stage list (D46907) we can more cleanly call preExecute() on
all of the stages, DispatchStage, will probably wrap cycleEvent() in that
case.

Made some formatting and minor cleanups to README.txt.  Some of the text
was re-flowed to stay within 80 cols.


Reviewers: andreadb, courbet, RKSimon

Reviewed By: andreadb, courbet

Subscribers: mgorny, javed.absar, tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D46983

llvm-svn: 332652
2018-05-17 19:22:29 +00:00
Andrea Di Biagio
f5e3a66963 [llvm-mca] Hide unrelated flags from the -help output.
llvm-svn: 332615
2018-05-17 15:35:14 +00:00
Andrea Di Biagio
8bbb45c175 [llvm-mca] add flag -all-views and flag -all-stats.
Flag -all-views enables all the views.
Flag -all-stats enables all the views that print hardware statistics.

llvm-svn: 332602
2018-05-17 12:27:03 +00:00
Matt Davis
78d832622d [llvm-mca] Move the RegisterFile class into its own translation unit. NFC
Summary: This change will help us turn the DispatchUnit into its own stage.

Reviewers: andreadb, RKSimon, courbet

Reviewed By: andreadb, courbet

Subscribers: mgorny, tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D46916

llvm-svn: 332493
2018-05-16 17:07:08 +00:00
Andrea Di Biagio
5b328a4e1c [llvm-mca] Move definitions in FetchStage.cpp inside namespace mca. NFC
Also, get rid of a redundant include in FetchStage.h and FetchStage.cpp.

llvm-svn: 332468
2018-05-16 13:38:17 +00:00
Andrea Di Biagio
857c92f90f [llvm-mca] Fix perf regression after r332390.
Revision 332390 introduced a FetchStage class in llvm-mca.
By design, FetchStage owns all the instructions in-flight in the OoO Backend.

Before this change, new instructions were added to a DenseMap indexed by
instruction id. The problem with using a DenseMap is that elements are not
ordered by key. This was causing a massive slow down in method
FetchStage::postExecute(), which searches for instructions retired that can be
deleted.

This patch replaces the DenseMap with a std::map ordered by instruction index.
At the end of every cycle, we search for the first instruction which is not
marked as "retired", and we remove all the previous instructions before it.
This works well because instructions are retired in-order.

Before this patch, a debug build of llvm-mca (on my Ryzen linux machine) took
~8.0 seconds to simulate 3000 iterations of a x86 dot-product (a `vmulps,
vpermilps, vaddps, vpermilps, vaddps` sequence). With this patch, it now takes
~0.8s to run all the 3000 iterations.

llvm-svn: 332461
2018-05-16 12:33:09 +00:00
Andrea Di Biagio
38049fdba1 [llvm-mca] Remove redundant includes in Stage.h.
This patch also makes Stage::isReady() a const method.

No functional change.

llvm-svn: 332443
2018-05-16 09:24:38 +00:00
Matt Davis
efffaf5cb4 [llvm-mca] Introduce a pipeline Stage class and FetchStage.
Summary:
    This is just an idea, really two ideas.  I expect some push-back,
    but I realize that posting a diff is the most comprehensive way to express
    these concepts.

    This patch introduces a Stage class which represents the
    various stages of an instruction pipeline.  As a start, I have created a simple
    FetchStage that is based on existing logic for how MCA produces
    instructions, but now encapsulated in a Stage.  The idea should become more concrete
    once we introduce additional stages.  The idea being, that when a stage completes,
    the next stage in the pipeline will be executed.  Stages are chained together
    as a singly linked list to closely model a real pipeline. For now there is only one stage,
    so the stage-to-stage flow of instructions isn't immediately obvious.

    Eventually, Stage will also handle event notifications, but that functionality
    is not complete, and not destined for this patch.  Ideally, an interested party 
    can register for notifications from a particular stage.  Callbacks will be issued to
    these listeners at various points in the execution of the stage.  
    For now, eventing functionality remains similar to what it has been in mca::Backend. 
    We will be building-up the Stage class as we move on, such as adding debug output.

    This patch also removes the unique_ptr<Instruction> return value from
    InstrBuilder::createInstruction.  An Instruction pointer is still produced,
    but now it's up to the caller to decide how that item should be managed post-allocation
    (e.g., smart pointer).  This allows the Fetch stage to create instructions and
    manage the lifetime of those instructions as it wishes, and not have to be bound to any
    specific managed pointer type.  Other callers of createInstruction might have different 
    requirements, and thus can manage the pointer to fit their needs.  Another idea would be to push the
   ownership to the RCU. 

    Currently, the FetchStage will wrap the Instruction
    pointer in a shared_ptr.  This allows us to remove the Instruction container in
    Backend, which was probably going to disappear, or move, at some point anyways.
    Note that I did run these changes through valgrind, to make sure we are not leaking
    memory.  While the shared_ptr comes with some additional overhead it relieves us
    from having to manage a list of generated instructions, and/or make lookup calls
    to remove the instructions. 

    I realize that both the Stage class and the Instruction pointer management
    (mentioned directly above) are separate but related ideas, and probably should
    land as separate patches; I am happy to do that if either idea is decent.
    The main reason these two ideas are together is that
    Stage::execute() can mutate an InstRef. For the fetch stage, the InstRef is populated
    as the primary action of that stage (execute()).  I didn't want to change the Stage interface
    to support the idea of generating an instruction.  Ideally, instructions are to
    be pushed through the pipeline.  I didn't want to draw too much of a
    specialization just for the fetch stage.  Excuse the word-salad.

Reviewers: andreadb, courbet, RKSimon

Reviewed By: andreadb

Subscribers: llvm-commits, mgorny, javed.absar, tschuett, gbedwell

Differential Revision: https://reviews.llvm.org/D46741

llvm-svn: 332390
2018-05-15 20:21:04 +00:00
Andrea Di Biagio
5c841698f9 [llvm-mca] use a formatted_raw_ostream to insert padding and get rid of tabs. NFC
llvm-svn: 332381
2018-05-15 18:11:45 +00:00
Andrea Di Biagio
09c9c3e7d1 [llvm-mca] Strip leading tabs and spaces from instruction strings before printing. NFC
llvm-svn: 332361
2018-05-15 15:18:05 +00:00
Andrea Di Biagio
ee73b51886 [llvm-mca] Remove unused include header files. NFC
Also, run clang-format on RetireControlUnit.cpp.

llvm-svn: 332337
2018-05-15 10:30:39 +00:00
Andrea Di Biagio
e5c03fa0a4 [llvm-mca] Add file header to RetireControlUnit.cpp.
Strictly speaking, this is not necessary for .cpp files. However, other .cpp
files from this same tool have it. This also matches what we do in other tools.

llvm-svn: 332334
2018-05-15 09:31:32 +00:00
Andrea Di Biagio
c6b21b2d69 [llvm-mca] Improved support for dependency-breaking instructions.
The tool assumes that a zero-latency instruction that doesn't consume hardware
resources is an optimizable dependency-breaking instruction. That means, it
doesn't have to wait on register input operands, and it doesn't consume any
physical register. The PRF knows how to optimize it at register renaming stage.

llvm-svn: 332249
2018-05-14 15:08:22 +00:00
Nicola Zaghen
9667127c14 Rename DEBUG macro to LLVM_DEBUG.
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624

llvm-svn: 332240
2018-05-14 12:53:11 +00:00
David Blaikie
3c91607119 Move standard library inclusions to after internal inclusions.
llvm-svn: 332124
2018-05-11 19:21:40 +00:00
David Blaikie
08c63c1dfe llvm-mca: Add missing includes
Move the header include in the primary source file to the top to
validate that it doesn't depend on any other inclusions.

llvm-svn: 331897
2018-05-09 17:28:10 +00:00
Matt Davis
3e43e69679 [llvm-mca] Avoid exposing index values in the MCA interfaces.
Summary:
This patch eliminates many places where we originally needed to  pass index
values to represent an instruction.  The index is still used as a key, in various parts of 
MCA.  I'm  not comfortable eliminating the index just yet.    By burying the index in
the instruction, we can avoid exposing that value in many places.

Eventually, we should consider removing the Instructions list in the Backend 
all together,   it's only used to hold and reclaim the memory for the allocated 
Instruction instances.  Instead we could pass around a smart pointer.  But that's
a separate discussion/patch.

Reviewers: andreadb, courbet, RKSimon

Reviewed By: andreadb

Subscribers: javed.absar, tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D46367

llvm-svn: 331660
2018-05-07 18:29:15 +00:00
Andrea Di Biagio
c6d4a11d75 [llvm-mca] removes flag -instruction-tables from the "View Options" category.
This patch also improves the description of a couple of flags in the view
options. With this change, the -help now specifies which views are enabled by
default.

llvm-svn: 331594
2018-05-05 15:36:47 +00:00
Andrea Di Biagio
bf53c91505 [llvm-mca] minor tweak to the resource pressure printing functionality. NFC.
llvm-svn: 331590
2018-05-05 12:21:54 +00:00
Matt Davis
1b4230edab [llvm-mca] Add descriptive names for the TimelineView report characters. NFC.
Summary:
This change makes the TimelineView source simpler to read and easier to modify in the future.
This patch introduces a class of static chars used as the display values in the TimelineView report, this change just eliminates a few magic characters.

Reviewers: andreadb, courbet, RKSimon

Reviewed By: andreadb

Subscribers: tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D46409

llvm-svn: 331540
2018-05-04 17:19:40 +00:00
Andrea Di Biagio
62309fcfae [llvm-mca] use colors for warnings and notes generated by InstrBuilder.
llvm-svn: 331517
2018-05-04 13:52:12 +00:00