1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00
Commit Graph

332 Commits

Author SHA1 Message Date
Matt Davis
4818ffc307 [llvm-mca] Reformat a few lines (fix spacing). NFC.
llvm-svn: 340065
2018-08-17 18:06:01 +00:00
Andrea Di Biagio
fb9304872a [llvm-mca] Removed references to HWStallEvent in Scheduler.h. NFCI
class Scheduler should not know anything of hardware event listeners and
hardware stall events (HWStallEvent).  HWStallEvent objects should only be
constructed by pipeline stages to notify listeners of hardware events.

No functional change intended.

llvm-svn: 340036
2018-08-17 15:01:37 +00:00
Andrea Di Biagio
1dc15644b9 [llvm-mca] Fix -Wpessimizing-move warnings introduced by r339923.
Reported by buildbot `clang-with-lto-ubuntu` ( build #9858 ).

llvm-svn: 339928
2018-08-16 19:45:13 +00:00
Andrea Di Biagio
081e1ff89b [llvm-mca] Refactor how execution is orchestrated by the Pipeline.
This patch changes how instruction execution is orchestrated by the Pipeline.
In particular, this patch makes it more explicit how instructions transition
through the various pipeline stages during execution.

The main goal is to simplify both the stage API and the Pipeline execution.  At
the same time, this patch fixes some design issues which are currently latent,
but that are likely to cause problems in future if people start defining custom
pipelines.

The new design assumes that each pipeline stage knows the "next-in-sequence".
The Stage API has gained three new methods:
 -   isAvailable(IR)
 -   checkNextStage(IR)
 -   moveToTheNextStage(IR).

An instruction IR can be executed by a Stage if method `Stage::isAvailable(IR)`
returns true.
Instructions can move to next stages using method moveToTheNextStage(IR).
An instruction cannot be moved to the next stage if method checkNextStage(IR)
(called on the current stage) returns false.
Stages are now responsible for moving instructions to the next stage in sequence
if necessary.

Instructions are allowed to transition through multiple stages during a single
cycle (as long as stages are available, and as long as all the calls to
`checkNextStage(IR)` returns true).

Methods `Stage::preExecute()` and `Stage::postExecute()` have now become
redundant, and those are removed by this patch.

Method Pipeline::runCycle() is now simpler, and it correctly visits stages
on every begin/end of cycle.

Other changes:
 - DispatchStage no longer requires a reference to the Scheduler.
 - ExecuteStage no longer needs to directly interact with the
   RetireControlUnit. Instead, executed instructions are now directly moved to the
   next stage (i.e. the retire stage).
 - RetireStage gained an execute method. This allowed us to remove the
   dependency with the RCU in ExecuteStage.
 - FecthStage now updates the "program counter" during cycleBegin() (i.e.
   before we start executing new instructions).
 - We no longer need Stage::Status to be returned by method execute(). It has
   been dropped in favor of a more lightweight llvm::Error.

Overally, I measured a ~11% performance gain w.r.t. the previous design.  I also
think that the Stage interface is probably easier to read now.  That being said,
code comments have to be improved, and I plan to do it in a follow-up patch.

Differential revision: https://reviews.llvm.org/D50849

llvm-svn: 339923
2018-08-16 19:00:48 +00:00
Andrea Di Biagio
da0d5626a8 [llvm-mca] Small refactoring in preparation for another patch that will improve the modularity of the Pipeline. NFCI
The main difference is that now `cycleStart()` and `cycleEnd()` return an
llvm::Error.

This patch implements a few minor style changes, and adds missing 'const' to
some methods.

llvm-svn: 339885
2018-08-16 15:43:09 +00:00
Andrea Di Biagio
10a086e001 [llvm-mca] Minor style changes. NFC
llvm-svn: 339823
2018-08-15 22:11:05 +00:00
Andrea Di Biagio
128c334e6d [llvm-mca] Fix PR38575: Avoid an invalid implicit truncation of a processor resource mask (an uint64_t value) to unsigned.
This patch fixes a regression introduced at revision 338702.

A processor resource mask was incorrectly implicitly truncated to an unsigned
quantity. Later on, the truncated mask was used to initialize an element of a
vector of processor resource descriptors.
On targets with more than 32 processor resources, some elements of the vector
are left uninitialized. As a consequence, this bug might have eventually caused
a crash due to null dereference in the Scheduler.

This patch fixes PR38575, and adds a test for it.

llvm-svn: 339768
2018-08-15 12:53:38 +00:00
Matt Davis
8f7fca4891 [llvm-mca] Propagate fatal llvm-mca errors from library classes to driver.
Summary:
This patch introduces error handling to propagate the errors from llvm-mca library classes (or what will become library classes) up to the driver.  This patch also introduces an enum to make clearer the intention of the return value for Stage::execute.

This supports PR38101.

Reviewers: andreadb, courbet, RKSimon

Reviewed By: andreadb

Subscribers: llvm-commits, tschuett, gbedwell

Differential Revision: https://reviews.llvm.org/D50561

llvm-svn: 339594
2018-08-13 18:11:48 +00:00
Matt Davis
ef7fb7a970 [llvm-mca] Make InstrBuilder::getOrCreateInstrDesc private. NFC.
llvm-svn: 339468
2018-08-10 20:24:27 +00:00
Petr Hosek
a98aa0866c [ADT] Normalize empty triple components
LLVM triple normalization is handling "unknown" and empty components
differently; for example given "x86_64-unknown-linux-gnu" and
"x86_64-linux-gnu" which should be equivalent, triple normalization
returns "x86_64-unknown-linux-gnu" and "x86_64--linux-gnu". autoconf's
config.sub returns "x86_64-unknown-linux-gnu" for both
"x86_64-linux-gnu" and "x86_64-unknown-linux-gnu". This changes the
triple normalization to behave the same way, replacing empty triple
components with "unknown".

This addresses PR37129.

Differential Revision: https://reviews.llvm.org/D50219

llvm-svn: 339294
2018-08-08 22:23:57 +00:00
Andrea Di Biagio
3d390604c7 [llvm-mca] Speed up the computation of the wait/ready/issued sets in the Scheduler.
This patch is a follow-up to r338702.

We don't need to use a map to model the wait/ready/issued sets. It is much more
efficient to use a vector instead.

This patch gives us an average 7.5% speedup (on top of the ~12% speedup obtained
after r338702).

llvm-svn: 338883
2018-08-03 12:55:28 +00:00
Andrea Di Biagio
8b13141f92 [llvm-mca] Use a vector to store ResourceState objects in the ResourceManager.
We don't need to use a map to store ResourceState objects. The number of
processor resources is known statically from the scheduling model. We can
therefore use a vector, and reserve a slot for each processor resource that we
want to simulate.
Every time the ResourceManager queries the ResourceState vector, the index to
the vector of ResourceState objects can be easily computed from the processor
resource mask.

This drastically reduces the time complexity of method ResourceManager::use() and
method ResourceManager::release(). This patch gives an average speedup of 12%.

llvm-svn: 338702
2018-08-02 11:12:35 +00:00
Andrea Di Biagio
692173433e [llvm-mca] Correctly update the rank in Scheduler::select().
Found by inspection.

llvm-svn: 338579
2018-08-01 16:06:33 +00:00
Andrea Di Biagio
e75426e0b8 [llvm-mca] Improve code comments. NFC.
llvm-svn: 338513
2018-08-01 10:49:01 +00:00
Matt Davis
d0dbf0dd79 [llvm-mca] Update the help text to reflect "physical" registers. NFC.
llvm-svn: 338430
2018-07-31 20:05:08 +00:00
Andrea Di Biagio
ec4329b38b [llvm-mca] Remove README.txt
A detailed description of the tool has been recently added by Matt to
CommandGuide/llvm-mca.rst. File README.txt is now redundant and can be removed;
all the relevant user-guide information has been improved and then moved to
llvm-mca.rst.

In future, we should add another .rst for the "llvm-mca developer manual" to
provide infromation about:
 - llvm-mca internals.
 - How to add custom stages to the simulated pipeline.
 - How to provide extra processor info in the scheduling model to improve the
   analysis performed by llvm-mca.

llvm-svn: 338386
2018-07-31 14:23:49 +00:00
Andrea Di Biagio
0e53532aeb [llvm-mca][BtVer2] Teach how to identify dependency-breaking idioms.
This patch teaches llvm-mca how to identify dependency breaking instructions on
btver2.

An example of dependency breaking instructions is the zero-idiom XOR (example:
`XOR %eax, %eax`), which always generates zero regardless of the actual value of
the input register operands.
Dependency breaking instructions don't have to wait on their input register
operands before executing. This is because the computation is not dependent on
the inputs.

Not all dependency breaking idioms are also zero-latency instructions. For
example, `CMPEQ %xmm1, %xmm1` is independent on
the value of XMM1, and it generates a vector of all-ones.
That instruction is not eliminated at register renaming stage, and its opcode is
issued to a pipeline for execution. So, the latency is not zero. 

This patch adds a new method named isDependencyBreaking() to the MCInstrAnalysis
interface. That method takes as input an instruction (i.e. MCInst) and a
MCSubtargetInfo.
The default implementation of isDependencyBreaking() conservatively returns
false for all instructions. Targets may override the default behavior for
specific CPUs, and return a value which better matches the subtarget behavior.

In future, we should teach to Tablegen how to automatically generate the body of
isDependencyBreaking from scheduling predicate definitions. This would allow us
to expose the knowledge about dependency breaking instructions to the machine
schedulers (and, potentially, other codegen passes).

Differential Revision: https://reviews.llvm.org/D49310

llvm-svn: 338372
2018-07-31 13:21:43 +00:00
Dean Michael Berris
79adceb6c6 [MCA] Avoid an InstrDesc copy in mca::LSUnit::reserve.
Summary:
InstrDesc contains 4 vectors (as well as some other data), so it's
expensive to copy.

Authored By: orodley

Reviewers: andreadb, mattd, dberris

Reviewed By: mattd, dberris

Subscribers: dberris, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D49775

llvm-svn: 337985
2018-07-26 00:02:54 +00:00
Andrea Di Biagio
c19db3b1d5 [llvm-mca][BtVer2] teach how to identify false dependencies on partially written
registers.

The goal of this patch is to improve the throughput analysis in llvm-mca for the
case where instructions perform partial register writes.

On x86, partial register writes are quite difficult to model, mainly because
different processors tend to implement different register merging schemes in
hardware.

When the code contains partial register writes, the IPC (instructions per
cycles) estimated by llvm-mca tends to diverge quite significantly from the
observed IPC (using perf).

Modern AMD processors (at least, from Bulldozer onwards) don't rename partial
registers. Quoting Agner Fog's microarchitecture.pdf:
" The processor always keeps the different parts of an integer register together.
For example, AL and AH are not treated as independent by the out-of-order
execution mechanism. An instruction that writes to part of a register will
therefore have a false dependence on any previous write to the same register or
any part of it."

This patch is a first important step towards improving the analysis of partial
register updates. It changes the semantic of RegisterFile descriptors in
tablegen, and teaches llvm-mca how to identify false dependences in the presence
of partial register writes (for more details: see the new code comments in
include/Target/TargetSchedule.h - class RegisterFile).

This patch doesn't address the case where a write to a part of a register is
followed by a read from the whole register.  On Intel chips, high8 registers
(AH/BH/CH/DH)) can be stored in separate physical registers. However, a later
(dirty) read of the full register (example: AX/EAX) triggers a merge uOp, which
adds extra latency (and potentially affects the pipe usage).
This is a very interesting article on the subject with a very informative answer
from Peter Cordes:
https://stackoverflow.com/questions/45660139/how-exactly-do-partial-registers-on-haswell-skylake-perform-writing-al-seems-to

In future, the definition of RegisterFile can be extended with extra information
that may be used to identify delays caused by merge opcodes triggered by a dirty
read of a partial write.

Differential Revision: https://reviews.llvm.org/D49196

llvm-svn: 337123
2018-07-15 11:01:38 +00:00
Matt Davis
7599dd7f50 [llvm-mca] Turn InstructionTables into a Stage.
Summary:
This patch converts the InstructionTables class into a subclass of mca::Stage.  This change allows us to use the Stage's inherited Listeners for event notifications.  This also allows us to create a simple pipeline for viewing the InstructionTables report.

I have been working on a follow on patch that should cleanup addView in InstructionTables.  Right now, addView adds the view to both the Listener list and Views list.  The follow-on patch addresses the fact that we don't really need two lists in this case.  That change is not specific to just InstructionTables, so it will be a separate patch. 

Reviewers: andreadb, courbet, RKSimon

Reviewed By: andreadb

Subscribers: tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D49329

llvm-svn: 337113
2018-07-14 23:52:50 +00:00
Matt Davis
4391ce3837 [llvm-mca] Remove unused InstRef formal from pre and post execute callbacks. NFC.
llvm-svn: 337077
2018-07-14 00:10:42 +00:00
Andrea Di Biagio
642e2e539c [llvm-mca] Improve a few debug prints. NFC
llvm-svn: 337003
2018-07-13 14:55:47 +00:00
Andrea Di Biagio
2c65d72a0f [llvm-mca] Simplify the Pipeline constructor. NFC
llvm-svn: 336984
2018-07-13 09:31:02 +00:00
Andrea Di Biagio
3af21f0841 [llvm-mca] Removed unused arguments from methods in class Pipeline. NFC
llvm-svn: 336983
2018-07-13 09:27:34 +00:00
Matt Davis
bcf4db053e [llvm-mca] Constify SourceMgr::hasNext. NFC.
llvm-svn: 336961
2018-07-12 23:19:30 +00:00
Matt Davis
3de6f07d48 [llvm-mca] Add cycleBegin/cycleEnd callbacks to mca::Stage.
Summary:
This patch  clears up some of the semantics within the Stage class.  Now, preExecute
can be called multiple times per simulated cycle.  Previously preExecute was
only called once per cycle, and postExecute could have been called multiple
times.

Now, cycleStart/cycleEnd are called only once per simulated cycle.
preExecute/postExecute can be called multiple times per cycle.  This
occurs because multiple execution events can occur during a single cycle.

When stages are executed (Pipeline::runCycle), the postExecute hook will
be called only if all Stages return a success from their 'execute' callback.

Reviewers: andreadb, courbet, RKSimon

Reviewed By: andreadb

Subscribers: tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D49250

llvm-svn: 336959
2018-07-12 22:59:53 +00:00
Matt Davis
9919114951 [llvm-mca] Simplify eventing by adding an onEvent templated method.
Summary:
This patch eliminates some redundancy in iterating across Listeners for the
Instruction and Stall HWEvents, by introducing a template onEvent routine.
This change was suggested by @courbet in https://reviews.llvm.org/D48576.  I
 hope that this patch addresses that suggestion appropriately.  I do like this
change better than what we had previously.


Reviewers: andreadb, courbet, RKSimon

Reviewed By: andreadb, courbet

Subscribers: javed.absar, tschuett, gbedwell, llvm-commits, courbet

Differential Revision: https://reviews.llvm.org/D48672

llvm-svn: 336916
2018-07-12 16:56:17 +00:00
Andrea Di Biagio
e2a4194fea [llvm-mca] Use a different character to flag instructions with side-effects in the Instruction Info View. NFC
This makes easier to identify changes in the instruction info flags.  It also
helps spotting potential regressions similar to the one recently introduced at
r336728.

Using the same character to mark MayLoad/MayStore/HasSideEffects is problematic
for llvm-lit. When pattern matching substrings, llvm-lit consumes tabs and
spaces. A change in position of the flag marker may not trigger a test failure.

This patch only changes the character used for flag `hasSideEffects`. The reason
why I didn't touch other flags is because I want to avoid spamming the mailing
because of the massive diff due to the numerous tests affected by this change.

In future, each instruction flag should be associated with a different character
in the Instruction Info View.

llvm-svn: 336797
2018-07-11 12:44:44 +00:00
Andrea Di Biagio
aaaf1382f8 [llvm-mca] report an error if the assembly sequence contains an unsupported instruction.
This is a short-term fix for PR38093.
For now, we llvm::report_fatal_error if the instruction builder finds an
unsupported instruction in the instruction stream.

We need to revisit this fix once we start addressing PR38101.
Essentially, we need a better framework for error handling.

llvm-svn: 336543
2018-07-09 12:30:55 +00:00
Matt Davis
7464e6c913 [llvm-mca] Add HardwareUnit and Context classes.
This patch moves the construction of the default backend from llvm-mca.cpp and
into mca::Context. The Context class is responsible for holding ownership of
the simulated hardware components. These components are subclasses of
HardwareUnit. Right now the HardwareUnit is pretty bare-bones, but eventually
we might want to add some common functionality across all hardware components,
such as isReady() or something similar.

I have a feeling this patch will probably need some updates, but it's a start.
One thing I am not particularly fond of is the rather large interface for
createDefaultPipeline. That convenience routine takes a rather large set of
inputs from the llvm-mca driver, where many of those inputs are generated via
command line options.

One item I think we might want to change is the separating of ownership of
hardware components (owned by the context) and the pipeline (which owns
Stages). In short, a Pipeline owns Stages, a Context (currently) owns hardware.
The Pipeline's Stages make use of the components, and thus there is a lifetime
dependency generated. The components must outlive the pipeline. We could solve
this by having the Context also own the Pipeline, and not return a
unique_ptr<Pipeline>. Now that I think about it, I like that idea more.

Differential Revision: https://reviews.llvm.org/D48691

llvm-svn: 336456
2018-07-06 18:03:14 +00:00
Andrea Di Biagio
f4b2508e93 [llvm-mca] A write latency cannot be a negative value. NFC
llvm-svn: 336437
2018-07-06 13:46:10 +00:00
Andrea Di Biagio
46e908a592 [llvm-mca] improve the instruction issue logic implemented by the Scheduler.
This patch modifies the Scheduler heuristic used to select the next instruction
to issue to the pipelines.

The motivating example is test X86/BtVer2/add-sequence.s, for which llvm-mca
wrongly reported an estimated IPC of 1.50. According to perf, the actual IPC for
that test should have been ~2.00.
It turns out that an IPC of 2.00 for test add-sequence.s cannot possibly be
predicted by a Scheduler that only prioritizes instructions based on their
"age". A similar issue also affected test X86/BtVer2/dependent-pmuld-paddd.s,
for which llvm-mca wrongly estimated an IPC of 0.84 instead of an IPC of 1.00.

Instructions in the ReadyQueue are now ranked based on two factors:
 - The "age" of an instruction.
 - The number of unique users of writes associated with an instruction.

The new logic still prioritizes older instructions over younger instructions to
minimize the pressure on the reorder buffer. However, the number of users of an
instruction now also affects the overall rank. This potentially increases the
ability of the Scheduler to extract instruction level parallelism.  This patch
fixes the problem with the wrong IPC reported for test add-sequence.s and test
dependent-pmuld-paddd.s.

llvm-svn: 336420
2018-07-06 08:08:30 +00:00
Andrea Di Biagio
c51f4304c7 [llvm-mca] Fix RegisterFile debug prints. NFC
llvm-svn: 336367
2018-07-05 16:13:49 +00:00
Andrea Di Biagio
9758b05b3b [llvm-mca] Clear the content of map VariantDescriptors in InstrBuilder before we start analyzing a new CodeBlock. NFCI.
Different CodeBlocks don't overlap. The same MCInst cannot appear in more than
one code block because all blocks are instantiated before the simulation is run.

We should always clear the content of map VariantDescriptors before every
simulation, since VariantDescriptors cannot possibly store useful information
for the next blocks. It is also "safer" to clear its content because `MCInst*`
is used as the key type for map VariantDescriptors.

llvm-svn: 336142
2018-07-02 20:39:57 +00:00
Francis Visoiu Mistrih
bb4cf24f14 [MC] Error on a .zerofill directive in a non-virtual section
On darwin, all virtual sections have zerofill type, and having a
.zerofill directive in a non-virtual section is not allowed. Instead of
asserting, show a nicer error.

In order to use the equivalent of .zerofill in a non-virtual section,
the usage of .zero of .space is required.

This patch replaces the assert with an error.

Differential Revision: https://reviews.llvm.org/D48517

llvm-svn: 336127
2018-07-02 17:29:43 +00:00
Andrea Di Biagio
a5b30c5c97 [llvm-mca] Remove field HasReadAdvanceEntries from class ReadDescriptor.
This simplifies the logic that updates RAW dependencies in the DispatchStage.
There is no advantage in storing that flag in the ReadDescriptor; we should
simply rely on the call to `STI.getReadAdvanceCycles()` to obtain the
ReadAdvance cycles. If there are no read-advance entries, then method
`getReadAdvanceCycles()` quickly returns 0.

No functional change intended.

llvm-svn: 335977
2018-06-29 14:24:46 +00:00
Matt Davis
50390608b5 [llvm-mca] Delete Pipeline's copy ctor and assignement operator.
Prevent copying of the Pipeline.

llvm-svn: 335885
2018-06-28 17:33:24 +00:00
Andrea Di Biagio
e6cf14997b [llvm-mca] Use a WriteRef to describe register writes in class RegisterFile.
This patch introduces a new class named WriteRef. A WriteRef is used by the
RegisterFile to keep track of register definitions. Internally it wraps a
WriteState, as well as the source index of the defining instruction.

This patch allows the tool to propagate additional information to support future
analysis on data dependencies.

llvm-svn: 335867
2018-06-28 15:50:26 +00:00
Andrea Di Biagio
1a03456a00 [llvm-mca] Refactor method RegisterFile::collectWrites(). NFCI
Rather than calling std::find in a loop, just sort the vector and remove
duplicate entries at the end of the function.

Also, move the debug print at the end of the function, and query the
MCRegisterInfo to print register names rather than physreg IDs.

No functional change intended.

llvm-svn: 335837
2018-06-28 11:20:14 +00:00
Matt Davis
92f0c6b4ce [llvm-mca] Register listeners with stages; remove Pipeline dependency from Stage.
Summary:
This patch removes a few callbacks from Pipeline.  It comes at the cost of
registering Listeners with all Stages.  Not all stages need listeners or issue
callbacks, this registration is a bit redundant.  However, as we build-out the
API, this redundancy can disappear.

The main purpose here is to move callback code from the Pipeline and into the
stages that actually issue those callbacks. This removes the back-pointer to
the Pipeline that was put into a few Stage subclasses.

Reviewers: andreadb, courbet, RKSimon

Reviewed By: andreadb, courbet

Subscribers: tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D48576

llvm-svn: 335748
2018-06-27 16:09:33 +00:00
Andrea Di Biagio
bb10807000 [llvm-mca] Avoid calling method update() on instructions that are already in the IS_READY state. NFCI
When promoting instructions from the wait queue to the ready queue, we should
check if an instruction has already reached the IS_READY state before
calling method update().

llvm-svn: 335722
2018-06-27 11:17:07 +00:00
Matt Davis
7c9f26f009 [llvm-mca] Add a comment to Stage::execute and fix a spelling error. NFC.
llvm-svn: 335697
2018-06-27 00:54:11 +00:00
Andrea Di Biagio
95b54979b5 [llvm-mca] Removed wrong NDEBUG guards introduced by my last commit.
This partially reverts r335589.

llvm-svn: 335592
2018-06-26 11:00:21 +00:00
Andrea Di Biagio
a58b32e07f [llvm-mca] Remove unused header files and correctly guard some include headers under NDEBUG. NFC
llvm-svn: 335589
2018-06-26 10:44:12 +00:00
Matt Davis
12a71b7b2e [llvm-mca] Rename Backend to Pipeline. NFC.
Summary:
This change renames the Backend and BackendPrinter to Pipeline and PipelinePrinter respectively. 
Variables and comments have also been updated to reflect this change.

The reason for this rename, is to be slightly more correct about what MCA is modeling.  MCA models a Pipeline, which implies some logical sequence of stages. 

Reviewers: andreadb, courbet, RKSimon

Reviewed By: andreadb, courbet

Subscribers: mgorny, javed.absar, tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D48496

llvm-svn: 335496
2018-06-25 16:53:00 +00:00
Matt Davis
e6c725e139 [llvm-mca] Remove unnecessary include and forward decl in RCU. NFC.
The DispatchUnit is no longer a dependency of RCU, so this patch removes a
stale include and forward decl.  This patch also cleans up some comments.

llvm-svn: 335392
2018-06-22 21:35:26 +00:00
Andrea Di Biagio
fa489a9fd0 [llvm-mca] Remove redundant call. NFC
llvm-svn: 335368
2018-06-22 17:03:40 +00:00
Andrea Di Biagio
6a1ca757a7 [llvm-mca] Set the operand ID for implicit register reads/writes. NFC
Also, move the definition of InstRef at the end of Instruction.h to avoid a
forward declaration.

llvm-svn: 335363
2018-06-22 16:37:05 +00:00
Matt Davis
85ee56f371 [llvm-mca] Introduce a sequential container of Stages
Summary:
Remove explicit stages and introduce a list of stages.

A pipeline should be composed of an arbitrary list of stages, and not any
 predefined list of stages in the Backend.  The Backend should not know of any
 particular stage, rather it should only be concerned that it has a list of
 stages, and that those stages will fulfill the contract of what it means to be
 a Stage (namely pre/post/execute a given instruction).

For now, we leave the original set of stages defined in the Backend ctor;
however, I imagine these will be moved out at a later time.

This patch makes an adjustment to the semantics of Stage::isReady.
Specifically, what the Backend really needs to know is if a Stage has
unfinished work.  With that said, it is more appropriately renamed
Stage::hasWorkToComplete().  This change will clean up the check in
Backend::run(), allowing us to query each stage to see if there is unfinished
work, regardless of what subclass a stage might be.  I feel that this change
simplifies the semantics too, but that's a subjective statement.

Given how RetireStage and ExecuteStage handle data in their preExecute(), I've
had to change the order of Retire and Execute in our stage list.  Retire must
complete any of its preExecute actions before ExecuteStage's preExecute can
take control.  This is mainly because both stages utilize the RCU.  In the
meantime, I want to see if I can adjust that or remove that coupling.

Reviewers: andreadb, RKSimon, courbet

Reviewed By: andreadb

Subscribers: tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D46907

llvm-svn: 335361
2018-06-22 16:17:26 +00:00
Andrea Di Biagio
371b1bd7b2 [llvm-mca] Updates comment in code, and remove some stale comments. NFC
Also, rename fields `TotalMappings` and `NumUsedMappings` in struct
RegisterMappingTracker into `NumPhysRegs` and `NumUsedPhysRegs`.

llvm-svn: 335219
2018-06-21 12:14:49 +00:00
Andrea Di Biagio
17e1fccadc [llvm-mca] use APint::operator[] to obtain the bit value. NFC
llvm-svn: 335131
2018-06-20 14:30:17 +00:00
Andrea Di Biagio
4893e095df [llvm-mca][X86] Teach how to identify register writes that implicitly clear the upper portion of a super-register.
This patch teaches llvm-mca how to identify register writes that implicitly zero
the upper portion of a super-register.

On X86-64, a general purpose register is implemented in hardware as a 64-bit
register. Quoting the Intel 64 Software Developer's Manual: "an update to the
lower 32 bits of a 64 bit integer register is architecturally defined to zero
extend the upper 32 bits".  Also, a write to an XMM register performed by an AVX
instruction implicitly zeroes the upper 128 bits of the aliasing YMM register.

This patch adds a new method named clearsSuperRegisters to the MCInstrAnalysis
interface to help identify instructions that implicitly clear the upper portion
of a super-register.  The rest of the patch teaches llvm-mca how to use that new
method to obtain the information, and update the register dependencies
accordingly.

I compared the kernels from tests clear-super-register-1.s and
clear-super-register-2.s against the output from perf on btver2.  Previously
there was a large discrepancy between the estimated IPC and the measured IPC.
Now the differences are mostly in the noise.

Differential Revision: https://reviews.llvm.org/D48225

llvm-svn: 335113
2018-06-20 10:08:11 +00:00
Matt Davis
8c00f4b3c2 [llvm-mca] Cleanup the header syntax line. Fix a comment. NFC.
This patch removes a few dashes from the header comment to make room for the syntax line.

llvm-svn: 334986
2018-06-18 21:38:38 +00:00
Andrea Di Biagio
f6e62b21be [llvm-mca] Use an ordered map to collect hardware statistics. NFC.
Histogram entries are now ordered by key.  This should improves their
readability when statistics are printed.

llvm-svn: 334961
2018-06-18 17:04:56 +00:00
Roman Lebedev
4ec9cfed2e [MCA] Add -summary-view option
Summary:
While that is indeed a quite interesting summary stat,
there are cases where it does not really add anything
other than consuming extra lines.

Declutters the output of D48190.

Reviewers: RKSimon, andreadb, courbet, craig.topper

Reviewed By: andreadb

Subscribers: javed.absar, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D48209

llvm-svn: 334833
2018-06-15 14:01:43 +00:00
Matt Davis
e567892d10 [llvm-mca] Clean up the header comment. NFC.
This change removes a few dashes to make room for the header syntax string.

llvm-svn: 334770
2018-06-14 20:58:54 +00:00
Matt Davis
0594746649 [llvm-mca] Introduce the ExecuteStage (was originally the Scheduler class).
Summary: This patch transforms the Scheduler class into the ExecuteStage.  Most of the logic remains.  

Reviewers: andreadb, RKSimon, courbet

Reviewed By: andreadb

Subscribers: mgorny, javed.absar, tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D47246

llvm-svn: 334679
2018-06-14 01:20:18 +00:00
Andrea Di Biagio
7cc34df875 [llvm-mca] Fixed a bug in the logic that checks if a memory operation is ready to execute.
Fixes PR37790.

In some (very rare) cases, the LSUnit (Load/Store unit) was wrongly marking a
load (or store) as "ready to execute" effectively bypassing older memory barrier
instructions.

To reproduce this bug, the memory barrier must be the first instruction in the
input assembly sequence, and it doesn't have to perform any register writes.

llvm-svn: 334633
2018-06-13 18:30:14 +00:00
Andrea Di Biagio
b297f1e738 Revert: [llvm-mca] Flush the output stream before we start the analysis of a new code region. NFC
Not sure why, but it breaks buildbot clang-cmake-armv8-full.
It causes a failure in TEST 'Xray-armhf-linux :: TestCases/Posix/profiling-single-threaded.cc'.

llvm-svn: 334617
2018-06-13 16:33:52 +00:00
Andrea Di Biagio
24185018fb [llvm-mca] Flush the output stream before we start the analysis of a new code region. NFC
llvm-svn: 334610
2018-06-13 15:43:56 +00:00
Andrea Di Biagio
0fae23f1ce [llvm-mca] Correctly update the CyclesLeft of a register read in the presence of partial register updates.
This patch fixe the logic in ReadState::cycleEvent(). That method was not
correctly updating field `TotalCycles`.

Added extra code comments in class ReadState to better describe each field.

llvm-svn: 334028
2018-06-05 17:12:02 +00:00
Andrea Di Biagio
8b4cab0d54 [RFC][patch 3/3] Add support for variant scheduling classes in llvm-mca.
This patch is the last of a sequence of three patches related to LLVM-dev RFC
"MC support for variant scheduling classes".
http://lists.llvm.org/pipermail/llvm-dev/2018-May/123181.html

This fixes PR36672.

The main goal of this patch is to teach llvm-mca how to solve variant scheduling
classes.  This patch does that, plus it adds new variant scheduling classes to
the BtVer2 scheduling model to identify so-called zero-idioms (i.e. so-called
dependency breaking instructions that are known to generate zero, and that are
optimized out in hardware at register renaming stage).

Without the BtVer2 change, this patch would not have had any meaningful tests.
This patch is effectively the union of two changes:
 1) a change that teaches llvm-mca how to resolve variant scheduling classes.
 2) a change to the BtVer2 scheduling model that allows us to special-case
    packed XOR zero-idioms (this partially fixes PR36671).

Differential Revision: https://reviews.llvm.org/D47374 

llvm-svn: 333909
2018-06-04 15:43:09 +00:00
Andrea Di Biagio
9cf08c77bf [llvm-mca] Track cycles contributed by resources that are in a 'Super' relationship.
This is required if we want to correctly match the behavior of method
SubtargetEmitter::ExpandProcResource() in Tablegen. When computing the set of
"consumed" processor resources and resource cycles, the logic in
ExpandProcResource() doesn't update the number of resource cycles contributed by
a "Super" resource to a group.  We need to take this into account when a model
declares a processor resource which is part of a 'processor resource group', and
it is also used as the "Super" of other resources.

llvm-svn: 333892
2018-06-04 12:23:07 +00:00
Andrea Di Biagio
0110f33d56 [llvm-mca] Move the logic that computes the block throughput into Support.h. NFC
This will allow us to share the logic that computes the block throughput with
other views.

llvm-svn: 333755
2018-06-01 14:35:21 +00:00
Andrea Di Biagio
93f3fc5db3 [llvm-mca] Fixed a problem caused by an invalid use of a processor resource mask in the Scheduler.
The lambda functions used by method ResourceManager::mustIssueImmediately() was
incorrectly truncating masks of buffered processor resources to 32-bit quantities.
The invalid mask values were then used to access a map of processor
resource descriptors.

Fixes PR37643.

llvm-svn: 333692
2018-05-31 20:27:46 +00:00
Matt Davis
d559fd8c73 [llvm-mca] Update the header's guard name. NFC.
This patch also places a comment at the end of the header guard.

llvm-svn: 333297
2018-05-25 18:45:43 +00:00
Matt Davis
3be2788247 [llvm-mca] Update DispatchStage header comment. NFC.
Updated the comment to be a wee bit more descriptive.

llvm-svn: 333296
2018-05-25 18:31:28 +00:00
Matt Davis
61885d41df [llvm-mca] Add the RetireStage.
Summary:
This class maintains the same logic as the original RetireControlUnit.

This is just an intermediate patch to make the RCU a Stage.  Future patches will remove the dependency on the DispatchStage, and then more properly populate the pre/execute/post Stage interface.  

Reviewers: andreadb, RKSimon, courbet

Reviewed By: andreadb, courbet

Subscribers: javed.absar, mgorny, tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D47244

llvm-svn: 333292
2018-05-25 18:00:25 +00:00
Andrea Di Biagio
2aa4f29a7e [llvm-mca] Fix a rounding problem in SummaryView.cpp exposed by r333204.
Before printing the block reciprocal throughput, ensure that the floating point
number is always rounded the same way on every target.
No functional change intended.

llvm-svn: 333210
2018-05-24 17:22:14 +00:00
Matt Davis
4c2c0e5a34 [llvm-mca] Fix header comments. NFC.
llvm-svn: 333096
2018-05-23 16:15:06 +00:00
Andrea Di Biagio
d476292a08 [llvm-mca] Print the "Block RThroughput" in the SummaryView.
This patch implements the "block reciprocal throughput" computation in the
SummaryView.

The block reciprocal throughput is computed as the MAX of:
  - NumMicroOps / DispatchWidth
  - Resource Cycles / #Units   (for every resource consumed).

The block throughput is bounded from above by the hardware dispatch throughput.
That is because the DispatchWidth is an upper bound on how many opcodes can be part
of a single dispatch group.

The block throughput is also limited by the amount of hardware parallelism. The
number of available resource units affects how the resource pressure is
distributed, and also how many blocks can be delivered every cycle.

llvm-svn: 333095
2018-05-23 15:59:27 +00:00
Matt Davis
127f57c750 [llvm-mca] Move DispatchStage::cycleEvent to preExecute. NFC.
Summary:
This is an intermediate change, it moves the non-notification logic from
Backend::notifyCycleBegin to runCycle().
    
Once the scheduler becomes part of the Execution stage
the explicit call to Scheduler::cycleEvent will disappear.
    
The logic for Dispatch::cycleEvent() can be in
the preExecute phase, which this patch addresses.


Reviewers: andreadb, RKSimon, courbet

Reviewed By: andreadb

Subscribers: tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D47213

llvm-svn: 333029
2018-05-22 20:51:58 +00:00
Andrea Di Biagio
36a508f010 [llvm-mca] Removed an empty line generated by the timeline view. NFC.
Also, regenerate all tests.

llvm-svn: 332853
2018-05-21 17:11:56 +00:00
Matt Davis
d16c1af4bc [llvm-mca] Make Dispatch a subclass of Stage.
Summary:
The logic of dispatch remains the same, but now DispatchUnit is a Stage (DispatchStage).

This change has the benefit of simplifying the backend runCycle() code.
The same logic applies, but it belongs to different components now.  This is just a start,
eventually we will need to remove the call to the DispatchStage in Scheduler.cpp, but
that will be a separate patch.  This change is mostly a renaming and moving of existing logic.

This change also encouraged me to remove the Subtarget (STI) member from the
Backend class.  That member was used to initialize the other members of Backend
and to eventually call DispatchUnit::dispatch().  Now that we have Stages, we
can eliminate this by instantiating the DispatchStage with everything it needs
at the time of construction (e.g., Subtarget).  That change allows us to call
DispatchStage::execute(IR) as we expect to call execute() for all other stages.

Once we add the Stage list (D46907) we can more cleanly call preExecute() on
all of the stages, DispatchStage, will probably wrap cycleEvent() in that
case.

Made some formatting and minor cleanups to README.txt.  Some of the text
was re-flowed to stay within 80 cols.


Reviewers: andreadb, courbet, RKSimon

Reviewed By: andreadb, courbet

Subscribers: mgorny, javed.absar, tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D46983

llvm-svn: 332652
2018-05-17 19:22:29 +00:00
Andrea Di Biagio
f5e3a66963 [llvm-mca] Hide unrelated flags from the -help output.
llvm-svn: 332615
2018-05-17 15:35:14 +00:00
Andrea Di Biagio
8bbb45c175 [llvm-mca] add flag -all-views and flag -all-stats.
Flag -all-views enables all the views.
Flag -all-stats enables all the views that print hardware statistics.

llvm-svn: 332602
2018-05-17 12:27:03 +00:00
Matt Davis
78d832622d [llvm-mca] Move the RegisterFile class into its own translation unit. NFC
Summary: This change will help us turn the DispatchUnit into its own stage.

Reviewers: andreadb, RKSimon, courbet

Reviewed By: andreadb, courbet

Subscribers: mgorny, tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D46916

llvm-svn: 332493
2018-05-16 17:07:08 +00:00
Andrea Di Biagio
5b328a4e1c [llvm-mca] Move definitions in FetchStage.cpp inside namespace mca. NFC
Also, get rid of a redundant include in FetchStage.h and FetchStage.cpp.

llvm-svn: 332468
2018-05-16 13:38:17 +00:00
Andrea Di Biagio
857c92f90f [llvm-mca] Fix perf regression after r332390.
Revision 332390 introduced a FetchStage class in llvm-mca.
By design, FetchStage owns all the instructions in-flight in the OoO Backend.

Before this change, new instructions were added to a DenseMap indexed by
instruction id. The problem with using a DenseMap is that elements are not
ordered by key. This was causing a massive slow down in method
FetchStage::postExecute(), which searches for instructions retired that can be
deleted.

This patch replaces the DenseMap with a std::map ordered by instruction index.
At the end of every cycle, we search for the first instruction which is not
marked as "retired", and we remove all the previous instructions before it.
This works well because instructions are retired in-order.

Before this patch, a debug build of llvm-mca (on my Ryzen linux machine) took
~8.0 seconds to simulate 3000 iterations of a x86 dot-product (a `vmulps,
vpermilps, vaddps, vpermilps, vaddps` sequence). With this patch, it now takes
~0.8s to run all the 3000 iterations.

llvm-svn: 332461
2018-05-16 12:33:09 +00:00
Andrea Di Biagio
38049fdba1 [llvm-mca] Remove redundant includes in Stage.h.
This patch also makes Stage::isReady() a const method.

No functional change.

llvm-svn: 332443
2018-05-16 09:24:38 +00:00
Matt Davis
efffaf5cb4 [llvm-mca] Introduce a pipeline Stage class and FetchStage.
Summary:
    This is just an idea, really two ideas.  I expect some push-back,
    but I realize that posting a diff is the most comprehensive way to express
    these concepts.

    This patch introduces a Stage class which represents the
    various stages of an instruction pipeline.  As a start, I have created a simple
    FetchStage that is based on existing logic for how MCA produces
    instructions, but now encapsulated in a Stage.  The idea should become more concrete
    once we introduce additional stages.  The idea being, that when a stage completes,
    the next stage in the pipeline will be executed.  Stages are chained together
    as a singly linked list to closely model a real pipeline. For now there is only one stage,
    so the stage-to-stage flow of instructions isn't immediately obvious.

    Eventually, Stage will also handle event notifications, but that functionality
    is not complete, and not destined for this patch.  Ideally, an interested party 
    can register for notifications from a particular stage.  Callbacks will be issued to
    these listeners at various points in the execution of the stage.  
    For now, eventing functionality remains similar to what it has been in mca::Backend. 
    We will be building-up the Stage class as we move on, such as adding debug output.

    This patch also removes the unique_ptr<Instruction> return value from
    InstrBuilder::createInstruction.  An Instruction pointer is still produced,
    but now it's up to the caller to decide how that item should be managed post-allocation
    (e.g., smart pointer).  This allows the Fetch stage to create instructions and
    manage the lifetime of those instructions as it wishes, and not have to be bound to any
    specific managed pointer type.  Other callers of createInstruction might have different 
    requirements, and thus can manage the pointer to fit their needs.  Another idea would be to push the
   ownership to the RCU. 

    Currently, the FetchStage will wrap the Instruction
    pointer in a shared_ptr.  This allows us to remove the Instruction container in
    Backend, which was probably going to disappear, or move, at some point anyways.
    Note that I did run these changes through valgrind, to make sure we are not leaking
    memory.  While the shared_ptr comes with some additional overhead it relieves us
    from having to manage a list of generated instructions, and/or make lookup calls
    to remove the instructions. 

    I realize that both the Stage class and the Instruction pointer management
    (mentioned directly above) are separate but related ideas, and probably should
    land as separate patches; I am happy to do that if either idea is decent.
    The main reason these two ideas are together is that
    Stage::execute() can mutate an InstRef. For the fetch stage, the InstRef is populated
    as the primary action of that stage (execute()).  I didn't want to change the Stage interface
    to support the idea of generating an instruction.  Ideally, instructions are to
    be pushed through the pipeline.  I didn't want to draw too much of a
    specialization just for the fetch stage.  Excuse the word-salad.

Reviewers: andreadb, courbet, RKSimon

Reviewed By: andreadb

Subscribers: llvm-commits, mgorny, javed.absar, tschuett, gbedwell

Differential Revision: https://reviews.llvm.org/D46741

llvm-svn: 332390
2018-05-15 20:21:04 +00:00
Andrea Di Biagio
5c841698f9 [llvm-mca] use a formatted_raw_ostream to insert padding and get rid of tabs. NFC
llvm-svn: 332381
2018-05-15 18:11:45 +00:00
Andrea Di Biagio
09c9c3e7d1 [llvm-mca] Strip leading tabs and spaces from instruction strings before printing. NFC
llvm-svn: 332361
2018-05-15 15:18:05 +00:00
Andrea Di Biagio
ee73b51886 [llvm-mca] Remove unused include header files. NFC
Also, run clang-format on RetireControlUnit.cpp.

llvm-svn: 332337
2018-05-15 10:30:39 +00:00
Andrea Di Biagio
e5c03fa0a4 [llvm-mca] Add file header to RetireControlUnit.cpp.
Strictly speaking, this is not necessary for .cpp files. However, other .cpp
files from this same tool have it. This also matches what we do in other tools.

llvm-svn: 332334
2018-05-15 09:31:32 +00:00
Andrea Di Biagio
c6b21b2d69 [llvm-mca] Improved support for dependency-breaking instructions.
The tool assumes that a zero-latency instruction that doesn't consume hardware
resources is an optimizable dependency-breaking instruction. That means, it
doesn't have to wait on register input operands, and it doesn't consume any
physical register. The PRF knows how to optimize it at register renaming stage.

llvm-svn: 332249
2018-05-14 15:08:22 +00:00
Nicola Zaghen
9667127c14 Rename DEBUG macro to LLVM_DEBUG.
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624

llvm-svn: 332240
2018-05-14 12:53:11 +00:00
David Blaikie
3c91607119 Move standard library inclusions to after internal inclusions.
llvm-svn: 332124
2018-05-11 19:21:40 +00:00
David Blaikie
08c63c1dfe llvm-mca: Add missing includes
Move the header include in the primary source file to the top to
validate that it doesn't depend on any other inclusions.

llvm-svn: 331897
2018-05-09 17:28:10 +00:00
Matt Davis
3e43e69679 [llvm-mca] Avoid exposing index values in the MCA interfaces.
Summary:
This patch eliminates many places where we originally needed to  pass index
values to represent an instruction.  The index is still used as a key, in various parts of 
MCA.  I'm  not comfortable eliminating the index just yet.    By burying the index in
the instruction, we can avoid exposing that value in many places.

Eventually, we should consider removing the Instructions list in the Backend 
all together,   it's only used to hold and reclaim the memory for the allocated 
Instruction instances.  Instead we could pass around a smart pointer.  But that's
a separate discussion/patch.

Reviewers: andreadb, courbet, RKSimon

Reviewed By: andreadb

Subscribers: javed.absar, tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D46367

llvm-svn: 331660
2018-05-07 18:29:15 +00:00
Andrea Di Biagio
c6d4a11d75 [llvm-mca] removes flag -instruction-tables from the "View Options" category.
This patch also improves the description of a couple of flags in the view
options. With this change, the -help now specifies which views are enabled by
default.

llvm-svn: 331594
2018-05-05 15:36:47 +00:00
Andrea Di Biagio
bf53c91505 [llvm-mca] minor tweak to the resource pressure printing functionality. NFC.
llvm-svn: 331590
2018-05-05 12:21:54 +00:00
Matt Davis
1b4230edab [llvm-mca] Add descriptive names for the TimelineView report characters. NFC.
Summary:
This change makes the TimelineView source simpler to read and easier to modify in the future.
This patch introduces a class of static chars used as the display values in the TimelineView report, this change just eliminates a few magic characters.

Reviewers: andreadb, courbet, RKSimon

Reviewed By: andreadb

Subscribers: tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D46409

llvm-svn: 331540
2018-05-04 17:19:40 +00:00
Andrea Di Biagio
62309fcfae [llvm-mca] use colors for warnings and notes generated by InstrBuilder.
llvm-svn: 331517
2018-05-04 13:52:12 +00:00
Andrea Di Biagio
433c861c97 [llvm-mca] remove unused argument from method InstrBuilder::createInstrDescImpl.
We don't need to pass the instruction index to the method that constructs new
instruction descriptors.

No functional change intended.

llvm-svn: 331516
2018-05-04 13:10:10 +00:00
Matt Davis
d3d52a368f [llvm-mca] Lift the logic of the RetireControlUnit from the Dispatch translation unit into its own translation unit. NFC
The logic remains the same.  Eventually, I see the RCU acting as its own separate stage in the instruction pipeline.

Differential Revision: https://reviews.llvm.org/D46331

llvm-svn: 331316
2018-05-01 23:04:01 +00:00
Adrian Prantl
076a6683eb Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

llvm-svn: 331272
2018-05-01 15:54:18 +00:00
Andrea Di Biagio
ce90d0cc15 [llvm-mca] Correctly handle zero-latency stores that consume pipeline resources.
This fixes PR37293.

We can have scheduling classes with no write latency entries, that still consume
processor resources. We don't want to treat those instructions as zero-latency
instructions; they still have to be issued to the underlying pipelines, so they
still consume resource cycles.

This is likely to be a regression which I have accidentally introduced at
revision 330807. Now, if an instruction has a non-empty set of write processor
resources, we conservatively treat it as a normal (i.e. non zero-latency)
instruction.

llvm-svn: 331193
2018-04-30 15:55:04 +00:00
Andrea Di Biagio
db2c85fda5 [llvm-mca] Support for in-order CPU for -instruction-tables testing.
Added Intel Atom tests to verify that the tool correctly generates instruction
tables even if the CPU is in-order.

Fixes PR37282.

llvm-svn: 331169
2018-04-30 12:05:34 +00:00
Matt Davis
58a43d9e66 [MCA] [NFC] Remove unused Index formal from ResourceManager::issueInstruction
Summary: The instruction index was never referenced in the body.  Just a minor cleanup.

Reviewers: andreadb

Reviewed By: andreadb

Subscribers: javed.absar, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D46142

llvm-svn: 331001
2018-04-26 22:30:40 +00:00
Filipe Cabecinhas
2b2e443153 [llvm-mca] Make ViewOptions static. NFCI
llvm-svn: 330829
2018-04-25 14:39:16 +00:00
Andrea Di Biagio
25c9a42543 [llvm-mca] Add a new option category for views.
With this patch, options to add/tweak views are all grouped together in the
-help output.

The new "View Options" category looks like this:

```
  View Options:

    -dispatch-stats                 - Print dispatch statistics
    -instruction-info               - Print the instruction info view
    -instruction-tables             - Print instruction tables
    -register-file-stats            - Print register file statistics
    -resource-pressure              - Print the resource pressure view
    -retire-stats                   - Print retire control unit statistics
    -scheduler-stats                - Print scheduler statistics
    -timeline                       - Print the timeline view
    -timeline-max-cycles=<uint>     - Maximum number of cycles in the timeline view. Defaults to 80 cycles
    -timeline-max-iterations=<uint> - Maximum number of iterations to print in timeline view
```

llvm-svn: 330816
2018-04-25 11:33:14 +00:00
Andrea Di Biagio
f2ed62e68e [llvm-mca] run clang-format on a bunch of files. NFC
llvm-svn: 330811
2018-04-25 10:27:30 +00:00
Andrea Di Biagio
517a29f95b [llvm-mca] Default to the native host cpu if flag -mcpu is not specified.
llvm-svn: 330809
2018-04-25 10:18:25 +00:00
Andrea Di Biagio
f16ca8a8c6 [llvm-mca] Remove method Instruction::isZeroLatency(). NFCI
llvm-svn: 330807
2018-04-25 09:38:58 +00:00
Andrea Di Biagio
11167dc4b6 [llvm-mca] Remove unused flag -verbose. NFC
I forgot to remove it at r329794.

llvm-svn: 330757
2018-04-24 19:14:56 +00:00
Andrea Di Biagio
a416598970 [llvm-mca] Default the output asm dialect used by the instruction printer to the input asm dialect.
The instruction printer used by llvm-mca to generate the performance report now
defaults the output assembly format to the format used for the input assembly
file.

On x86, the asm format can be either AT&T or Intel, depending on the
presence/absence of directive `.intel_syntax`.

Users can still specify a different assembly dialect with the command line flag
-output-asm-variant=<uint>.

llvm-svn: 330733
2018-04-24 16:19:08 +00:00
Andrea Di Biagio
15cc258b1e [llvm-mca] Refactor the Scheduler interface in preparation for PR36663.
Zero latency instructions are now scheduled the same way as other instructions.
Before this patch, there was a specialzed code path for those instructions.

All scheduler events are now generated from method `scheduleInstruction()` and
from method `cycleEvent()`. This will make easier to implement a "execution
stage", and let that stage publish all the scheduler events.

No functional change intended.

llvm-svn: 330723
2018-04-24 14:53:16 +00:00
Jonas Devlieghere
339d1c2aa6 [llvm-mca] Use WithColor for printing errors
Use convenience helpers in WithColor to print errors and notes.

Differential revision: https://reviews.llvm.org/D45666

llvm-svn: 330267
2018-04-18 15:26:51 +00:00
Rui Ueyama
04fb9911c5 Define InitLLVM to do common initialization all at once.
We have a few functions that virtually all command wants to run on
process startup/shutdown. This patch adds InitLLVM class to do that
all at once, so that we don't need to copy-n-paste boilerplate code
to each llvm command's main() function.

Differential Revision: https://reviews.llvm.org/D45602

llvm-svn: 330046
2018-04-13 18:26:06 +00:00
Andrea Di Biagio
8de05aaa85 [llvm-mca] Ensure that instructions with a schedule read-advance are always issued in the right order.
Normally, the Scheduler prioritizes older instructions over younger instructions
during the instruction issue stage. In one particular case where a dependent
instruction had a schedule read-advance associated to one of the input operands,
this rule was not correctly applied.

This patch fixes the issue and adds a test to verify that we don't regress that
particular case.

llvm-svn: 330032
2018-04-13 15:19:07 +00:00
Andrea Di Biagio
b8c627c79e [llvm-mca] Removed unused argument from cycleEvent. NFC
llvm-svn: 329895
2018-04-12 10:49:40 +00:00
Andrea Di Biagio
aa4e3e3fd7 [llvm-mca] Let the Scheduler notify dispatch stall events caused by the lack of scheduling resources.
This patch moves part of the logic that notifies dispatch stall events from the
DispatchUnit to the Scheduler.

The main goal of this patch is to remove (yet another) dependency between the
DispatchUnit and the Scheduler. Before this patch, the DispatchUnit had to know
about `Scheduler::Event` and how to classify stalls due to the lack of scheduling
resources. This patch removes that knowledge and simplifies the logic in
DispatchUnit::checkScheduler.

This is another change done in preparation for the work to fix PR36663.

No functional change intended.

llvm-svn: 329835
2018-04-11 18:05:23 +00:00
Andrea Di Biagio
5d1dd9a325 Revert "[llvm-mca][CMake] Remove unused libraries from set LLVM_LINK_COMPONENTS"
It caused a buildbot failure (clang-ppc64le-linux-multistage - build #6424)

llvm-svn: 329812
2018-04-11 14:35:23 +00:00
Andrea Di Biagio
9b07b84d73 [llvm-mca][CMake] Remove unused libraries from set LLVM_LINK_COMPONENTS.
llvm-svn: 329807
2018-04-11 13:52:42 +00:00
Andrea Di Biagio
74100cfc77 [llvm-mca] Minor code cleanup. NFC
llvm-svn: 329796
2018-04-11 12:31:44 +00:00
Andrea Di Biagio
d9344b8000 [llvm-mca] Renamed BackendStatistics to RetireControlUnitStatistics.
Also, removed flag -verbose in favor of flag -retire-stats.

llvm-svn: 329794
2018-04-11 12:12:53 +00:00
Andrea Di Biagio
b46fe1126f [llvm-mca] Move the logic that prints scheduler statistics from BackendStatistics to its own view.
Added flag -scheduler-stats to print scheduler related statistics.

llvm-svn: 329792
2018-04-11 11:37:46 +00:00
Andrea Di Biagio
3dcc1b8616 [llvm-mca] Simplify code. NFC
llvm-svn: 329711
2018-04-10 15:14:15 +00:00
Andrea Di Biagio
7af89ed924 [llvm-mca] Move the logic that prints dispatch unit statistics from BackendStatistics to its own view.
This patch moves the logic that collects and analyzes dispatch events to the
DispatchStatistics view.

Added flag -dispatch-stats to print statistics related to the dispatch logic.

llvm-svn: 329708
2018-04-10 14:55:14 +00:00
Andrea Di Biagio
af24ba4a16 [llvm-mca] Increase the default number of iterations to 100.
llvm-svn: 329694
2018-04-10 12:50:03 +00:00
Andrea Di Biagio
ae2973804a Reapply "[llvm-mca] Do not separate iterations with a newline in the timeline view."
This reapplies r329403 with a fix for the floating point rounding issue.

llvm-svn: 329680
2018-04-10 09:55:33 +00:00
Fangrui Song
2cd5141737 [llvm-mca] Fix MCACommentConsumer
llvm-svn: 329592
2018-04-09 17:06:57 +00:00
Andrea Di Biagio
66e474bb74 [llvm-mca] Add the ability to mark regions of code for analysis (PR36875)
This patch teaches llvm-mca how to parse code comments in search for special
"markers" used to select regions of code.

Example:

# LLVM-MCA-BEGIN My Code Region
  ....
# LLVM-MCA-END

The MCAsmLexer now delegates to an object of class MCACommentParser (i.e. an
AsmCommentConsumer) the parsing of code comments to search for begin/end code
region markers.

A comment starting with substring "LLVM-MCA-BEGIN" marks the beginning of a new
region of code.  A comment starting with substring "LLVM-MCA-END" marks the end
of the last region.

This implementation doesn't allow regions to overlap. Each region can have a
optional description; internally, each region is identified by a range of source
code locations (SMLoc).

MCInst objects are added to a region R only if the source location for the
MCInst is in the range of locations specified by R.

By default, the tool allocates an implicit "Default" code region which contains
every source location.  See new tests llvm-mca-marker-*.s for a few examples.

A new Backend object is created for every region. So, the analysis is conducted
on every parsed code region.  The final report is the union of the reports
generated for every code region.  Note that empty regions are skipped.

Special "[#] Code Region - ..." strings are used in the report to mark the
portion which is specific to a code region only. For example, see
llvm-mca-markers-5.s.

Differential Revision: https://reviews.llvm.org/D45433

llvm-svn: 329590
2018-04-09 16:39:52 +00:00
Hans Wennborg
5413d62d98 Revert r329403 "[llvm-mca] Do not separate iterations with a newline in the timeline view."
This made AArch64/CortexA57/direct-branch.s fail on Windows, e.g.
http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/11251

> Also, update a few tests to minimize the diff in D45369.
> No functional change intended.

llvm-svn: 329569
2018-04-09 13:53:41 +00:00
Andrea Di Biagio
6e48c08339 [llvm-mca] Simplify code. NFC
llvm-svn: 329532
2018-04-08 15:10:19 +00:00
Andrea Di Biagio
917f5d5fcd [llvm-mca] Do not separate iterations with a newline in the timeline view.
Also, update a few tests to minimize the diff in D45369.
No functional change intended.

llvm-svn: 329403
2018-04-06 15:30:02 +00:00
Andrea Di Biagio
ee5b4f2814 [documentation][llvm-mca] Update the documentation.
Scheduling models can now describe processor register files and retire control
units. This updates the existing documentation and the README file.

llvm-svn: 329311
2018-04-05 16:42:32 +00:00
Andrea Di Biagio
01d033c489 [MC][Tablegen] Allow models to describe the retire control unit for llvm-mca.
This patch adds the ability to describe properties of the hardware retire
control unit.

Tablegen class RetireControlUnit has been added for this purpose (see
TargetSchedule.td).

A RetireControlUnit specifies the size of the reorder buffer, as well as the
maximum number of opcodes that can be retired every cycle.

A zero (or negative) value for the reorder buffer size means: "the size is
unknown". If the size is unknown, then llvm-mca defaults it to the value of
field SchedMachineModel::MicroOpBufferSize.  A zero or negative number of
opcodes retired per cycle means: "there is no restriction on the number of
instructions that can be retired every cycle".

Models can optionally specify an instance of RetireControlUnit. There can only
be up-to one RetireControlUnit definition per scheduling model.

Information related to the RCU (RetireControlUnit) is stored in (two new fields
of) MCExtraProcessorInfo.  llvm-mca loads that information when it initializes
the DispatchUnit / RetireControlUnit (see Dispatch.h/Dispatch.cpp).

This patch fixes PR36661.

Differential Revision: https://reviews.llvm.org/D45259

llvm-svn: 329304
2018-04-05 15:41:41 +00:00
Andrea Di Biagio
56f9bc1f61 [llvm-mca] Remove flag -max-retire-per-cycle, and update the docs.
This is done in preparation for D45259.
With D45259, models can specify the size of the reorder buffer, and the retire
throughput directly via tablegen.

llvm-svn: 329274
2018-04-05 11:36:50 +00:00
Andrea Di Biagio
e2bef9a902 [llvm-mca] Move the logic that prints register file statistics to its own view. NFCI
Before this patch, the "BackendStatistics" view was responsible for printing the
register file usage (as well as many other statistics).

Now users can enable register file usage statistics using the command line flag
`-register-file-stats`. By default, the tool doesn't print register file
statistics.

llvm-svn: 329083
2018-04-03 16:46:23 +00:00
Andrea Di Biagio
ebedc80ebe [llvm-mca] Remove redundant include from BackendStatistics.h. NFC
Also use llvm::DenseMap for Histograms (instead of std::map).

llvm-svn: 329074
2018-04-03 15:36:15 +00:00
Andrea Di Biagio
f425ba9f0f [MC][Tablegen] Allow the definition of processor register files in the scheduling model for llvm-mca
This patch allows the description of register files in processor scheduling
models. This addresses PR36662.

A new tablegen class named 'RegisterFile' has been added to TargetSchedule.td.
Targets can optionally describe register files for their processors using that
class. In particular, class RegisterFile allows to specify:
 - The total number of physical registers.
 - Which target registers are accessible through the register file.
 - The cost of allocating a register at register renaming stage.

Example (from this patch - see file X86/X86ScheduleBtVer2.td)

  def FpuPRF : RegisterFile<72, [VR64, VR128, VR256], [1, 1, 2]>

Here, FpuPRF describes a register file for MMX/XMM/YMM registers. On Jaguar
(btver2), a YMM register definition consumes 2 physical registers, while MMX/XMM
register definitions only cost 1 physical register.

The syntax allows to specify an empty set of register classes.  An empty set of
register classes means: this register file models all the registers specified by
the Target.  For each register class, users can specify an optional register
cost. By default, register costs default to 1.  A value of 0 for the number of
physical registers means: "this register file has an unbounded number of
physical registers".

This patch is structured in two parts.

* Part 1 - MC/Tablegen *

A first part adds the tablegen definition of RegisterFile, and teaches the
SubtargetEmitter how to emit information related to register files.

Information about register files is accessible through an instance of
MCExtraProcessorInfo.
The idea behind this design is to logically partition the processor description
which is only used by external tools (like llvm-mca) from the processor
information used by the llvm machine schedulers.
I think that this design would make easier for targets to get rid of the extra
processor information if they don't want it.

* Part 2 - llvm-mca related *

The second part of this patch is related to changes to llvm-mca.

The main differences are:
 1) class RegisterFile now needs to take into account the "cost of a register"
when allocating physical registers at register renaming stage.
 2) Point 1. triggered a minor refactoring which lef to the removal of the
"maximum 32 register files" restriction.
 3) The BackendStatistics view has been updated so that we can print out extra
details related to each register file implemented by the processor.

The effect of point 3. is also visible in tests register-files-[1..5].s.

Differential Revision: https://reviews.llvm.org/D44980

llvm-svn: 329067
2018-04-03 13:36:24 +00:00
Andrea Di Biagio
b5a92f83f3 [llvm-mca] Do not assume that implicit reads cannot be associated with ReadAdvance entries.
Before, the instruction builder incorrectly assumed that only explicit reads
could have been associated with ReadAdvance entries.
This patch fixes the issue and adds a test to verify it.

llvm-svn: 328972
2018-04-02 13:46:49 +00:00
Mandeep Singh Grang
3cf1a994cb [tools] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: JDevlieghere, zturner, echristo, dberris, friss

Reviewed By: echristo

Subscribers: gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D45141

llvm-svn: 328943
2018-04-01 21:24:53 +00:00
Andrea Di Biagio
46429367d5 [llvm-mca] Correctly set the ReadAdvance information for register use operands.
The tool was passing the wrong operand index to method
MCSubtargetInfo::getReadAdvanceCycles(). That method requires a "UseIdx", and
not the operand index. This was found when testing X86 code where instructions
had a memory folded operand.

This patch fixes the issue and adds test read-advance-1.s to ensure that
the ReadAfterLd (a ReadAdvance of 3cy) information is correctly used.

llvm-svn: 328790
2018-03-29 14:26:56 +00:00
Andrea Di Biagio
85c1e6c6b2 [llvm-mca] pass the correct set of used registers in checkRAT.
We were incorrectly initializing the array of used registers in method checkRAT.
As a consequence, the number of register file stalls was misreported.

Added a test to cover this case.

llvm-svn: 328629
2018-03-27 15:23:41 +00:00
Andrea Di Biagio
bfa0362dd4 [llvm-mca] Fix how views are added to the InstructionTables.
This should fix the stack-use-after-scope reported by the asan buildbots after
revision 328493.

llvm-svn: 328499
2018-03-26 14:25:52 +00:00
Andrea Di Biagio
49f7508452 [llvm-mca] Add a flag -instruction-info to enable/disable the instruction info view.
llvm-svn: 328493
2018-03-26 13:44:54 +00:00
Andrea Di Biagio
a309a8b1e0 [llvm-mca] Add flag -instruction-tables to print the theoretical resource pressure distribution for instructions (PR36874)
The goal of this patch is to address most of PR36874.  To fully fix PR36874 we
need to split the "InstructionInfo" view from the "SummaryView". That would make
easy to check the latency and rthroughput as well.

The patch reuses all the logic from ResourcePressureView to print out the
"instruction tables".

We have an entry for every instruction in the input sequence. Each entry reports
the theoretical resource pressure distribution. Resource pressure is uniformly
distributed across all the processor resource units of a group.

At the moment, the backend pipeline is not configurable, so the only way to fix
this is by creating a different driver that simply sends instruction events to
the resource pressure view.  That means, we don't use the Backend interface.
Instead, it is simpler to just have a different code-path for when flag
-instruction-tables is specified.

Once Clement addresses bug 36663, then we can port the "instruction tables"
logic into a stage of our configurable pipeline.

Updated the BtVer2 test cases (thanks Simon for the help). Now we pass flag
-instruction-tables to each modified test.

Differential Revision: https://reviews.llvm.org/D44839

llvm-svn: 328487
2018-03-26 12:04:53 +00:00
Andrea Di Biagio
2b25cb95ef [llvm-mca] run clang-format on all files.
This also addresses Simon's review comment in D44839.

llvm-svn: 328428
2018-03-24 16:05:36 +00:00
Andrea Di Biagio
9fb8bf4a44 [llvm-mca] Remove unused field in InstrBuilder. NFC
llvm-svn: 328427
2018-03-24 15:48:25 +00:00
Andrea Di Biagio
90cef4c604 [llvm-mca] Split the InstructionInfoView from the SummaryView.
llvm-svn: 328358
2018-03-23 19:40:04 +00:00
Andrea Di Biagio
2c0ac399a9 [llvm-mca] update the ResourcePressureView after r328335. NFC.
This should have been part of r328335. I forgot to svn add these files.

llvm-svn: 328340
2018-03-23 17:53:02 +00:00
Andrea Di Biagio
a63fa9f9be [llvm-mca] Make the resource cost a double.
This is done in preparation for the fix for PR36874.
The number of cycles consumed for each pipe is now a double quantity. This
allows reuse of the resource pressure view to print out instruction tables.

llvm-svn: 328335
2018-03-23 17:36:07 +00:00
Andrea Di Biagio
2106090194 [llvm-mca] Pass the InstrBuilder to the constructor of Backend.
This is done in preparation for the fix for PR36784.
No functional change.

llvm-svn: 328306
2018-03-23 11:50:43 +00:00
Andrea Di Biagio
bad7a877bc [llvm-mca] Add flag -resource-pressure to enable/disable printing of the resource pressure view.
By default, the tool always enables the resource pressure view.
This flag lets user specify whether they want to add that view or not.

llvm-svn: 328305
2018-03-23 11:33:09 +00:00
Andrea Di Biagio
739c509f1d [llvm-mca] Minor refactoring. NFCI
Also, removed a couple of unused methods from class Instruction.

llvm-svn: 328198
2018-03-22 14:14:49 +00:00
Andrea Di Biagio
660fc0f64f [llvm-mca] Simplify (and better standardize) the Instruction interface.
llvm-svn: 328190
2018-03-22 11:39:34 +00:00
Andrea Di Biagio
92f9994efa [llvm-mca] Simplify code. NFC
llvm-svn: 328187
2018-03-22 10:19:20 +00:00
Andrea Di Biagio
8050b0610b [llvm-mca] Move the logic that computes the register file usage to the BackendStatistics view.
With this patch, the "instruction dispatched" event now provides information
related to the number of microarchitectural registers used in each register
file. Similarly, the "instruction retired" event is now able to tell how may
registers are freed in each register file.

Currently, the BackendStatistics view is the only consumer of register
usage/pressure information. BackendStatistics uses that info to print out a few
general statistics (i.e. max number of mappings used; total mapping created).
Before this patch, the BackendStatistics was forced to query the Backend to
obtain the register pressure information.

This helps removes that dependency. Now views are completely independent from
the Backend.  As a consequence, it should be easier to address PR36663 and
further modularize the pipeline.

Added a couple of test cases in the BtVer2 specific directory.

llvm-svn: 328129
2018-03-21 18:11:05 +00:00
Andrea Di Biagio
f0b727c01d [llvm-mca] Clean up some code. NFC
Removed a couple of methods from DispatchUnit.

llvm-svn: 328094
2018-03-21 12:49:07 +00:00
Andrea Di Biagio
7c7bcf05c6 [llvm-mca] add keyword override to a couple of methods in BackendStatistics.
This should fix the buildbots after r328011.

llvm-svn: 328029
2018-03-20 20:18:36 +00:00
Andrea Di Biagio
e4e0a138c6 [llvm-mca] Remove const from a bunch of ArrayRef. NFC
llvm-svn: 328018
2018-03-20 19:06:34 +00:00
Andrea Di Biagio
06f01a6c2d [llvm-mca] Move the logic that computes the scheduler's queue usage to the BackendStatistics view.
This patch introduces two new callbacks in the event listener interface to
handle the "buffered resource reserved" event and the "buffered resource
released" event. Every time a buffered resource is used, an event is generated.

Before this patch, the Scheduler (with the help of the ResourceManager) was
responsible for tracking the scheduler's queue usage. However, that design
forced the Scheduler to 'publish' scheduler's queue pressure information through
the Backend interface.

The goal of this patch is to break the dependency between the BackendStatistics
view, and the Backend. Now the Scheduler knows how to notify "buffer
reserved/released" events.  The scheduler's queue usage analysis has been moved
to the BackendStatistics.

Differential Revision: https://reviews.llvm.org/D44686

llvm-svn: 328011
2018-03-20 18:20:39 +00:00
Andrea Di Biagio
3cce5c6f4b [llvm-mca] Use llvm::make_unique in a few places. NFC
Also, clang-format a couple of DEBUG functions.

llvm-svn: 327978
2018-03-20 12:58:34 +00:00
Andrea Di Biagio
cf05a35161 [llvm-mca] Move the routine that computes processor resource masks to its own file.
Function computeProcResourceMasks is used by the ResourceManager (owned by the
Scheduler) to compute resource masks for processor resources.  Before this
refactoring, there was an implicit dependency between the Scheduler and the
InstrBuilder. That is because InstrBuilder has to know about resource masks when
computing the set of processor resources consumed by a new instruction.

With this patch, the functionality that computes resource masks has been
extracted from the ResourceManager, and moved to a separate file (Support.h). 
This helps removing the dependency between the Scheduler and the InstrBuilder.

No functional change intended.

llvm-svn: 327973
2018-03-20 12:25:54 +00:00
Andrea Di Biagio
66c09621c6 [llvm-mca] Remove unused method from ResourceManager. NFC
llvm-svn: 327888
2018-03-19 19:14:06 +00:00
Andrea Di Biagio
06ba64d31f [llvm-mca] Simplify code. NFC
llvm-svn: 327886
2018-03-19 19:09:38 +00:00
Andrea Di Biagio
ed37aa59f3 [llvm-mca] Add pipeline stall events.
This patch introduces a new class named HWStallEvent (see HWEventListener.h),
and updates the event listener interface. A HWStallEvent represents a pipeline
stall caused by the lack of hardware resources. Similarly to HWInstructionEvent,
the event type is an unsigned, and the exact meaning depends on the subtarget.
At the moment, HWStallEvent supports a few generic dispatch events.

The main goals of this patch is to remove the logic that counts dispatch stalls
from the DispatchUnit to the BackendStatistics view.

Previously, DispatchUnit was responsible for counting and classifying dispatch
stall events. With this patch, we delegate the task of counting and classifying
stall events to the listeners (i.e. in our case, it is view
"BackendStatistics"). So, the DispatchUnit doesn't have to do extra
(unnecessary) bookkeeping.

This patch also helps futher simplifying the Backend interface. Now class
BackendStatistics no longer has to query the Backend interface to obtain the
number of dispatch stalls. As a consequence, we can get rid of all the
'getNumXXX()' methods from class Backend.
The long term goal is to remove all the remaining dependencies between the
Backend and the BackendStatistics interface.

Differential Revision: https://reviews.llvm.org/D44621

llvm-svn: 327837
2018-03-19 13:23:07 +00:00
Andrea Di Biagio
48886c59e8 [llvm-mca] Allow the definition of multiple register files.
This is a refactoring in preparation for other two changes that will allow
scheduling models to define multiple register files. This is the first step
towards fixing PR36662.

class RegisterFile (in Dispatch.h) now can emulate multiple register files.
Internally, it tracks the number of available physical registers in each
register file (described by class RegisterFileInfo).

Each register file is associated to a list of MCRegisterClass indices. Knowing
the register class indices allows to map physical registers to register files.

The long term goal is to allow processor models to optionally specify how many
register files are implemented via tablegen.

Differential Revision: https://reviews.llvm.org/D44488

llvm-svn: 327798
2018-03-18 15:33:27 +00:00
Andrea Di Biagio
20562bf853 [llvm-mca] Remove method getSchedModel() from the Backend.
llvm-svn: 327756
2018-03-16 22:21:52 +00:00
Andrea Di Biagio
d9e4d8c0eb [llvm-mca] Remove unused methods from Backend. NFC
llvm-svn: 327749
2018-03-16 22:02:47 +00:00
Andrea Di Biagio
905db8a3d6 [llvm-mca] Simplify code. NFC.
Now both method DispatchUnit::checkRAT() and DispatchUnit::canDispatch take as
input an Instruction refrence instead of an instruction descriptor.
This was requested by Simon in D44488 to simplify the diff.

llvm-svn: 327640
2018-03-15 16:13:12 +00:00
Andrea Di Biagio
77ab4ed0a3 [llvm-mca] Remove unused variable from InstrBuilder.cpp. NFC
This was causing a buildbot failure.

llvm-svn: 327517
2018-03-14 15:19:47 +00:00
Andrea Di Biagio
29d48bb255 [llvm-mca] Move the logic that updates the register files from InstrBuilder to DispatchUnit. NFCI
Before this patch, the register file was always updated at instruction creation
time. That means, new read-after-write dependencies, and new temporary registers
were allocated at instruction creation time.

This patch refactors the code in InstrBuilder, and move all the logic that
updates the register file into the dispatch unit. We only want to update the
register file when instructions are effectively dispatched (not before).

This refactoring also helps removing a bad dependency between the InstrBuilder
and the DispatchUnit.

No functional change intended.

llvm-svn: 327514
2018-03-14 14:57:23 +00:00
Andrea Di Biagio
5efc27c249 [llvm-mca] Remove the logic that computes the reciprocal throughput, and make the SummaryView independent from the Backend. NFCI
Since r327420, the tool can query the MCSchedModel interface to obtain the
reciprocal throughput information.
As a consequence, method `ResourceManager::getRThroughput`, and
method `Backend::getRThroughput` are no longer needed.

This patch simplifies the code by removing the custom RThroughput computation.
This patch also refactors class SummaryView by removing the dependency with
the Backend object.

No functional change intended.

llvm-svn: 327425
2018-03-13 17:24:32 +00:00
Andrea Di Biagio
a627e4f7be [llvm-mca] Simplify code that computes the latency of an instruction in
InstrBuilder. NFCI

This was possible because of r327406, which added function`computeInstrLatency`
to MCSchedModel.

llvm-svn: 327415
2018-03-13 15:59:59 +00:00
Andrea Di Biagio
6917de9a43 [llvm-mca] Use a const ArrayRef in a few places. NFC
llvm-svn: 327396
2018-03-13 13:58:02 +00:00
Clement Courbet
0c8c714bb4 [llvm-mca] Fix unused variable warning in opt mode.
llvm-svn: 327394
2018-03-13 13:44:18 +00:00
Clement Courbet
767a644ff6 [llvm-mca] Refactor event listeners to make the backend agnostic to event types.
Summary: This is a first step towards making the pipeline configurable.

Subscribers: llvm-commits, andreadb

Differential Revision: https://reviews.llvm.org/D44309

llvm-svn: 327389
2018-03-13 13:11:01 +00:00
Andrea Di Biagio
63ab384f89 [llvm-mca] Fix use-of-uninitialized-value error reported by the MemorySanitizer.
This should make the buildbots green again.

llvm-svn: 327223
2018-03-10 20:52:59 +00:00
Andrea Di Biagio
17857e9071 [llvm-mca] BackendStatistics: early exit from method printSchedulerUsage if the
no scheduler resources were consumed.

llvm-svn: 327215
2018-03-10 17:40:25 +00:00
Andrea Di Biagio
13307dcaad [llvm-mca] Views are now independent from resource masks. NFCI
This change removes method Backend::getProcResourceMasks() and simplifies some
logic in the Views. This effectively removes yet another dependency between the
views and the Backend.
No functional change intended.

llvm-svn: 327214
2018-03-10 16:55:07 +00:00
Andrea Di Biagio
0a931ef145 [llvm-mca] Move the logic that prints the summary into its own view. NFCI
llvm-svn: 327128
2018-03-09 13:52:03 +00:00
Andrea Di Biagio
9d4e3cac5b [llvm-mca] Run clang-format on the source code. NFC
llvm-svn: 327125
2018-03-09 12:50:42 +00:00
Andrea Di Biagio
9519f9efed [llvm-mca] Fix handling of zero-latency instructions.
This patch fixes a problem found when testing zero latency instructions on
target AArch64 -mcpu=exynos-m3 / -mcpu=exynos-m1.

On Exynos-m3/m1, direct branches are zero-latency instructions that don't consume
any processor resources.  The DispatchUnit marks zero-latency instructions as
"executed", so that no scheduling is required.  The event of instruction
executed is then notified to all the listeners, and the reorder buffer (managed
by the RetireControlUnit) is updated. In particular, the entry associated to the
zero-latency instruction in the reorder buffer is marked as executed.

Before this patch, the DispatchUnit forgot to assign a retire control unit token
(RCUToken) to the zero-latency instruction. As a consequence, the RCUToken was
used uninitialized. This was causing a crash in the RetireControlUnit logic.

Fixes PR36650.

llvm-svn: 327056
2018-03-08 20:21:55 +00:00
Andrea Di Biagio
90adf354a8 [llvm-mca] add override keyword to method ResourcePressureView::printView().
NFC.

llvm-svn: 327027
2018-03-08 17:02:28 +00:00
Andrea Di Biagio
b9a5c5c635 [llvm-mca] HWEventListener is a class, not struct.
This should appease the buildbots.

llvm-svn: 327025
2018-03-08 16:34:19 +00:00
Andrea Di Biagio
84af6ac04f [llvm-mca] Unify the API for the various views. NFCI
This allows the customization of the performance report.

Users can specify their own custom sequence of views.
Each view contributes a portion of the performance report generated by the
BackendPrinter.

Internally, class BackendPrinter keeps a sequence of views; views are printed
out in sequence when method 'printReport()' is called. 

This patch addresses one of the two review comments from Clement in D43951.

llvm-svn: 327018
2018-03-08 16:08:43 +00:00
Andrea Di Biagio
e9587cc537 [llvm-mca] Emit the 'Instruction Info' table before the resource pressure view.
In future, both the summary information and the 'instruction info' table should
be moved into a separate "Summary" view.

llvm-svn: 327010
2018-03-08 15:34:38 +00:00
Andrea Di Biagio
45f0e5261e [llvm-mca] LLVM Machine Code Analyzer.
llvm-mca is an LLVM based performance analysis tool that can be used to
statically measure the performance of code, and to help triage potential
problems with target scheduling models.

llvm-mca uses information which is already available in LLVM (e.g. scheduling
models) to statically measure the performance of machine code in a specific cpu.
Performance is measured in terms of throughput as well as processor resource
consumption. The tool currently works for processors with an out-of-order
backend, for which there is a scheduling model available in LLVM.

The main goal of this tool is not just to predict the performance of the code
when run on the target, but also help with diagnosing potential performance
issues.

Given an assembly code sequence, llvm-mca estimates the IPC (instructions per
cycle), as well as hardware resources pressure. The analysis and reporting style
were mostly inspired by the IACA tool from Intel.

This patch is related to the RFC on llvm-dev visible at this link:
http://lists.llvm.org/pipermail/llvm-dev/2018-March/121490.html

Differential Revision: https://reviews.llvm.org/D43951

llvm-svn: 326998
2018-03-08 13:05:02 +00:00