This patch teaches llvm-mca how to parse code comments in search for special
"markers" used to select regions of code.
Example:
# LLVM-MCA-BEGIN My Code Region
....
# LLVM-MCA-END
The MCAsmLexer now delegates to an object of class MCACommentParser (i.e. an
AsmCommentConsumer) the parsing of code comments to search for begin/end code
region markers.
A comment starting with substring "LLVM-MCA-BEGIN" marks the beginning of a new
region of code. A comment starting with substring "LLVM-MCA-END" marks the end
of the last region.
This implementation doesn't allow regions to overlap. Each region can have a
optional description; internally, each region is identified by a range of source
code locations (SMLoc).
MCInst objects are added to a region R only if the source location for the
MCInst is in the range of locations specified by R.
By default, the tool allocates an implicit "Default" code region which contains
every source location. See new tests llvm-mca-marker-*.s for a few examples.
A new Backend object is created for every region. So, the analysis is conducted
on every parsed code region. The final report is the union of the reports
generated for every code region. Note that empty regions are skipped.
Special "[#] Code Region - ..." strings are used in the report to mark the
portion which is specific to a code region only. For example, see
llvm-mca-markers-5.s.
Differential Revision: https://reviews.llvm.org/D45433
llvm-svn: 329590
Scheduling models can now describe processor register files and retire control
units. This updates the existing documentation and the README file.
llvm-svn: 329311
This patch adds the ability to describe properties of the hardware retire
control unit.
Tablegen class RetireControlUnit has been added for this purpose (see
TargetSchedule.td).
A RetireControlUnit specifies the size of the reorder buffer, as well as the
maximum number of opcodes that can be retired every cycle.
A zero (or negative) value for the reorder buffer size means: "the size is
unknown". If the size is unknown, then llvm-mca defaults it to the value of
field SchedMachineModel::MicroOpBufferSize. A zero or negative number of
opcodes retired per cycle means: "there is no restriction on the number of
instructions that can be retired every cycle".
Models can optionally specify an instance of RetireControlUnit. There can only
be up-to one RetireControlUnit definition per scheduling model.
Information related to the RCU (RetireControlUnit) is stored in (two new fields
of) MCExtraProcessorInfo. llvm-mca loads that information when it initializes
the DispatchUnit / RetireControlUnit (see Dispatch.h/Dispatch.cpp).
This patch fixes PR36661.
Differential Revision: https://reviews.llvm.org/D45259
llvm-svn: 329304
This is done in preparation for D45259.
With D45259, models can specify the size of the reorder buffer, and the retire
throughput directly via tablegen.
llvm-svn: 329274
Before this patch, the "BackendStatistics" view was responsible for printing the
register file usage (as well as many other statistics).
Now users can enable register file usage statistics using the command line flag
`-register-file-stats`. By default, the tool doesn't print register file
statistics.
llvm-svn: 329083
This patch allows the description of register files in processor scheduling
models. This addresses PR36662.
A new tablegen class named 'RegisterFile' has been added to TargetSchedule.td.
Targets can optionally describe register files for their processors using that
class. In particular, class RegisterFile allows to specify:
- The total number of physical registers.
- Which target registers are accessible through the register file.
- The cost of allocating a register at register renaming stage.
Example (from this patch - see file X86/X86ScheduleBtVer2.td)
def FpuPRF : RegisterFile<72, [VR64, VR128, VR256], [1, 1, 2]>
Here, FpuPRF describes a register file for MMX/XMM/YMM registers. On Jaguar
(btver2), a YMM register definition consumes 2 physical registers, while MMX/XMM
register definitions only cost 1 physical register.
The syntax allows to specify an empty set of register classes. An empty set of
register classes means: this register file models all the registers specified by
the Target. For each register class, users can specify an optional register
cost. By default, register costs default to 1. A value of 0 for the number of
physical registers means: "this register file has an unbounded number of
physical registers".
This patch is structured in two parts.
* Part 1 - MC/Tablegen *
A first part adds the tablegen definition of RegisterFile, and teaches the
SubtargetEmitter how to emit information related to register files.
Information about register files is accessible through an instance of
MCExtraProcessorInfo.
The idea behind this design is to logically partition the processor description
which is only used by external tools (like llvm-mca) from the processor
information used by the llvm machine schedulers.
I think that this design would make easier for targets to get rid of the extra
processor information if they don't want it.
* Part 2 - llvm-mca related *
The second part of this patch is related to changes to llvm-mca.
The main differences are:
1) class RegisterFile now needs to take into account the "cost of a register"
when allocating physical registers at register renaming stage.
2) Point 1. triggered a minor refactoring which lef to the removal of the
"maximum 32 register files" restriction.
3) The BackendStatistics view has been updated so that we can print out extra
details related to each register file implemented by the processor.
The effect of point 3. is also visible in tests register-files-[1..5].s.
Differential Revision: https://reviews.llvm.org/D44980
llvm-svn: 329067
Before, the instruction builder incorrectly assumed that only explicit reads
could have been associated with ReadAdvance entries.
This patch fixes the issue and adds a test to verify it.
llvm-svn: 328972
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.
To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.
Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.
Reviewers: JDevlieghere, zturner, echristo, dberris, friss
Reviewed By: echristo
Subscribers: gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D45141
llvm-svn: 328943
The tool was passing the wrong operand index to method
MCSubtargetInfo::getReadAdvanceCycles(). That method requires a "UseIdx", and
not the operand index. This was found when testing X86 code where instructions
had a memory folded operand.
This patch fixes the issue and adds test read-advance-1.s to ensure that
the ReadAfterLd (a ReadAdvance of 3cy) information is correctly used.
llvm-svn: 328790
We were incorrectly initializing the array of used registers in method checkRAT.
As a consequence, the number of register file stalls was misreported.
Added a test to cover this case.
llvm-svn: 328629
The goal of this patch is to address most of PR36874. To fully fix PR36874 we
need to split the "InstructionInfo" view from the "SummaryView". That would make
easy to check the latency and rthroughput as well.
The patch reuses all the logic from ResourcePressureView to print out the
"instruction tables".
We have an entry for every instruction in the input sequence. Each entry reports
the theoretical resource pressure distribution. Resource pressure is uniformly
distributed across all the processor resource units of a group.
At the moment, the backend pipeline is not configurable, so the only way to fix
this is by creating a different driver that simply sends instruction events to
the resource pressure view. That means, we don't use the Backend interface.
Instead, it is simpler to just have a different code-path for when flag
-instruction-tables is specified.
Once Clement addresses bug 36663, then we can port the "instruction tables"
logic into a stage of our configurable pipeline.
Updated the BtVer2 test cases (thanks Simon for the help). Now we pass flag
-instruction-tables to each modified test.
Differential Revision: https://reviews.llvm.org/D44839
llvm-svn: 328487
This is done in preparation for the fix for PR36874.
The number of cycles consumed for each pipe is now a double quantity. This
allows reuse of the resource pressure view to print out instruction tables.
llvm-svn: 328335
By default, the tool always enables the resource pressure view.
This flag lets user specify whether they want to add that view or not.
llvm-svn: 328305
With this patch, the "instruction dispatched" event now provides information
related to the number of microarchitectural registers used in each register
file. Similarly, the "instruction retired" event is now able to tell how may
registers are freed in each register file.
Currently, the BackendStatistics view is the only consumer of register
usage/pressure information. BackendStatistics uses that info to print out a few
general statistics (i.e. max number of mappings used; total mapping created).
Before this patch, the BackendStatistics was forced to query the Backend to
obtain the register pressure information.
This helps removes that dependency. Now views are completely independent from
the Backend. As a consequence, it should be easier to address PR36663 and
further modularize the pipeline.
Added a couple of test cases in the BtVer2 specific directory.
llvm-svn: 328129
This patch introduces two new callbacks in the event listener interface to
handle the "buffered resource reserved" event and the "buffered resource
released" event. Every time a buffered resource is used, an event is generated.
Before this patch, the Scheduler (with the help of the ResourceManager) was
responsible for tracking the scheduler's queue usage. However, that design
forced the Scheduler to 'publish' scheduler's queue pressure information through
the Backend interface.
The goal of this patch is to break the dependency between the BackendStatistics
view, and the Backend. Now the Scheduler knows how to notify "buffer
reserved/released" events. The scheduler's queue usage analysis has been moved
to the BackendStatistics.
Differential Revision: https://reviews.llvm.org/D44686
llvm-svn: 328011
Function computeProcResourceMasks is used by the ResourceManager (owned by the
Scheduler) to compute resource masks for processor resources. Before this
refactoring, there was an implicit dependency between the Scheduler and the
InstrBuilder. That is because InstrBuilder has to know about resource masks when
computing the set of processor resources consumed by a new instruction.
With this patch, the functionality that computes resource masks has been
extracted from the ResourceManager, and moved to a separate file (Support.h).
This helps removing the dependency between the Scheduler and the InstrBuilder.
No functional change intended.
llvm-svn: 327973
This patch introduces a new class named HWStallEvent (see HWEventListener.h),
and updates the event listener interface. A HWStallEvent represents a pipeline
stall caused by the lack of hardware resources. Similarly to HWInstructionEvent,
the event type is an unsigned, and the exact meaning depends on the subtarget.
At the moment, HWStallEvent supports a few generic dispatch events.
The main goals of this patch is to remove the logic that counts dispatch stalls
from the DispatchUnit to the BackendStatistics view.
Previously, DispatchUnit was responsible for counting and classifying dispatch
stall events. With this patch, we delegate the task of counting and classifying
stall events to the listeners (i.e. in our case, it is view
"BackendStatistics"). So, the DispatchUnit doesn't have to do extra
(unnecessary) bookkeeping.
This patch also helps futher simplifying the Backend interface. Now class
BackendStatistics no longer has to query the Backend interface to obtain the
number of dispatch stalls. As a consequence, we can get rid of all the
'getNumXXX()' methods from class Backend.
The long term goal is to remove all the remaining dependencies between the
Backend and the BackendStatistics interface.
Differential Revision: https://reviews.llvm.org/D44621
llvm-svn: 327837
This is a refactoring in preparation for other two changes that will allow
scheduling models to define multiple register files. This is the first step
towards fixing PR36662.
class RegisterFile (in Dispatch.h) now can emulate multiple register files.
Internally, it tracks the number of available physical registers in each
register file (described by class RegisterFileInfo).
Each register file is associated to a list of MCRegisterClass indices. Knowing
the register class indices allows to map physical registers to register files.
The long term goal is to allow processor models to optionally specify how many
register files are implemented via tablegen.
Differential Revision: https://reviews.llvm.org/D44488
llvm-svn: 327798
Now both method DispatchUnit::checkRAT() and DispatchUnit::canDispatch take as
input an Instruction refrence instead of an instruction descriptor.
This was requested by Simon in D44488 to simplify the diff.
llvm-svn: 327640
Before this patch, the register file was always updated at instruction creation
time. That means, new read-after-write dependencies, and new temporary registers
were allocated at instruction creation time.
This patch refactors the code in InstrBuilder, and move all the logic that
updates the register file into the dispatch unit. We only want to update the
register file when instructions are effectively dispatched (not before).
This refactoring also helps removing a bad dependency between the InstrBuilder
and the DispatchUnit.
No functional change intended.
llvm-svn: 327514
Since r327420, the tool can query the MCSchedModel interface to obtain the
reciprocal throughput information.
As a consequence, method `ResourceManager::getRThroughput`, and
method `Backend::getRThroughput` are no longer needed.
This patch simplifies the code by removing the custom RThroughput computation.
This patch also refactors class SummaryView by removing the dependency with
the Backend object.
No functional change intended.
llvm-svn: 327425
Summary: This is a first step towards making the pipeline configurable.
Subscribers: llvm-commits, andreadb
Differential Revision: https://reviews.llvm.org/D44309
llvm-svn: 327389