1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-21 18:22:53 +01:00
Commit Graph

94 Commits

Author SHA1 Message Date
Patrick Holland
0114258120 [MCA] [In-order pipeline] Fix for 0 latency instruction causing assertion to fail.
0 latency instructions now get processed and retired properly within the in-order pipeline. Had to fix a bug within TimelineView.cpp as well that would show up when a 0 latency instruction was the first instruction in the source.

Differential Revision: https://reviews.llvm.org/D104675
2021-06-22 10:18:39 -07:00
Patrick Holland
5d008e4862 [MCA] [RegisterFile] Allow for skipping Defs with RegID of 0 (rather than assert(RegID) like we do before this patch).
This patch will allow developers to remove unwanted instruction Defs (most likely from within a target specific InstrPostProcess) by setting that Def's RegisterID to 0.
Differential Revision: https://reviews.llvm.org/D104433
2021-06-17 11:52:43 -07:00
Min-Yih Hsu
341aadfd56 [MCA] Anchoring the vtable of CustomBehaviour
Put the dtor of mca::CustomBehaviour into the cpp file to avoid
undefined vtable when linking libLLVMMCACustomBehaviourAMDGPU as shared
library.

Differential Revision: https://reviews.llvm.org/D104401
2021-06-16 12:43:58 -07:00
Patrick Holland
449e2cbd5e Reapply "[MCA] Adding the CustomBehaviour class to llvm-mca".
The original change was pushed in main as commit f7a23ecece52.
It was then reverted by commit a04f01bab2 because it caused linker failures
on buildbots that don't build the AMDGPU target.

--

Some instructions are not defined well enough within the target’s scheduling
model for llvm-mca to be able to properly simulate its behaviour. The ideal
solution to this situation is to modify the scheduling model, but that’s not
always a viable strategy. Maybe other parts of the backend depend on that
instruction being modelled the way that it is. Or maybe the instruction is quite
complex and it’s difficult to fully capture its behaviour with tablegen. The
CustomBehaviour class (which I will refer to as CB frequently) is designed to
provide intuitive scaffolding for developers to implement the correct modelling
for these instructions.

More details are available in the original commit log message (f7a23ecece52).

Differential Revision: https://reviews.llvm.org/D104149
2021-06-16 16:54:48 +01:00
Andrea Di Biagio
a8b232ce81 [MCA][InstrBuilder] Always check for implicit uses of resource units (PR50725).
When instructions are issued to the underlying pipeline resources, the
mca::ResourceManager should also check for the presence of extra uses induced by
the explicit consumption of multiple partially overlapping group resources.

Fixes PR50725
2021-06-16 14:51:12 +01:00
Andrea Di Biagio
f1dc7da2e3 Revert "[MCA] Adding the CustomBehaviour class to llvm-mca"
This reverts commit f7a23ecece524564a0c3e09787142cc6061027bb.

It appears to breaks buildbots that don't build the AMDGPU backend.
2021-06-15 21:41:36 +01:00
Patrick Holland
e52d4f2208 [MCA] Adding the CustomBehaviour class to llvm-mca
Some instructions are not defined well enough within the target’s scheduling
model for llvm-mca to be able to properly simulate its behaviour. The ideal
solution to this situation is to modify the scheduling model, but that’s not
always a viable strategy. Maybe other parts of the backend depend on that
instruction being modelled the way that it is. Or maybe the instruction is quite
complex and it’s difficult to fully capture its behaviour with tablegen. The
CustomBehaviour class (which I will refer to as CB frequently) is designed to
provide intuitive scaffolding for developers to implement the correct modelling
for these instructions.

Implementation details:

llvm-mca does its best to extract relevant register, resource, and memory
information from every MCInst when lowering them to an mca::Instruction. It then
uses this information to detect dependencies and simulate stalls within the
pipeline. For some instructions, the information that gets captured within the
mca::Instruction is not enough for mca to simulate them properly. In these
cases, there are two main possibilities:

1. The instruction has a dependency that isn’t detected by mca.
2. mca is incorrectly enforcing a dependency that shouldn’t exist.

For the rest of this discussion, I will be focusing on (1), but I have put some
thought into (2) and I may revisit it in the future.

So we have an instruction that has dependencies that aren’t picked up by mca.
The basic idea for both pipelines in mca is that when an instruction wants to be
dispatched, we first check for register hazards and then we check for resource
hazards. This is where CB is injected. If no register or resource hazards have
been detected, we make a call to CustomBehaviour::checkCustomHazard() to give
the target specific CB the chance to detect and enforce any custom dependencies.

The return value for checkCustomHazaard() is an unsigned int representing the
(minimum) number of cycles that the instruction needs to stall for. It’s fine to
underestimate this value because when StallCycles gets down to 0, we’ll end up
checking for all the hazards again before the instruction is actually
dispatched. However, it’s important not to overestimate the value and the more
accurate your estimate is, the more efficient mca’s execution can be.

In general, for checkCustomHazard() to be able to detect these custom
dependencies, it needs information about the current instruction and also all of
the instructions that are still executing within the pipeline. The mca pipeline
uses mca::Instruction rather than MCInst and the current information encoded
within each mca::Instruction isn’t sufficient for my use cases. I had to add a
few extra attributes to the mca::Instruction class and have them get set by the
MCInst during instruction building. For example, the current mca::Instruction
doesn’t know its opcode, and it also doesn’t know anything about its immediate
operands (both of which I had to add to the class).

With information about the current instruction, a list of all currently
executing instructions, and some target specific objects (MCSubtargetInfo and
MCInstrInfo which the base CB class has references to), developers should be
able to detect and enforce most custom dependencies within checkCustomHazard. If
you need more information than is present in the mca::Instruction, feel free to
add attributes to that class and have them set during the lowering sequence from
MCInst.

Fortunately, in the in-order pipeline, it’s very convenient for us to pass these
arguments to checkCustomHazard. The hazard checking is taken care of within
InOrderIssueStage::canExecute(). This function takes a const InstRef as a
parameter (representing the instruction that currently wants to be dispatched)
and the InOrderIssueStage class maintains a SmallVector<InstRef, 4> which holds
all of the currently executing instructions. For the out-of-order pipeline, it’s
a bit trickier to get the list of executing instructions and this is why I have
held off on implementing it myself. This is the main topic I will bring up when
I eventually make a post to discuss and ask for feedback.

CB is a base class where targets implement their own derived classes. If a
target specific CB does not exist (or we pass in the -disable-cb flag), the base
class is used. This base class trivially returns 0 from its checkCustomHazard()
implementation (meaning that the current instruction needs to stall for 0 cycles
aka no hazard is detected). For this reason, targets or users who choose not to
use CB shouldn’t see any negative impacts to accuracy or performance (in
comparison to pre-patch llvm-mca).

Differential Revision: https://reviews.llvm.org/D104149
2021-06-15 21:30:48 +01:00
Andrea Di Biagio
ad80113e58 [MCA][InstrBuilder] Check for the presence of flag VariadicOpsAreDefs.
This patch fixes the logic that checks for variadic register definitions,

Before llvm-svn 348114 (commit 4cf35b4ab0b), it was not possible to explicitly
mark variadic operands as definitions. By default, variadic operands of an
MCInst were always assumed to be uses. A number of had-hoc checks were
introduced in the InstrBuilder to fix the processing of variadic register
operands of ARM ldm/stm variants.

This patch simply replaces those old (and buggy) checks with a much simpler (and
correct) check for MCID::Flag::VariadicOpsAreDefs.
2021-06-15 09:52:38 +01:00
Andrea Di Biagio
44ebb7579c [MCA][NFCI] Minor changes to InstrBuilder and Instruction.
This is based on the assumption that most simulated instructions don't define
more than one or two registers. This is true for example on x86, where
most instruction definitions don't declare more than one register write.

The default code region size has been increased from 8 to 16. This is based on
the assumption that, for small microbenchmarks, the typical code snippet size is
often less than 16 instructions.

mca::Instruction now uses bitfields to pack flags.
No functional change intended.
2021-05-31 17:05:13 +01:00
Andrea Di Biagio
f7539d249a [MCA] Minor changes to the InOrderIssueStage. NFC
The constructor of InOrderIssueStage no longer takes as input a reference to the
target scheduling model. The stage can always query the subtarget to obtain a
reference to the scheduling model.
The ResourceManager is no longer stored internally as a unique_ptr.
Moved a couple of method definitions to the .cpp file.
2021-05-28 00:33:59 +01:00
Andrea Di Biagio
8de4c75a32 [MCA] Refactor the InOrderIssueStage stage. NFCI
Moved the logic that checks for RAW hazards from the InOrderIssueStage to the
RegisterFile.

Changed how the InOrderIssueStage keeps track of backend stalls. Stall events
are now generated from method notifyStallEvent().

No functional change intended.
2021-05-27 22:28:04 +01:00
Andrea Di Biagio
c31b5f70ca [MCA][InOrderIssueStage] Fix LastWriteBackCycle computation.
Conservatively use the instruction latency to compute the last write-back cycle.
Before this patch, the last write cycle computation was incorrect for store
instructions that didn't declare any register writes.
2021-05-26 14:17:43 +01:00
Andrea Di Biagio
9789db4ce7 [MCA][RegisterFile] Refactor the move elimination logic to address PR50258.
This patch lifts the restriction on the number of read/write registers for a
move elimination candidate.  With this patch, move elimination candidates with
exactly two reads and two writes are treated like register swap operations for
the purpose of move elimination.

This patch currently doesn't affect any upstream model. However, it should help
unblock the progress on PR50258.
2021-05-08 18:10:35 +01:00
Andrea Di Biagio
ba5ec98548 [MCA][RegisterFile] Fix register class check for move elimination (PR50265)
The register file should always check if the destination register is from a
register class that allows move elimination.

Before this change, the check on the register class was only performed in a few
very specific cases. However, it should have always been performed.
This patch fixes the issue.

Note that none of the upstream scheduling models is currently affected by this
bug, so there is no test for it. The issue was found by Roman while working on
the znver3 model. I was able to reproduce the issue locally by tweaking the
btver2 model. I then verified that this patch fixes the issue.
2021-05-07 21:30:25 +01:00
Andrea Di Biagio
de24fa879c [MCA] Fix CarryOver check in the DispatchStage (PR50174).
Early exit from method DispatchStage::isAvailable() if the dispatch group is
already full. Not all instructions declare at least one uOP.
Fixes PR50174.
2021-04-30 14:26:46 +01:00
Andrea Di Biagio
f4f953260a [MCA][LSUnit] Fix a potential use after free in the logic that updates memory groups.
Make sure that the `CriticalMemoryInstruction` of a memory group is invalidated
if it references an already executed instruction.  This avoids a potential
use-after-free if the critical memory info becomes stale, and the value is
read after the instruction has executed.
2021-04-20 13:30:45 +01:00
Andrew Savonichev
e73adcc6fa [MCA] Support carry-over instructions for in-order processors
Instructions that have more uops than the processor's IssueWidth are
issued in multiple cycles.

The patch fixes PR49712.

Differential Revision: https://reviews.llvm.org/D99339
2021-03-26 00:06:19 +03:00
Andrea Di Biagio
2d533d6c56 [MCA] Fix for uninitialised member in constructor. NFC 2021-03-24 11:21:59 +00:00
Andrew Savonichev
182b0cd903 [MCA] Disable RCU for InOrderIssueStage
This is a follow-up for:
D98604 [MCA] Ensure that writes occur in-order

When instructions are aligned by the order of writes, they retire
in-order naturally. There is no need for an RCU, so it is disabled.

Differential Revision: https://reviews.llvm.org/D98628
2021-03-24 13:54:04 +03:00
Andrea Di Biagio
766c0d3596 [MCA] Improved handling of negative read-advance cycles.
Before this patch, register writes were always invalidated by the
RegisterFile at instruction commit stage. So,
the RegisterFile was often losing the knowledge about the `execute
cycle` of writes already committed. While this was not problematic
for non-delayed reads, this was sometimes leading to inaccurate read
latency computations in the presence of negative read-advance cycles.

This patch fixes the issue by changing how the RegisterFile component
internally keeps track of the `execute cycle` information of each
write. On every instruction executed, the RegisterFile gets notified
by the RetireStage, so that it can internally record the execute
cycle of each executed write.
The `execute cycle` information is stored within WriteRef itself, and
it is not invalidated when the write is committed.
2021-03-23 14:47:23 +00:00
Andrew Savonichev
f475941ee5 [MCA] Ensure that writes occur in-order
Delay the issue of a new instruction if that leads to out-of-order
commits of writes.

This patch fixes the problem described in:
https://bugs.llvm.org/show_bug.cgi?id=41796#c3

Differential Revision: https://reviews.llvm.org/D98604
2021-03-18 17:10:20 +03:00
Jay Foad
c6bd271cd0 [MCA] Support in-order CPUs with MicroOpBufferSize=1
Differential Revision: https://reviews.llvm.org/D98356
2021-03-11 10:12:54 +00:00
Andrew Savonichev
064cc1a22c [MCA] Add support for in-order CPUs
This patch adds a pipeline to support in-order CPUs such as ARM
Cortex-A55.

In-order pipeline implements a simplified version of Dispatch,
Scheduler and Execute stages as a single stage. Entry and Retire
stages are common for both in-order and out-of-order pipelines.

Differential Revision: https://reviews.llvm.org/D94928
2021-03-04 14:08:19 +03:00
Kazu Hirata
f118370581 [llvm] Use llvm::find (NFC) 2021-01-19 20:19:14 -08:00
Kazu Hirata
3109d596ee [llvm] Use llvm::append_range (NFC) 2021-01-06 18:27:33 -08:00
Evgeny Leviant
ec72518728 [llvm-mca] Fix processing thumb instruction set
Differential revision: https://reviews.llvm.org/D91704
2020-11-24 18:27:59 +03:00
serge-sans-paille
82b6e6053d llvmbuildectomy - replace llvm-build by plain cmake
No longer rely on an external tool to build the llvm component layout.

Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.

These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.

Differential Revision: https://reviews.llvm.org/D90848
2020-11-13 10:35:24 +01:00
Andrea Di Biagio
7a72a7505a [MCA][LSUnit] Correctly update the internal group flags on store barrier execution. Fixes PR48024.
This is likely to be a regressigion introduced by my last refactoring of the
LSUnit (commit 5578ec32f9c4f). Before this patch, the
"CurrentStoreBarrierGroupID" index was not correctly reset on store barrier
executions.  This was leading to unexpected crashes like the one reported as
PR48024.
2020-10-31 11:57:27 +00:00
Evgeny Leviant
343576899e [ARM][SchedModels] Convert IsPredicatedPred to MCSchedPredicate
Differential revision: https://reviews.llvm.org/D89553
2020-10-19 11:37:54 +03:00
Jay Foad
53df1fb508 [APInt] New member function setBitVal
Differential Revision: https://reviews.llvm.org/D87033
2020-09-02 21:40:31 +01:00
Andrea Di Biagio
79ed672fa3 [MCA][InstrBuilder] Correctly mark reserved resources in initializeUsedResources.
This fixes a bug reported by Alex Renda on LLVMDev where mca did not correctly
mark a resource group as "reserved".
(See http://lists.llvm.org/pipermail/llvm-dev/2020-May/141485.html).

The issue was caused by a wrong check in function `initializeUsedResources`.
As a consequence of this, a resource group was left unreserved, and its field
`NumUnits` incorrectly reported an unrealistic number of consumed resource
units.

This patch fixes the issue with the handling of reserved resources in the
InstrBuilder class, and adds a simple test for it.  Ideally, as suggested by
Andy Trick, most of these problems will disappear if in the future we will
introduce a (optional) DelayCycles vector for SchedWriteRes.
2020-05-10 19:25:54 +01:00
Andrea Di Biagio
52f56e2249 [MCA] Fixed a bug where loads and stores were sometimes incorrectly marked as depedent. Fixes PR45793.
This fixes a regression introduced by a very old commit 280ac1fd1dc35 (was
llvm-svn 361950).

Commit 280ac1fd1dc35 redesigned the logic in the LSUnit with the goal of
speeding up isReady() queries, and stabilising the LSUnit API (while also making
the load store unit more customisable).

The concept of MemoryGroup (effectively an alias set) was added by that commit
to better describe and track dependencies between memory operations.  However,
that concept was not just used for alias dependencies, but it was also used for
describing memory "order" dependencies (enforced by the memory consistency
model).

Instructions of a same memory group were considered "equivalent" as in:
independent operations that can potentially execute in parallel.  The problem
was that the cost of a dependency (in terms of number of cycles) should have
been different for "order" dependency. Instructions in an order dependency
simply have to have to wait until their predecessors are "issued" to an
underlying pipeline (rather than having to wait until predecessors have beeng
fully executed). For simple "order" dependencies, this was effectively
introducing an artificial delay on the "issue" of independent loads and stores.

This patch fixes the issue and adds a new test named 'independent-load-stores.s'
to a bunch of x86 targets. That test contains the reproducible posted by Fabian
Ritter on PR45793.

I had to rerun the update-mca-tests script on several files. To avoid expected
regressions on some Exynos tests, I have added a -noalias=false flag (to match
the old strict behavior on latencies).

Some tests for processor Barcelona are improved/fixed by this change and they
now show better results.  In a few tests we were incorrectly counting the time
spent by instructions in a scheduler queue.  In one case in particular we now
correctly see a store executed out of order.  That test was affected by the same
underlying issue reported as PR45793.

Reviewers: mattd

Differential Revision: https://reviews.llvm.org/D79351
2020-05-05 10:25:36 +01:00
Shengchen Kan
88597bc560 [MC][Bugfix] Remove redundant parameter for relaxInstruction
Summary:
Before this patch, `relaxInstruction` takes three arguments, the first
argument refers to the instruction before relaxation and the third
argument is the output instruction after relaxation. There are two quite
strange things:
  1) The first argument's type is `const MCInst &`, the third
  argument's type is `MCInst &`, but they may be aliased to the same
  variable
  2) The backends of ARM, AMDGPU, RISC-V, Hexagon assume that the third
  argument is a fresh uninitialized `MCInst` even if `relaxInstruction`
  may be called like `relaxInstruction(Relaxed, STI, Relaxed)` in a
  loop.

In this patch, we drop the thrid argument, and let `relaxInstruction`
directly modify the given instruction. Also, this patch fixes the bug https://bugs.llvm.org/show_bug.cgi?id=45580, which is introduced by D77851, and
breaks the assumption of ARM, AMDGPU, RISC-V, Hexagon.

Reviewers: Razer6, MaskRay, jyknight, asb, luismarques, enderby, rtaylor, colinl, bcain

Reviewed By: Razer6, MaskRay, bcain

Subscribers: bcain, nickdesaulniers, nathanchance, wuzish, annita.zhang, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, tpr, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78364
2020-04-21 11:06:55 +08:00
Bill Wendling
0816222e8f Revert "Remove redundant "std::move"s in return statements"
The build failed with

  error: call to deleted constructor of 'llvm::Error'

errors.

This reverts commit 1c2241a7936bf85aa68aef94bd40c3ba77d8ddf2.
2020-02-10 07:07:40 -08:00
Bill Wendling
e45b5f33f3 Remove redundant "std::move"s in return statements 2020-02-10 06:39:44 -08:00
Andrea Di Biagio
6d0e7e12a0 [MCA] Remove verification check on MayLoad and MayStore. NFCI
Field NumMicroOpcodes is currently used by mca to model the number of uOPs
dispatched from the uOp-Queue to the out of order backend.  From a 'dispatch'
point of view, an instruction with zero opcodes is still valid; it simply
doesn't consume any dispatch group slots.

However, mca doesn't expect an instruction with zero uOPs to consume pipeline
resources because it is seen as a contradiction.  In practice, it only makes
sense if such an instruction is eliminated and never really executed. It may be
that mca is being too conservative here. However I believe that mca is right,
and we should probably check that inconsistency in CodeGenSchedule.cpp (when we
also verify scheduling classes in general).

This patch removes the check for MayLoad and MayStore in mca.  That check is
probably too conservative: we are already checking if a zero-uops instruction
consumes any processor resources. Note also that instructions with unmodelled
side-effects also tend to set the MayLoad/MayStore flags even if - theoretically
speaking - they might not even consume any hw resources in practice.

In future we may want to implement different checks (possibly outside of mca)
and potentially revisit the logic in mca that verifies instructions.
For that reason I have raised PR44797.
2020-02-05 13:50:01 +00:00
Benjamin Kramer
87d13166c7 Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.
2020-01-28 23:25:25 +01:00
Mark de Wever
be4b8874f7 [NFC] Fixes -Wrange-loop-analysis warnings
This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall.

Differential Revision: https://reviews.llvm.org/D71857
2020-01-01 20:01:37 +01:00
Tom Stellard
28bf7f3536 [cmake] Explicitly mark libraries defined in lib/ as "Component Libraries"
Summary:
Most libraries are defined in the lib/ directory but there are also a
few libraries defined in tools/ e.g. libLLVM, libLTO.  I'm defining
"Component Libraries" as libraries defined in lib/ that may be included in
libLLVM.so.  Explicitly marking the libraries in lib/ as component
libraries allows us to remove some fragile checks that attempt to
differentiate between lib/ libraries and tools/ libraires:

1. In tools/llvm-shlib, because
llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of
all libraries defined in the whole project, there was custom code
needed to filter out libraries defined in tools/, none of which should
be included in libLLVM.so.  This code assumed that any library
defined as static was from lib/ and everything else should be
excluded.

With this change, llvm_map_components_to_libnames(LIB_NAMES, "all")
only returns libraries that have been added to the LLVM_COMPONENT_LIBS
global cmake property, so this custom filtering logic can be removed.
Doing this also fixes the build with BUILD_SHARED_LIBS=ON
and LLVM_BUILD_LLVM_DYLIB=ON.

2. There was some code in llvm_add_library that assumed that
libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or
ARG_LINK_COMPONENTS set.  This is only true because libraries
defined lib lib/ use LLVMBuild.txt and don't set these values.
This code has been fixed now to check if the library has been
explicitly marked as a component library, which should now make it
easier to remove LLVMBuild at some point in the future.

I have tested this patch on Windows, MacOS and Linux with release builds
and the following combinations of CMake options:

- "" (No options)
- -DLLVM_BUILD_LLVM_DYLIB=ON
- -DLLVM_LINK_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON

Reviewers: beanz, smeenai, compnerd, phosek

Reviewed By: beanz

Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70179
2019-11-21 10:48:08 -08:00
Greg Bedwell
8a5c963586 [MCA] Fix a spelling mistake in a comment. NFC 2019-10-27 10:06:22 +00:00
Andrea Di Biagio
13160fb6a6 [MCA][LSUnit] Track loads and stores until retirement.
Before this patch, loads and stores were only tracked by their corresponding
queues in the LSUnit from dispatch until execute stage. In practice we should be
more conservative and assume that memory opcodes leave their queues at
retirement stage.

Basically, loads should leave the load queue only when they have completed and
delivered their data. We conservatively assume that a load is completed when it
is retired. Stores should be tracked by the store queue from dispatch until
retirement. In practice, stores can only leave the store queue if their data can
be written to the data cache.

This is mostly a mechanical change. With this patch, the retire stage notifies
the LSUnit when a memory instruction is retired. That would triggers the release
of LDQ/STQ entries.  The only visible change is in memory tests for the bdver2
model. That is because bdver2 is the only model that defines the load/store
queue size.

This patch partially addresses PR39830.

Differential Revision: https://reviews.llvm.org/D68266

llvm-svn: 374034
2019-10-08 10:46:01 +00:00
Andrea Di Biagio
62f731223b [MCA] Use references to LSUnitBase in class Scheduler and add helper methods to acquire/release LS queue entries. NFCI
llvm-svn: 373236
2019-09-30 17:24:25 +00:00
Andrea Di Biagio
cf4aba9b59 [Tblgen][MCA] Add the ability to mark groups as LoadQueue and StoreQueue. NFCI
Before this patch, users were not allowed to optionally mark processor resource
groups as load/store queues. That is because tablegen class MemoryQueue was
originally declared as expecting a ProcResource template argument (instead of a
more generic ProcResourceKind).

That was an oversight, since the original intention from D54957 was to let user
mark any processor resource as either load/store queue.  This patch adds the
ability to use processor resource groups in MemoryQueue definitions. This is not
a user visible change.

Differential Revision: https://reviews.llvm.org/D66810

llvm-svn: 370091
2019-08-27 18:20:34 +00:00
Andrea Di Biagio
14754dff2b [MCA] consistently use MCPhysReg instead of unsigned as register type. NFCI
llvm-svn: 369648
2019-08-22 13:32:17 +00:00
Jonas Devlieghere
2c693415b7 [llvm] Migrate llvm::make_unique to std::make_unique
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.

llvm-svn: 369013
2019-08-15 15:54:37 +00:00
Andrea Di Biagio
0029025259 [MCA] Slightly refactor class RetireControlUnit, and add the ability to override the mask of used buffered resources in class mca::Instruction. NFCI
This patch teaches the RCU how to peek 'next' RCUTokens. A new method has been
added to the RetireControlUnit class with the goal of minimizing the complexity
of follow-up patches that will enable macro-fusion support in mca.

This patch also adds method Instruction::getNumMicroOpcodes() to simplify common
interactions with the instruction descriptor (a pattern quite common in some
pipeline stages).

Added the ability to override the default set of consumed scheduler resources
(this -again- is to simplify future patches that add support for macro-op fusion).

No functional change intended.

llvm-svn: 369010
2019-08-15 15:27:40 +00:00
Andrea Di Biagio
278dc66e41 [MCA] Slightly refactor the logic in ResourceManager. NFCI
This patch slightly changes the API in the attempt to simplify resource buffer
queries. It is done in preparation for a patch that will enable support for
macro fusion.

llvm-svn: 368994
2019-08-15 12:39:55 +00:00
Andrea Di Biagio
9bbf3a5aeb [MCA] Add flag -show-encoding to llvm-mca.
Flag -show-encoding enables the printing of instruction encodings as part of the
the instruction info view.

Example (with flags -mtriple=x86_64--  -mcpu=btver2):

Instruction Info:
[1]: #uOps
[2]: Latency
[3]: RThroughput
[4]: MayLoad
[5]: MayStore
[6]: HasSideEffects (U)
[7]: Encoding Size

[1]    [2]    [3]    [4]    [5]    [6]    [7]    Encodings:     Instructions:
 1      2     1.00                         4     c5 f0 59 d0    vmulps   %xmm0, %xmm1, %xmm2
 1      4     1.00                         4     c5 eb 7c da    vhaddps  %xmm2, %xmm2, %xmm3
 1      4     1.00                         4     c5 e3 7c e3    vhaddps  %xmm3, %xmm3, %xmm4

In this example, column Encoding Size is the size in bytes of the instruction
encoding. Column Encodings reports the actual instruction encodings as byte
sequences in hex (objdump style).

The computation of encodings is done by a utility class named mca::CodeEmitter.

In future, I plan to expose the CodeEmitter to the instruction builder, so that
information about instruction encoding sizes can be used by the simulator. That
would be a first step towards simulating the throughput from the decoders in the
hardware frontend.

Differential Revision: https://reviews.llvm.org/D65948

llvm-svn: 368432
2019-08-09 11:26:27 +00:00
Andrea Di Biagio
57fdb21037 [MCA] Remove dependency from InstrBuilder in mca::Context. NFC
InstrBuilder is not required to construct the default pipeline.

llvm-svn: 368275
2019-08-08 10:30:58 +00:00
Andrea Di Biagio
9eeb43075a [MCA] Ignore invalid processor resource writes of zero cycles. NFCI
In debug mode, the tool also raises a warning and prints out a message which
helps identify the problematic MCWriteProcResEntry from the scheduling class.
This message would have been useful to have when triaging PR42282.

llvm-svn: 363387
2019-06-14 13:31:21 +00:00