mirror of
https://github.com/RPCS3/llvm-mirror.git
synced 2024-11-25 04:02:41 +01:00
[llvm] NFC: Fix trivial typo in rst and td files
Differential Revision: https://reviews.llvm.org/D77469
This commit is contained in:
parent
3d964a5a2b
commit
7ce19394dc
@ -6603,8 +6603,8 @@ after the source language arguments in the following order:
|
||||
The values come from the initial kernel execution state. See
|
||||
:ref:`amdgpu-amdhsa-vgpr-register-set-up-order-table`.
|
||||
|
||||
.. table:: Work-item implict argument layout
|
||||
:name: amdgpu-amdhsa-workitem-implict-argument-layout-table
|
||||
.. table:: Work-item implicit argument layout
|
||||
:name: amdgpu-amdhsa-workitem-implicit-argument-layout-table
|
||||
|
||||
======= ======= ==============
|
||||
Bits Size Field Name
|
||||
|
@ -417,7 +417,7 @@ Introduces a function ID that can be used with ``.cv_loc``. Includes
|
||||
caller, whether the caller is a real function or another inlined call site.
|
||||
|
||||
Syntax:
|
||||
``.cv_inline_site_id`` *FunctionId* ``within`` *Function* ``inlined_at`` *FileNumber Line* [ *Colomn* ]
|
||||
``.cv_inline_site_id`` *FunctionId* ``within`` *Function* ``inlined_at`` *FileNumber Line* [ *Column* ]
|
||||
|
||||
``.cv_loc`` Directive
|
||||
^^^^^^^^^^^^^^^^^^^^^
|
||||
|
@ -28,7 +28,7 @@ describe all the instructions using that model. TableGen parses all the relation
|
||||
models and uses the information to construct relation tables which relate
|
||||
instructions with each other. These tables are emitted in the
|
||||
``XXXInstrInfo.inc`` file along with the functions to query them. Following
|
||||
is the definition of ``InstrMapping`` class definied in Target.td file:
|
||||
is the definition of ``InstrMapping`` class defined in Target.td file:
|
||||
|
||||
.. code-block:: text
|
||||
|
||||
|
@ -1828,7 +1828,7 @@ example:
|
||||
``"preserve-sign"``, or ``"positive-zero"``. The first entry
|
||||
indicates the flushing mode for the result of floating point
|
||||
operations. The second indicates the handling of denormal inputs
|
||||
to floating point instructions. For compatability with older
|
||||
to floating point instructions. For compatibility with older
|
||||
bitcode, if the second value is omitted, both input and output
|
||||
modes will assume the same mode.
|
||||
|
||||
@ -1879,7 +1879,7 @@ example:
|
||||
``shadowcallstack``
|
||||
This attribute indicates that the ShadowCallStack checks are enabled for
|
||||
the function. The instrumentation checks that the return address for the
|
||||
function has not changed between the function prolog and eiplog. It is
|
||||
function has not changed between the function prolog and epilog. It is
|
||||
currently x86_64-specific.
|
||||
|
||||
Call Site Attributes
|
||||
@ -17194,7 +17194,7 @@ The first three arguments to the '``llvm.experimental.constrained.fmuladd``'
|
||||
intrinsic must be floating-point or vector of floating-point values.
|
||||
All three arguments must have identical types.
|
||||
|
||||
The fourth and fifth arguments specifiy the rounding mode and exception behavior
|
||||
The fourth and fifth arguments specify the rounding mode and exception behavior
|
||||
as described above.
|
||||
|
||||
Semantics:
|
||||
|
@ -3664,7 +3664,7 @@ Important Subclasses of the ``Instruction`` class
|
||||
* ``CmpInst``
|
||||
|
||||
This subclass represents the two comparison instructions,
|
||||
`ICmpInst <LangRef.html#i_icmp>`_ (integer opreands), and
|
||||
`ICmpInst <LangRef.html#i_icmp>`_ (integer operands), and
|
||||
`FCmpInst <LangRef.html#i_fcmp>`_ (floating point operands).
|
||||
|
||||
.. _m_Instruction:
|
||||
@ -3966,7 +3966,7 @@ Important Public Members of the ``GlobalVariable`` class
|
||||
|
||||
* ``bool hasInitializer()``
|
||||
|
||||
Returns true if this ``GlobalVariable`` has an intializer.
|
||||
Returns true if this ``GlobalVariable`` has an initializer.
|
||||
|
||||
* ``Constant *getInitializer()``
|
||||
|
||||
|
@ -712,7 +712,7 @@ clang's tree actually looks like in ``Lclang1``.
|
||||
Even so, the edge ``U3 -> Llld1`` could be problematic for future
|
||||
merges from upstream. git will think that we've already merged from
|
||||
``U3``, and we have, except for the state of the clang tree. One
|
||||
possible migitation strategy is to manually diff clang between ``U2``
|
||||
possible mitigation strategy is to manually diff clang between ``U2``
|
||||
and ``U3`` and apply those updates to ``local/zip``. Another,
|
||||
possibly simpler strategy is to freeze local work on downstream
|
||||
branches and merge all submodules from the latest upstream before
|
||||
@ -921,7 +921,7 @@ ecosystem, essentially extending it with new tools. If such
|
||||
repositories are tightly coupled with LLVM, it may make sense to
|
||||
import them into your local mirror of the monorepo.
|
||||
|
||||
If such repositores participated in the umbrella repository used
|
||||
If such repositories participated in the umbrella repository used
|
||||
during the zipping process above, they will automatically be added to
|
||||
the monorepo. For downstream repositories that don't participate in
|
||||
an umbrella setup, the ``import-downstream-repo.py`` tool at
|
||||
|
@ -136,7 +136,7 @@ TableGen's top-level production consists of "objects".
|
||||
TemplateArgList: "<" `Declaration` ("," `Declaration`)* ">"
|
||||
|
||||
A ``class`` declaration creates a record which other records can inherit
|
||||
from. A class can be parametrized by a list of "template arguments", whose
|
||||
from. A class can be parameterized by a list of "template arguments", whose
|
||||
values can be used in the class body.
|
||||
|
||||
A given class can only be defined once. A ``class`` declaration is
|
||||
|
@ -250,7 +250,7 @@ each function just passes through code-gen. If we both optimize and code-gen
|
||||
lazily we can start executing the first function more quickly, but we will have
|
||||
longer pauses as each function has to be both optimized and code-gen'd when it
|
||||
is first executed. Things become even more interesting if we consider
|
||||
interproceedural optimizations like inlining, which must be performed eagerly.
|
||||
interprocedural optimizations like inlining, which must be performed eagerly.
|
||||
These are complex trade-offs, and there is no one-size-fits all solution to
|
||||
them, but by providing composable layers we leave the decisions to the person
|
||||
implementing the JIT, and make it easy for them to experiment with different
|
||||
|
@ -1833,7 +1833,7 @@ def int_aarch64_sve_ld1_gather_scalar_offset : AdvSIMD_GatherLoad_VS_Intrinsic;
|
||||
// First-faulting gather loads: scalar base + vector offsets
|
||||
//
|
||||
|
||||
// 64 bit unscalled offsets
|
||||
// 64 bit unscaled offsets
|
||||
def int_aarch64_sve_ldff1_gather : AdvSIMD_GatherLoad_SV_64b_Offsets_Intrinsic;
|
||||
|
||||
// 64 bit scaled offsets
|
||||
|
@ -1080,7 +1080,7 @@ def int_arm_mve_vmull_poly: Intrinsic<
|
||||
|
||||
// The first two parameters are compile-time constants:
|
||||
// * Halving: 0 means halving (vhcaddq), 1 means non-halving (vcaddq)
|
||||
// instruction. Note: the flag is inverted to match the corresonding
|
||||
// instruction. Note: the flag is inverted to match the corresponding
|
||||
// bit in the instruction encoding
|
||||
// * Rotation angle: 0 mean 90 deg, 1 means 180 deg
|
||||
defm int_arm_mve_vcaddq : MVEMXPredicated<
|
||||
|
@ -476,7 +476,7 @@ let TargetPrefix = "ppc" in { // All PPC intrinsics start with "llvm.ppc.".
|
||||
Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty,
|
||||
llvm_v4f32_ty, llvm_v4f32_ty], [IntrNoMem]>;
|
||||
|
||||
// Vector Multiply Sum Intructions.
|
||||
// Vector Multiply Sum Instructions.
|
||||
def int_ppc_altivec_vmsummbm : GCCBuiltin<"__builtin_altivec_vmsummbm">,
|
||||
Intrinsic<[llvm_v4i32_ty], [llvm_v16i8_ty, llvm_v16i8_ty,
|
||||
llvm_v4i32_ty], [IntrNoMem]>;
|
||||
@ -496,7 +496,7 @@ let TargetPrefix = "ppc" in { // All PPC intrinsics start with "llvm.ppc.".
|
||||
Intrinsic<[llvm_v4i32_ty], [llvm_v8i16_ty, llvm_v8i16_ty,
|
||||
llvm_v4i32_ty], [IntrNoMem]>;
|
||||
|
||||
// Vector Multiply Intructions.
|
||||
// Vector Multiply Instructions.
|
||||
def int_ppc_altivec_vmulesb : GCCBuiltin<"__builtin_altivec_vmulesb">,
|
||||
Intrinsic<[llvm_v8i16_ty], [llvm_v16i8_ty, llvm_v16i8_ty],
|
||||
[IntrNoMem]>;
|
||||
@ -535,7 +535,7 @@ let TargetPrefix = "ppc" in { // All PPC intrinsics start with "llvm.ppc.".
|
||||
Intrinsic<[llvm_v2i64_ty], [llvm_v4i32_ty, llvm_v4i32_ty],
|
||||
[IntrNoMem]>;
|
||||
|
||||
// Vector Sum Intructions.
|
||||
// Vector Sum Instructions.
|
||||
def int_ppc_altivec_vsumsws : GCCBuiltin<"__builtin_altivec_vsumsws">,
|
||||
Intrinsic<[llvm_v4i32_ty], [llvm_v4i32_ty, llvm_v4i32_ty],
|
||||
[IntrNoMem]>;
|
||||
|
@ -284,7 +284,7 @@ let TargetPrefix = "x86" in { // All intrinsics start with "llvm.x86.".
|
||||
def int_x86_sse_ldmxcsr :
|
||||
Intrinsic<[], [llvm_ptr_ty],
|
||||
[IntrReadMem, IntrArgMemOnly, IntrHasSideEffects,
|
||||
// FIXME: LDMXCSR does not actualy write to memory,
|
||||
// FIXME: LDMXCSR does not actually write to memory,
|
||||
// but Fast and DAG Isel both use writing to memory
|
||||
// as a proxy for having side effects.
|
||||
IntrWriteMem]>;
|
||||
|
@ -224,7 +224,7 @@ class RegisterClass<string namespace, list<ValueType> regTypes, int alignment,
|
||||
list<ValueType> RegTypes = regTypes;
|
||||
|
||||
// Size - Specify the spill size in bits of the registers. A default value of
|
||||
// zero lets tablgen pick an appropriate size.
|
||||
// zero lets tablegen pick an appropriate size.
|
||||
int Size = 0;
|
||||
|
||||
// Alignment - Specify the alignment required of the registers when they are
|
||||
@ -703,7 +703,7 @@ class Requires<list<Predicate> preds> {
|
||||
/// ops definition - This is just a simple marker used to identify the operand
|
||||
/// list for an instruction. outs and ins are identical both syntactically and
|
||||
/// semantically; they are used to define def operands and use operands to
|
||||
/// improve readibility. This should be used like this:
|
||||
/// improve readability. This should be used like this:
|
||||
/// (outs R32:$dst), (ins R32:$src1, R32:$src2) or something similar.
|
||||
def ops;
|
||||
def outs;
|
||||
|
@ -1,4 +1,4 @@
|
||||
//===- TargetItinerary.td - Target Itinierary Description --*- tablegen -*-===//
|
||||
//===- TargetItinerary.td - Target Itinerary Description --*- tablegen -*-====//
|
||||
//
|
||||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
||||
// See https://llvm.org/LICENSE.txt for license information.
|
||||
|
@ -99,7 +99,7 @@ class SchedMachineModel {
|
||||
bit CompleteModel = 1;
|
||||
|
||||
// Indicates that we should do full overlap checking for multiple InstrRWs
|
||||
// definining the same instructions within the same SchedMachineModel.
|
||||
// defining the same instructions within the same SchedMachineModel.
|
||||
// FIXME: Remove when all in tree targets are clean with the full check
|
||||
// enabled.
|
||||
bit FullInstRWOverlapCheck = 1;
|
||||
@ -163,7 +163,7 @@ class ProcResourceKind;
|
||||
// differently. Here we refer to stage between decoding into micro-ops
|
||||
// and moving them into a reservation station.) Normally NumMicroOps
|
||||
// is sufficient to limit dispatch/issue groups. However, some
|
||||
// processors can form groups of with only certain combinitions of
|
||||
// processors can form groups of with only certain combinations of
|
||||
// instruction types. e.g. POWER7.
|
||||
//
|
||||
// Use BufferSize = 1 for in-order execution units. This is used for
|
||||
|
@ -729,7 +729,7 @@ def NOOP_SDNodeXForm : SDNodeXForm<imm, [{}]>;
|
||||
/// PatFrags - Represents a set of pattern fragments. Each single fragment
|
||||
/// can match something on the DAG, from a single node to multiple nested other
|
||||
/// fragments. The whole set of fragments matches if any of the single
|
||||
/// fragemnts match. This allows e.g. matching and "add with overflow" and
|
||||
/// fragments match. This allows e.g. matching and "add with overflow" and
|
||||
/// a regular "add" with the same fragment set.
|
||||
///
|
||||
class PatFrags<dag ops, list<dag> frags, code pred = [{}],
|
||||
|
@ -42,11 +42,11 @@ def FeatureAES : SubtargetFeature<
|
||||
"Enable AES support", [FeatureNEON]>;
|
||||
|
||||
// Crypto has been split up and any combination is now valid (see the
|
||||
// crypto defintions above). Also, crypto is now context sensitive:
|
||||
// crypto definitions above). Also, crypto is now context sensitive:
|
||||
// it has a different meaning for e.g. Armv8.4 than it has for Armv8.2.
|
||||
// Therefore, we rely on Clang, the user interacing tool, to pass on the
|
||||
// appropriate crypto options. But here in the backend, crypto has very little
|
||||
// meaning anymore. We kept the Crypto defintion here for backward
|
||||
// meaning anymore. We kept the Crypto definition here for backward
|
||||
// compatibility, and now imply features SHA2 and AES, which was the
|
||||
// "traditional" meaning of Crypto.
|
||||
def FeatureCrypto : SubtargetFeature<"crypto", "HasCrypto", "true",
|
||||
@ -878,7 +878,7 @@ def : ProcessorModel<"generic", NoSchedModel, [
|
||||
FeatureNEON,
|
||||
FeaturePerfMon,
|
||||
FeaturePostRAScheduler,
|
||||
// ETE and TRBE are future architecture extensions. We temporariliy enable them
|
||||
// ETE and TRBE are future architecture extensions. We temporarily enable them
|
||||
// by default for users targeting generic AArch64, until it is decided in which
|
||||
// armv8.x-a architecture revision they will end up. The extensions do not
|
||||
// affect code generated by the compiler and can be used only by explicitly
|
||||
|
@ -883,7 +883,7 @@ def imm0_31 : Operand<i64>, ImmLeaf<i64, [{
|
||||
}
|
||||
|
||||
// timm0_31 predicate - same ass imm0_31, but use TargetConstant (TimmLeaf)
|
||||
// instead of Contant (ImmLeaf)
|
||||
// instead of Constant (ImmLeaf)
|
||||
def timm0_31 : Operand<i64>, TImmLeaf<i64, [{
|
||||
return ((uint64_t)Imm) < 32;
|
||||
}]> {
|
||||
|
@ -787,7 +787,7 @@ def Z30 : AArch64Reg<30, "z30", [Q30, Z30_HI]>, DwarfRegNum<[126]>;
|
||||
def Z31 : AArch64Reg<31, "z31", [Q31, Z31_HI]>, DwarfRegNum<[127]>;
|
||||
}
|
||||
|
||||
// Enum descibing the element size for destructive
|
||||
// Enum describing the element size for destructive
|
||||
// operations.
|
||||
class ElementSizeEnum<bits<3> val> {
|
||||
bits<3> Value = val;
|
||||
|
@ -338,7 +338,7 @@ def : PState<"PAN", 0b00100>;
|
||||
// v8.2a "User Access Override" extension-specific PStates
|
||||
let Requires = [{ {AArch64::FeaturePsUAO} }] in
|
||||
def : PState<"UAO", 0b00011>;
|
||||
// v8.4a timining insensitivity of data processing instructions
|
||||
// v8.4a timing insensitivity of data processing instructions
|
||||
let Requires = [{ {AArch64::FeatureDIT} }] in
|
||||
def : PState<"DIT", 0b11010>;
|
||||
// v8.5a Spectre Mitigation
|
||||
@ -1358,7 +1358,7 @@ def : RWSysReg<"MPAMVPM7_EL2", 0b11, 0b100, 0b1010, 0b0110, 0b111>;
|
||||
def : ROSysReg<"MPAMIDR_EL1", 0b11, 0b000, 0b1010, 0b0100, 0b100>;
|
||||
} //FeatureMPAM
|
||||
|
||||
// v8.4a Activitiy Monitor registers
|
||||
// v8.4a Activity Monitor registers
|
||||
// Op0 Op1 CRn CRm Op2
|
||||
let Requires = [{ {AArch64::FeatureAM} }] in {
|
||||
def : RWSysReg<"AMCR_EL0", 0b11, 0b011, 0b1101, 0b0010, 0b000>;
|
||||
@ -1424,7 +1424,7 @@ def : RWSysReg<"TRFCR_EL2", 0b11, 0b100, 0b0001, 0b0010, 0b001>;
|
||||
def : RWSysReg<"TRFCR_EL12", 0b11, 0b101, 0b0001, 0b0010, 0b001>;
|
||||
} //FeatureTRACEV8_4
|
||||
|
||||
// v8.4a Timining insensitivity of data processing instructions
|
||||
// v8.4a Timing insensitivity of data processing instructions
|
||||
// DIT: Data Independent Timing instructions
|
||||
// Op0 Op1 CRn CRm Op2
|
||||
let Requires = [{ {AArch64::FeatureDIT} }] in {
|
||||
|
@ -265,7 +265,7 @@ multiclass GISelVop2IntrPat <
|
||||
|
||||
def : GISelVop2Pat <node, inst, dst_vt, src_vt>;
|
||||
|
||||
// FIXME: Intrinsics aren't marked as commutable, so we need to add an explcit
|
||||
// FIXME: Intrinsics aren't marked as commutable, so we need to add an explicit
|
||||
// pattern to handle commuting. This is another reason why legalizing to a
|
||||
// generic machine instruction may be better that matching the intrinsic
|
||||
// directly.
|
||||
|
@ -6,7 +6,7 @@
|
||||
//
|
||||
//===----------------------------------------------------------------------===//
|
||||
//
|
||||
// This file contains DAG node defintions for the AMDGPU target.
|
||||
// This file contains DAG node definitions for the AMDGPU target.
|
||||
//
|
||||
//===----------------------------------------------------------------------===//
|
||||
|
||||
@ -287,7 +287,7 @@ def AMDGPUffbh_i32_impl : SDNode<"AMDGPUISD::FFBH_I32", SDTIntBitCountUnaryOp>;
|
||||
def AMDGPUffbl_b32_impl : SDNode<"AMDGPUISD::FFBL_B32", SDTIntBitCountUnaryOp>;
|
||||
|
||||
// Signed and unsigned 24-bit multiply. The highest 8-bits are ignore
|
||||
// when performing the mulitply. The result is a 32-bit value.
|
||||
// when performing the multiply. The result is a 32-bit value.
|
||||
def AMDGPUmul_u24_impl : SDNode<"AMDGPUISD::MUL_U24", SDTIntBinOp,
|
||||
[SDNPCommutative, SDNPAssociative]
|
||||
>;
|
||||
@ -375,7 +375,7 @@ def AMDGPUret_flag : SDNode<"AMDGPUISD::RET_FLAG", SDTypeProfile<0, 1, [SDTCisPt
|
||||
|
||||
|
||||
//===----------------------------------------------------------------------===//
|
||||
// Intrinsic/Custom node compatability PatFrags
|
||||
// Intrinsic/Custom node compatibility PatFrags
|
||||
//===----------------------------------------------------------------------===//
|
||||
|
||||
def AMDGPUrcp : PatFrags<(ops node:$src), [(int_amdgcn_rcp node:$src),
|
||||
|
@ -1,4 +1,4 @@
|
||||
//===-- BUFInstructions.td - Buffer Instruction Defintions ----------------===//
|
||||
//===-- BUFInstructions.td - Buffer Instruction Definitions ---------------===//
|
||||
//
|
||||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
||||
// See https://llvm.org/LICENSE.txt for license information.
|
||||
|
@ -1,4 +1,4 @@
|
||||
//===-- DSInstructions.td - DS Instruction Defintions ---------------------===//
|
||||
//===-- DSInstructions.td - DS Instruction Definitions --------------------===//
|
||||
//
|
||||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
||||
// See https://llvm.org/LICENSE.txt for license information.
|
||||
|
@ -1,4 +1,4 @@
|
||||
//===-- FLATInstructions.td - FLAT Instruction Defintions -----------------===//
|
||||
//===-- FLATInstructions.td - FLAT Instruction Definitions ----------------===//
|
||||
//
|
||||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
||||
// See https://llvm.org/LICENSE.txt for license information.
|
||||
@ -100,7 +100,7 @@ class FLAT_Real <bits<7> op, FLAT_Pseudo ps> :
|
||||
!if(ps.is_flat_scratch, 0b01, 0));
|
||||
|
||||
// Signed offset. Highest bit ignored for flat and treated as 12-bit
|
||||
// unsigned for flat acceses.
|
||||
// unsigned for flat accesses.
|
||||
bits<13> offset;
|
||||
bits<1> nv = 0; // XXX - What does this actually do?
|
||||
|
||||
|
@ -1,4 +1,4 @@
|
||||
//===-- MIMGInstructions.td - MIMG Instruction Defintions -----------------===//
|
||||
//===-- MIMGInstructions.td - MIMG Instruction Definitions ----------------===//
|
||||
//
|
||||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
||||
// See https://llvm.org/LICENSE.txt for license information.
|
||||
|
@ -966,7 +966,7 @@ def VOPDstS64orS32 : BoolRC {
|
||||
}
|
||||
|
||||
// SCSrc_i1 is the operand for pseudo instructions only.
|
||||
// Boolean immeadiates shall not be exposed to codegen instructions.
|
||||
// Boolean immediates shall not be exposed to codegen instructions.
|
||||
def SCSrc_i1 : RegisterOperand<SReg_1_XEXEC> {
|
||||
let OperandNamespace = "AMDGPU";
|
||||
let OperandType = "OPERAND_REG_IMM_INT32";
|
||||
|
@ -1,4 +1,4 @@
|
||||
//===-- SIInstructions.td - SI Instruction Defintions ---------------------===//
|
||||
//===-- SIInstructions.td - SI Instruction Definitions --------------------===//
|
||||
//
|
||||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
||||
// See https://llvm.org/LICENSE.txt for license information.
|
||||
@ -66,7 +66,7 @@ def VINTRPDst : VINTRPDstOperand <VGPR_32>;
|
||||
|
||||
let Uses = [M0, EXEC] in {
|
||||
|
||||
// FIXME: Specify SchedRW for VINTRP insturctions.
|
||||
// FIXME: Specify SchedRW for VINTRP instructions.
|
||||
|
||||
multiclass V_INTERP_P1_F32_m : VINTRP_m <
|
||||
0x00000000,
|
||||
|
@ -751,7 +751,7 @@ def SReg_1024 : RegisterClass<"AMDGPU", [v32i32, v32f32], 32,
|
||||
let AllocationPriority = 20;
|
||||
}
|
||||
|
||||
// Register class for all vector registers (VGPRs + Interploation Registers)
|
||||
// Register class for all vector registers (VGPRs + Interpolation Registers)
|
||||
class VRegClass<int numRegs, list<ValueType> regTypes, dag regList> :
|
||||
RegisterClass<"AMDGPU", regTypes, 32, regList> {
|
||||
let Size = !mul(numRegs, 32);
|
||||
|
@ -1,4 +1,4 @@
|
||||
//===-- SISchedule.td - SI Scheduling definitons -------------------------===//
|
||||
//===-- SISchedule.td - SI Scheduling definitions -------------------------===//
|
||||
//
|
||||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
||||
// See https://llvm.org/LICENSE.txt for license information.
|
||||
@ -136,9 +136,9 @@ multiclass SICommonWriteRes {
|
||||
def : ReadAdvance<MIVGPRRead, -2>;
|
||||
def : InstRW<[Write64Bit, MIReadVGPR], (instregex "^V_ACCVGPR_WRITE_B32$")>;
|
||||
|
||||
// Technicaly mfma reads can be from 0 to 4 cycles but that does not make
|
||||
// Technically mfma reads can be from 0 to 4 cycles but that does not make
|
||||
// sense to model because its register setup is huge. In particular if we
|
||||
// properly model read advanice as -2 for a vgpr read it will result in a
|
||||
// properly model read advance as -2 for a vgpr read it will result in a
|
||||
// bad scheduling of acc writes before that mfma. To avoid it we would
|
||||
// need to consume 2 or 4 more vgprs to be initialized before the acc
|
||||
// write sequence. Just assume worst case here.
|
||||
|
@ -1,4 +1,4 @@
|
||||
//===---- SMInstructions.td - Scalar Memory Instruction Defintions --------===//
|
||||
//===---- SMInstructions.td - Scalar Memory Instruction Definitions -------===//
|
||||
//
|
||||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
||||
// See https://llvm.org/LICENSE.txt for license information.
|
||||
|
@ -1,4 +1,4 @@
|
||||
//===-- SOPInstructions.td - SOP Instruction Defintions -------------------===//
|
||||
//===-- SOPInstructions.td - SOP Instruction Definitions ------------------===//
|
||||
//
|
||||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
||||
// See https://llvm.org/LICENSE.txt for license information.
|
||||
|
@ -1,4 +1,4 @@
|
||||
//===-- VIInstructions.td - VI Instruction Defintions ---------------------===//
|
||||
//===-- VIInstructions.td - VI Instruction Definitions --------------------===//
|
||||
//
|
||||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
||||
// See https://llvm.org/LICENSE.txt for license information.
|
||||
|
@ -1,4 +1,4 @@
|
||||
//===-- VOP1Instructions.td - Vector Instruction Defintions ---------------===//
|
||||
//===-- VOP1Instructions.td - Vector Instruction Definitions --------------===//
|
||||
//
|
||||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
||||
// See https://llvm.org/LICENSE.txt for license information.
|
||||
|
@ -1,4 +1,4 @@
|
||||
//===-- VOP2Instructions.td - Vector Instruction Defintions ---------------===//
|
||||
//===-- VOP2Instructions.td - Vector Instruction Definitions --------------===//
|
||||
//
|
||||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
||||
// See https://llvm.org/LICENSE.txt for license information.
|
||||
|
@ -1,4 +1,4 @@
|
||||
//===-- VOP3Instructions.td - Vector Instruction Defintions ---------------===//
|
||||
//===-- VOP3Instructions.td - Vector Instruction Definitions --------------===//
|
||||
//
|
||||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
||||
// See https://llvm.org/LICENSE.txt for license information.
|
||||
|
@ -1,4 +1,4 @@
|
||||
//===-- VOP3PInstructions.td - Vector Instruction Defintions --------------===//
|
||||
//===-- VOP3PInstructions.td - Vector Instruction Definitions -------------===//
|
||||
//
|
||||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
||||
// See https://llvm.org/LICENSE.txt for license information.
|
||||
@ -35,7 +35,7 @@ class VOP3_VOP3PInst<string OpName, VOPProfile P, bit UseTiedOutput = 0,
|
||||
// FIXME: clampmod0 misbehaves with the non-default vdst_in
|
||||
// following it. For now workaround this by requiring clamp
|
||||
// in tied patterns. This should use undef_tied_input, but it
|
||||
// seems underdeveloped and doesns't apply the right register
|
||||
// seems underdeveloped and doesn't apply the right register
|
||||
// class constraints.
|
||||
!if(UseTiedOutput, (ins clampmod:$clamp, VGPR_32:$vdst_in),
|
||||
(ins clampmod0:$clamp))),
|
||||
|
@ -1,4 +1,4 @@
|
||||
//===-- VOPCInstructions.td - Vector Instruction Defintions ---------------===//
|
||||
//===-- VOPCInstructions.td - Vector Instruction Definitions --------------===//
|
||||
//
|
||||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
||||
// See https://llvm.org/LICENSE.txt for license information.
|
||||
|
@ -1,4 +1,4 @@
|
||||
//===-- VOPInstructions.td - Vector Instruction Defintions ----------------===//
|
||||
//===-- VOPInstructions.td - Vector Instruction Definitions ---------------===//
|
||||
//
|
||||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
||||
// See https://llvm.org/LICENSE.txt for license information.
|
||||
@ -160,7 +160,7 @@ class VOP3_Real <VOP_Pseudo ps, int EncodingFamily> :
|
||||
VOPProfile Pfl = ps.Pfl;
|
||||
}
|
||||
|
||||
// XXX - Is there any reason to distingusih this from regular VOP3
|
||||
// XXX - Is there any reason to distinguish this from regular VOP3
|
||||
// here?
|
||||
class VOP3P_Real<VOP_Pseudo ps, int EncodingFamily> :
|
||||
VOP3_Real<ps, EncodingFamily>;
|
||||
|
@ -34,7 +34,7 @@ def ARCGAWrapper : SDNode<"ARCISD::GAWRAPPER", SDT_ARCmov, []>;
|
||||
// Comparison
|
||||
def ARCcmp : SDNode<"ARCISD::CMP", SDT_ARCcmptst, [SDNPOutGlue]>;
|
||||
|
||||
// Conditionanal mov
|
||||
// Conditional mov
|
||||
def ARCcmov : SDNode<"ARCISD::CMOV", SDT_ARCcmov, [SDNPInGlue]>;
|
||||
|
||||
// Conditional Branch
|
||||
@ -206,7 +206,7 @@ multiclass ArcBinaryEXT5Inst<bits<6> mincode, string opasm> :
|
||||
multiclass ArcUnaryGEN4Inst<bits<6> mincode, string opasm> :
|
||||
ArcUnaryInst<0b00100, mincode, opasm>;
|
||||
|
||||
// Pattern generation for differnt instruction variants.
|
||||
// Pattern generation for different instruction variants.
|
||||
multiclass MultiPat<SDPatternOperator InFrag,
|
||||
Instruction RRR, Instruction RRU6, Instruction RRLImm> {
|
||||
def _rrr : Pat<(InFrag i32:$B, i32:$C), (RRR i32:$B, i32:$C)>;
|
||||
@ -215,7 +215,7 @@ multiclass MultiPat<SDPatternOperator InFrag,
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Instruction defintions and patterns for 3 operand binary instructions.
|
||||
// Instruction definitions and patterns for 3 operand binary instructions.
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
// Definitions for 3 operand binary instructions.
|
||||
|
@ -255,7 +255,7 @@ class CDE_FP_Vec_Instr<bit vec, bit acc, dag oops, dag iops, string asm, string
|
||||
let Inst{6} = vec;
|
||||
}
|
||||
|
||||
// Base class for floating-point variants of CDE VCX* intructions
|
||||
// Base class for floating-point variants of CDE VCX* instructions
|
||||
class CDE_FP_Instr<bit acc, bit sz, dag oops, dag iops, string asm, string cstr>
|
||||
: CDE_FP_Vec_Instr<0b0, acc, oops, iops, asm, cstr> {
|
||||
let Inst{24} = sz;
|
||||
|
@ -2819,7 +2819,7 @@ multiclass AI2_ldridx<bit isByte, string opc,
|
||||
}
|
||||
|
||||
let mayLoad = 1, hasSideEffects = 0 in {
|
||||
// FIXME: for LDR_PRE_REG etc. the itineray should be either IIC_iLoad_ru or
|
||||
// FIXME: for LDR_PRE_REG etc. the itinerary should be either IIC_iLoad_ru or
|
||||
// IIC_iLoad_siu depending on whether it the offset register is shifted.
|
||||
defm LDR : AI2_ldridx<0, "ldr", IIC_iLoad_iu, IIC_iLoad_ru>;
|
||||
defm LDRB : AI2_ldridx<1, "ldrb", IIC_iLoad_bh_iu, IIC_iLoad_bh_ru>;
|
||||
@ -3100,7 +3100,7 @@ multiclass AI2_stridx<bit isByte, string opc,
|
||||
}
|
||||
|
||||
let mayStore = 1, hasSideEffects = 0 in {
|
||||
// FIXME: for STR_PRE_REG etc. the itineray should be either IIC_iStore_ru or
|
||||
// FIXME: for STR_PRE_REG etc. the itinerary should be either IIC_iStore_ru or
|
||||
// IIC_iStore_siu depending on whether it the offset register is shifted.
|
||||
defm STR : AI2_stridx<0, "str", IIC_iStore_iu, IIC_iStore_ru>;
|
||||
defm STRB : AI2_stridx<1, "strb", IIC_iStore_bh_iu, IIC_iStore_bh_ru>;
|
||||
@ -5754,7 +5754,7 @@ def : ARMPat<(ARMthread_pointer), (MRC 15, 0, 13, 0, 3)>,
|
||||
// when we get here from a longjmp(). We force everything out of registers
|
||||
// except for our own input by listing the relevant registers in Defs. By
|
||||
// doing so, we also cause the prologue/epilogue code to actively preserve
|
||||
// all of the callee-saved resgisters, which is exactly what we want.
|
||||
// all of the callee-saved registers, which is exactly what we want.
|
||||
// A constant value is passed in $val, and we use the location as a scratch.
|
||||
//
|
||||
// These are pseudo-instructions and are lowered to individual MC-insts, so
|
||||
|
@ -4256,7 +4256,7 @@ let Predicates = [HasMVEFloat] in {
|
||||
}
|
||||
|
||||
|
||||
// Extra "worst case" and/or/xor partterns, going into and out of GRP
|
||||
// Extra "worst case" and/or/xor patterns, going into and out of GRP
|
||||
multiclass two_predops<SDPatternOperator opnode, Instruction insn> {
|
||||
def v16i1 : Pat<(v16i1 (opnode (v16i1 VCCR:$p1), (v16i1 VCCR:$p2))),
|
||||
(v16i1 (COPY_TO_REGCLASS
|
||||
@ -4986,7 +4986,7 @@ multiclass MVE_vec_scalar_fp_pat_m<SDNode unpred_op, Intrinsic pred_int,
|
||||
(v8f16 (instr_f16 (v8f16 MQPR:$Qm), (i32 rGPR:$val),
|
||||
ARMVCCThen, (v8i1 VCCR:$mask),
|
||||
(v8f16 MQPR:$inactive)))>;
|
||||
// Preicated F32
|
||||
// Predicated F32
|
||||
def : Pat<(v4f32 (pred_int (v4f32 MQPR:$Qm), (v4f32 (ARMvdup rGPR:$val)),
|
||||
(v4i1 VCCR:$mask), (v4f32 MQPR:$inactive))),
|
||||
(v4f32 (instr_f32 (v4f32 MQPR:$Qm), (i32 rGPR:$val),
|
||||
|
@ -7320,7 +7320,7 @@ def : Pat<(arm_vmovsr GPR:$a),
|
||||
Requires<[HasNEON, DontUseVMOVSR]>;
|
||||
|
||||
//===----------------------------------------------------------------------===//
|
||||
// Non-Instruction Patterns or Endiness - Revert Patterns
|
||||
// Non-Instruction Patterns or Endianess - Revert Patterns
|
||||
//===----------------------------------------------------------------------===//
|
||||
|
||||
// bit_convert
|
||||
|
@ -1513,7 +1513,7 @@ def tTPsoft : tPseudoInst<(outs), (ins), 4, IIC_Br,
|
||||
// tromped upon when we get here from a longjmp(). We force everything out of
|
||||
// registers except for our own input by listing the relevant registers in
|
||||
// Defs. By doing so, we also cause the prologue/epilogue code to actively
|
||||
// preserve all of the callee-saved resgisters, which is exactly what we want.
|
||||
// preserve all of the callee-saved registers, which is exactly what we want.
|
||||
// $val is a scratch register for our use.
|
||||
let Defs = [ R0, R1, R2, R3, R4, R5, R6, R7, R12, CPSR ],
|
||||
hasSideEffects = 1, isBarrier = 1, isCodeGenOnly = 1,
|
||||
|
@ -3757,7 +3757,7 @@ def : T2Pat<(stlex_2 (and GPR:$Rt, 0xffff), addr_offset_none:$addr),
|
||||
// when we get here from a longjmp(). We force everything out of registers
|
||||
// except for our own input by listing the relevant registers in Defs. By
|
||||
// doing so, we also cause the prologue/epilogue code to actively preserve
|
||||
// all of the callee-saved resgisters, which is exactly what we want.
|
||||
// all of the callee-saved registers, which is exactly what we want.
|
||||
// $val is a scratch register for our use.
|
||||
let Defs =
|
||||
[ R0, R1, R2, R3, R4, R5, R6, R7, R8, R9, R10, R11, R12, LR, CPSR,
|
||||
@ -4196,7 +4196,7 @@ def t2LDRpci_pic : PseudoInst<(outs rGPR:$dst), (ins i32imm:$addr, pclabel:$cp),
|
||||
imm:$cp))]>,
|
||||
Requires<[IsThumb2]>;
|
||||
|
||||
// Pseudo isntruction that combines movs + predicated rsbmi
|
||||
// Pseudo instruction that combines movs + predicated rsbmi
|
||||
// to implement integer ABS
|
||||
let usesCustomInserter = 1, Defs = [CPSR], hasNoSchedulingInfo = 1 in {
|
||||
def t2ABS : PseudoInst<(outs rGPR:$dst), (ins rGPR:$src),
|
||||
|
@ -744,7 +744,7 @@ let SchedModel = SwiftModel in {
|
||||
SwiftWriteLM14CyNo, SwiftWriteLM14CyNo,
|
||||
SwiftWriteLM14CyNo, SwiftWriteLM14CyNo,
|
||||
SwiftWriteP01OneCycle, SwiftVLDMPerm5]>,
|
||||
// Inaccurate: reuse describtion from 9 S registers.
|
||||
// Inaccurate: reuse description from 9 S registers.
|
||||
SchedVar<SwiftLMAddr11Pred,[SwiftWriteLM9Cy, SwiftWriteLM10Cy,
|
||||
SwiftWriteLM13Cy, SwiftWriteLM14CyNo,
|
||||
SwiftWriteLM17CyNo, SwiftWriteLM18CyNo,
|
||||
@ -760,7 +760,7 @@ let SchedModel = SwiftModel in {
|
||||
SwiftWriteLM11CyNo, SwiftWriteLM11CyNo,
|
||||
SwiftWriteLM11CyNo, SwiftWriteLM11CyNo,
|
||||
SwiftWriteP01OneCycle, SwiftVLDMPerm3]>,
|
||||
// Inaccurate: reuse describtion from 9 S registers.
|
||||
// Inaccurate: reuse description from 9 S registers.
|
||||
SchedVar<SwiftLMAddr13Pred, [SwiftWriteLM9Cy, SwiftWriteLM10Cy,
|
||||
SwiftWriteLM13Cy, SwiftWriteLM14CyNo,
|
||||
SwiftWriteLM17CyNo, SwiftWriteLM18CyNo,
|
||||
@ -958,7 +958,7 @@ let SchedModel = SwiftModel in {
|
||||
def : InstRW<[SwiftWriteLM7Cy, SwiftWriteP01OneCycle, SwiftWriteLM8Cy,
|
||||
SwiftWriteLM8Cy, SwiftExt1xP0, SwiftVLDMPerm3],
|
||||
(instregex "VLD3(LN|DUP)(d|q)(8|16|32)Pseudo_UPD")>;
|
||||
// Four element struture.
|
||||
// Four element structure.
|
||||
def : InstRW<[SwiftWriteLM8Cy, SwiftWriteLM9Cy, SwiftWriteLM10CyNo,
|
||||
SwiftWriteLM10CyNo, SwiftExt1xP0, SwiftVLDMPerm5],
|
||||
(instregex "VLD4(LN|DUP)(d|q)(8|16|32)$",
|
||||
|
@ -7,7 +7,7 @@
|
||||
//===----------------------------------------------------------------------===//
|
||||
|
||||
// These itinerary class descriptions are based on the instruction timing
|
||||
// classes as per V62. Curretnly, they are just extracted from
|
||||
// classes as per V62. Currently, they are just extracted from
|
||||
// HexagonScheduleV62.td but will soon be auto-generated by HexagonGen.py.
|
||||
|
||||
class PseudoItin {
|
||||
|
@ -37,7 +37,7 @@ def HVXVectorAccess : MemAccessSize<5>;
|
||||
// Instruction Class Declaration +
|
||||
//===----------------------------------------------------------------------===//
|
||||
|
||||
// "Parse" bits are explicity NOT defined in the opcode space to prevent
|
||||
// "Parse" bits are explicitly NOT defined in the opcode space to prevent
|
||||
// TableGen from using them for generation of the decoder tables.
|
||||
class OpcodeHexagon {
|
||||
field bits<32> Inst = ?; // Default to an invalid insn.
|
||||
|
@ -11,13 +11,13 @@
|
||||
//===----------------------------------------------------------------------===//
|
||||
|
||||
//----------------------------------------------------------------------------//
|
||||
// Hexagon Intruction Flags +
|
||||
// Hexagon Instruction Flags +
|
||||
//
|
||||
// *** Must match BaseInfo.h ***
|
||||
//----------------------------------------------------------------------------//
|
||||
|
||||
//----------------------------------------------------------------------------//
|
||||
// Intruction Classes Definitions +
|
||||
// Instruction Classes Definitions +
|
||||
//----------------------------------------------------------------------------//
|
||||
|
||||
class CVI_VA_Resource_NoOpcode<dag outs, dag ins, string asmstr,
|
||||
|
@ -490,7 +490,7 @@ def TFRI64_V4 : InstHexagon<(outs DoubleRegs:$dst),
|
||||
A2_combineii.Itinerary, TypeALU32_2op>, OpcodeHexagon;
|
||||
|
||||
// Hexagon doesn't have a vector multiply with C semantics.
|
||||
// Instead, generate a pseudo instruction that gets expaneded into two
|
||||
// Instead, generate a pseudo instruction that gets expanded into two
|
||||
// scalar MPYI instructions.
|
||||
// This is expanded by ExpandPostRAPseudos.
|
||||
let isPseudo = 1 in
|
||||
|
@ -6,7 +6,7 @@
|
||||
//
|
||||
//===----------------------------------------------------------------------===//
|
||||
//
|
||||
// This files descributes the formats of the microMIPS instruction set.
|
||||
// This files describes the formats of the microMIPS instruction set.
|
||||
//
|
||||
//===----------------------------------------------------------------------===//
|
||||
|
||||
|
@ -6,7 +6,7 @@
|
||||
//
|
||||
//===----------------------------------------------------------------------===//
|
||||
//
|
||||
// This files describes the defintions of the microMIPSr3 instructions.
|
||||
// This files describes the definitions of the microMIPSr3 instructions.
|
||||
//
|
||||
//===----------------------------------------------------------------------===//
|
||||
|
||||
|
@ -1642,7 +1642,7 @@ def : Mips16Pat<(select (i32 (setle CPU16Regs:$a, CPU16Regs:$b)),
|
||||
CPU16Regs:$b, CPU16Regs:$a)>;
|
||||
|
||||
//
|
||||
// unnsigned
|
||||
// unsigned
|
||||
// x = (a <= b)? x : y
|
||||
//
|
||||
// if (b < a) x = y
|
||||
|
@ -498,7 +498,7 @@ class MADD4 {
|
||||
list<Predicate> AdditionalPredicates = [HasMadd4];
|
||||
}
|
||||
|
||||
// Classses used for separating expansions that differ based on the ABI in
|
||||
// Classes used for separating expansions that differ based on the ABI in
|
||||
// use.
|
||||
class ABI_N64 {
|
||||
list<Predicate> AdditionalPredicates = [IsN64];
|
||||
@ -1286,7 +1286,7 @@ def LUiORiPred : PatLeaf<(imm), [{
|
||||
return isInt<32>(SVal) && (SVal & 0xffff);
|
||||
}]>;
|
||||
|
||||
// Mips Address Mode! SDNode frameindex could possibily be a match
|
||||
// Mips Address Mode! SDNode frameindex could possibly be a match
|
||||
// since load and store instructions from stack used it.
|
||||
def addr :
|
||||
ComplexPattern<iPTR, 2, "selectIntAddr", [frameindex]>;
|
||||
|
@ -244,7 +244,7 @@ def FeaturePCRelativeMemops :
|
||||
// !listconcat(FutureProcessorInheritableFeatures,
|
||||
// FutureProcessorSpecificFeatures)
|
||||
|
||||
// Makes it explicit and obvious what is new in FutureProcesor vs. Power8 as
|
||||
// Makes it explicit and obvious what is new in FutureProcessor vs. Power8 as
|
||||
// well as providing a single point of definition if the feature set will be
|
||||
// used elsewhere.
|
||||
def ProcessorFeatures {
|
||||
|
@ -928,7 +928,7 @@ def spe2dis : Operand<iPTR> { // SPE displacement where the imm is 2-aligned.
|
||||
}
|
||||
|
||||
// A single-register address. This is used with the SjLj
|
||||
// pseudo-instructions which tranlates to LD/LWZ. These instructions requires
|
||||
// pseudo-instructions which translates to LD/LWZ. These instructions requires
|
||||
// G8RC_NOX0 registers.
|
||||
def memr : Operand<iPTR> {
|
||||
let MIOperandInfo = (ops ptr_rc_nor0:$ptrreg);
|
||||
@ -965,11 +965,11 @@ def iaddrX16 : ComplexPattern<iPTR, 2, "SelectAddrImmX16", [], []>; // "stxv"
|
||||
|
||||
// Below forms are all x-form addressing mode, use three different ones so we
|
||||
// can make a accurate check for x-form instructions in ISEL.
|
||||
// x-form addressing mode whose associated diplacement form is D.
|
||||
// x-form addressing mode whose associated displacement form is D.
|
||||
def xaddr : ComplexPattern<iPTR, 2, "SelectAddrIdx", [], []>; // "stbx"
|
||||
// x-form addressing mode whose associated diplacement form is DS.
|
||||
// x-form addressing mode whose associated displacement form is DS.
|
||||
def xaddrX4 : ComplexPattern<iPTR, 2, "SelectAddrIdxX4", [], []>; // "stdx"
|
||||
// x-form addressing mode whose associated diplacement form is DQ.
|
||||
// x-form addressing mode whose associated displacement form is DQ.
|
||||
def xaddrX16 : ComplexPattern<iPTR, 2, "SelectAddrIdxX16", [], []>; // "stxvx"
|
||||
|
||||
def xoaddr : ComplexPattern<iPTR, 2, "SelectAddrIdxOnly",[], []>;
|
||||
|
@ -3645,7 +3645,7 @@ let AddedComplexity = 400, Predicates = [HasP9Vector] in {
|
||||
(EXTRACT_SUBREG (VEXTRACTUB Idx, $src), sub_64)))>;
|
||||
}
|
||||
|
||||
// Unsiged int in vsx register -> QP
|
||||
// Unsigned int in vsx register -> QP
|
||||
def : Pat<(f128 (uint_to_fp (i32 (PPCmfvsr f64:$src)))),
|
||||
(f128 (XSCVUDQP
|
||||
(XXEXTRACTUW (SUBREG_TO_REG (i64 1), $src, sub_64), 4)))>;
|
||||
@ -3716,7 +3716,7 @@ let AddedComplexity = 400, Predicates = [HasP9Vector] in {
|
||||
(VEXTRACTUB !head(!tail(Idx)), $src), sub_64)))>;
|
||||
}
|
||||
|
||||
// Unsiged int in vsx register -> QP
|
||||
// Unsigned int in vsx register -> QP
|
||||
def : Pat<(f128 (uint_to_fp (i32 (PPCmfvsr f64:$src)))),
|
||||
(f128 (XSCVUDQP
|
||||
(XXEXTRACTUW (SUBREG_TO_REG (i64 1), $src, sub_64), 8)))>;
|
||||
|
@ -156,7 +156,7 @@ foreach Index = 32-63 in {
|
||||
def VSX#Index : VSXReg<Index, "vs"#Index>;
|
||||
}
|
||||
|
||||
// The reprsentation of r0 when treated as the constant 0.
|
||||
// The representation of r0 when treated as the constant 0.
|
||||
def ZERO : GPR<0, "0">, DwarfRegAlias<R0>;
|
||||
def ZERO8 : GP8<ZERO, "0">, DwarfRegAlias<X0>;
|
||||
|
||||
|
@ -20,7 +20,7 @@ def P9Model : SchedMachineModel {
|
||||
|
||||
// Load latency is 4 or 5 cycles depending on the load. This latency assumes
|
||||
// that we have a cache hit. For a cache miss the load latency will be more.
|
||||
// There are two instructions (lxvl, lxvll) that have a latencty of 6 cycles.
|
||||
// There are two instructions (lxvl, lxvll) that have a latency of 6 cycles.
|
||||
// However it is not worth bumping this value up to 6 when the vast majority
|
||||
// of instructions are 4 or 5 cycles.
|
||||
let LoadLatency = 5;
|
||||
@ -40,7 +40,7 @@ def P9Model : SchedMachineModel {
|
||||
|
||||
let CompleteModel = 1;
|
||||
|
||||
// Do not support QPX (Quad Processing eXtension), SPE (Signal Procesing
|
||||
// Do not support QPX (Quad Processing eXtension), SPE (Signal Processing
|
||||
// Engine), prefixed instructions on Power 9 or PC relative mem ops.
|
||||
let UnsupportedFeatures = [HasQPX, HasSPE, PrefixInstrs, PCRelativeMemops];
|
||||
|
||||
|
@ -784,7 +784,7 @@ def : MnemonicAlias<"sbreak", "ebreak">;
|
||||
//
|
||||
// Naming convention: For 'generic' pattern classes, we use the naming
|
||||
// convention PatTy1Ty2. For pattern classes which offer a more complex
|
||||
// expension, prefix the class name, e.g. BccPat.
|
||||
// expansion, prefix the class name, e.g. BccPat.
|
||||
//===----------------------------------------------------------------------===//
|
||||
|
||||
/// Generic pattern classes
|
||||
|
@ -67,7 +67,7 @@ def RetCC_Sparc32 : CallingConv<[
|
||||
// bits of an integer register while the float goes in a floating point
|
||||
// register.
|
||||
//
|
||||
// The difference is encoded in LLVM IR using the inreg atttribute on function
|
||||
// The difference is encoded in LLVM IR using the inreg attribute on function
|
||||
// arguments:
|
||||
//
|
||||
// C: void f(float, float);
|
||||
|
@ -1080,7 +1080,7 @@ let hasSideEffects = 1, rd = 0, rs1 = 0b01111, rs2 = 0 in
|
||||
def STBAR : F3_1<2, 0b101000, (outs), (ins), "stbar", []>;
|
||||
|
||||
|
||||
// Section B.31 - Unimplmented Instruction
|
||||
// Section B.31 - Unimplemented Instruction
|
||||
let rd = 0 in
|
||||
def UNIMP : F2_1<0b000, (outs), (ins i32imm:$imm22),
|
||||
"unimp $imm22", []>;
|
||||
|
@ -1,4 +1,4 @@
|
||||
//===-- SparcSchedule.td - Describe the Sparc Itineries ----*- tablegen -*-===//
|
||||
//===-- SparcSchedule.td - Describe the Sparc Itineraries ----*- tablegen -*-=//
|
||||
//
|
||||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
||||
// See https://llvm.org/LICENSE.txt for license information.
|
||||
|
@ -4379,7 +4379,7 @@ class TernaryVRRdGeneric<string mnemonic, bits<16> opcode>
|
||||
}
|
||||
|
||||
// Ternary operation where the assembler mnemonic has an extra operand to
|
||||
// optionally allow specifiying arbitrary M6 values.
|
||||
// optionally allow specifying arbitrary M6 values.
|
||||
multiclass TernaryExtraVRRd<string mnemonic, bits<16> opcode,
|
||||
SDPatternOperator operator,
|
||||
TypedReg tr1, TypedReg tr2, bits<4> type> {
|
||||
@ -5135,7 +5135,7 @@ multiclass MemorySS<string mnemonic, bits<8> opcode,
|
||||
}
|
||||
}
|
||||
|
||||
// The same, but setting a CC result as comparion operator.
|
||||
// The same, but setting a CC result as comparison operator.
|
||||
multiclass CompareMemorySS<string mnemonic, bits<8> opcode,
|
||||
SDPatternOperator sequence, SDPatternOperator loop> {
|
||||
def "" : SideEffectBinarySSa<mnemonic, opcode>;
|
||||
|
@ -1564,7 +1564,7 @@ def : VectorReplicateScalar<v16i8, VREPB, 7>;
|
||||
def : VectorReplicateScalar<v8i16, VREPH, 3>;
|
||||
def : VectorReplicateScalar<v4i32, VREPF, 1>;
|
||||
|
||||
// i64 replications are just a single isntruction.
|
||||
// i64 replications are just a single instruction.
|
||||
def : Pat<(v2i64 (z_replicate GR64:$scalar)),
|
||||
(VLVGP GR64:$scalar, GR64:$scalar)>;
|
||||
|
||||
|
@ -167,7 +167,7 @@ class FPConversion<Instruction insn, SDPatternOperator operator, TypedReg tr1,
|
||||
: Pat<(tr1.vt (operator (tr2.vt tr2.op:$vec))),
|
||||
(insn tr2.op:$vec, suppress, mode)>;
|
||||
|
||||
// Use INSN to perform mininum/maximum operation OPERATOR on type TR.
|
||||
// Use INSN to perform minimum/maximum operation OPERATOR on type TR.
|
||||
// FUNCTION is the type of minimum/maximum function to perform.
|
||||
class FPMinMax<Instruction insn, SDPatternOperator operator, TypedReg tr,
|
||||
bits<4> function>
|
||||
|
@ -9,7 +9,7 @@
|
||||
// Processor definitions.
|
||||
//
|
||||
// For compatibility with other compilers on the platform, each model can
|
||||
// be identifed either by the system name (e.g. z10) or the level of the
|
||||
// be identified either by the system name (e.g. z10) or the level of the
|
||||
// architecture the model supports, as identified by the edition level
|
||||
// of the z/Architecture Principles of Operation document (e.g. arch8).
|
||||
//
|
||||
|
@ -28,7 +28,7 @@ let Uses = [SP32, SP64], isCall = 1 in {
|
||||
|
||||
// CALL should take both variadic arguments and produce variadic results, but
|
||||
// this is not possible to model directly. Instead, we select calls to a
|
||||
// CALL_PARAMS taking variadic aguments linked with a CALL_RESULTS that handles
|
||||
// CALL_PARAMS taking variadic arguments linked with a CALL_RESULTS that handles
|
||||
// producing the call's variadic results. We recombine the two in a custom
|
||||
// inserter hook after DAG ISel, so passes over MachineInstrs will only ever
|
||||
// observe CALL nodes with all of the expected variadic uses and defs.
|
||||
|
@ -7,7 +7,7 @@
|
||||
//===----------------------------------------------------------------------===//
|
||||
///
|
||||
/// \file
|
||||
/// WebAssembly refence type operand codegen constructs.
|
||||
/// WebAssembly reference type operand codegen constructs.
|
||||
///
|
||||
//===----------------------------------------------------------------------===//
|
||||
|
||||
|
@ -997,7 +997,7 @@ def ProcessorFeatures {
|
||||
class Proc<string Name, list<SubtargetFeature> Features>
|
||||
: ProcessorModel<Name, GenericModel, Features>;
|
||||
|
||||
// NOTE: CMPXCHG8B is here for legacy compatbility so that it is only disabled
|
||||
// NOTE: CMPXCHG8B is here for legacy compatibility so that it is only disabled
|
||||
// if i386/i486 is specifically requested.
|
||||
def : Proc<"generic", [FeatureX87, FeatureSlowUAMem16,
|
||||
FeatureCMPXCHG8B, FeatureInsertVZEROUPPER]>;
|
||||
|
@ -248,7 +248,7 @@ multiclass AVX512_maskable_common<bits<8> O, Format F, X86VectorVTInfo _,
|
||||
|
||||
// This multiclass generates the unconditional/non-masking, the masking and
|
||||
// the zero-masking variant of the vector instruction. In the masking case, the
|
||||
// perserved vector elements come from a new dummy input operand tied to $dst.
|
||||
// preserved vector elements come from a new dummy input operand tied to $dst.
|
||||
// This version uses a separate dag for non-masking and masking.
|
||||
multiclass AVX512_maskable_split<bits<8> O, Format F, X86VectorVTInfo _,
|
||||
dag Outs, dag Ins, string OpcodeStr,
|
||||
@ -270,7 +270,7 @@ multiclass AVX512_maskable_split<bits<8> O, Format F, X86VectorVTInfo _,
|
||||
|
||||
// This multiclass generates the unconditional/non-masking, the masking and
|
||||
// the zero-masking variant of the vector instruction. In the masking case, the
|
||||
// perserved vector elements come from a new dummy input operand tied to $dst.
|
||||
// preserved vector elements come from a new dummy input operand tied to $dst.
|
||||
multiclass AVX512_maskable<bits<8> O, Format F, X86VectorVTInfo _,
|
||||
dag Outs, dag Ins, string OpcodeStr,
|
||||
string AttSrcAsm, string IntelSrcAsm,
|
||||
@ -5673,7 +5673,7 @@ multiclass avx512_vptest<bits<8> opc, string OpcodeStr,
|
||||
X86FoldableSchedWrite sched, X86VectorVTInfo _,
|
||||
string Name> {
|
||||
// NOTE: Patterns are omitted in favor of manual selection in X86ISelDAGToDAG.
|
||||
// There are just too many permuations due to commutability and bitcasts.
|
||||
// There are just too many permutations due to commutability and bitcasts.
|
||||
let ExeDomain = _.ExeDomain, hasSideEffects = 0 in {
|
||||
defm rr : AVX512_maskable_cmp<opc, MRMSrcReg, _, (outs _.KRC:$dst),
|
||||
(ins _.RC:$src1, _.RC:$src2), OpcodeStr,
|
||||
@ -7493,7 +7493,7 @@ multiclass avx512_cvt_fp_scalar<bits<8> opc, string OpcodeStr, X86VectorVTInfo _
|
||||
}
|
||||
}
|
||||
|
||||
// Scalar Coversion with SAE - suppress all exceptions
|
||||
// Scalar Conversion with SAE - suppress all exceptions
|
||||
multiclass avx512_cvt_fp_sae_scalar<bits<8> opc, string OpcodeStr, X86VectorVTInfo _,
|
||||
X86VectorVTInfo _Src, SDNode OpNodeSAE,
|
||||
X86FoldableSchedWrite sched> {
|
||||
@ -7634,7 +7634,7 @@ let Uses = [MXCSR], mayRaiseFPException = 1 in {
|
||||
EVEX, EVEX_B, Sched<[sched.Folded]>;
|
||||
}
|
||||
}
|
||||
// Coversion with SAE - suppress all exceptions
|
||||
// Conversion with SAE - suppress all exceptions
|
||||
multiclass avx512_vcvt_fp_sae<bits<8> opc, string OpcodeStr, X86VectorVTInfo _,
|
||||
X86VectorVTInfo _Src, SDNode OpNodeSAE,
|
||||
X86FoldableSchedWrite sched> {
|
||||
@ -12256,7 +12256,7 @@ multiclass avx512_binop_all2<bits<8> opc, string OpcodeStr,
|
||||
|
||||
let ExeDomain = SSEPackedSingle in
|
||||
defm VCVTNE2PS2BF16 : avx512_binop_all2<0x72, "vcvtne2ps2bf16",
|
||||
SchedWriteCvtPD2PS, //FIXME: Shoulod be SchedWriteCvtPS2BF
|
||||
SchedWriteCvtPD2PS, //FIXME: Should be SchedWriteCvtPS2BF
|
||||
avx512vl_f32_info, avx512vl_i16_info,
|
||||
X86cvtne2ps2bf16, HasBF16, 0>, T8XD;
|
||||
|
||||
|
@ -847,7 +847,7 @@ defm LCMPXCHG8B : LCMPXCHG_UnOp<0xC7, MRM1m, "cmpxchg8b", X86cas8, i64mem>;
|
||||
// it. In other words, the register will not fix the clobbering of
|
||||
// RBX that will happen when setting the arguments for the instrucion.
|
||||
//
|
||||
// Unlike the actual related instuction, we mark that this one
|
||||
// Unlike the actual related instruction, we mark that this one
|
||||
// defines EBX (instead of using EBX).
|
||||
// The rationale is that we will define RBX during the expansion of
|
||||
// the pseudo. The argument feeding EBX is ebx_input.
|
||||
|
@ -166,7 +166,7 @@ let ExeDomain = SSEPackedDouble in {
|
||||
}
|
||||
|
||||
// All source register operands of FMA opcodes defined in fma3s_rm multiclass
|
||||
// can be commuted. In many cases such commute transformation requres an opcode
|
||||
// can be commuted. In many cases such commute transformation requires an opcode
|
||||
// adjustment, for example, commuting the operands 1 and 2 in FMA*132 form
|
||||
// would require an opcode change to FMA*231:
|
||||
// FMA*132* reg1, reg2, reg3; // reg1 * reg3 + reg2;
|
||||
@ -283,7 +283,7 @@ multiclass fma3s_rm_int<bits<8> opc, string OpcodeStr,
|
||||
[]>, Sched<[sched.Folded, sched.ReadAfterFold, sched.ReadAfterFold]>;
|
||||
}
|
||||
|
||||
// The FMA 213 form is created for lowering of scalar FMA intrinscis
|
||||
// The FMA 213 form is created for lowering of scalar FMA intrinsics
|
||||
// to machine instructions.
|
||||
// The FMA 132 form can trivially be get by commuting the 2nd and 3rd operands
|
||||
// of FMA 213 form.
|
||||
|
@ -286,7 +286,7 @@ defm MUL : FPBinary_rr<any_fmul>;
|
||||
defm DIV : FPBinary_rr<any_fdiv>;
|
||||
}
|
||||
|
||||
// Sets the scheduling resources for the actual NAME#_F<size>m defintions.
|
||||
// Sets the scheduling resources for the actual NAME#_F<size>m definitions.
|
||||
let SchedRW = [WriteFAddLd] in {
|
||||
defm ADD : FPBinary<any_fadd, MRM0m, "add">;
|
||||
defm SUB : FPBinary<any_fsub, MRM4m, "sub">;
|
||||
|
@ -7485,7 +7485,7 @@ void X86InstrInfo::setExecutionDomain(MachineInstr &MI, unsigned Domain) const {
|
||||
assert((Subtarget.hasDQI() || Domain >= 3) && "Requires AVX-512DQ");
|
||||
table = lookupAVX512(MI.getOpcode(), dom, ReplaceableInstrsAVX512DQ);
|
||||
// Don't change integer Q instructions to D instructions and
|
||||
// use D intructions if we started with a PS instruction.
|
||||
// use D instructions if we started with a PS instruction.
|
||||
if (table && Domain == 3 && (dom == 1 || table[3] == MI.getOpcode()))
|
||||
Domain = 4;
|
||||
}
|
||||
|
@ -3232,7 +3232,7 @@ def PAUSE : I<0x90, RawFrm, (outs), (ins),
|
||||
|
||||
let SchedRW = [WriteFence] in {
|
||||
// Load, store, and memory fence
|
||||
// TODO: As with mfence, we may want to ease the availablity of sfence/lfence
|
||||
// TODO: As with mfence, we may want to ease the availability of sfence/lfence
|
||||
// to include any 64-bit target.
|
||||
def SFENCE : I<0xAE, MRM_F8, (outs), (ins), "sfence", [(int_x86_sse_sfence)]>,
|
||||
PS, Requires<[HasSSE1]>;
|
||||
@ -7434,7 +7434,7 @@ def : Pat<(X86Blendi (loadv2i64 addr:$src2), VR128:$src1, timm:$src3),
|
||||
|
||||
// For insertion into the zero index (low half) of a 256-bit vector, it is
|
||||
// more efficient to generate a blend with immediate instead of an insert*128.
|
||||
// NOTE: We're using FP instructions here, but exeuction domain fixing should
|
||||
// NOTE: We're using FP instructions here, but execution domain fixing should
|
||||
// take care of using integer instructions when profitable.
|
||||
let Predicates = [HasAVX] in {
|
||||
def : Pat<(insert_subvector (v8i32 VR256:$src1), (v4i32 VR128:$src2), (iPTR 0)),
|
||||
|
@ -498,7 +498,7 @@ def GR64_NOREX_NOSP : RegisterClass<"X86", [i64], 64,
|
||||
// which we do not have right now.
|
||||
def LOW32_ADDR_ACCESS : RegisterClass<"X86", [i32], 32, (add GR32, RIP)>;
|
||||
|
||||
// When RBP is used as a base pointer in a 32-bit addresses environement,
|
||||
// When RBP is used as a base pointer in a 32-bit addresses environment,
|
||||
// this is also safe to use the full register to access addresses.
|
||||
// Since RBP will never be spilled, stick to a 32 alignment to save
|
||||
// on memory consumption.
|
||||
|
@ -1593,7 +1593,7 @@ def: InstRW<[BWWriteResGroup202], (instrs FSTENVm)>;
|
||||
def: InstRW<[WriteZero], (instrs CLC)>;
|
||||
|
||||
|
||||
// Intruction variants handled by the renamer. These might not need execution
|
||||
// Instruction variants handled by the renamer. These might not need execution
|
||||
// ports in certain conditions.
|
||||
// See Agner's Fog "The microarchitecture of Intel, AMD and VIA CPUs",
|
||||
// section "Haswell and Broadwell Pipeline" > "Register allocation and
|
||||
|
@ -1838,7 +1838,7 @@ def: InstRW<[HWWriteResGroup190], (instrs VGATHERQPSrm, VPGATHERQDrm)>;
|
||||
def: InstRW<[WriteZero], (instrs CLC)>;
|
||||
|
||||
|
||||
// Intruction variants handled by the renamer. These might not need execution
|
||||
// Instruction variants handled by the renamer. These might not need execution
|
||||
// ports in certain conditions.
|
||||
// See Agner's Fog "The microarchitecture of Intel, AMD and VIA CPUs",
|
||||
// section "Haswell and Broadwell Pipeline" > "Register allocation and
|
||||
|
@ -1106,7 +1106,7 @@ def: InstRW<[SBWriteResGroupVzeroupper], (instrs VZEROUPPER)>;
|
||||
|
||||
def: InstRW<[WriteZero], (instrs CLC)>;
|
||||
|
||||
// Intruction variants handled by the renamer. These might not need execution
|
||||
// Instruction variants handled by the renamer. These might not need execution
|
||||
// ports in certain conditions.
|
||||
// See Agner's Fog "The microarchitecture of Intel, AMD and VIA CPUs",
|
||||
// section "Sandy Bridge and Ivy Bridge Pipeline" > "Register allocation and
|
||||
|
@ -1744,7 +1744,7 @@ def: InstRW<[SKLWriteResGroup223], (instrs FSTENVm)>;
|
||||
def: InstRW<[WriteZero], (instrs CLC)>;
|
||||
|
||||
|
||||
// Intruction variants handled by the renamer. These might not need execution
|
||||
// Instruction variants handled by the renamer. These might not need execution
|
||||
// ports in certain conditions.
|
||||
// See Agner's Fog "The microarchitecture of Intel, AMD and VIA CPUs",
|
||||
// section "Skylake Pipeline" > "Register allocation and renaming".
|
||||
|
@ -2447,7 +2447,7 @@ def: InstRW<[SKXWriteResGroup267], (instrs PAUSE)>;
|
||||
def: InstRW<[WriteZero], (instrs CLC)>;
|
||||
|
||||
|
||||
// Intruction variants handled by the renamer. These might not need execution
|
||||
// Instruction variants handled by the renamer. These might not need execution
|
||||
// ports in certain conditions.
|
||||
// See Agner's Fog "The microarchitecture of Intel, AMD and VIA CPUs",
|
||||
// section "Skylake Pipeline" > "Register allocation and renaming".
|
||||
|
@ -91,7 +91,7 @@ def TestTarget : Target;
|
||||
// CHECK-LABEL: SubRegIndex sub0:
|
||||
// CHECK-LABEL: SubRegIndex sub1:
|
||||
// CHECK-LABEL: SubRegIndex sub2:
|
||||
// Check infered indexes:
|
||||
// Check inferred indexes:
|
||||
// CHECK: SubRegIndex ssub1_ssub2:
|
||||
// CHECK: SubRegIndex ssub3_ssub4:
|
||||
// CHECK: SubRegIndex ssub0_ssub1_ssub2_ssub3:
|
||||
|
@ -2,7 +2,7 @@
|
||||
|
||||
class C;
|
||||
|
||||
// TableGen prints records in alpabetical order.
|
||||
// TableGen prints records in alphabetical order.
|
||||
// CHECK-NOT: def ifdef_disabled1
|
||||
// CHECK-NOT: def ifdef_disabled2
|
||||
// CHECK: def ifdef_disabled3
|
||||
|
Loading…
Reference in New Issue
Block a user