mirror of
https://github.com/RPCS3/llvm-mirror.git
synced 2025-01-31 12:41:49 +01:00
[AMDGPU][MC][DOC] Updated AMD GPU assembler description.
Summary of changes: - Updated to reflect recent changes in assembler; - Minor bugfixing and improvements. llvm-svn: 372857
This commit is contained in:
parent
ce36c01e6a
commit
ebbe05934d
@ -566,7 +566,7 @@ SOPC
|
|||||||
s_cmp_lg_u64 :ref:`ssrc0<amdgpu_synid8_ssrc64_0>`, :ref:`ssrc1<amdgpu_synid8_ssrc64_0>`
|
s_cmp_lg_u64 :ref:`ssrc0<amdgpu_synid8_ssrc64_0>`, :ref:`ssrc1<amdgpu_synid8_ssrc64_0>`
|
||||||
s_cmp_lt_i32 :ref:`ssrc0<amdgpu_synid8_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid8_ssrc32_0>`
|
s_cmp_lt_i32 :ref:`ssrc0<amdgpu_synid8_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid8_ssrc32_0>`
|
||||||
s_cmp_lt_u32 :ref:`ssrc0<amdgpu_synid8_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid8_ssrc32_0>`
|
s_cmp_lt_u32 :ref:`ssrc0<amdgpu_synid8_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid8_ssrc32_0>`
|
||||||
s_set_gpr_idx_on :ref:`ssrc<amdgpu_synid8_ssrc32_0>`, :ref:`imm4<amdgpu_synid8_imm4>`
|
s_set_gpr_idx_on :ref:`ssrc<amdgpu_synid8_ssrc32_0>`, :ref:`imask<amdgpu_synid8_imask>`
|
||||||
s_setvskip :ref:`ssrc0<amdgpu_synid8_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid8_ssrc32_0>`
|
s_setvskip :ref:`ssrc0<amdgpu_synid8_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid8_ssrc32_0>`
|
||||||
|
|
||||||
SOPK
|
SOPK
|
||||||
@ -624,7 +624,7 @@ SOPP
|
|||||||
s_nop :ref:`imm16<amdgpu_synid8_bimm16>`
|
s_nop :ref:`imm16<amdgpu_synid8_bimm16>`
|
||||||
s_sendmsg :ref:`msg<amdgpu_synid8_msg>`
|
s_sendmsg :ref:`msg<amdgpu_synid8_msg>`
|
||||||
s_sendmsghalt :ref:`msg<amdgpu_synid8_msg>`
|
s_sendmsghalt :ref:`msg<amdgpu_synid8_msg>`
|
||||||
s_set_gpr_idx_mode :ref:`imm4<amdgpu_synid8_imm4>`
|
s_set_gpr_idx_mode :ref:`imask<amdgpu_synid8_imask>`
|
||||||
s_set_gpr_idx_off
|
s_set_gpr_idx_off
|
||||||
s_sethalt :ref:`imm16<amdgpu_synid8_bimm16>`
|
s_sethalt :ref:`imm16<amdgpu_synid8_bimm16>`
|
||||||
s_setkill :ref:`imm16<amdgpu_synid8_bimm16>`
|
s_setkill :ref:`imm16<amdgpu_synid8_bimm16>`
|
||||||
@ -1756,7 +1756,7 @@ VOPC
|
|||||||
gfx8_fimm16
|
gfx8_fimm16
|
||||||
gfx8_fimm32
|
gfx8_fimm32
|
||||||
gfx8_hwreg
|
gfx8_hwreg
|
||||||
gfx8_imm4
|
gfx8_imask
|
||||||
gfx8_label
|
gfx8_label
|
||||||
gfx8_msg
|
gfx8_msg
|
||||||
gfx8_param
|
gfx8_param
|
||||||
|
@ -736,7 +736,7 @@ SOPC
|
|||||||
s_cmp_lg_u64 :ref:`ssrc0<amdgpu_synid9_ssrc64_0>`, :ref:`ssrc1<amdgpu_synid9_ssrc64_0>`
|
s_cmp_lg_u64 :ref:`ssrc0<amdgpu_synid9_ssrc64_0>`, :ref:`ssrc1<amdgpu_synid9_ssrc64_0>`
|
||||||
s_cmp_lt_i32 :ref:`ssrc0<amdgpu_synid9_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid9_ssrc32_0>`
|
s_cmp_lt_i32 :ref:`ssrc0<amdgpu_synid9_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid9_ssrc32_0>`
|
||||||
s_cmp_lt_u32 :ref:`ssrc0<amdgpu_synid9_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid9_ssrc32_0>`
|
s_cmp_lt_u32 :ref:`ssrc0<amdgpu_synid9_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid9_ssrc32_0>`
|
||||||
s_set_gpr_idx_on :ref:`ssrc<amdgpu_synid9_ssrc32_0>`, :ref:`imm4<amdgpu_synid9_imm4>`
|
s_set_gpr_idx_on :ref:`ssrc<amdgpu_synid9_ssrc32_0>`, :ref:`imask<amdgpu_synid9_imask>`
|
||||||
s_setvskip :ref:`ssrc0<amdgpu_synid9_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid9_ssrc32_0>`
|
s_setvskip :ref:`ssrc0<amdgpu_synid9_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid9_ssrc32_0>`
|
||||||
|
|
||||||
SOPK
|
SOPK
|
||||||
@ -796,7 +796,7 @@ SOPP
|
|||||||
s_nop :ref:`imm16<amdgpu_synid9_bimm16>`
|
s_nop :ref:`imm16<amdgpu_synid9_bimm16>`
|
||||||
s_sendmsg :ref:`msg<amdgpu_synid9_msg>`
|
s_sendmsg :ref:`msg<amdgpu_synid9_msg>`
|
||||||
s_sendmsghalt :ref:`msg<amdgpu_synid9_msg>`
|
s_sendmsghalt :ref:`msg<amdgpu_synid9_msg>`
|
||||||
s_set_gpr_idx_mode :ref:`imm4<amdgpu_synid9_imm4>`
|
s_set_gpr_idx_mode :ref:`imask<amdgpu_synid9_imask>`
|
||||||
s_set_gpr_idx_off
|
s_set_gpr_idx_off
|
||||||
s_sethalt :ref:`imm16<amdgpu_synid9_bimm16>`
|
s_sethalt :ref:`imm16<amdgpu_synid9_bimm16>`
|
||||||
s_setkill :ref:`imm16<amdgpu_synid9_bimm16>`
|
s_setkill :ref:`imm16<amdgpu_synid9_bimm16>`
|
||||||
@ -2010,7 +2010,7 @@ VOPC
|
|||||||
gfx9_fimm16
|
gfx9_fimm16
|
||||||
gfx9_fimm32
|
gfx9_fimm32
|
||||||
gfx9_hwreg
|
gfx9_hwreg
|
||||||
gfx9_imm4
|
gfx9_imask
|
||||||
gfx9_label
|
gfx9_label
|
||||||
gfx9_msg
|
gfx9_msg
|
||||||
gfx9_param
|
gfx9_param
|
||||||
|
@ -10,5 +10,5 @@
|
|||||||
imm16
|
imm16
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits.
|
A 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.
|
||||||
|
|
||||||
|
@ -10,5 +10,5 @@
|
|||||||
imm32
|
imm32
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 32 bits.
|
An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value is truncated to 32 bits.
|
||||||
|
|
||||||
|
@ -21,7 +21,7 @@ Optionally may serve as an output data:
|
|||||||
* :ref:`dmask<amdgpu_synid_dmask>` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
|
* :ref:`dmask<amdgpu_synid_dmask>` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
|
||||||
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
|
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
|
||||||
|
|
||||||
Note. The surface data format is indicated in the image resource constant but not in the instruction.
|
Note: the surface data format is indicated in the image resource constant but not in the instruction.
|
||||||
|
|
||||||
|
|
||||||
*Operands:* :ref:`v<amdgpu_synid_v>`
|
*Operands:* :ref:`v<amdgpu_synid_v>`
|
||||||
|
@ -21,6 +21,6 @@ Optionally may serve as an output data:
|
|||||||
* :ref:`dmask<amdgpu_synid_dmask>` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
|
* :ref:`dmask<amdgpu_synid_dmask>` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
|
||||||
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
|
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
|
||||||
|
|
||||||
Note. The surface data format is indicated in the image resource constant but not in the instruction.
|
Note: the surface data format is indicated in the image resource constant but not in the instruction.
|
||||||
|
|
||||||
*Operands:* :ref:`v<amdgpu_synid_v>`
|
*Operands:* :ref:`v<amdgpu_synid_v>`
|
||||||
|
@ -10,5 +10,6 @@
|
|||||||
imm32
|
imm32
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
An :ref:`integer_number<amdgpu_synid_integer_number>` or a :ref:`floating-point_number<amdgpu_synid_floating-point_number>`. The number is converted to *f16* as described :ref:`here<amdgpu_synid_lit_conv>`.
|
A :ref:`floating-point_number<amdgpu_synid_floating-point_number>`, an :ref:`integer_number<amdgpu_synid_integer_number>`, or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
The value is converted to *f16* as described :ref:`here<amdgpu_synid_fp_conv>`.
|
||||||
|
|
||||||
|
@ -10,5 +10,6 @@
|
|||||||
imm32
|
imm32
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
An :ref:`integer_number<amdgpu_synid_integer_number>` or a :ref:`floating-point_number<amdgpu_synid_floating-point_number>`. The value is converted to *f32* as described :ref:`here<amdgpu_synid_lit_conv>`.
|
A :ref:`floating-point_number<amdgpu_synid_floating-point_number>`, an :ref:`integer_number<amdgpu_synid_integer_number>`, or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
The value is converted to *f32* as described :ref:`here<amdgpu_synid_fp_conv>`.
|
||||||
|
|
||||||
|
@ -14,18 +14,21 @@ Bits of a hardware register being accessed.
|
|||||||
|
|
||||||
The bits of this operand have the following meaning:
|
The bits of this operand have the following meaning:
|
||||||
|
|
||||||
============ ===================================
|
======= ===================== ============
|
||||||
Bits Description
|
Bits Description Value Range
|
||||||
============ ===================================
|
======= ===================== ============
|
||||||
5:0 Register *id*.
|
5:0 Register *id*. 0..63
|
||||||
10:6 First bit *offset* (0..31).
|
10:6 First bit *offset*. 0..31
|
||||||
15:11 *Size* in bits (1..32).
|
15:11 *Size* in bits. 1..32
|
||||||
============ ===================================
|
======= ===================== ============
|
||||||
|
|
||||||
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or using the syntax described below.
|
This operand may be specified as one of the following:
|
||||||
|
|
||||||
|
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
|
||||||
|
* An *hwreg* value described below.
|
||||||
|
|
||||||
==================================== ============================================================================
|
==================================== ============================================================================
|
||||||
Syntax Description
|
Hwreg Value Syntax Description
|
||||||
==================================== ============================================================================
|
==================================== ============================================================================
|
||||||
hwreg({0..63}) All bits of a register indicated by its *id*.
|
hwreg({0..63}) All bits of a register indicated by its *id*.
|
||||||
hwreg(<*name*>) All bits of a register indicated by its *name*.
|
hwreg(<*name*>) All bits of a register indicated by its *name*.
|
||||||
@ -33,7 +36,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_s
|
|||||||
hwreg(<*name*>, {0..31}, {1..32}) Register bits indicated by register *name*, first bit *offset* and *size*.
|
hwreg(<*name*>, {0..31}, {1..32}) Register bits indicated by register *name*, first bit *offset* and *size*.
|
||||||
==================================== ============================================================================
|
==================================== ============================================================================
|
||||||
|
|
||||||
Register *id*, *offset* and *size* must be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`.
|
Numeric values may be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`
|
||||||
|
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
Defined register *names* include:
|
Defined register *names* include:
|
||||||
|
|
||||||
@ -62,7 +66,16 @@ Examples:
|
|||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
s_getreg_b32 s2, 0x6
|
reg = 1
|
||||||
|
offset = 2
|
||||||
|
size = 4
|
||||||
|
hwreg_enc = reg | (offset << 6) | ((size - 1) << 11)
|
||||||
|
|
||||||
|
s_getreg_b32 s2, 0x1881
|
||||||
|
s_getreg_b32 s2, hwreg_enc // the same as above
|
||||||
|
s_getreg_b32 s2, hwreg(1, 2, 4) // the same as above
|
||||||
|
s_getreg_b32 s2, hwreg(reg, offset, size) // the same as above
|
||||||
|
|
||||||
s_getreg_b32 s2, hwreg(15)
|
s_getreg_b32 s2, hwreg(15)
|
||||||
s_getreg_b32 s2, hwreg(51, 1, 31)
|
s_getreg_b32 s2, hwreg(51, 1, 31)
|
||||||
s_getreg_b32 s2, hwreg(HW_REG_LDS_ALLOC, 0, 1)
|
s_getreg_b32 s2, hwreg(HW_REG_LDS_ALLOC, 0, 1)
|
||||||
|
@ -12,19 +12,26 @@ label
|
|||||||
|
|
||||||
A branch target which is a 16-bit signed integer treated as a PC-relative dword offset.
|
A branch target which is a 16-bit signed integer treated as a PC-relative dword offset.
|
||||||
|
|
||||||
This operand may be specified as:
|
This operand may be specified as one of the following:
|
||||||
|
|
||||||
* An :ref:`integer_number<amdgpu_synid_integer_number>`. The number is truncated to 16 bits.
|
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.
|
||||||
* An :ref:`absolute_expression<amdgpu_synid_absolute_expression>` which must start with an :ref:`integer_number<amdgpu_synid_integer_number>`. The value of the expression is truncated to 16 bits.
|
* A :ref:`symbol<amdgpu_synid_symbol>` (for example, a label) representing a relocatable address in the same compilation unit where it is referred from. The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker.
|
||||||
* A :ref:`symbol<amdgpu_synid_symbol>` (for example, a label). The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker.
|
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
offset = 30
|
offset = 30
|
||||||
s_branch loop_end
|
label_1:
|
||||||
s_branch 2 + offset
|
label_2 = . + 4
|
||||||
s_branch 32
|
|
||||||
loop_end:
|
s_branch 32
|
||||||
|
s_branch offset + 2
|
||||||
|
s_branch label_1
|
||||||
|
s_branch label_2
|
||||||
|
s_branch label_3
|
||||||
|
s_branch label_4
|
||||||
|
|
||||||
|
label_3 = label_2 + 4
|
||||||
|
label_4:
|
||||||
|
|
||||||
|
@ -12,24 +12,29 @@ msg
|
|||||||
|
|
||||||
A 16-bit message code. The bits of this operand have the following meaning:
|
A 16-bit message code. The bits of this operand have the following meaning:
|
||||||
|
|
||||||
============ ======================================================
|
============ =============================== ===============
|
||||||
Bits Description
|
Bits Description Value Range
|
||||||
============ ======================================================
|
============ =============================== ===============
|
||||||
3:0 Message *type*.
|
3:0 Message *type*. 0..15
|
||||||
6:4 Optional *operation*.
|
6:4 Optional *operation*. 0..7
|
||||||
9:7 Optional *parameters*.
|
7:7 Unused. \-
|
||||||
15:10 Unused.
|
9:8 Optional *stream*. 0..3
|
||||||
============ ======================================================
|
15:10 Unused. \-
|
||||||
|
============ =============================== ===============
|
||||||
|
|
||||||
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or using the syntax described below:
|
This operand may be specified as one of the following:
|
||||||
|
|
||||||
======================================== ========================================================================
|
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
|
||||||
Syntax Description
|
* A *sendmsg* value described below.
|
||||||
======================================== ========================================================================
|
|
||||||
sendmsg(<*type*>) A message identified by its *type*.
|
==================================== ====================================================
|
||||||
sendmsg(<*type*>, <*op*>) A message identified by its *type* and *operation*.
|
Sendmsg Value Syntax Description
|
||||||
sendmsg(<*type*>, <*op*>, <*stream*>) A message identified by its *type* and *operation* with a stream *id*.
|
==================================== ====================================================
|
||||||
======================================== ========================================================================
|
sendmsg(<*type*>) A message identified by its *type*.
|
||||||
|
sendmsg(<*type*>,<*op*>) A message identified by its *type* and *operation*.
|
||||||
|
sendmsg(<*type*>,<*op*>,<*stream*>) A message identified by its *type* and *operation*
|
||||||
|
with a stream *id*.
|
||||||
|
==================================== ====================================================
|
||||||
|
|
||||||
*Type* may be specified using message *name* or message *id*.
|
*Type* may be specified using message *name* or message *id*.
|
||||||
|
|
||||||
@ -37,7 +42,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_s
|
|||||||
|
|
||||||
Stream *id* is an integer in the range 0..3.
|
Stream *id* is an integer in the range 0..3.
|
||||||
|
|
||||||
Message *id*, operation *id* and stream *id* must be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`.
|
Numeric values may be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`
|
||||||
|
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
Each message type supports specific operations:
|
Each message type supports specific operations:
|
||||||
|
|
||||||
@ -60,16 +66,32 @@ Each message type supports specific operations:
|
|||||||
\ SYSMSG_OP_TTRACE_PC 4 \-
|
\ SYSMSG_OP_TTRACE_PC 4 \-
|
||||||
================= ========== ============================== ============ ==========
|
================= ========== ============================== ============ ==========
|
||||||
|
|
||||||
|
*Sendmsg* arguments are validated depending on how *type* value is specified:
|
||||||
|
|
||||||
|
* If message *type* is specified by name, arguments values must satisfy limitations detailed in the table above.
|
||||||
|
* If message *type* is specified as a number, each argument must not exceed corresponding value range (see the first table).
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
|
// numeric message code
|
||||||
|
msg = 0x10
|
||||||
s_sendmsg 0x12
|
s_sendmsg 0x12
|
||||||
|
s_sendmsg msg + 2
|
||||||
|
|
||||||
|
// sendmsg with strict arguments validation
|
||||||
s_sendmsg sendmsg(MSG_INTERRUPT)
|
s_sendmsg sendmsg(MSG_INTERRUPT)
|
||||||
s_sendmsg sendmsg(MSG_GET_DOORBELL)
|
|
||||||
s_sendmsg sendmsg(2, GS_OP_CUT)
|
|
||||||
s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT)
|
s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT)
|
||||||
s_sendmsg sendmsg(MSG_GS, 2)
|
s_sendmsg sendmsg(MSG_GS, 2)
|
||||||
s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_EMIT_CUT, 1)
|
s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_EMIT_CUT, 1)
|
||||||
s_sendmsg sendmsg(MSG_SYSMSG, SYSMSG_OP_TTRACE_PC)
|
s_sendmsg sendmsg(MSG_SYSMSG, SYSMSG_OP_TTRACE_PC)
|
||||||
|
s_sendmsg sendmsg(MSG_GET_DOORBELL)
|
||||||
|
|
||||||
|
// sendmsg with validation of value range only
|
||||||
|
msg = 2
|
||||||
|
op = 3
|
||||||
|
stream = 1
|
||||||
|
s_sendmsg sendmsg(msg, op, stream)
|
||||||
|
s_sendmsg sendmsg(2, GS_OP_CUT)
|
||||||
|
|
||||||
|
@ -12,7 +12,8 @@ imm3
|
|||||||
|
|
||||||
A bit mask which indicates request permissions.
|
A bit mask which indicates request permissions.
|
||||||
|
|
||||||
This operand must be specified as an :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 7 bits, but only 3 low bits are significant.
|
This operand must be specified as an :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
The value is truncated to 7 bits, but only 3 low bits are significant.
|
||||||
|
|
||||||
============ ==============================
|
============ ==============================
|
||||||
Bit Number Description
|
Bit Number Description
|
||||||
|
@ -10,5 +10,5 @@
|
|||||||
imm16
|
imm16
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits and then sign-extended to 32 bits.
|
An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.
|
||||||
|
|
||||||
|
@ -10,5 +10,5 @@
|
|||||||
imm16
|
imm16
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits and then zero-extended to 32 bits.
|
An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..65535.
|
||||||
|
|
||||||
|
@ -14,30 +14,31 @@ Counts of outstanding instructions to wait for.
|
|||||||
|
|
||||||
The bits of this operand have the following meaning:
|
The bits of this operand have the following meaning:
|
||||||
|
|
||||||
============ ======================================================
|
========== ========= ================================================ ============
|
||||||
Bits Description
|
High Bits Low Bits Description Value Range
|
||||||
============ ======================================================
|
========== ========= ================================================ ============
|
||||||
3:0 VM_CNT: vector memory operations count, lower bits.
|
15:14 3:0 VM_CNT: vector memory operations count. 0..63
|
||||||
6:4 EXP_CNT: export count.
|
\- 6:4 EXP_CNT: export count. 0..7
|
||||||
11:8 LGKM_CNT: LDS, GDS, Constant and Message count.
|
\- 11:8 LGKM_CNT: LDS, GDS, Constant and Message count. 0..15
|
||||||
15:14 VM_CNT: vector memory operations count, upper bits.
|
========== ========= ================================================ ============
|
||||||
============ ======================================================
|
|
||||||
|
|
||||||
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>`
|
This operand may be specified as one of the following:
|
||||||
or as a combination of the following symbolic helpers:
|
|
||||||
|
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
|
||||||
|
* A combination of *vmcnt*, *expcnt*, *lgkmcnt* and other values described below.
|
||||||
|
|
||||||
====================== ======================================================================
|
====================== ======================================================================
|
||||||
Syntax Description
|
Syntax Description
|
||||||
====================== ======================================================================
|
====================== ======================================================================
|
||||||
vmcnt(<*N*>) VM_CNT value. *N* must not exceed the largest VM_CNT value.
|
vmcnt(<*N*>) A VM_CNT value. *N* must not exceed the largest VM_CNT value.
|
||||||
expcnt(<*N*>) EXP_CNT value. *N* must not exceed the largest EXP_CNT value.
|
expcnt(<*N*>) An EXP_CNT value. *N* must not exceed the largest EXP_CNT value.
|
||||||
lgkmcnt(<*N*>) LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value.
|
lgkmcnt(<*N*>) An LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value.
|
||||||
vmcnt_sat(<*N*>) VM_CNT value computed as min(*N*, the largest VM_CNT value).
|
vmcnt_sat(<*N*>) A VM_CNT value computed as min(*N*, the largest VM_CNT value).
|
||||||
expcnt_sat(<*N*>) EXP_CNT value computed as min(*N*, the largest EXP_CNT value).
|
expcnt_sat(<*N*>) An EXP_CNT value computed as min(*N*, the largest EXP_CNT value).
|
||||||
lgkmcnt_sat(<*N*>) LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value).
|
lgkmcnt_sat(<*N*>) An LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value).
|
||||||
====================== ======================================================================
|
====================== ======================================================================
|
||||||
|
|
||||||
These helpers may be specified in any order. Ampersands and commas may be used as optional separators.
|
These values may be specified in any order. Spaces, ampersands and commas may be used as optional separators.
|
||||||
|
|
||||||
*N* is either an
|
*N* is either an
|
||||||
:ref:`integer number<amdgpu_synid_integer_number>` or an
|
:ref:`integer number<amdgpu_synid_integer_number>` or an
|
||||||
@ -47,10 +48,18 @@ Examples:
|
|||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
s_waitcnt 0
|
vm_cnt = 1
|
||||||
|
exp_cnt = 2
|
||||||
|
lgkm_cnt = 3
|
||||||
|
cnt = vm_cnt | (exp_cnt << 4) | (lgkm_cnt << 8)
|
||||||
|
|
||||||
|
s_waitcnt cnt
|
||||||
|
s_waitcnt 1 | (2 << 4) | (3 << 8) // the same as above
|
||||||
|
s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3) // the same as above
|
||||||
|
s_waitcnt vmcnt(vm_cnt) expcnt(exp_cnt) lgkmcnt(lgkm_cnt) // the same as above
|
||||||
|
|
||||||
s_waitcnt vmcnt(1)
|
s_waitcnt vmcnt(1)
|
||||||
s_waitcnt expcnt(2) lgkmcnt(3)
|
s_waitcnt expcnt(2) lgkmcnt(3)
|
||||||
s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3)
|
|
||||||
s_waitcnt vmcnt(1), expcnt(2), lgkmcnt(3)
|
s_waitcnt vmcnt(1), expcnt(2), lgkmcnt(3)
|
||||||
s_waitcnt vmcnt(1) & lgkmcnt_sat(100) & expcnt(2)
|
s_waitcnt vmcnt(1) & lgkmcnt_sat(100) & expcnt(2)
|
||||||
|
|
||||||
|
@ -10,5 +10,5 @@
|
|||||||
imm16
|
imm16
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits.
|
A 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.
|
||||||
|
|
||||||
|
@ -10,5 +10,5 @@
|
|||||||
imm32
|
imm32
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 32 bits.
|
An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value is truncated to 32 bits.
|
||||||
|
|
||||||
|
@ -21,7 +21,7 @@ Optionally may serve as an output data:
|
|||||||
* :ref:`dmask<amdgpu_synid_dmask>` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
|
* :ref:`dmask<amdgpu_synid_dmask>` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
|
||||||
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
|
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
|
||||||
|
|
||||||
Note. The surface data format is indicated in the image resource constant but not in the instruction.
|
Note: the surface data format is indicated in the image resource constant but not in the instruction.
|
||||||
|
|
||||||
|
|
||||||
*Operands:* :ref:`v<amdgpu_synid_v>`
|
*Operands:* :ref:`v<amdgpu_synid_v>`
|
||||||
|
@ -21,6 +21,6 @@ Optionally may serve as an output data:
|
|||||||
* :ref:`dmask<amdgpu_synid_dmask>` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
|
* :ref:`dmask<amdgpu_synid_dmask>` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
|
||||||
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
|
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
|
||||||
|
|
||||||
Note. The surface data format is indicated in the image resource constant but not in the instruction.
|
Note: the surface data format is indicated in the image resource constant but not in the instruction.
|
||||||
|
|
||||||
*Operands:* :ref:`v<amdgpu_synid_v>`
|
*Operands:* :ref:`v<amdgpu_synid_v>`
|
||||||
|
@ -10,5 +10,6 @@
|
|||||||
imm32
|
imm32
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
An :ref:`integer_number<amdgpu_synid_integer_number>` or a :ref:`floating-point_number<amdgpu_synid_floating-point_number>`. The value is converted to *f32* as described :ref:`here<amdgpu_synid_lit_conv>`.
|
A :ref:`floating-point_number<amdgpu_synid_floating-point_number>`, an :ref:`integer_number<amdgpu_synid_integer_number>`, or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
The value is converted to *f32* as described :ref:`here<amdgpu_synid_fp_conv>`.
|
||||||
|
|
||||||
|
@ -14,18 +14,21 @@ Bits of a hardware register being accessed.
|
|||||||
|
|
||||||
The bits of this operand have the following meaning:
|
The bits of this operand have the following meaning:
|
||||||
|
|
||||||
============ ===================================
|
======= ===================== ============
|
||||||
Bits Description
|
Bits Description Value Range
|
||||||
============ ===================================
|
======= ===================== ============
|
||||||
5:0 Register *id*.
|
5:0 Register *id*. 0..63
|
||||||
10:6 First bit *offset* (0..31).
|
10:6 First bit *offset*. 0..31
|
||||||
15:11 *Size* in bits (1..32).
|
15:11 *Size* in bits. 1..32
|
||||||
============ ===================================
|
======= ===================== ============
|
||||||
|
|
||||||
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or using the syntax described below.
|
This operand may be specified as one of the following:
|
||||||
|
|
||||||
|
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
|
||||||
|
* An *hwreg* value described below.
|
||||||
|
|
||||||
==================================== ============================================================================
|
==================================== ============================================================================
|
||||||
Syntax Description
|
Hwreg Value Syntax Description
|
||||||
==================================== ============================================================================
|
==================================== ============================================================================
|
||||||
hwreg({0..63}) All bits of a register indicated by its *id*.
|
hwreg({0..63}) All bits of a register indicated by its *id*.
|
||||||
hwreg(<*name*>) All bits of a register indicated by its *name*.
|
hwreg(<*name*>) All bits of a register indicated by its *name*.
|
||||||
@ -33,7 +36,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_s
|
|||||||
hwreg(<*name*>, {0..31}, {1..32}) Register bits indicated by register *name*, first bit *offset* and *size*.
|
hwreg(<*name*>, {0..31}, {1..32}) Register bits indicated by register *name*, first bit *offset* and *size*.
|
||||||
==================================== ============================================================================
|
==================================== ============================================================================
|
||||||
|
|
||||||
Register *id*, *offset* and *size* must be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`.
|
Numeric values may be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`
|
||||||
|
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
Defined register *names* include:
|
Defined register *names* include:
|
||||||
|
|
||||||
@ -53,7 +57,16 @@ Examples:
|
|||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
s_getreg_b32 s2, 0x6
|
reg = 1
|
||||||
|
offset = 2
|
||||||
|
size = 4
|
||||||
|
hwreg_enc = reg | (offset << 6) | ((size - 1) << 11)
|
||||||
|
|
||||||
|
s_getreg_b32 s2, 0x1881
|
||||||
|
s_getreg_b32 s2, hwreg_enc // the same as above
|
||||||
|
s_getreg_b32 s2, hwreg(1, 2, 4) // the same as above
|
||||||
|
s_getreg_b32 s2, hwreg(reg, offset, size) // the same as above
|
||||||
|
|
||||||
s_getreg_b32 s2, hwreg(15)
|
s_getreg_b32 s2, hwreg(15)
|
||||||
s_getreg_b32 s2, hwreg(51, 1, 31)
|
s_getreg_b32 s2, hwreg(51, 1, 31)
|
||||||
s_getreg_b32 s2, hwreg(HW_REG_LDS_ALLOC, 0, 1)
|
s_getreg_b32 s2, hwreg(HW_REG_LDS_ALLOC, 0, 1)
|
||||||
|
@ -12,19 +12,26 @@ label
|
|||||||
|
|
||||||
A branch target which is a 16-bit signed integer treated as a PC-relative dword offset.
|
A branch target which is a 16-bit signed integer treated as a PC-relative dword offset.
|
||||||
|
|
||||||
This operand may be specified as:
|
This operand may be specified as one of the following:
|
||||||
|
|
||||||
* An :ref:`integer_number<amdgpu_synid_integer_number>`. The number is truncated to 16 bits.
|
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.
|
||||||
* An :ref:`absolute_expression<amdgpu_synid_absolute_expression>` which must start with an :ref:`integer_number<amdgpu_synid_integer_number>`. The value of the expression is truncated to 16 bits.
|
* A :ref:`symbol<amdgpu_synid_symbol>` (for example, a label) representing a relocatable address in the same compilation unit where it is referred from. The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker.
|
||||||
* A :ref:`symbol<amdgpu_synid_symbol>` (for example, a label). The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker.
|
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
offset = 30
|
offset = 30
|
||||||
s_branch loop_end
|
label_1:
|
||||||
s_branch 2 + offset
|
label_2 = . + 4
|
||||||
s_branch 32
|
|
||||||
loop_end:
|
s_branch 32
|
||||||
|
s_branch offset + 2
|
||||||
|
s_branch label_1
|
||||||
|
s_branch label_2
|
||||||
|
s_branch label_3
|
||||||
|
s_branch label_4
|
||||||
|
|
||||||
|
label_3 = label_2 + 4
|
||||||
|
label_4:
|
||||||
|
|
||||||
|
@ -12,24 +12,29 @@ msg
|
|||||||
|
|
||||||
A 16-bit message code. The bits of this operand have the following meaning:
|
A 16-bit message code. The bits of this operand have the following meaning:
|
||||||
|
|
||||||
============ ======================================================
|
============ =============================== ===============
|
||||||
Bits Description
|
Bits Description Value Range
|
||||||
============ ======================================================
|
============ =============================== ===============
|
||||||
3:0 Message *type*.
|
3:0 Message *type*. 0..15
|
||||||
6:4 Optional *operation*.
|
6:4 Optional *operation*. 0..7
|
||||||
9:7 Optional *parameters*.
|
7:7 Unused. \-
|
||||||
15:10 Unused.
|
9:8 Optional *stream*. 0..3
|
||||||
============ ======================================================
|
15:10 Unused. \-
|
||||||
|
============ =============================== ===============
|
||||||
|
|
||||||
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or using the syntax described below:
|
This operand may be specified as one of the following:
|
||||||
|
|
||||||
======================================== ========================================================================
|
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
|
||||||
Syntax Description
|
* A *sendmsg* value described below.
|
||||||
======================================== ========================================================================
|
|
||||||
sendmsg(<*type*>) A message identified by its *type*.
|
==================================== ====================================================
|
||||||
sendmsg(<*type*>, <*op*>) A message identified by its *type* and *operation*.
|
Sendmsg Value Syntax Description
|
||||||
sendmsg(<*type*>, <*op*>, <*stream*>) A message identified by its *type* and *operation* with a stream *id*.
|
==================================== ====================================================
|
||||||
======================================== ========================================================================
|
sendmsg(<*type*>) A message identified by its *type*.
|
||||||
|
sendmsg(<*type*>,<*op*>) A message identified by its *type* and *operation*.
|
||||||
|
sendmsg(<*type*>,<*op*>,<*stream*>) A message identified by its *type* and *operation*
|
||||||
|
with a stream *id*.
|
||||||
|
==================================== ====================================================
|
||||||
|
|
||||||
*Type* may be specified using message *name* or message *id*.
|
*Type* may be specified using message *name* or message *id*.
|
||||||
|
|
||||||
@ -37,7 +42,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_s
|
|||||||
|
|
||||||
Stream *id* is an integer in the range 0..3.
|
Stream *id* is an integer in the range 0..3.
|
||||||
|
|
||||||
Message *id*, operation *id* and stream *id* must be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`.
|
Numeric values may be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`
|
||||||
|
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
Each message type supports specific operations:
|
Each message type supports specific operations:
|
||||||
|
|
||||||
@ -58,15 +64,31 @@ Each message type supports specific operations:
|
|||||||
\ SYSMSG_OP_TTRACE_PC 4 \-
|
\ SYSMSG_OP_TTRACE_PC 4 \-
|
||||||
================= ========== ============================== ============ ==========
|
================= ========== ============================== ============ ==========
|
||||||
|
|
||||||
|
*Sendmsg* arguments are validated depending on how *type* value is specified:
|
||||||
|
|
||||||
|
* If message *type* is specified by name, arguments values must satisfy limitations detailed in the table above.
|
||||||
|
* If message *type* is specified as a number, each argument must not exceed corresponding value range (see the first table).
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
|
// numeric message code
|
||||||
|
msg = 0x10
|
||||||
s_sendmsg 0x12
|
s_sendmsg 0x12
|
||||||
|
s_sendmsg msg + 2
|
||||||
|
|
||||||
|
// sendmsg with strict arguments validation
|
||||||
s_sendmsg sendmsg(MSG_INTERRUPT)
|
s_sendmsg sendmsg(MSG_INTERRUPT)
|
||||||
s_sendmsg sendmsg(2, GS_OP_CUT)
|
|
||||||
s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT)
|
s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT)
|
||||||
s_sendmsg sendmsg(MSG_GS, 2)
|
s_sendmsg sendmsg(MSG_GS, 2)
|
||||||
s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_EMIT_CUT, 1)
|
s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_EMIT_CUT, 1)
|
||||||
s_sendmsg sendmsg(MSG_SYSMSG, SYSMSG_OP_TTRACE_PC)
|
s_sendmsg sendmsg(MSG_SYSMSG, SYSMSG_OP_TTRACE_PC)
|
||||||
|
|
||||||
|
// sendmsg with validation of value range only
|
||||||
|
msg = 2
|
||||||
|
op = 3
|
||||||
|
stream = 1
|
||||||
|
s_sendmsg sendmsg(msg, op, stream)
|
||||||
|
s_sendmsg sendmsg(2, GS_OP_CUT)
|
||||||
|
|
||||||
|
@ -10,5 +10,5 @@
|
|||||||
imm16
|
imm16
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits and then sign-extended to 32 bits.
|
An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.
|
||||||
|
|
||||||
|
@ -10,5 +10,5 @@
|
|||||||
imm16
|
imm16
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits and then zero-extended to 32 bits.
|
An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..65535.
|
||||||
|
|
||||||
|
@ -14,29 +14,31 @@ Counts of outstanding instructions to wait for.
|
|||||||
|
|
||||||
The bits of this operand have the following meaning:
|
The bits of this operand have the following meaning:
|
||||||
|
|
||||||
============ ======================================================
|
===== ================================================ ============
|
||||||
Bits Description
|
Bits Description Value Range
|
||||||
============ ======================================================
|
===== ================================================ ============
|
||||||
3:0 VM_CNT: vector memory operations count.
|
3:0 VM_CNT: vector memory operations count. 0..15
|
||||||
6:4 EXP_CNT: export count.
|
6:4 EXP_CNT: export count. 0..7
|
||||||
12:8 LGKM_CNT: LDS, GDS, Constant and Message count.
|
12:8 LGKM_CNT: LDS, GDS, Constant and Message count. 0..31
|
||||||
============ ======================================================
|
===== ================================================ ============
|
||||||
|
|
||||||
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>`
|
This operand may be specified as one of the following:
|
||||||
or as a combination of the following symbolic helpers:
|
|
||||||
|
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
|
||||||
|
* A combination of *vmcnt*, *expcnt*, *lgkmcnt* and other values described below.
|
||||||
|
|
||||||
====================== ======================================================================
|
====================== ======================================================================
|
||||||
Syntax Description
|
Syntax Description
|
||||||
====================== ======================================================================
|
====================== ======================================================================
|
||||||
vmcnt(<*N*>) VM_CNT value. *N* must not exceed the largest VM_CNT value.
|
vmcnt(<*N*>) A VM_CNT value. *N* must not exceed the largest VM_CNT value.
|
||||||
expcnt(<*N*>) EXP_CNT value. *N* must not exceed the largest EXP_CNT value.
|
expcnt(<*N*>) An EXP_CNT value. *N* must not exceed the largest EXP_CNT value.
|
||||||
lgkmcnt(<*N*>) LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value.
|
lgkmcnt(<*N*>) An LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value.
|
||||||
vmcnt_sat(<*N*>) VM_CNT value computed as min(*N*, the largest VM_CNT value).
|
vmcnt_sat(<*N*>) A VM_CNT value computed as min(*N*, the largest VM_CNT value).
|
||||||
expcnt_sat(<*N*>) EXP_CNT value computed as min(*N*, the largest EXP_CNT value).
|
expcnt_sat(<*N*>) An EXP_CNT value computed as min(*N*, the largest EXP_CNT value).
|
||||||
lgkmcnt_sat(<*N*>) LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value).
|
lgkmcnt_sat(<*N*>) An LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value).
|
||||||
====================== ======================================================================
|
====================== ======================================================================
|
||||||
|
|
||||||
These helpers may be specified in any order. Ampersands and commas may be used as optional separators.
|
These values may be specified in any order. Spaces, ampersands and commas may be used as optional separators.
|
||||||
|
|
||||||
*N* is either an
|
*N* is either an
|
||||||
:ref:`integer number<amdgpu_synid_integer_number>` or an
|
:ref:`integer number<amdgpu_synid_integer_number>` or an
|
||||||
@ -46,10 +48,18 @@ Examples:
|
|||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
s_waitcnt 0
|
vm_cnt = 1
|
||||||
|
exp_cnt = 2
|
||||||
|
lgkm_cnt = 3
|
||||||
|
cnt = vm_cnt | (exp_cnt << 4) | (lgkm_cnt << 8)
|
||||||
|
|
||||||
|
s_waitcnt cnt
|
||||||
|
s_waitcnt 1 | (2 << 4) | (3 << 8) // the same as above
|
||||||
|
s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3) // the same as above
|
||||||
|
s_waitcnt vmcnt(vm_cnt) expcnt(exp_cnt) lgkmcnt(lgkm_cnt) // the same as above
|
||||||
|
|
||||||
s_waitcnt vmcnt(1)
|
s_waitcnt vmcnt(1)
|
||||||
s_waitcnt expcnt(2) lgkmcnt(3)
|
s_waitcnt expcnt(2) lgkmcnt(3)
|
||||||
s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3)
|
|
||||||
s_waitcnt vmcnt(1), expcnt(2), lgkmcnt(3)
|
s_waitcnt vmcnt(1), expcnt(2), lgkmcnt(3)
|
||||||
s_waitcnt vmcnt(1) & lgkmcnt_sat(100) & expcnt(2)
|
s_waitcnt vmcnt(1) & lgkmcnt_sat(100) & expcnt(2)
|
||||||
|
|
||||||
|
@ -10,5 +10,5 @@
|
|||||||
imm16
|
imm16
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits.
|
A 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.
|
||||||
|
|
||||||
|
@ -10,5 +10,5 @@
|
|||||||
imm32
|
imm32
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 32 bits.
|
An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value is truncated to 32 bits.
|
||||||
|
|
||||||
|
@ -21,7 +21,7 @@ Optionally may serve as an output data:
|
|||||||
* :ref:`dmask<amdgpu_synid_dmask>` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
|
* :ref:`dmask<amdgpu_synid_dmask>` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
|
||||||
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
|
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
|
||||||
|
|
||||||
Note. The surface data format is indicated in the image resource constant but not in the instruction.
|
Note: the surface data format is indicated in the image resource constant but not in the instruction.
|
||||||
|
|
||||||
|
|
||||||
*Operands:* :ref:`v<amdgpu_synid_v>`
|
*Operands:* :ref:`v<amdgpu_synid_v>`
|
||||||
|
@ -21,6 +21,6 @@ Optionally may serve as an output data:
|
|||||||
* :ref:`dmask<amdgpu_synid_dmask>` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
|
* :ref:`dmask<amdgpu_synid_dmask>` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
|
||||||
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
|
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
|
||||||
|
|
||||||
Note. The surface data format is indicated in the image resource constant but not in the instruction.
|
Note: the surface data format is indicated in the image resource constant but not in the instruction.
|
||||||
|
|
||||||
*Operands:* :ref:`v<amdgpu_synid_v>`
|
*Operands:* :ref:`v<amdgpu_synid_v>`
|
||||||
|
@ -10,5 +10,6 @@
|
|||||||
imm32
|
imm32
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
An :ref:`integer_number<amdgpu_synid_integer_number>` or a :ref:`floating-point_number<amdgpu_synid_floating-point_number>`. The number is converted to *f16* as described :ref:`here<amdgpu_synid_lit_conv>`.
|
A :ref:`floating-point_number<amdgpu_synid_floating-point_number>`, an :ref:`integer_number<amdgpu_synid_integer_number>`, or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
The value is converted to *f16* as described :ref:`here<amdgpu_synid_fp_conv>`.
|
||||||
|
|
||||||
|
@ -10,5 +10,6 @@
|
|||||||
imm32
|
imm32
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
An :ref:`integer_number<amdgpu_synid_integer_number>` or a :ref:`floating-point_number<amdgpu_synid_floating-point_number>`. The value is converted to *f32* as described :ref:`here<amdgpu_synid_lit_conv>`.
|
A :ref:`floating-point_number<amdgpu_synid_floating-point_number>`, an :ref:`integer_number<amdgpu_synid_integer_number>`, or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
The value is converted to *f32* as described :ref:`here<amdgpu_synid_fp_conv>`.
|
||||||
|
|
||||||
|
@ -14,18 +14,21 @@ Bits of a hardware register being accessed.
|
|||||||
|
|
||||||
The bits of this operand have the following meaning:
|
The bits of this operand have the following meaning:
|
||||||
|
|
||||||
============ ===================================
|
======= ===================== ============
|
||||||
Bits Description
|
Bits Description Value Range
|
||||||
============ ===================================
|
======= ===================== ============
|
||||||
5:0 Register *id*.
|
5:0 Register *id*. 0..63
|
||||||
10:6 First bit *offset* (0..31).
|
10:6 First bit *offset*. 0..31
|
||||||
15:11 *Size* in bits (1..32).
|
15:11 *Size* in bits. 1..32
|
||||||
============ ===================================
|
======= ===================== ============
|
||||||
|
|
||||||
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or using the syntax described below.
|
This operand may be specified as one of the following:
|
||||||
|
|
||||||
|
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
|
||||||
|
* An *hwreg* value described below.
|
||||||
|
|
||||||
==================================== ============================================================================
|
==================================== ============================================================================
|
||||||
Syntax Description
|
Hwreg Value Syntax Description
|
||||||
==================================== ============================================================================
|
==================================== ============================================================================
|
||||||
hwreg({0..63}) All bits of a register indicated by its *id*.
|
hwreg({0..63}) All bits of a register indicated by its *id*.
|
||||||
hwreg(<*name*>) All bits of a register indicated by its *name*.
|
hwreg(<*name*>) All bits of a register indicated by its *name*.
|
||||||
@ -33,7 +36,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_s
|
|||||||
hwreg(<*name*>, {0..31}, {1..32}) Register bits indicated by register *name*, first bit *offset* and *size*.
|
hwreg(<*name*>, {0..31}, {1..32}) Register bits indicated by register *name*, first bit *offset* and *size*.
|
||||||
==================================== ============================================================================
|
==================================== ============================================================================
|
||||||
|
|
||||||
Register *id*, *offset* and *size* must be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`.
|
Numeric values may be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`
|
||||||
|
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
Defined register *names* include:
|
Defined register *names* include:
|
||||||
|
|
||||||
@ -53,7 +57,16 @@ Examples:
|
|||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
s_getreg_b32 s2, 0x6
|
reg = 1
|
||||||
|
offset = 2
|
||||||
|
size = 4
|
||||||
|
hwreg_enc = reg | (offset << 6) | ((size - 1) << 11)
|
||||||
|
|
||||||
|
s_getreg_b32 s2, 0x1881
|
||||||
|
s_getreg_b32 s2, hwreg_enc // the same as above
|
||||||
|
s_getreg_b32 s2, hwreg(1, 2, 4) // the same as above
|
||||||
|
s_getreg_b32 s2, hwreg(reg, offset, size) // the same as above
|
||||||
|
|
||||||
s_getreg_b32 s2, hwreg(15)
|
s_getreg_b32 s2, hwreg(15)
|
||||||
s_getreg_b32 s2, hwreg(51, 1, 31)
|
s_getreg_b32 s2, hwreg(51, 1, 31)
|
||||||
s_getreg_b32 s2, hwreg(HW_REG_LDS_ALLOC, 0, 1)
|
s_getreg_b32 s2, hwreg(HW_REG_LDS_ALLOC, 0, 1)
|
||||||
|
66
docs/AMDGPU/gfx8_imask.rst
Normal file
66
docs/AMDGPU/gfx8_imask.rst
Normal file
@ -0,0 +1,66 @@
|
|||||||
|
..
|
||||||
|
**************************************************
|
||||||
|
* *
|
||||||
|
* Automatically generated file, do not edit! *
|
||||||
|
* *
|
||||||
|
**************************************************
|
||||||
|
|
||||||
|
.. _amdgpu_synid8_imask:
|
||||||
|
|
||||||
|
imask
|
||||||
|
===========================
|
||||||
|
|
||||||
|
This operand is a mask which controls indexing mode for operands of subsequent instructions.
|
||||||
|
Bits 0, 1 and 2 control indexing of *src0*, *src1* and *src2*, while bit 3 controls indexing of *dst*.
|
||||||
|
Value 1 enables indexing and value 0 disables it.
|
||||||
|
|
||||||
|
===== ========================================
|
||||||
|
Bit Meaning
|
||||||
|
===== ========================================
|
||||||
|
0 Enables or disables *src0* indexing.
|
||||||
|
1 Enables or disables *src1* indexing.
|
||||||
|
2 Enables or disables *src2* indexing.
|
||||||
|
3 Enables or disables *dst* indexing.
|
||||||
|
===== ========================================
|
||||||
|
|
||||||
|
This operand may be specified as one of the following:
|
||||||
|
|
||||||
|
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..15.
|
||||||
|
* A *gpr_idx* value described below.
|
||||||
|
|
||||||
|
==================================== ===========================================
|
||||||
|
Gpr_idx Value Syntax Description
|
||||||
|
==================================== ===========================================
|
||||||
|
gpr_idx(*<operands>*) Enable indexing for specified *operands*
|
||||||
|
and disable it for the rest.
|
||||||
|
*Operands* is a comma-separated list of
|
||||||
|
values which may include:
|
||||||
|
|
||||||
|
* "SRC0" - enable *src0* indexing.
|
||||||
|
|
||||||
|
* "SRC1" - enable *src1* indexing.
|
||||||
|
|
||||||
|
* "SRC2" - enable *src2* indexing.
|
||||||
|
|
||||||
|
* "DST" - enable *dst* indexing.
|
||||||
|
|
||||||
|
Each of these values may be specified only
|
||||||
|
once.
|
||||||
|
|
||||||
|
*Operands* list may be empty; this syntax
|
||||||
|
disables indexing for all operands.
|
||||||
|
==================================== ===========================================
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
|
||||||
|
.. parsed-literal::
|
||||||
|
|
||||||
|
s_set_gpr_idx_mode 0
|
||||||
|
s_set_gpr_idx_mode gpr_idx() // the same as above
|
||||||
|
|
||||||
|
s_set_gpr_idx_mode 15
|
||||||
|
s_set_gpr_idx_mode gpr_idx(DST,SRC0,SRC1,SRC2) // the same as above
|
||||||
|
s_set_gpr_idx_mode gpr_idx(SRC0,SRC1,SRC2,DST) // the same as above
|
||||||
|
|
||||||
|
s_set_gpr_idx_mode gpr_idx(DST,SRC1)
|
||||||
|
|
@ -1,25 +0,0 @@
|
|||||||
..
|
|
||||||
**************************************************
|
|
||||||
* *
|
|
||||||
* Automatically generated file, do not edit! *
|
|
||||||
* *
|
|
||||||
**************************************************
|
|
||||||
|
|
||||||
.. _amdgpu_synid8_imm4:
|
|
||||||
|
|
||||||
imm4
|
|
||||||
===========================
|
|
||||||
|
|
||||||
A positive :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 4 bits.
|
|
||||||
|
|
||||||
This operand is a mask which controls indexing mode for operands of subsequent instructions. Value 1 enables indexing and value 0 disables it.
|
|
||||||
|
|
||||||
============ ========================================
|
|
||||||
Bit Meaning
|
|
||||||
============ ========================================
|
|
||||||
0 Enables or disables *src0* indexing.
|
|
||||||
1 Enables or disables *src1* indexing.
|
|
||||||
2 Enables or disables *src2* indexing.
|
|
||||||
3 Enables or disables *dst* indexing.
|
|
||||||
============ ========================================
|
|
||||||
|
|
@ -12,19 +12,26 @@ label
|
|||||||
|
|
||||||
A branch target which is a 16-bit signed integer treated as a PC-relative dword offset.
|
A branch target which is a 16-bit signed integer treated as a PC-relative dword offset.
|
||||||
|
|
||||||
This operand may be specified as:
|
This operand may be specified as one of the following:
|
||||||
|
|
||||||
* An :ref:`integer_number<amdgpu_synid_integer_number>`. The number is truncated to 16 bits.
|
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.
|
||||||
* An :ref:`absolute_expression<amdgpu_synid_absolute_expression>` which must start with an :ref:`integer_number<amdgpu_synid_integer_number>`. The value of the expression is truncated to 16 bits.
|
* A :ref:`symbol<amdgpu_synid_symbol>` (for example, a label) representing a relocatable address in the same compilation unit where it is referred from. The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker.
|
||||||
* A :ref:`symbol<amdgpu_synid_symbol>` (for example, a label). The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker.
|
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
offset = 30
|
offset = 30
|
||||||
s_branch loop_end
|
label_1:
|
||||||
s_branch 2 + offset
|
label_2 = . + 4
|
||||||
s_branch 32
|
|
||||||
loop_end:
|
s_branch 32
|
||||||
|
s_branch offset + 2
|
||||||
|
s_branch label_1
|
||||||
|
s_branch label_2
|
||||||
|
s_branch label_3
|
||||||
|
s_branch label_4
|
||||||
|
|
||||||
|
label_3 = label_2 + 4
|
||||||
|
label_4:
|
||||||
|
|
||||||
|
@ -12,24 +12,29 @@ msg
|
|||||||
|
|
||||||
A 16-bit message code. The bits of this operand have the following meaning:
|
A 16-bit message code. The bits of this operand have the following meaning:
|
||||||
|
|
||||||
============ ======================================================
|
============ =============================== ===============
|
||||||
Bits Description
|
Bits Description Value Range
|
||||||
============ ======================================================
|
============ =============================== ===============
|
||||||
3:0 Message *type*.
|
3:0 Message *type*. 0..15
|
||||||
6:4 Optional *operation*.
|
6:4 Optional *operation*. 0..7
|
||||||
9:7 Optional *parameters*.
|
7:7 Unused. \-
|
||||||
15:10 Unused.
|
9:8 Optional *stream*. 0..3
|
||||||
============ ======================================================
|
15:10 Unused. \-
|
||||||
|
============ =============================== ===============
|
||||||
|
|
||||||
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or using the syntax described below:
|
This operand may be specified as one of the following:
|
||||||
|
|
||||||
======================================== ========================================================================
|
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
|
||||||
Syntax Description
|
* A *sendmsg* value described below.
|
||||||
======================================== ========================================================================
|
|
||||||
sendmsg(<*type*>) A message identified by its *type*.
|
==================================== ====================================================
|
||||||
sendmsg(<*type*>, <*op*>) A message identified by its *type* and *operation*.
|
Sendmsg Value Syntax Description
|
||||||
sendmsg(<*type*>, <*op*>, <*stream*>) A message identified by its *type* and *operation* with a stream *id*.
|
==================================== ====================================================
|
||||||
======================================== ========================================================================
|
sendmsg(<*type*>) A message identified by its *type*.
|
||||||
|
sendmsg(<*type*>,<*op*>) A message identified by its *type* and *operation*.
|
||||||
|
sendmsg(<*type*>,<*op*>,<*stream*>) A message identified by its *type* and *operation*
|
||||||
|
with a stream *id*.
|
||||||
|
==================================== ====================================================
|
||||||
|
|
||||||
*Type* may be specified using message *name* or message *id*.
|
*Type* may be specified using message *name* or message *id*.
|
||||||
|
|
||||||
@ -37,7 +42,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_s
|
|||||||
|
|
||||||
Stream *id* is an integer in the range 0..3.
|
Stream *id* is an integer in the range 0..3.
|
||||||
|
|
||||||
Message *id*, operation *id* and stream *id* must be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`.
|
Numeric values may be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`
|
||||||
|
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
Each message type supports specific operations:
|
Each message type supports specific operations:
|
||||||
|
|
||||||
@ -58,15 +64,31 @@ Each message type supports specific operations:
|
|||||||
\ SYSMSG_OP_TTRACE_PC 4 \-
|
\ SYSMSG_OP_TTRACE_PC 4 \-
|
||||||
================= ========== ============================== ============ ==========
|
================= ========== ============================== ============ ==========
|
||||||
|
|
||||||
|
*Sendmsg* arguments are validated depending on how *type* value is specified:
|
||||||
|
|
||||||
|
* If message *type* is specified by name, arguments values must satisfy limitations detailed in the table above.
|
||||||
|
* If message *type* is specified as a number, each argument must not exceed corresponding value range (see the first table).
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
|
// numeric message code
|
||||||
|
msg = 0x10
|
||||||
s_sendmsg 0x12
|
s_sendmsg 0x12
|
||||||
|
s_sendmsg msg + 2
|
||||||
|
|
||||||
|
// sendmsg with strict arguments validation
|
||||||
s_sendmsg sendmsg(MSG_INTERRUPT)
|
s_sendmsg sendmsg(MSG_INTERRUPT)
|
||||||
s_sendmsg sendmsg(2, GS_OP_CUT)
|
|
||||||
s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT)
|
s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT)
|
||||||
s_sendmsg sendmsg(MSG_GS, 2)
|
s_sendmsg sendmsg(MSG_GS, 2)
|
||||||
s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_EMIT_CUT, 1)
|
s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_EMIT_CUT, 1)
|
||||||
s_sendmsg sendmsg(MSG_SYSMSG, SYSMSG_OP_TTRACE_PC)
|
s_sendmsg sendmsg(MSG_SYSMSG, SYSMSG_OP_TTRACE_PC)
|
||||||
|
|
||||||
|
// sendmsg with validation of value range only
|
||||||
|
msg = 2
|
||||||
|
op = 3
|
||||||
|
stream = 1
|
||||||
|
s_sendmsg sendmsg(msg, op, stream)
|
||||||
|
s_sendmsg sendmsg(2, GS_OP_CUT)
|
||||||
|
|
||||||
|
@ -12,7 +12,8 @@ imm3
|
|||||||
|
|
||||||
A bit mask which indicates request permissions.
|
A bit mask which indicates request permissions.
|
||||||
|
|
||||||
This operand must be specified as an :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 7 bits, but only 3 low bits are significant.
|
This operand must be specified as an :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
The value is truncated to 7 bits, but only 3 low bits are significant.
|
||||||
|
|
||||||
============ ==============================
|
============ ==============================
|
||||||
Bit Number Description
|
Bit Number Description
|
||||||
|
@ -10,5 +10,5 @@
|
|||||||
imm16
|
imm16
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits and then sign-extended to 32 bits.
|
An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.
|
||||||
|
|
||||||
|
@ -10,5 +10,5 @@
|
|||||||
imm16
|
imm16
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits and then zero-extended to 32 bits.
|
An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..65535.
|
||||||
|
|
||||||
|
@ -14,29 +14,31 @@ Counts of outstanding instructions to wait for.
|
|||||||
|
|
||||||
The bits of this operand have the following meaning:
|
The bits of this operand have the following meaning:
|
||||||
|
|
||||||
============ ======================================================
|
===== ================================================ ============
|
||||||
Bits Description
|
Bits Description Value Range
|
||||||
============ ======================================================
|
===== ================================================ ============
|
||||||
3:0 VM_CNT: vector memory operations count.
|
3:0 VM_CNT: vector memory operations count. 0..15
|
||||||
6:4 EXP_CNT: export count.
|
6:4 EXP_CNT: export count. 0..7
|
||||||
11:8 LGKM_CNT: LDS, GDS, Constant and Message count.
|
11:8 LGKM_CNT: LDS, GDS, Constant and Message count. 0..15
|
||||||
============ ======================================================
|
===== ================================================ ============
|
||||||
|
|
||||||
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>`
|
This operand may be specified as one of the following:
|
||||||
or as a combination of the following symbolic helpers:
|
|
||||||
|
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
|
||||||
|
* A combination of *vmcnt*, *expcnt*, *lgkmcnt* and other values described below.
|
||||||
|
|
||||||
====================== ======================================================================
|
====================== ======================================================================
|
||||||
Syntax Description
|
Syntax Description
|
||||||
====================== ======================================================================
|
====================== ======================================================================
|
||||||
vmcnt(<*N*>) VM_CNT value. *N* must not exceed the largest VM_CNT value.
|
vmcnt(<*N*>) A VM_CNT value. *N* must not exceed the largest VM_CNT value.
|
||||||
expcnt(<*N*>) EXP_CNT value. *N* must not exceed the largest EXP_CNT value.
|
expcnt(<*N*>) An EXP_CNT value. *N* must not exceed the largest EXP_CNT value.
|
||||||
lgkmcnt(<*N*>) LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value.
|
lgkmcnt(<*N*>) An LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value.
|
||||||
vmcnt_sat(<*N*>) VM_CNT value computed as min(*N*, the largest VM_CNT value).
|
vmcnt_sat(<*N*>) A VM_CNT value computed as min(*N*, the largest VM_CNT value).
|
||||||
expcnt_sat(<*N*>) EXP_CNT value computed as min(*N*, the largest EXP_CNT value).
|
expcnt_sat(<*N*>) An EXP_CNT value computed as min(*N*, the largest EXP_CNT value).
|
||||||
lgkmcnt_sat(<*N*>) LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value).
|
lgkmcnt_sat(<*N*>) An LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value).
|
||||||
====================== ======================================================================
|
====================== ======================================================================
|
||||||
|
|
||||||
These helpers may be specified in any order. Ampersands and commas may be used as optional separators.
|
These values may be specified in any order. Spaces, ampersands and commas may be used as optional separators.
|
||||||
|
|
||||||
*N* is either an
|
*N* is either an
|
||||||
:ref:`integer number<amdgpu_synid_integer_number>` or an
|
:ref:`integer number<amdgpu_synid_integer_number>` or an
|
||||||
@ -46,10 +48,18 @@ Examples:
|
|||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
s_waitcnt 0
|
vm_cnt = 1
|
||||||
|
exp_cnt = 2
|
||||||
|
lgkm_cnt = 3
|
||||||
|
cnt = vm_cnt | (exp_cnt << 4) | (lgkm_cnt << 8)
|
||||||
|
|
||||||
|
s_waitcnt cnt
|
||||||
|
s_waitcnt 1 | (2 << 4) | (3 << 8) // the same as above
|
||||||
|
s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3) // the same as above
|
||||||
|
s_waitcnt vmcnt(vm_cnt) expcnt(exp_cnt) lgkmcnt(lgkm_cnt) // the same as above
|
||||||
|
|
||||||
s_waitcnt vmcnt(1)
|
s_waitcnt vmcnt(1)
|
||||||
s_waitcnt expcnt(2) lgkmcnt(3)
|
s_waitcnt expcnt(2) lgkmcnt(3)
|
||||||
s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3)
|
|
||||||
s_waitcnt vmcnt(1), expcnt(2), lgkmcnt(3)
|
s_waitcnt vmcnt(1), expcnt(2), lgkmcnt(3)
|
||||||
s_waitcnt vmcnt(1) & lgkmcnt_sat(100) & expcnt(2)
|
s_waitcnt vmcnt(1) & lgkmcnt_sat(100) & expcnt(2)
|
||||||
|
|
||||||
|
@ -10,5 +10,5 @@
|
|||||||
imm16
|
imm16
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits.
|
A 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.
|
||||||
|
|
||||||
|
@ -10,5 +10,5 @@
|
|||||||
imm32
|
imm32
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 32 bits.
|
An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value is truncated to 32 bits.
|
||||||
|
|
||||||
|
@ -21,7 +21,7 @@ Optionally may serve as an output data:
|
|||||||
* :ref:`dmask<amdgpu_synid_dmask>` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
|
* :ref:`dmask<amdgpu_synid_dmask>` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
|
||||||
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
|
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
|
||||||
|
|
||||||
Note. The surface data format is indicated in the image resource constant but not in the instruction.
|
Note: the surface data format is indicated in the image resource constant but not in the instruction.
|
||||||
|
|
||||||
|
|
||||||
*Operands:* :ref:`v<amdgpu_synid_v>`
|
*Operands:* :ref:`v<amdgpu_synid_v>`
|
||||||
|
@ -21,6 +21,6 @@ Optionally may serve as an output data:
|
|||||||
* :ref:`dmask<amdgpu_synid_dmask>` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
|
* :ref:`dmask<amdgpu_synid_dmask>` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
|
||||||
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
|
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
|
||||||
|
|
||||||
Note. The surface data format is indicated in the image resource constant but not in the instruction.
|
Note: the surface data format is indicated in the image resource constant but not in the instruction.
|
||||||
|
|
||||||
*Operands:* :ref:`v<amdgpu_synid_v>`
|
*Operands:* :ref:`v<amdgpu_synid_v>`
|
||||||
|
@ -10,5 +10,6 @@
|
|||||||
imm32
|
imm32
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
An :ref:`integer_number<amdgpu_synid_integer_number>` or a :ref:`floating-point_number<amdgpu_synid_floating-point_number>`. The number is converted to *f16* as described :ref:`here<amdgpu_synid_lit_conv>`.
|
A :ref:`floating-point_number<amdgpu_synid_floating-point_number>`, an :ref:`integer_number<amdgpu_synid_integer_number>`, or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
The value is converted to *f16* as described :ref:`here<amdgpu_synid_fp_conv>`.
|
||||||
|
|
||||||
|
@ -10,5 +10,6 @@
|
|||||||
imm32
|
imm32
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
An :ref:`integer_number<amdgpu_synid_integer_number>` or a :ref:`floating-point_number<amdgpu_synid_floating-point_number>`. The value is converted to *f32* as described :ref:`here<amdgpu_synid_lit_conv>`.
|
A :ref:`floating-point_number<amdgpu_synid_floating-point_number>`, an :ref:`integer_number<amdgpu_synid_integer_number>`, or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
The value is converted to *f32* as described :ref:`here<amdgpu_synid_fp_conv>`.
|
||||||
|
|
||||||
|
@ -14,18 +14,21 @@ Bits of a hardware register being accessed.
|
|||||||
|
|
||||||
The bits of this operand have the following meaning:
|
The bits of this operand have the following meaning:
|
||||||
|
|
||||||
============ ===================================
|
======= ===================== ============
|
||||||
Bits Description
|
Bits Description Value Range
|
||||||
============ ===================================
|
======= ===================== ============
|
||||||
5:0 Register *id*.
|
5:0 Register *id*. 0..63
|
||||||
10:6 First bit *offset* (0..31).
|
10:6 First bit *offset*. 0..31
|
||||||
15:11 *Size* in bits (1..32).
|
15:11 *Size* in bits. 1..32
|
||||||
============ ===================================
|
======= ===================== ============
|
||||||
|
|
||||||
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or using the syntax described below.
|
This operand may be specified as one of the following:
|
||||||
|
|
||||||
|
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
|
||||||
|
* An *hwreg* value described below.
|
||||||
|
|
||||||
==================================== ============================================================================
|
==================================== ============================================================================
|
||||||
Syntax Description
|
Hwreg Value Syntax Description
|
||||||
==================================== ============================================================================
|
==================================== ============================================================================
|
||||||
hwreg({0..63}) All bits of a register indicated by its *id*.
|
hwreg({0..63}) All bits of a register indicated by its *id*.
|
||||||
hwreg(<*name*>) All bits of a register indicated by its *name*.
|
hwreg(<*name*>) All bits of a register indicated by its *name*.
|
||||||
@ -33,7 +36,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_s
|
|||||||
hwreg(<*name*>, {0..31}, {1..32}) Register bits indicated by register *name*, first bit *offset* and *size*.
|
hwreg(<*name*>, {0..31}, {1..32}) Register bits indicated by register *name*, first bit *offset* and *size*.
|
||||||
==================================== ============================================================================
|
==================================== ============================================================================
|
||||||
|
|
||||||
Register *id*, *offset* and *size* must be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`.
|
Numeric values may be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`
|
||||||
|
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
Defined register *names* include:
|
Defined register *names* include:
|
||||||
|
|
||||||
@ -54,7 +58,16 @@ Examples:
|
|||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
s_getreg_b32 s2, 0x6
|
reg = 1
|
||||||
|
offset = 2
|
||||||
|
size = 4
|
||||||
|
hwreg_enc = reg | (offset << 6) | ((size - 1) << 11)
|
||||||
|
|
||||||
|
s_getreg_b32 s2, 0x1881
|
||||||
|
s_getreg_b32 s2, hwreg_enc // the same as above
|
||||||
|
s_getreg_b32 s2, hwreg(1, 2, 4) // the same as above
|
||||||
|
s_getreg_b32 s2, hwreg(reg, offset, size) // the same as above
|
||||||
|
|
||||||
s_getreg_b32 s2, hwreg(15)
|
s_getreg_b32 s2, hwreg(15)
|
||||||
s_getreg_b32 s2, hwreg(51, 1, 31)
|
s_getreg_b32 s2, hwreg(51, 1, 31)
|
||||||
s_getreg_b32 s2, hwreg(HW_REG_LDS_ALLOC, 0, 1)
|
s_getreg_b32 s2, hwreg(HW_REG_LDS_ALLOC, 0, 1)
|
||||||
|
66
docs/AMDGPU/gfx9_imask.rst
Normal file
66
docs/AMDGPU/gfx9_imask.rst
Normal file
@ -0,0 +1,66 @@
|
|||||||
|
..
|
||||||
|
**************************************************
|
||||||
|
* *
|
||||||
|
* Automatically generated file, do not edit! *
|
||||||
|
* *
|
||||||
|
**************************************************
|
||||||
|
|
||||||
|
.. _amdgpu_synid9_imask:
|
||||||
|
|
||||||
|
imask
|
||||||
|
===========================
|
||||||
|
|
||||||
|
This operand is a mask which controls indexing mode for operands of subsequent instructions.
|
||||||
|
Bits 0, 1 and 2 control indexing of *src0*, *src1* and *src2*, while bit 3 controls indexing of *dst*.
|
||||||
|
Value 1 enables indexing and value 0 disables it.
|
||||||
|
|
||||||
|
===== ========================================
|
||||||
|
Bit Meaning
|
||||||
|
===== ========================================
|
||||||
|
0 Enables or disables *src0* indexing.
|
||||||
|
1 Enables or disables *src1* indexing.
|
||||||
|
2 Enables or disables *src2* indexing.
|
||||||
|
3 Enables or disables *dst* indexing.
|
||||||
|
===== ========================================
|
||||||
|
|
||||||
|
This operand may be specified as one of the following:
|
||||||
|
|
||||||
|
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..15.
|
||||||
|
* A *gpr_idx* value described below.
|
||||||
|
|
||||||
|
==================================== ===========================================
|
||||||
|
Gpr_idx Value Syntax Description
|
||||||
|
==================================== ===========================================
|
||||||
|
gpr_idx(*<operands>*) Enable indexing for specified *operands*
|
||||||
|
and disable it for the rest.
|
||||||
|
*Operands* is a comma-separated list of
|
||||||
|
values which may include:
|
||||||
|
|
||||||
|
* "SRC0" - enable *src0* indexing.
|
||||||
|
|
||||||
|
* "SRC1" - enable *src1* indexing.
|
||||||
|
|
||||||
|
* "SRC2" - enable *src2* indexing.
|
||||||
|
|
||||||
|
* "DST" - enable *dst* indexing.
|
||||||
|
|
||||||
|
Each of these values may be specified only
|
||||||
|
once.
|
||||||
|
|
||||||
|
*Operands* list may be empty; this syntax
|
||||||
|
disables indexing for all operands.
|
||||||
|
==================================== ===========================================
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
|
||||||
|
.. parsed-literal::
|
||||||
|
|
||||||
|
s_set_gpr_idx_mode 0
|
||||||
|
s_set_gpr_idx_mode gpr_idx() // the same as above
|
||||||
|
|
||||||
|
s_set_gpr_idx_mode 15
|
||||||
|
s_set_gpr_idx_mode gpr_idx(DST,SRC0,SRC1,SRC2) // the same as above
|
||||||
|
s_set_gpr_idx_mode gpr_idx(SRC0,SRC1,SRC2,DST) // the same as above
|
||||||
|
|
||||||
|
s_set_gpr_idx_mode gpr_idx(DST,SRC1)
|
||||||
|
|
@ -1,25 +0,0 @@
|
|||||||
..
|
|
||||||
**************************************************
|
|
||||||
* *
|
|
||||||
* Automatically generated file, do not edit! *
|
|
||||||
* *
|
|
||||||
**************************************************
|
|
||||||
|
|
||||||
.. _amdgpu_synid9_imm4:
|
|
||||||
|
|
||||||
imm4
|
|
||||||
===========================
|
|
||||||
|
|
||||||
A positive :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 4 bits.
|
|
||||||
|
|
||||||
This operand is a mask which controls indexing mode for operands of subsequent instructions. Value 1 enables indexing and value 0 disables it.
|
|
||||||
|
|
||||||
============ ========================================
|
|
||||||
Bit Meaning
|
|
||||||
============ ========================================
|
|
||||||
0 Enables or disables *src0* indexing.
|
|
||||||
1 Enables or disables *src1* indexing.
|
|
||||||
2 Enables or disables *src2* indexing.
|
|
||||||
3 Enables or disables *dst* indexing.
|
|
||||||
============ ========================================
|
|
||||||
|
|
@ -12,19 +12,26 @@ label
|
|||||||
|
|
||||||
A branch target which is a 16-bit signed integer treated as a PC-relative dword offset.
|
A branch target which is a 16-bit signed integer treated as a PC-relative dword offset.
|
||||||
|
|
||||||
This operand may be specified as:
|
This operand may be specified as one of the following:
|
||||||
|
|
||||||
* An :ref:`integer_number<amdgpu_synid_integer_number>`. The number is truncated to 16 bits.
|
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.
|
||||||
* An :ref:`absolute_expression<amdgpu_synid_absolute_expression>` which must start with an :ref:`integer_number<amdgpu_synid_integer_number>`. The value of the expression is truncated to 16 bits.
|
* A :ref:`symbol<amdgpu_synid_symbol>` (for example, a label) representing a relocatable address in the same compilation unit where it is referred from. The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker.
|
||||||
* A :ref:`symbol<amdgpu_synid_symbol>` (for example, a label). The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker.
|
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
offset = 30
|
offset = 30
|
||||||
s_branch loop_end
|
label_1:
|
||||||
s_branch 2 + offset
|
label_2 = . + 4
|
||||||
s_branch 32
|
|
||||||
loop_end:
|
s_branch 32
|
||||||
|
s_branch offset + 2
|
||||||
|
s_branch label_1
|
||||||
|
s_branch label_2
|
||||||
|
s_branch label_3
|
||||||
|
s_branch label_4
|
||||||
|
|
||||||
|
label_3 = label_2 + 4
|
||||||
|
label_4:
|
||||||
|
|
||||||
|
@ -12,24 +12,29 @@ msg
|
|||||||
|
|
||||||
A 16-bit message code. The bits of this operand have the following meaning:
|
A 16-bit message code. The bits of this operand have the following meaning:
|
||||||
|
|
||||||
============ ======================================================
|
============ =============================== ===============
|
||||||
Bits Description
|
Bits Description Value Range
|
||||||
============ ======================================================
|
============ =============================== ===============
|
||||||
3:0 Message *type*.
|
3:0 Message *type*. 0..15
|
||||||
6:4 Optional *operation*.
|
6:4 Optional *operation*. 0..7
|
||||||
9:7 Optional *parameters*.
|
7:7 Unused. \-
|
||||||
15:10 Unused.
|
9:8 Optional *stream*. 0..3
|
||||||
============ ======================================================
|
15:10 Unused. \-
|
||||||
|
============ =============================== ===============
|
||||||
|
|
||||||
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or using the syntax described below:
|
This operand may be specified as one of the following:
|
||||||
|
|
||||||
======================================== ========================================================================
|
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
|
||||||
Syntax Description
|
* A *sendmsg* value described below.
|
||||||
======================================== ========================================================================
|
|
||||||
sendmsg(<*type*>) A message identified by its *type*.
|
==================================== ====================================================
|
||||||
sendmsg(<*type*>, <*op*>) A message identified by its *type* and *operation*.
|
Sendmsg Value Syntax Description
|
||||||
sendmsg(<*type*>, <*op*>, <*stream*>) A message identified by its *type* and *operation* with a stream *id*.
|
==================================== ====================================================
|
||||||
======================================== ========================================================================
|
sendmsg(<*type*>) A message identified by its *type*.
|
||||||
|
sendmsg(<*type*>,<*op*>) A message identified by its *type* and *operation*.
|
||||||
|
sendmsg(<*type*>,<*op*>,<*stream*>) A message identified by its *type* and *operation*
|
||||||
|
with a stream *id*.
|
||||||
|
==================================== ====================================================
|
||||||
|
|
||||||
*Type* may be specified using message *name* or message *id*.
|
*Type* may be specified using message *name* or message *id*.
|
||||||
|
|
||||||
@ -37,7 +42,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_s
|
|||||||
|
|
||||||
Stream *id* is an integer in the range 0..3.
|
Stream *id* is an integer in the range 0..3.
|
||||||
|
|
||||||
Message *id*, operation *id* and stream *id* must be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`.
|
Numeric values may be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`
|
||||||
|
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
Each message type supports specific operations:
|
Each message type supports specific operations:
|
||||||
|
|
||||||
@ -60,16 +66,32 @@ Each message type supports specific operations:
|
|||||||
\ SYSMSG_OP_TTRACE_PC 4 \-
|
\ SYSMSG_OP_TTRACE_PC 4 \-
|
||||||
================= ========== ============================== ============ ==========
|
================= ========== ============================== ============ ==========
|
||||||
|
|
||||||
|
*Sendmsg* arguments are validated depending on how *type* value is specified:
|
||||||
|
|
||||||
|
* If message *type* is specified by name, arguments values must satisfy limitations detailed in the table above.
|
||||||
|
* If message *type* is specified as a number, each argument must not exceed corresponding value range (see the first table).
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
|
// numeric message code
|
||||||
|
msg = 0x10
|
||||||
s_sendmsg 0x12
|
s_sendmsg 0x12
|
||||||
|
s_sendmsg msg + 2
|
||||||
|
|
||||||
|
// sendmsg with strict arguments validation
|
||||||
s_sendmsg sendmsg(MSG_INTERRUPT)
|
s_sendmsg sendmsg(MSG_INTERRUPT)
|
||||||
s_sendmsg sendmsg(MSG_GET_DOORBELL)
|
|
||||||
s_sendmsg sendmsg(2, GS_OP_CUT)
|
|
||||||
s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT)
|
s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT)
|
||||||
s_sendmsg sendmsg(MSG_GS, 2)
|
s_sendmsg sendmsg(MSG_GS, 2)
|
||||||
s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_EMIT_CUT, 1)
|
s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_EMIT_CUT, 1)
|
||||||
s_sendmsg sendmsg(MSG_SYSMSG, SYSMSG_OP_TTRACE_PC)
|
s_sendmsg sendmsg(MSG_SYSMSG, SYSMSG_OP_TTRACE_PC)
|
||||||
|
s_sendmsg sendmsg(MSG_GET_DOORBELL)
|
||||||
|
|
||||||
|
// sendmsg with validation of value range only
|
||||||
|
msg = 2
|
||||||
|
op = 3
|
||||||
|
stream = 1
|
||||||
|
s_sendmsg sendmsg(msg, op, stream)
|
||||||
|
s_sendmsg sendmsg(2, GS_OP_CUT)
|
||||||
|
|
||||||
|
@ -12,7 +12,8 @@ imm3
|
|||||||
|
|
||||||
A bit mask which indicates request permissions.
|
A bit mask which indicates request permissions.
|
||||||
|
|
||||||
This operand must be specified as an :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 7 bits, but only 3 low bits are significant.
|
This operand must be specified as an :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
The value is truncated to 7 bits, but only 3 low bits are significant.
|
||||||
|
|
||||||
============ ==============================
|
============ ==============================
|
||||||
Bit Number Description
|
Bit Number Description
|
||||||
|
@ -10,5 +10,5 @@
|
|||||||
imm16
|
imm16
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits and then sign-extended to 32 bits.
|
An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.
|
||||||
|
|
||||||
|
@ -10,5 +10,5 @@
|
|||||||
imm16
|
imm16
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits and then zero-extended to 32 bits.
|
An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..65535.
|
||||||
|
|
||||||
|
@ -14,30 +14,31 @@ Counts of outstanding instructions to wait for.
|
|||||||
|
|
||||||
The bits of this operand have the following meaning:
|
The bits of this operand have the following meaning:
|
||||||
|
|
||||||
============ ======================================================
|
========== ========= ================================================ ============
|
||||||
Bits Description
|
High Bits Low Bits Description Value Range
|
||||||
============ ======================================================
|
========== ========= ================================================ ============
|
||||||
3:0 VM_CNT: vector memory operations count, lower bits.
|
15:14 3:0 VM_CNT: vector memory operations count. 0..63
|
||||||
6:4 EXP_CNT: export count.
|
\- 6:4 EXP_CNT: export count. 0..7
|
||||||
11:8 LGKM_CNT: LDS, GDS, Constant and Message count.
|
\- 11:8 LGKM_CNT: LDS, GDS, Constant and Message count. 0..15
|
||||||
15:14 VM_CNT: vector memory operations count, upper bits.
|
========== ========= ================================================ ============
|
||||||
============ ======================================================
|
|
||||||
|
|
||||||
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>`
|
This operand may be specified as one of the following:
|
||||||
or as a combination of the following symbolic helpers:
|
|
||||||
|
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
|
||||||
|
* A combination of *vmcnt*, *expcnt*, *lgkmcnt* and other values described below.
|
||||||
|
|
||||||
====================== ======================================================================
|
====================== ======================================================================
|
||||||
Syntax Description
|
Syntax Description
|
||||||
====================== ======================================================================
|
====================== ======================================================================
|
||||||
vmcnt(<*N*>) VM_CNT value. *N* must not exceed the largest VM_CNT value.
|
vmcnt(<*N*>) A VM_CNT value. *N* must not exceed the largest VM_CNT value.
|
||||||
expcnt(<*N*>) EXP_CNT value. *N* must not exceed the largest EXP_CNT value.
|
expcnt(<*N*>) An EXP_CNT value. *N* must not exceed the largest EXP_CNT value.
|
||||||
lgkmcnt(<*N*>) LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value.
|
lgkmcnt(<*N*>) An LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value.
|
||||||
vmcnt_sat(<*N*>) VM_CNT value computed as min(*N*, the largest VM_CNT value).
|
vmcnt_sat(<*N*>) A VM_CNT value computed as min(*N*, the largest VM_CNT value).
|
||||||
expcnt_sat(<*N*>) EXP_CNT value computed as min(*N*, the largest EXP_CNT value).
|
expcnt_sat(<*N*>) An EXP_CNT value computed as min(*N*, the largest EXP_CNT value).
|
||||||
lgkmcnt_sat(<*N*>) LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value).
|
lgkmcnt_sat(<*N*>) An LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value).
|
||||||
====================== ======================================================================
|
====================== ======================================================================
|
||||||
|
|
||||||
These helpers may be specified in any order. Ampersands and commas may be used as optional separators.
|
These values may be specified in any order. Spaces, ampersands and commas may be used as optional separators.
|
||||||
|
|
||||||
*N* is either an
|
*N* is either an
|
||||||
:ref:`integer number<amdgpu_synid_integer_number>` or an
|
:ref:`integer number<amdgpu_synid_integer_number>` or an
|
||||||
@ -47,10 +48,18 @@ Examples:
|
|||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
s_waitcnt 0
|
vm_cnt = 1
|
||||||
|
exp_cnt = 2
|
||||||
|
lgkm_cnt = 3
|
||||||
|
cnt = vm_cnt | (exp_cnt << 4) | (lgkm_cnt << 8)
|
||||||
|
|
||||||
|
s_waitcnt cnt
|
||||||
|
s_waitcnt 1 | (2 << 4) | (3 << 8) // the same as above
|
||||||
|
s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3) // the same as above
|
||||||
|
s_waitcnt vmcnt(vm_cnt) expcnt(exp_cnt) lgkmcnt(lgkm_cnt) // the same as above
|
||||||
|
|
||||||
s_waitcnt vmcnt(1)
|
s_waitcnt vmcnt(1)
|
||||||
s_waitcnt expcnt(2) lgkmcnt(3)
|
s_waitcnt expcnt(2) lgkmcnt(3)
|
||||||
s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3)
|
|
||||||
s_waitcnt vmcnt(1), expcnt(2), lgkmcnt(3)
|
s_waitcnt vmcnt(1), expcnt(2), lgkmcnt(3)
|
||||||
s_waitcnt vmcnt(1) & lgkmcnt_sat(100) & expcnt(2)
|
s_waitcnt vmcnt(1) & lgkmcnt_sat(100) & expcnt(2)
|
||||||
|
|
||||||
|
@ -34,19 +34,21 @@ Specifies an immediate unsigned 8-bit offset, in bytes. The default value is 0.
|
|||||||
|
|
||||||
Used with DS instructions which have 2 addresses.
|
Used with DS instructions which have 2 addresses.
|
||||||
|
|
||||||
=================== =====================================================
|
=================== ====================================================================
|
||||||
Syntax Description
|
Syntax Description
|
||||||
=================== =====================================================
|
=================== ====================================================================
|
||||||
offset:{0..0xFF} Specifies an unsigned 8-bit offset as a positive
|
offset:{0..0xFF} Specifies an unsigned 8-bit offset as a positive
|
||||||
:ref:`integer number <amdgpu_synid_integer_number>`.
|
:ref:`integer number <amdgpu_synid_integer_number>`
|
||||||
=================== =====================================================
|
or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
=================== ====================================================================
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
offset:255
|
|
||||||
offset:0xff
|
offset:0xff
|
||||||
|
offset:2-x
|
||||||
|
offset:-x-y
|
||||||
|
|
||||||
.. _amdgpu_synid_ds_offset16:
|
.. _amdgpu_synid_ds_offset16:
|
||||||
|
|
||||||
@ -57,12 +59,13 @@ Specifies an immediate unsigned 16-bit offset, in bytes. The default value is 0.
|
|||||||
|
|
||||||
Used with DS instructions which have 1 address.
|
Used with DS instructions which have 1 address.
|
||||||
|
|
||||||
==================== ======================================================
|
==================== ====================================================================
|
||||||
Syntax Description
|
Syntax Description
|
||||||
==================== ======================================================
|
==================== ====================================================================
|
||||||
offset:{0..0xFFFF} Specifies an unsigned 16-bit offset as a positive
|
offset:{0..0xFFFF} Specifies an unsigned 16-bit offset as a positive
|
||||||
:ref:`integer number <amdgpu_synid_integer_number>`.
|
:ref:`integer number <amdgpu_synid_integer_number>`
|
||||||
==================== ======================================================
|
or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
==================== ====================================================================
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
@ -70,6 +73,7 @@ Examples:
|
|||||||
|
|
||||||
offset:65535
|
offset:65535
|
||||||
offset:0xffff
|
offset:0xffff
|
||||||
|
offset:-x-y
|
||||||
|
|
||||||
.. _amdgpu_synid_sw_offset16:
|
.. _amdgpu_synid_sw_offset16:
|
||||||
|
|
||||||
@ -95,7 +99,7 @@ See AMD documentation for more information.
|
|||||||
|
|
||||||
*mask* is a 5 character sequence which
|
*mask* is a 5 character sequence which
|
||||||
specifies how to transform the bits of the
|
specifies how to transform the bits of the
|
||||||
lane *id*.
|
lane *id*.
|
||||||
|
|
||||||
The following characters are allowed:
|
The following characters are allowed:
|
||||||
|
|
||||||
@ -116,7 +120,7 @@ See AMD documentation for more information.
|
|||||||
size and must be equal to 2, 4, 8, 16 or 32.
|
size and must be equal to 2, 4, 8, 16 or 32.
|
||||||
|
|
||||||
The second numeric parameter is an index of the
|
The second numeric parameter is an index of the
|
||||||
lane being broadcasted.
|
lane being broadcasted.
|
||||||
|
|
||||||
The index must not exceed group size.
|
The index must not exceed group size.
|
||||||
offset:swizzle(SWAP,{1..16}) Specifies a swap mode.
|
offset:swizzle(SWAP,{1..16}) Specifies a swap mode.
|
||||||
@ -128,7 +132,7 @@ See AMD documentation for more information.
|
|||||||
Reverses the lanes for groups of 2, 4, 8, 16 or 32 lanes.
|
Reverses the lanes for groups of 2, 4, 8, 16 or 32 lanes.
|
||||||
======================================================= ===========================================================
|
======================================================= ===========================================================
|
||||||
|
|
||||||
Numeric parameters may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
|
Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
|
||||||
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
@ -137,7 +141,7 @@ Examples:
|
|||||||
|
|
||||||
offset:255
|
offset:255
|
||||||
offset:0xffff
|
offset:0xffff
|
||||||
offset:swizzle(QUAD_PERM, 0, 1, 2 ,3)
|
offset:swizzle(QUAD_PERM, 0, 1, 2, 3)
|
||||||
offset:swizzle(BITMASK_PERM, "01pi0")
|
offset:swizzle(BITMASK_PERM, "01pi0")
|
||||||
offset:swizzle(BROADCAST, 2, 0)
|
offset:swizzle(BROADCAST, 2, 0)
|
||||||
offset:swizzle(SWAP, 8)
|
offset:swizzle(SWAP, 8)
|
||||||
@ -212,19 +216,20 @@ Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
|
|||||||
|
|
||||||
Cannot be used with *global/scratch* opcodes. GFX9 only.
|
Cannot be used with *global/scratch* opcodes. GFX9 only.
|
||||||
|
|
||||||
================= ======================================================
|
================= ====================================================================
|
||||||
Syntax Description
|
Syntax Description
|
||||||
================= ======================================================
|
================= ====================================================================
|
||||||
offset:{0..4095} Specifies a 12-bit unsigned offset as a positive
|
offset:{0..4095} Specifies a 12-bit unsigned offset as a positive
|
||||||
:ref:`integer number <amdgpu_synid_integer_number>`.
|
:ref:`integer number <amdgpu_synid_integer_number>`
|
||||||
================= ======================================================
|
or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
================= ====================================================================
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
offset:4095
|
offset:4095
|
||||||
offset:0xff
|
offset:x-0xff
|
||||||
|
|
||||||
.. _amdgpu_synid_flat_offset13s:
|
.. _amdgpu_synid_flat_offset13s:
|
||||||
|
|
||||||
@ -235,12 +240,13 @@ Specifies an immediate signed 13-bit offset, in bytes. The default value is 0.
|
|||||||
|
|
||||||
Can be used with *global/scratch* opcodes only. GFX9 only.
|
Can be used with *global/scratch* opcodes only. GFX9 only.
|
||||||
|
|
||||||
============================ =======================================================
|
===================== ====================================================================
|
||||||
Syntax Description
|
Syntax Description
|
||||||
============================ =======================================================
|
===================== ====================================================================
|
||||||
offset:{-4096..4095} Specifies a 13-bit signed offset as an
|
offset:{-4096..4095} Specifies a 13-bit signed offset as an
|
||||||
:ref:`integer number <amdgpu_synid_integer_number>`.
|
:ref:`integer number <amdgpu_synid_integer_number>`
|
||||||
============================ =======================================================
|
or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
===================== ====================================================================
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
@ -248,6 +254,7 @@ Examples:
|
|||||||
|
|
||||||
offset:-4000
|
offset:-4000
|
||||||
offset:0x10
|
offset:0x10
|
||||||
|
offset:-x
|
||||||
|
|
||||||
.. _amdgpu_synid_flat_offset12s:
|
.. _amdgpu_synid_flat_offset12s:
|
||||||
|
|
||||||
@ -260,12 +267,13 @@ Can be used with *global/scratch* opcodes only.
|
|||||||
|
|
||||||
GFX10 only.
|
GFX10 only.
|
||||||
|
|
||||||
============================ =======================================================
|
===================== ====================================================================
|
||||||
Syntax Description
|
Syntax Description
|
||||||
============================ =======================================================
|
===================== ====================================================================
|
||||||
offset:{-2048..2047} Specifies a 12-bit signed offset as an
|
offset:{-2048..2047} Specifies a 12-bit signed offset as an
|
||||||
:ref:`integer number <amdgpu_synid_integer_number>`.
|
:ref:`integer number <amdgpu_synid_integer_number>`
|
||||||
============================ =======================================================
|
or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
===================== ====================================================================
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
@ -273,6 +281,7 @@ Examples:
|
|||||||
|
|
||||||
offset:-2000
|
offset:-2000
|
||||||
offset:0x10
|
offset:0x10
|
||||||
|
offset:-x+y
|
||||||
|
|
||||||
.. _amdgpu_synid_flat_offset11:
|
.. _amdgpu_synid_flat_offset11:
|
||||||
|
|
||||||
@ -285,19 +294,20 @@ Cannot be used with *global/scratch* opcodes.
|
|||||||
|
|
||||||
GFX10 only.
|
GFX10 only.
|
||||||
|
|
||||||
================= ======================================================
|
================= ====================================================================
|
||||||
Syntax Description
|
Syntax Description
|
||||||
================= ======================================================
|
================= ====================================================================
|
||||||
offset:{0..2047} Specifies an 11-bit unsigned offset as a positive
|
offset:{0..2047} Specifies an 11-bit unsigned offset as a positive
|
||||||
:ref:`integer number <amdgpu_synid_integer_number>`.
|
:ref:`integer number <amdgpu_synid_integer_number>`
|
||||||
================= ======================================================
|
or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
================= ====================================================================
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
offset:2047
|
offset:2047
|
||||||
offset:0xff
|
offset:x+0xff
|
||||||
|
|
||||||
dlc
|
dlc
|
||||||
~~~
|
~~~
|
||||||
@ -340,19 +350,18 @@ dmask
|
|||||||
Specifies which channels (image components) are used by the operation. By default, no channels
|
Specifies which channels (image components) are used by the operation. By default, no channels
|
||||||
are used.
|
are used.
|
||||||
|
|
||||||
=============== =====================================================
|
=============== ====================================================================
|
||||||
Syntax Description
|
Syntax Description
|
||||||
=============== =====================================================
|
=============== ====================================================================
|
||||||
dmask:{0..15} Specifies image channels as a positive
|
dmask:{0..15} Specifies image channels as a positive
|
||||||
:ref:`integer number <amdgpu_synid_integer_number>`.
|
:ref:`integer number <amdgpu_synid_integer_number>`
|
||||||
|
or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
Each bit corresponds to one of 4 image
|
Each bit corresponds to one of 4 image components (RGBA).
|
||||||
components (RGBA).
|
|
||||||
|
|
||||||
If the specified bit value
|
If the specified bit value is 0, the component is not used,
|
||||||
is 0, the component is not used, value 1 means
|
value 1 means that the component is used.
|
||||||
that the component is used.
|
=============== ====================================================================
|
||||||
=============== =====================================================
|
|
||||||
|
|
||||||
This modifier has some limitations depending on instruction kind:
|
This modifier has some limitations depending on instruction kind:
|
||||||
|
|
||||||
@ -373,7 +382,7 @@ Examples:
|
|||||||
|
|
||||||
dmask:0xf
|
dmask:0xf
|
||||||
dmask:0b1111
|
dmask:0b1111
|
||||||
dmask:3
|
dmask:x|y|z
|
||||||
|
|
||||||
.. _amdgpu_synid_unorm:
|
.. _amdgpu_synid_unorm:
|
||||||
|
|
||||||
@ -468,7 +477,7 @@ Specifies data size: 16 or 32 bits (32 bits by default). Not supported by GFX7.
|
|||||||
Each 16-bit data element occupies 1 VGPR.
|
Each 16-bit data element occupies 1 VGPR.
|
||||||
|
|
||||||
GFX8.1, GFX9 and GFX10 support data packing.
|
GFX8.1, GFX9 and GFX10 support data packing.
|
||||||
Each pair of 16-bit data elements
|
Each pair of 16-bit data elements
|
||||||
occupies 1 VGPR.
|
occupies 1 VGPR.
|
||||||
======================================== ================================================
|
======================================== ================================================
|
||||||
|
|
||||||
@ -684,18 +693,19 @@ offset12
|
|||||||
|
|
||||||
Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
|
Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
|
||||||
|
|
||||||
=============================== ======================================================
|
================== ====================================================================
|
||||||
Syntax Description
|
Syntax Description
|
||||||
=============================== ======================================================
|
================== ====================================================================
|
||||||
offset:{0..0xFFF} Specifies a 12-bit unsigned offset as a positive
|
offset:{0..0xFFF} Specifies a 12-bit unsigned offset as a positive
|
||||||
:ref:`integer number <amdgpu_synid_integer_number>`.
|
:ref:`integer number <amdgpu_synid_integer_number>`
|
||||||
=============================== ======================================================
|
or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
================== ====================================================================
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
offset:0
|
offset:x+y
|
||||||
offset:0x10
|
offset:0x10
|
||||||
|
|
||||||
glc
|
glc
|
||||||
@ -782,14 +792,18 @@ GFX10 only.
|
|||||||
dpp8_sel
|
dpp8_sel
|
||||||
~~~~~~~~
|
~~~~~~~~
|
||||||
|
|
||||||
Selects which lane to pull data from, within a group of 8 lanes. This is a mandatory modifier.
|
Selects which lanes to pull data from, within a group of 8 lanes. This is a mandatory modifier.
|
||||||
There is no default value.
|
There is no default value.
|
||||||
|
|
||||||
GFX10 only.
|
GFX10 only.
|
||||||
|
|
||||||
The *dpp8_sel* modifier must specify exactly 8 values, each ranging from 0 to 7.
|
The *dpp8_sel* modifier must specify exactly 8 values.
|
||||||
First value selects which lane to read from to supply data into lane 0.
|
First value selects which lane to read from to supply data into lane 0.
|
||||||
Second value controls value for lane 1 and so on.
|
Second value controls lane 1 and so on.
|
||||||
|
|
||||||
|
Each value may be specified as either
|
||||||
|
an :ref:`integer number<amdgpu_synid_integer_number>` or
|
||||||
|
an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
=============================================================== ===========================
|
=============================================================== ===========================
|
||||||
Syntax Description
|
Syntax Description
|
||||||
@ -811,7 +825,7 @@ fi
|
|||||||
|
|
||||||
Controls interaction with inactive lanes for *dpp8* instructions. The default value is zero.
|
Controls interaction with inactive lanes for *dpp8* instructions. The default value is zero.
|
||||||
|
|
||||||
Note. *Inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero.
|
Note: *inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero.
|
||||||
|
|
||||||
GFX10 only.
|
GFX10 only.
|
||||||
|
|
||||||
@ -822,6 +836,9 @@ GFX10 only.
|
|||||||
fi:1 Fetch pre-exist values from inactive lanes.
|
fi:1 Fetch pre-exist values from inactive lanes.
|
||||||
==================================== =====================================================
|
==================================== =====================================================
|
||||||
|
|
||||||
|
Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
|
||||||
|
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
DPP/DPP16 Modifiers
|
DPP/DPP16 Modifiers
|
||||||
-------------------
|
-------------------
|
||||||
|
|
||||||
@ -837,7 +854,7 @@ There is no default value.
|
|||||||
|
|
||||||
GFX8 and GFX9 only. Use :ref:`dpp16_ctrl<amdgpu_synid_dpp16_ctrl>` for GFX10.
|
GFX8 and GFX9 only. Use :ref:`dpp16_ctrl<amdgpu_synid_dpp16_ctrl>` for GFX10.
|
||||||
|
|
||||||
Note. The lanes of a wavefront are organized in four *rows* and four *banks*.
|
Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
|
||||||
|
|
||||||
======================================== ================================================
|
======================================== ================================================
|
||||||
Syntax Description
|
Syntax Description
|
||||||
@ -856,7 +873,7 @@ Note. The lanes of a wavefront are organized in four *rows* and four *banks*.
|
|||||||
row_ror:{1..15} Row rotate right by 1-15 threads.
|
row_ror:{1..15} Row rotate right by 1-15 threads.
|
||||||
======================================== ================================================
|
======================================== ================================================
|
||||||
|
|
||||||
Note: Numeric parameters may be specified as either
|
Note: numeric values may be specified as either
|
||||||
:ref:`integer numbers<amdgpu_synid_integer_number>` or
|
:ref:`integer numbers<amdgpu_synid_integer_number>` or
|
||||||
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
@ -877,7 +894,7 @@ There is no default value.
|
|||||||
|
|
||||||
GFX10 only. Use :ref:`dpp_ctrl<amdgpu_synid_dpp_ctrl>` for GFX8 and GFX9.
|
GFX10 only. Use :ref:`dpp_ctrl<amdgpu_synid_dpp_ctrl>` for GFX8 and GFX9.
|
||||||
|
|
||||||
Note. The lanes of a wavefront are organized in four *rows* and four *banks*.
|
Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
|
||||||
(There are only two rows in *wave32* mode.)
|
(There are only two rows in *wave32* mode.)
|
||||||
|
|
||||||
======================================== ====================================================
|
======================================== ====================================================
|
||||||
@ -894,7 +911,7 @@ Note. The lanes of a wavefront are organized in four *rows* and four *banks*.
|
|||||||
row_ror:{1..15} Row rotate right by 1-15 threads.
|
row_ror:{1..15} Row rotate right by 1-15 threads.
|
||||||
======================================== ====================================================
|
======================================== ====================================================
|
||||||
|
|
||||||
Note: Numeric parameters may be specified as either
|
Note: numeric values may be specified as either
|
||||||
:ref:`integer numbers<amdgpu_synid_integer_number>` or
|
:ref:`integer numbers<amdgpu_synid_integer_number>` or
|
||||||
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
@ -912,21 +929,21 @@ row_mask
|
|||||||
|
|
||||||
Controls which rows are enabled for data sharing. By default, all rows are enabled.
|
Controls which rows are enabled for data sharing. By default, all rows are enabled.
|
||||||
|
|
||||||
Note. The lanes of a wavefront are organized in four *rows* and four *banks*.
|
Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
|
||||||
(There are only two rows in *wave32* mode.)
|
(There are only two rows in *wave32* mode.)
|
||||||
|
|
||||||
======================================== =====================================================
|
================= ====================================================================
|
||||||
Syntax Description
|
Syntax Description
|
||||||
======================================== =====================================================
|
================= ====================================================================
|
||||||
row_mask:{0..15} Specifies a *row mask* as a positive
|
row_mask:{0..15} Specifies a *row mask* as a positive
|
||||||
:ref:`integer number <amdgpu_synid_integer_number>`.
|
:ref:`integer number <amdgpu_synid_integer_number>`
|
||||||
|
or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
Each of 4 bits in the mask controls one
|
Each of 4 bits in the mask controls one row
|
||||||
row (0 - disabled, 1 - enabled).
|
(0 - disabled, 1 - enabled).
|
||||||
|
|
||||||
In *wave32* mode the values should be limited to
|
In *wave32* mode the values should be limited to 0..7.
|
||||||
{0..7}.
|
================= ====================================================================
|
||||||
======================================== =====================================================
|
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
@ -934,7 +951,7 @@ Examples:
|
|||||||
|
|
||||||
row_mask:0xf
|
row_mask:0xf
|
||||||
row_mask:0b1010
|
row_mask:0b1010
|
||||||
row_mask:0b1111
|
row_mask:x|y
|
||||||
|
|
||||||
.. _amdgpu_synid_bank_mask:
|
.. _amdgpu_synid_bank_mask:
|
||||||
|
|
||||||
@ -943,18 +960,19 @@ bank_mask
|
|||||||
|
|
||||||
Controls which banks are enabled for data sharing. By default, all banks are enabled.
|
Controls which banks are enabled for data sharing. By default, all banks are enabled.
|
||||||
|
|
||||||
Note. The lanes of a wavefront are organized in four *rows* and four *banks*.
|
Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
|
||||||
(There are only two rows in *wave32* mode.)
|
(There are only two rows in *wave32* mode.)
|
||||||
|
|
||||||
======================================== =======================================================
|
================== ====================================================================
|
||||||
Syntax Description
|
Syntax Description
|
||||||
======================================== =======================================================
|
================== ====================================================================
|
||||||
bank_mask:{0..15} Specifies a *bank mask* as a positive
|
bank_mask:{0..15} Specifies a *bank mask* as a positive
|
||||||
:ref:`integer number <amdgpu_synid_integer_number>`.
|
:ref:`integer number <amdgpu_synid_integer_number>`
|
||||||
|
or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
Each of 4 bits in the mask controls one
|
Each of 4 bits in the mask controls one bank
|
||||||
bank (0 - disabled, 1 - enabled).
|
(0 - disabled, 1 - enabled).
|
||||||
======================================== =======================================================
|
================== ====================================================================
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
@ -962,7 +980,7 @@ Examples:
|
|||||||
|
|
||||||
bank_mask:0x3
|
bank_mask:0x3
|
||||||
bank_mask:0b0011
|
bank_mask:0b0011
|
||||||
bank_mask:0b1111
|
bank_mask:x&y
|
||||||
|
|
||||||
.. _amdgpu_synid_bound_ctrl:
|
.. _amdgpu_synid_bound_ctrl:
|
||||||
|
|
||||||
@ -988,7 +1006,7 @@ fi
|
|||||||
|
|
||||||
Controls interaction with *inactive* lanes for *dpp16* instructions. The default value is zero.
|
Controls interaction with *inactive* lanes for *dpp16* instructions. The default value is zero.
|
||||||
|
|
||||||
Note. *Inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero.
|
Note: *inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero.
|
||||||
|
|
||||||
GFX10 only.
|
GFX10 only.
|
||||||
|
|
||||||
@ -1001,6 +1019,9 @@ GFX10 only.
|
|||||||
fi:1 Fetch pre-exist values from inactive lanes.
|
fi:1 Fetch pre-exist values from inactive lanes.
|
||||||
======================================== ==================================================
|
======================================== ==================================================
|
||||||
|
|
||||||
|
Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
|
||||||
|
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
SDWA Modifiers
|
SDWA Modifiers
|
||||||
--------------
|
--------------
|
||||||
|
|
||||||
@ -1037,7 +1058,6 @@ Selects which bits in the destination are affected. By default, all bits are aff
|
|||||||
dst_sel:WORD_1 Use bits 31:16.
|
dst_sel:WORD_1 Use bits 31:16.
|
||||||
======================================== ================================================
|
======================================== ================================================
|
||||||
|
|
||||||
|
|
||||||
.. _amdgpu_synid_dst_unused:
|
.. _amdgpu_synid_dst_unused:
|
||||||
|
|
||||||
dst_unused
|
dst_unused
|
||||||
@ -1151,7 +1171,7 @@ operands (both source and destination). First value controls src0, second value
|
|||||||
and so on, except that the last value controls destination.
|
and so on, except that the last value controls destination.
|
||||||
The value 0 selects the low bits, while 1 selects the high bits.
|
The value 0 selects the low bits, while 1 selects the high bits.
|
||||||
|
|
||||||
Note. op_sel modifier affects 16-bit operands only. For 32-bit operands the value specified
|
Note: op_sel modifier affects 16-bit operands only. For 32-bit operands the value specified
|
||||||
by op_sel must be 0.
|
by op_sel must be 0.
|
||||||
|
|
||||||
GFX9 and GFX10 only.
|
GFX9 and GFX10 only.
|
||||||
@ -1164,6 +1184,10 @@ GFX9 and GFX10 only.
|
|||||||
op_sel:[{0..1},{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands.
|
op_sel:[{0..1},{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands.
|
||||||
======================================== ============================================================
|
======================================== ============================================================
|
||||||
|
|
||||||
|
Note: numeric values may be specified as either
|
||||||
|
:ref:`integer numbers<amdgpu_synid_integer_number>` or
|
||||||
|
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
@ -1189,7 +1213,7 @@ Integer clamping is not supported by GFX7.
|
|||||||
For floating point operations, clamp modifier indicates that the result must be clamped
|
For floating point operations, clamp modifier indicates that the result must be clamped
|
||||||
to the range [0.0, 1.0]. By default, there is no clamping.
|
to the range [0.0, 1.0]. By default, there is no clamping.
|
||||||
|
|
||||||
Note. Clamp modifier is applied after :ref:`output modifiers<amdgpu_synid_omod>` (if any).
|
Note: clamp modifier is applied after :ref:`output modifiers<amdgpu_synid_omod>` (if any).
|
||||||
|
|
||||||
======================================== ================================================
|
======================================== ================================================
|
||||||
Syntax Description
|
Syntax Description
|
||||||
@ -1205,12 +1229,12 @@ omod
|
|||||||
Specifies if an output modifier must be applied to the result.
|
Specifies if an output modifier must be applied to the result.
|
||||||
By default, no output modifiers are applied.
|
By default, no output modifiers are applied.
|
||||||
|
|
||||||
Note. Output modifiers are applied before :ref:`clamping<amdgpu_synid_clamp>` (if any).
|
Note: output modifiers are applied before :ref:`clamping<amdgpu_synid_clamp>` (if any).
|
||||||
|
|
||||||
Output modifiers are valid for f32 and f64 floating point results only.
|
Output modifiers are valid for f32 and f64 floating point results only.
|
||||||
They must not be used with f16.
|
They must not be used with f16.
|
||||||
|
|
||||||
Note. *v_cvt_f16_f32* is an exception. This instruction produces f16 result
|
Note: *v_cvt_f16_f32* is an exception. This instruction produces f16 result
|
||||||
but accepts output modifiers.
|
but accepts output modifiers.
|
||||||
|
|
||||||
======================================== ================================================
|
======================================== ================================================
|
||||||
@ -1221,6 +1245,16 @@ but accepts output modifiers.
|
|||||||
div:2 Multiply the result by 0.5.
|
div:2 Multiply the result by 0.5.
|
||||||
======================================== ================================================
|
======================================== ================================================
|
||||||
|
|
||||||
|
Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
|
||||||
|
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
|
||||||
|
.. parsed-literal::
|
||||||
|
|
||||||
|
mul:2
|
||||||
|
mul:x // x must be equal to 2 or 4
|
||||||
|
|
||||||
.. _amdgpu_synid_vop3_operand_modifiers:
|
.. _amdgpu_synid_vop3_operand_modifiers:
|
||||||
|
|
||||||
VOP3 Operand Modifiers
|
VOP3 Operand Modifiers
|
||||||
@ -1233,15 +1267,19 @@ Operand modifiers are not used separately. They are applied to source operands.
|
|||||||
abs
|
abs
|
||||||
~~~
|
~~~
|
||||||
|
|
||||||
Computes absolute value of its operand. Applied before :ref:`neg<amdgpu_synid_neg>` (if any).
|
Computes the absolute value of its operand. Must be applied before :ref:`neg<amdgpu_synid_neg>`
|
||||||
Valid for floating point operands only.
|
(if any). Valid for floating point operands only.
|
||||||
|
|
||||||
======================================== ================================================
|
======================================== ====================================================
|
||||||
Syntax Description
|
Syntax Description
|
||||||
======================================== ================================================
|
======================================== ====================================================
|
||||||
abs(<operand>) Get absolute value of operand.
|
abs(<operand>) Get the absolute value of a floating-point operand.
|
||||||
\|<operand>| The same as above.
|
\|<operand>| The same as above (an SP3 syntax).
|
||||||
======================================== ================================================
|
======================================== ====================================================
|
||||||
|
|
||||||
|
Note: avoid using SP3 syntax with operands specified as expressions because the trailing '|'
|
||||||
|
may be misinterpreted. Such operands should be enclosed into additional parentheses as shown
|
||||||
|
in examples below.
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
@ -1249,28 +1287,50 @@ Examples:
|
|||||||
|
|
||||||
abs(v36)
|
abs(v36)
|
||||||
\|v36|
|
\|v36|
|
||||||
|
abs(x|y) // ok
|
||||||
|
\|(x|y)| // additional parentheses are required
|
||||||
|
|
||||||
.. _amdgpu_synid_neg:
|
.. _amdgpu_synid_neg:
|
||||||
|
|
||||||
neg
|
neg
|
||||||
~~~
|
~~~
|
||||||
|
|
||||||
Computes negative value of its operand. Applied after :ref:`abs<amdgpu_synid_abs>` (if any).
|
Computes the negative value of its operand. Must be applied after :ref:`abs<amdgpu_synid_abs>`
|
||||||
Valid for floating point operands only.
|
(if any). Valid for floating point operands only.
|
||||||
|
|
||||||
======================================== ================================================
|
================== ====================================================
|
||||||
Syntax Description
|
Syntax Description
|
||||||
======================================== ================================================
|
================== ====================================================
|
||||||
neg(<operand>) Get negative value of operand.
|
neg(<operand>) Get the negative value of a floating-point operand.
|
||||||
-<operand> The same as above.
|
The operand may include an optional
|
||||||
======================================== ================================================
|
:ref:`abs<amdgpu_synid_abs>` modifier.
|
||||||
|
-<operand> The same as above (an SP3 syntax).
|
||||||
|
================== ====================================================
|
||||||
|
|
||||||
|
Note: SP3 syntax is supported with limitations because of a potential ambiguity.
|
||||||
|
Currently it is allowed in the following cases:
|
||||||
|
|
||||||
|
* Before a register.
|
||||||
|
* Before an :ref:`abs<amdgpu_synid_abs>` modifier.
|
||||||
|
* Before an SP3 :ref:`abs<amdgpu_synid_abs>` modifier.
|
||||||
|
|
||||||
|
In all other cases "-" is handled as a part of an expression that follows the sign.
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
|
// Operands with negate modifiers
|
||||||
neg(v[0])
|
neg(v[0])
|
||||||
-v4
|
neg(1.0)
|
||||||
|
neg(abs(v0))
|
||||||
|
-v5
|
||||||
|
-abs(v5)
|
||||||
|
-\|v5|
|
||||||
|
|
||||||
|
// Operands without negate modifiers
|
||||||
|
-1
|
||||||
|
-x+y
|
||||||
|
|
||||||
VOP3P Modifiers
|
VOP3P Modifiers
|
||||||
---------------
|
---------------
|
||||||
@ -1304,6 +1364,10 @@ The value 0 selects the low bits, while 1 selects the high bits.
|
|||||||
op_sel:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands.
|
op_sel:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands.
|
||||||
================================= =============================================================
|
================================= =============================================================
|
||||||
|
|
||||||
|
Note: numeric values may be specified as either
|
||||||
|
:ref:`integer numbers<amdgpu_synid_integer_number>` or
|
||||||
|
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
@ -1333,6 +1397,10 @@ The value 0 selects the low bits, while 1 selects the high bits.
|
|||||||
op_sel_hi:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands.
|
op_sel_hi:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands.
|
||||||
=================================== =============================================================
|
=================================== =============================================================
|
||||||
|
|
||||||
|
Note: numeric values may be specified as either
|
||||||
|
:ref:`integer numbers<amdgpu_synid_integer_number>` or
|
||||||
|
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
@ -1367,6 +1435,10 @@ This modifier is valid for floating point operands only.
|
|||||||
neg_lo:[{0..1},{0..1},{0..1}] Select affected operands for instructions with 3 source operands.
|
neg_lo:[{0..1},{0..1},{0..1}] Select affected operands for instructions with 3 source operands.
|
||||||
================================ ==================================================================
|
================================ ==================================================================
|
||||||
|
|
||||||
|
Note: numeric values may be specified as either
|
||||||
|
:ref:`integer numbers<amdgpu_synid_integer_number>` or
|
||||||
|
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
@ -1401,6 +1473,10 @@ This modifier is valid for floating point operands only.
|
|||||||
neg_hi:[{0..1},{0..1},{0..1}] Select affected operands for instructions with 3 source operands.
|
neg_hi:[{0..1},{0..1},{0..1}] Select affected operands for instructions with 3 source operands.
|
||||||
=============================== ==================================================================
|
=============================== ==================================================================
|
||||||
|
|
||||||
|
Note: numeric values may be specified as either
|
||||||
|
:ref:`integer numbers<amdgpu_synid_integer_number>` or
|
||||||
|
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
@ -1419,7 +1495,7 @@ VOP3P V_MAD_MIX Modifiers
|
|||||||
-------------------------
|
-------------------------
|
||||||
|
|
||||||
*v_mad_mix_f32*, *v_mad_mixhi_f16* and *v_mad_mixlo_f16* instructions
|
*v_mad_mix_f32*, *v_mad_mixhi_f16* and *v_mad_mixlo_f16* instructions
|
||||||
use *op_sel* and *op_sel_hi* modifiers
|
use *op_sel* and *op_sel_hi* modifiers
|
||||||
in a manner different from *regular* VOP3P instructions.
|
in a manner different from *regular* VOP3P instructions.
|
||||||
|
|
||||||
See a description below.
|
See a description below.
|
||||||
@ -1449,6 +1525,10 @@ By default, low bits are used for all operands.
|
|||||||
op_sel:[{0..1},{0..1},{0..1}] Select location of each 16-bit source operand.
|
op_sel:[{0..1},{0..1},{0..1}] Select location of each 16-bit source operand.
|
||||||
=============================== ================================================
|
=============================== ================================================
|
||||||
|
|
||||||
|
Note: numeric values may be specified as either
|
||||||
|
:ref:`integer numbers<amdgpu_synid_integer_number>` or
|
||||||
|
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
@ -1477,6 +1557,10 @@ The location of 16 bits in the operand may be specified by
|
|||||||
op_sel_hi:[{0..1},{0..1},{0..1}] Select size of each source operand.
|
op_sel_hi:[{0..1},{0..1},{0..1}] Select size of each source operand.
|
||||||
======================================== ====================================
|
======================================== ====================================
|
||||||
|
|
||||||
|
Note: numeric values may be specified as either
|
||||||
|
:ref:`integer numbers<amdgpu_synid_integer_number>` or
|
||||||
|
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
@ -38,7 +38,8 @@ Assembler currently supports sequences of 1, 2, 3, 4, 8 and 16 *vector* register
|
|||||||
=================================================== ====================================================================
|
=================================================== ====================================================================
|
||||||
**v**\<N> A single 32-bit *vector* register.
|
**v**\<N> A single 32-bit *vector* register.
|
||||||
|
|
||||||
*N* must be a decimal integer number.
|
*N* must be a decimal
|
||||||
|
:ref:`integer number<amdgpu_synid_integer_number>`.
|
||||||
**v[**\ <N>\ **]** A single 32-bit *vector* register.
|
**v[**\ <N>\ **]** A single 32-bit *vector* register.
|
||||||
|
|
||||||
*N* may be specified as an
|
*N* may be specified as an
|
||||||
@ -51,10 +52,11 @@ Assembler currently supports sequences of 1, 2, 3, 4, 8 and 16 *vector* register
|
|||||||
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
**[v**\ <N>, \ **v**\ <N+1>, ... **v**\ <K>\ **]** A sequence of (\ *K-N+1*\ ) *vector* registers.
|
**[v**\ <N>, \ **v**\ <N+1>, ... **v**\ <K>\ **]** A sequence of (\ *K-N+1*\ ) *vector* registers.
|
||||||
|
|
||||||
Register indices must be specified as decimal integer numbers.
|
Register indices must be specified as decimal
|
||||||
|
:ref:`integer numbers<amdgpu_synid_integer_number>`.
|
||||||
=================================================== ====================================================================
|
=================================================== ====================================================================
|
||||||
|
|
||||||
Note. *N* and *K* must satisfy the following conditions:
|
Note: *N* and *K* must satisfy the following conditions:
|
||||||
|
|
||||||
* *N* <= *K*.
|
* *N* <= *K*.
|
||||||
* 0 <= *N* <= 255.
|
* 0 <= *N* <= 255.
|
||||||
@ -77,26 +79,27 @@ Examples:
|
|||||||
|
|
||||||
.. _amdgpu_synid_nsa:
|
.. _amdgpu_synid_nsa:
|
||||||
|
|
||||||
*Image* instructions may use special *NSA* (Non-Sequential Address) syntax for *image addresses*:
|
GFX10 *Image* instructions may use special *NSA* (Non-Sequential Address) syntax for *image addresses*:
|
||||||
|
|
||||||
=================================================== ====================================================================
|
===================================== =================================================
|
||||||
Syntax Description
|
Syntax Description
|
||||||
=================================================== ====================================================================
|
===================================== =================================================
|
||||||
**[v**\ <A>, \ **v**\ <B>, ... **v**\ <X>\ **]** A sequence of *vector* registers. At least one register
|
**[Vm**, \ **Vn**, ... **Vk**\ **]** A sequence of 32-bit *vector* registers.
|
||||||
must be specified.
|
Each register may be specified using a syntax
|
||||||
|
defined :ref:`above<amdgpu_synid_v>`.
|
||||||
|
|
||||||
In contrast with standard syntax described above, registers in
|
In contrast with standard syntax, registers
|
||||||
this sequence are not required to have consecutive indices.
|
in *NSA* sequence are not required to have
|
||||||
Moreover, the same register may appear in the list more than once.
|
consecutive indices. Moreover, the same register
|
||||||
=================================================== ====================================================================
|
may appear in the list more than once.
|
||||||
|
===================================== =================================================
|
||||||
Note. Reqister indices must be in the range 0..255. They must be specified as decimal integer numbers.
|
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
[v32,v1,v2]
|
[v32,v1,v[2]]
|
||||||
|
[v[32],v[1:1],[v2]]
|
||||||
[v4,v4,v4,v4]
|
[v4,v4,v4,v4]
|
||||||
|
|
||||||
.. _amdgpu_synid_s:
|
.. _amdgpu_synid_s:
|
||||||
@ -126,7 +129,9 @@ Sequences of 4 and more *scalar* registers must be quad-aligned.
|
|||||||
======================================================== ====================================================================
|
======================================================== ====================================================================
|
||||||
**s**\ <N> A single 32-bit *scalar* register.
|
**s**\ <N> A single 32-bit *scalar* register.
|
||||||
|
|
||||||
*N* must be a decimal integer number.
|
*N* must be a decimal
|
||||||
|
:ref:`integer number<amdgpu_synid_integer_number>`.
|
||||||
|
|
||||||
**s[**\ <N>\ **]** A single 32-bit *scalar* register.
|
**s[**\ <N>\ **]** A single 32-bit *scalar* register.
|
||||||
|
|
||||||
*N* may be specified as an
|
*N* may be specified as an
|
||||||
@ -137,12 +142,14 @@ Sequences of 4 and more *scalar* registers must be quad-aligned.
|
|||||||
*N* and *K* may be specified as
|
*N* and *K* may be specified as
|
||||||
:ref:`integer numbers<amdgpu_synid_integer_number>`
|
:ref:`integer numbers<amdgpu_synid_integer_number>`
|
||||||
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
**[s**\ <N>, \ **s**\ <N+1>, ... **s**\ <K>\ **]** A sequence of (\ *K-N+1*\ ) *scalar* registers.
|
**[s**\ <N>, \ **s**\ <N+1>, ... **s**\ <K>\ **]** A sequence of (\ *K-N+1*\ ) *scalar* registers.
|
||||||
|
|
||||||
Register indices must be specified as decimal integer numbers.
|
Register indices must be specified as decimal
|
||||||
|
:ref:`integer numbers<amdgpu_synid_integer_number>`.
|
||||||
======================================================== ====================================================================
|
======================================================== ====================================================================
|
||||||
|
|
||||||
Note. *N* and *K* must satisfy the following conditions:
|
Note: *N* and *K* must satisfy the following conditions:
|
||||||
|
|
||||||
* *N* must be properly aligned based on sequence size.
|
* *N* must be properly aligned based on sequence size.
|
||||||
* *N* <= *K*.
|
* *N* <= *K*.
|
||||||
@ -210,7 +217,8 @@ Sequences of 4 and more *ttmp* registers must be quad-aligned.
|
|||||||
============================================================= ====================================================================
|
============================================================= ====================================================================
|
||||||
**ttmp**\ <N> A single 32-bit *ttmp* register.
|
**ttmp**\ <N> A single 32-bit *ttmp* register.
|
||||||
|
|
||||||
*N* must be a decimal integer number.
|
*N* must be a decimal
|
||||||
|
:ref:`integer number<amdgpu_synid_integer_number>`.
|
||||||
**ttmp[**\ <N>\ **]** A single 32-bit *ttmp* register.
|
**ttmp[**\ <N>\ **]** A single 32-bit *ttmp* register.
|
||||||
|
|
||||||
*N* may be specified as an
|
*N* may be specified as an
|
||||||
@ -223,10 +231,11 @@ Sequences of 4 and more *ttmp* registers must be quad-aligned.
|
|||||||
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
|
||||||
**[ttmp**\ <N>, \ **ttmp**\ <N+1>, ... **ttmp**\ <K>\ **]** A sequence of (\ *K-N+1*\ ) *ttmp* registers.
|
**[ttmp**\ <N>, \ **ttmp**\ <N+1>, ... **ttmp**\ <K>\ **]** A sequence of (\ *K-N+1*\ ) *ttmp* registers.
|
||||||
|
|
||||||
Register indices must be specified as decimal integer numbers.
|
Register indices must be specified as decimal
|
||||||
|
:ref:`integer numbers<amdgpu_synid_integer_number>`.
|
||||||
============================================================= ====================================================================
|
============================================================= ====================================================================
|
||||||
|
|
||||||
Note. *N* and *K* must satisfy the following conditions:
|
Note: *N* and *K* must satisfy the following conditions:
|
||||||
|
|
||||||
* *N* must be properly aligned based on sequence size.
|
* *N* must be properly aligned based on sequence size.
|
||||||
* *N* <= *K*.
|
* *N* <= *K*.
|
||||||
@ -266,8 +275,8 @@ Trap base address, 64-bits wide. Holds the pointer to the current trap handler p
|
|||||||
Syntax Description Availability
|
Syntax Description Availability
|
||||||
================== ======================================================================= =============
|
================== ======================================================================= =============
|
||||||
tba 64-bit *trap base address* register. GFX7, GFX8
|
tba 64-bit *trap base address* register. GFX7, GFX8
|
||||||
[tba] 64-bit *trap base address* register (an alternative syntax). GFX7, GFX8
|
[tba] 64-bit *trap base address* register (an SP3 syntax). GFX7, GFX8
|
||||||
[tba_lo,tba_hi] 64-bit *trap base address* register (an alternative syntax). GFX7, GFX8
|
[tba_lo,tba_hi] 64-bit *trap base address* register (an SP3 syntax). GFX7, GFX8
|
||||||
================== ======================================================================= =============
|
================== ======================================================================= =============
|
||||||
|
|
||||||
High and low 32 bits of *trap base address* may be accessed as separate registers:
|
High and low 32 bits of *trap base address* may be accessed as separate registers:
|
||||||
@ -277,8 +286,8 @@ High and low 32 bits of *trap base address* may be accessed as separate register
|
|||||||
================== ======================================================================= =============
|
================== ======================================================================= =============
|
||||||
tba_lo Low 32 bits of *trap base address* register. GFX7, GFX8
|
tba_lo Low 32 bits of *trap base address* register. GFX7, GFX8
|
||||||
tba_hi High 32 bits of *trap base address* register. GFX7, GFX8
|
tba_hi High 32 bits of *trap base address* register. GFX7, GFX8
|
||||||
[tba_lo] Low 32 bits of *trap base address* register (an alternative syntax). GFX7, GFX8
|
[tba_lo] Low 32 bits of *trap base address* register (an SP3 syntax). GFX7, GFX8
|
||||||
[tba_hi] High 32 bits of *trap base address* register (an alternative syntax). GFX7, GFX8
|
[tba_hi] High 32 bits of *trap base address* register (an SP3 syntax). GFX7, GFX8
|
||||||
================== ======================================================================= =============
|
================== ======================================================================= =============
|
||||||
|
|
||||||
Note that *tba*, *tba_lo* and *tba_hi* are not accessible as assembler registers in GFX9 and GFX10,
|
Note that *tba*, *tba_lo* and *tba_hi* are not accessible as assembler registers in GFX9 and GFX10,
|
||||||
@ -295,8 +304,8 @@ Trap memory address, 64-bits wide.
|
|||||||
Syntax Description Availability
|
Syntax Description Availability
|
||||||
================= ======================================================================= ==================
|
================= ======================================================================= ==================
|
||||||
tma 64-bit *trap memory address* register. GFX7, GFX8
|
tma 64-bit *trap memory address* register. GFX7, GFX8
|
||||||
[tma] 64-bit *trap memory address* register (an alternative syntax). GFX7, GFX8
|
[tma] 64-bit *trap memory address* register (an SP3 syntax). GFX7, GFX8
|
||||||
[tma_lo,tma_hi] 64-bit *trap memory address* register (an alternative syntax). GFX7, GFX8
|
[tma_lo,tma_hi] 64-bit *trap memory address* register (an SP3 syntax). GFX7, GFX8
|
||||||
================= ======================================================================= ==================
|
================= ======================================================================= ==================
|
||||||
|
|
||||||
High and low 32 bits of *trap memory address* may be accessed as separate registers:
|
High and low 32 bits of *trap memory address* may be accessed as separate registers:
|
||||||
@ -306,8 +315,8 @@ High and low 32 bits of *trap memory address* may be accessed as separate regist
|
|||||||
================= ======================================================================= ==================
|
================= ======================================================================= ==================
|
||||||
tma_lo Low 32 bits of *trap memory address* register. GFX7, GFX8
|
tma_lo Low 32 bits of *trap memory address* register. GFX7, GFX8
|
||||||
tma_hi High 32 bits of *trap memory address* register. GFX7, GFX8
|
tma_hi High 32 bits of *trap memory address* register. GFX7, GFX8
|
||||||
[tma_lo] Low 32 bits of *trap memory address* register (an alternative syntax). GFX7, GFX8
|
[tma_lo] Low 32 bits of *trap memory address* register (an SP3 syntax). GFX7, GFX8
|
||||||
[tma_hi] High 32 bits of *trap memory address* register (an alternative syntax). GFX7, GFX8
|
[tma_hi] High 32 bits of *trap memory address* register (an SP3 syntax). GFX7, GFX8
|
||||||
================= ======================================================================= ==================
|
================= ======================================================================= ==================
|
||||||
|
|
||||||
Note that *tma*, *tma_lo* and *tma_hi* are not accessible as assembler registers in GFX9 and GFX10,
|
Note that *tma*, *tma_lo* and *tma_hi* are not accessible as assembler registers in GFX9 and GFX10,
|
||||||
@ -324,8 +333,8 @@ Flat scratch address, 64-bits wide. Holds the base address of scratch memory.
|
|||||||
Syntax Description
|
Syntax Description
|
||||||
================================== ================================================================
|
================================== ================================================================
|
||||||
flat_scratch 64-bit *flat scratch* address register.
|
flat_scratch 64-bit *flat scratch* address register.
|
||||||
[flat_scratch] 64-bit *flat scratch* address register (an alternative syntax).
|
[flat_scratch] 64-bit *flat scratch* address register (an SP3 syntax).
|
||||||
[flat_scratch_lo,flat_scratch_hi] 64-bit *flat scratch* address register (an alternative syntax).
|
[flat_scratch_lo,flat_scratch_hi] 64-bit *flat scratch* address register (an SP3 syntax).
|
||||||
================================== ================================================================
|
================================== ================================================================
|
||||||
|
|
||||||
High and low 32 bits of *flat scratch* address may be accessed as separate registers:
|
High and low 32 bits of *flat scratch* address may be accessed as separate registers:
|
||||||
@ -335,8 +344,8 @@ High and low 32 bits of *flat scratch* address may be accessed as separate regis
|
|||||||
========================= =========================================================================
|
========================= =========================================================================
|
||||||
flat_scratch_lo Low 32 bits of *flat scratch* address register.
|
flat_scratch_lo Low 32 bits of *flat scratch* address register.
|
||||||
flat_scratch_hi High 32 bits of *flat scratch* address register.
|
flat_scratch_hi High 32 bits of *flat scratch* address register.
|
||||||
[flat_scratch_lo] Low 32 bits of *flat scratch* address register (an alternative syntax).
|
[flat_scratch_lo] Low 32 bits of *flat scratch* address register (an SP3 syntax).
|
||||||
[flat_scratch_hi] High 32 bits of *flat scratch* address register (an alternative syntax).
|
[flat_scratch_hi] High 32 bits of *flat scratch* address register (an SP3 syntax).
|
||||||
========================= =========================================================================
|
========================= =========================================================================
|
||||||
|
|
||||||
.. _amdgpu_synid_xnack:
|
.. _amdgpu_synid_xnack:
|
||||||
@ -355,8 +364,8 @@ received an *XNACK* due to a vector memory operation.
|
|||||||
Syntax Description
|
Syntax Description
|
||||||
============================== =====================================================
|
============================== =====================================================
|
||||||
xnack_mask 64-bit *xnack mask* register.
|
xnack_mask 64-bit *xnack mask* register.
|
||||||
[xnack_mask] 64-bit *xnack mask* register (an alternative syntax).
|
[xnack_mask] 64-bit *xnack mask* register (an SP3 syntax).
|
||||||
[xnack_mask_lo,xnack_mask_hi] 64-bit *xnack mask* register (an alternative syntax).
|
[xnack_mask_lo,xnack_mask_hi] 64-bit *xnack mask* register (an SP3 syntax).
|
||||||
============================== =====================================================
|
============================== =====================================================
|
||||||
|
|
||||||
High and low 32 bits of *xnack mask* may be accessed as separate registers:
|
High and low 32 bits of *xnack mask* may be accessed as separate registers:
|
||||||
@ -366,8 +375,8 @@ High and low 32 bits of *xnack mask* may be accessed as separate registers:
|
|||||||
===================== ==============================================================
|
===================== ==============================================================
|
||||||
xnack_mask_lo Low 32 bits of *xnack mask* register.
|
xnack_mask_lo Low 32 bits of *xnack mask* register.
|
||||||
xnack_mask_hi High 32 bits of *xnack mask* register.
|
xnack_mask_hi High 32 bits of *xnack mask* register.
|
||||||
[xnack_mask_lo] Low 32 bits of *xnack mask* register (an alternative syntax).
|
[xnack_mask_lo] Low 32 bits of *xnack mask* register (an SP3 syntax).
|
||||||
[xnack_mask_hi] High 32 bits of *xnack mask* register (an alternative syntax).
|
[xnack_mask_hi] High 32 bits of *xnack mask* register (an SP3 syntax).
|
||||||
===================== ==============================================================
|
===================== ==============================================================
|
||||||
|
|
||||||
.. _amdgpu_synid_vcc:
|
.. _amdgpu_synid_vcc:
|
||||||
@ -385,8 +394,8 @@ Note that GFX10 H/W does not use high 32 bits of *vcc* in *wave32* mode.
|
|||||||
Syntax Description
|
Syntax Description
|
||||||
================ =========================================================================
|
================ =========================================================================
|
||||||
vcc 64-bit *vector condition code* register.
|
vcc 64-bit *vector condition code* register.
|
||||||
[vcc] 64-bit *vector condition code* register (an alternative syntax).
|
[vcc] 64-bit *vector condition code* register (an SP3 syntax).
|
||||||
[vcc_lo,vcc_hi] 64-bit *vector condition code* register (an alternative syntax).
|
[vcc_lo,vcc_hi] 64-bit *vector condition code* register (an SP3 syntax).
|
||||||
================ =========================================================================
|
================ =========================================================================
|
||||||
|
|
||||||
High and low 32 bits of *vector condition code* may be accessed as separate registers:
|
High and low 32 bits of *vector condition code* may be accessed as separate registers:
|
||||||
@ -396,8 +405,8 @@ High and low 32 bits of *vector condition code* may be accessed as separate regi
|
|||||||
================ =========================================================================
|
================ =========================================================================
|
||||||
vcc_lo Low 32 bits of *vector condition code* register.
|
vcc_lo Low 32 bits of *vector condition code* register.
|
||||||
vcc_hi High 32 bits of *vector condition code* register.
|
vcc_hi High 32 bits of *vector condition code* register.
|
||||||
[vcc_lo] Low 32 bits of *vector condition code* register (an alternative syntax).
|
[vcc_lo] Low 32 bits of *vector condition code* register (an SP3 syntax).
|
||||||
[vcc_hi] High 32 bits of *vector condition code* register (an alternative syntax).
|
[vcc_hi] High 32 bits of *vector condition code* register (an SP3 syntax).
|
||||||
================ =========================================================================
|
================ =========================================================================
|
||||||
|
|
||||||
.. _amdgpu_synid_m0:
|
.. _amdgpu_synid_m0:
|
||||||
@ -412,7 +421,7 @@ including register indexing and bounds checking.
|
|||||||
Syntax Description
|
Syntax Description
|
||||||
=========== ===================================================
|
=========== ===================================================
|
||||||
m0 A 32-bit *memory* register.
|
m0 A 32-bit *memory* register.
|
||||||
[m0] A 32-bit *memory* register (an alternative syntax).
|
[m0] A 32-bit *memory* register (an SP3 syntax).
|
||||||
=========== ===================================================
|
=========== ===================================================
|
||||||
|
|
||||||
.. _amdgpu_synid_exec:
|
.. _amdgpu_synid_exec:
|
||||||
@ -430,8 +439,8 @@ Note that GFX10 H/W does not use high 32 bits of *exec* in *wave32* mode.
|
|||||||
Syntax Description
|
Syntax Description
|
||||||
===================== =================================================================
|
===================== =================================================================
|
||||||
exec 64-bit *execute mask* register.
|
exec 64-bit *execute mask* register.
|
||||||
[exec] 64-bit *execute mask* register (an alternative syntax).
|
[exec] 64-bit *execute mask* register (an SP3 syntax).
|
||||||
[exec_lo,exec_hi] 64-bit *execute mask* register (an alternative syntax).
|
[exec_lo,exec_hi] 64-bit *execute mask* register (an SP3 syntax).
|
||||||
===================== =================================================================
|
===================== =================================================================
|
||||||
|
|
||||||
High and low 32 bits of *execute mask* may be accessed as separate registers:
|
High and low 32 bits of *execute mask* may be accessed as separate registers:
|
||||||
@ -441,8 +450,8 @@ High and low 32 bits of *execute mask* may be accessed as separate registers:
|
|||||||
===================== =================================================================
|
===================== =================================================================
|
||||||
exec_lo Low 32 bits of *execute mask* register.
|
exec_lo Low 32 bits of *execute mask* register.
|
||||||
exec_hi High 32 bits of *execute mask* register.
|
exec_hi High 32 bits of *execute mask* register.
|
||||||
[exec_lo] Low 32 bits of *execute mask* register (an alternative syntax).
|
[exec_lo] Low 32 bits of *execute mask* register (an SP3 syntax).
|
||||||
[exec_hi] High 32 bits of *execute mask* register (an alternative syntax).
|
[exec_hi] High 32 bits of *execute mask* register (an SP3 syntax).
|
||||||
===================== =================================================================
|
===================== =================================================================
|
||||||
|
|
||||||
.. _amdgpu_synid_vccz:
|
.. _amdgpu_synid_vccz:
|
||||||
@ -452,7 +461,7 @@ vccz
|
|||||||
|
|
||||||
A single bit flag indicating that the :ref:`vcc<amdgpu_synid_vcc>` is all zeros.
|
A single bit flag indicating that the :ref:`vcc<amdgpu_synid_vcc>` is all zeros.
|
||||||
|
|
||||||
Note. When GFX10 operates in *wave32* mode, this register reflects state of :ref:`vcc_lo<amdgpu_synid_vcc_lo>`.
|
Note: when GFX10 operates in *wave32* mode, this register reflects state of :ref:`vcc_lo<amdgpu_synid_vcc_lo>`.
|
||||||
|
|
||||||
.. _amdgpu_synid_execz:
|
.. _amdgpu_synid_execz:
|
||||||
|
|
||||||
@ -461,7 +470,7 @@ execz
|
|||||||
|
|
||||||
A single bit flag indicating that the :ref:`exec<amdgpu_synid_exec>` is all zeros.
|
A single bit flag indicating that the :ref:`exec<amdgpu_synid_exec>` is all zeros.
|
||||||
|
|
||||||
Note. When GFX10 operates in *wave32* mode, this register reflects state of :ref:`exec_lo<amdgpu_synid_exec>`.
|
Note: when GFX10 operates in *wave32* mode, this register reflects state of :ref:`exec_lo<amdgpu_synid_exec>`.
|
||||||
|
|
||||||
.. _amdgpu_synid_scc:
|
.. _amdgpu_synid_scc:
|
||||||
|
|
||||||
@ -495,19 +504,20 @@ GFX10 only.
|
|||||||
|
|
||||||
.. _amdgpu_synid_constant:
|
.. _amdgpu_synid_constant:
|
||||||
|
|
||||||
constant
|
inline constant
|
||||||
--------
|
---------------
|
||||||
|
|
||||||
A set of integer and floating-point *inline* constants and values:
|
An *inline constant* is an integer or a floating-point value encoded as a part of an instruction.
|
||||||
|
Compare *inline constants* with :ref:`literals<amdgpu_synid_literal>`.
|
||||||
|
|
||||||
|
Inline constants include:
|
||||||
|
|
||||||
* :ref:`iconst<amdgpu_synid_iconst>`
|
* :ref:`iconst<amdgpu_synid_iconst>`
|
||||||
* :ref:`fconst<amdgpu_synid_fconst>`
|
* :ref:`fconst<amdgpu_synid_fconst>`
|
||||||
* :ref:`ival<amdgpu_synid_ival>`
|
* :ref:`ival<amdgpu_synid_ival>`
|
||||||
|
|
||||||
In contrast with :ref:`literals<amdgpu_synid_literal>`, these operands are encoded as a part of instruction.
|
|
||||||
|
|
||||||
If a number may be encoded as either
|
If a number may be encoded as either
|
||||||
a :ref:`literal<amdgpu_synid_literal>` or
|
a :ref:`literal<amdgpu_synid_literal>` or
|
||||||
a :ref:`constant<amdgpu_synid_constant>`,
|
a :ref:`constant<amdgpu_synid_constant>`,
|
||||||
assembler selects the latter encoding as more efficient.
|
assembler selects the latter encoding as more efficient.
|
||||||
|
|
||||||
@ -516,17 +526,14 @@ assembler selects the latter encoding as more efficient.
|
|||||||
iconst
|
iconst
|
||||||
~~~~~~
|
~~~~~~
|
||||||
|
|
||||||
An :ref:`integer number<amdgpu_synid_integer_number>`
|
An :ref:`integer number<amdgpu_synid_integer_number>` or
|
||||||
|
an :ref:`absolute expression<amdgpu_synid_absolute_expression>`
|
||||||
encoded as an *inline constant*.
|
encoded as an *inline constant*.
|
||||||
|
|
||||||
Only a small fraction of integer numbers may be encoded as *inline constants*.
|
Only a small fraction of integer numbers may be encoded as *inline constants*.
|
||||||
They are enumerated in the table below.
|
They are enumerated in the table below.
|
||||||
Other integer numbers have to be encoded as :ref:`literals<amdgpu_synid_literal>`.
|
Other integer numbers have to be encoded as :ref:`literals<amdgpu_synid_literal>`.
|
||||||
|
|
||||||
Integer *inline constants* are converted to
|
|
||||||
:ref:`expected operand type<amdgpu_syn_instruction_type>`
|
|
||||||
as described :ref:`here<amdgpu_synid_int_const_conv>`.
|
|
||||||
|
|
||||||
================================== ====================================
|
================================== ====================================
|
||||||
Value Note
|
Value Note
|
||||||
================================== ====================================
|
================================== ====================================
|
||||||
@ -548,10 +555,6 @@ Only a small fraction of floating-point numbers may be encoded as *inline consta
|
|||||||
They are enumerated in the table below.
|
They are enumerated in the table below.
|
||||||
Other floating-point numbers have to be encoded as :ref:`literals<amdgpu_synid_literal>`.
|
Other floating-point numbers have to be encoded as :ref:`literals<amdgpu_synid_literal>`.
|
||||||
|
|
||||||
Floating-point *inline constants* are converted to
|
|
||||||
:ref:`expected operand type<amdgpu_syn_instruction_type>`
|
|
||||||
as described :ref:`here<amdgpu_synid_fp_const_conv>`.
|
|
||||||
|
|
||||||
===================== ===================================================== ==================
|
===================== ===================================================== ==================
|
||||||
Value Note Availability
|
Value Note Availability
|
||||||
===================== ===================================================== ==================
|
===================== ===================================================== ==================
|
||||||
@ -594,21 +597,18 @@ These operands provide read-only access to H/W registers.
|
|||||||
literal
|
literal
|
||||||
-------
|
-------
|
||||||
|
|
||||||
A literal is a 64-bit value which is encoded as a separate 32-bit dword in the instruction stream.
|
A *literal* is a 64-bit value encoded as a separate 32-bit dword in the instruction stream.
|
||||||
|
Compare *literals* with :ref:`inline constants<amdgpu_synid_constant>`.
|
||||||
|
|
||||||
If a number may be encoded as either
|
If a number may be encoded as either
|
||||||
a :ref:`literal<amdgpu_synid_literal>` or
|
a :ref:`literal<amdgpu_synid_literal>` or
|
||||||
an :ref:`inline constant<amdgpu_synid_constant>`,
|
an :ref:`inline constant<amdgpu_synid_constant>`,
|
||||||
assembler selects the latter encoding as more efficient.
|
assembler selects the latter encoding as more efficient.
|
||||||
|
|
||||||
Literals may be specified as :ref:`integer numbers<amdgpu_synid_integer_number>`,
|
Literals may be specified as :ref:`integer numbers<amdgpu_synid_integer_number>`,
|
||||||
:ref:`floating-point numbers<amdgpu_synid_floating-point_number>` or
|
:ref:`floating-point numbers<amdgpu_synid_floating-point_number>`,
|
||||||
:ref:`expressions<amdgpu_synid_expression>`
|
:ref:`absolute expressions<amdgpu_synid_absolute_expression>` or
|
||||||
(expressions are currently supported for 32-bit operands only).
|
:ref:`relocatable expressions<amdgpu_synid_relocatable_expression>`.
|
||||||
|
|
||||||
A 64-bit literal value is converted by assembler
|
|
||||||
to an :ref:`expected operand type<amdgpu_syn_instruction_type>`
|
|
||||||
as described :ref:`here<amdgpu_synid_lit_conv>`.
|
|
||||||
|
|
||||||
An instruction may use only one literal but several operands may refer the same literal.
|
An instruction may use only one literal but several operands may refer the same literal.
|
||||||
|
|
||||||
@ -617,30 +617,38 @@ An instruction may use only one literal but several operands may refer the same
|
|||||||
uimm8
|
uimm8
|
||||||
-----
|
-----
|
||||||
|
|
||||||
A 8-bit positive :ref:`integer number<amdgpu_synid_integer_number>`.
|
A 8-bit :ref:`integer number<amdgpu_synid_integer_number>`
|
||||||
The value is encoded as part of the opcode so it is free to use.
|
or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
The value must be in the range 0..0xFF.
|
||||||
|
|
||||||
.. _amdgpu_synid_uimm32:
|
.. _amdgpu_synid_uimm32:
|
||||||
|
|
||||||
uimm32
|
uimm32
|
||||||
------
|
------
|
||||||
|
|
||||||
A 32-bit positive :ref:`integer number<amdgpu_synid_integer_number>`.
|
A 32-bit :ref:`integer number<amdgpu_synid_integer_number>`
|
||||||
The value is stored as a separate 32-bit dword in the instruction stream.
|
or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
The value must be in the range 0..0xFFFFFFFF.
|
||||||
|
|
||||||
.. _amdgpu_synid_uimm20:
|
.. _amdgpu_synid_uimm20:
|
||||||
|
|
||||||
uimm20
|
uimm20
|
||||||
------
|
------
|
||||||
|
|
||||||
A 20-bit positive :ref:`integer number<amdgpu_synid_integer_number>`.
|
A 20-bit :ref:`integer number<amdgpu_synid_integer_number>`
|
||||||
|
or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
|
The value must be in the range 0..0xFFFFF.
|
||||||
|
|
||||||
.. _amdgpu_synid_uimm21:
|
.. _amdgpu_synid_uimm21:
|
||||||
|
|
||||||
uimm21
|
uimm21
|
||||||
------
|
------
|
||||||
|
|
||||||
A 21-bit positive :ref:`integer number<amdgpu_synid_integer_number>`.
|
A 21-bit :ref:`integer number<amdgpu_synid_integer_number>`
|
||||||
|
or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
|
The value must be in the range 0..0x1FFFFF.
|
||||||
|
|
||||||
.. WARNING:: Assembler currently supports 20-bit offsets only. Use :ref:`uimm20<amdgpu_synid_uimm20>` as a replacement.
|
.. WARNING:: Assembler currently supports 20-bit offsets only. Use :ref:`uimm20<amdgpu_synid_uimm20>` as a replacement.
|
||||||
|
|
||||||
@ -649,7 +657,10 @@ A 21-bit positive :ref:`integer number<amdgpu_synid_integer_number>`.
|
|||||||
simm21
|
simm21
|
||||||
------
|
------
|
||||||
|
|
||||||
A 21-bit :ref:`integer number<amdgpu_synid_integer_number>`.
|
A 21-bit :ref:`integer number<amdgpu_synid_integer_number>`
|
||||||
|
or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
|
||||||
|
|
||||||
|
The value must be in the range -0x100000..0x0FFFFF.
|
||||||
|
|
||||||
.. WARNING:: Assembler currently supports 20-bit unsigned offsets only. Use :ref:`uimm20<amdgpu_synid_uimm20>` as a replacement.
|
.. WARNING:: Assembler currently supports 20-bit unsigned offsets only. Use :ref:`uimm20<amdgpu_synid_uimm20>` as a replacement.
|
||||||
|
|
||||||
@ -678,27 +689,20 @@ Integer Numbers
|
|||||||
---------------
|
---------------
|
||||||
|
|
||||||
Integer numbers are 64 bits wide.
|
Integer numbers are 64 bits wide.
|
||||||
They may be specified in binary, octal, hexadecimal and decimal formats:
|
They are converted to :ref:`expected operand type<amdgpu_syn_instruction_type>`
|
||||||
|
as described :ref:`here<amdgpu_synid_int_conv>`.
|
||||||
|
|
||||||
============== ====================================
|
Integer numbers may be specified in binary, octal, hexadecimal and decimal formats:
|
||||||
Format Syntax
|
|
||||||
============== ====================================
|
|
||||||
Decimal [-]?[1-9][0-9]*
|
|
||||||
Binary [-]?0b[01]+
|
|
||||||
Octal [-]?0[0-7]+
|
|
||||||
Hexadecimal [-]?0x[0-9a-fA-F]+
|
|
||||||
\ [-]?[0x]?[0-9][0-9a-fA-F]*[hH]
|
|
||||||
============== ====================================
|
|
||||||
|
|
||||||
Examples:
|
============ =============================== ========
|
||||||
|
Format Syntax Example
|
||||||
.. parsed-literal::
|
============ =============================== ========
|
||||||
|
Decimal [-]?[1-9][0-9]* -1234
|
||||||
-1234
|
Binary [-]?0b[01]+ 0b1010
|
||||||
0b1010
|
Octal [-]?0[0-7]+ 010
|
||||||
010
|
Hexadecimal [-]?0x[0-9a-fA-F]+ 0xff
|
||||||
0xff
|
\ [-]?[0x]?[0-9][0-9a-fA-F]*[hH] 0ffh
|
||||||
0ffh
|
============ =============================== ========
|
||||||
|
|
||||||
.. _amdgpu_synid_floating-point_number:
|
.. _amdgpu_synid_floating-point_number:
|
||||||
|
|
||||||
@ -706,31 +710,29 @@ Floating-Point Numbers
|
|||||||
----------------------
|
----------------------
|
||||||
|
|
||||||
All floating-point numbers are handled as double (64 bits wide).
|
All floating-point numbers are handled as double (64 bits wide).
|
||||||
|
They are converted to
|
||||||
|
:ref:`expected operand type<amdgpu_syn_instruction_type>`
|
||||||
|
as described :ref:`here<amdgpu_synid_fp_conv>`.
|
||||||
|
|
||||||
Floating-point numbers may be specified in hexadecimal and decimal formats:
|
Floating-point numbers may be specified in hexadecimal and decimal formats:
|
||||||
|
|
||||||
============== ======================================================== ========================================================
|
============ ======================================================== ====================== ====================
|
||||||
Format Syntax Note
|
Format Syntax Examples Note
|
||||||
============== ======================================================== ========================================================
|
============ ======================================================== ====================== ====================
|
||||||
Decimal [-]?[0-9]*[.][0-9]*([eE][+-]?[0-9]*)? Must include either a decimal separator or an exponent.
|
Decimal [-]?[0-9]*[.][0-9]*([eE][+-]?[0-9]*)? -1.234, 234e2 Must include either
|
||||||
Hexadecimal [-]0x[0-9a-fA-F]*(.[0-9a-fA-F]*)?[pP][+-]?[0-9a-fA-F]+
|
a decimal separator
|
||||||
============== ======================================================== ========================================================
|
or an exponent.
|
||||||
|
Hexadecimal [-]0x[0-9a-fA-F]*(.[0-9a-fA-F]*)?[pP][+-]?[0-9a-fA-F]+ -0x1afp-10, 0x.1afp10
|
||||||
Examples:
|
============ ======================================================== ====================== ====================
|
||||||
|
|
||||||
.. parsed-literal::
|
|
||||||
|
|
||||||
-1.234
|
|
||||||
234e2
|
|
||||||
-0x1afp-10
|
|
||||||
0x.1afp10
|
|
||||||
|
|
||||||
.. _amdgpu_synid_expression:
|
.. _amdgpu_synid_expression:
|
||||||
|
|
||||||
Expressions
|
Expressions
|
||||||
===========
|
===========
|
||||||
|
|
||||||
An expression specifies an address or a numeric value.
|
An expression is evaluated to a 64-bit integer.
|
||||||
|
Note that floating-point expressions are not supported.
|
||||||
|
|
||||||
There are two kinds of expressions:
|
There are two kinds of expressions:
|
||||||
|
|
||||||
* :ref:`Absolute<amdgpu_synid_absolute_expression>`.
|
* :ref:`Absolute<amdgpu_synid_absolute_expression>`.
|
||||||
@ -741,10 +743,14 @@ There are two kinds of expressions:
|
|||||||
Absolute Expressions
|
Absolute Expressions
|
||||||
--------------------
|
--------------------
|
||||||
|
|
||||||
The value of an absolute expression remains the same after program relocation.
|
The value of an absolute expression does not change after program relocation.
|
||||||
Absolute expressions must not include unassigned and relocatable values
|
Absolute expressions must not include unassigned and relocatable values
|
||||||
such as labels.
|
such as labels.
|
||||||
|
|
||||||
|
Absolute expressions are evaluated to 64-bit integer values and converted to
|
||||||
|
:ref:`expected operand type<amdgpu_syn_instruction_type>`
|
||||||
|
as described :ref:`here<amdgpu_synid_int_conv>`.
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
@ -760,46 +766,39 @@ Relocatable Expressions
|
|||||||
The value of a relocatable expression depends on program relocation.
|
The value of a relocatable expression depends on program relocation.
|
||||||
|
|
||||||
Note that use of relocatable expressions is limited with branch targets
|
Note that use of relocatable expressions is limited with branch targets
|
||||||
and 32-bit :ref:`literals<amdgpu_synid_literal>`.
|
and 32-bit integer operands.
|
||||||
|
|
||||||
Addition information about relocation may be found :ref:`here<amdgpu-relocation-records>`.
|
A relocatable expression is evaluated to a 64-bit integer value
|
||||||
|
which depends on operand kind and :ref:`relocation type<amdgpu-relocation-records>`
|
||||||
Examples:
|
of symbol(s) used in the expression. For example, if an instruction refers a label,
|
||||||
|
this reference is evaluated to an offset from the address after the instruction
|
||||||
|
to the label address:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
y = x + 10 // x is not yet defined. Undefined symbols are assumed to be PC-relative.
|
label:
|
||||||
z = .
|
v_add_co_u32_e32 v0, vcc, label, v1 // 'label' operand is evaluated to -4
|
||||||
|
|
||||||
Expression Data Type
|
Note that values of relocatable expressions are usually unknown at assembly time;
|
||||||
--------------------
|
they are resolved later by a linker and converted to
|
||||||
|
:ref:`expected operand type<amdgpu_syn_instruction_type>`
|
||||||
|
as described :ref:`here<amdgpu_synid_rl_conv>`.
|
||||||
|
|
||||||
Expressions and operands of expressions are interpreted as 64-bit integers.
|
Operands and Operations
|
||||||
|
-----------------------
|
||||||
|
|
||||||
Expressions may include 64-bit :ref:`floating-point numbers<amdgpu_synid_floating-point_number>` (double).
|
Expressions are composed of 64-bit integer operands and operations.
|
||||||
However these operands are also handled as 64-bit integers
|
Operands include :ref:`integer numbers<amdgpu_synid_integer_number>`
|
||||||
using binary representation of specified floating-point numbers.
|
and :ref:`symbols<amdgpu_synid_symbol>`.
|
||||||
No conversion from floating-point to integer is performed.
|
|
||||||
|
|
||||||
Examples:
|
|
||||||
|
|
||||||
.. parsed-literal::
|
|
||||||
|
|
||||||
x = 0.1 // x is assigned an integer 4591870180066957722 which is a binary representation of 0.1.
|
|
||||||
y = x + x // y is a sum of two integer values; it is not equal to 0.2!
|
|
||||||
|
|
||||||
Syntax
|
|
||||||
------
|
|
||||||
|
|
||||||
Expressions are composed of
|
|
||||||
:ref:`symbols<amdgpu_synid_symbol>`,
|
|
||||||
:ref:`integer numbers<amdgpu_synid_integer_number>`,
|
|
||||||
:ref:`floating-point numbers<amdgpu_synid_floating-point_number>`,
|
|
||||||
:ref:`binary operators<amdgpu_synid_expression_bin_op>`,
|
|
||||||
:ref:`unary operators<amdgpu_synid_expression_un_op>` and subexpressions.
|
|
||||||
|
|
||||||
Expressions may also use "." which is a reference to the current PC (program counter).
|
Expressions may also use "." which is a reference to the current PC (program counter).
|
||||||
|
|
||||||
|
:ref:`Unary<amdgpu_synid_expression_un_op>` and :ref:`binary<amdgpu_synid_expression_bin_op>`
|
||||||
|
operations produce 64-bit integer results.
|
||||||
|
|
||||||
|
Syntax of Expressions
|
||||||
|
---------------------
|
||||||
|
|
||||||
The syntax of expressions is shown below::
|
The syntax of expressions is shown below::
|
||||||
|
|
||||||
expr ::= expr binop expr | primaryexpr ;
|
expr ::= expr binop expr | primaryexpr ;
|
||||||
@ -887,7 +886,7 @@ They operate on and produce 64-bit integers.
|
|||||||
Symbols
|
Symbols
|
||||||
-------
|
-------
|
||||||
|
|
||||||
A symbol is a named 64-bit value, representing a relocatable
|
A symbol is a named 64-bit integer value, representing a relocatable
|
||||||
address or an absolute (non-relocatable) number.
|
address or an absolute (non-relocatable) number.
|
||||||
|
|
||||||
Symbol names have the following syntax:
|
Symbol names have the following syntax:
|
||||||
@ -907,128 +906,78 @@ The table below provides several examples of syntax used for symbol definition.
|
|||||||
A symbol may be used before it is declared or assigned;
|
A symbol may be used before it is declared or assigned;
|
||||||
unassigned symbols are assumed to be PC-relative.
|
unassigned symbols are assumed to be PC-relative.
|
||||||
|
|
||||||
Addition information about symbols may be found :ref:`here<amdgpu-symbols>`.
|
Additional information about symbols may be found :ref:`here<amdgpu-symbols>`.
|
||||||
|
|
||||||
.. _amdgpu_synid_conv:
|
.. _amdgpu_synid_conv:
|
||||||
|
|
||||||
Conversions
|
Type and Size Conversion
|
||||||
===========
|
========================
|
||||||
|
|
||||||
This section describes what happens when a 64-bit
|
This section describes what happens when a 64-bit
|
||||||
:ref:`integer number<amdgpu_synid_integer_number>`, a
|
:ref:`integer number<amdgpu_synid_integer_number>`, a
|
||||||
:ref:`floating-point numbers<amdgpu_synid_floating-point_number>` or a
|
:ref:`floating-point number<amdgpu_synid_floating-point_number>` or an
|
||||||
:ref:`symbol<amdgpu_synid_symbol>`
|
:ref:`expression<amdgpu_synid_expression>`
|
||||||
is used for an operand which has a different type or size.
|
is used for an operand which has a different type or size.
|
||||||
|
|
||||||
Depending on operand kind, this conversion is performed by either assembler or AMDGPU H/W:
|
.. _amdgpu_synid_int_conv:
|
||||||
|
|
||||||
* Values encoded as :ref:`inline constants<amdgpu_synid_constant>` are handled by H/W.
|
Conversion of Integer Values
|
||||||
* Values encoded as :ref:`literals<amdgpu_synid_literal>` are converted by assembler.
|
----------------------------
|
||||||
|
|
||||||
.. _amdgpu_synid_const_conv:
|
Instruction operands may be specified as 64-bit :ref:`integer numbers<amdgpu_synid_integer_number>` or
|
||||||
|
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. These values are converted to
|
||||||
|
the :ref:`expected operand type<amdgpu_syn_instruction_type>` using the following steps:
|
||||||
|
|
||||||
Inline Constants
|
1. *Validation*. Assembler checks if the input value may be truncated without loss to the required *truncation width*
|
||||||
----------------
|
(see the table below). There are two cases when this operation is enabled:
|
||||||
|
|
||||||
.. _amdgpu_synid_int_const_conv:
|
* The truncated bits are all 0.
|
||||||
|
* The truncated bits are all 1 and the value after truncation has its MSB bit set.
|
||||||
|
|
||||||
Integer Inline Constants
|
In all other cases assembler triggers an error.
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
|
||||||
|
|
||||||
Integer :ref:`inline constants<amdgpu_synid_constant>`
|
2. *Conversion*. The input value is converted to the expected type as described in the table below.
|
||||||
may be thought of as 64-bit
|
Depending on operand kind, this conversion is performed by either assembler or AMDGPU H/W (or both).
|
||||||
:ref:`integer numbers<amdgpu_synid_integer_number>`;
|
|
||||||
when used as operands they are truncated to the size of
|
|
||||||
:ref:`expected operand type<amdgpu_syn_instruction_type>`.
|
|
||||||
No data type conversions are performed.
|
|
||||||
|
|
||||||
Examples:
|
============== ================= =============== ====================================================================
|
||||||
|
Expected type Truncation Width Conversion Description
|
||||||
|
============== ================= =============== ====================================================================
|
||||||
|
i16, u16, b16 16 num.u16 Truncate to 16 bits.
|
||||||
|
i32, u32, b32 32 num.u32 Truncate to 32 bits.
|
||||||
|
i64 32 {-1,num.i32} Truncate to 32 bits and then sign-extend the result to 64 bits.
|
||||||
|
u64, b64 32 {0,num.u32} Truncate to 32 bits and then zero-extend the result to 64 bits.
|
||||||
|
f16 16 num.u16 Use low 16 bits as an f16 value.
|
||||||
|
f32 32 num.u32 Use low 32 bits as an f32 value.
|
||||||
|
f64 32 {num.u32,0} Use low 32 bits of the number as high 32 bits
|
||||||
|
of the result; low 32 bits of the result are zeroed.
|
||||||
|
============== ================= =============== ====================================================================
|
||||||
|
|
||||||
|
Examples of enabled conversions:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
// GFX9
|
// GFX9
|
||||||
|
|
||||||
v_add_u16 v0, -1, 0 // v0 = 0xFFFF
|
v_add_u16 v0, -1, 0 // src0 = 0xFFFF
|
||||||
v_add_f16 v0, -1, 0 // v0 = 0xFFFF (NaN)
|
v_add_f16 v0, -1, 0 // src0 = 0xFFFF (NaN)
|
||||||
|
//
|
||||||
|
v_add_u32 v0, -1, 0 // src0 = 0xFFFFFFFF
|
||||||
|
v_add_f32 v0, -1, 0 // src0 = 0xFFFFFFFF (NaN)
|
||||||
|
//
|
||||||
|
v_add_u16 v0, 0xff00, v0 // src0 = 0xff00
|
||||||
|
v_add_u16 v0, 0xffffffffffffff00, v0 // src0 = 0xff00
|
||||||
|
v_add_u16 v0, -256, v0 // src0 = 0xff00
|
||||||
|
//
|
||||||
|
s_bfe_i64 s[0:1], 0xffefffff, s3 // src0 = 0xffffffffffefffff
|
||||||
|
s_bfe_u64 s[0:1], 0xffefffff, s3 // src0 = 0x00000000ffefffff
|
||||||
|
v_ceil_f64_e32 v[0:1], 0xffefffff // src0 = 0xffefffff00000000 (-1.7976922776554302e308)
|
||||||
|
//
|
||||||
|
x = 0xffefffff //
|
||||||
|
s_bfe_i64 s[0:1], x, s3 // src0 = 0xffffffffffefffff
|
||||||
|
s_bfe_u64 s[0:1], x, s3 // src0 = 0x00000000ffefffff
|
||||||
|
v_ceil_f64_e32 v[0:1], x // src0 = 0xffefffff00000000 (-1.7976922776554302e308)
|
||||||
|
|
||||||
v_add_u32 v0, -1, 0 // v0 = 0xFFFFFFFF
|
Examples of disabled conversions:
|
||||||
v_add_f32 v0, -1, 0 // v0 = 0xFFFFFFFF (NaN)
|
|
||||||
|
|
||||||
.. _amdgpu_synid_fp_const_conv:
|
|
||||||
|
|
||||||
Floating-Point Inline Constants
|
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
||||||
|
|
||||||
Floating-point :ref:`inline constants<amdgpu_synid_constant>`
|
|
||||||
may be thought of as 64-bit
|
|
||||||
:ref:`floating-point numbers<amdgpu_synid_floating-point_number>`;
|
|
||||||
when used as operands they are converted to a floating-point number of
|
|
||||||
:ref:`expected operand size<amdgpu_syn_instruction_type>`.
|
|
||||||
|
|
||||||
Examples:
|
|
||||||
|
|
||||||
.. parsed-literal::
|
|
||||||
|
|
||||||
// GFX9
|
|
||||||
|
|
||||||
v_add_f16 v0, 1.0, 0 // v0 = 0x3C00 (1.0)
|
|
||||||
v_add_u16 v0, 1.0, 0 // v0 = 0x3C00
|
|
||||||
|
|
||||||
v_add_f32 v0, 1.0, 0 // v0 = 0x3F800000 (1.0)
|
|
||||||
v_add_u32 v0, 1.0, 0 // v0 = 0x3F800000
|
|
||||||
|
|
||||||
|
|
||||||
.. _amdgpu_synid_lit_conv:
|
|
||||||
|
|
||||||
Literals
|
|
||||||
--------
|
|
||||||
|
|
||||||
.. _amdgpu_synid_int_lit_conv:
|
|
||||||
|
|
||||||
Integer Literals
|
|
||||||
~~~~~~~~~~~~~~~~
|
|
||||||
|
|
||||||
Integer :ref:`literals<amdgpu_synid_literal>`
|
|
||||||
are specified as 64-bit :ref:`integer numbers<amdgpu_synid_integer_number>`.
|
|
||||||
|
|
||||||
When used as operands they are converted to
|
|
||||||
:ref:`expected operand type<amdgpu_syn_instruction_type>` as described below.
|
|
||||||
|
|
||||||
============== ============== =============== ====================================================================
|
|
||||||
Expected type Condition Result Note
|
|
||||||
============== ============== =============== ====================================================================
|
|
||||||
i16, u16, b16 cond(num,16) num.u16 Truncate to 16 bits.
|
|
||||||
i32, u32, b32 cond(num,32) num.u32 Truncate to 32 bits.
|
|
||||||
i64 cond(num,32) {-1,num.i32} Truncate to 32 bits and then sign-extend the result to 64 bits.
|
|
||||||
u64, b64 cond(num,32) { 0,num.u32} Truncate to 32 bits and then zero-extend the result to 64 bits.
|
|
||||||
f16 cond(num,16) num.u16 Use low 16 bits as an f16 value.
|
|
||||||
f32 cond(num,32) num.u32 Use low 32 bits as an f32 value.
|
|
||||||
f64 cond(num,32) {num.u32,0} Use low 32 bits of the number as high 32 bits
|
|
||||||
of the result; low 32 bits of the result are zeroed.
|
|
||||||
============== ============== =============== ====================================================================
|
|
||||||
|
|
||||||
The condition *cond(X,S)* indicates if a 64-bit number *X*
|
|
||||||
can be converted to a smaller size *S* by truncation of upper bits.
|
|
||||||
There are two cases when the conversion is possible:
|
|
||||||
|
|
||||||
* The truncated bits are all 0.
|
|
||||||
* The truncated bits are all 1 and the value after truncation has its MSB bit set.
|
|
||||||
|
|
||||||
Examples of valid literals:
|
|
||||||
|
|
||||||
.. parsed-literal::
|
|
||||||
|
|
||||||
// GFX9
|
|
||||||
// Literal value after conversion:
|
|
||||||
v_add_u16 v0, 0xff00, v0 // 0xff00
|
|
||||||
v_add_u16 v0, 0xffffffffffffff00, v0 // 0xff00
|
|
||||||
v_add_u16 v0, -256, v0 // 0xff00
|
|
||||||
// Literal value after conversion:
|
|
||||||
s_bfe_i64 s[0:1], 0xffefffff, s3 // 0xffffffffffefffff
|
|
||||||
s_bfe_u64 s[0:1], 0xffefffff, s3 // 0x00000000ffefffff
|
|
||||||
v_ceil_f64_e32 v[0:1], 0xffefffff // 0xffefffff00000000 (-1.7976922776554302e308)
|
|
||||||
|
|
||||||
Examples of invalid literals:
|
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
@ -1037,49 +986,57 @@ Examples of invalid literals:
|
|||||||
v_add_u16 v0, 0x1ff00, v0 // truncated bits are not all 0 or 1
|
v_add_u16 v0, 0x1ff00, v0 // truncated bits are not all 0 or 1
|
||||||
v_add_u16 v0, 0xffffffffffff00ff, v0 // truncated bits do not match MSB of the result
|
v_add_u16 v0, 0xffffffffffff00ff, v0 // truncated bits do not match MSB of the result
|
||||||
|
|
||||||
.. _amdgpu_synid_fp_lit_conv:
|
.. _amdgpu_synid_fp_conv:
|
||||||
|
|
||||||
Floating-Point Literals
|
Conversion of Floating-Point Values
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~
|
-----------------------------------
|
||||||
|
|
||||||
Floating-point :ref:`literals<amdgpu_synid_literal>` are specified as 64-bit
|
Instruction operands may be specified as 64-bit :ref:`floating-point numbers<amdgpu_synid_floating-point_number>`.
|
||||||
:ref:`floating-point numbers<amdgpu_synid_floating-point_number>`.
|
These values are converted to the :ref:`expected operand type<amdgpu_syn_instruction_type>` using the following steps:
|
||||||
|
|
||||||
When used as operands they are converted to
|
1. *Validation*. Assembler checks if the input f64 number can be converted
|
||||||
:ref:`expected operand type<amdgpu_syn_instruction_type>` as described below.
|
to the *required floating-point type* (see the table below) without overflow or underflow.
|
||||||
|
Precision lost is allowed. If this conversion is not possible, assembler triggers an error.
|
||||||
|
|
||||||
============== ============== ================= =================================================================
|
2. *Conversion*. The input value is converted to the expected type as described in the table below.
|
||||||
Expected type Condition Result Note
|
Depending on operand kind, this is performed by either assembler or AMDGPU H/W (or both).
|
||||||
============== ============== ================= =================================================================
|
|
||||||
i16, u16, b16 cond(num,16) f16(num) Convert to f16 and use bits of the result as an integer value.
|
|
||||||
i32, u32, b32 cond(num,32) f32(num) Convert to f32 and use bits of the result as an integer value.
|
|
||||||
i64, u64, b64 false \- Conversion disabled because of an unclear semantics.
|
|
||||||
f16 cond(num,16) f16(num) Convert to f16.
|
|
||||||
f32 cond(num,32) f32(num) Convert to f32.
|
|
||||||
f64 true {num.u32.hi,0} Use high 32 bits of the number as high 32 bits of the result;
|
|
||||||
zero-fill low 32 bits of the result.
|
|
||||||
|
|
||||||
Note that the result may differ from the original number.
|
============== ================ ================= =================================================================
|
||||||
============== ============== ================= =================================================================
|
Expected type Required FP Type Conversion Description
|
||||||
|
============== ================ ================= =================================================================
|
||||||
|
i16, u16, b16 f16 f16(num) Convert to f16 and use bits of the result as an integer value.
|
||||||
|
i32, u32, b32 f32 f32(num) Convert to f32 and use bits of the result as an integer value.
|
||||||
|
i64, u64, b64 \- \- Conversion disabled.
|
||||||
|
f16 f16 f16(num) Convert to f16.
|
||||||
|
f32 f32 f32(num) Convert to f32.
|
||||||
|
f64 f64 {num.u32.hi,0} Use high 32 bits of the number as high 32 bits of the result;
|
||||||
|
zero-fill low 32 bits of the result.
|
||||||
|
|
||||||
The condition *cond(X,S)* indicates if an f64 number *X* can be converted
|
Note that the result may differ from the original number.
|
||||||
to a smaller *S*-bit floating-point type without overflow or underflow.
|
============== ================ ================= =================================================================
|
||||||
Precision lost is allowed.
|
|
||||||
|
|
||||||
Examples of valid literals:
|
Examples of enabled conversions:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
// GFX9
|
// GFX9
|
||||||
|
|
||||||
v_add_f16 v1, 65500.0, v2
|
v_add_f16 v0, 1.0, 0 // src0 = 0x3C00 (1.0)
|
||||||
v_add_f32 v1, 65600.0, v2
|
v_add_u16 v0, 1.0, 0 // src0 = 0x3C00
|
||||||
|
//
|
||||||
|
v_add_f32 v0, 1.0, 0 // src0 = 0x3F800000 (1.0)
|
||||||
|
v_add_u32 v0, 1.0, 0 // src0 = 0x3F800000
|
||||||
|
|
||||||
// Literal value before conversion: 1.7976931348623157e308 (0x7fefffffffffffff)
|
// src0 before conversion:
|
||||||
// Literal value after conversion: 1.7976922776554302e308 (0x7fefffff00000000)
|
// 1.7976931348623157e308 = 0x7fefffffffffffff
|
||||||
|
// src0 after conversion:
|
||||||
|
// 1.7976922776554302e308 = 0x7fefffff00000000
|
||||||
v_ceil_f64 v[0:1], 1.7976931348623157e308
|
v_ceil_f64 v[0:1], 1.7976931348623157e308
|
||||||
|
|
||||||
Examples of invalid literals:
|
v_add_f16 v1, 65500.0, v2 // ok for f16.
|
||||||
|
v_add_f32 v1, 65600.0, v2 // ok for f32, but would result in overflow for f16.
|
||||||
|
|
||||||
|
Examples of disabled conversions:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
@ -1087,25 +1044,35 @@ Examples of invalid literals:
|
|||||||
|
|
||||||
v_add_f16 v1, 65600.0, v2 // overflow
|
v_add_f16 v1, 65600.0, v2 // overflow
|
||||||
|
|
||||||
.. _amdgpu_synid_exp_conv:
|
.. _amdgpu_synid_rl_conv:
|
||||||
|
|
||||||
Expressions
|
Conversion of Relocatable Values
|
||||||
~~~~~~~~~~~
|
--------------------------------
|
||||||
|
|
||||||
Expressions operate with and result in 64-bit integers.
|
:ref:`Relocatable expressions<amdgpu_synid_relocatable_expression>`
|
||||||
|
may be used with 32-bit integer operands and jump targets.
|
||||||
|
|
||||||
When used as operands they are truncated to
|
When the value of a relocatable expression is resolved by a linker, it is
|
||||||
:ref:`expected operand size<amdgpu_syn_instruction_type>`.
|
converted as needed and truncated to the operand size. The conversion depends
|
||||||
No data type conversions are performed.
|
on :ref:`relocation type<amdgpu-relocation-records>` and operand kind.
|
||||||
|
|
||||||
Examples:
|
For example, when a 32-bit operand of an instruction refers a relocatable expression *expr*,
|
||||||
|
this reference is evaluated to a 64-bit offset from the address after the
|
||||||
|
instruction to the address being referenced, *counted in bytes*.
|
||||||
|
Then the value is truncated to 32 bits and encoded as a literal:
|
||||||
|
|
||||||
.. parsed-literal::
|
.. parsed-literal::
|
||||||
|
|
||||||
// GFX9
|
expr = .
|
||||||
|
v_add_co_u32_e32 v0, vcc, expr, v1 // 'expr' operand is evaluated to -4
|
||||||
|
// and then truncated to 0xFFFFFFFC
|
||||||
|
|
||||||
x = 0.1
|
As another example, when a branch instruction refers a label,
|
||||||
v_sqrt_f32 v0, x // v0 = [low 32 bits of 0.1 (double)]
|
this reference is evaluated to an offset from the address after the
|
||||||
v_sqrt_f32 v0, (0.1 + 0) // the same as above
|
instruction to the label address, *counted in dwords*.
|
||||||
v_sqrt_f32 v0, 0.1 // v0 = [0.1 (double) converted to float]
|
Then the value is truncated to 16 bits:
|
||||||
|
|
||||||
|
.. parsed-literal::
|
||||||
|
|
||||||
|
label:
|
||||||
|
s_branch label // 'label' operand is evaluated to -1 and truncated to 0xFFFF
|
||||||
|
Loading…
x
Reference in New Issue
Block a user