1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 12:41:49 +01:00

[AMDGPU][MC][DOC] Updated AMD GPU assembler description.

Summary of changes:
- Updated to reflect recent changes in assembler;
- Minor bugfixing and improvements.

llvm-svn: 372857
This commit is contained in:
Dmitry Preobrazhensky 2019-09-25 12:38:35 +00:00
parent ce36c01e6a
commit ebbe05934d
58 changed files with 1058 additions and 709 deletions

View File

@ -566,7 +566,7 @@ SOPC
s_cmp_lg_u64 :ref:`ssrc0<amdgpu_synid8_ssrc64_0>`, :ref:`ssrc1<amdgpu_synid8_ssrc64_0>` s_cmp_lg_u64 :ref:`ssrc0<amdgpu_synid8_ssrc64_0>`, :ref:`ssrc1<amdgpu_synid8_ssrc64_0>`
s_cmp_lt_i32 :ref:`ssrc0<amdgpu_synid8_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid8_ssrc32_0>` s_cmp_lt_i32 :ref:`ssrc0<amdgpu_synid8_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid8_ssrc32_0>`
s_cmp_lt_u32 :ref:`ssrc0<amdgpu_synid8_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid8_ssrc32_0>` s_cmp_lt_u32 :ref:`ssrc0<amdgpu_synid8_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid8_ssrc32_0>`
s_set_gpr_idx_on :ref:`ssrc<amdgpu_synid8_ssrc32_0>`, :ref:`imm4<amdgpu_synid8_imm4>` s_set_gpr_idx_on :ref:`ssrc<amdgpu_synid8_ssrc32_0>`, :ref:`imask<amdgpu_synid8_imask>`
s_setvskip :ref:`ssrc0<amdgpu_synid8_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid8_ssrc32_0>` s_setvskip :ref:`ssrc0<amdgpu_synid8_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid8_ssrc32_0>`
SOPK SOPK
@ -624,7 +624,7 @@ SOPP
s_nop :ref:`imm16<amdgpu_synid8_bimm16>` s_nop :ref:`imm16<amdgpu_synid8_bimm16>`
s_sendmsg :ref:`msg<amdgpu_synid8_msg>` s_sendmsg :ref:`msg<amdgpu_synid8_msg>`
s_sendmsghalt :ref:`msg<amdgpu_synid8_msg>` s_sendmsghalt :ref:`msg<amdgpu_synid8_msg>`
s_set_gpr_idx_mode :ref:`imm4<amdgpu_synid8_imm4>` s_set_gpr_idx_mode :ref:`imask<amdgpu_synid8_imask>`
s_set_gpr_idx_off s_set_gpr_idx_off
s_sethalt :ref:`imm16<amdgpu_synid8_bimm16>` s_sethalt :ref:`imm16<amdgpu_synid8_bimm16>`
s_setkill :ref:`imm16<amdgpu_synid8_bimm16>` s_setkill :ref:`imm16<amdgpu_synid8_bimm16>`
@ -1756,7 +1756,7 @@ VOPC
gfx8_fimm16 gfx8_fimm16
gfx8_fimm32 gfx8_fimm32
gfx8_hwreg gfx8_hwreg
gfx8_imm4 gfx8_imask
gfx8_label gfx8_label
gfx8_msg gfx8_msg
gfx8_param gfx8_param

View File

@ -736,7 +736,7 @@ SOPC
s_cmp_lg_u64 :ref:`ssrc0<amdgpu_synid9_ssrc64_0>`, :ref:`ssrc1<amdgpu_synid9_ssrc64_0>` s_cmp_lg_u64 :ref:`ssrc0<amdgpu_synid9_ssrc64_0>`, :ref:`ssrc1<amdgpu_synid9_ssrc64_0>`
s_cmp_lt_i32 :ref:`ssrc0<amdgpu_synid9_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid9_ssrc32_0>` s_cmp_lt_i32 :ref:`ssrc0<amdgpu_synid9_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid9_ssrc32_0>`
s_cmp_lt_u32 :ref:`ssrc0<amdgpu_synid9_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid9_ssrc32_0>` s_cmp_lt_u32 :ref:`ssrc0<amdgpu_synid9_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid9_ssrc32_0>`
s_set_gpr_idx_on :ref:`ssrc<amdgpu_synid9_ssrc32_0>`, :ref:`imm4<amdgpu_synid9_imm4>` s_set_gpr_idx_on :ref:`ssrc<amdgpu_synid9_ssrc32_0>`, :ref:`imask<amdgpu_synid9_imask>`
s_setvskip :ref:`ssrc0<amdgpu_synid9_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid9_ssrc32_0>` s_setvskip :ref:`ssrc0<amdgpu_synid9_ssrc32_0>`, :ref:`ssrc1<amdgpu_synid9_ssrc32_0>`
SOPK SOPK
@ -796,7 +796,7 @@ SOPP
s_nop :ref:`imm16<amdgpu_synid9_bimm16>` s_nop :ref:`imm16<amdgpu_synid9_bimm16>`
s_sendmsg :ref:`msg<amdgpu_synid9_msg>` s_sendmsg :ref:`msg<amdgpu_synid9_msg>`
s_sendmsghalt :ref:`msg<amdgpu_synid9_msg>` s_sendmsghalt :ref:`msg<amdgpu_synid9_msg>`
s_set_gpr_idx_mode :ref:`imm4<amdgpu_synid9_imm4>` s_set_gpr_idx_mode :ref:`imask<amdgpu_synid9_imask>`
s_set_gpr_idx_off s_set_gpr_idx_off
s_sethalt :ref:`imm16<amdgpu_synid9_bimm16>` s_sethalt :ref:`imm16<amdgpu_synid9_bimm16>`
s_setkill :ref:`imm16<amdgpu_synid9_bimm16>` s_setkill :ref:`imm16<amdgpu_synid9_bimm16>`
@ -2010,7 +2010,7 @@ VOPC
gfx9_fimm16 gfx9_fimm16
gfx9_fimm32 gfx9_fimm32
gfx9_hwreg gfx9_hwreg
gfx9_imm4 gfx9_imask
gfx9_label gfx9_label
gfx9_msg gfx9_msg
gfx9_param gfx9_param

View File

@ -10,5 +10,5 @@
imm16 imm16
=========================== ===========================
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits. A 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.

View File

@ -10,5 +10,5 @@
imm32 imm32
=========================== ===========================
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 32 bits. An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value is truncated to 32 bits.

View File

@ -21,7 +21,7 @@ Optionally may serve as an output data:
* :ref:`dmask<amdgpu_synid_dmask>` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword. * :ref:`dmask<amdgpu_synid_dmask>` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified. * :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
Note. The surface data format is indicated in the image resource constant but not in the instruction. Note: the surface data format is indicated in the image resource constant but not in the instruction.
*Operands:* :ref:`v<amdgpu_synid_v>` *Operands:* :ref:`v<amdgpu_synid_v>`

View File

@ -21,6 +21,6 @@ Optionally may serve as an output data:
* :ref:`dmask<amdgpu_synid_dmask>` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword. * :ref:`dmask<amdgpu_synid_dmask>` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified. * :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
Note. The surface data format is indicated in the image resource constant but not in the instruction. Note: the surface data format is indicated in the image resource constant but not in the instruction.
*Operands:* :ref:`v<amdgpu_synid_v>` *Operands:* :ref:`v<amdgpu_synid_v>`

View File

@ -10,5 +10,6 @@
imm32 imm32
=========================== ===========================
An :ref:`integer_number<amdgpu_synid_integer_number>` or a :ref:`floating-point_number<amdgpu_synid_floating-point_number>`. The number is converted to *f16* as described :ref:`here<amdgpu_synid_lit_conv>`. A :ref:`floating-point_number<amdgpu_synid_floating-point_number>`, an :ref:`integer_number<amdgpu_synid_integer_number>`, or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`.
The value is converted to *f16* as described :ref:`here<amdgpu_synid_fp_conv>`.

View File

@ -10,5 +10,6 @@
imm32 imm32
=========================== ===========================
An :ref:`integer_number<amdgpu_synid_integer_number>` or a :ref:`floating-point_number<amdgpu_synid_floating-point_number>`. The value is converted to *f32* as described :ref:`here<amdgpu_synid_lit_conv>`. A :ref:`floating-point_number<amdgpu_synid_floating-point_number>`, an :ref:`integer_number<amdgpu_synid_integer_number>`, or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`.
The value is converted to *f32* as described :ref:`here<amdgpu_synid_fp_conv>`.

View File

@ -14,18 +14,21 @@ Bits of a hardware register being accessed.
The bits of this operand have the following meaning: The bits of this operand have the following meaning:
============ =================================== ======= ===================== ============
Bits Description Bits Description Value Range
============ =================================== ======= ===================== ============
5:0 Register *id*. 5:0 Register *id*. 0..63
10:6 First bit *offset* (0..31). 10:6 First bit *offset*. 0..31
15:11 *Size* in bits (1..32). 15:11 *Size* in bits. 1..32
============ =================================== ======= ===================== ============
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or using the syntax described below. This operand may be specified as one of the following:
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
* An *hwreg* value described below.
==================================== ============================================================================ ==================================== ============================================================================
Syntax Description Hwreg Value Syntax Description
==================================== ============================================================================ ==================================== ============================================================================
hwreg({0..63}) All bits of a register indicated by its *id*. hwreg({0..63}) All bits of a register indicated by its *id*.
hwreg(<*name*>) All bits of a register indicated by its *name*. hwreg(<*name*>) All bits of a register indicated by its *name*.
@ -33,7 +36,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_s
hwreg(<*name*>, {0..31}, {1..32}) Register bits indicated by register *name*, first bit *offset* and *size*. hwreg(<*name*>, {0..31}, {1..32}) Register bits indicated by register *name*, first bit *offset* and *size*.
==================================== ============================================================================ ==================================== ============================================================================
Register *id*, *offset* and *size* must be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`. Numeric values may be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
Defined register *names* include: Defined register *names* include:
@ -62,7 +66,16 @@ Examples:
.. parsed-literal:: .. parsed-literal::
s_getreg_b32 s2, 0x6 reg = 1
offset = 2
size = 4
hwreg_enc = reg | (offset << 6) | ((size - 1) << 11)
s_getreg_b32 s2, 0x1881
s_getreg_b32 s2, hwreg_enc // the same as above
s_getreg_b32 s2, hwreg(1, 2, 4) // the same as above
s_getreg_b32 s2, hwreg(reg, offset, size) // the same as above
s_getreg_b32 s2, hwreg(15) s_getreg_b32 s2, hwreg(15)
s_getreg_b32 s2, hwreg(51, 1, 31) s_getreg_b32 s2, hwreg(51, 1, 31)
s_getreg_b32 s2, hwreg(HW_REG_LDS_ALLOC, 0, 1) s_getreg_b32 s2, hwreg(HW_REG_LDS_ALLOC, 0, 1)

View File

@ -12,19 +12,26 @@ label
A branch target which is a 16-bit signed integer treated as a PC-relative dword offset. A branch target which is a 16-bit signed integer treated as a PC-relative dword offset.
This operand may be specified as: This operand may be specified as one of the following:
* An :ref:`integer_number<amdgpu_synid_integer_number>`. The number is truncated to 16 bits. * An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.
* An :ref:`absolute_expression<amdgpu_synid_absolute_expression>` which must start with an :ref:`integer_number<amdgpu_synid_integer_number>`. The value of the expression is truncated to 16 bits. * A :ref:`symbol<amdgpu_synid_symbol>` (for example, a label) representing a relocatable address in the same compilation unit where it is referred from. The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker.
* A :ref:`symbol<amdgpu_synid_symbol>` (for example, a label). The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker.
Examples: Examples:
.. parsed-literal:: .. parsed-literal::
offset = 30 offset = 30
s_branch loop_end label_1:
s_branch 2 + offset label_2 = . + 4
s_branch 32
loop_end: s_branch 32
s_branch offset + 2
s_branch label_1
s_branch label_2
s_branch label_3
s_branch label_4
label_3 = label_2 + 4
label_4:

View File

@ -12,24 +12,29 @@ msg
A 16-bit message code. The bits of this operand have the following meaning: A 16-bit message code. The bits of this operand have the following meaning:
============ ====================================================== ============ =============================== ===============
Bits Description Bits Description Value Range
============ ====================================================== ============ =============================== ===============
3:0 Message *type*. 3:0 Message *type*. 0..15
6:4 Optional *operation*. 6:4 Optional *operation*. 0..7
9:7 Optional *parameters*. 7:7 Unused. \-
15:10 Unused. 9:8 Optional *stream*. 0..3
============ ====================================================== 15:10 Unused. \-
============ =============================== ===============
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or using the syntax described below: This operand may be specified as one of the following:
======================================== ======================================================================== * An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
Syntax Description * A *sendmsg* value described below.
======================================== ========================================================================
sendmsg(<*type*>) A message identified by its *type*. ==================================== ====================================================
sendmsg(<*type*>, <*op*>) A message identified by its *type* and *operation*. Sendmsg Value Syntax Description
sendmsg(<*type*>, <*op*>, <*stream*>) A message identified by its *type* and *operation* with a stream *id*. ==================================== ====================================================
======================================== ======================================================================== sendmsg(<*type*>) A message identified by its *type*.
sendmsg(<*type*>,<*op*>) A message identified by its *type* and *operation*.
sendmsg(<*type*>,<*op*>,<*stream*>) A message identified by its *type* and *operation*
with a stream *id*.
==================================== ====================================================
*Type* may be specified using message *name* or message *id*. *Type* may be specified using message *name* or message *id*.
@ -37,7 +42,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_s
Stream *id* is an integer in the range 0..3. Stream *id* is an integer in the range 0..3.
Message *id*, operation *id* and stream *id* must be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`. Numeric values may be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
Each message type supports specific operations: Each message type supports specific operations:
@ -60,16 +66,32 @@ Each message type supports specific operations:
\ SYSMSG_OP_TTRACE_PC 4 \- \ SYSMSG_OP_TTRACE_PC 4 \-
================= ========== ============================== ============ ========== ================= ========== ============================== ============ ==========
*Sendmsg* arguments are validated depending on how *type* value is specified:
* If message *type* is specified by name, arguments values must satisfy limitations detailed in the table above.
* If message *type* is specified as a number, each argument must not exceed corresponding value range (see the first table).
Examples: Examples:
.. parsed-literal:: .. parsed-literal::
// numeric message code
msg = 0x10
s_sendmsg 0x12 s_sendmsg 0x12
s_sendmsg msg + 2
// sendmsg with strict arguments validation
s_sendmsg sendmsg(MSG_INTERRUPT) s_sendmsg sendmsg(MSG_INTERRUPT)
s_sendmsg sendmsg(MSG_GET_DOORBELL)
s_sendmsg sendmsg(2, GS_OP_CUT)
s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT) s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT)
s_sendmsg sendmsg(MSG_GS, 2) s_sendmsg sendmsg(MSG_GS, 2)
s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_EMIT_CUT, 1) s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_EMIT_CUT, 1)
s_sendmsg sendmsg(MSG_SYSMSG, SYSMSG_OP_TTRACE_PC) s_sendmsg sendmsg(MSG_SYSMSG, SYSMSG_OP_TTRACE_PC)
s_sendmsg sendmsg(MSG_GET_DOORBELL)
// sendmsg with validation of value range only
msg = 2
op = 3
stream = 1
s_sendmsg sendmsg(msg, op, stream)
s_sendmsg sendmsg(2, GS_OP_CUT)

View File

@ -12,7 +12,8 @@ imm3
A bit mask which indicates request permissions. A bit mask which indicates request permissions.
This operand must be specified as an :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 7 bits, but only 3 low bits are significant. This operand must be specified as an :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`.
The value is truncated to 7 bits, but only 3 low bits are significant.
============ ============================== ============ ==============================
Bit Number Description Bit Number Description

View File

@ -10,5 +10,5 @@
imm16 imm16
=========================== ===========================
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits and then sign-extended to 32 bits. An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.

View File

@ -10,5 +10,5 @@
imm16 imm16
=========================== ===========================
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits and then zero-extended to 32 bits. An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..65535.

View File

@ -14,30 +14,31 @@ Counts of outstanding instructions to wait for.
The bits of this operand have the following meaning: The bits of this operand have the following meaning:
============ ====================================================== ========== ========= ================================================ ============
Bits Description High Bits Low Bits Description Value Range
============ ====================================================== ========== ========= ================================================ ============
3:0 VM_CNT: vector memory operations count, lower bits. 15:14 3:0 VM_CNT: vector memory operations count. 0..63
6:4 EXP_CNT: export count. \- 6:4 EXP_CNT: export count. 0..7
11:8 LGKM_CNT: LDS, GDS, Constant and Message count. \- 11:8 LGKM_CNT: LDS, GDS, Constant and Message count. 0..15
15:14 VM_CNT: vector memory operations count, upper bits. ========== ========= ================================================ ============
============ ======================================================
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` This operand may be specified as one of the following:
or as a combination of the following symbolic helpers:
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
* A combination of *vmcnt*, *expcnt*, *lgkmcnt* and other values described below.
====================== ====================================================================== ====================== ======================================================================
Syntax Description Syntax Description
====================== ====================================================================== ====================== ======================================================================
vmcnt(<*N*>) VM_CNT value. *N* must not exceed the largest VM_CNT value. vmcnt(<*N*>) A VM_CNT value. *N* must not exceed the largest VM_CNT value.
expcnt(<*N*>) EXP_CNT value. *N* must not exceed the largest EXP_CNT value. expcnt(<*N*>) An EXP_CNT value. *N* must not exceed the largest EXP_CNT value.
lgkmcnt(<*N*>) LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value. lgkmcnt(<*N*>) An LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value.
vmcnt_sat(<*N*>) VM_CNT value computed as min(*N*, the largest VM_CNT value). vmcnt_sat(<*N*>) A VM_CNT value computed as min(*N*, the largest VM_CNT value).
expcnt_sat(<*N*>) EXP_CNT value computed as min(*N*, the largest EXP_CNT value). expcnt_sat(<*N*>) An EXP_CNT value computed as min(*N*, the largest EXP_CNT value).
lgkmcnt_sat(<*N*>) LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value). lgkmcnt_sat(<*N*>) An LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value).
====================== ====================================================================== ====================== ======================================================================
These helpers may be specified in any order. Ampersands and commas may be used as optional separators. These values may be specified in any order. Spaces, ampersands and commas may be used as optional separators.
*N* is either an *N* is either an
:ref:`integer number<amdgpu_synid_integer_number>` or an :ref:`integer number<amdgpu_synid_integer_number>` or an
@ -47,10 +48,18 @@ Examples:
.. parsed-literal:: .. parsed-literal::
s_waitcnt 0 vm_cnt = 1
exp_cnt = 2
lgkm_cnt = 3
cnt = vm_cnt | (exp_cnt << 4) | (lgkm_cnt << 8)
s_waitcnt cnt
s_waitcnt 1 | (2 << 4) | (3 << 8) // the same as above
s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3) // the same as above
s_waitcnt vmcnt(vm_cnt) expcnt(exp_cnt) lgkmcnt(lgkm_cnt) // the same as above
s_waitcnt vmcnt(1) s_waitcnt vmcnt(1)
s_waitcnt expcnt(2) lgkmcnt(3) s_waitcnt expcnt(2) lgkmcnt(3)
s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3)
s_waitcnt vmcnt(1), expcnt(2), lgkmcnt(3) s_waitcnt vmcnt(1), expcnt(2), lgkmcnt(3)
s_waitcnt vmcnt(1) & lgkmcnt_sat(100) & expcnt(2) s_waitcnt vmcnt(1) & lgkmcnt_sat(100) & expcnt(2)

View File

@ -10,5 +10,5 @@
imm16 imm16
=========================== ===========================
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits. A 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.

View File

@ -10,5 +10,5 @@
imm32 imm32
=========================== ===========================
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 32 bits. An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value is truncated to 32 bits.

View File

@ -21,7 +21,7 @@ Optionally may serve as an output data:
* :ref:`dmask<amdgpu_synid_dmask>` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword. * :ref:`dmask<amdgpu_synid_dmask>` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified. * :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
Note. The surface data format is indicated in the image resource constant but not in the instruction. Note: the surface data format is indicated in the image resource constant but not in the instruction.
*Operands:* :ref:`v<amdgpu_synid_v>` *Operands:* :ref:`v<amdgpu_synid_v>`

View File

@ -21,6 +21,6 @@ Optionally may serve as an output data:
* :ref:`dmask<amdgpu_synid_dmask>` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword. * :ref:`dmask<amdgpu_synid_dmask>` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified. * :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
Note. The surface data format is indicated in the image resource constant but not in the instruction. Note: the surface data format is indicated in the image resource constant but not in the instruction.
*Operands:* :ref:`v<amdgpu_synid_v>` *Operands:* :ref:`v<amdgpu_synid_v>`

View File

@ -10,5 +10,6 @@
imm32 imm32
=========================== ===========================
An :ref:`integer_number<amdgpu_synid_integer_number>` or a :ref:`floating-point_number<amdgpu_synid_floating-point_number>`. The value is converted to *f32* as described :ref:`here<amdgpu_synid_lit_conv>`. A :ref:`floating-point_number<amdgpu_synid_floating-point_number>`, an :ref:`integer_number<amdgpu_synid_integer_number>`, or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`.
The value is converted to *f32* as described :ref:`here<amdgpu_synid_fp_conv>`.

View File

@ -14,18 +14,21 @@ Bits of a hardware register being accessed.
The bits of this operand have the following meaning: The bits of this operand have the following meaning:
============ =================================== ======= ===================== ============
Bits Description Bits Description Value Range
============ =================================== ======= ===================== ============
5:0 Register *id*. 5:0 Register *id*. 0..63
10:6 First bit *offset* (0..31). 10:6 First bit *offset*. 0..31
15:11 *Size* in bits (1..32). 15:11 *Size* in bits. 1..32
============ =================================== ======= ===================== ============
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or using the syntax described below. This operand may be specified as one of the following:
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
* An *hwreg* value described below.
==================================== ============================================================================ ==================================== ============================================================================
Syntax Description Hwreg Value Syntax Description
==================================== ============================================================================ ==================================== ============================================================================
hwreg({0..63}) All bits of a register indicated by its *id*. hwreg({0..63}) All bits of a register indicated by its *id*.
hwreg(<*name*>) All bits of a register indicated by its *name*. hwreg(<*name*>) All bits of a register indicated by its *name*.
@ -33,7 +36,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_s
hwreg(<*name*>, {0..31}, {1..32}) Register bits indicated by register *name*, first bit *offset* and *size*. hwreg(<*name*>, {0..31}, {1..32}) Register bits indicated by register *name*, first bit *offset* and *size*.
==================================== ============================================================================ ==================================== ============================================================================
Register *id*, *offset* and *size* must be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`. Numeric values may be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
Defined register *names* include: Defined register *names* include:
@ -53,7 +57,16 @@ Examples:
.. parsed-literal:: .. parsed-literal::
s_getreg_b32 s2, 0x6 reg = 1
offset = 2
size = 4
hwreg_enc = reg | (offset << 6) | ((size - 1) << 11)
s_getreg_b32 s2, 0x1881
s_getreg_b32 s2, hwreg_enc // the same as above
s_getreg_b32 s2, hwreg(1, 2, 4) // the same as above
s_getreg_b32 s2, hwreg(reg, offset, size) // the same as above
s_getreg_b32 s2, hwreg(15) s_getreg_b32 s2, hwreg(15)
s_getreg_b32 s2, hwreg(51, 1, 31) s_getreg_b32 s2, hwreg(51, 1, 31)
s_getreg_b32 s2, hwreg(HW_REG_LDS_ALLOC, 0, 1) s_getreg_b32 s2, hwreg(HW_REG_LDS_ALLOC, 0, 1)

View File

@ -12,19 +12,26 @@ label
A branch target which is a 16-bit signed integer treated as a PC-relative dword offset. A branch target which is a 16-bit signed integer treated as a PC-relative dword offset.
This operand may be specified as: This operand may be specified as one of the following:
* An :ref:`integer_number<amdgpu_synid_integer_number>`. The number is truncated to 16 bits. * An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.
* An :ref:`absolute_expression<amdgpu_synid_absolute_expression>` which must start with an :ref:`integer_number<amdgpu_synid_integer_number>`. The value of the expression is truncated to 16 bits. * A :ref:`symbol<amdgpu_synid_symbol>` (for example, a label) representing a relocatable address in the same compilation unit where it is referred from. The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker.
* A :ref:`symbol<amdgpu_synid_symbol>` (for example, a label). The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker.
Examples: Examples:
.. parsed-literal:: .. parsed-literal::
offset = 30 offset = 30
s_branch loop_end label_1:
s_branch 2 + offset label_2 = . + 4
s_branch 32
loop_end: s_branch 32
s_branch offset + 2
s_branch label_1
s_branch label_2
s_branch label_3
s_branch label_4
label_3 = label_2 + 4
label_4:

View File

@ -12,24 +12,29 @@ msg
A 16-bit message code. The bits of this operand have the following meaning: A 16-bit message code. The bits of this operand have the following meaning:
============ ====================================================== ============ =============================== ===============
Bits Description Bits Description Value Range
============ ====================================================== ============ =============================== ===============
3:0 Message *type*. 3:0 Message *type*. 0..15
6:4 Optional *operation*. 6:4 Optional *operation*. 0..7
9:7 Optional *parameters*. 7:7 Unused. \-
15:10 Unused. 9:8 Optional *stream*. 0..3
============ ====================================================== 15:10 Unused. \-
============ =============================== ===============
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or using the syntax described below: This operand may be specified as one of the following:
======================================== ======================================================================== * An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
Syntax Description * A *sendmsg* value described below.
======================================== ========================================================================
sendmsg(<*type*>) A message identified by its *type*. ==================================== ====================================================
sendmsg(<*type*>, <*op*>) A message identified by its *type* and *operation*. Sendmsg Value Syntax Description
sendmsg(<*type*>, <*op*>, <*stream*>) A message identified by its *type* and *operation* with a stream *id*. ==================================== ====================================================
======================================== ======================================================================== sendmsg(<*type*>) A message identified by its *type*.
sendmsg(<*type*>,<*op*>) A message identified by its *type* and *operation*.
sendmsg(<*type*>,<*op*>,<*stream*>) A message identified by its *type* and *operation*
with a stream *id*.
==================================== ====================================================
*Type* may be specified using message *name* or message *id*. *Type* may be specified using message *name* or message *id*.
@ -37,7 +42,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_s
Stream *id* is an integer in the range 0..3. Stream *id* is an integer in the range 0..3.
Message *id*, operation *id* and stream *id* must be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`. Numeric values may be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
Each message type supports specific operations: Each message type supports specific operations:
@ -58,15 +64,31 @@ Each message type supports specific operations:
\ SYSMSG_OP_TTRACE_PC 4 \- \ SYSMSG_OP_TTRACE_PC 4 \-
================= ========== ============================== ============ ========== ================= ========== ============================== ============ ==========
*Sendmsg* arguments are validated depending on how *type* value is specified:
* If message *type* is specified by name, arguments values must satisfy limitations detailed in the table above.
* If message *type* is specified as a number, each argument must not exceed corresponding value range (see the first table).
Examples: Examples:
.. parsed-literal:: .. parsed-literal::
// numeric message code
msg = 0x10
s_sendmsg 0x12 s_sendmsg 0x12
s_sendmsg msg + 2
// sendmsg with strict arguments validation
s_sendmsg sendmsg(MSG_INTERRUPT) s_sendmsg sendmsg(MSG_INTERRUPT)
s_sendmsg sendmsg(2, GS_OP_CUT)
s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT) s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT)
s_sendmsg sendmsg(MSG_GS, 2) s_sendmsg sendmsg(MSG_GS, 2)
s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_EMIT_CUT, 1) s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_EMIT_CUT, 1)
s_sendmsg sendmsg(MSG_SYSMSG, SYSMSG_OP_TTRACE_PC) s_sendmsg sendmsg(MSG_SYSMSG, SYSMSG_OP_TTRACE_PC)
// sendmsg with validation of value range only
msg = 2
op = 3
stream = 1
s_sendmsg sendmsg(msg, op, stream)
s_sendmsg sendmsg(2, GS_OP_CUT)

View File

@ -10,5 +10,5 @@
imm16 imm16
=========================== ===========================
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits and then sign-extended to 32 bits. An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.

View File

@ -10,5 +10,5 @@
imm16 imm16
=========================== ===========================
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits and then zero-extended to 32 bits. An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..65535.

View File

@ -14,29 +14,31 @@ Counts of outstanding instructions to wait for.
The bits of this operand have the following meaning: The bits of this operand have the following meaning:
============ ====================================================== ===== ================================================ ============
Bits Description Bits Description Value Range
============ ====================================================== ===== ================================================ ============
3:0 VM_CNT: vector memory operations count. 3:0 VM_CNT: vector memory operations count. 0..15
6:4 EXP_CNT: export count. 6:4 EXP_CNT: export count. 0..7
12:8 LGKM_CNT: LDS, GDS, Constant and Message count. 12:8 LGKM_CNT: LDS, GDS, Constant and Message count. 0..31
============ ====================================================== ===== ================================================ ============
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` This operand may be specified as one of the following:
or as a combination of the following symbolic helpers:
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
* A combination of *vmcnt*, *expcnt*, *lgkmcnt* and other values described below.
====================== ====================================================================== ====================== ======================================================================
Syntax Description Syntax Description
====================== ====================================================================== ====================== ======================================================================
vmcnt(<*N*>) VM_CNT value. *N* must not exceed the largest VM_CNT value. vmcnt(<*N*>) A VM_CNT value. *N* must not exceed the largest VM_CNT value.
expcnt(<*N*>) EXP_CNT value. *N* must not exceed the largest EXP_CNT value. expcnt(<*N*>) An EXP_CNT value. *N* must not exceed the largest EXP_CNT value.
lgkmcnt(<*N*>) LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value. lgkmcnt(<*N*>) An LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value.
vmcnt_sat(<*N*>) VM_CNT value computed as min(*N*, the largest VM_CNT value). vmcnt_sat(<*N*>) A VM_CNT value computed as min(*N*, the largest VM_CNT value).
expcnt_sat(<*N*>) EXP_CNT value computed as min(*N*, the largest EXP_CNT value). expcnt_sat(<*N*>) An EXP_CNT value computed as min(*N*, the largest EXP_CNT value).
lgkmcnt_sat(<*N*>) LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value). lgkmcnt_sat(<*N*>) An LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value).
====================== ====================================================================== ====================== ======================================================================
These helpers may be specified in any order. Ampersands and commas may be used as optional separators. These values may be specified in any order. Spaces, ampersands and commas may be used as optional separators.
*N* is either an *N* is either an
:ref:`integer number<amdgpu_synid_integer_number>` or an :ref:`integer number<amdgpu_synid_integer_number>` or an
@ -46,10 +48,18 @@ Examples:
.. parsed-literal:: .. parsed-literal::
s_waitcnt 0 vm_cnt = 1
exp_cnt = 2
lgkm_cnt = 3
cnt = vm_cnt | (exp_cnt << 4) | (lgkm_cnt << 8)
s_waitcnt cnt
s_waitcnt 1 | (2 << 4) | (3 << 8) // the same as above
s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3) // the same as above
s_waitcnt vmcnt(vm_cnt) expcnt(exp_cnt) lgkmcnt(lgkm_cnt) // the same as above
s_waitcnt vmcnt(1) s_waitcnt vmcnt(1)
s_waitcnt expcnt(2) lgkmcnt(3) s_waitcnt expcnt(2) lgkmcnt(3)
s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3)
s_waitcnt vmcnt(1), expcnt(2), lgkmcnt(3) s_waitcnt vmcnt(1), expcnt(2), lgkmcnt(3)
s_waitcnt vmcnt(1) & lgkmcnt_sat(100) & expcnt(2) s_waitcnt vmcnt(1) & lgkmcnt_sat(100) & expcnt(2)

View File

@ -10,5 +10,5 @@
imm16 imm16
=========================== ===========================
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits. A 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.

View File

@ -10,5 +10,5 @@
imm32 imm32
=========================== ===========================
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 32 bits. An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value is truncated to 32 bits.

View File

@ -21,7 +21,7 @@ Optionally may serve as an output data:
* :ref:`dmask<amdgpu_synid_dmask>` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword. * :ref:`dmask<amdgpu_synid_dmask>` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified. * :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
Note. The surface data format is indicated in the image resource constant but not in the instruction. Note: the surface data format is indicated in the image resource constant but not in the instruction.
*Operands:* :ref:`v<amdgpu_synid_v>` *Operands:* :ref:`v<amdgpu_synid_v>`

View File

@ -21,6 +21,6 @@ Optionally may serve as an output data:
* :ref:`dmask<amdgpu_synid_dmask>` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword. * :ref:`dmask<amdgpu_synid_dmask>` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified. * :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
Note. The surface data format is indicated in the image resource constant but not in the instruction. Note: the surface data format is indicated in the image resource constant but not in the instruction.
*Operands:* :ref:`v<amdgpu_synid_v>` *Operands:* :ref:`v<amdgpu_synid_v>`

View File

@ -10,5 +10,6 @@
imm32 imm32
=========================== ===========================
An :ref:`integer_number<amdgpu_synid_integer_number>` or a :ref:`floating-point_number<amdgpu_synid_floating-point_number>`. The number is converted to *f16* as described :ref:`here<amdgpu_synid_lit_conv>`. A :ref:`floating-point_number<amdgpu_synid_floating-point_number>`, an :ref:`integer_number<amdgpu_synid_integer_number>`, or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`.
The value is converted to *f16* as described :ref:`here<amdgpu_synid_fp_conv>`.

View File

@ -10,5 +10,6 @@
imm32 imm32
=========================== ===========================
An :ref:`integer_number<amdgpu_synid_integer_number>` or a :ref:`floating-point_number<amdgpu_synid_floating-point_number>`. The value is converted to *f32* as described :ref:`here<amdgpu_synid_lit_conv>`. A :ref:`floating-point_number<amdgpu_synid_floating-point_number>`, an :ref:`integer_number<amdgpu_synid_integer_number>`, or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`.
The value is converted to *f32* as described :ref:`here<amdgpu_synid_fp_conv>`.

View File

@ -14,18 +14,21 @@ Bits of a hardware register being accessed.
The bits of this operand have the following meaning: The bits of this operand have the following meaning:
============ =================================== ======= ===================== ============
Bits Description Bits Description Value Range
============ =================================== ======= ===================== ============
5:0 Register *id*. 5:0 Register *id*. 0..63
10:6 First bit *offset* (0..31). 10:6 First bit *offset*. 0..31
15:11 *Size* in bits (1..32). 15:11 *Size* in bits. 1..32
============ =================================== ======= ===================== ============
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or using the syntax described below. This operand may be specified as one of the following:
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
* An *hwreg* value described below.
==================================== ============================================================================ ==================================== ============================================================================
Syntax Description Hwreg Value Syntax Description
==================================== ============================================================================ ==================================== ============================================================================
hwreg({0..63}) All bits of a register indicated by its *id*. hwreg({0..63}) All bits of a register indicated by its *id*.
hwreg(<*name*>) All bits of a register indicated by its *name*. hwreg(<*name*>) All bits of a register indicated by its *name*.
@ -33,7 +36,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_s
hwreg(<*name*>, {0..31}, {1..32}) Register bits indicated by register *name*, first bit *offset* and *size*. hwreg(<*name*>, {0..31}, {1..32}) Register bits indicated by register *name*, first bit *offset* and *size*.
==================================== ============================================================================ ==================================== ============================================================================
Register *id*, *offset* and *size* must be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`. Numeric values may be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
Defined register *names* include: Defined register *names* include:
@ -53,7 +57,16 @@ Examples:
.. parsed-literal:: .. parsed-literal::
s_getreg_b32 s2, 0x6 reg = 1
offset = 2
size = 4
hwreg_enc = reg | (offset << 6) | ((size - 1) << 11)
s_getreg_b32 s2, 0x1881
s_getreg_b32 s2, hwreg_enc // the same as above
s_getreg_b32 s2, hwreg(1, 2, 4) // the same as above
s_getreg_b32 s2, hwreg(reg, offset, size) // the same as above
s_getreg_b32 s2, hwreg(15) s_getreg_b32 s2, hwreg(15)
s_getreg_b32 s2, hwreg(51, 1, 31) s_getreg_b32 s2, hwreg(51, 1, 31)
s_getreg_b32 s2, hwreg(HW_REG_LDS_ALLOC, 0, 1) s_getreg_b32 s2, hwreg(HW_REG_LDS_ALLOC, 0, 1)

View File

@ -0,0 +1,66 @@
..
**************************************************
* *
* Automatically generated file, do not edit! *
* *
**************************************************
.. _amdgpu_synid8_imask:
imask
===========================
This operand is a mask which controls indexing mode for operands of subsequent instructions.
Bits 0, 1 and 2 control indexing of *src0*, *src1* and *src2*, while bit 3 controls indexing of *dst*.
Value 1 enables indexing and value 0 disables it.
===== ========================================
Bit Meaning
===== ========================================
0 Enables or disables *src0* indexing.
1 Enables or disables *src1* indexing.
2 Enables or disables *src2* indexing.
3 Enables or disables *dst* indexing.
===== ========================================
This operand may be specified as one of the following:
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..15.
* A *gpr_idx* value described below.
==================================== ===========================================
Gpr_idx Value Syntax Description
==================================== ===========================================
gpr_idx(*<operands>*) Enable indexing for specified *operands*
and disable it for the rest.
*Operands* is a comma-separated list of
values which may include:
* "SRC0" - enable *src0* indexing.
* "SRC1" - enable *src1* indexing.
* "SRC2" - enable *src2* indexing.
* "DST" - enable *dst* indexing.
Each of these values may be specified only
once.
*Operands* list may be empty; this syntax
disables indexing for all operands.
==================================== ===========================================
Examples:
.. parsed-literal::
s_set_gpr_idx_mode 0
s_set_gpr_idx_mode gpr_idx() // the same as above
s_set_gpr_idx_mode 15
s_set_gpr_idx_mode gpr_idx(DST,SRC0,SRC1,SRC2) // the same as above
s_set_gpr_idx_mode gpr_idx(SRC0,SRC1,SRC2,DST) // the same as above
s_set_gpr_idx_mode gpr_idx(DST,SRC1)

View File

@ -1,25 +0,0 @@
..
**************************************************
* *
* Automatically generated file, do not edit! *
* *
**************************************************
.. _amdgpu_synid8_imm4:
imm4
===========================
A positive :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 4 bits.
This operand is a mask which controls indexing mode for operands of subsequent instructions. Value 1 enables indexing and value 0 disables it.
============ ========================================
Bit Meaning
============ ========================================
0 Enables or disables *src0* indexing.
1 Enables or disables *src1* indexing.
2 Enables or disables *src2* indexing.
3 Enables or disables *dst* indexing.
============ ========================================

View File

@ -12,19 +12,26 @@ label
A branch target which is a 16-bit signed integer treated as a PC-relative dword offset. A branch target which is a 16-bit signed integer treated as a PC-relative dword offset.
This operand may be specified as: This operand may be specified as one of the following:
* An :ref:`integer_number<amdgpu_synid_integer_number>`. The number is truncated to 16 bits. * An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.
* An :ref:`absolute_expression<amdgpu_synid_absolute_expression>` which must start with an :ref:`integer_number<amdgpu_synid_integer_number>`. The value of the expression is truncated to 16 bits. * A :ref:`symbol<amdgpu_synid_symbol>` (for example, a label) representing a relocatable address in the same compilation unit where it is referred from. The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker.
* A :ref:`symbol<amdgpu_synid_symbol>` (for example, a label). The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker.
Examples: Examples:
.. parsed-literal:: .. parsed-literal::
offset = 30 offset = 30
s_branch loop_end label_1:
s_branch 2 + offset label_2 = . + 4
s_branch 32
loop_end: s_branch 32
s_branch offset + 2
s_branch label_1
s_branch label_2
s_branch label_3
s_branch label_4
label_3 = label_2 + 4
label_4:

View File

@ -12,24 +12,29 @@ msg
A 16-bit message code. The bits of this operand have the following meaning: A 16-bit message code. The bits of this operand have the following meaning:
============ ====================================================== ============ =============================== ===============
Bits Description Bits Description Value Range
============ ====================================================== ============ =============================== ===============
3:0 Message *type*. 3:0 Message *type*. 0..15
6:4 Optional *operation*. 6:4 Optional *operation*. 0..7
9:7 Optional *parameters*. 7:7 Unused. \-
15:10 Unused. 9:8 Optional *stream*. 0..3
============ ====================================================== 15:10 Unused. \-
============ =============================== ===============
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or using the syntax described below: This operand may be specified as one of the following:
======================================== ======================================================================== * An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
Syntax Description * A *sendmsg* value described below.
======================================== ========================================================================
sendmsg(<*type*>) A message identified by its *type*. ==================================== ====================================================
sendmsg(<*type*>, <*op*>) A message identified by its *type* and *operation*. Sendmsg Value Syntax Description
sendmsg(<*type*>, <*op*>, <*stream*>) A message identified by its *type* and *operation* with a stream *id*. ==================================== ====================================================
======================================== ======================================================================== sendmsg(<*type*>) A message identified by its *type*.
sendmsg(<*type*>,<*op*>) A message identified by its *type* and *operation*.
sendmsg(<*type*>,<*op*>,<*stream*>) A message identified by its *type* and *operation*
with a stream *id*.
==================================== ====================================================
*Type* may be specified using message *name* or message *id*. *Type* may be specified using message *name* or message *id*.
@ -37,7 +42,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_s
Stream *id* is an integer in the range 0..3. Stream *id* is an integer in the range 0..3.
Message *id*, operation *id* and stream *id* must be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`. Numeric values may be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
Each message type supports specific operations: Each message type supports specific operations:
@ -58,15 +64,31 @@ Each message type supports specific operations:
\ SYSMSG_OP_TTRACE_PC 4 \- \ SYSMSG_OP_TTRACE_PC 4 \-
================= ========== ============================== ============ ========== ================= ========== ============================== ============ ==========
*Sendmsg* arguments are validated depending on how *type* value is specified:
* If message *type* is specified by name, arguments values must satisfy limitations detailed in the table above.
* If message *type* is specified as a number, each argument must not exceed corresponding value range (see the first table).
Examples: Examples:
.. parsed-literal:: .. parsed-literal::
// numeric message code
msg = 0x10
s_sendmsg 0x12 s_sendmsg 0x12
s_sendmsg msg + 2
// sendmsg with strict arguments validation
s_sendmsg sendmsg(MSG_INTERRUPT) s_sendmsg sendmsg(MSG_INTERRUPT)
s_sendmsg sendmsg(2, GS_OP_CUT)
s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT) s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT)
s_sendmsg sendmsg(MSG_GS, 2) s_sendmsg sendmsg(MSG_GS, 2)
s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_EMIT_CUT, 1) s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_EMIT_CUT, 1)
s_sendmsg sendmsg(MSG_SYSMSG, SYSMSG_OP_TTRACE_PC) s_sendmsg sendmsg(MSG_SYSMSG, SYSMSG_OP_TTRACE_PC)
// sendmsg with validation of value range only
msg = 2
op = 3
stream = 1
s_sendmsg sendmsg(msg, op, stream)
s_sendmsg sendmsg(2, GS_OP_CUT)

View File

@ -12,7 +12,8 @@ imm3
A bit mask which indicates request permissions. A bit mask which indicates request permissions.
This operand must be specified as an :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 7 bits, but only 3 low bits are significant. This operand must be specified as an :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`.
The value is truncated to 7 bits, but only 3 low bits are significant.
============ ============================== ============ ==============================
Bit Number Description Bit Number Description

View File

@ -10,5 +10,5 @@
imm16 imm16
=========================== ===========================
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits and then sign-extended to 32 bits. An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.

View File

@ -10,5 +10,5 @@
imm16 imm16
=========================== ===========================
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits and then zero-extended to 32 bits. An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..65535.

View File

@ -14,29 +14,31 @@ Counts of outstanding instructions to wait for.
The bits of this operand have the following meaning: The bits of this operand have the following meaning:
============ ====================================================== ===== ================================================ ============
Bits Description Bits Description Value Range
============ ====================================================== ===== ================================================ ============
3:0 VM_CNT: vector memory operations count. 3:0 VM_CNT: vector memory operations count. 0..15
6:4 EXP_CNT: export count. 6:4 EXP_CNT: export count. 0..7
11:8 LGKM_CNT: LDS, GDS, Constant and Message count. 11:8 LGKM_CNT: LDS, GDS, Constant and Message count. 0..15
============ ====================================================== ===== ================================================ ============
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` This operand may be specified as one of the following:
or as a combination of the following symbolic helpers:
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
* A combination of *vmcnt*, *expcnt*, *lgkmcnt* and other values described below.
====================== ====================================================================== ====================== ======================================================================
Syntax Description Syntax Description
====================== ====================================================================== ====================== ======================================================================
vmcnt(<*N*>) VM_CNT value. *N* must not exceed the largest VM_CNT value. vmcnt(<*N*>) A VM_CNT value. *N* must not exceed the largest VM_CNT value.
expcnt(<*N*>) EXP_CNT value. *N* must not exceed the largest EXP_CNT value. expcnt(<*N*>) An EXP_CNT value. *N* must not exceed the largest EXP_CNT value.
lgkmcnt(<*N*>) LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value. lgkmcnt(<*N*>) An LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value.
vmcnt_sat(<*N*>) VM_CNT value computed as min(*N*, the largest VM_CNT value). vmcnt_sat(<*N*>) A VM_CNT value computed as min(*N*, the largest VM_CNT value).
expcnt_sat(<*N*>) EXP_CNT value computed as min(*N*, the largest EXP_CNT value). expcnt_sat(<*N*>) An EXP_CNT value computed as min(*N*, the largest EXP_CNT value).
lgkmcnt_sat(<*N*>) LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value). lgkmcnt_sat(<*N*>) An LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value).
====================== ====================================================================== ====================== ======================================================================
These helpers may be specified in any order. Ampersands and commas may be used as optional separators. These values may be specified in any order. Spaces, ampersands and commas may be used as optional separators.
*N* is either an *N* is either an
:ref:`integer number<amdgpu_synid_integer_number>` or an :ref:`integer number<amdgpu_synid_integer_number>` or an
@ -46,10 +48,18 @@ Examples:
.. parsed-literal:: .. parsed-literal::
s_waitcnt 0 vm_cnt = 1
exp_cnt = 2
lgkm_cnt = 3
cnt = vm_cnt | (exp_cnt << 4) | (lgkm_cnt << 8)
s_waitcnt cnt
s_waitcnt 1 | (2 << 4) | (3 << 8) // the same as above
s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3) // the same as above
s_waitcnt vmcnt(vm_cnt) expcnt(exp_cnt) lgkmcnt(lgkm_cnt) // the same as above
s_waitcnt vmcnt(1) s_waitcnt vmcnt(1)
s_waitcnt expcnt(2) lgkmcnt(3) s_waitcnt expcnt(2) lgkmcnt(3)
s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3)
s_waitcnt vmcnt(1), expcnt(2), lgkmcnt(3) s_waitcnt vmcnt(1), expcnt(2), lgkmcnt(3)
s_waitcnt vmcnt(1) & lgkmcnt_sat(100) & expcnt(2) s_waitcnt vmcnt(1) & lgkmcnt_sat(100) & expcnt(2)

View File

@ -10,5 +10,5 @@
imm16 imm16
=========================== ===========================
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits. A 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.

View File

@ -10,5 +10,5 @@
imm32 imm32
=========================== ===========================
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 32 bits. An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value is truncated to 32 bits.

View File

@ -21,7 +21,7 @@ Optionally may serve as an output data:
* :ref:`dmask<amdgpu_synid_dmask>` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword. * :ref:`dmask<amdgpu_synid_dmask>` may specify 2 data elements for 32-bit-per-pixel surfaces or 4 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified. * :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
Note. The surface data format is indicated in the image resource constant but not in the instruction. Note: the surface data format is indicated in the image resource constant but not in the instruction.
*Operands:* :ref:`v<amdgpu_synid_v>` *Operands:* :ref:`v<amdgpu_synid_v>`

View File

@ -21,6 +21,6 @@ Optionally may serve as an output data:
* :ref:`dmask<amdgpu_synid_dmask>` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword. * :ref:`dmask<amdgpu_synid_dmask>` may specify 1 data element for 32-bit-per-pixel surfaces or 2 data elements for 64-bit-per-pixel surfaces. Each data element occupies 1 dword.
* :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified. * :ref:`tfe<amdgpu_synid_tfe>` adds 1 dword if specified.
Note. The surface data format is indicated in the image resource constant but not in the instruction. Note: the surface data format is indicated in the image resource constant but not in the instruction.
*Operands:* :ref:`v<amdgpu_synid_v>` *Operands:* :ref:`v<amdgpu_synid_v>`

View File

@ -10,5 +10,6 @@
imm32 imm32
=========================== ===========================
An :ref:`integer_number<amdgpu_synid_integer_number>` or a :ref:`floating-point_number<amdgpu_synid_floating-point_number>`. The number is converted to *f16* as described :ref:`here<amdgpu_synid_lit_conv>`. A :ref:`floating-point_number<amdgpu_synid_floating-point_number>`, an :ref:`integer_number<amdgpu_synid_integer_number>`, or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`.
The value is converted to *f16* as described :ref:`here<amdgpu_synid_fp_conv>`.

View File

@ -10,5 +10,6 @@
imm32 imm32
=========================== ===========================
An :ref:`integer_number<amdgpu_synid_integer_number>` or a :ref:`floating-point_number<amdgpu_synid_floating-point_number>`. The value is converted to *f32* as described :ref:`here<amdgpu_synid_lit_conv>`. A :ref:`floating-point_number<amdgpu_synid_floating-point_number>`, an :ref:`integer_number<amdgpu_synid_integer_number>`, or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`.
The value is converted to *f32* as described :ref:`here<amdgpu_synid_fp_conv>`.

View File

@ -14,18 +14,21 @@ Bits of a hardware register being accessed.
The bits of this operand have the following meaning: The bits of this operand have the following meaning:
============ =================================== ======= ===================== ============
Bits Description Bits Description Value Range
============ =================================== ======= ===================== ============
5:0 Register *id*. 5:0 Register *id*. 0..63
10:6 First bit *offset* (0..31). 10:6 First bit *offset*. 0..31
15:11 *Size* in bits (1..32). 15:11 *Size* in bits. 1..32
============ =================================== ======= ===================== ============
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or using the syntax described below. This operand may be specified as one of the following:
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
* An *hwreg* value described below.
==================================== ============================================================================ ==================================== ============================================================================
Syntax Description Hwreg Value Syntax Description
==================================== ============================================================================ ==================================== ============================================================================
hwreg({0..63}) All bits of a register indicated by its *id*. hwreg({0..63}) All bits of a register indicated by its *id*.
hwreg(<*name*>) All bits of a register indicated by its *name*. hwreg(<*name*>) All bits of a register indicated by its *name*.
@ -33,7 +36,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_s
hwreg(<*name*>, {0..31}, {1..32}) Register bits indicated by register *name*, first bit *offset* and *size*. hwreg(<*name*>, {0..31}, {1..32}) Register bits indicated by register *name*, first bit *offset* and *size*.
==================================== ============================================================================ ==================================== ============================================================================
Register *id*, *offset* and *size* must be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`. Numeric values may be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
Defined register *names* include: Defined register *names* include:
@ -54,7 +58,16 @@ Examples:
.. parsed-literal:: .. parsed-literal::
s_getreg_b32 s2, 0x6 reg = 1
offset = 2
size = 4
hwreg_enc = reg | (offset << 6) | ((size - 1) << 11)
s_getreg_b32 s2, 0x1881
s_getreg_b32 s2, hwreg_enc // the same as above
s_getreg_b32 s2, hwreg(1, 2, 4) // the same as above
s_getreg_b32 s2, hwreg(reg, offset, size) // the same as above
s_getreg_b32 s2, hwreg(15) s_getreg_b32 s2, hwreg(15)
s_getreg_b32 s2, hwreg(51, 1, 31) s_getreg_b32 s2, hwreg(51, 1, 31)
s_getreg_b32 s2, hwreg(HW_REG_LDS_ALLOC, 0, 1) s_getreg_b32 s2, hwreg(HW_REG_LDS_ALLOC, 0, 1)

View File

@ -0,0 +1,66 @@
..
**************************************************
* *
* Automatically generated file, do not edit! *
* *
**************************************************
.. _amdgpu_synid9_imask:
imask
===========================
This operand is a mask which controls indexing mode for operands of subsequent instructions.
Bits 0, 1 and 2 control indexing of *src0*, *src1* and *src2*, while bit 3 controls indexing of *dst*.
Value 1 enables indexing and value 0 disables it.
===== ========================================
Bit Meaning
===== ========================================
0 Enables or disables *src0* indexing.
1 Enables or disables *src1* indexing.
2 Enables or disables *src2* indexing.
3 Enables or disables *dst* indexing.
===== ========================================
This operand may be specified as one of the following:
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..15.
* A *gpr_idx* value described below.
==================================== ===========================================
Gpr_idx Value Syntax Description
==================================== ===========================================
gpr_idx(*<operands>*) Enable indexing for specified *operands*
and disable it for the rest.
*Operands* is a comma-separated list of
values which may include:
* "SRC0" - enable *src0* indexing.
* "SRC1" - enable *src1* indexing.
* "SRC2" - enable *src2* indexing.
* "DST" - enable *dst* indexing.
Each of these values may be specified only
once.
*Operands* list may be empty; this syntax
disables indexing for all operands.
==================================== ===========================================
Examples:
.. parsed-literal::
s_set_gpr_idx_mode 0
s_set_gpr_idx_mode gpr_idx() // the same as above
s_set_gpr_idx_mode 15
s_set_gpr_idx_mode gpr_idx(DST,SRC0,SRC1,SRC2) // the same as above
s_set_gpr_idx_mode gpr_idx(SRC0,SRC1,SRC2,DST) // the same as above
s_set_gpr_idx_mode gpr_idx(DST,SRC1)

View File

@ -1,25 +0,0 @@
..
**************************************************
* *
* Automatically generated file, do not edit! *
* *
**************************************************
.. _amdgpu_synid9_imm4:
imm4
===========================
A positive :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 4 bits.
This operand is a mask which controls indexing mode for operands of subsequent instructions. Value 1 enables indexing and value 0 disables it.
============ ========================================
Bit Meaning
============ ========================================
0 Enables or disables *src0* indexing.
1 Enables or disables *src1* indexing.
2 Enables or disables *src2* indexing.
3 Enables or disables *dst* indexing.
============ ========================================

View File

@ -12,19 +12,26 @@ label
A branch target which is a 16-bit signed integer treated as a PC-relative dword offset. A branch target which is a 16-bit signed integer treated as a PC-relative dword offset.
This operand may be specified as: This operand may be specified as one of the following:
* An :ref:`integer_number<amdgpu_synid_integer_number>`. The number is truncated to 16 bits. * An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.
* An :ref:`absolute_expression<amdgpu_synid_absolute_expression>` which must start with an :ref:`integer_number<amdgpu_synid_integer_number>`. The value of the expression is truncated to 16 bits. * A :ref:`symbol<amdgpu_synid_symbol>` (for example, a label) representing a relocatable address in the same compilation unit where it is referred from. The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker.
* A :ref:`symbol<amdgpu_synid_symbol>` (for example, a label). The value is handled as a 16-bit PC-relative dword offset to be resolved by a linker.
Examples: Examples:
.. parsed-literal:: .. parsed-literal::
offset = 30 offset = 30
s_branch loop_end label_1:
s_branch 2 + offset label_2 = . + 4
s_branch 32
loop_end: s_branch 32
s_branch offset + 2
s_branch label_1
s_branch label_2
s_branch label_3
s_branch label_4
label_3 = label_2 + 4
label_4:

View File

@ -12,24 +12,29 @@ msg
A 16-bit message code. The bits of this operand have the following meaning: A 16-bit message code. The bits of this operand have the following meaning:
============ ====================================================== ============ =============================== ===============
Bits Description Bits Description Value Range
============ ====================================================== ============ =============================== ===============
3:0 Message *type*. 3:0 Message *type*. 0..15
6:4 Optional *operation*. 6:4 Optional *operation*. 0..7
9:7 Optional *parameters*. 7:7 Unused. \-
15:10 Unused. 9:8 Optional *stream*. 0..3
============ ====================================================== 15:10 Unused. \-
============ =============================== ===============
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` or using the syntax described below: This operand may be specified as one of the following:
======================================== ======================================================================== * An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
Syntax Description * A *sendmsg* value described below.
======================================== ========================================================================
sendmsg(<*type*>) A message identified by its *type*. ==================================== ====================================================
sendmsg(<*type*>, <*op*>) A message identified by its *type* and *operation*. Sendmsg Value Syntax Description
sendmsg(<*type*>, <*op*>, <*stream*>) A message identified by its *type* and *operation* with a stream *id*. ==================================== ====================================================
======================================== ======================================================================== sendmsg(<*type*>) A message identified by its *type*.
sendmsg(<*type*>,<*op*>) A message identified by its *type* and *operation*.
sendmsg(<*type*>,<*op*>,<*stream*>) A message identified by its *type* and *operation*
with a stream *id*.
==================================== ====================================================
*Type* may be specified using message *name* or message *id*. *Type* may be specified using message *name* or message *id*.
@ -37,7 +42,8 @@ This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_s
Stream *id* is an integer in the range 0..3. Stream *id* is an integer in the range 0..3.
Message *id*, operation *id* and stream *id* must be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`. Numeric values may be specified as positive :ref:`integer numbers<amdgpu_synid_integer_number>`
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
Each message type supports specific operations: Each message type supports specific operations:
@ -60,16 +66,32 @@ Each message type supports specific operations:
\ SYSMSG_OP_TTRACE_PC 4 \- \ SYSMSG_OP_TTRACE_PC 4 \-
================= ========== ============================== ============ ========== ================= ========== ============================== ============ ==========
*Sendmsg* arguments are validated depending on how *type* value is specified:
* If message *type* is specified by name, arguments values must satisfy limitations detailed in the table above.
* If message *type* is specified as a number, each argument must not exceed corresponding value range (see the first table).
Examples: Examples:
.. parsed-literal:: .. parsed-literal::
// numeric message code
msg = 0x10
s_sendmsg 0x12 s_sendmsg 0x12
s_sendmsg msg + 2
// sendmsg with strict arguments validation
s_sendmsg sendmsg(MSG_INTERRUPT) s_sendmsg sendmsg(MSG_INTERRUPT)
s_sendmsg sendmsg(MSG_GET_DOORBELL)
s_sendmsg sendmsg(2, GS_OP_CUT)
s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT) s_sendmsg sendmsg(MSG_GS, GS_OP_EMIT)
s_sendmsg sendmsg(MSG_GS, 2) s_sendmsg sendmsg(MSG_GS, 2)
s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_EMIT_CUT, 1) s_sendmsg sendmsg(MSG_GS_DONE, GS_OP_EMIT_CUT, 1)
s_sendmsg sendmsg(MSG_SYSMSG, SYSMSG_OP_TTRACE_PC) s_sendmsg sendmsg(MSG_SYSMSG, SYSMSG_OP_TTRACE_PC)
s_sendmsg sendmsg(MSG_GET_DOORBELL)
// sendmsg with validation of value range only
msg = 2
op = 3
stream = 1
s_sendmsg sendmsg(msg, op, stream)
s_sendmsg sendmsg(2, GS_OP_CUT)

View File

@ -12,7 +12,8 @@ imm3
A bit mask which indicates request permissions. A bit mask which indicates request permissions.
This operand must be specified as an :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 7 bits, but only 3 low bits are significant. This operand must be specified as an :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`.
The value is truncated to 7 bits, but only 3 low bits are significant.
============ ============================== ============ ==============================
Bit Number Description Bit Number Description

View File

@ -10,5 +10,5 @@
imm16 imm16
=========================== ===========================
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits and then sign-extended to 32 bits. An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range -32768..65535.

View File

@ -10,5 +10,5 @@
imm16 imm16
=========================== ===========================
An :ref:`integer_number<amdgpu_synid_integer_number>`. The value is truncated to 16 bits and then zero-extended to 32 bits. An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..65535.

View File

@ -14,30 +14,31 @@ Counts of outstanding instructions to wait for.
The bits of this operand have the following meaning: The bits of this operand have the following meaning:
============ ====================================================== ========== ========= ================================================ ============
Bits Description High Bits Low Bits Description Value Range
============ ====================================================== ========== ========= ================================================ ============
3:0 VM_CNT: vector memory operations count, lower bits. 15:14 3:0 VM_CNT: vector memory operations count. 0..63
6:4 EXP_CNT: export count. \- 6:4 EXP_CNT: export count. 0..7
11:8 LGKM_CNT: LDS, GDS, Constant and Message count. \- 11:8 LGKM_CNT: LDS, GDS, Constant and Message count. 0..15
15:14 VM_CNT: vector memory operations count, upper bits. ========== ========= ================================================ ============
============ ======================================================
This operand may be specified as a positive 16-bit :ref:`integer_number<amdgpu_synid_integer_number>` This operand may be specified as one of the following:
or as a combination of the following symbolic helpers:
* An :ref:`integer_number<amdgpu_synid_integer_number>` or an :ref:`absolute_expression<amdgpu_synid_absolute_expression>`. The value must be in the range 0..0xFFFF.
* A combination of *vmcnt*, *expcnt*, *lgkmcnt* and other values described below.
====================== ====================================================================== ====================== ======================================================================
Syntax Description Syntax Description
====================== ====================================================================== ====================== ======================================================================
vmcnt(<*N*>) VM_CNT value. *N* must not exceed the largest VM_CNT value. vmcnt(<*N*>) A VM_CNT value. *N* must not exceed the largest VM_CNT value.
expcnt(<*N*>) EXP_CNT value. *N* must not exceed the largest EXP_CNT value. expcnt(<*N*>) An EXP_CNT value. *N* must not exceed the largest EXP_CNT value.
lgkmcnt(<*N*>) LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value. lgkmcnt(<*N*>) An LGKM_CNT value. *N* must not exceed the largest LGKM_CNT value.
vmcnt_sat(<*N*>) VM_CNT value computed as min(*N*, the largest VM_CNT value). vmcnt_sat(<*N*>) A VM_CNT value computed as min(*N*, the largest VM_CNT value).
expcnt_sat(<*N*>) EXP_CNT value computed as min(*N*, the largest EXP_CNT value). expcnt_sat(<*N*>) An EXP_CNT value computed as min(*N*, the largest EXP_CNT value).
lgkmcnt_sat(<*N*>) LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value). lgkmcnt_sat(<*N*>) An LGKM_CNT value computed as min(*N*, the largest LGKM_CNT value).
====================== ====================================================================== ====================== ======================================================================
These helpers may be specified in any order. Ampersands and commas may be used as optional separators. These values may be specified in any order. Spaces, ampersands and commas may be used as optional separators.
*N* is either an *N* is either an
:ref:`integer number<amdgpu_synid_integer_number>` or an :ref:`integer number<amdgpu_synid_integer_number>` or an
@ -47,10 +48,18 @@ Examples:
.. parsed-literal:: .. parsed-literal::
s_waitcnt 0 vm_cnt = 1
exp_cnt = 2
lgkm_cnt = 3
cnt = vm_cnt | (exp_cnt << 4) | (lgkm_cnt << 8)
s_waitcnt cnt
s_waitcnt 1 | (2 << 4) | (3 << 8) // the same as above
s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3) // the same as above
s_waitcnt vmcnt(vm_cnt) expcnt(exp_cnt) lgkmcnt(lgkm_cnt) // the same as above
s_waitcnt vmcnt(1) s_waitcnt vmcnt(1)
s_waitcnt expcnt(2) lgkmcnt(3) s_waitcnt expcnt(2) lgkmcnt(3)
s_waitcnt vmcnt(1) expcnt(2) lgkmcnt(3)
s_waitcnt vmcnt(1), expcnt(2), lgkmcnt(3) s_waitcnt vmcnt(1), expcnt(2), lgkmcnt(3)
s_waitcnt vmcnt(1) & lgkmcnt_sat(100) & expcnt(2) s_waitcnt vmcnt(1) & lgkmcnt_sat(100) & expcnt(2)

View File

@ -34,19 +34,21 @@ Specifies an immediate unsigned 8-bit offset, in bytes. The default value is 0.
Used with DS instructions which have 2 addresses. Used with DS instructions which have 2 addresses.
=================== ===================================================== =================== ====================================================================
Syntax Description Syntax Description
=================== ===================================================== =================== ====================================================================
offset:{0..0xFF} Specifies an unsigned 8-bit offset as a positive offset:{0..0xFF} Specifies an unsigned 8-bit offset as a positive
:ref:`integer number <amdgpu_synid_integer_number>`. :ref:`integer number <amdgpu_synid_integer_number>`
=================== ===================================================== or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
=================== ====================================================================
Examples: Examples:
.. parsed-literal:: .. parsed-literal::
offset:255
offset:0xff offset:0xff
offset:2-x
offset:-x-y
.. _amdgpu_synid_ds_offset16: .. _amdgpu_synid_ds_offset16:
@ -57,12 +59,13 @@ Specifies an immediate unsigned 16-bit offset, in bytes. The default value is 0.
Used with DS instructions which have 1 address. Used with DS instructions which have 1 address.
==================== ====================================================== ==================== ====================================================================
Syntax Description Syntax Description
==================== ====================================================== ==================== ====================================================================
offset:{0..0xFFFF} Specifies an unsigned 16-bit offset as a positive offset:{0..0xFFFF} Specifies an unsigned 16-bit offset as a positive
:ref:`integer number <amdgpu_synid_integer_number>`. :ref:`integer number <amdgpu_synid_integer_number>`
==================== ====================================================== or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
==================== ====================================================================
Examples: Examples:
@ -70,6 +73,7 @@ Examples:
offset:65535 offset:65535
offset:0xffff offset:0xffff
offset:-x-y
.. _amdgpu_synid_sw_offset16: .. _amdgpu_synid_sw_offset16:
@ -95,7 +99,7 @@ See AMD documentation for more information.
*mask* is a 5 character sequence which *mask* is a 5 character sequence which
specifies how to transform the bits of the specifies how to transform the bits of the
lane *id*. lane *id*.
The following characters are allowed: The following characters are allowed:
@ -116,7 +120,7 @@ See AMD documentation for more information.
size and must be equal to 2, 4, 8, 16 or 32. size and must be equal to 2, 4, 8, 16 or 32.
The second numeric parameter is an index of the The second numeric parameter is an index of the
lane being broadcasted. lane being broadcasted.
The index must not exceed group size. The index must not exceed group size.
offset:swizzle(SWAP,{1..16}) Specifies a swap mode. offset:swizzle(SWAP,{1..16}) Specifies a swap mode.
@ -128,7 +132,7 @@ See AMD documentation for more information.
Reverses the lanes for groups of 2, 4, 8, 16 or 32 lanes. Reverses the lanes for groups of 2, 4, 8, 16 or 32 lanes.
======================================================= =========================================================== ======================================================= ===========================================================
Numeric parameters may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
Examples: Examples:
@ -137,7 +141,7 @@ Examples:
offset:255 offset:255
offset:0xffff offset:0xffff
offset:swizzle(QUAD_PERM, 0, 1, 2 ,3) offset:swizzle(QUAD_PERM, 0, 1, 2, 3)
offset:swizzle(BITMASK_PERM, "01pi0") offset:swizzle(BITMASK_PERM, "01pi0")
offset:swizzle(BROADCAST, 2, 0) offset:swizzle(BROADCAST, 2, 0)
offset:swizzle(SWAP, 8) offset:swizzle(SWAP, 8)
@ -212,19 +216,20 @@ Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
Cannot be used with *global/scratch* opcodes. GFX9 only. Cannot be used with *global/scratch* opcodes. GFX9 only.
================= ====================================================== ================= ====================================================================
Syntax Description Syntax Description
================= ====================================================== ================= ====================================================================
offset:{0..4095} Specifies a 12-bit unsigned offset as a positive offset:{0..4095} Specifies a 12-bit unsigned offset as a positive
:ref:`integer number <amdgpu_synid_integer_number>`. :ref:`integer number <amdgpu_synid_integer_number>`
================= ====================================================== or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
================= ====================================================================
Examples: Examples:
.. parsed-literal:: .. parsed-literal::
offset:4095 offset:4095
offset:0xff offset:x-0xff
.. _amdgpu_synid_flat_offset13s: .. _amdgpu_synid_flat_offset13s:
@ -235,12 +240,13 @@ Specifies an immediate signed 13-bit offset, in bytes. The default value is 0.
Can be used with *global/scratch* opcodes only. GFX9 only. Can be used with *global/scratch* opcodes only. GFX9 only.
============================ ======================================================= ===================== ====================================================================
Syntax Description Syntax Description
============================ ======================================================= ===================== ====================================================================
offset:{-4096..4095} Specifies a 13-bit signed offset as an offset:{-4096..4095} Specifies a 13-bit signed offset as an
:ref:`integer number <amdgpu_synid_integer_number>`. :ref:`integer number <amdgpu_synid_integer_number>`
============================ ======================================================= or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
===================== ====================================================================
Examples: Examples:
@ -248,6 +254,7 @@ Examples:
offset:-4000 offset:-4000
offset:0x10 offset:0x10
offset:-x
.. _amdgpu_synid_flat_offset12s: .. _amdgpu_synid_flat_offset12s:
@ -260,12 +267,13 @@ Can be used with *global/scratch* opcodes only.
GFX10 only. GFX10 only.
============================ ======================================================= ===================== ====================================================================
Syntax Description Syntax Description
============================ ======================================================= ===================== ====================================================================
offset:{-2048..2047} Specifies a 12-bit signed offset as an offset:{-2048..2047} Specifies a 12-bit signed offset as an
:ref:`integer number <amdgpu_synid_integer_number>`. :ref:`integer number <amdgpu_synid_integer_number>`
============================ ======================================================= or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
===================== ====================================================================
Examples: Examples:
@ -273,6 +281,7 @@ Examples:
offset:-2000 offset:-2000
offset:0x10 offset:0x10
offset:-x+y
.. _amdgpu_synid_flat_offset11: .. _amdgpu_synid_flat_offset11:
@ -285,19 +294,20 @@ Cannot be used with *global/scratch* opcodes.
GFX10 only. GFX10 only.
================= ====================================================== ================= ====================================================================
Syntax Description Syntax Description
================= ====================================================== ================= ====================================================================
offset:{0..2047} Specifies an 11-bit unsigned offset as a positive offset:{0..2047} Specifies an 11-bit unsigned offset as a positive
:ref:`integer number <amdgpu_synid_integer_number>`. :ref:`integer number <amdgpu_synid_integer_number>`
================= ====================================================== or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
================= ====================================================================
Examples: Examples:
.. parsed-literal:: .. parsed-literal::
offset:2047 offset:2047
offset:0xff offset:x+0xff
dlc dlc
~~~ ~~~
@ -340,19 +350,18 @@ dmask
Specifies which channels (image components) are used by the operation. By default, no channels Specifies which channels (image components) are used by the operation. By default, no channels
are used. are used.
=============== ===================================================== =============== ====================================================================
Syntax Description Syntax Description
=============== ===================================================== =============== ====================================================================
dmask:{0..15} Specifies image channels as a positive dmask:{0..15} Specifies image channels as a positive
:ref:`integer number <amdgpu_synid_integer_number>`. :ref:`integer number <amdgpu_synid_integer_number>`
or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
Each bit corresponds to one of 4 image Each bit corresponds to one of 4 image components (RGBA).
components (RGBA).
If the specified bit value If the specified bit value is 0, the component is not used,
is 0, the component is not used, value 1 means value 1 means that the component is used.
that the component is used. =============== ====================================================================
=============== =====================================================
This modifier has some limitations depending on instruction kind: This modifier has some limitations depending on instruction kind:
@ -373,7 +382,7 @@ Examples:
dmask:0xf dmask:0xf
dmask:0b1111 dmask:0b1111
dmask:3 dmask:x|y|z
.. _amdgpu_synid_unorm: .. _amdgpu_synid_unorm:
@ -468,7 +477,7 @@ Specifies data size: 16 or 32 bits (32 bits by default). Not supported by GFX7.
Each 16-bit data element occupies 1 VGPR. Each 16-bit data element occupies 1 VGPR.
GFX8.1, GFX9 and GFX10 support data packing. GFX8.1, GFX9 and GFX10 support data packing.
Each pair of 16-bit data elements Each pair of 16-bit data elements
occupies 1 VGPR. occupies 1 VGPR.
======================================== ================================================ ======================================== ================================================
@ -684,18 +693,19 @@ offset12
Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0. Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
=============================== ====================================================== ================== ====================================================================
Syntax Description Syntax Description
=============================== ====================================================== ================== ====================================================================
offset:{0..0xFFF} Specifies a 12-bit unsigned offset as a positive offset:{0..0xFFF} Specifies a 12-bit unsigned offset as a positive
:ref:`integer number <amdgpu_synid_integer_number>`. :ref:`integer number <amdgpu_synid_integer_number>`
=============================== ====================================================== or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
================== ====================================================================
Examples: Examples:
.. parsed-literal:: .. parsed-literal::
offset:0 offset:x+y
offset:0x10 offset:0x10
glc glc
@ -782,14 +792,18 @@ GFX10 only.
dpp8_sel dpp8_sel
~~~~~~~~ ~~~~~~~~
Selects which lane to pull data from, within a group of 8 lanes. This is a mandatory modifier. Selects which lanes to pull data from, within a group of 8 lanes. This is a mandatory modifier.
There is no default value. There is no default value.
GFX10 only. GFX10 only.
The *dpp8_sel* modifier must specify exactly 8 values, each ranging from 0 to 7. The *dpp8_sel* modifier must specify exactly 8 values.
First value selects which lane to read from to supply data into lane 0. First value selects which lane to read from to supply data into lane 0.
Second value controls value for lane 1 and so on. Second value controls lane 1 and so on.
Each value may be specified as either
an :ref:`integer number<amdgpu_synid_integer_number>` or
an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
=============================================================== =========================== =============================================================== ===========================
Syntax Description Syntax Description
@ -811,7 +825,7 @@ fi
Controls interaction with inactive lanes for *dpp8* instructions. The default value is zero. Controls interaction with inactive lanes for *dpp8* instructions. The default value is zero.
Note. *Inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero. Note: *inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero.
GFX10 only. GFX10 only.
@ -822,6 +836,9 @@ GFX10 only.
fi:1 Fetch pre-exist values from inactive lanes. fi:1 Fetch pre-exist values from inactive lanes.
==================================== ===================================================== ==================================== =====================================================
Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
DPP/DPP16 Modifiers DPP/DPP16 Modifiers
------------------- -------------------
@ -837,7 +854,7 @@ There is no default value.
GFX8 and GFX9 only. Use :ref:`dpp16_ctrl<amdgpu_synid_dpp16_ctrl>` for GFX10. GFX8 and GFX9 only. Use :ref:`dpp16_ctrl<amdgpu_synid_dpp16_ctrl>` for GFX10.
Note. The lanes of a wavefront are organized in four *rows* and four *banks*. Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
======================================== ================================================ ======================================== ================================================
Syntax Description Syntax Description
@ -856,7 +873,7 @@ Note. The lanes of a wavefront are organized in four *rows* and four *banks*.
row_ror:{1..15} Row rotate right by 1-15 threads. row_ror:{1..15} Row rotate right by 1-15 threads.
======================================== ================================================ ======================================== ================================================
Note: Numeric parameters may be specified as either Note: numeric values may be specified as either
:ref:`integer numbers<amdgpu_synid_integer_number>` or :ref:`integer numbers<amdgpu_synid_integer_number>` or
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
@ -877,7 +894,7 @@ There is no default value.
GFX10 only. Use :ref:`dpp_ctrl<amdgpu_synid_dpp_ctrl>` for GFX8 and GFX9. GFX10 only. Use :ref:`dpp_ctrl<amdgpu_synid_dpp_ctrl>` for GFX8 and GFX9.
Note. The lanes of a wavefront are organized in four *rows* and four *banks*. Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
(There are only two rows in *wave32* mode.) (There are only two rows in *wave32* mode.)
======================================== ==================================================== ======================================== ====================================================
@ -894,7 +911,7 @@ Note. The lanes of a wavefront are organized in four *rows* and four *banks*.
row_ror:{1..15} Row rotate right by 1-15 threads. row_ror:{1..15} Row rotate right by 1-15 threads.
======================================== ==================================================== ======================================== ====================================================
Note: Numeric parameters may be specified as either Note: numeric values may be specified as either
:ref:`integer numbers<amdgpu_synid_integer_number>` or :ref:`integer numbers<amdgpu_synid_integer_number>` or
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
@ -912,21 +929,21 @@ row_mask
Controls which rows are enabled for data sharing. By default, all rows are enabled. Controls which rows are enabled for data sharing. By default, all rows are enabled.
Note. The lanes of a wavefront are organized in four *rows* and four *banks*. Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
(There are only two rows in *wave32* mode.) (There are only two rows in *wave32* mode.)
======================================== ===================================================== ================= ====================================================================
Syntax Description Syntax Description
======================================== ===================================================== ================= ====================================================================
row_mask:{0..15} Specifies a *row mask* as a positive row_mask:{0..15} Specifies a *row mask* as a positive
:ref:`integer number <amdgpu_synid_integer_number>`. :ref:`integer number <amdgpu_synid_integer_number>`
or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
Each of 4 bits in the mask controls one Each of 4 bits in the mask controls one row
row (0 - disabled, 1 - enabled). (0 - disabled, 1 - enabled).
In *wave32* mode the values should be limited to In *wave32* mode the values should be limited to 0..7.
{0..7}. ================= ====================================================================
======================================== =====================================================
Examples: Examples:
@ -934,7 +951,7 @@ Examples:
row_mask:0xf row_mask:0xf
row_mask:0b1010 row_mask:0b1010
row_mask:0b1111 row_mask:x|y
.. _amdgpu_synid_bank_mask: .. _amdgpu_synid_bank_mask:
@ -943,18 +960,19 @@ bank_mask
Controls which banks are enabled for data sharing. By default, all banks are enabled. Controls which banks are enabled for data sharing. By default, all banks are enabled.
Note. The lanes of a wavefront are organized in four *rows* and four *banks*. Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
(There are only two rows in *wave32* mode.) (There are only two rows in *wave32* mode.)
======================================== ======================================================= ================== ====================================================================
Syntax Description Syntax Description
======================================== ======================================================= ================== ====================================================================
bank_mask:{0..15} Specifies a *bank mask* as a positive bank_mask:{0..15} Specifies a *bank mask* as a positive
:ref:`integer number <amdgpu_synid_integer_number>`. :ref:`integer number <amdgpu_synid_integer_number>`
or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
Each of 4 bits in the mask controls one Each of 4 bits in the mask controls one bank
bank (0 - disabled, 1 - enabled). (0 - disabled, 1 - enabled).
======================================== ======================================================= ================== ====================================================================
Examples: Examples:
@ -962,7 +980,7 @@ Examples:
bank_mask:0x3 bank_mask:0x3
bank_mask:0b0011 bank_mask:0b0011
bank_mask:0b1111 bank_mask:x&y
.. _amdgpu_synid_bound_ctrl: .. _amdgpu_synid_bound_ctrl:
@ -988,7 +1006,7 @@ fi
Controls interaction with *inactive* lanes for *dpp16* instructions. The default value is zero. Controls interaction with *inactive* lanes for *dpp16* instructions. The default value is zero.
Note. *Inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero. Note: *inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero.
GFX10 only. GFX10 only.
@ -1001,6 +1019,9 @@ GFX10 only.
fi:1 Fetch pre-exist values from inactive lanes. fi:1 Fetch pre-exist values from inactive lanes.
======================================== ================================================== ======================================== ==================================================
Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
SDWA Modifiers SDWA Modifiers
-------------- --------------
@ -1037,7 +1058,6 @@ Selects which bits in the destination are affected. By default, all bits are aff
dst_sel:WORD_1 Use bits 31:16. dst_sel:WORD_1 Use bits 31:16.
======================================== ================================================ ======================================== ================================================
.. _amdgpu_synid_dst_unused: .. _amdgpu_synid_dst_unused:
dst_unused dst_unused
@ -1151,7 +1171,7 @@ operands (both source and destination). First value controls src0, second value
and so on, except that the last value controls destination. and so on, except that the last value controls destination.
The value 0 selects the low bits, while 1 selects the high bits. The value 0 selects the low bits, while 1 selects the high bits.
Note. op_sel modifier affects 16-bit operands only. For 32-bit operands the value specified Note: op_sel modifier affects 16-bit operands only. For 32-bit operands the value specified
by op_sel must be 0. by op_sel must be 0.
GFX9 and GFX10 only. GFX9 and GFX10 only.
@ -1164,6 +1184,10 @@ GFX9 and GFX10 only.
op_sel:[{0..1},{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands. op_sel:[{0..1},{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands.
======================================== ============================================================ ======================================== ============================================================
Note: numeric values may be specified as either
:ref:`integer numbers<amdgpu_synid_integer_number>` or
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
Examples: Examples:
.. parsed-literal:: .. parsed-literal::
@ -1189,7 +1213,7 @@ Integer clamping is not supported by GFX7.
For floating point operations, clamp modifier indicates that the result must be clamped For floating point operations, clamp modifier indicates that the result must be clamped
to the range [0.0, 1.0]. By default, there is no clamping. to the range [0.0, 1.0]. By default, there is no clamping.
Note. Clamp modifier is applied after :ref:`output modifiers<amdgpu_synid_omod>` (if any). Note: clamp modifier is applied after :ref:`output modifiers<amdgpu_synid_omod>` (if any).
======================================== ================================================ ======================================== ================================================
Syntax Description Syntax Description
@ -1205,12 +1229,12 @@ omod
Specifies if an output modifier must be applied to the result. Specifies if an output modifier must be applied to the result.
By default, no output modifiers are applied. By default, no output modifiers are applied.
Note. Output modifiers are applied before :ref:`clamping<amdgpu_synid_clamp>` (if any). Note: output modifiers are applied before :ref:`clamping<amdgpu_synid_clamp>` (if any).
Output modifiers are valid for f32 and f64 floating point results only. Output modifiers are valid for f32 and f64 floating point results only.
They must not be used with f16. They must not be used with f16.
Note. *v_cvt_f16_f32* is an exception. This instruction produces f16 result Note: *v_cvt_f16_f32* is an exception. This instruction produces f16 result
but accepts output modifiers. but accepts output modifiers.
======================================== ================================================ ======================================== ================================================
@ -1221,6 +1245,16 @@ but accepts output modifiers.
div:2 Multiply the result by 0.5. div:2 Multiply the result by 0.5.
======================================== ================================================ ======================================== ================================================
Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
Examples:
.. parsed-literal::
mul:2
mul:x // x must be equal to 2 or 4
.. _amdgpu_synid_vop3_operand_modifiers: .. _amdgpu_synid_vop3_operand_modifiers:
VOP3 Operand Modifiers VOP3 Operand Modifiers
@ -1233,15 +1267,19 @@ Operand modifiers are not used separately. They are applied to source operands.
abs abs
~~~ ~~~
Computes absolute value of its operand. Applied before :ref:`neg<amdgpu_synid_neg>` (if any). Computes the absolute value of its operand. Must be applied before :ref:`neg<amdgpu_synid_neg>`
Valid for floating point operands only. (if any). Valid for floating point operands only.
======================================== ================================================ ======================================== ====================================================
Syntax Description Syntax Description
======================================== ================================================ ======================================== ====================================================
abs(<operand>) Get absolute value of operand. abs(<operand>) Get the absolute value of a floating-point operand.
\|<operand>| The same as above. \|<operand>| The same as above (an SP3 syntax).
======================================== ================================================ ======================================== ====================================================
Note: avoid using SP3 syntax with operands specified as expressions because the trailing '|'
may be misinterpreted. Such operands should be enclosed into additional parentheses as shown
in examples below.
Examples: Examples:
@ -1249,28 +1287,50 @@ Examples:
abs(v36) abs(v36)
\|v36| \|v36|
abs(x|y) // ok
\|(x|y)| // additional parentheses are required
.. _amdgpu_synid_neg: .. _amdgpu_synid_neg:
neg neg
~~~ ~~~
Computes negative value of its operand. Applied after :ref:`abs<amdgpu_synid_abs>` (if any). Computes the negative value of its operand. Must be applied after :ref:`abs<amdgpu_synid_abs>`
Valid for floating point operands only. (if any). Valid for floating point operands only.
======================================== ================================================ ================== ====================================================
Syntax Description Syntax Description
======================================== ================================================ ================== ====================================================
neg(<operand>) Get negative value of operand. neg(<operand>) Get the negative value of a floating-point operand.
-<operand> The same as above. The operand may include an optional
======================================== ================================================ :ref:`abs<amdgpu_synid_abs>` modifier.
-<operand> The same as above (an SP3 syntax).
================== ====================================================
Note: SP3 syntax is supported with limitations because of a potential ambiguity.
Currently it is allowed in the following cases:
* Before a register.
* Before an :ref:`abs<amdgpu_synid_abs>` modifier.
* Before an SP3 :ref:`abs<amdgpu_synid_abs>` modifier.
In all other cases "-" is handled as a part of an expression that follows the sign.
Examples: Examples:
.. parsed-literal:: .. parsed-literal::
// Operands with negate modifiers
neg(v[0]) neg(v[0])
-v4 neg(1.0)
neg(abs(v0))
-v5
-abs(v5)
-\|v5|
// Operands without negate modifiers
-1
-x+y
VOP3P Modifiers VOP3P Modifiers
--------------- ---------------
@ -1304,6 +1364,10 @@ The value 0 selects the low bits, while 1 selects the high bits.
op_sel:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands. op_sel:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands.
================================= ============================================================= ================================= =============================================================
Note: numeric values may be specified as either
:ref:`integer numbers<amdgpu_synid_integer_number>` or
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
Examples: Examples:
.. parsed-literal:: .. parsed-literal::
@ -1333,6 +1397,10 @@ The value 0 selects the low bits, while 1 selects the high bits.
op_sel_hi:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands. op_sel_hi:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands.
=================================== ============================================================= =================================== =============================================================
Note: numeric values may be specified as either
:ref:`integer numbers<amdgpu_synid_integer_number>` or
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
Examples: Examples:
.. parsed-literal:: .. parsed-literal::
@ -1367,6 +1435,10 @@ This modifier is valid for floating point operands only.
neg_lo:[{0..1},{0..1},{0..1}] Select affected operands for instructions with 3 source operands. neg_lo:[{0..1},{0..1},{0..1}] Select affected operands for instructions with 3 source operands.
================================ ================================================================== ================================ ==================================================================
Note: numeric values may be specified as either
:ref:`integer numbers<amdgpu_synid_integer_number>` or
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
Examples: Examples:
.. parsed-literal:: .. parsed-literal::
@ -1401,6 +1473,10 @@ This modifier is valid for floating point operands only.
neg_hi:[{0..1},{0..1},{0..1}] Select affected operands for instructions with 3 source operands. neg_hi:[{0..1},{0..1},{0..1}] Select affected operands for instructions with 3 source operands.
=============================== ================================================================== =============================== ==================================================================
Note: numeric values may be specified as either
:ref:`integer numbers<amdgpu_synid_integer_number>` or
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
Examples: Examples:
.. parsed-literal:: .. parsed-literal::
@ -1419,7 +1495,7 @@ VOP3P V_MAD_MIX Modifiers
------------------------- -------------------------
*v_mad_mix_f32*, *v_mad_mixhi_f16* and *v_mad_mixlo_f16* instructions *v_mad_mix_f32*, *v_mad_mixhi_f16* and *v_mad_mixlo_f16* instructions
use *op_sel* and *op_sel_hi* modifiers use *op_sel* and *op_sel_hi* modifiers
in a manner different from *regular* VOP3P instructions. in a manner different from *regular* VOP3P instructions.
See a description below. See a description below.
@ -1449,6 +1525,10 @@ By default, low bits are used for all operands.
op_sel:[{0..1},{0..1},{0..1}] Select location of each 16-bit source operand. op_sel:[{0..1},{0..1},{0..1}] Select location of each 16-bit source operand.
=============================== ================================================ =============================== ================================================
Note: numeric values may be specified as either
:ref:`integer numbers<amdgpu_synid_integer_number>` or
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
Examples: Examples:
.. parsed-literal:: .. parsed-literal::
@ -1477,6 +1557,10 @@ The location of 16 bits in the operand may be specified by
op_sel_hi:[{0..1},{0..1},{0..1}] Select size of each source operand. op_sel_hi:[{0..1},{0..1},{0..1}] Select size of each source operand.
======================================== ==================================== ======================================== ====================================
Note: numeric values may be specified as either
:ref:`integer numbers<amdgpu_synid_integer_number>` or
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
Examples: Examples:
.. parsed-literal:: .. parsed-literal::

View File

@ -38,7 +38,8 @@ Assembler currently supports sequences of 1, 2, 3, 4, 8 and 16 *vector* register
=================================================== ==================================================================== =================================================== ====================================================================
**v**\<N> A single 32-bit *vector* register. **v**\<N> A single 32-bit *vector* register.
*N* must be a decimal integer number. *N* must be a decimal
:ref:`integer number<amdgpu_synid_integer_number>`.
**v[**\ <N>\ **]** A single 32-bit *vector* register. **v[**\ <N>\ **]** A single 32-bit *vector* register.
*N* may be specified as an *N* may be specified as an
@ -51,10 +52,11 @@ Assembler currently supports sequences of 1, 2, 3, 4, 8 and 16 *vector* register
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
**[v**\ <N>, \ **v**\ <N+1>, ... **v**\ <K>\ **]** A sequence of (\ *K-N+1*\ ) *vector* registers. **[v**\ <N>, \ **v**\ <N+1>, ... **v**\ <K>\ **]** A sequence of (\ *K-N+1*\ ) *vector* registers.
Register indices must be specified as decimal integer numbers. Register indices must be specified as decimal
:ref:`integer numbers<amdgpu_synid_integer_number>`.
=================================================== ==================================================================== =================================================== ====================================================================
Note. *N* and *K* must satisfy the following conditions: Note: *N* and *K* must satisfy the following conditions:
* *N* <= *K*. * *N* <= *K*.
* 0 <= *N* <= 255. * 0 <= *N* <= 255.
@ -77,26 +79,27 @@ Examples:
.. _amdgpu_synid_nsa: .. _amdgpu_synid_nsa:
*Image* instructions may use special *NSA* (Non-Sequential Address) syntax for *image addresses*: GFX10 *Image* instructions may use special *NSA* (Non-Sequential Address) syntax for *image addresses*:
=================================================== ==================================================================== ===================================== =================================================
Syntax Description Syntax Description
=================================================== ==================================================================== ===================================== =================================================
**[v**\ <A>, \ **v**\ <B>, ... **v**\ <X>\ **]** A sequence of *vector* registers. At least one register **[Vm**, \ **Vn**, ... **Vk**\ **]** A sequence of 32-bit *vector* registers.
must be specified. Each register may be specified using a syntax
defined :ref:`above<amdgpu_synid_v>`.
In contrast with standard syntax described above, registers in In contrast with standard syntax, registers
this sequence are not required to have consecutive indices. in *NSA* sequence are not required to have
Moreover, the same register may appear in the list more than once. consecutive indices. Moreover, the same register
=================================================== ==================================================================== may appear in the list more than once.
===================================== =================================================
Note. Reqister indices must be in the range 0..255. They must be specified as decimal integer numbers.
Examples: Examples:
.. parsed-literal:: .. parsed-literal::
[v32,v1,v2] [v32,v1,v[2]]
[v[32],v[1:1],[v2]]
[v4,v4,v4,v4] [v4,v4,v4,v4]
.. _amdgpu_synid_s: .. _amdgpu_synid_s:
@ -126,7 +129,9 @@ Sequences of 4 and more *scalar* registers must be quad-aligned.
======================================================== ==================================================================== ======================================================== ====================================================================
**s**\ <N> A single 32-bit *scalar* register. **s**\ <N> A single 32-bit *scalar* register.
*N* must be a decimal integer number. *N* must be a decimal
:ref:`integer number<amdgpu_synid_integer_number>`.
**s[**\ <N>\ **]** A single 32-bit *scalar* register. **s[**\ <N>\ **]** A single 32-bit *scalar* register.
*N* may be specified as an *N* may be specified as an
@ -137,12 +142,14 @@ Sequences of 4 and more *scalar* registers must be quad-aligned.
*N* and *K* may be specified as *N* and *K* may be specified as
:ref:`integer numbers<amdgpu_synid_integer_number>` :ref:`integer numbers<amdgpu_synid_integer_number>`
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
**[s**\ <N>, \ **s**\ <N+1>, ... **s**\ <K>\ **]** A sequence of (\ *K-N+1*\ ) *scalar* registers. **[s**\ <N>, \ **s**\ <N+1>, ... **s**\ <K>\ **]** A sequence of (\ *K-N+1*\ ) *scalar* registers.
Register indices must be specified as decimal integer numbers. Register indices must be specified as decimal
:ref:`integer numbers<amdgpu_synid_integer_number>`.
======================================================== ==================================================================== ======================================================== ====================================================================
Note. *N* and *K* must satisfy the following conditions: Note: *N* and *K* must satisfy the following conditions:
* *N* must be properly aligned based on sequence size. * *N* must be properly aligned based on sequence size.
* *N* <= *K*. * *N* <= *K*.
@ -210,7 +217,8 @@ Sequences of 4 and more *ttmp* registers must be quad-aligned.
============================================================= ==================================================================== ============================================================= ====================================================================
**ttmp**\ <N> A single 32-bit *ttmp* register. **ttmp**\ <N> A single 32-bit *ttmp* register.
*N* must be a decimal integer number. *N* must be a decimal
:ref:`integer number<amdgpu_synid_integer_number>`.
**ttmp[**\ <N>\ **]** A single 32-bit *ttmp* register. **ttmp[**\ <N>\ **]** A single 32-bit *ttmp* register.
*N* may be specified as an *N* may be specified as an
@ -223,10 +231,11 @@ Sequences of 4 and more *ttmp* registers must be quad-aligned.
or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
**[ttmp**\ <N>, \ **ttmp**\ <N+1>, ... **ttmp**\ <K>\ **]** A sequence of (\ *K-N+1*\ ) *ttmp* registers. **[ttmp**\ <N>, \ **ttmp**\ <N+1>, ... **ttmp**\ <K>\ **]** A sequence of (\ *K-N+1*\ ) *ttmp* registers.
Register indices must be specified as decimal integer numbers. Register indices must be specified as decimal
:ref:`integer numbers<amdgpu_synid_integer_number>`.
============================================================= ==================================================================== ============================================================= ====================================================================
Note. *N* and *K* must satisfy the following conditions: Note: *N* and *K* must satisfy the following conditions:
* *N* must be properly aligned based on sequence size. * *N* must be properly aligned based on sequence size.
* *N* <= *K*. * *N* <= *K*.
@ -266,8 +275,8 @@ Trap base address, 64-bits wide. Holds the pointer to the current trap handler p
Syntax Description Availability Syntax Description Availability
================== ======================================================================= ============= ================== ======================================================================= =============
tba 64-bit *trap base address* register. GFX7, GFX8 tba 64-bit *trap base address* register. GFX7, GFX8
[tba] 64-bit *trap base address* register (an alternative syntax). GFX7, GFX8 [tba] 64-bit *trap base address* register (an SP3 syntax). GFX7, GFX8
[tba_lo,tba_hi] 64-bit *trap base address* register (an alternative syntax). GFX7, GFX8 [tba_lo,tba_hi] 64-bit *trap base address* register (an SP3 syntax). GFX7, GFX8
================== ======================================================================= ============= ================== ======================================================================= =============
High and low 32 bits of *trap base address* may be accessed as separate registers: High and low 32 bits of *trap base address* may be accessed as separate registers:
@ -277,8 +286,8 @@ High and low 32 bits of *trap base address* may be accessed as separate register
================== ======================================================================= ============= ================== ======================================================================= =============
tba_lo Low 32 bits of *trap base address* register. GFX7, GFX8 tba_lo Low 32 bits of *trap base address* register. GFX7, GFX8
tba_hi High 32 bits of *trap base address* register. GFX7, GFX8 tba_hi High 32 bits of *trap base address* register. GFX7, GFX8
[tba_lo] Low 32 bits of *trap base address* register (an alternative syntax). GFX7, GFX8 [tba_lo] Low 32 bits of *trap base address* register (an SP3 syntax). GFX7, GFX8
[tba_hi] High 32 bits of *trap base address* register (an alternative syntax). GFX7, GFX8 [tba_hi] High 32 bits of *trap base address* register (an SP3 syntax). GFX7, GFX8
================== ======================================================================= ============= ================== ======================================================================= =============
Note that *tba*, *tba_lo* and *tba_hi* are not accessible as assembler registers in GFX9 and GFX10, Note that *tba*, *tba_lo* and *tba_hi* are not accessible as assembler registers in GFX9 and GFX10,
@ -295,8 +304,8 @@ Trap memory address, 64-bits wide.
Syntax Description Availability Syntax Description Availability
================= ======================================================================= ================== ================= ======================================================================= ==================
tma 64-bit *trap memory address* register. GFX7, GFX8 tma 64-bit *trap memory address* register. GFX7, GFX8
[tma] 64-bit *trap memory address* register (an alternative syntax). GFX7, GFX8 [tma] 64-bit *trap memory address* register (an SP3 syntax). GFX7, GFX8
[tma_lo,tma_hi] 64-bit *trap memory address* register (an alternative syntax). GFX7, GFX8 [tma_lo,tma_hi] 64-bit *trap memory address* register (an SP3 syntax). GFX7, GFX8
================= ======================================================================= ================== ================= ======================================================================= ==================
High and low 32 bits of *trap memory address* may be accessed as separate registers: High and low 32 bits of *trap memory address* may be accessed as separate registers:
@ -306,8 +315,8 @@ High and low 32 bits of *trap memory address* may be accessed as separate regist
================= ======================================================================= ================== ================= ======================================================================= ==================
tma_lo Low 32 bits of *trap memory address* register. GFX7, GFX8 tma_lo Low 32 bits of *trap memory address* register. GFX7, GFX8
tma_hi High 32 bits of *trap memory address* register. GFX7, GFX8 tma_hi High 32 bits of *trap memory address* register. GFX7, GFX8
[tma_lo] Low 32 bits of *trap memory address* register (an alternative syntax). GFX7, GFX8 [tma_lo] Low 32 bits of *trap memory address* register (an SP3 syntax). GFX7, GFX8
[tma_hi] High 32 bits of *trap memory address* register (an alternative syntax). GFX7, GFX8 [tma_hi] High 32 bits of *trap memory address* register (an SP3 syntax). GFX7, GFX8
================= ======================================================================= ================== ================= ======================================================================= ==================
Note that *tma*, *tma_lo* and *tma_hi* are not accessible as assembler registers in GFX9 and GFX10, Note that *tma*, *tma_lo* and *tma_hi* are not accessible as assembler registers in GFX9 and GFX10,
@ -324,8 +333,8 @@ Flat scratch address, 64-bits wide. Holds the base address of scratch memory.
Syntax Description Syntax Description
================================== ================================================================ ================================== ================================================================
flat_scratch 64-bit *flat scratch* address register. flat_scratch 64-bit *flat scratch* address register.
[flat_scratch] 64-bit *flat scratch* address register (an alternative syntax). [flat_scratch] 64-bit *flat scratch* address register (an SP3 syntax).
[flat_scratch_lo,flat_scratch_hi] 64-bit *flat scratch* address register (an alternative syntax). [flat_scratch_lo,flat_scratch_hi] 64-bit *flat scratch* address register (an SP3 syntax).
================================== ================================================================ ================================== ================================================================
High and low 32 bits of *flat scratch* address may be accessed as separate registers: High and low 32 bits of *flat scratch* address may be accessed as separate registers:
@ -335,8 +344,8 @@ High and low 32 bits of *flat scratch* address may be accessed as separate regis
========================= ========================================================================= ========================= =========================================================================
flat_scratch_lo Low 32 bits of *flat scratch* address register. flat_scratch_lo Low 32 bits of *flat scratch* address register.
flat_scratch_hi High 32 bits of *flat scratch* address register. flat_scratch_hi High 32 bits of *flat scratch* address register.
[flat_scratch_lo] Low 32 bits of *flat scratch* address register (an alternative syntax). [flat_scratch_lo] Low 32 bits of *flat scratch* address register (an SP3 syntax).
[flat_scratch_hi] High 32 bits of *flat scratch* address register (an alternative syntax). [flat_scratch_hi] High 32 bits of *flat scratch* address register (an SP3 syntax).
========================= ========================================================================= ========================= =========================================================================
.. _amdgpu_synid_xnack: .. _amdgpu_synid_xnack:
@ -355,8 +364,8 @@ received an *XNACK* due to a vector memory operation.
Syntax Description Syntax Description
============================== ===================================================== ============================== =====================================================
xnack_mask 64-bit *xnack mask* register. xnack_mask 64-bit *xnack mask* register.
[xnack_mask] 64-bit *xnack mask* register (an alternative syntax). [xnack_mask] 64-bit *xnack mask* register (an SP3 syntax).
[xnack_mask_lo,xnack_mask_hi] 64-bit *xnack mask* register (an alternative syntax). [xnack_mask_lo,xnack_mask_hi] 64-bit *xnack mask* register (an SP3 syntax).
============================== ===================================================== ============================== =====================================================
High and low 32 bits of *xnack mask* may be accessed as separate registers: High and low 32 bits of *xnack mask* may be accessed as separate registers:
@ -366,8 +375,8 @@ High and low 32 bits of *xnack mask* may be accessed as separate registers:
===================== ============================================================== ===================== ==============================================================
xnack_mask_lo Low 32 bits of *xnack mask* register. xnack_mask_lo Low 32 bits of *xnack mask* register.
xnack_mask_hi High 32 bits of *xnack mask* register. xnack_mask_hi High 32 bits of *xnack mask* register.
[xnack_mask_lo] Low 32 bits of *xnack mask* register (an alternative syntax). [xnack_mask_lo] Low 32 bits of *xnack mask* register (an SP3 syntax).
[xnack_mask_hi] High 32 bits of *xnack mask* register (an alternative syntax). [xnack_mask_hi] High 32 bits of *xnack mask* register (an SP3 syntax).
===================== ============================================================== ===================== ==============================================================
.. _amdgpu_synid_vcc: .. _amdgpu_synid_vcc:
@ -385,8 +394,8 @@ Note that GFX10 H/W does not use high 32 bits of *vcc* in *wave32* mode.
Syntax Description Syntax Description
================ ========================================================================= ================ =========================================================================
vcc 64-bit *vector condition code* register. vcc 64-bit *vector condition code* register.
[vcc] 64-bit *vector condition code* register (an alternative syntax). [vcc] 64-bit *vector condition code* register (an SP3 syntax).
[vcc_lo,vcc_hi] 64-bit *vector condition code* register (an alternative syntax). [vcc_lo,vcc_hi] 64-bit *vector condition code* register (an SP3 syntax).
================ ========================================================================= ================ =========================================================================
High and low 32 bits of *vector condition code* may be accessed as separate registers: High and low 32 bits of *vector condition code* may be accessed as separate registers:
@ -396,8 +405,8 @@ High and low 32 bits of *vector condition code* may be accessed as separate regi
================ ========================================================================= ================ =========================================================================
vcc_lo Low 32 bits of *vector condition code* register. vcc_lo Low 32 bits of *vector condition code* register.
vcc_hi High 32 bits of *vector condition code* register. vcc_hi High 32 bits of *vector condition code* register.
[vcc_lo] Low 32 bits of *vector condition code* register (an alternative syntax). [vcc_lo] Low 32 bits of *vector condition code* register (an SP3 syntax).
[vcc_hi] High 32 bits of *vector condition code* register (an alternative syntax). [vcc_hi] High 32 bits of *vector condition code* register (an SP3 syntax).
================ ========================================================================= ================ =========================================================================
.. _amdgpu_synid_m0: .. _amdgpu_synid_m0:
@ -412,7 +421,7 @@ including register indexing and bounds checking.
Syntax Description Syntax Description
=========== =================================================== =========== ===================================================
m0 A 32-bit *memory* register. m0 A 32-bit *memory* register.
[m0] A 32-bit *memory* register (an alternative syntax). [m0] A 32-bit *memory* register (an SP3 syntax).
=========== =================================================== =========== ===================================================
.. _amdgpu_synid_exec: .. _amdgpu_synid_exec:
@ -430,8 +439,8 @@ Note that GFX10 H/W does not use high 32 bits of *exec* in *wave32* mode.
Syntax Description Syntax Description
===================== ================================================================= ===================== =================================================================
exec 64-bit *execute mask* register. exec 64-bit *execute mask* register.
[exec] 64-bit *execute mask* register (an alternative syntax). [exec] 64-bit *execute mask* register (an SP3 syntax).
[exec_lo,exec_hi] 64-bit *execute mask* register (an alternative syntax). [exec_lo,exec_hi] 64-bit *execute mask* register (an SP3 syntax).
===================== ================================================================= ===================== =================================================================
High and low 32 bits of *execute mask* may be accessed as separate registers: High and low 32 bits of *execute mask* may be accessed as separate registers:
@ -441,8 +450,8 @@ High and low 32 bits of *execute mask* may be accessed as separate registers:
===================== ================================================================= ===================== =================================================================
exec_lo Low 32 bits of *execute mask* register. exec_lo Low 32 bits of *execute mask* register.
exec_hi High 32 bits of *execute mask* register. exec_hi High 32 bits of *execute mask* register.
[exec_lo] Low 32 bits of *execute mask* register (an alternative syntax). [exec_lo] Low 32 bits of *execute mask* register (an SP3 syntax).
[exec_hi] High 32 bits of *execute mask* register (an alternative syntax). [exec_hi] High 32 bits of *execute mask* register (an SP3 syntax).
===================== ================================================================= ===================== =================================================================
.. _amdgpu_synid_vccz: .. _amdgpu_synid_vccz:
@ -452,7 +461,7 @@ vccz
A single bit flag indicating that the :ref:`vcc<amdgpu_synid_vcc>` is all zeros. A single bit flag indicating that the :ref:`vcc<amdgpu_synid_vcc>` is all zeros.
Note. When GFX10 operates in *wave32* mode, this register reflects state of :ref:`vcc_lo<amdgpu_synid_vcc_lo>`. Note: when GFX10 operates in *wave32* mode, this register reflects state of :ref:`vcc_lo<amdgpu_synid_vcc_lo>`.
.. _amdgpu_synid_execz: .. _amdgpu_synid_execz:
@ -461,7 +470,7 @@ execz
A single bit flag indicating that the :ref:`exec<amdgpu_synid_exec>` is all zeros. A single bit flag indicating that the :ref:`exec<amdgpu_synid_exec>` is all zeros.
Note. When GFX10 operates in *wave32* mode, this register reflects state of :ref:`exec_lo<amdgpu_synid_exec>`. Note: when GFX10 operates in *wave32* mode, this register reflects state of :ref:`exec_lo<amdgpu_synid_exec>`.
.. _amdgpu_synid_scc: .. _amdgpu_synid_scc:
@ -495,19 +504,20 @@ GFX10 only.
.. _amdgpu_synid_constant: .. _amdgpu_synid_constant:
constant inline constant
-------- ---------------
A set of integer and floating-point *inline* constants and values: An *inline constant* is an integer or a floating-point value encoded as a part of an instruction.
Compare *inline constants* with :ref:`literals<amdgpu_synid_literal>`.
Inline constants include:
* :ref:`iconst<amdgpu_synid_iconst>` * :ref:`iconst<amdgpu_synid_iconst>`
* :ref:`fconst<amdgpu_synid_fconst>` * :ref:`fconst<amdgpu_synid_fconst>`
* :ref:`ival<amdgpu_synid_ival>` * :ref:`ival<amdgpu_synid_ival>`
In contrast with :ref:`literals<amdgpu_synid_literal>`, these operands are encoded as a part of instruction.
If a number may be encoded as either If a number may be encoded as either
a :ref:`literal<amdgpu_synid_literal>` or a :ref:`literal<amdgpu_synid_literal>` or
a :ref:`constant<amdgpu_synid_constant>`, a :ref:`constant<amdgpu_synid_constant>`,
assembler selects the latter encoding as more efficient. assembler selects the latter encoding as more efficient.
@ -516,17 +526,14 @@ assembler selects the latter encoding as more efficient.
iconst iconst
~~~~~~ ~~~~~~
An :ref:`integer number<amdgpu_synid_integer_number>` An :ref:`integer number<amdgpu_synid_integer_number>` or
an :ref:`absolute expression<amdgpu_synid_absolute_expression>`
encoded as an *inline constant*. encoded as an *inline constant*.
Only a small fraction of integer numbers may be encoded as *inline constants*. Only a small fraction of integer numbers may be encoded as *inline constants*.
They are enumerated in the table below. They are enumerated in the table below.
Other integer numbers have to be encoded as :ref:`literals<amdgpu_synid_literal>`. Other integer numbers have to be encoded as :ref:`literals<amdgpu_synid_literal>`.
Integer *inline constants* are converted to
:ref:`expected operand type<amdgpu_syn_instruction_type>`
as described :ref:`here<amdgpu_synid_int_const_conv>`.
================================== ==================================== ================================== ====================================
Value Note Value Note
================================== ==================================== ================================== ====================================
@ -548,10 +555,6 @@ Only a small fraction of floating-point numbers may be encoded as *inline consta
They are enumerated in the table below. They are enumerated in the table below.
Other floating-point numbers have to be encoded as :ref:`literals<amdgpu_synid_literal>`. Other floating-point numbers have to be encoded as :ref:`literals<amdgpu_synid_literal>`.
Floating-point *inline constants* are converted to
:ref:`expected operand type<amdgpu_syn_instruction_type>`
as described :ref:`here<amdgpu_synid_fp_const_conv>`.
===================== ===================================================== ================== ===================== ===================================================== ==================
Value Note Availability Value Note Availability
===================== ===================================================== ================== ===================== ===================================================== ==================
@ -594,21 +597,18 @@ These operands provide read-only access to H/W registers.
literal literal
------- -------
A literal is a 64-bit value which is encoded as a separate 32-bit dword in the instruction stream. A *literal* is a 64-bit value encoded as a separate 32-bit dword in the instruction stream.
Compare *literals* with :ref:`inline constants<amdgpu_synid_constant>`.
If a number may be encoded as either If a number may be encoded as either
a :ref:`literal<amdgpu_synid_literal>` or a :ref:`literal<amdgpu_synid_literal>` or
an :ref:`inline constant<amdgpu_synid_constant>`, an :ref:`inline constant<amdgpu_synid_constant>`,
assembler selects the latter encoding as more efficient. assembler selects the latter encoding as more efficient.
Literals may be specified as :ref:`integer numbers<amdgpu_synid_integer_number>`, Literals may be specified as :ref:`integer numbers<amdgpu_synid_integer_number>`,
:ref:`floating-point numbers<amdgpu_synid_floating-point_number>` or :ref:`floating-point numbers<amdgpu_synid_floating-point_number>`,
:ref:`expressions<amdgpu_synid_expression>` :ref:`absolute expressions<amdgpu_synid_absolute_expression>` or
(expressions are currently supported for 32-bit operands only). :ref:`relocatable expressions<amdgpu_synid_relocatable_expression>`.
A 64-bit literal value is converted by assembler
to an :ref:`expected operand type<amdgpu_syn_instruction_type>`
as described :ref:`here<amdgpu_synid_lit_conv>`.
An instruction may use only one literal but several operands may refer the same literal. An instruction may use only one literal but several operands may refer the same literal.
@ -617,30 +617,38 @@ An instruction may use only one literal but several operands may refer the same
uimm8 uimm8
----- -----
A 8-bit positive :ref:`integer number<amdgpu_synid_integer_number>`. A 8-bit :ref:`integer number<amdgpu_synid_integer_number>`
The value is encoded as part of the opcode so it is free to use. or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
The value must be in the range 0..0xFF.
.. _amdgpu_synid_uimm32: .. _amdgpu_synid_uimm32:
uimm32 uimm32
------ ------
A 32-bit positive :ref:`integer number<amdgpu_synid_integer_number>`. A 32-bit :ref:`integer number<amdgpu_synid_integer_number>`
The value is stored as a separate 32-bit dword in the instruction stream. or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
The value must be in the range 0..0xFFFFFFFF.
.. _amdgpu_synid_uimm20: .. _amdgpu_synid_uimm20:
uimm20 uimm20
------ ------
A 20-bit positive :ref:`integer number<amdgpu_synid_integer_number>`. A 20-bit :ref:`integer number<amdgpu_synid_integer_number>`
or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
The value must be in the range 0..0xFFFFF.
.. _amdgpu_synid_uimm21: .. _amdgpu_synid_uimm21:
uimm21 uimm21
------ ------
A 21-bit positive :ref:`integer number<amdgpu_synid_integer_number>`. A 21-bit :ref:`integer number<amdgpu_synid_integer_number>`
or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
The value must be in the range 0..0x1FFFFF.
.. WARNING:: Assembler currently supports 20-bit offsets only. Use :ref:`uimm20<amdgpu_synid_uimm20>` as a replacement. .. WARNING:: Assembler currently supports 20-bit offsets only. Use :ref:`uimm20<amdgpu_synid_uimm20>` as a replacement.
@ -649,7 +657,10 @@ A 21-bit positive :ref:`integer number<amdgpu_synid_integer_number>`.
simm21 simm21
------ ------
A 21-bit :ref:`integer number<amdgpu_synid_integer_number>`. A 21-bit :ref:`integer number<amdgpu_synid_integer_number>`
or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
The value must be in the range -0x100000..0x0FFFFF.
.. WARNING:: Assembler currently supports 20-bit unsigned offsets only. Use :ref:`uimm20<amdgpu_synid_uimm20>` as a replacement. .. WARNING:: Assembler currently supports 20-bit unsigned offsets only. Use :ref:`uimm20<amdgpu_synid_uimm20>` as a replacement.
@ -678,27 +689,20 @@ Integer Numbers
--------------- ---------------
Integer numbers are 64 bits wide. Integer numbers are 64 bits wide.
They may be specified in binary, octal, hexadecimal and decimal formats: They are converted to :ref:`expected operand type<amdgpu_syn_instruction_type>`
as described :ref:`here<amdgpu_synid_int_conv>`.
============== ==================================== Integer numbers may be specified in binary, octal, hexadecimal and decimal formats:
Format Syntax
============== ====================================
Decimal [-]?[1-9][0-9]*
Binary [-]?0b[01]+
Octal [-]?0[0-7]+
Hexadecimal [-]?0x[0-9a-fA-F]+
\ [-]?[0x]?[0-9][0-9a-fA-F]*[hH]
============== ====================================
Examples: ============ =============================== ========
Format Syntax Example
.. parsed-literal:: ============ =============================== ========
Decimal [-]?[1-9][0-9]* -1234
-1234 Binary [-]?0b[01]+ 0b1010
0b1010 Octal [-]?0[0-7]+ 010
010 Hexadecimal [-]?0x[0-9a-fA-F]+ 0xff
0xff \ [-]?[0x]?[0-9][0-9a-fA-F]*[hH] 0ffh
0ffh ============ =============================== ========
.. _amdgpu_synid_floating-point_number: .. _amdgpu_synid_floating-point_number:
@ -706,31 +710,29 @@ Floating-Point Numbers
---------------------- ----------------------
All floating-point numbers are handled as double (64 bits wide). All floating-point numbers are handled as double (64 bits wide).
They are converted to
:ref:`expected operand type<amdgpu_syn_instruction_type>`
as described :ref:`here<amdgpu_synid_fp_conv>`.
Floating-point numbers may be specified in hexadecimal and decimal formats: Floating-point numbers may be specified in hexadecimal and decimal formats:
============== ======================================================== ======================================================== ============ ======================================================== ====================== ====================
Format Syntax Note Format Syntax Examples Note
============== ======================================================== ======================================================== ============ ======================================================== ====================== ====================
Decimal [-]?[0-9]*[.][0-9]*([eE][+-]?[0-9]*)? Must include either a decimal separator or an exponent. Decimal [-]?[0-9]*[.][0-9]*([eE][+-]?[0-9]*)? -1.234, 234e2 Must include either
Hexadecimal [-]0x[0-9a-fA-F]*(.[0-9a-fA-F]*)?[pP][+-]?[0-9a-fA-F]+ a decimal separator
============== ======================================================== ======================================================== or an exponent.
Hexadecimal [-]0x[0-9a-fA-F]*(.[0-9a-fA-F]*)?[pP][+-]?[0-9a-fA-F]+ -0x1afp-10, 0x.1afp10
Examples: ============ ======================================================== ====================== ====================
.. parsed-literal::
-1.234
234e2
-0x1afp-10
0x.1afp10
.. _amdgpu_synid_expression: .. _amdgpu_synid_expression:
Expressions Expressions
=========== ===========
An expression specifies an address or a numeric value. An expression is evaluated to a 64-bit integer.
Note that floating-point expressions are not supported.
There are two kinds of expressions: There are two kinds of expressions:
* :ref:`Absolute<amdgpu_synid_absolute_expression>`. * :ref:`Absolute<amdgpu_synid_absolute_expression>`.
@ -741,10 +743,14 @@ There are two kinds of expressions:
Absolute Expressions Absolute Expressions
-------------------- --------------------
The value of an absolute expression remains the same after program relocation. The value of an absolute expression does not change after program relocation.
Absolute expressions must not include unassigned and relocatable values Absolute expressions must not include unassigned and relocatable values
such as labels. such as labels.
Absolute expressions are evaluated to 64-bit integer values and converted to
:ref:`expected operand type<amdgpu_syn_instruction_type>`
as described :ref:`here<amdgpu_synid_int_conv>`.
Examples: Examples:
.. parsed-literal:: .. parsed-literal::
@ -760,46 +766,39 @@ Relocatable Expressions
The value of a relocatable expression depends on program relocation. The value of a relocatable expression depends on program relocation.
Note that use of relocatable expressions is limited with branch targets Note that use of relocatable expressions is limited with branch targets
and 32-bit :ref:`literals<amdgpu_synid_literal>`. and 32-bit integer operands.
Addition information about relocation may be found :ref:`here<amdgpu-relocation-records>`. A relocatable expression is evaluated to a 64-bit integer value
which depends on operand kind and :ref:`relocation type<amdgpu-relocation-records>`
Examples: of symbol(s) used in the expression. For example, if an instruction refers a label,
this reference is evaluated to an offset from the address after the instruction
to the label address:
.. parsed-literal:: .. parsed-literal::
y = x + 10 // x is not yet defined. Undefined symbols are assumed to be PC-relative. label:
z = . v_add_co_u32_e32 v0, vcc, label, v1 // 'label' operand is evaluated to -4
Expression Data Type Note that values of relocatable expressions are usually unknown at assembly time;
-------------------- they are resolved later by a linker and converted to
:ref:`expected operand type<amdgpu_syn_instruction_type>`
as described :ref:`here<amdgpu_synid_rl_conv>`.
Expressions and operands of expressions are interpreted as 64-bit integers. Operands and Operations
-----------------------
Expressions may include 64-bit :ref:`floating-point numbers<amdgpu_synid_floating-point_number>` (double). Expressions are composed of 64-bit integer operands and operations.
However these operands are also handled as 64-bit integers Operands include :ref:`integer numbers<amdgpu_synid_integer_number>`
using binary representation of specified floating-point numbers. and :ref:`symbols<amdgpu_synid_symbol>`.
No conversion from floating-point to integer is performed.
Examples:
.. parsed-literal::
x = 0.1 // x is assigned an integer 4591870180066957722 which is a binary representation of 0.1.
y = x + x // y is a sum of two integer values; it is not equal to 0.2!
Syntax
------
Expressions are composed of
:ref:`symbols<amdgpu_synid_symbol>`,
:ref:`integer numbers<amdgpu_synid_integer_number>`,
:ref:`floating-point numbers<amdgpu_synid_floating-point_number>`,
:ref:`binary operators<amdgpu_synid_expression_bin_op>`,
:ref:`unary operators<amdgpu_synid_expression_un_op>` and subexpressions.
Expressions may also use "." which is a reference to the current PC (program counter). Expressions may also use "." which is a reference to the current PC (program counter).
:ref:`Unary<amdgpu_synid_expression_un_op>` and :ref:`binary<amdgpu_synid_expression_bin_op>`
operations produce 64-bit integer results.
Syntax of Expressions
---------------------
The syntax of expressions is shown below:: The syntax of expressions is shown below::
expr ::= expr binop expr | primaryexpr ; expr ::= expr binop expr | primaryexpr ;
@ -887,7 +886,7 @@ They operate on and produce 64-bit integers.
Symbols Symbols
------- -------
A symbol is a named 64-bit value, representing a relocatable A symbol is a named 64-bit integer value, representing a relocatable
address or an absolute (non-relocatable) number. address or an absolute (non-relocatable) number.
Symbol names have the following syntax: Symbol names have the following syntax:
@ -907,128 +906,78 @@ The table below provides several examples of syntax used for symbol definition.
A symbol may be used before it is declared or assigned; A symbol may be used before it is declared or assigned;
unassigned symbols are assumed to be PC-relative. unassigned symbols are assumed to be PC-relative.
Addition information about symbols may be found :ref:`here<amdgpu-symbols>`. Additional information about symbols may be found :ref:`here<amdgpu-symbols>`.
.. _amdgpu_synid_conv: .. _amdgpu_synid_conv:
Conversions Type and Size Conversion
=========== ========================
This section describes what happens when a 64-bit This section describes what happens when a 64-bit
:ref:`integer number<amdgpu_synid_integer_number>`, a :ref:`integer number<amdgpu_synid_integer_number>`, a
:ref:`floating-point numbers<amdgpu_synid_floating-point_number>` or a :ref:`floating-point number<amdgpu_synid_floating-point_number>` or an
:ref:`symbol<amdgpu_synid_symbol>` :ref:`expression<amdgpu_synid_expression>`
is used for an operand which has a different type or size. is used for an operand which has a different type or size.
Depending on operand kind, this conversion is performed by either assembler or AMDGPU H/W: .. _amdgpu_synid_int_conv:
* Values encoded as :ref:`inline constants<amdgpu_synid_constant>` are handled by H/W. Conversion of Integer Values
* Values encoded as :ref:`literals<amdgpu_synid_literal>` are converted by assembler. ----------------------------
.. _amdgpu_synid_const_conv: Instruction operands may be specified as 64-bit :ref:`integer numbers<amdgpu_synid_integer_number>` or
:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. These values are converted to
the :ref:`expected operand type<amdgpu_syn_instruction_type>` using the following steps:
Inline Constants 1. *Validation*. Assembler checks if the input value may be truncated without loss to the required *truncation width*
---------------- (see the table below). There are two cases when this operation is enabled:
.. _amdgpu_synid_int_const_conv: * The truncated bits are all 0.
* The truncated bits are all 1 and the value after truncation has its MSB bit set.
Integer Inline Constants In all other cases assembler triggers an error.
~~~~~~~~~~~~~~~~~~~~~~~~
Integer :ref:`inline constants<amdgpu_synid_constant>` 2. *Conversion*. The input value is converted to the expected type as described in the table below.
may be thought of as 64-bit Depending on operand kind, this conversion is performed by either assembler or AMDGPU H/W (or both).
:ref:`integer numbers<amdgpu_synid_integer_number>`;
when used as operands they are truncated to the size of
:ref:`expected operand type<amdgpu_syn_instruction_type>`.
No data type conversions are performed.
Examples: ============== ================= =============== ====================================================================
Expected type Truncation Width Conversion Description
============== ================= =============== ====================================================================
i16, u16, b16 16 num.u16 Truncate to 16 bits.
i32, u32, b32 32 num.u32 Truncate to 32 bits.
i64 32 {-1,num.i32} Truncate to 32 bits and then sign-extend the result to 64 bits.
u64, b64 32 {0,num.u32} Truncate to 32 bits and then zero-extend the result to 64 bits.
f16 16 num.u16 Use low 16 bits as an f16 value.
f32 32 num.u32 Use low 32 bits as an f32 value.
f64 32 {num.u32,0} Use low 32 bits of the number as high 32 bits
of the result; low 32 bits of the result are zeroed.
============== ================= =============== ====================================================================
Examples of enabled conversions:
.. parsed-literal:: .. parsed-literal::
// GFX9 // GFX9
v_add_u16 v0, -1, 0 // v0 = 0xFFFF v_add_u16 v0, -1, 0 // src0 = 0xFFFF
v_add_f16 v0, -1, 0 // v0 = 0xFFFF (NaN) v_add_f16 v0, -1, 0 // src0 = 0xFFFF (NaN)
//
v_add_u32 v0, -1, 0 // src0 = 0xFFFFFFFF
v_add_f32 v0, -1, 0 // src0 = 0xFFFFFFFF (NaN)
//
v_add_u16 v0, 0xff00, v0 // src0 = 0xff00
v_add_u16 v0, 0xffffffffffffff00, v0 // src0 = 0xff00
v_add_u16 v0, -256, v0 // src0 = 0xff00
//
s_bfe_i64 s[0:1], 0xffefffff, s3 // src0 = 0xffffffffffefffff
s_bfe_u64 s[0:1], 0xffefffff, s3 // src0 = 0x00000000ffefffff
v_ceil_f64_e32 v[0:1], 0xffefffff // src0 = 0xffefffff00000000 (-1.7976922776554302e308)
//
x = 0xffefffff //
s_bfe_i64 s[0:1], x, s3 // src0 = 0xffffffffffefffff
s_bfe_u64 s[0:1], x, s3 // src0 = 0x00000000ffefffff
v_ceil_f64_e32 v[0:1], x // src0 = 0xffefffff00000000 (-1.7976922776554302e308)
v_add_u32 v0, -1, 0 // v0 = 0xFFFFFFFF Examples of disabled conversions:
v_add_f32 v0, -1, 0 // v0 = 0xFFFFFFFF (NaN)
.. _amdgpu_synid_fp_const_conv:
Floating-Point Inline Constants
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Floating-point :ref:`inline constants<amdgpu_synid_constant>`
may be thought of as 64-bit
:ref:`floating-point numbers<amdgpu_synid_floating-point_number>`;
when used as operands they are converted to a floating-point number of
:ref:`expected operand size<amdgpu_syn_instruction_type>`.
Examples:
.. parsed-literal::
// GFX9
v_add_f16 v0, 1.0, 0 // v0 = 0x3C00 (1.0)
v_add_u16 v0, 1.0, 0 // v0 = 0x3C00
v_add_f32 v0, 1.0, 0 // v0 = 0x3F800000 (1.0)
v_add_u32 v0, 1.0, 0 // v0 = 0x3F800000
.. _amdgpu_synid_lit_conv:
Literals
--------
.. _amdgpu_synid_int_lit_conv:
Integer Literals
~~~~~~~~~~~~~~~~
Integer :ref:`literals<amdgpu_synid_literal>`
are specified as 64-bit :ref:`integer numbers<amdgpu_synid_integer_number>`.
When used as operands they are converted to
:ref:`expected operand type<amdgpu_syn_instruction_type>` as described below.
============== ============== =============== ====================================================================
Expected type Condition Result Note
============== ============== =============== ====================================================================
i16, u16, b16 cond(num,16) num.u16 Truncate to 16 bits.
i32, u32, b32 cond(num,32) num.u32 Truncate to 32 bits.
i64 cond(num,32) {-1,num.i32} Truncate to 32 bits and then sign-extend the result to 64 bits.
u64, b64 cond(num,32) { 0,num.u32} Truncate to 32 bits and then zero-extend the result to 64 bits.
f16 cond(num,16) num.u16 Use low 16 bits as an f16 value.
f32 cond(num,32) num.u32 Use low 32 bits as an f32 value.
f64 cond(num,32) {num.u32,0} Use low 32 bits of the number as high 32 bits
of the result; low 32 bits of the result are zeroed.
============== ============== =============== ====================================================================
The condition *cond(X,S)* indicates if a 64-bit number *X*
can be converted to a smaller size *S* by truncation of upper bits.
There are two cases when the conversion is possible:
* The truncated bits are all 0.
* The truncated bits are all 1 and the value after truncation has its MSB bit set.
Examples of valid literals:
.. parsed-literal::
// GFX9
// Literal value after conversion:
v_add_u16 v0, 0xff00, v0 // 0xff00
v_add_u16 v0, 0xffffffffffffff00, v0 // 0xff00
v_add_u16 v0, -256, v0 // 0xff00
// Literal value after conversion:
s_bfe_i64 s[0:1], 0xffefffff, s3 // 0xffffffffffefffff
s_bfe_u64 s[0:1], 0xffefffff, s3 // 0x00000000ffefffff
v_ceil_f64_e32 v[0:1], 0xffefffff // 0xffefffff00000000 (-1.7976922776554302e308)
Examples of invalid literals:
.. parsed-literal:: .. parsed-literal::
@ -1037,49 +986,57 @@ Examples of invalid literals:
v_add_u16 v0, 0x1ff00, v0 // truncated bits are not all 0 or 1 v_add_u16 v0, 0x1ff00, v0 // truncated bits are not all 0 or 1
v_add_u16 v0, 0xffffffffffff00ff, v0 // truncated bits do not match MSB of the result v_add_u16 v0, 0xffffffffffff00ff, v0 // truncated bits do not match MSB of the result
.. _amdgpu_synid_fp_lit_conv: .. _amdgpu_synid_fp_conv:
Floating-Point Literals Conversion of Floating-Point Values
~~~~~~~~~~~~~~~~~~~~~~~ -----------------------------------
Floating-point :ref:`literals<amdgpu_synid_literal>` are specified as 64-bit Instruction operands may be specified as 64-bit :ref:`floating-point numbers<amdgpu_synid_floating-point_number>`.
:ref:`floating-point numbers<amdgpu_synid_floating-point_number>`. These values are converted to the :ref:`expected operand type<amdgpu_syn_instruction_type>` using the following steps:
When used as operands they are converted to 1. *Validation*. Assembler checks if the input f64 number can be converted
:ref:`expected operand type<amdgpu_syn_instruction_type>` as described below. to the *required floating-point type* (see the table below) without overflow or underflow.
Precision lost is allowed. If this conversion is not possible, assembler triggers an error.
============== ============== ================= ================================================================= 2. *Conversion*. The input value is converted to the expected type as described in the table below.
Expected type Condition Result Note Depending on operand kind, this is performed by either assembler or AMDGPU H/W (or both).
============== ============== ================= =================================================================
i16, u16, b16 cond(num,16) f16(num) Convert to f16 and use bits of the result as an integer value.
i32, u32, b32 cond(num,32) f32(num) Convert to f32 and use bits of the result as an integer value.
i64, u64, b64 false \- Conversion disabled because of an unclear semantics.
f16 cond(num,16) f16(num) Convert to f16.
f32 cond(num,32) f32(num) Convert to f32.
f64 true {num.u32.hi,0} Use high 32 bits of the number as high 32 bits of the result;
zero-fill low 32 bits of the result.
Note that the result may differ from the original number. ============== ================ ================= =================================================================
============== ============== ================= ================================================================= Expected type Required FP Type Conversion Description
============== ================ ================= =================================================================
i16, u16, b16 f16 f16(num) Convert to f16 and use bits of the result as an integer value.
i32, u32, b32 f32 f32(num) Convert to f32 and use bits of the result as an integer value.
i64, u64, b64 \- \- Conversion disabled.
f16 f16 f16(num) Convert to f16.
f32 f32 f32(num) Convert to f32.
f64 f64 {num.u32.hi,0} Use high 32 bits of the number as high 32 bits of the result;
zero-fill low 32 bits of the result.
The condition *cond(X,S)* indicates if an f64 number *X* can be converted Note that the result may differ from the original number.
to a smaller *S*-bit floating-point type without overflow or underflow. ============== ================ ================= =================================================================
Precision lost is allowed.
Examples of valid literals: Examples of enabled conversions:
.. parsed-literal:: .. parsed-literal::
// GFX9 // GFX9
v_add_f16 v1, 65500.0, v2 v_add_f16 v0, 1.0, 0 // src0 = 0x3C00 (1.0)
v_add_f32 v1, 65600.0, v2 v_add_u16 v0, 1.0, 0 // src0 = 0x3C00
//
v_add_f32 v0, 1.0, 0 // src0 = 0x3F800000 (1.0)
v_add_u32 v0, 1.0, 0 // src0 = 0x3F800000
// Literal value before conversion: 1.7976931348623157e308 (0x7fefffffffffffff) // src0 before conversion:
// Literal value after conversion: 1.7976922776554302e308 (0x7fefffff00000000) // 1.7976931348623157e308 = 0x7fefffffffffffff
// src0 after conversion:
// 1.7976922776554302e308 = 0x7fefffff00000000
v_ceil_f64 v[0:1], 1.7976931348623157e308 v_ceil_f64 v[0:1], 1.7976931348623157e308
Examples of invalid literals: v_add_f16 v1, 65500.0, v2 // ok for f16.
v_add_f32 v1, 65600.0, v2 // ok for f32, but would result in overflow for f16.
Examples of disabled conversions:
.. parsed-literal:: .. parsed-literal::
@ -1087,25 +1044,35 @@ Examples of invalid literals:
v_add_f16 v1, 65600.0, v2 // overflow v_add_f16 v1, 65600.0, v2 // overflow
.. _amdgpu_synid_exp_conv: .. _amdgpu_synid_rl_conv:
Expressions Conversion of Relocatable Values
~~~~~~~~~~~ --------------------------------
Expressions operate with and result in 64-bit integers. :ref:`Relocatable expressions<amdgpu_synid_relocatable_expression>`
may be used with 32-bit integer operands and jump targets.
When used as operands they are truncated to When the value of a relocatable expression is resolved by a linker, it is
:ref:`expected operand size<amdgpu_syn_instruction_type>`. converted as needed and truncated to the operand size. The conversion depends
No data type conversions are performed. on :ref:`relocation type<amdgpu-relocation-records>` and operand kind.
Examples: For example, when a 32-bit operand of an instruction refers a relocatable expression *expr*,
this reference is evaluated to a 64-bit offset from the address after the
instruction to the address being referenced, *counted in bytes*.
Then the value is truncated to 32 bits and encoded as a literal:
.. parsed-literal:: .. parsed-literal::
// GFX9 expr = .
v_add_co_u32_e32 v0, vcc, expr, v1 // 'expr' operand is evaluated to -4
// and then truncated to 0xFFFFFFFC
x = 0.1 As another example, when a branch instruction refers a label,
v_sqrt_f32 v0, x // v0 = [low 32 bits of 0.1 (double)] this reference is evaluated to an offset from the address after the
v_sqrt_f32 v0, (0.1 + 0) // the same as above instruction to the label address, *counted in dwords*.
v_sqrt_f32 v0, 0.1 // v0 = [0.1 (double) converted to float] Then the value is truncated to 16 bits:
.. parsed-literal::
label:
s_branch label // 'label' operand is evaluated to -1 and truncated to 0xFFFF