1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 11:13:28 +01:00
llvm-mirror/test/CodeGen/AMDGPU/soft-clause-dbg-value.mir
Matt Arsenault f1ba4465de AMDGPU: Use kill instruction to hint soft clause live ranges
Previously we would use a bundle to hint the register allocator to not
overwrite the pointers in a sequence of loads to avoid breaking soft
clauses. This bundling was based on a fuzzy register pressure
heuristic, so we could not guarantee using more registers than are
really available. This would result in register allocator failing on
unsatisfiable bundles. Use a kill to artificially extend the live
ranges, so we can always succeed at register allocation even if it
means extra spills in the worst case.

This seems to capture most of the benefit of the bundle while avoiding
most of the risk presented by the bundle. However the lit tests do
show a handful of regressions. In some cases with sequences of
volatile loads, unused load components end up getting reallocated to
the next load which forces a wait between. There are also a few small
scheduling regressions where a hazard used to be avoided, and one
spill torture test which for some reason nearly doubles the stack
usage. There is also a bit of noise from leftover kills (it may make
sense for post-RA pseudos to strip all of these out).
2021-02-26 18:26:40 -05:00

47 lines
2.8 KiB
YAML

# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx906 -mattr=+xnack -run-pass=si-form-memory-clauses -verify-machineinstrs -o - %s | FileCheck %s
# Make sure that debug instructions do not change the bundling, and
# the dbg_values which break the clause are inserted after the new
# bundles.
---
name: sgpr_clause_dbg_value
tracksRegLiveness: true
body: |
bb.0:
liveins: $sgpr4_sgpr5, $vgpr0, $vgpr1, $vgpr2, $vgpr3, $vgpr4, $vgpr5, $vgpr6, $vgpr7, $vgpr8, $vgpr9, $vgpr10, $vgpr11, $vgpr12, $vgpr13, $vgpr14, $vgpr15, $vgpr16, $vgpr17, $vgpr18, $vgpr19, $vgpr20, $vgpr21, $vgpr22, $vgpr23, $vgpr24
; CHECK-LABEL: name: sgpr_clause_dbg_value
; CHECK: liveins: $sgpr4_sgpr5, $vgpr0, $vgpr1, $vgpr2, $vgpr3, $vgpr4, $vgpr5, $vgpr6, $vgpr7, $vgpr8, $vgpr9, $vgpr10, $vgpr11, $vgpr12, $vgpr13, $vgpr14, $vgpr15, $vgpr16, $vgpr17, $vgpr18, $vgpr19, $vgpr20, $vgpr21, $vgpr22, $vgpr23, $vgpr24
; CHECK: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr4_sgpr5
; CHECK: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[COPY]], 0, 0, 0 :: (load 4, addrspace 4)
; CHECK: DBG_VALUE [[S_LOAD_DWORD_IMM]], 0, 0
; CHECK: [[S_LOAD_DWORD_IMM1:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[COPY]], 8, 0, 0 :: (load 4, addrspace 4)
; CHECK: DBG_VALUE [[S_LOAD_DWORD_IMM1]], 0, 0
; CHECK: S_NOP 0
; CHECK: S_NOP 0
; CHECK: S_NOP 0
; CHECK: [[S_LOAD_DWORD_IMM2:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[COPY]], 16, 0, 0 :: (load 4, addrspace 4)
; CHECK: [[S_LOAD_DWORD_IMM3:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[COPY]], 32, 0, 0 :: (load 4, addrspace 4)
; CHECK: DBG_VALUE [[S_LOAD_DWORD_IMM2]], 0, 0
; CHECK: DBG_VALUE [[S_LOAD_DWORD_IMM3]], 0, 0
; CHECK: [[S_LOAD_DWORD_IMM4:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[COPY]], 64, 0, 0 :: (load 4, addrspace 4)
; CHECK: KILL [[COPY]]
; CHECK: S_ENDPGM 0, implicit [[S_LOAD_DWORD_IMM]], implicit [[S_LOAD_DWORD_IMM1]], implicit [[S_LOAD_DWORD_IMM2]], implicit [[S_LOAD_DWORD_IMM3]], implicit [[S_LOAD_DWORD_IMM4]]
%0:sreg_64 = COPY $sgpr4_sgpr5
%1:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM %0, 0, 0, 0 :: (load 4, align 4, addrspace 4)
DBG_VALUE %1, 0, 0
%2:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM %0, 8, 0, 0 :: (load 4, align 4, addrspace 4)
DBG_VALUE %2, 0, 0
S_NOP 0
S_NOP 0
S_NOP 0
%3:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM %0, 16, 0, 0 :: (load 4, align 4, addrspace 4)
%4:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM %0, 32, 0, 0 :: (load 4, align 4, addrspace 4)
DBG_VALUE %3, 0, 0
DBG_VALUE %4, 0, 0
%5:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM %0, 64, 0, 0 :: (load 4, align 4, addrspace 4)
S_ENDPGM 0, implicit %1, implicit %2, implicit %3, implicit %4, implicit %5
...