mirror of
https://github.com/RPCS3/llvm-mirror.git
synced 2024-11-26 04:32:44 +01:00
20d76c425d
* Context * During register coalescing, we use rematerialization when coalescing is not possible. That means we may rematerialize a super register when only a smaller register is actually used. E.g., 0B v1 = ldimm 0xFF 1B v2 = COPY v1.low8bits 2B = v2 => 0B v1 = ldimm 0xFF 1B v2 = ldimm 0xFF 2B = v2.low8bits Where xB are the slot indexes. Here v2 grew from a 8-bit register to a 16-bit register. When that happens and subregister liveness is enabled, we create subranges for the newly created value. E.g., before remat, the live range of v2 looked like: main range: [1r, 2r) (Reads v2 is defined at index 1 slot register and used before the slot register of index 2) After remat, it should look like: main range: [1r, 2r) low 8 bits: [1r, 2r) high 8 bits: [1r, 1d) <-- dead def I.e., the unsused lanes of v2 should be marked as dead definition. * The Problem * Prior to this patch, the live-ranges from the previous exampel, would have the full live-range for all subranges: main range: [1r, 2r) low 8 bits: [1r, 2r) high 8 bits: [1r, 2r) <-- too long * The Fix * Technically, the code that this patch changes is not wrong: When we create the subranges for the newly rematerialized value, we create only one subrange for the whole bit mask. In other words, at this point v2 live-range looks like this: main range: [1r, 2r) low & high: [1r, 2r) Then, it gets wrong when we call LiveInterval::refineSubRanges on low 8 bits: main range: [1r, 2r) low 8 bits: [1r, 2r) high 8 bits: [1r, 2r) <-- too long Ideally, we would like LiveInterval::refineSubRanges to be able to do the right thing and mark the dead lanes as such. However, this is not possible, because by the time we update / refine the live ranges, the IR hasn't been updated yet, therefore we actually don't have enough information to do the right thing. Another option to fix the problem would have been to call LiveIntervals::shrinkToUses after the IR is updated. This is not desirable as this may have a noticeable impact on compile time. Instead, what this patch does is when we create the subranges for the rematerialized value, we explicitly create one subrange for the lanes that were used before rematerialization and one for the lanes that were not used. The used one inherits the live range of the main range and the unused one is just created empty. The existing rematerialization code then detects that the unused one are not live and it correctly sets dead def intervals for them. https://llvm.org/PR41372
47 lines
2.1 KiB
YAML
47 lines
2.1 KiB
YAML
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
|
|
# RUN: llc -mcpu=z13 -O3 -misched=ilpmin -systemz-subreg-liveness -verify-machineinstrs -start-before simple-register-coalescing %s -mtriple s390x-ibm-linux -stop-after machine-scheduler -o - | FileCheck %s
|
|
|
|
# Check that when the register coalescer rematerializes a register to set
|
|
# only a sub register, it sets the subranges of the unused lanes as being dead
|
|
# at the definition point.
|
|
#
|
|
# The way that test exercises that comes in two steps:
|
|
# - First, we need the register coalescer to rematerialize something.
|
|
# In that test, %0 is rematerializable and will be rematerialized in
|
|
# %1 since %1 and %0 cannot be directly coalesced (they interfere).
|
|
# - Second, we indirectly check that the subranges are valid for %1
|
|
# when, in the machine scheduler, we move the instructions that define %1
|
|
# closer to the return instruction (i.e., we move MSFI and the rematerialized
|
|
# definition of %0 (i.e., %1 = LGHI 25) down). When doing that displacement,
|
|
# the scheduler updates the live-ranges of %1. When the subrange for the
|
|
# unused lane (here the subrange for %1.subreg_h32) was not correct, the
|
|
# scheduler would hit an assertion or access some invalid memory location
|
|
# making the compiler crash.
|
|
#
|
|
# Bottom line, this test checks what was intended if at the end, both %0 and %1
|
|
# are defined with `LGHI 25` and the instructions defining %1 are right before
|
|
# the return instruction.
|
|
#
|
|
# PR41372
|
|
---
|
|
name: main
|
|
tracksRegLiveness: true
|
|
body: |
|
|
bb.0:
|
|
|
|
; CHECK-LABEL
|
|
; CHECK-LABEL: name: main
|
|
; CHECK: [[LGHI:%[0-9]+]]:gr64bit = LGHI 25
|
|
; CHECK: CHIMux [[LGHI]].subreg_l32, 0, implicit-def $cc
|
|
; CHECK: [[LGHI1:%[0-9]+]]:gr64bit = LGHI 25
|
|
; CHECK: undef [[LGHI1]].subreg_l32:gr64bit = MSFI [[LGHI1]].subreg_l32, -117440512
|
|
; CHECK: Return implicit [[LGHI1]].subreg_l32
|
|
%0:gr64bit = LGHI 25
|
|
%1:gr32bit = COPY %0.subreg_l32
|
|
%1:gr32bit = MSFI %1, -117440512
|
|
%2:grx32bit = COPY %0.subreg_l32
|
|
CHIMux killed %2, 0, implicit-def $cc
|
|
%3:gr32bit = COPY killed %1
|
|
Return implicit %3
|
|
...
|