This patch implements DBG_VALUE_LIST handling to the LiveDebugValues pass. This
is a substantial change, and makes a few fundamental changes to the existing
logic.
We still use the basic model of a VarLocMap that is indexed by a LocIndex, with
a VarLocSet (a CoalescingBitVector underneath) giving us efficient lookups of
existing variable locations for a given location type. The main change is that
the VarLocMap may contain a given VarLoc multiple times (once for each unique
location operand), so that a VarLoc can be looked up from any of the registers
that it uses. This means that each VarLoc has multiple corresponding LocIndexes;
to allow us to iterate through the set of VarLocs (previously we would iterate
through the VarLocSet), we now also maintain a single entry in the VarLocMap
that contains every VarLoc exactly once.
The VarLoc class itself is also changed; this change is much simpler,
refactoring out location-specific members into a MachineLocation class and
adding a vector of these locations.
Differential Revision: https://reviews.llvm.org/D83890
Separate the LoZ ELF calling convention in tablegen.
This will make it easier to add the z/OS ABI in future patches.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D96867
This patch enables AsmPrinter support for complex expression with
entry values. It shouldn't AsmPrinter's call whether these are safe or
not but the pass who introduces the DW_OP_LLVM_entry_value. This patch
on its own has no effect on clang.
Differential Revision: https://reviews.llvm.org/D96559
This moves SinkIntoLoop from MachineLICM to MachineSink. The motivation for
this work is that hoisting is a canonicalisation transformation, but we do not
really have a good story to sink instructions back if that is better, e.g. to
reduce live-ranges, register pressure and spilling. This has been discussed a
few times on the list, the latest thread is:
https://lists.llvm.org/pipermail/llvm-dev/2020-December/147184.html
There it was pointed out that we have the LoopSink IR pass, but that works on
IR, lacks register pressure informatiom, and is focused on profile guided
optimisations, and then we have MachineLICM and MachineSink that both perform
sinking. MachineLICM is more about hoisting and CSE'ing of hoisted
instructions. It also contained a very incomplete and disabled-by-default
SinkIntoLoop feature, which we now move to MachineSink.
Getting loop-sinking to do something useful is going to be at least a 3-step
approach:
1) This is just moving the code and is almost a NFC, but contains a bug fix.
This uses helper function `isLoopInvariant` that was factored out in D94082 and
added to MachineLoop.
2) A first functional change to make loop-sink a little bit less restrictive,
which it really is at the moment, is the change in D94308. This lets it do
more (alias) analysis using functions in MachineSink, making it a bit more
powerful. Nothing changes much: still off by default. But it shows that
MachineSink is a better home for this, and it starts using its functionality
like `hasStoreBetween`, and in the next step we can use `isProfitableToSinkTo`.
3) This is the going to be he interesting step: decision making when and how
many instructions to sink. This will be driven by the register pressure, and
deciding if reducing live-ranges and loop sinking will help in better
performance.
4) Once we are happy with 3), this should be enabled by default, that should be
the end goal of this exercise.
Differential Revision: https://reviews.llvm.org/D93694
In MipsDelaySlotFiller, when replacing old call-branch with
the compact branch instruction, an assertion is caused by erasing
the old call with unhandled CSInfo.
The problem was reported in PR48695.
This patch fixes it, by moving call site info from the old call
instruction to its replace.
Patch by Nikola Tesic
Differential revision: https://reviews.llvm.org/D94685
Deciding where to place debugging instructions when normal instructions
sink between blocks is difficult -- see PR44117. Dealing with this with
instruction-referencing variable locations is simple: we just tolerate
DBG_INSTR_REFs referring to values that haven't been computed yet. This
patch adds support into InstrRefBasedLDV to record when a variable value
appears in the middle of a block, and should have a DBG_VALUE added when it
appears (a debug use before def).
While described simply, this relies heavily on the value-propagation
algorithm in InstrRefBasedLDV. The implementation doesn't attempt to verify
the location of a value unless something non-trivial occurs to merge
variable values in vlocJoin. This means that a variable with a value that
has no location can retain it across all control flow (including loops).
It's only when another debug instruction specifies a different variable
value that we have to check, and find there's no location.
This property means that if a machine value is defined in a block dominated
by a DBG_INSTR_REF that refers to it, all the successor blocks can
automatically find a location for that value (if it's not clobbered). Thus
in a sense, InstrRefBasedLDV is already supporting and implementing
use-before-defs. This patch allows us to specify a variable location in the
block where it's defined.
When loading live-in variable locations, TransferTracker currently discards
those where it can't find a location for the variable value. However, we
can tell from the machine value number whether the value is defined in this
block. If it is, add it to a set of use-before-def records. Then, once the
relevant instruction has been processed, emit a DBG_VALUE immediately after
it.
Differential Revision: https://reviews.llvm.org/D85775
Handle DBG_INSTR_REF instructions in LiveDebugValues, to determine and
propagate variable locations. The logic is fairly straight forwards:
Collect a map of debug-instruction-number to the machine value numbers
generated in the first walk through the function. When building the
variable value transfer function and we see a DBG_INSTR_REF, look up the
instruction it refers to, and pick the machine value number it generates,
That's it; the rest of LiveDebugValues continues as normal.
Awkwardly, there are two kinds of instruction numbering happening here: the
offset into the block (which is how machine value numbers are determined),
and the numbers that we label instructions with when generating
DBG_INSTR_REFs.
I've also restructured the TransferTracker redefVar code a little, to
separate some DBG_VALUE specific operations into its own method. The
changes around redefVar should be largely NFC, while allowing
DBG_INSTR_REFs to specify a value number rather than just a location.
Differential Revision: https://reviews.llvm.org/D85771
When switching the register debug operands to $noreg in
setupDebugValueUndef() also clear the sub-register indices for virtual
registers. This is done when marking DBG_VALUEs undef in other cases,
e.g. in LiveDebugVariables. I have not found any cases where leaving the
sub-register index causes any issues, and the indices would eventually
get dropped when LiveDebugVariables reinserted the undef DBG_VALUEs
after register scheduling, but if nothing else it looked a bit weird in
printouts to have sub-register indices on $noreg, and I don't think the
sub-register index holds any meaningful information at that point.
I have not been able to find any source-level reproducer for this with
an upstream target, so I have just added an instrumented machine-sink
test.
Reviewed By: djtodoro, jmorse
Differential Revision: https://reviews.llvm.org/D89941
Both FastRegAlloc and LiveDebugVariables/greedy need to cope with
DBG_INSTR_REFs. None of them actually need to take any action, other than
passing DBG_INSTR_REFs through: variable location information doesn't refer
to any registers at this stage.
LiveDebugVariables stashes the instruction information in a tuple, then
re-creates it later. This is only necessary as the register allocator
doesn't expect to see any debug instructions while it's working. No
equivalence classes or interval splitting is required at all!
No changes are needed for the fast register allocator, as it just ignores
debug instructions. The test added checks that both of them preserve
DBG_INSTR_REFs.
This also expands ScheduleDAGInstrs.cpp to treat DBG_INSTR_REFs the same as
DBG_VALUEs when rescheduling instructions around. The current movement of
DBG_VALUEs around is less than ideal, but it's not a regression to make
DBG_INSTR_REFs subject to the same movement.
Differential Revision: https://reviews.llvm.org/D85757
The instruction referencing work currently only works on X86, and all the
tests for it will be X86 based for the time being. Configure the whole
directory to be X86-only, seeing how I keep on landing tests that don't
have the correct REQUIRES lines.
This patch touches two optimizations, TwoAddressInstruction and X86's
FixupLEAs pass, both of which optimize by re-creating instructions. For
LEAs, various bits of arithmetic are better represented as LEAs on X86,
while TwoAddressInstruction sometimes converts instrs into three address
instructions if it's profitable.
For debug instruction referencing, both of these require substitutions to
be created -- the old instruction number must be pointed to the new
instruction number, as illustrated in the added test. If this isn't done,
any variable locations based on the optimized instruction are
conservatively dropped.
Differential Revision: https://reviews.llvm.org/D85756
Add a table recording "substitutions" between pairs of <instruction,
operand> numbers, from old pairs to new pairs. Post-isel optimizations are
able to record the outcome of an optimization in this way. For example, if
there were a divide instruction that generated the quotient and remainder,
and it were replaced by one that only generated the quotient:
$rax, $rcx = DIV-AND-REMAINDER $rdx, $rsi, debug-instr-num 1
DBG_INSTR_REF 1, 0
DBG_INSTR_REF 1, 1
Became:
$rax = DIV $rdx, $rsi, debug-instr-num 2
DBG_INSTR_REF 1, 0
DBG_INSTR_REF 1, 1
We could enter a substitution from <1, 0> to <2, 0>, and no substitution
for <1, 1> as it's no longer generated.
This approach means that if an instruction or value is deleted once we've
left SSA form, all variables that used the value implicitly become
"optimized out", something that isn't true of the current DBG_VALUE
approach.
Differential Revision: https://reviews.llvm.org/D85749
This patch defines the MIR format for debug instruction references: it's an
integer trailing an instruction, marked out by "debug-instr-number", much
like how "debug-location" identifies the DebugLoc metadata of an
instruction. The instruction number is stored directly in a MachineInstr.
Actually referring to an instruction comes in a later patch, but is done
using one of these instruction numbers.
I've added a round-trip test and two verifier checks: that we don't label
meta-instructions as generating values, and that there are no duplicates.
Differential Revision: https://reviews.llvm.org/D85746
This was landed but reverted in 5b9c2b1bea7 due to asan picking up a memory
leak. This is fixed in the change to InstrRefBasedImpl.cpp. Original
commit message follows:
[LiveDebugValues][NFC] Add instr-ref tests, adapt old tests
This patch adds a few tests in DebugInfo/MIR/InstrRef/ of interesting
behaviour that the instruction referencing implementation of
LiveDebugValues has. Mostly, these tests exist to ensure that if you
give the "-experimental-debug-variable-locations" command line switch,
the right implementation runs; and to ensure it behaves the same way as
the VarLoc LiveDebugValues implementation.
I've also touched roughly 30 other tests, purely to make the tests less
rigid about what output to accept. DBG_VALUE instructions are usually
printed with a trailing !debug-location indicating its scope:
!debug-location !1234
However InstrRefBasedLDV produces new DebugLoc instances on the fly,
meaning there sometimes isn't a numbered node when they're printed,
making the output:
!debug-location !DILocation(line: 0, blah blah)
Which causes a ton of these tests to fail. This patch removes checks for
that final part of each DBG_VALUE instruction. None of them appear to
be actually checking the scope is correct, just that it's present, so
I don't believe there's any loss in coverage here.
Differential Revision: https://reviews.llvm.org/D83054
This reverts commit b9d977b0ca60c54f11615ca9d144c9f08b29fd85.
This cutoff is no longer required. The commit 34ffa7fc501 (D86153) introduces a
performance improvement which was tested against the motivating case for this
patch.
Discussed in differential revision: https://reviews.llvm.org/D86153
With this patch we're now accounting for two more cases which should be
considered 'valid throughout': First, where RangeEnd is ScopeEnd. Second, where
RangeEnd comes before ScopeEnd when including meta instructions, but are both
preceded by the same non-meta instruction.
CTMark shows a geomean binary size reduction of 1.5% for RelWithDebInfo builds.
`llvm-locstats` (using D85636) shows a very small variable location coverage
change in 2 of 10 binaries, but it is in the order of 10s of bytes which lines
up with my expectations.
I've added a test which checks both of these new cases. The first check in the
test isn't strictly necessary for this patch. But I'm not sure that it is
explicitly tested anywhere else, and is useful for the final patch in the
series.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D86151
Fix the ARM backend's analyzeBranch so it doesn't ignore predicated
return instructions, and make the MachineVerifier rule more strict.
Differential Revision: https://reviews.llvm.org/D40061
This patch adds a few tests in DebugInfo/MIR/InstrRef/ of interesting
behaviour that the instruction referencing implementation of
LiveDebugValues has. Mostly, these tests exist to ensure that if you
give the "-experimental-debug-variable-locations" command line switch,
the right implementation runs; and to ensure it behaves the same way as
the VarLoc LiveDebugValues implementation.
I've also touched roughly 30 other tests, purely to make the tests less
rigid about what output to accept. DBG_VALUE instructions are usually
printed with a trailing !debug-location indicating its scope:
!debug-location !1234
However InstrRefBasedLDV produces new DebugLoc instances on the fly,
meaning there sometimes isn't a numbered node when they're printed,
making the output:
!debug-location !DILocation(line: 0, blah blah)
Which causes a ton of these tests to fail. This patch removes checks for
that final part of each DBG_VALUE instruction. None of them appear to
be actually checking the scope is correct, just that it's present, so
I don't believe there's any loss in coverage here.
Differential Revision: https://reviews.llvm.org/D83054
Emit DWARF 5 call-site symbols even though DWARF 4 is set,
only in the case of LLDB tuning.
This patch addresses PR46643.
Differential Revision: https://reviews.llvm.org/D83463
SONY debugger does not prefer debug entry values feature, so
the plan is to avoid production of the entry values
by default when the tuning is SCE debugger.
The feature still can be enabled with the -debug-entry-values
option for the testing/development purposes.
This patch addresses PR46643.
Differential Revision: https://reviews.llvm.org/D83462
Occasionally we see absolutely massive basic blocks, typically in global
constructors that are vulnerable to heavy inlining. When these blocks are
dense with DBG_VALUE instructions, we can hit near quadratic complexity in
DwarfDebug's validThroughout function. The problem is caused by:
* validThroughout having to step through all instructions in the block to
examine their lexical scope,
* and a high proportion of instructions in that block being DBG_VALUEs
for a unique variable fragment,
Leading to us stepping through every instruction in the block, for (nearly)
each instruction in the block.
By adding this guard, we force variables in large blocks to use a location
list rather than a single-location expression, as shown in the added test.
This shouldn't change the meaning of the output DWARF at all: instead we
use a less efficient DWARF encoding to avoid a poor-performance code path.
Differential Revision: https://reviews.llvm.org/D83236
When describing parameter value loaded by a COPY instruction, consider
case where needed Reg value is a sub- or super- register of the COPY
instruction's destination register. Without this patch, compile process
will crash with the assertion "TargetInstrInfo::describeLoadedValue
can't describe super- or sub-regs for copy instructions".
Patch by Nikola Tesic
Differential revision: https://reviews.llvm.org/D82000
Describe parameter's value loaded by MIPS ADDiu instruction.
When parameter's value is loaded into a register by mips ADDiu/DADDiu
instruction, it could be described correctly and emitted as
DW_AT_GNU_call_site_value.
Patch by Nikola Tesic
Differential revision: https://reviews.llvm.org/D78108
This adds call site info support for call instructions with delay slot.
Search for instructions inside call delay slot, which load value
into parameter forwarding registers.
Return address of the call points to instruction after call delay slot,
which is not the one, immediately after the call instruction.
Patch by Nikola Tesic
Differential revision: https://reviews.llvm.org/D78107
Summary:
Instead of iterating over all VarLoc IDs in removeEntryValue(), just
iterate over the interval reserved for entry value VarLocs. This changes
the iteration order, hence the test update -- otherwise this is NFC.
This appears to give an ~8.5x wall time speed-up for LiveDebugValues when
compiling sqlite3.c 3.30.1 with a Release clang (on my machine):
```
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
Before: 2.5402 ( 18.8%) 0.0050 ( 0.4%) 2.5452 ( 17.3%) 2.5452 ( 17.3%) Live DEBUG_VALUE analysis
After: 0.2364 ( 2.1%) 0.0034 ( 0.3%) 0.2399 ( 2.0%) 0.2398 ( 2.0%) Live DEBUG_VALUE analysis
```
The change in removeEntryValue() is the only one that appears to affect
wall time, but for consistency (and to resolve a pending TODO), I made
the analogous changes for iterating over SpillLocKind VarLocs.
Reviewers: nikic, aprantl, jmorse, djtodoro
Subscribers: hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80684
Summary:
We received a report of LiveDebugValues consuming 25GB+ of RAM when
compiling code generated by Unity's IL2CPP scripting backend.
There's an initial 5GB spike due to repeatedly copying cached lists of
MachineBasicBlocks within the UserValueScopes members of VarLocs.
But the larger scaling issue arises due to the fact that prior to range
extension, there are 81K basic blocks and 156K DBG_VALUEs: given enough
memory, LiveDebugValues would insert 101 million MIs (I counted this by
incrementing a counter inside of VarLoc::BuildDbgValue).
It seems like LiveDebugValues would have to be rearchitected to support
this kind of input (we'd need some new represntation for DBG_VALUEs that
get inserted into ~every block via flushPendingLocs). OTOH, large globs
of auto-generated code are typically not debugged interactively.
So: add cutoffs to disable range extension when the input is too big. I
chose the cutoffs experimentally, erring on the conservative side. When
compiling a large collection of Apple software, range extension never
got disabled.
rdar://63418929
Reviewers: aprantl, friss, jmorse, Orlando
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80662
Summary:
A struct argument can be passed-by-value to a callee via a pointer to a
temporary stack copy. Add support for emitting an entry value DBG_VALUE
when an indirect parameter DBG_VALUE becomes unavailable. This is done
by omitting DW_OP_stack_value from the entry value expression, to make
the expression describe the location of an object.
rdar://63373691
Reviewers: djtodoro, aprantl, dstenb
Subscribers: hiraditya, lldb-commits, llvm-commits
Tags: #lldb, #llvm
Differential Revision: https://reviews.llvm.org/D80345
In D68209, LiveDebugValues::transferDebugValue had a call to
OpenRanges.erase shifted, and by accident this led to a code path where
DBG_VALUEs of $noreg would not have their open range terminated, allowing
variable locations to extend past blocks where they were terminated.
This patch correctly terminates the open range, if present, when such a
DBG_VAUE is encountered, and adds a test for this behaviour.
Differential Revision: https://reviews.llvm.org/D78218
Move ARM ConstantIsland and LowOverheadLopps passes later in the pipeline
such that they will be run after the upcoming Machine Outlining pass.
Differential Revision: https://reviews.llvm.org/D76065
Record the address of a tail-calling branch instruction within its call
site entry using DW_AT_call_pc. This allows a debugger to determine the
address to use when creating aritificial frames.
This creates an extra attribute + relocation at tail call sites, which
constitute 3-5% of all call sites in xnu/clang respectively.
rdar://60307600
Differential Revision: https://reviews.llvm.org/D76336
When compiling
```
struct S {
float w;
};
void f(long w, long b);
void g(struct S s) {
int w = s.w;
f(w, w*4);
}
```
I get Assertion failed: ((!CombinedExpr || CombinedExpr->isValid()) && "Combined debug expression is invalid").
That's because we combine two epxressions that both end in DW_OP_stack_value:
```
(lldb) p Expr->dump()
!DIExpression(DW_OP_LLVM_convert, 32, DW_ATE_signed, DW_OP_LLVM_convert, 64, DW_ATE_signed, DW_OP_stack_value)
(lldb) p Param.Expr->dump()
!DIExpression(DW_OP_constu, 4, DW_OP_mul, DW_OP_LLVM_convert, 32, DW_ATE_signed, DW_OP_LLVM_convert, 64, DW_ATE_signed, DW_OP_stack_value)
(lldb) p CombinedExpr->isValid()
(bool) $0 = false
(lldb) p CombinedExpr->dump()
!DIExpression(4097, 32, 5, 4097, 64, 5, 16, 4, 30, 4097, 32, 5, 4097, 64, 5, 159, 159)
```
I believe that in this particular case combining two stack values is
safe, but I didn't want to sink the special handling into
DIExpression::append() because I do want everyone to think about what
they are doing.
Patch by Adrian Prantl.
Fixes PR45181.
rdar://problem/60383095
Differential Revision: https://reviews.llvm.org/D76164
This fixes a miscompile that happened because a DBG_VALUE interfered
with the MachineOutliner's liveness analysis.
Inserting a DBG_VALUE after a terminator breaks predicates on MBB such
as isReturnBlock(). And the resulting DBG_VALUE cannot be "live".
I plan to introduce a MachineVerifier check for this situation in a
follow up.
rdar://59859175
Testing: check-llvm, LNT build with a stage2 compiler & entry values
enabled
Differential Revision: https://reviews.llvm.org/D75548
As a narrow stopgap for the assertion failure described in PR45025, add
a describeLoadedValue override to ARMBaseInstrInfo and use it to detect
copies in which the forwarding reg is a super/sub reg of the copy
destination. For the moment this is unsupported.
Several follow ups are possible:
1) Handle VORRq. At the moment, we do not, because isCopyInstrImpl
returns early when !MI.isMoveReg().
2) In the case where forwarding reg is a super-reg of the copy
destination, we should be able to describe the forwarding reg as a
subreg within the copy destination. I'm not 100% sure about this, but
it looks like that's what's done in AArch64InstrInfo.
3) In the case where the forwarding reg is a sub-reg of the copy
destination, maybe we could describe the forwarding reg using the
copy destinaion and a DW_OP_LLVM_fragment (I guess this should be
possible after D75036).
https://bugs.llvm.org/show_bug.cgi?id=45025
rdar://59772698
Differential Revision: https://reviews.llvm.org/D75273