DWARF v5 introduces DW_AT_call_all_calls, a subprogram attribute which
indicates that all calls (both regular and tail) within the subprogram
have call site entries. The information within these call site entries
can be used by a debugger to populate backtraces with synthetic tail
call frames.
Tail calling frames go missing in backtraces because the frame of the
caller is reused by the callee. Call site entries allow a debugger to
reconstruct a sequence of (tail) calls which led from one function to
another. This improves backtrace quality. There are limitations: tail
recursion isn't handled, variables within synthetic frames may not
survive to be inspected, etc. This approach is not novel, see:
https://gcc.gnu.org/wiki/summit2010?action=AttachFile&do=get&target=jelinek.pdf
This patch adds an IR-level flag (DIFlagAllCallsDescribed) which lowers
to DW_AT_call_all_calls. It adds the minimal amount of DWARF generation
support needed to emit standards-compliant call site entries. For easier
deployment, when the debugger tuning is LLDB, the DWARF requirement is
adjusted to v4.
Testing: Apart from check-{llvm, clang}, I built a stage2 RelWithDebInfo
clang binary. Its dSYM passed verification and grew by 1.4% compared to
the baseline. 151,879 call site entries were added.
rdar://42001377
Differential Revision: https://reviews.llvm.org/D49887
llvm-svn: 343883
Context: Compiler generated instructions do not have a debug location
assigned to them. However emitting 0-line records for all of them bloats
the line tables for very little benefit so we usually avoid doing that.
Not emitting anything will lead to the previous debug location getting
applied to the locationless instructions. This is not desirable for
block begin and after labels. Previously we would emit simply emit
line-0 records in this case, this patch changes the behavior to do a
forward search for a debug location in these cases before emitting a
line-0 record to further reduce line table bloat.
Inspired by the discussion in https://reviews.llvm.org/D52862
llvm-svn: 343874
The comments in this code say we were trying to avoid 16-bit immediates, but if the immediate fits in 8-bits this isn't an issue. This avoids creating a zero extend that probably won't go away.
The movmskb related changes are interesting. The movmskb instruction writes a 32-bit result, but fills the upper bits with 0. So the zero_extend we were previously emitting was free, but we turned a -1 immediate that would fit in 8-bits into a 32-bit immediate so it was still bad.
llvm-svn: 343871
Currently we hardcode instructions with ReadAfterLd if the register operands don't need to be available until the folded load has completed. This doesn't take into account the different load latencies of different memory operands (PR36957).
This patch adds a ReadAfterFold def into X86FoldableSchedWrite to replace ReadAfterLd, allowing us to specify the load latency at a scheduler class level.
I've added ReadAfterVec*Ld classes that match the XMM/Scl, XMM and YMM/ZMM WriteVecLoad classes that we currently use, we can tweak these values in future patches once this infrastructure is in place.
Differential Revision: https://reviews.llvm.org/D52886
llvm-svn: 343868
And use that to transform fsub with zero constant operands.
The integer part isn't used yet, but it is proposed for use in
D44548, so adding both enhancements here makes that
patch simpler.
llvm-svn: 343865
Decode subvector shuffles from INSERT_SUBVECTOR(SRC0, SHUFFLE(EXTRACT_SUBVECTOR(SRC1))
This was found necessary while investigating PR39161
llvm-svn: 343853
Call getOperandInfo() instead of using (near) duplicated code in
LoopVectorizationCostModel::getInstructionCost().
This gets the OperandValueKind and OperandValueProperties values for a Value
passed as operand to an arithmetic instruction.
getOperandInfo() used to be a static method in TargetTransformInfo.cpp, but
is now instead a public member.
Review: Florian Hahn
https://reviews.llvm.org/D52883
llvm-svn: 343852
Finally all targets are enabling multiple regalloc hints, so the hook to
disable this can now be removed.
NFC.
Review: Simon Pilgrim
https://reviews.llvm.org/D52316
llvm-svn: 343851
Some projects rely on using libraries from the Windows SDK with their
original casing, just with a lowercase extension. E.g. the WinSock2 lib
is named WS2_32.Lib in the Windows SDK, and we would previously only
create a ws2_32.lib symlink for it (i.e. all lowercase). Also create a
WS2_32.lib symlink (i.e. original casing with lowercase extension) to
cover users of this casing. As a drive-by fix, only create these
symlinks when they differ from the original name to reduce the amount of
noise in the library symlinks directory.
llvm-svn: 343832
Summary:
These are emitted by the wasm backend for e.g.
__stack_pointer@GLOBAL which previously wasn't accepted by the
assembler.
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, llvm-commits, sunfish
Differential Revision: https://reviews.llvm.org/D52911
llvm-svn: 343830
Summary:
At some point in the past the recursion in DominatesMergePoint used to pass null for AggressiveInsts as part of the recursion. It no longer does this. So there is no way for AggressiveInsts to be null.
This passes it by reference and removes the null check to make this explicit.
Reviewers: efriedma, reames
Reviewed By: efriedma
Subscribers: xbolva00, llvm-commits
Differential Revision: https://reviews.llvm.org/D52575
llvm-svn: 343828
Summary:
Fixes https://bugs.llvm.org/show_bug.cgi?id=39158 and regression caused by
D49034. Though it is possible the problem was existed before and was exposed by
additional DBG_VALUEs.
Reviewers: sunfish, dschuff, aheejin
Reviewed By: aheejin
Subscribers: sbc100, aheejin, llvm-commits, alexcrichton, jgravelle-google
Differential Revision: https://reviews.llvm.org/D52837
llvm-svn: 343827
The simplest instance of this is an intrinsic with no results which will have the
intrinsic ID as operand 0.
Also fix some benign incorrectness when op0 is a reg but isn't a def that was
guarded against by checking for the extension opcodes.
llvm-svn: 343821
We established the (unfortunately complicated) rules for UB/poison
propagation with vector ops in:
D48893
D48987
D49047
It's clear from the affected tests that we are potentially creating
poison where none existed before the transforms. For add/sub/mul,
the answer is simple: just drop the flags because the extra undef
vector lanes are generally more valuable for analysis and codegen.
llvm-svn: 343819
Previously we replaced the chain use ourself and return the data result. LegalizeVectorOps then detected that we'd done this and assumed the chain had already been handled.
This commit instead returns a MERGE_VALUES node with two results joined from nodes. This allows LegalizeVectorOps to do all the replacements for us without any special casing. The MERGE_VALUES will be removed by DAG combine.
llvm-svn: 343817
Summary:
The llvm::SimplifyCFG function creates a SimplifyCFGOpt object and calls run on it. There were numerous places reached from this run function that called back out llvm::SimplifyCFG which would create another SimplifyCFGOpt object. This is an inefficient use of stack space at minimum. We are also not passing along the LoopHeaders pointer passed into the outer llvm::SimplifyCFG call. So if its not null we lose it on the first recursion and get nullptr from there on.
This patch adds an outer loop around the main BasicBlock simplifying code and adds a flag to the SimplifyCFGOpt class that can be set by to request another iteration. I don't think we can iterate based just on the change flag alone since some of the simplifications delete a basic block entirely leaving nothing to iterate on.
Reviewers: bogner, eli.friedman, reames
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D52760
llvm-svn: 343816
The isAmdCodeObjectV2 is a misleading name which actually checks whether the os
is amdhsa or mesa.
Also add a test to make sure we do not generate old kernel header for code
object v3.
Differential Revision: https://reviews.llvm.org/D52897
llvm-svn: 343813
This can happen if assembling a reference to _GLOBAL_OFFSET_TABLE_.
While it doesn't make sense to try to assemble that for COFF,
the fact that we previously used llvm_unreachable meant that the code
had undefined behaviour if something tried to assemble that.
The configure script of libgmp would try to assemble such a snippet
(which should signal a failure). If llvm is built without assertions,
the undefined behaviour meant a (near) infinite loop.
Differential Revision: https://reviews.llvm.org/D52903
llvm-svn: 343811
This change ensures that the (membername,timestamp) tuple uniquely
identifies an entry in an archive for format=darwin, in deterministic
mode (which is the default).
That, then, enables lldb and dsymutil to locate the appropriate object
within the archive.
Differential Revision: https://reviews.llvm.org/D47659
llvm-svn: 343805
This brings the extending loads patch back to the original intent but minus the
PHI bug and with another small improvement to de-dupe truncates that are
inserted into the same block.
The truncates are sunk to their uses unless this would require inserting before a
phi in which case it sinks to the _beginning_ of the predecessor block for that
path (but no earlier than the def).
The reason for choosing the beginning of the predecessor is that it makes de-duping
multiple truncates in the same block simple, and optimized code is going to run a
scheduler at some point which will likely change the position anyway.
llvm-svn: 343804
- Fix spill/reloads of XSeqPairs failing with vregs (only physregs
worked correctly)
- Add missing spill/reload code for WSeqPairs class
Differential Revision: https://reviews.llvm.org/D52761
llvm-svn: 343799
This is a follow-up to rL343482 / D52439.
This was a pattern that initially caused the commit to be reverted because
the transform requires a bitcast as shown here.
llvm-svn: 343794
Flag 'AllowZeroMoveEliminationOnly' should have been a property of the PRF, and
not set at register granularity.
This change also restricts move elimination to writes that update a full
physical register. We assume that there is a strong correlation between
logical registers that allow move elimination, and how those same registers are
allocated to physical registers by the register renamer.
This is still a no functional change, because this experimental code path is
disabled for now. This is done in preparation for another patch that will add
the ability to describe how move elimination works in scheduling models.
llvm-svn: 343787
lowerGlobalAddress, lowerBlockAddress, and insertIndirectBranch contain
overzealous checks for is64Bit. These functions are all safe as-implemented
for RV64.
llvm-svn: 343781
Replacing Timer* with unique_ptr<Timer> in a pass-to-timer map.
That allows to get rid of unpretty raw deletes in PassTimingInfo destructor.
Strictly cleanup, not intended to change any visible behavior.
llvm-svn: 343772