If it was decided to relocate derived pointer using the spill its value is
not exported in general case.
When gc.relocate is located in an another block than a statepoint we cannot
get SD for derived value but for spill case it is not required at all.
However implementation of gc.relocate lowering unconditionally request SD value
causing the assert triggering.
The CL fixes this by handling spill case earlier than SD is really required.
Reviewers: reames, dantrushin
Reviewed By: dantrushin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D98324
This patch adds support for DBG_VALUE_LIST in the LiveDebugVariables pass. The
changes are mostly in computeIntervals, extendDef, and addDefsFromCopies; when
extending the def of a DBG_VALUE_LIST the live ranges of every used register
must be considered, and when such a def is killed by more than one of its used
registers being killed at the same time it is necessary to find valid copies of
all of those registers to create a new def with.
The DebugVariableValue class has also been changed to reference multiple
location numbers instead of just one. This has been accomplished by using a
C-style array with a unique_ptr and an array length packed into 6 bits, to
minimize the size of the class (which must be kept low to be used with
IntervalMap). This may not be the most efficient solution possible, and should
be looked at if performance issues arise.
Differential Revision: https://reviews.llvm.org/D83895
The current logic in TargetLibraryInfoImpl::getLibFunc() was only treating
strcpy, etc. with i8* arguments in address space zero as a valid library
function. However, in the CHERI and Morello targets we expect all libc
functions to use address space 200 arguments.
This commit updates isValidProtoForLibFunc() to check that the argument
is a pointer type. This also drops the check for i8* since we should not
be checking the pointee type any more.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D95142
Currently DSE misses cases where the size is a non-const IR value, even
if they match. For example, this means that llvm.memcpy/llvm.memset
calls are not eliminated, even if they write the same number of bytes.
This patch extends isOverwite to try to get IR values for the number of
bytes written from the analyzed instructions. If the values match,
alias checks are performed and the result is returned.
At the moment this only covers llvm.memcpy/llvm.memset. In the future,
we may enable MemoryLocation to also track variable sizes, but this
simple approach should allow us to cover the important cases in DSE.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D98284
As we're going to replace this ambiguous option with more precise
instruction-level fast-math description, some tests need to be updated
and the option doesn't play any role in some of them.
It makes it consistent with `size()` method return type and with
STL-like containers API.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D97921
This allows sending requests through CLI and more debugging
opportunities. Example:
```bash
$ grpc_cli ls localhost:50051
clang.clangd.remote.v1.SymbolIndex
grpc.reflection.v1alpha.ServerReflection
grpc.health.v1.Health
```
This patch simplifies pattern (xxswap (vec-op (xxswap a) (xxswap b)))
into (vec-op a b) if vec-op is lane-insensitive. The motivating case
is ScalarToVector-VecOp-ExtractElement sequence on LE, but the
peephole itself is not related to endianness, so BE may also benefit
from this.
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D97658
now -funique-internal-linkage-name flag is available, and we want to flip
it on by default since it is beneficial to have separate sample profiles
for different internal symbols with the same name. As a preparation, we
want to avoid regression caused by the flip.
When we flip -funique-internal-linkage-name on, the profile is collected
from binary built without -funique-internal-linkage-name so it has no uniq
suffix, but the IR in the optimized build contains the suffix. This kind of
mismatch may introduce transient regression.
To avoid such mismatch, we introduce a NameTable section flag indicating
whether there is any name in the profile containing uniq suffix. Compiler
will decide whether to keep uniq suffix during name canonicalization
depending on the NameTable section flag. The flag is only available for
extbinary format. For other formats, by default compiler will keep uniq
suffix so they will only experience transient regression when
-funique-internal-linkage-name is just flipped.
Another type of regression is caused by places where we miss to call
getCanonicalFnName. Those places are fixed.
Differential Revision: https://reviews.llvm.org/D96932
Fix bug in MemoryDependence [and thus GVN] for invariant group.
Previously MemDep didn't verify that the store was storing into a
pointer rather than a store simply using a pointer.
Differential Revision: https://reviews.llvm.org/D98267
We encountered an issue where LTO running on IR that used the DSOLocalEquivalent
constant would result in bad codegen. The underlying issue was ValueMapper wasn't
properly handling DSOLocalEquivalent, so this just adds the machinery for handling
it. This code path is triggered by a fix to DSOLocalEquivalent::handleOperandChangeImpl
where DSOLocalEquivalent could potentially not have the same type as its underlying GV.
This updates DSOLocalEquivalent::handleOperandChangeImpl to change the type if
the GV type changes and handles this constant in ValueMapper.
Differential Revision: https://reviews.llvm.org/D97978
This test shows a case where we can potentially scalarize the store in a
predicated loop, creating a lot of instructions that would be much
slower than scalar.
Turn `-Wreturn-type` into an error.
This is currently used by libcxx, libcxxabi, and libunwind, and would be a good default
for all of llvm. I'm not aware of any cases where this shouldn't be an error. This
ensures different build configs, merges, and downstream branches catch issues sooner.
Differential Revision: https://reviews.llvm.org/D98224
This pull request implements patterns to exploit the load rightmost vector
element instructions for loading element 0 on little endian PowerPC subtargets
into v8i16 and v16i8 vector registers for i16 and i8 data types.
Differential Revision: https://reviews.llvm.org/D94816#inline-921403
This was suggested by lebedev.ri over on D96534. You'll note lack of tests. During review, we weren't actually able to find a case which exercises it, but both I and lebedev.ri feel it's a reasonable change, straight forward, and near free.
Differential Revision: https://reviews.llvm.org/D97064
This removes hard-coded shadow width references and adds more RUN
lines to increase test coverage under different options (fast16 labels
mode).
Also, shortens the test by unifying common lines under both combine- and no-combine-ptr-label options.
Reviewed By: stephan.yichao.zhao
Differential Revision: https://reviews.llvm.org/D98227
LSR prefers to schedule iv increments just before the latch. The recent 80511565 broadened this to moving increments in the original IR. This pointed out a robustness problem with the CGP transform.
When we have a use of an induction increment outside of the loop (we canonicalize away from this form, but it happens e.g. unanalyzeable loops) we'd avoid performing the uadd/usub transform. Interestingly, all of these involve moving the increment closer to it's operands, so there's no concern about dominating all uses. We can handle that case cheaply, resulting in a more robust transform.
For <2 x s32>, we can use G_DUPLANE32, but with a <4 x s32> source. To make it
work, we can just widen the original source with a concat_vectors.
Doing this allows <2 x float> indexed fmul instruction selection patterns to
fire, which gives a nice 0.3% code size saving on Bullet with -Os.
Differential Revision: https://reviews.llvm.org/D98059
If every element is extracted from a G_BUILD_VECTOR, pass through the source
registers. This is different to the extract(build_vector) combine because this
one tolerates multiple users as long as they're exhaustive.
Differential Revision: https://reviews.llvm.org/D97890
Refactor and add comments to explain where the magic numbers come from
in terms of the instruction cache line size. NFC.
Differential Revision: https://reviews.llvm.org/D98266
This patch implements DBG_VALUE_LIST handling to the LiveDebugValues pass. This
is a substantial change, and makes a few fundamental changes to the existing
logic.
We still use the basic model of a VarLocMap that is indexed by a LocIndex, with
a VarLocSet (a CoalescingBitVector underneath) giving us efficient lookups of
existing variable locations for a given location type. The main change is that
the VarLocMap may contain a given VarLoc multiple times (once for each unique
location operand), so that a VarLoc can be looked up from any of the registers
that it uses. This means that each VarLoc has multiple corresponding LocIndexes;
to allow us to iterate through the set of VarLocs (previously we would iterate
through the VarLocSet), we now also maintain a single entry in the VarLocMap
that contains every VarLoc exactly once.
The VarLoc class itself is also changed; this change is much simpler,
refactoring out location-specific members into a MachineLocation class and
adding a vector of these locations.
Differential Revision: https://reviews.llvm.org/D83890
This test currently fails to compile when using a MinGW toolchain as setenv is not defined. This function is a POSIX function Windows does not implement.
This patch enables the setenv macro used in the unit test for all of Windows, making the test compile and run successfully.
Differential Revision: https://reviews.llvm.org/D98271
llvm-jitlink and llvm-jitlink-executor make use of APIs that are
part of the socket and nsl libraries on SunOS systems (Solaris and
Illumos). Make sure they get linked.
Ran into this in Rust CI when cross-compiling LLVM 12 to these
targets.
Differential Revision: https://reviews.llvm.org/D97633
I've left mask registers to a future patch as we'll need
to convert them to full vectors, shuffle, and then truncate.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D97609