Update v4i64 -> v4f32/v4f64 uitofp costs based on the worst case costs from the script in D103695.
Fixes a few regressions before we start adding AVX costs for legalized types.
The compiler should not ignore UndefValue when gathering the scalars,
otherwise the resulting code may be less defined than the original one.
Also, grouped scalars to insert them at first to reduce the analysis in
further passes.
Differential Revision: https://reviews.llvm.org/D105275
Fixes bugs [[ https://bugs.llvm.org/show_bug.cgi?id=50580 | 50580 ]] and [[ https://bugs.llvm.org/show_bug.cgi?id=49446 | 49446 ]]
When compiling with -g "DBG_VALUE <reg>" instructions are added in the MIR, if such a instruction is inserted between instructions that use <reg> then MachineCopyPropagation invalidates <reg> , this causes some copies to not be propagated and causes differences in code generation (ex bugs 50580 and 49446 ). DBG_VALUE instructions should be ignored since they don't actually modify the register.
Reviewed By: lkail
Differential Revision: https://reviews.llvm.org/D104394
While this should not matter for most architectures (where the program
address space is 0), it is important for CHERI (and therefore Arm Morello).
We use address space 200 for all of our code pointers and without this
change we assert in the SelectionDAG handling of BlockAddress nodes.
It is also useful for AVR: previously programs targeting
AVR that attempt to read their own machine code
via a pointer to a label would instead read from RAM
using a pointer relative to the the start of program flash.
Reviewed By: dylanmckay, theraven
Differential Revision: https://reviews.llvm.org/D48803
If the store address does not dominate the matrix multiply, try to hoist
address computation instructions without side-effects and/or memory
reads before the multiply, to allow fusion.
Reviewed By: thegameg
Differential Revision: https://reviews.llvm.org/D105193
Reland of 31859f896.
This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and
lowering methods for load and stores of reference types from IR
globals. Once the lowering creates the new nodes, tablegen pattern
matches those and converts them to Wasm global.get/set.
Differential Revision: https://reviews.llvm.org/D104797
With 'for' loop there is is a single place where 'Current' is adjusted. It helps to avoid copy paste and makes a bit easy to understand overall loop controll flow.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D101044
Previously we used the vector type, but we're loading/storing
invididual elements so I think only element alignment should matter.
Noticed while looking at the code for something else so I don't
have a test case.
Differential Revision: https://reviews.llvm.org/D105220
In `IRTranslator::translateGetElementPtr`, when we run into a vector gep with
some scalar operands, we try to normalize those operands using
`buildSplatVector`.
This is fine except for when the getelementptr has a <1 x N> type. In that case
it is treated as a scalar. If we run into one of these then every call to
```
// With VectorWidth = 1
LLT::fixed_vector(VectorWidth, PtrTy)
```
will assert.
Here's an example (equivalent to the added testcase):
https://godbolt.org/z/hGsTnMYdW
To get around this, this patch adds a variable, `WantSplatVector`, which
is true when our vector type ought to actually be represented using a vector.
When it's false, we'll translate as a scalar. This checks if `VectorWidth > 1`.
This fixes this bug:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=35496
Differential Revision: https://reviews.llvm.org/D105316
Target-independent code only knows how to spill to the stack; instead,
use AArch64ISD::REINTERPRET_CAST.
Differential Revision: https://reviews.llvm.org/D104573
We need the compiler generated variable to override the weak symbol of
the same name inside the profile runtime, but using LinkOnceODRLinkage
results in weak symbol being emitted which leads to an issue where the
linker might choose either of the weak symbols potentially disabling the
runtime counter relocation.
This change replaces the use of weak definition inside the runtime with
an external weak reference to address the issue. We also place the
compiler generated symbol inside a COMDAT group so dead definition can
be garbage collected by the linker.
Differential Revision: https://reviews.llvm.org/D105176
This is the cause of the miscompile in:
https://llvm.org/PR50944
The problem has likely existed for some time, but it was made visible with:
5af8bacc94024 ( D104661 )
handleOtherCmpSelSimplifications() assumed it can convert select of
constants to bool logic ops, but that does not work with poison.
We had a very similar construct in InstCombine, so the fix here
mimics the fix there.
The bug is in instsimplify, but I'm not sure how to reproduce it outside of
instcombine. The reason this is visible in instcombine is because we have a
hack (FIXME) to bypass simplification of a select when it has an icmp user:
955f125899/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp (L2632)
So we get to an unusual case where we are trying to simplify an instruction
that has an operand that would have already simplified if we had processed
it in normal order.
Differential Revision: https://reviews.llvm.org/D105298
`llvm-strip` does not support `-l`. Apple's `strip` supports `-l`, but
it is not documented, and the latest code doesn't seem to do anything
meaningful. From the old source code drops it seems that `-l` was added
around version 795 of cctools and removed before 898. The code around
the flag usage in 795 talks about problems with kext and forcing the
execution of `ld -r`, which seems a behaviour that is not enforceable in
latest versions of cctools.
The `-l` flag was added in https://reviews.llvm.org/D15133 without a lot
of explanation.
Since the flag is not active, removing it should not modify the
behaviour for most people (except if someone is trying to compile LLVM
with a really old version of `strip`).
Additionally, break the invocation into two different flags, since
`llvm-strip` doesn't at the moment support grouped flags, and other
`strip` implementations should work the same no matter if grouped or
not.
Test Plan:
Using `strip` from Xcode 12.5 in Big Sur to strip the same binary (a
simple Hello World), using both `-Sxl` and `-Sx` produces exactly the
same binary.
Repeating the same process with `clang` results also in the same binary.
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D105243
GNU and Apple `strip` implementations seems to support grouped options.
Enable the support for grouped options introduced in
https://reviews.llvm.org/D83639 for `llvm-strip` invocations.
Includes test that checks that both the grouped and non grouped
invocations produces the same result.
Reviewed By: alexander-shaposhnikov, MaskRay
Differential Revision: https://reviews.llvm.org/D105249
D104868 removed an (incorrect) fold for distributing BFI instructions in
a chain, combining them into a single instruction. BFIs like that are
hard to test, as the patterns are often destroyed before they become
BFIs. But it can come up in places, with chains of BFIs that can be
combined.
This patch adds a replacement, which reassociates BFI instructions with
non-overlapping insertion masks so that low bits are inserted first.
This can end up sorting the nodes so that adjacent inserts are next to
one another, allowing the existing folds to combine into a single BFI.
Differential Revision: https://reviews.llvm.org/D105096
The new documentation entry gives an example use case from
libomptarget.
Reviewed By: yln, jhenderson, davezarzycki
Differential Revision: https://reviews.llvm.org/D105208
Synchronizing multiple custom targets requires not only target but also
file dependencies. Building Linalg involves running yaml-gen followed by
tablegen. Currently, these custom targets are only synchronized using a
target dependency resulting in issues in specific incremental build
setups (https://llvm.discourse.group/t/missing-build-cmake-tblgen-dependency/3727/10).
This patch introduces a novel LLVM_TARGET_DEPENDS variable to the
TableGen.cmake file to provide a way to specify file dependencies.
Additionally, it adapts the Linalg CMakeLists.txt to introduce the
necessary file dependency between yaml-gen and tablegen.
Differential Revision: https://reviews.llvm.org/D105272
Address mistakenly comparing the pointer values of two C-style strings
rather than comparing their contents in the unit tests for makeVisitor,
added in 6d6f35eb7b92c6dd4478834497752f4e963db16d
This follows up to D104665 (which added umulo handling alongside the existing uaddo case), and generalizes for the remaining overflow intrinsics.
I went to add analogous handling to LVI, and discovered that LVI already had a more general implementation. Instead, we can port was LVI does to instcombine. (For context, LVI uses makeExactNoWrapRegion to constrain the value 'x' in blocks reached after a branch on the condition `op.with.overflow(x, C).overflow`.)
Differential Revision: https://reviews.llvm.org/D104932
This adds support for opaque pointers in intrinsic type checks
of IIT kind Pointer and PtrToElt.
This is less straight-forward than it might initially seem, because
we should only accept opaque pointers here in --force-opaque-pointers
mode. Otherwise, there would be more than one valid type signature
for a given intrinsic name.
Differential Revision: https://reviews.llvm.org/D105155
Inserting into a smaller-than-legal scalable vector would result in an
internal compiler error. For example, inserting a <vscale x 4 x i8> into
a <vscale x 8 x i8> (both illegal vector types for SVE) would cause a
crash.
This crash was happening because there was no code to promote (legalise)
the result of an INSERT_SUBVECTOR node.
This patch implements PromoteIntRes_INSERT_SUBVECTOR, which legalises
the ISD node. This is currently done by going through memory. This is
necessary because of the requirement that the SubVec parameter of the
INSERT_SUBVECTOR node must be smaller than the Vec parameter, which
means that INSERT_SUBVECTOR cannot always have a legal result/operand
types.
Co-Authored-by: Joe Ellis <joe.ellis@arm.com>
Differential Revision: https://reviews.llvm.org/D102766
This reverts b56e5f8a10c1 (and follow-up f6db88535cb) and instead
restores the state we had before 0c96a92d8666b8: ClangdMain.cpp
includes Features.inc before including Transport.h.
This is a bit ugly, but it matches the former state and making Transport.h
include Features.h means that xpc/ needs to be able to find the generated
Features.inc, wich is also a bit ugly.
Building on rG2a1ef8784ad9a, adjust the SSE cost tables to use the legalized types based on the worst case costs from the script in D103695.
To account for different numbers of src/dst legalized type registers we must scale the cost by maximum of the src/dst, not just use src