Summary:
Separate introduction of IntrNoFree property as suggested in D70365
Reviewers: arsenm, nhaehnle
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82587
Currently we have to override all macros that are declared. But in many
cases it is convenient to use default values and to override only
a particular one or two.
This provides a way to set a default value for any macro:
```
Symbols:
- Name: [[FOO=foo]]
```
Differential revision: https://reviews.llvm.org/D82455
Select into corresponding V_CMP instruction based on CmpInst predicate,
stored as immediate, in last operand.
Differential Revision: https://reviews.llvm.org/D82652
This reverts commit ed4328c607306a2aa6df4833a0dce4482edbc94c.
My change apparently caused a buildbot to fail with the error
CMake Error at /b/sanitizer-x86_64-linux-autoconf/build/tsan_release_build/lib/cmake/llvm/AddLLVM.cmake:869 (add_dependencies):
The dependency target "omp_gen" of target "ScudoBenchmarks.x86_64" does not
exist.
I don't at all understand why, because as far as I can see, the target
`omp_gen` is only added to `LLVM_COMMON_DEPENDS` after having been
created, so there //should// be no way it can end up on anything's
dependency list if it doesn't exist! But apparently it happened anyway.
Differential Revision: https://reviews.llvm.org/D82659
'InitialLength' is replaced with 'Format' (DWARF32 by default) and 'Length' in this patch.
Besides, test cases for DWARFv4 and DWARFv5, DWARF32 and DWARF64 is
added.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D82622
For TI's distribution of msp430-gcc
```
msp430-elf-gcc -S -o- -Os -x c - <<< "int f(float a, float b) { return a != b; }"
```
does not mention `R13` at all. `__libgcc_cmp_return__` machine mode is 2 byte on MSP430, as well.
Differential Revision: https://reviews.llvm.org/D82635
Summary:
`include/llvm/Frontend/OpenMP/CMakeLists.txt` creates a new target
called `omp_gen`, which is automatically added to
`LLVM_COMMON_DEPENDS` by the `add_public_tablegen_target` macro. But
it only gets added to the version of `LLVM_COMMON_DEPENDS` in the
scope of that subsidiary CMakeLists file, and it doesn't propagate all
the way back up to the permanent version of that variable which is
actually used to set dependencies.
The visible effect is that the output build scripts contain a missing
dependency. For example, if I run cmake in Ninja output mode, and then
run
ninja -t commands tools/clang/examples/PrintFunctionNames/CMakeFiles/PrintFunctionNames.dir/PrintFunctionNames.cpp.o
to list all the commands that are prerequisites of building that
object file, then the list does not include the llvm-tblgen command
that builds `include/llvm/Frontend/OpenMP/OMP.h.inc`, even though that
generated include file is needed (by a chain of includes starting from
`clang/AST/AST.h`), and that object file can't be compiled without it.
This missing dependency can cause intermittent build failures,
depending on the order that Ninja (or whatever) happens to schedule
its commands.
I've fixed it by adding a `set` command in two levels of
`CMakeLists.txt` to propagate the modified version of
`LLVM_COMMON_DEPENDS` back up through the parent scopes so that
`omp_gen` does end up on the version seen by the main
`llvm/CMakeLists.txt`.
Reviewers: clementval, thakis, chandlerc, jdoerfert
Reviewed By: clementval
Subscribers: jdenny, mgorny, sstefan1, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82659
Fix a warning in getNode() when extracting a subvector from a
concat vector. We can simply replace the call to getVectorNumElements
with getVectorMinNumElements as this follows the defined behaviour
for EXTRACT_SUBVECTOR.
Differential Revision: https://reviews.llvm.org/D82746
Instead of doing multiple unpacks when zero extending vectors (e.g. v2i16 ->
v2i64), benchmarks have shown that it is better to do a VPERM (vector
permute) since that is only one sequential instruction on the critical path.
This patch achieves this by
1. Expand ZERO_EXTEND_VECTOR_INREG into a vector shuffle with a zero vector
instead of (multiple) unpacks.
2. Improve SystemZ::GeneralShuffle to perform a single unpack as the last
operation if Bytes matches it.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D78486
This diff addresses the old TODO in MachOObjcopy.cpp and
correctly sets the page size used for alignment of segments.
In particular, now llvm-objcopy's output is consistent
with the input (the alignment of vmsize doesn't change).
Test plan:
1. make check-all
2. verify that a binary copied via llvm-objcopy now correctly works on iOS.
Differential revision: https://reviews.llvm.org/D82503
When trying to reduce a BUILD_VECTOR to a SHUFFLE_VECTOR it's
important that we carefully check the vector types that led to
that BUILD_VECTOR. In the test I have attached to this commit
there is a case where the results of two SVE faddv instructions
are being stored to consecutive memory locations. With my fix,
as part of merging those stores we discover that each BUILD_VECTOR
element came from an extract of a SVE vector element and
therefore bail out.
Differential Revision: https://reviews.llvm.org/D82564
These map to the similar accessors on ArrayRef and other random access containers.
This fixes a compilation error on MLIR ODS for variadic operands/results, which relied on the availability of front in certain situations.
Sometimes SimplifyCFG may decide to perform jump threading. In order
to do it, it follows the following algorithm:
1. Checks if the block is small enough for threading;
2. If yes, inserts a PR Phi relying that the next iteration will remove it
by performing jump threading;
3. The next iteration checks the block again and performs the threading.
This logic has a corner case: inserting the PR Phi increases block's size
by 1. If the block size at first check was max possible, one more Phi will
exceed this size, and we will neither perform threading nor remove the
created Phi node. As result, we will end up with worse IR than before.
This patch fixes this situation by excluding Phis from block size computation.
Excluding Phis from size computation for threading also makes sense by
itself because in case of threadign all those Phis will be removed.
Differential Revision: https://reviews.llvm.org/D81835
Reviewed By: asbirlea, nikic
If the shuffle is a blend and one input is a 0 vector, we should prefer AND over PSHUFB since its available on more execution ports.
Differential Revision: https://reviews.llvm.org/D82798
`FILECHECK_OPTS` was implemented so that a test runner, such as CI,
can specify FileCheck debugging options, such as `-v` and `-vv`.
However, if a test suite has a FileCheck call that already specifies
`-v` or `-vv`, then that call will fail if `FILECHECK_OPTS` also
specifies it.
For `-vv`, this problem already exists:
`clang/test/CodeGen/aarch64-v8.2a-fp16-intrinsics-constrained.c`
It's not yet clear if the `-vv` in that test was intentional, but this
usage shouldn't fail anyway. It's already true that FileCheck permits
`-vv` and `-v` together even though `-vv` implies `-v`.
Compare D70784, which fixed the same problem for `-dump-input`.
Reviewed By: jhenderson, thopre
Differential Revision: https://reviews.llvm.org/D82601
Summary:
add_unittest was checking that the result of get_target_property was not
"NOTFOUND", but despite what the documentation says, get_target_property
returns <the var>-NOTFOUND on failure.
Reviewers: efriedma, thakis, serge-sans-paille, chandlerc
Reviewed By: serge-sans-paille
Subscribers: mgorny, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81762
This change ensures that the Darwin driver doesn't add unsupported libraries to the link
invocation when linking the Apple Silicon macOS slice.
rdar://61011136
Differential Revision: https://reviews.llvm.org/D82696
This was producing reg = xor undef reg, undef reg. This looks similar
to a use of a value to define itself, and I want to disallow undef
uses for SSA virtual registers. If this were to use implicit_def,
there's no guarantee the two operands end up using the same register
(I think no guarantee exists even if the two operands start out as the
same register, but this was violated when I switched this to use an
explicit implicit_def). The MOV32r0 pseudo evidently exists to handle
this case, so use it instead. This was more work than I expected for
the 64-bit case, but I didn't see any helper for materializing a
64-bit 0.
This reverts commit 01bf8cdf5fa9bc71869e15e5e351b2b68c39feb6.
Breaks the build:
llvm/include/llvm/ADT/FunctionExtras.h:223:7: error: explicit template argument list not allowed
223 | Callbacks<CallableT, CalledAs, EnableIfTrivial<CallableT>>;
Summary:
This technique should extend to rvalue-qualified etc, but I didn't add any.
I removed "volatile" from the future plans, which seems... speculative at best.
While here I moved the callbacks object out of the constructor into a
variable template, which I believe addresses the fixme there about unused
objects.
(I'm not a template guru, so it's always possible the old version was designed
for compile-time performance in a way I'm missing)
Reviewers: kadircet
Subscribers: dexonsmith, llvm-commits, chandlerc
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82581
Commit 1fed131660b2 assumed that shuffle vector canonicalization will
always ensure that the shuffle mask will be ordered so that element
zero comes from the LHS vector. However there is code out there for
which this is not the case. This patch simply removes that unsafe
assumption and makes the code work regardless of the source of the
first element.
The "new way" of enabling recommonmark is only supported in recommonmark
0.5 and later. Use the deprecated approach with versions of Sphinx that
still support it.
If I understand correctly there's no way to use older versions of
recommonmark (<0.5) with newer versions of Sphinx (>3.0) because the old
approach got removed.
Differential revision: https://reviews.llvm.org/D75284
It's possible for the first loop trip(s) to set the `Changed` Status, and to a
later one to early exit, in which case `Changed` must be return.
Differential Revision: https://reviews.llvm.org/D82753
For now, moving it to SIPreEmitPeephole.
Should find a right place to have this code.
Reviewed By: nhaehnle
Differential revision: https://reviews.llvm.org/D77544
MVE has native reductions for integer add and min/max. The others need
to be expanded to a series of extract's and scalar operators to reduce
the vector into a single scalar. The default codegen for that expands
the reduction into a series of in-order operations.
This modifies that to something more suitable for MVE. The basic idea is
to use vector operations until there are 4 remaining items then switch
to pairwise operations. For example a v8f16 fadd reduction would become:
Y = VREV X
Z = ADD(X, Y)
z0 = Z[0] + Z[1]
z1 = Z[2] + Z[3]
return z0 + z1
The awkwardness (there is always some) comes in from something like a
v4f16, which is first legalized by adding identity values to the extra
lanes of the reduction, and which can then not be optimized away through
the vrev; fadd combo, the inserts remain. I've made sure they custom
lower so that we can produce the pairwise additions before the extra
values are added.
Differential Revision: https://reviews.llvm.org/D81397