`CheckAtomic.cmake` was skipping the test of whether atomics work in MSVC
without an atomics library (they do), but not setting the value of
`HAVE_CXX_ATOMICS_WITHOUT_LIB`. That caused build issues when trying to land
D69869. I fixed that issue in f128f442a3d, by adding an `elseif(MSVC)`, as
was being done below in the 64-bit atomics check. That minimal fix did work,
but it kept various inconsistencies between the original atomics check and
the 64-bit one. This patch now makes the checks follow the same structure,
cleaning them up.
Differential Revision: https://reviews.llvm.org/D74767
Cost-modeling decisions are tied to the compute interleave groups
(widening decisions, scalar and uniform values). When invalidating the
interleave groups, those decisions also need to be invalidated.
Otherwise there is a mis-match during VPlan construction.
VPWidenMemoryRecipes created initially are left around w/o converting them
into VPInterleave recipes. Such a conversion indeed should not take place,
and these gather/scatter recipes may in fact be right. The crux is leaving around
obsolete CM_Interleave (and dependent) markings of instructions along with
their costs, instead of recalculating decisions, costs, and recipes.
Alternatively to forcing a complete recompute later on, we could try
to selectively invalidate the decisions connected to the interleave
groups. But we would likely need to run the uniform/scalar value
detection parts again anyways and the extra complexity is probably not
worth it.
Fixes PR45572.
Reviewers: gilr, rengolin, Ayal, hsaito
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D78298
Cherrypick the upstream fix commit a77d5f7 onto llvm/utils/benchmark
and libcxx/utils/google-benchmark.
This fixes LLVM's 32-bit RISC-V compilation, and the issues
mentioned in https://github.com/google/benchmark/pull/955
An additional cherrypick of ecc1685 fixes some minor formatting
issues introduced by the preceding commit.
Differential Revision: https://reviews.llvm.org/D78084
Summary:
The instruction in non-text section can not be executed, so they will not affect performance.
In addition, their encoding values are treated as data, so we should not touch them.
Reviewers: MaskRay, reames, LuoYuanke, jyknight
Reviewed By: MaskRay
Subscribers: annita.zhang, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77971
[MachineOutliner] fix test for excluding CFI and add test to include CFI in outlining
New test to check that we only outline CFI instruction if all CFI
Instructions in the function would be captured by the outlining
adding x86 tests analagous to AARCH64 cfi tests
Revision: https://reviews.llvm.org/D77852
BitVectors and SmallBitVectors with equal contents but different
capacities were getting different hashes.
Reviewed By: aganea
Differential Revision: https://reviews.llvm.org/D77038
The immediate used for the regdef is the encoding for the register
class in the enum generated by tablegen. This encoding will change
any time a new register class is added. Since the number is part
of the input, this means it can become stale.
This change modifies some test to avoid this kind of immediate
all together. And updates one test to use the current encoding of
GR64.
SmallVector currently uses 32bit integers for size and capacity to reduce
sizeof(SmallVector). This limits the number of elements to UINT32_MAX.
For a SmallVector<char>, this limits the SmallVector size to only 4GB.
Buffering bitcode output uses SmallVector<char>, but needs >4GB output.
This changes SmallVector size and capacity to conditionally use word-size
integers if the element type is small (<4 bytes). For larger elements types,
the vector size can reach ~16GB with 32bit size.
Making this conditional on the element type provides both the smaller
sizeof(SmallVector) for larger types which are unlikely to grow so large,
and supports larger capacities for smaller element types.
This change also includes a fix for the bug where a SmallVector with 32bit
size has reached UINT32_MAX elements, and cannot provide guaranteed growth.
Context:
// Double the size of the allocated memory, guaranteeing space for at
// least one more element or MinSize if specified.
void grow(size_t MinSize = 0) { this->grow_pod(MinSize, sizeof(T)); }
void push_back(const T &Elt) {
if (LLVM_UNLIKELY(this->size() >= this->capacity()))
this->grow();
memcpy(reinterpret_cast<void *>(this->end()), &Elt, sizeof(T));
this->set_size(this->size() + 1);
}
When grow is called in push_back() without a MinSize specified, this is
relying on the guarantee of space for at least one more element.
There is an edge case bug where the SmallVector is already at its maximum size
and push_back() calls grow() with default MinSize of zero. Grow is unable to
provide space for one more element, but push_back() assumes the additional
element it will be available. This can result in silent memory corruption, as
this->end() will be an invalid pointer and the program may continue executing.
An alternative to this fix would be to remove the default argument from
grow(), which would mean several changing grow() to grow(this->size()+1)
in several places.
No test case added because it would require allocating a large ammount.
Differential Revision: https://reviews.llvm.org/D77621
Starting with hasRedZone adding MachineFunctionInfo to be put in the YAML for MIR files.
Split out of: D78062
Based on implementation for MachineFunctionInfo for WebAssembly
Differential Revision: https://reviews.llvm.org/D78173
Patch by Andrew Litteken! (AndrewLitteken)
We should only modify existing ones. Previously, we were creating
MachineFunctions for externally-available functions. AFAICT this was benign
in tree but ultimately led to asan bugs in our out of tree target.
Summary:
Remove asserting vector getters from Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.
Reviewers: dexonsmith, sdesmalen, efriedma
Reviewed By: efriedma
Subscribers: cfe-commits, hiraditya, llvm-commits
Tags: #llvm, #clang
Differential Revision: https://reviews.llvm.org/D77278
InstCombine should ensure these don't exist.
I'm looking at making some changes to how we detect these
patterns and not having to worry about these phis will help.
Summary:
Use a SmallSetVector instead of a SmallPtrSet when collecting
and storing Roots.
The iteration order for a SmallPtrSet is not deterministic,
so in the past the order of items inserted in the WorkList
inside walkBackwards has been non-deterministic. This patch
intends to make the order of rewrites done in Float2Int
deterministic by changing the container for the Roots set.
The semantics result of the transformation should not be
any different afaict. But at least naming of IR variables
(when outputting the result as an ll file) should be more
stable now.
Reviewers: craig.topper, spatel, cameron.mcinally
Reviewed By: spatel
Subscribers: mgrang, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74534
This reverts commit 17b1869b72f30f2702cb1abd7222027082e49eb6.
It is an attempt to fix the failure reported at
The patch differs from the original one reviwed at
https://reviews.llvm.org/D77435 only for the use of the std::make_tuple
in building the return value of `findAddrModeSVELoadStore`:
- return {IsRegReg ? Opc_rr : Opc_ri, NewBase, NewOffset};
+ return std::make_tuple(IsRegReg ? Opc_rr : Opc_ri, NewBase,
the original patch submitted at
fc4e954ed5
was failing the following build:
http://lab.llvm.org:8011/builders/clang-armv7-linux-build-cache/builds/29420/
with error:
/home/buildslave/buildslave/clang-armv7-linux-build-cache/llvm/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
/home/buildslave/buildslave/clang-armv7-linux-build-cache/llvm/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp:1439:10:
error: chosen constructor is explicit in copy-initialization
return {IsRegReg ? Opc_rr : Opc_ri, NewBase, NewOffset};
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/bin/../lib/gcc/arm-linux-gnueabihf/5.4.0/../../../../include/c++/5.4.0/tuple:479:19:
note: explicit constructor declared here
constexpr tuple(_UElements&&... __elements)
^
1 error generated.
Summary:
This change is fixing an issue where the dagcombine incorrectly used an addressing mode with scaled offsets (indices), instead of unscaled offsets.
Those addressing modes do not exist for `prfh` , `prfw` and `prfd`, hence we can reuse `prfb` because that has unscaled offsets, and because the pseudo-code in the XML spec suggests that the element size is not used for the amount of data that is prefetched by the instruction.
FWIW, GCC also emits a `prfb` for these cases.
Reviewers: sdesmalen, andwar, rengolin
Reviewed By: sdesmalen
Subscribers: tschuett, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78069
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.
Reviewers: craig.topper, sdesmalen, efriedma, RKSimon
Reviewed By: efriedma
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77264
There are also some adjustments to use MaybeAlign in here due
to CallBase::getParamAlignment() being deprecated. It would
be a little cleaner if getOrEnforceKnownAlignment was migrated
to Align/MaybeAlign.
Differential Revision: https://reviews.llvm.org/D78345
There are also some adjustments to use MaybeAlign in here due
to CallBase::getParamAlignment() being deprecated. It would
be cleaner if getOrEnforceKnownAlignment was migrated
to Align/MaybeAlign.
Differential Revision: https://reviews.llvm.org/D78345
This patch combines the "has" and "get" parts of the cache access.
getCachedValueInfo() now both sets the BBLV return argument, and
returns whether the value was found.
Additionally, the management of the work stack is now integrated
into getBlockValue(). If the value is not cached yet, we try to
push to the stack (and return false, indicating that we need to
solve first), or return overdefined in case of a cycle.
These changes a) avoid a duplicate cache lookup for has & get and
b) ensure that the logic is uniform everywhere. For this reason
this change is also not quite NFC, because previously overdefined
values from the cache, and overdefined values from a cycle received
different treatment when it came to assumption intersection.
Differential Revision: https://reviews.llvm.org/D76788
PredicateInfo takes up a large amount of memory during IPSCCP
with many functions. And a large part of that space seems to
be going completely to waste here...
Summary:
Fixed wrong conditions for generating (S[LR]I X, Y, C2) from
(or (and X, BvecC1), (lsl Y, C2)) and added ISel nodes to lower to S[LR]I. The
optimisation is also enabled by default now.
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77387