Currently needsStackRealignment returns false if canRealignStack returns false.
This means that the behavior of needsStackRealignment does not correspond to
it's name and description; a function might need stack realignment, but if it
is not possible then this function returns false. Furthermore,
needsStackRealignment is not virtual and therefore some backends have made use
of canRealignStack to indicate whether a function needs stack realignment.
This patch attempts to clarify the situation by separating them and introducing
new names:
- shouldRealignStack - true if there is any reason the stack should be
realigned
- canRealignStack - true if we are still able to realign the stack (e.g. we
can still reserve/have reserved a frame pointer)
- hasStackRealignment = shouldRealignStack && canRealignStack (not target
customisable)
Targets can now override shouldRealignStack to indicate that stack realignment
is required.
This change will make it easier in a future change to handle the case where we
need to realign the stack but can't do so (for example when the register
allocator creates an aligned spill after the frame pointer has been
eliminated).
Differential Revision: https://reviews.llvm.org/D98716
Change-Id: Ib9a4d21728bf9d08a545b4365418d3ffe1af4d87
This patch changes the interface to take a RegisterKind, to indicate
whether the register bitwidth of a scalar register, fixed-width vector
register, or scalable vector register must be returned.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D98874
As far as I can tell, the node coming in has an i64 result so the
return should have the same type. The HexagonISD node used for
this has a type profile that says the result is i64.
Found while trying to add assserts to LegalizeDAG to catch
result type mismatches.
Reviewed By: kparzysz
Differential Revision: https://reviews.llvm.org/D98962
The offset in HVX loads/stores is only 4 bits long, so often an
extra register is needed to hold the address. Minimize the number
of such registers by "standardizing" the base addresses and reusing
preexisting base registers when replacing frame indices.
This adds an Mask ArrayRef to getShuffleCost, so that if an exact mask
can be provided a more accurate cost can be provided by the backend.
For example VREV costs could be returned by the ARM backend. This should
be an NFC until then, laying the groundwork for that to be added.
Differential Revision: https://reviews.llvm.org/D98206
As a followup to D95291, getOperandsScalarizationOverhead was still
using a VF as a vector factor if the arguments were scalar, and would
assert on certain matrix intrinsics with differently sized vector
arguments. This patch removes the VF arg, instead passing the Types
through directly. This should allow it to more accurately compute the
cost without having to guess at which operands will be vectorized,
something difficult with more complex intrinsics.
This adjusts one SVE test as it is now calling the wrong intrinsic vs
veccall. Without invalid InstructCosts the cost of the scalarized
intrinsic is too low. This should get fixed when the cost of
scalarization is accounted for with scalable types.
Differential Revision: https://reviews.llvm.org/D96287
This refactors shouldFavorPostInc() and shouldFavorBackedgeIndex() into
getPreferredAddressingMode() so that we have one interface to steer LSR in
generating the preferred addressing mode.
Differential Revision: https://reviews.llvm.org/D96600
This will be needed in the loop-vectorizer where the minimum VF
requested may be a scalable VF. getMinimumVF now takes an additional
operand 'IsScalableVF' that indicates whether a scalable VF is required.
Reviewed By: kparzysz, rampitec
Differential Revision: https://reviews.llvm.org/D96020
The Hexagon Vector Combine pass genertes stores for a complete
aligned vector. The start of each section is a multiple of the
vector size, so that value is passed to normalize to compute
the offset of the stores in the section. The first store may
not occur at offset 0 when there is a gap between sections.
The result cannot be widened, unfortunately, because widening vNi1
would depend on the context in which it appears (i.e. the type alone
is not sufficient to tell if it needs to be widened).
This patch fixes a bug that could result in miscompiles (at least
in an OOT target). The problem could be seen by adding checks that
the DominatorTree used in BasicAliasAnalysis and ValueTracking was
valid (e.g. by adding DT->verify() call before every DT dereference
and then running all tests in test/CodeGen).
Problem was that the LegacyPassManager calculated "last user"
incorrectly for passes such as the DominatorTree when not telling
the pass manager that there was a transitive dependency between
the different analyses. And then it could happen that an incorrect
dominator tree was used when doing alias analysis (which was a pretty
serious bug as the alias analysis result could be invalid).
Fixes: https://bugs.llvm.org/show_bug.cgi?id=48709
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D94138
In https://reviews.llvm.org/D88138 this was incorrectly added with
registerOptimizerLastEPCallback(), when it should be
registerLoopOptimizerEndEPCallback(), matching the legacy PM's
EP_LoopOptimizerEnd.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D93929
As mentioned in D93793, there are quite a few places where unary `IRBuilder::CreateShuffleVector(X, Mask)` can be used
instead of `IRBuilder::CreateShuffleVector(X, Undef, Mask)`.
Let's update them.
Actually, it would have been more natural if the patches were made in this order:
(1) let them use unary CreateShuffleVector first
(2) update IRBuilder::CreateShuffleVector to use poison as a placeholder value (D93793)
The order is swapped, but in terms of correctness it is still fine.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D93923
My bot runs VS 2019, but it could not compile this code.
Message:
[55/2465] Building CXX object lib\Target\Hexagon\CMakeFiles\LLVMHexagonCodeGen.dir\HexagonVectorCombine.cpp.obj
FAILED: lib/Target/Hexagon/CMakeFiles/LLVMHexagonCodeGen.dir/HexagonVectorCombine.cpp.obj
...
C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.23.28105\include\map(71): error C2976: 'std::map': too few template arguments
C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.23.28105\include\map(71): note: see declaration of 'std::map'
The version in the path, 14.23, corresponds to _MSC_VER 1923, so raise
the version floor to 1924.
I have not tested with versions between 1924 and 1928 (latest), but the
latest works with the variadic version.