We were defaulting to the lower action for this, resulting in SHL+ASHR
sequences. On AArch64 we can do this in one instruction for an arbitrary
extension using SBFM as we do for G_SEXT.
Differential Revision: https://reviews.llvm.org/D81992
Summary:
UBSan buildbot caught an undefined behavior in itostr with INT64_MIN.
The negation cannot be represented in the promoted operand (long long).
Negation is well defined on unsigned value though so this commit does
the negation after the static cast.
Reviewers: jhenderson, chandlerc, lattner
Reviewed By: lattner
Subscribers: dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82200
This relaxes an assertion that required symbols to start before the end
of a block. Instead, symbols are now required to end on or before the
end of a block. This fixes two important corner cases: Symbols at the
start of empty blocks/sections, and block/section end symbols.
Summary:
Add patterns to select s_cselect in the isel.
Handle more cases of implicit SCC accesses in si-fix-sgpr-copies
to allow new patterns to work.
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, asbirlea, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81925
Summary:
Factor out repetetitive code into helper function and split massive
ExpressionFormat method test into separate test for each method,
removing dead code in passing. Also add a MinInt64 and MaxInt64 checks
when testing getMatchingString.
Reviewers: jhenderson, jdenny, probinson, grimar, arichardson
Reviewed By: jhenderson, grimar
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82132
This patch adds codegen for the following BFloat
operations to the ARM backend:
* concatenation of bf16 vectors
* bf16 vector element extraction
* bf16 vector element insertion
* duplication of a bf16 value into each lane of a vector
* duplication of a bf16 vector lane into each lane
Differential Revision: https://reviews.llvm.org/D81411
Similar to D81937, we might crash when printing a histogram for a GNU hash table
with a 'symndx' index that is larger than the number of dynamic symbols.
This patch adopts and reuses the `getGnuHashTableChains()` helper which performs
a validation of the table. As a side effect the warning reported for
the --gnu-hash-table was improved.
Also with this change we start to report a warning when the histogram is requested for
the GNU hash table, but the dynamic symbols table is empty (size == 0).
Differential revision: https://reviews.llvm.org/D82010
All cases in the test are supported now, it only still failed because an
over-eager regex match not accounting for `, align ` being added to each
load/store now.
Without SSE41 we don't have the PCMPEQQ instruction, making cmp-with-zero reductions more complicated than necessary. We can compare as vXi32 (PCMPEQD) and tweak the MOVMSK comparison to test upper/lower DWORD comparisons.
This pre-fixes something that occurs with null tests for vectors of (64-bit) pointers such as in PR35129.
At the moment we use Global ISel by default at -O0, however it is
currently not capable of dealing with scalable vectors for two
reasons:
1. The register banks know nothing about SVE registers.
2. The LLT (Low Level Type) class knows nothing about scalable
vectors.
For now, the easiest way to avoid users hitting issues when using
the SVE ACLE is to fall back on normal DAG ISel when encountering
instructions that operate on scalable vector types.
I've added a couple of RUN lines to existing SVE tests to ensure
we can compile at -O0. I've also added some new tests to
CodeGen/AArch64/GlobalISel/arm64-fallback.ll
that demonstrate we correctly fallback to DAG ISel at -O0 when
lowering formal arguments or translating instructions that involve
scalable vector types.
Differential Revision: https://reviews.llvm.org/D81557
Code does not track terminators and do not expose them through interface.
State there is just a state of the last instruction or entry.
So this information is just redundant and doesn't need to be tested.
This patch updates SCCP/IPSCCP to use the computed range info to turn
sexts into zexts, if the value is known to be non-negative. We already
to a similar transform in CorrelatedValuePropagation, but it seems like
we can catch a lot of additional cases by doing it in SCCP/IPSCCP as
well.
The transform is limited to ranges that are known to not include undef.
Currently constant ranges from conditions are treated as potentially
containing undef, due to PR46144. Once we flip this, the transform will
be more effective in practice.
Reviewers: efriedma, davide
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D81756
Without this fix, handleMoveUp can create an invalid live range like
this:
[98904e,98908r:0)[98908e,227504r:1)
where the two segments overlap, but only because we have lost the "e"
(early-clobber) on the end point of the first segment.
Differential Revision: https://reviews.llvm.org/D82110
Summary:
this reduces significantly the number of assumes generated without aftecting too much
the information that is preserved. this improves the compile-time cost
of enable-knowledge-retention significantly.
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: hiraditya, asbirlea, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79650
For now I have changed SimplifyDemandedBits and it's various callers
to assume we know nothing for scalable vectors and to ignore the
demanded bits completely. I have also done something similar for
SimplifyDemandedVectorElts. These changes fix up lots of warnings
due to calls to EVT::getVectorNumElements() for types with scalable
vectors. These functions are all used for optimisations, rather than
functional requirements. In future we can revisit this code if
there is a need to improve code quality for SVE.
Differential Revision: https://reviews.llvm.org/D80537
When trying to calculate the number of sign bits for scalable vectors
we should just bail out for now and pretend we know nothing.
Differential Revision: https://reviews.llvm.org/D81093
These tests involve a JIT, and like other tests should have the
REQUIRE: default_triple present.
This allow to run `ninja check` without the host target configured
in.
A "BTI c" instruction only allows jumping/calling to using a BLR* instruction.
However, the SLSBLR mitigation changes a BLR to a BR to implement the
function call. Therefore, a "BTI c" check that passed before could
trigger after the BLR->BL change done by the SLSBLR mitigation.
However, if the register used in BR is X16 or X17, this trigger will not
fire (see ArmARM for further details).
Therefore, this patch simply changes the function stubs for the SLSBLR
mitigation from
__llvm_slsblr_thunk_x<N>:
br x<N>
SpeculationBarrier
to
__llvm_slsblr_thunk_x<N>:
mov x16, x<N>
br x16
SpeculationBarrier
Differential Revision: https://reviews.llvm.org/D81405
We currently miss a number of opportunities to emit single-instruction
VMRG[LH][BHW] instructions for shuffles on little endian subtargets. Although
this in itself is not a huge performance opportunity since loading the permute
vector for a VPERM can always be pulled out of loops, producing such merge
instructions is useful to downstream optimizations.
Since VPERM is essentially opaque to all subsequent optimizations, we want to
avoid it as much as possible. Other permute instructions have semantics that can
be reasoned about much more easily in later optimizations.
This patch does the following:
- Canonicalize shuffles so that the first element comes from the first vector
(since that's what most of the mask matching functions want)
- Switch the elements that come from splat vectors so that they match the
corresponding elements from the other vector (to allow for merges)
- Adds debugging messages for when a shuffle is matched to a VPERM so that
anyone interested in improving this further can get the info for their code
Differential revision: https://reviews.llvm.org/D77448
Summary: so it doesn't change the C ABI
Reviewers: deadalnix
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82135
Summary:
Extend StackLifetime with option to calculate liveliness
where alloca is only considered alive on basic block entry
if all non-dead predecessors had it alive at terminators.
Depends on D82043.
Reviewers: eugenis
Reviewed By: eugenis
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82124