In some case, the RC may have 0 allocatable reg.
eg: VRSAVERC in PowerPC, which has only 1 reg, but it is also reserved.
The curreent implementation will keep calling the computePSetLimit because
getRegPressureSetLimit assume computePSetLimit will return a non-zero value.
The fix simply early return the value from TableGen for such special case.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D92907
This is an enhancement to LLVM Source-Based Code Coverage in clang to track how
many times individual branch-generating conditions are taken (evaluate to TRUE)
and not taken (evaluate to FALSE). Individual conditions may comprise larger
boolean expressions using boolean logical operators. This functionality is
very similar to what is supported by GCOV except that it is very closely
anchored to the ASTs.
Differential Revision: https://reviews.llvm.org/D84467
Allow loop nests with empty basic blocks without loops in different
levels as perfect.
Reviewers: Meinersbur
Differential Revision: https://reviews.llvm.org/D93665
Creating in-loop reductions relies on IR references to map
IR values to VPValues after interleave group creation.
Make sure we re-add the updated member to the plan, so the look-ups
still work as expected
This fixes a crash reported after D90562.
AVX512 has fast truncation ops, but if the truncation source is a concatenation of subvectors then its likely that we can use PACK more efficiently.
This is only guaranteed to work for truncations to 128/256-bit vectors as the PACK works across 128-bit sub-lanes, for now I've just disabled 512-bit truncation cases but we need to get them working eventually for D61129.
Use instead of the isa_and_nonnull<StoreInst> and use the StoreInst::getPointerOperand wrapper instead of a hardcoded Instruction::getOperand.
Looks cleaner and avoids a spurious clang static analyzer null dereference warning.
Convert it to v_fma_legacy_f32 if it is profitable to do so, just like
other mac instructions that are converted to their mad equivalents.
Differential Revision: https://reviews.llvm.org/D94010
Support EH_SJLJ_LONGJMP, EH_SJLJ_SETJMP, and EH_SJLJ_SETUP_DISPATCH
for SjLj exception handling. NC++ uses SjLj exception handling, so
implement it first. Add regression tests also.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D94071
CTLZ and CTPOP are lowered to CLZ and CNT instructions respectively.
CTTZ is not a native SVE operation but is instead lowered to:
CTTZ(V) => CTLZ(BITREVERSE(V))
In the case of fixed-length support using SVE we also lower CTTZ
operating on NEON sized vectors because of its reliance on
BITREVERSE which is also lowered to SVE intructions at these lengths.
Differential Revision: https://reviews.llvm.org/D93607
We're immediately dereferencing the casted pointer, so use cast<> which will assert instead of dyn_cast<> which can return null.
Fixes static analyzer warning.
Loop strength reduction tries to recover debug variable values by looking
for simple offsets from PHI values. In really extreme conditions there may
be an offset used that won't fit in an int64_t, hitting an APInt assertion.
This patch adds a regression test and adjusts the equivalent value
collecting code to filter out any values where the offset can't be
represented by an int64_t. This means that for very large integers with
very large offsets, the variable location will become undef, which is the
same behaviour as before 2a6782bb9f1 / D87494.
Differential Revision: https://reviews.llvm.org/D94016
For wasm-ld table linking work to proceed, object files should indicate
if they use an indirect function table. In the future this will be done
by the usual symbols and relocations mechanism, but until that support
lands in the linker, the presence of an `__indirect_function_table` in
the object file's import section shows that the object file needs an
indirect function table.
Prior to https://reviews.llvm.org/D91637, this condition was met by all
object files residualizing an `__indirect_function_table` import.
Since https://reviews.llvm.org/D91637, the intention has been that only
those object files needing an indirect function table would have the
`__indirect_function_table` import. However, we missed the case of
object files which use the table via `call_indirect` but which
themselves do not declare any indirect functions.
This changeset makes it so that when we lower a call to `call_indirect`,
that we ensure that a `__indirect_function_table` symbol is present and
that it will be propagated to the linker.
A followup patch will revise this mechanism to make an explicit link
between `call_indirect` and its associated indirect function table; see
https://reviews.llvm.org/D90948.
Differential Revision: https://reviews.llvm.org/D92840
We're immediately dereferencing the casted pointer, so use cast<> which will assert instead of dyn_cast<> which can return null.
Fixes static analyzer warning.
We're immediately dereferencing the casted pointer, so use cast<> which will assert instead of dyn_cast<> which can return null.
Fixes static analyzer warning.
In order to support SJLJ exception, implement llvm.eh.sjlj.lsda first.
Add regression test also.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D93811
TableGen would pick the largest RC for constraining the operands, which
could potentially be an unallocatable RC. This patch removes selection
of unallocatable RCs.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D93945
This patch updates the llvm module map to reflect changes made in
`5efc71e119d4eba235209d262e7d171361a0b9be` and fixes the module builds
(`-DLLVM_ENABLE_MODULES=On`).
Differential Revision: https://reviews.llvm.org/D94057
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This function no longer does anything useful. It probably did something originally but latter changes removed them and didn't clean up this function.
The checks are already done in the callers as well.
Differential Revision: https://reviews.llvm.org/D94055
The piece of code tries to use splat+shift to lower build_vector with
repeating bit pattern. And immediate field of vector splat is only 5
bits (-16~15). It iterates over them one by one to find which
shifts/rotates to number in build_vector.
This patch removes code to try matching constant with algebraic
right-shift because that's meaningless - any negative number's algebraic
right-shift won't produce result smaller than itself. Besides, code
(int)((unsigned)i >> j) means logical shift-right in C.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D93937
Current implementation assumes that, each MachineConstantPoolValue takes
up sizeof(MachineConstantPoolValue::Ty) bytes. For PowerPC, we want to
lump all the constants with the same type as one MachineConstantPoolValue
to save the cost that calculate the TOC entry for each const. So, we need
to extend the MachineConstantPoolValue that break this assumption.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D89108
Current update_llc_test_checks.py cannot generate checks for AIX
(powerpc64-ibm-aix-xcoff) properly. Assembly generated is little bit
different from Linux. So I use begin function comment here to capture
function name.
Reviewed By: MaskRay, steven.zhang
Differential Revision: https://reviews.llvm.org/D93676
... which requires not deleting an edge that just got deleted,
because we could be dealing with a block that didn't go through
ConstantFoldTerminator() yet, and thus has a degenerate cond br
with matching true/false destinations.
Notably, this doesn't switch *every* case, remaining cases
don't actually pass sanity checks in non-permissve mode,
and therefore require further analysis.
Note that SimplifyCFG still defaults to not preserving DomTree by default,
so this is effectively a NFC change.
While here, rename the inaccurate getRecurrenceBinOp()
because that was also used to get CmpInst opcodes.
The recurrence/reduction kind should always refer to the
expected opcode for a reduction. SLP appears to be the
only direct caller of createSimpleTargetReduction(), and
that calling code ideally should not be carrying around
both an opcode and a reduction kind.
This should allow us to generalize reduction matching to
use intrinsics instead of only binops.