Currently we infer whether the flat-scratch-init kernel input should
be enabled based on calls. Move this handling, so we can decide if the
full set of ABI inputs is needed in kernels. Ideally we would have an
analysis of some sort, rather than the function attributes.
This patch enables exception handling in code added to LLJIT on Darwin by
adding an orc::EHFrameRegistrationPlugin instance to the ObjectLinkingLayer
(which is currently used on Darwin only).
These may be accessed from multiple threads if concurrent materialization is
enabled in ORC.
Testcase coming in a follow-up patch that enables eh-frame registration for
LLJIT.
The patch removes late endcf handling and only leaves the
related portion with redundant exec mask copy elimination.
Differential Revision: https://reviews.llvm.org/D76095
Summary:
Support ConstantInt::get() and Constant::getAllOnesValue() for scalable
vector type, this requires ConstantVector::getSplat() to take in 'ElementCount',
instead of 'unsigned' number of element count.
This change is needed for D73753.
Reviewers: sdesmalen, efriedma, apazos, spatel, huntergr, willlovett
Reviewed By: efriedma
Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74386
This patch allows ISD::FSHR(i32) patterns to lower to ALIGNBIT instructions.
This improves test coverage of ISD::FSHR matching - x86 has both FSHL/FSHR instructions and we prefer FSHL by default.
Differential Revision: https://reviews.llvm.org/D76070
Summary:
Using the default DAG.UnrollVectorOp on v16i8 and v8i16 vectors
results in i8 or i16 nodes being inserted into the SelectionDAG. Since
those are illegal types, this causes a legalization assertion failure
for some code patterns, as uncovered by PR45178. This change unrolls
shifts manually to avoid this issue by adding and using a new optional
EVT argument to DAG.ExtractVectorElements to control the type of the
extract_element nodes.
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, zzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76043
This is to replace the optimization from the SIOptimizeExecMaskingPreRA.
We have less opportunities in the control flow lowering because many
VGPR copies are still in place and will be removed later, but we know
for sure an instruction is SI_END_CF and not just an arbitrary S_OR_B64
with EXEC.
The subsequent change needs to convert s_and_saveexec into s_and and
address new TODO lines in tests, then code block guarded by the
-amdgpu-remove-redundant-endcf option in the pre-RA exec mask optimizer
will be removed.
Differential Revision: https://reviews.llvm.org/D76033
Use GraphTraits in the implementation of the GraphDiff's own GraphTraits
so GraphDiff can be used across all graph types that provide
GraphTraits.
Also use partial template specializations to make the traits a bit more
compact.
Reviewers: asbirlea
Differential Revision: https://reviews.llvm.org/D76034
Summary:
On 32-bit PPC target[AIX and BE], when we convert an `i64` to `f32`, a `setcc` operand expansion is needed. The expansion will set the result type of expanded `setcc` operation based on if the subtarget use CRBits or not. If the subtarget does use the CRBits, like AIX and BE, then it will set the result type to `i1`, leading to an inconsistency with original `setcc` result type[i32].
And the reason why it crashed underneath is because we don't set result type of setcc consistent in those two places.
This patch fixes this problem by setting original setcc opnode result type also with `getSetCCResultType` interface.
Reviewers: sfertile, cebowleratibm, hubert.reinterpretcast, Xiangling_L
Reviewed By: sfertile
Subscribers: wuzish, nemanjai, hiraditya, kbarton, jsji, shchenz, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75702
Summary:
De-duplicate isel instruction classes by using RRIm for RRINDm. The latter
becomes obsolete.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D76063
Summary:
This patch adds the following intrinsics for non-temporal gather loads
and scatter stores:
* aarch64_sve_ldnt1_gather_index
* aarch64_sve_stnt1_scatter_index
These intrinsics implement the "scalar + vector of indices" addressing
mode.
As opposed to regular and first-faulting gathers/scatters, there's no
instruction that would take indices and then scale them. Instead, the
indices for non-temporal gathers/scatters are scaled before the
intrinsics are lowered to `ldnt1` instructions.
The new ISD nodes, GLDNT1_INDEX and SSTNT1_INDEX, are only used as
placeholders so that we can easily identify the cases implemented in
this patch in performGatherLoadCombine and performScatterStoreCombined.
Once encountered, they are replaced with:
* GLDNT1_INDEX -> SPLAT_VECTOR + SHL + GLDNT1
* SSTNT1_INDEX -> SPLAT_VECTOR + SHL + SSTNT1
The patterns for lowering ISD::SHL for scalable vectors (required by
this patch) were missing, so these are added too.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D75601
This is part of the IR sibling for:
D75576
(I'm splitting part of the transform as a separate commit
to reduce risk. I don't know of any bugs that might be
exposed by this improved folding, but it's hard to see
those in advance...)
Lets us remove another SLM proc family flag usage.
This is NFC, but we should probably check whether atom/glm/knl? should be using this flag as well...
This patch switches SCCP to use ValueLatticeElement for lattice values,
instead of the local LatticeVal, as first step to enable integer range support.
This patch does not make use of constant ranges for additional operations
and the only difference for now is that integer constants are represented by
single element ranges. To preserve the existing behavior, the following helpers
are used
* isConstant(LV): returns true when LV is either a constant or a constant range with a single element. This should return true in the same cases where LV.isConstant() returned true previously.
* getConstant(LV): returns a constant if LV is either a constant or a constant range with a single element. This should return a constant in the same cases as LV.getConstant() previously.
* getConstantInt(LV): same as getConstant, but additionally casted to ConstantInt.
Reviewers: davide, efriedma, mssimpso
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D60582
The initialization order was not correct. These bugs were discovered by
valgrind. They appear to work fine in practice but this patch should
unblock switching the AVR backend on by default as now a standard AVR
llc invocation runs without memory errors.
The AVRISelLowering constructor would run before the subtarget boolean
fields were initialized to false. Now, the initialization order is
correct.
Summary:
This adds the ACLE intrinsic family for the VFMA and VFMS
instructions, which perform fused multiply-add on vectors of floats.
I've represented the unpredicated versions in IR using the cross-
platform `@llvm.fma` IR intrinsic. We already had isel rules to
convert one of those into a vector VFMA in the simplest possible way;
but we didn't have rules to detect a negated argument and turn it into
VFMS, or rules to detect a splat argument and turn it into one of the
two vector/scalar forms of the instruction. Now we have all of those.
The predicated form uses a target-specific intrinsic as usual, but
I've stuck to just one, for a predicated FMA. The subtraction and
splat versions are code-generated by passing an fneg or a splat as one
of its operands, the same way as the unpredicated version.
In arm_mve_defs.h, I've had to introduce a tiny extra piece of
infrastructure: a record `id` for use in codegen dags which implements
the identity function. (Just because you can't declare a Tablegen
value of type dag which is //only// a `$varname`: you have to wrap it
in something. Now I can write `(id $varname)` to get the same effect.)
Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D75998
Found by the LLVM MemorySanitizer tests when switching AVR to a default
backend.
ELFArch must be initialized before the call to
initializeSubtargetDependencies().
The uninitialized read would occur deep within TableGen'd code.
Summary:
This patch replaces incorrectt assert with a check. Previously it asserts that
if SCEV cannot prove `isKnownPredicate(A != B)`, then it should be able to prove
`isKnownPredicate(A == B)`.
Both these fact may be not provable. It is shown in the provided test:
Could not prove: `{-294,+,-2}<%bb1> != 0`
Asserting: `{-294,+,-2}<%bb1> == 0`
Obviously, this SCEV is not equal to zero, but 0 is in its range so we cannot
also prove that it is not zero.
Instead of assert, we should be checking the required conditions explicitly.
Reviewers: lebedev.ri, fhahn, sanjoy, fedor.sergeev
Reviewed By: lebedev.ri
Subscribers: hiraditya, zzheng, javed.absar, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76050
Summary: This patch adds the basic utilities to deal with dropable uses. dropable uses are uses that we rather drop than prevent transformations, for now they are limited to uses in llvm.assume.
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: uenoku, lebedev.ri, mgorny, hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73404
Summary:
Cost modelling strikes again.
In PR44668 <https://bugs.llvm.org/show_bug.cgi?id=44668> patch series,
i've made the same mistake of always using generic `getOperationCost()`
that i missed in reviewing D73480/D74495 which was later fixed
in 62dd44d76da9aa596fb199bda8b1e8768bb41033.
We should be using more specific hooks instead - `getCastInstrCost()`,
`getArithmeticInstrCost()`, `getCmpSelInstrCost()`.
Evidently, this does not have an effect on the existing testcases,
with unchanged default cost budget. But if it *does* have an effect
on some target, we'll have to segregate tests that use this function
per-target, much like we already do with other TTI-aware transform tests.
There's also an issue that @samparker has brought up in post-commit-review:
>>! In D73501#1905171, @samparker wrote:
> Hi,
> Did you get performance numbers for these patches? We track the performance
> of our (Arm) open source DSP library and the cost model fixes were generally
> a notable improvement, so many thanks for that! But the final patch
> for rewriting exit values has generally been bad, especially considering
> the gains from the modelling improvements. I need to look into it further,
> but on my current test case I'm seeing +30% increase in stack accesses
> with a similar decrease in performance.
> I'm just wondering if you observed any negative effects yourself?
I don't know if this addresses that, or we need D66450 for that.
Reviewers: samparker, spatel, mkazantsev, reames, wmi
Reviewed By: reames
Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits, samparker
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75908
Summary: When narrowing a scalar G_EXTRACT where the destination lines up perfectly with a single result of the emitted G_UNMERGE_VALUES a COPY should be emitted instead of unconditionally trying to emit a G_MERGE_VALUES.
Reviewers: arsenm, dsanders
Reviewed By: arsenm
Subscribers: wdng, rovka, hiraditya, volkan, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75743
The note section type implies a specific format that this section does
not have thus tools like readelf fail here. Progbits has no format and
another pipeline compiler already sets the type to progbits.
Differential Revision: https://reviews.llvm.org/D75913
Summary:
Currently, a BoundaryAlign fragment may be inserted after the branch
that needs to be aligned to truncate the current fragment, this fragment is
unused at most of time. To avoid that, we can insert a new empty Data
fragment instead. Non-relaxable instruction is usually emitted into Data
fragment, so the inserted empty Data fragment will be reused at a high
possibility.
Reviewers: annita.zhang, reames, MaskRay, craig.topper, LuoYuanke, jyknight
Reviewed By: reames, LuoYuanke
Subscribers: llvm-commits, dexonsmith, hiraditya
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75438
Summary:
This patch makes the AVR backend an official target of LLVM, serving
as a request for comments for moving the AVR backend out of
experimental.
A future patch will move the LLVM AVR buildbot (llvm-avr-linux) from the
staging buildmaster to the production buildmaster, so error emails will
start to go out.
Summary of the backend
----------------------
- 16-bit little endian
- AsmParser based assembly parser
- uses the MC library for generating AVR ELFs
- most logic driven from standard TableGen-erated tables like other
backends
- passes all of the test suite under `check-all`, including generic
CodeGen and DebugInfo tests
- Used in two frontends
- Limited, but functional support for DebugInfo and LLVM DWARF dumping
- Binary compatible with AVR-GCC and avr-{libc,libgcc} for the most part
- Cannot lower 32-bit shifts due to a bug, can lower shifts larger or
smaller
- Supports assembly/MC for all the entire AVR ISA, generally generates poorly
optimized machine instructions, with most focus thus far on correctness
I've added reviewers and subscribers from previous patches where backends were made official,
and those who participated in the recent thread on llvm-dev, please add anybody I've missed.
The most recent discussion on this topic can be found in the llvm-dev thread [Moving the AVR backend out of experimental](https://lists.llvm.org/pipermail/llvm-dev/2020-February/139158.html)
Reviewers: chandlerc, lattner, rengolin, tstellar, arsenm, thakis, simoll, asb
Reviewed By: rengolin, thakis
Subscribers: CryZe, wdng, mgorny, aprantl, Jim, hans, aykevl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75099