llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

Author	SHA1	Message	Date
Paul Walker	f7143dfdb4	[SVE] Custom ISel for fixed length extract/insert_subvector. We use extact_subvector and insert_subvector to "cast" between fixed length and scalable vectors. This patch adds custom c++ based ISel for the following cases: fixed_vector = ISD::EXTRACT_SUBVECTOR scalable_vector, 0 scalable_vector = ISD::INSERT_SUBVECTOR undef(scalable_vector), fixed_vector, 0 Which result in either EXTRACT_SUBREG/INSERT_SUBREG for NEON sized vectors or COPY_TO_REGCLASS otherwise. Differential Revision: https://reviews.llvm.org/D82871	2020-07-08 09:49:28 +00:00
David Sherwood	bd3697d837	[CodeGen] Fix warnings in sve-ld1-addressing-mode-reg-imm.ll For the GetElementPtr case in function AddressingModeMatcher::matchOperationAddr I've changed the code to use the TypeSize class instead of relying upon the implicit conversion to a uint64_t. As part of this we now check for scalable types and if we encounter one just bail out for now as the subsequent optimisations doesn't currently support them. This changes fixes up all warnings in the following tests: llvm/test/CodeGen/AArch64/sve-ld1-addressing-mode-reg-imm.ll llvm/test/CodeGen/AArch64/sve-st1-addressing-mode-reg-imm.ll Differential Revision: https://reviews.llvm.org/D83124	2020-07-08 09:16:00 +01:00
Kerry McLaughlin	a7ecd67d40	[SVE][CodeGen] Legalisation of unpredicated store instructions Summary: When splitting a store of a scalable type, the new address is calculated in SplitVecOp_STORE using a vscale and an add instruction. Reviewers: sdesmalen, efriedma, david-arm Reviewed By: david-arm Subscribers: tschuett, hiraditya, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83041	2020-07-07 11:47:10 +01:00
Kerry McLaughlin	884e13dafb	[SVE][CodeGen] Legalisation of unpredicated load instructions Summary: When splitting a load of a scalable type, the new address is calculated in SplitVecRes_LOAD using a vscale and an add instruction. This patch also adds a DAG combiner fold to visitADD for vscale: - Fold (add (vscale(C0)), (vscale(C1))) to (add (vscale(C0 + C1))) Reviewers: sdesmalen, efriedma, david-arm Reviewed By: david-arm Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82792	2020-07-07 11:05:03 +01:00
David Sherwood	f96f4eb390	[SVE] Add more warnings checks to clang and LLVM SVE tests There are now more SVE tests in LLVM and Clang that do not emit warnings related to invalid use of EVT::getVectorNumElements() and VectorType::getNumElements(). For these tests I have added additional checks that there are no warnings in order to prevent any future regressions. Differential Revision: https://reviews.llvm.org/D82943	2020-07-07 09:33:20 +01:00
David Sherwood	654ddede3d	[SVE][CodeGen] Fix bug when falling back to DAG ISel In an earlier commit 584d0d5c1749c13625a5d322178ccb4121eea610 I added functionality to allow AArch64 CodeGen support for falling back to DAG ISel when Global ISel encounters scalable vector types. However, it seems that we were not falling back early enough as llvm::getLLTForType was still being invoked for scalable vector types. I've added a new fallback function to the call lowering class in order to catch this problem early enough, rather than wait for lowerFormalArguments to reject scalable vector types. Differential Revision: https://reviews.llvm.org/D82524	2020-07-07 09:23:04 +01:00
David Sherwood	204220c193	[CodeGen] Fix warnings in sve-vector-splat.ll and sve-trunc.ll This patch fixes all remaining warnings in: llvm/test/CodeGen/AArch64/sve-trunc.ll llvm/test/CodeGen/AArch64/sve-vector-splat.ll I hit some warnings related to getCopyPartsToVector. I fixed two issues: 1. In widenVectorToPartType() we assumed that we'd always be using BUILD_VECTOR nodes to expand from one vector type to another, which is incorrect for scalable vector types. I've fixed this for now by simply bailing out immediately for scalable vectors. 2. In getCopyToPartsVector() I've changed the code to compare the element counts of different types. Differential Revision: https://reviews.llvm.org/D83028	2020-07-07 09:21:47 +01:00
Simon Pilgrim	db4d0a9002	Regenerate neon copy tests. NFC. To simplify the diffs in a patch in development.	2020-07-06 13:58:25 +01:00
Paul Walker	2d7a8bc6d4	[SVE] Fix invalid assert in expand_DestructiveOp. AArch64ExpandPseudo::expand_DestructiveOp contains an assert to ensure the destructive operand's register is unique. However, this is only required when psuedo expansion emits a movprfx. A simple example when a movprfx is not required is Z0 = FADD_ZPZZ_UNDEF_S P0, Z0, Z0 which expands to an unprefixed FADD_ZPmZ_S instruction. This patch moves the assert to the places where a movprfx is emitted. Differential Revision: https://reviews.llvm.org/D83029	2020-07-04 09:21:40 +00:00
Petre-Ionut Tudor	235879e1a0	[ARM] Generate [SU]RHADD from (b - (~a)) >> 1 Summary: Teach LLVM to recognize the above pattern, which is usually a transformation of (a + b + 1) >> 1, where the operands are either signed or unsigned types. Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82669	2020-07-03 16:00:06 +01:00
Luke Geeson	920620bd6d	[ARM] Add Cortex-A77 Support for Clang and LLVM This patch upstreams support for the Arm-v8 Cortex-A77 processor for AArch64 and ARM. In detail: - Adding cortex-a77 as a cpu option for aarch64 and arm targets in clang - Cortex-A77 CPU name and ProcessorModel in llvm details of the CPU can be found here: https://www.arm.com/products/silicon-ip-cpu/cortex-a/cortex-a77 and a similar submission to GCC can be found here: `e0664b7a63` The following people contributed to this patch: - Luke Geeson - Mikhail Maltsev Reviewers: t.p.northover, dmgreen, ostannard, SjoerdMeijer Reviewed By: dmgreen Subscribers: dmgreen, kristof.beyls, hiraditya, danielkiss, cfe-commits, llvm-commits, miyuki Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D82887	2020-07-03 13:00:54 +01:00
Sanjay Patel	6e12757f99	[SelectionDAG] don't split branch on logic-of-vector-compares SelectionDAGBuilder converts logic-of-compares into multiple branches based on a boolean TLI setting in isJumpExpensive(). But that probably never considered the pattern of extracted bools from a vector compare - it seems unlikely that we would want to turn vector logic into control-flow. The motivating x86 reduction case is shown in PR44565: https://bugs.llvm.org/show_bug.cgi?id=44565 ...and that test shows the expected improvement from using pmovmsk codegen. For AArch64, I modified the test to include an extra op because the simpler test gets transformed by a codegen invocation of SimplifyCFG. Differential Revision: https://reviews.llvm.org/D82602	2020-07-02 17:05:24 -04:00
Sander de Smalen	67ab949978	[AArch64][SVE] Put zeroing pseudos and patterns under flag. This patch puts the _ZERO pseudos and corresponding patterns under the predicate 'UseExperimentalZeroingPseudos', so that they can be enabled/disabled through compile flags. This is done because the zeroing pseudos use MOVPRFX to do merging of the inactive lanes, but it depends on the uarch whether this operation is actually merged with the destructive operation. If not, it may be more profitable to use a SELECT and to give the compiler the freedom to schedule these instructions as normal, rather than keeping them bundled together. Additionally, this feature is not yet fully implemented and there are still known bugs (see D80410) that need to be resolved before the 'experimental' can be dropped from the name. Reviewers: paulwalker-arm, cameron.mcinally, efriedma Reviewed By: paulwalker-arm Tags: #llvm Differential Revision: https://reviews.llvm.org/D82780	2020-07-02 14:24:33 +01:00
Kerry McLaughlin	07d934b054	[AArch64][SVE] Add reg+imm addressing mode for unpredicated stores Reviewers: sdesmalen, efriedma, david-arm Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82985	2020-07-02 12:00:01 +01:00
David Sherwood	4591bf7068	[SVE] Add warnings checks in four more LLVM SVE tests I have added CHECK lines to the following tests: llvm/test/CodeGen/AArch64/sve-breakdown-scalable-vectortype.ll llvm/test/CodeGen/AArch64/sve-calling-convention-tuple-types.ll llvm/test/CodeGen/AArch64/sve-intrinsics-create-tuple.ll llvm/test/CodeGen/AArch64/sve-intrinsics-loads.ll since they are now free of warnings related to invalid use of EVT::getVectorNumElements() and VectorType::getNumElements(). Differential Revision: https://reviews.llvm.org/D82957	2020-07-02 10:43:17 +01:00
Sander de Smalen	58e4673769	[CodeGen][SVE] Don't drop scalable flag in DAGCombiner::visitEXTRACT_SUBVECTOR There was a rogue 'assert' in AArch64ISelLowering for the tuple.get intrinsics, that shouldn't really have been there (I suspect this was a remnant from when we expected the wider vector always to have come from a vector CONCAT). When I tried to create a more minimal reproducer, I found a bug in DAGCombiner where it drops the scalable flag when trying to fold: extract_subv (bitcast X), Index --> bitcast (extract_subv X, Index') This patch fixes both issues. Reviewers: david-arm, efriedma, spatel Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D82910	2020-07-02 10:16:43 +01:00
Sander de Smalen	b6c6989aee	[AArch64][SVE] Add unpred load/store patterns for bf16 types Reviewers: kmclaughlin, c-rhodes, efriedma Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D82909	2020-07-02 10:01:24 +01:00
David Sherwood	a2113da008	[CodeGen] Fix warnings in DAGCombiner::visitSCALAR_TO_VECTOR In visitSCALAR_TO_VECTOR we try to optimise cases such as: scalar_to_vector (extract_vector_elt %x) into vector shuffles of %x. However, it led to numerous warnings when %x is a scalable vector type, so for now I've changed the code to only perform the combination on fixed length vectors. Although we probably could change the code to work with scalable vectors in certain cases, without a proper profit analysis it doesn't seem worth it at the moment. This change fixes up one of the warnings in: llvm/test/CodeGen/AArch64/sve-merging-stores.ll I've also added a simplified version of the same test to: llvm/test/CodeGen/AArch64/sve-fp.ll which already has checks for no warnings. Differential Revision: https://reviews.llvm.org/D82872	2020-07-01 18:47:13 +01:00
James Y Knight	af0734bc33	Change the INLINEASM_BR MachineInstr to be a non-terminating instruction. Before this instruction supported output values, it fit fairly naturally as a terminator. However, being a terminator while also supporting outputs causes some trouble, as the physreg->vreg COPY operations cannot be in the same block. Modeling it as a non-terminator allows it to be handled the same way as invoke is handled already. Most of the changes here were created by auditing all the existing users of MachineBasicBlock::isEHPad() and MachineBasicBlock::hasEHPadSuccessor(), and adding calls to isInlineAsmBrIndirectTarget or mayHaveInlineAsmBr, as appropriate. Reviewed By: nickdesaulniers, void Differential Revision: https://reviews.llvm.org/D79794	2020-07-01 12:51:50 -04:00
David Green	d0933009b5	[Outliner] Set nounwind for outlined functions This prevents the outlined functions from pulling in a lot of unnecessary code in our downstream libraries/linker. Which stops outlining making codesize worse in c++ code with no-exceptions. Differential Revision: https://reviews.llvm.org/D57254	2020-07-01 17:18:34 +01:00
Kerry McLaughlin	11891a6394	[AArch64][SVE] Add reg+imm addressing mode for unpredicated loads Reviewers: efriedma, sdesmalen, david-arm Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82893	2020-07-01 10:33:56 +01:00
Paul Walker	411d35313e	[SVE] Relax merge requirement for IR based divides. We currently lower SDIV to SDIV_MERGE_OP1. This forces the value for inactive lanes in a way that can hamper register allocation, however, the lowering has no requirement for inactive lanes. Instead this patch replaces SDIV_MERGE_OP1 with SDIV_PRED thus freeing the register allocator. Once done the only user of SDIV_MERGE_OP1 is intrinsic lowering so I've removed the node and perform ISel on the intrinsic directly. This also allows us to implement MOVPRFX based zeroing in the same manner as SUB. This patch also renames UDIV_MERGE_OP1 and [F]ADD_MERGE_OP1 for the same reason but in the ADD cases the ISel code is already as required. Differential Revision: https://reviews.llvm.org/D82783	2020-07-01 08:18:42 +00:00
David Sherwood	4bd32c5a01	[SVE][CodeGen] Fix bug in DAGCombiner::reduceBuildVecToShuffle When trying to reduce a BUILD_VECTOR to a SHUFFLE_VECTOR it's important that we carefully check the vector types that led to that BUILD_VECTOR. In the test I have attached to this commit there is a case where the results of two SVE faddv instructions are being stored to consecutive memory locations. With my fix, as part of merging those stores we discover that each BUILD_VECTOR element came from an extract of a SVE vector element and therefore bail out. Differential Revision: https://reviews.llvm.org/D82564	2020-06-30 07:28:15 +01:00
Cullen Rhodes	89c6c9f994	[AArch64][SVE] Add bfloat16 to outstanding tuple vector intrinsics Summary: * svget2/3/4 * svset2/3/4 * svcreate2/3/4 * svundef/2/3/4 Reviewers: sdesmalen, kmclaughlin, fpetrogalli, efriedma Reviewed By: fpetrogalli Differential Revision: https://reviews.llvm.org/D82665	2020-06-29 17:00:58 +00:00
Francesco Petrogalli	b0f83ed2ae	[sve][acle] Implement some of the C intrinsics for brain float. Summary: The following intrinsics have been extended to support brain float types: svbfloat16_t svclasta[_bf16](svbool_t pg, svbfloat16_t fallback, svbfloat16_t data) bfloat16_t svclasta[_n_bf16](svbool_t pg, bfloat16_t fallback, svbfloat16_t data) bfloat16_t svlasta[_bf16](svbool_t pg, svbfloat16_t op) svbfloat16_t svclastb[_bf16](svbool_t pg, svbfloat16_t fallback, svbfloat16_t data) bfloat16_t svclastb[_n_bf16](svbool_t pg, bfloat16_t fallback, svbfloat16_t data) bfloat16_t svlastb[_bf16](svbool_t pg, svbfloat16_t op) svbfloat16_t svdup[_n]_bf16(bfloat16_t op) svbfloat16_t svdup[_n]_bf16_m(svbfloat16_t inactive, svbool_t pg, bfloat16_t op) svbfloat16_t svdup[_n]_bf16_x(svbool_t pg, bfloat16_t op) svbfloat16_t svdup[_n]_bf16_z(svbool_t pg, bfloat16_t op) svbfloat16_t svdupq[_n]_bf16(bfloat16_t x0, bfloat16_t x1, bfloat16_t x2, bfloat16_t x3, bfloat16_t x4, bfloat16_t x5, bfloat16_t x6, bfloat16_t x7) svbfloat16_t svdupq_lane[_bf16](svbfloat16_t data, uint64_t index) svbfloat16_t svinsr[_n_bf16](svbfloat16_t op1, bfloat16_t op2) Reviewers: sdesmalen, kmclaughlin, c-rhodes, ctetreau, efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D82345	2020-06-29 16:09:08 +00:00
Cullen Rhodes	f038a9beac	[AArch64][SVE] Add bfloat16 support to svext intrinsic Reviewers: sdesmalen, kmclaughlin, efriedma, david-arm, fpetrogalli Reviewed By: sdesmalen, fpetrogalli Differential Revision: https://reviews.llvm.org/D82391	2020-06-29 11:08:38 +00:00
Kerry McLaughlin	df0704f42c	[AArch64][SVE] Bail out of performPostLD1Combine for scalable types Summary: performPostLD1Combine will introduce either a LD1LANEpost or LD1DUPpost node, which will cause selection failure if the return type is a scalable vector. Reviewers: sdesmalen, c-rhodes, efriedma Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82670	2020-06-29 11:59:53 +01:00
Matt Arsenault	c0c2638b8d	GlobalISel: Don't fail translate on weak cmpxchg The translation of cmpxchg added by 9481399c0fd2c198c81b92636c0dcff7d4c41df2 specifically skipped weak cmpxchg due to not understanding the meaning. Weak cmpxchg was added in 420a216817def01816186910a2e35885c9201951. As explained in the commit message, the weak mode is implicit in how ATOMIC_CMP_SWAP_WITH_SUCCESS is lowered. If it's expanded to a regular ATOMIC_CMP_SWAP, it's replaced with a strong cmpxchg. This handling seems weird to me, but this was already following the DAG behavior. I would expect the strong IR instruction to not have the boolean output. Failing that, I might expect the IRTranslator to emit ATOMIC_CMP_SWAP and a constant for the boolean.	2020-06-26 17:52:18 -04:00
Francesco Petrogalli	142ed95012	[sve][acle] Recommit https://reviews.llvm.org/D82501 The original patch was reverted in `ff5ccf258e` as it was missing the C tests that got accidentally missing. This patch is a NFC of https://reviews.llvm.org/D82501, together with the SVE ACLE tests for the C intrinsics of svreinterpret for brain float types.	2020-06-26 20:45:29 +00:00
Francesco Petrogalli	6655379f18	Revert "[sve][acle] Add reinterpret intrinsics for brain float." This reverts commit a15722c5ce4759c12960fe434ee6bd8aac70bb16. The commmit has to be reverted because I accidentally submit https://reviews.llvm.org/D82501 without the C tests that were added in an early version of the patch.	2020-06-26 20:19:49 +00:00
Paul Walker	2467d01420	[SVE] Code generation for fixed length vector adds. Summary: Teach LowerToPredicatedOp to lower fixed length vector operations. Add AArch64ISD nodes and isel patterns for predicated integer and floating point adds. Together this enables SVE code generation for fixed length vector adds. Reviewers: rengolin, efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82483	2020-06-26 19:54:41 +00:00
Sanjay Patel	5dcf739606	[AArch64] add vector test for merged condition branching; NFC	2020-06-26 14:22:11 -04:00
Francesco Petrogalli	fb4e03c951	[sve][acle] Add reinterpret intrinsics for brain float. Reviewers: kmclaughlin, efriedma, ctetreau, sdesmalen, david-arm Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D82501	2020-06-26 15:20:58 +00:00
Kerry McLaughlin	eb10451902	[AArch64][SVE] Add bfloat16 support to store intrinsics Summary: Bfloat16 support added for the following intrinsics: - ST1 - STNT1 Reviewers: sdesmalen, c-rhodes, fpetrogalli, efriedma, stuij, david-arm Reviewed By: fpetrogalli Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D82448	2020-06-26 11:05:56 +01:00
Kerry McLaughlin	c4cb941ff9	[AArch64][SVE] Predicate bfloat16 load patterns with HasBF16 Reviewers: sdesmalen, c-rhodes, efriedma, fpetrogalli Reviewed By: fpetrogalli Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82464	2020-06-26 10:38:24 +01:00
Cullen Rhodes	af0ff0941f	[AArch64][SVE] Guard perm and select bfloat16 intrinsic patterns Summary: Permutation and selection bfloat16 intrinsic patterns should be guarded on the feature flag `+bf16`. Missed in D82182 and D80850. Reviewers: sdesmalen, fpetrogalli, kmclaughlin, efriedma Reviewed By: fpetrogalli Differential Revision: https://reviews.llvm.org/D82492	2020-06-26 09:35:36 +00:00
Amara Emerson	b3f9630cd8	[AArch64][GlobalISel] Fix extended shift addressing mode selection not handling sxth. The complex pattern for extended shift offsets only allow sxtw as the extend, not sxth. Our equivalent function to do this was not rejecting SXTH so we were miscompiling. This was exposed by D81992.	2020-06-25 17:24:32 -07:00
Jessica Paquette	3c623abe2f	[AArch64][GlobalISel] Port buildvector -> dup pattern from AArch64ISelLowering Given this: ``` %x:_(<n x sK>) = G_BUILD_VECTOR %lane, ... ... %y:_(<n x sK>) = G_SHUFFLE_VECTOR %x(<n x sK>), %foo, shufflemask(0, 0, ...) ``` We can produce: ``` %y:_(<n x sK) = G_DUP %lane(sK) ``` Doesn't seem to be too common, but AArch64ISelLowering attempts to do this before trying to produce a DUPLANE. Might as well port it. Also make it so that when the splat has an undef mask, we try setting it to 0. SDAG does this, and it makes sure that when we get the build vector operand, we actually get a source operand. Differential Revision: https://reviews.llvm.org/D81979	2020-06-25 14:19:06 -07:00
Francesco Petrogalli	fe31663523	[sve][acle] Add some C intrinsics for brain float types. Summary: The following intrinsics has been added: svuint16_t svcnt[_bf16]_m(svuint16_t inactive, svbool_t pg, svbfloat16_t op) svuint16_t svcnt[_bf16]_x(svbool_t pg, svbfloat16_t op) svuint16_t svcnt[_bf16]_z(svbool_t pg, svbfloat16_t op) svbfloat16_t svtbl[_bf16](svbfloat16_t data, svuint16_t indices) svbfloat16_t svtbl2[_bf16](svbfloat16x2_t data, svuint16_t indices) svbfloat16_t svtbx[_bf16](svbfloat16_t fallback, svbfloat16_t data, svuint16_t indices) Reviewers: c-rhodes, kmclaughlin, efriedma, sdesmalen, ctetreau Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D82429	2020-06-25 16:31:01 +00:00
Sanjay Patel	8ccfdd4e5c	[x86][AArch64] add tests for fmul-fma combine; NFC As discussed in D80801, there's a possible overstep in what is allowed by the 'contract' fast-math-flag.	2020-06-24 15:56:32 -04:00
Cullen Rhodes	f2a50c987a	[AArch64][SVE2] Add bfloat16 support to whilerw/whilewr intrinsics Reviewed By: fpetrogalli Differential Revision: https://reviews.llvm.org/D82399	2020-06-24 10:06:31 +00:00
Cullen Rhodes	8543c38ff5	[AArch64][SVE] Add bfloat16 support to perm and select intrinsics Summary: Added for following intrinsics: * zip1, zip2, zip1q, zip2q * trn1, trn2, trn1q, trn2q * uzp1, uzp2, uzp1q, uzp2q * splice * rev * sel Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D82182	2020-06-24 10:04:51 +00:00
Kerry McLaughlin	31c721c75b	[AArch64][SVE] Add bfloat16 support to load intrinsics Summary: Bfloat16 support added for the following intrinsics: - LD1 - LD1RQ - LDNT1 - LDNF1 - LDFF1 Reviewers: sdesmalen, c-rhodes, efriedma, stuij, fpetrogalli, david-arm Reviewed By: fpetrogalli Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D82298	2020-06-24 10:32:19 +01:00
Amara Emerson	7f4328cecf	[AArch64][GlobalISel] Improve codegen for some constant vectors by using constant pool loads. There's more smarts in AArch64ISelLowering that we don't have yet, but this change incrementally improves some of the more common patterns. I think future iterations will want to use some combination of PostLegalizerCombiner and the selector to catch the other cases. Differential Revision: https://reviews.llvm.org/D82340	2020-06-23 19:23:47 -07:00
Eli Friedman	9d315e1c2b	Remove GlobalValue::getAlignment(). This function is deceptive at best: it doesn't return what you'd expect. If you have an arbitrary GlobalValue and you want to determine the alignment of that pointer, Value::getPointerAlignment() returns the correct value. If you want the actual declared alignment of a function or variable, GlobalObject::getAlignment() returns that. This patch switches all the users of GlobalValue::getAlignment to an appropriate alternative. Differential Revision: https://reviews.llvm.org/D80368	2020-06-23 19:13:42 -07:00
Eli Friedman	bd863e1c80	[AArch64][SVE] Add legalization support for i32/i64 vector srem/urem Implement them on top of sdiv/udiv, similar to what we do for integer types. Potential future work: implementing i8/i16 srem/urem, optimizations for constant divisors, optimizing the mul+sub to mls. Differential Revision: https://reviews.llvm.org/D81511	2020-06-23 16:27:52 -07:00
Mikhail Maltsev	14bad468ca	[BFloat] Add convert/copy instrinsic support This patch is part of a series implementing the Bfloat16 extension of the Armv8.6-a architecture, as detailed here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a Specifically it adds intrinsic support in clang and llvm for Arm and AArch64. The bfloat type, and its properties are specified in the Arm Architecture Reference Manual: https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile The following people contributed to this patch: - Alexandros Lamprineas - Luke Cheeseman - Mikhail Maltsev - Momchil Velikov - Luke Geeson Differential Revision: https://reviews.llvm.org/D80928	2020-06-23 14:27:05 +00:00
Sander de Smalen	0b0c0d92c0	[AArch64][SVE] ACLE: Add bfloat16 to struct load/stores. This patch contains: - Support in LLVM CodeGen for bfloat16 types for ld2/3/4 and st2/3/4. - New bfloat16 ACLE builtins for svld(2\|3\|4)[_vnum] and svst(2\|3\|4)[_vnum] Reviewers: stuij, efriedma, c-rhodes, fpetrogalli Reviewed By: fpetrogalli Tags: #clang, #lldb, #llvm Differential Revision: https://reviews.llvm.org/D82187	2020-06-23 12:12:35 +01:00
Kerry McLaughlin	6ec1271650	[SVE][CodeGen] Legalisation of vsetcc with scalable types Summary: Changes SplitVecOp_VSETCC to use getVectorElementCount() Reviewers: sdesmalen, efriedma, dancgr Reviewed By: efriedma Subscribers: david-arm, tschuett, hiraditya, rkruppe, psnobl, huihuiz, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79167	2020-06-23 11:56:29 +01:00
Paul Walker	833bdd906f	[SVE] Code generation for fixed length vector loads & stores. Summary: This patch adds base support for code generating fixed length vector operations targeting a known SVE vector length. To achieve this we lower fixed length vector operations to equivalent scalable vector operations, whereby SVE predication is used to limit the elements processed to those present within the fixed length vector. Specifically this patch implements load and store operations, which get lowered to their masked counterparts thusly: V = load(Addr) => V = extract_fixed_vector(masked_load(make_pred(V.NumElts), Addr)) store(V, (Addr)) => masked_store(insert_fixed_vector(V), make_pred(V.NumElts), Addr)) Reviewers: rengolin, efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80385	2020-06-23 09:39:03 +00:00

... 2 3 4 5 6 ...

3875 Commits