llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

Author	SHA1	Message	Date
Stelios Ioannou	b1db1f3afb	[AArch64] Add abs intrinsic costs This patch adds cost-modelling for abs vector intrinsic. Change-Id: I89007971bfb15f5b4a02a2eadfd43018e9a73976	2021-02-25 09:31:52 +00:00
David Green	0f4c2ea58a	[TTI] Change getOperandsScalarizationOverhead to take Type args As a followup to D95291, getOperandsScalarizationOverhead was still using a VF as a vector factor if the arguments were scalar, and would assert on certain matrix intrinsics with differently sized vector arguments. This patch removes the VF arg, instead passing the Types through directly. This should allow it to more accurately compute the cost without having to guess at which operands will be vectorized, something difficult with more complex intrinsics. This adjusts one SVE test as it is now calling the wrong intrinsic vs veccall. Without invalid InstructCosts the cost of the scalarized intrinsic is too low. This should get fixed when the cost of scalarization is accounted for with scalable types. Differential Revision: https://reviews.llvm.org/D96287	2021-02-23 13:04:59 +00:00
David Green	8f3770b197	[ARM] Ensure types provided to getIntrinsicCost are valid It appears that pointer types were causing issues for the min/max cost code in getIntrinsicInstrCost. This makes sure that when matching icmp/select to a min/max, we only do that for normal int or float types.	2021-02-18 14:00:23 +00:00
David Green	20c6684726	[ARM] Add larger than legal ICmp costs A v8i32 compare will produce a v8i1 predicate, but during codegen the v8i32 will be split into two v4i32, potentially requiring two v4i1 predicates to be merged into a single v8i1. Because this merging of two v4i1's into a v8i1 is very expensive, we need to make the cost of the compare equally high. This patch adds the cost of that to ARMTTIImpl::getCmpSelInstrCost. Because we don't know whether the user of the predicate can be split, and the cost model is mostly pre-instruction, we may be pessimistic but that should only be for larger and legal types. This also adds min/max detection to the costmodel where it can be detected, to keep those in line with the cost of simple min/max instructions. Otherwise for the most part, costs that were already expensive have become more expensive. Differential Revision: https://reviews.llvm.org/D96692	2021-02-18 11:42:17 +00:00
David Green	2465ca1fd4	[ARM] MVE ICmp costing tests. NFC	2021-02-18 10:50:34 +00:00
Stanislav Mekhanoshin	f1c6dbc4d5	[AMDGPU] gfx90a support Differential Revision: https://reviews.llvm.org/D96906	2021-02-17 16:01:32 -08:00
David Green	e5e4857286	[ARM] Add MVE abs costs Similar to min/max, this increases the accuracy of abs intrinsics costs under MVE.	2021-02-17 14:21:09 +00:00
David Green	37557faad8	[ARM] MVE abs intrinsic costs. NFC	2021-02-17 13:54:17 +00:00
David Green	2d5347e7d3	[ARM] Add some basic Min/Max costs This adds basic MVE costs for SMIN/SMAX/UMIN/UMAX, as well as MINNUM and MAXNUM representing fmin and fmax. It tightens up the costs, not using a ICmp+Select cost. Differential Revision: https://reviews.llvm.org/D96603	2021-02-15 15:06:19 +00:00
Caroline Concatto	0b128753ba	[CostModel]Add cost model for experimental.vector.reverse This patch uses the function getShuffleCost with SK_Reverse to compute the cost for experimental.vector.reverse. For scalable vector type, it adds a table will the legal types on AArch64TTIImpl::getShuffleCost to not assert in BasicTTIImpl::getShuffleCost, and for fixed vector, it relies on the existing cost model in BasicTTIImpl. Depends on D94883 Differential Revision: https://reviews.llvm.org/D95603	2021-02-15 14:23:57 +00:00
David Green	0472de536c	[ARM] Fix duplicate fdiv tests, changing them to frem. NFC	2021-02-13 15:16:11 +00:00
David Green	2f0db07412	[ARM] Extra vector shuffle tests of various kinds. NFC	2021-02-13 15:03:10 +00:00
David Green	89d0773885	[ARM] MVE min/max cost tests. NFC	2021-02-13 11:12:12 +00:00
Sander de Smalen	4b43882cd0	[BasicTTIImpl] Fix getCastInstrCost for scalable vectors by querying for ElementCount. This fixes an overly restrictive assumption that the vector is a FixedVectorType, in code that tries to calculate the cost of a cast operation when splitting a too-wide vector. The algorithm works the same for scalable vectors, so this patch removes the cast<FixedVectorType>. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D96253	2021-02-12 08:28:52 +00:00
Sander de Smalen	0d847dd905	[CostModel] An extending load to illegal type is not free. COST(zext (<4 x i32> load(...) to <4 x i64>)) != 0 when <4 x i64> is an illegal result type that requires splitting of the operation. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D96250	2021-02-12 07:59:21 +00:00
David Green	91764313a1	[ARM] Add CostKind to getMVEVectorCostFactor. This adds the CostKind to getMVEVectorCostFactor, so that it can automatically account for CodeSize costs, where it returns a cost of 1 not the MVEFactor used for Throughput/Latency. This helps simplify the caller code and allows us to get the codesize cost more correct in more cases.	2021-02-11 15:33:59 +00:00
Caroline Concatto	ec0cfd98d5	[AArch64][SVE]Add cost model for broadcast shuffle This patch adds a cost model for SK_Broadcast in AArch64TTIImpl::getShuffleCost with scalable vector. Without this patch, the scalable vector type relies on BasicTTIImpl cost implementation and assert. Differential Revision: https://reviews.llvm.org/D95598	2021-02-03 09:53:22 +00:00
David Green	5c8cfcc0df	[AArch64] Add vector saturating add intrinsic costs This adds sadd.sat, uadd.sat, ssub.sat and usub.sat costs for AArch64, similar to how they were recently added for ARM. Differential Revision: https://reviews.llvm.org/D95292	2021-01-27 10:38:32 +00:00
Sander de Smalen	466208abea	[CostModel] Handle CTLZ and CCTZ in getTypeBasedIntrinsicInstrCost Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D95355	2021-01-26 14:37:51 +00:00
David Green	2c039d2841	[AArch64] Saturating add cost tests. NFC	2021-01-24 13:49:17 +00:00
David Sherwood	eeef7c223e	Fix build failure caused by 2e080eb00ad76654313e0e119bb7fa0ffe2f9866	2021-01-22 09:56:53 +00:00
David Sherwood	b1a389ab23	[SVE] Add support for scalable vectorization of loops with selects and cmps I have removed an unnecessary assert in LoopVectorizationCostModel::getInstructionCost that prevented a cost being calculated for select instructions when using scalable vectors. In addition, I have changed AArch64TTIImpl::getCmpSelInstrCost to only do special cost calculations for fixed width vectors and fall back to the base version for scalable vectors. I have added a simple cost model test for cmps and selects: test/Analysis/CostModel/sve-cmpsel.ll and some simple tests that show we vectorize loops with cmp and select: test/Transforms/LoopVectorize/AArch64/sve-basic-vec.ll Differential Revision: https://reviews.llvm.org/D95039	2021-01-22 09:48:13 +00:00
Jeroen Dobbelaere	116cd71f2c	[noalias.decl] Look through llvm.experimental.noalias.scope.decl Just like llvm.assume, there are a lot of cases where we can just ignore llvm.experimental.noalias.scope.decl. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93042	2021-01-19 20:09:42 +01:00
David Green	04f2bf7a46	[ARM] Expand vXi1 VSELECT's We have no lowering for VSELECT vXi1, vXi1, vXi1, so mark them as expanded to turn them into a series of logical operations. Differential Revision: https://reviews.llvm.org/D94946	2021-01-19 17:56:50 +00:00
David Green	c7d5648431	[ARM] Add MVE add.sat costs This adds some basic MVE sadd_sat/ssub_sat/uadd_sat/usub_sat costs, based on when the instruction is legal. With smaller than legal types that are promoted we generate shr(qadd(shl, shl)), so the cost is 4 appropriately. Differential Revision: https://reviews.llvm.org/D94958	2021-01-19 15:38:46 +00:00
David Green	218cbd9ab8	[ARM] Expand add.sat/sub.sat cost checks. NFC	2021-01-19 15:06:06 +00:00
Caroline Concatto	90b2e7be56	[AArch64][SVE]Add cost model for vector reduce for scalable vector This patch computes the cost for vector.reduce<operand> for scalable vectors. The cost is split into two parts: the legalization cost and the horizontal reduction. Differential Revision: https://reviews.llvm.org/D93639	2021-01-19 11:54:16 +00:00
David Green	a521cb4d34	[ARM] Update trunc costs We did not have specific costs for larger than legal truncates that were not otherwise cheap (where they were next to stores, for example). As MVE does not have a dedicated instruction for them (and we do not use loads/stores yet), they should be expensive as they get expanded to a series of lane moves. Differential Revision: https://reviews.llvm.org/D94260	2021-01-11 08:59:28 +00:00
David Green	e7f6381bd5	[ARM] Additional trunc cost tests. NFC	2021-01-11 08:35:16 +00:00
Caroline Concatto	0a20cec01d	[AArch64][CostModel]Fix gather scatter cost model This patch fixes a bug introduced in the patch: https://reviews.llvm.org/D93030 This patch pulls the test for scalable vector to be the first instruction to be checked. This avoids the Gather and Scatter cost model for AArch64 to compute the number of vector elements for something that is not a vector and therefore crashing.	2021-01-07 14:02:08 +00:00
Peter Waller	d954147555	[llvm][NFC] Disallow all warnings in TypeSize tests This is a follow-up to a request from a reviewer [0]. The text may change in the future and these tests should not produce any warning output. [0] https://reviews.llvm.org/D91806#inline-879243 Reviewed By: sdesmalen, david-arm Differential Revision: https://reviews.llvm.org/D94161	2021-01-06 17:17:07 +00:00
Caroline Concatto	3dded9e3bb	[AArch64][SVE]Add cost model for masked gather and scatter for scalable vector. A new TTI interface has been added 'Optional <unsigned>getMaxVScale' that returns the maximum vscale for a given target. When known getMaxVScale is used to compute the cost of masked gather scatter for scalable vector. Depends on D92094 Differential Revision: https://reviews.llvm.org/D93030	2021-01-04 13:59:58 +00:00
Juneyoung Lee	3936fc226a	Update inselt tests at llvm/test/Analysis to have poison as shufflevector's placeholder (NFC) File listed by: grep -R -E "^[^;]shufflevector <.> ., <.> undef" . \| grep inseltpoison Updated with: sed -i -E 's/shufflevector <(.)> (.), <(.*)> undef/shufflevector <\1> \2, <\3> poison/g' $1	2020-12-31 17:12:37 +09:00
Juneyoung Lee	b3956a9858	Precommit analysis/etc tests for inselt poison placeholder This adds tests in directories missing from https://reviews.llvm.org/rGdb7a2f347f132b3920415013d62d1adfb18d8d58	2020-12-24 12:14:24 +09:00
Bradley Smith	94d6c090f7	[CostModel] Add costs for llvm.experimental.vector.{extract,insert} intrinsics Adds cost model support for the new llvm.experimental.vector.{extract,insert} intrinsics, using the existing getExtractSubvectorOverhead and getInsertSubvectorOverhead functions for shuffles. Previously this case would throw an assertion. Differential Revision: https://reviews.llvm.org/D93043	2020-12-16 13:39:04 +00:00
Caroline Concatto	b575d6a35a	[CostModel]Replace FixedVectorType by VectorType in costgetIntrinsicInstrCost This patch replaces FixedVectorType by VectorType in getIntrinsicInstrCost in BasicTTIImpl.h. It re-arranges the scalable type test earlier return and add tests for scalable types. Depends on D91532 Differential Revision: https://reviews.llvm.org/D92094	2020-12-16 13:06:23 +00:00
David Green	5ed162a969	[ARM] Add basic masked load/store costs This adds some basic MVE masked load/store costs, notably changing the cost of legal loads/stores to the MVECostFactor and the cost of scalarized instructions to 8*NumElts. Differential Revision: https://reviews.llvm.org/D86538	2020-12-12 15:26:32 +00:00
Simon Pilgrim	f44bc6bb58	[CostModel][X86] getGatherScatterOpCost - use default implementation for alt costkinds Noticed while looking at D92701 - we only really handle TCK_RecipThroughput gather/scatter costs - for now drop back to the default implementation for non-legal gathers/scatters.	2020-12-06 14:08:26 +00:00
Sanjay Patel	47c90fc42e	[x86] adjust cost model values for minnum/maxnum with fast-math-flags Without FMF, we lower these intrinsics into something like this: vmaxsd %xmm0, %xmm1, %xmm2 vcmpunordsd %xmm0, %xmm0, %xmm0 vblendvpd %xmm0, %xmm1, %xmm2, %xmm0 But if we can ignore NANs, the single min/max instruction is enough because there is no need to fix up the x86 logic that corresponds to X > Y ? X : Y. We probably want to make other adjustments for FP intrinsics with FMF to account for specialized codegen (for example, FSQRT). Differential Revision: https://reviews.llvm.org/D92337	2020-12-01 10:45:53 -05:00
Sanjay Patel	621f444a0d	[x86] add tests for maxnum/minnum with nnan; NFC	2020-11-30 14:30:28 -05:00
Sjoerd Meijer	19add0d39d	[AArch64][CostModel] Fix cost for mul <2 x i64> This was modeled to have a cost of 1, but since we do not have a MUL.2d this is scalarized into vector inserts/extracts and scalar muls. Motivating precommitted test is test/Transforms/SLPVectorizer/AArch64/mul.ll, which we don't want to SLP vectorize. Test Transforms/LoopVectorize/AArch64/extractvalue-no-scalarization-required.ll unfortunately needed changing, but the reason is documented in LoopVectorize.cpp:6855: // The cost of executing VF copies of the scalar instruction. This opcode // is unknown. Assume that it is the same as 'mul'. which I will address next as a follow up of this. Differential Revision: https://reviews.llvm.org/D92208	2020-11-30 11:36:55 +00:00
Simon Pilgrim	1ebb153694	[DAG] Legalize umin(x,y) -> sub(x,usubsat(x,y)) and umax(x,y) -> add(x,usubsat(y,x)) iff usubsat is legal If usubsat() is legal, this is likely to result in smaller codegen expansion than the default cmp+select codegen expansion. Allows us to move the x86-specific lowering to the generic expansion code. Differential Revision: https://reviews.llvm.org/D92183	2020-11-27 11:18:58 +00:00
Sjoerd Meijer	26ad57d4a6	[AArch64][CostModel] Precommit some vector mul tests. NFC. The cost-model is not getting the cost right for a mul with <2 x i64> operands, i.e. we don't have a MUL.2d, and this is precommitting some tests before adjusting this.	2020-11-26 13:23:11 +00:00
Florian Hahn	5e5e86f97a	[CostModel] Add basic implementation of getGatherScatterOpCost. Add a basic implementation of getGatherScatterOpCost to BasicTTIImpl. The implementation estimates the cost of scalarizing the loads/stores, the cost of packing/extracting the individual lanes and the cost of only selecting enabled lanes. This more accurately reflects the current cost on targets like AArch64. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D91984	2020-11-26 12:02:25 +00:00
Simon Pilgrim	1c995b0a64	[CostModel][X86] Refresh ISD::ABS costs Update costs now that D92095 and D92102 have tweaked the SSE2 implementation The SSE42 BLENDVPD cost can actually be used on SSE41 as we don't attempt to generate PCMPGT anymore Add scalar i16/i32/i64 costs as we can do this cheaply with CMOV	2020-11-25 18:40:19 +00:00
Florian Hahn	82109b2991	[AArch64] Add scatter cost model tests.	2020-11-23 18:36:56 +00:00
Florian Hahn	1bf7e7d469	[AArch64] Add tests for masked.gather costs.	2020-11-23 17:33:27 +00:00
Sanjay Patel	9de62dc3bf	[CostModel] add basic handling for FP maximum/minimum intrinsics This might be a regression for some ARM targets, but that should be changed in the target-specific overrides. There is apparently still no default lowering for these nodes, so I am assuming these intrinsics are not in common use. X86, PowerPC, and RISC-V for example, just crash given the most basic IR.	2020-11-22 13:43:53 -05:00
Sanjay Patel	30af193cb7	[CostModel] add tests for FP maximum; NFC These min/max intrinsics are not handled in the basic implementation and probably not handled in target-specific overrides either.	2020-11-22 13:33:42 -05:00
Sanjay Patel	0d725bd74e	[CostModel] mostly remove cost-kind predicate for intrinsics in basic TTI implementation This is re-applying a combination of f7eac51b9b3f and 8ec7ea3ddce7 as one patch to avoid regressions now that we have better testing in place. Those were reverted with 32dd5870ee31 because of crashing in experimental intrinsics. That bug should be fixed with 7ae346434. Paraphrased original commit messages: This is the last step in removing cost-kind as a consideration in the basic class model for intrinsics. See D89461 for the start of that. Subsequent commits dealt with each of the special-case intrinsics that had customization here in the basic class. This should remove a barrier to retrying D87188 (canonicalization to the abs intrinsic). The ARM and x86 cost diffs seen here may be wrong because the target-specific overrides have their own bugs, but we hope this is less wrong - if something has a significant throughput cost, then it should have a significant size / blended cost too by default. The only behavioral diff in current regression tests is shown in the x86 scatter-gather test (which is misplaced or broken because it runs the entire -O3 pipeline) - we unrolled less, and we assume that is a improvement. Exception: in general, we want the size cost for a scalar call to be cheap even if the other costs are expensive - we expect it to just be a branch with some optional stack manipulation. It is likely that we will want to carve out some exceptions/overrides to this rule as follow-up patches for calls that have some general and/or target-specific difference to the expected lowering. This was noticed as a regression in unrolling, so we have a test for that now along with a couple of direct cost model tests. If the assumed scalarization costs for the oversized vector calls are not realistic, that would be another follow-up refinement of the cost models. Differential Revision: https://reviews.llvm.org/D90554	2020-11-20 11:21:10 -05:00

1 2 3 4 5 ...

734 Commits