llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 12:12:47 +01:00

Author	SHA1	Message	Date
Simon Pilgrim	1899e92869	[CostModel][X86] Improve extract/insert element costs (PR43605) This tries to improve the accuracy of extract/insert element costs by accounting for subvector extraction/insertion for >128-bit vectors and the shuffling of elements to/from the 0'th index. It also adds INSERTPS for f32 types and PINSR/PEXTR costs for integer types (at the moment we assume the same cost as MOVD/MOVQ - which isn't always true). Differential Revision: https://reviews.llvm.org/D74976	2020-02-27 15:54:13 +00:00
Bardia Mahjour	4af1e9e981	[DA] Delinearization of fixed-size multi-dimensional arrays Summary: Currently the dependence analysis in LLVM is unable to compute accurate dependence vectors for multi-dimensional fixed size arrays. This is mainly because the delinearization algorithm in scalar evolution relies on parametric terms to be present in the access functions. In the case of fixed size arrays such parametric terms are not present, but we can use the indexes from GEP instructions to recover the subscripts for each dimension of the arrays. This patch adds this ability under the existing option `-da-disable-delinearization-checks`. Authored By: bmahjour Reviewer: Meinersbur, sebpop, fhahn, dmgreen, grosser, etiotto, bollu Reviewed By: Meinersbur Subscribers: hiraditya, arphaman, Whitney, ppc-slack, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72178	2020-02-27 10:29:01 -05:00
Jay Foad	7d1b7cc031	[AMDGPU][ConstantFolding] Fold llvm.amdgcn.fract intrinsic Reviewers: nhaehnle, arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75179	2020-02-27 14:37:53 +00:00
Simon Pilgrim	417ce65d08	[CostModel][X86] We don't need a scale factor for SLM extract costs D74976 will handle larger vector types, but since SLM doesn't support AVX+ then we will always be extracting from 128-bit vectors so don't need to scale the cost.	2020-02-24 14:23:04 +00:00
Simon Pilgrim	77b793f027	[CostModel][X86] Try to check against common prefixes before using target-specific cpu checks SLM/GLM is still a mess so not all of them have been updated yet.	2020-02-24 11:59:07 +00:00
Jonas Paulsson	035c4568cc	[SystemZ] Return scalarized costs for vector instructions on older archs. A cost query for a vector instruction should return a cost even without target vector support, and not trigger an assert. VectorCombine does this with an input containing source code vectors. Review: Ulrich Weigand	2020-02-21 09:17:37 -08:00
Evgeniy Brevnov	836e38e46a	[DependenceAnalysis] Memory dependence analysis internal caching mechanism is broken in presence of TBAA (PR42733). Summary: There is a flaw in memory dependence analysis caching mechanism when memory accesses with TBAA are involved. Assume we first analysed and cached results for access with TBAA. Later we request dependence for the same memory but without TBAA (or different TBAA). By design these two queries should share one entry in the internal cache which corresponds to a general access (without TBAA). Thus upon second request internal cached is cleared and we continue analysis for access as if there is no TBAA. The problem is that even though internal cache is cleared the set of visited nodes is not. That means we won't traverse visited nodes again and populate internal cache with the corresponding dependence results. So we end up with internal cache in an incomplete state. Current implementation tries to signal that situation by resetting CacheInfo->Pair at line 1104. But that doesn't actually help since later code ignores this invalidation and relies on 'Cache->empty()' property to decide on cache completeness. Reviewers: reames, hfinkel, chandlerc, fedor.sergeev, asbirlea, fhahn, john.brawn, Prazek, sunfish Reviewed By: john.brawn Subscribers: DaniilSuchkov, kosarev, jfb, dantrushin, hiraditya, bmahjour, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73032	2020-02-21 20:20:36 +07:00
Sanjay Patel	6fe2359aa7	[ConstantFold] fold fsub -0.0, undef to undef rather than NaN A question about this behavior came up on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2020-February/139003.html ...and as part of backend improvements in D73978, but this is an IR change first because we already have fairly thorough tests in place here. We decided not to implement a more general change that would have folded any FP binop with nearly arbitrary constant + undef operand to undef because that is not theoretically correct (even if it is practically correct). Differential Revision: https://reviews.llvm.org/D74713	2020-02-21 08:03:19 -05:00
Sanjay Patel	1dc06667d0	[ConstantFold] add/move tests for FP with undef operand; NFC	2020-02-20 15:07:11 -05:00
Hideto Ueno	26e1302f3c	[MustExecute] Add backward exploration for must-be-executed-context Summary: As mentioned in D71974, it is useful for must-be-executed-context to explore CFG backwardly. This patch is ported from parts of D64975. We use a dominator tree to find the previous context if a dominator tree is available. Reviewers: jdoerfert, hfinkel, baziotis, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74817	2020-02-20 14:49:30 +09:00
Bardia Mahjour	d2570e78cd	[DDG] Data Dependence Graph - Graph Simplification Summary: This is the last functional patch affecting the representation of DDG. Here we try to simplify the DDG to reduce the number of nodes and edges by iteratively merging pairs of nodes that satisfy the following conditions, until no such pair can be identified. A pair of nodes consisting of a and b can be merged if: 1. the only edge from a is a def-use edge to b and 2. the only edge to b is a def-use edge from a and 3. there is no cyclic edge from b to a and 4. all instructions in a and b belong to the same basic block and 5. both a and b are simple (single or multi instruction) nodes. These criteria allow us to fold many uninteresting def-use edges that commonly exist in the graph while avoiding the risk of introducing dependencies that didn't exist before. Authored By: bmahjour Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert Reviewed By: Meinersbur Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto, ppc-slack Tags: #llvm Differential Revision: https://reviews.llvm.org/D72350	2020-02-19 13:41:51 -05:00
Jay Foad	92cac194bd	[AMDGPU][ConstantFolding] Fold llvm.amdgcn.fmul.legacy intrinsic Reviewers: arsenm, rampitec, nhaehnle Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74835	2020-02-19 16:01:30 +00:00
Evgeniy Brevnov	840bb38307	Reverting D73027 [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses(PR42151).	2020-02-14 22:57:23 +07:00
Evgeniy Brevnov	c443874e26	[DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses(PR42151). Summary: This is second attempt to fix the problem with incorrect dependencies reported in presence of invariant load. Initial fix (https://reviews.llvm.org/D64405) was reverted due to a regression reported in https://reviews.llvm.org/D70516. The original fix changed caching behavior for invariant loads. Namely such loads are not put into the second level cache (NonLocalDepInfo). The problem with that fix is the first level cache (CachedNonLocalPointerInfo) still works as if invariant loads were in the second level cache. The solution is in addition to not putting dependence results into the second level cache avoid putting info about invariant loads into the first level cache as well. Reviewers: jdoerfert, reames, hfinkel, efriedma Reviewed By: jdoerfert Subscribers: DaniilSuchkov, hiraditya, bmahjour, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73027	2020-02-14 12:18:31 +07:00
Huihui Zhang	8658e137b1	[ConstantFold][SVE] Fix constant fold for FoldReinterpretLoadFromConstPtr. Summary: Bail out early for scalable vectors. As global variables are not expected to be scalable. Use explicit call of getFixedSize() to assert on places where scalable size doesn't make sense. Reviewers: sdesmalen, efriedma, apazos, huntergr, willlovett Reviewed By: sdesmalen Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74424	2020-02-12 10:24:50 -08:00
Ehud Katz	f87b5040ba	[ConstantFolding] Fold calls to FP remainder function With the fixed implementation of the "remainder" operation in rG9d0956ebd471, we can now add support to folding calls to it. Differential Revision: https://reviews.llvm.org/D69777	2020-02-12 13:21:18 +02:00
Nicolai Hähnle	894807cedd	AMDGPU: llvm.amdgcn.writelane is a source of divergence Summary: Consider: %r = call i32 @llvm.amdgcn.writelane(i32 0, i32 1, i32 2) This produces a value that is 0 on lane 1, and 2 everywhere else; i.e., it is divergent. Reported-by: Marek Olsak <Marek.Olsak@amd.com> Reviewers: arsenm, foad, mareko Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74400	2020-02-12 09:12:56 +01:00
Huihui Zhang	10305a8da8	[ConstantFold][SVE] Fix constand fold for vector call. Summary: Do not iterate on scalable vectors. Reviewers: sdesmalen, efriedma, apazos, huntergr, willlovett Reviewed By: sdesmalen Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74419	2020-02-11 14:06:15 -08:00
Alina Sbirlea	1e37b17c69	[BasicAA] Make BasicAA a cfg pass. Summary: Part of the changes in D44564 made BasicAA not CFG only due to it using PhiAnalysisValues which may have values invalidated. Subsequent patches (rL340613) appear to have addressed this limitation. BasicAA should not be invalidated by non-CFG-altering passes. A concrete example is MemCpyOpt which preserves CFG, but we are testing it invalidates BasicAA. llvm-dev RFC: https://groups.google.com/forum/#!topic/llvm-dev/eSPXuWnNfzM Reviewers: john.brawn, sebpop, hfinkel, brzycki Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74353	2020-02-11 11:30:08 -08:00
Rachel Craik	e80d68f22a	[LoopCacheAnalysis]: Add support for negative stride LoopCacheAnalysis currently assumes the loop will be iterated over in a forward direction. This patch addresses the issue by using the absolute value of the stride when iterating backwards. Note: this patch will treat negative and positive array access the same, resulting in the same cost being calculated for single and bi-directional access patterns. This should be improved in a subsequent patch. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D73064	2020-02-10 13:22:35 -05:00
Huihui Zhang	97695b832a	[ConstantFold][NFC] Move scalable vector unit tests under vscale.ll	2020-02-05 16:03:51 -08:00
Huihui Zhang	3f01fba31d	[ConstantFold][SVE] Fix constant folding for bitcast. Do not iterate on scalable vector type in BitCastConstantVector. Continuation work of D70985, D71147. Support for folding bitcast into splat value is kept in D74095, as it depends on D71637. Differential Revision: https://reviews.llvm.org/D71389	2020-02-05 15:39:57 -08:00
Christopher Tetreault	c5dc9d3b15	Reapply: [SVE] Fix bug in simplification of scalable vector instructions This reverts commit a05441038a3a4a011b9421751367c5c797d57137, reapplying commit 31574d38ac5fa4646cf01dd252a23e682402134f	2020-02-05 10:00:09 -08:00
Matt Arsenault	0861faad9c	AMDGPU: Fix divergence analysis of control flow intrinsics The mask results of these should be uniform. The trickier part is the dummy booleans used as IR glue need to be treated as divergent. This should make the divergence analysis results correct for the IR the DAG is constructed from. This should allow us to eliminate requiresUniformRegister, which has an expensive, recursive scan over all users looking for control flow intrinsics. This should avoid recent compile time regressions.	2020-02-05 09:30:54 -08:00
Matt Arsenault	9cc24c4b43	AMDGPU: Fix isAlwaysUniform for simple asm SGPR results We were handling the case where the result was a struct with an extracted SGPR component, but not for the simple case.	2020-02-04 13:34:14 -08:00
Matt Arsenault	8c7054551b	AMDGPU: Analyze divergence of inline asm	2020-02-03 12:42:16 -08:00
Reid Kleckner	a1c473cd39	Revert "[SVE] Fix bug in simplification of scalable vector instructions" This reverts commit 31574d38ac5fa4646cf01dd252a23e682402134f. The newly added shufflevector test does not pass locally on either of my workstations.	2020-02-03 11:12:09 -08:00
Christopher Tetreault	56276c94bb	[SVE] Fix bug in simplification of scalable vector instructions Summary: * Most of the simplifications in SimplifyShuffleVectorInst depend on the concrete value of, or the length of the mask vector. For scalable vectors, this cannot be known at compile time. ** for these tests, detect if the vector is scalable before attempting the transformation * The functions ShuffleVectorInst::getMaskValue and ShuffleVectorInst::getShuffleMask access the value of the constant mask. However, since the length of the mask is unknown at compile time, these function do not work for scalable vectors. Add asserts to ensure that the input mask is not scalable Reviewers: efriedma, sdesmalen, apazos, chrisj, huihuiz Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73555	2020-02-03 10:15:56 -08:00
Huihui Zhang	ac4a5d6e8a	[ConstantFold][SVE][NFC] Add test for select instruction in scalable vector. Side notes from D73669, no need to guard the iteration on vectors, as it is explicitly looking for a ConstantVector/ConstantDataVector, which is not expected to be scalable at the moment. So, add the test only.	2020-01-30 10:56:12 -08:00
Huihui Zhang	8898d47dbb	[ConstantFold][SVE] Fix constant folding for scalable vector unary operations. Summary: Similar to issue D71445. Scalable vector should not be evaluated element by element. Add support to handle scalable vector UndefValue. Reviewers: sdesmalen, efriedma, apazos, huntergr, willlovett Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73678	2020-01-30 10:45:15 -08:00
Craig Topper	9655902815	[X86] Fix the cost model for v16i16->v16i32 zero_extend/sign_extend with AVX2 We seem to be inheriting the cost from sse4.1. But if we have 256-bit registers we should be able to do this with just one extract to split the 16i16 and two v8i16->v8i32 operations so our cost should be 3 not 4. Differential Revision: https://reviews.llvm.org/D73646	2020-01-29 15:52:10 -08:00
Huihui Zhang	94e50c3dad	[ConstantFold][SVE] Fix constant folding for scalable vector binary operations. Summary: Scalable vector should not be evaluated element by element. Add support to handle scalable vector UndefValue. Reviewers: sdesmalen, huntergr, spatel, lebedev.ri, apazos, efriedma, willlovett Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71445	2020-01-29 10:49:08 -08:00
Evgenii Stepanov	d2f0ede221	Support zero size types in StackSafetyAnalysis. Reviewers: vitalybuka Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73395	2020-01-27 15:22:59 -08:00
Evgenii Stepanov	a55605a524	Fix StackSafetyAnalysis crash with scalable vector types. Summary: Treat scalable allocas as if they have storage size of 0, and scalable-typed memory accesses as if their range is unlimited. This is not a proper support of scalable vector types in the analysis - we can do better, but not today. Reviewers: vitalybuka Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73394	2020-01-27 15:22:59 -08:00
Roman Lebedev	9bf8fcc900	[NFC][IndVarSimplify] Autogenerate tests affected by isHighCostExpansionHelper() cost modelling (PR44668)	2020-01-27 23:34:29 +03:00
Craig Topper	e9d6696ce1	Revert a107f86 "[GlobalsAA] Add back a check to intrinsic_addresstaken.ll to see if the AVX and AVX512 bots still fail for it." It still fails some buildbots which is what I was trying to test.	2020-01-24 13:15:23 -08:00
Craig Topper	0a13c55c8c	[GlobalsAA] Add back a check to intrinsic_addresstaken.ll to see if the AVX and AVX512 bots still fail for it. These bots failed for this several months ago and as a result, this check was removed. If they still fail I'm going to try to see if I can figure out why.	2020-01-24 11:54:23 -08:00
Austin Kerbow	0fa8b03aac	Resubmit: [DA][TTI][AMDGPU] Add option to select GPUDA with TTI Summary: Enable the new diveregence analysis by default for AMDGPU. Resubmit with test updates since GPUDA was causing failures on Windows. Reviewers: rampitec, nhaehnle, arsenm, thakis Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73315	2020-01-24 10:39:40 -08:00
Austin Kerbow	06d4a892b3	[DA] Don't propagate from unreachable blocks Summary: Fixes crash that could occur when a divergent terminator has an unreachable parent. Reviewers: rampitec, nhaehnle, arsenm Subscribers: jvesely, wdng, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73323	2020-01-24 10:28:11 -08:00
David Green	d8e98cfe8f	[ARM] Basic gather scatter cost model This is a very basic MVE gather/scatter cost model, based roughly on the code that we will currently produce. It does not handle truncating scatters or extending gathers correctly yet, as it is difficult to tell that they are going to be correctly extended/truncated from the limited information in the cost function. This can be improved as we extend support for these in the future. Based on code originally written by David Sherwood. Differential Revision: https://reviews.llvm.org/D73021	2020-01-22 14:41:38 +00:00
David Green	380c878c38	[ARM] MVE Gather Scatter cost model tests. NFC	2020-01-22 14:41:38 +00:00
Florian Hahn	7b29a90656	[IR] Mark memset.* intrinsics as IntrWriteMem. llvm.memset intrinsics do only write memory, but are missing IntrWriteMem, so they doesNotReadMemory() returns false for them. The test change is due to the test checking the fn attribute ids at the call sites, which got bumped up due to a new combination with writeonly appearing in the test file. Reviewers: jdoerfert, reames, efriedma, nlopes, lebedev.ri Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D72789	2020-01-16 10:35:46 +00:00
Zheng Chen	d8d9e920a8	[SCEV] accurate range for addrecexpr with nuw flag If addrecexpr has nuw flag, the value should never be less than its start value and start value does not required to be SCEVConstant. Reviewed By: nikic, sanjoy Differential Revision: https://reviews.llvm.org/D71690	2020-01-12 20:22:37 -05:00
Zheng Chen	1c93f5af35	[SCEV] more accurate range for addrecexpr with nsw flag. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D72436	2020-01-11 23:26:35 -05:00
Zheng Chen	edd7ca0f0e	[SCEV] [NFC] add more test cases for range of addrecexpr with nsw flag	2020-01-10 22:44:47 -05:00
Zheng Chen	4e5c39e80b	[SCEV] [NFC] add testcase for constant range for addrecexpr with nsw flag	2020-01-09 01:26:57 -05:00
Simon Pilgrim	df5719cf52	[CostModel][X86] Add missing scalar i64->f32 uitofp costs	2020-01-06 13:17:02 +00:00
Fangrui Song	2d0a36fd96	Migrate function attribute "no-frame-pointer-elim"="false" to "frame-pointer"="none" as cleanups after D56351	2019-12-24 16:27:51 -08:00
Fangrui Song	148dd94d20	Migrate function attribute "no-frame-pointer-elim-non-leaf" to "frame-pointer"="non-leaf" as cleanups after D56351	2019-12-24 16:05:15 -08:00
Fangrui Song	d9c5df08b1	Migrate function attribute "no-frame-pointer-elim" to "frame-pointer"="all" as cleanups after D56351	2019-12-24 15:57:33 -08:00

1 2 3 4 5 ...

1937 Commits