llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 19:23:23 +01:00

Author	SHA1	Message	Date
Roman Lebedev	4292a82bd2	[NFC][SimplifyCFG] FoldBranchToCommonDest(): pull out 'common successor' into a variable Makes it easier to use it elsewhere	2020-12-14 20:14:31 +03:00
Roman Lebedev	aef5ea6d96	[NFC][SimplifyCFG] Add another miscompiled test for PR48450	2020-12-14 20:14:31 +03:00
Stanislav Mekhanoshin	4e48d88543	[SLP] Control maximum vectorization factor from TTI D82227 has added a proper check to limit PHI vectorization to the maximum vector register size. That unfortunately resulted in at least a couple of regressions on SystemZ and x86. This change reverts PHI handling from D82227 and replaces it with a more general check in SLPVectorizerPass::tryToVectorizeList(). Moved to tryToVectorizeList() it allows to restart vectorization if initial chunk fails. However, this function is more general and handles not only PHI but everything which SLP handles. If vectorization factor would be limited to maximum vector register size it would limit much more vectorization than before leading to further regressions. Therefore a new TTI callback getMaximumVF() is added with the default 0 to preserve current behavior and limit nothing. Then targets can decide what is better for them. The callback gets ElementSize just like a similar getMinimumVF() function and the main opcode of the chain. The latter is to avoid regressions at least on the AMDGPU. We can have loads and stores up to 128 bit wide, and <2 x 16> bit vector math on some subtargets, where the rest shall not be vectorized. I.e. we need to differentiate based on the element size and operation itself. Differential Revision: https://reviews.llvm.org/D92059	2020-12-14 08:49:40 -08:00
Jay Foad	3f42bac9ba	[AMDGPU] Make use of HasSMemRealTime predicate. NFC. We have this subtarget feature so it makes sense to use it here. This is NFC because it's always defined by default on GFX8+. Differential Revision: https://reviews.llvm.org/D93202	2020-12-14 16:34:57 +00:00
Kazushi (Jam) Marukawa	e50338f5d0	[VE] Add logical mask intrinsic instructions Add andm, orm, xorm, eqvm, nndm, negm, pcvm, lzvm, and tovm intrinsic instructions, a few pseudo instructions to expand logical intrinsic using VM512, a mechnism to expand such pseudo instructions, and regression tests. Also, assign vector mask types and vector mask register classes correctly. This is required to use VM512 registers as function arguments. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D93093	2020-12-15 01:34:31 +09:00
Simon Pilgrim	ee5f2e29f2	[X86] LowerBUILD_VECTOR - track zero/nonzero elements with APInt masks. NFCI. Prep work for undef/zero 'upper elements' handling as proposed in D92645.	2020-12-14 16:28:45 +00:00
Kazushi (Jam) Marukawa	97b075a61c	[VE] Correct addRegisterClass calls Correct addRegisterClass calls for vector mask registers. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D93212	2020-12-15 01:16:56 +09:00
diggerlin	1cd7949f8b	[AIX] Fixed "comparison of unsigned expression >= 0 is always true" gcc warnings. Summary: fixed a Fixed "comparison of unsigned expression >= 0 is always true" gcc warnings. http://lab.llvm.org:8011/#/builders/5/builds/2407/steps/2/logs/stdio the error caused by patch https://reviews.llvm.org/D92398	2020-12-14 11:08:40 -05:00
Markus Lavin	8b493e1580	Reland [DebugInfo] Improve dbg preservation in LSR. Use SCEV to salvage additional @llvm.dbg.value that have turned into referencing undef after transformation (and traditional salvageDebugInfo). Before rewrite (but after introduction of new induction variables) use SCEV to compute an equivalent set of values for each @llvm.dbg.value in the loop body (among the loop header PHI-nodes). After rewrite (and dead PHI elimination) update those @llvm.dbg.value now referencing undef by picking a remaining value from its equivalence set. Allow match with offset by inserting compensation code in the DIExpression. Fixes : PR38815 Differential Revision: https://reviews.llvm.org/D87494	2020-12-14 16:15:18 +01:00
Florian Hahn	8b3b1f6f2e	[VPlan] Make VPWidenMemoryInstructionRecipe a VPDef. This patch updates VPWidenMemoryInstructionRecipe to use VPDef to manage the value it produces instead of inheriting from VPValue. Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D90563	2020-12-14 14:13:59 +00:00
David Spickett	18038eabcf	[llvm-objdump] Use "--" for long options in --help text Single dash for these options is not recognised. Changes found by running this on the --help output and the user guide: grep -e ' -[a-zA-Z]\{2,\}' The user guide was updated in https://reviews.llvm.org/D92305 so no change there. Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D92310	2020-12-14 13:11:29 +00:00
Anton Afanasyev	255cd47a94	[SLP] Fix vector element size for the store chains Vector element size could be different for different store chains. This patch prevents wrong computation of maximum number of elements for that case. Differential Revision: https://reviews.llvm.org/D93192	2020-12-14 15:51:43 +03:00
Simon Pilgrim	f264008f89	[TableGen] Don't dereference from dyn_cast<> - use cast<> instead. NFCI. dyn_cast<> can return null if the cast fails, resulting in null dereferences and static analyzer warnings. We should use cast<> instead.	2020-12-14 12:12:08 +00:00
Simon Pilgrim	371053ed23	[IRCE] Add test case for PR48051	2020-12-14 12:01:19 +00:00
Kerry McLaughlin	8503126c6a	[SVE][CodeGen] Lower scalable floating-point vector reductions Changes in this patch: - Minor changes to the LowerVECREDUCE_SEQ_FADD function added by @cameron.mcinally to also work for scalable types - Added TableGen patterns for FP reductions with unpacked types (nxv2f16, nxv4f16 & nxv2f32) - Asserts added to expandFMINNUM_FMAXNUM & expandVecReduceSeq for scalable types Reviewed By: cameron.mcinally Differential Revision: https://reviews.llvm.org/D93050	2020-12-14 11:45:42 +00:00
David Green	169d672b6b	[ARM] Improve handling of empty VPT blocks in tail predicated loops A vpt block that just contains either VPST;VCTP or VPT;VCTP, once the VCTP is removed will become invalid. This fixed the first by removing the now empty block and bails out for the second, as we have no simple way of converting a VPT to a VCMP. Differential Revision: https://reviews.llvm.org/D92369	2020-12-14 11:17:01 +00:00
Carl Ritson	ce9c6c06b9	[AMDGPU][NFC] Rename opsel/opsel_hi/neg_lo/neg_hi with suffix 0 These parameters set a default value of 0, so I believe they should include a 0 suffix. This allows for versions which do not set a default value in future. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D93187	2020-12-14 20:01:56 +09:00
Carl Ritson	5ca103d09e	[AMDGPU][NFC] Remove unused VOP3Mods0Clamp This is unused and the selection function does not exist. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D93188	2020-12-14 20:00:58 +09:00
Sebastian Neubauer	de78d986b0	[AMDGPU] Mark amdgpu_gfx functions as module entry function - Allows lds allocations - Writes resource usage into COMPUTE_PGM_RSRC1 registers in PAL metadata Differential Revision: https://reviews.llvm.org/D92946	2020-12-14 10:43:39 +01:00
Georgii Rymar	c7112b7126	[llvm-readobj] - For SHT_REL relocations, don't display an addend. This is https://bugs.llvm.org/show_bug.cgi?id=44257. In LLVM style we always print `0` as addend when dumping SHT_REL relocations. It is confusing, this patch stops printing it as the first comment on the bug page suggests. Differential revision: https://reviews.llvm.org/D93033	2020-12-14 12:03:00 +03:00
Jan Svoboda	ff6316c907	[clang][cli] Better defaults for MarshallingInfoString Depends on D84018 Reviewed By: Bigcheese Original patch by Daniel Grumberg. Differential Revision: https://reviews.llvm.org/D84185	2020-12-14 09:59:56 +01:00
Georgii Rymar	dff27ca0da	[llvm-readelf] - Improve ELF type field dumping. This is related to https://bugs.llvm.org/show_bug.cgi?id=40868. Currently we don't print `OS Specific`/``Processor Specific`/`<unknown>` prefixes when dumping the ELF file type. This is not consistent with GNU readelf. The patch fixes it. Also, this patch removes the `types.test`, because we already have `file-types.test`, which tests more cases and this patch revealed that we have such a duplicate. Differential revision: https://reviews.llvm.org/D93096	2020-12-14 11:24:08 +03:00
sameeran joshi	53796ac006	[Flang][OpenMP-5.0] Semantic checks for flush construct. From OMP 5.0 [2.17.8] Restriction: If memory-order-clause is release,acquire, or acq_rel, list items must not be specified on the flush directive. Reviewed By: kiranchandramohan, clementval Differential Revision: https://reviews.llvm.org/D89879	2020-12-14 13:30:48 +05:30
QingShan Zhang	054f5f2547	[PowerPC][FP128] Fix the incorrect signature for math library call The runtime library has two family library implementation for ppc_fp128 and fp128. For IBM Long double(ppc_fp128), it is suffixed with 'l', i.e(sqrtl). For IEEE Long double(fp128), it is suffixed with "ieee128" or "f128". We miss to map several libcall for IEEE Long double. Reviewed By: qiucf Differential Revision: https://reviews.llvm.org/D91675	2020-12-14 07:52:56 +00:00
sameeran joshi	ab2dc4d459	[Flang][OpenMP] Semantic checks for Atomic construct. Patch implements restrictions from 2.17.7 of OpenMP 5.0 standard for atomic Construct. Tests for the same are added. One of the restriction `OpenMP constructs may not be encountered during execution of an atomic region.` Is mentioned in 5.0 standard to be a semantic restriction, but given the stricter nature of parser in F18 it's caught at parsing itself. This patch is a next patch in series from D88965. Reviewed By: clementval Differential Revision: https://reviews.llvm.org/D89583	2020-12-14 13:03:57 +05:30
Craig Topper	7b691db656	[LoopIdiom] Pre-commit tests for D92745. NFC	2020-12-13 23:25:00 -08:00
Anton Afanasyev	1f4e3377f8	[SLP][Test] Precommit test for D93192 This test shows failure of combined stores chains vectorization	2020-12-14 09:23:47 +03:00
Chen Zheng	28d16f3586	[MachineCombiner][NFC] Add MustReduceRegisterPressure goal add a new goal MustReduceRegisterPressure for machine combiner pass. PowerPC will use this new goal to do some register pressure related optimization. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D92068	2020-12-14 00:02:42 -05:00
Kazu Hirata	f57215c328	[CodeGen] Use llvm::erase_value (NFC)	2020-12-13 20:05:48 -08:00
Kazu Hirata	c204059cba	[Target] Use llvm::is_contained (NFC)	2020-12-13 19:35:10 -08:00
Arthur Eubanks	26e5c18037	[opt][NPM] Pin -lower-amx-type to legacy PM This is part of the codegen pipeline.	2020-12-13 19:16:20 -08:00
Lang Hames	74ef815f29	Re-apply 8904ee8ac7e with missing header included this time.	2020-12-14 13:39:33 +11:00
Nico Weber	2ecd062c06	Revert "[JITLink] Add JITLinkDylib type, thread through JITLinkMemoryManager APIs." This reverts commit 8904ee8ac7ebcc50a60de0914abc6862e28b6664. Didn't `git add` llvm/ExecutionEngine/JITLink/JITLinkDylib.h and hence doesn't build anywhere.	2020-12-13 21:30:38 -05:00
Lang Hames	c16030181f	[JITLink] Add JITLinkDylib type, thread through JITLinkMemoryManager APIs. JITLinkDylib represents a target dylib for a JITLink link. By representing this explicitly we can: - Enable JITLinkMemoryManagers to manage allocations on a per-dylib basis (e.g by maintaining a seperate allocation pool for each JITLinkDylib). - Enable new features and diagnostics that require information about the target dylib (not implemented in this patch).	2020-12-14 12:29:16 +11:00
Lang Hames	3a59a13351	[JITLink] Fix include guard end comment.	2020-12-14 12:00:21 +11:00
Lang Hames	6cff7af9fe	[ORC] Prefer preincrement on iterator.	2020-12-14 12:00:21 +11:00
Craig Topper	baf5e23683	[X86] Add ExeDomain = SSEPackedSingle to cvtss2sd and cvtsd2ss instrutions. Prep for D92993	2020-12-13 12:35:33 -08:00
Craig Topper	c97e9a2df5	[X86] Add isel patterns to form VPDPWSSD from (add (vpmaddwd X, Y), Z) when AVXVNNI is enabled. We already have these patterns for AVX512VNNI.	2020-12-13 12:02:07 -08:00
Nikita Popov	b008f21f60	[AC] Handle (X+C1)<C2 assumes (PR48408) InstCombine canonicalizes X>C && X<C' style comparisons into (X+C1)<C2. This type of expression is recognized by some analyses like LVI, but currently not when used inside assumptions, because AssumptionCache does not track affected values for it.	2020-12-13 21:00:32 +01:00
Harald van Dijk	7fde76de88	[X86] Extend varargs test This extends the existing x86-64-varargs test by passing enough arguments that they need to be passed in memory, and by passing them in reverse order, using va_arg for each argument to retrieve them and restoring them to the correct order, and by using va_copy to have two va_lists to use with va_arg.	2020-12-13 18:33:10 +00:00
Kazu Hirata	739a1bfd1b	[Analysis] Remove unused declaration replaceEdgeKey (NFC) The declaration was introduced without a corresponding definition on Feb 9, 2017 in commit aaad9f84be2a6a3eb8202ed4eaa5e5e2021d055e.	2020-12-13 10:03:45 -08:00
Kazu Hirata	cad22ca413	[Transforms] Use llvm::erase_value (NFC)	2020-12-13 09:48:47 -08:00
Tony	0edbddc58e	[NFC]{AMDGPU] Update AMDGPUUsage with AMD RDNA 2 reference Differential Revision: https://reviews.llvm.org/D93172	2020-12-13 17:21:02 +00:00
Simon Pilgrim	911e7e5df0	[X86][SSE] combineX86ShufflesRecursively - add basic handling for combining shuffles of different widths (PR45974) If a faux shuffle uses smaller shuffle inputs, try to recursively combine with those inputs directly instead of widening them immediately. Then widen all smaller inputs at the bottom of the recursion. This will still mean we're generating nodes on the fly (PR45974) even if we don't combine to a new shuffle but it does help AVX2+ targets combine across xmm/ymm/zmm types, mainly as variable shuffles.	2020-12-13 17:18:07 +00:00
Simon Pilgrim	cfea3c3324	[X86][AVX] Add additional X86ISD::SUBV_BROADCAST_LOAD test case for D92645 Suggested by @yubing - to check whether we can reuse a single subvector broadcast for 128/256/512-bit vectors.	2020-12-13 16:43:33 +00:00
Florian Hahn	89cd0fd6c4	[VPlan] Use interleaveComma in printOperands() (NFC).	2020-12-13 16:29:16 +00:00
Florian Hahn	ff3668406f	Recommit "[AArch64] Lower calls with rv_marker attribute." This recommits a87fccb3ff9c with a fix to mark the destination operand of the marker instruction as def, to fix a machine verifier failure. This reverts the revert commit c0f2cea7c0afc7c9688e1633f2a9b25c8ea4a9bd.	2020-12-13 16:20:39 +00:00
Simon Pilgrim	a7e965cbc8	[X86][SSE] combineReductionToHorizontal - add vXi8 ISD::MUL reduction handling (PR39709) Default expansion leads to repeated extensions/truncations to/from vXi16 which shuffle combining and demanded elts can't completely unravel. Better just to promote (any_extend) the input and perform a vXi16 reduction. We'll be able to remove a lot of this if we ever get decent legalization support for reduction intrinsics in SelectionDAG.	2020-12-13 15:22:54 +00:00
Simon Pilgrim	66c06c91e1	[X86] Regenerate vector-reduce-mul.ll with common check prefixes. NFC. Try to merge AVX1/AVX2/AVX512 codegen checks where possible	2020-12-13 14:25:42 +00:00
Nikita Popov	5f5687abd0	[BasicAA] Handle known non-zero variable index BasicAA currently handles cases like ScaleV0 + (-Scale)V1 where V0 != V1, but does not handle the simpler case of Scale*V with V != 0. Add it based on an isKnownNonZero() call. I'm not passing a context instruction for now, because the existing approach of always using GEP1 for context could result in symmetry issues. Differential Revision: https://reviews.llvm.org/D93162	2020-12-13 13:20:05 +01:00

1 2 3 4 5 ...

208202 Commits