llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 19:23:23 +01:00

Author	SHA1	Message	Date
Lang Hames	6cff7af9fe	[ORC] Prefer preincrement on iterator.	2020-12-14 12:00:21 +11:00
Craig Topper	baf5e23683	[X86] Add ExeDomain = SSEPackedSingle to cvtss2sd and cvtsd2ss instrutions. Prep for D92993	2020-12-13 12:35:33 -08:00
Craig Topper	c97e9a2df5	[X86] Add isel patterns to form VPDPWSSD from (add (vpmaddwd X, Y), Z) when AVXVNNI is enabled. We already have these patterns for AVX512VNNI.	2020-12-13 12:02:07 -08:00
Nikita Popov	b008f21f60	[AC] Handle (X+C1)<C2 assumes (PR48408) InstCombine canonicalizes X>C && X<C' style comparisons into (X+C1)<C2. This type of expression is recognized by some analyses like LVI, but currently not when used inside assumptions, because AssumptionCache does not track affected values for it.	2020-12-13 21:00:32 +01:00
Harald van Dijk	7fde76de88	[X86] Extend varargs test This extends the existing x86-64-varargs test by passing enough arguments that they need to be passed in memory, and by passing them in reverse order, using va_arg for each argument to retrieve them and restoring them to the correct order, and by using va_copy to have two va_lists to use with va_arg.	2020-12-13 18:33:10 +00:00
Kazu Hirata	739a1bfd1b	[Analysis] Remove unused declaration replaceEdgeKey (NFC) The declaration was introduced without a corresponding definition on Feb 9, 2017 in commit aaad9f84be2a6a3eb8202ed4eaa5e5e2021d055e.	2020-12-13 10:03:45 -08:00
Kazu Hirata	cad22ca413	[Transforms] Use llvm::erase_value (NFC)	2020-12-13 09:48:47 -08:00
Tony	0edbddc58e	[NFC]{AMDGPU] Update AMDGPUUsage with AMD RDNA 2 reference Differential Revision: https://reviews.llvm.org/D93172	2020-12-13 17:21:02 +00:00
Simon Pilgrim	911e7e5df0	[X86][SSE] combineX86ShufflesRecursively - add basic handling for combining shuffles of different widths (PR45974) If a faux shuffle uses smaller shuffle inputs, try to recursively combine with those inputs directly instead of widening them immediately. Then widen all smaller inputs at the bottom of the recursion. This will still mean we're generating nodes on the fly (PR45974) even if we don't combine to a new shuffle but it does help AVX2+ targets combine across xmm/ymm/zmm types, mainly as variable shuffles.	2020-12-13 17:18:07 +00:00
Simon Pilgrim	cfea3c3324	[X86][AVX] Add additional X86ISD::SUBV_BROADCAST_LOAD test case for D92645 Suggested by @yubing - to check whether we can reuse a single subvector broadcast for 128/256/512-bit vectors.	2020-12-13 16:43:33 +00:00
Florian Hahn	89cd0fd6c4	[VPlan] Use interleaveComma in printOperands() (NFC).	2020-12-13 16:29:16 +00:00
Florian Hahn	ff3668406f	Recommit "[AArch64] Lower calls with rv_marker attribute." This recommits a87fccb3ff9c with a fix to mark the destination operand of the marker instruction as def, to fix a machine verifier failure. This reverts the revert commit c0f2cea7c0afc7c9688e1633f2a9b25c8ea4a9bd.	2020-12-13 16:20:39 +00:00
Simon Pilgrim	a7e965cbc8	[X86][SSE] combineReductionToHorizontal - add vXi8 ISD::MUL reduction handling (PR39709) Default expansion leads to repeated extensions/truncations to/from vXi16 which shuffle combining and demanded elts can't completely unravel. Better just to promote (any_extend) the input and perform a vXi16 reduction. We'll be able to remove a lot of this if we ever get decent legalization support for reduction intrinsics in SelectionDAG.	2020-12-13 15:22:54 +00:00
Simon Pilgrim	66c06c91e1	[X86] Regenerate vector-reduce-mul.ll with common check prefixes. NFC. Try to merge AVX1/AVX2/AVX512 codegen checks where possible	2020-12-13 14:25:42 +00:00
Nikita Popov	5f5687abd0	[BasicAA] Handle known non-zero variable index BasicAA currently handles cases like ScaleV0 + (-Scale)V1 where V0 != V1, but does not handle the simpler case of Scale*V with V != 0. Add it based on an isKnownNonZero() call. I'm not passing a context instruction for now, because the existing approach of always using GEP1 for context could result in symmetry issues. Differential Revision: https://reviews.llvm.org/D93162	2020-12-13 13:20:05 +01:00
Nico Weber	a3fb223ce6	[mac/arm] Deflake 3 check-llvm tests On macOS/arm, signature verification has kill semantics by default. Signature verification is cached with a file's inode (actually, vnode), and if a new executable is copied over an existing file (which reuses the inode), the cache isn't invalidated. So when the new executable is executed, the kernel still has the old content's signature cached and the kills the executable because the old signatue doesn't match the new contents (https://openradar.appspot.com/FB8914243). As workaround, rm the desitnation files first, to ensure they have a fresh vnode (and hence no stale cached signature) after the copy. Part of PR46647. See also e0e334a9c1ac for a similar change.	2020-12-12 21:14:45 -05:00
Chris Sears	9ebc697f7d	X86: Correcting X86OutgoingValueHandler typo (NFC) https://reviews.llvm.org/D92631	2020-12-12 20:28:37 -05:00
Nico Weber	eb4dd1efc0	fix typos to cycle bots	2020-12-12 20:19:33 -05:00
Nico Weber	fad391b5a4	mac/arm: XFAIL the last 2 failing check-llvm tests We should fix them, but let's XFAIL them for now so that we can start running check-llvm on bots and lock in the passing tests. Part of PR46647.	2020-12-12 20:12:02 -05:00
Nico Weber	4abfbbe941	[mac/arm] skip MappedMemoryTest that try to map w+x macOS/arm is w^x, so these tests don't work. Fixes these failures: LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.AllocAndRelease/5 LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.AllocAndReleaseHuge/5 LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.BasicWrite/5 LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.DuplicateNear/5 LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.EnabledWrite/3 LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.EnabledWrite/4 LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.EnabledWrite/5 LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.MultipleAllocAndRelease/5 LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.MultipleWrite/5 LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.SuccessiveNear/5 LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.UnalignedNear/5 LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.ZeroNear/5 LLVM-Unit :: Support/./SupportTests/AllocationTests/MappedMemoryTest.ZeroSizeNear/5 Part of PR46647.	2020-12-12 19:46:32 -05:00
Craig Topper	6db8e7e4ea	[X86] Autogenerate complete checks. NFC	2020-12-12 16:37:28 -08:00
Amara Emerson	7bc095bd88	[[GlobalISel][IRTranslator] Fix a crash when the use of an extractvalue is a non-dominated metadata use. We don't expect uses to come before defs in the CFG, so allocateVRegs() asserted. Fixes PR48211	2020-12-12 14:58:54 -08:00
Roman Lebedev	cc68932a0f	[SimplifyCFG] FoldBranchToCommonDest(): bonus instrns must only be used by PHI nodes in successors (PR48450) In particular, if the successor block, which is about to get a new predecessor block, currently only has a single predecessor, then the bonus instructions will be directly used within said successor, which is fine, since the block with bonus instructions dominates that successor. But once there's a new predecessor, the IR is no longer valid, and we don't fix it, because we only update PHI nodes. Which means, the live-out bonus instructions must be exclusively used by the PHI nodes in successor blocks. So we have to form trivial PHI nodes. which will then be successfully updated to recieve cloned bonus instns. This all works fine, except for the fact that we don't have access to the dominator tree, and we don't ignore unreachable code, so we sometimes do end up having to deal with some weird IR. Fixes https://bugs.llvm.org/show_bug.cgi?id=48450	2020-12-13 00:06:57 +03:00
Zarko Todorovski	1ca33f4559	[PPC] Check for PPC64 when emitting 64bit specific VSX nodes when pattern matching built vectors Some of the pattern matching in PPCInstrVSX.td and node lowering involving vectors assumes 64bit mode. This patch disables some of the unsafe pattern matching and lowering of BUILD_VECTOR in 32bit mode. Reviewed By: Xiangling_L Differential Revision: https://reviews.llvm.org/D92789	2020-12-12 15:28:28 -05:00
Nikita Popov	021d70e8fc	[CVP] Simplify and generalize switch handling CVP currently handles switches by checking an equality predicate on all edges from predecessor blocks. Of course, this can only work if the value being switched over is defined in a different block. Replace this implementation with a call to getPredicateAt(), which also does the predecessor edge predicate check (if not defined in the same block), but can also do quite a bit more: It can reason about phi-nodes by checking edge predicates for incoming values, it can reason about assumes, and it can reason about block values. As such, this makes the implementation both simpler and more powerful. The compile-time impact on CTMark is in the noise.	2020-12-12 21:12:27 +01:00
Nikita Popov	37db5c8083	[CVP] Add additional switch tests (NFC) These cover cases handled by getPredicateAt(), but not by the current implementation: * Assumes based on context instruction. * Value from phi node in same block (using per-pred reasoning). * Value from non-phi node in same block (using block-val reasoning).	2020-12-12 20:58:00 +01:00
Krzysztof Parzyszek	577e1b0232	[Hexagon] Reconsider getMask fix, return original mask, convert later The getPayload/getMask/getPassThrough functions should return values that could be composed into a masked load/store without any additional type casts. The previous fix violated that. Instead, convert scalar mask to a vector right before rescaling.	2020-12-12 13:27:22 -06:00
Tony	313d9ab376	[NFC][AMDGPU] AMDGPUUsage updates - Document which processors are supported by which runtimes. - Add missing mappings for code object V2 note records Differential Revision: https://reviews.llvm.org/D93016	2020-12-12 18:19:02 +00:00
Kazu Hirata	dc01966102	[Analysis/Interval] Remove isLoop (NFC) The last use of isLoop was removed on Apr 29, 2002 in commit 09bbb5c015c6e40b3d45da057f955ddb7c8f8485 as part of an effort to remove "old induction varaible cannonicalization pass built on top of interval analysis".	2020-12-12 10:09:35 -08:00
Kazu Hirata	aca797bfd1	[Transforms] Use is_contained (NFC)	2020-12-12 09:37:49 -08:00
Krzysztof Parzyszek	988ff0aa45	[Hexagon] Create vector masks for scalar loads/stores AlignVectors treats all loaded/stored values as vectors of bytes, and masks as corresponding vectors of booleans, so make getMask produce a 1-element vector for scalars from the start.	2020-12-12 11:12:17 -06:00
Harald van Dijk	90e4a4c68b	[UpdateTestChecks] Add --(no-)x86_scrub_sp option. This makes it possible to use update_llc_test_checks to manage tests that check for incorrect x86 stack offsets. It does not yet modify any test to make use of this new option.	2020-12-12 17:11:13 +00:00
Harald van Dijk	15a28c0a8f	[X86] Avoid data16 prefix for lea in x32 mode The ABI demands a data16 prefix for lea in 64-bit LP64 mode, but not in 64-bit ILP32 mode. In both modes this prefix would ordinarily be ignored, but the instructions may be changed by the linker to instructions that are affected by the prefix. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D93157	2020-12-12 17:05:24 +00:00
David Green	5ed162a969	[ARM] Add basic masked load/store costs This adds some basic MVE masked load/store costs, notably changing the cost of legal loads/stores to the MVECostFactor and the cost of scalarized instructions to 8*NumElts. Differential Revision: https://reviews.llvm.org/D86538	2020-12-12 15:26:32 +00:00
David Green	68a0f80009	[LV] Fix scalar cost for tail predicated loops When it comes to the scalar cost of any predicated block, the loop vectorizer by default regards this predication as a sign that it is looking at an if-conversion and divides the scalar cost of the block by 2, assuming it would only be executed half the time. This however makes no sense if the predication has been introduced to tail predicate the loop. Original patch by Anna Welker Differential Revision: https://reviews.llvm.org/D86452	2020-12-12 14:21:40 +00:00
Nikita Popov	cf4871bd6a	[BasicAA] Make non-equal index handling simpler to extend (NFC)	2020-12-12 15:00:47 +01:00
Nikita Popov	36c749f8f1	[BasicAA] Add tests for non-zero var index (NFC)	2020-12-12 15:00:46 +01:00
Luo, Yuanke	826a8b01a7	[X86] Add chain in ISel for x86_tdpbssd_internal intrinsic.	2020-12-12 21:14:38 +08:00
Nathan James	b4d64251fd	[YAML] Support extended spellings when parsing bools. Support all the spellings of boolean datatypes according to https://yaml.org/type/bool.html Reviewed By: silvas Differential Revision: https://reviews.llvm.org/D92755	2020-12-12 12:50:34 +00:00
David Green	109a6a32fa	[ARM] Test for showing scalar vector costs. NFC	2020-12-12 11:43:14 +00:00
Jan Svoboda	fa6c9b63ca	[clang][cli] Add flexible TableGen multiclass for boolean options This introduces more flexible multiclass for declaring two flags controlling the same boolean keypath. Compared to existing Opt{In,Out}FFlag multiclasses, the new syntax makes it easier to read option declarations and reason about the keypath. This also makes specifying common properties of both flags possible. I'm open to suggestions on the class names. Not 100% sure the benefits are worth the added complexity. Depends on D92774. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D92775	2020-12-12 10:53:28 +01:00
Jan Svoboda	2bf0908f96	[clang][cli] Don't always emit -f[no-]legacy-pass-manager We don't need to always generate `-f[no-]experimental-new-pass-manager`. This patch does not change the behavior of any other command line flag. (For example `-triple` is still being always generated.) Reviewed By: dexonsmith, Bigcheese Differential Revision: https://reviews.llvm.org/D92857	2020-12-12 10:11:23 +01:00
Kazu Hirata	6f6801a6fa	[Analysis] Use is_contained (NFC)	2020-12-11 21:19:31 -08:00
Mircea Trofin	44ff9e7909	[MLGO] Fix build break as result of new InstructionCost (D91174)	2020-12-11 20:28:39 -08:00
Fangrui Song	506848f563	[llvm-cov gcov] Replace Donald B. Johnson's cycle enumeration with iterative cycle finding gcov computes the line execution count as the sum of (a) counts from predecessors on other lines and (b) the sum of loop execution counts of blocks on the same line (think of loops on one line). For (b), we use Donald B. Johnson's cycle enumeration algorithm and perform cycle cancelling for each cycle. This number of candidate cycles were exponential and D93036 made it polynomial by skipping zero count cycles. The time complexity is high (O(VE^2) (it could be O(E^2) but the linear `Blocks` check made it higher) and the implementation is complex. We could just identify loops and sum all back edges. However, this requires a dominator tree construction which is more complex. The time complexity can be decreased to almost linear, though. This patch just performs cycle cancelling iteratively. Add two members `traversable` and `incoming` to GCOVArc. There are 3 states: `!traversable`: blocks not on this line or explored blocks * `traversable && incoming == nullptr`: unexplored blocks * `traversable && incoming != nullptr`: blocks which are being explored (on the stack) If an arc points to a block being explored, a cycle has been found. Let E be the number of arcs. Every time a cycle is found, at least one arc is saturated (`edgeCount` reduced to 0), so there are at most E cycles. Finding one cycle takes O(E) time, so the overall time complexity is O(E^2). Note that we always augment through a back edge and never need to augment its reverse edge so reverse edges in traditional flow networks are not needed. Reviewed By: xinhaoyuan Differential Revision: https://reviews.llvm.org/D93073	2020-12-11 18:28:16 -08:00
Fangrui Song	3e1d1f4661	[Kaleidoscope] Migrate DebugInfo::get to DILocation::get	2020-12-11 18:01:04 -08:00
Jonas Paulsson	160287755d	Reapply "[SystemZFrameLowering] Don't overrwrite R1D (backchain) when probing." Fixed to properly compute the live-in lists of new blocks. Review: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D92803	2020-12-11 18:25:47 -06:00
Jonas Paulsson	0962109bbf	[SystemZTTIImpl] Allow some non-prefetched accesses in getMinPrefetchStride(). The performance improvement on LBM previously achieved with improved software prefetching (36d4421) have gone lost recently with e00f189. There now is one memory access in the loop that LoopDataPrefetch cannot handle (while before there was none) which the heuristic rejects. This patch adds a small margin by allowing 1 non-prefetched memory access for every 32 prefetched ones, so that the heuristic doesn't bail in this type of case. Review: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D92985	2020-12-11 18:06:07 -06:00
diggerlin	071d4d3232	[AIX] Fixed a link error. Summary: "Speculative fix for link failure on bots" with a mention of "the clang-ppc64le-rhel bot fails on link: http://lab.llvm.org:8011/#/builders/57/builds/2307/steps/6/logs/stdio". PPCAsmPrinter.cpp:(.text._ZN12_GLOBAL__N_116PPCAIXAsmPrinter19emitFunctionBodyEndEv+0x2f8): undefined reference to `llvm::XCOFF::getNameForTracebackTableLanguageId(llvm::XCOFF::TracebackTable::LanguageID)' PPCAsmPrinter.cpp:(.text._ZN12_GLOBAL__N_116PPCAIXAsmPrinter19emitFunctionBodyEndEv+0x2170): undefined reference to `llvm::XCOFF::parseParmsType(unsigned int, unsigned int)'	2020-12-11 18:53:10 -05:00
Craig Topper	08d4def9b7	[LoopIdiomRecognize] Autogenerate complete checks for the X86 ctlz/cttz tests. NFC Preparation for D92745 which will add more tests to these files.	2020-12-11 15:35:37 -08:00

1 2 3 4 5 ...

208167 Commits