llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 04:32:44 +01:00

Author	SHA1	Message	Date
Ilya Leoshkevich	3a1b644428	[TSan] Add SystemZ longjmp support Implement the interceptor and stack pointer demangling. Reviewed By: dvyukov Differential Revision: https://reviews.llvm.org/D105629	2021-07-15 12:18:48 +02:00
Ilya Leoshkevich	b0ff945d9d	[TSan] Use zeroext for function parameters SystemZ ABI requires zero-extending function parameters to 64-bit. The compiler is free to optimize the code around this assumption, e.g. failing to zero-extend __tsan_atomic32_load()'s morder may cause crashes in to_mo() switch table lookup. Fix by adding zeroext attributes to TSan's FunctionCallees, similar to how it was done in commit 3bc439bdff8b ("[MSan] Add instrumentation for SystemZ"). This is a no-op on arches that don't need it. Reviewed By: dvyukov Differential Revision: https://reviews.llvm.org/D105629	2021-07-15 12:18:47 +02:00
Max Kazantsev	4863909bb0	[Test] We can benefit from pipelining of ymm load/stores This patch demonstrates a scenario when we need to load/store a single 64-byte value, which is done by 2 ymm loads and stores in AVX. The current codegen choses the following sequence: load ymm0 load ymm1 store ymm1 store ymm0 If we instead stored ymm0 before ymm1, we could execute 2nd load and 1st store in parallel.	2021-07-15 17:15:14 +07:00
Cullen Rhodes	c55c74c634	[AArch64][SME] Add outer product instructions This patch adds support for the following outer product instructions: * BFMOPA, BFMOPS, FMOPA, FMOPS, SMOPA, SMOPS, SUMOPA, SUMOPS, UMOPA, UMOPS, USMOPA, USMOPS. Depends on D105570. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D105571	2021-07-15 09:51:06 +00:00
Florian Mayer	f4bb75377e	[NFC] [hwasan] Split argument logic into functions. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D105971	2021-07-15 10:45:43 +01:00
Bogdan Graur	baec013412	Fixes memory sanitizer 'use-of-uninitialized-value' diagnostic. Differential Revision: https://reviews.llvm.org/D106047	2021-07-15 11:17:04 +02:00
Timm Bäder	0108938558	[llvm][tools] Hide unrelated llvm-bcanalyzer options They otherwise show up when we link against the dynamic libLLVM.so. Differential Revision: https://reviews.llvm.org/D105893	2021-07-15 10:43:15 +02:00
LLVM GN Syncbot	091d61e196	[gn build] Port b0d38ad0bc25	2021-07-15 07:50:35 +00:00
Djordje Todorovic	140c795b75	[2/2][RemoveRedundantDebugValues] Add a Pass that removes redundant DBG_VALUEs This patch adds the forward scan for finding redundant DBG_VALUEs. This analysis aims to remove redundant DBG_VALUEs by going forward in the basic block by considering the first DBG_VALUE as a valid until its first (location) operand is not clobbered/modified. For example: (1) DBG_VALUE $edi, !"var1", ... (2) <block of code that does affect $edi> (3) DBG_VALUE $edi, !"var1", ... ... in this case, we can remove (3). Differential Revision: https://reviews.llvm.org/D105280	2021-07-15 00:08:31 -07:00
Tony Tye	57b2fbab2e	[AMDGPU] Reserve AMDGPU ELF e_flags machine 0x44 Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D106034	2021-07-15 06:46:27 +00:00
Chuanqi Xu	ca13ea7edf	[Coroutines] Run coroutine passes by default This patch make coroutine passes run by default in LLVM pipeline. Now the clang and opt could handle IR inputs containing coroutine intrinsics without special options. It should be fine. On the one hand, the coroutine passes seems to be stable since there are already many projects using coroutine feature. On the other hand, the coroutine passes should do nothing for IR who doesn't contain coroutine intrinsic. Test Plan: check-llvm Reviewed by: lxfind, aeubanks Differential Revision: https://reviews.llvm.org/D105877	2021-07-15 14:33:40 +08:00
Kuter Dinel	b86b597e6c	[Attributor] AACallEdges, Add a way to ask nonasm unknown callees This patch adds a feature to AACallEdges AbstractAttribute that allows users to ask if there is a unknown callee that isn't a inline assembly. This feature is needed by some of it's users. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D105992	2021-07-15 06:10:42 +03:00
Chen Zheng	90f3c2e043	[PowerPC][NFC] add testcase for update-form preparation with non-const increment	2021-07-15 02:46:24 +00:00
LLVM GN Syncbot	9865f77f6d	[gn build] Port b9c3941cd61d	2021-07-15 01:12:36 +00:00
Kai Luo	bb52bc77a5	[PowerPC] Generate inlined quadword lock free atomic operations via AtomicExpand This patch uses AtomicExpandPass to implement quadword lock free atomic operations. It adopts the method introduced in https://reviews.llvm.org/D47882, which expand atomic operations post RA to avoid spilling that might prevent LL/SC progress. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D103614	2021-07-15 01:12:09 +00:00
Kuter Dinel	bfdc40f15b	[AMDGPU] Use update_test_checks.py script for annotate kernel features tests. This patch makes the annotate kernel features tests use the update_tests_checks.py script. Which makes it easy to update the tests. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D105864	2021-07-15 03:13:37 +03:00
Thomas Lively	3c50e4a7a7	[WebAssembly] Codegen for v128.storeX_lane instructions Replace the experimental clang builtins and LLVM intrinsics for these instructions with normal codegen patterns. Resolves PR50435. Differential Revision: https://reviews.llvm.org/D106019	2021-07-14 16:15:25 -07:00
Jon Roelofs	daf8a095a1	[GlobalOpt] Fix a miscompile when evaluating struct initializers. The bug was that evaluateBitcastFromPtr attempts a narrowing to a struct's 0th element of a store that covers other elements. While this is okay on the load side, applying it to stores causes us to miss the writes to the additionally covered elements. rdar://79503568 Differential revision: https://reviews.llvm.org/D105838	2021-07-14 15:37:01 -07:00
Steven Wu	d31e29ae39	[Support] Turn on SupportTest for Apple Silicon Follow up for D106012, turn on unittest for Host on Apple Silicon. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D106020	2021-07-14 15:24:56 -07:00
Arthur Eubanks	f5788bb9c7	[docs][OpaquePtr] Remove finished task	2021-07-14 14:36:41 -07:00
Wolfgang Pieb	bf7a422513	[ARM] Fix RELA relocations for 32bit ARM. RELA relocations for 32 bit ARM ignored the addend. Some tools generate them instead of REL type relocations. This fixes PR50473. Reviewed By: MaskRay, peter.smith Differential Revision: https://reviews.llvm.org/D105214	2021-07-14 14:27:15 -07:00
Derek Schuff	f708d3928c	[llvm-strip][WebAssembly] Support strip flags Summary: Add support for the basic section stripping (and keeping) flags for wasm: strip with no flags, --strip-all, --strip-debug, --only-section, --keep-section, and --only-keep-debug. Factor section removal into a function and use a predicate chain like the ELF implementation. Reviewers: jhenderson, sbc100 Differential Revision: https://reviews.llvm.org/D73820	2021-07-14 14:17:02 -07:00
Arthur Eubanks	a8da5bdd63	Precommit test for D106017	2021-07-14 14:14:49 -07:00
Arthur Eubanks	9d99a13a85	[SimpleLoopUnswitch] Don't non-trivially unswitch loops with catchswitch exits SplitBlock() can't handle catchswitch. Fixes PR50973. Reviewed By: aheejin Differential Revision: https://reviews.llvm.org/D105672	2021-07-14 14:07:28 -07:00
Jon Roelofs	ffc7470172	[AArch64] Fix selection of G_UNMERGE <2 x s16> Differential revision: https://reviews.llvm.org/D106007	2021-07-14 13:40:56 -07:00
Philip Reames	99a7d1e6cf	[tests] Stablize tests for possible change in deref semantics This is conceptually part of e75a2dfe. This file contains both tests whose results don't change (with the right attributes added), and tests which fundementally regress with the current proposal. Doing the update took some care, thus the seperate change. Here's the e75a2dfe context repeated: There's a potential change in dereferenceability attribute semantics in the nearish future. See llvm-dev thread "RFC: Decomposing deref(N) into deref(N) + nofree" and D99100 for context. This change simply adds appropriate attributes to tests to keep transform logic exercised under both old and new/proposed semantics. Note that for many of these cases, O3 would infer exactly these attributes on the test IR. This change handles the idiomatic pattern of a dereferenceable object being passed to a call which can not free that memory. There's a couple other tests which need more one-off attention, they'll be handled in another change.	2021-07-14 13:37:50 -07:00
Steven Wu	9c64643022	[Support] Get correct number of physical cores on Apple Silicon Fix a bug that `computeHostNumPhysicalCores` is fallback to default unknown when building for Apple Silicon macs. rdar://80533675 Reviewed By: arphaman Differential Revision: https://reviews.llvm.org/D106012	2021-07-14 13:29:54 -07:00
Philip Reames	652d77cf2f	Global variables with strong definitions cannot be freed With the current deref semantics, this is redundant - since we assume that anything which is dereferenceable (ever) can't be freed - but it becomes neccessary for the deref-at-point semantics. Testing wise, this is covered by test/CodeGen/X86/hoist-invariant-load.ll when -use-dereferenceable-at-point-semantics is active. I didn't bother duplicating the command line since a) it's an in-development mode, and b) the change is pretty obvious.	2021-07-14 13:26:18 -07:00
Philip Reames	b12ecd2ebd	[tests] Stablize tests for possible change in deref semantics There's a potential change in dereferenceability attribute semantics in the nearish future. See llvm-dev thread "RFC: Decomposing deref(N) into deref(N) + nofree" and D99100 for context. This change simply adds appropriate attributes to tests to keep transform logic exercised under both old and new/proposed semantics. Note that for many of these cases, O3 would infer exactly these attributes on the test IR. This change handles the idiomatic pattern of a dereferenceable object being passed to a call which can not free that memory. There's a couple other tests which need more one-off attention, they'll be handled in another change.	2021-07-14 13:05:43 -07:00
Stanislav Mekhanoshin	1dae83cabb	[AMDGPU] Add TII::isIgnorableUse() to allow VOP rematerialization Any def of EXEC prevents rematerialization of any VOP instruction because of the physreg use. Create a callback to check if the physreg use can be ingored to allow rematerialization. Differential Revision: https://reviews.llvm.org/D105836	2021-07-14 13:03:58 -07:00
Alexey Bataev	79d83636b9	[SLP][NFC]Fix variables names, NFC.	2021-07-14 12:43:45 -07:00
Fangrui Song	f00933b0e4	[docs] Fix :option:`--file-header` reference in llvm-readelf.rst after D105532	2021-07-14 12:39:22 -07:00
Simon Pilgrim	dd97f9edd3	[SLP] Fix case of variable name. NFCI.	2021-07-14 20:20:04 +01:00
Roman Lebedev	60d35e606c	[NFC] Drop redundant check prefixes in newly added test file	2021-07-14 22:14:36 +03:00
Nikita Popov	0b8ca13018	[Attributes] Use single method to fetch type from AttributeSet (NFC) While it is nice to have separate methods in the public AttributeSet API, we can fetch the type from the internal AttributeSetNode using a generic API for all type attribute kinds.	2021-07-14 21:10:56 +02:00
Roman Lebedev	c64b97d8cf	[NFC][PhaseOrdering] Add test for the lack of CSE after SimplifyCFG (PR51092)	2021-07-14 22:07:38 +03:00
David Green	5da5b28644	[ARM] Move add(VMLALVA(A, X, Y), B) to VMLALVA(add(A, B), X, Y) For i64 reductions we currently try and convert add(VMLALV(X, Y), B) to VMLALVA(B, X, Y), incorporating the addition into the VMLALVA. If we have an add of an existing VMLALVA, this patch pushes the add up above the VMLALVA so that it may potentially be simplified further, for example being folded into another VMLALV. Differential Revision: https://reviews.llvm.org/D105686	2021-07-14 20:06:49 +01:00
Nikita Popov	b227a46d3d	[Verifier] Improve incompatible attribute type check A couple of attributes had explicit checks for incompatibility with pointer types. However, this is already handled generically by the typeIncompatible() check. We can drop these after adding SwiftError to typeIncompatible(). However, the previous implementation of the check prints out all attributes that are incompatible with a given type, even though those attributes aren't actually used. This has the annoying result that the error message changes every time a new attribute is added to the list. Improve this by explicitly finding which attribute isn't compatible and printing just that.	2021-07-14 21:02:10 +02:00
Saleem Abdulrasool	ba012f02f8	Demangle: correct swift_async demangling for Microsoft scheme The emission was corrected for the swift_async calling convention but the demangling support was not. This repairs the demangling support as well.	2021-07-14 11:43:44 -07:00
Eli Friedman	0af449d2a7	[SelectionDAG] Add an overload of getStepVector that assumes step 1. This is mostly a minor convenience, but the pattern seems frequent enough to be worthwhile (and we'll probably add more uses in the future). Differential Revision: https://reviews.llvm.org/D105850	2021-07-14 11:37:01 -07:00
Thomas Lively	cf44692539	[WebAssembly] Codegen for v128.loadX_lane instructions Replace the experimental clang builtin and LLVM intrinsics for these instructions with normal codegen patterns. Resolves PR50433. Differential Revision: https://reviews.llvm.org/D105950	2021-07-14 11:31:53 -07:00
Thomas Lively	81bb5f99ad	[WebAssembly] Remove datalayout strings from llc tests The data layout strings do not have any effect on llc tests and will become misleadingly out of date as we continue to update the canonical data layout, so remove them from the tests. Differential Revision: https://reviews.llvm.org/D105842	2021-07-14 11:17:08 -07:00
David Green	bb9a6d3734	[ARM] Lower v16i8 -> i64 VMLA reductions. MVE does not have a VMLALV instruction that can perform v16i8 -> i64 reductions, like it does for v8i16->i64 and v4i32->i64 reductions. That means that the pattern to create them will be spilt up by type legalization, creating a lot of instructions. This extends the patterns for matching i64 reductions a little to handle the v16i8->i64 case. We need to turn them into a pair of v8i16->i64 VMLALVs that each perform half of the reduction and are summed together (so the later is a VMLALVA). The order of the lanes does not matter for the reduction so we generate a MVEEXT for the extension, that will either be folded into a extending load or can be optimized to a VREV/VMOVL. Some of the resulting codegen isn't optimal, but will be improved in a later patch. Differential Revision: https://reviews.llvm.org/D105680	2021-07-14 18:11:32 +01:00
Sanjay Patel	cd35b5a262	[InstCombine] reorder icmp with offset folds for better results This set of folds was added recently with: c7b658aeb526 0c400e895306 40b752d28d95 ...and I noted that this wasn't likely to fire in code derived from C/C++ source because of nsw in particular. But I didn't notice that I had placed the code above the no-wrap block of transforms. This is likely the cause of regressions noted from the previous commit because -- as shown in the test diffs -- we may have transformed into a compare with an arbitrary constant rather than a simpler signbit test.	2021-07-14 12:12:05 -04:00
Sanjay Patel	9552511c41	[InstCombine] add tests for icmp with constant offset and no-wrap flags; NFC	2021-07-14 12:12:05 -04:00
Sander de Smalen	0fddd6dc0c	[LV] Print remark when loop cannot be vectorized due to invalid costs. This patch emits remarks for instructions that have invalid costs for a given set of vectorization factors. Some example output: t.c:4:19: remark: Instruction with invalid costs prevented vectorization at VF=(vscale x 1): load dst[i] = sinf(src[i]); ^ t.c:4:14: remark: Instruction with invalid costs prevented vectorization at VF=(vscale x 1, vscale x 2, vscale x 4): call to llvm.sin.f32 dst[i] = sinf(src[i]); ^ t.c:4:12: remark: Instruction with invalid costs prevented vectorization at VF=(vscale x 1): store dst[i] = sinf(src[i]); ^ Reviewed By: fhahn, kmclaughlin Differential Revision: https://reviews.llvm.org/D105806	2021-07-14 17:11:33 +01:00
Matt Arsenault	496d9d6a80	GlobalISel: Handle lowering non-power-of-2 extloads	2021-07-14 11:54:11 -04:00
Sander de Smalen	d616b58c4d	[CostModel][AArch64] Make loads/stores of <vscale x 1 x eltty> invalid. At the moment, <vscale x 1 x eltty> are not yet fully handled by the code-generator, so to avoid vectorizing loops with that VF, we mark the cost for these types as invalid. The reason for not adding a new "TTI::getMinimumScalableVF" is because the type is supposed to be a type that can be legalized. It partially is, although the support for these types need some more work. Reviewed By: paulwalker-arm, dmgreen Differential Revision: https://reviews.llvm.org/D103882	2021-07-14 16:44:22 +01:00
Jay Foad	954a1c2142	[AMDGPU] Check llc-pipeline.ll with -match-full-lines -strict-whitespace This prevents breaking the indentation that shows the structure of the pass managers. Differential Revision: https://reviews.llvm.org/D105891	2021-07-14 16:33:50 +01:00
Alexey Bataev	328393368a	[SLP]Workaround for InsertSubVector cost. The cost of the InsertSubvector shuffle kind cost is not complete and may end up with just extracts + inserts costs in many cases. Added a workaround to represent it as a generic PermuteSingleSrc, which is still pessimistic but better than InsertSubvector. Differential Revision: https://reviews.llvm.org/D105827	2021-07-14 07:54:24 -07:00

1 2 3 4 5 ...

218525 Commits