llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00

Author	SHA1	Message	Date
Josh Berdine	c64c5b211f	[OCaml] Add missing TypeKinds, Opcode, and AtomicRMWBinOps There are several enum values that have been added to LLVM-C that are missing from the OCaml bindings. The types defined in bindings/ocaml/llvm/llvm.ml should be in sync with the corresponding enum definitions in include/llvm-c/Core.h. The enum values are passed from C to OCaml unmodified, and clients of the OCaml bindings interpret them as tags of the corresponding OCaml types. So the only changes needed are to add the missing constructors to the type definitions, and to change the name of the maximum opcode in an assertion. Differential Revision: https://reviews.llvm.org/D98578	2021-03-16 15:32:38 +00:00
Joe Ellis	859445ea3f	[AArch64][SVE] Fold vector ZExt/SExt into gather loads where possible This commit folds sxtw'd or uxtw'd offsets into gather loads where possible with a DAGCombine optimization. As an example, the following code: 1 #include <arm_sve.h> 2 3 svuint64_t func(svbool_t pred, const int32_t *base, svint64_t offsets) { 4 return svld1sw_gather_s64offset_u64( 5 pred, base, svextw_s64_x(pred, offsets) 6 ); 7 } would previously lower to the following assembly: sxtw z0.d, p0/m, z0.d ld1sw { z0.d }, p0/z, [x0, z0.d] ret but now lowers to: ld1sw { z0.d }, p0/z, [x0, z0.d, sxtw] ret Differential Revision: https://reviews.llvm.org/D97858	2021-03-16 15:09:46 +00:00
Max Kazantsev	9f606251f1	[SCEV][NFC] Move check up the stack One of (and primary) callers of isBasicBlockEntryGuardedByCond is isKnownPredicateAt, which makes isKnownPredicate check before it. It already makes non-recursive check inside. So, on this execution path this check is made twice. The only other caller is isLoopEntryGuardedByCond. Moving the check there should save some compile time.	2021-03-16 22:09:17 +07:00
Craig Topper	eeed21c58e	[RISCV] Look through copies when trying to find an implicit def in addVSetVL. The InstrEmitter can sometimes insert a copy after an IMPLICIT_DEF before connecting it to the vector instruction. This occurs when constrainRegClass reduces to a class with less than 4 registers. I believe LMUL8 on masked instructions triggers this since the result can only use the v8, v16, or v24 register group as the mask is using v0. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98567	2021-03-16 07:59:09 -07:00
David Zarzycki	f2b6f6c181	[lit testing] Mark reorder.py as unavailable on Windows The test file has embedded slashes. This is fine for normal users that are just recording and reordering paths, but not great when the trace data is committed back to a repository that should work on both Unix and Windows.	2021-03-16 10:54:06 -04:00
Joe Ellis	cab71b0250	[AArch64][SVEIntrinsicOpts] Factor out redundant SVE mul/fmul intrinsics This commit implements an IR-level optimization to eliminate idempotent SVE mul/fmul intrinsic calls. Currently, the following patterns are captured: fmul pg (dup_x 1.0) V => V mul pg (dup_x 1) V => V fmul pg V (dup_x 1.0) => V mul pg V (dup_x 1) => V fmul pg V (dup v pg 1.0) => V mul pg V (dup v pg 1) => V The result of this commit is that code such as: 1 #include <arm_sve.h> 2 3 svfloat64_t foo(svfloat64_t a) { 4 svbool_t t = svptrue_b64(); 5 svfloat64_t b = svdup_f64(1.0); 6 return svmul_m(t, a, b); 7 } will lower to a nop. This commit does not capture all possibilities; only the simple cases described above. There is still room for further optimisation. Differential Revision: https://reviews.llvm.org/D98033	2021-03-16 14:50:17 +00:00
Craig Topper	a9b14b0ccd	[RISCV] Improve i32 UADDSAT/USUBSAT on RV64. The default promotion uses zero extends that become shifts. We cam use sign extend instead which is better for RISCV. I've used two different implementations based on whether we have minu/maxu instructions. Differential Revision: https://reviews.llvm.org/D98683	2021-03-16 07:44:06 -07:00
Aaron Puchert	b376ef963f	Correct Doxygen syntax for inline code There is no syntax like {@code ...} in Doxygen, @code is a block command that ends with @endcode, and generally these are not enclosed in braces. The correct syntax for inline code snippets is @c <code>. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D98665	2021-03-16 15:17:45 +01:00
LLVM GN Syncbot	24ae2470e3	[gn build] Port 9a5af541ee05	2021-03-16 14:03:53 +00:00
Simon Pilgrim	acf6a1b2db	[X86][SSE] canonicalizeShuffleWithBinOps - add PERMILPS/PERMILPD + PERMPD/PERMQ + INSERTPS handling. Bail if the INSERTPS would introduce zeros across the binop.	2021-03-16 13:52:08 +00:00
RamNalamothu	5a521663a1	[AMDGPU, NFC] Refactor FP/BP spill index code in emitPrologue/emitEpilogue Reviewed By: scott.linder Differential Revision: https://reviews.llvm.org/D98617	2021-03-16 19:19:45 +05:30
Simonas Kazlauskas	533dfa60d2	[InstSimplify] Match PtrToInt more directly in a GEP transform (NFC) In preparation for D98611, the upcoming change will need to apply additional checks to `P` and `V`, and so this refactor paves the way for adding additional checks in a less awkward way. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D98672	2021-03-16 15:45:19 +02:00
David Zarzycki	0a3d22f26f	[lit testing] Fix Windows reliability?	2021-03-16 09:11:41 -04:00
Max Kazantsev	4eef8d0048	[Test] Add test with loops guarded by trivial conditions	2021-03-16 19:46:36 +07:00
Max Kazantsev	97be3b8226	[Test] Update auto-generated checks	2021-03-16 19:39:45 +07:00
serge-sans-paille	ee3e917269	[NFC] Use SmallString instead of std::string for the AttrBuilder This avoids a few unnecessary conversion from StringRef to std::string, and a bunch of extra allocation thanks to the SmallString. Differential Revision: https://reviews.llvm.org/D98190	2021-03-16 13:34:14 +01:00
David Zarzycki	8f1e125e6b	[llvm-exegesis testing] Workaround unreliable test Picking an instruction at random is not perfectly reliable.	2021-03-16 08:00:14 -04:00
serge-sans-paille	9381bac6a8	[NFC] Replace loop by idiomatic llvm::find_if	2021-03-16 12:49:19 +01:00
Bjorn Pettersson	5e7e393b90	[TableGen/GlobalISel] Emit MI_predicate custom code for PatFrags (not only PatFrag) When GlobalISelEmitter::emitCxxPredicateFns emitted code for MI predicates it used "PatFrag" when searching for definitions. With this patch it will search for all "PatFrags" instead. Since PatFrag derives from PatFrags the difference is that we now include all definitions using PatFrags directly as well. Thus making it possible to use GISelPredicateCode together with a PatFrags definition. It might be noted that the matcher code was emitted also for PatFrags in the past. But then one ended up with errors since the custom code in testMIPredicate_MI was missing. Differential Revision: https://reviews.llvm.org/D98486	2021-03-16 12:44:09 +01:00
Sanjay Patel	0e17978276	[SLP] improve readability in reduction logic; NFC We had 2 different and ambiguously-named 'I' variables.	2021-03-16 07:35:13 -04:00
Markus Böck	1a004a81cb	[test] Make sure the test program in GetErrcMessages.cmake exits normally. If for some reason the test program does not exit normally it'd currently lead to a false positive and it's stdout output being assigned to the output variable. Instead, check the test program exited normally before assigning the process output to the out variable. Follow up on rGaf2796c76d2ff4b73165ed47959afd35a769beee Fixes an issue discovered post commit in https://reviews.llvm.org/D98278	2021-03-16 12:22:40 +01:00
Dmitry Preobrazhensky	92405c8fe8	[AMDGPU][MC] Disabled lds_direct for GFX90a Fixed bug 49382. Differential Revision: https://reviews.llvm.org/D98626	2021-03-16 13:52:36 +03:00
Markus Böck	848f365839	[test][NFC] Minor formatting and comment adjustments in GetErrcMessages.cmake These changes address post-commit review comments discussed in https://reviews.llvm.org/D98278	2021-03-16 11:08:57 +01:00
David Zarzycki	643090aa23	[lit] Sort test start times based on prior test timing data Lit as it exists today has three hacks that allow users to run tests earlier: 1) An entire test suite can set the `is_early` boolean. 2) A very recently introduced "early_tests" feature. 3) The `--incremental` flag forces failing tests to run first. All of these approaches have problems. 1) The `is_early` feature was until very recently undocumented. Nevertheless it still lacks testing and is a imprecise way of optimizing test starting times. 2) The `early_tests` feature requires manual updates and doesn't scale. 3) `--incremental` is undocumented, untested, and it requires modifying the source file system by "touching" the file. This "touch" based approach is arguably a hack because it confuses editors (because it looks like the test was modified behind the back of the editor) and "touching" the test source file doesn't work if the test suite is read only from the perspective of `lit` (via advanced filesystem/build tricks). This patch attempts to simplify and address all of the above problems. This patch formalizes, documents, tests, and defaults lit to recording the execution time of tests and then reordering all tests during the next execution. By reordering the tests, high core count machines run faster, sometimes significantly so. This patch also always runs failing tests first, which is a positive user experience win for those that didn't know about the hidden `--incremental` flag. Finally, if users want, they can _optionally_ commit the test timing data (or a subset thereof) back to the repository to accelerate bots and first-time runs of the test suite. Reviewed By: jhenderson, yln Differential Revision: https://reviews.llvm.org/D98179	2021-03-16 05:23:04 -04:00
serge-sans-paille	39f31f66e4	[NFC] Wisely nest dyn_cast in FunctionLoweringInfo Take advantage of the inheritance tree to avoid a few comparison.	2021-03-16 10:22:44 +01:00
Caroline Concatto	cb0c62b65c	[SVE][LoopVectorize] Add support for scalable vectorization of loops with vector reverse This patch adds support for reverse loop vectorization. It is possible to vectorize the following loop: ``` for (int i = n-1; i >= 0; --i) a[i] = b[i] + 1.0; ``` with fixed or scalable vector. The loop-vectorizer will use 'reverse' on the loads/stores to make sure the lanes themselves are also handled in the right order. This patch adds support for scalable vector on IRBuilder interface to create a reverse vector. The IR function CreateVectorReverse lowers to experimental.vector.reverse for scalable vector and keedp the original behavior for fixed vector using shuffle reverse. Differential Revision: https://reviews.llvm.org/D95363	2021-03-16 07:51:59 +00:00
Amara Emerson	7917eb608e	[AArch64][GlobalISel] Fix crash on lowering <1 x half> types.	2021-03-15 23:27:43 -07:00
wlei	6bd02e9a2a	[CSSPGO][llvm-profgen] Fix getCanonicalFnName usage in llvm-profgen Previously we didn't support to keep the unique linkage name(-funique-internal-linkage-name) in llvm-profgen. As discussed in https://reviews.llvm.org/D96932, we choose to do canonicalization for it. Now since "selected" is set as the default parameter of getCanonicalFnName in `D96932`, we don't need to add any attribute here for the previous usage and only fix the missing usage in the pseudo probe decoding. Differential Revision: https://reviews.llvm.org/D98226	2021-03-15 21:00:42 -07:00
Johannes Doerfert	60c629d360	[NVPTX] CUDA does provide malloc/free since compute capability 2.X https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#dynamic-global-memory-allocation-and-operations Reviewed By: tra Differential Revision: https://reviews.llvm.org/D98606	2021-03-15 22:45:56 -05:00
Josh Berdine	984ad28c40	[OCaml][test] Fix Bindings/OCaml/executionengine.ml test It seems that at some point it became necessary to pass `-thread` to the ocaml compiler for this test. Differential Revision: https://reviews.llvm.org/D98593	2021-03-16 02:48:36 +00:00
LLVM GN Syncbot	b19c601bda	[gn build] Port 4f198b0c27b0	2021-03-16 02:41:16 +00:00
Bing1 Yu	ab2b029d8f	[X86] Pass to transform amx intrinsics to scalar operation. This pass runs in any situations but we skip it when it is not O0 and the function doesn't have optnone attribute. With -O0, the def of shape to amx intrinsics is near the amx intrinsics code. We are not able to find a point which post-dominate all the shape and dominate all amx intrinsics. To decouple the dependency of the shape, we transform amx intrinsics to scalar operation, so that compiling doesn't fail. In long term, we should improve fast register allocation to allocate amx register. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D93594	2021-03-16 10:40:22 +08:00
David Blaikie	11757a1309	Skip path separators to make the test portable across Win/Linux	2021-03-15 18:24:40 -07:00
Petr Hosek	fb12fb1321	[CMake] Clean up unnecessary dependency The LINK_COMPONENTS dependency between DebugInfoCodeView and DebugInfoMSF is unnecessary. Breaking them would allow a more fine-controlled distribution. Patch By: dangyi Differential Revision: https://reviews.llvm.org/D98465	2021-03-15 16:29:16 -07:00
LLVM GN Syncbot	4791deff1c	[gn build] Port ecf6466f01c5	2021-03-15 23:01:19 +00:00
Lang Hames	4b389e1c7b	[JITLink][MachO][x86-64] Introduce generic x86-64 support. This patch introduces generic x86-64 edge kinds, and refactors the MachO/x86-64 backend to use these edge kinds. This simplifies the implementation of the MachO/x86-64 backend and makes it possible to write generic x86-64 passes and utilities. The new edge kinds are different from the original set used in the MachO/x86-64 backend. Several edge kinds that were not meaningfully distinguished in that backend (e.g. the PCRelMinusN edges) have been merged into single edge kinds in the new scheme (these edge kinds can be reintroduced later if we find a use for them). At the same time, new edge kinds have been introduced to convey extra information about the state of the graph. E.g. The RequestAndTransformTo* edges represent GOT/TLVP relocations prior to synthesis of the GOT/TLVP entries, and the 'Relaxable' suffix distinguishes edges that are candidates for optimization from edges which should be left as-is (e.g. to enable runtime redirection). ELF/x86-64 will be refactored to use these generic edges at some point in the future, and I anticipate a similar refactor to create a generic arm64 support header too. Differential Revision: https://reviews.llvm.org/D98305	2021-03-15 15:43:07 -07:00
Nico Weber	b7c9aa8059	[gn build] merge af2796c76d2f a bit more The default is fine on non-Win, but on Win this needs an explicit setting now that lit no longer has the right default.	2021-03-15 18:20:54 -04:00
Alexander Yermolovich	0440613b8e	[DWARF] Check for AddrOffsetSectionBase to work with DWO Units. Context: https://lists.llvm.org/pipermail/llvm-dev/2021-February/148521.html A fix for llvm-symbolizer, and other tools like BOLT, that allows retrieving address when built with -gsplit-dwarf=single mode. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D96827	2021-03-15 14:46:09 -07:00
Artem Belevich	7a17da6eb6	[NVPTX] Avoid temp copy of byval kernel parameters. Avoid making a temporary copy of byval argument if all accesses are loads and therefore the pointer to the parameter can not escape. This avoids excessive global memory accesses when each kernel makes its own copy. Differential revision: https://reviews.llvm.org/D98469	2021-03-15 14:27:22 -07:00
Nick Lewycky	40de5cf96a	NFC: Formatting changes. Run clang-format over these files. Capitalize some variable names per clang-tidy's request. Pulled out to simplify review of D98302.	2021-03-15 14:26:39 -07:00
Stanislav Mekhanoshin	17050632cf	[AMDGPU] Fix copyPhysReg to not produce unalined vgpr access RA can insert something like a sub1_sub2 COPY of a wide VGPR tuple which results in the unaligned acces with v_pk_mov_b32 after the copy is expanded. This is regression after D97316. Differential Revision: https://reviews.llvm.org/D98549	2021-03-15 14:14:30 -07:00
Florian Hahn	1aed9ced3b	[AnnotationRemarks] Remove unneeded Function.h include (NFC).	2021-03-15 21:09:35 +00:00
Nico Weber	19f76964c4	[gn build] merge 9bcf0eff99	2021-03-15 17:05:05 -04:00
Nico Weber	f75a03ef22	[gn build] kind of merge af2796c76d2f Good enough for now. If we need more, we'll do the usual platform-dependent hardcoding that in practice works for everything else too.	2021-03-15 17:01:00 -04:00
Stanislav Mekhanoshin	fc6febe595	[AMDGPU] Fixed msan failure with uninitialized value	2021-03-15 13:58:19 -07:00
Kirill Bobyrev	da5206f100	[clangd] Optionally add reflection for clangd-index-server This was originally landed without the optional part and reverted later: `8080ea4c4b` Reviewed By: kadircet Differential Revision: https://reviews.llvm.org/D98404	2021-03-15 21:07:25 +01:00
Markus Böck	9badf35f5b	Revert line accidentally included in af2796c76d2ff4b73165ed47959afd35a769beee	2021-03-15 21:03:46 +01:00
Sanjay Patel	677d887642	[SLP] update stale test comments; NFC These bugs were fixed with 0a8e7ca402eb	2021-03-15 16:02:46 -04:00
Stanislav Mekhanoshin	196e7f3138	[AMDGPU] Use single cache policy operand Replace individual operands GLC, SLC, and DLC with a single cache_policy bitmask operand. This will reduce the number of operands in MIR and I hope the amount of code. These operands are mostly 0 anyway. Additional advantage that parser will accept these flags in any order unlike now. Differential Revision: https://reviews.llvm.org/D96469	2021-03-15 13:00:59 -07:00
Markus Böck	1be4884f17	[test] Add ability to get error messages from CMake for errc substitution Visual Studios implementation of the C++ Standard Library does not use strerror to produce a message for std::error_code unlike other standard libraries such as libstdc++ or libc++ that might be used. This patch adds a cmake script that through running a C++ program gets the error messages for the POSIX error codes and passes them onto lit through an optional config parameter. If the config parameter is not set, or getting the messages failed, due to say a cross compiling configuration without an emulator, it will fall back to using pythons strerror functions. Differential Revision: https://reviews.llvm.org/D98278	2021-03-15 20:56:08 +01:00

1 2 3 4 5 ...

212770 Commits