llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 04:02:41 +01:00

Author	SHA1	Message	Date
Alexandros Lamprineas	a50e569197	[AArch64] Add a Machine Value Type for 8 consecutive registers Adds MVT::i64x8, a Machine Value Type needed for lowering inline assembly operands which materialize a sequence of eight general purpose registers. Differential Revision: https://reviews.llvm.org/D94096	2021-08-02 15:45:58 +01:00
Jeremy Morse	cd0096f439	[DebugInfo][InstrRef] Don't break up ret-sequences on debug-info instrs When we have a terminator sequence (i.e. a tailcall or return), MIIsInTerminatorSequence is used to work out where the preceding ABI-setup instructions end, i.e. the parts that were glued to the terminator instruction. This allows LLVM to split blocks safely without having to worry about ABI stuff. The function only ignores DBG_VALUE instructions, meaning that the two debug instructions I recently added can end terminator sequences early, causing various MachineVerifier errors. This patch promotes the test for debug instructions from "isDebugValue" to "isDebugInstr", thus avoiding any debug-info interfering with this function. Differential Revision: https://reviews.llvm.org/D106660 (cherry picked from commit 8612417e5a54cfef941ab45de55e48b4a0c4e8b4)	2021-07-29 15:08:13 +01:00
Bradley Smith	183b0c7c98	[AArch64][SVE] Fix incorrect mask type when lowering fixed type SVE gather/scatter An incorrect mask type when lowering an SVE gather/scatter was causing a codegen fault which manifested as the incorrect predicate size being used for an SVE gather/scatter, (e.g.. p0.b rather than p0.d). Fixes PR51182. Differential Revision: https://reviews.llvm.org/D106943 (cherry picked from commit 191831e380f317cd2baa5d48abe02d1d11cd44cb)	2021-07-29 07:03:40 -07:00
Diana Picus	923213f844	test-release.sh: Kill python2 Don't prefer python2's virtualenv when setting up the test-suite. Always use python3 instead, since that's what we support everywhere else anyway. Differential Revision: https://reviews.llvm.org/D106941	2021-07-29 10:28:39 +02:00
Chris Jackson	9a10dd5b1c	Revert "[DebugInfo][LoopStrengthReduction] SCEV-based salvaging for LSR" This was reverted due to a reported crash. This reverts commit 796b84d26f4d461fb50e7b4e84e15a10eaca88fc.	2021-07-29 00:04:50 +01:00
Valentin Clement	c463fa6cad	[mlir][openacc] Initial translation for DataOp to LLVM IR Add basic translation of acc.data to LLVM IR with runtime calls. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D104301	2021-07-27 22:04:04 -04:00
Jose M Monsalve Diaz	5b7208da36	[OpenMP] Folding threadLimit and numThreads when single value in kernels The device runtime contains several calls to `__kmpc_get_hardware_num_threads_in_block` and `__kmpc_get_hardware_num_blocks`. If the thread_limit and the num_teams are constant, these calls can be folded to the constant value. In this patch we use the already introduced `AAFoldRuntimeCall` and the `NumTeams` and `NumThreads` kernel attributes (to be introduced in a different patch) to fold these functions. The code checks all the kernels, and if their attributes match, the functions are folded. In the future we will explore specializing for multiple values of NumThreads and NumTeams. Depends on D106390 Reviewed By: jdoerfert, JonChesterfield Differential Revision: https://reviews.llvm.org/D106033	2021-07-27 21:47:12 -04:00
George Burgess IV	66a78bc217	llvm/utils: guarantee revert_checker's revert ordering At the moment, the revert ordering from this tool is unspecified (though it happens to be in `git log` order, so newest reverts come first). From the standpoint of tooling and users, this seems to be the opposite of what we want by default: tools and users will generally try to apply these reverts as cherry-picks. If two reverts in the list are close enough to each other, if the reverts get applied out of order, we'll get a merge conflict. Rather than having `reverse`s for all tools (and mental reverses for manual users), just guarantee an oldest-first output ordering for this function. Differential Revision: https://reviews.llvm.org/D106838	2021-07-28 00:51:05 +00:00
Juneyoung Lee	bab6d38daf	[DAGCombiner] Fold SETCC(FREEZE(x),const) to FREEZE(SETCC(x,const)) if SETCC is used by BRCOND This patch adds a peephole optimization `SETCC(FREEZE(x),const)` => `FREEZE(SETCC(x,const))` if the SETCC is only used by BRCOND. Combined with `BRCOND(FREEZE(X)) => BRCOND(X)`, this leads to a nice improvement in the generated assembly when x is a masked loaded value. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D105344	2021-07-28 09:22:15 +09:00
Juneyoung Lee	2e57fe1d7d	Precommit test files for D105344 (NFC)	2021-07-28 09:21:55 +09:00
Xiang1 Zhang	a6d5003afd	[X86] Fix lowering to illegal type in LowerINSERT_VECTOR_ELT Differential Revision: https://reviews.llvm.org/D106780	2021-07-28 08:16:59 +08:00
Johannes Doerfert	9b53e594b4	Reapply "[Attributor] Disable simplification AAs if a callback is present"" This reapplies commit cbb709e25124dc38ee593882051fc88c987fe591 and includes the use of the lookup method instead of operator[] to avoid accidentally setting (empty) simplification callbacks. This reverts commit aa27430a625b2fd059707a87f8ba2df8f480ff11.	2021-07-27 19:14:50 -05:00
Xiang1 Zhang	5d447ad589	Revert "[X86] Fix lowering to illegal type in LowerINSERT_VECTOR_ELT" This reverts commit 6ff73efea94621e74642e4d7a15cc86a5fb6d411.	2021-07-28 08:12:29 +08:00
Mehdi Amini	115164b68b	Add llvm::equal convenient wrapper for ranges around std::equal Differential Revision: https://reviews.llvm.org/D106913	2021-07-28 00:10:22 +00:00
Xiang1 Zhang	409f0eedd6	[X86] Fix lowering to illegal type in LowerINSERT_VECTOR_ELT	2021-07-28 08:08:30 +08:00
Krzysztof Parzyszek	60850cdc6a	[Hexagon] Fix resetting dead registers in DBG_VALUE_LISTs This fixes https://llvm.org/PR51229.	2021-07-27 18:36:28 -05:00
LLVM GN Syncbot	803630dbd1	[gn build] Port 8a48e6dda9f7	2021-07-27 23:10:20 +00:00
Johannes Doerfert	96f821bd7b	Revert "[Attributor] Disable simplification AAs if a callback is present" This reverts commit cbb709e25124dc38ee593882051fc88c987fe591 as it breaks the tests, which was not supposed to happen. Investigating now.	2021-07-27 18:09:42 -05:00
Johannes Doerfert	0042081f58	[Attributor] Verify `checkForAllUses` return value properly Also do not emit more than one remark after Heap2Stack failed.	2021-07-27 17:50:27 -05:00
Johannes Doerfert	0b6fb7b1ef	[Attributor] Disable simplification AAs if a callback is present AAValueSimplify, AAValueConstantRange, and AAPotentialValues all look at the IR by default. If queried for a IR position which has a simplification callback we should either look at the callback return, or give up. We do the latter for now.	2021-07-27 17:50:26 -05:00
James Y Knight	9aacda281c	Fix test/Transforms/LoopVectorize/AArch64/strict-fadd-vf1.ll. It was writing to the source directory (which may not be writeable), rather than using %t. Fixes: a5dd6c6cf935 ("[LoopVectorize] Don't interleave scalar ordered reductions for inner loops")	2021-07-27 18:32:29 -04:00
Nico Weber	091e720912	[gn build] manually port 71909de37495	2021-07-27 18:23:28 -04:00
Mircea Trofin	506c974562	[MLGO] fix silly LLVM_DEBUG misuse	2021-07-27 15:10:28 -07:00
Mircea Trofin	f4b486c976	[NFC][MLGO] Debug messages for what inline advisor is selected We already have an indication (error) if the desired inline advisor cannot be enabled, but we don't have a positive indication. Added LLVM_DEBUG messages for the latter.	2021-07-27 15:05:39 -07:00
Nemanja Ivanovic	8b3f85a32c	[PowerPC] Turn deprecated altivec prefetch instrs to nops on AIX The dst/dstt/dstst/dststt instructions are nop's on all PowerPC cores that AIX supports. The AIX assembler also does not accept these mnemonics. Turn them into nop's on AIX (similar to dstall).	2021-07-27 15:50:02 -05:00
Sanjay Patel	11aa71a71d	[x86] update stale code comment; NFC The transform was generalized with: 1ce05ad619a5	2021-07-27 16:45:52 -04:00
Sanjay Patel	d571039392	[x86] add more tests for cmov and lea; NFC	2021-07-27 16:45:52 -04:00
Matt Arsenault	ece3299a71	AMDGPU/GlobalISel: Fix selecting G_SEXTLOAD/G_ZEXTLOAD pre-gfx9 The patterns for the m0 glue patterns were failing to import.	2021-07-27 15:56:42 -04:00
Matt Arsenault	17f139f330	AMDGPU/GlobalISel: Fix wrong addrspace in test MMOs	2021-07-27 15:56:41 -04:00
Matt Arsenault	cfbd77a7f1	AMDGPU/GlobalISel: Add a few tests for unaligned truncating stores	2021-07-27 15:56:41 -04:00
Benjamin Kramer	c0d40a1ca3	Remove unused include that's also a layering violation. NFC.	2021-07-27 21:21:55 +02:00
Amara Emerson	0a69d4f31a	Add test update for a11d9a1f480f which disables fallbacks.	2021-07-27 12:16:06 -07:00
Amara Emerson	6ce8f2f7c1	[AArch64][GlobalISel] Fix constraining LDXPX intrinsic selection. Causes a fallback because of lack of regclasses on vregs, unless its without asserts, where we end up crashing later in codegen.	2021-07-27 12:13:56 -07:00
Enna1	2a88f80da4	[ASAN] NFC: Remove redundant variable `StackAlignment` has only one use: `StackAlignment = std::max(StackAlignment, AI.getAlignment());` So it is redundant. Reviewed By: vitalybuka, MTC Differential Revision: https://reviews.llvm.org/D106741	2021-07-27 12:02:37 -07:00
LLVM GN Syncbot	c563d2cbcd	[gn build] Port 02077da7e7a8	2021-07-27 18:41:55 +00:00
Adam Nemet	d9613eb43c	[Matrix] Fix shape for factored transpose The shape of the input is C x R. Differential Revision: https://reviews.llvm.org/D106722	2021-07-27 11:36:13 -07:00
Adam Nemet	add64be20a	[Matrix] RAUW should only replace an instruction in ShapeMap if supportsShapeInfo As an instruction is replaced in optimizeTransposes RAUW will replace it in the ShapeMap (ShapeMap is ValueMap so that uses are updated). In finalizeLowering however we skip updating uses if they are in the ShapeMap since they will be lowered separately at which point we pick up the lowered operands. In the testcase what happened was that since we replaced the doubled-transpose with the shuffle, it ended up in the ShapeMap. As we lowered the columnwise-load the use in the shuffle was not updated. Then as we removed the original columnwise-load we changed that to an undef. I.e. we ended up with: ``` %shuf = shufflevector <8 x double> undef, <8 x double> poison, <6 x i32> ^^^^^ <i32 0, i32 1, i32 2, i32 4, i32 5, i32 6> ``` Besides the fix itself, I have fortified this last bit. As we change uses to undef when removing instruction we track the undefed instruction to make sure we eventually remove those too. This would have caught the issue at compile time. Differential Revision: https://reviews.llvm.org/D106714	2021-07-27 11:36:13 -07:00
Alexey Zhikhartsev	6516543c4b	Add jump-threading optimization for deterministic finite automata The current JumpThreading pass does not jump thread loops since it can result in irreducible control flow that harms other optimizations. This prevents switch statements inside a loop from being optimized to use unconditional branches. This code pattern occurs in the core_state_transition function of Coremark. The state machine can be implemented manually with goto statements resulting in a large runtime improvement, and this transform makes the switch implementation match the goto version in performance. This patch specifically targets switch statements inside a loop that have the opportunity to be threaded. Once it identifies an opportunity, it creates new paths that branch directly to the correct code block. For example, the left CFG could be transformed to the right CFG: ``` sw.bb sw.bb / \| \ / \| \ case1 case2 case3 case1 case2 case3 \ \| / / \| \ latch.bb latch.2 latch.3 latch.1 br sw.bb / \| \ sw.bb.2 sw.bb.3 sw.bb.1 br case2 br case3 br case1 ``` Co-author: Justin Kreiner @jkreiner Co-author: Ehsan Amiri @amehsan Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D99205	2021-07-27 14:34:04 -04:00
Craig Topper	6bfc6b8665	[RISCV] Select vector shl by 1 to a vector add. A vector add may be faster than a vector shift. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D106689	2021-07-27 10:57:28 -07:00
David Green	0f86a54b34	[AArch64] Update and expand min-max cost model test. NFC This expands the cost model test for min/max to many more types, including floating point minnum/maxnum and minimum/maximum, and FP16 with and without fullfp16. The old llc run lines are removed, as those are better tested by CodeGen tests.	2021-07-27 18:48:58 +01:00
Andy Kaylor	fd18a762a0	Enabling the copy-constant-to-alloca optimization in more instances Patch by Mohammad Fawaz This patch allows lifetime calls to be ignored (and later erased) if we know that the copy-constant-to-alloca optimization is going to happen. The case that is missed is when the global variable is in a different address space than the alloca (as shown in the example added to the lit test.) This used to work before `6da31fa4a6` Differential Revision: https://reviews.llvm.org/D106573	2021-07-27 10:11:43 -07:00
David Sherwood	cdd50ed2ff	[LoopVectorize] Don't interleave scalar ordered reductions for inner loops Consider the following loop: void foo(float dst, float src, int N) { for (int i = 0; i < N; i++) { dst[i] = 0.0; for (int j = 0; j < N; j++) { dst[i] += src[(i * N) + j]; } } } When we are not building with -Ofast we may attempt to vectorise the inner loop using ordered reductions instead. In addition we also try to select an appropriate interleave count for the inner loop. However, when choosing a VF=1 the inner loop will be scalar and there is existing code in selectInterleaveCount that limits the interleave count to 2 for reductions due to concerns about increasing the critical path. For ordered reductions this problem is even worse due to the additional data dependency, and so I've added code to simply disable interleaving for scalar ordered reductions for now. Test added here: Transforms/LoopVectorize/AArch64/strict-fadd-vf1.ll Differential Revision: https://reviews.llvm.org/D106646	2021-07-27 17:41:01 +01:00
Anna Thomas	3ad8a0b712	Update reduction test. Remove standalone test file Based on post commit review comments at 68ffed12b.	2021-07-27 12:35:04 -04:00
Matt Arsenault	3b4453b968	AMDGPU: Update tests for lower i1 change I forgot to squash the test updates for b32d3d9e81cdd9275d19cd2a396c461edc9e7189	2021-07-27 12:14:58 -04:00
Matt Arsenault	8979bda8e8	AMDGPU: Treat IMPLICIT_DEF like a constant lanemask source This is partially a workaround. SILowerI1Copies does not understand unstructured loops. This would result in inserting instructions to merge a mask register in the same block where it was defined in an unstructured loop.	2021-07-27 11:44:38 -04:00
Thomas Lively	bb4e957ebf	[WebAssembly] Codegen for extmul SIMD instructions Replace the clang builtins and LLVM intrinsics for the SIMD extmul instructions with normal codegen patterns. Differential Revision: https://reviews.llvm.org/D106724	2021-07-27 08:41:30 -07:00
Anirudh Prasad	2b967460b4	[SystemZ][z/OS] Initial code to generate assembly files on z/OS - This patch consists of the bare basic code needed in order to generate some assembly for the z/OS target. - Only the .text and the .bss sections are added for now. - The relevant MCSectionGOFF/Symbol interfaces have been added. This enables us to print out the GOFF machine code sections. - This patch enables us to add simple lit tests wherever possible, and contribute to the testing coverage for the z/OS target - Further improvements and additions will be made in future patches. Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D106380	2021-07-27 11:29:15 -04:00
Anna Thomas	23f3f90569	Strip undef implying attributes when moving calls When hoisting/moving calls to locations, we strip unknown metadata. Such calls are usually marked `speculatable`, i.e. they are guaranteed to not cause undefined behaviour when run anywhere. So, we should strip attributes that can cause immediate undefined behaviour if those attributes are not valid in the context where the call is moved to. This patch introduces such an API and uses it in relevant passes. See updated tests. Fix for PR50744. Reviewed By: nikic, jdoerfert, lebedev.ri Differential Revision: https://reviews.llvm.org/D104641	2021-07-27 10:57:05 -04:00
Tres Popp	05308b27ea	Revert "[X86][AVX] Add getBROADCAST_LOAD helper function. NFCI." This reverts commit 1cfecf4fc4278afb0005923f6dff595cd372da5c. This commit broke LLVM code generated through XLA by removing a conditional on Ld->getExtensionType() == ISD::NON_EXTLOAD This is not a perfect revert. The new function is left as other uses of it exist now.	2021-07-27 16:55:50 +02:00
Tres Popp	d809395fde	Revert "Revert "[X86][AVX] Add getBROADCAST_LOAD helper function. NFCI."" This reverts commit d7bbb1230a94cb239aa4a8cb896c45571444675d. There were follow up uses of a deleted method and I didn't run the tests. Undo the revert, so I can do it properly.	2021-07-27 16:48:31 +02:00

1 2 3 4 5 ...

219321 Commits