llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-18 10:32:48 +02:00

Author	SHA1	Message	Date
Joseph Huber	4f1fd1c209	[Attributor] Change function internalization to not replace uses in internalized callers The current implementation of function internalization creats a copy of each function and replaces every use. This has the downside that the external versions of the functions will call into the internalized versions of the functions. This prevents them from being fully independent of eachother. This patch replaces the current internalization scheme with a method that creates all the copies of the functions intended to be internalized first and then replaces the uses as long as their caller is not already internalized. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D106931 (cherry picked from commit adbaa39dfce7a8361d89b6a3b382fd8f50b94727)	2021-08-04 16:35:01 -07:00
Cullen Rhodes	710ae2bfd3	[ReleaseNotes] Add scalable matrix extension support to AArch64 changes Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D106853	2021-08-03 15:24:36 +00:00
Muhammad Omair Javaid	bafc7f359f	[llvm][Release notes] Add AArch64 SVE, PAC and LLDB prebuilt binary This patch updates LLVM release notes to add a announcement about AArch64 SVE, PAC and LLDB prebuilt binary.	2021-08-03 20:20:07 +05:00
David Spickett	aa2a6b072f	[llvm][Release notes] Add memory tagging support to lldb changes	2021-08-03 12:25:36 +00:00
Sanjay Patel	d4eedb2312	[Analysis] improve function signature checking for snprintf The check for size_t parameter 1 was already here for snprintf_chk, but it wasn't applied to regular snprintf. This could lead to mismatching and eventually crashing as shown in: https://llvm.org/PR50885 (cherry picked from commit 7f5555776513f174729a686ed01270e23462aaf7)	2021-08-02 22:58:39 -07:00
Jose M Monsalve Diaz	9770d34891	[OpenMP] Fixing llvm-omp-device-info compilation with runtimes When using `-DLLVM_ENABLED_RUNTIMES` instead of `-DLLVM_ENABLED_PROJECTS` the `llvm-omp-device-info` tool is not compiled or installed. In general, no llvm tool would be build on runtimes, because the -DLLVM_BUILD_TOOLS flag is removed by the way runtimes compilation calls cmake again. This patch is simple. Just forward the value of this flag to the runtime cmake command. I'm also removing an unnecessary comment in the compilation of the tool Differential Revision: https://reviews.llvm.org/D107177 (cherry picked from commit 5424ceeda0534ab382e2a6cb192099f76ee8b12c)	2021-08-02 20:05:19 -07:00
Simon Pilgrim	70f5e23577	[X86][AVX] Add test case for PR51281 (cherry picked from commit 6569b7f90239b5932465a1c6936632b4a9527d66)	2021-08-02 20:05:12 -07:00
Sanjay Patel	df3286259b	[DAGCombiner] don't try to partially reduce add-with-overflow ops This transform was added with D58874, but there were no tests for overflow ops. We need to change this one way or another because it can crash as shown in: https://llvm.org/PR51238 Note that if there are no uses of an overflow op's bool overflow result, we reduce it to a regular math op, so we continue to fold that case either way. If we have uses of both the math and the overflow bool, then we are likely not saving anything by creating an independent sub instruction as seen in the test diffs here. This patch makes the behavior in SDAG consistent with what we do in instcombine AFAICT. Differential Revision: https://reviews.llvm.org/D106983 (cherry picked from commit fa6b2c9915ba27e1e97f8901ea4aa877f331fb9f)	2021-08-02 13:52:48 -07:00
Sanjay Patel	a0686462c3	[AArch64][x86] add tests for add-with-overflow folds; NFC There's a generic combine for these, but no test coverage. It's not clear if this is actually a good fold. The combine was added with D58874, but it has a bug that can cause crashing ( https://llvm.org/PR51238 ). (cherry picked from commit e427077ec10ea18ac21f5065342183481d87783a)	2021-08-02 13:52:42 -07:00
Sanjay Patel	b92c9f9565	[DivRemPairs] make sure we have a valid CFG for hoisting division This transform was added with e38b7e894808ec2 and as shown in: https://llvm.org/PR51241 ...it could crash without an extra check of the blocks. There might be a more compact way to write this constraint, but we can't just count the successors/predecessors without affecting a test that includes a switch instruction. (cherry picked from commit 5b83261c1518a39636abe094123f1704bbfd972f)	2021-08-02 13:52:37 -07:00
Craig Topper	7c9c296915	[RISCV] Restrict performANY_EXTENDCombine to prevent an infinite loop. The sign_extend we insert here can get turned into a zero_extend if the sign bit is known zero. This can enable a setcc combine that shrinks compares with zero_extend. This reduces the use count of the zero_extend allowing other combines to turn it back into an any_extend. This restricts the combine to only cases where the result is used by a CopyToReg. This works for my original motivating case. I hope the CopyToReg use will prevent any converted extends from turning back into an any_extend. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D106754 (cherry picked from commit 54588bcc052e5b08f90e672c33d0c1ad4eda2424)	2021-08-02 11:31:08 -07:00
Alexandros Lamprineas	276fcebbe0	[AArch64] Legalize MVT::i64x8 in DAG isel lowering This patch legalizes the Machine Value Type introduced in D94096 for loads and stores. A new target hook named getAsmOperandValueType() is added which maps i512 to MVT::i64x8. GlobalISel falls back to DAG for legalization. Differential Revision: https://reviews.llvm.org/D94097	2021-08-02 15:45:58 +01:00
Alexandros Lamprineas	a50e569197	[AArch64] Add a Machine Value Type for 8 consecutive registers Adds MVT::i64x8, a Machine Value Type needed for lowering inline assembly operands which materialize a sequence of eight general purpose registers. Differential Revision: https://reviews.llvm.org/D94096	2021-08-02 15:45:58 +01:00
Jeremy Morse	cd0096f439	[DebugInfo][InstrRef] Don't break up ret-sequences on debug-info instrs When we have a terminator sequence (i.e. a tailcall or return), MIIsInTerminatorSequence is used to work out where the preceding ABI-setup instructions end, i.e. the parts that were glued to the terminator instruction. This allows LLVM to split blocks safely without having to worry about ABI stuff. The function only ignores DBG_VALUE instructions, meaning that the two debug instructions I recently added can end terminator sequences early, causing various MachineVerifier errors. This patch promotes the test for debug instructions from "isDebugValue" to "isDebugInstr", thus avoiding any debug-info interfering with this function. Differential Revision: https://reviews.llvm.org/D106660 (cherry picked from commit 8612417e5a54cfef941ab45de55e48b4a0c4e8b4)	2021-07-29 15:08:13 +01:00
Bradley Smith	183b0c7c98	[AArch64][SVE] Fix incorrect mask type when lowering fixed type SVE gather/scatter An incorrect mask type when lowering an SVE gather/scatter was causing a codegen fault which manifested as the incorrect predicate size being used for an SVE gather/scatter, (e.g.. p0.b rather than p0.d). Fixes PR51182. Differential Revision: https://reviews.llvm.org/D106943 (cherry picked from commit 191831e380f317cd2baa5d48abe02d1d11cd44cb)	2021-07-29 07:03:40 -07:00
Diana Picus	923213f844	test-release.sh: Kill python2 Don't prefer python2's virtualenv when setting up the test-suite. Always use python3 instead, since that's what we support everywhere else anyway. Differential Revision: https://reviews.llvm.org/D106941	2021-07-29 10:28:39 +02:00
Chris Jackson	9a10dd5b1c	Revert "[DebugInfo][LoopStrengthReduction] SCEV-based salvaging for LSR" This was reverted due to a reported crash. This reverts commit 796b84d26f4d461fb50e7b4e84e15a10eaca88fc.	2021-07-29 00:04:50 +01:00
Valentin Clement	c463fa6cad	[mlir][openacc] Initial translation for DataOp to LLVM IR Add basic translation of acc.data to LLVM IR with runtime calls. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D104301	2021-07-27 22:04:04 -04:00
Jose M Monsalve Diaz	5b7208da36	[OpenMP] Folding threadLimit and numThreads when single value in kernels The device runtime contains several calls to `__kmpc_get_hardware_num_threads_in_block` and `__kmpc_get_hardware_num_blocks`. If the thread_limit and the num_teams are constant, these calls can be folded to the constant value. In this patch we use the already introduced `AAFoldRuntimeCall` and the `NumTeams` and `NumThreads` kernel attributes (to be introduced in a different patch) to fold these functions. The code checks all the kernels, and if their attributes match, the functions are folded. In the future we will explore specializing for multiple values of NumThreads and NumTeams. Depends on D106390 Reviewed By: jdoerfert, JonChesterfield Differential Revision: https://reviews.llvm.org/D106033	2021-07-27 21:47:12 -04:00
George Burgess IV	66a78bc217	llvm/utils: guarantee revert_checker's revert ordering At the moment, the revert ordering from this tool is unspecified (though it happens to be in `git log` order, so newest reverts come first). From the standpoint of tooling and users, this seems to be the opposite of what we want by default: tools and users will generally try to apply these reverts as cherry-picks. If two reverts in the list are close enough to each other, if the reverts get applied out of order, we'll get a merge conflict. Rather than having `reverse`s for all tools (and mental reverses for manual users), just guarantee an oldest-first output ordering for this function. Differential Revision: https://reviews.llvm.org/D106838	2021-07-28 00:51:05 +00:00
Juneyoung Lee	bab6d38daf	[DAGCombiner] Fold SETCC(FREEZE(x),const) to FREEZE(SETCC(x,const)) if SETCC is used by BRCOND This patch adds a peephole optimization `SETCC(FREEZE(x),const)` => `FREEZE(SETCC(x,const))` if the SETCC is only used by BRCOND. Combined with `BRCOND(FREEZE(X)) => BRCOND(X)`, this leads to a nice improvement in the generated assembly when x is a masked loaded value. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D105344	2021-07-28 09:22:15 +09:00
Juneyoung Lee	2e57fe1d7d	Precommit test files for D105344 (NFC)	2021-07-28 09:21:55 +09:00
Xiang1 Zhang	a6d5003afd	[X86] Fix lowering to illegal type in LowerINSERT_VECTOR_ELT Differential Revision: https://reviews.llvm.org/D106780	2021-07-28 08:16:59 +08:00
Johannes Doerfert	9b53e594b4	Reapply "[Attributor] Disable simplification AAs if a callback is present"" This reapplies commit cbb709e25124dc38ee593882051fc88c987fe591 and includes the use of the lookup method instead of operator[] to avoid accidentally setting (empty) simplification callbacks. This reverts commit aa27430a625b2fd059707a87f8ba2df8f480ff11.	2021-07-27 19:14:50 -05:00
Xiang1 Zhang	5d447ad589	Revert "[X86] Fix lowering to illegal type in LowerINSERT_VECTOR_ELT" This reverts commit 6ff73efea94621e74642e4d7a15cc86a5fb6d411.	2021-07-28 08:12:29 +08:00
Mehdi Amini	115164b68b	Add llvm::equal convenient wrapper for ranges around std::equal Differential Revision: https://reviews.llvm.org/D106913	2021-07-28 00:10:22 +00:00
Xiang1 Zhang	409f0eedd6	[X86] Fix lowering to illegal type in LowerINSERT_VECTOR_ELT	2021-07-28 08:08:30 +08:00
Krzysztof Parzyszek	60850cdc6a	[Hexagon] Fix resetting dead registers in DBG_VALUE_LISTs This fixes https://llvm.org/PR51229.	2021-07-27 18:36:28 -05:00
LLVM GN Syncbot	803630dbd1	[gn build] Port 8a48e6dda9f7	2021-07-27 23:10:20 +00:00
Johannes Doerfert	96f821bd7b	Revert "[Attributor] Disable simplification AAs if a callback is present" This reverts commit cbb709e25124dc38ee593882051fc88c987fe591 as it breaks the tests, which was not supposed to happen. Investigating now.	2021-07-27 18:09:42 -05:00
Johannes Doerfert	0042081f58	[Attributor] Verify `checkForAllUses` return value properly Also do not emit more than one remark after Heap2Stack failed.	2021-07-27 17:50:27 -05:00
Johannes Doerfert	0b6fb7b1ef	[Attributor] Disable simplification AAs if a callback is present AAValueSimplify, AAValueConstantRange, and AAPotentialValues all look at the IR by default. If queried for a IR position which has a simplification callback we should either look at the callback return, or give up. We do the latter for now.	2021-07-27 17:50:26 -05:00
James Y Knight	9aacda281c	Fix test/Transforms/LoopVectorize/AArch64/strict-fadd-vf1.ll. It was writing to the source directory (which may not be writeable), rather than using %t. Fixes: a5dd6c6cf935 ("[LoopVectorize] Don't interleave scalar ordered reductions for inner loops")	2021-07-27 18:32:29 -04:00
Nico Weber	091e720912	[gn build] manually port 71909de37495	2021-07-27 18:23:28 -04:00
Mircea Trofin	506c974562	[MLGO] fix silly LLVM_DEBUG misuse	2021-07-27 15:10:28 -07:00
Mircea Trofin	f4b486c976	[NFC][MLGO] Debug messages for what inline advisor is selected We already have an indication (error) if the desired inline advisor cannot be enabled, but we don't have a positive indication. Added LLVM_DEBUG messages for the latter.	2021-07-27 15:05:39 -07:00
Nemanja Ivanovic	8b3f85a32c	[PowerPC] Turn deprecated altivec prefetch instrs to nops on AIX The dst/dstt/dstst/dststt instructions are nop's on all PowerPC cores that AIX supports. The AIX assembler also does not accept these mnemonics. Turn them into nop's on AIX (similar to dstall).	2021-07-27 15:50:02 -05:00
Sanjay Patel	11aa71a71d	[x86] update stale code comment; NFC The transform was generalized with: 1ce05ad619a5	2021-07-27 16:45:52 -04:00
Sanjay Patel	d571039392	[x86] add more tests for cmov and lea; NFC	2021-07-27 16:45:52 -04:00
Matt Arsenault	ece3299a71	AMDGPU/GlobalISel: Fix selecting G_SEXTLOAD/G_ZEXTLOAD pre-gfx9 The patterns for the m0 glue patterns were failing to import.	2021-07-27 15:56:42 -04:00
Matt Arsenault	17f139f330	AMDGPU/GlobalISel: Fix wrong addrspace in test MMOs	2021-07-27 15:56:41 -04:00
Matt Arsenault	cfbd77a7f1	AMDGPU/GlobalISel: Add a few tests for unaligned truncating stores	2021-07-27 15:56:41 -04:00
Benjamin Kramer	c0d40a1ca3	Remove unused include that's also a layering violation. NFC.	2021-07-27 21:21:55 +02:00
Amara Emerson	0a69d4f31a	Add test update for a11d9a1f480f which disables fallbacks.	2021-07-27 12:16:06 -07:00
Amara Emerson	6ce8f2f7c1	[AArch64][GlobalISel] Fix constraining LDXPX intrinsic selection. Causes a fallback because of lack of regclasses on vregs, unless its without asserts, where we end up crashing later in codegen.	2021-07-27 12:13:56 -07:00
Enna1	2a88f80da4	[ASAN] NFC: Remove redundant variable `StackAlignment` has only one use: `StackAlignment = std::max(StackAlignment, AI.getAlignment());` So it is redundant. Reviewed By: vitalybuka, MTC Differential Revision: https://reviews.llvm.org/D106741	2021-07-27 12:02:37 -07:00
LLVM GN Syncbot	c563d2cbcd	[gn build] Port 02077da7e7a8	2021-07-27 18:41:55 +00:00
Adam Nemet	d9613eb43c	[Matrix] Fix shape for factored transpose The shape of the input is C x R. Differential Revision: https://reviews.llvm.org/D106722	2021-07-27 11:36:13 -07:00
Adam Nemet	add64be20a	[Matrix] RAUW should only replace an instruction in ShapeMap if supportsShapeInfo As an instruction is replaced in optimizeTransposes RAUW will replace it in the ShapeMap (ShapeMap is ValueMap so that uses are updated). In finalizeLowering however we skip updating uses if they are in the ShapeMap since they will be lowered separately at which point we pick up the lowered operands. In the testcase what happened was that since we replaced the doubled-transpose with the shuffle, it ended up in the ShapeMap. As we lowered the columnwise-load the use in the shuffle was not updated. Then as we removed the original columnwise-load we changed that to an undef. I.e. we ended up with: ``` %shuf = shufflevector <8 x double> undef, <8 x double> poison, <6 x i32> ^^^^^ <i32 0, i32 1, i32 2, i32 4, i32 5, i32 6> ``` Besides the fix itself, I have fortified this last bit. As we change uses to undef when removing instruction we track the undefed instruction to make sure we eventually remove those too. This would have caught the issue at compile time. Differential Revision: https://reviews.llvm.org/D106714	2021-07-27 11:36:13 -07:00
Alexey Zhikhartsev	6516543c4b	Add jump-threading optimization for deterministic finite automata The current JumpThreading pass does not jump thread loops since it can result in irreducible control flow that harms other optimizations. This prevents switch statements inside a loop from being optimized to use unconditional branches. This code pattern occurs in the core_state_transition function of Coremark. The state machine can be implemented manually with goto statements resulting in a large runtime improvement, and this transform makes the switch implementation match the goto version in performance. This patch specifically targets switch statements inside a loop that have the opportunity to be threaded. Once it identifies an opportunity, it creates new paths that branch directly to the correct code block. For example, the left CFG could be transformed to the right CFG: ``` sw.bb sw.bb / \| \ / \| \ case1 case2 case3 case1 case2 case3 \ \| / / \| \ latch.bb latch.2 latch.3 latch.1 br sw.bb / \| \ sw.bb.2 sw.bb.3 sw.bb.1 br case2 br case3 br case1 ``` Co-author: Justin Kreiner @jkreiner Co-author: Ehsan Amiri @amehsan Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D99205	2021-07-27 14:34:04 -04:00

1 2 3 4 5 ...

219433 Commits