llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 04:02:41 +01:00

Author	SHA1	Message	Date
Sanjay Patel	de802b8e6e	[InstSimplify] add tests for min/max idioms; NFC (cherry picked from commit 9b942a545cb53d4bae2071a2dea513be74f68221)	2021-08-16 11:35:24 -07:00
David Sherwood	9740b5c5ef	[LoopVectorize] Improve vectorisation of some intrinsics by treating them as uniform This patch adds more instructions to the Uniforms list, for example certain intrinsics that are uniform by definition or whose operands are loop invariant. This list includes: 1. The intrinsics 'experimental.noalias.scope.decl' and 'sideeffect', which are always uniform by definition. 2. If intrinsics 'lifetime.start', 'lifetime.end' and 'assume' have loop invariant input operands then these are also uniform too. Also, in VPRecipeBuilder::handleReplication we check if an instruction is uniform based purely on whether or not the instruction lives in the Uniforms list. However, there are certain cases where calls to some intrinsics can be effectively treated as uniform too. Therefore, we now also treat the following cases as uniform for scalable vectors: 1. If the 'assume' intrinsic's operand is not loop invariant, then we are free to treat this as uniform anyway since it's only a performance hint. We will get the benefit for the first lane. 2. When the input pointers for 'lifetime.start' and 'lifetime.end' are loop variant then for scalable vectors we assume these still ultimately come from the broadcast of an alloca. We do not support scalable vectorisation of loops containing alloca instructions, hence the alloca itself would be invariant. If the pointer does not come from an alloca then the intrinsic itself has no effect. I have updated the assume test for fixed width, since we now treat it as uniform: Transforms/LoopVectorize/assume.ll I've also added new scalable vectorisation tests for other intriniscs: Transforms/LoopVectorize/scalable-assume.ll Transforms/LoopVectorize/scalable-lifetime.ll Transforms/LoopVectorize/scalable-noalias-scope-decl.ll Differential Revision: https://reviews.llvm.org/D107284 (cherry picked from commit 3fd96e1b2e129b981f1bc1be2615486187e74687)	2021-08-16 11:32:41 -07:00
David Sherwood	00203829b4	[NFC] Clean up tests in test/Transforms/LoopVectorize/assume.ll The tests previously had lots of unnecessary CHECK lines, where all we really need to check is the presence (or absence) of the assume intrinsic and the correct input operands. Differential Revision: https://reviews.llvm.org/D107157 (cherry picked from commit 1172a8a7639399fe0b8a6c78a7123b1c3f9cf833)	2021-08-16 11:32:33 -07:00
Martin Storsjö	3a88dc8338	Add release notes for things relating to MinGW in the release	2021-08-16 12:26:49 +03:00
Rainer Orth	4a38ef8718	[ELF] Don't emit SHF_GNU_RETAIN on Solaris The introduction of `SHF_GNU_RETAIN` has caused massive problems on Solaris. Initially, as reported in Bug 49437, it caused dozens of testsuite failures on both sparc and x86. The objects were marked as `ELFOSABI_NONE`, but `SHF_GNU_RETAIN` is a GNU extension. In the native Solaris ABI, that flag (in the range for OS-specific values) is `SHF_SUNW_ABSENT` with a completely different semantics, which confuses Solaris `ld` very much. Later, the objects became (correctly) marked `ELFOSABI_GNU`, which Solaris `ld` doesn't support, causing it to SEGV and break the build. The linker is currently being hardened to not accept non-native OS ABIs to avoid this. The need for linker support is already documented in `clang/include/clang/Basic/AttrDocs.td`, but not currently checked. This patch avoids all this by not emitting `SHF_GNU_RETAIN` on Solaris at all. Tested on `amd64-pc-solaris2.11`, `sparcv9-sun-solaris2.11`, and `x86_64-pc-linux-gnu`. Differential Revision: https://reviews.llvm.org/D107747 (cherry picked from commit 7bbbf2956181f375ab193321b37ea71c5fc44054)	2021-08-12 22:51:57 -07:00
Petr Hosek	fd411b2b5d	[profile] Fix profile merging with binary IDs This fixes support for merging profiles which broke as a consequence of e50a38840dc3db5813f74b1cd2e10e6d984d0e67. The issue was missing adjustment in merge logic to account for the binary IDs which are now included in the raw profile just after header. In addition, this change also: * Includes the version in module signature that's used for merging to avoid accidental attempts to merge incompatible profiles. * Moves the binary IDs size field after version field in the header as was suggested in the review. Differential Revision: https://reviews.llvm.org/D107143 (cherry picked from commit 83302c84890e5e6cb74c7d6c9f8eaaa56db0077c)	2021-08-12 22:46:22 -07:00
Andrea Di Biagio	681b643c07	[X86][SchedModel] Add missing ReadAdvance for some arithmetic ops (PR51318 and PR51322). This fixes a bug where implicit uses of EFLAGS were not marked as ReadAdvance in the RM/MR variants of ADC/SBB (PR51318) This also fixes the absence of ReadAdvance for the register operand of RMW arithmetic instructions (PR51322). Differential Revision: https://reviews.llvm.org/D107367 (cherry picked from commit 7a1a35a1d1ae2e69769505c9f39910067c53d53b)	2021-08-11 21:40:03 -07:00
Andrea Di Biagio	fb132cb74b	[MCA][NFC] Add tests for PR51318 and PR51322. Also, regenerate existing X86 tests using update_mca_test.py. (cherry picked from commit f0658c7a429b9e356da1670b280ab943ad0b0b94)	2021-08-11 21:39:56 -07:00
Andrea Di Biagio	db94372a40	[MCA] Simplify the rounding logic used in TimelineView::printWaitTimeEntry. This is related to PR51392. Before this patch, the timeline view was rounding doubles to the first decimal, using a logic similar to this: ``` double AverageTime = (double)Input / CumulativeExecutions; double Result = floor((AverageTime * 10) + 0.5) / 10 ``` Here, Input and CumulativeExecutions are both unsigned integers. The last operation is what effectively performs the rounding of AverageTime. PR51392 has been raised because - under specific -m32 configurations of GCC - one of the timeline tests reports slighlty different values (due to a different rounding choice). This patch tries to minimise the propagation of floating-point error by hoisting the multiply by 10, so that it is performed on the unsigned. ``` double AverageTime = (double)(Input * 10) / CumulativeExecutions; floor(AverageTime + 0.5) / 10 ``` So we are trading a floating point multiply for a integer multiply (which can be expanded using a simple MUL or using an `ADD + LEA` sequence). This decrease in floating point operations executed should also help with decreasing the error in the computation.. Strictly speaking, that computation will always be potentially subject to error (depending on what values are passed in input). However, this patch should improve the situation and make bug like PR51392 less frequent. (cherry picked from commit 45685a1fc4524579a25b03eb1a27e8fcb792afc7)	2021-08-11 13:42:58 -07:00
Johannes Doerfert	64a6596867	[Attributor][NFC] Try to make the windows build bots happy Failed for some reason, potentially because of the inner type declaration in combination with the `using`. This might help. Failure: https://lab.llvm.org/buildbot/#/builders/127/builds/15432 (cherry picked from commit fc32a5c87d9d5aef2c0b27715153fdd45cebd3f3)	2021-08-11 09:15:37 -07:00
Johannes Doerfert	d2cc939747	[Attributor][FIX] Handle recurrences (PHIs) in AAPointerInfo explicitly PHI nodes are not pass through but change their value, we have to account for that to avoid missing stores. Follow up for D107798 to fix PR51249 for good. Differential Revision: https://reviews.llvm.org/D107808 (cherry picked from commit e7e3585cde0b08152a8cbf54029794d07c15963d)	2021-08-11 09:15:33 -07:00
Johannes Doerfert	e9a9a807dd	[Attributor][FIX] Only avoid visiting PHI uses multiple times (PR51249) AAPointerInfoFloating needs to visit all uses and some multiple times if we go through PHI nodes. Attributor::checkForAllUses keeps a visited set so we don't recurs endlessly. We now allow recursion for non-phi uses so we track all pointer offsets via PHI nodes properly without endless recursion. This replaces the first attempt D107579. Differential Revision: https://reviews.llvm.org/D107798 (cherry picked from commit 96da6dd6ba53bce5dbe822fe968c2b67ba9bc221)	2021-08-11 09:15:31 -07:00
Johannes Doerfert	5cf4e36ce1	[Attributor][NFC] Precommit reproducer for PR51249 The bulk of the changes come from attributes but only the @phi_store function is effectively added. (cherry picked from commit f358727ce06cca3b1f541b70719ccd1ac62efbf5)	2021-08-11 09:15:24 -07:00
Evandro Menezes	0c8a79e78d	[RISCV] Add scheduling resources for V Add the scheduling resources for the V extension instructions. Differential Revision: https://reviews.llvm.org/D98002 (cherry picked from commit 63a5ac4e0d969f41bf71785cc3979349a45a2892)	2021-08-10 23:11:38 -07:00
Tom Stellard	11c22c929d	Drop LLVM_VERSION_SUFFIX	2021-08-10 20:50:20 -07:00
Michał Górny	13e216d251	[llvm] [cmake] Export LLVM_ENABLE_NEW_PASS_MANAGER into LLVMConfig.cmake Include the vaue of LLVM_ENABLE_NEW_PASS_MANAGER in generated LLVMConfig.cmake since it is needed by clang's build system. This fixes test failures when the new pass manager is enabled (i.e. by default) by having clang's CMake files correctly detect that and skip relevant tests. Differential Revision: https://reviews.llvm.org/D107628 (cherry picked from commit 889a1e69bd2d65c368712ec653450099446aed33)	2021-08-10 15:37:05 -07:00
Bradley Smith	ee15bdbb06	[AArch64][SVE] Fix assertion failure when lowering fixed length gather/scatter The patterns for fixed length gather/scatter with 32-bit offsets and 64-bit memory type are slightly different that the rest of the patterns, as such the lowering needs to be slightly different to ensure the correct types are used. Differential Revision: https://reviews.llvm.org/D107576 (cherry picked from commit 73ecb9987b00db274b7b2ac34b0602ffdb906a4b)	2021-08-10 15:34:36 -07:00
Michał Górny	7d89a50287	[llvm] [lit] Fix inconsistent test order in shtest-keyword-parse-errors Remove test times when running shtest-keyword-parse-errors test, in order to prevent the previous executions from impacting subtest order and therefore causing FileCheck to fail. Differential Revision: https://reviews.llvm.org/D107427 (cherry picked from commit 39fa96a4906934774ba20bcb0cd5f808f619f3a6)	2021-08-06 12:50:56 -07:00
Yonghong Song	f6a86e448a	BPF: avoid NE/EQ loop exit condition Kuniyuki Iwashima reported in [1] that llvm compiler may convert a loop exit condition with "i < bound" to "i != bound", where "i" is the loop index variable and "bound" is the upper bound. In case that "bound" is not a constant, verifier will always have "i != bound" true, which will cause verifier failure since to verifier this is an infinite loop. The fix is to avoid transforming "i < bound" to "i != bound". In llvm, the transformation is done by IndVarSimplify pass. The compiler checks loop condition cost (i = i + 1) and if the cost is lower, it may transform "i < bound" to "i != bound". This patch implemented getArithmeticInstrCost() in BPF TargetTransformInfo class to return a higher cost for such an operation, which will prevent the transformation for the test case added in this patch. [1] https://lore.kernel.org/netdev/1994df05-8f01-371f-3c3b-d33d7836878c@fb.com/ Differential Revision: https://reviews.llvm.org/D107483 (cherry picked from commit e52946b9ababcbf8e6f40b1b15900ae2e795a1c6)	2021-08-06 12:45:53 -07:00
Dylan Fleming	a558de374f	[InstCombine] Fixed select + masked load fold failure Fixed type assertion failure caused by trying to fold a masked load with a select where the select condition is a scalar value Reviewed By: sdesmalen, lebedev.ri Differential Revision: https://reviews.llvm.org/D107372 (cherry picked from commit 3943a74666cbe718b74e06092ce3b4c20e85fde1)	2021-08-06 12:41:06 -07:00
Martin Storsjö	7563d8dfe5	[llvm-rc] Allow specifying language with a leading 0x prefix This option is always interpreted strictly as a hexadecimal string, even if it has no prefix that indicates the number format, hence the existing call to StringRef::getAsInteger(16, ...). StringRef::getAsInteger(0, ...) consumes a leading "0x" prefix is present, but when the radix is specified, the radix shouldn't be included. Both MS rc.exe and GNU windres accept the language with that prefix. Also allow specifying the codepage to llvm-windres with a different radix, as GNU windres allows that (but MS rc.exe doesn't). This fixes https://llvm.org/PR51295. Differential Revision: https://reviews.llvm.org/D107263 (cherry picked from commit 46020f6f0c8aa134002208b2ecf0593b04c46d08)	2021-08-06 12:39:15 -07:00
Chris Jackson	910de616a4	[DebugInfo][LSR] Avoid crashes on large integer inputs SCEV-based salvaging in LSR translates SCEVs to DIExpressions. SCEVs may contain very large integers but the translation does not support integers greater than 64 bits. This patch adds checks to ensure conversions of these large integers is not attempted. A regression test is added to ensure no such translation is attempted. Reviewed by: StephenTozer PR: https://bugs.llvm.org/show_bug.cgi?id=51329 Differential Revision: https://reviews.llvm.org/D107438 (cherry picked from commit 21ee38e24f9801a567306b2a88defacf6e589a8b)	2021-08-05 10:38:19 +01:00
Jeremy Morse	46ad88f625	Follow-up to D105207, only salvage affine SCEVs to avoid a crash SCEVToIterCountExpr only expects to be fed affine expressions, but DbgRewriteSalvageableDVIs is feeding it non-affine induction variables. Following this up with an obvious fix, will add test coverage too if this avoids D105207 being reverted. (cherry picked from commit 2537120c870c04893636f171f553024f378c2de8)	2021-08-05 10:35:08 +01:00
Chris Jackson	ff86d9e5f0	[DebugInfo][LoopStrengthReduction] SCEV-based salvaging for LSR Reapply commit d675b594f4f1e1f6a195fb9a4fd02cf3de92292d that was reverted due to buildbot failures. A simple fix has been applied to remove an assertion. Differential Revision: https://reviews.llvm.org/D105207 (cherry picked from commit 0ba8595287ea2203ef2250e2b0b41f284a055518)	2021-08-05 10:34:33 +01:00
Eli Friedman	a75903d41d	[ConstantFold] Get rid of special cases for sizeof etc. Target-dependent constant folding will fold these down to simple constants (or at least, expressions that don't involve a GEP). We don't need heroics to try to optimize the form of the expression before that happens. Fixes https://bugs.llvm.org/show_bug.cgi?id=51232 . Differential Revision: https://reviews.llvm.org/D107116 (cherry picked from commit 2a2847823f0d13188c43ebdd0baf42a95df750c7)	2021-08-04 21:25:15 -07:00
Nathan Chancellor	f9a2d3e277	[test] Fix tools/gold/X86/comdat-nodeduplicate.ll on non-X86 hosts When running this test on an aarch64 machine, it fails: ``` /usr/bin/ld.gold: error: .../test/tools/gold/X86/Output/comdat-nodeduplicate.ll.tmp/ab.lto.o: incompatible target ``` Specify the elf_x86_64 emulation as all of the other gold plugin tests do. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D107020 (cherry picked from commit 5060224d9eed8b8359ed5090bb7c577b8575e9e7)	2021-08-04 19:43:41 -07:00
Andy Kaylor	8be74d232c	Fixing an infinite loop problem in InstCombine Patch by Mohammad Fawaz This issues started happening after `b373b5990d` Basically, if the memcpy is volatile, the collectUsers() function should return false, just like we do for volatile loads. Differential Revision: https://reviews.llvm.org/D106950 (cherry picked from commit b4d945bacdaf2c60dd5fdb119b90cced73c41beb)	2021-08-04 16:51:40 -07:00
Jeroen Dobbelaere	3851813b89	[PredicateInfo] Use Intrinsic::getDeclaration now that it handles unnamed types. This is a second attempt to fix the EXPENSIVE_CHECKS issue that was mentioned In D91661#2875179 by @jroelofs. (The first attempt was in D105983) D91661 more or less completely reverted D49126 and by doing so also removed the cleanup logic of the created declarations and calls. This patch is a replacement for D91661 (which must itself be reverted first). It replaces the custom declaration creation with the generic version and shows the test impact. It also tracks the number of NamedValues to detect if a new prototype was added instead of looking at the available users of a prototype. Reviewed By: jroelofs Differential Revision: https://reviews.llvm.org/D106147 (cherry picked from commit 03b8c69d06f810f13d0b74d06dabea37c43e5b78)	2021-08-04 16:51:33 -07:00
Jeroen Dobbelaere	e20914df74	Revert "Revert of D49126 [PredicateInfo] Use custom mangling to support ssa_copy with unnamed types." This reverts commit 77080a1eb6061df2dcfae8ac84b85ad4d1e02031. This change introduced issues detected with EXPENSIVE_CHECKS. Reverting to restore the needed function cleanup. A next patch will then just improve on the name mangling. (cherry picked from commit dc5570d149ca6a0931413bf1ad469eb8f9517f82)	2021-08-04 16:51:29 -07:00
Sanjay Patel	2b94ecbbe0	[SROA] prevent crash on large memset length (PR50910) I don't know much about this pass, but we need a stronger check on the memset length arg to avoid an assert. The current code was added with D59000. The test is reduced from: https://llvm.org/PR50910 Differential Revision: https://reviews.llvm.org/D106462 (cherry picked from commit f2a322bfcfbc62b5523f32c4eded6faf2cad2e24)	2021-08-04 16:51:23 -07:00
Joseph Huber	30f0ebbbe2	[Attributor] Don't test internalization in the CGSCC pass. Summary: Enabling internalization in the Attributor's CGSCC pass does something different that we don't expect. Ignore this for now to pass the tests. (cherry picked from commit 97851a08e2684388dec24fbe46818704052f9dbe)	2021-08-04 16:35:08 -07:00
Joseph Huber	4f1fd1c209	[Attributor] Change function internalization to not replace uses in internalized callers The current implementation of function internalization creats a copy of each function and replaces every use. This has the downside that the external versions of the functions will call into the internalized versions of the functions. This prevents them from being fully independent of eachother. This patch replaces the current internalization scheme with a method that creates all the copies of the functions intended to be internalized first and then replaces the uses as long as their caller is not already internalized. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D106931 (cherry picked from commit adbaa39dfce7a8361d89b6a3b382fd8f50b94727)	2021-08-04 16:35:01 -07:00
Cullen Rhodes	710ae2bfd3	[ReleaseNotes] Add scalable matrix extension support to AArch64 changes Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D106853	2021-08-03 15:24:36 +00:00
Muhammad Omair Javaid	bafc7f359f	[llvm][Release notes] Add AArch64 SVE, PAC and LLDB prebuilt binary This patch updates LLVM release notes to add a announcement about AArch64 SVE, PAC and LLDB prebuilt binary.	2021-08-03 20:20:07 +05:00
David Spickett	aa2a6b072f	[llvm][Release notes] Add memory tagging support to lldb changes	2021-08-03 12:25:36 +00:00
Sanjay Patel	d4eedb2312	[Analysis] improve function signature checking for snprintf The check for size_t parameter 1 was already here for snprintf_chk, but it wasn't applied to regular snprintf. This could lead to mismatching and eventually crashing as shown in: https://llvm.org/PR50885 (cherry picked from commit 7f5555776513f174729a686ed01270e23462aaf7)	2021-08-02 22:58:39 -07:00
Jose M Monsalve Diaz	9770d34891	[OpenMP] Fixing llvm-omp-device-info compilation with runtimes When using `-DLLVM_ENABLED_RUNTIMES` instead of `-DLLVM_ENABLED_PROJECTS` the `llvm-omp-device-info` tool is not compiled or installed. In general, no llvm tool would be build on runtimes, because the -DLLVM_BUILD_TOOLS flag is removed by the way runtimes compilation calls cmake again. This patch is simple. Just forward the value of this flag to the runtime cmake command. I'm also removing an unnecessary comment in the compilation of the tool Differential Revision: https://reviews.llvm.org/D107177 (cherry picked from commit 5424ceeda0534ab382e2a6cb192099f76ee8b12c)	2021-08-02 20:05:19 -07:00
Simon Pilgrim	70f5e23577	[X86][AVX] Add test case for PR51281 (cherry picked from commit 6569b7f90239b5932465a1c6936632b4a9527d66)	2021-08-02 20:05:12 -07:00
Sanjay Patel	df3286259b	[DAGCombiner] don't try to partially reduce add-with-overflow ops This transform was added with D58874, but there were no tests for overflow ops. We need to change this one way or another because it can crash as shown in: https://llvm.org/PR51238 Note that if there are no uses of an overflow op's bool overflow result, we reduce it to a regular math op, so we continue to fold that case either way. If we have uses of both the math and the overflow bool, then we are likely not saving anything by creating an independent sub instruction as seen in the test diffs here. This patch makes the behavior in SDAG consistent with what we do in instcombine AFAICT. Differential Revision: https://reviews.llvm.org/D106983 (cherry picked from commit fa6b2c9915ba27e1e97f8901ea4aa877f331fb9f)	2021-08-02 13:52:48 -07:00
Sanjay Patel	a0686462c3	[AArch64][x86] add tests for add-with-overflow folds; NFC There's a generic combine for these, but no test coverage. It's not clear if this is actually a good fold. The combine was added with D58874, but it has a bug that can cause crashing ( https://llvm.org/PR51238 ). (cherry picked from commit e427077ec10ea18ac21f5065342183481d87783a)	2021-08-02 13:52:42 -07:00
Sanjay Patel	b92c9f9565	[DivRemPairs] make sure we have a valid CFG for hoisting division This transform was added with e38b7e894808ec2 and as shown in: https://llvm.org/PR51241 ...it could crash without an extra check of the blocks. There might be a more compact way to write this constraint, but we can't just count the successors/predecessors without affecting a test that includes a switch instruction. (cherry picked from commit 5b83261c1518a39636abe094123f1704bbfd972f)	2021-08-02 13:52:37 -07:00
Craig Topper	7c9c296915	[RISCV] Restrict performANY_EXTENDCombine to prevent an infinite loop. The sign_extend we insert here can get turned into a zero_extend if the sign bit is known zero. This can enable a setcc combine that shrinks compares with zero_extend. This reduces the use count of the zero_extend allowing other combines to turn it back into an any_extend. This restricts the combine to only cases where the result is used by a CopyToReg. This works for my original motivating case. I hope the CopyToReg use will prevent any converted extends from turning back into an any_extend. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D106754 (cherry picked from commit 54588bcc052e5b08f90e672c33d0c1ad4eda2424)	2021-08-02 11:31:08 -07:00
Alexandros Lamprineas	276fcebbe0	[AArch64] Legalize MVT::i64x8 in DAG isel lowering This patch legalizes the Machine Value Type introduced in D94096 for loads and stores. A new target hook named getAsmOperandValueType() is added which maps i512 to MVT::i64x8. GlobalISel falls back to DAG for legalization. Differential Revision: https://reviews.llvm.org/D94097	2021-08-02 15:45:58 +01:00
Alexandros Lamprineas	a50e569197	[AArch64] Add a Machine Value Type for 8 consecutive registers Adds MVT::i64x8, a Machine Value Type needed for lowering inline assembly operands which materialize a sequence of eight general purpose registers. Differential Revision: https://reviews.llvm.org/D94096	2021-08-02 15:45:58 +01:00
Jeremy Morse	cd0096f439	[DebugInfo][InstrRef] Don't break up ret-sequences on debug-info instrs When we have a terminator sequence (i.e. a tailcall or return), MIIsInTerminatorSequence is used to work out where the preceding ABI-setup instructions end, i.e. the parts that were glued to the terminator instruction. This allows LLVM to split blocks safely without having to worry about ABI stuff. The function only ignores DBG_VALUE instructions, meaning that the two debug instructions I recently added can end terminator sequences early, causing various MachineVerifier errors. This patch promotes the test for debug instructions from "isDebugValue" to "isDebugInstr", thus avoiding any debug-info interfering with this function. Differential Revision: https://reviews.llvm.org/D106660 (cherry picked from commit 8612417e5a54cfef941ab45de55e48b4a0c4e8b4)	2021-07-29 15:08:13 +01:00
Bradley Smith	183b0c7c98	[AArch64][SVE] Fix incorrect mask type when lowering fixed type SVE gather/scatter An incorrect mask type when lowering an SVE gather/scatter was causing a codegen fault which manifested as the incorrect predicate size being used for an SVE gather/scatter, (e.g.. p0.b rather than p0.d). Fixes PR51182. Differential Revision: https://reviews.llvm.org/D106943 (cherry picked from commit 191831e380f317cd2baa5d48abe02d1d11cd44cb)	2021-07-29 07:03:40 -07:00
Diana Picus	923213f844	test-release.sh: Kill python2 Don't prefer python2's virtualenv when setting up the test-suite. Always use python3 instead, since that's what we support everywhere else anyway. Differential Revision: https://reviews.llvm.org/D106941	2021-07-29 10:28:39 +02:00
Chris Jackson	9a10dd5b1c	Revert "[DebugInfo][LoopStrengthReduction] SCEV-based salvaging for LSR" This was reverted due to a reported crash. This reverts commit 796b84d26f4d461fb50e7b4e84e15a10eaca88fc.	2021-07-29 00:04:50 +01:00
Valentin Clement	c463fa6cad	[mlir][openacc] Initial translation for DataOp to LLVM IR Add basic translation of acc.data to LLVM IR with runtime calls. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D104301	2021-07-27 22:04:04 -04:00
Jose M Monsalve Diaz	5b7208da36	[OpenMP] Folding threadLimit and numThreads when single value in kernels The device runtime contains several calls to `__kmpc_get_hardware_num_threads_in_block` and `__kmpc_get_hardware_num_blocks`. If the thread_limit and the num_teams are constant, these calls can be folded to the constant value. In this patch we use the already introduced `AAFoldRuntimeCall` and the `NumTeams` and `NumThreads` kernel attributes (to be introduced in a different patch) to fold these functions. The code checks all the kernels, and if their attributes match, the functions are folded. In the future we will explore specializing for multiple values of NumThreads and NumTeams. Depends on D106390 Reviewed By: jdoerfert, JonChesterfield Differential Revision: https://reviews.llvm.org/D106033	2021-07-27 21:47:12 -04:00

1 2 3 4 5 ...

219414 Commits