llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00

Author	SHA1	Message	Date
Nico Weber	54dea701e7	[gn build] (manually) port 78bda894129 from 2012 because 924d62ca4a85 added it to check-llvm	2021-07-22 09:11:54 -04:00
Caroline Concatto	2a337676d3	[LoopVectorize] Fix crash for predicated instruction with scalable VF This patch avoids computing discounts for predicated instructions when the VF is scalable. There is no support for vectorization of loops with division because the vectorizer cannot guarantee that zero divisions will not happen. This loop now does not use VF scalable ``` for (long long i = 0; i < n; i++) if (cond[i]) a[i] /= b[i]; ``` Differential Revision: https://reviews.llvm.org/D101916	2021-07-22 12:48:27 +01:00
Paulo Matos	ceddd7eb41	Add support for zero-sized Scalars as a LowLevelType Opaque values (of zero size) can be stored in memory with the implemention of reference types in the WebAssembly backend. Since MachineMemOperand uses LLTs we need to be able to support zero-sized scalars types in LLTs. Differential Revision: https://reviews.llvm.org/D105423	2021-07-22 13:47:19 +02:00
Florian Mayer	152a339cb1	Revert "[hwasan] Use stack safety analysis." This reverts commit bde9415fef25e9ff6e10595a2f4f5004dd62f10a.	2021-07-22 12:16:16 +01:00
Dawid Jurczak	60d27bc367	[LoopIdiom] Transform memmove-like loop into memmove (PR46179) The purpose of patch is to learn Loop idiom recognition pass how to recognize simple memmove patterns in similar way like GCC: https://godbolt.org/z/fh95e83od LoopIdiomRecognize already has machinery for memset and memcpy recognition, patch tries to extend exisiting capabilities with minimal effort. Differential Revision: https://reviews.llvm.org/D104464	2021-07-22 13:05:43 +02:00
Florian Mayer	fa5973a54d	[hwasan] Use stack safety analysis. This avoids unnecessary instrumentation. Reviewed By: eugenis, vitalybuka Differential Revision: https://reviews.llvm.org/D105703	2021-07-22 12:04:54 +01:00
Simon Pilgrim	8531d202cf	[InstCombine] Fold (gep (oneuse(gep Ptr, Idx0)), Idx1) -> (gep Ptr, (add Idx0, Idx1)) (PR51069) As noticed on D106352, after we've folded "(select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0))" if the inner Ptr was also a (now one use) gep we could then merge the geps, using the sum of the indices instead. I've limited this to basic 2-op geps - a more general case further down InstCombinerImpl.visitGetElementPtrInst doesn't have the one-use limitation but only creates the add if it can be created via SimplifyAddInst. https://alive2.llvm.org/ce/z/f8pLfD (Thanks Roman!) Differential Revision: https://reviews.llvm.org/D106450	2021-07-22 10:58:51 +01:00
Simon Tatham	bc23ee33a0	[clang] Use i64 for the !srcloc metadata on asm IR nodes. This is part of a patch series working towards the ability to make SourceLocation into a 64-bit type to handle larger translation units. !srcloc is generated in clang codegen, and pulled back out by llvm functions like AsmPrinter::emitInlineAsm that need to report errors in the inline asm. From there it goes to LLVMContext::emitError, is stored in DiagnosticInfoInlineAsm, and ends up back in clang, at BackendConsumer::InlineAsmDiagHandler(), which reconstitutes a true clang::SourceLocation from the integer cookie. Throughout this code path, it's now 64-bit rather than 32, which means that if SourceLocation is expanded to a 64-bit type, this error report won't lose half of the data. The compiler will tolerate both of i32 and i64 !srcloc metadata in input IR without faulting. Test added in llvm/MC. (The semantic accuracy of the metadata is another matter, but I don't know of any situation where that matters: if you're reading an IR file written by a previous run of clang, you don't have the SourceManager that can relate those source locations back to the original source files.) Original version of the patch by Mikhail Maltsev. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D105491	2021-07-22 10:24:52 +01:00
David Green	3712bffb3b	[AArch64] Add and update reduction and shuffle costs. NFC	2021-07-22 10:22:42 +01:00
Fraser Cormack	4c884f4ac3	[RISCV] Fix a crash when lowering split float arguments Lowering certain float vectors without legal vector types could cause a crash due to a bad interaction between passing floats via GPRs and argument splitting. Split vector floats appear just like scalar floats. Under certain situations we choose to pass these float arguments via GPRs and use an XLenVT location and set the 'BCvt' info to track how they must be converted back to floating-point values. However, later logic for handling split arguments may take over, in which case we lose the previous information and set the 'Indirect' info, thus incorrectly lowering to integer types. I don't believe that we would have come across the notion of split floating-point arguments before. This patch addresses the issue by updating the lowering so that split arguments are only passed indirectly when they are scalar integer types. This has some change to how we lower some larger illegal float vectors, as can be seen in 'fastcc-float.ll' where the vector is now passed partly in registers and partly on the stack. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D102852	2021-07-22 09:55:26 +01:00
Fraser Cormack	ecdacaf414	[RISCV] Lower more BUILD_VECTOR sequences to RVV's VID This relands a6ca88e908b5befcd9b0f8c8cb40f53095cc17bc which was originally reverted due to overflow bugs in e3fa2b1eab60342dc882b7b888658b03c472fa2b. This patch teaches the compiler to identify a wider variety of `BUILD_VECTOR`s which form integer arithmetic sequences, and to lower them to `vid.v` with modifications for non-unit steps and non-zero addends. The sequences handled by this optimization must either be monotonically increasing or decreasing. Consecutive elements holding the same value indicate a fractional step which, while simple mathematically, becomes more complex to handle both in the realm of lossy integer division and in the presence of `undef`s. For example, a common "interleaving" shuffle index will be lowered by LLVM to both `<0,u,1,u,2,...>` and `<u,0,u,1,u,...>` `BUILD_VECTOR` nodes. Either of these would ideally be lowered to `vid.v` shifted right by 1. Detection of this sequence in presence of general `undef` values is more complicated, however: `<0,u,u,1,>` could match either `<0,0,0,1,>` or `<0,0,1,1,>` depending on later values in the sequence. Both are possible, so backtracking or multiple passes is inevitable. Sticking to monotonic sequences keeps the logic simpler as it can be done in one pass. Fractional steps will likely be a separate optimization in a future patch. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D104921	2021-07-22 09:36:12 +01:00
Timm Bäder	da198e839e	[llvm][tools] Hide remaining unrelated llvm- tool options Differential Revision: https://reviews.llvm.org/D106430	2021-07-22 09:47:55 +02:00
Hsiangkai Wang	c6d5dd6f53	[llvm-mc-assemble-fuzzer] Initialize MCTargetOptions. When run the command in the llvm-mc-assemble-fuzzer document, ``` llvm-mc-fuzzer --triple=aarch64-linux-gnu --fuzzer-args -max_len=4 ``` it triggers the following assertion: ``` llvm-mc-assemble-fuzzer: llvm-project/llvm/lib/MC/MCTargetOptionsCommandFlags.cpp:38: bool llvm::mc::getRelaxAll(): Assertion `RelaxAllView && "RegisterMCTargetOptionsFlags not created."' failed. ``` It is caused by no global RegisterMCTargetOptionsFlags object to initialize the MC target options. Differential Revision: https://reviews.llvm.org/D106417	2021-07-22 14:36:37 +08:00
Johannes Doerfert	f303e248b9	[Attributor][FIX] Improve call graph updating If we remove a non-intrinsic instruction we need to tell the (old) call graph about it. This caused problems with some features down the line as they allowed to removed calls more aggressively.	2021-07-22 00:07:56 -05:00
Johannes Doerfert	54c73c71f7	[Attributor][FIX] Do not introduce multiple instances of SSA values If we have a recursive function we could create multiple instantiations of an SSA value, one per recursive invocation of the function. This is a problem as we use SSA value equality in various places. The basic idea follows from this test: ``` static int r(int c, int *a) { int X; return c ? r(false, &X) : a == &X; } int test(int c) { return r(c, undef); } ``` If we look through the argument `a` we will end up with `X`. Using SSA value equality we will fold `a == &X` to true and return true even though it should have been false because `a` and `&X` are from different instantiations of the function. Various tests for this have been placed in value-simplify-instances.ll and this commit fixes them all by avoiding to produce simplified values that could be non-unique at runtime. Thus, the result of a simplify value call will always be unique at runtime or the original value, both do not allow to accidentally compare two instances of a value with each other and conclude they are equal statically (pointer equivalence) while they are unequal at runtime.	2021-07-22 00:07:55 -05:00
Johannes Doerfert	b93c50b86a	[Attributor] Improve the Attributor::getAssumedConstant interface Similar to Attributor::getAssumedSimplified we need to allow IRPs directly to get the right simplification callback (and context).	2021-07-22 00:07:55 -05:00
ShihPo Hung	35ac1cb771	[RegisterCoalescer] Make resolveConflicts aware of earlyclobber Prior to this patch, it skipped the instruction defining VNI when checking if the tainted lanes are used. In the given example, VRGATHER is an illegal instruction because its DstReg overlaps with SrcReg. Therefore we need to check the defining instruction as well when there is an earlyclobber constraint. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D105684	2021-07-22 12:11:10 +08:00
Johannes Doerfert	b257ec0d19	[Attributor][NFC] Precommit tests exposing a conceptual simplification problem Value simplification works under the implicit assumption that two SSA values (`llvm::Value`) that are pointer equal are also equal at runtime. This is mostly true except for values that are instantiated multiple times. These test cases expose the problems we currently have when it comes to recursion and multiple instances of values.	2021-07-21 22:51:05 -05:00
Johannes Doerfert	76691e60d6	[OpenMP][FIX] Use name + type checks not only name checks for calls A call that is analyzed in an optimization needs to be verified against the name and type of the runtime function to avoid that we look at arguments that do not exist (anymore). This can happen if the signature was rewritten. Since we will not set RFI.Declaration if the type doesn't match we can use it (if it's not null) to determine if the signature is as expected. Differential Revision: https://reviews.llvm.org/D106341	2021-07-21 22:51:05 -05:00
Johannes Doerfert	59bc220605	[Attributor][NFC] Clang format	2021-07-21 22:51:05 -05:00
Ben Shi	7d4933eff2	[RISCV] Optimize multiplication in the zba extension with SHADD This patch make the following optimization. (mul x, 3 power_of_2) -> (SLLI (SH1ADD x, x), bits) (mul x, 5 * power_of_2) -> (SLLI (SH2ADD x, x), bits) (mul x, 9 * power_of_2) -> (SLLI (SH3ADD x, x), bits) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D105796	2021-07-22 10:28:41 +08:00
Carl Ritson	8b2d2affc5	[AMDGPU] Add VReg_192/VReg_224 support for MIMG instructions Allow MIMG instructions to be selected with 6/7 VGPRs for vaddr. Previously these were rounded up to VReg_256 this saves VGPRs. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D103800	2021-07-22 10:42:15 +09:00
Carl Ritson	41b211a722	[AMDGPU] Allow frontends to disable null export for pixel shaders Disable null export (for kills) when a frontend defines a pixel shader as not exporting using amdgpu-color-export and amdgpu-depth-export function attrbutes. This allows the generation of export free pixel shaders. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D105683	2021-07-22 10:20:46 +09:00
Joseph Huber	191a71d3e8	[OpenMP] Strip NoInline from known OpenMP runtime functions This patch strips the NoInline attribute from known OpenMP runtime functions. This is done so that we can denote certain runtime functions as NoInline to ensure their call sites are intact so they can be checked by OpenMPOpt. We don't wan't this noinline attribute to remain for any functions after OpenMPOpt has been run however. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D106482	2021-07-21 21:18:26 -04:00
Joseph Huber	4323f25227	[OpenMP] Fold `__kmpc_is_generic_main_thread_id` if possible This patch adds the ability to fold `__kmpc_is_generic_main_thread_id` if we know for a fact that it is executed by the initial thread using AAExecutionDomain. This combined with folding `__kmpc_is_spmd_exec_mode` will allow us to fully fold `__kmpc_is_generic_main_thread`. Depends on D106438 D106437 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D106439	2021-07-21 21:18:22 -04:00
Joseph Huber	e6c2e59f71	[OpenMP] Add an option to disable function internalization Function internalization can sometimes occur in situations where we want to keep the call sites intact. This patch adds an option to disable function internalization and prevents the device runtime from being internalized while creating the bitcode library. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D106438	2021-07-21 21:18:18 -04:00
Joseph Huber	9d542314e4	[Libomptarget] Introduce new main thread ID runtime function This patch introduces `__kmpc_is_generic_main_thread_id` which splits the old comparison into its own runtime function. The purpose of this is so we can fold this part independently, so when both this and `is_spmd_mode` are folded the final function will be folded as well. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D106437	2021-07-21 21:18:14 -04:00
Joseph Huber	2c3ddf5d6f	[OpenMP] Add new execution mode for SPMD execution with Generic semantics Qualified kernels can be transformed from generic-mode to SPMD mode using an optimization in OpenMPOpt. This patch introduces a new execution mode to indicate kernels that have been transformed from generic-mode to SPMD-mode. These kernels have SPMD-mode execution, but need generic-mode semantics for scheduling the blocks and threads. Without this far too few blocks will be scheduled for a generic region as SPMD mode expects the trip count to be divided by the number of threads. Reviewed By: ggeorgakoudis Differential Revision: https://reviews.llvm.org/D106460	2021-07-21 20:57:28 -04:00
Joseph Huber	472a223072	[OpenMP] Change `__kmpc_free_shared` to include the paired allocation size This patch changes `__kmpc_free_shared` to take an additional argument corresponding to the associated allocation's size. This makes it easier to implement the allocator in the runtime. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D106496	2021-07-21 20:56:21 -04:00
Lang Hames	549c960a94	Re-re-revert "[ORC][ORC-RT] Add initial native-TLV support to MachOPlatform." This reverts commit 6b2a96285b9bbe92d2c5e21830f21458f8be976d. The ccache builders are still failing. Looks like they need to be updated to get the llvm-zorg config change in 490633945677656ba75d42ff1ca9d4a400b7b243. I'll re-apply this as soon as the builders are updated.	2021-07-22 10:45:24 +10:00
Jacob Hegna	2182633955	[NFC] Code cleanups in InlineCost.cpp. - annotate const functions with "const" - replace C-style casts with static_cast Differential Revision: https://reviews.llvm.org/D105362	2021-07-22 00:03:36 +00:00
Lang Hames	6f51759135	Re-re-apply "[ORC][ORC-RT] Add initial native-TLV support to MachOPlatform." This reapplies commit a7733e9556b5a6334c910f88bcd037e84e17e3fc ("Re-apply [ORC][ORC-RT] Add initial native-TLV support to MachOPlatform."), and d4abdefc998a1ee19d5edc79ec233774cbf64f6a ("[ORC-RT] Rename macho_tlv.x86-64.s to macho_tlv.x86-64.S (uppercase suffix)"). These patches were reverted in 48aa82cacbff10e1c5395a03f86488bf449ba4da while I investigated bot failures (e.g. https://lab.llvm.org/buildbot/#/builders/109/builds/18981). The fix was to disable building of the ORC runtime on buliders using ccache (which is the same fix used for other compiler-rt projects containing assembly code). This fix was commited to llvm-zorg in 490633945677656ba75d42ff1ca9d4a400b7b243.	2021-07-22 09:46:52 +10:00
Thomas Lively	8403ff42b3	[WebAssembly] Replace @llvm.wasm.popcnt with @llvm.ctpop.v16i8 Use the standard target-independent intrinsic to take advantage of standard optimizations. Differential Revision: https://reviews.llvm.org/D106506	2021-07-21 16:45:54 -07:00
Stanislav Mekhanoshin	2ef5dd8386	Prevent dead uses in register coalescer after rematerialization The coalescer does not check if register uses are available at the point of rematerialization. If it attempts to rematerialize an instruction with such uses it can end up with use without a def. LiveRangeEdit does such check during rematerialization, so just call LiveRangeEdit::allUsesAvailableAt() to avoid the problem. Differential Revision: https://reviews.llvm.org/D106396	2021-07-21 15:19:55 -07:00
Jessica Paquette	f9cc27a4e9	[AArch64][GlobalISel] Change \| -> \|\| in an if I wrote the wrong type of OR by mistake.	2021-07-21 14:57:31 -07:00
LLVM GN Syncbot	9cab0ceb62	[gn build] Port 74fd3cb8cd3e	2021-07-21 21:45:33 +00:00
Stanislav Mekhanoshin	54565ba720	[AMDGPU] Mark relevant rematerializable VOP3 instructions Differential Revision: https://reviews.llvm.org/D106110	2021-07-21 14:44:13 -07:00
Stanislav Mekhanoshin	b044663832	[AMDGPU] Mark relevant rematerializable VOP2 instructions Differential Revision: https://reviews.llvm.org/D106023	2021-07-21 14:24:59 -07:00
Bill Wendling	fd6bc0cc95	[llvm-diff] Check for recursive initialiers We need to check for recursive initializers in the "ConstantStruct" case. Differential Revision: https://reviews.llvm.org/D105616	2021-07-21 14:21:21 -07:00
David Green	f01cf44407	[ARM] Pass SelectionDAG to methods that dont require DCI. NFC In these methods DCI is never used, only the DAG from it. Pass the DAG directly, cleaning up the code a little.	2021-07-21 22:11:09 +01:00
Stanislav Mekhanoshin	c90b3afab5	[AMDGPU] Mark all relevant VOP1 instructions rematerializable Differential Revision: https://reviews.llvm.org/D105919	2021-07-21 14:05:32 -07:00
Fangrui Song	b451f3588d	[sanitizer] Place module_ctor/module_dtor in llvm.used This removes an abuse of ELF linker behaviors while keeping Mach-O/COFF linker behaviors unchanged. ELF: when module_ctor is in a comdat, this patch removes reliance on a linker abuse (an SHT_INIT_ARRAY in a section group retains the whole group) by using SHF_GNU_RETAIN. No linker behavior difference when module_ctor is not in a comdat. Mach-O: module_ctor gets `N_NO_DEAD_STRIP`. No linker behavior difference because module_ctor is already referenced by a `S_MOD_INIT_FUNC_POINTERS` section (GC root). PE/COFF: no-op. SanitizerCoverage already appends module_ctor to `llvm.used`. Other sanitizers: llvm.used for local linkage is not implemented in `TargetLoweringObjectFileCOFF::emitLinkerDirectives` (once implemented or switched to a non-local linkage, COFF can use module_ctor in comdat (i.e. generalize ELF-specific rL301586)). There is no object file size difference. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D106246	2021-07-21 14:03:26 -07:00
Nikita Popov	5d9c2de528	[SimplifyCFG] Fix if conversion with opaque pointers We need to make sure that the value types are the same. Otherwise we both may not have the necessary dereferenceability implication, nor can we directly form the desired select pattern. Without opaque pointers this is enforced implicitly through the pointer comparison.	2021-07-21 22:24:07 +02:00
Nikita Popov	f265bf7812	[SimplifyCFG] Regenerate test checks (NFC)	2021-07-21 22:24:07 +02:00
Stanislav Mekhanoshin	5ee240dc93	[AMDGPU] Move perfhint analysis This is SCC pass, moving it to the end of SCC PM saves one Function PM. This needs the analysis to take into account memory access width since it is now places after the load/store optimizer (D105651). Differential Revision: https://reviews.llvm.org/D105652	2021-07-21 13:06:49 -07:00
Jessica Paquette	cb309a9442	[AArch64][GlobalISel] Widen s2 and s4 G_IMPLICIT_DEF + G_FREEZE These had ``` .clampScalar(0, s1, 64) .widenScalarToNextPow2(0, 8) ``` If you have s2 or s4, then `widenScalarToNextPow2` does nothing. This changes the `widenScalarToNextPow2` rule to use s8 as the minimum type instead, allowing us to correctly widen s2 and s4. This does not impact s1, since it's marked as legal already. Differential Revision: https://reviews.llvm.org/D106413	2021-07-21 12:59:20 -07:00
John McCall	9f30d2a5ae	Fix a bug in OptimizedStructLayout when filling gaps before fixed fields with highly-aligned flexible fields. The code was not considering the possibility that aligning the current offset to the alignment of a queue might push us past the end of the gap. Subtracting the offsets to figure out the maximum field size for the gap then overflowed, making us think that we had nearly unbounded space to fill. Fixes PR 51131.	2021-07-21 15:47:18 -04:00
Stanislav Mekhanoshin	5b3e6630e5	[AMDGPU] Tune perfhint analysis to account access width A function with less memory instructions but wider access is the same as a function with more but narrower accesses in terms of memory boundness. In fact the pass would give different answers before and after vectorization without this change. Differential Revision: https://reviews.llvm.org/D105651	2021-07-21 12:46:10 -07:00
Craig Topper	6a9e481d78	[RISCV] Cleanup comment around vector tail policy handling. NFC vmv.x.s and reductions don't ignore tail policy anymore.	2021-07-21 12:45:08 -07:00
Sanjay Patel	ffb5e7ee28	[SROA] avoid crash on memset with constant expression length https://llvm.org/PR50888	2021-07-21 15:20:28 -04:00

... 4 5 6 7 8 ...

219223 Commits