llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 04:32:44 +01:00

Author	SHA1	Message	Date
Duncan P. N. Exon Smith	99e33d3043	Support: Remove F_{None,Text,Append} compatibility synonyms, NFC Remove the compatibility spellings of `OF_{None,Text,Append}` that were left behind by 1f67a3cba9b09636c56e2109d8a35ae96dc15782. No functionality change here, just an API cleanup. Differential Revision: https://reviews.llvm.org/D101506	2021-06-15 12:04:09 -07:00
Bob Haarman	99d0b11cd5	[X86] avoid assert with varargs, soft float, and no-implicit-float Fixes: - PR36507 Floating point varargs are not handled correctly with -mno-implicit-float - PR48528 __builtin_va_start assumes it can pass SSE registers when using -Xclang -msoft-float -Xclang -no-implicit-float On x86_64, floating-point parameters are normally passed in XMM registers. For va_start, we spill those to memory so va_arg can find them. There is an interaction here with -msoft-float and -no-implicit-float: When -msoft-float is in effect, instead of passing floating-point parameters in XMM registers, they are passed in general-purpose registers. When -no-implicit-float is in effect, it "disables implicit floating-point instructions" (per the LangRef). The intended effect is to not have the compiler generate floating-point code unless explicit floating-point operations are present in the source code, but what exactly counts as an explicit floating-point operation is not specified. The existing behavior of LLVM here has led to some surprises and PRs. This change modifies the behavior as follows: \| soft \| no-implicit \| old behavior \| new behavior \| \| no \| no \| spill XMM regs \| spill XMM regs \| \| yes \| no \| don't spill XMM \| don't spill XMM \| \| no \| yes \| don't spill XMM \| spill XMM regs \| \| yes \| yes \| assert \| don't spill XMM \| In particular, this avoids the assert that happens when -msoft-float and -no-implicit-float are both in effect. This seems like a perfectly reasonable combination: If we don't want to rely on hardware floating-point support, we want to both avoid using float registers to pass parameters and avoid having the compiler generate floating-point code that wasn't in the original program. Instead of crashing the compiler, the new behavior is to not synthesize floating-point code in this case. This fixes PR48528. The other interesting case is when -no-implicit-float is in effect, but -msoft-float is not. In that case, any floating-point parameters that are present will be in XMM registers, and so we have to spill them to correctly handle those. This fixes PR36507. The spill is conditional on %al indicating that parameters are present in XMM registers, so no floating-point code will be executed unless the function is called with floating-point parameters. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D104001	2021-06-15 11:27:35 -07:00
Stanislav Mekhanoshin	2d316f51f6	[AMDGPU] Fix lds superalign test. NFC.	2021-06-15 11:02:34 -07:00
Roman Lebedev	10ca53ce65	[NewPM] Remove SpeculateAroundPHIs pass Addition of this pass has been botched. There is no particular reason why it had to be sold as an inseparable part of new-pm transition. It was added when old-pm was still the default, and very very few users were actually tracking new-pm, so it's effects weren't measured. Which means, some of the turnoil of the new-pm transition are actually likely regressions due to this pass. Likewise, there has been a number of post-commit feedback (post new-pm switch), namely * https://reviews.llvm.org/D37467#2787157 (regresses HW-loops) * https://reviews.llvm.org/D37467#2787259 (should not be in middle-end, should run after LSR, not before) * https://reviews.llvm.org/D95789 (an attempt to fix bad loop backedge metadata) and in the half year past, the pass authors (google) still haven't found time to respond to any of that. Hereby it is proposed to backout the pass from the pipeline, until someone who cares about it can address the issues reported, and properly start the process of adding a new pass into the pipeline, with proper performance evaluation. Furthermore, neither google nor facebook reports any perf changes from this change, so i'm dropping the pass completely. It can always be re-reverted should/if anyone want to pick it up again. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D104099	2021-06-15 20:35:55 +03:00
David Green	89b17ef583	Revert "[ARM] Extend narrow values to allow using truncating scatters" This commit adds nodes that might not always be used, which the expensive checks builder does not like. Reverting for now to think up a better way of handling it.	2021-06-15 18:19:25 +01:00
Arthur Eubanks	d046164d2f	[NFC][OpaquePtr] Avoid calling getPointerElementType() Pointee types are going away soon. For this, we mostly just care about store/load types, which are already available without the pointee types. The other intrinsics always use i8*. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D103719	2021-06-15 09:53:12 -07:00
Arthur Eubanks	04c6a23a12	[NFC] Remove redundant variable Differential Revision: https://reviews.llvm.org/D103706	2021-06-15 09:53:11 -07:00
David Green	acdf5fd435	[ARM] Extend narrow values to allow using truncating scatters As a minor adjustment to the existing lowering of offset scatters, this extends any smaller-than-legal vectors into full vectors using a zext, so that the truncating scatters can be used. Due to the way MVE legalizes the vectors this should be cheap in most situations, and will prevent the vector from being scalarized. Differential Revision: https://reviews.llvm.org/D103704	2021-06-15 17:45:14 +01:00
David Green	f3d7b360ea	[ARM] Use rq gather/scatters for smaller v4 vectors A pointer will always fit into an i32, so a rq offset gather/scatter can be used with v4i8 and v4i16 gathers, using a base of 0 and the Ptr as the offsets. The rq gather can then correctly extend the type, allowing us to use the gathers without falling back to scalarizing. This patch rejigs tryCreateMaskedGatherOffset in the MVEGatherScatterLowering pass to decompose the Ptr into Base:0 + Offset:Ptr (with a scale of 1), if the Ptr could not be decomposed from a GEP. v4i32 gathers will already use qi gathers, this extends that to v4i8 and v4i16 gathers using the extending rq variants. Differential Revision: https://reviews.llvm.org/D103674	2021-06-15 17:06:15 +01:00
Andrew Litteken	48bef97e44	[IROutliner] Adding DebugInfo handling for IR Outlined Functions This adds support for functions outlined by the IR Outliner to be recognized by the debugger. The expected behavior is that it will skip over the instructions included in that section. This is due to the fact that we can not say which of the original locations the instructions originated from. These functions will show up in the call stack, but you cannot step through them. Reviewers: paquette, vsk, djtodoro Differential Revision: https://reviews.llvm.org/D87302	2021-06-15 10:57:08 -05:00
David Green	e55f90b072	[ARM] Rejig some of the MVE gather/scatter lowering pass. NFC This adjusts some of how the gather/scatter lowering pass passes around data and where certain gathers/scatters are created from. It should not effect code generation on its own, but allows other patches to more clearly reason about the code. A number of extra test cases were also added for smaller gathers/ scatters that can be extended, and some of the test comments were updated.	2021-06-15 15:38:39 +01:00
Simon Pilgrim	30958bdc95	[llvm-exegesis] Fix X86LbrCounter destructor to correctly unmap memory and not double-close fd (PR50620) As was reported on PR50620, the X86LbrCounter destructor was double-closing the filedescriptor and not unmapping the buffer. Differential Revision: https://reviews.llvm.org/D104201	2021-06-15 14:24:35 +01:00
LLVM GN Syncbot	5b8f203a86	[gn build] Port 4eb9fe2e1a07	2021-06-15 12:01:01 +00:00
Florian Hahn	54872e596c	[LoopDeletion] Check for irreducible cycles when deleting loops. Loops with irreducible cycles may loop infinitely. Those cannot be removed, unless the loop/function is marked as mustprogress. Also discussed in D103382. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D104238	2021-06-15 12:56:12 +01:00
Lang Hames	5dbc71a3b0	[ORC] Fix endianness in manual serialization to match WrapperFunctionUtils.	2021-06-15 21:51:52 +10:00
Lang Hames	792e37b227	[ORC] Fix missing std::move.	2021-06-15 21:42:58 +10:00
Lang Hames	0c8cd2ec6f	[ORC] Fix narrowing-in-initializer-list warnings.	2021-06-15 21:39:16 +10:00
Lang Hames	8b99b0a62e	[ORC] Fix missing function in unit test.	2021-06-15 21:39:00 +10:00
Lang Hames	8d05185ee9	[ORC] Make WrapperFunctionResult's ValuePtr member non-const. The const qualifier was a hangover from an earlier iteration that allowed wrapper functions to return pointers to const memory. This feature has been removed, so there's no reason for this to be const any more, and removing it eliminates const-cast warnings.	2021-06-15 21:24:12 +10:00
Lang Hames	e11b1aca83	[ORC] Port WrapperFunctionUtils and SimplePackedSerialization from ORC runtime. Replace the existing WrapperFunctionResult type in llvm/include/ExecutionEngine/Orc/Shared/TargetProcessControlTypes.h with a version adapted from the ORC runtime's implementation. Also introduce the SimplePackedSerialization scheme (also adapted from the ORC runtime's implementation) for wrapper functions to avoid manual serialization and deserialization for calls to runtime functions involving common types.	2021-06-15 21:13:57 +10:00
Neil Henning	8521aa2a65	ABI breaking changes fixes. This commit mostly just replaces bad uses of `NDEBUG` with uses of `LLVM_ENABLE_ABI_BREAKING_CHANGES` - the safe way to include ABI breaking changes (normally extra struct elements in headers). Differential Revision: https://reviews.llvm.org/D104216	2021-06-15 11:08:13 +01:00
Roman Lebedev	586aaeabf1	[X86] Schedule-model second (mask) output of GATHER instruction Much like `mulx`'s `WriteIMulH`, there are two outputs of AVX2 GATHER instructions. This was changed back in rL160110, but the sched model change wasn't present. So right now, for sched models that are marked as complete (`znver3` only now), codegen'ning `GATHER` results in a crash: ``` DefIdx 1 exceeds machine model writes for early-clobber renamable $ymm3, dead early-clobber renamable $ymm2 = VPGATHERDDYrm killed renamable $ymm3(tied-def 0), undef renamable $rax, 4, renamable $ymm0, 0, $noreg, killed renamable $ymm2(tied-def 1) :: (load 32, align 1) ``` https://godbolt.org/z/Ks7zW7WGh I'm guessing we need to deal with this like we deal with `WriteIMulH`. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D104205	2021-06-15 12:04:33 +03:00
Andrea Di Biagio	ad80113e58	[MCA][InstrBuilder] Check for the presence of flag VariadicOpsAreDefs. This patch fixes the logic that checks for variadic register definitions, Before llvm-svn 348114 (commit 4cf35b4ab0b), it was not possible to explicitly mark variadic operands as definitions. By default, variadic operands of an MCInst were always assumed to be uses. A number of had-hoc checks were introduced in the InstrBuilder to fix the processing of variadic register operands of ARM ldm/stm variants. This patch simply replaces those old (and buggy) checks with a much simpler (and correct) check for MCID::Flag::VariadicOpsAreDefs.	2021-06-15 09:52:38 +01:00
Jay Foad	c3a38401e0	[IR] Remove forward declaration of GraphTraits from Type.h This has been unnecessary since r352353 removed GraphTraits specializations for Type, except that a couple of other headers were accidentally relying on this declaration. Differential Revision: https://reviews.llvm.org/D104119	2021-06-15 09:23:45 +01:00
LLVM GN Syncbot	d90014c622	[gn build] Port d0a5d8611935	2021-06-15 05:56:32 +00:00
CarlosAlbertoEnciso	8355a030c0	[Debug-Info][CodeView] Fix GUID string generation for MSVC generated objects. This patch is to address https://bugs.llvm.org/show_bug.cgi?id=50459. YAML:455:28: error: GUID strings are 38 characters long The valid format for a GUID is {XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX} where X is a hex digit (0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F). The length of the individual components must be: 8, 4, 4, 4, 12. For some cases, the converted string generated by obj2yaml, does not comply with those lengths. yaml2obj checks that the GUID string must be 38 characters including the dashes and braces. Reviewed By: amccarth Differential Revision: https://reviews.llvm.org/D103089	2021-06-15 06:53:21 +01:00
CarlosAlbertoEnciso	75ed4fb122	Revert "[NFC] This is a test commit to check commit access." This reverts commit b4d40e19def8c2e1a77ae30b5ac16751d1c461f7.	2021-06-15 06:25:22 +01:00
Carlos Alberto Enciso	fc72a61cfd	[NFC] This is a test commit to check commit access. Add full stop at the end of comment.	2021-06-15 06:20:31 +01:00
Craig Topper	e3c98d1dad	[X86] Use EVT::getVectorVT instead of changeVectorElementType in reduceVMULWidth. Changing vector element type doesn't work for v6i32->v6i16 now that v6i32 is an MVT and v6i16 is not. I would like to fix this in changeVectorElementType, but you need a LLVMContext to call getVectorVT which we can't get from an MVT. Fixes PR50709.	2021-06-14 22:07:04 -07:00
Kai Luo	adf785206e	[PowerPC] Export 16 byte load-store instructions Export `lq`, `stq`, `lqarx` and `stqcx.` in preparation for implementing 16-byte lock free atomic operations on AIX. Add a new register class `g8prc` for these instructions, since these instructions require even-odd register pair. Reviewed By: nemanjai, jsji, #powerpc Differential Revision: https://reviews.llvm.org/D103010	2021-06-15 01:56:10 +00:00
Vitaly Buka	a3de872dd6	[NFC][sanitizer] clang-format some code	2021-06-14 18:05:22 -07:00
Jacob Hegna	aeb9af0756	Remove redundant environment variable XLA_FLAGS. If the flag is not set, the script saved_model_aot_compile.py in tensorflow will default it to the correct value. However, in TF 2.5, the way the value is set in TensorFlowCompile.cmake file triggers a build error. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D103972	2021-06-14 23:58:22 +00:00
Adrian Prantl	dfb1691713	Allow signposts to take advantage of deferred string substitution One nice feature of the os_signpost API is that format string substitutions happen in the consumer, not the logging application. LLVM's current Signpost class doesn't take advantage of this though and instead always uses a static "Begin/End %s" format string. This patch uses variadic macros to allow the API to be used as intended. Unfortunately, the primary use-case I had in mind (the LLDB_SCOPED_TIMER() macro) does not get much better from this, because __PRETTY_FUNCTION__ is not a macro, but a static string, so signposts created by LLDB_SCOPED_TIMER() still use a static "%s" format string. At least LLDB_SCOPED_TIMERF() works as intended. This reapplies the previously reverted patch with additional include order fixes for non-modular builds of LLDB. Differential Revision: https://reviews.llvm.org/D103575	2021-06-14 16:53:41 -07:00
Huihui Zhang	6dc3e5ee9a	[SVE][LSR] Teach LSR to enable simple scaled-index addressing mode generation for SVE. Currently, Loop strengh reduce is not handling loops with scalable stride very well. Take loop vectorized with scalable vector type <vscale x 8 x i16> for instance, (refer to test/CodeGen/AArch64/sve-lsr-scaled-index-addressing-mode.ll added). Memory accesses are incremented by "16vscale", while induction variable is incremented by "8vscale". The scaling factor "2" needs to be extracted to build candidate formula i.e., "reg(%in) + 2reg({0,+,(8 %vscale)}". So that addrec register reg({0,+,(8vscale)}) can be reused among Address and ICmpZero LSRUses to enable optimal solution selection. This patch allow LSR getExactSDiv to recognize special cases like "C1XY /s C2X*Y", and pull out "C1 /s C2" as scaling factor whenever possible. Without this change, LSR is missing candidate formula with proper scaled factor to leverage target scaled-index addressing mode. Note: This patch doesn't fully fix AArch64 isLegalAddressingMode for scalable vector. But allow simple valid scale to pass through. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D103939	2021-06-14 16:42:34 -07:00
Adrian Prantl	299552b38b	Revert "Allow signposts to take advantage of deferred string substitution" This reverts commit 03841edde7eee21d1d450041ab9a113a7e1be869. Unfortunately this still breaks the LLDB standalone bot.	2021-06-14 16:09:04 -07:00
Matt Morehouse	3601822b05	[HWASan] Enable globals support for LAM. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D104265	2021-06-14 14:20:44 -07:00
Adrian Prantl	770268a3e9	Allow signposts to take advantage of deferred string substitution One nice feature of the os_signpost API is that format string substitutions happen in the consumer, not the logging application. LLVM's current Signpost class doesn't take advantage of this though and instead always uses a static "Begin/End %s" format string. This patch uses variadic macros to allow the API to be used as intended. Unfortunately, the primary use-case I had in mind (the LLDB_SCOPED_TIMER() macro) does not get much better from this, because __PRETTY_FUNCTION__ is not a macro, but a static string, so signposts created by LLDB_SCOPED_TIMER() still use a static "%s" format string. At least LLDB_SCOPED_TIMERF() works as intended. This reapplies the previsously reverted patch with additional MachO.h macro #undefs. Differential Revision: https://reviews.llvm.org/D103575	2021-06-14 14:19:41 -07:00
Roman Lebedev	8401a57c05	[TLI] SimplifyDemandedVectorElts(): handle SCALAR_TO_VECTOR(EXTRACT_VECTOR_ELT(?, 0)) Iff we have `SCALAR_TO_VECTOR` (and we demand it's only defined 0'th element), and said scalar was produced by `EXTRACT_VECTOR_ELT` from the 0'th element of some vector, then we can just continue traversal into said source vector. This comes up in X86 vector uniform shift lowering. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D104250	2021-06-14 23:52:53 +03:00
Piotr Sobczak	c63f0c139e	[AMDGPU] Limit runs of fixLdsBranchVmemWARHazard The code in fixLdsBranchVmemWARHazard looks for patterns of a vmem/lds access followed by a branch, followed by an lds/vmem access. The handling of the hazard requires an arbitrary number of instructions to process. In the worst case where a function has a vmem access, but no lds accesses, all instructions are examined only to conclude that the hazard cannot occur. Add the pre-processing stage which detects if there is both lds and vmem present in the function and only then does the more costly search. This patch significantly improves compilation time in the cases the hazard cannot happen. In one pathological case I looked at IsHazardInst is needlesly called 88.6 milions times. The numbers could also be improved by introducing a map around the inner calls to ::getWaitStatesSince in fixLdsBranchVmemWARHazard, but nothing will beat not running fixLdsBranchVmemWARHazard at all in the cases detected by shouldRunLdsBranchVmemWARHazardFixup(). Differential Revision: https://reviews.llvm.org/D104219	2021-06-14 22:30:23 +02:00
Arthur Eubanks	675b987cd3	Move some code under NDEBUG from D103135	2021-06-14 11:39:12 -07:00
Arthur Eubanks	db00e9aaf8	Remove accidentally added debugging code from D103135	2021-06-14 11:11:40 -07:00
Saleem Abdulrasool	a4fccf0a10	X86: pass swift_async context in R14 on Win64 Pass swift_async context in a callee-saved register rather than as a regular parameter. This is similar to the Swift `self` and `error` parameters.	2021-06-14 11:02:21 -07:00
Arthur Eubanks	25efb3e5da	[docs][OpaquePtr] Shuffle around the transition plan section Emphasize that this is basically an attempt to remove ``PointerType::getElementType`` and ``Type::getPointerElementType()``. Add a couple more subtasks. Differential Revision: https://reviews.llvm.org/D104151	2021-06-14 10:59:41 -07:00
Arthur Eubanks	a32daab226	[OpaquePtr] Remove existing support for forward compatibility It assumes that PointerType will keep having an optional pointee type, but we'd like to remove the pointee type in PointerType at some point. I feel like the current implementation could be simplified anyway, although perhaps I'm underestimating the amount of work needed throughout BitcodeReader. We will still need a side table to keep track of pointee types. This will be reimplemented at some point. This is essentially a revert of a4771e9d (which doesn't look like it was reviewed anyway). Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D103135	2021-06-14 10:52:56 -07:00
wlei	c4ed78c10b	[CSSPGO] Aggregation by the last K context frames for cold profiles This change provides the option to merge and aggregate cold context by the last k frames instead of context-less name. By default K = 1 means the context-less one. This is for better perf tuning. The more selective merging and trimming will rely on llvm-profgen's preinliner. Reviewed By: wenlei, hoy Differential Revision: https://reviews.llvm.org/D104131	2021-06-14 10:33:43 -07:00
Fraser Cormack	94d12e2de0	[RISCV] Transform unaligned RVV vector loads/stores to aligned ones This patch adds support for loading and storing unaligned vectors via an equivalently-sized i8 vector type, which has support in the RVV specification for byte-aligned access. This offers a more optimal path for handling of unaligned fixed-length vector accesses, which are currently scalarized. It also prevents crashing when `LegalizeDAG` sees an unaligned scalable-vector load/store operation. Future work could be to investigate loading/storing via the largest vector element type for the given alignment, in case that would be more optimal on hardware. For instance, a 4-byte-aligned nxv2i64 vector load could loaded as nxv4i32 instead of as nxv16i8. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D104032	2021-06-14 18:12:18 +01:00
Sanjay Patel	224787dbb2	[InstCombine] add DeMorgan folds for logical ops in select form We canonicalized to these select patterns (poison-safe logic) with D101191, so we need to reduce 'not' ops when possible as we would with 'and'/'or' instructions. This is shown in a secondary example in: https://llvm.org/PR50389 https://alive2.llvm.org/ce/z/BvsESh	2021-06-14 12:54:35 -04:00
Sanjay Patel	b4782a5249	[InstCombine] add tests for logical and/or with not ops; NFC	2021-06-14 12:54:35 -04:00
Florian Hahn	b893da34b6	[LoopDeletion] Add test with irreducible control flow in loop. Currently the irreducible cycles in the loops are ignored. The irreducible cycle may loop infinitely in irreducible_subloop_no_mustprogress, which is allowed and the loop should not be removed. Discussed in D103382.	2021-06-14 17:42:32 +01:00
Florian Hahn	5af9313d31	[VectorCombine] Limit scalarization to non-poison indices for now. As Eli mentioned post-commit in D103378, the result of the freeze may still be out-of-range according to Alive2. So for now, just limit the transform to indices that are non-poison.	2021-06-14 16:40:14 +01:00

1 2 3 4 5 ...

217210 Commits