llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00

Author	SHA1	Message	Date
Huihui Zhang	6dc3e5ee9a	[SVE][LSR] Teach LSR to enable simple scaled-index addressing mode generation for SVE. Currently, Loop strengh reduce is not handling loops with scalable stride very well. Take loop vectorized with scalable vector type <vscale x 8 x i16> for instance, (refer to test/CodeGen/AArch64/sve-lsr-scaled-index-addressing-mode.ll added). Memory accesses are incremented by "16vscale", while induction variable is incremented by "8vscale". The scaling factor "2" needs to be extracted to build candidate formula i.e., "reg(%in) + 2reg({0,+,(8 %vscale)}". So that addrec register reg({0,+,(8vscale)}) can be reused among Address and ICmpZero LSRUses to enable optimal solution selection. This patch allow LSR getExactSDiv to recognize special cases like "C1XY /s C2X*Y", and pull out "C1 /s C2" as scaling factor whenever possible. Without this change, LSR is missing candidate formula with proper scaled factor to leverage target scaled-index addressing mode. Note: This patch doesn't fully fix AArch64 isLegalAddressingMode for scalable vector. But allow simple valid scale to pass through. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D103939	2021-06-14 16:42:34 -07:00
Adrian Prantl	299552b38b	Revert "Allow signposts to take advantage of deferred string substitution" This reverts commit 03841edde7eee21d1d450041ab9a113a7e1be869. Unfortunately this still breaks the LLDB standalone bot.	2021-06-14 16:09:04 -07:00
Matt Morehouse	3601822b05	[HWASan] Enable globals support for LAM. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D104265	2021-06-14 14:20:44 -07:00
Adrian Prantl	770268a3e9	Allow signposts to take advantage of deferred string substitution One nice feature of the os_signpost API is that format string substitutions happen in the consumer, not the logging application. LLVM's current Signpost class doesn't take advantage of this though and instead always uses a static "Begin/End %s" format string. This patch uses variadic macros to allow the API to be used as intended. Unfortunately, the primary use-case I had in mind (the LLDB_SCOPED_TIMER() macro) does not get much better from this, because __PRETTY_FUNCTION__ is not a macro, but a static string, so signposts created by LLDB_SCOPED_TIMER() still use a static "%s" format string. At least LLDB_SCOPED_TIMERF() works as intended. This reapplies the previsously reverted patch with additional MachO.h macro #undefs. Differential Revision: https://reviews.llvm.org/D103575	2021-06-14 14:19:41 -07:00
Roman Lebedev	8401a57c05	[TLI] SimplifyDemandedVectorElts(): handle SCALAR_TO_VECTOR(EXTRACT_VECTOR_ELT(?, 0)) Iff we have `SCALAR_TO_VECTOR` (and we demand it's only defined 0'th element), and said scalar was produced by `EXTRACT_VECTOR_ELT` from the 0'th element of some vector, then we can just continue traversal into said source vector. This comes up in X86 vector uniform shift lowering. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D104250	2021-06-14 23:52:53 +03:00
Piotr Sobczak	c63f0c139e	[AMDGPU] Limit runs of fixLdsBranchVmemWARHazard The code in fixLdsBranchVmemWARHazard looks for patterns of a vmem/lds access followed by a branch, followed by an lds/vmem access. The handling of the hazard requires an arbitrary number of instructions to process. In the worst case where a function has a vmem access, but no lds accesses, all instructions are examined only to conclude that the hazard cannot occur. Add the pre-processing stage which detects if there is both lds and vmem present in the function and only then does the more costly search. This patch significantly improves compilation time in the cases the hazard cannot happen. In one pathological case I looked at IsHazardInst is needlesly called 88.6 milions times. The numbers could also be improved by introducing a map around the inner calls to ::getWaitStatesSince in fixLdsBranchVmemWARHazard, but nothing will beat not running fixLdsBranchVmemWARHazard at all in the cases detected by shouldRunLdsBranchVmemWARHazardFixup(). Differential Revision: https://reviews.llvm.org/D104219	2021-06-14 22:30:23 +02:00
Arthur Eubanks	675b987cd3	Move some code under NDEBUG from D103135	2021-06-14 11:39:12 -07:00
Arthur Eubanks	db00e9aaf8	Remove accidentally added debugging code from D103135	2021-06-14 11:11:40 -07:00
Saleem Abdulrasool	a4fccf0a10	X86: pass swift_async context in R14 on Win64 Pass swift_async context in a callee-saved register rather than as a regular parameter. This is similar to the Swift `self` and `error` parameters.	2021-06-14 11:02:21 -07:00
Arthur Eubanks	25efb3e5da	[docs][OpaquePtr] Shuffle around the transition plan section Emphasize that this is basically an attempt to remove ``PointerType::getElementType`` and ``Type::getPointerElementType()``. Add a couple more subtasks. Differential Revision: https://reviews.llvm.org/D104151	2021-06-14 10:59:41 -07:00
Arthur Eubanks	a32daab226	[OpaquePtr] Remove existing support for forward compatibility It assumes that PointerType will keep having an optional pointee type, but we'd like to remove the pointee type in PointerType at some point. I feel like the current implementation could be simplified anyway, although perhaps I'm underestimating the amount of work needed throughout BitcodeReader. We will still need a side table to keep track of pointee types. This will be reimplemented at some point. This is essentially a revert of a4771e9d (which doesn't look like it was reviewed anyway). Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D103135	2021-06-14 10:52:56 -07:00
wlei	c4ed78c10b	[CSSPGO] Aggregation by the last K context frames for cold profiles This change provides the option to merge and aggregate cold context by the last k frames instead of context-less name. By default K = 1 means the context-less one. This is for better perf tuning. The more selective merging and trimming will rely on llvm-profgen's preinliner. Reviewed By: wenlei, hoy Differential Revision: https://reviews.llvm.org/D104131	2021-06-14 10:33:43 -07:00
Fraser Cormack	94d12e2de0	[RISCV] Transform unaligned RVV vector loads/stores to aligned ones This patch adds support for loading and storing unaligned vectors via an equivalently-sized i8 vector type, which has support in the RVV specification for byte-aligned access. This offers a more optimal path for handling of unaligned fixed-length vector accesses, which are currently scalarized. It also prevents crashing when `LegalizeDAG` sees an unaligned scalable-vector load/store operation. Future work could be to investigate loading/storing via the largest vector element type for the given alignment, in case that would be more optimal on hardware. For instance, a 4-byte-aligned nxv2i64 vector load could loaded as nxv4i32 instead of as nxv16i8. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D104032	2021-06-14 18:12:18 +01:00
Sanjay Patel	224787dbb2	[InstCombine] add DeMorgan folds for logical ops in select form We canonicalized to these select patterns (poison-safe logic) with D101191, so we need to reduce 'not' ops when possible as we would with 'and'/'or' instructions. This is shown in a secondary example in: https://llvm.org/PR50389 https://alive2.llvm.org/ce/z/BvsESh	2021-06-14 12:54:35 -04:00
Sanjay Patel	b4782a5249	[InstCombine] add tests for logical and/or with not ops; NFC	2021-06-14 12:54:35 -04:00
Florian Hahn	b893da34b6	[LoopDeletion] Add test with irreducible control flow in loop. Currently the irreducible cycles in the loops are ignored. The irreducible cycle may loop infinitely in irreducible_subloop_no_mustprogress, which is allowed and the loop should not be removed. Discussed in D103382.	2021-06-14 17:42:32 +01:00
Florian Hahn	5af9313d31	[VectorCombine] Limit scalarization to non-poison indices for now. As Eli mentioned post-commit in D103378, the result of the freeze may still be out-of-range according to Alive2. So for now, just limit the transform to indices that are non-poison.	2021-06-14 16:40:14 +01:00
Saleem Abdulrasool	c92efb7d33	SelectionDAG: repair the Windows build 6e5628354e22f3ca40b04295bac540843b8e6482 regressed the Windows build as the return type no longer matched in both branches for the return value type deduction. This uses a bit more compiler magic to deal with that.	2021-06-14 08:25:36 -07:00
zhijian	ad7e1ecf68	[AIX][XCOFF] emit vector info of traceback table. Summary: emit vector info of traceback table. Reviewers: Jason Liu,Hubert Tong Differential Revision: https://reviews.llvm.org/D93659	2021-06-14 11:15:22 -04:00
Florian Hahn	bc6a656349	[ADT] Use unnamed argument for unused arg in StringMapEntryStorage. This silences an 'unsused argument' warning. Similar to c2006f857d80f54b90ed7d911d3e7acf4f46001b.	2021-06-14 15:54:57 +01:00
Jingu Kang	d8d1189bdb	[AArch64] Improve SAD pattern Given a vecreduce_add node, detect the below pattern and convert it to the node sequence with UABDL, [S\|U]ADB and UADDLP. i32 vecreduce_add( v16i32 abs( v16i32 sub( v16i32 [sign\|zero]_extend(v16i8 a), v16i32 [sign\|zero]_extend(v16i8 b)))) =================> i32 vecreduce_add( v4i32 UADDLP( v8i16 add( v8i16 zext( v8i8 [S\|U]ABD low8:v16i8 a, low8:v16i8 b v8i16 zext( v8i8 [S\|U]ABD high8:v16i8 a, high8:v16i8 b Differential Revision: https://reviews.llvm.org/D104042	2021-06-14 15:48:51 +01:00
LLVM GN Syncbot	4f6c145c6a	[gn build] Port c820b494d6e1	2021-06-14 14:41:33 +00:00
Roman Lebedev	7a71822528	[NFC][DAGCombine] Extract getFirstIndexOf() lambda back into a function Not all supported compilers like such lambdas, at least one buildbot is unhappy.	2021-06-14 16:25:59 +03:00
Roman Lebedev	9f4eaf3945	[DAGCombine] reduceBuildVecToShuffle(): sort input vectors by decreasing size The sorting, obviously, must be stable, else we will have random assembly fluctuations. Apparently there was no test coverage that would benefit from that, so i've added one test. The sorting consists of two parts - just sort the input vectors, and recompute the shuffle mask -> input vector mapping. I don't believe we need to do anything else. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D104187	2021-06-14 16:18:37 +03:00
Jeroen Dobbelaere	c08eaddde6	Intrinsic::getName: require a Module argument Ensure that we provide a `Module` when checking if a rename of an intrinsic is necessary. This fixes the issue that was detected by https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=32288 (as mentioned by @fhahn), after committing D91250. Note that the `LLVMIntrinsicCopyOverloadedName` is being deprecated in favor of `LLVMIntrinsicCopyOverloadedName2`. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D99173	2021-06-14 14:52:29 +02:00
Florian Hahn	68369fae88	[VPlan] Add additional tests for region merging. Add additional tests suggested in D100260. Also drop the unneeded `indvars.` prefix from induction phi name.	2021-06-14 11:25:06 +01:00
Guillaume Chatelet	9aa4a5f77d	[llvm] remove Sequence::asSmallVector() There's no need for `toSmallVector()` as `SmallVector.h` already provides a `to_vector` free function that takes a range. Reviewed By: Quuxplusone Differential Revision: https://reviews.llvm.org/D104024	2021-06-14 08:28:05 +00:00
Simon Moll	91d4645488	[VP] Binary floating-point intrinsics. This patch implements vector-predicated intrinsics on IR level for fadd, fsub, fmul, fdiv and frem. There operate in the default floating-point environment. We will use constrained fp operand bundles for constrained vector-predicated fp math (D93455). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D93470	2021-06-14 08:51:41 +02:00
Mindong Chen	65590a4d7a	[LoopVectorize] precommit pr50686.ll for D104148	2021-06-14 13:58:25 +08:00
Xuanda Yang	a66f237758	[LLParser] Remove outdated deplibs The comment mentions deplibs should be removed in 4.0. Removing it in this patch. Reviewed By: compnerd, dexonsmith, lattner Differential Revision: https://reviews.llvm.org/D102763	2021-06-14 12:46:12 +08:00
RamNalamothu	a2306da6e0	Implement DW_CFA_LLVM_* for Heterogeneous Debugging Add support in MC/MIR for writing/parsing, and DebugInfo. This is part of the Extensions for Heterogeneous Debugging defined at https://llvm.org/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html Specifically the CFI instructions implemented here are defined at https://llvm.org/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html#cfa-definition-instructions Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D76877	2021-06-14 08:51:50 +05:30
Aditya Kumar	d436515539	Calculate getTerminator only when necessary Differential Revision: https://reviews.llvm.org/D104202	2021-06-13 20:16:07 -07:00
Juneyoung Lee	2f184e9475	[Utils] Add missing freeze and poison keyword highlights This patch adds missing keyword highlights for freeze and poison Reviewed By: MaskRay, porglezomp Differential Revision: https://reviews.llvm.org/D104017	2021-06-14 09:21:26 +09:00
Eric Astor	d15098fdc8	[ms] [llvm-ml] When parsing MASM, "jmp short" instructions are case insensitive Handle "short" in a case-insensitive fashion in MASM. Required to correctly parse z_Windows_NT-586_asm.asm from the OpenMP runtime. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D104195	2021-06-13 18:36:00 -04:00
Eric Astor	679dc9bc3b	[ms] [llvm-ml] Fix capitalization of the ignored CPU directives These directives are matched in lowercase, so make sure to use lowercase for their P suffix. Differential Revision: https://reviews.llvm.org/D104206	2021-06-13 18:34:42 -04:00
Eric Astor	3e051c60b8	Fix misspelled instruction in X86 assembly parser Did not correctly handle "jecxz short <address>". Discovered while working on LLVM-ML; shows up in z_Windows_NT-586_asm.asm from the OpenMP runtime Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D104194	2021-06-13 18:34:15 -04:00
David Green	a615d4a362	[DSE] Extra multiblock loop tests, NFC. Some of these can be DSE'd, some of which cannot. Useful in D100464.	2021-06-13 22:30:42 +01:00
LemonBoy	71d85d4af9	[SPARC] Legalize truncation and extension between fp128 and half Lower truncations and expansions between fp128 and half values into libcalls. Expand truncating stores into two separate truncation and a store operations. Reviewed By: jrtc27 Differential Revision: https://reviews.llvm.org/D104185	2021-06-13 20:05:15 +02:00
Nikita Popov	478596b756	[LoopUnroll] Test multi-exit runtime unrolling with predictable exit (NFC) The (prior to prologue insertion) predictable exit shouldn't get folded here. Make sure it isn't...	2021-06-13 18:48:38 +02:00
Simon Pilgrim	0b64dd4442	RawError.h - remove unused <string> include. NFCI.	2021-06-13 17:32:57 +01:00
Simon Pilgrim	3f3834f7e5	BoundsChecking.cpp - tidy implicit header dependencies. NFCI. We don't use <vector> but we do use std::pair (<utility>)	2021-06-13 17:08:15 +01:00
Simon Pilgrim	836026294d	DIPrinter.h - tidy implicit header dependencies. NFCI. We don't use <string> but we do use std::unique_ptr (<memory>) and llvm::Optional<>	2021-06-13 17:00:15 +01:00
Simon Pilgrim	910cf30f57	DetailedRecordsBackend.cpp - printSectionHeading - avoid std::string creation/copies. Don't create std::string from constant c-strings or pass std::string by value - we can use StringRef instead.	2021-06-13 16:49:40 +01:00
Simon Pilgrim	9ec7689e46	DetailedRecordsBackend.cpp - tidy implicit header dependencies. NFCI. We don't use <algorithm>, <set> or <vector>, but we do use std::pair (<utility>).	2021-06-13 16:27:17 +01:00
Simon Pilgrim	fdadecc8f8	ProfiledCallGraph.h - remove unused <string> include. NFCI.	2021-06-13 15:19:25 +01:00
Simon Pilgrim	20d33f98a5	RegUsageInfoPropagate.cpp - remove unused <string> and <map> includes. NFCI.	2021-06-13 15:19:24 +01:00
Simon Pilgrim	f71e5f90f8	MachOObjectFile.cpp - remove unused <string> include. NFCI.	2021-06-13 15:19:24 +01:00
Simon Pilgrim	cf2264bfb7	DWARFDebugFrame.cpp - remove unused <string> include. NFCI.	2021-06-13 15:19:24 +01:00
Simon Pilgrim	052e3ea653	GVN.cpp - remove unused <vector> include. NFCI.	2021-06-13 14:06:32 +01:00
Simon Pilgrim	8c6f5b0343	LoopUnrollAndJamPass.cpp - remove unused <vector> include. NFCI.	2021-06-13 14:06:32 +01:00

1 2 3 4 5 ...

217177 Commits