llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00

Author	SHA1	Message	Date
Wenlei He	1b193b8bb3	[CSSPGO][llvm-profgen] Context-sensitive global pre-inliner This change sets up a framework in llvm-profgen to estimate inline decision and adjust context-sensitive profile based on that. We call it a global pre-inliner in llvm-profgen. It will serve two purposes: 1) Since context profile for not inlined context will be merged into base profile, if we estimate a context will not be inlined, we can merge the context profile in the output to save profile size. 2) For thinLTO, when a context involving functions from different modules is not inined, we can't merge functions profiles across modules, leading to suboptimal post-inline count quality. By estimating some inline decisions, we would be able to adjust/merge context profiles beforehand as a mitigation. Compiler inline heuristic uses inline cost which is not available in llvm-profgen. But since inline cost is closely related to size, we could get an estimate through function size from debug info. Because the size we have in llvm-profgen is the final size, it could also be more accurate than the inline cost estimation in the compiler. This change only has the framework, with a few TODOs left for follow up patches for a complete implementation: 1) We need to retrieve size for funciton//inlinee from debug info for inlining estimation. Currently we use number of samples in a profile as place holder for size estimation. 2) Currently the thresholds are using the values used by sample loader inliner. But they need to be tuned since the size here is fully optimized machine code size, instead of inline cost based on not yet fully optimized IR. Differential Revision: https://reviews.llvm.org/D99146	2021-03-29 09:46:14 -07:00
Wei Mi	d6d534e85a	[SampleFDO] Do not scale the magic number NOMORE_ICP_MAGICNUM in value profile during profile update. When we inline a function and update the profile, the value profiles of the indirect call in the inliner and inlinee will be scaled. In https://reviews.llvm.org/D96806 and https://reviews.llvm.org/D97350, we start using the magic number NOMORE_ICP_MAGICNUM (-1) to mark targets which have been promoted. The magic number shouldn't be scaled during the profile update. Although the problem has been suppressed by https://reviews.llvm.org/D98187 for SampleFDO, which stops profile update for inlining in sampleFDO, the patch is still wanted since it will be more consistent to handle the magic number properly in profile update. Differential Revision: https://reviews.llvm.org/D99394	2021-03-29 09:34:37 -07:00
Florian Hahn	c15bc589d4	Recommit "[LV] Move runtime pointer size check to LVP::plan()." Re-apply 25fbe803d4db, with a small update to emit the right remark class. Original message: [LV] Move runtime pointer size check to LVP::plan(). This removes the need for the remaining doesNotMeet check and instead directly checks if there are too many runtime checks for vectorization in the planner. A subsequent patch will adjust the logic used to decide whether to vectorize with runtime to consider their cost more accurately. Reviewed By: lebedev.ri	2021-03-29 16:14:27 +01:00
Bradley Smith	4cc2f2b476	[SelectionDAG][AArch64][SVE] Perform SETCC condition legalization in LegalizeVectorOps This is currently performed in SelectionDAGLegalize, here we make it also happen in LegalizeVectorOps, allowing a target to lower the SETCC condition codes first in LegalizeVectorOps and then lower to a custom node afterwards, without having to duplicate all of the SETCC condition legalization in the target specific lowering. As a result of this, fixed length floating point SETCC nodes can now be properly lowered for SVE. Differential Revision: https://reviews.llvm.org/D98939	2021-03-29 15:32:25 +01:00
Florian Hahn	c376195fed	Revert "[LV] Move runtime pointer size check to LVP::plan()." This reverts commit 25fbe803d4dbcf8ff3a3a9ca161f5b9a68353ed0. This breaks a clang test which filters for the wrong remark type.	2021-03-29 14:41:53 +01:00
Sanjay Patel	70f97370e8	[SLP] allow matching integer min/max intrinsics as reduction ops This is a 2nd try of: 3c8473ba534 which was reverted at: a26312f9d4f because of crashing. This version includes extra code and tests to avoid the known crashing examples as discussed in PR49730. Original commit message: As noted in D98152, we need to patch SLP to avoid regressions when we start canonicalizing to integer min/max intrinsics. Most of the real work to make this possible was in: 7202f47508 Differential Revision: https://reviews.llvm.org/D98981	2021-03-29 09:38:18 -04:00
Paul C. Anagnostopoulos	128e39dc70	[TableGen] Add support for the 'assert' statement in class definitions. Differential Revision: https://reviews.llvm.org/D99275	2021-03-29 09:20:29 -04:00
Florian Hahn	53ccebfadc	[LV] Move runtime pointer size check to LVP::plan(). This removes the need for the remaining doesNotMeet check and instead directly checks if there are too many runtime checks for vectorization in the planner. A subsequent patch will adjust the logic used to decide whether to vectorize with runtime to consider their cost more accurately. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D98634	2021-03-29 14:12:29 +01:00
Jingu Kang	a2f441345f	[SimpleLoopUnswitch] Fix wrong assertions in partial-unswitch.ll	2021-03-29 14:04:29 +01:00
Matt Arsenault	efe6c99b52	Reapply "OpaquePtr: Turn inalloca into a type attribute" This reverts commit 07e46367baeca96d84b03fa215b41775f69d5989.	2021-03-29 08:55:30 -04:00
Jingu Kang	8d90020fab	[LoopUnswitch] Use reference variables instead of pointer one Differential Revision: https://reviews.llvm.org/D99496	2021-03-29 13:08:46 +01:00
Jingu Kang	4483b3ebb3	[SimpleLoopUnswitch] Add tests to check partially invariant unswitch Differential Revision: https://reviews.llvm.org/D99493	2021-03-29 13:06:32 +01:00
Hans Wennborg	55eac64914	Don't use $ as suffix for symbol names in ThinLTOBitcodeWriter and other places Using $ breaks demangling of the symbols. For example, $ c++filt _Z3foov\$123 _Z3foov$123 This causes problems for developers who would like to see nice stack traces etc., but also for automatic crash tracking systems which try to organize crashes based on the stack traces. Instead, use the period as suffix separator, since Itanium demanglers normally ignore such suffixes: $ c++filt _Z3foov.123 foo() [clone .123] This is already done in some places; try to do it everywhere. Differential revision: https://reviews.llvm.org/D97484	2021-03-29 13:03:52 +02:00
Oliver Stannard	0914bea32c	Revert "Reapply "OpaquePtr: Turn inalloca into a type attribute"" Reverting because test 'Bindings/Go/go.test' is failing on most buildbots. This reverts commit fc9df309917e57de704f3ce4372138a8d4a23d7a.	2021-03-29 11:32:22 +01:00
Simon Pilgrim	f3d685c6a2	[X86][F16C] Add F16C -O0 test coverage Ensure the duplicate conversions noticed in D48614 have gone	2021-03-29 11:31:20 +01:00
Simon Pilgrim	da2d2f6455	[X86] Regenerate tests to add missing @PLT	2021-03-29 11:31:19 +01:00
Simon Pilgrim	b49e0cd2a5	[X86][SSE] combineHorizOpWithShuffle - consistently use getTargetShuffleInputs to decode shuffles Minor cleanup before I start trying to merge the unary/binary shuffle combining paths.	2021-03-29 11:31:19 +01:00
Nashe Mncube	762a55526d	[SVE][Analysis]Instruction costs for ops on scalable-vec The following operations have no associated cost for them when applied to scalable vectors, and as a consequence can trigger a crash when a call is made to AArch64TTIImpl::getCastInstrCost(): - fptrunc - trunc - fpext - fpto(u,s)i This patch adds costs for these operations and relevant regression tests. Differential Revision: https://reviews.llvm.org/D98934	2021-03-29 11:15:50 +01:00
Stefan Gränitz	cc45a8982f	[Orc][tests] Moving one MCJIT test over to Orc to make sure the PowerPC fix worked The PowerPC fix landed in d9069dd9b576. This is in preparation for D98931.	2021-03-29 11:53:03 +02:00
Jingu Kang	9d66510fb1	[NFC][LoopUnswitch] Move hasPartialIVCondition to LoopUtils Differential revision: https://reviews.llvm.org/D99490	2021-03-29 10:29:45 +01:00
Petar Avramovic	4315e26388	[AMDGPU] Extend gfx10 test coverage. NFC. Differential Revision: https://reviews.llvm.org/D99267	2021-03-29 11:13:55 +02:00
David Green	895ea4b8bd	[ARM] Extend MVE lane interleaving to handle other non-instruction leaves This extends the recent MVE lane interleaving passto handle other non-instruction leaves, for which a new shuffle is added. This helps especially for constants and potentially for arguments. Differential Revision: https://reviews.llvm.org/D97289	2021-03-29 09:05:45 +01:00
Lang Hames	859b69b17f	[ORC][C-bindings] Fix some ORC C bindings function names and signatures. LLVMOrcDisposeObjectLayer and LLVMOrcExecutionSessionGetJITDylibByName did not have matching signatures between the C-API header and binding implementations. Fixes http://llvm.org/PR49745. Patch by Mats Larsen. Thanks Mats! Reviewed by: lhames Differential Revision: https://reviews.llvm.org/D99478	2021-03-28 16:30:47 -07:00
Craig Topper	7c7e47bb5a	[RISCV] Add a RV64 mulhsu test case. NFC	2021-03-28 15:54:44 -07:00
David Green	44947e83f7	[ARM] Fix the Changed value in the MVE lane interleaving pass.	2021-03-28 23:47:53 +01:00
Nikita Popov	ab6e561d30	[BasicAA] Make sure types match in constant offset heuristic This can only happen if offset types that are larger than the pointer size are involved. The previous implementation did not assert in this case because it initialized the APInts to the width of one of the variables -- though I strongly suspect it did not compute correct results in this case. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=32621 reported by fhahn.	2021-03-28 21:38:09 +02:00
Craig Topper	99c3c72646	[X86] Add phase ordering test for the problem D99427 is trying to solve. NFC	2021-03-28 12:14:30 -07:00
Craig Topper	de0e08c72b	[X86] Optimize vXi8 MULHS on targets where we can't sign_extend to the next register size. For these cases we need to extract the upper or lower elements, multiply them using 16-bit multiplies and repack them. Previously we used punpcklbw/punpckhbw+psraw or pmovsxbw+pshudfd to extract and sign extend so we could use pmullw to compute the 16-bit product and then shift down the high bits. We can avoid the need to sign extend if we unpack the bytes into the high byte of each word and fill the lower byte with 0 using pxor. This puts the sign bit of each byte into the sign bit of each word. Since the LHS and RHS have 8 trailing zeros, the full 32-bit product of those 16-bit values will have 16 trailing zeros. This means the 16-bit product of the original bytes is in the upper 16 bits which we can calculate using pmulhw. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D98587	2021-03-28 11:41:29 -07:00
Craig Topper	6e75027132	[X86][update_llc_test_checks] Use a less greedy regular expression for replacing constant pool labels in tests. While working on D97208 I noticed that these greedy regular expressions prevent tests from failing when (%rip) appears after a constant pool label when it didn't before. Reviewed By: RKSimon, pengfei Differential Revision: https://reviews.llvm.org/D99460	2021-03-28 11:39:46 -07:00
LLVM GN Syncbot	4a9e2678d2	[gn build] Port 7b6f760fcd19	2021-03-28 18:35:33 +00:00
David Green	d2d877079e	[ARM] MVE vector lane interleaving MVE does not have a single sext/zext or trunc instruction that takes the bottom half of a vector and extends to a full width, like NEON has with MOVL. Instead it is expected that this happens through top/bottom instructions. So the MVE equivalent VMOVLT/B instructions take either the even or odd elements of the input and extend them to the larger type, producing a vector with half the number of elements each of double the bitwidth. As there is no simple instruction for a normal extend, we often have to expand sext/zext/trunc into a series of lane moves (or stack loads/stores, which we do not do yet). This pass takes vector code that starts at truncs, looks for interconnected blobs of operations that end with sext/zext and transforms them by adding shuffles so that the lanes are interleaved and the MVE VMOVL/VMOVN instructions can be used. This is done pre-ISel so that it can work across basic blocks. This initial version of the pass just handles a limited set of instructions, not handling constants or splats or FP, which can all come as extensions to this base. Differential Revision: https://reviews.llvm.org/D95804	2021-03-28 19:34:58 +01:00
Craig Topper	39af919139	[RISCV] Add test case for mulhsu. We don't yet use mulhsu, but we should.	2021-03-28 11:03:39 -07:00
Matt Arsenault	403cadc380	Reapply "OpaquePtr: Turn inalloca into a type attribute" This reverts commit 20d5c42e0ef5d252b434bcb610b04f1cb79fe771.	2021-03-28 13:35:21 -04:00
Sanjay Patel	2bdb86c857	[InstCombine] sink min/max intrinsics with common op after select This is another step towards parity with cmp+select min/max idioms. See D98152.	2021-03-28 13:13:04 -04:00
Sanjay Patel	fbffae5729	[InstCombine] add tests for select of min/max intrinsics; NFC	2021-03-28 13:13:04 -04:00
Nico Weber	755e1b95c9	Revert "OpaquePtr: Turn inalloca into a type attribute" This reverts commit 4fefed65637ec46c8c2edad6b07b5569ac61e9e5. Broke check-clang everywhere.	2021-03-28 13:02:52 -04:00
Zakk Chen	56db174a0b	[RISCV][Clang] Update new overloading rules for RVV intrinsics. RVV intrinsics has new overloading rule, please see `82aac7dad4` Changed: 1. Rename `generic` to `overloaded` because the new rule is not using C11 generic. 2. Change HasGeneric to HasNoMaskedOverloaded because all masked operations support overloading api. 3. Add more overloaded tests due to overloading rule changed. Differential Revision: https://reviews.llvm.org/D99189	2021-03-28 09:04:35 -07:00
Stefan Gränitz	cd6dd68cfd	[Orc][examples] Add missing dependency to OrcShared in LLJITWithRemoteDebugging	2021-03-28 17:48:28 +02:00
Stefan Gränitz	8130b8a5a1	[Orc][examples] Add LLJITWithRemoteDebugging example	2021-03-28 17:25:09 +02:00
Matt Arsenault	600ba40bd7	AArch64/GlobalISel: Remove IR section from test	2021-03-28 11:12:59 -04:00
Matt Arsenault	9b63996812	OpaquePtr: Turn inalloca into a type attribute I think byval/sret and the others are close to being able to rip out the code to support the missing type case. A lot of this code is shared with inalloca, so catch this up to the others so that can happen.	2021-03-28 11:12:23 -04:00
Florian Hahn	f051f52528	[LV] Mark a few more cost-model members as const (NFC).	2021-03-28 14:59:48 +01:00
Nikita Popov	bdb84921a1	[BasicAA] Handle gep with unknown sizes earlier (NFCI) If the sizes of both memory locations are unknown, we can only perform a check on the underlying objects. There's no point in going through GEP decomposition in this case.	2021-03-28 15:48:49 +02:00
Florian Hahn	3a9af0ee66	[SelDag] Add isIntOrFPConstant helper function. This patch adds a new isIntOrFPConstant helper function to check if a SDValue is a integer of FP constant. This pattern is used in various places. There also are places that incorrectly just check for integer constants, e.g. D99384, so hopefully this helper will help people avoid that issue. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D99428	2021-03-28 12:48:58 +01:00
Hsiangkai Wang	3039a6f04c	[RISCV] Add vfabs.v pseudo instruction. Differential Revision: https://reviews.llvm.org/D99454	2021-03-28 10:24:05 +08:00
Vaivaswatha Nagaraj	60a545b91a	[OCaml][Test] Fix and enable debuginfo.ml test `get_or_create_type_array` was used on a non-type MDNode. Add interface for `get_or_create_array` and use that instead. Differential Revision: https://reviews.llvm.org/D99450	2021-03-28 06:25:39 +05:30
Craig Topper	519ec4e15e	[X86] Regenerate a bunch of tests to pick up @PLT I'm prepping another patch to the same tests and this just adds noise to my diff.	2021-03-27 16:41:35 -07:00
Craig Topper	9c604c891f	[RISCV] Add a pattern for (sext_inreg (mul (and X, 0xffffffff), (and Y, 0xffffffff)), i32) to suppress MULW formation We have a special pattern for (mul (and X, 0xffffffff), (and Y, 0xffffffff)), to optimize the ANDs to shift. But if a sext_inreg coms first, we'll form a MULW and limit the effectiveness of the special match. So this patch adds a larger pattern to suppress the MULW formation by emitting a sext.w and then the same output we use for the (mul (and X, 0xffffffff), (and Y, 0xffffffff)). This should all get CSEd. This is the issue I was trying to fix with D99029, but that affected many more tests.	2021-03-27 15:37:18 -07:00
Nikita Popov	f2b0645b2f	[BasicAA] Refactor linear expression decomposition The current linear expression decomposition handles zext/sext by decomposing the casted operand, and then checking NUW/NSW flags to determine whether the extension can be distributed. This has some disadvantages: First, it is not possible to perform a partial decomposition. If we have zext((x + C1) +<nuw> C2) then we will fail to decompose the expression entirely, even though it would be safe and profitable to decompose it to zext(x + C1) +<nuw> zext(C2) Second, we may end up performing unnecessary decompositions, which will later be discarded because they lack nowrap flags necessary for extensions. Third, correctness of the code is not entirely obvious: At a high level, we encounter zext(x -<nuw> C) in the form of a zext on the linear expression x + (-C) with nuw flag set. Notably, this case must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C). The code handles this correctly by speculatively zexting constants to the final bitwidth, and performing additional fixup if the actual extension turns out to be an sext. This was not immediately obvious to me. This patch inverts the approach: An ExtendedValue represents a zext(sext(V)), and linear expression decomposition will try to decompose V further, either by absorbing another sext/zext into the ExtendedValue, or by distributing zext(sext(x op C)) over a binary operator with appropriate nsw/nuw flags. At each step we can determine whether distribution is legal and abort with a partial decomposition if not. We also know which extensions we need to apply to constants, and don't need to speculate or fixup.	2021-03-27 23:31:58 +01:00
Florian Hahn	4ba4438b35	[LV] Fix formatting from 2f9d68c3f12a.	2021-03-27 21:29:56 +00:00

1 2 3 4 5 ...

213357 Commits