llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 03:02:36 +01:00

Author	SHA1	Message	Date
Jay Foad	b3a27e03dc	[AMDGPU] Add IntrWillReturn to recently added intrinsics This adds IntrWillReturn to the gfx90a mfma intrinsics, to match all the other mfma intrinsics, and llvm.amdgcn.live.mask, to match llvm.amdgcn.ps.live. Differential Revision: https://reviews.llvm.org/D97675	2021-03-01 17:35:26 +00:00
Jez Ng	5415c595ac	[lld-macho] Switch default to new Darwin backend The new Darwin backend for LLD is now able to link reasonably large real-world programs on x86_64. For instance, we have achieved self-hosting for the X86_64 target, where all LLD tests pass when building lld with itself on macOS. As such, we would like to make it the default back-end. The new port is now named `ld64.lld`, and the old port remains accessible as `ld64.lld.darwinold` This [annoucement email][1] has some context. (But note that, unlike what the email says, we are no longer doing this as part of the LLVM 12 branch cut -- instead we will go into LLVM 13.) Numerous mechanical test changes were required to make this change; in the interest of creating something that's reviewable on Phabricator, I've split out the boring changes into a separate diff (D95905). I plan to merge its contents with those in this diff before landing. (@gkm made the original draft of this diff, and he has agreed to let me take over.) [1]: https://lists.llvm.org/pipermail/llvm-dev/2021-January/147665.html Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D95204	2021-03-01 12:30:10 -05:00
Juneyoung Lee	312b2e1b99	[TTI] Consider select form of and/or i1 as having arithmetic cost This is a patch that updates the cost of `select i1 a, b, false` to be equivalent to that of `and i1 a, b` as well as the cost of `select i1 a, true, b` equivalent to `or i1 a, b`. Until now, these selects were folded into and/or i1 by InstCombine, but the transformation is poison-unsafe. This is a step towards removing the unsafe transformation. D93065 has relevant transformations linked. These selects should be translated into the assemblies as and/or i1 do in the same manner. The cost should be equivalent. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D97360	2021-03-02 02:18:19 +09:00
Florian Hahn	e4aed742a6	[VPlan] Remove recipes from back to front. Update the deletion order when destroying VPBasicBlocks. This ensures recipes that depend on earlier ones in the block are removed first. Otherwise this may cause issues when recipes have remaining users later in the block.	2021-03-01 16:06:30 +00:00
Andy Wingo	d9c224e71b	[WebAssembly] call_indirect issues table number relocs If the reference-types feature is enabled, call_indirect will explicitly reference its corresponding function table via TABLE_NUMBER relocations against a table symbol. Also, as before, address-taken functions can also cause the function table to be created, only with reference-types they additionally cause a symbol table entry to be emitted. Differential Revision: https://reviews.llvm.org/D90948	2021-03-01 16:49:00 +01:00
Simon Pilgrim	f9a4546a12	[TableGen] Avoid repeated TreePredicateFn::getCodeToRunOnSDNode() calls in MatcherTableEmitter::EmitNodePredicatesFunction loop. NFCI.	2021-03-01 15:43:37 +00:00
Masoud Ataei	eeb7eb5db7	[PowerPC] Removing sqrtd2 and sqrtf4 from list of vectorizable function with MASSV Under -O3 and -Ofast, the MASSV conversion prevents the sqrt call to be inlined. Inline sqrt is faster than MASSV call on leppc. Differential Revision: https://reviews.llvm.org/D97487	2021-03-01 15:42:19 +00:00
Simon Pilgrim	918dfd8b20	[X86] Fold shuffle(not(x),undef) -> not(shuffle(x,undef)) Move NOT out to expose more AND -> ANDN folds	2021-03-01 14:47:39 +00:00
Jay Foad	d90cb44457	[AMDGPU] New intrinsic void llvm.amdgcn.s.sethalt(i32) The expected use case is for frontends to insert this into shaders that are to be run under a debugger. The shader can then be resumed or single stepped from the point of the call under debugger control. Differential Revision: https://reviews.llvm.org/D97670	2021-03-01 14:30:23 +00:00
Jay Foad	82b9132694	[AMDGPU] Simplify SITargetLowering::isSDNodeSourceOfDivergence. NFC. Check for read-modify-write AtomicSDNodes instead of using an exhaustive list of ISD opcodes. Differential Revision: https://reviews.llvm.org/D97671	2021-03-01 14:22:08 +00:00
Matt Arsenault	3e86dcbeed	GlobalISel: Verify G_CONCAT_VECTORS has at least 2 sources	2021-03-01 09:10:36 -05:00
Matt Arsenault	fede57d2a9	GlobalISel: Move splitToValueTypes to generic code I copied the nearly identical function from AArch64 into AMDGPU, so fix this duplication. Mips and X86 have their own more exotic versions which should be removed. However replacing those is better left for a separate patch since it requires other changes to avoid regressions.	2021-03-01 08:58:18 -05:00
Matt Arsenault	f75441bd01	AArch64/GlobalISel: Fix using wrong calling convention for calls This was reusing the parent function calling convention instead of the callee. I'm not sure if there's a case where there's an observable difference. I previously missed this in b72a23650f573299aec30846fb844c3558921fb8	2021-03-01 08:46:33 -05:00
Sander de Smalen	c44f10ad3d	[AArch64] NFC: Cleanup some SVE cost-model tests. Moved some of the `sve-getIntrinsicCost-<..>` into a single sve-intrinsics.ll file, and simplified the tests a bit by bundling all the intrinsics in one function (instead of testing one intrinsic per function). That makes it easier to see the cost of the intrinsics.	2021-03-01 13:26:31 +00:00
serge-sans-paille	66ad213da3	Revert "Use the default seed value for djb hash for StringMap" This reverts commit d84440ec919019ac446241db72cfd905c6ac9dfa. It breaks (at least) lldb and lld validation https://lab.llvm.org/buildbot/#/builders/68/builds/7837 https://lab.llvm.org/buildbot/#/builders/36/builds/5495	2021-03-01 14:00:39 +01:00
David Green	042f6e8e77	[AArch64] Add combine for add(udot(0, x, y), z) -> udot(z, x, y). Given a zero input for a udot, an add can be folded in to take the place of the input, using thte addition that the instruction naturally performs. Differential Revision: https://reviews.llvm.org/D97188	2021-03-01 12:53:34 +00:00
David Green	2e8c4023c8	[AArch64] Adjust dot produce tests. NFC This regenerates and splits out the dotproduce tests, adding a few extra tests for upcoming changes.	2021-03-01 12:46:43 +00:00
serge-sans-paille	8e4de8e5a6	Use the default seed value for djb hash for StringMap See original comment in 560ce2c70fb1fe8e4b9b5e39c54e494a50373ba8 Baiscally the default seed value results in less collision, but changes the iteration order, which matters for a few test cases. Differential Revision: https://reviews.llvm.org/D97396	2021-03-01 13:21:27 +01:00
Fraser Cormack	bc12858624	[RISCV] Support INSERT_SUBVECTOR on vector masks Like with EXTRACT_SUBVECTOR, INSERT_SUBVECTOR poses a problem for vector masks as RVV isn't able to slide mask types around. We choose instead to bitcast to equivalently-sized i8 types where we can, else we zero-extend, perform the operation, and truncate back down. One test was left disabled due to a crash in the legalizer. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97559	2021-03-01 12:04:11 +00:00
Fraser Cormack	21e7d609a7	[RISCV] Fix INSERT/EXTRACT_SUBVECTOR on fractional LMUL types This patch fixes a bug where the lowering for INSERT_SUBVECTOR and EXTRACT_SUBVECTOR would insist on first extracting a register-aligned LMUL1 vector type before perfoming the slide up/down. This was even if the vector was a fractional LMUL type, in which case the aligned EXTRACT_SUBVECTOR was invalid. This issue only occurred for scalable vector types, but a variety of tests for both scalable and fixed-length vectors have been added to ensure this does not regress in the future. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97556	2021-03-01 11:51:05 +00:00
Fraser Cormack	b214f42157	[RISCV] Unify scalable- and fixed-vector INSERT_SUBVECTOR lowering This patch unifies the two disparate paths for lowering INSERT_SUBVECTOR operations under one roof. Consequently, with this patch it is possible to support any fixed-length subvector insertion, not just "cast-like" ones. As before, support for the insertion of mask vectors will come in a separate patch. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97543	2021-03-01 11:38:47 +00:00
Fraser Cormack	5b136b7998	[RISCV] Support EXTRACT_SUBVECTOR on vector masks This patch adds support for extracting subvectors from vector masks. This can be either extracting a scalable vector from another, or a fixed-length vector from a fixed-length or scalable vector. Since RVV lacks a way to slide vector masks down on an element-wise basis and we don't know the true length of the vector registers, in many cases we must resort to using equivalently-sized i8 vectors to perform the operation. When this is not possible we fall back and extend to a suitable i8 vector. Support was also added for fixed-length truncation to mask types. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97475	2021-03-01 11:20:09 +00:00
Florian Hahn	5ace0d2963	[LV] Generate RT checks up-front and remove them if required. This patch updates LV to generate the runtime checks just after cost modeling, to allow a more precise estimate of the actual cost of the checks. This information will be used in future patches to generate larger runtime checks in cases where the checks only make up a small fraction of the expected scalar loop execution time. The runtime checks are created up-front in a temporary block to allow better estimating the cost and un-linked from the existing IR. After deciding to vectorize, the checks are moved backed. If deciding not to vectorize, the temporary block is completely removed. This patch is similar in spirit to D71053, but explores a different direction: instead of delaying the decision on whether to vectorize in the presence of runtime checks it instead optimistically creates the runtime checks early and discards them later if decided to not vectorize. This has the advantage that the cost-modeling decisions can be kept together and can be done up-front and thus preserving the general code structure. I think delaying (part) of the decision to vectorize would also make the VPlan migration a bit harder. One potential drawback of this patch is that we speculatively generate IR which we might have to clean up later. However it seems like the code required to do so is quite manageable. Reviewed By: lebedev.ri, ebrevnov Differential Revision: https://reviews.llvm.org/D75980	2021-03-01 10:48:04 +00:00
Simon Pilgrim	fc7ed7f16c	[DAG] visitVECTOR_SHUFFLE - attempt to match commuted shuffles with MergeInnerShuffle. Try to match "shuffle(C, shuffle(A, B, M0), M1) -> shuffle(A, B, M2)" etc. by using MergeInnerShuffle's commuted inner shuffle mode.	2021-03-01 10:42:11 +00:00
Fraser Cormack	dd84fcbdc5	[CodeGen] Fix issues with subvector intrinsic index types This patch addresses issues arising from the fact that the index type used for subvector insertion/extraction is inconsistent between the intrinsics and SDNodes. The intrinsic forms require i64 whereas the SDNodes use the type returned by SelectionDAG::getVectorIdxTy. Rather than update the intrinsic definitions to use an overloaded index type, this patch fixes the issue by transforming the index to the correct type as required. Any loss of index bits going from i64 to a smaller type is unexpected, and will be caught by an assertion in SelectionDAG::getVectorIdxConstant. The patch also updates the documentation for INSERT_SUBVECTOR and adds an assertion to its creation to bring it in line with EXTRACT_SUBVECTOR. This necessitated changes to AArch64 which was using i64 for EXTRACT_SUBVECTOR but i32 for INSERT_SUBVECTOR. Only one test changed its codegen after updating the backend accordingly. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D97459	2021-03-01 10:28:21 +00:00
Serguei Katkov	1ff80409ac	[Statepoint Lowering] Consider dead deopt gc values together with other gc values Currently dead gc value mentioned in the deopt section are not listed in gc section and so are processed separately. With this CL all deopt gc values are considered as base pointers and processed in the same way as other gc values. The fact that deopt gc pointer is a base pointer was used all the time but it is explicitly documented here by putting the value in SI.Base. The idea of the patch comes from Philip Reames. Reviewers: reames, dantrushin Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D97554	2021-03-01 17:23:02 +07:00
Simon Pilgrim	33efbc8acf	[DAG] visitVECTOR_SHUFFLE - move shuffle canonicalization/merges all under the same legality test. NFCI. Minor cleanup to move related combines closer together to make it more coherent, without changing the ordering.	2021-03-01 09:42:00 +00:00
Max Kazantsev	13859b219a	[NFC] Detect IV increment expressed as uadd_with_overflow and usub_with_overflow Current callers do not call it with such argument, so this is NFC. But for further changes, it can be very useful to detect such cases.	2021-03-01 13:24:01 +07:00
Max Kazantsev	e3a0d79f40	[NFC] Introduce function getIVStep for further reuse	2021-03-01 13:04:56 +07:00
Max Kazantsev	3d0fa66f52	[NFC] Whitespace fix	2021-03-01 12:14:03 +07:00
Max Kazantsev	adc4cff078	[NFC] Factor out IV detector function for further reuse	2021-03-01 12:11:54 +07:00
Juneyoung Lee	e428efde6c	[SimplifyCFG] Update FoldTwoEntryPHINode to handle and/or of select and binop equally This is a minor change that fixes FoldTwoEntryPHINode to handle phis with and/ors of select form and binop form equally.	2021-03-01 13:34:51 +09:00
Serguei Katkov	2c429002e4	[Statepoint lowering] Require spill of deopt value in case its type is not legal If the type of the deopt operand has an illegal type and we want to use register for it then it needs to be legalized. This is not supported currently by legalizer and it is not actually clear how to legalize this type of values. Instead we just spill such values and use spill slot location in statepoint. Originally tests were created by Philip Reames. Reviewers: reames, dantrushin Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D97541	2021-03-01 10:23:53 +07:00
Craig Topper	ce5d619908	[DAGCombiner][X86] Don't peek through ANDs on the shift amount in matchRotateSub when called from MatchFunnelPosNeg. Peeking through AND is only valid if the input to both shifts is the same. If the inputs are different, then the original pattern ORs the two values when the masked shift amount is 0. This is ok if the values are the same since the OR would be a NOP which is why its ok for rotate. Fixes PR49365 and reverts PR34641 Differential Revision: https://reviews.llvm.org/D97637	2021-02-28 12:58:00 -08:00
Kazu Hirata	a3564b4137	[IR] Use range-based for loops (NFC)	2021-02-28 10:59:23 -08:00
Kazu Hirata	2ca15ec5a6	[TableGen] Use ListSeparator (NFC)	2021-02-28 10:59:22 -08:00
Kazu Hirata	454751eae1	[llvm] Use set_is_subset (NFC)	2021-02-28 10:59:20 -08:00
Craig Topper	38291ba7f3	[DAGCombiner] Don't skip no overflow check on UMULO if the first computeKnownBits call doesn't return any 0 bits. Even if the first computeKnownBits call doesn't have any zero bits it is possible the other operand has bitwidth-1 leading zero. In that case overflow is still impossible. So always call computeKnownBits for both operands.	2021-02-28 08:26:22 -08:00
Matt Arsenault	6e55b5e1e6	AMDGPU/GlobalISel: Add subtarget to a test SelectionDAG forces us to have a weird ABI for 16-bit values without legal 16-bit operations, but currently GlobalISel bypasses this and sometimes ends up using the gfx8+ ABI in some contexts. Make sure we're testing the normal ABI to avoid a test change in a future patch.	2021-02-28 10:29:25 -05:00
Sanjay Patel	a277205632	[InstCombine] avoid infinite loop in demanded bits for select https://llvm.org/PR49205	2021-02-28 10:17:53 -05:00
David Green	941edbe847	[ARM] VMOVN undef folding If we insert undef using a VMOVN, we can just use the original value in three out of the four possible combinations. Using VMOVT into a undef vector will still require the lanes to be moved, but otherwise the non-undef value can be used.	2021-02-28 14:44:45 +00:00
Simon Pilgrim	cf1a34fda4	[X86][AVX] Reuse existing VBROADCAST(x) for SCALAR_TO_VECTOR(x) Similar to what we already do for BROADCASTs of different vector sizes - if we're going to broadcast it anyway might as well reuse it.	2021-02-28 11:37:27 +00:00
David Green	ae50d26182	[ARM] VECTOR_REG_CAST undef -> undef Propagate undef through VECTOR_REG_CAST nodes, allowing extra simplification in some patterns.	2021-02-28 11:13:49 +00:00
Wei Mi	510612328d	[SampleFDO] Add a cutoff flag to control how many symbols will be included into profile symbol list. When test is unrepresentative to production behavior, sample profile collected from production can cause unexpected performance behavior in test. To triage such issue, it is useful to have a cutoff flag to control how many symbols will be included into profile symbol list in order to do binary search. Differential Revision: https://reviews.llvm.org/D97623	2021-02-27 23:15:31 -08:00
Craig Topper	ac78a77509	[X86] Add avx512f command lines to vec_smulo and vec_umulo.	2021-02-27 21:16:42 -08:00
Chen Zheng	0abf3f2c78	[Debug-Info][NFC] use emitDwarfUnitLength for debug line section Use emitDwarfUnitLength for debug line, so we can benefit from overriding of emitDwarfUnitLength inside different streamers. Reviewed By: ikudrin, dblaikie Differential Revision: https://reviews.llvm.org/D95998	2021-02-27 22:33:49 -05:00
William S. Moses	fb8af1f3c6	[Attributor] Conditinoally delete fns Allow the attributor to delete functions only if requested Differential Revision: https://reviews.llvm.org/D97238	2021-02-27 20:37:42 -05:00
Craig Topper	fffdcb057e	[X86] Fix a couple comments that said LHS where they meant RHS. NFC	2021-02-27 17:14:17 -08:00
Craig Topper	6edf5e9419	[X86] Add back SSE check prefix for vec-umulo.ll. Regenerate vec-smulo.ll. NFC Simon modified the check prefixes in these tests while D97160 was pending review. When D97160 was commited it wasn't updated it merge cleanly, but didn't comprehend the check prefix changes.	2021-02-27 15:18:09 -08:00
Tony Tye	e33a5d6364	[NFC][AMDGPU] Document the AMDGPU target feature defaults Document the default for the XNACK and SRAMECC target features for code object V2-V3 and V4. Reviewed By: kzhuravl Differential Revision: https://reviews.llvm.org/D97598	2021-02-27 18:28:15 +00:00

1 2 3 4 5 ...

211920 Commits