llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 04:02:41 +01:00

Author	SHA1	Message	Date
Matt Arsenault	3e86dcbeed	GlobalISel: Verify G_CONCAT_VECTORS has at least 2 sources	2021-03-01 09:10:36 -05:00
Matt Arsenault	fede57d2a9	GlobalISel: Move splitToValueTypes to generic code I copied the nearly identical function from AArch64 into AMDGPU, so fix this duplication. Mips and X86 have their own more exotic versions which should be removed. However replacing those is better left for a separate patch since it requires other changes to avoid regressions.	2021-03-01 08:58:18 -05:00
Matt Arsenault	f75441bd01	AArch64/GlobalISel: Fix using wrong calling convention for calls This was reusing the parent function calling convention instead of the callee. I'm not sure if there's a case where there's an observable difference. I previously missed this in b72a23650f573299aec30846fb844c3558921fb8	2021-03-01 08:46:33 -05:00
Sander de Smalen	c44f10ad3d	[AArch64] NFC: Cleanup some SVE cost-model tests. Moved some of the `sve-getIntrinsicCost-<..>` into a single sve-intrinsics.ll file, and simplified the tests a bit by bundling all the intrinsics in one function (instead of testing one intrinsic per function). That makes it easier to see the cost of the intrinsics.	2021-03-01 13:26:31 +00:00
serge-sans-paille	66ad213da3	Revert "Use the default seed value for djb hash for StringMap" This reverts commit d84440ec919019ac446241db72cfd905c6ac9dfa. It breaks (at least) lldb and lld validation https://lab.llvm.org/buildbot/#/builders/68/builds/7837 https://lab.llvm.org/buildbot/#/builders/36/builds/5495	2021-03-01 14:00:39 +01:00
David Green	042f6e8e77	[AArch64] Add combine for add(udot(0, x, y), z) -> udot(z, x, y). Given a zero input for a udot, an add can be folded in to take the place of the input, using thte addition that the instruction naturally performs. Differential Revision: https://reviews.llvm.org/D97188	2021-03-01 12:53:34 +00:00
David Green	2e8c4023c8	[AArch64] Adjust dot produce tests. NFC This regenerates and splits out the dotproduce tests, adding a few extra tests for upcoming changes.	2021-03-01 12:46:43 +00:00
serge-sans-paille	8e4de8e5a6	Use the default seed value for djb hash for StringMap See original comment in 560ce2c70fb1fe8e4b9b5e39c54e494a50373ba8 Baiscally the default seed value results in less collision, but changes the iteration order, which matters for a few test cases. Differential Revision: https://reviews.llvm.org/D97396	2021-03-01 13:21:27 +01:00
Fraser Cormack	bc12858624	[RISCV] Support INSERT_SUBVECTOR on vector masks Like with EXTRACT_SUBVECTOR, INSERT_SUBVECTOR poses a problem for vector masks as RVV isn't able to slide mask types around. We choose instead to bitcast to equivalently-sized i8 types where we can, else we zero-extend, perform the operation, and truncate back down. One test was left disabled due to a crash in the legalizer. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97559	2021-03-01 12:04:11 +00:00
Fraser Cormack	21e7d609a7	[RISCV] Fix INSERT/EXTRACT_SUBVECTOR on fractional LMUL types This patch fixes a bug where the lowering for INSERT_SUBVECTOR and EXTRACT_SUBVECTOR would insist on first extracting a register-aligned LMUL1 vector type before perfoming the slide up/down. This was even if the vector was a fractional LMUL type, in which case the aligned EXTRACT_SUBVECTOR was invalid. This issue only occurred for scalable vector types, but a variety of tests for both scalable and fixed-length vectors have been added to ensure this does not regress in the future. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97556	2021-03-01 11:51:05 +00:00
Fraser Cormack	b214f42157	[RISCV] Unify scalable- and fixed-vector INSERT_SUBVECTOR lowering This patch unifies the two disparate paths for lowering INSERT_SUBVECTOR operations under one roof. Consequently, with this patch it is possible to support any fixed-length subvector insertion, not just "cast-like" ones. As before, support for the insertion of mask vectors will come in a separate patch. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97543	2021-03-01 11:38:47 +00:00
Fraser Cormack	5b136b7998	[RISCV] Support EXTRACT_SUBVECTOR on vector masks This patch adds support for extracting subvectors from vector masks. This can be either extracting a scalable vector from another, or a fixed-length vector from a fixed-length or scalable vector. Since RVV lacks a way to slide vector masks down on an element-wise basis and we don't know the true length of the vector registers, in many cases we must resort to using equivalently-sized i8 vectors to perform the operation. When this is not possible we fall back and extend to a suitable i8 vector. Support was also added for fixed-length truncation to mask types. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97475	2021-03-01 11:20:09 +00:00
Florian Hahn	5ace0d2963	[LV] Generate RT checks up-front and remove them if required. This patch updates LV to generate the runtime checks just after cost modeling, to allow a more precise estimate of the actual cost of the checks. This information will be used in future patches to generate larger runtime checks in cases where the checks only make up a small fraction of the expected scalar loop execution time. The runtime checks are created up-front in a temporary block to allow better estimating the cost and un-linked from the existing IR. After deciding to vectorize, the checks are moved backed. If deciding not to vectorize, the temporary block is completely removed. This patch is similar in spirit to D71053, but explores a different direction: instead of delaying the decision on whether to vectorize in the presence of runtime checks it instead optimistically creates the runtime checks early and discards them later if decided to not vectorize. This has the advantage that the cost-modeling decisions can be kept together and can be done up-front and thus preserving the general code structure. I think delaying (part) of the decision to vectorize would also make the VPlan migration a bit harder. One potential drawback of this patch is that we speculatively generate IR which we might have to clean up later. However it seems like the code required to do so is quite manageable. Reviewed By: lebedev.ri, ebrevnov Differential Revision: https://reviews.llvm.org/D75980	2021-03-01 10:48:04 +00:00
Simon Pilgrim	fc7ed7f16c	[DAG] visitVECTOR_SHUFFLE - attempt to match commuted shuffles with MergeInnerShuffle. Try to match "shuffle(C, shuffle(A, B, M0), M1) -> shuffle(A, B, M2)" etc. by using MergeInnerShuffle's commuted inner shuffle mode.	2021-03-01 10:42:11 +00:00
Fraser Cormack	dd84fcbdc5	[CodeGen] Fix issues with subvector intrinsic index types This patch addresses issues arising from the fact that the index type used for subvector insertion/extraction is inconsistent between the intrinsics and SDNodes. The intrinsic forms require i64 whereas the SDNodes use the type returned by SelectionDAG::getVectorIdxTy. Rather than update the intrinsic definitions to use an overloaded index type, this patch fixes the issue by transforming the index to the correct type as required. Any loss of index bits going from i64 to a smaller type is unexpected, and will be caught by an assertion in SelectionDAG::getVectorIdxConstant. The patch also updates the documentation for INSERT_SUBVECTOR and adds an assertion to its creation to bring it in line with EXTRACT_SUBVECTOR. This necessitated changes to AArch64 which was using i64 for EXTRACT_SUBVECTOR but i32 for INSERT_SUBVECTOR. Only one test changed its codegen after updating the backend accordingly. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D97459	2021-03-01 10:28:21 +00:00
Serguei Katkov	1ff80409ac	[Statepoint Lowering] Consider dead deopt gc values together with other gc values Currently dead gc value mentioned in the deopt section are not listed in gc section and so are processed separately. With this CL all deopt gc values are considered as base pointers and processed in the same way as other gc values. The fact that deopt gc pointer is a base pointer was used all the time but it is explicitly documented here by putting the value in SI.Base. The idea of the patch comes from Philip Reames. Reviewers: reames, dantrushin Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D97554	2021-03-01 17:23:02 +07:00
Simon Pilgrim	33efbc8acf	[DAG] visitVECTOR_SHUFFLE - move shuffle canonicalization/merges all under the same legality test. NFCI. Minor cleanup to move related combines closer together to make it more coherent, without changing the ordering.	2021-03-01 09:42:00 +00:00
Max Kazantsev	13859b219a	[NFC] Detect IV increment expressed as uadd_with_overflow and usub_with_overflow Current callers do not call it with such argument, so this is NFC. But for further changes, it can be very useful to detect such cases.	2021-03-01 13:24:01 +07:00
Max Kazantsev	e3a0d79f40	[NFC] Introduce function getIVStep for further reuse	2021-03-01 13:04:56 +07:00
Max Kazantsev	3d0fa66f52	[NFC] Whitespace fix	2021-03-01 12:14:03 +07:00
Max Kazantsev	adc4cff078	[NFC] Factor out IV detector function for further reuse	2021-03-01 12:11:54 +07:00
Juneyoung Lee	e428efde6c	[SimplifyCFG] Update FoldTwoEntryPHINode to handle and/or of select and binop equally This is a minor change that fixes FoldTwoEntryPHINode to handle phis with and/ors of select form and binop form equally.	2021-03-01 13:34:51 +09:00
Serguei Katkov	2c429002e4	[Statepoint lowering] Require spill of deopt value in case its type is not legal If the type of the deopt operand has an illegal type and we want to use register for it then it needs to be legalized. This is not supported currently by legalizer and it is not actually clear how to legalize this type of values. Instead we just spill such values and use spill slot location in statepoint. Originally tests were created by Philip Reames. Reviewers: reames, dantrushin Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D97541	2021-03-01 10:23:53 +07:00
Craig Topper	ce5d619908	[DAGCombiner][X86] Don't peek through ANDs on the shift amount in matchRotateSub when called from MatchFunnelPosNeg. Peeking through AND is only valid if the input to both shifts is the same. If the inputs are different, then the original pattern ORs the two values when the masked shift amount is 0. This is ok if the values are the same since the OR would be a NOP which is why its ok for rotate. Fixes PR49365 and reverts PR34641 Differential Revision: https://reviews.llvm.org/D97637	2021-02-28 12:58:00 -08:00
Kazu Hirata	a3564b4137	[IR] Use range-based for loops (NFC)	2021-02-28 10:59:23 -08:00
Kazu Hirata	2ca15ec5a6	[TableGen] Use ListSeparator (NFC)	2021-02-28 10:59:22 -08:00
Kazu Hirata	454751eae1	[llvm] Use set_is_subset (NFC)	2021-02-28 10:59:20 -08:00
Craig Topper	38291ba7f3	[DAGCombiner] Don't skip no overflow check on UMULO if the first computeKnownBits call doesn't return any 0 bits. Even if the first computeKnownBits call doesn't have any zero bits it is possible the other operand has bitwidth-1 leading zero. In that case overflow is still impossible. So always call computeKnownBits for both operands.	2021-02-28 08:26:22 -08:00
Matt Arsenault	6e55b5e1e6	AMDGPU/GlobalISel: Add subtarget to a test SelectionDAG forces us to have a weird ABI for 16-bit values without legal 16-bit operations, but currently GlobalISel bypasses this and sometimes ends up using the gfx8+ ABI in some contexts. Make sure we're testing the normal ABI to avoid a test change in a future patch.	2021-02-28 10:29:25 -05:00
Sanjay Patel	a277205632	[InstCombine] avoid infinite loop in demanded bits for select https://llvm.org/PR49205	2021-02-28 10:17:53 -05:00
David Green	941edbe847	[ARM] VMOVN undef folding If we insert undef using a VMOVN, we can just use the original value in three out of the four possible combinations. Using VMOVT into a undef vector will still require the lanes to be moved, but otherwise the non-undef value can be used.	2021-02-28 14:44:45 +00:00
Simon Pilgrim	cf1a34fda4	[X86][AVX] Reuse existing VBROADCAST(x) for SCALAR_TO_VECTOR(x) Similar to what we already do for BROADCASTs of different vector sizes - if we're going to broadcast it anyway might as well reuse it.	2021-02-28 11:37:27 +00:00
David Green	ae50d26182	[ARM] VECTOR_REG_CAST undef -> undef Propagate undef through VECTOR_REG_CAST nodes, allowing extra simplification in some patterns.	2021-02-28 11:13:49 +00:00
Wei Mi	510612328d	[SampleFDO] Add a cutoff flag to control how many symbols will be included into profile symbol list. When test is unrepresentative to production behavior, sample profile collected from production can cause unexpected performance behavior in test. To triage such issue, it is useful to have a cutoff flag to control how many symbols will be included into profile symbol list in order to do binary search. Differential Revision: https://reviews.llvm.org/D97623	2021-02-27 23:15:31 -08:00
Craig Topper	ac78a77509	[X86] Add avx512f command lines to vec_smulo and vec_umulo.	2021-02-27 21:16:42 -08:00
Chen Zheng	0abf3f2c78	[Debug-Info][NFC] use emitDwarfUnitLength for debug line section Use emitDwarfUnitLength for debug line, so we can benefit from overriding of emitDwarfUnitLength inside different streamers. Reviewed By: ikudrin, dblaikie Differential Revision: https://reviews.llvm.org/D95998	2021-02-27 22:33:49 -05:00
William S. Moses	fb8af1f3c6	[Attributor] Conditinoally delete fns Allow the attributor to delete functions only if requested Differential Revision: https://reviews.llvm.org/D97238	2021-02-27 20:37:42 -05:00
Craig Topper	fffdcb057e	[X86] Fix a couple comments that said LHS where they meant RHS. NFC	2021-02-27 17:14:17 -08:00
Craig Topper	6edf5e9419	[X86] Add back SSE check prefix for vec-umulo.ll. Regenerate vec-smulo.ll. NFC Simon modified the check prefixes in these tests while D97160 was pending review. When D97160 was commited it wasn't updated it merge cleanly, but didn't comprehend the check prefix changes.	2021-02-27 15:18:09 -08:00
Tony Tye	e33a5d6364	[NFC][AMDGPU] Document the AMDGPU target feature defaults Document the default for the XNACK and SRAMECC target features for code object V2-V3 and V4. Reviewed By: kzhuravl Differential Revision: https://reviews.llvm.org/D97598	2021-02-27 18:28:15 +00:00
Kazu Hirata	cf6f3cec61	[IR] Use range-based for loops (NFC)	2021-02-27 10:09:25 -08:00
Kazu Hirata	9d2bb4e874	[llvm] Fix typos in documentation (NFC)	2021-02-27 10:09:23 -08:00
Kazu Hirata	ff7df6f0b8	[llvm-readobj] Use ListSeparator (NFC)	2021-02-27 10:09:22 -08:00
Sanjay Patel	c42cc1f879	[SimplifyCFG] avoid illegal phi with both poison and undef In the example based on: https://llvm.org/PR49218 ...we are crashing because poison is a subclass of undef, so we merge blocks and create: PHI node has multiple entries for the same basic block with different incoming values! %k3 = phi i64 [ poison, %entry ], [ %k3, %g ], [ undef, %entry ] If both poison and undef values are incoming, we soften the poison values to undef. Differential Revision: https://reviews.llvm.org/D97495	2021-02-27 09:10:32 -05:00
Wang, Pengfei	80c4b80982	[X86] Disable rematerializion for PTILELOADDV Per the discussion in D97453. We currently disable it due to it's not a common scenario and has some problem in implementation. Differential Revision: https://reviews.llvm.org/D97453	2021-02-27 21:08:58 +08:00
Ella Ma	1f7cac962d	[llvm] Add assertions for the smart pointers with the possibility to be null in DWARFLinker::loadClangModule Split from D91844. The local variable `Unit` in function `DWARFLinker::loadClangModule` in file `llvm/lib/DWARFLinker/DWARFLinker.cpp`. If the variable is not set in the loop below its definition, it will trigger a null pointer dereference after the loop. Patch By: OikawaKirie Reviewed By: avl Differential Revision: https://reviews.llvm.org/D97185	2021-02-27 10:14:39 +03:00
Kazu Hirata	ca656eb6e9	[Transforms/Utils] Use range-based for loops (NFC)	2021-02-26 22:36:40 -08:00
Kazu Hirata	ea4b300a92	[TableGen] Use ListSeparator (NFC)	2021-02-26 22:36:38 -08:00
Heejin Ahn	1efa2cfaba	[WebAssembly] Fix reverse mapping in WasmEHFuncInfo D97247 added the reverse mapping from unwind destination to their source, but it had a critical bug; sources can be multiple, because multiple BBs can have a single BB as their unwind destination. This changes `WasmEHFuncInfo::getUnwindSrc` to `getUnwindSrcs` and makes it return a vector rather than a single BB. It does not return the const reference to the existing vector but creates a new vector because `WasmEHFuncInfo` stores not `BasicBlock` or `MachineBasicBlock` but `PointerUnion` of them. Also I hoped to unify those methods for `BasicBlock` and `MachineBasicBlock` into one using templates to reduce duplication, but failed because various usages require `BasicBlock*` to be `const` but it's hard to make it `const` for `MachineBasicBlock` usages. Fixes https://github.com/emscripten-core/emscripten/issues/13514. (More precisely, fixes https://github.com/emscripten-core/emscripten/issues/13514#issuecomment-784708744) Reviewed By: dschuff, tlively Differential Revision: https://reviews.llvm.org/D97583	2021-02-26 17:12:10 -08:00
Fangrui Song	a2b91c0b40	ELF: Create unique SHF_GNU_RETAIN sections for llvm.used global objects If a global object is listed in `@llvm.used`, place it in a unique section with the `SHF_GNU_RETAIN` flag. The section is a GC root under `ld --gc-sections` with LLD>=13 or GNU ld>=2.36. For front ends which do not expect to see multiple sections of the same name, consider emitting `@llvm.compiler.used` instead of `@llvm.used`. SHF_GNU_RETAIN is restricted to ELFOSABI_GNU and ELFOSABI_FREEBSD in binutils. We don't do the restriction - see the rationale in D95749. The integrated assembler has supported SHF_GNU_RETAIN since D95730. GNU as>=2.36 supports section flag 'R'. We don't need to worry about GNU ld support because older GNU ld just ignores the unknown SHF_GNU_RETAIN. With this change, `__attribute__((retain))` functions/variables emitted by clang will get the SHF_GNU_RETAIN flag. Differential Revision: https://reviews.llvm.org/D97448	2021-02-26 16:38:44 -08:00

1 2 3 4 5 ...

212010 Commits