llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00

Author	SHA1	Message	Date
Yusra Syeda	4857db55d1	[SystemZ/z/OS] Add XPLINK 64-bit calling convention to tablegen. This commit adds the initial changes to the SystemZ target description for the XPLINK 64-bit calling convention on z/OS. Additions include: - a new predicate IsTargetXPLINK64 - different register allocation order - generaton of nopr after a call Reviewed-by: uweigand Differential Revision: https://reviews.llvm.org/D96887	2021-02-19 18:39:49 -05:00
Philip Reames	ddb3bf5440	[ValueTracking] Add a two argument form of safeCtxI [NFC] The existing implementation was relying on order of evaluation to achieve a particular result. This got really confusing when wanting to change the handling for arguments in a later patch.	2021-02-19 14:52:51 -08:00
Amara Emerson	d83de2e66b	[AArch64][GlobalISel] Make G_VECREDUCE_ADD of <2 x s32> legal.	2021-02-19 14:28:21 -08:00
Jianzhou Zhao	ff60cc1168	[dfsan] Add origin address calculation This is a part of https://reviews.llvm.org/D95835. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D97065	2021-02-19 21:30:07 +00:00
Craig Topper	3867db163b	[RISCV] Remove VPatILoad and VPatIStore multiclasses that are no longer used. NFC	2021-02-19 13:23:08 -08:00
Philip Reames	56af664c20	Add datalayout to test added in 7e3183d73 Realized after pushing this would probably fail on bots for other than x86-64.	2021-02-19 13:10:19 -08:00
Philip Reames	ae521e31d1	Add test triggered by review discussion on D97077	2021-02-19 13:03:58 -08:00
Tim Shen	e909c229f8	Patch by @wecing (Chenguang Wang). The current getFoldedSizeOf() implementation uses naive recursion, which could be really slow when the input structure type is too complex. This issue was first brought up in http://llvm.org/bugs/show_bug.cgi?id=8281; this change fixes it by adding memoization. Differential Revision: https://reviews.llvm.org/D6594	2021-02-19 12:44:17 -08:00
Jianzhou Zhao	fbcfd3cb70	[msan] Set cmpxchg shadow precisely In terms of https://llvm.org/docs/LangRef.html#cmpxchg-instruction, the return type of chmpxchg is a pair {ty, i1}, while I think we only wanted to set the shadow for the address 0th op, and it has type ty. Reviewed-by: eugenis Differential Revision: https://reviews.llvm.org/D97029	2021-02-19 20:23:23 +00:00
Philip Reames	4fe33c803d	precommit test cleanup for D97077	2021-02-19 12:19:39 -08:00
Sanjay Patel	eb2508cda0	[Verifier] remove dead code for saturating intrinsics; NFC Test coverage shows that we assert with the string from the tablegen defs file for these intrinsics, so these cases should never be live.	2021-02-19 14:58:25 -05:00
Sanjay Patel	b2a0193b6b	[Verifier] add tests for saturating intrinsics; NFC As noted in D96904, we don't have direct tests for these malformed ops.	2021-02-19 14:58:25 -05:00
Haowei Wu	7c097db22c	[elfabi] Fix a bug when .dynsym contains no non-local symbol This patch fixed a bug when elbabi was supplied with a tbe file contains no non-local symbol. Before this patch, it wrote 0 to sh_info of the .dynsym section, making the ELF stub file invalid. This patch fixed this issue. Differential Revision: https://reviews.llvm.org/D96930	2021-02-19 11:36:53 -08:00
Sanjay Patel	11bf01464d	[Analysis][LoopVectorize] do not form reductions of pointers This is a fix for https://llvm.org/PR49215 either before/after we make a verifier enhancement for vector reductions with D96904. I'm not sure what the current thinking is for pointer math/logic in IR. We allow icmp on pointer values. Therefore, we match min/max patterns, so without this patch, the vectorizer could form a vector reduction from that sequence. But the LangRef definitions for min/max and vector reduction intrinsics do not allow pointer types: https://llvm.org/docs/LangRef.html#llvm-smax-intrinsic https://llvm.org/docs/LangRef.html#llvm-vector-reduce-umax-intrinsic So we would crash/assert at some point - either in IR verification, in the cost model, or in codegen. If we do want to allow this kind of transform, we will need to update the LangRef and all of those parts of the compiler. Differential Revision: https://reviews.llvm.org/D97047	2021-02-19 14:01:57 -05:00
Craig Topper	5925943057	[RISCV] Use inheritance to reduce some repeated code in tablegen. NFC The VLX and VSX searchable tables, share the same format so we can have a common base class for them.	2021-02-19 10:42:18 -08:00
Simon Pilgrim	257b9a938c	[X86] Regenerate 2007-06-28-X86-64-isel.ll	2021-02-19 18:35:15 +00:00
Simon Pilgrim	383882990f	[X86] Remove unused intrinsic declaration	2021-02-19 18:35:14 +00:00
Simon Pilgrim	8fe41f7cf9	[X86] Regenerate 2011-12-06-AVXVectorExtractCombine.ll	2021-02-19 18:35:14 +00:00
Craig Topper	b4a550cc03	[RISCV] Remove unneeded indexed segment load/store vector pseudo instruction. We had more combinations of data and index lmuls than we needed. Also add some asserts to verify that the IndexVT and data VT have the same element count when we isel these pseudo instructions.	2021-02-19 10:28:48 -08:00
Craig Topper	eb3717e534	[RISCV] Use custom isel for vector indexed load/store intrinsics. There are many legal combinations of index and data VTs supported for these intrinsics. This results in a lot of isel patterns in RISCVGenDAGISel.inc. By adding a separate table similar to what we use for segment load/stores, we can more efficiently manually select these intrinsics. We should also be able to reuse this table scalable vector gather/scatter. This reduces the llc binary size by ~56K. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D97033	2021-02-19 10:10:06 -08:00
Craig Topper	12d4bce8f8	[RISCV] Prevent selecting a 0 VL to X0 for the segment load/store intrinsics. Just like we do for isel patterns, we need to call selectVLOp to prevent 0 from being selected to X0 by the default isel. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97021	2021-02-19 10:07:12 -08:00
Craig Topper	0858057f74	[RISCV] Move SHFLI matching to DAG combine. Add 32-bit support for RV64 We previously used isel patterns for this, but that used quite a bit of space in the isel table due to OR being associative and commutative. It also wouldn't handle shifts/ands being in reversed order. This generalizes the shift/and matching from GREVI to take the expected mask table as input so we can reuse it for SHFLI. There is no SHFLIW instruction, but we can promote a 32-bit SHFLI to i64 on RV64. As long as bit 4 of the control bit isn't set, a 64-bit SHFLI will preserve 33 sign bits if the input had at least 33 sign bits. ComputeNumSignBits has been updated to account for that to avoid sext.w in the tests. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96661	2021-02-19 10:07:12 -08:00
Wei Mi	9407268caa	[SampleFDO] Add PromotedInsns to prevent repeated ICP. In https://reviews.llvm.org/rG5fb65c02ca5e91e7e1a00e0efdb8edc899f3e4b9, We use 0 count value profile to memorize which target has been promoted and prevent repeated ICP for the same target, so we delete PromotedInsns. However, I found the implementation in the patch has some shortcomings to be fixed otherwise there will still be repeated ICP. So I add PromotedInsns back temorarily. Will remove it after I get a thorough fix.	2021-02-19 10:01:49 -08:00
Jessica Paquette	3471d7d959	[AArch64][GlobalISel] Run redundant_sext_inreg in the post-legalizer combiner This is to ensure that we can eliminate G_ASSERT_SEXT. In a follow-up patch, I'm going to make CallLowering emit G_ASSERT_SEXT for signext parameters. Differential Revision: https://reviews.llvm.org/D96913	2021-02-19 09:34:47 -08:00
Benjamin Kramer	ba5cce52d3	[LV] Fold single-use variable into assert. NFC.	2021-02-19 18:11:39 +01:00
Nikita Popov	92fca6922a	[MemCopyOpt] Enable MemorySSA by default This enables use of MemorySSA instead of MemDep in MemCpyOpt. To allow this without significant compile-time impact, the MemCpyOpt pass is moved directly before DSE (in the cases where this was not already the case), which allows us to reuse the existing MemorySSA analysis. Unlike the MemDep-based implementation, the MemorySSA-based MemCpyOpt can also perform simple optimizations across basic blocks. Differential Revision: https://reviews.llvm.org/D94376	2021-02-19 18:06:25 +01:00
Philip Reames	289a0fd30f	[SCEV] Use both known bits and sign bits when computing range of SCEV unknowns When computing a range for a SCEVUnknown, today we use computeKnownBits for unsigned ranges, and computeNumSignBots for signed ranges. This means we miss opportunities to improve range results. One common missed pattern is that we have a signed range of a value which CKB can determine is positive, but CNSB doesn't convey that information. The current range includes the negative part, and is thus double the size. Per the removed comment, the original concern which delayed using both (after some code merging years back) was a compile time concern. CTMark results (provided by Nikita, thanks!) showed a geomean impact of about 0.1%. This doesn't seem large enough to avoid higher quality results. Differential Revision: https://reviews.llvm.org/D96534	2021-02-19 08:29:12 -08:00
Mircea Trofin	dd4f4eac1d	[NFC][Regalloc] Share the VirtRegAuxInfo object with LiveRangeEdit VirtRegAuxInfo is an extensibility point, so the register allocator's decision on which implementation to use should be communicated to the other users - namely, LiveRangeEdit. Differential Revision: https://reviews.llvm.org/D96898	2021-02-19 07:44:28 -08:00
madhur13490	00aa32634a	Make fixed-abi default for AMD HSA OS fixed-abi uses pre-defined and predictable SGPR/VGPRs for passing arguments. This patch makes this scheme default when HSA OS is specified in triple. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D96340	2021-02-19 15:05:25 +00:00
David Green	7d67e26f6b	[ARM] Correct vector predicate type in MVE getCmpSelInstrCost	2021-02-19 14:43:51 +00:00
Jay Foad	cfbb508b8c	[AMDGPU] Add some GFX9 test coverage. NFC.	2021-02-19 14:38:52 +00:00
Simon Pilgrim	2c96887376	[DAG] visitTRUNCATE - attempt to truncate USUBSAT Fold trunc(usubsat(zext(x),y)) -> usubsat(x,trunc(umin(y,satlimit)))	2021-02-19 14:26:05 +00:00
Djordje Todorovic	872e6ef35a	[llvm-dwarfdump][locstats] Unify handling of inlined vars with no loc The presence or absence of an inline variable (as well as formal parameter) with only an abstract_origin ref (without DW_AT_location) should not change the location coverage. It means, for both: DW_TAG_inlined_subroutine DW_AT_abstract_origin (0x0000004e "f") DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000013) DW_TAG_formal_parameter DW_AT_abstract_origin (0x0000005a "b") and, DW_TAG_inlined_subroutine DW_AT_abstract_origin (0x0000004e "f") DW_AT_low_pc (0x0000000000000010) DW_AT_high_pc (0x0000000000000013) we should report 0% location coverage. If we add DW_AT_location, for both cases the coverage should be improved. Differential Revision: https://reviews.llvm.org/D96045	2021-02-19 05:38:01 -08:00
David Green	f3a8ca86f5	Revert "[ARM] Expand the range of allowed post-incs in load/store optimizer" This reverts commit 3b34b06fc5908b4f7dc720c0655d5756bd8e2a28 as runtime errors were reported.	2021-02-19 13:15:10 +00:00
Florian Hahn	832b7f4044	[LV] Remove VPCallback. Now that all state for generated instructions is managed directly in VPTransformState, VPCallBack is no longer needed. This patch updates the last use of `getOrCreateScalarValue` to instead manage the value directly in VPTransformState and removes VPCallback. Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D95383	2021-02-19 12:50:41 +00:00
Simon Pilgrim	2db41da76d	[X86][SSE] Add tests for trunc(usubsat()) patterns.	2021-02-19 12:26:48 +00:00
Nico Weber	7d651b066b	[gn build] Port 1a2b3536efef	2021-02-19 07:23:48 -05:00
Fraser Cormack	4b59fc7239	[RISCV] Address some clang-tidy warnings. NFCI.	2021-02-19 12:10:28 +00:00
Carl Ritson	d4de5e47d8	[AMDGPU] WQM/WWM: Fix marking of partial definitions Track lanes when processing definitions for marking WQM/WWM. If all lanes have been defined then marking can stop. This prevents marking unnecessary instructions as WQM/WWM. In particular this fixes a bug where values passing through V_SET_INACTIVE would me marked as requiring WWM. Reviewed By: piotr Differential Revision: https://reviews.llvm.org/D95503	2021-02-19 20:45:24 +09:00
Nikita Popov	93b123f786	[DCE] Don't remove non-willreturn calls In both ADCE and BDCE (via DemandedBits) we should not remove instructions that are not guaranteed to return. This issue was pointed out by fhahn in the recent llvm-dev thread. Differential Revision: https://reviews.llvm.org/D96993	2021-02-19 12:35:40 +01:00
Simon Pilgrim	c1001d823d	Remove unnecessary "using namespace llvm" inside "namespace llvm". NFCI.	2021-02-19 11:15:16 +00:00
Simon Pilgrim	fb8fd00830	[X86][AVX] getFauxShuffleMask - decode VBROADCAST(EXTRACT_VECTOR_ELT(V,0)) Handle the case where we're broadcasting a scalar extracted from another vector.	2021-02-19 11:06:53 +00:00
Nikita Popov	56e22ed4c5	[IR] Move willReturn() to Instruction This moves the willReturn() helper from CallBase to Instruction, so that it can be used in a more generic manner. This will make it easier to fix additional passes (ADCE and BDCE), and will give us one place to change if additional instructions should become non-willreturn (e.g. there has been talk about handling volatile operations this way). I have also included the IntrinsicInst workaround directly in here, so that it gets applied consistently. (As such this change is not entirely NFC -- FuncAttrs will now use this as well.) Differential Revision: https://reviews.llvm.org/D96992	2021-02-19 11:56:01 +01:00
Nikita Popov	287dbdb68d	[BasicAA] Add simple depth limit to avoid stack overflow (PR49151) This is a simpler variant of D96647. It just adds a straightforward depth limit with a high cutoff, without introducing complex logic for BatchAA consistency. It accepts that we may cache a sub-optimal result if the depth limit is hit. Eventually this should be more fully addressed by D96647 or similar, but in the meantime this avoids stack overflows in a cheap way. Differential Revision: https://reviews.llvm.org/D96996	2021-02-19 11:05:42 +01:00
Djordje Todorovic	ff9317c291	[docs] Fix the GlobalISel/GenericOpcode.rst This couses docs build to fail. Introduced with D96890.	2021-02-19 10:31:31 +01:00
Wang, Pengfei	2da858213e	[X86] Fix a codegen crash in getSetCCResultType This patch fixes some crashes coming from X86ISelLowering::getSetCCResultType, which would occasionally return an EVT constructed from an invalid MVT, which has a null Type pointer. This patch refers to D95434. Differential Revision: https://reviews.llvm.org/D97036	2021-02-19 17:30:10 +08:00
Sjoerd Meijer	23fb2ffcc6	[AArch64] Add some missing Neoverse features This enables AES fusion and the post RA scheduler for the Neoverse cores. And while we are it also for the A55 that we had missed earlier. Differential Revision: https://reviews.llvm.org/D96866	2021-02-19 09:18:35 +00:00
Qiu Chaofan	41dafb0875	[llvm-exegesis] Ignore instructions using custom inserter Some instructions defined in table-gen files sets usesCustomInserter bit, which means it has to be lowered by target code and isn't actually valid instruction at MC level. So we should treat them like pseudo instructions. Reviewed By: gchatelet Differential Revision: https://reviews.llvm.org/D94898	2021-02-19 17:04:27 +08:00
Qiu Chaofan	48961acee2	[llvm-exegesis] [PowerPC] Add basic LIT test Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D94897	2021-02-19 17:01:05 +08:00
David Green	20be3bc420	[NPM][LTO] Do not enable MemorySSA with LoopFullUnrollPass As with the standard opt pipeline, we disable the MemorySSA dependency in the LTO LPM pipeline as not all passes preserve MemorySSA.	2021-02-19 08:35:11 +00:00

1 2 3 4 5 ...

211506 Commits