llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

Author	SHA1	Message	Date
Sander de Smalen	275d51c545	[AArch64][SVE] Don't support fixedStack for SVE objects. Fixed stack objects are preallocated and defined to be allocated before any of the regular stack objects. These are normally used to model stack arguments. The AAPCS does not support passing SVE registers on the stack by value (only by reference). The current layout also doesn't place them before all stack objects, but rather before all SVE objects. Removing this simplifies the code that emits the allocation/deallocation around callee-saved registers (D84042). This patch also removes all uses of fixedStack from from framelayout-sve.mir, where this was used purely for testing purposes. Reviewers: paulwalker-arm, efriedma, rengolin Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D84538	2020-07-28 15:45:53 +01:00
Francesco Petrogalli	79f86bfe26	[llvm][CodeGen] Addressing modes for SVE ldN. Reviewers: c-rhodes, efriedma, sdesmalen Subscribers: huihuiz, tschuett, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77251	2020-07-27 22:18:28 +00:00
Arthur Eubanks	fca7fa2021	Prefix some AArch64/ARM passes with "aarch64-"/"arm-" For consistency with other target specific passes. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D84560	2020-07-27 11:00:39 -07:00
Jon Roelofs	c0e6798650	[AArch64] fjcvtzs,rmif,cfinv,setf* all clobber nzcv Differential Revision: https://reviews.llvm.org/D83818	2020-07-27 09:17:53 -06:00
David Sherwood	7010946aee	[SVE] Don't use LocalStackAllocation for SVE objects I have introduced a new TargetFrameLowering query function: isStackIdSafeForLocalArea that queries whether or not it is safe for objects of a given stack id to be bundled into the local area. The default behaviour is to always bundle regardless of the stack id, however for AArch64 this is overriden so that it's only safe for fixed-size stack objects. There is future work here to extend this algorithm for multiple local areas so that SVE stack objects can be bundled together and accessed from their own virtual base-pointer. Differential Revision: https://reviews.llvm.org/D83859	2020-07-27 08:22:01 +01:00
QingShan Zhang	ab3ffdcb1e	[Scheduling] Improve group algorithm for store cluster Store Addr and Store Addr+8 are clusterable pair. They have memory(ctrl) dependency on different loads. Current implementation will put these two stores into different group and miss to cluster them. Reviewed By: evandro Differential Revision: https://reviews.llvm.org/D84139	2020-07-27 02:02:40 +00:00
Amara Emerson	8003a99b34	[AArch64][GlobalISel] Make <8 x s16> and <16 x s8> legal types for G_SHUFFLE_VECTOR and G_IMPLICIT_DEF. Trivial change, we're still missing support for rev matching for these types in the combiner.	2020-07-26 00:48:09 -07:00
Jessica Paquette	0266a8b395	[AArch64][GlobalISel] Look through constants when selection stores of 0 Very minor code size improvements (hits 8 times in Bullet at -O3), but still something. Also very minor NFC change to make sure we only search for a 0 constant when selecting a store. Before, we'd do this for loads as well. Differential Revision: https://reviews.llvm.org/D84573	2020-07-24 22:46:14 -07:00
Jessica Paquette	b6d2a38769	[AArch64][GlobalISel] Use wzr/xzr for 16 and 32 bit stores of zero We weren't performing this optimization on 16 and 32 bit stores. SDAG happily does this though. e.g. https://godbolt.org/z/cWocKr This saves about 0.2% in code size on CTMark at -O3. Differential Revision: https://reviews.llvm.org/D84568	2020-07-24 17:15:20 -07:00
Matt Arsenault	16ca0e0369	GlobalISel: Define mulfix/divfix opcodes The full expansion involves the funnel shifts, which depend on another patch to expand those.	2020-07-24 20:02:20 -04:00
Amara Emerson	8f9bed84e6	[AArch64][GlobalISel] Promote G_UITOFP vector operands to same elt size as result. Fixes legalization failures.	2020-07-24 17:00:50 -07:00
Eli Friedman	d2a7f965f0	[AArch64][SVE] Add "fast" fcmp operations. dacf8d3 added support for most fcmp operations, but there are some extra variations I hadn't considered: SelectionDAG supports float comparisons that are neither ordered nor unordered. Add support for the missing operations. Differential Revision: https://reviews.llvm.org/D84460	2020-07-24 13:22:41 -07:00
Francesco Petrogalli	5f0a55fcfc	[llvm][sve] Reg + Imm addressing mode for ld1ro. Reviewers: kmclaughlin, efriedma, sdesmalen Subscribers: tschuett, hiraditya, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83357	2020-07-24 17:48:47 +00:00
Eli Friedman	f7eaa43249	[AArch64][SVE] Teach copyPhysReg to copy ZPR2/3/4. It's sort of tricky to hit this in practice, but not impossible. I have a synthetic C testcase if anyone is interested. The implementation is identical to the equivalent NEON register copies. Differential Revision: https://reviews.llvm.org/D84373	2020-07-23 16:41:37 -07:00
Amara Emerson	b33e55da04	[AArch64][GlobalISel] Add post-legalize combine for sext(trunc(sextload)) -> trunc/copy On AArch64 we generate redundant G_SEXTs or G_SEXT_INREGs because of this. Differential Revision: https://reviews.llvm.org/D81993	2020-07-23 12:06:35 -07:00
Evgeny Leviant	355534b549	[CodeGen][TargetPassConfig] Add unreachable-mbb-elimination pass explicitly Differential revision: https://reviews.llvm.org/D84228	2020-07-23 18:05:11 +03:00
Konstantin Schwarz	e3dfca8d38	[GlobalISel][InlineAsm] Add register class ID to the flags of register input operands Summary: We do this already for output operands, but missed it for (non-tied) input operands. Reviewers: arsenm, Petar.Avramovic Reviewed By: arsenm Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, llvm-commits, kerbowa Tags: #llvm Differential Revision: https://reviews.llvm.org/D83763	2020-07-23 13:35:01 +02:00
Sander de Smalen	d7fe459f10	[AArch64][SVE] Correctly allocate scavenging slot in presence of SVE. This patch addresses two issues: * Forces the availability of the base-pointer (x19) when the frame has both scalable vectors and variable-length arrays. Otherwise it will be expensive to access non-SVE locals. * In presence of SVE stack objects, it will allocate the emergency scavenging slot close to the SP, so that they can be accessed from the SP or BP if available. If accessed from the frame-pointer, it will otherwise need an extra register to access the scavenging slot because of mixed scalable/non-scalable addressing modes. Reviewers: efriedma, ostannard, cameron.mcinally, rengolin, david-arm Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D70174	2020-07-22 10:50:36 +01:00
Amara Emerson	751c6ffec8	Revert "[AArch64][GlobalISel] Add post-legalize combine for sext_inreg(trunc(sextload)) -> copy" This reverts commit 64eb3a4915f00cca9af4c305a9ff36209003cd7b. It caused miscompiles with optimizations enabled. Reverting while I investigate.	2020-07-21 16:01:18 -07:00
Amara Emerson	fb2c0a0ef7	[AArch64][GlobalISel] Fix TLS accesses clobbering registers incorrectly. This was happening because the BLR didn't have a use of the X0 arg register, which would end up being re-used in high reg pressure situations. The change also avoids hard coding the use of X0 for the sequence except to copy the value for the call. ld64 should still be able to optimize it. rdar://65438258	2020-07-21 16:01:17 -07:00
Matt Arsenault	f9701c3fe0	GlobalISel: Translate llvm.powi intrinsic There are a few questionable things about this intrinsic and existing DAG implementation. For some reason the intrinsic hardcodes the second operand to be scalar-only i32, and SelectionDAG builder makes a legalization decision based on whether the operand is constant.	2020-07-21 18:13:04 -04:00
Sander de Smalen	cd95c17a17	[AArch64][SVE] Fix PCS for functions taking/returning scalable types. The default calling convention needs to save/restore the SVE callee saves according to the SVE PCS when the function takes or returns scalable types, even when the `aarch64_sve_vector_pcs` CC is not specified for the function. Reviewers: efriedma, paulwalker-arm, david-arm, rengolin Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D84041	2020-07-21 15:55:39 +01:00
Petre-Ionut Tudor	1210771898	[ARM] Generate [SU]HADD from ((a + b) >> 1) Summary: Teach LLVM to recognize the above pattern, where the operands are either signed or unsigned types. Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83777	2020-07-21 13:22:07 +01:00
Eli Friedman	941026e51d	[AArch64][SVE] Add support for trunc to <vscale x N x i1>. This isn't a natively supported operation, so convert it to a mask+compare. In addition to the operation itself, fix up some surrounding stuff to make the testcase work: we need concat_vectors on i1 vectors, we need legalization of i1 vector truncates, and we need to fix up all the relevant uses of getVectorNumElements(). Differential Revision: https://reviews.llvm.org/D83811	2020-07-20 13:11:02 -07:00
Yuanfang Chen	bf8086d1c1	[llc] (almost) remove `--print-machineinstrs` Its effect could be achieved by `-stop-after`,`-print-after`,`-print-after-all`. But a few tests need to print MIR after ISel which could not be done with `-print-after`/`-stop-after` since isel pass does not have commandline name. That's the reason `--print-machineinstrs` is downgraded to `--print-after-isel` in this patch. `--print-after-isel` could be removed after we switch to new pass manager since isel pass would have a commandline text name to use `print-after` or equivalent switches. The motivation of this patch is to reduce tests dependency on would-be-deprecated feature. Reviewed By: arsenm, dsanders Differential Revision: https://reviews.llvm.org/D83275	2020-07-20 10:43:28 -07:00
Matt Arsenault	b4f7c23c9a	AArch64/GlobalISel: Fix hardcoded registers in error message checks	2020-07-20 10:06:18 -04:00
Paul Walker	75f1125292	[SVE] Add lowering for fixed length vector fdiv, fma, fmul and fsub operations. Differential Revision: https://reviews.llvm.org/D84034	2020-07-20 11:57:34 +00:00
Elvina Yakubova	6cd76408bf	[llvm-readobj] Update tests because of changes in llvm-readobj behavior This patch updates tests using llvm-readobj and llvm-readelf, because soon reading from stdin will be achievable only via a '-' as described here: https://bugs.llvm.org/show_bug.cgi?id=46400. Patch with changes to llvm-readobj behavior is here: https://reviews.llvm.org/D83704 Differential Revision: https://reviews.llvm.org/D83912 Reviewed by: jhenderson, MaskRay, grimar	2020-07-20 10:39:04 +01:00
Tim Northover	24f7263412	AArch64: emit @llvm.debugtrap as `brk #0xf000` on all platforms It's useful for a debugger to be able to distinguish an @llvm.debugtrap from a (noreturn) @llvm.trap, so this extends the existing Windows behaviour to other platforms.	2020-07-20 10:31:26 +01:00
Evgeny Leviant	3d264ff874	[CodeGen][TargetPassConfig] Add TargetTransformInfo pass correctly Patch adds tti pass directly enforcing its execution with correctly set TargetTransformInfo. Differential revision: https://reviews.llvm.org/D84047	2020-07-18 14:11:40 +03:00
Jay Foad	3f23d4b8c3	[MachineScheduler] Fix the TopDepth/BotHeightReduce latency heuristics tryLatency compares two sched candidates. For the top zone it prefers the one with lesser depth, but only if that depth is greater than the total latency of the instructions we've already scheduled -- otherwise its latency would be hidden and there would be no stall. Unfortunately it only tests the depth of one of the candidates. This can lead to situations where the TopDepthReduce heuristic does not kick in, but a lower priority heuristic chooses the other candidate, whose depth is greater than the already scheduled latency, which causes a stall. The fix is to apply the heuristic if the depth of either candidate is greater than the already scheduled latency. All this also applies to the BotHeightReduce heuristic in the bottom zone. Differential Revision: https://reviews.llvm.org/D72392	2020-07-17 11:02:13 +01:00
Paul Walker	bb56ec9a61	[SVE] Add lowering for scalable vector fadd, fdiv, fmul and fsub operations. Lower the operations to predicated variants. This is prep work required for fixed length code generation but also fixes a bug whereby these operations fail selection when "unpacked" vector types (e.g. MVT::nxv2f32) are used. This patch also adds the missing "unpacked" patterns for FMA. Differential Revision: https://reviews.llvm.org/D83765	2020-07-16 11:31:35 +00:00
Kerry McLaughlin	30eb603e95	[SVE][CodeGen] Legalisation of masked loads and stores Summary: This patch modifies IncrementMemoryAddress to use a vscale when calculating the new address if the data type is scalable. Also adds tablegen patterns which match an extract_subvector of a legal predicate type with zip1/zip2 instructions Reviewers: sdesmalen, efriedma, david-arm Reviewed By: efriedma, david-arm Subscribers: tschuett, hiraditya, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83137	2020-07-16 10:55:45 +01:00
Hiroshi Yamauchi	bd196de5cc	[PGO][PGSO] Add profile guided size optimization to LegalizeDAG. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83333	2020-07-15 10:03:38 -07:00
Roger Ferrer Ibanez	97e701dc7a	[DAGCombiner] Rebuild (setcc x, y, ==) from (xor (xor x, y), 1) The existing code already considered this case. Unfortunately a typo in the condition prevents it from triggering. Also the existing code, had it run, forgot to do the folding. This fixes PR42876. Differential Revision: https://reviews.llvm.org/D65802	2020-07-15 07:34:22 +00:00
Roger Ferrer Ibanez	7ef6c6e4c3	[NFC] Add tests for boolean comparisons They currently show that the not equal case may be improved. See PR42876 Differential Revision: https://reviews.llvm.org/D65801	2020-07-15 07:33:43 +00:00
Paul Walker	0e29ab3acf	[SelectionDAG] Prevent warnings when extracting fixed length vector from scalable. ComputeNumSignBits and computeKnownBits both trigger "Scalable flag may be dropped" warnings when a fixed length vector is extracted from a scalable vector. This patch assumes nothing about the demanded elements thus matching the behaviour when extracting a scalable vector from a scalable vector. Differential Revision: https://reviews.llvm.org/D83642	2020-07-14 11:12:56 +00:00
Sander de Smalen	d549b86c4b	[AArch64][SVE] Remove erroneous assert in resolveFrameOffsetReference The code already supports addressing a fixed-size stack object from the frame-pointer, by first subtracting sizeof(SVE area) from FP. Reviewers: efriedma, cameron.mcinally, david-arm, rengolin Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D83125	2020-07-14 09:22:45 +01:00
David Sherwood	1652ed61e6	[SVE][CodeGen] Add README for SVE-related warnings in tests I have added a new file: llvm/test/CodeGen/AArch64/README that describes what to do in the event one of the SVE codegen tests fails the warnings check. In addition, I've added comments to all the relevant SVE tests pointing users at the README file. Differential Revision: https://reviews.llvm.org/D83467	2020-07-14 08:31:10 +01:00
David Sherwood	314ce05460	[SVE][CodeGen] Fix implicit TypeSize->uint64_t conversion in TransformFPLoadStorePair In DAGCombiner::TransformFPLoadStorePair we were dropping the scalable property of TypeSize when trying to create an integer type of equivalent size. In fact, this optimisation makes no sense for scalable types since we don't know the size at compile time. I have changed the code to bail out when encountering scalable type sizes. I've added a test to llvm/test/CodeGen/AArch64/sve-fp.ll that exercises this code path. The test already emits an error if it encounters warnings due to implicit TypeSize->uint64_t conversions. Differential Revision: https://reviews.llvm.org/D83572	2020-07-14 08:07:30 +01:00
Amara Emerson	0a9b8a10a9	[AArch64][GlobalISel] Add post-legalize combine for sext_inreg(trunc(sextload)) -> copy On AArch64 we generate redundant G_SEXTs or G_SEXT_INREGs because of this. Differential Revision: https://reviews.llvm.org/D81993	2020-07-13 20:27:45 -07:00
Sanjay Patel	e6d4089b0d	[DAGCombiner] allow load/store merging if pairs can be rotated into place This carves out an exception for a pair of consecutive loads that are reversed from the consecutive order of a pair of stores. All of the existing profitability/legality checks for the memops remain between the 2 altered hunks of code. This should give us the same x86 base-case asm that gcc gets in PR41098 and PR44895: http://bugs.llvm.org/PR41098 http://bugs.llvm.org/PR44895 I think we are missing a potential subsequent conversion to use "movbe" if the target supports that. That might be similar to what AArch64 would use to get "rev16". Differential Revision: https://reviews.llvm.org/D83567	2020-07-13 08:57:00 -04:00
Sanjay Patel	86d4edf021	Revert "[DAGCombiner] allow load/store merging if pairs can be rotated into place" This reverts commit 591a3af5c7acc05617c0eacf6ae4f76bd8a9a6ce. The commit message was cut off and failed to include the review citation.	2020-07-13 08:55:29 -04:00
Sanjay Patel	37cf92de0a	[DAGCombiner] allow load/store merging if pairs can be rotated into place This carves out an exception for a pair of consecutive loads that are reversed from the consecutive order of a pair of stores. All of the existing profitability/legality checks for the memops remain between the 2 altered hunks of code. This should give us the same x86 base-case asm that gcc gets in PR41098 and PR44895:i http://bugs.llvm.org/PR41098 http://bugs.llvm.org/PR44895 I think we are missing a potential subsequent conversion to use "movbe" if the target supports that. That might be similar to what AArch64 would use to get "rev16". Differential Revision:	2020-07-13 08:53:06 -04:00
Paul Walker	bb7b8a1764	[SVE] Ensure fixed length vector fptrunc operations bigger than NEON are not considered legal. Differential Revision: https://reviews.llvm.org/D83568	2020-07-13 11:16:30 +00:00
Petar Avramovic	d6083a3a5f	[GlobalISel][InlineAsm] Fix buildCopy for inputs Check that input size matches size of destination reg class. Attempt to extend input size when needed. Differential Revision: https://reviews.llvm.org/D83384	2020-07-13 10:52:33 +02:00
Sanjay Patel	9bc8c780ab	[DAGCombiner] tighten fast-math constraints for fma fold fadd (fma A, B, (fmul C, D)), E --> fma A, B, (fma C, D, E) This is only allowed when "reassoc" is present on the fadd. As discussed in D80801, this transform goes beyond what is allowed by "contract" FMF (-ffp-contract=fast). That is because we are fusing the trailing add of 'E' with a multiply, but without "reassoc", the code mandates that the products AB and CD are added together before adding in 'E'. I've added this example to the LangRef to try to clarify the meaning of "contract". If that seems reasonable, we should probably do something similar for the clang docs because there does not appear to be any formal spec for the behavior of -ffp-contract=fast. Differential Revision: https://reviews.llvm.org/D82499	2020-07-12 08:51:49 -04:00
Luke Geeson	d2668753df	[ARM] Add Cortex-A78 and Cortex-X1 Support for Clang and LLVM This patch upstreams support for the Arm-v8 Cortex-A78 and Cortex-X1 processors for AArch64 and ARM. In detail: - Adding cortex-a78 and cortex-x1 as cpu options for aarch64 and arm targets in clang - Adding Cortex-A78 and Cortex-X1 CPU names and ProcessorModels in llvm details of the CPU can be found here: https://www.arm.com/products/cortex-x https://www.arm.com/products/silicon-ip-cpu/cortex-a/cortex-a78 The following people contributed to this patch: - Luke Geeson - Mikhail Maltsev Reviewers: t.p.northover, dmgreen Reviewed By: dmgreen Subscribers: dmgreen, kristof.beyls, hiraditya, danielkiss, cfe-commits, llvm-commits, miyuki Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D83206	2020-07-10 18:24:11 +01:00
Sanjay Patel	8c7a86e447	[AArch64][x86] add tests for rotated store merge; NFC	2020-07-10 11:28:51 -04:00
Paul Walker	ad789e6576	[SVE] Code generation for fixed length vector truncates. Lower fixed length vector truncates to a sequence of SVE UZP1 instructions. Differential Revision: https://reviews.llvm.org/D83395	2020-07-10 10:37:19 +00:00

1 2 3 4 5 ...

3782 Commits