llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

Author	SHA1	Message	Date
Thomas Symalla	ee664c032b	Added and used new target pseudo for v_cvt_pk_i16_i32, changes due to code review.	2021-02-02 09:14:53 +01:00
Thomas Symalla	43278a1cb3	Formatting changes	2021-02-02 09:14:53 +01:00
Thomas Symalla	46f1f49a56	Formatting changes.	2021-02-02 09:14:53 +01:00
Thomas Symalla	8aacc5adcd	Updating formatting changes.	2021-02-02 09:14:53 +01:00
Thomas Symalla	8955159f2d	Resolve formatting changes.	2021-02-02 09:14:53 +01:00
Thomas Symalla	38676d07e0	Code changes yielded from review.	2021-02-02 09:14:53 +01:00
Thomas Symalla	ea43201600	Move step to PreLegalizer	2021-02-02 09:14:53 +01:00
Thomas Symalla	122a71a9f1	Move Combiner to PreLegalize step	2021-02-02 09:14:53 +01:00
Thomas Symalla	9a5185f66f	Reverted unintended git-format change.	2021-02-02 09:14:52 +01:00
Thomas Symalla	5e6c38bc76	Fixed the lit tests and a bug in the implementation.	2021-02-02 09:14:52 +01:00
Thomas Symalla	8206639df8	Refactored the pattern matching.	2021-02-02 09:14:52 +01:00
Thomas Symalla	555bb61a39	Added early exit.	2021-02-02 09:14:52 +01:00
Thomas Symalla	1f11e485f5	Added comments.	2021-02-02 09:14:52 +01:00
Thomas Symalla	ae887237b4	clang-format	2021-02-02 09:14:52 +01:00
Thomas Symalla	1663fb4919	Added clamp i64 to i16 global isel pattern.	2021-02-02 09:14:52 +01:00
Craig Topper	6d008ca25f	[RISCV] Replace NoX0 SDNodeXForm with a ComplexPattern to do the selection of the VL operand. I think this is a more standard way of doing this. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D95833	2021-02-02 00:08:58 -08:00
Puyan Lotfi	293c83b00b	Revert "[AArch64] Homogeneous Prolog and Epilog Size Optimization" This reverts commit 0426be3df6180747bd68706db87a70580f064f0f. Reverting due to some expensive-checks failures in tests.	2021-02-02 02:33:44 -05:00
Kyungwoo Lee	28c9f1933e	[AArch64] Homogeneous Prolog and Epilog Size Optimization Prologs and epilogs handle callee-save registers and tend to be irregular with different immediate offsets that are not often handled by the MachineOutliner. Commit D18619/a5335647d5e8 (combining stack operations) stretched irregularity further. This patch tries to emit homogeneous stores and loads with the same offset for prologs and epilogs respectively. We have observed that this canonicalizes (homogenizes) prologs and epilogs significantly and results in a greatly increased chance of outlining, resulting in a code size reduction. Despite the above results, there are still size wins to be had that the MachineOutliner does not provide due to the special handling X30/LR. To handle the LR case, his patch custom-outlines prologs and epilogs in place. It does this by doing the following: * Injects HOM_Prolog and HOM_Epilog pseudo instructions during a Prolog and Epilog Injection Pass. * Lowers and optimizes said pseudos in a AArchLowerHomogneousPrologEpilog Pass. * Outlined helpers are created on demand. Identical helpers are merged by the linker. * An opt-in flag is introduced to enable this feature. Another threshold flag is also introduced to control the aggressiveness of outlining for application's need. This reduced an average of 4% of code size on LLVM-TestSuite/CTMark targeting arm64/-Oz. Differential Revision: https://reviews.llvm.org/D76570	2021-02-02 00:26:51 -05:00
Matt Arsenault	e64514a4fe	AMDGPU: Fix dbg_value handling when forming soft clause bundles DBG_VALUES placed between memory instructions would change codegen. Skip over these and re-insert them after the bundle instead of giving up on bundling.	2021-02-01 22:16:35 -05:00
Philip Reames	29acfb48a9	[x86] introduce no_callee_saved_registers attribute This is directly analogous to the existing no_caller_saved_registers, but with the opposite intention. A function or call so marked shifts the responsibility of spilling the usual CSRs to it's caller. An indirect call site and callee which don't agree on the attribute is ill defined. The motivation for this change is that being able to prune callee saves (without modifying other details of the calling convention) is sometimes useful when generating stubs and adapters. There's no intention to expose this as a source language feature; this is expected to be used by frontends to implement adapters where warranted. Some specific examples of use cases: * GC compatible compiled code wants to call an externally defined library function without needing to track pointer values through CSRs. * debug enabled code wants to call precompiled library which doesn't provide enough information to track CSRs while preserving debug quality in caller. * adapter stub entering hand written assembler which doesn't follow normal calling conventions.	2021-02-01 16:19:14 -08:00
Philip Reames	dfa3b662c8	[NFC][X86] Use CallBase interface to simplify code	2021-02-01 15:24:41 -08:00
Philip Reames	1af57bc4bf	[NFC][X86] Avoid redundant work inspecting callee	2021-02-01 15:24:41 -08:00
David Green	b21519cd48	[ARM] Flatten identity shuffles through vqdmulh nodes Given a shuffle(vqdmulh(shuffle, shuffle), we can flatter the shuffles out if they become an identity mask. This can come up during lane interleaving, when we do that better. Differential Revision: https://reviews.llvm.org/D94034	2021-02-01 19:14:20 +00:00
Craig Topper	41a3080ffd	[X86] Accept 64-bit GPRs for vextractps when using a register that requires EVEX. This is consistent with the VEX version. It also fixes a sorting issue in the matching table that caused the EVEX version to be prioritized over VEX in intel syntax. Fixes issue [2] from PR48991.	2021-02-01 11:01:32 -08:00
Simon Pilgrim	446a01964c	[X86][SSE] LowerScalarImmediateShift - use APInt::getLowBitsSet for vXi8 ISD::SRL mask generation. NFCI. Match what we do for ISD::SHL	2021-02-01 18:17:40 +00:00
Jessica Paquette	08fd0a40ed	[AArch64][GlobalISel] Emit G_ASSERT_ZEXT in assignValueToReg When we have a zeroext parameter, emit G_ASSERT_ZEXT. Add a check that we actually emit it. This is a 0.1% code size win on CTMark/7zip and CTMark/consumer-typeset at -Os. Differential Revision: https://reviews.llvm.org/D95567	2021-02-01 10:01:52 -08:00
Craig Topper	6ab0fefa1b	[RISCV] Add scalable vector support for floating point FMA instructions A follow up patch will add support for commuting operands or changing opcode to vfmacc and friends. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D95662	2021-02-01 09:52:43 -08:00
Craig Topper	c8be45ab39	[RISCV] Update comment text from D95774. NFC	2021-02-01 09:52:43 -08:00
Craig Topper	136420859e	[RISCV] Optimize (srl (and X, 0xffff), C) -> (srli (slli X, 16), 16 + C). Rather than materializing the 0xffff immediate for the AND, use a shift left to remove the upper bits and then shift in zeros from the right. This pattern occurs when type legalizing an i16 right shift. I've implemented this with custom selection code for a number of reasons. I've limited this to the AND having a single use. We need to compensate for SimplifyDemandedBits altering the AND mask. I'm using *W opcodes on RV64. We may want to generlize this in the future. For all these reason it seemed easiest to do it this way. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D95774	2021-02-01 09:37:55 -08:00
Austin Kerbow	b099a1ce22	[AMDGPU] Fix release build after 0397dca0.	2021-02-01 08:55:14 -08:00
Austin Kerbow	411730e6c6	[AMDGPU] Fix crash with sgpr spills to vgpr disabled This would assert with amdgpu-spill-sgpr-to-vgpr disabled when trying to spill the FP. Fixes: SWDEV-262704 Reviewed By: RamNalamothu Differential Revision: https://reviews.llvm.org/D95768	2021-02-01 08:35:25 -08:00
David Green	dbdd240132	[ARM] Simplify VMOVRRD from extracts of buildvectors Under the softfp calling convention, we are often left with VMOVRRD(extract(bitcast(build_vector(a, b, c, d)))) for the return value of the function. These can be simplified to a,b or c,d directly, depending on the value of the extract. Big endian is a little different because the bitcast switches the lanes around, meaning we end up with b,a or d,c. Differential Revision: https://reviews.llvm.org/D94989	2021-02-01 16:09:25 +00:00
Kerry McLaughlin	267255edd9	[SVE][CodeGen] Remove performMaskedGatherScatterCombine The AArch64 DAG combine added by D90945 & D91433 extends the index of a scalable masked gather or scatter to i32 if necessary. This patch removes the combine and instead adds shouldExtendGSIndex, which is used by visitMaskedGather/Scatter in SelectionDAGBuilder to query whether the index should be extended before calling getMaskedGather/Scatter. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D94525	2021-02-01 14:10:00 +00:00
Dmitry Preobrazhensky	d2dbf6a661	[AMDGPU][MC] Corrected error position for invalid operands Generic parser may report an incorrect error position when an offending operand is followed by a comma. See bug 48884 for details: https://bugs.llvm.org/show_bug.cgi?id=48884. Differential Revision: https://reviews.llvm.org/D95674	2021-02-01 14:31:08 +03:00
David Green	9248cc33ab	[ARM] Turn sext_inreg(VGetLaneu) into VGetLaneu This adds a DAG combine for converting sext_inreg of VGetLaneu into VGetLanes, providing the types match correctly. Differential Revision: https://reviews.llvm.org/D95073	2021-02-01 11:10:35 +00:00
Simon Pilgrim	4720508a81	[X86][AVX] combineExtractWithShuffle - combine extracts from 256/512-bit vector shuffles. We can only legally extract from the lowest 128-bit subvector, so extract the correct subvector to allow us to handle 256/512-bit vector element extracts.	2021-02-01 10:31:43 +00:00
David Green	96fb4c7ba1	[ARM] Simplify extract of VMOVDRR Under SoftFP calling conventions, we can be left with extract(bitcast(BUILD_VECTOR(VMOVDRR(a, b), ..))) patterns that can simplify to a or b, depending on the extract lane. Differential Revision: https://reviews.llvm.org/D94990	2021-02-01 10:24:57 +00:00
Kazushi (Jam) Marukawa	1b359f6379	[VE] Change inetger constants 32-bit friendly Correct integer constants like `1UL << 63` to `UINT64_C(1) << 63` in order to make them work on 32-bit machines. Tested on both an i386 and x86_64 machines. Reviewed By: mgorny Differential Revision: https://reviews.llvm.org/D95724	2021-02-01 19:00:47 +09:00
Craig Topper	a5363d614d	[Mips] Cleanup isel patterns to use 'vnot' instead of (xor X, immAllOnesV). NFCI A couple patterns used bitconvert on the immAllOnesV, but the isel matching uses ISD::isBuildVectorAllOnes which is able to look through bitcasts. So isel patterns don't need to do it explicitly.	2021-01-31 20:01:05 -08:00
Craig Topper	92b45d2660	[PowerPC] Remove vnot_ppc and replace with the standard vnot. immAllOnesV has special support for looking through bitcasts automatically so isel patterns don't need to explicitly look for the bitconvert.	2021-01-31 19:41:33 -08:00
Craig Topper	00ea1c43c6	[X86] Cleanup isel patterns to use 'vnot' instead of (xor X, immAllOnesV) to improve readability. NFC	2021-01-31 18:53:40 -08:00
Craig Topper	ebb87abce3	[RISCV] Custom lower fshl/fshr with Zbt extension. We need to add a mask to the shift amount for these operations to use the FSR/FSL instructions. We were previously doing this in isel patterns, but custom lowering will make the mask visible to optimizations earlier.	2021-01-31 17:49:15 -08:00
Kazu Hirata	e0a8f45e5f	[llvm] Drop unnecessary const from return types (NFC) Identified with const-return-type.	2021-01-31 10:23:43 -08:00
Kazu Hirata	8323f01e46	[VE] Fix compiler warnings (NFC)	2021-01-31 10:23:39 -08:00
Matt Arsenault	d78e6d4720	AMDGPU: Add missing consts	2021-01-31 10:47:57 -05:00
Craig Topper	c021b33a1e	[RISCV] Use MVT instead of EVT in RISCVISelDAGToDAG.cpp All this code runs post type legalization so we should have exclusively legal types. The methods on MVT should be more efficient than EVT.	2021-01-30 15:57:15 -08:00
Kazu Hirata	d4906cf2ad	[llvm] Add missing header guards (NFC) Identified with llvm-header-guard.	2021-01-30 09:53:42 -08:00
Kazu Hirata	58d7d68bee	[AMDGPU] Forward-declare AMDGPUTargetMachine (NFC) AMDGPUTargetTransformInfo.h needs AMDGPUTargetMachine but relies on a forward declaration of AMDGPUTargetMachine in AMDGPU.h. This patch adds a forward declaration right in AMDGPUTargetTransformInfo.h. While we are at it, this patch removes the one in AMDGPU.h, where it is unnecessary.	2021-01-30 09:53:40 -08:00
Kazu Hirata	ab8cd6c8fd	[llvm] Use isa instead of dyn_cast (NFC)	2021-01-29 23:23:37 -08:00
Kazu Hirata	95dfb94fbd	[llvm] Use llvm::lower_bound and llvm::upper_bound (NFC)	2021-01-29 23:23:36 -08:00

1 2 3 4 5 ...

61169 Commits