llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-21 18:22:53 +01:00

Author	SHA1	Message	Date
sguo35	5521155be5	Fix register clobbering on aarch64 GHC when mixing tail/non-tail calls By default LLVM doesn't save any regs for GHC on arm64. This means we'll clobber LR on arm64 if we make non-tail calls (e.g. L2 syscall) So we should save LR on non-tail calls, and not assume we won't make non-tail calls.	2022-05-28 23:38:31 +03:00
Malcolm Jestadt	c725f494c9	X86: Avoid converting EVEX to VEX when disp8 would be beneficial Saves around 2% code size	2022-05-27 15:29:35 +03:00
sguo35	eb7a5e5301	Fix tail call guarantee setting for GHC on arm64 backend	2022-05-05 07:35:52 +03:00
Nekotekina	318b8fe374	X86: fixup matchPMADDWD_3	2021-11-18 12:01:17 +03:00
Nekotekina	1cc7bdd501	X86: improve (V)PMADDWD detection (2) Implement "full" pattern.	2021-11-16 13:50:49 +03:00
Nekotekina	610c27aa1c	X86: disable AVX512 truncate with saturation instructions These are not very useful in RPCS3. However, pessimizations occur.	2021-11-16 13:45:26 +03:00
Nekotekina	c9fceef173	X86: fixup (V)PMADDWD detection Fix some bugs (missing checks). Add constant support.	2021-11-02 17:42:39 +03:00
Nekotekina	f7d625e31a	X86: improve (V)PMADDWD detection In function combineMulToPMADDWD, if 17 bit are sign bits, not just zero bits, the optimization can be applied sometimes. For now, detect and replace SRA pairs with SRL.	2021-11-02 17:42:39 +03:00
Nekotekina	c36b21c023	X86: modify PreserveAll CC to save full AVX-512 state	2021-11-02 17:42:39 +03:00
Nekotekina	548daf04b5	X86: avoid vector-scalar shifts if splat amount is directly a vector ADD/SUB/AND op. Prefer vector-vector shifts if available (AVX2+). Improves code generated for rotate and funnel shifts. Otherwise it would generate a shuffle + slower vector-scalar shift.	2021-11-02 17:42:39 +03:00
Nekotekina	d5bc359dfd	X86: add patterns for X86ISD::VSHLV and X86ISD::VSRLV Replace VSELECT instruction which zeroes their result on exceeding legal SHL/SRL shift amount.	2021-11-02 17:42:39 +03:00
Nekotekina	bed700114e	X86: add pattern for X86ISD::VSRAV Detect clamping ashr shift amount to max legal value	2021-11-02 17:42:39 +03:00
Nekotekina	2ffa82223f	X86: expand detectAVGPattern() Allow all integer widths in the pattern, allow ashr Handle signed and mixed cases, allowing to replace truncation	2021-11-02 17:42:39 +03:00
Nekotekina	5ff8f4151c	X86: optimize VSELECT for v16i8 with shl + sign bit test	2021-11-02 17:42:39 +03:00
Nekotekina	4743d020ce	X86: LowerShift: new algorithm for vector-vector shifts Emit pair of shifts of double size if possible	2021-11-02 17:42:39 +03:00
Nekotekina	d18817ded9	X86: Fix/workaround Small Code Model for JIT Force RIP-relative jump tables and global values These things were causing crashes due to use of absolute addressing	2021-11-02 17:42:39 +03:00
guopeilin	39c406a58f	[AArch64][GlobalISel] Use ZExtValue for zext(xor) when invert tb(n)z Currently, we use SExtValue to decide whether to invert tbz or tbnz. However, for the case zext (xor x, c), we should use ZExt rather than SExt otherwise we will generate totally opposite branches. Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D108755 (cherry picked from commit 5f48c144c58f6d23e850a1978a6fe05887103b17)	2021-09-21 09:15:10 -07:00
Simon Pilgrim	aed4e7449f	[X86] combineX86ShuffleChain - ensure we only peek through bitcasts to vectors (PR51858) When searching for hidden identity shuffles (added at rG41146bfe82aecc79961c3de898cda02998172e4b), only peek through bitcasts to the source operand if it is a vector type as well. (cherry picked from commit dcba99418438ec1d624ad207674234bd2e9e3394)	2021-09-20 11:22:27 -07:00
Tom Stellard	7b93a88a1e	Revert "[AArch64][GlobalISel] Legalize bswap <2 x i16>" This reverts commit 5cd63e9ec2a385de2682949c0bbe928afaf35c91. https://bugs.llvm.org/show_bug.cgi?id=51707	2021-09-10 21:09:59 -07:00
Elliot Saba	a967752a95	[X86] Don't clobber EBX in stackprobes On X86, the stackprobe emission code chooses the `R11D` register, which is illegal on i686. This ends up wrapping around to `EBX`, which does not get properly callee-saved within the stack probing prologue, clobbering the register for the callers. We fix this by explicitly using `EAX` as the stack probe register. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D109203 (cherry picked from commit ae8507b0df738205a6b9e3795ad34672b7499381)	2021-09-10 09:30:52 -07:00
Bradley Smith	28d769100d	Workaround incorrect types when lowering fixed length gather/scatter When lowering a fixed length gather/scatter the index type is assumed to be the same as the memory type, this is incorrect in cases where the extension of the index has been folded into the addressing mode. For now add a temporary workaround to fix the codegen faults caused by this by preventing the removal of this extension. At a later date the lowering for SVE gather/scatters will be redesigned to improve the way addressing modes are handled. As a short term side effect of this change, the addressing modes generated for fixed length gather/scatters will not be optimal. Differential Revision: https://reviews.llvm.org/D109145 (cherry picked from commit 14e1a4a6eef2fb95ec852c9ddfc597f80bba3226)	2021-09-09 09:05:58 -07:00
Cullen Rhodes	5f6ef6fbfd	[AArch64][SME] Fix imm bug in mov vector to tile aliases Also fixes a warning mentioned in D109359. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D109363 (cherry picked from commit 89786c2b992c3cb4c4a230542d2af34ec2915a08)	2021-09-08 20:47:08 -07:00
David Truby	e0d7c39869	[AArch64][sve] Prevent incorrect function call on fixed width vector The isEssentiallyExtractHighSubvector function currently calls getVectorNumElements on a type that in specific cases might be scalable. Since this function only has correct behaviour at the moment on scalable types anyway, the function can just return false when given a fixed type. Differential Revision: https://reviews.llvm.org/D109163 (cherry picked from commit b297531ece896fb9ec36f001a74aef144082602b)	2021-09-08 06:09:19 -07:00
Fraser Cormack	ba85498148	[RISCV] Fix reporting of incorrect commutable operand indices This patch fixes an issue where RISCV's `findCommutedOpIndices` would incorrectly return the pseudo `CommuteAnyOperandIndex` as a commutable operand index, rather than fixing a specific index. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D108206 (cherry picked from commit 5b06cbac11e53ce55f483c1852a108012507a6bb)	2021-09-03 15:48:26 -07:00
Nikita Popov	4ddceef928	[WebAssembly] Fix FastISel of condition in different block (PR51651) If the icmp is in a different block, then the register for the icmp operand may not be initialized, as it nominally does not have cross-block uses. Add a check that the icmp is in the same block as the branch, which should be the common case. This matches what X86 FastISel does: `5b6b090cf2/llvm/lib/Target/X86/X86FastISel.cpp (L1648)` The "not" transform that could have a similar issue is dropped entirely, because it is currently dead: The incoming value is a branch or select condition of type i1, but this code requires an i32 to trigger. Fixes https://bugs.llvm.org/show_bug.cgi?id=51651. Differential Revision: https://reviews.llvm.org/D108840 (cherry picked from commit 16086d47c0d0cd08ffae8e69a69c88653e654d01)	2021-08-31 20:58:25 -07:00
Ricky Taylor	8fbe4ddc7c	[M68k] Update pointer data layout Fixes PR51626. The M68k requires that all instruction, word and long word reads are aligned to word boundaries. From the 68020 onwards, there is a performance benefit from aligning long words to long word boundaries. The M68k uses the same data layout for pointers and integers. In line with this, this commit updates the pointer data layout to match the layout already set for 32-bit integers: 32:16:32. Differential Revision: https://reviews.llvm.org/D108792 (cherry picked from commit 8d3f112f0cdbed2311aead86bcd72e763ad55255)	2021-08-31 20:56:41 -07:00
Ricky Taylor	de85b171b7	[M68k][NFC] Rename M68kOperand::Kind to KindTy Rename the M68kOperand::Type enumeration to KindTy to avoid ambiguity with the Kind field when referencing enumeration values e.g. `Kind::Value`. This works around a compilation error under GCC 5, where GCC won't lookup enum class values if you have a similarly named field (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60994). The error in question is: `M68kAsmParser.cpp:857:8: error: 'Kind' is not a class, namespace, or enumeration` Differential Revision: https://reviews.llvm.org/D108723 (cherry picked from commit f659b6b1fa43ffb8c95dbbf767ef57f6e964e7f6)	2021-08-30 21:40:39 -07:00
Tom Stellard	5aea8f0472	Revert "[RISCV] Fix reporting of incorrect commutable operand indices" This reverts commit a7933290f72a08dc060d38fa52772a9cc33ed9ba. This commit caused some bot failures: clang-with-thin-lto-ubuntu-release lld-x86_64-win-release llvm-clang-x86_64-expensive-checks-debian-release	2021-08-24 21:59:54 -07:00
Fraser Cormack	2424302e99	[RISCV] Fix reporting of incorrect commutable operand indices This patch fixes an issue where RISCV's `findCommutedOpIndices` would incorrectly return the pseudo `CommuteAnyOperandIndex` as a commutable operand index, rather than fixing a specific index. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D108206 (cherry picked from commit 5b06cbac11e53ce55f483c1852a108012507a6bb)	2021-08-24 10:20:28 -07:00
Nikita Popov	e4e6f3eeff	[AArch64] Fix comparison peephole opt with non-0/1 immediate (PR51476) This is a non-intrusive fix for https://bugs.llvm.org/show_bug.cgi?id=51476 intended for backport to the 13.x release branch. It expands on the current hack by distinguishing between CmpValue of 0, 1 and 2, where 0 and 1 have the obvious meaning and 2 means "anything else". The new optimization from D98564 should only be performed for CmpValue of 0 or 1. For main, I think we should switch the analyzeCompare() and optimizeCompare() APIs to use int64_t instead of int, which is in line with MachineOperand's notion of an immediate, and avoids this problem altogether. Differential Revision: https://reviews.llvm.org/D108076 (cherry picked from commit 81b106584f2baf33e09be2362c35c1bf2f6bfe94)	2021-08-18 20:07:23 -07:00
Simon Pilgrim	45d26b8826	[X86][AVX] Extract SUBV_BROADCAST constant bits from just the lower subvector range (PR51281) As reported on PR51281, an internal fuzz test encountered an issue when extracting constant bits from a SUBV_BROADCAST node from a constant pool source larger than the broadcasted subvector width. The getTargetConstantBitsFromNode was assuming that the Constant would the same size as the subvector, resulting in the incorrect packing of the per-element bits data. This patch attempts to solve this by using the SUBV_BROADCAST node to determine the subvector width, and then ensuring we extract only the lowest bits from Constant of that subvector bitsize. Differential Revision: https://reviews.llvm.org/D107158 (cherry picked from commit 18e6a03b1a15b2661259af15ae604b4c4850cd61)	2021-08-18 12:15:46 -07:00
Tomas Matheson	4d78ad44fb	[ARM][atomicrmw] Fix CMP_SWAP_32 expand assert This assert is intended to ensure that the high registers are not selected when it is passed to one of the thumb UXT instructions. However it was triggering even for 32 bit where no UXT instruction is emitted. Fixes PR51313. Differential Revision: https://reviews.llvm.org/D107363 (cherry picked from commit 40650f27b5df95b2f96d25ea03976d8136804441)	2021-08-18 12:14:24 -07:00
Amy Kwan	8202608068	[PowerPC] Disable CTR Loop generate for fma with the PPC double double type. It is possible to generate the llvm.fmuladd.ppcf128 intrinsic, and there is no actual FMA instruction that corresponds to this intrinsic call for ppcf128. Thus, this intrinsic needs to remain as a call as it cannot be lowered to any instruction, which also means we need to disable CTR loop generation for fma involving the ppcf128 type. This patch accomplishes this behaviour. Differential Revision: https://reviews.llvm.org/D107914 (cherry picked from commit 581a80304c671b6cb2b1b1f87feb9fbe14875f2a)	2021-08-17 20:22:13 -07:00
Andrea Di Biagio	681b643c07	[X86][SchedModel] Add missing ReadAdvance for some arithmetic ops (PR51318 and PR51322). This fixes a bug where implicit uses of EFLAGS were not marked as ReadAdvance in the RM/MR variants of ADC/SBB (PR51318) This also fixes the absence of ReadAdvance for the register operand of RMW arithmetic instructions (PR51322). Differential Revision: https://reviews.llvm.org/D107367 (cherry picked from commit 7a1a35a1d1ae2e69769505c9f39910067c53d53b)	2021-08-11 21:40:03 -07:00
Evandro Menezes	0c8a79e78d	[RISCV] Add scheduling resources for V Add the scheduling resources for the V extension instructions. Differential Revision: https://reviews.llvm.org/D98002 (cherry picked from commit 63a5ac4e0d969f41bf71785cc3979349a45a2892)	2021-08-10 23:11:38 -07:00
Bradley Smith	ee15bdbb06	[AArch64][SVE] Fix assertion failure when lowering fixed length gather/scatter The patterns for fixed length gather/scatter with 32-bit offsets and 64-bit memory type are slightly different that the rest of the patterns, as such the lowering needs to be slightly different to ensure the correct types are used. Differential Revision: https://reviews.llvm.org/D107576 (cherry picked from commit 73ecb9987b00db274b7b2ac34b0602ffdb906a4b)	2021-08-10 15:34:36 -07:00
Yonghong Song	f6a86e448a	BPF: avoid NE/EQ loop exit condition Kuniyuki Iwashima reported in [1] that llvm compiler may convert a loop exit condition with "i < bound" to "i != bound", where "i" is the loop index variable and "bound" is the upper bound. In case that "bound" is not a constant, verifier will always have "i != bound" true, which will cause verifier failure since to verifier this is an infinite loop. The fix is to avoid transforming "i < bound" to "i != bound". In llvm, the transformation is done by IndVarSimplify pass. The compiler checks loop condition cost (i = i + 1) and if the cost is lower, it may transform "i < bound" to "i != bound". This patch implemented getArithmeticInstrCost() in BPF TargetTransformInfo class to return a higher cost for such an operation, which will prevent the transformation for the test case added in this patch. [1] https://lore.kernel.org/netdev/1994df05-8f01-371f-3c3b-d33d7836878c@fb.com/ Differential Revision: https://reviews.llvm.org/D107483 (cherry picked from commit e52946b9ababcbf8e6f40b1b15900ae2e795a1c6)	2021-08-06 12:45:53 -07:00
Craig Topper	7c9c296915	[RISCV] Restrict performANY_EXTENDCombine to prevent an infinite loop. The sign_extend we insert here can get turned into a zero_extend if the sign bit is known zero. This can enable a setcc combine that shrinks compares with zero_extend. This reduces the use count of the zero_extend allowing other combines to turn it back into an any_extend. This restricts the combine to only cases where the result is used by a CopyToReg. This works for my original motivating case. I hope the CopyToReg use will prevent any converted extends from turning back into an any_extend. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D106754 (cherry picked from commit 54588bcc052e5b08f90e672c33d0c1ad4eda2424)	2021-08-02 11:31:08 -07:00
Alexandros Lamprineas	276fcebbe0	[AArch64] Legalize MVT::i64x8 in DAG isel lowering This patch legalizes the Machine Value Type introduced in D94096 for loads and stores. A new target hook named getAsmOperandValueType() is added which maps i512 to MVT::i64x8. GlobalISel falls back to DAG for legalization. Differential Revision: https://reviews.llvm.org/D94097	2021-08-02 15:45:58 +01:00
Bradley Smith	183b0c7c98	[AArch64][SVE] Fix incorrect mask type when lowering fixed type SVE gather/scatter An incorrect mask type when lowering an SVE gather/scatter was causing a codegen fault which manifested as the incorrect predicate size being used for an SVE gather/scatter, (e.g.. p0.b rather than p0.d). Fixes PR51182. Differential Revision: https://reviews.llvm.org/D106943 (cherry picked from commit 191831e380f317cd2baa5d48abe02d1d11cd44cb)	2021-07-29 07:03:40 -07:00
Xiang1 Zhang	a6d5003afd	[X86] Fix lowering to illegal type in LowerINSERT_VECTOR_ELT Differential Revision: https://reviews.llvm.org/D106780	2021-07-28 08:16:59 +08:00
Xiang1 Zhang	5d447ad589	Revert "[X86] Fix lowering to illegal type in LowerINSERT_VECTOR_ELT" This reverts commit 6ff73efea94621e74642e4d7a15cc86a5fb6d411.	2021-07-28 08:12:29 +08:00
Xiang1 Zhang	409f0eedd6	[X86] Fix lowering to illegal type in LowerINSERT_VECTOR_ELT	2021-07-28 08:08:30 +08:00
Krzysztof Parzyszek	60850cdc6a	[Hexagon] Fix resetting dead registers in DBG_VALUE_LISTs This fixes https://llvm.org/PR51229.	2021-07-27 18:36:28 -05:00
Nemanja Ivanovic	8b3f85a32c	[PowerPC] Turn deprecated altivec prefetch instrs to nops on AIX The dst/dstt/dstst/dststt instructions are nop's on all PowerPC cores that AIX supports. The AIX assembler also does not accept these mnemonics. Turn them into nop's on AIX (similar to dstall).	2021-07-27 15:50:02 -05:00
Sanjay Patel	11aa71a71d	[x86] update stale code comment; NFC The transform was generalized with: 1ce05ad619a5	2021-07-27 16:45:52 -04:00
Matt Arsenault	ece3299a71	AMDGPU/GlobalISel: Fix selecting G_SEXTLOAD/G_ZEXTLOAD pre-gfx9 The patterns for the m0 glue patterns were failing to import.	2021-07-27 15:56:42 -04:00
Amara Emerson	6ce8f2f7c1	[AArch64][GlobalISel] Fix constraining LDXPX intrinsic selection. Causes a fallback because of lack of regclasses on vregs, unless its without asserts, where we end up crashing later in codegen.	2021-07-27 12:13:56 -07:00
Craig Topper	6bfc6b8665	[RISCV] Select vector shl by 1 to a vector add. A vector add may be faster than a vector shift. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D106689	2021-07-27 10:57:28 -07:00
Matt Arsenault	8979bda8e8	AMDGPU: Treat IMPLICIT_DEF like a constant lanemask source This is partially a workaround. SILowerI1Copies does not understand unstructured loops. This would result in inserting instructions to merge a mask register in the same block where it was defined in an unstructured loop.	2021-07-27 11:44:38 -04:00

1 2 3 4 5 ...

63656 Commits