llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 13:11:39 +01:00

Author	SHA1	Message	Date
Jay Foad	3f23d4b8c3	[MachineScheduler] Fix the TopDepth/BotHeightReduce latency heuristics tryLatency compares two sched candidates. For the top zone it prefers the one with lesser depth, but only if that depth is greater than the total latency of the instructions we've already scheduled -- otherwise its latency would be hidden and there would be no stall. Unfortunately it only tests the depth of one of the candidates. This can lead to situations where the TopDepthReduce heuristic does not kick in, but a lower priority heuristic chooses the other candidate, whose depth is greater than the already scheduled latency, which causes a stall. The fix is to apply the heuristic if the depth of either candidate is greater than the already scheduled latency. All this also applies to the BotHeightReduce heuristic in the bottom zone. Differential Revision: https://reviews.llvm.org/D72392	2020-07-17 11:02:13 +01:00
Kai Luo	71e1d22d77	[PowerPC] Precommit test case for PR46759. NFC.	2020-07-17 08:41:15 +00:00
Jay Foad	47db2fc583	[PowerPC] Precommit 64-bit funnel shift test cases	2020-07-16 15:20:52 +01:00
Jay Foad	5903937557	[PowerPC] Use CHECK-LABEL for better diagnostics	2020-07-16 13:41:29 +01:00
Amy Kwan	1157465de4	[PowerPC][Power10] Fix VINS* (vector insert byte/half/word) instructions to have i32 arguments. Previously, the vins* intrinsic was incorrectly defined to have its second and third argument arguments as an i64. This patch fixes the second and third argument of the vins* instruction and intrinsic to have i32s instead. Differential Revision: https://reviews.llvm.org/D83497	2020-07-16 00:30:24 -05:00
Amy Kwan	9e6b3fe312	[PowerPC][Power10] Implement Test LSB by Byte Builtins in LLVM/Clang This patch implements builtins for the Test LSB by Byte instruction introduced in Power10. Differential Revision: https://reviews.llvm.org/D82431	2020-07-13 22:47:47 -05:00
Kai Luo	9f42c54bb4	[PowerPC] Generate CFI directives when probing in prologue Add missing CFI directives when probing in prologue if `stack-clash-protection` is enabled. Differential Revision: https://reviews.llvm.org/D83276	2020-07-14 02:56:12 +00:00
Fangrui Song	b8f06a9e5f	[PowerPC] Fix combineVectorShuffle regression after D77448 Commit 1fed131660b2 assumed that NewShuffle (shuffle vector canonicalization result) will always be ShuffleVectorSDNode, which may be false (it may be a BITCAST node): ``` ... t12: v4i32 = scalar_to_vector t2 t15: v16i8 = bitcast t12 # LHS t17: v16i8 = vector_shuffle<u,u,u,u,u,u,u,u,0,1,2,3,u,u,u,u> t15, undef:v16i8 # SVN ``` Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D83617	2020-07-13 16:57:27 -07:00
Kai Luo	a7930d8f42	[PowerPC] Enhance tests for D83276. NFC.	2020-07-13 04:37:09 +00:00
Qiu Chaofan	e2a03586ce	[PowerPC] Support constrained conversion in SPE target This patch adds support for constrained int/fp conversion between signed/unsigned i32 and f32/f64. Reviewed By: jhibbits Differential Revision: https://reviews.llvm.org/D82747	2020-07-13 12:18:36 +08:00
Jinsong Ji	6ef62dfa4a	[PowerPC][MachinePipeliner] Enable pipeliner if hasInstrSchedModel P9 is the only one with InstrSchedModel, but we may have more in the future, we should not hardcoded it to P9, check hasInstrSchedModel instead. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D83590	2020-07-11 02:24:12 +00:00
Lei Huang	ae49303684	[PowerPC] Enable default support of quad precision operations Summary: Remove option guarding support of quad precision operations. Reviewers: nemanjai, #powerpc, steven.zhang Reviewed By: nemanjai, #powerpc, steven.zhang Subscribers: qiucf, wuzish, nemanjai, hiraditya, kbarton, shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83437	2020-07-10 13:27:48 -05:00
Kang Zhang	f98ea09169	[NFC][PowerPC] Add a new MIR file to test mi-peephole pass	2020-07-10 16:08:07 +00:00
Kai Luo	8f0afb6658	[PowerPC] Only make copies of registers on stack in variadic function when va_start is called On PPC64, for a variadic function, if va_start is not called, it won't access any variadic argument on stack, thus we can save stores of registers used to pass arguments. Differential Revision: https://reviews.llvm.org/D82361	2020-07-09 07:18:17 +00:00
Biplob Mishra	20220d08fc	[PowerPC] Implement Vector Replace Builtins in LLVM Provide the LLVM intrinsics needed to implement vector replace element builtins in altivec.h which will be added in a subsequent patch. Differential Revision: https://reviews.llvm.org/D83308	2020-07-07 12:22:52 -05:00
Nemanja Ivanovic	2e394d9166	[PowerPC] Do not RAUW combined nodes in VECTOR_SHUFFLE legalization When legalizing shuffles, we make an attempt to combine it into a PPC specific canonical form that avoids a need for a swap. If the combine is successful, we RAUW the node and the custom legalization replaces the now dead node instead of the one it should replace. Remove that erroneous call to RAUW.	2020-07-06 22:09:28 -05:00
Biplob Mishra	ff0e91295b	[PowerPC] Implement Vector Splat Immediate Builtins in Clang Implements builtins for the following prototypes: vector signed int vec_splati (const signed int); vector float vec_splati (const float); vector double vec_splatid (const float); vector signed int vec_splati_ins (vector signed int, const unsigned int, const signed int); vector unsigned int vec_splati_ins (vector unsigned int, const unsigned int, const unsigned int); vector float vec_splati_ins (vector float, const unsigned int, const float); Differential Revision: https://reviews.llvm.org/D82520	2020-07-06 20:29:33 -05:00
Amy Kwan	77b45c8014	[PowerPC][Power10] Exploit the xxsplti32dx instruction when lowering VECTOR_SHUFFLE. This patch aims to exploit the xxsplti32dx XT, IX, IMM32 instruction when lowering VECTOR_SHUFFLEs. We implement lowerToXXSPLTI32DX when lowering vector shuffles to check if: - Element size is 4 bytes - The RHS is a constant vector (and constant splat of 4-bytes) - The shuffle mask is a suitable mask for the XXSPLTI32DX instruction where it is one of the 32 masks: <0, 4-7, 2, 4-7> <4-7, 1, 4-7, 3> Differential Revision: https://reviews.llvm.org/D83245	2020-07-06 20:28:38 -05:00
jasonliu	3b7308f12c	[XCOFF][AIX] Give symbol an internal name when desired symbol name contains invalid character(s) Summary: When a desired symbol name contains invalid character that the system assembler could not process, we need to emit .rename directive in assembly path in order for that desired symbol name to appear in the symbol table. Reviewed By: hubert.reinterpretcast, DiggerLin, daltenty, Xiangling_L Differential Revision: https://reviews.llvm.org/D82481	2020-07-06 15:49:15 +00:00
Esme-Yi	5f873faf6c	[PowerPC] Legalize SREM/UREM directly on P9. Summary: As Bugzilla-35090 reported, the rationale for using custom lowering SREM/UREM should no longer be true. At the IR level, the div-rem-pairs pass performs the transformation where the remainder is computed from the result of the division when both a required. We should now be able to lower these directly on P9. And the pass also fixed the problem that divide is in a different block than the remainder. This is a patch to remove redundant code and make SREM/UREM legal directly on P9. Reviewed By: lkail Differential Revision: https://reviews.llvm.org/D82145	2020-07-06 11:47:31 +00:00
Kai Luo	f91a303288	[PowerPC] Implement probing for prologue This patch is part of supporting `-fstack-clash-protection`. Implemented probing when emitting prologue. Differential Revision: https://reviews.llvm.org/D81460	2020-07-04 03:07:08 +00:00
Biplob Mishra	65c4fdc701	[PowerPC] Implement Vector Insert Builtins in LLVM/Clang Implements vec_insertl() and vec_inserth(). Differential Revision: https://reviews.llvm.org/D82365	2020-07-03 15:30:41 -05:00
jasonliu	b6227b52f3	[XCOFF][AIX] Use 'L..' instead of '.L' for getPrivateGlobalPrefix in DataLayout Summary: D80831 changed part of the prefix usage for AIX. But there are other places getting prefix from DataLayout. This patch intends to make prefix usage consistent on AIX. Reviewed by: hubert.reinterpretcast, daltenty Differential Revision: https://reviews.llvm.org/D81270	2020-07-03 18:25:14 +00:00
Sean Fertile	91c9c2c9c7	Enable basepointer for AIX. Differential Revision: https://reviews.llvm.org/D82030	2020-07-03 11:55:49 -04:00
Kai Luo	8382e3ec50	[PowerPC] Implement probing for dynamic stack allocation This patch is part of supporting `-fstack-clash-protection`. Mainly do such things compared to existing `lowerDynamicAlloc` - Added a new pseudo instruction PPC::PREPARE_PROBED_ALLOC to get actual frame pointer and final stack pointer. - Synthesize a loop to probe by blocks. - Use DYNAREAOFFSET to get MaxCallFrameSize which is calculated in prologepilog. Differential Revision: https://reviews.llvm.org/D81358	2020-07-03 05:36:40 +00:00
Biplob Mishra	d6caac3195	[PowerPC] Implement Vector Blend Builtins in LLVM/Clang Implements vec_blendv() Differential Revision: https://reviews.llvm.org/D82774	2020-07-02 16:52:52 -05:00
Biplob Mishra	283a0554d5	[PowerPC]Implement Vector Permute Extended Builtin Implements vector permute builtin: vec_permx() Differential Revision: https://reviews.llvm.org/D82869	2020-07-02 14:53:18 -05:00
Nemanja Ivanovic	84b83b6cc1	[PowerPC] Remove undefs from splat input when changing shuffle mask As of 1fed131660b2c5d3ea7007e273a7a5da80699445, we have code that changes shuffle masks so that we can put the shuffle in a canonical form that can be matched to a single instruction. However, it does not properly account for undef elements in the BUILD_VECTOR that is the RHS splat so we can end up with undefs where they shouldn't be. This patch converts the splat input with undefs to one without.	2020-07-02 12:26:56 -05:00
Qiu Chaofan	5777c725bf	[NFC] Fix typo in triples from unkown to unknown	2020-07-02 16:21:54 +08:00
Biplob Mishra	30db684f8c	[PowerPC]Implement Vector Shift Double Bit Immediate Builtins Implement Vector Shift Double Bit Immediate Builtins in LLVM/Clang. * vec_sldb (); * vec_srdb (); Differential Revision: https://reviews.llvm.org/D82440	2020-07-01 20:34:53 -05:00
Anil Mahmud	e9e3ccc44e	[PowerPC] Exploit xxspltiw and xxspltidp instructions Exploits the VSX Vector Splat Immediate Word and VSX Vector Splat Immediate Double Precision instructions: xxspltiw XT,IMM32 xxspltidp XT,IMM32 Differential Revision: https://reviews.llvm.org/D82911	2020-07-01 19:18:29 -05:00
Stefan Pintilie	91403d8b4a	[PowerPC] Fix for PC Relative call protocol The situation where the caller uses a TOC and the callee does not but is marked as clobbers the TOC (st_other=1) was not being compiled correctly if both functions where in the same object file. The call site where we had `callee` was missing a nop after the call. This is because it was assumed that since the two functions where in the same DSO they would be sharing a TOC. This is not the case if the callee uses PC Relative because in that case it may clobber the TOC. This patch makes sure that we add the cnop correctly so that the linker has a place to restore the TOC. Reviewers: sfertile, NeHuang, saghir Differential Revision: https://reviews.llvm.org/D81126	2020-07-01 07:08:41 -05:00
Nemanja Ivanovic	608bc8479f	[PowerPC] Fix crash for shuffle canonicalization with elt 0 from RHS Commit 1fed131660b2 assumed that shuffle vector canonicalization will always ensure that the shuffle mask will be ordered so that element zero comes from the LHS vector. However there is code out there for which this is not the case. This patch simply removes that unsafe assumption and makes the code work regardless of the source of the first element.	2020-06-29 12:26:08 -05:00
Nemanja Ivanovic	5e8f6beb8e	[PowerPC] Don't combine SCALAR_TO_VECTOR without VSX Most of the patterns for PPCISD::SCALAR_TO_VECTOR_PERMUTED require VSX. So don't emit them if the subtarget doesn't have VSX. This resolves the issue reported on https://reviews.llvm.org/rG1fed131660b2c5d3ea7007e273a7a5da80699445	2020-06-29 09:48:57 -05:00
Esme-Yi	7c2f3bcb24	[NFC][PowerPC] Add run lines to test DivRemPairsPass.	2020-06-28 16:26:05 +00:00
Chen Zheng	47d79b5f12	[MachineLICM] testcase for hoisting rematerializable instruction, nfc	2020-06-28 03:16:57 -04:00
Amy Kwan	04ae1ab1c3	[PowerPC] Add support for llvm.ppc.dcbt, llvm.ppc.dcbtst, llvm.ppc.isync intrinsics This patch adds LLVM intrinsics for the dcbt (Data Cache Block Touch), dcbtst (Data Cache Block Touch for Store) and isync (Instruction Synchronize) instructions. The intrinsic for dcbt and dcbst in this patch are named llvm.ppc.dcbt.with.hint and llvm.ppc.dcbtst.with.hint respectively as there already exists an intrinsic for llvm.ppc.dcbt and llvm.ppc.dcbtst. However, the original variants of the intrinsics do not accept the TH immediate field, whereas these variants do. Differential Revision: https://reviews.llvm.org/D79633	2020-06-26 13:02:18 -05:00
Amy Kwan	8e4bd7c3f3	[PowerPC][Power10] Implement centrifuge, vector gather every nth bit, vector evaluate Builtins in LLVM/Clang This patch implements builtins for the following prototypes: unsigned long long __builtin_cfuged (unsigned long long, unsigned long long); vector unsigned long long vec_cfuge (vector unsigned long long, vector unsigned long long); unsigned long long vec_gnb (vector unsigned __int128, const unsigned int); vector unsigned char vec_ternarylogic (vector unsigned char, vector unsigned char, vector unsigned char, const unsigned int); vector unsigned short vec_ternarylogic (vector unsigned short, vector unsigned short, vector unsigned short, const unsigned int); vector unsigned int vec_ternarylogic (vector unsigned int, vector unsigned int, vector unsigned int, const unsigned int); vector unsigned long long vec_ternarylogic (vector unsigned long long, vector unsigned long long, vector unsigned long long, const unsigned int); vector unsigned __int128 vec_ternarylogic (vector unsigned __int128, vector unsigned __int128, vector unsigned __int128, const unsigned int); Differential Revision: https://reviews.llvm.org/D80970	2020-06-25 21:34:41 -05:00
Shawn Landden	df770f665c	[PowerPC] add popcount CodeGen test; NFC	2020-06-25 12:41:33 +04:00
Amy Kwan	25f513ca38	[PowerPC][Power10] Implement Count Leading/Trailing Zeroes Builtins under bit Mask in LLVM/Clang This patch implements builtins for the following prototypes: unsigned long long __builtin_cntlzdm (unsigned long long, unsigned long long) unsigned long long __builtin_cnttzdm (unsigned long long, unsigned long long) vector unsigned long long vec_cntlzm (vector unsigned long long, vector unsigned long long) vector unsigned long long vec_cnttzm (vector unsigned long long, vector unsigned long long) Differential Revision: https://reviews.llvm.org/D80941	2020-06-24 16:03:45 -05:00
Eli Friedman	9d315e1c2b	Remove GlobalValue::getAlignment(). This function is deceptive at best: it doesn't return what you'd expect. If you have an arbitrary GlobalValue and you want to determine the alignment of that pointer, Value::getPointerAlignment() returns the correct value. If you want the actual declared alignment of a function or variable, GlobalObject::getAlignment() returns that. This patch switches all the users of GlobalValue::getAlignment to an appropriate alternative. Differential Revision: https://reviews.llvm.org/D80368	2020-06-23 19:13:42 -07:00
Chen Zheng	4cc65c325c	[PowerPC] fold addi's imm operand to its imm form consumer's displacement This patch adds a function to do following transformation: %0:g8rc_and_g8rc_nox0 = ADDI8 %5:g8rc_and_g8rc_nox0, 144 STD killed %7:g8rc, 16, %0:g8rc_and_g8rc_nox0 :: (store 8 into %ir.8) ------> STD killed %7:g8rc, 160, %5:g8rc_and_g8rc_nox0 :: (store 8 into %ir.8) Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D81723	2020-06-23 06:28:18 -04:00
Kai Luo	236d96d31a	[PowerPC][NFC] Add tests for variadic functions on PPC64	2020-06-23 09:20:04 +00:00
Amy Kwan	999d7986df	[PowerPC][Power10] Implement VSX PCV Generate Operations in LLVM/Clang This patch implements builtins for the following prototypes for the VSX Permute Control Vector Generate with Mask Instructions: vector unsigned char vec_genpcvm (vector unsigned char, const int); vector unsigned short vec_genpcvm (vector unsigned short, const int); vector unsigned int vec_genpcvm (vector unsigned int, const int); vector unsigned long long vec_genpcvm (vector unsigned long long, const int); Differential Revision: https://reviews.llvm.org/D81774	2020-06-22 21:09:34 -05:00
Amy Kwan	048b03aa1b	[PowerPC][Power10] Implement Vector Clear Left/Rightmost Bytes Builtins in LLVM/Clang This patch implements builtins for the following prototypes: ``` vector signed char vec_clrl (vector signed char a, unsigned int n); vector unsigned char vec_clrl (vector unsigned char a, unsigned int n); vector signed char vec_clrr (vector signed char a, unsigned int n); vector signed char vec_clrr (vector unsigned char a, unsigned int n); ``` Differential Revision: https://reviews.llvm.org/D81707	2020-06-20 18:29:16 -05:00
Nemanja Ivanovic	f6f0a5745d	[PowerPC] Canonicalize shuffles to match more single-instruction masks on LE We currently miss a number of opportunities to emit single-instruction VMRG[LH][BHW] instructions for shuffles on little endian subtargets. Although this in itself is not a huge performance opportunity since loading the permute vector for a VPERM can always be pulled out of loops, producing such merge instructions is useful to downstream optimizations. Since VPERM is essentially opaque to all subsequent optimizations, we want to avoid it as much as possible. Other permute instructions have semantics that can be reasoned about much more easily in later optimizations. This patch does the following: - Canonicalize shuffles so that the first element comes from the first vector (since that's what most of the mask matching functions want) - Switch the elements that come from splat vectors so that they match the corresponding elements from the other vector (to allow for merges) - Adds debugging messages for when a shuffle is matched to a VPERM so that anyone interested in improving this further can get the info for their code Differential revision: https://reviews.llvm.org/D77448	2020-06-18 21:54:22 -05:00
Amy Kwan	31eedc0f69	[PowerPC][Power10] Implement Parallel Bits Deposit/Extract Builtins in LLVM/Clang This patch implements builtins for the following prototypes: vector unsigned long long vec_pdep(vector unsigned long long, vector unsigned long long); vector unsigned long long vec_pext(vector unsigned long long, vector unsigned long long __b); unsigned long long __builtin_pdepd (unsigned long long, unsigned long long); unsigned long long __builtin_pextd (unsigned long long, unsigned long long); Revision Depends on D80758 Differential Revision: https://reviews.llvm.org/D80935	2020-06-18 16:23:56 -05:00
Kang Zhang	aa0efac1ba	[PowerPC] Don't convert Loop to CTR Loop for fp128 BinaryOperator Summary: For PPC BinaryOperator of fp128 will become libcall, we shouldn't convert loop to CTR loop if the loop contain libCall. But currently, in the PPCTTIImpl::mightUseCTR() function, we only deal with BinaryOperator for ppc_fp128, don't deal with the fp128. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D81353	2020-06-18 02:54:19 +00:00
Esme-Yi	3a5edae7bf	[PowerPC] Custom lower rotl v1i128 to vector_shuffle. Summary: A bug is reported in bugzilla-45628, where the swap_with_shift case can’t be matched to a single HW instruction xxswapd as expected. In fact the case matches the idiom of rotate. We have MatchRotate to handle an ‘or’ of two operands and generate a rot[lr] if the case matches the idiom of rotate. While PPC doesn’t support ROTL v1i128. We can custom lower ROTL v1i128 to the vector_shuffle. The vector_shuffle will be matched to a single HW instruction during the phase of instruction selection. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D81076	2020-06-18 01:32:23 +00:00
Kang Zhang	8a2bb9ff6f	[NFC]][PowerPC] Remove unused intrinsic for old CTR loop pass Summary: In the patch D62907 the PPC CTRLoops pass has been replaced by Generic Hardware Loop pass, and it has imported some new intrinsic for Generic Hardware Loop. The old intrinsic used in PPC CTRLoops int_ppc_mtctr and int_ppc_is_decremented_ctr_nonzero is been replaced by int_set_loop_iterations and loop_decrement. This patch is to remove above unused two instrinsic. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D81539	2020-06-17 07:06:46 +00:00

1 2 3 4 5 ...

2585 Commits