llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00

Author	SHA1	Message	Date
Ahmed Bougacha	458b6b7f6e	[X86] Remove dead ISD opcodes. NFC. llvm-svn: 273716	2016-06-24 20:37:55 +00:00
Igor Breger	8384cb8d2e	[AVX512] [AVX512/AVX][Intrinsics] Fix Variable Bit Shift Right Arithmetic intrinsic lowering. Differential Revision: http://reviews.llvm.org/D20897 llvm-svn: 273138	2016-06-20 07:05:43 +00:00
Simon Pilgrim	8ac77ef18f	[X86][SSE4A] Autoupgrade and remove MOVNTSD/MOVNTSS intrinsics Required better annotation of the instruction defs upon removal of the builtin intrinsic pattern. llvm-svn: 273077	2016-06-18 02:38:26 +00:00
Simon Pilgrim	5d2cb8e501	[X86][SSE4A] Added patterns for nontemporal stores of scalar float/doubles using MOVNTSD/MOVNTSS llvm-svn: 272651	2016-06-14 09:43:38 +00:00
Sanjay Patel	f3d2a8e797	[x86, SSE] change patterns for CMPP to float types to allow matching with SSE1 (PR28044) This patch is intended to solve: https://llvm.org/bugs/show_bug.cgi?id=28044 By changing the definition of X86ISD::CMPP to use float types, we allow it to be created and pass legalization for an SSE1-only target where v4i32 is not legal. The motivational trail for this change includes: https://llvm.org/bugs/show_bug.cgi?id=28001 and eventually makes this trigger: http://reviews.llvm.org/D21190 Ie, after this step, we should be free to have Clang generate FP compare IR instead of x86 intrinsics for SSE C packed compare intrinsics. (We can auto-upgrade and remove the LLVM sse.cmp intrinsics as a follow-up step.) Once we're generating vector IR instead of x86 intrinsics, a big pile of generic optimizations can trigger. Differential Revision: http://reviews.llvm.org/D21235 llvm-svn: 272511	2016-06-12 15:03:25 +00:00
Simon Pilgrim	53b27def76	[X86][SSE] Use vXi8 return type for PSLLDQ/PSRLDQ instructions These are byte shift instructions and it will make shuffle combining a lot more straightforward if we can assume a vXi8 vector of bytes so decoded shuffle masks match the return type's number of elements llvm-svn: 272468	2016-06-11 12:54:37 +00:00
Craig Topper	a336600f12	[X86] Bring consistent naming to the SSE/AVX and AVX512 PALIGNR instructions. Then add shuffle decode printing for the EVEX forms which is made easier by having the naming structure more similar to other instructions. llvm-svn: 272249	2016-06-09 07:06:38 +00:00
Simon Pilgrim	67ca4cba96	[X86][SSE] Add general lowering of nontemporal vector loads Currently the only way to use the (V)MOVNTDQA nontemporal vector loads instructions is through the int_x86_sse41_movntdqa style builtins. This patch adds support for lowering nontemporal loads from general IR, allowing us to remove the movntdqa builtins in a future patch. We currently still fold nontemporal loads into suitable instructions, we should probably look at removing this (and nontemporal stores as well) or at least make the target's folding implementation aware that its dealing with a nontemporal memory transaction. There is also an issue that VMOVNTDQA only acts on 128-bit vectors on pre-AVX2 hardware - so currently a normal ymm load is still used on AVX1 targets. Differential Review: http://reviews.llvm.org/D20965 llvm-svn: 272010	2016-06-07 13:34:24 +00:00
Craig Topper	b20bc0d4cb	[AVX512] Allow avx2 and sse41 nontemporal load intrinsics to select EVEX encoded instructions when VLX is enabled. llvm-svn: 271988	2016-06-07 07:27:57 +00:00
Craig Topper	9126a71b21	[AVX512] Remove unnecessary mayLoad, mayStore, hasSidEffects flags from instructions that have patterns that imply them. Add the same set of flags to instructions that don't have patterns to imply them. llvm-svn: 271987	2016-06-07 07:27:54 +00:00
Craig Topper	c89131380e	[X86] Use X86ISD::ABS for lowering pabs SSSE3/AVX intrinsics to match AVX512. Should allow those intrinsics to use the EVEX encoded instructions and get the extra registers when available. llvm-svn: 271775	2016-06-04 04:32:15 +00:00
Craig Topper	1981be1e4f	[X86] Fix some isel patterns to remove an operand from some multiclasses. NFC llvm-svn: 271631	2016-06-03 05:58:52 +00:00
Craig Topper	7dae6773ea	[AVX512] Ensure EVEX vpshufd, vpshuflw, and vpshufhw have isel priority over the VEX encoded ones. llvm-svn: 271629	2016-06-03 05:31:04 +00:00
Craig Topper	d4d080a520	[X86] Simplify a multiclass to remove a parameter. NFC llvm-svn: 271627	2016-06-03 05:30:56 +00:00
Craig Topper	89720ac9ae	[X86] Remove unnecessary pattern predicates from the vector bit cast patterns. The types have to be legal and there are no alternative patterns. Saves almost 200 bytes in isel table. llvm-svn: 271625	2016-06-03 04:15:27 +00:00
Simon Pilgrim	2e72cbb66e	[X86][SSE] Replace (V)CVTTPS2DQ and VCVTTPD2DQ truncating (round to zero) f32/f64 to i32 with generic IR (llvm) This patch removes the llvm intrinsics (V)CVTTPS2DQ and VCVTTPD2DQ truncation (round to zero) conversions and auto-upgrades to FP_TO_SINT calls instead. Note: I looked at updating CVTTPD2DQ as well but this still requires a lot more work to correctly lower. Differential Revision: http://reviews.llvm.org/D20860 llvm-svn: 271510	2016-06-02 10:55:21 +00:00
Craig Topper	94770dc6ce	[X86] No need to use 256-bit VMOVNTPS for integer types when only AVX1 is supported. VMOVNTDQ is available with AVX1. We were getting this right for v4i64 but not the other integer types. llvm-svn: 271482	2016-06-02 04:19:48 +00:00
Simon Pilgrim	29a89e51e6	[X86][SSE] Add load-folding patterns for (V)CVTDQ2PD (PR27291) Added patterns for (V)CVTDQ2PD -> 2f64 loading from a 64-bit source. llvm-svn: 271269	2016-05-31 12:04:35 +00:00
Craig Topper	4f195e8edb	[X86] Remove SSE/AVX unaligned store intrinsics as clang no longer uses them. Auto upgrade to native unaligned store instructions. llvm-svn: 271236	2016-05-30 23:15:56 +00:00
Simon Pilgrim	1a1ddc32da	[X86][SSE] Replace (V)CVTDQ2PD(Y) and (V)CVTPS2PD(Y) lossless conversion intrinsics with generic IR Followup to D20528 clang patch, this removes the (V)CVTDQ2PD(Y) and (V)CVTPS2PD(Y) llvm intrinsics and auto-upgrades to sitofp/fpext instead. Differential Revision: http://reviews.llvm.org/D20568 llvm-svn: 270678	2016-05-25 08:59:18 +00:00
Craig Topper	4710ab1424	[X86] Remove the llvm.x86.sse2.storel.dq intrinsic. It hasn't been used in a long time. llvm-svn: 270677	2016-05-25 06:56:32 +00:00
Craig Topper	62ac946928	[X86] Use instruction aliases to replace custom asm parser code for optimizing moves to use 2 byte VEX prefix. llvm-svn: 270394	2016-05-23 04:02:27 +00:00
Craig Topper	c1c9bee262	[X86] Remove unnecessary alignment check on patterns that use VEXTRACTF128 for integer types when only AVX1 is supported. llvm-svn: 270335	2016-05-21 22:50:18 +00:00
Craig Topper	ff8ed4829f	[AVX512] Add patterns for extracting subvectors and storing to memory. llvm-svn: 270334	2016-05-21 22:50:14 +00:00
Craig Topper	79cd5a6b41	[AVX512] Add patterns for VEXTRACT v16i16->v8i16 and v32i8->v16i8. Disable AVX2 versions of vector extract when AVX512VL is enabled. llvm-svn: 270318	2016-05-21 07:08:56 +00:00
Craig Topper	f3e023e70e	[AVX512] Disable AVX2 VPERMD, VPERMQ, VPERMPS, and VPERMPD patterns when AVX512VL is enabled. Also add shuffle comment printing for AVX512VL VPERMPD/VPERMQ to keep some tests that now use these instructions instead of the AVX2 ones. llvm-svn: 270317	2016-05-21 06:07:18 +00:00
Craig Topper	30a8fe51db	[AVX512] Disable AVX/AVX2 VBROADCASTSS/VBROADCASTSD patterns when AVX512VL is enabled. llvm-svn: 270316	2016-05-21 05:47:25 +00:00
Craig Topper	2b54c30436	[AVX512] Disable AVX/AVX2 patterns for VPSADBW and VPMULUDQ when the AVX512VL/AVX512BWI equivalents are available. llvm-svn: 270311	2016-05-21 03:52:32 +00:00
Craig Topper	8ca6c23ba5	[X86] Convert some SSE2/AVX2 intrinsics to ISD opcodes during lowering instead of pattern matching the intrinsics. This unifies handling with AVX512 and allows these intrinsics to select EVEX encoded instructions to increase available registers. llvm-svn: 270310	2016-05-21 03:52:28 +00:00
Craig Topper	1780b40874	[X86] Fix another AVX pattern to only be disable if VLX and BWI are supported. llvm-svn: 270182	2016-05-20 05:10:27 +00:00
Craig Topper	195c9b10ae	[X86] Fix some AVX patterns to only be disabled if VLX and BWI are supported. Without this we get isel failures on the avx-intrinsics-x86.ll test in AVX512VL. llvm-svn: 270174	2016-05-20 02:00:08 +00:00
Ashutosh Nema	0cfbe42fbc	Add new flag and intrinsic support for MWAITX and MONITORX instructions Summary: MONITORX/MWAITX instructions provide similar capability to the MONITOR/MWAIT pair while adding a timer function, such that another termination of the MWAITX instruction occurs when the timer expires. The presence of the MONITORX and MWAITX instructions is indicated by CPUID 8000_0001, ECX, bit 29. The MONITORX and MWAITX instructions are intercepted by the same bits that intercept MONITOR and MWAIT. MONITORX instruction establishes a range to be monitored. MWAITX instruction causes the processor to stop instruction execution and enter an implementation-dependent optimized state until occurrence of a class of events. Opcode of MONITORX instruction is "0F 01 FA". Opcode of MWAITX instruction is "0F 01 FB". These opcode information is used in adding tests for the disassembler. These instructions are enabled for AMD's bdver4 architecture. Patch by Ganesh Gopalasubramanian! Reviewers: echristo, craig.topper, RKSimon Subscribers: RKSimon, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19795 llvm-svn: 269911	2016-05-18 11:59:12 +00:00
Craig Topper	b6baa777fe	[AVX512] Change predicates on some vXi16/vXi8 AVX store patterns so they stay enabled unless VLX and BWI instructions are supported." Without this we could fail instruction selection if VLX was enabled, but BWI wasn't. llvm-svn: 268885	2016-05-08 23:08:40 +00:00
Craig Topper	dd43e37f21	[AVX512] Add VLX 128/256-bit SET0 operations that encode to 128/256-bit EVEX encoded VPXORD so all 32 registers can be used. llvm-svn: 268884	2016-05-08 21:33:53 +00:00
Craig Topper	872733a7d7	[X86] Remove extra patterns that check for BUILD_VECTOR of all 0s. These are always canonicalized to v4i32/v8i32/v16i32 except for in SSE1 only when only v4f32 is supported. llvm-svn: 268880	2016-05-08 20:10:20 +00:00
Craig Topper	ce30d72448	[X86] Lower 256-bit vector all-zero constants to v8i32 even with AVX1 only. Either way a 256-bit VXORPS will be used. llvm-svn: 268873	2016-05-08 07:10:54 +00:00
Craig Topper	17e31eeac8	[X86] Add patterns for 256-bit non-temporal stores when only AVX1 is supported. While there, add a predicate to the SSE2 patterns to avoid an ordering dependency. llvm-svn: 268872	2016-05-08 07:10:50 +00:00
Craig Topper	85b1036796	[X86] No need to avoid selecting AVX_SET0 for 256-bit integer types when only AVX1 is supported. AVX_SET0 just expands to 256-bit VXORPS which is legal in AVX1. llvm-svn: 268871	2016-05-08 07:10:47 +00:00
Simon Pilgrim	d1da382cb8	[X86][SSE] Dropped X86ISD::FGETSIGNx86 and use MOVMSK instead for FGETSIGN lowering movmsk.ll tests are unchanged. llvm-svn: 268237	2016-05-02 14:58:22 +00:00
Igor Breger	a0208b4462	Change AVX512 braodcastsd/ss patterns interaction with spilling . New implementation take a scalar register and generate a vector without COPY_TO_REGCLASS (turn it into a VR128 register ) .The issue is that during register allocation we may spill a scalar value using 128-bit loads and stores, wasting cache bandwidth. Differential Revision: http://reviews.llvm.org/D19579 llvm-svn: 268190	2016-05-01 08:40:00 +00:00
Craig Topper	5387c11293	[AVX512] Prefer AVX512 VPACK instructions over AVX/AVX2 instructions when VLX and BWI are supported. llvm-svn: 268189	2016-05-01 06:52:19 +00:00
Craig Topper	ac42bfd1c7	[X86] Add an AddedComplexity to another pattern to put it near similar in the output file. llvm-svn: 268184	2016-05-01 05:22:15 +00:00
Craig Topper	30013f145d	[X86] Remove a seemlingly unused pattern. The same pattern appears elsewhere with an AddedComplexity that made this unreachable. llvm-svn: 268183	2016-05-01 05:22:13 +00:00
Craig Topper	6aa28ed840	[X86] Add AddedComplexity to keep some similar patterns near each other in the output file. llvm-svn: 268181	2016-05-01 04:59:49 +00:00
Craig Topper	fcae9c2218	[X86] Remove some redundant selection patterns. llvm-svn: 268180	2016-05-01 04:59:46 +00:00
Simon Pilgrim	9aa00086fb	[X86][SSE] Support for MOVMSK signbit extraction instructions Add support for lowering with the MOVMSK instruction to extract vector element signbits to a GPR. This is an early step towards more optimal handling of vector comparison results. Differential Revision: http://reviews.llvm.org/D18741 llvm-svn: 265266	2016-04-03 18:22:03 +00:00
Simon Pilgrim	a9b6ea15aa	[X86][AVX] Better support for the variable mask form of VPERMILPD/VPERMILPS The variable mask form of VPERMILPD/VPERMILPS were only partially implemented, with much of it still performed as an intrinsic. This patch properly defines the instructions in terms of X86ISD::VPERMILPV, permitting the opcode to be easily combined as a target shuffle. Differential Revision: http://reviews.llvm.org/D17681 llvm-svn: 262635	2016-03-03 18:13:53 +00:00
Simon Pilgrim	13e404ae07	Strip trailing whitespace. NFCI. llvm-svn: 262083	2016-02-26 22:28:50 +00:00
Igor Breger	c2763588ac	AVX512F: Add GATHER/SCATTER assembler Intel syntax tests for knl/skx/avx . Change memory operand parser handling. Differential Revision: http://reviews.llvm.org/D17564 llvm-svn: 261862	2016-02-25 13:30:17 +00:00
Igor Breger	e60efa9c40	AVX512: Fix predicate of AVX pcmpeqw/b , pcmpgtb/w/d instructions . AVX512 version of this instructions return result in kmask register, so AVX patterns should not be disabled. Differential Revision: http://reviews.llvm.org/D17517 llvm-svn: 261619	2016-02-23 08:55:33 +00:00

1 2 3 4 5 ...

1307 Commits