llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 04:22:57 +02:00

Author	SHA1	Message	Date
Bruno Cardoso Lopes	17ae896095	Move code around and add comments llvm-svn: 137518	2011-08-12 21:48:22 +00:00
Bruno Cardoso Lopes	4106caa9af	Cleanup: Remove Int_ CVTSS2SI* forms llvm-svn: 137297	2011-08-11 02:52:36 +00:00
Bruno Cardoso Lopes	565ab1542a	The following X86 pattern is incorrect: def : Pat<(X86Movss VR128:$src1, (bc_v4i32 (v2i64 (load addr:$src2)))), (MOVLPSrm VR128:$src1, addr:$src2)>; This matches a MOVSS dag with a MOVLPS instruction. However, MOVSS will replace only the low 32 bits of the register, while the MOVLPS instruction will replace the low 64 bits. A testcase is added and illustrates the bug and also modified the one that was already present. Patch by Tanya Lattner. llvm-svn: 137227	2011-08-10 17:45:17 +00:00
Bruno Cardoso Lopes	7461b930f3	Add v16i16 and v32i8 store patterns llvm-svn: 137166	2011-08-09 22:39:53 +00:00
Bruno Cardoso Lopes	028c6aa951	Use fp unpack instructions to unpack int types. Until we have AVX2, this is the best we can do for these patterns. This fix PR10554. llvm-svn: 137161	2011-08-09 22:18:37 +00:00
Bruno Cardoso Lopes	633400ee00	Reapply a more appropriate solution than in r137114. AVX supports v4f64 = sitofp v4i32. This fix PR10559. Also add support for v4i32 = fptosi v4f64. llvm-svn: 137128	2011-08-09 17:39:13 +00:00
Bruno Cardoso Lopes	d521431558	Add support for avx vector fextend llvm-svn: 137105	2011-08-09 03:04:29 +00:00
Bruno Cardoso Lopes	09a727298f	Add AVX versions of 128-bit sitofp and fptosi llvm-svn: 137104	2011-08-09 03:04:25 +00:00
Bruno Cardoso Lopes	1025d1eb3b	Add two patterns to match special vmovss and vmovsd cases. Also fix the patterns already there to be more strict regarding the predicate. This fixes PR10558 llvm-svn: 137100	2011-08-09 01:43:09 +00:00
Bruno Cardoso Lopes	d7eac41193	Make LowerVSETCC aware of AVX types and add patterns to match them. llvm-svn: 137090	2011-08-09 00:46:57 +00:00
Bruno Cardoso Lopes	771876cade	Add v4f64 -> v2f32 fp_round support. Also add a testcase to exercise the legalizer. This commit together with the two previous ones fixes PR10495. llvm-svn: 136654	2011-08-01 21:54:09 +00:00
Bruno Cardoso Lopes	473d982caf	Add v8i32 and v4i64 vpermil patterns llvm-svn: 136451	2011-07-29 01:31:07 +00:00
Bruno Cardoso Lopes	02bbf20b02	Cleanup PALIGNR handling and remove the old palign pattern fragment. Also make PALIGNR masks to don't match 256-bits, which isn't supported It's also a step to solve PR10489 llvm-svn: 136448	2011-07-29 01:30:59 +00:00
Bruno Cardoso Lopes	e24a043703	Add patterns to generate copies for extract_subvector instead of using vextractf128. This will reduce the number of issued instruction for several avx codes. llvm-svn: 136323	2011-07-28 01:26:50 +00:00
Bruno Cardoso Lopes	73945bf79a	movd/movq write zeros in the high 128-bit part of the vector. Use them to match 256-bit scalar_to_vector+zext. llvm-svn: 136322	2011-07-28 01:26:46 +00:00
Bruno Cardoso Lopes	1f63a37172	Add a few patterns to match allzeros without having to use the fp unit. Take advantage that the 128-bit vpxor zeros the higher part and use it. This also fixes PR10491 llvm-svn: 136321	2011-07-28 01:26:43 +00:00
Bruno Cardoso Lopes	06d8be564f	Add SINT_TO_FP and FP_TO_SINT support for v8i32 types. Also move a convert pattern close to the instruction definition. llvm-svn: 136320	2011-07-28 01:26:39 +00:00
Kevin Enderby	9adbbfffd0	Fix llvm-mc handing of x86 instructions that take 8-bit unsigned immediates. llvm-mc gives an "invalid operand" error for instructions that take an unsigned immediate which have the high bit set such as: pblendw $0xc5, %xmm2, %xmm1 llvm-mc treats all x86 immediates as signed values and range checks them. A small number of x86 instructions use the imm8 field as a set of bits. This change only changes those instructions and where the high bit is not ignored. The others remain unchanged. llvm-svn: 136287	2011-07-27 23:01:50 +00:00
Bruno Cardoso Lopes	8830fde434	The vpermilps and vpermilpd have different behaviour regarding the usage of the shuffle bitmask. Both work in 128-bit lanes without crossing, but in the former the mask of the high part is the same used by the low part while in the later both lanes have independent masks. Handle this properly and and add support for vpermilpd. llvm-svn: 136200	2011-07-27 00:56:34 +00:00
Bruno Cardoso Lopes	e53bb853ea	Recognize unpckh* masks and match 256-bit versions. The new versions are different from the previous 128-bit because they work in lanes. Update a few comments and add testcases llvm-svn: 136157	2011-07-26 22:03:40 +00:00
Bruno Cardoso Lopes	a493ad3938	Remove now unused patterns. 0 insertions(+), 98 deletions(-) llvm-svn: 136109	2011-07-26 18:22:39 +00:00
Bruno Cardoso Lopes	b24e958ffb	Cleanup old matching for PUNPCK* variants llvm-svn: 136108	2011-07-26 18:22:27 +00:00
Bruno Cardoso Lopes	ab40a57cce	Add 256-bit isel for movsldup/movshdup llvm-svn: 136051	2011-07-26 02:39:32 +00:00
Bruno Cardoso Lopes	cde45ac9ca	Add 128-bit AVX versions of movshdup/mosldup llvm-svn: 136048	2011-07-26 02:39:23 +00:00
Bruno Cardoso Lopes	25698b90e9	Cleanup movsldup/movshdup matching. 27 insertions(+), 62 deletions(-) llvm-svn: 136047	2011-07-26 02:39:13 +00:00
Bruno Cardoso Lopes	c94d6a2d2c	Codegen allonesvector better while using AVX: vpcmpeqd + vinsertf128 This also fixes PR10452 llvm-svn: 136004	2011-07-25 23:05:32 +00:00
Bruno Cardoso Lopes	f457bc8120	Add remaining 256-bit vector bitcasts. This also fixes PR10451 llvm-svn: 136003	2011-07-25 23:05:28 +00:00
Bruno Cardoso Lopes	9380919dc5	- Handle special scalar_to_vector case: splats. Using a native 128-bit shuffle before inserting on a 256-bit vector. - Add AVX versions of movd/movq instructions - Introduce a few COPY patterns to match insert_subvector instructions. This turns a trivial insert_subvector instruction into a register copy, coalescing the xmm into a ymm and avoid emiting on more instruction. llvm-svn: 136002	2011-07-25 23:05:25 +00:00
Bruno Cardoso Lopes	5382a1e65e	Add v8f32->v8i32 bitcast. Fixes PR10440 llvm-svn: 135794	2011-07-22 19:51:02 +00:00
Bruno Cardoso Lopes	3691063149	- Register v16i16 as valid VR256 register class - Add more bitcasts for v16i16 - Since 135661 and 135662 already added the splat logic, just add one more splat test for v16i16 llvm-svn: 135663	2011-07-21 02:24:08 +00:00
Bruno Cardoso Lopes	ba1a2a9135	Add support for 256-bit versions of VPERMIL instruction. This is a new instruction introduced in AVX, which can operate on 128 and 256-bit vectors. It considers a 256-bit vector as two independent 128-bit lanes. It can permute any 32 or 64 elements inside a lane, and restricts the second lane to have the same permutation of the first one. With the improved splat support introduced early today, adding codegen for this instruction enable more efficient 256-bit code: Instead of: vextractf128 $0, %ymm0, %xmm0 punpcklbw %xmm0, %xmm0 punpckhbw %xmm0, %xmm0 vinsertf128 $0, %xmm0, %ymm0, %ymm1 vinsertf128 $1, %xmm0, %ymm1, %ymm0 vextractf128 $1, %ymm0, %xmm1 shufps $1, %xmm1, %xmm1 movss %xmm1, 28(%rsp) movss %xmm1, 24(%rsp) movss %xmm1, 20(%rsp) movss %xmm1, 16(%rsp) vextractf128 $0, %ymm0, %xmm0 shufps $1, %xmm0, %xmm0 movss %xmm0, 12(%rsp) movss %xmm0, 8(%rsp) movss %xmm0, 4(%rsp) movss %xmm0, (%rsp) vmovaps (%rsp), %ymm0 We get: vextractf128 $0, %ymm0, %xmm0 punpcklbw %xmm0, %xmm0 punpckhbw %xmm0, %xmm0 vinsertf128 $0, %xmm0, %ymm0, %ymm1 vinsertf128 $1, %xmm0, %ymm1, %ymm0 vpermilps $85, %ymm0, %ymm0 llvm-svn: 135662	2011-07-21 01:55:47 +00:00
Bruno Cardoso Lopes	194507cc77	Add aditional patterns for vextractf128 instruction llvm-svn: 135660	2011-07-21 01:55:39 +00:00
Bruno Cardoso Lopes	14c800c1e3	Add aditional patterns for vinsertf128 instruction llvm-svn: 135659	2011-07-21 01:55:36 +00:00
Bruno Cardoso Lopes	e0d5bd467f	Move code around. No functionality changes llvm-svn: 135657	2011-07-21 01:55:30 +00:00
Bruno Cardoso Lopes	bdf75dfa28	Be more smart with VCVTSS2SD. Also place the patterns close to the definitions. llvm-svn: 135407	2011-07-18 18:11:25 +00:00
Bruno Cardoso Lopes	da90f383ab	Add AVX 128-bit sqrt versions llvm-svn: 135404	2011-07-18 17:51:40 +00:00
Bruno Cardoso Lopes	d258749f73	Add AVX 128-bit patterns for sint_to_fp llvm-svn: 135332	2011-07-16 00:50:20 +00:00
Bruno Cardoso Lopes	2a23e486ad	Add a few patterns for 256-bit bitcasts. No testcases now, they are comming together with other tests. llvm-svn: 135312	2011-07-15 22:24:17 +00:00
Bruno Cardoso Lopes	d24f039847	Add 256-bit load/store recognition and matching in several places. llvm-svn: 135171	2011-07-14 18:50:58 +00:00
Bruno Cardoso Lopes	c0401dddf7	Make X86ISD::ANDNP more general and Codegen 256-bit VANDNP. A more general version of X86ISD::ANDNP also opened the room for a little bit of refactoring. llvm-svn: 135088	2011-07-13 21:36:51 +00:00
Bruno Cardoso Lopes	b98f50da03	The target specific node PANDN name is misleading. That happens because it's later selected to a ANDNPD/ANDNPS instruction instead of the PANDN instruction. Rename it. llvm-svn: 135087	2011-07-13 21:36:47 +00:00
Bruno Cardoso Lopes	cb49278ad6	AVX Codegen support for 256-bit versions of vandps, vandpd, vorps, vorpd, vxorps, vxorpd llvm-svn: 135023	2011-07-13 01:15:33 +00:00
Eli Friedman	9765ae0015	Add assembler/disassembler support for non-AVX pclmulqdq. While I'm here, use proper aliases for the pclmullqlqdq and friends. PR10269. llvm-svn: 134424	2011-07-05 18:21:20 +00:00
Eli Friedman	802029c494	Add support for movntil/movntiq mnemonics. Reported on llvmdev. llvm-svn: 133759	2011-06-23 21:07:47 +00:00
Nick Lewycky	8e5c09b7dc	Add support for assembling "movq" when it's correct to do so, while continuing to emit "movd" across the board to continue supporting a Darwin assembler bug. This is the reincarnation of r133452. llvm-svn: 133565	2011-06-21 22:45:41 +00:00
Bob Wilson	5b04895bb8	Revert r133452: "Emit movq for 64-bit register to XMM register moves..." This is breaking compiler-rt and llvm-gcc builds on MacOSX when not using the integrated assembler. llvm-svn: 133524	2011-06-21 17:35:13 +00:00
Nick Lewycky	831fb8200d	Emit movq for 64-bit register to XMM register moves, but continue to accept movd when assembling. llvm-svn: 133452	2011-06-20 18:33:26 +00:00
Bruno Cardoso Lopes	f52f4dd0b8	Add AVX suport for fpextend. Original patch by Syoyo Fujita with more comments by me. llvm-svn: 133153	2011-06-16 07:03:21 +00:00
Bruno Cardoso Lopes	b6afc5168f	Add one more argument to the prefetch intrinsic to indicate whether it's a data or instruction cache access. Update the targets to match it and also teach autoupgrade. llvm-svn: 132976	2011-06-14 04:58:37 +00:00
Stuart Hastings	ea8b49dff3	Reapply 132424 with fixes. This fixes PR10068. rdar://problem/5993888 llvm-svn: 132606	2011-06-03 23:53:54 +00:00
Rafael Espindola	1299f014d4	Revert 132424 to fix PR10068. llvm-svn: 132479	2011-06-02 19:57:47 +00:00
Stuart Hastings	9a085fb9d8	Recommit 132404 with fixes. rdar://problem/5993888 llvm-svn: 132424	2011-06-01 21:33:14 +00:00
Stuart Hastings	4b33767382	Revert 132404 to appease a buildbot. rdar://problem/5993888 llvm-svn: 132419	2011-06-01 19:52:20 +00:00
Stuart Hastings	23f5ceda96	Add support for x86 CMPEQSS and friends. These instructions do a floating-point comparison, generate a mask of 0s or 1s, and generally DTRT with NaNs. Only profitable when the user wants a materialized 0 or 1 at runtime. rdar://problem/5993888 llvm-svn: 132404	2011-06-01 17:17:45 +00:00
Stuart Hastings	fdc9e4af68	FGETSIGN support for x86, using movmskps/pd. Will be enabled with a patch to TargetLowering.cpp. rdar://problem/5660695 llvm-svn: 132388	2011-06-01 04:39:42 +00:00
Chad Rosier	b87c4a6945	Renamed llvm.x86.sse42.crc32 intrinsics; crc64 doesn't exist. crc32.[8\|16\|32] have been renamed to .crc32.32.[8\|16\|32] and crc64.[8\|16\|32] have been renamed to .crc32.64.[8\|64]. llvm-svn: 132163	2011-05-26 23:13:19 +00:00
Rafael Espindola	98372d430c	Don't produce a vmovntdq if we don't have AVX support. llvm-svn: 131330	2011-05-14 00:30:01 +00:00
Bill Wendling	67f5e8f0a7	Replace the "movnt" intrinsics with a native store + nontemporal metadata bit. <rdar://problem/8460511> llvm-svn: 130791	2011-05-03 21:11:17 +00:00
Eric Christopher	1de0dfaab0	xmm0 is an implicit parameter in this and so shouldn't be in the string template. Fixes rdar://8493866 llvm-svn: 130747	2011-05-03 01:28:32 +00:00
Chris Lattner	52e40d9b4c	clean up after Sean's r127646 patch. llvm-svn: 130475	2011-04-29 05:40:18 +00:00
Bill Wendling	0984f4927e	Reapply r129401 with patch for clang. llvm-svn: 129419	2011-04-13 00:36:11 +00:00
Bill Wendling	f6446a0961	Revert r129401 for now. Clang is using the old way of doing things. llvm-svn: 129403	2011-04-12 22:59:27 +00:00
Bill Wendling	f9c9d3e05b	Remove the unaligned load intrinsics in favor of using native unaligned loads. Now that we have a first-class way to represent unaligned loads, the unaligned load intrinsics are superfluous. First part of <rdar://problem/8460511>. llvm-svn: 129401	2011-04-12 22:46:31 +00:00
Sean Callanan	a38db2eeda	Enabled disassembler support for AVX instructions in the instruction tables and fixed a few bugs that were causing decode conflicts. Rudimentary tests are coming up in the next patch. llvm-svn: 127646	2011-03-15 01:28:15 +00:00
David Greene	2fd6d03bc9	[AVX] Fix mask predicates for 256-bit UNPCKLPS/D and implement missing patterns for them. Add a SIMD test subdirectory to hold tests for SIMD instruction selection correctness and quality. ' llvm-svn: 126845	2011-03-02 17:23:43 +00:00
Joerg Sonnenberger	efa8090e2a	Recognize monitor/mwait with explicit register arguments llvm-svn: 125805	2011-02-18 00:48:11 +00:00
David Greene	7de7347ee8	[AVX] Support VSINSERTF128 with more patterns and appropriate infrastructure. This makes lowering 256-bit vectors to 128-bit vectors simple when 256-bit vector support is not available. llvm-svn: 124868	2011-02-04 16:08:29 +00:00
David Greene	2753be260c	[AVX] VEXTRACTF128 support. This commit includes patterns for matching EXTRACT_SUBVECTOR to VEXTRACTF128 along with support routines to examine and translate index values. VINSERTF128 comes next. With these two in place we can begin supporting more AVX operations as INSERT/EXTRACT can be used as a fallback when 256-bit support is not available. llvm-svn: 124797	2011-02-03 15:50:00 +00:00
Chris Lattner	9ba0a83f2b	fix a missing shuffle pattern, PR9009. Patch by Artiom Myaskouvskey! llvm-svn: 124102	2011-01-24 03:42:46 +00:00
Chris Lattner	586e7af07d	Fix PR8946, a missing reg/reg form of movdqu. llvm-svn: 123242	2011-01-11 17:04:55 +00:00
Chris Lattner	3ef9db5cd4	fix PR8900, a shuffle miscompilation. Patch by Nadav Rotem! llvm-svn: 122921	2011-01-05 22:28:46 +00:00
Nate Begeman	c7dfecb10e	Implement feedback from Bruno on making pblendvb an x86-specific ISD node in addition to being an intrinsic, and convert lowering to use it. Hopefully the pattern fragment is doing the right thing with XMM0, looks correct in testing. llvm-svn: 122277	2010-12-20 22:04:24 +00:00
Nate Begeman	ef5f3c0fa7	Add support for matching psign & plendvb to the x86 target Remove unnecessary pandn patterns, 'vnot' patfrag looks through bitcasts llvm-svn: 122098	2010-12-17 22:55:37 +00:00
Nate Begeman	8c00ecd290	Add some missing predicates. llvm-svn: 121445	2010-12-10 00:54:26 +00:00
Nate Begeman	cb6d1c8193	Formalize the notion that AVX and SSE are non-overlapping extensions from the compiler's point of view. Per email discussion, we either want to always use VEX-prefixed instructions or never use them, and are taking "HasAVX" to mean "Always use VEX". Passing -mattr=-avx,+sse42 should serve to restore legacy SSE support when desirable. llvm-svn: 121439	2010-12-10 00:26:57 +00:00
Nate Begeman	4a62a3e229	Add support for AVX to materialize +0.0 when doing scalar FP. llvm-svn: 121415	2010-12-09 21:43:51 +00:00
Benjamin Kramer	851691ddb2	Add patterns for the x86 popcnt instruction. - Also adds a new POPCNT subtarget feature that is currently enabled if the target supports SSE4.2 (nehalem) or SSE4A (barcelona). llvm-svn: 120917	2010-12-04 20:32:23 +00:00
Nate Begeman	deb26223bd	Scalar f32/f64 are also subregs of ymm regs llvm-svn: 120844	2010-12-03 21:54:39 +00:00
Eric Christopher	6a21ceab5c	Implement a PseudoI class and transfer the sse instructions over to use it. llvm-svn: 120412	2010-11-30 08:57:23 +00:00
Eric Christopher	f27f0b5234	Rewrite mwait and monitor support and custom lower arguments. Fixes PR8573. llvm-svn: 120404	2010-11-30 07:20:12 +00:00
Bruno Cardoso Lopes	9f9f796756	Fix PR8211 llvm-svn: 118445	2010-11-08 21:24:59 +00:00
Dale Johannesen	b78530f9b0	Fix pastos in handling of AVX cvttsd2si, PR8491. Bruno, please review, but I'm pretty sure this is right. Patch by Alex Mac! llvm-svn: 117514	2010-10-28 00:35:54 +00:00
Chris Lattner	72e7e84c3f	simplify some map operations. llvm-svn: 116014	2010-10-07 23:57:02 +00:00
Evan Cheng	7c89d70f27	Canonicalize X86ISD::MOVDDUP nodes to v2f64 to make sure all cases match. Also eliminate unneeded isel patterns. rdar://8520311 llvm-svn: 115977	2010-10-07 20:50:20 +00:00
Chris Lattner	84846b71af	remove the !nameconcat tblgen feature. It "shorthand" and only used in 4 places where !cast is just as short. llvm-svn: 115722	2010-10-06 00:19:21 +00:00
Chris Lattner	12274b9845	allow !strconcat to take more than two operands to eliminate !strconcat(!strconcat(!strconcat(!strconcat Simplify some x86 td files to use it. llvm-svn: 115719	2010-10-05 23:58:18 +00:00
Chris Lattner	5d7d5a81eb	distribute the rest of the contents of X86Instr64bit.td out to the right places. X86Instr64bit.td now dies, long live x86-64! llvm-svn: 115669	2010-10-05 20:49:15 +00:00
Chris Lattner	9317bf2ed5	move CMOV_FR32 and friends to InstrCompiler, since they are pseudo instructions. Move POPCNT to InstrSSE since they are SSE4 instructions. llvm-svn: 115603	2010-10-05 06:41:40 +00:00
Chris Lattner	9c58de2dc4	fix rdar://8490728 - llvm-mc rejects gpr64 form of 'movmskpd' llvm-svn: 115029	2010-09-29 05:05:03 +00:00
Chris Lattner	890c21a20a	add assembler support for the cvtsd2sil/cvtsd2siq mnemonics, rdar://8456382 llvm-svn: 115027	2010-09-29 04:55:40 +00:00
Chris Lattner	c14d59589c	add basic avx support to the disassembler, also teach it about ssmem/sdmem operands. With this done, we can remove the _Int suffixes from the round instructions without the disassembler blowing up. This allows the assembler to support them, implementing rdar://8456376 - llvm-mc rejects 'roundss' llvm-svn: 115019	2010-09-29 02:57:56 +00:00
Chris Lattner	f90296b045	add asmparser support for cvttpd2dq by removing some Int_ prefixes. Clean up cvttps2dq by removing some redundant implementations of the same instruction. rdar://8456382 llvm-svn: 115018	2010-09-29 02:36:32 +00:00
Chris Lattner	e5c5c8dc1f	implement rdar://8456382 - cvtsd2si support, by removing some Int_ prefixes. llvm-svn: 115017	2010-09-29 02:24:57 +00:00
Dale Johannesen	eb807a15a3	Fix typos. 128-bit PSHUFB takes 128-bit memory op. v8i16 is not an MMX type; put it where it belongs. llvm-svn: 113785	2010-09-13 21:15:43 +00:00
Bruno Cardoso Lopes	49efee5c95	Add one more pattern to fallback movddup llvm-svn: 113522	2010-09-09 18:48:34 +00:00
Dale Johannesen	de53df20d6	Move remaining MMX instructions from SSE to MMX. llvm-svn: 113501	2010-09-09 17:13:07 +00:00
Dale Johannesen	7469923117	Move most MMX instructions (defined as anything that uses MMX, even if it also uses other things) from InstrSSE into InstrMMX. No (intended) functional change. llvm-svn: 113462	2010-09-09 01:02:39 +00:00
Bruno Cardoso Lopes	892c337123	x86 vector shuffle lowering now relies only on target specific nodes to emit shuffles and don't do isel mask matching anymore. - Add the selection of the remaining shuffle opcode (movddup) - Introduce two new functions to "recognize" where we may get potential folds and add several comments to them explaining why they are not yet in the desidered shape. - Add more patterns to fallback the case where we select a specific shuffle opcode as if it could fold a load, but it can't, so remap to a valid instruction. - Add a couple of FIXMEs to address in the following days once there's a good solution to the current folding problem. llvm-svn: 113369	2010-09-08 17:43:25 +00:00
Dale Johannesen	8354cab2de	Add patterns for MMX that use the new intrinsics. Enable palignr intrinsic. These may need adjustment for a new VT in due course. llvm-svn: 113233	2010-09-07 18:10:56 +00:00
Bruno Cardoso Lopes	92bb02f722	Remove unused target specific node llvm-svn: 113224	2010-09-07 17:38:55 +00:00
Dale Johannesen	2f4f8f5705	Remove the rest of the nonexistent 64-bit AVX instructions. Bruno, please review. llvm-svn: 113014	2010-09-03 21:23:00 +00:00
Bruno Cardoso Lopes	b8ce8b7e9f	Reapply last harmless part of r112934, the pattern fragment to match X86Unpcklpd llvm-svn: 113009	2010-09-03 20:44:26 +00:00
Daniel Dunbar	26e0e964ab	Revert r112934, "- Use specific nodes to match unpckl masks.", which introduced some infinite loop and select failures. - Apologies for eager reverting, but its branch day. llvm-svn: 113000	2010-09-03 19:38:11 +00:00
Bruno Cardoso Lopes	f91bd70e9a	AVX doesn't support mm operations neither its instrinsics. The AVX versions of PALIGN and PABS* should only exist for 128-bit. Remove the unnecessary stuff. llvm-svn: 112944	2010-09-03 02:08:45 +00:00
Bruno Cardoso Lopes	e1ad6555a8	- Use specific nodes to match unpckl masks. - Teach getShuffleScalarElt how to handle more target specific nodes, so the DAGCombine can make use of it. - Add another hack to avoid the node update problem during legalization. More description on the comments llvm-svn: 112934	2010-09-03 01:24:00 +00:00
Bruno Cardoso Lopes	dcdab94661	become more strict about when it's safe to use X86ISD::MOVLPS llvm-svn: 112799	2010-09-02 02:35:51 +00:00
Bruno Cardoso Lopes	601bf4c6d3	Using target specific nodes for shuffle nodes makes the mask check more strict, breaking some cases not checked in the testsuite, but also exposes some foldings not done before, as this example: movaps (%rdi), %xmm0 movaps (%rax), %xmm1 movaps %xmm0, %xmm2 movss %xmm1, %xmm2 shufps $36, %xmm2, %xmm0 now is generated as: movaps (%rdi), %xmm0 movaps %xmm0, %xmm1 movlps (%rax), %xmm1 shufps $36, %xmm1, %xmm0 llvm-svn: 112753	2010-09-01 22:33:20 +00:00
Bruno Cardoso Lopes	9375b2f67d	Use movlps, movlpd, movss and movsd specific nodes instead of pattern matching with movlp pattern fragment llvm-svn: 112694	2010-09-01 05:08:25 +00:00
Bruno Cardoso Lopes	80613a070e	Use x86 specific MOVSLDUP node, add more patterns to match it and remove useless load nodes llvm-svn: 112661	2010-08-31 22:35:05 +00:00
Bruno Cardoso Lopes	8fc83b1960	Use x86 specific MOVSHDUP node and add more patterns to match it llvm-svn: 112657	2010-08-31 22:22:11 +00:00
Bruno Cardoso Lopes	6fbe7b9ddd	Use MOVLHPS and MOVHLPS x86 nodes whenever possible. Also remove some useless nodes llvm-svn: 112642	2010-08-31 21:15:21 +00:00
Bruno Cardoso Lopes	7939025262	Use pshufhw and pshuflw in more cases and fix getTargetShuffleNode number of arguments llvm-svn: 111890	2010-08-24 01:16:15 +00:00
Bruno Cardoso Lopes	28d9071635	This is the first step towards refactoring the x86 vector shuffle code. The general idea here is to have a group of x86 target specific nodes which are going to be selected during lowering and then directly matched in isel. The commit includes the addition of those specific nodes and a bunch of patterns, and incrementally we're going to switch between them and what we have right now. Both the patterns and target specific nodes can change as we move forward with this work. llvm-svn: 111691	2010-08-20 22:55:05 +00:00
Dale Johannesen	3f9c148d0e	Revert 110491. While not wrong, it was based on a misanalysis and is undesirable. llvm-svn: 111028	2010-08-13 18:43:45 +00:00
Bruno Cardoso Lopes	de5f3f5cb6	Improve comment to make explicit why not to touch this could before JIT goes MC llvm-svn: 111021	2010-08-13 17:44:10 +00:00
Eric Christopher	63c83f19a0	Revert last patch and r110954 as I meant to. llvm-svn: 111001	2010-08-13 02:37:50 +00:00
Bruno Cardoso Lopes	350d186d69	Some small clean-up: use of pseudo instructions llvm-svn: 110954	2010-08-12 20:55:18 +00:00
Bruno Cardoso Lopes	7cb26cb8be	- Teach SSEDomainFix to switch between different levels of AVX instructions. Here we guess that AVX will have domain issues, so just implement them for consistency and in the future we remove if it's unnecessary. - Make foldMemoryOperandImpl aware of 256-bit zero vectors folding and support the 128-bit counterparts of AVX too. - Make sure MOV[AU]PS instructions are only selected when SSE1 is enabled, and duplicate the patterns to match AVX. - Add a testcase for a simple 128-bit zero vector creation. llvm-svn: 110946	2010-08-12 20:20:53 +00:00
Bruno Cardoso Lopes	99b5298854	Define AVX 128-bit pattern versions of SET0PS/PD. llvm-svn: 110937	2010-08-12 18:20:59 +00:00
Bruno Cardoso Lopes	bb491bd56c	Begin to support some vector operations for AVX 256-bit intructions. The long term goal here is to be able to match enough of vector_shuffle and build_vector so all avx intrinsics which aren't mapped to their own built-ins but to shufflevector calls can be codegen'd. This is the first (baby) step, support building zeroed vectors. llvm-svn: 110897	2010-08-12 02:06:36 +00:00
Bruno Cardoso Lopes	6eb24fd744	Add AVX matching patterns to Packed Bit Test intrinsics. Apply the same approach of SSE4.1 ptest intrinsics but create a new x86 node "testp" since AVX introduces vtest{ps}{pd} instructions which set ZF and CF depending on sign bit AND and ANDN of packed floating-point sources. This is slightly different from what the "ptest" does. Tests comming with the other 256 intrinsics tests. llvm-svn: 110744	2010-08-10 23:25:42 +00:00
Bruno Cardoso Lopes	f1928b60c0	Add AVX movnt{pd,ps,dq} 256-bit intrinsics llvm-svn: 110650	2010-08-10 02:49:24 +00:00
Bruno Cardoso Lopes	f5884c6791	Add AVX movmsk 256-bit intrinsics llvm-svn: 110648	2010-08-10 02:34:56 +00:00
Bruno Cardoso Lopes	2a7ed4b5c9	Support AVX 256-bit load and store intrinsics llvm-svn: 110645	2010-08-10 01:43:16 +00:00
Bruno Cardoso Lopes	1ea37cfa7b	Patterns to match AVX cmp instructions llvm-svn: 110633	2010-08-10 00:13:20 +00:00
Bruno Cardoso Lopes	4e8d77892c	Add matching patterns for vblend AVX intrinsics llvm-svn: 110630	2010-08-10 00:02:05 +00:00
Bruno Cardoso Lopes	e58d077846	Add VCVTPD2PS, VCVTPS2DQ, VCVTPS2PDY, VCVTTPD2DQY, VCVTTPS2DQ and VCVTPD2DQ 256-bit conversion intrinsics llvm-svn: 110608	2010-08-09 21:51:56 +00:00
Bruno Cardoso Lopes	e7ceec4edf	Add patterns to AVX conversions instructions. Do that instead of declaring more intructions whenever is possible, more coming llvm-svn: 110605	2010-08-09 21:24:59 +00:00
Bruno Cardoso Lopes	6a92e01d05	Memory version of vcvtdq2pd intrinsic llvm-svn: 110582	2010-08-09 18:20:14 +00:00
Bruno Cardoso Lopes	0794b8ab3f	Patterns to match vinsert, vbroadcast, vmovmask and vcvtdq2pd AVX intrinsics llvm-svn: 110580	2010-08-09 18:03:43 +00:00
Dale Johannesen	23f9086dd3	Use sdmem and sse_load_f64 (etc.) for the vector form of CMPSD (etc.) Matching a 128-bit memory operand is wrong, the instruction uses only 64 bits (same as ADDSD etc.) 8193553. llvm-svn: 110491	2010-08-07 00:33:42 +00:00
Bruno Cardoso Lopes	5b602f8822	Patterns to match AVX 256-bit vzero intrinsics llvm-svn: 110480	2010-08-06 22:10:01 +00:00
Bruno Cardoso Lopes	821eebf946	Patterns to match AVX 256-bit permutation intrinsics llvm-svn: 110468	2010-08-06 20:03:27 +00:00
Bruno Cardoso Lopes	d186fba555	Patterns to match AVX 256-bit horizontal arithmetic intrinsics llvm-svn: 110427	2010-08-06 02:10:30 +00:00
Bruno Cardoso Lopes	5e9f9c921e	Patterns to match AVX 256-bit arithmetic intrinsics llvm-svn: 110425	2010-08-06 01:52:29 +00:00
Bruno Cardoso Lopes	0c0dd2173c	Support all 128-bit AVX vector intrinsics. Most part of them I already declared during the addition of the assembler support, the additional changes are: - Add missing intrinsics - Move all SSE conversion instructions in X86InstInfo64.td to the SSE.td file. - Duplicate some patterns to AVX mode. - Step into PCMPEST/PCMPIST custom inserter and add AVX versions. llvm-svn: 109878	2010-07-30 19:54:33 +00:00
Bruno Cardoso Lopes	a80b57e5fc	Add AVX version of CLMUL instructions llvm-svn: 109248	2010-07-23 18:41:12 +00:00
Bruno Cardoso Lopes	93fd8bdf6a	Fix some AVX instructions which didnt had HasAVX prefix. And also a problem with PINSRW, which was totally wrong because of a typo I introduced previously llvm-svn: 109198	2010-07-23 00:14:54 +00:00
Bruno Cardoso Lopes	7722724eee	Add remaining AVX instructions (most of them dealing with GR64 destinations. This complete the assembler support for the general AVX ISA. But we still miss instructions from FMA3 and CLMUL specific feature flags, which are now the next step llvm-svn: 109168	2010-07-22 21:18:49 +00:00
Eric Christopher	4924d5fb93	Custom lower the memory barrier instructions and add support for lowering without sse2. Add a couple of new testcases. Fixes a few libgomp tests and latent bugs. Remove a few todos. llvm-svn: 109078	2010-07-22 02:48:34 +00:00
Bruno Cardoso Lopes	5920e38cd2	Add more 256-bit forms for a bunch of regular AVX instructions Add 64-bit (GR64) versions of some instructions (which are not described in their SSE forms, but are described in AVX) llvm-svn: 109063	2010-07-21 23:53:50 +00:00
Bruno Cardoso Lopes	eea3b7ed83	Add missing AVX convert instructions. Those instructions are not described in their SSE forms (although they exist), but add the AVX forms anyway, so the assembler can benefit from it llvm-svn: 109039	2010-07-21 21:37:59 +00:00
Bruno Cardoso Lopes	1284fdc932	Avoid AVX instructions to be selected instead of its SSE form llvm-svn: 109032	2010-07-21 20:38:42 +00:00
Bruno Cardoso Lopes	d13d8c2562	Add AVX only vzeroall and vzeroupper instructions llvm-svn: 109002	2010-07-21 08:56:24 +00:00
Bruno Cardoso Lopes	c4d93a5a34	Add new AVX vpermilps, vpermilpd and vperm2f128 instructions llvm-svn: 108984	2010-07-21 03:07:42 +00:00
Bruno Cardoso Lopes	a7efb29695	Add new AVX vmaskmov instructions, and also fix the VEX encoding bits to support it llvm-svn: 108983	2010-07-21 02:46:58 +00:00
Bruno Cardoso Lopes	e0dce1c741	Add new AVX vextractf128 instructions llvm-svn: 108964	2010-07-20 23:19:02 +00:00
Bruno Cardoso Lopes	b677cbc9b2	Add new AVX instruction vinsertf128 llvm-svn: 108892	2010-07-20 19:44:51 +00:00
Bruno Cardoso Lopes	88869cb4db	Add AVX vbroadcast new instruction llvm-svn: 108788	2010-07-20 00:11:13 +00:00
Bruno Cardoso Lopes	4ca44dda21	Add 256-bit vaddsub, vhadd, vhsub, vblend and vdpp instructions! llvm-svn: 108769	2010-07-19 23:32:44 +00:00
Bruno Cardoso Lopes	0616a418b6	Add AVX 256-bit compare instructions and a bunch of testcases llvm-svn: 108286	2010-07-13 22:06:38 +00:00
Bruno Cardoso Lopes	7bc71d2d0a	AVX 256-bit conversion instructions Add the x86 VEX_L form to handle special cases where VEX_L must be set. llvm-svn: 108274	2010-07-13 21:07:28 +00:00
Bruno Cardoso Lopes	ae37153b05	Add AVX 256-bit packed logical forms llvm-svn: 108224	2010-07-13 02:38:35 +00:00
Bruno Cardoso Lopes	495ae629bb	Add AVX 256-bit unop arithmetic instructions llvm-svn: 108223	2010-07-13 01:53:31 +00:00
Bruno Cardoso Lopes	185483638b	Since AVX is a superset of all SSE versions, only use HasAVX for AVX instructions llvm-svn: 108222	2010-07-13 00:38:47 +00:00
David Greene	d81591ee09	Move some SIMD fragment code into X86InstrFragmentsSIMD so that the utility classes can be used from multiple files. This will aid transitioning to a new refactored x86 SIMD specification. llvm-svn: 108213	2010-07-12 23:41:28 +00:00
Bruno Cardoso Lopes	852e3bf472	Add AVX 256 binary arithmetic instructions llvm-svn: 108207	2010-07-12 23:04:15 +00:00
Bruno Cardoso Lopes	b021506033	More refactoring of basic SSE arith instructions. Open room for 256-bit instructions llvm-svn: 108204	2010-07-12 22:41:32 +00:00
Dan Gohman	e9c4426bb0	Apply the SSE dependence idiom for SSE unary operations to SD instructions too, in addition to SS instructions. And add a comment about it. llvm-svn: 108191	2010-07-12 20:46:04 +00:00
Bruno Cardoso Lopes	a4889e6f93	Add AVX 256-bit MOVMSK forms llvm-svn: 108184	2010-07-12 20:06:32 +00:00
Bruno Cardoso Lopes	f4180a9a7b	Add AVX 256-bit packed MOVNT variants llvm-svn: 108021	2010-07-09 21:42:42 +00:00
Bruno Cardoso Lopes	6ca8dc935c	Add AVX 256-bit unpack and interleave llvm-svn: 108017	2010-07-09 21:20:35 +00:00
Bruno Cardoso Lopes	3676e24b67	Start the support for AVX instructions with 256-bit %ymm registers. A couple of notes: - The instructions are being added with dummy placeholder patterns using some 256 specifiers, this is not meant to work now, but since there are some multiclasses generic enough to accept them, when we go for codegen, the stuff will be already there. - Add VEX encoding bits to support YMM - Add MOVUPS and MOVAPS in the first round - Use "Y" as suffix for those Instructions: MOVUPSYrr, ... - All AVX instructions in X86InstrSSE.td will move soon to a new X86InstrAVX file. llvm-svn: 107996	2010-07-09 18:27:43 +00:00
Bruno Cardoso Lopes	8d350872d4	Add AVX AES instructions llvm-svn: 107798	2010-07-07 18:24:20 +00:00
Bruno Cardoso Lopes	6222076cd1	Add AVX SSE4.2 instructions llvm-svn: 107752	2010-07-07 03:39:29 +00:00
Bruno Cardoso Lopes	931471d7e8	Use only one multiclass to pinsrq instructions llvm-svn: 107750	2010-07-07 01:43:01 +00:00
Bruno Cardoso Lopes	65fbd0530f	Now that almost all SSE4.1 AVX instructions are added, move code around to more appropriate sections. No functionality changes llvm-svn: 107749	2010-07-07 01:33:38 +00:00
Bruno Cardoso Lopes	675ebe2dc0	Add AVX SSE4.1 insertps, ptest and movntdqa instructions llvm-svn: 107747	2010-07-07 01:14:56 +00:00
Bruno Cardoso Lopes	fa10461265	Add AVX SSE4.1 extractps and pinsr instructions llvm-svn: 107746	2010-07-07 01:01:13 +00:00
Bruno Cardoso Lopes	54c2f858b3	Add AVX SSE4.1 Extract Integer instructions llvm-svn: 107740	2010-07-07 00:07:24 +00:00
Bruno Cardoso Lopes	b9e1c33054	Add the rest of AVX SSE4.1 packed move with sign/zero extend instructions llvm-svn: 107723	2010-07-06 23:15:17 +00:00
Bruno Cardoso Lopes	0c6ec0b068	Add part of AVX SSE4.1 packed move with sign/zero extend instructions llvm-svn: 107720	2010-07-06 23:01:41 +00:00
Bruno Cardoso Lopes	a0b37e839c	Add AVX vblendvpd, vblendvps and vpblendvb instructions Update VEX encoding to support those new instructions llvm-svn: 107715	2010-07-06 22:36:24 +00:00
Chris Lattner	e7c95bcd9e	rip out even more sporadic v2f32 support. llvm-svn: 107610	2010-07-05 04:38:33 +00:00
Bill Wendling	689155c673	Revert r107583. I no longer think that this is the way to solve the problem. llvm-svn: 107585	2010-07-04 09:16:57 +00:00
Bill Wendling	8a3ecba7a4	Mark sse_load_f32 and sse_load_f64 as having memory operands (SDNPMemOperand). This way when they're morphed the memory operands will be copied as well. llvm-svn: 107583	2010-07-04 08:59:55 +00:00
Bruno Cardoso Lopes	dc16024895	Add AVX SSE4.1 blend, mpsadbw and vdp llvm-svn: 107560	2010-07-03 01:37:03 +00:00
Bruno Cardoso Lopes	9cbb625579	Add AVX SSE4.1 binop (some forms of packed max,min,mul,pack,cmp) instructions llvm-svn: 107558	2010-07-03 01:15:47 +00:00
Bruno Cardoso Lopes	df02d037e4	Add AVX SSE4.1 Horizontal Minimum and Position instruction llvm-svn: 107552	2010-07-03 00:49:21 +00:00
Bruno Cardoso Lopes	e6b70efcb0	Add AVX SSE4.1 round instructions llvm-svn: 107549	2010-07-03 00:37:44 +00:00
Bruno Cardoso Lopes	473863e456	Simple refactoring of SSE4.1 instructions, making room for the AVX forms llvm-svn: 107540	2010-07-02 23:27:59 +00:00
Bruno Cardoso Lopes	4931e183b5	- Add support for the rest of AVX SSE3 instructions - Fix VEX prefix to be emitted with 3 bytes whenever VEX_5M represents a REX equivalent two byte leading opcode llvm-svn: 107523	2010-07-02 22:06:54 +00:00
Bruno Cardoso Lopes	c5670fcb23	Shrink down SSE3 code by more multiclass refactoring llvm-svn: 107448	2010-07-01 23:10:49 +00:00
Bruno Cardoso Lopes	c215186088	Shrink down SSE3 code by some multiclass refactoring - 1st part llvm-svn: 107438	2010-07-01 22:33:18 +00:00
Bruno Cardoso Lopes	511e5f47de	Move SSE3 Move patterns to a more appropriate section Add AVX SSE3 packed horizontal and & sub instructions llvm-svn: 107405	2010-07-01 17:35:02 +00:00
Bruno Cardoso Lopes	0a3048e8b9	Add AVX SSE3 packed addsub instructions llvm-svn: 107404	2010-07-01 17:08:18 +00:00
Bruno Cardoso Lopes	c1abe91367	Add AVX SSE3 replicate and convert instructions llvm-svn: 107375	2010-07-01 02:33:39 +00:00
Bruno Cardoso Lopes	956316a3d7	- Add AVX SSE2 Move doubleword and quadword instructions. - Add encode bits for VEX_W - All 128-bit SSE 1 & SSE2 instructions that are described in the .td file now have a AVX encoded form already working. llvm-svn: 107365	2010-07-01 01:20:06 +00:00
Bruno Cardoso Lopes	7ae1ebd3b4	Move MOVD/MODQ code around, creating sections for each of them llvm-svn: 107308	2010-06-30 18:49:10 +00:00
Bruno Cardoso Lopes	f8855c22be	Add AVX SSE2 mask creation and conditional store instructions llvm-svn: 107306	2010-06-30 18:38:10 +00:00
Bruno Cardoso Lopes	6c468039a2	Fix a bug introduced in r107211 where instructions with memory operands are declared as commutable llvm-svn: 107300	2010-06-30 18:06:01 +00:00
Bruno Cardoso Lopes	3c02702830	Add AVX SSE2 packed integer extract/insert instructions llvm-svn: 107293	2010-06-30 17:03:03 +00:00
Bruno Cardoso Lopes	39594cc5d0	Add AVX SSE2 integer unpack instructions llvm-svn: 107246	2010-06-30 04:06:39 +00:00
Bruno Cardoso Lopes	419f8f29c3	Add AVX SSE2 packed integer shuffle instructions llvm-svn: 107245	2010-06-30 03:47:56 +00:00
Bruno Cardoso Lopes	c2f5cd2389	Small refactoring of SSE2 packed integer shuffle instructions llvm-svn: 107243	2010-06-30 03:29:36 +00:00
Bruno Cardoso Lopes	d9acb34aa2	Add AVX SSE2 pack with saturation integer instructions llvm-svn: 107241	2010-06-30 02:30:25 +00:00
Bruno Cardoso Lopes	c470ba9937	Add AVX SSE2 integer packed compare instructions llvm-svn: 107240	2010-06-30 02:21:09 +00:00
Bruno Cardoso Lopes	cfbebb3921	- Add AVX form of all SSE2 logical instructions - Add VEX encoding bits to x86 MRM0r-MRM7r llvm-svn: 107238	2010-06-30 01:58:37 +00:00
Bruno Cardoso Lopes	2439877e05	Add several AVX integer packed binop instructions llvm-svn: 107225	2010-06-29 23:47:49 +00:00
Bruno Cardoso Lopes	b80121d316	Move SSE2 Packed Integer instructions around, and create specific sections for each of them llvm-svn: 107211	2010-06-29 22:12:16 +00:00

... 2 3 4 5 6 ...

806 Commits