llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 06:22:56 +02:00

Author	SHA1	Message	Date
Juergen Ributzka	f7fcca505b	Add the "vbroadcasti128" instruction back. This is a follow-up to r231182. This adds the "vbroadcasti128" instruction back, but without the intrinsic mapping. Also add a test to check the instriction encoding. This is related to rdar://problem/18742778. llvm-svn: 231945	2015-03-11 17:29:03 +00:00
Elena Demikhovsky	320252ae4d	AVX-512: Added SKX forms of shift instructions. Added rotation instructions, encoding only. Added encoding tests for all these forms. llvm-svn: 231916	2015-03-11 10:25:42 +00:00
Ahmed Bougacha	a07940a0f3	[X86] Remove stale comment. NFC. It turns out 256bit V[SZ]EXT nodes are still generated by the new shuffle lowering, so this is here to stay! llvm-svn: 231422	2015-03-05 23:18:41 +00:00
Juergen Ributzka	a0c556be5c	Remove 'llvm.x86.avx2.vbroadcasti128' intrinsic. The intrinsic is no longer generated by the front-end. Remove the intrinsic and auto-upgrade it to a vector shuffle. Reviewed by Nadav This is related to rdar://problem/18742778. llvm-svn: 231182	2015-03-04 00:13:25 +00:00
Elena Demikhovsky	e032aa37b6	AVX-512: Added mask and rounding mode for scalar arithmetics Added more tests for scalar instructions to destinguish between AVX and AVX-512 forms. llvm-svn: 230891	2015-03-01 07:44:04 +00:00
Craig Topper	c3d656cc84	[X86] Remove the blendpd/blendps/pblendw/pblendd intrinsics. They can represented by shuffle_vector instructions. llvm-svn: 230860	2015-02-28 19:33:17 +00:00
Elena Demikhovsky	b180c77ca9	restructured X86 scalar unary operation templates I made the templates general, no need to define pattern separately for each instruction/intrinsic. Now only need to add r_Int pattern for AVX. llvm-svn: 230221	2015-02-23 14:14:02 +00:00
Craig Topper	f4cb7a4e70	[X86] Add some missing redundant MMX and SSE encodings for disassembler. llvm-svn: 230165	2015-02-22 07:50:41 +00:00
Sanjay Patel	e02545175a	canonicalize a v2f64 blendi of 2 registers This canonicalization step saves us 3 pattern matching possibilities * 4 math ops for scalar FP math that uses xmm regs. The backend can re-commute the operands post-instruction-selection if that makes register allocation better. The tests in llvm/test/CodeGen/X86/sse-scalar-fp-arith.ll cover this scenario already, so there are no new tests with this patch. Differential Revision: http://reviews.llvm.org/D7777 llvm-svn: 230024	2015-02-20 16:55:27 +00:00
Craig Topper	398dc737fa	[X86] Remove AVX2 and SSE2 pslldq and psrldq intrinsics. We can represent them in IR with vector shuffles now. All their uses have been removed from clang in favor of shuffles. llvm-svn: 229640	2015-02-18 06:24:44 +00:00
Sanjay Patel	5abe0e38c5	prevent folding a scalar FP load into a packed logical FP instruction (PR22371) Change the memory operands in sse12_fp_packed_scalar_logical_alias from scalars to vectors. That's what the hardware packed logical FP instructions define: 128-bit memory operands. There are no scalar versions of these instructions...because this is x86. Generating the wrong code (folding a scalar load into a 128-bit load) is still possible using the peephole optimization pass and the load folding tables. We won't completely solve this bug until we either fix the lowering in fabs/fneg/fcopysign and any other places where scalar FP logic is created or fix the load folding in foldMemoryOperandImpl() to make sure it isn't changing the size of the load. Differential Revision: http://reviews.llvm.org/D7474 llvm-svn: 229531	2015-02-17 20:08:21 +00:00
Simon Pilgrim	e9778d75a1	[X86][SSE] Add SSE MOVQ instructions to SSEPackedInt domain Patch to explicitly add the SSE MOVQ (rr,mr,rm) instructions to SSEPackedInt domain - prevents a number of costly domain switches. Differential Revision: http://reviews.llvm.org/D7600 llvm-svn: 229439	2015-02-16 21:50:56 +00:00
Craig Topper	b6e168f770	[X86] Remove the multiply by 8 that goes into the shift constant for X86ISD::VSHLDQ and X86ISD::VSRLDQ. This simplifies the pattern matching in isel and allows these nodes to become the patterns embedded in the instruction. llvm-svn: 229431	2015-02-16 20:52:07 +00:00
Craig Topper	3f6d0f923d	[X86] Remove x86.avx2.psll.dq.bs and x86.avx2.psrl.dq.bs intrinsics. llvm-svn: 229430	2015-02-16 20:51:59 +00:00
Simon Pilgrim	ddbf019542	[X86][AVX2] vpslldq/vpsrldq byte shifts for AVX2 This patch refactors the existing lowerVectorShuffleAsByteShift function to add support for 256-bit vectors on AVX2 targets. It also fixes a tablegen issue that prevented the lowering of vpslldq/vpsrldq vec256 instructions. Differential Revision: http://reviews.llvm.org/D7596 llvm-svn: 229311	2015-02-15 13:19:52 +00:00
Sanjay Patel	6fdd9c06bb	[SSE/AVX] Use multiclasses to reduce the mass of scalar math patterns; NFCI This takes the preposterous number of patterns in this section that were last added to in r219033 down to just plain obnoxious. With a little more work, we might get this down to just comical. I've added more test cases to the existing file that checks these patterns, but it seems that some of these patterns simply don't exist with today's shuffle lowering. llvm-svn: 229158	2015-02-13 21:52:42 +00:00
Craig Topper	eaf6d626b1	[X86] Remove int_x86_sse2_psll_dq_bs and int_x86_sse2_psrl_dq_bs intrinsics. The builtins aren't used by clang. llvm-svn: 229069	2015-02-13 06:07:24 +00:00
Craig Topper	52b42f0b75	[X86] Remove 256-bit and 512-bit memop pattern fragments. They are no longer used. llvm-svn: 228563	2015-02-09 04:04:53 +00:00
Craig Topper	39e5a46fee	[X86] Remove the remaining uses of memop from AVX and AVX2 instruction patterns. AVX and AVX2 can handle unaligned loads being folded so we can just use 'load' llvm-svn: 228551	2015-02-08 22:38:25 +00:00
Craig Topper	e2e6d72938	[X86] Add xrstors/xsavec/xsaves/clflushopt/clwb/pcommit instructions llvm-svn: 228283	2015-02-05 08:51:06 +00:00
Chandler Carruth	99f7e3a3dd	[x86] Give movss and movsd execution domains in the x86 backend. This associates movss and movsd with the packed single and packed double execution domains (resp.). While this is largely cosmetic, as we now don't have weird ping-pong-ing between single and double precision, it is also useful because it avoids the domain fixing algorithm from seeing domain breaks that don't actually exist. It will also be much more important if we have an execution domain default other than packed single, as that would cause us to mix movss and movsd with integer vector code on a regular basis, a very bad mixture. llvm-svn: 228135	2015-02-04 10:58:53 +00:00
Chandler Carruth	62dda41067	[x86] Add missing patterns for andps, orps, xorps, and andnps. Specifically, the existing patterns were scalar-only. These cover the packed vector bitwise operations when specifically requested with pseudo instructions. This is particularly important in SSE1 where we can't actually emit a logical operation on a v2i64 as that isn't a legal type. This will be tested in subsequent patches which form the floating point and patterns in more places. llvm-svn: 228123	2015-02-04 09:06:01 +00:00
Sanjay Patel	dd58512572	Merge consecutive 16-byte loads into one 32-byte load (PR22329) This patch detects consecutive vector loads using the existing EltsFromConsecutiveLoads() logic. This fixes: http://llvm.org/bugs/show_bug.cgi?id=22329 This patch effectively reverts the tablegen additions of D6492 / http://reviews.llvm.org/rL224344 ...which in hindsight were a horrible hack. The test cases that were added with that patch are simply modified to load from varying offsets of a base pointer. These loads did not match the existing tablegen patterns. A happy side effect of doing this optimization earlier is that we can now fold the load into a math op where possible; this is shown in some of the updated checks in the test file. Differential Revision: http://reviews.llvm.org/D7303 llvm-svn: 228006	2015-02-03 18:54:00 +00:00
Simon Pilgrim	8cbdc5422d	[X86][SSE] Float comparisons can sometimes be safely commuted For ordered, unordered, equal and not-equal tests, packed float and double comparison instructions can be safely commuted without affecting the results. This patch checks the comparison mode of the (v)cmpps + (v)cmppd instructions and commutes the result if it can. Differential Revision: http://reviews.llvm.org/D7178 llvm-svn: 227145	2015-01-26 22:29:24 +00:00
Simon Pilgrim	00b317ad36	[X86][PCLMUL] Enable commutation for PCLMUL instructions Patch to allow (v)pclmulqdq to be commuted - swaps the src registers and inverts the immediate (low/high) src mask. Differential Revision: http://reviews.llvm.org/D7180 llvm-svn: 227141	2015-01-26 22:00:18 +00:00
Sanjay Patel	75e4c9502b	Model sqrtsd as a binary operation with one source operand tied to the destination (PR14221) This patch fixes the following miscompile: define void @sqrtsd(<2 x double> %a) nounwind uwtable ssp { %0 = tail call <2 x double> @llvm.x86.sse2.sqrt.sd(<2 x double> %a) nounwind %a0 = extractelement <2 x double> %0, i32 0 %conv = fptrunc double %a0 to float %a1 = extractelement <2 x double> %0, i32 1 %conv3 = fptrunc double %a1 to float tail call void @callee2(float %conv, float %conv3) nounwind ret void } Current codegen: sqrtsd %xmm0, %xmm1 ## high element of %xmm1 is undef here xorps %xmm0, %xmm0 cvtsd2ss %xmm1, %xmm0 shufpd $1, %xmm1, %xmm1 cvtsd2ss %xmm1, %xmm1 ## operating on undef value jmp _callee This is a continuation of http://llvm.org/viewvc/llvm-project?view=revision&revision=224624 ( http://reviews.llvm.org/D6330 ) which was itself a continuation of r167064 ( http://llvm.org/viewvc/llvm-project?view=revision&revision=167064 ). All of these patches are partial fixes for PR14221 ( http://llvm.org/bugs/show_bug.cgi?id=14221 ); this should be the final patch needed to resolve that bug. Differential Revision: http://reviews.llvm.org/D6885 llvm-svn: 227111	2015-01-26 18:42:16 +00:00
Craig Topper	568161290d	[X86] Give scalar VRNDSCALE instructions priority in AVX512 mode. llvm-svn: 227039	2015-01-25 08:49:22 +00:00
Craig Topper	a4c295adc4	Remove tab characters. NFC llvm-svn: 227036	2015-01-25 08:45:32 +00:00
Craig Topper	011934eb9c	[X86] Replace i32i8imm on SSE/AVX instructions with i32u8imm which will make the assembler bounds check them. It will also make them print as unsigned. llvm-svn: 227032	2015-01-25 02:21:16 +00:00
Craig Topper	88aba4703b	[X86] Use u8imm in several places that used i32i8imm that don't require an i32 type. llvm-svn: 227031	2015-01-25 02:21:13 +00:00
Craig Topper	e2205638ed	Remove tab characters. NFC. llvm-svn: 227030	2015-01-25 02:21:11 +00:00
Simon Pilgrim	b377a5e42b	[X86][SSE] Added support for SSE3 lane duplication shuffle instructions This patch adds shuffle matching for the SSE3 MOVDDUP, MOVSLDUP and MOVSHDUP instructions. The big use of these being that they avoid many single source shuffles from needing to use (pre-AVX) dual source instructions such as SHUFPD/SHUFPS: causing extra moves and preventing load folds. Adding these instructions uncovered an issue in XFormVExtractWithShuffleIntoLoad which crashed on single operand shuffle instructions (now fixed). It also involved fixing getTargetShuffleMask to correctly identify theses instructions as unary shuffles. Also adds a missing tablegen pattern for MOVDDUP. Differential Revision: http://reviews.llvm.org/D7042 llvm-svn: 226716	2015-01-21 22:44:35 +00:00
Ahmed Bougacha	aea688cba4	[X86] Declare SSE4.1/AVX2 vector extloads covered by PMOV[SZ]X legal. Now that we can fully specify extload legality, we can declare them legal for the PMOVSX/PMOVZX instructions. This for instance enables a DAGCombine to fire on code such as (and (<zextload-equivalent> ...), <redundant mask>) to turn it into: (zextload ...) as seen in the testcase changes. There is one regression, in widen_load-2.ll: we're no longer able to do store-to-load forwarding with illegal extload memory types. This will be addressed separately. Differential Revision: http://reviews.llvm.org/D6533 llvm-svn: 226676	2015-01-21 17:07:06 +00:00
Craig Topper	39f463653a	[X86] Convert all the i8imm used by SSE and AVX instructions to u8imm. This makes the assembler check their size and removes a hack from the disassembler to avoid sign extending the immediate. llvm-svn: 226645	2015-01-21 08:15:54 +00:00
Craig Topper	8282659ccb	[x86] Add assembly parser bounds checking to the immediate value for cmpss/cmpsd/cmpps/cmppd. llvm-svn: 226642	2015-01-21 06:07:53 +00:00
Craig Topper	4312f7bdd7	[x86] Add some mayLoad/hasSideEffects flags. Remove one that was already covered by a pattern. llvm-svn: 226562	2015-01-20 12:15:30 +00:00
Simon Pilgrim	65bd4292b0	[X86][SSE] Minor fix to VPBLENDW AVX2 commutation. D6015 / rL221313 enabled commutation for SSE immediate blend instructions, but due to a typo the AVX2 VPBLENDW ymm instructions weren't flagged as commutative along with the others in the tables, but were still being commuted in code and tested for. llvm-svn: 225612	2015-01-11 22:08:01 +00:00
Craig Topper	1f395a3c6c	[x86] Prevent llvm.x86.cmp.ps/pd/ss/sd from being selected with bad immediates. The frontend now checks this when the builtin is used. This will allow the instruction printer to not have to deal with invalid immediates on these instructions. llvm-svn: 224885	2014-12-27 18:10:56 +00:00
Elena Demikhovsky	744da8554e	Masked load and store codegen - fixed 128-bit vectors The codegen failed on 128-bit types on AVX2. I added patterns and in td files and tests. llvm-svn: 224647	2014-12-19 23:27:57 +00:00
Sanjay Patel	50aab0bca2	Model sqrtss as a binary operation with one source operand tied to the destination (PR14221) This is a continuation of r167064 ( http://llvm.org/viewvc/llvm-project?view=revision&revision=167064 ). That patch started to fix PR14221 ( http://llvm.org/bugs/show_bug.cgi?id=14221 ), but it was not completed. Differential Revision: http://reviews.llvm.org/D6330 llvm-svn: 224624	2014-12-19 22:16:28 +00:00
Robert Khasanov	0bf2db97cb	[AVX512] Enable FP arithmetic lowering for AVX512VL subsets. Added RegOp2MemOpTable4 to transform 4th operand from register to memory in merge-masked versions of instructions. Added lowering tests. llvm-svn: 224516	2014-12-18 12:28:22 +00:00
Craig Topper	ed8004c615	[X86] Don't use PS prefix on LDMXCSR/STMXCSR. Near as I can tell prefixes are ignored on these instructions except for a comment in the Intel docs about 0xf3. Binutils disassembler seems to ignore prefixes on these instructions. Our disassembler still doesn't distinguish PS and "no prefix" well enough for this to make a functional change, but it helps with experiments I'm doing on a potential new disassembler table builder. llvm-svn: 224496	2014-12-18 05:02:10 +00:00
Robert Khasanov	104b98b388	[AVX512] Enable integer arithmetic lowering for AVX512BW/VL subsets. Added lowering tests. llvm-svn: 224349	2014-12-16 18:24:07 +00:00
Sanjay Patel	8363dd3b42	combine consecutive subvector 16-byte loads into one 32-byte load This is a fix for PR21709 ( http://llvm.org/bugs/show_bug.cgi?id=21709 ). When we have 2 consecutive 16-byte loads that are merged into one 32-byte vector, we can use a single 32-byte load instead. But we don't do this for SandyBridge / IvyBridge because they have slower 32-byte memops. We also don't bother using 32-byte integer loads on a machine that only has AVX1 (btver2) because those operands would have to be split in half anyway since there is no support for 32-byte integer math ops. Differential Revision: http://reviews.llvm.org/D6492 llvm-svn: 224344	2014-12-16 16:30:01 +00:00
Robert Khasanov	64bf0f6845	[AVX512] Enabling bit logic lowering Added lowering tests. llvm-svn: 224132	2014-12-12 17:02:18 +00:00
Robert Khasanov	efae7453cb	[AVX512] Enabling MIN/MAX lowering. Added lowering tests. llvm-svn: 224127	2014-12-12 15:10:43 +00:00
Ahmed Bougacha	4b8a22ae51	[X86] Add a temporary testcase for PR21876/r223996. llvm-svn: 224074	2014-12-11 23:07:52 +00:00
Ahmed Bougacha	9304854896	[X86] Add back AVX2 VR256 PMOVX patterns. We can't reach those from zext, but other parts of the backend (the shuffle lowering) generate 256-bit VZEXT nodes. Fixes PR21876. llvm-svn: 223996	2014-12-11 04:32:17 +00:00
Sanjay Patel	ecf92813fa	Match new shuffle codegen for MOVHPD patterns Add patterns to match SSE (shufpd) and AVX (vpermilpd) shuffle codegen when storing the high element of a v2f64. The existing patterns were only checking for an unpckh type of shuffle. http://llvm.org/bugs/show_bug.cgi?id=21791 Differential Revision: http://reviews.llvm.org/D6586 llvm-svn: 223929	2014-12-10 16:58:54 +00:00
Ahmed Bougacha	b46577058a	[X86] Refactor PMOV[SZ]Xrm to add missing AVX2 patterns. Most patterns will go away once the extload legalization changes land. Differential Revision: http://reviews.llvm.org/D6125 llvm-svn: 223567	2014-12-06 01:31:07 +00:00

1 2 3 4 5 ...

1189 Commits