Bruno Cardoso Lopes
17ae896095
Move code around and add comments
...
llvm-svn: 137518
2011-08-12 21:48:22 +00:00
Bruno Cardoso Lopes
4106caa9af
Cleanup: Remove Int_ CVTSS2SI* forms
...
llvm-svn: 137297
2011-08-11 02:52:36 +00:00
Bruno Cardoso Lopes
565ab1542a
The following X86 pattern is incorrect:
...
def : Pat<(X86Movss VR128:$src1,
(bc_v4i32 (v2i64 (load addr:$src2)))),
(MOVLPSrm VR128:$src1, addr:$src2)>;
This matches a MOVSS dag with a MOVLPS instruction. However, MOVSS will replace only the low 32 bits of the register, while the MOVLPS instruction will replace the low 64 bits. A testcase is added and illustrates the bug and also modified the one that was already present. Patch by Tanya Lattner.
llvm-svn: 137227
2011-08-10 17:45:17 +00:00
Bruno Cardoso Lopes
7461b930f3
Add v16i16 and v32i8 store patterns
...
llvm-svn: 137166
2011-08-09 22:39:53 +00:00
Bruno Cardoso Lopes
028c6aa951
Use fp unpack instructions to unpack int types. Until we have AVX2, this
...
is the best we can do for these patterns. This fix PR10554.
llvm-svn: 137161
2011-08-09 22:18:37 +00:00
Bruno Cardoso Lopes
633400ee00
Reapply a more appropriate solution than in r137114. AVX supports
...
v4f64 = sitofp v4i32. This fix PR10559.
Also add support for v4i32 = fptosi v4f64.
llvm-svn: 137128
2011-08-09 17:39:13 +00:00
Bruno Cardoso Lopes
d521431558
Add support for avx vector fextend
...
llvm-svn: 137105
2011-08-09 03:04:29 +00:00
Bruno Cardoso Lopes
09a727298f
Add AVX versions of 128-bit sitofp and fptosi
...
llvm-svn: 137104
2011-08-09 03:04:25 +00:00
Bruno Cardoso Lopes
1025d1eb3b
Add two patterns to match special vmovss and vmovsd cases. Also fix
...
the patterns already there to be more strict regarding the predicate.
This fixes PR10558
llvm-svn: 137100
2011-08-09 01:43:09 +00:00
Bruno Cardoso Lopes
d7eac41193
Make LowerVSETCC aware of AVX types and add patterns to match them.
...
llvm-svn: 137090
2011-08-09 00:46:57 +00:00
Bruno Cardoso Lopes
771876cade
Add v4f64 -> v2f32 fp_round support. Also add a testcase to exercise
...
the legalizer. This commit together with the two previous ones fixes
PR10495.
llvm-svn: 136654
2011-08-01 21:54:09 +00:00
Bruno Cardoso Lopes
473d982caf
Add v8i32 and v4i64 vpermil patterns
...
llvm-svn: 136451
2011-07-29 01:31:07 +00:00
Bruno Cardoso Lopes
02bbf20b02
Cleanup PALIGNR handling and remove the old palign pattern fragment.
...
Also make PALIGNR masks to don't match 256-bits, which isn't supported
It's also a step to solve PR10489
llvm-svn: 136448
2011-07-29 01:30:59 +00:00
Bruno Cardoso Lopes
e24a043703
Add patterns to generate copies for extract_subvector instead of
...
using vextractf128. This will reduce the number of issued instruction
for several avx codes.
llvm-svn: 136323
2011-07-28 01:26:50 +00:00
Bruno Cardoso Lopes
73945bf79a
movd/movq write zeros in the high 128-bit part of the vector. Use
...
them to match 256-bit scalar_to_vector+zext.
llvm-svn: 136322
2011-07-28 01:26:46 +00:00
Bruno Cardoso Lopes
1f63a37172
Add a few patterns to match allzeros without having to use the fp unit.
...
Take advantage that the 128-bit vpxor zeros the higher part and use it.
This also fixes PR10491
llvm-svn: 136321
2011-07-28 01:26:43 +00:00
Bruno Cardoso Lopes
06d8be564f
Add SINT_TO_FP and FP_TO_SINT support for v8i32 types. Also move
...
a convert pattern close to the instruction definition.
llvm-svn: 136320
2011-07-28 01:26:39 +00:00
Kevin Enderby
9adbbfffd0
Fix llvm-mc handing of x86 instructions that take 8-bit unsigned immediates.
...
llvm-mc gives an "invalid operand" error for instructions that take an unsigned
immediate which have the high bit set such as:
pblendw $0xc5, %xmm2, %xmm1
llvm-mc treats all x86 immediates as signed values and range checks them.
A small number of x86 instructions use the imm8 field as a set of bits.
This change only changes those instructions and where the high bit is not
ignored. The others remain unchanged.
llvm-svn: 136287
2011-07-27 23:01:50 +00:00
Bruno Cardoso Lopes
8830fde434
The vpermilps and vpermilpd have different behaviour regarding the
...
usage of the shuffle bitmask. Both work in 128-bit lanes without
crossing, but in the former the mask of the high part is the same
used by the low part while in the later both lanes have independent
masks. Handle this properly and and add support for vpermilpd.
llvm-svn: 136200
2011-07-27 00:56:34 +00:00
Bruno Cardoso Lopes
e53bb853ea
Recognize unpckh* masks and match 256-bit versions. The new versions are
...
different from the previous 128-bit because they work in lanes.
Update a few comments and add testcases
llvm-svn: 136157
2011-07-26 22:03:40 +00:00
Bruno Cardoso Lopes
a493ad3938
Remove now unused patterns. 0 insertions(+), 98 deletions(-)
...
llvm-svn: 136109
2011-07-26 18:22:39 +00:00
Bruno Cardoso Lopes
b24e958ffb
Cleanup old matching for PUNPCK* variants
...
llvm-svn: 136108
2011-07-26 18:22:27 +00:00
Bruno Cardoso Lopes
ab40a57cce
Add 256-bit isel for movsldup/movshdup
...
llvm-svn: 136051
2011-07-26 02:39:32 +00:00
Bruno Cardoso Lopes
cde45ac9ca
Add 128-bit AVX versions of movshdup/mosldup
...
llvm-svn: 136048
2011-07-26 02:39:23 +00:00
Bruno Cardoso Lopes
25698b90e9
Cleanup movsldup/movshdup matching.
...
27 insertions(+), 62 deletions(-)
llvm-svn: 136047
2011-07-26 02:39:13 +00:00
Bruno Cardoso Lopes
c94d6a2d2c
Codegen allonesvector better while using AVX: vpcmpeqd + vinsertf128
...
This also fixes PR10452
llvm-svn: 136004
2011-07-25 23:05:32 +00:00
Bruno Cardoso Lopes
f457bc8120
Add remaining 256-bit vector bitcasts. This also fixes PR10451
...
llvm-svn: 136003
2011-07-25 23:05:28 +00:00
Bruno Cardoso Lopes
9380919dc5
- Handle special scalar_to_vector case: splats. Using a native 128-bit
...
shuffle before inserting on a 256-bit vector.
- Add AVX versions of movd/movq instructions
- Introduce a few COPY patterns to match insert_subvector instructions.
This turns a trivial insert_subvector instruction into a register copy,
coalescing the xmm into a ymm and avoid emiting on more instruction.
llvm-svn: 136002
2011-07-25 23:05:25 +00:00
Bruno Cardoso Lopes
5382a1e65e
Add v8f32->v8i32 bitcast. Fixes PR10440
...
llvm-svn: 135794
2011-07-22 19:51:02 +00:00
Bruno Cardoso Lopes
3691063149
- Register v16i16 as valid VR256 register class
...
- Add more bitcasts for v16i16
- Since 135661 and 135662 already added the splat logic,
just add one more splat test for v16i16
llvm-svn: 135663
2011-07-21 02:24:08 +00:00
Bruno Cardoso Lopes
ba1a2a9135
Add support for 256-bit versions of VPERMIL instruction. This is a new
...
instruction introduced in AVX, which can operate on 128 and 256-bit vectors.
It considers a 256-bit vector as two independent 128-bit lanes. It can permute
any 32 or 64 elements inside a lane, and restricts the second lane to
have the same permutation of the first one. With the improved splat support
introduced early today, adding codegen for this instruction enable more
efficient 256-bit code:
Instead of:
vextractf128 $0, %ymm0, %xmm0
punpcklbw %xmm0, %xmm0
punpckhbw %xmm0, %xmm0
vinsertf128 $0, %xmm0, %ymm0, %ymm1
vinsertf128 $1, %xmm0, %ymm1, %ymm0
vextractf128 $1, %ymm0, %xmm1
shufps $1, %xmm1, %xmm1
movss %xmm1, 28(%rsp)
movss %xmm1, 24(%rsp)
movss %xmm1, 20(%rsp)
movss %xmm1, 16(%rsp)
vextractf128 $0, %ymm0, %xmm0
shufps $1, %xmm0, %xmm0
movss %xmm0, 12(%rsp)
movss %xmm0, 8(%rsp)
movss %xmm0, 4(%rsp)
movss %xmm0, (%rsp)
vmovaps (%rsp), %ymm0
We get:
vextractf128 $0, %ymm0, %xmm0
punpcklbw %xmm0, %xmm0
punpckhbw %xmm0, %xmm0
vinsertf128 $0, %xmm0, %ymm0, %ymm1
vinsertf128 $1, %xmm0, %ymm1, %ymm0
vpermilps $85, %ymm0, %ymm0
llvm-svn: 135662
2011-07-21 01:55:47 +00:00
Bruno Cardoso Lopes
194507cc77
Add aditional patterns for vextractf128 instruction
...
llvm-svn: 135660
2011-07-21 01:55:39 +00:00
Bruno Cardoso Lopes
14c800c1e3
Add aditional patterns for vinsertf128 instruction
...
llvm-svn: 135659
2011-07-21 01:55:36 +00:00
Bruno Cardoso Lopes
e0d5bd467f
Move code around. No functionality changes
...
llvm-svn: 135657
2011-07-21 01:55:30 +00:00
Bruno Cardoso Lopes
bdf75dfa28
Be more smart with VCVTSS2SD. Also place the patterns close to the
...
definitions.
llvm-svn: 135407
2011-07-18 18:11:25 +00:00
Bruno Cardoso Lopes
da90f383ab
Add AVX 128-bit sqrt versions
...
llvm-svn: 135404
2011-07-18 17:51:40 +00:00
Bruno Cardoso Lopes
d258749f73
Add AVX 128-bit patterns for sint_to_fp
...
llvm-svn: 135332
2011-07-16 00:50:20 +00:00
Bruno Cardoso Lopes
2a23e486ad
Add a few patterns for 256-bit bitcasts. No testcases now, they are
...
comming together with other tests.
llvm-svn: 135312
2011-07-15 22:24:17 +00:00
Bruno Cardoso Lopes
d24f039847
Add 256-bit load/store recognition and matching in several places.
...
llvm-svn: 135171
2011-07-14 18:50:58 +00:00
Bruno Cardoso Lopes
c0401dddf7
Make X86ISD::ANDNP more general and Codegen 256-bit VANDNP. A more
...
general version of X86ISD::ANDNP also opened the room for a little bit
of refactoring.
llvm-svn: 135088
2011-07-13 21:36:51 +00:00
Bruno Cardoso Lopes
b98f50da03
The target specific node PANDN name is misleading. That happens because
...
it's later selected to a ANDNPD/ANDNPS instruction instead of the PANDN
instruction. Rename it.
llvm-svn: 135087
2011-07-13 21:36:47 +00:00
Bruno Cardoso Lopes
cb49278ad6
AVX Codegen support for 256-bit versions of vandps, vandpd, vorps, vorpd, vxorps, vxorpd
...
llvm-svn: 135023
2011-07-13 01:15:33 +00:00
Eli Friedman
9765ae0015
Add assembler/disassembler support for non-AVX pclmulqdq. While I'm here, use proper aliases for the pclmullqlqdq and friends. PR10269.
...
llvm-svn: 134424
2011-07-05 18:21:20 +00:00
Eli Friedman
802029c494
Add support for movntil/movntiq mnemonics. Reported on llvmdev.
...
llvm-svn: 133759
2011-06-23 21:07:47 +00:00
Nick Lewycky
8e5c09b7dc
Add support for assembling "movq" when it's correct to do so, while continuing
...
to emit "movd" across the board to continue supporting a Darwin assembler bug.
This is the reincarnation of r133452.
llvm-svn: 133565
2011-06-21 22:45:41 +00:00
Bob Wilson
5b04895bb8
Revert r133452: "Emit movq for 64-bit register to XMM register moves..."
...
This is breaking compiler-rt and llvm-gcc builds on MacOSX when not using
the integrated assembler.
llvm-svn: 133524
2011-06-21 17:35:13 +00:00
Nick Lewycky
831fb8200d
Emit movq for 64-bit register to XMM register moves, but continue to accept
...
movd when assembling.
llvm-svn: 133452
2011-06-20 18:33:26 +00:00
Bruno Cardoso Lopes
f52f4dd0b8
Add AVX suport for fpextend.
...
Original patch by Syoyo Fujita with more comments by me.
llvm-svn: 133153
2011-06-16 07:03:21 +00:00
Bruno Cardoso Lopes
b6afc5168f
Add one more argument to the prefetch intrinsic to indicate whether it's a data
...
or instruction cache access. Update the targets to match it and also teach
autoupgrade.
llvm-svn: 132976
2011-06-14 04:58:37 +00:00
Stuart Hastings
ea8b49dff3
Reapply 132424 with fixes. This fixes PR10068.
...
rdar://problem/5993888
llvm-svn: 132606
2011-06-03 23:53:54 +00:00
Rafael Espindola
1299f014d4
Revert 132424 to fix PR10068.
...
llvm-svn: 132479
2011-06-02 19:57:47 +00:00
Stuart Hastings
9a085fb9d8
Recommit 132404 with fixes. rdar://problem/5993888
...
llvm-svn: 132424
2011-06-01 21:33:14 +00:00
Stuart Hastings
4b33767382
Revert 132404 to appease a buildbot. rdar://problem/5993888
...
llvm-svn: 132419
2011-06-01 19:52:20 +00:00
Stuart Hastings
23f5ceda96
Add support for x86 CMPEQSS and friends. These instructions do a
...
floating-point comparison, generate a mask of 0s or 1s, and generally
DTRT with NaNs. Only profitable when the user wants a materialized 0
or 1 at runtime. rdar://problem/5993888
llvm-svn: 132404
2011-06-01 17:17:45 +00:00
Stuart Hastings
fdc9e4af68
FGETSIGN support for x86, using movmskps/pd. Will be enabled with a
...
patch to TargetLowering.cpp. rdar://problem/5660695
llvm-svn: 132388
2011-06-01 04:39:42 +00:00
Chad Rosier
b87c4a6945
Renamed llvm.x86.sse42.crc32 intrinsics; crc64 doesn't exist.
...
crc32.[8|16|32] have been renamed to .crc32.32.[8|16|32] and
crc64.[8|16|32] have been renamed to .crc32.64.[8|64].
llvm-svn: 132163
2011-05-26 23:13:19 +00:00
Rafael Espindola
98372d430c
Don't produce a vmovntdq if we don't have AVX support.
...
llvm-svn: 131330
2011-05-14 00:30:01 +00:00
Bill Wendling
67f5e8f0a7
Replace the "movnt" intrinsics with a native store + nontemporal metadata bit.
...
<rdar://problem/8460511>
llvm-svn: 130791
2011-05-03 21:11:17 +00:00
Eric Christopher
1de0dfaab0
xmm0 is an implicit parameter in this and so shouldn't be in the
...
string template.
Fixes rdar://8493866
llvm-svn: 130747
2011-05-03 01:28:32 +00:00
Chris Lattner
52e40d9b4c
clean up after Sean's r127646 patch.
...
llvm-svn: 130475
2011-04-29 05:40:18 +00:00
Bill Wendling
0984f4927e
Reapply r129401 with patch for clang.
...
llvm-svn: 129419
2011-04-13 00:36:11 +00:00
Bill Wendling
f6446a0961
Revert r129401 for now. Clang is using the old way of doing things.
...
llvm-svn: 129403
2011-04-12 22:59:27 +00:00
Bill Wendling
f9c9d3e05b
Remove the unaligned load intrinsics in favor of using native unaligned loads.
...
Now that we have a first-class way to represent unaligned loads, the unaligned
load intrinsics are superfluous.
First part of <rdar://problem/8460511>.
llvm-svn: 129401
2011-04-12 22:46:31 +00:00
Sean Callanan
a38db2eeda
Enabled disassembler support for AVX instructions
...
in the instruction tables and fixed a few bugs that
were causing decode conflicts. Rudimentary tests
are coming up in the next patch.
llvm-svn: 127646
2011-03-15 01:28:15 +00:00
David Greene
2fd6d03bc9
[AVX] Fix mask predicates for 256-bit UNPCKLPS/D and implement
...
missing patterns for them.
Add a SIMD test subdirectory to hold tests for SIMD instruction
selection correctness and quality.
'
llvm-svn: 126845
2011-03-02 17:23:43 +00:00
Joerg Sonnenberger
efa8090e2a
Recognize monitor/mwait with explicit register arguments
...
llvm-svn: 125805
2011-02-18 00:48:11 +00:00
David Greene
7de7347ee8
[AVX] Support VSINSERTF128 with more patterns and appropriate
...
infrastructure. This makes lowering 256-bit vectors to 128-bit
vectors simple when 256-bit vector support is not available.
llvm-svn: 124868
2011-02-04 16:08:29 +00:00
David Greene
2753be260c
[AVX] VEXTRACTF128 support. This commit includes patterns for
...
matching EXTRACT_SUBVECTOR to VEXTRACTF128 along with support routines
to examine and translate index values. VINSERTF128 comes next. With
these two in place we can begin supporting more AVX operations as
INSERT/EXTRACT can be used as a fallback when 256-bit support is not
available.
llvm-svn: 124797
2011-02-03 15:50:00 +00:00
Chris Lattner
9ba0a83f2b
fix a missing shuffle pattern, PR9009. Patch by Artiom Myaskouvskey!
...
llvm-svn: 124102
2011-01-24 03:42:46 +00:00
Chris Lattner
586e7af07d
Fix PR8946, a missing reg/reg form of movdqu.
...
llvm-svn: 123242
2011-01-11 17:04:55 +00:00
Chris Lattner
3ef9db5cd4
fix PR8900, a shuffle miscompilation. Patch by Nadav Rotem!
...
llvm-svn: 122921
2011-01-05 22:28:46 +00:00
Nate Begeman
c7dfecb10e
Implement feedback from Bruno on making pblendvb an x86-specific ISD node in addition to being an intrinsic, and convert
...
lowering to use it. Hopefully the pattern fragment is doing the right thing with XMM0, looks correct in testing.
llvm-svn: 122277
2010-12-20 22:04:24 +00:00
Nate Begeman
ef5f3c0fa7
Add support for matching psign & plendvb to the x86 target
...
Remove unnecessary pandn patterns, 'vnot' patfrag looks through bitcasts
llvm-svn: 122098
2010-12-17 22:55:37 +00:00
Nate Begeman
8c00ecd290
Add some missing predicates.
...
llvm-svn: 121445
2010-12-10 00:54:26 +00:00
Nate Begeman
cb6d1c8193
Formalize the notion that AVX and SSE are non-overlapping extensions from the compiler's point of view. Per email discussion, we either want to always use VEX-prefixed instructions or never use them, and are taking "HasAVX" to mean "Always use VEX". Passing -mattr=-avx,+sse42 should serve to restore legacy SSE support when desirable.
...
llvm-svn: 121439
2010-12-10 00:26:57 +00:00
Nate Begeman
4a62a3e229
Add support for AVX to materialize +0.0 when doing scalar FP.
...
llvm-svn: 121415
2010-12-09 21:43:51 +00:00
Benjamin Kramer
851691ddb2
Add patterns for the x86 popcnt instruction.
...
- Also adds a new POPCNT subtarget feature that is currently enabled if the target
supports SSE4.2 (nehalem) or SSE4A (barcelona).
llvm-svn: 120917
2010-12-04 20:32:23 +00:00
Nate Begeman
deb26223bd
Scalar f32/f64 are also subregs of ymm regs
...
llvm-svn: 120844
2010-12-03 21:54:39 +00:00
Eric Christopher
6a21ceab5c
Implement a PseudoI class and transfer the sse instructions over to use
...
it.
llvm-svn: 120412
2010-11-30 08:57:23 +00:00
Eric Christopher
f27f0b5234
Rewrite mwait and monitor support and custom lower arguments.
...
Fixes PR8573.
llvm-svn: 120404
2010-11-30 07:20:12 +00:00
Bruno Cardoso Lopes
9f9f796756
Fix PR8211
...
llvm-svn: 118445
2010-11-08 21:24:59 +00:00
Dale Johannesen
b78530f9b0
Fix pastos in handling of AVX cvttsd2si, PR8491.
...
Bruno, please review, but I'm pretty sure this is right.
Patch by Alex Mac!
llvm-svn: 117514
2010-10-28 00:35:54 +00:00
Chris Lattner
72e7e84c3f
simplify some map operations.
...
llvm-svn: 116014
2010-10-07 23:57:02 +00:00
Evan Cheng
7c89d70f27
Canonicalize X86ISD::MOVDDUP nodes to v2f64 to make sure all cases match. Also eliminate unneeded isel patterns. rdar://8520311
...
llvm-svn: 115977
2010-10-07 20:50:20 +00:00
Chris Lattner
84846b71af
remove the !nameconcat tblgen feature. It "shorthand" and only used in 4 places
...
where !cast is just as short.
llvm-svn: 115722
2010-10-06 00:19:21 +00:00
Chris Lattner
12274b9845
allow !strconcat to take more than two operands to eliminate
...
!strconcat(!strconcat(!strconcat(!strconcat
Simplify some x86 td files to use it.
llvm-svn: 115719
2010-10-05 23:58:18 +00:00
Chris Lattner
5d7d5a81eb
distribute the rest of the contents of X86Instr64bit.td out to
...
the right places. X86Instr64bit.td now dies, long live x86-64!
llvm-svn: 115669
2010-10-05 20:49:15 +00:00
Chris Lattner
9317bf2ed5
move CMOV_FR32 and friends to InstrCompiler, since they are
...
pseudo instructions.
Move POPCNT to InstrSSE since they are SSE4 instructions.
llvm-svn: 115603
2010-10-05 06:41:40 +00:00
Chris Lattner
9c58de2dc4
fix rdar://8490728 - llvm-mc rejects gpr64 form of 'movmskpd'
...
llvm-svn: 115029
2010-09-29 05:05:03 +00:00
Chris Lattner
890c21a20a
add assembler support for the cvtsd2sil/cvtsd2siq mnemonics, rdar://8456382
...
llvm-svn: 115027
2010-09-29 04:55:40 +00:00
Chris Lattner
c14d59589c
add basic avx support to the disassembler, also teach it about ssmem/sdmem
...
operands.
With this done, we can remove the _Int suffixes from the round instructions
without the disassembler blowing up. This allows the assembler to support
them, implementing rdar://8456376 - llvm-mc rejects 'roundss'
llvm-svn: 115019
2010-09-29 02:57:56 +00:00
Chris Lattner
f90296b045
add asmparser support for cvttpd2dq by removing some Int_ prefixes.
...
Clean up cvttps2dq by removing some redundant implementations of the
same instruction. rdar://8456382
llvm-svn: 115018
2010-09-29 02:36:32 +00:00
Chris Lattner
e5c5c8dc1f
implement rdar://8456382 - cvtsd2si support, by removing some Int_ prefixes.
...
llvm-svn: 115017
2010-09-29 02:24:57 +00:00
Dale Johannesen
eb807a15a3
Fix typos. 128-bit PSHUFB takes 128-bit memory op.
...
v8i16 is not an MMX type; put it where it belongs.
llvm-svn: 113785
2010-09-13 21:15:43 +00:00
Bruno Cardoso Lopes
49efee5c95
Add one more pattern to fallback movddup
...
llvm-svn: 113522
2010-09-09 18:48:34 +00:00
Dale Johannesen
de53df20d6
Move remaining MMX instructions from SSE to MMX.
...
llvm-svn: 113501
2010-09-09 17:13:07 +00:00
Dale Johannesen
7469923117
Move most MMX instructions (defined as anything that
...
uses MMX, even if it also uses other things) from InstrSSE
into InstrMMX. No (intended) functional change.
llvm-svn: 113462
2010-09-09 01:02:39 +00:00
Bruno Cardoso Lopes
892c337123
x86 vector shuffle lowering now relies only on target specific
...
nodes to emit shuffles and don't do isel mask matching anymore.
- Add the selection of the remaining shuffle opcode (movddup)
- Introduce two new functions to "recognize" where we may get
potential folds and add several comments to them explaining why
they are not yet in the desidered shape.
- Add more patterns to fallback the case where we select
a specific shuffle opcode as if it could fold a load, but it
can't, so remap to a valid instruction.
- Add a couple of FIXMEs to address in the following days once
there's a good solution to the current folding problem.
llvm-svn: 113369
2010-09-08 17:43:25 +00:00
Dale Johannesen
8354cab2de
Add patterns for MMX that use the new intrinsics.
...
Enable palignr intrinsic.
These may need adjustment for a new VT in due course.
llvm-svn: 113233
2010-09-07 18:10:56 +00:00
Bruno Cardoso Lopes
92bb02f722
Remove unused target specific node
...
llvm-svn: 113224
2010-09-07 17:38:55 +00:00
Dale Johannesen
2f4f8f5705
Remove the rest of the nonexistent 64-bit AVX instructions.
...
Bruno, please review.
llvm-svn: 113014
2010-09-03 21:23:00 +00:00
Bruno Cardoso Lopes
b8ce8b7e9f
Reapply last harmless part of r112934, the pattern fragment to match X86Unpcklpd
...
llvm-svn: 113009
2010-09-03 20:44:26 +00:00
Daniel Dunbar
26e0e964ab
Revert r112934, "- Use specific nodes to match unpckl masks.", which introduced
...
some infinite loop and select failures.
- Apologies for eager reverting, but its branch day.
llvm-svn: 113000
2010-09-03 19:38:11 +00:00
Bruno Cardoso Lopes
f91bd70e9a
AVX doesn't support mm operations neither its instrinsics.
...
The AVX versions of PALIGN and PABS* should only exist for
128-bit. Remove the unnecessary stuff.
llvm-svn: 112944
2010-09-03 02:08:45 +00:00
Bruno Cardoso Lopes
e1ad6555a8
- Use specific nodes to match unpckl masks.
...
- Teach getShuffleScalarElt how to handle more target
specific nodes, so the DAGCombine can make use of it.
- Add another hack to avoid the node update problem
during legalization. More description on the comments
llvm-svn: 112934
2010-09-03 01:24:00 +00:00
Bruno Cardoso Lopes
dcdab94661
become more strict about when it's safe to use X86ISD::MOVLPS
...
llvm-svn: 112799
2010-09-02 02:35:51 +00:00
Bruno Cardoso Lopes
601bf4c6d3
Using target specific nodes for shuffle nodes makes the mask
...
check more strict, breaking some cases not checked in the
testsuite, but also exposes some foldings not done before,
as this example:
movaps (%rdi), %xmm0
movaps (%rax), %xmm1
movaps %xmm0, %xmm2
movss %xmm1, %xmm2
shufps $36, %xmm2, %xmm0
now is generated as:
movaps (%rdi), %xmm0
movaps %xmm0, %xmm1
movlps (%rax), %xmm1
shufps $36, %xmm1, %xmm0
llvm-svn: 112753
2010-09-01 22:33:20 +00:00
Bruno Cardoso Lopes
9375b2f67d
Use movlps, movlpd, movss and movsd specific nodes instead of pattern matching with movlp pattern fragment
...
llvm-svn: 112694
2010-09-01 05:08:25 +00:00
Bruno Cardoso Lopes
80613a070e
Use x86 specific MOVSLDUP node, add more patterns to match it and remove useless load nodes
...
llvm-svn: 112661
2010-08-31 22:35:05 +00:00
Bruno Cardoso Lopes
8fc83b1960
Use x86 specific MOVSHDUP node and add more patterns to match it
...
llvm-svn: 112657
2010-08-31 22:22:11 +00:00
Bruno Cardoso Lopes
6fbe7b9ddd
Use MOVLHPS and MOVHLPS x86 nodes whenever possible. Also remove some useless nodes
...
llvm-svn: 112642
2010-08-31 21:15:21 +00:00
Bruno Cardoso Lopes
7939025262
Use pshufhw and pshuflw in more cases and fix getTargetShuffleNode number of arguments
...
llvm-svn: 111890
2010-08-24 01:16:15 +00:00
Bruno Cardoso Lopes
28d9071635
This is the first step towards refactoring the x86 vector shuffle code. The
...
general idea here is to have a group of x86 target specific nodes which are
going to be selected during lowering and then directly matched in isel.
The commit includes the addition of those specific nodes and a *bunch* of
patterns, and incrementally we're going to switch between them and what we
have right now. Both the patterns and target specific nodes can change as
we move forward with this work.
llvm-svn: 111691
2010-08-20 22:55:05 +00:00
Dale Johannesen
3f9c148d0e
Revert 110491. While not wrong, it was based on a
...
misanalysis and is undesirable.
llvm-svn: 111028
2010-08-13 18:43:45 +00:00
Bruno Cardoso Lopes
de5f3f5cb6
Improve comment to make explicit why not to touch this could before JIT goes MC
...
llvm-svn: 111021
2010-08-13 17:44:10 +00:00
Eric Christopher
63c83f19a0
Revert last patch and r110954 as I meant to.
...
llvm-svn: 111001
2010-08-13 02:37:50 +00:00
Bruno Cardoso Lopes
350d186d69
Some small clean-up: use of pseudo instructions
...
llvm-svn: 110954
2010-08-12 20:55:18 +00:00
Bruno Cardoso Lopes
7cb26cb8be
- Teach SSEDomainFix to switch between different levels of AVX instructions. Here we guess that AVX will have domain issues, so just implement them for consistency and in the future we remove if it's unnecessary.
...
- Make foldMemoryOperandImpl aware of 256-bit zero vectors folding and support the 128-bit counterparts of AVX too.
- Make sure MOV[AU]PS instructions are only selected when SSE1 is enabled, and duplicate the patterns to match AVX.
- Add a testcase for a simple 128-bit zero vector creation.
llvm-svn: 110946
2010-08-12 20:20:53 +00:00
Bruno Cardoso Lopes
99b5298854
Define AVX 128-bit pattern versions of SET0PS/PD.
...
llvm-svn: 110937
2010-08-12 18:20:59 +00:00
Bruno Cardoso Lopes
bb491bd56c
Begin to support some vector operations for AVX 256-bit intructions. The long
...
term goal here is to be able to match enough of vector_shuffle and build_vector
so all avx intrinsics which aren't mapped to their own built-ins but to
shufflevector calls can be codegen'd. This is the first (baby) step, support
building zeroed vectors.
llvm-svn: 110897
2010-08-12 02:06:36 +00:00
Bruno Cardoso Lopes
6eb24fd744
Add AVX matching patterns to Packed Bit Test intrinsics.
...
Apply the same approach of SSE4.1 ptest intrinsics but
create a new x86 node "testp" since AVX introduces
vtest{ps}{pd} instructions which set ZF and CF depending
on sign bit AND and ANDN of packed floating-point sources.
This is slightly different from what the "ptest" does.
Tests comming with the other 256 intrinsics tests.
llvm-svn: 110744
2010-08-10 23:25:42 +00:00
Bruno Cardoso Lopes
f1928b60c0
Add AVX movnt{pd,ps,dq} 256-bit intrinsics
...
llvm-svn: 110650
2010-08-10 02:49:24 +00:00
Bruno Cardoso Lopes
f5884c6791
Add AVX movmsk 256-bit intrinsics
...
llvm-svn: 110648
2010-08-10 02:34:56 +00:00
Bruno Cardoso Lopes
2a7ed4b5c9
Support AVX 256-bit load and store intrinsics
...
llvm-svn: 110645
2010-08-10 01:43:16 +00:00
Bruno Cardoso Lopes
1ea37cfa7b
Patterns to match AVX cmp instructions
...
llvm-svn: 110633
2010-08-10 00:13:20 +00:00
Bruno Cardoso Lopes
4e8d77892c
Add matching patterns for vblend AVX intrinsics
...
llvm-svn: 110630
2010-08-10 00:02:05 +00:00
Bruno Cardoso Lopes
e58d077846
Add VCVTPD2PS, VCVTPS2DQ, VCVTPS2PDY, VCVTTPD2DQY, VCVTTPS2DQ and VCVTPD2DQ 256-bit conversion intrinsics
...
llvm-svn: 110608
2010-08-09 21:51:56 +00:00
Bruno Cardoso Lopes
e7ceec4edf
Add patterns to AVX conversions instructions. Do that instead of declaring more intructions whenever is possible, more coming
...
llvm-svn: 110605
2010-08-09 21:24:59 +00:00
Bruno Cardoso Lopes
6a92e01d05
Memory version of vcvtdq2pd intrinsic
...
llvm-svn: 110582
2010-08-09 18:20:14 +00:00
Bruno Cardoso Lopes
0794b8ab3f
Patterns to match vinsert, vbroadcast, vmovmask and vcvtdq2pd AVX intrinsics
...
llvm-svn: 110580
2010-08-09 18:03:43 +00:00
Dale Johannesen
23f9086dd3
Use sdmem and sse_load_f64 (etc.) for the vector
...
form of CMPSD (etc.) Matching a 128-bit memory
operand is wrong, the instruction uses only 64 bits
(same as ADDSD etc.) 8193553.
llvm-svn: 110491
2010-08-07 00:33:42 +00:00
Bruno Cardoso Lopes
5b602f8822
Patterns to match AVX 256-bit vzero intrinsics
...
llvm-svn: 110480
2010-08-06 22:10:01 +00:00
Bruno Cardoso Lopes
821eebf946
Patterns to match AVX 256-bit permutation intrinsics
...
llvm-svn: 110468
2010-08-06 20:03:27 +00:00
Bruno Cardoso Lopes
d186fba555
Patterns to match AVX 256-bit horizontal arithmetic intrinsics
...
llvm-svn: 110427
2010-08-06 02:10:30 +00:00
Bruno Cardoso Lopes
5e9f9c921e
Patterns to match AVX 256-bit arithmetic intrinsics
...
llvm-svn: 110425
2010-08-06 01:52:29 +00:00
Bruno Cardoso Lopes
0c0dd2173c
Support all 128-bit AVX vector intrinsics. Most part of them I already
...
declared during the addition of the assembler support, the additional
changes are:
- Add missing intrinsics
- Move all SSE conversion instructions in X86InstInfo64.td to the SSE.td file.
- Duplicate some patterns to AVX mode.
- Step into PCMPEST/PCMPIST custom inserter and add AVX versions.
llvm-svn: 109878
2010-07-30 19:54:33 +00:00
Bruno Cardoso Lopes
a80b57e5fc
Add AVX version of CLMUL instructions
...
llvm-svn: 109248
2010-07-23 18:41:12 +00:00
Bruno Cardoso Lopes
93fd8bdf6a
Fix some AVX instructions which didnt had HasAVX prefix. And also a problem with PINSRW, which was totally wrong because of a typo I introduced previously
...
llvm-svn: 109198
2010-07-23 00:14:54 +00:00
Bruno Cardoso Lopes
7722724eee
Add remaining AVX instructions (most of them dealing with GR64 destinations. This complete the assembler support for the general AVX ISA. But we still miss instructions from FMA3 and CLMUL specific feature flags, which are now the next step
...
llvm-svn: 109168
2010-07-22 21:18:49 +00:00
Eric Christopher
4924d5fb93
Custom lower the memory barrier instructions and add support
...
for lowering without sse2. Add a couple of new testcases.
Fixes a few libgomp tests and latent bugs. Remove a few todos.
llvm-svn: 109078
2010-07-22 02:48:34 +00:00
Bruno Cardoso Lopes
5920e38cd2
Add more 256-bit forms for a bunch of regular AVX instructions
...
Add 64-bit (GR64) versions of some instructions (which are not
described in their SSE forms, but are described in AVX)
llvm-svn: 109063
2010-07-21 23:53:50 +00:00
Bruno Cardoso Lopes
eea3b7ed83
Add missing AVX convert instructions. Those instructions are not described in their SSE forms (although they exist), but add the AVX forms anyway, so the assembler can benefit from it
...
llvm-svn: 109039
2010-07-21 21:37:59 +00:00
Bruno Cardoso Lopes
1284fdc932
Avoid AVX instructions to be selected instead of its SSE form
...
llvm-svn: 109032
2010-07-21 20:38:42 +00:00
Bruno Cardoso Lopes
d13d8c2562
Add AVX only vzeroall and vzeroupper instructions
...
llvm-svn: 109002
2010-07-21 08:56:24 +00:00
Bruno Cardoso Lopes
c4d93a5a34
Add new AVX vpermilps, vpermilpd and vperm2f128 instructions
...
llvm-svn: 108984
2010-07-21 03:07:42 +00:00
Bruno Cardoso Lopes
a7efb29695
Add new AVX vmaskmov instructions, and also fix the VEX encoding bits to support it
...
llvm-svn: 108983
2010-07-21 02:46:58 +00:00
Bruno Cardoso Lopes
e0dce1c741
Add new AVX vextractf128 instructions
...
llvm-svn: 108964
2010-07-20 23:19:02 +00:00
Bruno Cardoso Lopes
b677cbc9b2
Add new AVX instruction vinsertf128
...
llvm-svn: 108892
2010-07-20 19:44:51 +00:00
Bruno Cardoso Lopes
88869cb4db
Add AVX vbroadcast new instruction
...
llvm-svn: 108788
2010-07-20 00:11:13 +00:00
Bruno Cardoso Lopes
4ca44dda21
Add 256-bit vaddsub, vhadd, vhsub, vblend and vdpp instructions!
...
llvm-svn: 108769
2010-07-19 23:32:44 +00:00
Bruno Cardoso Lopes
0616a418b6
Add AVX 256-bit compare instructions and a bunch of testcases
...
llvm-svn: 108286
2010-07-13 22:06:38 +00:00
Bruno Cardoso Lopes
7bc71d2d0a
AVX 256-bit conversion instructions
...
Add the x86 VEX_L form to handle special cases where VEX_L must be set.
llvm-svn: 108274
2010-07-13 21:07:28 +00:00
Bruno Cardoso Lopes
ae37153b05
Add AVX 256-bit packed logical forms
...
llvm-svn: 108224
2010-07-13 02:38:35 +00:00
Bruno Cardoso Lopes
495ae629bb
Add AVX 256-bit unop arithmetic instructions
...
llvm-svn: 108223
2010-07-13 01:53:31 +00:00
Bruno Cardoso Lopes
185483638b
Since AVX is a superset of all SSE versions, only use HasAVX for AVX instructions
...
llvm-svn: 108222
2010-07-13 00:38:47 +00:00
David Greene
d81591ee09
Move some SIMD fragment code into X86InstrFragmentsSIMD so that the
...
utility classes can be used from multiple files. This will aid
transitioning to a new refactored x86 SIMD specification.
llvm-svn: 108213
2010-07-12 23:41:28 +00:00
Bruno Cardoso Lopes
852e3bf472
Add AVX 256 binary arithmetic instructions
...
llvm-svn: 108207
2010-07-12 23:04:15 +00:00
Bruno Cardoso Lopes
b021506033
More refactoring of basic SSE arith instructions. Open room for 256-bit instructions
...
llvm-svn: 108204
2010-07-12 22:41:32 +00:00
Dan Gohman
e9c4426bb0
Apply the SSE dependence idiom for SSE unary operations to
...
SD instructions too, in addition to SS instructions. And
add a comment about it.
llvm-svn: 108191
2010-07-12 20:46:04 +00:00
Bruno Cardoso Lopes
a4889e6f93
Add AVX 256-bit MOVMSK forms
...
llvm-svn: 108184
2010-07-12 20:06:32 +00:00
Bruno Cardoso Lopes
f4180a9a7b
Add AVX 256-bit packed MOVNT variants
...
llvm-svn: 108021
2010-07-09 21:42:42 +00:00
Bruno Cardoso Lopes
6ca8dc935c
Add AVX 256-bit unpack and interleave
...
llvm-svn: 108017
2010-07-09 21:20:35 +00:00
Bruno Cardoso Lopes
3676e24b67
Start the support for AVX instructions with 256-bit %ymm registers. A couple of
...
notes:
- The instructions are being added with dummy placeholder patterns using some 256
specifiers, this is not meant to work now, but since there are some multiclasses
generic enough to accept them, when we go for codegen, the stuff will be already
there.
- Add VEX encoding bits to support YMM
- Add MOVUPS and MOVAPS in the first round
- Use "Y" as suffix for those Instructions: MOVUPSYrr, ...
- All AVX instructions in X86InstrSSE.td will move soon to a new X86InstrAVX
file.
llvm-svn: 107996
2010-07-09 18:27:43 +00:00
Bruno Cardoso Lopes
8d350872d4
Add AVX AES instructions
...
llvm-svn: 107798
2010-07-07 18:24:20 +00:00
Bruno Cardoso Lopes
6222076cd1
Add AVX SSE4.2 instructions
...
llvm-svn: 107752
2010-07-07 03:39:29 +00:00
Bruno Cardoso Lopes
931471d7e8
Use only one multiclass to pinsrq instructions
...
llvm-svn: 107750
2010-07-07 01:43:01 +00:00
Bruno Cardoso Lopes
65fbd0530f
Now that almost all SSE4.1 AVX instructions are added, move code around to more appropriate sections. No functionality changes
...
llvm-svn: 107749
2010-07-07 01:33:38 +00:00
Bruno Cardoso Lopes
675ebe2dc0
Add AVX SSE4.1 insertps, ptest and movntdqa instructions
...
llvm-svn: 107747
2010-07-07 01:14:56 +00:00
Bruno Cardoso Lopes
fa10461265
Add AVX SSE4.1 extractps and pinsr instructions
...
llvm-svn: 107746
2010-07-07 01:01:13 +00:00
Bruno Cardoso Lopes
54c2f858b3
Add AVX SSE4.1 Extract Integer instructions
...
llvm-svn: 107740
2010-07-07 00:07:24 +00:00
Bruno Cardoso Lopes
b9e1c33054
Add the rest of AVX SSE4.1 packed move with sign/zero extend instructions
...
llvm-svn: 107723
2010-07-06 23:15:17 +00:00
Bruno Cardoso Lopes
0c6ec0b068
Add part of AVX SSE4.1 packed move with sign/zero extend instructions
...
llvm-svn: 107720
2010-07-06 23:01:41 +00:00
Bruno Cardoso Lopes
a0b37e839c
Add AVX vblendvpd, vblendvps and vpblendvb instructions
...
Update VEX encoding to support those new instructions
llvm-svn: 107715
2010-07-06 22:36:24 +00:00
Chris Lattner
e7c95bcd9e
rip out even more sporadic v2f32 support.
...
llvm-svn: 107610
2010-07-05 04:38:33 +00:00
Bill Wendling
689155c673
Revert r107583. I no longer think that this is the way to solve the problem.
...
llvm-svn: 107585
2010-07-04 09:16:57 +00:00
Bill Wendling
8a3ecba7a4
Mark sse_load_f32 and sse_load_f64 as having memory operands
...
(SDNPMemOperand). This way when they're morphed the memory operands will be
copied as well.
llvm-svn: 107583
2010-07-04 08:59:55 +00:00
Bruno Cardoso Lopes
dc16024895
Add AVX SSE4.1 blend, mpsadbw and vdp
...
llvm-svn: 107560
2010-07-03 01:37:03 +00:00
Bruno Cardoso Lopes
9cbb625579
Add AVX SSE4.1 binop (some forms of packed max,min,mul,pack,cmp) instructions
...
llvm-svn: 107558
2010-07-03 01:15:47 +00:00
Bruno Cardoso Lopes
df02d037e4
Add AVX SSE4.1 Horizontal Minimum and Position instruction
...
llvm-svn: 107552
2010-07-03 00:49:21 +00:00
Bruno Cardoso Lopes
e6b70efcb0
Add AVX SSE4.1 round instructions
...
llvm-svn: 107549
2010-07-03 00:37:44 +00:00
Bruno Cardoso Lopes
473863e456
Simple refactoring of SSE4.1 instructions, making room for the AVX forms
...
llvm-svn: 107540
2010-07-02 23:27:59 +00:00
Bruno Cardoso Lopes
4931e183b5
- Add support for the rest of AVX SSE3 instructions
...
- Fix VEX prefix to be emitted with 3 bytes whenever VEX_5M
represents a REX equivalent two byte leading opcode
llvm-svn: 107523
2010-07-02 22:06:54 +00:00
Bruno Cardoso Lopes
c5670fcb23
Shrink down SSE3 code by more multiclass refactoring
...
llvm-svn: 107448
2010-07-01 23:10:49 +00:00
Bruno Cardoso Lopes
c215186088
Shrink down SSE3 code by some multiclass refactoring - 1st part
...
llvm-svn: 107438
2010-07-01 22:33:18 +00:00
Bruno Cardoso Lopes
511e5f47de
Move SSE3 Move patterns to a more appropriate section
...
Add AVX SSE3 packed horizontal and & sub instructions
llvm-svn: 107405
2010-07-01 17:35:02 +00:00
Bruno Cardoso Lopes
0a3048e8b9
Add AVX SSE3 packed addsub instructions
...
llvm-svn: 107404
2010-07-01 17:08:18 +00:00
Bruno Cardoso Lopes
c1abe91367
Add AVX SSE3 replicate and convert instructions
...
llvm-svn: 107375
2010-07-01 02:33:39 +00:00
Bruno Cardoso Lopes
956316a3d7
- Add AVX SSE2 Move doubleword and quadword instructions.
...
- Add encode bits for VEX_W
- All 128-bit SSE 1 & SSE2 instructions that are described
in the .td file now have a AVX encoded form already working.
llvm-svn: 107365
2010-07-01 01:20:06 +00:00
Bruno Cardoso Lopes
7ae1ebd3b4
Move MOVD/MODQ code around, creating sections for each of them
...
llvm-svn: 107308
2010-06-30 18:49:10 +00:00
Bruno Cardoso Lopes
f8855c22be
Add AVX SSE2 mask creation and conditional store instructions
...
llvm-svn: 107306
2010-06-30 18:38:10 +00:00
Bruno Cardoso Lopes
6c468039a2
Fix a bug introduced in r107211 where instructions with memory operands are declared as commutable
...
llvm-svn: 107300
2010-06-30 18:06:01 +00:00
Bruno Cardoso Lopes
3c02702830
Add AVX SSE2 packed integer extract/insert instructions
...
llvm-svn: 107293
2010-06-30 17:03:03 +00:00
Bruno Cardoso Lopes
39594cc5d0
Add AVX SSE2 integer unpack instructions
...
llvm-svn: 107246
2010-06-30 04:06:39 +00:00
Bruno Cardoso Lopes
419f8f29c3
Add AVX SSE2 packed integer shuffle instructions
...
llvm-svn: 107245
2010-06-30 03:47:56 +00:00
Bruno Cardoso Lopes
c2f5cd2389
Small refactoring of SSE2 packed integer shuffle instructions
...
llvm-svn: 107243
2010-06-30 03:29:36 +00:00
Bruno Cardoso Lopes
d9acb34aa2
Add AVX SSE2 pack with saturation integer instructions
...
llvm-svn: 107241
2010-06-30 02:30:25 +00:00
Bruno Cardoso Lopes
c470ba9937
Add AVX SSE2 integer packed compare instructions
...
llvm-svn: 107240
2010-06-30 02:21:09 +00:00
Bruno Cardoso Lopes
cfbebb3921
- Add AVX form of all SSE2 logical instructions
...
- Add VEX encoding bits to x86 MRM0r-MRM7r
llvm-svn: 107238
2010-06-30 01:58:37 +00:00
Bruno Cardoso Lopes
2439877e05
Add *several* AVX integer packed binop instructions
...
llvm-svn: 107225
2010-06-29 23:47:49 +00:00
Bruno Cardoso Lopes
b80121d316
Move SSE2 Packed Integer instructions around, and create specific sections for each of them
...
llvm-svn: 107211
2010-06-29 22:12:16 +00:00