1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 04:22:57 +02:00
Commit Graph

1015 Commits

Author SHA1 Message Date
Craig Topper
0c4ab86d2c Fix the memop type on a couple 256-bit AVX instructions that were using f128mem instead of f256mem.
llvm-svn: 148196
2012-01-14 18:29:57 +00:00
Chad Rosier
4a705ae81a Fix pasto from r146196.
llvm-svn: 148167
2012-01-14 01:50:21 +00:00
Craig Topper
c1e3d46e07 Convert SHUFPD with the same register for both sources to PSHUFD if it would prevent a register copy. Similar to SHUFPS, but requires the mask to be converted.
llvm-svn: 148112
2012-01-13 09:21:41 +00:00
Craig Topper
e52c0484de Make X86 instruction selection use 256-bit VPXOR for build_vector of all ones if AVX2 is enabled. This gives the ExeDepsFix pass a chance to choose FP vs int as appropriate. Also use v8i32 as the type for getZeroVector if AVX2 is enabled. This is consistent with SSE2 using prefering v4i32.
llvm-svn: 148108
2012-01-13 08:12:35 +00:00
Craig Topper
71ea42cc29 Add patterns for v16i16 and v32i8 immAllZerosV to select VPXOR to match v4i64 and v8i32.
llvm-svn: 148106
2012-01-13 06:59:47 +00:00
Chad Rosier
7bab07a5f1 Add missing VEX predicates to VMOVSDto64rr/VMOVSDto64mr. This fixes a few
failing test cases on our internal AVX nightly tester.
rdar://10663637

llvm-svn: 147881
2012-01-10 22:14:06 +00:00
Craig Topper
c9756440ea Instruction selection priority fixes to remove the XMM/XMMInt/orAVX predicates. Another commit will remove orAVX functions from X86SubTarget.
llvm-svn: 147841
2012-01-10 06:30:56 +00:00
Craig Topper
915bc4edaa Add HasAVX predicate to some of the AVX patterns.
llvm-svn: 147769
2012-01-09 08:34:00 +00:00
Craig Topper
2710d2dadb Reorder a bunch of patterns to put the AVX version first thus giving it priority over the SSE version. Another step towards trying to remove the AVX hack that disables SSE from X86Subtarget.
llvm-svn: 147768
2012-01-09 08:10:38 +00:00
Craig Topper
ee2dabebe3 Clean up patterns for MOVNT*. Not sure why there were floating point types on MOVNTPS and MOVNTDQ. And v4i64 was completely missing.
llvm-svn: 147767
2012-01-09 06:52:46 +00:00
Craig Topper
8ce58f4687 Mark MOVNTI as being supported in SSE2 OR AVX mode. This instruction has no AVX equivalent so we should use the SSE version.
llvm-svn: 147766
2012-01-09 06:38:55 +00:00
Craig Topper
bd6f3004ba Move SSE2 logical operations PAND/POR/PXOR/PANDN above SSE1 logical operations ANDPS/ORPS/XORPS/ANDNPS. This fixes a pattern ordering issue that meant that the SSE2 instructions could never be directly selected since the SSE1 patterns would always match first. This is largely moot with the ExeDepsFix pass, but I'm trying to audit for all such ordering issues.
llvm-svn: 147765
2012-01-09 05:07:01 +00:00
Chad Rosier
afcaa8f38a Enhance DAGCombine for transforming 128->256 casts into a vmovaps, rather
then a vxorps + vinsertf128 pair if the original vector came from a load.
rdar://10594409

llvm-svn: 147481
2012-01-03 21:05:52 +00:00
Craig Topper
69e0a09d8f Make CanXFormVExtractWithShuffleIntoLoad reject loads with multiple uses. Also make it return false if there's not even a load at all. This makes the code better match the code in DAGCombiner that it tries to match. These two changes prevent some cases where vector_shuffles were making it to instruction selection and causing the older shuffle selection code to be triggered. Also needed to fix a bad pattern that this change exposed. This is the first step towards getting rid of the old shuffle selection support. No test cases yet because there's no way to tell whether a shuffle was handled in the legalize stage or at instruction selection.
llvm-svn: 147428
2012-01-02 08:46:48 +00:00
Craig Topper
d8ae2d9f27 Fix sfence, lfence, mfence, and clflush to be able to be selected when AVX is enabled. Fix monitor and mwait to require SSE3 or AVX, previously they worked even if SSE3 was disabled. Make prefetch instructions not set the execution domain since they don't use XMM registers.
llvm-svn: 147409
2012-01-01 19:40:22 +00:00
Craig Topper
ef59fe1ad4 Merge X86 SHUFPS and SHUFPD node types.
llvm-svn: 147394
2011-12-31 23:50:21 +00:00
Craig Topper
0311c45aed Add patterns for integer forms of SHUFPD/VSHUFPD with a memory load.
llvm-svn: 147393
2011-12-31 23:24:49 +00:00
Craig Topper
c01ce759d7 Fix typo in a SHUFPD and VSHUFPD pattern that prevented SHUFPD/VSHUFPD with a load from being selected.
llvm-svn: 147392
2011-12-31 23:15:11 +00:00
Craig Topper
04b3b369de Remove the separate explicit AES instruction patterns. They are equivalent to the patterns specified by the instructions. Also remove unnecessary bitconverts from the AES patterns.
llvm-svn: 147342
2011-12-29 17:41:56 +00:00
Chad Rosier
98251404f7 Fix 80-column violations.
llvm-svn: 147095
2011-12-21 20:59:09 +00:00
Elena Demikhovsky
b37883fe87 This is the second fix related to VZEXT_MOVL node.
The failure that I see in the current version is:

LLVM ERROR: Cannot select: 0x18b8f70: v4i64 = X86ISD::VZEXT_MOVL 0x18beee0 [ID=14]
  0x18beee0: v4i64 = insert_subvector 0x18b8c70, 0x18b9170, 0x18b9570 [ID=13]
    0x18b8c70: v4i64 = insert_subvector 0x18b9870, 0x18bf4e0, 0x18b9970 [ID=12]
      0x18b9870: v4i64 = undef [ID=4]
      0x18bf4e0: v2i64 = bitcast 0x18bf3e0 [ID=10]
        0x18bf3e0: v4i32 = BUILD_VECTOR 0x18b9770, 0x18b9770, 0x18b9770, 0x18b9770 [ID=8]
          0x18b9770: i32 = TargetConstant<0> [ID=6]
          0x18b9770: i32 = TargetConstant<0> [ID=6]
          0x18b9770: i32 = TargetConstant<0> [ID=6]
          0x18b9770: i32 = TargetConstant<0> [ID=6]
      0x18b9970: i32 = Constant<0> [ID=3]
    0x18b9170: v2i64 = undef [ORD=1] [ID=1]
    0x18b9570: i32 = Constant<2> [ID=5]

llvm-svn: 146975
2011-12-20 13:34:28 +00:00
Eli Friedman
f626b19bda Make sure we correctly note the existence of an i8 immediate for vblendvps and friends, so we compute fixups correctly. PR11586.
llvm-svn: 146709
2011-12-15 23:46:18 +00:00
Chad Rosier
62ebee9859 Add missing zmovl AVX patterns which were causing crashes.
Patch by Elena Demikhovsky <elena.demikhovsky@intel.com>!

llvm-svn: 146689
2011-12-15 22:11:31 +00:00
Benjamin Kramer
06cd66b1d7 X86: Add patterns for the various rounding ops for SSE4.1 and AVX.
llvm-svn: 146257
2011-12-09 15:44:03 +00:00
Benjamin Kramer
66bfc0739d X86: Split (v)rounds[sd] into a normal and an intrinsic version.
llvm-svn: 146256
2011-12-09 15:43:55 +00:00
Evan Cheng
ad8debd736 Add 256-bit variant vmovss and vmovsd patterns. rdar://10538417
llvm-svn: 146196
2011-12-08 22:30:45 +00:00
Evan Cheng
d8a73b8918 Add various missing AVX patterns which was causing crashes. Sadly, the generated
code looks pretty bad compared to SSE.

rdar://10538793

llvm-svn: 146191
2011-12-08 22:05:28 +00:00
Evan Cheng
93e29adc2f Many of the SSE patterns should not be selected when AVX is available. This led to the following code in X86Subtarget.cpp
if (HasAVX)
    X86SSELevel = NoMMXSSE;

This is so patterns that are predicated on hasSSE3, etc. would not be selected when avx is available. Instead, the AVX variant is selected.
However, this breaks instructions which do not have AVX variants.

The right way to fix this is for the SSE but not-AVX patterns to predicate on something like hasSSE3() && !hasAVX().
Then we can take out the hack in X86Subtarget.cpp. Patterns which do not have AVX variants do not need to change.

However, we need to audit all the patterns before we make the change. This patch is workaround that fixes one specific case,
the prefetch instructions. rdar://10538297

llvm-svn: 146163
2011-12-08 19:00:42 +00:00
Craig Topper
6b3cc1405f Fix a bunch of SSE/AVX patterns to use proper memop types. In particular, not using integer loads other than v2i64/v4i64 since the others are all promoted.
llvm-svn: 146031
2011-12-07 08:30:53 +00:00
Craig Topper
8b05e7d035 Fix a bunch of SSE/AVX patterns to use v2i64/v4i64 loads since all other integer vector loads are promoted to those.
llvm-svn: 145927
2011-12-06 09:04:59 +00:00
Craig Topper
846d53deed Merge floating point and integer UNPCK X86ISD node types.
llvm-svn: 145926
2011-12-06 08:21:25 +00:00
Craig Topper
0ac9bb8aa1 Merge VPERM2F128/VPERM2I128 ISD node types.
llvm-svn: 145485
2011-11-30 07:47:51 +00:00
Craig Topper
43b885cff4 Merge decoding of VPERMILPD and VPERMILPS shuffle masks. Merge X86ISD node type for VPERMILPD/PS. Add instruction selection support for VINSERTI128/VEXTRACTI128.
llvm-svn: 145483
2011-11-30 06:25:25 +00:00
Evan Cheng
5c1efd630b Add another missing pattern. llvm-gcc likes f64 but clang likes i64 so it was generating poor code for some SSE builtins.
llvm-svn: 145448
2011-11-29 22:48:34 +00:00
Jakob Stoklund Olesen
5d6a4584d9 Make X86::FsFLD0SS / FsFLD0SD real pseudo-instructions.
Like V_SET0, these instructions are expanded by ExpandPostRA to xorps /
vxorps so they can participate in execution domain swizzling.

This also makes the AVX variants redundant.

llvm-svn: 145440
2011-11-29 22:27:25 +00:00
Elena Demikhovsky
735cff1fa8 Fixed vsqrt.ss intrinsic usage - order of input operands was wrong.
Added a test.
Thanks Bruno for reviewing the patch.

llvm-svn: 145403
2011-11-29 15:00:45 +00:00
Craig Topper
4550fc2649 Fix issues in shuffle decoding around VPERM* instructions. Fix shuffle decoding for VSHUFPS/D for 256-bit types. Add pattern matching for memory forms of VPERMILPS/VPERMILPD.
llvm-svn: 145390
2011-11-29 07:49:05 +00:00
Craig Topper
aca91b9f14 Fix VINSERTF128/VEXTRACTF128 to be marked as FP instructions. Allow execution dependency fix pass to convert them to their integer equivalents when AVX2 is enabled.
llvm-svn: 145376
2011-11-29 05:37:58 +00:00
Craig Topper
a6c1d25798 Correctly mark VPERM2F128 as being an FP instruction and add execution domain fixing support to convert it to VPERM2I128 for AVX2.
llvm-svn: 145370
2011-11-29 03:57:34 +00:00
Evan Cheng
1ed975b097 Add missing avx pattern.
llvm-svn: 145272
2011-11-28 20:27:23 +00:00
Craig Topper
6f5a0bc4e3 Add X86 instruction selection for VPERM2I128 when AVX2 is enabled. Merge VPERMILPS/VPERMILPD detection since they are pretty similar.
llvm-svn: 145238
2011-11-28 10:14:51 +00:00
Craig Topper
563854a230 Merge 128-bit and 256-bit X86ISD node types for VPERMILPS and VPERMILPD. Simplify some shuffle lowering code since V1 can never be UNDEF due to canonalizing that occurs when shuffle nodes are created.
llvm-svn: 145153
2011-11-26 22:55:48 +00:00
Craig Topper
65f8dcdb7d Collapse X86ISD node types for PUNPCKH*, PUNPCKL*, UNPCKLP*, and UNPCKHP* to not be type specific. Now we just have integer high and low and floating point high and low. Pattern matching will choose the correct instruction based on the vector type.
llvm-svn: 145148
2011-11-26 20:47:44 +00:00
Craig Topper
e761f42368 Remove 256-bit specific node types for UNPCKHPS/D and instead use the 128-bit versions and let the operand type disinquish. Also fix the load form of the v8i32 patterns for these to realize that the load would be promoted to v4i64.
llvm-svn: 145126
2011-11-24 22:57:10 +00:00
Craig Topper
7cf04d32e9 Remove AVX2 specific X86ISD node types for PUNPCKH/L and instead just reuse the 128-bit versions and let the vector type distinguish.
llvm-svn: 145125
2011-11-24 22:20:08 +00:00
Craig Topper
866214a486 Lowering for v32i8 to VPUNPCKLBW/VPUNPCKHBW when AVX2 is enabled.
llvm-svn: 145028
2011-11-21 08:26:50 +00:00
Craig Topper
14cedf481a Add support for lowering 256-bit shuffles to VPUNPCKL/H for i16, i32, i64 if AVX2 is enabled.
llvm-svn: 145026
2011-11-21 06:57:39 +00:00
Craig Topper
e878c775cf Add code for lowering v32i8 shifts by a splat to AVX2 immediate shift instructions. Remove 256-bit splat handling from LowerShift as it was already handled by PerformShiftCombine.
llvm-svn: 145005
2011-11-20 00:12:05 +00:00
Craig Topper
6ed413c495 Use 256-bit vcmpeqd for creating an all ones vector when AVX2 is enabled.
llvm-svn: 145004
2011-11-19 22:34:59 +00:00
Craig Topper
3e24dc25b2 Remove some of the special classes that worked around an old tablegen limitation of not being able to remove redundant bitconverts from patterns.
llvm-svn: 145003
2011-11-19 21:01:54 +00:00
Craig Topper
c6a4cbdc04 Custom lower AVX2 variable shift intrinsics to shl/srl/sra nodes and remove the intrinsic patterns.
llvm-svn: 144999
2011-11-19 17:46:46 +00:00
Craig Topper
a64e2604a2 Synthesize SSSE3/AVX 128-bit horizontal integer add/sub instructions from add/sub of appropriate shuffle vectors.
llvm-svn: 144989
2011-11-19 09:02:40 +00:00
Craig Topper
117ffc9a0c Collapse X86 PSIGNB/PSIGNW/PSIGND node types.
llvm-svn: 144988
2011-11-19 07:33:10 +00:00
Craig Topper
536f9d9434 Extend VPBLENDVB and VPSIGN lowering to work for AVX2.
llvm-svn: 144987
2011-11-19 07:07:26 +00:00
Craig Topper
0deee76383 Remove unused parameters from the AVX maskmov classes.
llvm-svn: 144985
2011-11-19 04:49:22 +00:00
Nadav Rotem
08f8a75c2c Add AVX2 vpbroadcast support
llvm-svn: 144967
2011-11-18 02:49:55 +00:00
Craig Topper
7297509c73 Fix SSE/AVX integer comparison patterns to understand that all integer vector loads are promoted to i64 vector loads so patterns need a bitconvert. Also slightly simplify the AVX2 variable shift patterns by using the predefined bitconvert pattern fragments.
llvm-svn: 144896
2011-11-17 07:49:38 +00:00
Craig Topper
4d39196041 Remove seemingly unnecessary duplicate VROUND definitions.
llvm-svn: 144885
2011-11-17 07:04:00 +00:00
Evan Cheng
5bae2333cb Another missing X86ISD::MOVLPD pattern. rdar://10450317
llvm-svn: 144839
2011-11-16 22:24:44 +00:00
Craig Topper
7a4d482aaa Fix the execution domain on a bunch of SSE/AVX instructions.
llvm-svn: 144784
2011-11-16 07:30:46 +00:00
Evan Cheng
2034ff3b0b Add a missing pattern for X86ISD::MOVLPD. rdar://10436044
llvm-svn: 144566
2011-11-14 20:35:52 +00:00
Craig Topper
e0b34012db Add neverHasSideEffects, mayLoad, and mayStore to many patternless SSE/AVX instructions. Remove MMX check from LowerVECTOR_SHUFFLE since MMX vector types won't go through it anyway.
llvm-svn: 144522
2011-11-14 06:46:21 +00:00
Craig Topper
0458cdf64a Add more AVX2 shift lowering support. Move AVX2 variable shift to use patterns instead of custom lowering code.
llvm-svn: 144457
2011-11-12 09:58:49 +00:00
Craig Topper
50df7c3842 Add lowering for AVX2 shift instructions.
llvm-svn: 144380
2011-11-11 07:39:23 +00:00
Nadav Rotem
e3d8f1a069 AVX2: Add variable shift from memory.
Note: These patterns only works in some cases because
many times the load sd node is bitcasted from a load
node of a different type.

llvm-svn: 144266
2011-11-10 06:54:20 +00:00
Nadav Rotem
ddc6bfa543 AVX2: Add patterns for variable shift operations
llvm-svn: 144212
2011-11-09 21:22:13 +00:00
Nadav Rotem
e66a72a2c4 Add AVX2 support for vselect of v32i8
llvm-svn: 144187
2011-11-09 13:21:28 +00:00
Craig Topper
7ff77dc2b1 Add instruction selection for AVX2 integer comparisons.
llvm-svn: 144176
2011-11-09 08:06:13 +00:00
Evan Cheng
4a63100fe3 Add x86 isel logic and patterns to match movlps from clang generated IR for _mm_loadl_pi(). rdar://10134392, rdar://10050222
llvm-svn: 144052
2011-11-08 00:31:58 +00:00
Craig Topper
7eab73f510 Add AVX2 variable shift instructions and intrinsics.
llvm-svn: 143915
2011-11-07 08:26:24 +00:00
Craig Topper
b1ef950217 Add AVX2 VPMOVMASK instructions and intrinsics.
llvm-svn: 143904
2011-11-07 03:20:35 +00:00
Craig Topper
d422190c0f Add AVX2 VEXTRACTI128 and VINSERTI128 instructions. Fix VPERM2I128 to be qualified with HasAVX2 instead of HasAVX. Mark VINSERTF128 and VEXTRACTF128 as never having side effects.
llvm-svn: 143902
2011-11-07 02:00:04 +00:00
Craig Topper
01b852b95a More AVX2 instructions and their intrinsics.
llvm-svn: 143895
2011-11-06 23:04:08 +00:00
Craig Topper
31b1d79474 Add more AVX2 instructions and intrinsics.
llvm-svn: 143861
2011-11-06 06:12:20 +00:00
Craig Topper
80cdc1ee12 Add intrinsics for X86 vcvtps2ph and vcvtph2ps instructions
llvm-svn: 143683
2011-11-04 06:59:49 +00:00
Craig Topper
124b2fd08c Add new X86 AVX2 VBROADCAST instructions.
llvm-svn: 143612
2011-11-03 07:35:53 +00:00
Craig Topper
a2a55bd0b4 More AVX2 instructions and intrinsics.
llvm-svn: 143536
2011-11-02 06:54:17 +00:00
Craig Topper
c5482eb697 Add a bunch more X86 AVX2 instructions and their corresponding intrinsics.
llvm-svn: 143529
2011-11-02 04:42:13 +00:00
Craig Topper
6eaf58df7c Begin adding AVX2 instructions. No selection support yet other than intrinsics.
llvm-svn: 143331
2011-10-31 02:15:10 +00:00
Jakob Stoklund Olesen
d7827f928d V_SET0 has no side effects.
TableGen will mark any pattern-less instruction as having unmodeled side
effects. This is extra bad for V_SET0 which gets rematerialized a lot.

This was part of the cause for PR11125, but the real bug was fixed
in r141923.

llvm-svn: 141924
2011-10-14 00:39:50 +00:00
Craig Topper
0d25fa802f Add 'implicit EFLAGS' to patterns for popcnt and lzcnt
llvm-svn: 141853
2011-10-13 06:18:52 +00:00
Craig Topper
881d972428 Add HasPOPCNT predicate to the POPCNT instructions. Also mark POPCNT as modifying EFLAGS.
llvm-svn: 141656
2011-10-11 07:13:09 +00:00
Craig Topper
db2d702bff Make Ivy Bridge 16-bit floating point conversion instructions require AVX.
llvm-svn: 141654
2011-10-11 07:01:37 +00:00
Craig Topper
9b7ab95570 Add Ivy Bridge 16-bit floating point conversion instructions for the X86 disassembler.
llvm-svn: 141505
2011-10-09 07:31:39 +00:00
Craig Topper
9d32602cfd Add support in the disassembler for ignoring the L-bit on certain VEX instructions. Mark instructions that have this behavior. Fixes PR10676.
llvm-svn: 141065
2011-10-04 06:30:42 +00:00
Craig Topper
df04bee9b2 Add support for MOVBE and RDRAND instructions for the assembler and disassembler. Includes feature flag checking, but no instrinsic support. Fixes PR10832, PR11026 and PR11027.
llvm-svn: 141007
2011-10-03 17:28:23 +00:00
Jakob Stoklund Olesen
76da38e8e8 Expand the x86 V_SET0* pseudos right after register allocation.
This also makes it possible to reduce the number of pseudo instructions
and get rid of the encoding information.

llvm-svn: 140776
2011-09-29 05:10:54 +00:00
Duncan Sands
6d3fe8d11a Implement Chris's suggestion of legalizing the various SSE and AVX
hadd/hsub intrinsics into the new fhadd/fhsub X86 node.

llvm-svn: 140383
2011-09-23 16:10:22 +00:00
Duncan Sands
1da590b589 Synthesize SSE3/AVX 128 bit horizontal add/sub instructions from
floating point add/sub of appropriate shuffle vectors.  Does not
synthesize the 256 bit AVX versions because they work differently.

llvm-svn: 140332
2011-09-22 20:15:48 +00:00
Bruno Cardoso Lopes
629e7c2410 Revert r140097, working on a better approach
llvm-svn: 140203
2011-09-20 23:19:29 +00:00
Bruno Cardoso Lopes
906f64c461 The wrong relocation was being emitted for several SSSE3 instructions.
This fixes PR10963. Thanks to Benjamin for finding the wrong tablegen
declaration.

llvm-svn: 140184
2011-09-20 21:39:21 +00:00
Bruno Cardoso Lopes
de0dc10d6d Fix PR10949. Fix the encoding of VMOVPQIto64rr.
llvm-svn: 140098
2011-09-19 23:36:59 +00:00
Bruno Cardoso Lopes
7cf7f02c3d Based on the small opt Zvi's patch was trying to achieve, eliminate
128-bit undef subvector insertion into a 256-bit vector

llvm-svn: 140097
2011-09-19 23:36:50 +00:00
Bruno Cardoso Lopes
9e5ef44daf Match X86ISD::FSETCCsd and X86ISD::FSETCCss while in AVX mode. This fix
PR10955 and PR10948.

llvm-svn: 140069
2011-09-19 21:29:24 +00:00
Bruno Cardoso Lopes
f611f6c371 Describe more AVX 128-bit convert instructions without patterns to have
mayLoad = 1

llvm-svn: 139973
2011-09-16 23:41:29 +00:00
Bruno Cardoso Lopes
396b8136bf Add mayLoad attribute to AVX convert instructions, since non of them
are declared with load patterns. This fix the crash in PR10941. No testcases,
since a fold is triggered and then converted back to the register form
afterwards.

llvm-svn: 139953
2011-09-16 22:02:14 +00:00
Craig Topper
60719c7bfb Fix mem type for VEX.128 form of VROUNDP*. Remove filter preventing VROUND from being recognized by disassembler.
llvm-svn: 139691
2011-09-14 06:41:26 +00:00
Bruno Cardoso Lopes
27a7ace4b4 Teach the foldable tables about 128-bit AVX instructions and make the
alignment check for 256-bit classes more strict. There're no testcases
but we catch more folding cases for AVX while running single and multi
sources in the llvm testsuite.

Since some 128-bit AVX instructions have different number of operands
than their SSE counterparts, they are placed in different tables.

256-bit AVX instructions should also be added in the table soon. And
there a few more 128-bit versions to handled, which should come in
the following commits.

llvm-svn: 139687
2011-09-14 02:36:58 +00:00
Nadav Rotem
f1730712f7 swap vselect operand order - pr10907
llvm-svn: 139630
2011-09-13 19:56:38 +00:00
Bruno Cardoso Lopes
f02589db47 Add versions 256-bit versions of alignedstore and alignedload, to be
more strict about the alignment checking. This was found by inspection
and I don't have any testcases so far, although the llvm testsuite runs
without any problem.

llvm-svn: 139625
2011-09-13 19:33:03 +00:00
Craig Topper
03c833ff84 Remove filter that was preventing MOVDQU/MOVDQA and their VEX forms from being disassembled. Also added encodings for the other register/register form of these instructions. Fixes PR10848.
llvm-svn: 139588
2011-09-13 06:54:58 +00:00
Craig Topper
6eeb5396f8 Fix encoding of VMOVDQU to not simultaneously be 'TB OpSize' and 'XS'. 'XS' is correct and seems to have been taking priority.
llvm-svn: 139587
2011-09-13 06:39:34 +00:00
Bruno Cardoso Lopes
a4d2bdfa40 Fix PR10845. SUBREG_TO_REG shouldn't be used when the input and
destination types are equal!

llvm-svn: 139553
2011-09-12 22:59:23 +00:00
Bruno Cardoso Lopes
e2fc394ed2 Organize a bit the operand names for CMPPS and CMPPD
llvm-svn: 139527
2011-09-12 19:30:36 +00:00
Bruno Cardoso Lopes
fc1c90ac48 Realign BLEND patterns to match the general style for patterns in .td file.
llvm-svn: 139526
2011-09-12 19:30:33 +00:00
Bruno Cardoso Lopes
f0e65e0f13 Fix 80-columns
llvm-svn: 139525
2011-09-12 19:30:29 +00:00
Nadav Rotem
06ce2ac074 Format patterns, remove unused X86blend patterns
llvm-svn: 139491
2011-09-12 08:41:50 +00:00
Craig Topper
5ffd0cb080 Fix disassembling of one of the register/register forms of MOVUPS/MOVUPD/MOVAPS/MOVAPD/MOVSS/MOVSD and their VEX equivalents. Fixes PR10877.
llvm-svn: 139486
2011-09-11 23:19:54 +00:00
Nadav Rotem
abb5bb41d4 CR fixes per Bruno's request.
Undo the changes from r139285 which added custom lowering to vselect.
Add tablegen lowering for vselect.

llvm-svn: 139479
2011-09-11 15:02:23 +00:00
Nadav Rotem
ccb46031e6 Implement vector-select support for avx256. Refactor the vblend implementation to have tablegen match the instruction by the node type
llvm-svn: 139400
2011-09-09 20:29:17 +00:00
Bruno Cardoso Lopes
54962ac233 Add a AVX version of a simple i64 -> f64 bitcast. This could be
triggered using llc with -O0, which wouldn't let it be folded and
expose the lack of this pattern.

llvm-svn: 139320
2011-09-08 21:52:33 +00:00
Bruno Cardoso Lopes
74a67e22b0 Add AVX versions of blend vector operations and fix some issues noticed
in Nadav's r139285 and r139287 commits.

1) Rename vsel.ll to a more descriptive name
2) Change the order of BLEND operands to "Op1, Op2, Cond", this is
necessary because PBLENDVB is already used in different places with
this order, and it was being emitted in the wrong way for vselect
3) Add AVX patterns and tests for the same SSE41 instructions

llvm-svn: 139305
2011-09-08 18:05:08 +00:00
Bruno Cardoso Lopes
84c53e3965 Fix PR10844: Add patterns to cover non foldable versions of X86vzmovl.
Triggered using llc -O0. Also fix some SET0PS patterns to their AVX
forms and test it on the testcase.

llvm-svn: 139304
2011-09-08 18:05:02 +00:00
Nadav Rotem
b461f2190e Add X86-SSE4 codegen support for vector-select.
llvm-svn: 139285
2011-09-08 08:11:19 +00:00
Bruno Cardoso Lopes
02157d584a Add AVX versions to match AESENC/AESDEC intrinsics. This hopefully ends
the cycle of missing AVX counterparts of already present SSE* patterns

llvm-svn: 139073
2011-09-03 00:47:08 +00:00
Bruno Cardoso Lopes
c72ce24240 Add AVX version of a SSE4.1 VPBLENDVB pattern
llvm-svn: 139072
2011-09-03 00:47:05 +00:00
Bruno Cardoso Lopes
a25fc6f941 Add AVX versions of SSE4.1 EXTRACTPS patterns
llvm-svn: 139071
2011-09-03 00:47:03 +00:00
Bruno Cardoso Lopes
45d02d5eca Add AVX versions for SSE4.1 MOVZX* patterns
llvm-svn: 139070
2011-09-03 00:47:01 +00:00
Bruno Cardoso Lopes
cadec3711c Add one more AVX pattern for MOVZPQILo2PQI
llvm-svn: 139069
2011-09-03 00:46:58 +00:00
Bruno Cardoso Lopes
48eeb79003 Move PUNPCKLQDQ splat pattern close to the instruction definition and
duplicate it for AVX mode.

llvm-svn: 139068
2011-09-03 00:46:56 +00:00
Bruno Cardoso Lopes
ca90af60bd Add AVX pattern versions for PSHUFB,PSIGN{B,W,D}
llvm-svn: 139067
2011-09-03 00:46:54 +00:00
Bruno Cardoso Lopes
7fae5ca308 Add AVX versions of MOVZDI2PDI patterns. Use SUBREG_TO_REG to indicate
that the AVX versions (even the 128-bit ones) all clear the upper part
of the destination register.

llvm-svn: 139066
2011-09-03 00:46:51 +00:00
Bruno Cardoso Lopes
e749426ece Enforce subtarget checks in a few places to be explicit when the
pattern should be matched

llvm-svn: 139065
2011-09-03 00:46:49 +00:00
Bruno Cardoso Lopes
323a5b334e Tidy up code moving patterns to their appropriate place!
llvm-svn: 139064
2011-09-03 00:46:47 +00:00
Bruno Cardoso Lopes
ea1931b9d0 Add AVX versions of FsMOVAPS and FsMOVAPS. Teach X86InstrInfo how to use
it!

llvm-svn: 139063
2011-09-03 00:46:45 +00:00
Bruno Cardoso Lopes
86c67e11c9 Fix 80-column and style
llvm-svn: 139061
2011-09-03 00:46:40 +00:00
Bruno Cardoso Lopes
beb7a448e7 Tidy up some SSE/AVX convert intrinsics. Also add an AVX version of
OptForSize pattern

llvm-svn: 139060
2011-09-03 00:46:38 +00:00
Bruno Cardoso Lopes
8771512b75 Move more code around and duplicate AVX patterns: MOVHPS and MOVLPS
llvm-svn: 138897
2011-08-31 21:15:32 +00:00
Bruno Cardoso Lopes
22aceefbf7 Move MOVAPS,MOVUPS patterns close to the instructions definition
llvm-svn: 138896
2011-08-31 21:15:29 +00:00
Bruno Cardoso Lopes
4823fe07e6 Remove "_Int" forms of MOVUPSmr and MOVAPSmr
llvm-svn: 138895
2011-08-31 21:15:22 +00:00
Bruno Cardoso Lopes
5bd6e92f99 - Move all MOVSS and MOVSD patterns close to their definitions
- Duplicate some store patterns to their AVX forms!
- Catched a bug while restricting the patterns subtarget, fix it
  and update a testcase to check it properly

llvm-svn: 138851
2011-08-31 03:04:20 +00:00
Bruno Cardoso Lopes
a9c2c56e13 Remove unnecessary AVX checks
llvm-svn: 138850
2011-08-31 03:04:14 +00:00
Evan Cheng
bbabe9ff60 Fix (movhps load) lowering / pattern to match more cases. rdar://10050549
llvm-svn: 138848
2011-08-31 02:05:24 +00:00
Bruno Cardoso Lopes
3a09888a72 Move non-intruction patterns to a more appropriate place!
llvm-svn: 138744
2011-08-29 17:51:24 +00:00
Craig Topper
b20cee1e19 Fix disassembling of VCVTSD2SI
llvm-svn: 138623
2011-08-26 04:49:29 +00:00
Bruno Cardoso Lopes
e6119d18de Do the same as r138461. Mark VZEROALL as clobbering all YMM registers
llvm-svn: 138592
2011-08-25 22:23:58 +00:00
Bruno Cardoso Lopes
5b3d2c9e17 Add support for AVX 256-bit version of MOVDDUP!
llvm-svn: 138588
2011-08-25 21:40:37 +00:00
Craig Topper
a6085b9757 Add more missing TB encodings to VEX instructions to allow them to be disassembled. Fixes remainder of PR10678.
llvm-svn: 138553
2011-08-25 08:11:01 +00:00
Craig Topper
06ed6cb856 Add TB encoding to VEROALL, VZEROUPPER, and VCVTPS2PD to allow them to be disassembled. Fixes PR10723.
llvm-svn: 138551
2011-08-25 06:57:46 +00:00
Bruno Cardoso Lopes
5d34219953 Add support for 256-bit versions of VSHUFPD and VSHUFPS.
llvm-svn: 138546
2011-08-25 02:58:26 +00:00
Bruno Cardoso Lopes
dfa5cf4620 Create a section for non-instructions patterns in the beginning of the
file, and move more code around!

llvm-svn: 138521
2011-08-24 23:18:11 +00:00
Bruno Cardoso Lopes
719b357628 Move code around!
llvm-svn: 138520
2011-08-24 23:18:09 +00:00
Bruno Cardoso Lopes
3824b766ac Organize UNPCK* patterns, also add remaining for AVX.
llvm-svn: 138519
2011-08-24 23:18:06 +00:00
Bruno Cardoso Lopes
82c8bc7efd Move remaining MOVDDUP patterns close to MOVDDUP defintion and duplicate
the missing ones for AVX.

llvm-svn: 138518
2011-08-24 23:18:04 +00:00
Bruno Cardoso Lopes
d315a6b6e6 Organize and tidy up MOVDDUP section. Also update comments!
llvm-svn: 138517
2011-08-24 23:18:02 +00:00
Bruno Cardoso Lopes
762fb13cc9 Move MOVHLPS patterns close to MOVHLPS definition, and duplicate the
pattern for 128-bit AVX mode.

llvm-svn: 138516
2011-08-24 23:17:59 +00:00
Bruno Cardoso Lopes
d62766849f Move all PSHUF* patterns close to the PSHUF* definitions. Also be
explicit about which subtarget they refer to, and add AVX versions of
the ones we currently don't. Remove old and now wrong comments!

llvm-svn: 138515
2011-08-24 23:17:57 +00:00
Bruno Cardoso Lopes
122f7cfc92 Move all SHUFP* patterns close to the SHUFP* definitions. Also be
explicit about which subtarget they refer to, and add AVX versions of
the ones we currently don't. Make the mask check more strict, to be
clear it won't be used to match to 256-bit versions!

llvm-svn: 138514
2011-08-24 23:17:55 +00:00
Bruno Cardoso Lopes
734febce18 Mark VZEROALL as clobbering all YMM registers
llvm-svn: 138461
2011-08-24 18:48:33 +00:00
Bruno Cardoso Lopes
8959b54713 Fix a nasty bug where a v4i64 was being wrong emitted with 32-bit
permutations. Also tidy up some patterns and make them close to their
instruction definition!

llvm-svn: 138392
2011-08-23 22:06:37 +00:00
Craig Topper
67b22aedb4 Add support for breaking 256-bit v16i16 and v32i8 VSETCC into two 128-bit ones, avoiding sclarization. Add vex form of pcmpeqq and pcmpgtq. Fixes more cases for PR10712.
llvm-svn: 138321
2011-08-23 04:36:33 +00:00
Bruno Cardoso Lopes
23ff325f5b Add 128-bit AVX codegen for PCMP* family of integer instructions
llvm-svn: 138270
2011-08-22 20:31:00 +00:00
Craig Topper
f68d77215d Add TB encoding to VEX versions of SSE fp logical operations to fix disassembler
llvm-svn: 138034
2011-08-19 05:28:50 +00:00
Bruno Cardoso Lopes
0d458d4bb3 Re-encoded 128-bit AVX versions of SQRT, RSQRT, RCP have 3 operands
instead of 2. They were already defined this way in their regular
version, but not for the intrinsics versions (*_Int), and that would work
for assembly emission but not for object code, since a MachineOperand
would be missing. This commit fix PR10697.

Also removed the {VSQRT,VRSQRT,VRCP}r_Int forms and match the intrinsic
via INSERT_SUBREG+EXTRACT_SUBREG patterns. The same couldn't be done for
memory versions because sse_load_f32/sse_load_f64 operand need special
handling and don't work like regular "addr" operands.

There are right now 114 "*_Int" and 98 "Int_*" forms! I'm slowly
removing them as I step through, but hope we can get rid of these
someday, they are really annoying :)

llvm-svn: 138012
2011-08-18 23:59:21 +00:00
Bruno Cardoso Lopes
c174d8ac48 Cleanup vector logical ops in AVX and add use int versions for simple
v2i64

llvm-svn: 137919
2011-08-18 02:11:34 +00:00
Bruno Cardoso Lopes
98531dfd08 Introduce matching patterns for vbroadcast AVX instruction. The idea is to
match splats in the form (splat (scalar_to_vector (load ...))) whenever
the load can be folded. All the logic and instruction emission is
working but because of PR8156, there are no ways to match loads, cause
they can never be folded for splats. Thus, the tests are XFAILed, but
I've tested and exercised all the logic using a relaxed version for
checking the foldable loads, as if the bug was already fixed. This
should work out of the box once PR8156 gets fixed since MayFoldLoad will
work as expected.

llvm-svn: 137810
2011-08-17 02:29:19 +00:00
Bruno Cardoso Lopes
f026c60f3d While I'm here, remove the "_alt" hacks to a series of INSERT_SUBREG and
also add the AVX versions of the 128-bit patterns

llvm-svn: 137685
2011-08-15 23:36:51 +00:00
Bruno Cardoso Lopes
1e817d1451 Reorder declarations of vmovmskp* and also put the necessary AVX
predicate and TB encoding fields. This fix the encoding for the
attached testcase. This fixes PR10625.

llvm-svn: 137684
2011-08-15 23:36:45 +00:00
Bruno Cardoso Lopes
2d100ca13c The VPERM2F128 is a AVX instruction which permutes between two 256-bit
vectors. It operates on 128-bit elements instead of regular scalar
types. Recognize shuffles that are suitable for VPERM2F128 and teach
the x86 legalizer how to handle them.

llvm-svn: 137519
2011-08-12 21:48:26 +00:00
Bruno Cardoso Lopes
17ae896095 Move code around and add comments
llvm-svn: 137518
2011-08-12 21:48:22 +00:00
Bruno Cardoso Lopes
4106caa9af Cleanup: Remove Int_ CVTSS2SI* forms
llvm-svn: 137297
2011-08-11 02:52:36 +00:00
Bruno Cardoso Lopes
565ab1542a The following X86 pattern is incorrect:
def : Pat<(X86Movss VR128:$src1,
                   (bc_v4i32 (v2i64 (load addr:$src2)))),
          (MOVLPSrm VR128:$src1, addr:$src2)>;
This matches a MOVSS dag with a MOVLPS instruction. However, MOVSS will replace only the low 32 bits of the register, while the MOVLPS instruction will replace the low 64 bits. A testcase is added and illustrates the bug and also modified the one that was already present. Patch by Tanya Lattner.

llvm-svn: 137227
2011-08-10 17:45:17 +00:00
Bruno Cardoso Lopes
7461b930f3 Add v16i16 and v32i8 store patterns
llvm-svn: 137166
2011-08-09 22:39:53 +00:00
Bruno Cardoso Lopes
028c6aa951 Use fp unpack instructions to unpack int types. Until we have AVX2, this
is the best we can do for these patterns. This fix PR10554.

llvm-svn: 137161
2011-08-09 22:18:37 +00:00
Bruno Cardoso Lopes
633400ee00 Reapply a more appropriate solution than in r137114. AVX supports
v4f64 = sitofp v4i32. This fix PR10559.
Also add support for v4i32 = fptosi v4f64.

llvm-svn: 137128
2011-08-09 17:39:13 +00:00
Bruno Cardoso Lopes
d521431558 Add support for avx vector fextend
llvm-svn: 137105
2011-08-09 03:04:29 +00:00
Bruno Cardoso Lopes
09a727298f Add AVX versions of 128-bit sitofp and fptosi
llvm-svn: 137104
2011-08-09 03:04:25 +00:00
Bruno Cardoso Lopes
1025d1eb3b Add two patterns to match special vmovss and vmovsd cases. Also fix
the patterns already there to be more strict regarding the predicate.
This fixes PR10558

llvm-svn: 137100
2011-08-09 01:43:09 +00:00
Bruno Cardoso Lopes
d7eac41193 Make LowerVSETCC aware of AVX types and add patterns to match them.
llvm-svn: 137090
2011-08-09 00:46:57 +00:00
Bruno Cardoso Lopes
771876cade Add v4f64 -> v2f32 fp_round support. Also add a testcase to exercise
the legalizer. This commit together with the two previous ones fixes
PR10495.

llvm-svn: 136654
2011-08-01 21:54:09 +00:00
Bruno Cardoso Lopes
473d982caf Add v8i32 and v4i64 vpermil patterns
llvm-svn: 136451
2011-07-29 01:31:07 +00:00
Bruno Cardoso Lopes
02bbf20b02 Cleanup PALIGNR handling and remove the old palign pattern fragment.
Also make PALIGNR masks to don't match 256-bits, which isn't supported
It's also a step to solve PR10489

llvm-svn: 136448
2011-07-29 01:30:59 +00:00
Bruno Cardoso Lopes
e24a043703 Add patterns to generate copies for extract_subvector instead of
using vextractf128. This will reduce the number of issued instruction
for several avx codes.

llvm-svn: 136323
2011-07-28 01:26:50 +00:00
Bruno Cardoso Lopes
73945bf79a movd/movq write zeros in the high 128-bit part of the vector. Use
them to match 256-bit scalar_to_vector+zext.

llvm-svn: 136322
2011-07-28 01:26:46 +00:00
Bruno Cardoso Lopes
1f63a37172 Add a few patterns to match allzeros without having to use the fp unit.
Take advantage that the 128-bit vpxor zeros the higher part and use it.
This also fixes PR10491

llvm-svn: 136321
2011-07-28 01:26:43 +00:00
Bruno Cardoso Lopes
06d8be564f Add SINT_TO_FP and FP_TO_SINT support for v8i32 types. Also move
a convert pattern close to the instruction definition.

llvm-svn: 136320
2011-07-28 01:26:39 +00:00
Kevin Enderby
9adbbfffd0 Fix llvm-mc handing of x86 instructions that take 8-bit unsigned immediates.
llvm-mc gives an "invalid operand" error for instructions that take an unsigned
immediate which have the high bit set such as:
    pblendw $0xc5, %xmm2, %xmm1
llvm-mc treats all x86 immediates as signed values and range checks them.
A small number of x86 instructions use the imm8 field as a set of bits.
This change only changes those instructions and where the high bit is not
ignored.  The others remain unchanged.

llvm-svn: 136287
2011-07-27 23:01:50 +00:00
Bruno Cardoso Lopes
8830fde434 The vpermilps and vpermilpd have different behaviour regarding the
usage of the shuffle bitmask. Both work in 128-bit lanes without
crossing, but in the former the mask of the high part is the same
used by the low part while in the later both lanes have independent
masks. Handle this properly and and add support for vpermilpd.

llvm-svn: 136200
2011-07-27 00:56:34 +00:00
Bruno Cardoso Lopes
e53bb853ea Recognize unpckh* masks and match 256-bit versions. The new versions are
different from the previous 128-bit because they work in lanes.
Update a few comments and add testcases

llvm-svn: 136157
2011-07-26 22:03:40 +00:00
Bruno Cardoso Lopes
a493ad3938 Remove now unused patterns. 0 insertions(+), 98 deletions(-)
llvm-svn: 136109
2011-07-26 18:22:39 +00:00
Bruno Cardoso Lopes
b24e958ffb Cleanup old matching for PUNPCK* variants
llvm-svn: 136108
2011-07-26 18:22:27 +00:00
Bruno Cardoso Lopes
ab40a57cce Add 256-bit isel for movsldup/movshdup
llvm-svn: 136051
2011-07-26 02:39:32 +00:00
Bruno Cardoso Lopes
cde45ac9ca Add 128-bit AVX versions of movshdup/mosldup
llvm-svn: 136048
2011-07-26 02:39:23 +00:00
Bruno Cardoso Lopes
25698b90e9 Cleanup movsldup/movshdup matching.
27 insertions(+), 62 deletions(-)

llvm-svn: 136047
2011-07-26 02:39:13 +00:00
Bruno Cardoso Lopes
c94d6a2d2c Codegen allonesvector better while using AVX: vpcmpeqd + vinsertf128
This also fixes PR10452

llvm-svn: 136004
2011-07-25 23:05:32 +00:00
Bruno Cardoso Lopes
f457bc8120 Add remaining 256-bit vector bitcasts. This also fixes PR10451
llvm-svn: 136003
2011-07-25 23:05:28 +00:00
Bruno Cardoso Lopes
9380919dc5 - Handle special scalar_to_vector case: splats. Using a native 128-bit
shuffle before inserting on a 256-bit vector.
- Add AVX versions of movd/movq instructions
- Introduce a few COPY patterns to match insert_subvector instructions.
This turns a trivial insert_subvector instruction into a register copy,
coalescing the xmm into a ymm and avoid emiting on more instruction.

llvm-svn: 136002
2011-07-25 23:05:25 +00:00
Bruno Cardoso Lopes
5382a1e65e Add v8f32->v8i32 bitcast. Fixes PR10440
llvm-svn: 135794
2011-07-22 19:51:02 +00:00
Bruno Cardoso Lopes
3691063149 - Register v16i16 as valid VR256 register class
- Add more bitcasts for v16i16
- Since 135661 and 135662 already added the splat logic,
just add one more splat test for v16i16

llvm-svn: 135663
2011-07-21 02:24:08 +00:00
Bruno Cardoso Lopes
ba1a2a9135 Add support for 256-bit versions of VPERMIL instruction. This is a new
instruction introduced in AVX, which can operate on 128 and 256-bit vectors.
It considers a 256-bit vector as two independent 128-bit lanes. It can permute
any 32 or 64 elements inside a lane, and restricts the second lane to
have the same permutation of the first one. With the improved splat support
introduced early today, adding codegen for this instruction enable more
efficient 256-bit code:

Instead of:
  vextractf128  $0, %ymm0, %xmm0
  punpcklbw %xmm0, %xmm0
  punpckhbw %xmm0, %xmm0
  vinsertf128 $0, %xmm0, %ymm0, %ymm1
  vinsertf128 $1, %xmm0, %ymm1, %ymm0
  vextractf128  $1, %ymm0, %xmm1
  shufps  $1, %xmm1, %xmm1
  movss %xmm1, 28(%rsp)
  movss %xmm1, 24(%rsp)
  movss %xmm1, 20(%rsp)
  movss %xmm1, 16(%rsp)
  vextractf128  $0, %ymm0, %xmm0
  shufps  $1, %xmm0, %xmm0
  movss %xmm0, 12(%rsp)
  movss %xmm0, 8(%rsp)
  movss %xmm0, 4(%rsp)
  movss %xmm0, (%rsp)
  vmovaps (%rsp), %ymm0
We get:
  vextractf128  $0, %ymm0, %xmm0
  punpcklbw %xmm0, %xmm0
  punpckhbw %xmm0, %xmm0
  vinsertf128 $0, %xmm0, %ymm0, %ymm1
  vinsertf128 $1, %xmm0, %ymm1, %ymm0
  vpermilps $85, %ymm0, %ymm0

llvm-svn: 135662
2011-07-21 01:55:47 +00:00
Bruno Cardoso Lopes
194507cc77 Add aditional patterns for vextractf128 instruction
llvm-svn: 135660
2011-07-21 01:55:39 +00:00
Bruno Cardoso Lopes
14c800c1e3 Add aditional patterns for vinsertf128 instruction
llvm-svn: 135659
2011-07-21 01:55:36 +00:00
Bruno Cardoso Lopes
e0d5bd467f Move code around. No functionality changes
llvm-svn: 135657
2011-07-21 01:55:30 +00:00
Bruno Cardoso Lopes
bdf75dfa28 Be more smart with VCVTSS2SD. Also place the patterns close to the
definitions.

llvm-svn: 135407
2011-07-18 18:11:25 +00:00
Bruno Cardoso Lopes
da90f383ab Add AVX 128-bit sqrt versions
llvm-svn: 135404
2011-07-18 17:51:40 +00:00
Bruno Cardoso Lopes
d258749f73 Add AVX 128-bit patterns for sint_to_fp
llvm-svn: 135332
2011-07-16 00:50:20 +00:00
Bruno Cardoso Lopes
2a23e486ad Add a few patterns for 256-bit bitcasts. No testcases now, they are
comming together with other tests.

llvm-svn: 135312
2011-07-15 22:24:17 +00:00
Bruno Cardoso Lopes
d24f039847 Add 256-bit load/store recognition and matching in several places.
llvm-svn: 135171
2011-07-14 18:50:58 +00:00
Bruno Cardoso Lopes
c0401dddf7 Make X86ISD::ANDNP more general and Codegen 256-bit VANDNP. A more
general version of X86ISD::ANDNP also opened the room for a little bit
of refactoring.

llvm-svn: 135088
2011-07-13 21:36:51 +00:00
Bruno Cardoso Lopes
b98f50da03 The target specific node PANDN name is misleading. That happens because
it's later selected to a ANDNPD/ANDNPS instruction instead of the PANDN
instruction. Rename it.

llvm-svn: 135087
2011-07-13 21:36:47 +00:00
Bruno Cardoso Lopes
cb49278ad6 AVX Codegen support for 256-bit versions of vandps, vandpd, vorps, vorpd, vxorps, vxorpd
llvm-svn: 135023
2011-07-13 01:15:33 +00:00
Eli Friedman
9765ae0015 Add assembler/disassembler support for non-AVX pclmulqdq. While I'm here, use proper aliases for the pclmullqlqdq and friends. PR10269.
llvm-svn: 134424
2011-07-05 18:21:20 +00:00
Eli Friedman
802029c494 Add support for movntil/movntiq mnemonics. Reported on llvmdev.
llvm-svn: 133759
2011-06-23 21:07:47 +00:00
Nick Lewycky
8e5c09b7dc Add support for assembling "movq" when it's correct to do so, while continuing
to emit "movd" across the board to continue supporting a Darwin assembler bug.
This is the reincarnation of r133452.

llvm-svn: 133565
2011-06-21 22:45:41 +00:00
Bob Wilson
5b04895bb8 Revert r133452: "Emit movq for 64-bit register to XMM register moves..."
This is breaking compiler-rt and llvm-gcc builds on MacOSX when not using
the integrated assembler.

llvm-svn: 133524
2011-06-21 17:35:13 +00:00
Nick Lewycky
831fb8200d Emit movq for 64-bit register to XMM register moves, but continue to accept
movd when assembling.

llvm-svn: 133452
2011-06-20 18:33:26 +00:00
Bruno Cardoso Lopes
f52f4dd0b8 Add AVX suport for fpextend.
Original patch by Syoyo Fujita with more comments by me.

llvm-svn: 133153
2011-06-16 07:03:21 +00:00
Bruno Cardoso Lopes
b6afc5168f Add one more argument to the prefetch intrinsic to indicate whether it's a data
or instruction cache access. Update the targets to match it and also teach
autoupgrade.

llvm-svn: 132976
2011-06-14 04:58:37 +00:00
Stuart Hastings
ea8b49dff3 Reapply 132424 with fixes. This fixes PR10068.
rdar://problem/5993888

llvm-svn: 132606
2011-06-03 23:53:54 +00:00
Rafael Espindola
1299f014d4 Revert 132424 to fix PR10068.
llvm-svn: 132479
2011-06-02 19:57:47 +00:00
Stuart Hastings
9a085fb9d8 Recommit 132404 with fixes. rdar://problem/5993888
llvm-svn: 132424
2011-06-01 21:33:14 +00:00
Stuart Hastings
4b33767382 Revert 132404 to appease a buildbot. rdar://problem/5993888
llvm-svn: 132419
2011-06-01 19:52:20 +00:00
Stuart Hastings
23f5ceda96 Add support for x86 CMPEQSS and friends. These instructions do a
floating-point comparison, generate a mask of 0s or 1s, and generally
DTRT with NaNs.  Only profitable when the user wants a materialized 0
or 1 at runtime.  rdar://problem/5993888

llvm-svn: 132404
2011-06-01 17:17:45 +00:00
Stuart Hastings
fdc9e4af68 FGETSIGN support for x86, using movmskps/pd. Will be enabled with a
patch to TargetLowering.cpp.  rdar://problem/5660695

llvm-svn: 132388
2011-06-01 04:39:42 +00:00
Chad Rosier
b87c4a6945 Renamed llvm.x86.sse42.crc32 intrinsics; crc64 doesn't exist.
crc32.[8|16|32] have been renamed to .crc32.32.[8|16|32] and
crc64.[8|16|32] have been renamed to .crc32.64.[8|64].

llvm-svn: 132163
2011-05-26 23:13:19 +00:00
Rafael Espindola
98372d430c Don't produce a vmovntdq if we don't have AVX support.
llvm-svn: 131330
2011-05-14 00:30:01 +00:00
Bill Wendling
67f5e8f0a7 Replace the "movnt" intrinsics with a native store + nontemporal metadata bit.
<rdar://problem/8460511>

llvm-svn: 130791
2011-05-03 21:11:17 +00:00
Eric Christopher
1de0dfaab0 xmm0 is an implicit parameter in this and so shouldn't be in the
string template.

Fixes rdar://8493866

llvm-svn: 130747
2011-05-03 01:28:32 +00:00
Chris Lattner
52e40d9b4c clean up after Sean's r127646 patch.
llvm-svn: 130475
2011-04-29 05:40:18 +00:00
Bill Wendling
0984f4927e Reapply r129401 with patch for clang.
llvm-svn: 129419
2011-04-13 00:36:11 +00:00
Bill Wendling
f6446a0961 Revert r129401 for now. Clang is using the old way of doing things.
llvm-svn: 129403
2011-04-12 22:59:27 +00:00
Bill Wendling
f9c9d3e05b Remove the unaligned load intrinsics in favor of using native unaligned loads.
Now that we have a first-class way to represent unaligned loads, the unaligned
load intrinsics are superfluous.

First part of <rdar://problem/8460511>.

llvm-svn: 129401
2011-04-12 22:46:31 +00:00
Sean Callanan
a38db2eeda Enabled disassembler support for AVX instructions
in the instruction tables and fixed a few bugs that
were causing decode conflicts.  Rudimentary tests
are coming up in the next patch.

llvm-svn: 127646
2011-03-15 01:28:15 +00:00
David Greene
2fd6d03bc9 [AVX] Fix mask predicates for 256-bit UNPCKLPS/D and implement
missing patterns for them.

      Add a SIMD test subdirectory to hold tests for SIMD instruction
      selection correctness and quality.
'

llvm-svn: 126845
2011-03-02 17:23:43 +00:00
Joerg Sonnenberger
efa8090e2a Recognize monitor/mwait with explicit register arguments
llvm-svn: 125805
2011-02-18 00:48:11 +00:00
David Greene
7de7347ee8 [AVX] Support VSINSERTF128 with more patterns and appropriate
infrastructure.  This makes lowering 256-bit vectors to 128-bit
vectors simple when 256-bit vector support is not available.

llvm-svn: 124868
2011-02-04 16:08:29 +00:00
David Greene
2753be260c [AVX] VEXTRACTF128 support. This commit includes patterns for
matching EXTRACT_SUBVECTOR to VEXTRACTF128 along with support routines
to examine and translate index values.  VINSERTF128 comes next.  With
these two in place we can begin supporting more AVX operations as
INSERT/EXTRACT can be used as a fallback when 256-bit support is not
available.

llvm-svn: 124797
2011-02-03 15:50:00 +00:00
Chris Lattner
9ba0a83f2b fix a missing shuffle pattern, PR9009. Patch by Artiom Myaskouvskey!
llvm-svn: 124102
2011-01-24 03:42:46 +00:00
Chris Lattner
586e7af07d Fix PR8946, a missing reg/reg form of movdqu.
llvm-svn: 123242
2011-01-11 17:04:55 +00:00
Chris Lattner
3ef9db5cd4 fix PR8900, a shuffle miscompilation. Patch by Nadav Rotem!
llvm-svn: 122921
2011-01-05 22:28:46 +00:00
Nate Begeman
c7dfecb10e Implement feedback from Bruno on making pblendvb an x86-specific ISD node in addition to being an intrinsic, and convert
lowering to use it.  Hopefully the pattern fragment is doing the right thing with XMM0, looks correct in testing.

llvm-svn: 122277
2010-12-20 22:04:24 +00:00
Nate Begeman
ef5f3c0fa7 Add support for matching psign & plendvb to the x86 target
Remove unnecessary pandn patterns, 'vnot' patfrag looks through bitcasts

llvm-svn: 122098
2010-12-17 22:55:37 +00:00
Nate Begeman
8c00ecd290 Add some missing predicates.
llvm-svn: 121445
2010-12-10 00:54:26 +00:00
Nate Begeman
cb6d1c8193 Formalize the notion that AVX and SSE are non-overlapping extensions from the compiler's point of view. Per email discussion, we either want to always use VEX-prefixed instructions or never use them, and are taking "HasAVX" to mean "Always use VEX". Passing -mattr=-avx,+sse42 should serve to restore legacy SSE support when desirable.
llvm-svn: 121439
2010-12-10 00:26:57 +00:00
Nate Begeman
4a62a3e229 Add support for AVX to materialize +0.0 when doing scalar FP.
llvm-svn: 121415
2010-12-09 21:43:51 +00:00
Benjamin Kramer
851691ddb2 Add patterns for the x86 popcnt instruction.
- Also adds a new POPCNT subtarget feature that is currently enabled if the target
  supports SSE4.2 (nehalem) or SSE4A (barcelona).

llvm-svn: 120917
2010-12-04 20:32:23 +00:00
Nate Begeman
deb26223bd Scalar f32/f64 are also subregs of ymm regs
llvm-svn: 120844
2010-12-03 21:54:39 +00:00
Eric Christopher
6a21ceab5c Implement a PseudoI class and transfer the sse instructions over to use
it.

llvm-svn: 120412
2010-11-30 08:57:23 +00:00
Eric Christopher
f27f0b5234 Rewrite mwait and monitor support and custom lower arguments.
Fixes PR8573.

llvm-svn: 120404
2010-11-30 07:20:12 +00:00
Bruno Cardoso Lopes
9f9f796756 Fix PR8211
llvm-svn: 118445
2010-11-08 21:24:59 +00:00
Dale Johannesen
b78530f9b0 Fix pastos in handling of AVX cvttsd2si, PR8491.
Bruno, please review, but I'm pretty sure this is right.
Patch by Alex Mac!

llvm-svn: 117514
2010-10-28 00:35:54 +00:00
Chris Lattner
72e7e84c3f simplify some map operations.
llvm-svn: 116014
2010-10-07 23:57:02 +00:00
Evan Cheng
7c89d70f27 Canonicalize X86ISD::MOVDDUP nodes to v2f64 to make sure all cases match. Also eliminate unneeded isel patterns. rdar://8520311
llvm-svn: 115977
2010-10-07 20:50:20 +00:00
Chris Lattner
84846b71af remove the !nameconcat tblgen feature. It "shorthand" and only used in 4 places
where !cast is just as short.

llvm-svn: 115722
2010-10-06 00:19:21 +00:00
Chris Lattner
12274b9845 allow !strconcat to take more than two operands to eliminate
!strconcat(!strconcat(!strconcat(!strconcat

Simplify some x86 td files to use it.

llvm-svn: 115719
2010-10-05 23:58:18 +00:00
Chris Lattner
5d7d5a81eb distribute the rest of the contents of X86Instr64bit.td out to
the right places.  X86Instr64bit.td now dies, long live x86-64!

llvm-svn: 115669
2010-10-05 20:49:15 +00:00
Chris Lattner
9317bf2ed5 move CMOV_FR32 and friends to InstrCompiler, since they are
pseudo instructions.

Move POPCNT to InstrSSE since they are SSE4 instructions.

llvm-svn: 115603
2010-10-05 06:41:40 +00:00
Chris Lattner
9c58de2dc4 fix rdar://8490728 - llvm-mc rejects gpr64 form of 'movmskpd'
llvm-svn: 115029
2010-09-29 05:05:03 +00:00
Chris Lattner
890c21a20a add assembler support for the cvtsd2sil/cvtsd2siq mnemonics, rdar://8456382
llvm-svn: 115027
2010-09-29 04:55:40 +00:00
Chris Lattner
c14d59589c add basic avx support to the disassembler, also teach it about ssmem/sdmem
operands.

With this done, we can remove the _Int suffixes from the round instructions
without the disassembler blowing up.  This allows the assembler to support
them, implementing rdar://8456376 - llvm-mc rejects 'roundss'

llvm-svn: 115019
2010-09-29 02:57:56 +00:00