1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 04:22:57 +02:00
Commit Graph

1011 Commits

Author SHA1 Message Date
Manman Ren
f26bd7d8f9 X86 SSE: update rsqrtss and rcpss to use two source operands and
the first source operand is tied to the destination operand.

This is to accurately model the corresponding instructions where the upper
bits are unmodified.

rdar://12558838
PR14221

llvm-svn: 167064
2012-10-30 23:53:59 +00:00
Michael Liao
d26c27ad35 Fix PR14204
- Add missing pattern on X86ISD::VZEXT from VR256 to VR256 when AVX2 is enabled.

llvm-svn: 166947
2012-10-29 17:57:12 +00:00
Michael Liao
70bb8004bd Add custom conversion from v2u32 to v2f32 in 32-bit mode
- As there's no 64-bit GPRs in 32-bit mode, a custom conversion from v2u32 to
  v2f32 is added to improve the efficiency of the code generated.

llvm-svn: 166545
2012-10-24 04:09:32 +00:00
Michael Liao
23c890e0a8 Enable lowering ZERO_EXTEND/ANY_EXTEND to PMOVZX from SSE4.1
llvm-svn: 166486
2012-10-23 17:34:00 +00:00
Michael Liao
6a09ff62ba Add support for FP_ROUND from v2f64 to v2f32
- Due to the current matching vector elements constraints in
  ISD::FP_ROUND, rounding from v2f64 to v4f32 (after legalization from
  v2f32) is scalarized. Add a customized v2f32 widening to convert it
  into a target-specific X86ISD::VFPROUND to work around this
  constraints.

llvm-svn: 165631
2012-10-10 16:53:28 +00:00
Craig Topper
abbf768c15 Remove code for setting the VEX L-bit as a function of operand size from the code emitters and the disassembler table builder. Fix a couple instructions that were still missing VEX_L.
llvm-svn: 164204
2012-09-19 06:37:45 +00:00
Craig Topper
7c37abcace Add explicit VEX_L tags to all 256-bit instructions. This will allow us to remove code from the code emitters that examined operands to set the L-bit.
llvm-svn: 164202
2012-09-19 06:06:34 +00:00
Nadav Rotem
c790bc0984 The PMOVZXWD family of functions had patterns extends narrow vector types to wide vector types.
It had patterns for zext-loading and extending. This commit adds patterns for loading a wide type, performing a bitcast,
and extending. This is an odd pattern, but it is commonly used when writing code with intrinsics.

rdar://11897677

llvm-svn: 163995
2012-09-16 07:39:07 +00:00
Michael Liao
7dfa5e2092 Enhance PR11334 fix to support extload from v2f32/v4f32
- Fix an remaining issue of PR11674 as well

llvm-svn: 163528
2012-09-10 18:33:51 +00:00
Craig Topper
a91d731898 Add instruction selection for ffloor of vectors when SSE4.1 or AVX is enabled.
llvm-svn: 163473
2012-09-08 17:42:27 +00:00
Craig Topper
0b9e2dd7a7 Use iPTR instead of i32 for extract_subvector/insert_subvector index in lowering and patterns. This makes it consistent with the incoming DAG nodes from the DAG builder.
llvm-svn: 163293
2012-09-06 06:09:01 +00:00
Craig Topper
b2bad42f00 Add patterns for converting stores of subvector_extracts of lower 128-bits of a 256-bit vector to VMOVAPSmr/VMOVUPSmr.
llvm-svn: 163292
2012-09-06 05:15:01 +00:00
Craig Topper
864ef1eec5 Remove some of the patterns added in r163196. Increasing the complexity on insert_subvector into undef accomplishes the same thing.
llvm-svn: 163198
2012-09-05 07:26:35 +00:00
Craig Topper
f029cfe913 Add patterns for integer forms of VINSERTF128/VINSERTI128 folded with loads. Also add patterns to turn subvector inserts with loads to index 0 of an undef into VMOVAPS.
llvm-svn: 163196
2012-09-05 06:58:39 +00:00
Craig Topper
6274d26545 Convert vextracti128/vextractf128 intrinsics to extract_subvector at DAG build time. Similar was previously done for vinserti128/vinsertf128. Add patterns for folding these extract_subvectors with stores.
llvm-svn: 163192
2012-09-05 05:48:09 +00:00
Craig Topper
0791e3f380 Typos
llvm-svn: 163053
2012-09-01 06:33:50 +00:00
Michael Liao
43c7369b24 Clean up AddedComplexity further after adding UseSSEx
llvm-svn: 162973
2012-08-31 03:01:35 +00:00
Jim Grosbach
6d3cb70105 X86: Fix encoding of 'movd %xmm0, %rax'
The assembly string for the VMOVPQIto64rr instruction incorrectly lacked the 'v'
prefix, resulting in mis-assembly of the vanilla movd instruction.

llvm-svn: 162963
2012-08-31 00:30:30 +00:00
Michael Liao
b6735b87b0 Introduce 'UseSSEx' to force SSE legacy encoding
- Add 'UseSSEx' to force SSE legacy insn not being selected when AVX is
  enabled.

  As the penalty of inter-mixing SSE and AVX instructions, we need
  prevent SSE legacy insn from being generated except explicitly
  specified through some intrinsics. For patterns supported by both
  SSE and AVX, so far, we force AVX insn will be tried first relying on
  AddedComplexity or position in td file. It's error-prone and
  introduces bugs accidentally.

  'UseSSEx' is disabled when AVX is turned on. For SSE insns inherited
  by AVX, we need this predicate to force VEX encoding or SSE legacy
  encoding only.

  For insns not inherited by AVX, we still use the previous predicates,
  i.e. 'HasSSEx'. So far, these insns fall into the following
  categories:
  * SSE insns with MMX operands
  * SSE insns with GPR/MEM operands only (xFENCE, PREFETCH, CLFLUSH,
    CRC, and etc.)
  * SSE4A insns.
  * MMX insns.
  * x87 insns added by SSE.

2 test cases are modified:

 - test/CodeGen/X86/fast-isel-x86-64.ll
   AVX code generation is different from SSE one. 'vcvtsi2sdq' cannot be
   selected by fast-isel due to complicated pattern and fast-isel
   fallback to materialize it from constant pool.

 - test/CodeGen/X86/widen_load-1.ll
   AVX code generation is different from SSE one after fixing SSE/AVX
   inter-mixing. Exec-domain fixing prefers 'vmovapd' instead of
   'vmovaps'.

llvm-svn: 162919
2012-08-30 16:54:46 +00:00
Bill Wendling
6488dc22bb The commutative flag is already correctly set within the multiclass. If we set
it here, then a 'register-memory' version would wrongly get the commutative
flag.
<rdar://problem/12180135>

llvm-svn: 162741
2012-08-28 07:36:46 +00:00
Craig Topper
803047a9bb Convert V_SETALLONES/AVX_SETALLONES/AVX2_SETALLONES to Post-RA pseudos.
llvm-svn: 162740
2012-08-28 07:30:47 +00:00
Craig Topper
02bb8ce5e0 Merge AVX_SET0PSY/AVX_SET0PDY/AVX2_SET0 into a single post-RA pseudo.
llvm-svn: 162738
2012-08-28 07:05:28 +00:00
Jakob Stoklund Olesen
882cb360be More missing mayLoad flags on AVX multiclasses.
llvm-svn: 162714
2012-08-28 00:02:01 +00:00
Craig Topper
bbee14ad9d Don't allow vextractf128 to be folded with unaligned stores. We don't fold unaligned loads so shouldn't fold unaligned stores as it can cause an alignment fault to occur.
llvm-svn: 162658
2012-08-27 07:19:59 +00:00
Craig Topper
57dd6db42e Fold some patterns into instruction definitons so tablegen can infer flags removing the need for an explicit 'neverHasSideEffects = 1'
llvm-svn: 162656
2012-08-27 07:04:50 +00:00
Craig Topper
b524d2e36d Add HasAVX1Only predicate and use it for patterns that have an AVX1 instruction and an AVX2 instruction rather than relying on AddedComplexity.
llvm-svn: 162654
2012-08-27 06:08:57 +00:00
Jakob Stoklund Olesen
d1820cea0b Add missing mayLoad flags to a large class of AVX *_Int instructions.
llvm-svn: 162622
2012-08-24 23:29:07 +00:00
Jakob Stoklund Olesen
3739d6ca99 Remove some spurious mayLoad = 0 flags.
They were inserted to silence TableGen's warning about
redundant properties. That warning is now gone.

llvm-svn: 162517
2012-08-24 00:31:20 +00:00
Nadav Rotem
589dc766e0 When unsafe math is used, we can use commutative FMAX and FMIN. In some cases
this allows for better code generation.

Added a new DAGCombine transformation to convert FMAX and FMIN to FMANC and
FMINC, which are commutative.

For example:

  movaps  %xmm0, %xmm1
  movsd LC(%rip), %xmm0
  minsd %xmm1, %xmm0

becomes:

  minsd LC(%rip), %xmm0

llvm-svn: 162187
2012-08-19 13:06:16 +00:00
Michael Liao
daebe04c2f fix PR11334
- FP_EXTEND only support extending from vectors with matching elements.
  This results in the scalarization of extending to v2f64 from v2f32,
  which will be legalized to v4f32 not matching with v2f64.
- add X86-specific VFPEXT supproting extending from v4f32 to v2f64.
- add BUILD_VECTOR lowering helper to recover back the original
  extending from v4f32 to v2f64.
- test case is enhanced to include different vector width.

llvm-svn: 161894
2012-08-14 21:24:47 +00:00
Craig Topper
dc1b95e7de Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305.
llvm-svn: 161318
2012-08-06 06:22:36 +00:00
Craig Topper
e26b30c830 Remove custom inserter for MWAIT. It doesn't do anything that couldn't be represented in a pattern.
llvm-svn: 161306
2012-08-05 00:36:57 +00:00
Manman Ren
1c73b43c93 X86: mark GATHER instructios as mayLoad
llvm-svn: 161143
2012-08-01 23:28:59 +00:00
Craig Topper
80fdfb7f56 Give VCVTTPD2DQ priority over CVTTPD2DQ.
llvm-svn: 160942
2012-07-30 02:20:32 +00:00
Craig Topper
492a7af190 Fix patterns for CVTTPS2DQ to specify SSE2 instead of SSE1.
llvm-svn: 160941
2012-07-30 02:14:02 +00:00
Craig Topper
9050c15c71 Fix up patterns for VCVTSS2SD. Specifically give it priority over SSE form. Add an OptForSpeed to explicitly pair up with an OptForSize that was already on another pattern.
llvm-svn: 160939
2012-07-30 01:38:57 +00:00
Craig Topper
147248a6a0 Fix load types on intrinsic forms of SS2SD and SD2SS AVX/SSE convert instruction patterns.
llvm-svn: 160938
2012-07-29 23:26:34 +00:00
Craig Topper
293e781ba6 Move more SSE/AVX convert instruction patterns into their definitions.
llvm-svn: 160937
2012-07-29 22:30:06 +00:00
Craig Topper
e75418242a Fold patterns for some of the SSE/AVX convert instructions into their instruction definitions.
llvm-svn: 160922
2012-07-28 18:59:19 +00:00
Craig Topper
189349dab2 Mark some of the SSE/AVX convert instructions as mayLoad/neverHasSideEffects.
llvm-svn: 160921
2012-07-28 18:36:39 +00:00
Craig Topper
3c15b4afd4 Make CVTSS2SI instruction definition consistent with CVTSD2SI.
llvm-svn: 160914
2012-07-28 08:28:23 +00:00
Craig Topper
8121932592 Fix up memory load types for SSE scalar convert intrinsic patterns.
llvm-svn: 160913
2012-07-28 07:59:59 +00:00
Jakob Stoklund Olesen
008b36037d Remove the last mentions of sub_ss and sub_sd from patterns.
I'll remove these two sub-register indexes shortly.

llvm-svn: 160831
2012-07-26 23:03:08 +00:00
Jakob Stoklund Olesen
c1dd7a213d Eliminate sub_ss, sub_sd from broadcast patterns.
The (COPY_TO_REGCLASS GR32:$src, VR128) pattern looks odd, but
copyPhysReg does the right thing with it. (The old pattern would
eventually produce the same cross-class copy).

llvm-svn: 160830
2012-07-26 22:59:06 +00:00
Jakob Stoklund Olesen
419ccae442 Eliminate more sub_ss / sub_sd patterns.
This gets rid of some more INSERT_SUBREG - IMPLICIT_DEF patterns,
simplifying the emitted code a bit.

llvm-svn: 160820
2012-07-26 22:30:18 +00:00
Jakob Stoklund Olesen
0a41a45c36 Eliminate some SUBREG_TO_REG patterns with sub_ss and sub_sd.
The SUBREG_TO_REG instruction has magic semantics asserting that the
source value was defined by an instruction that cleared the high half of
the register. Those semantics are never actually exploited for xmm
registers.

llvm-svn: 160818
2012-07-26 22:03:21 +00:00
Jakob Stoklund Olesen
44868927b9 Eliminate a batch of uses of sub_ss and sub_sd in the X86 target.
These idempotent sub-register indices don't do anything --- They simply
map XMM registers to themselves.  They no longer affect register classes
either since the SubRegClasses field has been removed from Target.td.

This patch replaces XMM->XMM EXTRACT_SUBREG and INSERT_SUBREG patterns
with COPY_TO_REGCLASS patterns which simply become COPY instructions.

The number of IMPLICIT_DEF instructions before register allocation is
reduced, and that is the cause of the test case changes.

llvm-svn: 160816
2012-07-26 21:40:42 +00:00
Craig Topper
9667d599eb Make l/q suffixes on AVX forms of scalar convert instructions consistent with their non-AVX forms.
llvm-svn: 160775
2012-07-26 07:48:28 +00:00
Nadav Rotem
03d2729392 The vbroadcast family of instructions has 'fallback patterns' in case where the
load source operand is used by multiple nodes. The v2i64 broadcast was emulated
by shuffling the two lower i32 elements to the upper two.
We had a bug in the immediate used for the broadcast.
Replacing 0 to 0x44.
0x44 means [01|00|01|00] which corresponds to the correct lane.

Patch by Michael Kuperstein.

llvm-svn: 160430
2012-07-18 08:14:48 +00:00
Craig Topper
b144f3b6db Make x86 asm parser to check for xmm vs ymm for index register in gather instructions. Also fix Intel syntax for gather instructions to use 'DWORD PTR' or 'QWORD PTR' to match gas.
llvm-svn: 160420
2012-07-18 04:11:12 +00:00
Nadav Rotem
0377e0d234 Rename VBROADCASTSDrm into VBROADCASTSDYrm to match the naming convention.
Allow the folding of vbroadcastRR to vbroadcastRM, where the memory operand is a spill slot.

PR12782.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160230
2012-07-15 12:26:30 +00:00
Craig Topper
a75da664ff Mark VINSERTI128rm as MayLoad=1. Fixes PR13348.
llvm-svn: 160162
2012-07-13 05:46:28 +00:00
Craig Topper
6eef81b65b Update GATHER instructions to support 2 read-write operands. Patch from myself and Manman Ren.
llvm-svn: 160110
2012-07-12 06:52:41 +00:00
Craig Topper
b346ce8240 Reverse assembler/disassembler operand order for gather instructions.
llvm-svn: 159983
2012-07-10 06:38:33 +00:00
Craig Topper
4644d3577c Remove extra space.
llvm-svn: 159647
2012-07-03 06:48:58 +00:00
Craig Topper
60bbc2fde8 Change i128mem/i256mem to f128mem/f256mem on some floating point vector instructions.
llvm-svn: 159646
2012-07-03 06:11:06 +00:00
Craig Topper
6fcb4454a0 Add aliases for pblendvb, blendvpd, and blendvps instructions with the implicit xmm0 operand specified. Fixes PR13252.
llvm-svn: 159644
2012-07-03 05:49:45 +00:00
Elena Demikhovsky
0617b5a56c Optimization of shuffle node that can fit to the register form of VBROADCAST instruction on AVX2.
llvm-svn: 159504
2012-07-01 06:12:26 +00:00
Manman Ren
63bf58865a X86: add more GATHER intrinsics in LLVM
Corrected type for index of llvm.x86.avx2.gather.d.pd.256
  from 256-bit to 128-bit.
Corrected types for src|dst|mask of llvm.x86.avx2.gather.q.ps.256
  from 256-bit to 128-bit.

Support the following intrinsics:
  llvm.x86.avx2.gather.d.q, llvm.x86.avx2.gather.q.q
  llvm.x86.avx2.gather.d.q.256, llvm.x86.avx2.gather.q.q.256
  llvm.x86.avx2.gather.d.d, llvm.x86.avx2.gather.q.d
  llvm.x86.avx2.gather.d.d.256, llvm.x86.avx2.gather.q.d.256

llvm-svn: 159402
2012-06-29 00:54:20 +00:00
Manman Ren
6be46b7b4c X86: add GATHER intrinsics (AVX2) in LLVM
Support the following intrinsics:
llvm.x86.avx2.gather.d.pd, llvm.x86.avx2.gather.q.pd
llvm.x86.avx2.gather.d.pd.256, llvm.x86.avx2.gather.q.pd.256
llvm.x86.avx2.gather.d.ps, llvm.x86.avx2.gather.q.ps
llvm.x86.avx2.gather.d.ps.256, llvm.x86.avx2.gather.q.ps.256

Modified Disassembler to handle VSIB addressing mode.

llvm-svn: 159221
2012-06-26 19:47:59 +00:00
Craig Topper
5c8bdeb3f3 Remove some duplicate instructions that exist only to given different mnemonics for the assembler. Use InstAlias instead.
llvm-svn: 159184
2012-06-26 04:12:49 +00:00
Craig Topper
df4e56ebc1 Add SSE2 predicate to CVTPS2PD instructions. Doesn't matter much because there are no patterns in the instruction.
llvm-svn: 159127
2012-06-25 06:51:42 +00:00
Craig Topper
2959047de1 Remove codegen only instruction in favor of one that has the same definition. Make some pattern operands more explicit about types.
llvm-svn: 159126
2012-06-25 06:16:00 +00:00
Craig Topper
3f09003ac6 Remove intrinsic specific instructions for (V)CVTPS2DQ and replace with patterns.
llvm-svn: 159109
2012-06-24 07:07:16 +00:00
Craig Topper
bbce66a591 Remove intrinsic specific instructions for (V)CVTPS2DQ and replace with patterns.
llvm-svn: 159108
2012-06-24 06:55:37 +00:00
Craig Topper
677692bc32 Fix build failures from r159106.
llvm-svn: 159107
2012-06-24 06:08:31 +00:00
Craig Topper
58ede28106 Remove intrinsic specific instructions for CVTPD2PS and replace with just patterns.
llvm-svn: 159106
2012-06-24 05:44:31 +00:00
Craig Topper
7485c05a49 Remove intrinsic specific instructions for CVTPD2DQ. Replace with patterns.
llvm-svn: 159105
2012-06-24 05:33:24 +00:00
Craig Topper
e824497e82 Remove intrinsic specific instructions for (V)CVTDQ2PS. Use a Pat instead instead.
llvm-svn: 159090
2012-06-23 22:33:14 +00:00
Craig Topper
59fcd68657 Make CVTDQ2PS instruction use SSE2 predicate instead of SSE1. No functional change because there are no patterns in the instructions. Also fix a typo in a comment.
llvm-svn: 159087
2012-06-23 20:52:45 +00:00
Craig Topper
caf5a8e7aa Move CVTPD2DQ to use SSE2 predicate instead of SSE3. Move DQ2PD and PD2DQ to the SSE2 section of the file.
llvm-svn: 159086
2012-06-23 20:15:42 +00:00
Craig Topper
7067c92fbb Use correct memory types for (V)CVTDQ2PD instructions.
llvm-svn: 159075
2012-06-23 08:30:27 +00:00
Craig Topper
b05bd41aed Remove intrinsic specific instructions for 128-bit (V)CVTDQ2PD. Replace with intrinsic patterns. Mem forms omitted because the load size is only 64-bits.
llvm-svn: 159070
2012-06-23 04:23:36 +00:00
Craig Topper
f19d6cef51 Add predicate check around some patterns.
llvm-svn: 158797
2012-06-20 07:30:23 +00:00
Craig Topper
54d8fe551b Add predicate check around some patterns.
llvm-svn: 158795
2012-06-20 07:01:11 +00:00
Kay Tiong Khoo
7247ab8114 *no need to pollute Intel syntax with bonus mnemonics; operand size is explicitly specified
llvm-svn: 158603
2012-06-16 17:19:49 +00:00
Craig Topper
8d23b98818 Mark several instructions SSE2 instead of SSE3 as they should be.
llvm-svn: 158049
2012-06-06 06:45:27 +00:00
Benjamin Kramer
cb686400fb X86: Rename the CLMUL target feature to PCLMUL.
It was renamed in gcc/gas a while ago and causes all kinds of
confusion because it was named differently in llvm and clang.

llvm-svn: 157745
2012-05-31 14:34:17 +00:00
Craig Topper
c6bd90e646 Add intrinsic for pclmulqdq instruction.
llvm-svn: 157731
2012-05-31 04:37:40 +00:00
Benjamin Kramer
0c823ae0ed Add intrinsics, code gen, assembler and disassembler support for the SSE4a extrq and insertq instructions.
This required light surgery on the assembler and disassembler
because the instructions use an uncommon encoding. They are
the only two instructions in x86 that use register operands
and two immediates.

llvm-svn: 157634
2012-05-29 19:05:25 +00:00
Craig Topper
77b1a4cee5 Remove 256-bit AVX non-temporal store intrinsics. Similar was previously done for 128-bit.
llvm-svn: 156375
2012-05-08 06:58:15 +00:00
Craig Topper
02644ca6b7 Fix some issues in the f16c instructions.
llvm-svn: 156287
2012-05-07 06:00:15 +00:00
Craig Topper
c6d0bc2afc Add SSE4A MOVNTSS/MOVNTSD instructions.
llvm-svn: 156281
2012-05-07 05:36:19 +00:00
Nadav Rotem
021d75713c AVX: Add additional vbroadcast replacement sequences for integers.
Remove the v2f64 patterns because it does not match any vbroadcast
instruction.

llvm-svn: 155461
2012-04-24 18:09:59 +00:00
Nadav Rotem
d060c25823 AVX: We lower VECTOR_SHUFFLE and BUILD_VECTOR nodes into vbroadcast instructions
using the pattern (vbroadcast (i32load src)). In some cases, after we generate
this pattern new users are added to the load node, which prevent the selection
of the blend pattern. This commit provides fallback patterns which perform
in-vector broadcast (using in-vector vbroadcast in AVX2 and pshufd on AVX1).

llvm-svn: 155437
2012-04-24 11:07:03 +00:00
Elena Demikhovsky
35721fc4f8 ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2
llvm-svn: 155309
2012-04-22 09:39:03 +00:00
Craig Topper
db4fcf7088 Replace vpermd/vpermps intrinic patterns with custom lowering to target specific nodes.
llvm-svn: 154801
2012-04-16 07:13:00 +00:00
Craig Topper
129dccdc84 Flip the arguments when converting vpermd/vpermps intrinsics into instructions. The intrinsic has the mask as the last operand, but the instruction has it as the second.
llvm-svn: 154797
2012-04-16 06:26:15 +00:00
Craig Topper
1b15347812 Merge vpermps/vpermd and vpermpd/vpermq SD nodes.
llvm-svn: 154782
2012-04-16 00:41:45 +00:00
Craig Topper
788250eec1 Remove AVX2 vpermq and vpermpd intrinsics. These can now be handled with normal shuffle vectors.
llvm-svn: 154778
2012-04-15 22:43:31 +00:00
Nadav Rotem
2a4e2ef10c Fix PR12529. The Vxx family of instructions are only supported by AVX.
Use non-vex instructions for SSE4.

llvm-svn: 154770
2012-04-15 19:36:44 +00:00
Elena Demikhovsky
92fb3e613e Added VPERM optimization for AVX2 shuffles
llvm-svn: 154761
2012-04-15 11:18:59 +00:00
Craig Topper
448790d566 Fix 128-bit ptest intrinsics to take v2i64 instead of v4f32 since these are integer instructions.
llvm-svn: 154580
2012-04-12 07:23:00 +00:00
Nadav Rotem
c922b4f2a3 Reapply 154396 after fixing a test.
Original message:
Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
blendV uses a register for the selection while Vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.

llvm-svn: 154483
2012-04-11 06:40:27 +00:00
Eric Christopher
f8886e8f48 Temporarily revert this patch to see if it brings the buildbots back.
llvm-svn: 154425
2012-04-10 19:33:16 +00:00
Nadav Rotem
74f87a6bd8 Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
blendv uses a register for the selection while vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.

llvm-svn: 154396
2012-04-10 14:33:13 +00:00
Craig Topper
a6412fb8c0 Turn avx2 vinserti128 intrinsic calls into INSERT_SUBVECTOR DAG nodes and remove patterns for selecting the intrinsic. Similar was already done for avx1.
llvm-svn: 154272
2012-04-07 22:32:29 +00:00
Craig Topper
1ddf62dc2c Move vinsertf128 patterns near the instruction definitions. Add AddedComplexity to AVX2 vextracti128 patterns to give them priority over the integer versions of vextractf128 patterns.
llvm-svn: 154268
2012-04-07 21:57:43 +00:00
Craig Topper
ce6c05e0df Add support for AVX enhanced comparison predicates. Patch from Kay Tiong Khoo.
llvm-svn: 153935
2012-04-03 05:20:24 +00:00
Chad Rosier
17f25ea47b [avx] Add patterns for combining vextractf128 + vmovaps/vmovups/vmobdqu to
vextractf128 with 128-bit mem dest.

Combines

	vextractf128 $0, %ymm0, %xmm0
	vmovaps %xmm0, (%rdi)

to

    vextractf128 $0, %ymm0, (%rdi)

rdar://11082570

llvm-svn: 153139
2012-03-20 21:43:40 +00:00