1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 03:23:01 +02:00
Commit Graph

16267 Commits

Author SHA1 Message Date
Simon Pilgrim
6320b01ed3 [X86][SSE] Check for out of bounds PEXTR/PINSR indices during faux shuffle combining.
llvm-svn: 323045
2018-01-20 17:16:01 +00:00
Craig Topper
eb91b245e3 [X86] Teach X86 codegen to use vector width preference to avoid promoting to 512-bit types when VLX is enabled and the preference is for a smaller size.
This change applies to places where we would turn 128/256-bit code into 512-bit in order to get a wider element type through sext/zext. Any 512-bit types that already existed in the IR/DAG will be left that way.

The width preference has no effect on codegen behavior when the target does not have AVX512 enabled. So AVX/AVX2 codegen cannot be limited via this mechanism yet.

If the preference is lower than 256 we may still use a 256 bit type to do the operation. Constraining to 128 bits makes it much more difficult to support some operations. For many of these cases we need to change element width while keeping element count constant which is easiest done by switching between 256 and 128 bit.

The preference is only obeyed when AVX512 and VLX are available. This means the preference is not obeyed for KNL, but is obeyed for SKX, Cannonlake, and Icelake. For KNL, the only way to do masked operation is on 512-bit registers so we would have to completely disable masking to obey the preference. We would also lose support for gather, scatter, ctlz, vXi64 multiplies, etc. This may change in the future, but this simplifies the initial implementation.

Differential Revision: https://reviews.llvm.org/D41895

llvm-svn: 323016
2018-01-20 00:26:12 +00:00
Craig Topper
bca85d8144 [X86] Add support for passing 'prefer-vector-width' function attribute into X86Subtarget and exposing via X86's getRegisterWidth TTI interface.
This will cause the vectorizers to do some limiting of the vector widths they create. This is not a strict limit. There are reasons I know of that the loop vectorizer will generate larger vectors for.

I've written this in such a way that the interface will only return a properly supported width(0/128/256/512) even if the attribute says something funny like 384 or 10.

This has been split from D41895 with the remainder in a follow up commit.

llvm-svn: 323015
2018-01-20 00:26:08 +00:00
Daniel Neilson
f59acc15ad Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)
Summary:
 This is a resurrection of work first proposed and discussed in Aug 2015:
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
and initially landed (but then backed out) in Nov 2015:
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

 The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument
which is required to be a constant integer. It represents the alignment of the
dest (and source), and so must be the minimum of the actual alignment of the
two.

 This change is the first in a series that allows source and dest to each
have their own alignments by using the alignment attribute on their arguments.

 In this change we:
1) Remove the alignment argument.
2) Add alignment attributes to the source & dest arguments. We, temporarily,
   require that the alignments for source & dest be equal.

 For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false)
will now read
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false)

 Downstream users may have to update their lit tests that check for
@llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script
may help with updating the majority of your tests, but it does not catch all possible
patterns so some manual checking and updating will be required.

s~declare void @llvm\.mem(set|cpy|move)\.p([^(]*)\((.*), i32, i1\)~declare void @llvm.mem\1.p\2(\3, i1)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* align \6 \3, i8 \4, i8 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* align \6 \3, i8 \4, i16 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* align \6 \3, i8 \4, i32 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* align \6 \3, i8 \4, i64 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* align \6 \3, i8 \4, i128 \5, i1 \7)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* \4, i8\5* \6, i8 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* \4, i8\5* \6, i16 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* \4, i8\5* \6, i32 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* \4, i8\5* \6, i64 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* \4, i8\5* \6, i128 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g

 The remaining changes in the series will:
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
   source and dest alignments.
Step 3) Update Clang to use the new IRBuilder API.
Step 4) Update Polly to use the new IRBuilder API.
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
        and those that use use MemIntrinsicInst::[get|set]Alignment() to use
        getDestAlignment() and getSourceAlignment() instead.
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
        MemIntrinsicInst::[get|set]Alignment() methods.

Reviewers: pete, hfinkel, lhames, reames, bollu

Reviewed By: reames

Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits

Differential Revision: https://reviews.llvm.org/D41675

llvm-svn: 322965
2018-01-19 17:13:12 +00:00
Sanjay Patel
e68c91d039 [x86] shrink 'and' immediate values by setting the high bits (PR35907)
Try to reverse the constant-shrinking that happens in SimplifyDemandedBits()
for 'and' masks when it results in a smaller sign-extended immediate.

We are also able to detect dead 'and' ops here (the mask is all ones). In
that case, we replace and return without selecting the 'and'.

Other targets might want to share some of this logic by enabling this under a
target hook, but I didn't see diffs for simple cases with PowerPC or AArch64,
so they may already have some specialized logic for this kind of thing or have
different needs.

This should solve PR35907:
https://bugs.llvm.org/show_bug.cgi?id=35907

Differential Revision: https://reviews.llvm.org/D42088

llvm-svn: 322957
2018-01-19 16:37:25 +00:00
Nirav Dave
1bceaa6e21 [X86] Extend load-op-store fusion merge to ADC/SBB.
Summary: Add handling of EFLAG input to X86 Load-op-store fusion checking.

Reviewers: craig.topper, RKSimon

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D42128

llvm-svn: 322952
2018-01-19 15:37:57 +00:00
Craig Topper
1b2403c0cf [X86] Make better use of instregex for cmovcc/setcc/jcc instructions in the Intel scheduler models.
Combine all the separate condition codes into a singular expression when possible.

llvm-svn: 322924
2018-01-19 05:47:32 +00:00
Craig Topper
1c2f80aef6 [X86] Add intrinsic support for the RDPID instruction
This adds a new instrinsic to support the rdpid instruction. The implementation is a bit weird because the intrinsic is defined as always returning 32-bits, but the assembler support thinks the instruction produces a 64-bit register in 64-bit mode. But really it zeros the upper 32 bits. So I had to add separate patterns where 64-bit mode uses an extract_subreg.

Differential Revision: https://reviews.llvm.org/D42205

llvm-svn: 322910
2018-01-18 23:52:31 +00:00
Craig Topper
672594bd71 [X86] Use vmovdqu64/vmovdqa64 for unmasked integer vector stores for consistency with loads.
Previously we used 64 for vXi64 stores and 32 for everything else. This change uses 64 for everything just like do for loads.

llvm-svn: 322820
2018-01-18 07:44:09 +00:00
Craig Topper
5585e2235d [X86] Remove isel patterns for using unmasked vmovdqa32/vmovdqu32 for integer vector loads.
These patterns were just looking for a vXi64 bitcasted to vXi32, but there is no advantage to using vmovdqa32 over vmovdqa64.

llvm-svn: 322819
2018-01-18 07:44:06 +00:00
Reid Kleckner
8873e1db2a [CodeGen] Hoist common AsmPrinter code out of X86, ARM, and AArch64
Every known PE COFF target emits /EXPORT: linker flags into a .drective
section. The AsmPrinter should handle this.

While we're at it, use global_values() and emit each export flag with
its own .ascii directive. This should make the .s file output more
readable.

llvm-svn: 322788
2018-01-17 23:55:23 +00:00
Simon Pilgrim
af727ae93a [X86][BTVER2] Reduce instregex usage (PR35955)
Most are just replaced with instrs lists, but a few regexps have been further generalized to match more instructions with a single pattern.

llvm-svn: 322734
2018-01-17 19:12:48 +00:00
Craig Topper
1c79d356f8 [X86] Teach LowerBUILD_VECTOR to recognize pair-wise splats of 32-bit elements and use a 64-bit broadcast
If we are splatting pairs of 32-bit elements, we can use a 64-bit broadcast to get the job done.

We could probably could probably do this with other sizes too, for example four 16-bit elements. Or we could broadcast pairs of 16-bit elements using a 32-bit element broadcast. But I've left that as a future improvement.

I've also restricted this to AVX2 only because we can only broadcast loads under AVX.

Differential Revision: https://reviews.llvm.org/D42086

llvm-svn: 322730
2018-01-17 18:58:22 +00:00
Craig Topper
85cb5c17f6 [X86] When legalizing (v64i1 select i8, v64i1, v64i1) make sure not to introduce bitcasts to i64 in 32-bit mode
We legalize selects of masks with scalar conditions using a bitcast to an integer type. But if we are in 32-bit mode we can't convert v64i1 to i64. So instead split the v64i1 to v32i1 and concat it back together. Each half will then be legalized by bitcasting to i32 which is fine.

The test case is a little indirect. If we have the v64i1 select in IR it will get legalized by legalize vector ops which has a run of type legalization after it. That type legalization run is able to fix this i64 bitcast. So in order to avoid that we need a build_vector of a splat which legalize vector ops will ignore. Legalize DAG will then turn that into a select via LowerBUILD_VECTORvXi1. And the select will get legalized. In this case there is no type legalizer run to cleanup the bitcast.

This fixes pr35972.

llvm-svn: 322724
2018-01-17 18:46:01 +00:00
Benjamin Kramer
5dc3847399 [X86] Don't mutate shuffle arguments after early-out for AVX512
The match* functions have the annoying behavior of modifying its inputs.
Save and restore the inputs, just in case the early out for AVX512 is
hit. This is still not great and its only a matter of time this kind of
bug happens again, but I couldn't come up with a better pattern without
rewriting significant chunks of this code. Fixes PR35977.

llvm-svn: 322644
2018-01-17 13:01:06 +00:00
Benjamin Kramer
00cc7e3e32 [X86] Constify DebugLoc parameters. No functionality change.
llvm-svn: 322643
2018-01-17 13:00:58 +00:00
Andrew V. Tischenko
1fb380cedb Allow usage of X86-prefixes as separate instrs.
Differential Revision: https://reviews.llvm.org/D42102

llvm-svn: 322623
2018-01-17 10:12:06 +00:00
Craig Topper
94bc93e190 [X86] In LowerBUILD_VECTOR, rename ExtVT to EltVT so it makes sense.
llvm-svn: 322616
2018-01-17 03:58:21 +00:00
Craig Topper
aa7c3ff71e [X86] Remove duplicate lines from scheduler models. NFC
llvm-svn: 322615
2018-01-17 03:50:21 +00:00
Simon Pilgrim
15dfa1bfc1 [X86][BTVER2] Fix scheduling of VCMPSD/VCMPSS instructions
For some reason they don't have a trailing i like the packed equivalents.

llvm-svn: 322600
2018-01-16 22:15:41 +00:00
Simon Pilgrim
bbd306f411 [X86][BTVER2] Use instrs instead of instregex for low match counts (PR35955)
llvm-svn: 322598
2018-01-16 22:08:43 +00:00
Simon Pilgrim
6ddd65608e [X86][BTVER2] Use instrs instead of instregex for single use matches (PR35955)
llvm-svn: 322597
2018-01-16 21:44:48 +00:00
Simon Pilgrim
3c09cf2a76 [X86][MMX] Accept UNDEF upper bits for MOVD GR32->MMX
llvm-svn: 322574
2018-01-16 17:01:31 +00:00
Simon Pilgrim
d94666a84c [X86][MMX] Improve MMX constant generation
Extend the MMX zero code to take any constant with zero'd upper 32-bits

llvm-svn: 322553
2018-01-16 14:21:28 +00:00
Craig Topper
73a154cef6 [X86] Revisit the fix I made years ago to make 'xchgl %eax, %eax' not encode using the 0x90 encoding in 64-bit mode.
Prior to this we had a separate instruction and register class that excluded eax to prevent matching the instruction that would encode with 0x90.

This patch changes this to just use an InstAlias to force xchgl %eax, %eax to use XCHG32rr instruction in 64-bit mode. This gets rid of the separate instruction and register class.

llvm-svn: 322532
2018-01-16 06:07:16 +00:00
Craig Topper
c065faaccf [X86] Make 'xchgq %rax, %rax' an alias for the 0x90 nop encoding to match gas.
Previously we encoded it as 0x48 0x90.

llvm-svn: 322531
2018-01-16 06:07:14 +00:00
Simon Pilgrim
8a7237fd4f Avoid Wparentheses warning.
llvm-svn: 322526
2018-01-15 22:40:06 +00:00
Simon Pilgrim
a8d108687b [X86][MMX] Add support for MMX zero vector creation
As mentioned on PR35869, (and came up recently on D41517) we don't create a MMX zero register via the PXOR but instead perform a spill to stack from a XMM zero register.

This patch adds support for direct MMX zero vector creation and should make it easier to add better constant vector creation in the future as well.

Differential Revision: https://reviews.llvm.org/D41908

llvm-svn: 322525
2018-01-15 22:32:40 +00:00
Simon Pilgrim
9a1d5eeb3e [X86][SSE] Add custom execution domain fixing for BLENDPD/BLENDPS/PBLENDD/PBLENDW (PR34873)
Add support for custom execution domain fixing and implement support for BLENDPD/BLENDPS/PBLENDD/PBLENDW.

Differential Revision: https://reviews.llvm.org/D42042

llvm-svn: 322524
2018-01-15 22:18:45 +00:00
Craig Topper
4a0f80b0a4 [X86] Use MVT::getVectorVT instead of EVT::getVectorVT when splitting 256/512 bit build_vectors. NFC
We must be creating a legal type here which means it can be an MVT.

llvm-svn: 322512
2018-01-15 20:33:53 +00:00
Craig Topper
a5860c1159 [X86] Generalize some code in LowerBUILD_VECTOR. NFC
llvm-svn: 322511
2018-01-15 20:33:52 +00:00
Craig Topper
a103183a87 [X86] Remove unnecessary if statement from LowerBUILD_VECTOR. NFCI
We were checking for 128, 256, or 512 bit vectors, but those are the only types that can get here.

llvm-svn: 322510
2018-01-15 20:33:50 +00:00
Simon Pilgrim
91f050f700 [X86] Fix typos in WriteVMOVNTDQSt and WriteVMOVNTPYSt pattern names. NFCI.
llvm-svn: 322495
2018-01-15 17:55:21 +00:00
Clement Courbet
e370a26012 [X86] Add missing predicates for VRNDSCALES{D,S}{m,r}
Summary: This is similar to https://reviews.llvm.org/D41983.

Reviewers: gchatelet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42069

llvm-svn: 322486
2018-01-15 14:24:07 +00:00
Andrew V. Tischenko
99ef3709ba Update BTVER2 sched numbers for some AVX instructions (xmm version).
Differential Revision: https://reviews.llvm.org/D40067

llvm-svn: 322485
2018-01-15 14:21:11 +00:00
Clement Courbet
ededcd269c [X86]Add missing predicates for VMOVDQUYrm,VMOVDQUYmr.
Summary:
Due to missing parentheses.

This is similar to https://reviews.llvm.org/D41983.

Reviewers: gchatelet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42062

llvm-svn: 322483
2018-01-15 13:37:05 +00:00
Clement Courbet
53fea2956b [X86] Fix missing predicates HasAVX512 Predicates in avx512_sqrt_scalar.
Summary:
For example, VSQRTSDZr and VSQRTSSZr were missing the predicate.
Also fix braces indentation and braces for consistency.

Reviewers: craig.topper, RKSimon

Suscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41983

llvm-svn: 322478
2018-01-15 12:05:33 +00:00
Simon Pilgrim
6556cecd6b [X86][SSE] Support combining MOVLHPS undef inputs
llvm-svn: 322459
2018-01-14 18:50:34 +00:00
Craig Topper
7095915282 [X86] Use ISD::TRUNCATE instead of X86ISD::VTRUNC when input and output types have the same number of elements.
llvm-svn: 322455
2018-01-14 08:11:36 +00:00
Craig Topper
f1f9a8c7f7 [X86] Add X86ISD::VTRUNC to computeKnownBitsForTargetNode.
We have to take special care to avoid the cases where the result of the truncate would be padded with zero elements.

Ideally we'd just use ISD::TRUNCATE for these cases instead.

llvm-svn: 322454
2018-01-14 08:11:33 +00:00
Craig Topper
767d0f3bfe [X86] Improve legalization of vXi16/vXi8 selects.
Extend vXi1 conditions of vXi8/vXi16 selects even before type legalization gets a chance to split wide vectors. Previously we would only extend 128 and 256 bit vectors. But if we start with a 512 bit vector or wider that needs to be split we wouldn't extend until after the split had taken place. By extending early we improve the results of type legalization.

Don't widen condition of 128/256 bit vXi16/vXi8 selects when we have BWI but not VLX. We can still use a mask register by widening the select to 512-bits instead. This is similar to what we do for compares already.

llvm-svn: 322450
2018-01-14 02:05:51 +00:00
Zvi Rackover
441282dd8c X86: Add pattern matching for PMADDWD
In addition to the existing match as part of a loop-reduction, add a
straightforward pattern match for DAG-contained patterns.

Reviewers: RKSimon, craig.topper

Subscribers: llvm-commits

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D41811

llvm-svn: 322446
2018-01-13 17:42:19 +00:00
Craig Topper
297e87e001 [X86] Add DAG combine to promote vXi1 result of a vXi8/vXi16 setcc when we have AVX512 but not BWI.
This avoids having the result type stick around until lowering where we have to extend the setcc and insert a truncate. If we get the types converted early we can do more to optimize it.

llvm-svn: 322432
2018-01-13 06:24:46 +00:00
Craig Topper
aa0ccaf5cd [X86] Remove unused isel pattern for zero extend from v16i1/v8i1 to v16i32/v8i64.
We have custom lowering on vzext that produces a vselect and a build vector. So zext never gets to isel.

llvm-svn: 322381
2018-01-12 17:34:09 +00:00
Craig Topper
46f6d0692f [X86] Don't allow lods/stos/scas/cmps/movs to be parsed without a suffix and only memory operand in at&t syntax.
Without a register with a size being mentioned the instruction is ambiguous in at&t syntax. With Intel syntax the memory operation caries a size that can be used to disambiguate.

llvm-svn: 322356
2018-01-12 06:48:26 +00:00
Craig Topper
67c7573d58 [X86] Don't require suffix on 'clr' mnemonic in intel syntax
llvm-svn: 322355
2018-01-12 06:48:24 +00:00
Craig Topper
2e17caf317 [X86] Add 'l' and 'q' suffixes to the tbm instruction mnemonics.
While the suffix isn't required to disambiguate the instructions, it is required in order to parse the instructions when the suffix is specified in order to match the GNU assembler.

llvm-svn: 322354
2018-01-12 06:21:36 +00:00
Craig Topper
6509d82de1 [X86] Disable sldtq parsing in 64-bit mode.
llvm-svn: 322353
2018-01-12 05:38:15 +00:00
Craig Topper
2a983cee68 [X86] Disable movsq/stosq/scasqcmpsq/lodsq parsing in 64-bit mode.
llvm-svn: 322352
2018-01-12 05:38:14 +00:00
David L. Jones
3c7677b5c6 Revert r322279 due to Skylake miscompile.
Summary:
This revision causes Skylake (and apparently, only Skylake) codegen to fail in
certain cases. Details: https://bugs.llvm.org/show_bug.cgi?id=35918

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D41972

llvm-svn: 322335
2018-01-12 00:17:38 +00:00