llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 06:22:56 +02:00

Author	SHA1	Message	Date
David Majnemer	93803262f4	[X86] Add intrinsics for reading and writing to the flags register LLVM's targets need to know if stack pointer adjustments occur after the prologue. This is needed to correctly determine if the red-zone is appropriate to use or if a frame pointer is required. Normally, LLVM can figure this out very precisely by reasoning about the contents of the MachineFunction. There is an interesting corner case: inline assembly. The vast majority of inline assembly which will perform a push or pop is done so to pair up with pushf or popf as appropriate. Unfortunately, this inline assembly doesn't mark the stack pointer as clobbered because, well, it isn't. The stack pointer is decremented and then immediately incremented. Because of this, LLVM was changed in r256456 to conservatively assume that inline assembly contain a sequence of stack operations. This is unfortunate because the vast majority of inline assembly will not end up manipulating the stack pointer in any way at all. Instead, let's provide a more principled solution: an intrinsic. FWIW, other compilers (MSVC and GCC among them) also provide this functionality as an intrinsic. llvm-svn: 256685	2016-01-01 06:50:01 +00:00
Craig Topper	4ffb442ed6	[X86] Remove a return after llvm_unreachable. llvm-svn: 256681	2015-12-31 22:40:48 +00:00
Craig Topper	cf29b2e15a	[X86] Move shuffle decoding for constant pool into the X86CodeGen library to remove a layering violation in the Util library. llvm-svn: 256680	2015-12-31 22:40:45 +00:00
Michael Zuckerman	861e8172f1	[AVX512] add PSRLQ and PSRLD Intrinsic Differential Revision: http://reviews.llvm.org/D15770 llvm-svn: 256673	2015-12-31 15:22:04 +00:00
Michael Kuperstein	ebbd053e6a	[X86] Avoid folding scalar loads into unary sse intrinsics Not folding these cases tends to avoid partial register updates: sqrtss (%eax), %xmm0 Has a partial update of %xmm0, while movss (%eax), %xmm0 sqrtss %xmm0, %xmm0 Has a clobber of the high lanes immediately before the partial update, avoiding a potential stall. Given this, we only want to fold when optimizing for size. This is consistent with the patterns we already have for some of the fp/int converts, and in X86InstrInfo::foldMemoryOperandImpl() Differential Revision: http://reviews.llvm.org/D15741 llvm-svn: 256671	2015-12-31 09:45:16 +00:00
Asaf Badouh	f9720f53b4	[X86][PKU] Add {RD,WR}PKRU intrinsics Differential Revision: http://reviews.llvm.org/D15808 llvm-svn: 256670	2015-12-31 08:31:13 +00:00
Sanjay Patel	32b3efed19	use range-based for-loops; NFCI llvm-svn: 256573	2015-12-29 19:14:23 +00:00
Michael Zuckerman	d97aa00156	[AVX512] add PSRLW Intrinsic Differential Revision: http://reviews.llvm.org/D15751 llvm-svn: 256558	2015-12-29 13:04:35 +00:00
Craig Topper	26319780b7	[X86] Remove declaration of ATTAsmParser. Its equivalent to the DefaultAsmParser. NFC llvm-svn: 256541	2015-12-29 07:03:27 +00:00
Sanjay Patel	61b36ff574	[x86] lower calls to fmin and llvm.minnum.* using minss/minsd/minps/minpd (PR24475) This is a follow-on to: http://reviews.llvm.org/rL255700 http://reviews.llvm.org/rL256454 http://reviews.llvm.org/rL256510 llvm-svn: 256522	2015-12-28 21:16:55 +00:00
Elena Demikhovsky	3ed0b3c7f1	Implemented cost model for masked gather and scatter operations The cost is calculated for all X86 targets. When gather/scatter instruction is not supported we calculate the cost of scalar sequence. Differential revision: http://reviews.llvm.org/D15677 llvm-svn: 256519	2015-12-28 20:10:59 +00:00
Sanjay Patel	7785be794b	[x86] lower calls to fmax and llvm.maxnum.* using maxps/maxpd (PR24475) This is a follow-on to: http://reviews.llvm.org/rL255700 http://reviews.llvm.org/rL256454 llvm-svn: 256510	2015-12-28 19:20:19 +00:00
Sanjay Patel	9ffd44cf90	tidy up; NFC llvm-svn: 256506	2015-12-28 18:18:22 +00:00
Michael Kuperstein	ba0a393451	[X86] Better support for the MCU psABI (LLVM part) This adds support for the MCU psABI in a way different from r251223 and r251224, basically reverting most of these two patches. The problem with the approach taken in r251223/4 is that it only handled libcalls that originated from the backend. However, the mid-end also inserts quite a few libcalls and assumes these use the platform's default calling convention. The previous patch tried to insert inregs when necessary both in the FE and, somewhat hackily, in the CG. Instead, we now define a new default calling convention for the MCU, which doesn't use inreg marking at all, similarly to what x86-64 does. Differential Revision: http://reviews.llvm.org/D15054 llvm-svn: 256494	2015-12-28 14:39:21 +00:00
Asaf Badouh	3e8d6828a0	[X86][AVX512] Lower broadcast sub vector to vector inrtrinsics lower broadcast<type>x<vector> to shuffles. there are two cases: 1.src is 128 bits and dest is 512 bits: in this case we will lower it to shuffle with imm = 0. 2.src is 256 bit and dest is 512 bits: in this case we will lower it to shuffle with imm = 01000100b (0x44) that way we will broadcast the 256bit source: ymm[0,1,2,3] => zmm[0,1,2,3,0,1,2,3] then it will mask it with the passthru value (in case it's mask op). Differential Revision: http://reviews.llvm.org/D15790 llvm-svn: 256490	2015-12-28 08:26:26 +00:00
Asaf Badouh	6fcb80c7ac	[X86][AVX512] add fp scalar broadcast intrinsics Differential Revision: http://reviews.llvm.org/D15790 llvm-svn: 256489	2015-12-28 08:09:25 +00:00
Craig Topper	228bc66bfc	[AVX512] Remove VEX_LIG from vmovd/vmovq instructions. From what I can tell from the Intel docs these instructions require the L-bit to be 0. llvm-svn: 256486	2015-12-28 06:32:47 +00:00
Craig Topper	a315cd0b1d	[AVX512] Fix some places that used FR64 instead of FR64X. llvm-svn: 256484	2015-12-28 06:11:45 +00:00
Craig Topper	cf3121d888	[AVX512] Bring vmovq instructions names into alignment with the AVX and SSE names. Add a missing encoding to disassembler and assembler. I believe this also fixes a case where a 64-bit memory form that is documented as being unsupported in 32-bit mode was able to be selected there. llvm-svn: 256483	2015-12-28 06:11:42 +00:00
Craig Topper	78232095c9	[X86] Move address for store target from outs to ins on a couple instructions. llvm-svn: 256482	2015-12-28 06:11:39 +00:00
Craig Topper	bfdc4a3764	[X86] Add proper Uses/Defs/mayLoad flags for AAA/AAD/AAM/AAS/DAA/DAS/XLAT instructions. llvm-svn: 256481	2015-12-28 06:11:37 +00:00
Craig Topper	e4e0592ca3	[AVX512] Remove separate instruction and patterns for lowering ctlz_zero_undef. Change the operation for CTLZ_ZERO_UNDEF to Expand so SelectionDAG will convert them to CTLZ before lowering. llvm-svn: 256477	2015-12-27 21:33:50 +00:00
Craig Topper	ce5014e9fe	[AVX512] Remove alternate data type versions of VALIGND, VALIGNQ, VMOVSHDUP and VMOVSLDUP. They don't have any tests and I don't think they can be selected. If they are truly needed they should be implemented with patterns against the normal instructions and not separate instructions. llvm-svn: 256475	2015-12-27 19:45:21 +00:00
Igor Breger	a848a96908	AVX512: Change VPMOVB2M DAG lowering , use CVT2MASK node instead TRUNCATE. Fix TRUNCATE lowering vector to vector i1, use LSB and not MSB. Implement VPMOVB/W/D/Q2M intrinsic. Differential Revision: http://reviews.llvm.org/D15675 llvm-svn: 256470	2015-12-27 13:56:16 +00:00
Asaf Badouh	f94cbd0492	[X86][AVX512] change broadcast to use maskable pattern Differential Revision: http://reviews.llvm.org/D15786 llvm-svn: 256469	2015-12-27 12:14:34 +00:00
Craig Topper	c79efd26f5	[AVX-512] Remove alernate integer forms for VPERMILPS and VPERMILPD. There no tests for them and I don't see any way to select them anyway. If they are really needed they should be implemented as patterns and not full fledged instructions. llvm-svn: 256462	2015-12-27 06:55:08 +00:00
David Majnemer	38d1ffe261	[X86, Win64] Use a frame pointer if pushf is emitted A frame pointer must be used if stack pointer is modified after the prologue. LLVM will emit pushf/popf if we need to save/restore the FLAGS register, requiring us to have a frame pointer for the function. There is a small twist: this sequence might exist in user code via inline-assembly. For now, conservatively assume that such functions require a frame pointer. For real world justification, please see clang's implementation of __readeflags. This fixes PR25945. llvm-svn: 256456	2015-12-27 06:07:26 +00:00
Sanjay Patel	5cb4cfb9d6	[x86] lower calls to llvm.maxnum.v4f32 using maxps This is a follow-on to: http://reviews.llvm.org/rL255700 llvm-svn: 256454	2015-12-26 21:44:55 +00:00
Craig Topper	e482a43cb8	[X86] Fix an unused variable warning in released builds. llvm-svn: 256453	2015-12-26 20:13:33 +00:00
Craig Topper	dcab6c8bcf	[X86] Add support for printing shuffle comments for AVX512 PSHUFB instructions. llvm-svn: 256452	2015-12-26 19:48:43 +00:00
Craig Topper	89319531b9	[X86] Fold some variable declarations and initializations into if statements. NFC llvm-svn: 256451	2015-12-26 19:48:37 +00:00
Craig Topper	de07308c81	[X86] Fix shuffle decoding for variable VPERMIL to be tolerant of the Constant type not matching due to folding in the constant pool and to get VPERMILPD correct. llvm-svn: 256433	2015-12-26 04:50:07 +00:00
Craig Topper	38dac27e3a	[X86] Fix copy and paste typo from pasting from another Makefile to restore code. llvm-svn: 256431	2015-12-25 23:27:57 +00:00
Craig Topper	f89e5ceed0	[X86] Put back the include path to the main X86 sources in the AsmParser library to fix the bots. llvm-svn: 256430	2015-12-25 22:22:16 +00:00
Craig Topper	7e1d2e52cd	[X86] Remove X86CodeGen dependency from the AsmParser library. llvm-svn: 256429	2015-12-25 22:10:11 +00:00
Craig Topper	8d6e33b512	[X86] Move getX86SubSuperRegisterOrZero to X86MCTargetDesc.cpp so it can be used by AsmParser library without depending on X86CodeGen library. llvm-svn: 256428	2015-12-25 22:10:08 +00:00
Craig Topper	8ac345d153	Remove extra forward declarations and scrub includes for all in tree InstPrinters. NFC llvm-svn: 256427	2015-12-25 22:10:01 +00:00
Craig Topper	08678ccbc2	[X86] Move AVX512 STATIC_ROUNDING enum to X86BaseInfo.h to fix a layering violation in AsmParser. llvm-svn: 256426	2015-12-25 22:09:49 +00:00
Craig Topper	fe5f33f108	[X86] Replace MVT::SimpleValueType in the AsmParser library and getX86SubSuperRegister with just an unsigned representing size. This a is step towards fixing a layering violation so the X86 AsmParser won't depending on CodeGen types. llvm-svn: 256425	2015-12-25 22:09:45 +00:00
Craig Topper	f023f89e96	[X86] Don't pass the default value to the High argument of getX86SubSuperRegister. Most place don't care about this argument. NFC llvm-svn: 256424	2015-12-25 19:44:16 +00:00
Craig Topper	f3eef0e344	[X86] getX86SubSuperRegisterOrZero shouldn't call getX86SubSuperRegister recursively. It should call itself instead. Otherwise it might fire an assertion when it was designed not too. llvm-svn: 256422	2015-12-25 17:07:32 +00:00
Craig Topper	be47256c20	[X86] Add missing X86II::MRM_C4, MRM_C5, etc. encodings to getMemoryOperandNo. These aren't used by any instructions, but could be someday. NFC llvm-svn: 256421	2015-12-25 17:07:30 +00:00
Craig Topper	474cc56790	[X86] Use assert instead of if and llvm_unreachable. NFC llvm-svn: 256420	2015-12-25 17:07:27 +00:00
Craig Topper	e1f71ad36d	[X86] Minor identation fixes. NFC llvm-svn: 256419	2015-12-25 17:07:24 +00:00
Marina Yatsina	38411a839c	[X86][ms-inline asm] Add support for memory operands that include structs Add ability to reference struct symbols in memory operands. Test case will be added on the clang side (review http://reviews.llvm.org/D15749) Differential Revision: http://reviews.llvm.org/D15748 llvm-svn: 256381	2015-12-24 12:09:51 +00:00
Asaf Badouh	593a27ca5c	[X86][PKU] Add {RD,WR}PKRU encoding Differential Revision: http://reviews.llvm.org/D15711 llvm-svn: 256366	2015-12-24 08:25:00 +00:00
Elena Demikhovsky	423b83228e	AVX-512: Kreg set 0/1 optimization The patterns that set a mask register to 0/1 KXOR %kn, %kn, %kn / KXNOR %kn, %kn, %kn are replaced with KXOR %k0, %k0, %kn / KXNOR %k0, %k0, %kn - AVX-512 targets optimization. KNL does not recognize dependency-breaking idioms for mask registers, so kxnor %k1, %k1, %k2 has a RAW dependence on %k1. Using %k0 as the undef input register is a performance heuristic based on the assumption that %k0 is used less frequently than the other mask registers, since it is not usable as a write mask. Differential Revision: http://reviews.llvm.org/D15739 llvm-svn: 256365	2015-12-24 08:12:22 +00:00
Igor Breger	855ac148cd	AVX512: VPMOVM2B/W/D/Q intrinsic implementation. Differential Revision: http://reviews.llvm.org//D15747 llvm-svn: 256364	2015-12-24 07:11:53 +00:00
Simon Pilgrim	1750355ddf	[X86][AVX] Only shuffle the lower half of vectors if the upper half is undefined First step towards making better use of AVX's implicit zeroing of the upper half of a 256-bit vector by instructions that only act on the lower 128-bit vector - discussed on D14151. As well as the fact that 128-bit shuffle instructions are generally more capable, this can be performant for older CPUs with 128-bit ALUs (e.g. Jaguar, Sandy Bridge) that must treat 256-bit vectors as multiple micro-ops. Moved the similar subvector extraction shuffle combines from PerformShuffleCombine256 to lowerVectorShuffle as well. Note: I've avoided combining shuffles that reference elements from the upper halves of the input vectors - this may be reviewed in future work as well (AVX1 would probably always gain, but AVX2 does have some cross-lane shuffle instructions). Differential Revision: http://reviews.llvm.org/D15477 llvm-svn: 256332	2015-12-23 13:10:07 +00:00
Igor Breger	305115c35b	AVX512BW: Enable packed word shift for 512bit vector. Enable lowering scalar immidiate shift v64i8 .Fix predicate for AVX1/2 shifts. Differential Revision: http://reviews.llvm.org/D15713 llvm-svn: 256324	2015-12-23 08:06:50 +00:00

1 2 3 4 5 ...

12531 Commits