llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00

Author	SHA1	Message	Date
Lang Hames	7a23518af7	During folding for patchpoint/stackmap instructions, defer creation of new MIs until we know that folding will be successful. No functional change. llvm-svn: 194880	2013-11-15 23:13:21 +00:00
Juergen Ributzka	ee3af15269	[weak vtables] Remove a bunch of weak vtables This patch removes most of the trivial cases of weak vtables by pinning them to a single object file. Differential Revision: http://llvm-reviews.chandlerc.com/D2068 Reviewed by Andy llvm-svn: 194865	2013-11-15 22:34:48 +00:00
Bob Wilson	d433cf7463	Avoid illegal integer promotion in fastisel Stop folding constant adds into GEP when the type size doesn't match. Otherwise, the adds' operands are effectively being promoted, changing the conditions of an overflow. Results are different when: sext(a) + sext(b) != sext(a + b) Problem originally found on x86-64, but also fixed issues with ARM and PPC, which used similar code. <rdar://problem/15292280> Patch by Duncan Exon Smith! llvm-svn: 194840	2013-11-15 19:09:27 +00:00
Cameron McInally	cae8bdeb82	Add AVX512 unmasked FMA intrinsics and support. llvm-svn: 194824	2013-11-15 17:01:14 +00:00
Matt Arsenault	9921608896	Add addrspacecast instruction. Patch by Michele Scandale! llvm-svn: 194760	2013-11-15 01:34:59 +00:00
Elena Demikhovsky	bac904c06d	AVX-512: Handled extractelement from mask vector; Added VMOSHDUP/VMOVSLDUP shuffle instructions. llvm-svn: 194691	2013-11-14 11:29:27 +00:00
Andrew Trick	1eb87f0d42	Minor extension to llvm.experimental.patchpoint: don't require a call. If a null call target is provided, don't emit a dummy call. This allows the runtime to reserve as little nop space as it needs without the requirement of emitting a call. llvm-svn: 194676	2013-11-14 06:54:10 +00:00
Juergen Ributzka	b47be624ea	SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too. This patch reapplies r193676 with an additional fix for the Hexagon backend. The SystemZ backend has already been fixed by r194148. The Type Legalizer recognizes that VSELECT needs to be split, because the type is to wide for the given target. The same does not always apply to SETCC, because less space is required to encode the result of a comparison. As a result VSELECT is split and SETCC is unrolled into scalar comparisons. This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG Combiner. If a matching pattern is found, then the result mask of SETCC is promoted to the expected vector mask type for the given target. Now the type legalizer will split both VSELECT and SETCC. This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>. Reviewed by Nadav llvm-svn: 194542	2013-11-13 01:57:54 +00:00
Andrew Trick	12470267da	Cleanup the stackmap operand folding code and fix a corner case. I still don't know how to refer to the fixed operands symbolically. I plan to look into it. llvm-svn: 194529	2013-11-12 22:58:39 +00:00
Eric Christopher	b7b2cc176c	Add a FIXME for 32-bit q modifiers. llvm-svn: 194515	2013-11-12 21:47:44 +00:00
Andrew Trick	56e6608cf0	Simplify operand folding when rematerializing a load. We already know how to fold a reload from a frameindex without analyzing the load instruction. Generalize this to handle any frameindex load. This streamlines the logic for rematerializing loads from stack arguments. As a side effect, it allows stackmaps to record a stack argument location without spilling it. Verified no effect on codegen for llvm test-suite. llvm-svn: 194497	2013-11-12 18:06:12 +00:00
Lang Hames	dcd012fa30	Lower X86::MORESTACK_RET and X86::MORESTACK_RET_RESTORE_R10 in X86AsmPrinter::EmitInstruction, rather than X86MCInstLower::Lower. The aim is to improve the reusability of the X86MCInstLower class by making it more function-like. The X86::MORESTACK_RET_RESTORE_R10 pseudo broke the function model by emitting an extra instruction to the MCStreamer attached to the AsmPrinter. The patch should have no impact on generated code. llvm-svn: 194431	2013-11-11 23:00:41 +00:00
Andrew Trick	9a4f1fc067	Fix the recently added anyregcc convention to handle spilled operands. Fixes <rdar://15432754> [JS] Assertion: "Folded a def to a non-store!" The primary purpose of anyregcc is to prevent a patchpoint's call arguments and return value from being spilled. They must be available in a register, although the calling convention does not pin the register. It's up to the front end to avoid using this convention for calls with more arguments than allocatable registers. llvm-svn: 194428	2013-11-11 22:40:25 +00:00
Juergen Ributzka	a748d55906	[Stackmap] Materialize the jump address within the patchpoint noop slide. This patch moves the jump address materialization inside the noop slide. This enables patching of the materialization itself or its complete removal. This patch also adds the ability to define scratch registers that can be used safely by the code called from the patchpoint intrinsic. At least one scratch register is required, because that one is used for the materialization of the jump address. This patch depends on D2009. Differential Revision: http://llvm-reviews.chandlerc.com/D2074 Reviewed by Andy llvm-svn: 194306	2013-11-09 01:51:33 +00:00
Juergen Ributzka	f27436b708	[Stackmap] Add AnyReg calling convention support for patchpoint intrinsic. The idea of the AnyReg Calling Convention is to provide the call arguments in registers, but not to force them to be placed in a paticular order into a specified set of registers. Instead it is up tp the register allocator to assign any register as it sees fit. The same applies to the return value (if applicable). Differential Revision: http://llvm-reviews.chandlerc.com/D2009 Reviewed by Andy llvm-svn: 194293	2013-11-08 23:28:16 +00:00
Jim Grosbach	b8435149f5	X86: Assembly files with .cfi_cfa_def shouldn't hit llvm_unreachable() On darwin, when trying to create compact unwind info, a .cfi_cfa_def directive would case an llvm_unreachable() to be hit. Back off when we see this directive and generate the regular DWARF style eh_frame. rdar://15406518 llvm-svn: 194285	2013-11-08 22:33:06 +00:00
David Majnemer	ac56140f8a	X86 Disassembler: remove unused bool typedef-name llvm-svn: 194062	2013-11-05 10:34:42 +00:00
Craig Topper	8a08a00b6c	Lift alignment restrictions on load folding for a significant portion of AVX instructions. llvm-svn: 194048	2013-11-05 06:31:43 +00:00
Eric Christopher	a42eaab3a9	Check for both styles of clobbers, those produced by dragonegg and those produced by clang for the inline asm bswap conversion. Modified from a patch by Chris Smowton. llvm-svn: 194016	2013-11-04 21:41:21 +00:00
Cameron McInally	02e4f56c18	Add support for AVX512 masked vector blend intrinsics. llvm-svn: 194006	2013-11-04 19:14:56 +00:00
Benjamin Kramer	2d870f327a	X86: Add a description for AMD bdver3 aka Steamroller. This is just bdver2 + FSGSBase. llvm-svn: 193984	2013-11-04 10:29:20 +00:00
Elena Demikhovsky	841cd7d09e	AVX-512: added VPCONFLICT instruction and intrinsics, added EVEX_KZ to tablegen llvm-svn: 193959	2013-11-03 13:46:31 +00:00
Michael Liao	ae9a5c1116	Fix PR17764 - When selecting BLEND from vselect, the operands need swapping as due to the difference between vselect and SSE/AVX's BLEND insn llvm-svn: 193900	2013-11-02 00:10:02 +00:00
Dan Gohman	a55412164e	Fix unused variable warnings. llvm-svn: 193823	2013-10-31 22:58:11 +00:00
Andrew Trick	bb45eecd46	Add new calling convention for WebKit Java Script. llvm-svn: 193812	2013-10-31 22:12:01 +00:00
Andrew Trick	75681a41c0	Add support for stack map generation in the X86 backend. Originally implemented by Lang Hames. llvm-svn: 193811	2013-10-31 22:11:56 +00:00
Andrew Trick	48c4e0c740	whitespace llvm-svn: 193765	2013-10-31 17:18:07 +00:00
Cameron McInally	c38779faad	Add AVX512 unmasked integer broadcast intrinsics and support. llvm-svn: 193748	2013-10-31 13:56:31 +00:00
Elena Demikhovsky	1c867680b8	AVX-512: Implemented CMOV for 512-bit vectors llvm-svn: 193747	2013-10-31 13:15:32 +00:00
Tom Roeder	9290de3a99	This commit adds some (but not all) of the x86-64 relocations that are not currently supported in the ELF object writer, along with a simple test case. llvm-svn: 193709	2013-10-30 18:47:25 +00:00
Juergen Ributzka	6c6240a024	Revert "SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too." Now Hexagon and SystemZ are not happy with it :-( llvm-svn: 193677	2013-10-30 06:36:19 +00:00
Juergen Ributzka	746eeed753	SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too. The Type Legalizer recognizes that VSELECT needs to be split, because the type is to wide for the given target. The same does not always apply to SETCC, because less space is required to encode the result of a comparison. As a result VSELECT is split and SETCC is unrolled into scalar comparisons. This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG Combiner. If a matching pattern is found, then the result mask of SETCC is promoted to the expected vector mask type for the given target. This mask has usually the same size as the VSELECT return type (except for Intel KNL). Now the type legalizer will split both VSELECT and SETCC. This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>. Reviewed by Nadav llvm-svn: 193676	2013-10-30 05:48:18 +00:00
Rafael Espindola	3e081640d5	Move getSymbol to TargetLoweringObjectFile. This allows constructing a Mangler with just a TargetMachine. llvm-svn: 193630	2013-10-29 17:28:26 +00:00
Rafael Espindola	68ddc56344	Add a helper getSymbol to AsmPrinter. llvm-svn: 193627	2013-10-29 17:07:16 +00:00
Rafael Espindola	29e5575ba6	The asm printer has a mangler. Don't keep a second pointer to it. llvm-svn: 193616	2013-10-29 16:11:22 +00:00
Elena Demikhovsky	0e6849495e	AVX-512: PMIN/PMAX intrinsics and patterns Patch by Cameron McInally <cameron.mcinally@nyu.edu> llvm-svn: 193497	2013-10-27 08:18:37 +00:00
Quentin Colombet	5d88e45af6	[X86][AVX512] Add patterns that match the AVX512 floating point register vbroadcast intrinsics. Patch by Cameron McInally <cameron.mcinally@nyu.edu> llvm-svn: 193422	2013-10-25 18:04:12 +00:00
Quentin Colombet	5650890143	[X86][AVX512] Add patterns that match the AVX512 floating point vbroadcast intrinsics. Patch by Cameron McInally <cameron.mcinally@nyu.edu> llvm-svn: 193421	2013-10-25 17:47:18 +00:00
Nadav Rotem	d99242eb75	Optimize concat_vectors(X, undef) -> scalar_to_vector(X). This optimization is not SSE specific so I am moving it to DAGco. The new scalar_to_vector dag node exposed a missing pattern in the AArch64 target that I needed to add. llvm-svn: 193393	2013-10-25 06:41:18 +00:00
Elena Demikhovsky	da06b9b278	AVX-512: added VCVTPH2PS, VCVTPS2PH with intrinsics llvm-svn: 193312	2013-10-24 07:16:35 +00:00
Yaron Keren	3fb42fb8b5	(this is a corrected patch) Calling _chkstk is required on ELF as well as COFF on Windows. Without _chkstk, functions requiring large stack crash in initialization code. Previous code tested for COFF format but not Mach-O and this patch modifies the code to test for Windows OS (both Windows target and MingW target) but not Mach-O object format: Looks like macho environment was used to build some EFI code. Credits to Andrew MacPherson. llvm-svn: 193289	2013-10-23 23:37:01 +00:00
Rafael Espindola	b6d34eea66	Revert "Calling _chkstk is required on ELF as well as COFF on Windows. Without _chkstk functions requiring large stack crash in initialization code. Previous code tested for COFF format but not Mach-O and this patch modifies the code to test for Windows." This reverts commit r193263. It is causing CodeGen/X86/mingw-alloca.ll to fail. llvm-svn: 193275	2013-10-23 21:45:09 +00:00
Benjamin Kramer	701e41bb58	X86: Custom lower sext v16i8 to v16i16, and the corresponding truncate. Also update the cost model. llvm-svn: 193270	2013-10-23 21:06:07 +00:00
Yaron Keren	56f5c84f6c	Calling _chkstk is required on ELF as well as COFF on Windows. Without _chkstk functions requiring large stack crash in initialization code. Previous code tested for COFF format but not Mach-O and this patch modifies the code to test for Windows. Credits to Andrew MacPherson. llvm-svn: 193263	2013-10-23 19:40:07 +00:00
Benjamin Kramer	8ed652c269	X86: Custom lower zext v16i8 to v16i16. On sandy bridge (PR17654) we now get vpxor %xmm1, %xmm1, %xmm1 vpunpckhbw %xmm1, %xmm0, %xmm2 vpunpcklbw %xmm1, %xmm0, %xmm0 vinsertf128 $1, %xmm2, %ymm0, %ymm0 On haswell it's a simple vpmovzxbw %xmm0, %ymm0 There is a maze of duplicated and dead transforms and patterns in this area. Remove the dead custom lowering of zext v8i16 to v8i32, that's already handled by LowerAVXExtend. llvm-svn: 193262	2013-10-23 19:19:04 +00:00
Michael Liao	3b38b22386	Fix PR17631 - Skip instructions added in prolog. For specific targets, prolog may insert helper function calls (e.g. _chkstk will be called when there're more than 4K bytes allocated on stack). However, these helpers don't use/def YMM/XMM registers. llvm-svn: 193261	2013-10-23 18:32:43 +00:00
Jim Grosbach	03a64fa7b7	X86: Make concat_vectors combine a bit more conservative. Per Nadav's review comments for r192866. llvm-svn: 193252	2013-10-23 17:37:40 +00:00
Quentin Colombet	c5a6c85c4f	[X86][FastISel] Add a comment to help understanding changes made in r192636. <rdar://problem/15192473> llvm-svn: 193199	2013-10-22 21:29:08 +00:00
Elena Demikhovsky	3136868b1d	AVX-512: aligned / unaligned load and store for 512-bit integer vectors. llvm-svn: 193156	2013-10-22 09:19:28 +00:00
Craig Topper	8b2b2a7210	Replace (V)MOVZDI2PDIrr/rm instructions with patterns that select (V)MOVDI2PDIrr/rm. llvm-svn: 193146	2013-10-22 04:35:20 +00:00
Lang Hames	df2443e32e	X86 vector element shift-by-immediate instructions take i8 immediates. Make the instruction defenitions and ISEL reflect this. Prior to this patch these instructions took an i32i8imm, and the high bits were dropped during encoding. This led to incorrect behavior for shifts by immediates higher than 255. This patch fixes that issue by detecting large immediate shifts and returning constant zero (for logical shifts) or capping the shift amount at an encodable value (for arithmetic shifts). Fixes <rdar://problem/14968098> llvm-svn: 193096	2013-10-21 17:51:24 +00:00
Elena Demikhovsky	dceb9534bf	AVX-512: MUL operation lowering for v8i64 llvm-svn: 193083	2013-10-21 13:27:34 +00:00
Nadav Rotem	fd357159bc	Mark some command line flags as hidden llvm-svn: 193013	2013-10-18 23:38:13 +00:00
Hans Wennborg	c1a311233c	MC asm parser: allow ?'s in symbol names, and handle @'s in names in MS asm This is another (final?) stab at making us able to parse our own asm output on Windows. Symbols on Windows often contain @'s and ?'s in their names. Our asm parser didn't like this. ?'s were not allowed, and @'s were intepreted as trying to reference PLT/GOT/etc. We can't just add quotes around the bad names, since e.g. for MinGW, we use gas to assemble, and it doesn't like quotes in some places (notably in .def directives). This commit makes us allow ?'s in symbol names, and @'s in symbol names for MS assembly. Differential Revision: http://llvm-reviews.chandlerc.com/D1978 llvm-svn: 193000	2013-10-18 20:46:28 +00:00
Hans Wennborg	33576424a9	Revert "Re-commit r192758 - MC: quote tricky symbol names in asm output" This caused the clang-native-mingw32-win7 buildbot to break. The assembler was complaining about the following lines that were showing up in the asm for CrashRecoveryContext.cpp: movl $"__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4", 4(%eax) calll "_AddVectoredExceptionHandler@8" .def "__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4"; "__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4": calll "_RemoveVectoredExceptionHandler@4" Reverting for now. llvm-svn: 192940	2013-10-18 02:14:40 +00:00
Jim Grosbach	504d93cae7	x86: Move bitcasts outside concat_vector. Consider the following: typedef unsigned short ushort4U __attribute__((ext_vector_type(4), aligned(2))); typedef unsigned short ushort4 __attribute__((ext_vector_type(4))); typedef unsigned short ushort8 __attribute__((ext_vector_type(8))); typedef int int4 __attribute__((ext_vector_type(4))); int4 __bbase_cvt_int(ushort4 v) { ushort8 a; a.lo = v; return _mm_cvtepu16_epi32(a); } This generates the, not unreasonable, IR: define <4 x i32> @foo0(double %v.coerce) nounwind ssp { %tmp = bitcast double %v.coerce to <4 x i16> %tmp1 = shufflevector <4 x i16> %tmp, <4 x i16> undef, <8 x i32> <i32 %0, i32 1, i32 2, i32 3, i32 undef, i32 undef, i32 undef, i32 undef> %tmp2 = tail call <4 x i32> @llvm.x86.sse41.pmovzxwd(<8 x i16> %tmp1) ret <4 x i32> %tmp2 } The problem is when type legalization gets hold of the v4i16. It legalizes that by spilling to the stack, then doing a zero-extending load. Things go even more silly from there, ending up with something like: _foo0: movsd %xmm0, -8(%rsp) <== Spill to the stack. movq -8(%rsp), %xmm0 <== Reload it right back out. pmovzxwd %xmm0, %xmm1 <== Here's what we actually asked for. pblendw $1, %xmm1, %xmm0 <== We don't need this at all pmovzxwd %xmm0, %xmm0 <== We already did this ret The v8i8 to v8i16 zext intrinsic gives even worse results, with two table lookups via pshufb instructions(!!). To avoid all that, we can move the bitcasting until after we've formed the wider (legal) vector type. Then our normal codegen flows along nicely and we get the expected: _foo0: pmovzxwd %xmm0, %xmm0 ret rdar://15245794 llvm-svn: 192866	2013-10-17 02:58:06 +00:00
Hans Wennborg	822f0080df	Re-commit r192758 - MC: quote tricky symbol names in asm output The reason this got reverted was that the @feat.00 symbol which was emitted for every TU became quoted, and on cygwin/mingw we use the gas assembler which couldn't handle the quotes. This commit fixes the problem by only emitting @feat.00 for win32, where we use clang -cc1as to assemble. gas would just drop this symbol anyway, so there is no loss there. With @feat.00 gone, there shouldn't be quoted symbols showing up on cygwin since it uses the Itanium ABI, which doesn't put these funny characters in symbols. > Because of win32 mangling, we produce symbol and section names with > funny characters in them, most notably @ characters. > > MC would choke on trying to parse its own assembly output. This patch addresses > that by: > > - Making @ trigger quoting of symbol names > - Also quote section names in the same way > - Just parse section names like other identifiers (to allow for quotes) > - Don't assume @ signifies a symbol variant if it is in a string. llvm-svn: 192859	2013-10-17 01:13:02 +00:00
Yunzhong Gao	23e948dd2f	Enabling 3DNow! prefetch instruction for a few AMD processors: bobcat, jaguar, bulldozer and piledriver. Support for the instruction itself seems to have already been added in r178040. Differential Revision: http://llvm-reviews.chandlerc.com/D1933 llvm-svn: 192828	2013-10-16 19:04:11 +00:00
Rafael Espindola	90d8b36e1e	Add a MCAsmInfoELF class and factor some code into it. We had a MCAsmInfoCOFF, but no common class for all the ELF MCAsmInfos before. llvm-svn: 192760	2013-10-16 01:34:32 +00:00
Rafael Espindola	6f6b3d032c	Move .ident handling to MCStreamer. No functionality change, but exposes the API so that codegen can use it too. Patch by Katya Romanova. llvm-svn: 192757	2013-10-16 01:05:45 +00:00
Andrew Trick	e3e67d4a0a	Enable MI Sched for x86. This changes the SelectionDAG scheduling preference to source order. Soon, the SelectionDAG scheduler can be bypassed saving a nice chunk of compile time. Performance differences that result from this change are often a consequence of register coalescing. The register coalescer is far from perfect. Bugs can be filed for deficiencies. On x86 SandyBridge/Haswell, the source order schedule is often preserved, particularly for small blocks. Register pressure is generally improved over the SD scheduler's ILP mode. However, we are still able to handle large blocks that require latency hiding, unlike the SD scheduler's BURR mode. MI scheduler also attempts to discover the critical path in single-block loops and adjust heuristics accordingly. The MI scheduler relies on the new machine model. This is currently unimplemented for AVX, so we may not be generating the best code yet. Unit tests are updated so they don't depend on SD scheduling heuristics. llvm-svn: 192750	2013-10-15 23:33:07 +00:00
Michael Liao	1081bbac6c	Fix PR17546 - Type of index used in extract_vector_elt or insert_vector_elt supposes to be TLI.getVectorIdxTy() which is pointer type on most targets. It'd better to truncate (or zero-extend in case it's changed later) it to mask element type to guarantee they are matching instead of asserting that. llvm-svn: 192722	2013-10-15 17:51:58 +00:00
Michael Liao	a94d0a900a	Fix PR16807 - Lower signed division by constant powers-of-2 to target-independent DAG operators instead of target-dependent ones to support them better on targets where vector types are legal but shift operators on that types are illegal. E.g., on AVX, PSRAW is only available on <8 x i16> though <16 x i16> is a legal type. llvm-svn: 192721	2013-10-15 17:51:02 +00:00
Craig Topper	037594e792	Remove x86_sse42_crc32_64_8 intrinsic. It has no functional difference from x86_sse42_crc32_32_8 and was not mapped to a clang builtin. I'm not even sure why this form of the instruction is even called out explicitly in the docs. Also add AutoUpgrade support to convert it into the other intrinsic with appropriate trunc and zext. llvm-svn: 192672	2013-10-15 05:20:47 +00:00
Quentin Colombet	cb4b84532c	[X86][FastISel] During X86 fastisel, the address of indirect call was resolved through bitcast, ptrtoint, and inttoptr instructions. This is valid only if the related instructions are in that same basic block, otherwise we may reference variables that were not live accross basic blocks resulting in undefined virtual registers. The bug was exposed when both SDISel and FastISel were used within the same function, i.e., one basic block is issued with FastISel and another with SDISel, as demonstrated with the testcase. <rdar://problem/15192473> llvm-svn: 192636	2013-10-14 22:32:09 +00:00
Andrew Trick	b65138d3af	Fix the ExecutionDepsFix pass to handle AVX instructions. This pass is needed to break false dependencies. Without it, unlucky register assignment can result in wild (5x) swings in performance. This pass was trying to handle AVX but not getting it right. AVX doesn't have partial register defs, it has unused register reads in which the high bits of a source operand are copied into the unused bits of the dest. Fixing this requires conservative liveness analysis. This is awkard because the pass already has its own pseudo-liveness. However, proper liveness is expensive, and we would like to use a generic utility to compute it. The fix only invokes liveness on-demand. It is rare to detect a case that needs undef-read dependence breaking, but when it happens, it can be needed many times within a very large block. I think the existing heuristic which uses a register window of 16 is too conservative for loop-carried false dependencies. If the loop is a reduction. The out-of-order engine may be able to execute several loop iterations in parallel. However, I'll leave this tuning exercise for next time. llvm-svn: 192635	2013-10-14 22:19:03 +00:00
Andrew Trick	196a42f694	whitespace llvm-svn: 192633	2013-10-14 22:18:56 +00:00
Eric Christopher	1a04817b81	Revert part of a fix from 2010, changes since then: a) x86-64 TLS has been documented b) the code path should use movq for the correct relocation to be generated. I've also added a fixme for the test case that we should improve the code generated, it should look something like is documented in the tls abi document. llvm-svn: 192631	2013-10-14 21:52:26 +00:00
Eric Christopher	1b5964bd4e	Reformat this routine slightly. llvm-svn: 192630	2013-10-14 21:52:23 +00:00
Eric Christopher	d6f19023b0	Remove some extraneous whitespace. llvm-svn: 192629	2013-10-14 21:52:18 +00:00
Elena Demikhovsky	c460e7e50a	Fixed a bug in dynamic allocation memory on stack. The alignment of allocated space was wrong, see Bugzila 17345. Done by Zvi Rackover <zvi.rackover@intel.com>. llvm-svn: 192573	2013-10-14 07:26:51 +00:00
Craig Topper	fe4fce729c	Create classes to reduce the size of the tablegen entries for the CRC32 instructions. llvm-svn: 192568	2013-10-14 05:19:58 +00:00
Craig Topper	1548551887	Allow pinsrw/pinsrb/pextrb/pextrw/movmskps/movmskpd/pmovmskb/extractps instructions to parse either GR32 or GR64 without resorting to duplicating instructions. llvm-svn: 192567	2013-10-14 04:55:01 +00:00
Craig Topper	ba1540e28a	Add disassembler support for SSE4.1 register/register form of PEXTRW. There is a shorter encoding that was part of SSE2, but a memory form was added in SSE4.1. This is the register form of that encoding. llvm-svn: 192566	2013-10-14 01:42:32 +00:00
Craig Topper	c9050b2d46	Mark MOVMSKPS/MOVMSKPD/VPINSRWrr64i as AsmParserOnly to remove them from the disassembler tables. Add PINSRWrr64i to complement the AVX version. llvm-svn: 192565	2013-10-14 01:21:22 +00:00
Craig Topper	bb1360277e	Don't use 64-bit versions of MOVMSKPD in CodeGen. The instructions only produce a 1-bit result so we can just use SUBREG_TO_REG to extend the 32-bit versions. llvm-svn: 192562	2013-10-14 00:24:33 +00:00
Craig Topper	47d75426b9	Remove more filters from the disassembler. Mark some AVX512 instructions as CodeGenOnly. llvm-svn: 192525	2013-10-12 05:41:08 +00:00
Craig Topper	1e89b25474	Mark some more instructions as CodeGenOnly. Remove filters from the disassembler. llvm-svn: 192522	2013-10-12 04:46:18 +00:00
Craig Topper	60ef08db39	Allow non-AVX form of pmovmskb to take a GR64 operand. llvm-svn: 192341	2013-10-10 05:33:31 +00:00
Craig Topper	45f6a833a2	Remove duplicate instructions. llvm-svn: 192340	2013-10-10 05:01:22 +00:00
Elena Demikhovsky	f24ecf7862	AVX-512: Added VRCP28 and VRSQRT28 instructions and intrinsics. llvm-svn: 192283	2013-10-09 08:16:14 +00:00
Andrew Trick	6456fd444d	Add missing HasAVX512 predicate. This was only working because AVX had cheaper rules in all cases. I'm sure there are other places in this file where predicates are missing. llvm-svn: 192276	2013-10-09 05:11:10 +00:00
Craig Topper	bd2eef914f	Replace a couple instructions with patterns referring to other instructions with same encoding and operands. Mark a couple other instructions as CodeGenOnly since we have FR and VR instructions and only one of them is needed by the assembler/disassembler. llvm-svn: 192274	2013-10-09 04:54:21 +00:00
Craig Topper	ae27a7b281	Use AVX512PIi8 for the alt forms of vcmp instructions. This adds the TB prefix and keeps the mnemonic from starting with an extra 'v' llvm-svn: 192272	2013-10-09 04:24:38 +00:00
Craig Topper	718df5110c	Mark some instructions as CodeGenOnly since they aren't needed by the assembler or disassembler. Disassembler already filtered them, but asm parser still had them in its tables. llvm-svn: 192271	2013-10-09 03:56:16 +00:00
Craig Topper	d5082631e1	Add in64BitMode/in32BitMode to the MMX/SSE2/AVX maskmovq/dq instructions. This way the asm parser will pick the right one based on the mode. Instruction selection already did the right thing based on the pointer size. llvm-svn: 192266	2013-10-09 02:18:34 +00:00
Rafael Espindola	6267c79fdb	Add a MCTargetStreamer interface. This patch fixes an old FIXME by creating a MCTargetStreamer interface and moving the target specific functions for ARM, Mips and PPC to it. The ARM streamer is still declared in a common place because it is used from lib/CodeGen/ARMException.cpp, but the Mips and PPC are completely hidden in the corresponding Target directories. I will send an email to llvmdev with instructions on how to use this. llvm-svn: 192181	2013-10-08 13:08:17 +00:00
Craig Topper	4c3cbfbbbc	Remove unneeded MMX instruction definition by moving pattern to an equivalent instruction definition and removing the filtering from the disassembler table building. llvm-svn: 192175	2013-10-08 06:30:39 +00:00
Craig Topper	aa1a4d51f0	Remove some instructions that existed to provide aliases to the assembler. Can be done with InstAlias instead. Unfortunately, this was causing printer to use 'vmovq' or 'vmovd' based on what was parsed. To cleanup the inconsistencies convert all 'vmovd' with 64-bit registers to 'vmovq', but provide an alias so that 'vmovd' will still parse. llvm-svn: 192171	2013-10-08 05:53:50 +00:00
Benjamin Kramer	feace9b737	X86: Fix type check. Just because an integer type is illegal doesn't mean it's i64. Fixes PR17495, where an i24 triggered this code. It's intended to optimize i64 loads on 32 bit x86. llvm-svn: 192123	2013-10-07 19:11:35 +00:00
Rafael Espindola	e60c3625e3	Remove getEHExceptionRegister and getEHHandlerRegister. They haven't been used for a long time. Patch by MathOnNapkins. llvm-svn: 192099	2013-10-07 13:39:22 +00:00
Craig Topper	6e389a510f	Remove some instructions that seem to only exist to trick the filtering checks in the disassembler table creation. Just fix up the filter to let the real instruction through instead. llvm-svn: 192090	2013-10-07 07:19:47 +00:00
Craig Topper	0c3bbe0644	Remove FsMOVAPSrr and friends. They have no patterns and are no longer selected anywhere. llvm-svn: 192089	2013-10-07 06:10:45 +00:00
Craig Topper	4a7ff81d5f	Teach X86 asm parser that VMOVAPSrr and other VEX-encoded register to register moves should be switched from using the MRMSrcReg form to the MRMDestReg form if the source register is a 64-bit extended register and the destination register is not. This allows the instruction to be encoded using the 2-byte VEX form instead of the 3-byte VEX form. The GNU assembler has similar behavior and instruction selection already does this. llvm-svn: 192088	2013-10-07 05:42:48 +00:00
Craig Topper	b5918acf04	Add disassembler support for long encodings for INC/DEC in 32-bit mode. llvm-svn: 192086	2013-10-07 04:28:06 +00:00
Benjamin Kramer	a7e734d765	X86: Don't fold spills into SSE operations if the stack is unaligned. Regalloc can emit unaligned spills nowadays, but we can't fold the spills into SSE ops if we can't guarantee alignment. PR12250. llvm-svn: 192064	2013-10-06 13:48:22 +00:00
Elena Demikhovsky	cb8eaca2e4	AVX-512: added scalar convert instructions and intrinsics. Fixed load folding in VPERM2I instruction. llvm-svn: 192063	2013-10-06 13:11:09 +00:00
Elena Demikhovsky	0ff833ab99	AVX-512: fixed shuffle lowering in case of BLEND and added VSHUFPS patterns. llvm-svn: 192055	2013-10-06 06:11:18 +00:00
Craig Topper	9a365fa296	Add TBM instructions to loading folding tables. llvm-svn: 192046	2013-10-05 20:20:51 +00:00
Nick Lewycky	e9c94635b3	Rename this feature to "cx16" to match gcc's flag name. Apparently these strings are directly tied to the flag names in clang with no remapping in between? llvm-svn: 192044	2013-10-05 20:11:44 +00:00
Craig Topper	94a706d015	Remove underscores from TBM instruction names for consistency with other instruction naming. llvm-svn: 192040	2013-10-05 19:27:26 +00:00
Craig Topper	0a8f3fc996	Remove unneeded TBM intrinsics. The arithmetic/logical operation patterns are sufficient. llvm-svn: 192039	2013-10-05 19:22:59 +00:00
Craig Topper	d0a63f6722	Add an additional pattern for BLCI since opt can turn (not (add x, 1)) into (sub -2, x). llvm-svn: 192037	2013-10-05 17:17:53 +00:00
Elena Demikhovsky	05028f4106	AVX-512: Fixed encoding of VMOVQ instruction. llvm-svn: 191889	2013-10-03 12:03:26 +00:00
Craig Topper	e0cb6198ed	Replace C++ style comment with a C style comment to satisfy some of the build bots. llvm-svn: 191880	2013-10-03 06:29:59 +00:00
Craig Topper	541a27d9e4	Remove comma from the end of an enum. llvm-svn: 191877	2013-10-03 06:18:26 +00:00
Craig Topper	6fb0648c41	Add XOP disassembler support. Fixes PR13933. llvm-svn: 191874	2013-10-03 05:17:48 +00:00
Craig Topper	5ac188d0f2	Add patterns for selecting TBM instructions from logical operations. Patch from Yunzhong Gao. llvm-svn: 191871	2013-10-03 04:16:45 +00:00
Elena Demikhovsky	ee11e148e9	AVX-512: fixed a bug in getLoadStoreRegOpcode() for AVX-512 target llvm-svn: 191818	2013-10-02 12:20:42 +00:00
Elena Demikhovsky	d336ecd5ad	AVX-512: Added TB prefix to all instructions without prefixes, otherwise encoding fails after the last change in X86MCCodeEmitter.cpp. llvm-svn: 191812	2013-10-02 06:39:07 +00:00
Rafael Espindola	a279462828	Remove several unused variables. Patch by Alp Toker. llvm-svn: 191757	2013-10-01 13:32:03 +00:00
Elena Demikhovsky	84c6cd222d	AVX-512: Added X86vzmovl patterns llvm-svn: 191733	2013-10-01 08:38:02 +00:00
Craig Topper	e1e883da01	Remove 0 as a valid encoding for the m-mmmm field. llvm-svn: 191732	2013-10-01 07:10:28 +00:00
Craig Topper	419f67b3cc	Remove unneeded fields from disassembler internal instruction format. llvm-svn: 191731	2013-10-01 06:56:57 +00:00
Craig Topper	401688a9b1	BEXTR should be defined to take same type for bother operands. llvm-svn: 191728	2013-10-01 03:48:26 +00:00
Preston Gurd	182f111d29	Forgot to add a break statement. llvm-svn: 191715	2013-09-30 23:51:22 +00:00
Preston Gurd	a352cdea56	The X86FixupLEAs pass for Intel Atom must not call convertToThreeAddress on ADD16rr opcodes, if src1 != src, since that would cause convertToThreeAddress to try to create a virtual register. This is not permitted after register allocation, which is when the X86FixupLEAs pass runs. This patch fixes PR16785. llvm-svn: 191711	2013-09-30 23:18:42 +00:00
Craig Topper	a4bd7d9c3c	Various x86 disassembler fixes. Add VEX_LIG to scalar FMA4 instructions. Use VEX_LIG in some of the inheriting checks in disassembler table generator. Make use of VEX_L_W, VEX_L_W_XS, VEX_L_W_XD contexts. Don't let VEX_L_W, VEX_L_W_XS, VEX_L_W_XD, VEX_L_W_OPSIZE inherit from their non-L forms unless VEX_LIG is set. Let VEX_L_W, VEX_L_W_XS, VEX_L_W_XD, VEX_L_W_OPSIZE inherit from all of their non-L or non-W cases. Increase ranking on VEX_L_W, VEX_L_W_XS, VEX_L_W_XD, VEX_L_W_OPSIZE so they get chosen over non-L/non-W forms. llvm-svn: 191649	2013-09-30 02:46:36 +00:00
Craig Topper	3844406801	Change type of XOP flag in code emitters to a bool. Remove a some unneeded cases from switch. llvm-svn: 191632	2013-09-29 08:33:34 +00:00
Craig Topper	34f9b95fc6	Add comments for XOPA map introduced with TBM instructions.a llvm-svn: 191630	2013-09-29 06:31:18 +00:00
Robert Wilhelm	6b36431ffa	Fix spelling intruction -> instruction. llvm-svn: 191610	2013-09-28 11:46:15 +00:00
Yunzhong Gao	e51da27a74	Adding intrinsics to the llvm backend for TBM instruction set. Phabricator code review is located here: http://llvm-reviews.chandlerc.com/D1750 llvm-svn: 191539	2013-09-27 18:38:42 +00:00
Craig Topper	60de5044cf	Put HasAVX512 predicate on some patterns to properly disable them when AVX512 isn't enabled. Currently it works simply because the SSE and AVX version of the same patterns are checked first in the DAG isel table. llvm-svn: 191490	2013-09-27 07:20:47 +00:00
Craig Topper	13e2db06ea	Switch HasAVX to UseAVX in one spot to ensure that AVX512 form of VINSERTPS is used in AVX512 mode. llvm-svn: 191489	2013-09-27 07:16:24 +00:00
Craig Topper	da1590c69a	Removal some duplicate patterns. llvm-svn: 191488	2013-09-27 07:11:17 +00:00
Yunzhong Gao	54d338bb6a	Fixing Intel format of the vshufpd instruction. Phabricator code review is located at: http://llvm-reviews.chandlerc.com/D1759 llvm-svn: 191481	2013-09-27 01:44:23 +00:00
Andrew Trick	65c09c6381	Mark the x86 machine model as incomplete. PR17367. Ideally, the machinel model is added at the time the instructions are defined. But many instructions in X86InstrSSE.td still need a model. Without this workaround the scheduler asserts because x86 already has itinerary classes for these instructions, indicating they should be modeled by the scheduler. Since we use the new machine model for other instructions, it expects a new machine model for these too. llvm-svn: 191391	2013-09-25 18:14:12 +00:00
David Majnemer	30b6b79b54	MC: Remove vestigial PCSymbol field from AsmInfo llvm-svn: 191362	2013-09-25 09:36:11 +00:00
Yunzhong Gao	14609c71b2	Adding a feature flag to the llvm backend for x86 TBM instruction set. Adding TBM feature to bdver2 processor; piledriver supports this instruction set according to the following document: http://developer.amd.com/wordpress/media/2012/10/New-Bulldozer-and-Piledriver-Instructions.pdf Phabricator code review is located here: http://llvm-reviews.chandlerc.com/D1692 llvm-svn: 191324	2013-09-24 18:21:52 +00:00
Bill Wendling	a02f17aad8	Followup to r191252. Make sure that the code that handles the constant addresses is run for the GEPs. This just refactors that code and then calls it for the GEPs that are collected during the iteration. <rdar://problem/12445434> llvm-svn: 191281	2013-09-24 07:19:30 +00:00
Bill Wendling	339b0f39aa	Selecting the address from a very long chain of GEPs can blow the stack. The recursive nature of the address selection code can cause the stack to explode if there is a long chain of GEPs. Convert the recursive bit into a iterative method to avoid this. <rdar://problem/12445434> llvm-svn: 191252	2013-09-24 00:13:08 +00:00
Tim Northover	c9a7e47164	ISelDAG: spot chain cycles involving MachineNodes Previously, the DAGISel function WalkChainUsers was spotting that it had entered already-selected territory by whether a node was a MachineNode (amongst other things). Since it's fairly common practice to insert MachineNodes during ISelLowering, this was not the correct check. Looking around, it seems that other nodes get their NodeId set to -1 upon selection, so this makes sure the same thing happens to all MachineNodes and uses that characteristic to determine whether we should stop looking for a loop during selection. This should fix PR15840. llvm-svn: 191165	2013-09-22 08:21:56 +00:00
David Majnemer	b80f5369fe	X86: Use R_X86_64_TPOFF64 for FK_Data_8 Summary: LLVM would crash when trying to come up with a relocation type for assembly like: movabsq $V@TPOFF, %rax Instead, we say the relocation type is R_X86_64_TPOFF64. Fixes PR17274. Reviewers: dblaikie, nrieck, rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1717 llvm-svn: 191163	2013-09-22 05:30:16 +00:00
Juergen Ributzka	b55735e2d8	Revert "SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too." This reverts commit r191130. llvm-svn: 191138	2013-09-21 15:09:46 +00:00
Craig Topper	ef2cf025cd	Remove alignment restrictions from FMA load folding. llvm-svn: 191136	2013-09-21 05:58:59 +00:00
Juergen Ributzka	d4aac28519	Fix the buildbot llvm-svn: 191133	2013-09-21 05:15:01 +00:00
Juergen Ributzka	32cca125e1	[X86] Emulate AVX 256bit MIN/MAX support by splitting the vector. In AVX 256bit vectors are valid vectors and therefore the Type Legalizer doesn't split the VSELECT and SETCC nodes. AVX only supports MIN/MAX on 128bit vectors and this fix enables vector splitting for this special case in the X86 DAG Combiner. This fix is related to PR16695, PR17002, and <rdar://problem/14594431>. llvm-svn: 191131	2013-09-21 04:55:22 +00:00
Juergen Ributzka	67e5289ff2	SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too. The Type Legalizer recognizes that VSELECT needs to be split, because the type is to wide for the given target. The same does not always apply to SETCC, because less space is required to encode the result of a comparison. As a result VSELECT is split and SETCC is unrolled into scalar comparisons. This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG Combiner. If a matching pattern is found, then the result mask of SETCC is promoted to the expected vector mask for the given target. This mask has usually te same size as the VSELECT return type (except for Intel KNL). Now the type legalizer will split both VSELECT and SETCC. This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>. llvm-svn: 191130	2013-09-21 04:55:18 +00:00
Craig Topper	6670549a2c	Lift alignment restrictions on load/store folding of VEXTRACTI128/VINSERTI128. llvm-svn: 191073	2013-09-20 05:37:49 +00:00
Yi Jiang	ed81b17719	X86 horizontal vector reduction cost model llvm-svn: 191021	2013-09-19 17:48:48 +00:00
Tim Northover	89d57eb12b	X86: FrameIndex addressing modes do have a base register. When selecting the DAG (add (WrapperRIP ...), (FrameIndex ...)), X86 code had spotted the FrameIndex possibility and was working out whether it could fold the WrapperRIP into this. The test for forming a %rip version is notionally whether we already have a base or index register (%rip precludes both), but we were forgetting to account for the register that would be inserted later to access the frame. rdar://problem/15024520 llvm-svn: 190995	2013-09-19 11:33:53 +00:00
Craig Topper	7ef60f689f	Prevent extra calls to ToggleFeature for Feature64Bit and FeatureCMOV if they've already been enabled. The extra call ends up clearing the bit in FeatureBits since its a 'toggle'. Can't prove that anything was broken because of this since I don't think the FeatureBits for these are used. llvm-svn: 190920	2013-09-18 06:01:53 +00:00
Craig Topper	01e805a531	Fix X86 subtarget to not overwrite the autodetected features by calling InitMCProcessorInfo right after detecting them. Instead add a new function that only updates the scheduling model and call that. llvm-svn: 190919	2013-09-18 05:54:09 +00:00
Craig Topper	5d022196de	Lift alignment restrictions for load/store folding on VINSERTF128/VEXTRACTF128. Fixes PR17268. llvm-svn: 190916	2013-09-18 03:55:53 +00:00
Reid Kleckner	130539949d	COFF: Ensure that objects produced by LLVM link with /safeseh Summary: We indicate that the object files are safe by emitting a @feat.00 absolute address symbol. The address is presumably interpreted as a bitfield of features that the compiler would like to enable. Bit 0 is documented in the PE COFF spec to opt in to "registered SEH", which is what /safeseh enables. LLVM's object files are safe by default because LLVM doesn't know how to produce SEH handlers. Reviewers: Bigcheese CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1691 llvm-svn: 190898	2013-09-17 23:18:05 +00:00
Preston Gurd	3d32b70ccf	Remove unused code, which had been commented out. llvm-svn: 190869	2013-09-17 16:53:36 +00:00
Ben Langmuir	c0ab36fe2e	Add llvm.x86.* intrinsics for Intel SHA Extensions Add llvm.x86.* intrinsics for all of the Intel SHA Extensions instructions, as well as tests. Also remove mayLoad and hasSideEffects, which can be inferred from the instruction patterns. llvm-svn: 190864	2013-09-17 13:44:39 +00:00
Elena Demikhovsky	28417de9de	AVX-512: Converted to Unix style llvm-svn: 190851	2013-09-17 07:34:34 +00:00
Craig Topper	ece6095ce4	Add AES and SHA instructions to the load folding tables. llvm-svn: 190850	2013-09-17 06:50:11 +00:00
Craig Topper	4b8534b86a	Fix column alignment. No functional change. llvm-svn: 190849	2013-09-17 06:05:17 +00:00

1 2 3 4 5 ...

9723 Commits