llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 22:12:57 +02:00

Author	SHA1	Message	Date
Roman Divacky	85348270cd	Stop casting away const qualifier needlessly. llvm-svn: 163258	2012-09-05 22:26:57 +00:00
Roman Divacky	4be967f49b	Use const properly so that we dont remove const qualifier from region and MII by casting. Found with gcc48. llvm-svn: 163247	2012-09-05 21:17:34 +00:00
Craig Topper	864ef1eec5	Remove some of the patterns added in r163196. Increasing the complexity on insert_subvector into undef accomplishes the same thing. llvm-svn: 163198	2012-09-05 07:26:35 +00:00
Craig Topper	f029cfe913	Add patterns for integer forms of VINSERTF128/VINSERTI128 folded with loads. Also add patterns to turn subvector inserts with loads to index 0 of an undef into VMOVAPS. llvm-svn: 163196	2012-09-05 06:58:39 +00:00
Craig Topper	6274d26545	Convert vextracti128/vextractf128 intrinsics to extract_subvector at DAG build time. Similar was previously done for vinserti128/vinsertf128. Add patterns for folding these extract_subvectors with stores. llvm-svn: 163192	2012-09-05 05:48:09 +00:00
Chad Rosier	b75afa43e4	Fix function name per coding standard. llvm-svn: 163187	2012-09-05 01:15:43 +00:00
Preston Gurd	c80dc7d214	Generic Bypass Slow Div - CodeGenPrepare pass for identifying div/rem ops - Backend specifies the type mapping using addBypassSlowDivType - Enabled only for Intel Atom with O2 32-bit -> 8-bit - Replace IDIV with instructions which test its value and use DIVB if the value is positive and less than 256. - In the case when the quotient and remainder of a divide are used a DIV and a REM instruction will be present in the IR. In the non-Atom case they are both lowered to IDIVs and CSE removes the redundant IDIV instruction, using the quotient and remainder from the first IDIV. However, due to this optimization CSE is not able to eliminate redundant IDIV instructions because they are located in different basic blocks. This is overcome by calculating both the quotient (DIV) and remainder (REM) in each basic block that is inserted by the optimization and reusing the result values when a subsequent DIV or REM instruction uses the same operands. - Test cases check for the presents of the optimization when calculating either the quotient, remainder, or both. Patch by Tyler Nowicki! llvm-svn: 163150	2012-09-04 18:22:17 +00:00
Elena Demikhovsky	61924c155d	This patch optimizes shuffle instruction - generates 2 instructions instead of 4. Since this specific shuffle is widely used in many workloads we have ~10% performance on them. shufflevector <8 x float> %A, <8 x float> %B, <8 x i32> <i32 0, i32 8, i32 2, i32 10, i32 4, i32 12, i32 6, i32 14> vmovaps (%rdx), %ymm0 vshufps $8, %ymm0, %ymm0, %ymm0 vmovaps (%rcx), %ymm1 vshufps $8, %ymm0, %ymm1, %ymm1 vunpcklps %ymm0, %ymm1, %ymm0 vmovaps (%rcx), %ymm0 vmovsldup (%rdx), %ymm1 vblendps $85, %ymm0, %ymm1, %ymm0 llvm-svn: 163134	2012-09-04 12:49:02 +00:00
Chad Rosier	294688cf56	[ms-inline asm] Asm operands can map to one or more MCOperands. Therefore, add the NumMCOperands argument to the GetMCInstOperandNum() function that is set to the number of MCOperands this asm operand mapped to. llvm-svn: 163124	2012-09-03 20:31:23 +00:00
Chad Rosier	bd31fcd8a9	[ms-inline asm] Add an interface to the GetMCInstOperandNum() function in the MCTargetAsmParser class. llvm-svn: 163122	2012-09-03 18:47:45 +00:00
Chad Rosier	fac2e7b419	Removed unused argument. llvm-svn: 163104	2012-09-03 03:16:09 +00:00
Chris Lattner	4a8f2bcb32	some peepholes that should match horizontal add/sub operations. llvm-svn: 163103	2012-09-03 02:58:21 +00:00
Chad Rosier	6fbf85d859	[ms-inline asm] Expose the Kind and Opcode variables from the MatchInstructionImpl() function. These values are used by the ConvertToMCInst() function to index into the ConversionTable. The values are also needed to call the GetMCInstOperandNum() function. llvm-svn: 163101	2012-09-03 02:06:46 +00:00
Craig Topper	0791e3f380	Typos llvm-svn: 163053	2012-09-01 06:33:50 +00:00
Manman Ren	9afdad8207	SelectionDAG: when constructing VZEXT_LOAD from other loads, make sure its output chain is correctly setup. As an example, if the original load must happen before later stores, we need to make sure the constructed VZEXT_LOAD is constrained to be before the stores. rdar://11457792 llvm-svn: 163036	2012-08-31 23:16:57 +00:00
Craig Topper	2e53378ff6	Mark FMA4 instructions as commutable and add them to the folding tables. llvm-svn: 163035	2012-08-31 23:10:34 +00:00
Craig Topper	4a81c1cbe0	Add selection of RegOp2MemOpTable3 to canFoldMemoryOperand llvm-svn: 163029	2012-08-31 22:12:16 +00:00
Michael Liao	6f4b3f358d	Fix PR12359 - In addition to undefined, if V2 is zero vector, skip 2nd PSHUFB and POR as well as PSHUFB will zero elements with negative indices. Patch by Sriram Murali <sriram.murali@intel.com> llvm-svn: 163018	2012-08-31 20:12:31 +00:00
Chad Rosier	5e5a7c4932	The ConvertToMCInst() function can't fail, so remove the now dead Match_ConversionFail enum. llvm-svn: 163002	2012-08-31 16:41:07 +00:00
Craig Topper	917333c8c7	Mark FMA3 instructions as commutable so that the operands to the multiply part can be commuted. llvm-svn: 163001	2012-08-31 16:31:13 +00:00
Craig Topper	6bb3145d0d	Add support for converting llvm.fma to fma4 instructions. llvm-svn: 162999	2012-08-31 15:40:30 +00:00
Michael Liao	43c7369b24	Clean up AddedComplexity further after adding UseSSEx llvm-svn: 162973	2012-08-31 03:01:35 +00:00
Jim Grosbach	6d3cb70105	X86: Fix encoding of 'movd %xmm0, %rax' The assembly string for the VMOVPQIto64rr instruction incorrectly lacked the 'v' prefix, resulting in mis-assembly of the vanilla movd instruction. llvm-svn: 162963	2012-08-31 00:30:30 +00:00
Michael Liao	b6735b87b0	Introduce 'UseSSEx' to force SSE legacy encoding - Add 'UseSSEx' to force SSE legacy insn not being selected when AVX is enabled. As the penalty of inter-mixing SSE and AVX instructions, we need prevent SSE legacy insn from being generated except explicitly specified through some intrinsics. For patterns supported by both SSE and AVX, so far, we force AVX insn will be tried first relying on AddedComplexity or position in td file. It's error-prone and introduces bugs accidentally. 'UseSSEx' is disabled when AVX is turned on. For SSE insns inherited by AVX, we need this predicate to force VEX encoding or SSE legacy encoding only. For insns not inherited by AVX, we still use the previous predicates, i.e. 'HasSSEx'. So far, these insns fall into the following categories: * SSE insns with MMX operands * SSE insns with GPR/MEM operands only (xFENCE, PREFETCH, CLFLUSH, CRC, and etc.) * SSE4A insns. * MMX insns. * x87 insns added by SSE. 2 test cases are modified: - test/CodeGen/X86/fast-isel-x86-64.ll AVX code generation is different from SSE one. 'vcvtsi2sdq' cannot be selected by fast-isel due to complicated pattern and fast-isel fallback to materialize it from constant pool. - test/CodeGen/X86/widen_load-1.ll AVX code generation is different from SSE one after fixing SSE/AVX inter-mixing. Exec-domain fixing prefers 'vmovapd' instead of 'vmovaps'. llvm-svn: 162919	2012-08-30 16:54:46 +00:00
Craig Topper	3bc01e8fa4	Only perform DAG combine on FMAs of legal types. llvm-svn: 162892	2012-08-30 06:56:15 +00:00
Michael Liao	0e40defe86	Fix PR13727 - The root cause is that target constant materialization in X86 fast-isel creates a PC-rel addressing which may overflow 32-bit range in non-Small code model if .rodata section is allocated too far away from code segment in MCJIT, which uses Large code model so far. - Follow the similar logic to fix non-Small code model in fast-isel by skipping non-Small code model. llvm-svn: 162881	2012-08-30 00:30:16 +00:00
Benjamin Kramer	49d736fb29	Make helper function static. llvm-svn: 162843	2012-08-29 16:17:01 +00:00
Craig Topper	aa2444a397	Convert FMA4 patterns to use target specific nodes instead of intrinsics to align with FMA3. llvm-svn: 162829	2012-08-29 07:18:25 +00:00
Chad Rosier	eed9ef7a03	Typo. llvm-svn: 162807	2012-08-28 23:57:47 +00:00
Michael Liao	2136b1b1ed	Add comments on the literal value used. llvm-svn: 162805	2012-08-28 23:42:17 +00:00
Michael Liao	32ad80c81f	Explicitly update the number of nodes to be traversed llvm-svn: 162780	2012-08-28 19:20:29 +00:00
Bill Wendling	6488dc22bb	The commutative flag is already correctly set within the multiclass. If we set it here, then a 'register-memory' version would wrongly get the commutative flag. <rdar://problem/12180135> llvm-svn: 162741	2012-08-28 07:36:46 +00:00
Craig Topper	803047a9bb	Convert V_SETALLONES/AVX_SETALLONES/AVX2_SETALLONES to Post-RA pseudos. llvm-svn: 162740	2012-08-28 07:30:47 +00:00
Craig Topper	02bb8ce5e0	Merge AVX_SET0PSY/AVX_SET0PDY/AVX2_SET0 into a single post-RA pseudo. llvm-svn: 162738	2012-08-28 07:05:28 +00:00
Michael Liao	1f793b9c47	Fix PR12312 - Add a target-specific DAG optimization to recognize a pattern PTEST-able. Such a pattern is a OR'd tree with X86ISD::OR as the root node. When X86ISD::OR node has only its flag result being used as a boolean value and all its leaves are extracted from the same vector, it could be folded into an X86ISD::PTEST node. llvm-svn: 162735	2012-08-28 03:34:40 +00:00
Jakob Stoklund Olesen	882cb360be	More missing mayLoad flags on AVX multiclasses. llvm-svn: 162714	2012-08-28 00:02:01 +00:00
Craig Topper	3e5376d85a	Remove MMX shift intrinsic handling code that also exists in SelectionDAGBuilder. llvm-svn: 162661	2012-08-27 08:08:30 +00:00
Craig Topper	bbee14ad9d	Don't allow vextractf128 to be folded with unaligned stores. We don't fold unaligned loads so shouldn't fold unaligned stores as it can cause an alignment fault to occur. llvm-svn: 162658	2012-08-27 07:19:59 +00:00
Craig Topper	57dd6db42e	Fold some patterns into instruction definitons so tablegen can infer flags removing the need for an explicit 'neverHasSideEffects = 1' llvm-svn: 162656	2012-08-27 07:04:50 +00:00
Craig Topper	b524d2e36d	Add HasAVX1Only predicate and use it for patterns that have an AVX1 instruction and an AVX2 instruction rather than relying on AddedComplexity. llvm-svn: 162654	2012-08-27 06:08:57 +00:00
Richard Smith	865f47cbb6	Fix integer undefined behavior due to signed left shift overflow in LLVM. Reviewed offline by chandlerc. llvm-svn: 162623	2012-08-24 23:29:28 +00:00
Jakob Stoklund Olesen	d1820cea0b	Add missing mayLoad flags to a large class of AVX *_Int instructions. llvm-svn: 162622	2012-08-24 23:29:07 +00:00
Jakob Stoklund Olesen	9ebe947bb0	Mark X86::RET and RETI instructions as variadic. There is special magic happening when returning floating point values on the x87 stack. The RET instructions get extra f80 operands. llvm-svn: 162592	2012-08-24 20:52:44 +00:00
Jakob Stoklund Olesen	48bb81b28a	Remove more mayLoad workarounds. llvm-svn: 162556	2012-08-24 14:43:22 +00:00
Craig Topper	aa57ba3944	Custom lower FMA intrinsics to target specific nodes and remove the patterns. llvm-svn: 162534	2012-08-24 04:03:22 +00:00
Jakob Stoklund Olesen	3739d6ca99	Remove some spurious mayLoad = 0 flags. They were inserted to silence TableGen's warning about redundant properties. That warning is now gone. llvm-svn: 162517	2012-08-24 00:31:20 +00:00
Jakob Stoklund Olesen	2f512d8eba	X86MemBarrier has unmodeled side effects. llvm-svn: 162514	2012-08-24 00:31:10 +00:00
Jakob Stoklund Olesen	16126ffe0d	Preserve operand flags in convertToThreeAddress() by copying operands. No test case, this is a generalization of r160260. llvm-svn: 162485	2012-08-23 22:36:31 +00:00
Craig Topper	3d4254e5b4	Favor FMA3 over FMA4 if both are enabled. llvm-svn: 162454	2012-08-23 18:14:30 +00:00
Craig Topper	528004fc78	Use a switch statement instead of a bunch of if-else checks and pull out the common function call. llvm-svn: 162428	2012-08-23 04:57:36 +00:00

1 2 3 4 5 ...

8563 Commits