llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-23 21:13:02 +02:00

Author	SHA1	Message	Date
Craig Topper	63828a2473	[AVX-512] Give priority to EVEX encoded scalar FMA instructions when we have FMA, AVX512 and no VLX. We were giving priority if VLX was enabled. llvm-svn: 298046	2017-03-17 06:10:37 +00:00
Craig Topper	624fc12e44	[X86][FMA4] Remove isCommutable from FMA4 scalar intrinsics. They aren't commutable as operand 0 should pass its upper bits through to the output. llvm-svn: 288011	2016-11-27 21:37:04 +00:00
Craig Topper	bf372031c2	[X86][FMA] Add missing Predicates qualifier around scalar FMA intrinsic patterns. llvm-svn: 288010	2016-11-27 21:37:02 +00:00
Craig Topper	5943c6520e	[X86] Create a new instruction format to handle MemOp4 encoding. This saves one bit in TSFlags and simplifies MRMSrcMem/MRMSrcReg format handling. llvm-svn: 279423	2016-08-22 07:38:45 +00:00
Craig Topper	9332f50e72	[X86] Remove CustomInserter for FMA3 instructions. Looks like since we got full commuting support for FMAs after this was added, the coalescer can now get this right on its own. Differential Revision: https://reviews.llvm.org/D22799 llvm-svn: 276987	2016-07-28 15:28:56 +00:00
Craig Topper	39f037759a	[X86] Make the FMA3 instruction names consistent between VEX and EVEX encoded versions. This places the 132/213/231 form number in front of the SS/SD/PS/PD. Move the Y for 256-bit versions to be after the PS/PD. Change the AVX512 scalar forms to include a Z in the their name. This new format should be consistent with the general naming of instructions. llvm-svn: 276559	2016-07-24 08:26:38 +00:00
Vyacheslav Klochkov	d08c394197	X86-FMA3: Defined the ExeDomain property for Scalar FMA3 opcodes. Reviewer: Simon Pilgrim. Differential Revision: http://reviews.llvm.org/D15317 llvm-svn: 255080	2015-12-09 00:12:13 +00:00
Simon Pilgrim	52cfcde3fb	[X86][FMA4] Explicitly set the domain of FMA4 float/double scalar instructions Both were defaulting to the float domain - now matches the packed instructions. llvm-svn: 254841	2015-12-05 07:07:42 +00:00
Vyacheslav Klochkov	fdc2e9e5ae	X86-FMA3: Improved/enabled the memory folding optimization for scalar loads generated for _mm_losd_s{s,d}() intrinsics and used in scalar FMAs generated for FMA intrinsics _mm_f{madd,msub,nmadd,nmsub}_s{s,d}(). Reviewer: David Kreitzer Differential Revision: http://reviews.llvm.org/D14762 llvm-svn: 254140	2015-11-26 07:45:30 +00:00
Sanjay Patel	836b13b706	fix typo; NFC llvm-svn: 254069	2015-11-25 15:33:36 +00:00
Vyacheslav Klochkov	262f8eaf50	X86-FMA3: Implemented commute transformations FMA_Int instructions. It made it possible to apply the memory folding optimization for the 2nd operand of FMA_Int instructions. Reviewer: Quentin Colombet Differential Revision: http://reviews.llvm.org/D14550 llvm-svn: 252973	2015-11-13 00:07:35 +00:00
Andrew Kaylor	3cc19fd26f	Improved the operands commute transformation for X86-FMA3 instructions. All 3 operands of FMA3 instructions are commutable now. Patch by Slava Klochkov Reviewers: Quentin Colombet(qcolombet), Ahmed Bougacha(ab). Differential Revision: http://reviews.llvm.org/D13269 llvm-svn: 252335	2015-11-06 19:47:25 +00:00
Andrew Kaylor	e42c061561	Created new X86 FMA3 opcodes (FMA_Int) that are used now for lowering of scalar FMA intrinsics. Patch by Slava Klochkov The key difference between FMA and FMA_Int opcodes is that FMA_Int opcodes are handled more conservatively. It is illegal to commute the 1st operand of FMA*_Int instructions as the upper bits of scalar FMA intrinsic result must be taken from the 1st operand, but such commute transformation would change those upper bits and invalidate the intrinsic's result. Reviewers: Quentin Colombet, Elena Demikhovsky Differential Revision: http://reviews.llvm.org/D13710 llvm-svn: 252060	2015-11-04 18:10:41 +00:00
Michael Kuperstein	4559e5e720	[X86] When pattern-matching scalar FMA3 intrinsics, don't re-arrange the first and second operands. The semantics of the scalar FMA intrinsics are that the high vector elements are copied from the first source. The existing pattern switches src1 and src2 around, to match the "213" order, which ends up tying the original src2 to the dest. Since the actual scalar fma3 instructions copy the high elements from the dest register, the wrong values are copied. This modifies the pattern to leave src1 and src2 in their original order. Differential Revision: http://reviews.llvm.org/D9908 llvm-svn: 238131	2015-05-25 12:35:25 +00:00
Craig Topper	0734168db8	Replace neverHasSideEffects=1 with hasSideEffects=0 in all .td files. llvm-svn: 222801	2014-11-26 00:46:26 +00:00
Quentin Colombet	9b13d839be	[X86] Selectively mark the FMA variants inside a family as isCommutable. Given a FMA family (e.g., 213, 231), not all the variants (i.e., register or memory) are commutable. E.g., for the 213 family (with the syntax src1, src2, src3): fmaXXX213 A, B, reg3/mem3 == fmaXXX213 B, A, reg3/mem3 Now consider the 231 family: fmaXXX231 A, B, reg3 == fmaXXX231 A, reg3, B But fmaXXX231 A, B, mem3 != fmaXXX231 A, mem3, B Indeed, mem3 cannot be the second argument of the memory variant of fmaXXX231. Working on a reduced test case! <rdar://problem/16800495> llvm-svn: 208252	2014-05-07 21:43:35 +00:00
Lang Hames	e2f8671084	[X86] Make the VFMA*231 variants commutable and relax the alignment restrictions on FMA3 memory operands. FMA3 instructions are VEX encoded, so they can load from unaligned memory. Testcase to follow, along with related patch. <rdar://problem/16478629> llvm-svn: 205472	2014-04-02 22:06:16 +00:00
Lang Hames	4397ed9ac7	[X86] Only 213 FMA3 variants should be marked commutable. Commuting the 231 and 132 variants would swap addends and multiplicands/multipliers, which isn't valid. I'm still trying to reduce a decent test case for this. llvm-svn: 200792	2014-02-04 19:42:47 +00:00
Lang Hames	884a7dc676	Replace X86 FMA intrinsic pseduo-instructions with def pats. It looks like these pseudos were only used for pattern matching. Def pats are the appropriate way to do that. As a bonus, these intrinsics will now have memory operands folded properly, and better FMA3 variants selected where appropriate (see r199933). <rdar://problem/15611947> llvm-svn: 200577	2014-01-31 21:29:19 +00:00
Lang Hames	8b08ff3852	Replace vfmaddxx213 instructions with their 231-type equivalents in accumulator loops. Writing back to the accumulator (231-type) allows the coalescer to eliminate an extra copy. llvm-svn: 199933	2014-01-23 20:23:36 +00:00
Craig Topper	4a48c26e38	Add a new x86 specific instruction flag to force some isCodeGenOnly instructions to go through to the disassembler tables without resorting to string matches. Apply flag to all _REV instructions. llvm-svn: 198543	2014-01-05 04:17:28 +00:00
Craig Topper	57b949fa83	Mark all x86 Int_ and _Int patterns as isCodeGenOnly so the disassembler table builder doesn't need to string match them to exclude them. llvm-svn: 198323	2014-01-02 17:28:14 +00:00
Craig Topper	a4bd7d9c3c	Various x86 disassembler fixes. Add VEX_LIG to scalar FMA4 instructions. Use VEX_LIG in some of the inheriting checks in disassembler table generator. Make use of VEX_L_W, VEX_L_W_XS, VEX_L_W_XD contexts. Don't let VEX_L_W, VEX_L_W_XS, VEX_L_W_XD, VEX_L_W_OPSIZE inherit from their non-L forms unless VEX_LIG is set. Let VEX_L_W, VEX_L_W_XS, VEX_L_W_XD, VEX_L_W_OPSIZE inherit from all of their non-L or non-W cases. Increase ranking on VEX_L_W, VEX_L_W_XS, VEX_L_W_XD, VEX_L_W_OPSIZE so they get chosen over non-L/non-W forms. llvm-svn: 191649	2013-09-30 02:46:36 +00:00
Craig Topper	ef2cf025cd	Remove alignment restrictions from FMA load folding. llvm-svn: 191136	2013-09-21 05:58:59 +00:00
Craig Topper	58b9662000	Simplify nested strconcats in X86 td files since strconcat can take more than 2 arguments. llvm-svn: 172379	2013-01-14 07:46:34 +00:00
Craig Topper	152bee45fa	Mark all the _REV instructions as not having side effects. They aren't really emitted by the backend, but it reduces the number of instructions in the output files with unmodelled side effects to make auditing easier. llvm-svn: 171118	2012-12-26 21:30:22 +00:00
Craig Topper	f0d2332d86	Fix execution domain for packed FMA4 instructions. llvm-svn: 168417	2012-11-21 08:08:21 +00:00
Craig Topper	7c37abcace	Add explicit VEX_L tags to all 256-bit instructions. This will allow us to remove code from the code emitters that examined operands to set the L-bit. llvm-svn: 164202	2012-09-19 06:06:34 +00:00
Craig Topper	2e53378ff6	Mark FMA4 instructions as commutable and add them to the folding tables. llvm-svn: 163035	2012-08-31 23:10:34 +00:00
Craig Topper	917333c8c7	Mark FMA3 instructions as commutable so that the operands to the multiply part can be commuted. llvm-svn: 163001	2012-08-31 16:31:13 +00:00
Craig Topper	6bb3145d0d	Add support for converting llvm.fma to fma4 instructions. llvm-svn: 162999	2012-08-31 15:40:30 +00:00
Craig Topper	aa2444a397	Convert FMA4 patterns to use target specific nodes instead of intrinsics to align with FMA3. llvm-svn: 162829	2012-08-29 07:18:25 +00:00
Jakob Stoklund Olesen	48bb81b28a	Remove more mayLoad workarounds. llvm-svn: 162556	2012-08-24 14:43:22 +00:00
Craig Topper	aa57ba3944	Custom lower FMA intrinsics to target specific nodes and remove the patterns. llvm-svn: 162534	2012-08-24 04:03:22 +00:00
Craig Topper	e432edabf1	Cleanup the scalar FMA3 definitions. Add patterns to fold loads with scalar forms. llvm-svn: 162260	2012-08-21 07:11:11 +00:00
Craig Topper	2e63b3ea18	Merge FMA3 instructions with and without patterns into single classes using null_frag. llvm-svn: 162257	2012-08-21 05:56:45 +00:00
Craig Topper	77406bef3b	Remove FMA3 intrinsic instructions in favor of patterns. llvm-svn: 162194	2012-08-20 06:21:25 +00:00
Craig Topper	64c93f9d07	Use correct intrinsic for 256-bit VFMSUBADDPS. llvm-svn: 162193	2012-08-20 06:03:04 +00:00
Craig Topper	832951e7da	Remove trailing white space and tab characters. No functional change. llvm-svn: 162192	2012-08-19 23:37:46 +00:00
Elena Demikhovsky	0fec7026d9	Added FMA functionality to X86 target. llvm-svn: 161110	2012-08-01 12:06:00 +00:00
Craig Topper	52bf0cfb27	Add intrinsic forms for FMA instructions to opcode folding tables. llvm-svn: 157917	2012-06-04 07:46:16 +00:00
Craig Topper	8d3031fa46	Rename fma4 intrinsics to just fma since they are now used for both FMA4 and FMA3. Autoupgrade support coming in a separate commit. llvm-svn: 157898	2012-06-03 07:26:46 +00:00
Craig Topper	685b86b007	Use sse_load_f32/64 for scalar FMA3 intrinsic patterns instead of 128-bit loads to match instruction behavior. llvm-svn: 157895	2012-06-03 01:40:43 +00:00
Craig Topper	e783584ea7	Add neverHasSideEffects and mayLoad to FMA3 instructions. llvm-svn: 157894	2012-06-03 00:30:49 +00:00
Craig Topper	02e5a00a70	Remove fadd(fmul) patterns for FMA3. This needs to be implemented by paying attention to FP_CONTRACT and matching @llvm.fma which is not available yet. This will allow us to enablle intrinsic use at least though. llvm-svn: 157804	2012-06-01 06:07:48 +00:00
Craig Topper	6343cc814e	Tidy up. Remove trailing spaces and fix the worst of the 80 column violations. llvm-svn: 157799	2012-06-01 05:24:29 +00:00
Elena Demikhovsky	194da7364d	Added FMA3 Intel instructions. I disabled FMA3 autodetection, since the result may differ from expected for some benchmarks. I added tests for GodeGen and intrinsics. I did not change llvm.fma.f32/64 - it may be done later. llvm-svn: 157737	2012-05-31 09:20:20 +00:00
Jia Liu	6bb2f0f0e4	some comment fix for X86 and ARM llvm-svn: 150902	2012-02-19 02:03:36 +00:00
Jia Liu	b077b6085d	Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore. llvm-svn: 150878	2012-02-18 12:03:15 +00:00
Craig Topper	4f47613598	Mark scalar FMA4 instructions as ignoring the VEX.L bit. llvm-svn: 147602	2012-01-05 08:56:10 +00:00

1 2

62 Commits