llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 03:23:01 +02:00

Author	SHA1	Message	Date
Hao Liu	636db9c0e6	[AArch64]Implement the copy of two FPR8 registers by using FMOVss of two FPR32 registers in copyPhysReg. llvm-svn: 201061	2014-02-10 03:16:22 +00:00
Tim Northover	cf3928b4f7	ARM & AArch64: merge NEON absolute compare intrinsics There was an extremely confusing proliferation of LLVM intrinsics to implement the vacge & vacgt instructions. This combines them all into two polymorphic intrinsics, shared across both backends. llvm-svn: 200768	2014-02-04 14:55:42 +00:00
Tim Northover	0b6ea5de72	AArch64 & ARM: refactor crypto intrinsics to take scalars Some of the SHA instructions take a scalar i32 as one argument (largely because they work on 160-bit hash fragments). This wasn't reflected in the IR previously, with ARM and AArch64 choosing different types (<4 x i32> and <1 x i32> respectively) which was ugly. This makes all the affected intrinsics take a uniform "i32", allowing them to become non-polymorphic at the same time. llvm-svn: 200706	2014-02-03 17:27:49 +00:00
Chad Rosier	156f3a2a96	[AArch64] Custom lower concat_vector patterns with v4i16, v4i32, v8i8, v8i16, v16i8 types. llvm-svn: 200491	2014-01-30 21:46:54 +00:00
Kevin Qin	379441a4e6	[AArch64 NEON] Lower SELECT_CC with vector operand. When the scalar compare is between floating point and operands are vector, we custom lower SELECT_CC to use NEON SIMD compare for generating less instructions. llvm-svn: 200365	2014-01-29 01:57:30 +00:00
Kevin Qin	436aae7633	[AArch64 NEON] Try to generate CONCAT_VECTOR when lowering BUILD_VECTOR or SHUFFLE_VECTOR. Replace r199791. llvm-svn: 200180	2014-01-27 02:53:54 +00:00
Kevin Qin	d83dee8270	Revert r199791. It's old version which has some bugs. I'll commit lattest patch soon. llvm-svn: 200179	2014-01-27 02:53:41 +00:00
Jiangning Liu	5ac0a5db29	Improve pattern match from v1i8 to v1i32 for AArch64 Neon. llvm-svn: 200119	2014-01-26 04:55:53 +00:00
Jiangning Liu	8a0b567fb9	Implement pattern match from v1xx to v1xx for AArch64 Neon. llvm-svn: 200113	2014-01-26 03:27:40 +00:00
Kevin Qin	ef4cd4a730	[AArch64 NEON] Add patterns for concat_vector on v2i32. llvm-svn: 200111	2014-01-26 02:46:15 +00:00
Kevin Qin	db813b2f82	[AArch64 NEON] Add test case for vector FP_ROUND. llvm-svn: 200110	2014-01-26 02:23:33 +00:00
Ana Pazos	0a0875b43a	[AArch64] Removed unused i8 type from FPR8 register class. The i8 type is not registered with any register class. This causes a segmentation fault in MachineLICM::getRegisterClassIDAndCost. The code selects the first type associated with register class FPR8, which happens to be i8. It uses this type (i8) to get the representative class pointer, which is 0. It then uses this pointer to access a field, resulting in segmentation fault. Since i8 type is not being used for printing any neon instruction we can safely remove it. llvm-svn: 200046	2014-01-24 22:36:53 +00:00
Kevin Qin	3282007e08	[AArch64 NEON] Fix a bug in implementing register copy bwtween FPR16. llvm-svn: 199978	2014-01-24 07:53:04 +00:00
Ana Pazos	5fdec23c84	[AArch64] Added vselect patterns with float and double types llvm-svn: 199925	2014-01-23 19:18:57 +00:00
Hao Liu	80b39e0b02	[AArch64]Add CHECK for two test cases testing scalar_to_vector committed in r199461. llvm-svn: 199861	2014-01-23 02:09:30 +00:00
Kevin Qin	9a631f3af4	[AArch64 NEON] Try to generate CONCAT_VECTOR when lowering BUILD_VECTOR or SHUFFLE_VECTOR. llvm-svn: 199791	2014-01-22 06:11:03 +00:00
Kevin Qin	d925e0a953	[AArch64 NEON] Fix a bug caused by undef lane when generating VEXT. It was commited as r199628 but reverted in r199628 as causing regression test failed. It's because of old vervsion of patch I used to commit. Sorry for mistake. llvm-svn: 199704	2014-01-21 01:48:52 +00:00
Chandler Carruth	f3546bc541	Revert r199628: "[AArch64 NEON] Fix a bug caused by undef lane when generating VEXT." This test fails the newly added regression tests. llvm-svn: 199631	2014-01-20 08:18:01 +00:00
Kevin Qin	a2c8e30bce	[AArch64 NEON] Fix a bug caused by undef lane when generating VEXT. llvm-svn: 199628	2014-01-20 07:32:26 +00:00
Kevin Qin	e739fc1b8e	[AArch64 NEON] Expand vector for UDIV/SDIV/UREM/SREM/FREM as neon doesn't support these operations. llvm-svn: 199485	2014-01-17 09:54:30 +00:00
Hao Liu	f1036ed220	[AArch64]Fix the problem can't select f16_to_f32 and f32_to_f16. Also add copy support for FPR16. Also add a missing test case file belongs to commit r197361. llvm-svn: 199463	2014-01-17 06:23:30 +00:00
Kevin Qin	71a9ad96db	[AArch64 NEON] Custom lower conversion between vector integer and vector floating point if element bit-width doesn't match. llvm-svn: 199462	2014-01-17 05:52:35 +00:00
Hao Liu	96315c1088	[AArch64]Fix the problem can't select concat_vectors of two v1i32 types. Also fix the problem can't select scalar_to_vector from f32 to v2f32/v4f32. llvm-svn: 199461	2014-01-17 05:44:46 +00:00
Jiangning Liu	ff1e0e1ce3	For AArch64, lowering sext_inreg and generate optimized code by using SXTL. llvm-svn: 199296	2014-01-15 05:08:01 +00:00
Tim Northover	3f497bbb76	AArch64: don't try to handle [SU]MUL_LOHI nodes We should set them to expand for now since there are no patterns dealing with them. Actually, there are no instructions either so I doubt they'll ever be acceptable. llvm-svn: 199265	2014-01-14 22:53:22 +00:00
Rafael Espindola	9ae2f1aa3d	Revert "[AArch64] Added vselect patterns with float and double types" This reverts commit r199242. It is causing CodeGen/AArch64/neon-bsl.ll to fail. llvm-svn: 199248	2014-01-14 19:24:08 +00:00
Ana Pazos	51fd756e4b	[AArch64] Added vselect patterns with float and double types llvm-svn: 199242	2014-01-14 18:45:48 +00:00
Andrea Di Biagio	c159ef589c	[AArch64] Fix assertion failure caused by an invalid comparison between APInt values. APInt only knows how to compare values with the same BitWidth and asserts in all other cases. With this fix, function PerformORCombine does not use the APInt equality operator if the APInt values returned by 'isConstantSplat' differ in BitWidth. In that case they are different and no comparison is needed. llvm-svn: 199119	2014-01-13 16:51:00 +00:00
Kevin Qin	5aa184711d	[AArch64 NEON] Add missing patterns for bitcast from or to v1f64 llvm-svn: 199070	2014-01-13 01:58:38 +00:00
Kevin Qin	9b14d101ea	[AArch64 NEON] Add more scenarios to use perm instructions when lowering shuffle_vector This patch covered 2 more scenarios: 1. Two operands of shuffle_vector are the same, like %shuffle.i = shufflevector <8 x i8> %a, <8 x i8> %a, <8 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14> 2. One of operands is undef, like %shuffle.i = shufflevector <8 x i8> %a, <8 x i8> undef, <8 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14> After this patch, perm instructions will have chance to be emitted instead of lots of INS. llvm-svn: 199069	2014-01-13 01:56:29 +00:00
Nico Rieck	884ee061f2	Fix non-deterministic SDNodeOrder-dependent codegen Reset SelectionDAGBuilder's SDNodeOrder to ensure deterministic code generation. llvm-svn: 199050	2014-01-12 14:09:17 +00:00
Kristof Beyls	082ab7548c	Make sure -use-init-array has intended effect on all AArch64 ELF targets, not just linux. llvm-svn: 198937	2014-01-10 13:41:49 +00:00
Andrea Di Biagio	946f61e227	Teach the DAGCombiner how to fold 'vselect' dag nodes according to the following two rules: 1) fold (vselect (build_vector AllOnes), A, B) -> A 2) fold (vselect (build_vector AllZeros), A, B) -> B llvm-svn: 198777	2014-01-08 18:33:04 +00:00
Kevin Qin	fd4df4bd7a	[AArch64 NEON] Fix generating incorrect value type of NEON_VDUPLANE when lower build_vector if result value type mismatch with operand value type. llvm-svn: 198743	2014-01-08 08:06:14 +00:00
Hao Liu	2324cab69b	[AArch64]Add support to spill/fill D tuples such as DPair/DTriple/DQuad. There is no test cases for D tuple as the original test cases are too large. As the spill/fill of the D tuple is similar to the Q tuple, the correctness can be guaranteed. llvm-svn: 198684	2014-01-07 10:50:43 +00:00
Hao Liu	bbb265cfee	[AArch64]Add support to copy D tuples such as DPair/DTriple/DQuad and Q tuples such as QPair/QTriple/QQuad. There is no test case for D tuple as the original test cases are too large. As the copy of the D tuple is similar to the Q tuple, the correctness can be guaranteed. llvm-svn: 198682	2014-01-07 10:00:03 +00:00
Kevin Qin	cb6af368ab	[AArch64 NEON] Fixed incorrect immediate used in BIC instruction. llvm-svn: 198675	2014-01-07 05:10:47 +00:00
Kevin Qin	162c44e325	[AArch64 NEON] Fix invalid constant used in vselect condition. There is a wrong assumption that the vector element type and the type of each ConstantSDNode in the build_vector were the same. However, when promoting the integer operand of a legally typed build_vector, the operand type and the vector element type do not need to be the same (See method 'DAGTypeLegalizer::PromoteIntOp_BUILD_VECTOR' in LegalizeIntegerTypes.cpp). in AArch64 backend, the following dag sequence: C0: i1 = Constant<0> C1: i1 = Constant<-1> V: v8i1 = BUILD_VECTOR C1, C1, C0, C0, C0, C0, C0, C0 is type-legalized into: NewC0: i32 = Constant<0> NewC1: i32 = Constant<1> V: v8i8 = BUILD_VECTOR NewC1, NewC1, NewC0, NewC0, NewC0, NewC0, NewC0, NewC0 Forcing a getZeroExtend to VTBits to ensure that the new constant is correctly. llvm-svn: 198582	2014-01-06 02:26:10 +00:00
Jiangning Liu	583b8a7116	For AArch64 Neon, simplify scalar dup by lane0 for fp. llvm-svn: 198194	2013-12-30 02:44:35 +00:00
Hao Liu	ab32d54fad	[AArch64]Add code to spill/fill Q register tuples such as QPair/QTriple/QQuad. llvm-svn: 198193	2013-12-30 02:38:12 +00:00
Hao Liu	8bef865160	[AArch64]Can't select shift left 0 of type v1i64 llvm-svn: 198192	2013-12-30 02:12:46 +00:00
Kevin Qin	cbb0be4bee	Fix a bug in DAGcombiner about zero-extend after setcc. For AArch64 backend, if DAGCombiner see "sext(setcc)", it will combine them together to a single setcc with extended value type. Then if it see "zext(setcc)", it assumes setcc is Vxi1, and try to create "(and (vsetcc), (1, 1, ...)". While setcc isn't Vxi1, DAGcombiner will create wrong node and get wrong code emitted. llvm-svn: 198190	2013-12-30 02:05:13 +00:00
Hao Liu	e8d49c2088	[AArch64]Fix the problem that can't select mul of v1i64/v2i64 types. E.g. Can't select such IR: %tmp = mul <2 x i64> %a, %b llvm-svn: 198188	2013-12-30 01:38:41 +00:00
Hao Liu	8ed49e0c42	[AArch64]Fix a problem that the register order of fmls/fmla by element is incorrect. E.g. the codegen result is fmls v1.2s, v0.2s, v2.s[3] which is expected to be fmls v0.2s, v1.2s, v2.s[3] llvm-svn: 198001	2013-12-25 07:12:34 +00:00
Jiangning Liu	ca1d69d4c2	Add missing pattern matches to support ACLE intrinsics of AArch64 NEON. llvm-svn: 197993	2013-12-25 01:22:51 +00:00
Hao Liu	8ef969c4a0	[AArch64]Add patterns to match normal shift nodes: shl, sra and srl. llvm-svn: 197969	2013-12-24 09:00:21 +00:00
Kevin Qin	3993f1cd71	[AArch64 NEON] Fix a bug when lowering BUILD_VECTOR. DAG.getVectorShuffle() doesn't always return a vector_shuffle node. If mask is the exact sequence of it's operand(For example, operand_0 is v8i8, and the mask is 0, 1, 2, 3, 4, 5, 6, 7), it will directly return that operand. So a check is added here. llvm-svn: 197967	2013-12-24 08:16:06 +00:00
Kevin Qin	8f86911897	[AArch64 NEON] Fix a pattern match failure with NEON_VDUP. This failure caused by improper condition when lowering shuffle_vector to scalar_to_vector. After this patch NEON_VDUP with v1i64 will not be generated. llvm-svn: 197966	2013-12-24 08:11:47 +00:00
Ana Pazos	85f191fc73	[AArch64] Check fmul node single use in fused multiply patterns Check for single use of fmul node in fused multiply patterns to allow generation of fused multiply add/sub instructions. Otherwise fmul operation ends up being repeated more than once which does not help peformance on targets with only one MAC unit, as for example cortex-a53. llvm-svn: 197929	2013-12-24 00:47:29 +00:00
Ana Pazos	8821a9ef6b	[AArch64 NEON] Fixed fused multiply negate add/sub patterns The correct pattern matching should be: - fnmadd is (-Ra) + (-Rn)Rm which should be matched as: fma (fneg node:$Rn), node:$Rm, (fneg node:$Ra) and as (f32 (fsub (f32 (fneg FPR32:$Ra)), (f32 (fmul FPR32:$Rn, FPR32:$Rm)))) - fnmsub is (-Ra) + RnRm which should be matched as fma node:$Rn, node:$Rm, (fneg node:$Ra) and as (f32 (fsub (f32 (fmul FPR32:$Rn, FPR32:$Rm)), FPR32:$Ra)))) llvm-svn: 197928	2013-12-24 00:40:10 +00:00

1 2 3 4

185 Commits