Chad Rosier
75d1acea0b
[AArch64] Simplify the Neon Scalar3Same patterns for floating-point reciprocal
...
step, floating-point reciprocal square root step, floating-point absolute
difference, and integer/floating-point compare instructions. Also, move the
scalar general arithmetic operation patterns closer to similar code. No
functional change intended.
llvm-svn: 197250
2013-12-13 17:56:44 +00:00
Chad Rosier
7ff770d07b
[AArch64] Removed unnecessary copy patterns with v1fx types.
...
- Copy patterns with float/double types are enough.
- Fix typos in test case names that were using v1fx.
- There is no ACLE intrinsic that uses v1f32 type. And there is no conflict of
neon and non-neon ovelapped operations with this type, so there is no need to
support operations with this type.
- Remove v1f32 from FPR32 register and disallow v1f32 as a legal type for
operations.
Patch by Ana Pazos!
llvm-svn: 197159
2013-12-12 15:46:29 +00:00
Hao Liu
f67a904939
[AArch64]Fix the problem that AArch64 backend fails to select scalar_to_vector of vector types having more than one element.
...
llvm-svn: 197135
2013-12-12 07:36:26 +00:00
Chad Rosier
551789d294
[AArch64] Refactor NEON floating-point Max/Min/Maxnm/Minnm across vector AArch64
...
intrinsics to use f32 types, rather than their vector equivalents.
llvm-svn: 197090
2013-12-11 23:21:25 +00:00
Chad Rosier
c251a82254
[AArch64] Add NEON scalar floating-point compare LLVM AArch64 intrinsics that
...
use f32/f64 types, rather than their vector equivalents.
llvm-svn: 197068
2013-12-11 21:03:46 +00:00
Chad Rosier
0b1fef12e8
[AArch64] Refactor the NEON scalar floating-point reciprocal step and
...
floating-point reciprocal square root step LLVM AArch64 intrinsics to
use f32/f64 types, rather than their vector equivalents.
llvm-svn: 197067
2013-12-11 21:03:43 +00:00
Chad Rosier
43daaa765b
[AArch64] Refactor the NEON scalar floating-point reciprocal estimate, floating-
...
point reciprocal exponent, and floating-point reciprocal square root estimate
LLVM AArch64 intrinsics to use f32/f64 types, rather than their vector
equivalents.
llvm-svn: 197066
2013-12-11 21:03:40 +00:00
Kevin Qin
1453b4e3d7
[AArch64 NEON] Get instruction BSL matched to VSELECT.
...
llvm-svn: 196998
2013-12-11 02:33:50 +00:00
Chad Rosier
29ed5c4552
[AArch64] Refactor the NEON floating-point absolute difference LLVM AArch64
...
intrinsic to use f32/f64 types, rather than their vector equivalents.
llvm-svn: 196965
2013-12-10 21:33:59 +00:00
Chad Rosier
5394f9c916
[AArch64] Refactor the NEON signed/unsigned floating-point convert to fixed-point
...
LLVM AArch64 intrinsics to use f32/f64, rather than their vector equivalents.
llvm-svn: 196964
2013-12-10 21:33:56 +00:00
Chad Rosier
b2112dc6c3
[AArch64] Overload NEON signed/unsigned floating-point convert to fixed-point
...
and fixed-point convert to floating-point LLVM AArch64 intrinsics.
llvm-svn: 196963
2013-12-10 21:33:53 +00:00
Chad Rosier
3d7979609e
[AArch64] Overload NEON signed/unsigned integer convert to floating-point
...
LLVM AArch64 intrinsics.
llvm-svn: 196962
2013-12-10 21:33:50 +00:00
Chad Rosier
7e9f19f92d
[AArch64] Refactor the Neon vector/scalar floating-point convert intrinsics so
...
that they use float/double rather than the vector equivalents when appropriate.
llvm-svn: 196930
2013-12-10 16:11:39 +00:00
Chad Rosier
0b6c7be6f7
[AArch64] Refactor the Neon vector/scalar floating-point convert implementation.
...
Specifically, reuse the ARM intrinsics when possible.
llvm-svn: 196926
2013-12-10 15:35:33 +00:00
Kevin Qin
e00af17fe4
[AArch64 NEON] Replace fpimm with fpz32 for floating compare with zero.
...
This is a small change to be strict. Just want get pattern safer.
llvm-svn: 196889
2013-12-10 06:51:07 +00:00
Kevin Qin
746aa8a55e
[AArch64 NEON] Support poly128_t and implement relevant intrinsic.
...
llvm-svn: 196887
2013-12-10 06:48:35 +00:00
Chad Rosier
8ba851adda
[AArch64] Refactor the NEON scalar reduce pairwise intrinsics, so that they use
...
float/double rather than the vector equivalents when appropriate.
llvm-svn: 196833
2013-12-09 22:47:38 +00:00
Chad Rosier
a7872e4b5d
[AArch64] Refactor NEON scalar reduce pairwise front-end codegen to remove
...
unnecessary patterns in tablegen.
llvm-svn: 196832
2013-12-09 22:47:34 +00:00
Chad Rosier
850366132e
[AArch64] Remove q and non-q intrinsic definitions in the NEON scalar reduce
...
pairwise implementation, using an overloaded definition instead.
llvm-svn: 196831
2013-12-09 22:47:31 +00:00
Ana Pazos
171fb9a9de
Fix pattern match for movi with 0D result
...
Patch by Jiangning Liu.
With some test case changes:
- intrinsic test added to the existing /test/CodeGen/AArch64/neon-aba-abd.ll.
- New test cases to cover movi 1D scenario without using the intrinsic in
test/CodeGen/AArch64/neon-mov.ll.
llvm-svn: 196806
2013-12-09 19:29:14 +00:00
Hao Liu
050a186fd6
[AArch64]Add missing pair intrinsics such as:
...
int32_t vminv_s32(int32x2_t a)
which should be compiled into SMINP Vd.2S,Vn.2S,Vm.2S
llvm-svn: 196749
2013-12-09 03:51:42 +00:00
Ana Pazos
a6ef726750
Implemented vget/vset_lane_f16 intrinsics
...
llvm-svn: 196533
2013-12-05 21:07:49 +00:00
Kevin Qin
f5b717aa75
[AArch64 Neon] Add ACLE intrinsic vceqz_f64.
...
llvm-svn: 196362
2013-12-04 08:02:34 +00:00
Kevin Qin
f93a2e8673
[AArch64 NEON] Add missing compare intrinsics.
...
llvm-svn: 196360
2013-12-04 07:53:28 +00:00
Hao Liu
547dc86218
[AArch64]Add missing floating point convert, round and misc intrinsics.
...
E.g. int64x1_t vcvt_s64_f64(float64x1_t a) -> FCVTZS Dd, Dn
llvm-svn: 196210
2013-12-03 06:06:55 +00:00
Hao Liu
f922fde3de
AArch64: add missing ACLE intrinsics mapping to general arithmetic operation from VFP instructions.
...
E.g. float64x1_t vadd_f64(float64x1_t a, float64x1_t b) -> FADD Dd, Dn, Dm.
llvm-svn: 196208
2013-12-03 05:58:30 +00:00
NAKAMURA Takumi
62ddd8c9e0
Whitespace.
...
llvm-svn: 196203
2013-12-03 05:28:27 +00:00
Hao Liu
fea9943555
AArch64: Add missing scalar pair intrinsics.
...
E.g. "float32_t vaddv_f32(float32x2_t a)" to be matched into "faddp s0, v1.2s".
llvm-svn: 196198
2013-12-03 03:39:47 +00:00
Jiangning Liu
24b3414579
Add some missing pattern matches for AArch64 Neon intrinsics like vuqadd_s64 and friends.
...
llvm-svn: 196192
2013-12-03 01:33:52 +00:00
Jiangning Liu
3f5f9eefd0
Add some missing pattern matches for AArch64 Neon intrinsics like vmull_high_n_s16 and friends.
...
llvm-svn: 196190
2013-12-03 01:29:32 +00:00
Chad Rosier
bcca7559f8
[AArch64] Implemented vcopy_lane patterns using scalar DUP instruction.
...
Patch by Ana Pazos!
llvm-svn: 196151
2013-12-02 21:05:16 +00:00
Hao Liu
b9fa1067c7
AArch64: The pattern match should check the range of the immediate value.
...
Or we can generate some illegal instructions.
E.g. shrn2 v0.4s, v1.2d, #35 . The legal range should be in [1, 16].
llvm-svn: 195941
2013-11-29 02:11:22 +00:00
Jiangning Liu
afc7f71eb3
Add missing pattern for supporting intrinsic function vbsl_f64 with
...
argument double floating point.
llvm-svn: 195938
2013-11-29 01:37:15 +00:00
Kevin Qin
b95721d200
[AArch64 NEON]Fix a assertion failure when disassemble SHLL instruction.
...
llvm-svn: 195936
2013-11-29 01:29:16 +00:00
Jiangning Liu
d9270b7a51
Fix the AArch64 NEON bug exposed by checking constant integer argument range of ACLE intrinsics.
...
llvm-svn: 195843
2013-11-27 14:02:25 +00:00
Chad Rosier
ca062e81db
[AArch64] Add support for NEON scalar floating-point absolute difference.
...
llvm-svn: 195803
2013-11-27 01:45:58 +00:00
Chad Rosier
1337fcc721
[AArch64] Add support for NEON scalar floating-point to integer convert
...
instructions.
llvm-svn: 195788
2013-11-26 22:17:37 +00:00
Kevin Qin
1370a1e1ee
Refactored the implementation of AArch64 NEON instruction ZIP, UZP
...
and TRN.
Fix a bug when mixed use of vget_high_u8() and vuzp_u8().
llvm-svn: 195716
2013-11-26 03:26:47 +00:00
Hao Liu
4c6cc894d2
Fix the bugs about AArch64 Load/Store vector types and bitcast between i64 and vector types.
...
e.g. "%tmp = load <2 x i64>* %ptr" can't be selected.
"%tmp = bitcast i64 %in to <2 x i32>" can't be selected.
llvm-svn: 195424
2013-11-22 08:47:22 +00:00
Jiangning Liu
a50f9e81f3
For AArch64 back-end instruction selection, lower Neon_Lowxxx with EXTRCT_SUBREG.
...
llvm-svn: 195408
2013-11-22 02:45:13 +00:00
Ana Pazos
86d72bbede
Implemented Neon scalar vdup_lane intrinsics.
...
Fixed scalar dup alias and added test case.
llvm-svn: 195330
2013-11-21 08:16:15 +00:00
Ana Pazos
5ddc31e426
Implemented Neon scalar by element intrinsics.
...
Intrinsics implemented: vqdmull_lane, vqdmulh_lane, vqrdmulh_lane,
vqdmlal_lane, vqdmlsl_lane scalar Neon intrinsics.
llvm-svn: 195327
2013-11-21 07:37:04 +00:00
Hao Liu
b26dfe0306
Implement AArch64 neon instructions class SIMD lsone and SIMD lone-post.
...
llvm-svn: 195078
2013-11-19 02:17:05 +00:00
Jiangning Liu
42b7a215f4
Implement AArch64 SISD intrinsics for vget_high and vget_low.
...
llvm-svn: 195074
2013-11-19 01:46:48 +00:00
Kevin Qin
7b74269765
implement MC layer of AArch64 neon instruction PMULL and PMULL2 with 128 bit integer.
...
llvm-svn: 195072
2013-11-19 01:40:25 +00:00
Jiangning Liu
7c858f236d
Add predicate for AArch64 crypto instructions.
...
llvm-svn: 195071
2013-11-19 01:38:31 +00:00
Kevin Qin
eb2e892703
[AArch64 NEON]Add mov alias for simd copy instructions.
...
Set some unspecified bits of INS/DUP to zero as ARMARM requested.
llvm-svn: 194996
2013-11-18 09:20:32 +00:00
Hao Liu
fcc294f3dd
Implement the newly added ACLE functions for ld1/st1 with 2/3/4 vectors.
...
The functions are like: vst1_s8_x2 ...
llvm-svn: 194990
2013-11-18 06:31:53 +00:00
Ana Pazos
b1568fd504
Implemented aarch64 Neon scalar vmulx_lane intrinsics
...
Implemented aarch64 Neon scalar vfma_lane intrinsics
Implemented aarch64 Neon scalar vfms_lane intrinsics
Implemented legacy vmul_n_f64, vmul_lane_f64, vmul_laneq_f64
intrinsics (v1f64 parameter type) using Neon scalar instructions.
Implemented legacy vfma_lane_f64, vfms_lane_f64,
vfma_laneq_f64, vfms_laneq_f64 intrinsics (v1f64 parameter type)
using Neon scalar instructions.
llvm-svn: 194888
2013-11-15 23:32:10 +00:00
Chad Rosier
6b1d577e71
[AArch64] Fix the scalar NEON ACLE functions so that they return float/double
...
rather than the vector equivalent.
llvm-svn: 194853
2013-11-15 21:28:10 +00:00