1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 03:33:20 +01:00
Commit Graph

298 Commits

Author SHA1 Message Date
Ana Pazos
9cdade7a3e [AArch64] Expanded sin, cos, pow with FP vector types inputs
llvm-svn: 201601
2014-02-18 20:31:05 +00:00
Jiangning Liu
9508c695c8 Fix a typo about lowering AArch64 va_copy.
llvm-svn: 201541
2014-02-18 02:37:42 +00:00
Nico Rieck
02b2fbee0d Fix broken CHECK lines
llvm-svn: 201479
2014-02-16 07:31:05 +00:00
Kevin Qin
fa58a631ae [AArch64 NEON] Fix a bug to avoid using floating type as condition type in lowering SELECT_CC.
llvm-svn: 201395
2014-02-14 09:41:15 +00:00
Hao Liu
022a50cb21 [AArch64]Fix the assertion failure caused by "v1i1 SETCC" DAG node.
As v1i1 is illegal, the type legalizer tries to scalarize such node. But if the type operands of SETCC is legal, the scalarization algorithm will cause an assertion failure.

llvm-svn: 201381
2014-02-14 02:21:56 +00:00
Rafael Espindola
71737927b7 Specify a triple. MachO AArch64 support is missing.
llvm-svn: 201335
2014-02-13 15:30:06 +00:00
Daniel Sanders
7a3a160940 Re-commit: Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call
Summary:
AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for
targets with mature MC support. Such targets will always parse the inline
assembly (even when emitting assembly). Targets without mature MC support
continue to use EmitRawText() for assembly output.

The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced
with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler
to parse inline assembly (even when emitting assembly output). UseIntegratedAs
is set to true for targets that consider any failure to parse valid assembly
to be a bug. Target specific subclasses generally enable the integrated
assembler in their constructor. The default value can be overridden with
-no-integrated-as.

All tests that rely on inline assembly supporting invalid assembly (for example,
those that use mnemonics such as 'foo' or 'hello world') have been updated to
disable the integrated assembler.

Changes since review (and last commit attempt):
- Fixed test failures that were missed due to configuration of local build.
  (fixes crash.ll and a couple others).
- Fixed tests that happened to pass because the local build was on X86
  (should fix 2007-12-17-InvokeAsm.ll)
- mature-mc-support.ll's should no longer require all targets to be compiled.
  (should fix ARM and PPC buildbots)
- Object output (-filetype=obj and similar) now forces the integrated assembler
  to be enabled regardless of default setting or -no-integrated-as.
  (should fix SystemZ buildbots)

Reviewers: rafael

Reviewed By: rafael

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2686

llvm-svn: 201333
2014-02-13 14:44:26 +00:00
NAKAMURA Takumi
74e8d4a7db llvm/test/CodeGen/AArch64/cpus.ll: Tweak to use -mtriple=aarch64-unknown-unknown, or this would crash for targeting pecoff like *-mingw32.
llvm-svn: 201315
2014-02-13 11:06:23 +00:00
Oliver Stannard
f7cc40a705 Add Cortex-A53 and Cortex-A57 cores to the AArch64 backend
llvm-svn: 201305
2014-02-13 09:46:11 +00:00
Hao Liu
386fc0d8ae [AArch64]Fix the problems that can't select mul/add/sub of v1i8/v1i16/v1i32 types.
As this problems are similar to shl/sra/srl, also add patterns for shift nodes.

llvm-svn: 201298
2014-02-13 05:42:33 +00:00
Hao Liu
ee04163cfe [AArch64]Add support for spilling FPR8/FPR16.
llvm-svn: 201287
2014-02-13 02:36:58 +00:00
Daniel Sanders
656c4d360b Revert r201237+r201238: Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call
It introduced multiple test failures in the buildbots.

llvm-svn: 201241
2014-02-12 15:39:20 +00:00
Daniel Sanders
e647d6441b Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call
Summary:
AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output.

The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as.

All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler.

Reviewers: rafael

Reviewed By: rafael

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2686

llvm-svn: 201237
2014-02-12 14:44:54 +00:00
Hao Liu
636db9c0e6 [AArch64]Implement the copy of two FPR8 registers by using FMOVss of two FPR32 registers in copyPhysReg.
llvm-svn: 201061
2014-02-10 03:16:22 +00:00
Tim Northover
cf3928b4f7 ARM & AArch64: merge NEON absolute compare intrinsics
There was an extremely confusing proliferation of LLVM intrinsics to implement
the vacge & vacgt instructions. This combines them all into two polymorphic
intrinsics, shared across both backends.

llvm-svn: 200768
2014-02-04 14:55:42 +00:00
Tim Northover
0b6ea5de72 AArch64 & ARM: refactor crypto intrinsics to take scalars
Some of the SHA instructions take a scalar i32 as one argument (largely because
they work on 160-bit hash fragments). This wasn't reflected in the IR
previously, with ARM and AArch64 choosing different types (<4 x i32> and <1 x
i32> respectively) which was ugly.

This makes all the affected intrinsics take a uniform "i32", allowing them to
become non-polymorphic at the same time.

llvm-svn: 200706
2014-02-03 17:27:49 +00:00
Chad Rosier
156f3a2a96 [AArch64] Custom lower concat_vector patterns with v4i16, v4i32, v8i8, v8i16, v16i8 types.
llvm-svn: 200491
2014-01-30 21:46:54 +00:00
Kevin Qin
379441a4e6 [AArch64 NEON] Lower SELECT_CC with vector operand.
When the scalar compare is between floating point and operands are
vector, we custom lower SELECT_CC to use NEON SIMD compare for
generating less instructions.

llvm-svn: 200365
2014-01-29 01:57:30 +00:00
Kevin Qin
436aae7633 [AArch64 NEON] Try to generate CONCAT_VECTOR when lowering BUILD_VECTOR or SHUFFLE_VECTOR.
Replace r199791.

llvm-svn: 200180
2014-01-27 02:53:54 +00:00
Kevin Qin
d83dee8270 Revert r199791.
It's old version which has some bugs. I'll commit lattest patch soon.

llvm-svn: 200179
2014-01-27 02:53:41 +00:00
Jiangning Liu
5ac0a5db29 Improve pattern match from v1i8 to v1i32 for AArch64 Neon.
llvm-svn: 200119
2014-01-26 04:55:53 +00:00
Jiangning Liu
8a0b567fb9 Implement pattern match from v1xx to v1xx for AArch64 Neon.
llvm-svn: 200113
2014-01-26 03:27:40 +00:00
Kevin Qin
ef4cd4a730 [AArch64 NEON] Add patterns for concat_vector on v2i32.
llvm-svn: 200111
2014-01-26 02:46:15 +00:00
Kevin Qin
db813b2f82 [AArch64 NEON] Add test case for vector FP_ROUND.
llvm-svn: 200110
2014-01-26 02:23:33 +00:00
Ana Pazos
0a0875b43a [AArch64] Removed unused i8 type from FPR8 register class.
The i8 type is not registered with any register class.
This causes a segmentation fault in MachineLICM::getRegisterClassIDAndCost.

The code selects the first type associated with register class FPR8,
which happens to be i8.
It uses this type (i8) to get the representative class pointer, which is 0.
It then uses this pointer to access a field, resulting in segmentation fault.

Since i8 type is not being used for printing any neon instruction
we can safely remove it.

llvm-svn: 200046
2014-01-24 22:36:53 +00:00
Kevin Qin
3282007e08 [AArch64 NEON] Fix a bug in implementing register copy bwtween FPR16.
llvm-svn: 199978
2014-01-24 07:53:04 +00:00
Ana Pazos
5fdec23c84 [AArch64] Added vselect patterns with float and double types
llvm-svn: 199925
2014-01-23 19:18:57 +00:00
Hao Liu
80b39e0b02 [AArch64]Add CHECK for two test cases testing scalar_to_vector committed in r199461.
llvm-svn: 199861
2014-01-23 02:09:30 +00:00
Kevin Qin
9a631f3af4 [AArch64 NEON] Try to generate CONCAT_VECTOR when lowering BUILD_VECTOR or SHUFFLE_VECTOR.
llvm-svn: 199791
2014-01-22 06:11:03 +00:00
Kevin Qin
d925e0a953 [AArch64 NEON] Fix a bug caused by undef lane when generating VEXT.
It was commited as r199628 but reverted in r199628 as causing
regression test failed. It's because of old vervsion of patch
I used to commit. Sorry for mistake.

llvm-svn: 199704
2014-01-21 01:48:52 +00:00
Chandler Carruth
f3546bc541 Revert r199628: "[AArch64 NEON] Fix a bug caused by undef lane when generating VEXT."
This test fails the newly added regression tests.

llvm-svn: 199631
2014-01-20 08:18:01 +00:00
Kevin Qin
a2c8e30bce [AArch64 NEON] Fix a bug caused by undef lane when generating VEXT.
llvm-svn: 199628
2014-01-20 07:32:26 +00:00
Kevin Qin
e739fc1b8e [AArch64 NEON] Expand vector for UDIV/SDIV/UREM/SREM/FREM as neon doesn't support these operations.
llvm-svn: 199485
2014-01-17 09:54:30 +00:00
Hao Liu
f1036ed220 [AArch64]Fix the problem can't select f16_to_f32 and f32_to_f16.
Also add copy support for FPR16.
Also add a missing test case file belongs to commit r197361.

llvm-svn: 199463
2014-01-17 06:23:30 +00:00
Kevin Qin
71a9ad96db [AArch64 NEON] Custom lower conversion between vector integer and vector floating point if element bit-width doesn't match.
llvm-svn: 199462
2014-01-17 05:52:35 +00:00
Hao Liu
96315c1088 [AArch64]Fix the problem can't select concat_vectors of two v1i32 types.
Also fix the problem can't select scalar_to_vector from f32 to v2f32/v4f32.

llvm-svn: 199461
2014-01-17 05:44:46 +00:00
Jiangning Liu
ff1e0e1ce3 For AArch64, lowering sext_inreg and generate optimized code by using SXTL.
llvm-svn: 199296
2014-01-15 05:08:01 +00:00
Tim Northover
3f497bbb76 AArch64: don't try to handle [SU]MUL_LOHI nodes
We should set them to expand for now since there are no patterns
dealing with them. Actually, there are no instructions either so I
doubt they'll ever be acceptable.

llvm-svn: 199265
2014-01-14 22:53:22 +00:00
Rafael Espindola
9ae2f1aa3d Revert "[AArch64] Added vselect patterns with float and double types"
This reverts commit r199242.

It is causing CodeGen/AArch64/neon-bsl.ll to fail.

llvm-svn: 199248
2014-01-14 19:24:08 +00:00
Ana Pazos
51fd756e4b [AArch64] Added vselect patterns with float and double types
llvm-svn: 199242
2014-01-14 18:45:48 +00:00
Andrea Di Biagio
c159ef589c [AArch64] Fix assertion failure caused by an invalid comparison between APInt values.
APInt only knows how to compare values with the same BitWidth and asserts
in all other cases.

With this fix, function PerformORCombine does not use the APInt equality
operator if the APInt values returned by 'isConstantSplat' differ in BitWidth.
In that case they are different and no comparison is needed.

llvm-svn: 199119
2014-01-13 16:51:00 +00:00
Kevin Qin
5aa184711d [AArch64 NEON] Add missing patterns for bitcast from or to v1f64
llvm-svn: 199070
2014-01-13 01:58:38 +00:00
Kevin Qin
9b14d101ea [AArch64 NEON] Add more scenarios to use perm instructions when lowering shuffle_vector
This patch covered 2 more scenarios:

1.  Two operands of shuffle_vector are the same, like
%shuffle.i = shufflevector <8 x i8> %a, <8 x i8> %a, <8 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14>

2. One of operands is undef, like
%shuffle.i = shufflevector <8 x i8> %a, <8 x i8> undef, <8 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14>

After this patch, perm instructions will have chance to be emitted instead of lots of INS.

llvm-svn: 199069
2014-01-13 01:56:29 +00:00
Nico Rieck
884ee061f2 Fix non-deterministic SDNodeOrder-dependent codegen
Reset SelectionDAGBuilder's SDNodeOrder to ensure deterministic code
generation.

llvm-svn: 199050
2014-01-12 14:09:17 +00:00
Kristof Beyls
082ab7548c Make sure -use-init-array has intended effect on all AArch64 ELF targets, not just linux.
llvm-svn: 198937
2014-01-10 13:41:49 +00:00
Andrea Di Biagio
946f61e227 Teach the DAGCombiner how to fold 'vselect' dag nodes according
to the following two rules:
  1) fold (vselect (build_vector AllOnes), A, B) -> A
  2) fold (vselect (build_vector AllZeros), A, B) -> B

llvm-svn: 198777
2014-01-08 18:33:04 +00:00
Kevin Qin
fd4df4bd7a [AArch64 NEON] Fix generating incorrect value type of NEON_VDUPLANE
when lower build_vector if result value type mismatch with operand
value type.

llvm-svn: 198743
2014-01-08 08:06:14 +00:00
Hao Liu
2324cab69b [AArch64]Add support to spill/fill D tuples such as DPair/DTriple/DQuad. There is no test cases for D tuple as the original test cases are too large. As the spill/fill of the D tuple is similar to the Q tuple, the correctness can be guaranteed.
llvm-svn: 198684
2014-01-07 10:50:43 +00:00
Hao Liu
bbb265cfee [AArch64]Add support to copy D tuples such as DPair/DTriple/DQuad and Q tuples such as QPair/QTriple/QQuad. There is no test case for D tuple as the original test cases are too large. As the copy of the D tuple is similar to the Q tuple, the correctness can be guaranteed.
llvm-svn: 198682
2014-01-07 10:00:03 +00:00
Kevin Qin
cb6af368ab [AArch64 NEON] Fixed incorrect immediate used in BIC instruction.
llvm-svn: 198675
2014-01-07 05:10:47 +00:00
Kevin Qin
162c44e325 [AArch64 NEON] Fix invalid constant used in vselect condition.
There is a wrong assumption that the vector element type and the
type of each ConstantSDNode in the build_vector were the same.
However, when promoting the integer operand of a legally typed
build_vector, the operand type and the vector element type do not
need to be the same
(See method 'DAGTypeLegalizer::PromoteIntOp_BUILD_VECTOR' in
LegalizeIntegerTypes.cpp).

  in AArch64 backend, the following dag sequence:

  C0: i1 = Constant<0>
  C1: i1 = Constant<-1>
  V: v8i1 = BUILD_VECTOR C1, C1, C0, C0, C0, C0, C0, C0

  is type-legalized into:

  NewC0: i32 = Constant<0>
  NewC1: i32 = Constant<1>
  V: v8i8 = BUILD_VECTOR NewC1, NewC1, NewC0, NewC0, NewC0, NewC0, NewC0, NewC0

Forcing a getZeroExtend to VTBits to ensure that the new constant
is correctly.

llvm-svn: 198582
2014-01-06 02:26:10 +00:00
Jiangning Liu
583b8a7116 For AArch64 Neon, simplify scalar dup by lane0 for fp.
llvm-svn: 198194
2013-12-30 02:44:35 +00:00
Hao Liu
ab32d54fad [AArch64]Add code to spill/fill Q register tuples such as QPair/QTriple/QQuad.
llvm-svn: 198193
2013-12-30 02:38:12 +00:00
Hao Liu
8bef865160 [AArch64]Can't select shift left 0 of type v1i64
llvm-svn: 198192
2013-12-30 02:12:46 +00:00
Kevin Qin
cbb0be4bee Fix a bug in DAGcombiner about zero-extend after setcc.
For AArch64 backend, if DAGCombiner see "sext(setcc)", it will
combine them together to a single setcc with extended value type.
Then if it see "zext(setcc)", it assumes setcc is Vxi1, and try to
create "(and (vsetcc), (1, 1, ...)". While setcc isn't Vxi1,
DAGcombiner will create wrong node and get wrong code emitted.

llvm-svn: 198190
2013-12-30 02:05:13 +00:00
Hao Liu
e8d49c2088 [AArch64]Fix the problem that can't select mul of v1i64/v2i64 types.
E.g. Can't select such IR:
     %tmp = mul <2 x i64> %a, %b

llvm-svn: 198188
2013-12-30 01:38:41 +00:00
Hao Liu
8ed49e0c42 [AArch64]Fix a problem that the register order of fmls/fmla by element is incorrect.
E.g. the codegen result is 
     fmls v1.2s, v0.2s, v2.s[3]
which is expected to be
     fmls v0.2s, v1.2s, v2.s[3]

llvm-svn: 198001
2013-12-25 07:12:34 +00:00
Jiangning Liu
ca1d69d4c2 Add missing pattern matches to support ACLE intrinsics of AArch64 NEON.
llvm-svn: 197993
2013-12-25 01:22:51 +00:00
Hao Liu
8ef969c4a0 [AArch64]Add patterns to match normal shift nodes: shl, sra and srl.
llvm-svn: 197969
2013-12-24 09:00:21 +00:00
Kevin Qin
3993f1cd71 [AArch64 NEON] Fix a bug when lowering BUILD_VECTOR.
DAG.getVectorShuffle() doesn't always return a vector_shuffle node.
If mask is the exact sequence of it's operand(For example, operand_0
is v8i8, and  the mask is 0, 1, 2, 3, 4, 5, 6, 7), it will directly
return that operand. So a check is added here.

llvm-svn: 197967
2013-12-24 08:16:06 +00:00
Kevin Qin
8f86911897 [AArch64 NEON] Fix a pattern match failure with NEON_VDUP.
This failure caused by improper condition when lowering shuffle_vector
to scalar_to_vector. After this patch NEON_VDUP with v1i64 will not
be generated.

llvm-svn: 197966
2013-12-24 08:11:47 +00:00
Ana Pazos
85f191fc73 [AArch64] Check fmul node single use in fused multiply patterns
Check for single use of fmul node in fused multiply patterns
to allow generation of fused multiply add/sub instructions.
Otherwise fmul operation ends up being repeated more than
once which does not help peformance on targets with
only one MAC unit, as for example cortex-a53.

llvm-svn: 197929
2013-12-24 00:47:29 +00:00
Ana Pazos
8821a9ef6b [AArch64 NEON] Fixed fused multiply negate add/sub patterns
The correct pattern matching should be:

- fnmadd is (-Ra) + (-Rn)*Rm  which should be matched as:

  fma (fneg node:$Rn),  node:$Rm, (fneg node:$Ra) and as

  (f32 (fsub (f32 (fneg FPR32:$Ra)), (f32 (fmul FPR32:$Rn, FPR32:$Rm))))

- fnmsub is (-Ra) + Rn*Rm which should be matched as

  fma node:$Rn,  node:$Rm, (fneg node:$Ra) and as

  (f32 (fsub (f32 (fmul FPR32:$Rn, FPR32:$Rm)), FPR32:$Ra))))

llvm-svn: 197928
2013-12-24 00:40:10 +00:00
Hao Liu
3ae1e13884 [AArch64]The compare to zero intrinsics should be implemented by 'icmp/fcmp' and 'sext' not 'zext'. Modify the test cases.
llvm-svn: 197897
2013-12-23 02:42:10 +00:00
Kevin Qin
99ae282f19 [AArch64 NEON]Implment loading vector constant form constant pool.
llvm-svn: 197551
2013-12-18 06:26:04 +00:00
Hao Liu
81b69b5ce1 [AArch64]Fix the pattern match failure for v1i8/v1i16/v1i32 types.
Currently we have such types as legal vector types. The DAG combiner may generate some DAG nodes having such types but we don't have patterns to match them.
E.g. a load i32 and a bitcast i32 to v1i32 will be combined into a load v1i32:
     bitcast (load i32) to v1i32 -> load v1i32.
So this patch fixes such problems for load/dup instructions.
If v1i8/v1i16/v1i32 are not legal any more, the code in this patch can be deleted. So I also add some FIXME.

llvm-svn: 197361
2013-12-16 02:51:28 +00:00
Chad Rosier
7ff770d07b [AArch64] Removed unnecessary copy patterns with v1fx types.
- Copy patterns with float/double types are enough.
- Fix typos in test case names that were using v1fx.
- There is no ACLE intrinsic that uses v1f32 type.  And there is no conflict of
  neon and non-neon ovelapped operations with this type, so there is no need to
  support operations with this type.
- Remove v1f32 from FPR32 register and disallow v1f32 as a legal type for
  operations.

Patch by Ana Pazos!

llvm-svn: 197159
2013-12-12 15:46:29 +00:00
Hao Liu
f67a904939 [AArch64]Fix the problem that AArch64 backend fails to select scalar_to_vector of vector types having more than one element.
llvm-svn: 197135
2013-12-12 07:36:26 +00:00
Kevin Qin
18b9563375 Fix Incorrect CHECK message [0-31]+ in test case.
In regular expression, [0-31]+ equals to [0-3]+, not the number from
0 to 31. So change it to [0-9]+.

llvm-svn: 197113
2013-12-12 02:19:13 +00:00
Quentin Colombet
a4f824b0d6 Fix an over-constrained assertion in MachineFunction::addLiveIn.
The assertion was checking that the virtual register VReg used to represent the
physical register PReg uses the same register class as the one passed to
MachineFunction::addLiveIn.
This is over-constraining because it is sufficient to check that the register
class of VReg (VRegRC) is a subclass of the register class of PReg (PRegRC) and
that VRegRC contains PReg.
Indeed, if VReg gets constrained because of some operation constraints
between two calls of MachineFunction::addLiveIn, the original assertion
cannot match.

This fixes <rdar://problem/15633429>. 

llvm-svn: 197097
2013-12-12 00:15:47 +00:00
Chad Rosier
551789d294 [AArch64] Refactor NEON floating-point Max/Min/Maxnm/Minnm across vector AArch64
intrinsics to use f32 types, rather than their vector equivalents.

llvm-svn: 197090
2013-12-11 23:21:25 +00:00
Chad Rosier
c251a82254 [AArch64] Add NEON scalar floating-point compare LLVM AArch64 intrinsics that
use f32/f64 types, rather than their vector equivalents.

llvm-svn: 197068
2013-12-11 21:03:46 +00:00
Chad Rosier
0b1fef12e8 [AArch64] Refactor the NEON scalar floating-point reciprocal step and
floating-point reciprocal square root step LLVM AArch64 intrinsics to
use f32/f64 types, rather than their vector equivalents.

llvm-svn: 197067
2013-12-11 21:03:43 +00:00
Chad Rosier
43daaa765b [AArch64] Refactor the NEON scalar floating-point reciprocal estimate, floating-
point reciprocal exponent, and floating-point reciprocal square root estimate
LLVM AArch64 intrinsics to use f32/f64 types, rather than their vector
equivalents.

llvm-svn: 197066
2013-12-11 21:03:40 +00:00
Chad Rosier
29ed5c4552 [AArch64] Refactor the NEON floating-point absolute difference LLVM AArch64
intrinsic to use f32/f64 types, rather than their vector equivalents.

llvm-svn: 196965
2013-12-10 21:33:59 +00:00
Chad Rosier
5394f9c916 [AArch64] Refactor the NEON signed/unsigned floating-point convert to fixed-point
LLVM AArch64 intrinsics to use f32/f64, rather than their vector equivalents.

llvm-svn: 196964
2013-12-10 21:33:56 +00:00
Chad Rosier
b2112dc6c3 [AArch64] Overload NEON signed/unsigned floating-point convert to fixed-point
and fixed-point convert to floating-point LLVM AArch64 intrinsics.

llvm-svn: 196963
2013-12-10 21:33:53 +00:00
Chad Rosier
3d7979609e [AArch64] Overload NEON signed/unsigned integer convert to floating-point
LLVM AArch64 intrinsics.

llvm-svn: 196962
2013-12-10 21:33:50 +00:00
Chad Rosier
7e9f19f92d [AArch64] Refactor the Neon vector/scalar floating-point convert intrinsics so
that they use float/double rather than the vector equivalents when appropriate.

llvm-svn: 196930
2013-12-10 16:11:39 +00:00
Chad Rosier
0b6c7be6f7 [AArch64] Refactor the Neon vector/scalar floating-point convert implementation.
Specifically, reuse the ARM intrinsics when possible.

llvm-svn: 196926
2013-12-10 15:35:33 +00:00
Kevin Qin
746aa8a55e [AArch64 NEON] Support poly128_t and implement relevant intrinsic.
llvm-svn: 196887
2013-12-10 06:48:35 +00:00
Chad Rosier
8ba851adda [AArch64] Refactor the NEON scalar reduce pairwise intrinsics, so that they use
float/double rather than the vector equivalents when appropriate.

llvm-svn: 196833
2013-12-09 22:47:38 +00:00
Chad Rosier
a7872e4b5d [AArch64] Refactor NEON scalar reduce pairwise front-end codegen to remove
unnecessary patterns in tablegen.

llvm-svn: 196832
2013-12-09 22:47:34 +00:00
Chad Rosier
850366132e [AArch64] Remove q and non-q intrinsic definitions in the NEON scalar reduce
pairwise implementation, using an overloaded definition instead.

llvm-svn: 196831
2013-12-09 22:47:31 +00:00
Ana Pazos
171fb9a9de Fix pattern match for movi with 0D result
Patch by Jiangning Liu.

With some test case changes:
- intrinsic test added to the existing /test/CodeGen/AArch64/neon-aba-abd.ll.
- New test cases to cover movi 1D scenario without using the intrinsic in
test/CodeGen/AArch64/neon-mov.ll.

llvm-svn: 196806
2013-12-09 19:29:14 +00:00
Hao Liu
050a186fd6 [AArch64]Add missing pair intrinsics such as:
int32_t vminv_s32(int32x2_t a)
which should be compiled into SMINP Vd.2S,Vn.2S,Vm.2S

llvm-svn: 196749
2013-12-09 03:51:42 +00:00
Hao Liu
31c452a955 [AArch64]Pattern match failures for truncate store and extend load
llvm-svn: 196748
2013-12-09 03:34:08 +00:00
Jiangning Liu
7825595e77 For AArch64, add missing register cost calculation for big value types like v4i64 and v8i64.
llvm-svn: 196456
2013-12-05 02:12:01 +00:00
Kevin Qin
f5b717aa75 [AArch64 Neon] Add ACLE intrinsic vceqz_f64.
llvm-svn: 196362
2013-12-04 08:02:34 +00:00
Kevin Qin
f93a2e8673 [AArch64 NEON] Add missing compare intrinsics.
llvm-svn: 196360
2013-12-04 07:53:28 +00:00
Hao Liu
547dc86218 [AArch64]Add missing floating point convert, round and misc intrinsics.
E.g. int64x1_t vcvt_s64_f64(float64x1_t a) -> FCVTZS Dd, Dn

llvm-svn: 196210
2013-12-03 06:06:55 +00:00
Hao Liu
f922fde3de AArch64: add missing ACLE intrinsics mapping to general arithmetic operation from VFP instructions.
E.g. float64x1_t vadd_f64(float64x1_t a, float64x1_t b) -> FADD Dd, Dn, Dm.

llvm-svn: 196208
2013-12-03 05:58:30 +00:00
Hao Liu
fea9943555 AArch64: Add missing scalar pair intrinsics.
E.g. "float32_t vaddv_f32(float32x2_t a)" to be matched into "faddp s0, v1.2s".

llvm-svn: 196198
2013-12-03 03:39:47 +00:00
Jiangning Liu
24b3414579 Add some missing pattern matches for AArch64 Neon intrinsics like vuqadd_s64 and friends.
llvm-svn: 196192
2013-12-03 01:33:52 +00:00
Jiangning Liu
3f5f9eefd0 Add some missing pattern matches for AArch64 Neon intrinsics like vmull_high_n_s16 and friends.
llvm-svn: 196190
2013-12-03 01:29:32 +00:00
Chad Rosier
bcca7559f8 [AArch64] Implemented vcopy_lane patterns using scalar DUP instruction.
Patch by Ana Pazos!

llvm-svn: 196151
2013-12-02 21:05:16 +00:00
Hao Liu
b9fa1067c7 AArch64: The pattern match should check the range of the immediate value.
Or we can generate some illegal instructions.
E.g. shrn2 v0.4s, v1.2d, #35. The legal range should be in [1, 16].

llvm-svn: 195941
2013-11-29 02:11:22 +00:00
Jiangning Liu
844201423a Add missing test case for bsl_f64 support of AArch64 NEON.
llvm-svn: 195939
2013-11-29 01:38:08 +00:00
Jiangning Liu
d9270b7a51 Fix the AArch64 NEON bug exposed by checking constant integer argument range of ACLE intrinsics.
llvm-svn: 195843
2013-11-27 14:02:25 +00:00
Chad Rosier
ca062e81db [AArch64] Add support for NEON scalar floating-point absolute difference.
llvm-svn: 195803
2013-11-27 01:45:58 +00:00
Chad Rosier
1337fcc721 [AArch64] Add support for NEON scalar floating-point to integer convert
instructions.

llvm-svn: 195788
2013-11-26 22:17:37 +00:00
Kevin Qin
1370a1e1ee Refactored the implementation of AArch64 NEON instruction ZIP, UZP
and TRN.
Fix a bug when mixed use of vget_high_u8() and vuzp_u8().

llvm-svn: 195716
2013-11-26 03:26:47 +00:00
Kevin Qin
95c8b28223 [AArch64]Implement 128 bit register copy with NEON.
llvm-svn: 195713
2013-11-26 02:33:42 +00:00
Hao Liu
4c6cc894d2 Fix the bugs about AArch64 Load/Store vector types and bitcast between i64 and vector types.
e.g. "%tmp = load <2 x i64>* %ptr" can't be selected. 
     "%tmp = bitcast i64 %in to <2 x i32>" can't be selected.

llvm-svn: 195424
2013-11-22 08:47:22 +00:00
Jiangning Liu
a50f9e81f3 For AArch64 back-end instruction selection, lower Neon_Lowxxx with EXTRCT_SUBREG.
llvm-svn: 195408
2013-11-22 02:45:13 +00:00
Ana Pazos
86d72bbede Implemented Neon scalar vdup_lane intrinsics.
Fixed scalar dup alias and added test case.

llvm-svn: 195330
2013-11-21 08:16:15 +00:00
Ana Pazos
5ddc31e426 Implemented Neon scalar by element intrinsics.
Intrinsics implemented: vqdmull_lane, vqdmulh_lane, vqrdmulh_lane,
vqdmlal_lane, vqdmlsl_lane scalar Neon intrinsics.

llvm-svn: 195327
2013-11-21 07:37:04 +00:00
Hao Liu
b26dfe0306 Implement AArch64 neon instructions class SIMD lsone and SIMD lone-post.
llvm-svn: 195078
2013-11-19 02:17:05 +00:00
Jiangning Liu
42b7a215f4 Implement AArch64 SISD intrinsics for vget_high and vget_low.
llvm-svn: 195074
2013-11-19 01:46:48 +00:00
Jiangning Liu
7c858f236d Add predicate for AArch64 crypto instructions.
llvm-svn: 195071
2013-11-19 01:38:31 +00:00
Hao Liu
fcc294f3dd Implement the newly added ACLE functions for ld1/st1 with 2/3/4 vectors.
The functions are like: vst1_s8_x2 ...

llvm-svn: 194990
2013-11-18 06:31:53 +00:00
Ana Pazos
b1568fd504 Implemented aarch64 Neon scalar vmulx_lane intrinsics
Implemented aarch64 Neon scalar vfma_lane intrinsics
Implemented aarch64 Neon scalar vfms_lane intrinsics

Implemented legacy vmul_n_f64, vmul_lane_f64, vmul_laneq_f64
intrinsics (v1f64 parameter type) using Neon scalar instructions.

Implemented legacy vfma_lane_f64, vfms_lane_f64,
vfma_laneq_f64, vfms_laneq_f64 intrinsics (v1f64 parameter type)
using Neon scalar instructions.

llvm-svn: 194888
2013-11-15 23:32:10 +00:00
Chad Rosier
6b1d577e71 [AArch64] Fix the scalar NEON ACLE functions so that they return float/double
rather than the vector equivalent.

llvm-svn: 194853
2013-11-15 21:28:10 +00:00
Kevin Qin
0b4fc92580 Add test case for AArch64 NEON instruction set misc.
llvm-svn: 194673
2013-11-14 06:45:17 +00:00
Kevin Qin
47a3b639e3 Implement aarch64 neon instruction class SIMD misc.
llvm-svn: 194656
2013-11-14 02:44:13 +00:00
Jiangning Liu
5a9b5605ba Implement AArch64 NEON instruction set AdvSIMD (table).
llvm-svn: 194648
2013-11-14 01:57:32 +00:00
Chad Rosier
fae5b22550 [AArch64] Add support for legacy AArch32 NEON scalar shift by immediate
instructions.  This patch does not include the shift right and accumulate
instructions.  A number of non-overloaded intrinsics have been remove in favor
of their overloaded counterparts.

llvm-svn: 194598
2013-11-13 20:05:37 +00:00
Chad Rosier
8d7ebe36dd [AArch64] The shift right/left and insert immediate builtins expect 3
source operands, a vector, an element to insert, and a shift amount.

llvm-svn: 194406
2013-11-11 19:11:11 +00:00
Chad Rosier
4848250116 [AArch64] Add support for NEON scalar floating-point convert to fixed-point instructions.
llvm-svn: 194394
2013-11-11 18:04:07 +00:00
Jiangning Liu
9c0eb8e7ba Implement AArch64 Neon instruction set Perm.
llvm-svn: 194123
2013-11-06 03:35:27 +00:00
Jiangning Liu
1cdd311f06 Implement AArch64 Neon instruction set Bitwise Extract.
llvm-svn: 194118
2013-11-06 02:25:49 +00:00
Jiangning Liu
59b8117b0b Implement AArch64 Neon Crypto instruction classes AES, SHA, and 3 SHA.
llvm-svn: 194085
2013-11-05 17:42:05 +00:00
Hao Liu
386d8dd5a6 Implement AArch64 post-index vector load/store multiple N-element structure class SIMD(lselem-post).
Including following 14 instructions:
4 ld1 insts: post-index load multiple 1-element structure to sequential 1/2/3/4 registers.
ld2/ld3/ld4: post-index load multiple N-element structure to sequential N registers (N=2,3,4).
4 st1 insts: post-index store multiple 1-element structure from sequential 1/2/3/4 registers.
st2/st3/st4: post-index store multiple N-element structure from sequential N registers (N = 2,3,4).

llvm-svn: 194043
2013-11-05 03:39:32 +00:00
Kevin Qin
63fa5c1ef6 Implemented aarch64 neon intrinsic vcopy_lane with float type.
llvm-svn: 194041
2013-11-05 02:03:59 +00:00
Tim Northover
6110ffc9ca AArch64: use default asm operand printing when modifier inapplicable
If an inline assembly operand has multiple constraints (e.g. "Ir" for immediate
or register) and an operand modifier (E.g. "w" for "print register as wN") then
we need to decide behaviour when the modifier doesn't apply to the constraint.

Previousely produced some combination of an assertion failure and a fatal
error. GCC's behaviour appears to be to ignore the modifier and print the
operand in the default way. This patch should implement that.

llvm-svn: 194024
2013-11-04 23:04:07 +00:00
Chad Rosier
fd7dc7524c [AArch64] Add support for NEON scalar fixed-point convert to floating-point instructions.
llvm-svn: 193816
2013-10-31 22:36:59 +00:00
Chad Rosier
aea5ba449f [AArch64] Add support for NEON scalar shift immediate instructions.
llvm-svn: 193790
2013-10-31 19:28:44 +00:00
Amara Emerson
ce9bb052e5 [AArch64] Make the use of FP instructions optional, but enabled by default.
This adds a new subtarget feature called FPARMv8 (implied by NEON), and
predicates the support of the FP instructions and registers on this feature.

llvm-svn: 193739
2013-10-31 09:32:11 +00:00
Chad Rosier
02e430c891 [AArch64] Add support for NEON scalar floating-point compare instructions.
llvm-svn: 193691
2013-10-30 15:19:37 +00:00
Weiming Zhao
6ef9f618fc add test cases for frameaddr and returnaddr for aarch64
llvm-svn: 193626
2013-10-29 17:01:29 +00:00
Tim Northover
30527a23a1 AArch64: add 'a' inline asm operand modifier
This is used in the Linux kernel, and effectively just means "print an
address".

llvm-svn: 193593
2013-10-29 08:22:33 +00:00
Rafael Espindola
5f82bc3329 Convert another llc -filetype=obj test.
llvm-svn: 193538
2013-10-28 21:06:12 +00:00
Rafael Espindola
4272a090a0 Convert another llc -filetype=obj test.
llvm-svn: 193537
2013-10-28 20:59:41 +00:00
Rafael Espindola
00f2d95c0c Convert another llc -filetype=obj test.
llvm-svn: 193536
2013-10-28 20:54:33 +00:00
Rafael Espindola
c4c4f23851 Convert a llc -filetype=obj test into a llvm-mc test.
llvm-svn: 193534
2013-10-28 20:40:20 +00:00
Amara Emerson
de52a239bd [AArch64] Fix NZCV reg live-in bug in F128CSEL codegen.
When generating the IfTrue basic block during the F128CSEL pseudo-instruction
handling, the NZCV live-in for the newly created BB wasn't being added. This
caused a fault during MI-sched/live range calculation when the predecessor
for the fall-through BB didn't have a live-in for phys-reg as expected.

llvm-svn: 193316
2013-10-24 08:28:24 +00:00
Chad Rosier
838b6065b8 [AArch64] Add the constraint to NEON scalar mla/mls instructions.
llvm-svn: 193117
2013-10-21 20:11:47 +00:00
Chad Rosier
163fdd3e73 [AArch64] Add support for NEON scalar extract narrow instructions.
llvm-svn: 192970
2013-10-18 14:03:24 +00:00
Chad Rosier
9a6d485c7f [AArch64] Add support for NEON scalar three register different instruction
class.  The instruction class includes the signed saturating doubling
multiply-add long, signed saturating doubling multiply-subtract long, and
the signed saturating doubling multiply long instructions.

llvm-svn: 192908
2013-10-17 18:12:29 +00:00
Chad Rosier
3ed3565e0f [AArch64] Add support for NEON scalar negate instruction.
llvm-svn: 192843
2013-10-16 21:04:39 +00:00
Chad Rosier
aaa3bb367a [AArch64] Add support for NEON scalar absolute value instruction.
llvm-svn: 192842
2013-10-16 21:04:34 +00:00
Chad Rosier
a195d145b8 [AArch64] Add support for NEON scalar signed saturating accumulated of unsigned
value and unsigned saturating accumulate of signed value instructions.

llvm-svn: 192800
2013-10-16 16:09:02 +00:00
Chad Rosier
3e791b2408 [AArch64] Add support for NEON scalar signed saturating absolute value and
scalar signed saturating negate instructions.

llvm-svn: 192733
2013-10-15 21:18:44 +00:00
Chad Rosier
40761dc629 [AArch64] Add support for NEON scalar integer compare instructions.
llvm-svn: 192596
2013-10-14 14:37:20 +00:00
Kevin Qin
e90902acc5 Implement aarch64 neon instruction set AdvSIMD (copy).
llvm-svn: 192410
2013-10-11 02:33:55 +00:00
Hao Liu
d0ab407a23 Implement AArch64 vector load/store multiple N-element structure class SIMD(lselem).
Including following 14 instructions:
4 ld1 insts: load multiple 1-element structure to sequential 1/2/3/4 registers.
ld2/ld3/ld4: load multiple N-element structure to sequential N registers (N=2,3,4).
4 st1 insts: store multiple 1-element structure from sequential 1/2/3/4 registers.
st2/st3/st4: store multiple N-element structure from sequential N registers (N = 2,3,4).

llvm-svn: 192361
2013-10-10 17:00:52 +00:00
Rafael Espindola
bb93e39fe2 Revert "Implement AArch64 vector load/store multiple N-element structure class SIMD(lselem). Including following 14 instructions: 4 ld1 insts: load multiple 1-element structure to sequential 1/2/3/4 registers. ld2/ld3/ld4: load multiple N-element structure to sequential N registers (N=2,3,4). 4 st1 insts: store multiple 1-element structure from sequential 1/2/3/4 registers. st2/st3/st4: store multiple N-element structure from sequential N registers (N = 2,3,4)."
This reverts commit r192352. It broke the build.

llvm-svn: 192354
2013-10-10 15:15:17 +00:00
Hao Liu
0ff11c9c71 Implement AArch64 vector load/store multiple N-element structure class SIMD(lselem).
Including following 14 instructions:
4 ld1 insts: load multiple 1-element structure to sequential 1/2/3/4 registers.
ld2/ld3/ld4: load multiple N-element structure to sequential N registers (N=2,3,4).
4 st1 insts: store multiple 1-element structure from sequential 1/2/3/4 registers.
st2/st3/st4: store multiple N-element structure from sequential N registers (N = 2,3,4).

llvm-svn: 192352
2013-10-10 15:01:24 +00:00
Tim Northover
87db53ff7a AArch64: enable MISched by default.
Substantial SelectionDAG scheduling is going away soon, and is
interfering with Hao's attempts to implement LDn/STn instructions, so
I say we make the leap first.

There were a few reorderings (inevitably) which broke some tests. I
tried to replace them with CHECK-DAG variants mostly, but some too
complex for that to be useful and I just reordered them.

llvm-svn: 192282
2013-10-09 07:53:57 +00:00
Tim Northover
a9df6657ee AArch64: migrate ADRP relaxation test to be llvm-mc only.
llvm-svn: 192281
2013-10-09 07:53:49 +00:00