1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 20:43:44 +02:00
Commit Graph

53 Commits

Author SHA1 Message Date
Chad Rosier
75e17097bb [AArch64] Generate vector signed/unsigned mul and mla/mls long.
Phabricator Revision: http://reviews.llvm.org/D5589
Patch by Balaram Makam <bmakam@codeaurora.org>!!

llvm-svn: 219276
2014-10-08 02:31:24 +00:00
Asiri Rathnayake
0e12aa2ad9 Add missing natual vector cast.
Summary: The natual vector cast node (similar to bitcast) AArch64ISD::NVCAST
was introduced in r217159 and r217138. This patch adds a missing cast from
v2f32 to v1i64 which is causing some compilation failures. Also added test
cases to cover various modimm types and BUILD_VECTORs with i64 elements.

llvm-svn: 218751
2014-10-01 09:59:45 +00:00
Tim Northover
2ca258e6f9 AArch64: fix vector-immediate BIC/ORR on big-endian devices.
Follow up to r217138, extending the logic to other NEON-immediate instructions.
As before, the instruction already performs the correct operation and we're
just using a different type for convenience, so we want a true nop-cast.

Patch by Asiri Rathnayake.

llvm-svn: 217159
2014-09-04 15:05:24 +00:00
Tim Northover
54d4e0e00b AArch64: fix big-endian immediate materialisation
We were materialising big-endian constants using DAG nodes with types different
from what was requested, followed by a bitcast. This is fine on little-endian
machines where bitcasting is a nop, but we need a slightly different
representation for big-endian. This adds a new set of NVCAST (natural-vector
cast) operations which are always nops.

Patch by Asiri Rathnayake.

llvm-svn: 217138
2014-09-04 09:46:14 +00:00
Jiangning Liu
bffae55891 [AArch64] Fix some failures exposed by value type v4f16 and v8f16.
1) Add some missing bitcast patterns for v8f16.
2) Add type promotion for operand of ld/st operations.

llvm-svn: 216706
2014-08-29 01:31:42 +00:00
Oliver Stannard
8901713bd2 Teach the AArch64 backend about v4f16 and v8f16
This teaches the AArch64 backend to deal with the operations required
to deal with the operations on v4f16 and v8f16 which are exposed by
NEON intrinsics, plus the add, sub, mul and div operations.

llvm-svn: 216555
2014-08-27 16:16:04 +00:00
Jiangning Liu
a94c806f8b [AArch64] Disable some optimization cases for type conversion from sint to fp, because those optimization cases are micro-architecture dependent and only make sense for Cyclone. A new predicate Cyclone is introduced in .td file.
llvm-svn: 213827
2014-07-24 01:29:59 +00:00
Tim Northover
86e216695b CodeGen: emit IR-level f16 conversion intrinsics as fptrunc/fpext
This makes the first stage DAG for @llvm.convert.to.fp16 an fptrunc,
and correspondingly @llvm.convert.from.fp16 an fpext. The legalisation
path is now uniform, regardless of the input IR:

  fptrunc -> FP_TO_FP16 (if f16 illegal) -> libcall
  fpext -> FP16_TO_FP (if f16 illegal) -> libcall

Each target should be able to select the version that best matches its
operations and not be required to duplicate patterns for both fptrunc
and FP_TO_FP16 (for example).

As a result we can remove some redundant AArch64 patterns.

llvm-svn: 213507
2014-07-21 09:13:56 +00:00
Tim Northover
eae1f1c8cc CodeGen: extend f16 conversions to permit types > float.
This makes the two intrinsics @llvm.convert.from.f16 and
@llvm.convert.to.f16 accept types other than simple "float". This is
only strictly needed for the truncate operation, since otherwise
double rounding occurs and there's no way to represent the strict IEEE
conversion. However, for symmetry we allow larger types in the extend
too.

During legalization, we can expand an "fp16_to_double" operation into
two extends for convenience, but abort when the truncate isn't legal. A new
libcall is probably needed here.

Even after this commit, various target tweaks are needed to actually use the
extended intrinsics. I've put these into separate commits for clarity, so there
are no actual tests of f64 conversion here.

llvm-svn: 213248
2014-07-17 10:51:23 +00:00
Yi Kong
9b1652c5d0 Port memory barriers intrinsics to AArch64
Memory barrier __builtin_arm_[dmb, dsb, isb] intrinsics are required to
implement their corresponding ACLE and MSVC intrinsics.

This patch ports ARM dmb, dsb, isb intrinsic to AArch64.

Differential Revision: http://reviews.llvm.org/D4520

llvm-svn: 213247
2014-07-17 10:50:20 +00:00
Tim Northover
4ac35c9d7b AArch64: remove unnecessary pseudo-instruction.
Sufficiently twisted use of TableGen lets us write patterns directly for f16
(as an i16 promoted to i32) -> f32 conversion.

llvm-svn: 212933
2014-07-14 11:16:02 +00:00
Arnaud A. de Grandmaison
e705812564 [AArch64] Add logical alias instructions to MC AsmParser
This patch teaches the AsmParser to accept some logical+immediate
instructions and convert them as shown:

  bic  Rd, Rn, #imm  ->  and Rd, Rn, #~imm
  bics Rd, Rn, #imm  ->  ands Rd, Rn, #~imm
  orn  Rd, Rn, #imm  ->  orr Rd, Rn, #~imm
  eon  Rd, Rn, #imm  ->  eor Rd, Rn, #~imm

Those instructions are an alternate syntax available to assembly coders,
and are needed in order to support code already compiling with some other
assemblers. For example, the bic construct is used by the linux kernel.

llvm-svn: 212722
2014-07-10 15:12:26 +00:00
Jim Grosbach
10098b8e10 AArch64: Better codegen for storing to __fp16.
Storing will generally be immediately preceded by rounding from an f32
or f64, so make sure to match those patterns directly to convert into the
FPR16 register class directly rather than going through the integer GPRs.

This also eliminates an extra step in the convert-from-f64 path
which was first converting to f32 and then to f16 from there.

rdar://17594379

llvm-svn: 212638
2014-07-09 18:55:52 +00:00
Jim Grosbach
4c963f72c5 AArch64: Better codegen for loading from __fp16.
Loading will generally extend to an f32 or an 64, so make sure
to match those patterns directly to load into the FPR16 register
class directly rather than going through the integer GPRs.

This also eliminates an extra step in the convert-to-f64 path
which was first converting to f32 and then to f64 from there.

rdar://17594379

llvm-svn: 212573
2014-07-08 23:28:48 +00:00
Saleem Abdulrasool
9cdbf65f90 AArch64: whitespace cleanup
llvm-svn: 212420
2014-07-06 22:13:26 +00:00
Jim Grosbach
6677259350 AArch64: Add backend intrinsic for rbit.
Define an intrinsic for the frontend to use and pattern match it to
the RBIT instruction.

rdar://9283021

llvm-svn: 211058
2014-06-16 21:55:35 +00:00
Tim Northover
ca0f4dc4f0 AArch64/ARM64: move ARM64 into AArch64's place
This commit starts with a "git mv ARM64 AArch64" and continues out
from there, renaming the C++ classes, intrinsics, and other
target-local objects for consistency.

"ARM64" test directories are also moved, and tests that began their
life in ARM64 use an arm64 triple, those from AArch64 use an aarch64
triple. Both should be equivalent though.

This finishes the AArch64 merge, and everyone should feel free to
continue committing as normal now.

llvm-svn: 209577
2014-05-24 12:50:23 +00:00
Tim Northover
d7f173214f AArch64/ARM64: remove AArch64 from tree prior to renaming ARM64.
I'm doing this in two phases for a better "git blame" record. This
commit removes the previous AArch64 backend and redirects all
functionality to ARM64. It also deduplicates test-lines and removes
orphaned AArch64 tests.

The next step will be "git mv ARM64 AArch64" and rewire most of the
tests.

Hopefully LLVM is still functional, though it would be even better if
no-one ever had to care because the rename happens straight
afterwards.

llvm-svn: 209576
2014-05-24 12:42:26 +00:00
Tim Northover
7dfb58559c AArch64: disable printing of add/sub alias
This alias appears not to have an appropriate PrintMethod. Normally, I'd look
into it, but since AArch64 is disappearing soon it's probably not worth it.

This will be tested when the TableGen "should I print this Alias" heuristic is
fixed (very soon).

llvm-svn: 208967
2014-05-16 09:41:43 +00:00
Tim Northover
4736478963 AArch64: disable printing of MOV -> MOVZ aliases
Actually, MOV sometimes is canonical, but for now this is a better
approximation than what's there.

This will be tested when the TableGen "should I print this Alias" heuristic is
fixed (very soon).

llvm-svn: 208962
2014-05-16 09:41:21 +00:00
Tim Northover
3c2cc7a397 TableGen: use PrintMethods to print more aliases
llvm-svn: 208607
2014-05-12 18:04:06 +00:00
Chad Rosier
9733d5cead [AArch64] Add SchedRW lists to NEON instructions.
Previously, only regular AArch64 instructions were annotated with SchedRW lists.
This patch does the same for NEON enabling these instructions to be scheduled by
the MIScheduler. Additionally, store operations are now modeled and a few
SchedRW lists were updated for bug fixes (e.g. multiple def operands).

Reviewers: apazos, mcrosier, atrick
Patch by Dave Estes <cestes@codeaurora.org>!

llvm-svn: 204505
2014-03-21 19:34:41 +00:00
Tim Northover
e19a5dc534 AArch64: error when both positional & named operands are used.
Only one instruction pair needed changing: SMULH & UMULH. The previous
code worked, but MC was doing extra work treating Ra as a valid
operand (which then got completely overwritten in MCCodeEmitter).

No behaviour change, so no tests.

llvm-svn: 203772
2014-03-13 09:00:13 +00:00
Chad Rosier
6c9595d931 [AArch64] This is a work in progress to provide a machine description
for the Cortex-A53 subtarget in the AArch64 backend.

This patch lays the ground work to annotate each AArch64 instruction
(no NEON yet) with a list of SchedReadWrite types. The patch also
provides the Cortex-A53 processor resources, maps those the the default
SchedReadWrites, and provides basic latency. NEON support will be added
in a subsequent patch with proper forwarding logic.

Verification was done by setting the pre-RA scheduler to linearize to
better gauge the effect of the MIScheduler. Even without modeling the
forward logic, the results show a modest improvement for Cortex-A53.

Reviewers: apazos, mcrosier, atrick
Patch by Dave Estes <cestes@codeaurora.org>!

llvm-svn: 203125
2014-03-06 16:04:00 +00:00
Chad Rosier
e60c767814 Revert "[AArch64] This is a work in progress to provide a machine description"
This reverts commit ff717c8fc786a0cfa1602982b91895fa09e514fc.

llvm-svn: 202773
2014-03-04 00:32:07 +00:00
Chad Rosier
ad64e09862 [AArch64] This is a work in progress to provide a machine description
for the Cortex-A53 subtarget in the AArch64 backend.

This patch lays the ground work to annotate each AArch64 instruction
(no NEON yet) with a list of SchedReadWrite types. The patch also
provides the Cortex-A53 processor resources, maps those the the default
SchedReadWrites, and provides basic latency. NEON support will be added
in a subsequent patch with proper forwarding logic.

Verification was done by setting the pre-RA scheduler to linearize to
better gauge the effect of the MIScheduler. Even without modeling the
forward logic, the results show a modest improvement for Cortex-A53.

Reviewers: apazos, mcrosier, atrick
Patch by Dave Estes <cestes@codeaurora.org>!

llvm-svn: 202767
2014-03-03 23:32:47 +00:00
Albrecht Kadlec
7a0ac75c6a trivial test commit
llvm-svn: 202084
2014-02-24 22:18:38 +00:00
Kevin Qin
e05e6b31e1 [AArch64] Add register constraints to avoid generating STLXR and STXR with unpredictable behavior.
llvm-svn: 201841
2014-02-21 07:45:48 +00:00
Ana Pazos
85f191fc73 [AArch64] Check fmul node single use in fused multiply patterns
Check for single use of fmul node in fused multiply patterns
to allow generation of fused multiply add/sub instructions.
Otherwise fmul operation ends up being repeated more than
once which does not help peformance on targets with
only one MAC unit, as for example cortex-a53.

llvm-svn: 197929
2013-12-24 00:47:29 +00:00
Ana Pazos
8821a9ef6b [AArch64 NEON] Fixed fused multiply negate add/sub patterns
The correct pattern matching should be:

- fnmadd is (-Ra) + (-Rn)*Rm  which should be matched as:

  fma (fneg node:$Rn),  node:$Rm, (fneg node:$Ra) and as

  (f32 (fsub (f32 (fneg FPR32:$Ra)), (f32 (fmul FPR32:$Rn, FPR32:$Rm))))

- fnmsub is (-Ra) + Rn*Rm which should be matched as

  fma node:$Rn,  node:$Rm, (fneg node:$Ra) and as

  (f32 (fsub (f32 (fmul FPR32:$Rn, FPR32:$Rm)), FPR32:$Ra))))

llvm-svn: 197928
2013-12-24 00:40:10 +00:00
Kevin Qin
99ae282f19 [AArch64 NEON]Implment loading vector constant form constant pool.
llvm-svn: 197551
2013-12-18 06:26:04 +00:00
Chad Rosier
ba5d2d1be6 [AArch64] Implemented AdvSIMD scalar x indexed element format and AdvSIMD scalar
copy in MC layer. Added the MC layer tests.  Fixed triple setting in test cases.

Patch by Ana Pazos <apazos@codeaurora.org>.

llvm-svn: 194501
2013-11-12 19:13:08 +00:00
Amara Emerson
ce9bb052e5 [AArch64] Make the use of FP instructions optional, but enabled by default.
This adds a new subtarget feature called FPARMv8 (implied by NEON), and
predicates the support of the FP instructions and registers on this feature.

llvm-svn: 193739
2013-10-31 09:32:11 +00:00
Chad Rosier
02e430c891 [AArch64] Add support for NEON scalar floating-point compare instructions.
llvm-svn: 193691
2013-10-30 15:19:37 +00:00
Jiangning Liu
5867567c41 Initial support for Neon scalar instructions.
Patch by Ana Pazos.

1.Added support for v1ix and v1fx types.
2.Added Scalar Pairwise Reduce instructions.
3.Added initial implementation of Scalar Arithmetic instructions.

llvm-svn: 191263
2013-09-24 02:47:27 +00:00
Tim Northover
5eefec9e6d AArch64: use RegisterOperand for NEON registers.
Previously we modelled VPR128 and VPR64 as essentially identical
register-classes containing V0-V31 (which had Q0-Q31 as "sub_alias"
sub-registers). This model is starting to cause significant problems
for code generation, particularly writing EXTRACT/INSERT_SUBREG
patterns for converting between the two.

The change here switches to classifying VPR64 & VPR128 as
RegisterOperands, which are essentially aliases for RegisterClasses
with different parsing and printing behaviour. This fits almost
exactly with their real status (VPR128 == FPR128 printed strangely,
VPR64 == FPR64 printed strangely).

llvm-svn: 190665
2013-09-13 07:26:52 +00:00
Tim Northover
dbac87d1fc AArch64: add initial NEON support
Patch by Ana Pazos.

- Completed implementation of instruction formats:
AdvSIMD three same
AdvSIMD modified immediate
AdvSIMD scalar pairwise

- Completed implementation of instruction classes
(some of the instructions in these classes
belong to yet unfinished instruction formats):
Vector Arithmetic
Vector Immediate
Vector Pairwise Arithmetic

- Initial implementation of instruction formats:
AdvSIMD scalar two-reg misc
AdvSIMD scalar three same

- Intial implementation of instruction class:
Scalar Arithmetic

- Initial clang changes to support arm v8 intrinsics.
Note: no clang changes for scalar intrinsics function name mangling yet.

- Comprehensive test cases for added instructions
To verify auto codegen, encoding, decoding, diagnosis, intrinsics.

llvm-svn: 187567
2013-08-01 09:20:35 +00:00
Tim Northover
c1348880dc AArch64: correct CodeGen of MOVZ/MOVK combinations.
According to the AArch64 ELF specification (4.6.8), it's the
assembler's responsibility to make sure the shift amount is correct in
relocated MOVZ/MOVK instructions.

This wasn't being obeyed by either the MCJIT CodeGen or RuntimeDyldELF
(which happened to work out well for JIT tests). This commit should
make us compliant in this area.

llvm-svn: 185360
2013-07-01 19:23:10 +00:00
Tim Northover
87645e02c0 AArch64: implement large code model access to global variables.
The MOVZ/MOVK instruction sequence may not be the most efficient (a
literal-pool load could be better) but adding that would require
reinstating the ConstantIslands pass.

For now the sequence is correct, and that's enough. Beware, as of
commit GNU ld does not appear to support the relocations needed for
this. Its primary purpose (for now) will be to support JITed code,
since in that case there is no guarantee of where your code will end
up in memory relative to external symbols it references.

llvm-svn: 181117
2013-05-04 16:53:46 +00:00
Tim Northover
6932cfb14f AArch64: remove useless comment
llvm-svn: 179952
2013-04-20 15:57:41 +00:00
Tim Northover
8eb5637d73 AArch64: remove barriers from AArch64 atomic operations.
I've managed to convince myself that AArch64's acquire/release
instructions are sufficient to guarantee C++11's required semantics,
even in the sequentially-consistent case.

llvm-svn: 179005
2013-04-08 08:40:41 +00:00
Tim Northover
acffe8e7ca AArch64: switch patterns to be type-based rather than RegClass-based
It's a bit of churn in the blame log, but I think there are real benefits to
the newer system so I'm making the change in one go.

llvm-svn: 178633
2013-04-03 11:19:16 +00:00
Tim Northover
e7cedcf871 AArch64: remove post-encoder method from FCMP (immediate) instructions.
The work done by the post-encoder (setting architecturally unused bits to 0 as
required) can be done by the existing operand that covers the "#0.0". This
removes at least one use of the discouraged PostEncoderMethod uses.

llvm-svn: 176261
2013-02-28 14:46:14 +00:00
Tim Northover
e2cf283c3e AArch64: Use cbnz instead of cmp/b.ne pair for atomic operations.
llvm-svn: 176253
2013-02-28 13:52:07 +00:00
Tim Northover
04e9446751 AArch64: remove ConstantIsland pass & put literals in separate section.
This implements the review suggestion to simplify the AArch64 backend. If we
later discover that we *really* need the extra complexity of the
ConstantIslands pass for performance reasons it can be resurrected.

llvm-svn: 175258
2013-02-15 09:33:43 +00:00
Tim Northover
025494b1c4 AArch64: switch from neverHasSideEffects to hasSideEffects.
llvm-svn: 175176
2013-02-14 16:31:12 +00:00
Tim Northover
21f54fd5c2 AArch64: add block comments where missing
Only comments affected. No code change at all.

llvm-svn: 175169
2013-02-14 16:17:01 +00:00
Tim Northover
349160133e Make use of DiagnosticType to provide better AArch64 diagnostics.
This gives a DiagnosticType to all AsmOperands in sight. This replaces all
"invalid operand" diagnostics with something more specific. The messages given
should still be sufficiently vague that they're not usually actively misleading
when LLVM guesses your instruction incorrectly.

llvm-svn: 174871
2013-02-11 09:29:37 +00:00
Tim Northover
a6ee94525f Implement external weak (ELF) symbols on AArch64
Weakly defined symbols should evaluate to 0 if they're undefined at
link-time. This is impossible to do with the usual address generation
patterns, so we should use a literal pool entry to materlialise the
address.

llvm-svn: 174518
2013-02-06 16:43:33 +00:00
Tim Northover
4daffb618d Add AArch64 CRC32 instructions
These instructions are a late addition to the architecture, and may
yet end up behind an optional attribute, but for now they're available
at all times.

llvm-svn: 174496
2013-02-06 09:13:13 +00:00