1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 14:33:02 +02:00
Commit Graph

520 Commits

Author SHA1 Message Date
Jim Grosbach
13af5276a1 ARM assembly parsing for VST1 two-register encoding.
llvm-svn: 144430
2011-11-11 23:45:47 +00:00
Jim Grosbach
76dd8a9702 ARM VST1 w/ writeback assembly parsing and encoding.
llvm-svn: 143369
2011-10-31 21:50:31 +00:00
Owen Anderson
5fdb303642 Specify that the high bit of the alignment field is fixed to 0 on these instructions.
llvm-svn: 143220
2011-10-28 20:43:24 +00:00
Jim Grosbach
fabe0f2f0b ARM assembly parsing and encoding for VLD1 with writeback.
Four entry register lists.

llvm-svn: 142882
2011-10-25 00:14:01 +00:00
Jim Grosbach
e8a2edd71c Nuke dead code. Nothing generates the VLD1d64QPseudo_UPD instruction.
llvm-svn: 142877
2011-10-24 23:40:46 +00:00
Jim Grosbach
688186941f ARM assembly parsing and encoding for VLD1 w/ writeback.
Three entry register list variation.

llvm-svn: 142876
2011-10-24 23:26:05 +00:00
Jim Grosbach
cf4fba1dd0 ARM assembly parsing and encoding for VLD1 w/ writeback.
One and two length register list variants.

llvm-svn: 142861
2011-10-24 22:16:58 +00:00
Jim Grosbach
4a6508dd4e ARM refactor am6offset usage for VLD1.
Split am6offset into fixed and register offset variants so the instruction
encodings are explicit rather than relying an a magic reg0 marker.
Needed to being able to parse these.

llvm-svn: 142853
2011-10-24 21:45:13 +00:00
Jim Grosbach
d964cf8939 Assembly parsing for 4-register sequential variant of VLD2.
llvm-svn: 142704
2011-10-21 23:58:57 +00:00
Jim Grosbach
a6e536367e Assembly parsing for 2-register sequential variant of VLD2.
llvm-svn: 142691
2011-10-21 22:21:10 +00:00
Jim Grosbach
68dfc88f95 Assembly parsing for 4-register variant of VLD1.
llvm-svn: 142682
2011-10-21 20:35:01 +00:00
Jim Grosbach
2c1ca90ac9 Assembly parsing for 3-register variant of VLD1.
llvm-svn: 142675
2011-10-21 20:02:19 +00:00
Jim Grosbach
6bb38d0e97 ARM VLD parsing and encoding.
Next step in the ongoing saga of NEON load/store assmebly parsing. Handle
VLD1 instructions that take a two-register register list.

Adjust the instruction definitions to only have the single encoded register
as an operand. The super-register from the pseudo is kept as an implicit def,
so passes which come after pseudo-expansion still know that the instruction
defines the other subregs.

llvm-svn: 142670
2011-10-21 18:54:25 +00:00
Jim Grosbach
501c72cdc5 Remove some outdated comments.
llvm-svn: 142653
2011-10-21 16:14:12 +00:00
Jim Grosbach
e9d1df8266 ARM VLD1/VST1 (one register, no writeback) assembly parsing and encoding.
llvm-svn: 142583
2011-10-20 15:04:25 +00:00
Jim Grosbach
972f26d936 ARM VTBX (one register) assembly parsing and encoding.
llvm-svn: 142581
2011-10-20 14:48:50 +00:00
Jim Grosbach
6a932d6ad1 ARM VTBL (one register) assembly parsing and encoding.
llvm-svn: 142441
2011-10-18 23:02:30 +00:00
Jim Grosbach
d748cf251f Yet more ARM NEON assembly parsing for the lane index operand.
llvm-svn: 142416
2011-10-18 20:21:17 +00:00
Jim Grosbach
ff8c26a53f ARM vmla/vmls assembly parsing for the lane index operand.
llvm-svn: 142413
2011-10-18 20:14:56 +00:00
Jim Grosbach
ed5cb526e2 ARM vmov assembly parsing for the lane index operand.
llvm-svn: 142412
2011-10-18 20:10:47 +00:00
Jim Grosbach
988b8dd4ce ARM vmla/vmls assembly parsing for the lane index operand.
llvm-svn: 142389
2011-10-18 18:27:07 +00:00
Jim Grosbach
2752e0b869 ARM vqdmulh assembly parsing for the lane index operand.
llvm-svn: 142386
2011-10-18 18:12:09 +00:00
Jim Grosbach
b56577b650 ARM vmul assembly parsing for the lane index operand.
llvm-svn: 142381
2011-10-18 18:01:52 +00:00
Jim Grosbach
4a138cb8d9 ARM vqdmlal assembly parsing for the lane index operand.
llvm-svn: 142365
2011-10-18 17:16:30 +00:00
Jim Grosbach
031bb99231 ARM assembly parsing and encoding for VMOV.i64.
llvm-svn: 142356
2011-10-18 16:18:11 +00:00
Jim Grosbach
bcfb4ed53c ARM assembly parsing and encoding for VMOV/VMVN/VORR/VBIC.i32.
llvm-svn: 142321
2011-10-18 00:22:00 +00:00
Jim Grosbach
1e994e76a7 ARM assembly parsing and encoding for VMOV/VMVN/VORR/VBIC.i16.
llvm-svn: 142303
2011-10-17 23:09:09 +00:00
Jim Grosbach
f3d495fbbd ARM NEON "vmov.i8" immediate assembly parsing and encoding.
NEON immediates are "interesting". Start of the work to handle parsing them
in an 'as' compatible manner. Getting the matcher to play nicely with
these and the floating point immediates from VFP is an extra fun wrinkle.

llvm-svn: 142293
2011-10-17 22:26:03 +00:00
Jim Grosbach
eeb05f7532 Tidy up organization.
llvm-svn: 142248
2011-10-17 21:00:11 +00:00
Jim Grosbach
94980a23e6 ARM NEON assembly parsing and encoding for VDUP(scalar).
llvm-svn: 141446
2011-10-07 23:56:00 +00:00
Chad Rosier
3c596dbe51 Remove the VMOVQQ pseudo instruction.
llvm-svn: 138177
2011-08-20 00:52:40 +00:00
Chad Rosier
0d49bb37fb Remove VMOVQQQQ pseudo instruction.
llvm-svn: 138174
2011-08-20 00:40:14 +00:00
Owen Anderson
2e722e7cd4 Specify a necessary fixed bit for VLD3DUP, and otherwise rearrange the Thumb2 NEON decoding hooks to bring us closer to correctness.
llvm-svn: 137686
2011-08-15 23:38:54 +00:00
Owen Anderson
894585de33 Fix problems decoding the to/from-lane NEON memory instructions, and add a comprehensive NEON decoding testcase.
llvm-svn: 137635
2011-08-15 18:44:44 +00:00
Owen Anderson
ffe1c55752 Replace the existing ARM disassembler with a new one based on the FixedLenDecoderEmitter.
This new disassembler can correctly decode all the testcases that the old one did, though
some "expected failure" testcases are XFAIL'd for now because it is not (yet) as strict in
operand checking as the old one was.

llvm-svn: 137144
2011-08-09 20:55:18 +00:00
Bob Wilson
e241d2cd04 Add missing register constraint for some VLD3/VLD4 pseudo instructions.
<rdar://problem/9878189>

llvm-svn: 136962
2011-08-05 07:24:09 +00:00
Owen Anderson
7a380bac06 Remove VMOVDneon and VMOVQ, which are just aliases for VORR. This continues to simplify the path towards an auto-generated disassembler.
llvm-svn: 135290
2011-07-15 18:46:47 +00:00
Owen Anderson
4cf53f7ec4 Remove unnecessary duplicate instruction definitions that simply overloaded the type of VEXT. This can be achieved with a Pat definition, and is much more disassembler friendly.
llvm-svn: 135283
2011-07-15 17:48:05 +00:00
Jim Grosbach
eff8e5d153 Clean up a few 80 column violations.
llvm-svn: 132946
2011-06-13 22:54:22 +00:00
Tanya Lattner
aa1f6df650 Fix encoding for VEXTdf.
llvm-svn: 132486
2011-06-02 21:25:24 +00:00
Mon P Wang
08d3b69861 Fixed MC encoding for index_align for VLD1/VST1 (single element from one lane) for size 32
llvm-svn: 131085
2011-05-09 17:47:27 +00:00
Mon P Wang
9aa67ff50a Fixed encoding for VEXTqf
llvm-svn: 129101
2011-04-07 19:56:12 +00:00
Owen Anderson
d4e1a2f2b6 Somehow we managed to forget to encode the lane index for a large swathe of NEON instructions. With this fix, the entire test-suite passes with the Thumb integrated assembler.
llvm-svn: 128587
2011-03-30 23:45:29 +00:00
Cameron Zwarich
1b8f91d2c8 Add a ARM-specific SD node for VBSL so that forms with a constant first operand
can be recognized. This fixes <rdar://problem/9183078>.

llvm-svn: 128584
2011-03-30 23:01:21 +00:00
Owen Anderson
d73041e884 Get rid of the non-writeback versions VLDMDB and VSTMDB, which don't actually exist.
llvm-svn: 128461
2011-03-29 16:45:53 +00:00
Jim Grosbach
ee6075cda5 ARM VDUPfd and VDUPfq can just be patterns. The instruction is the same
as for VDUP32d and VDUP32q, respectively.

llvm-svn: 127489
2011-03-11 20:44:08 +00:00
Jim Grosbach
3329263352 ARM VDUPLNfq and VDUPLNfd definitions can just be Pat<>s for VDUPLN32q
and VDUPLN32d, respectively.

llvm-svn: 127486
2011-03-11 20:31:17 +00:00
Jim Grosbach
431682981d ARM VREV64df and VREV64qf can just be patterns. The instruction is the same
as for VREV64d32 and VREV64q32, respectively.

llvm-svn: 127485
2011-03-11 20:18:05 +00:00
Bill Wendling
68934338ab * Correct encoding for VSRI.
* Add tests for VSRI and VSLI.

llvm-svn: 127297
2011-03-09 00:33:17 +00:00
Bill Wendling
b790c462c0 Correct the encoding for VRSRA and VSRA instructions.
llvm-svn: 127294
2011-03-09 00:00:35 +00:00
Bill Wendling
ab9f04b6d8 * Fix VRSHR and VSHR to have the correct encoding for the immediate.
* Update the NEON shift instruction test to expect what 'as' produces.

llvm-svn: 127293
2011-03-08 23:48:09 +00:00
Bill Wendling
958e854f40 Rename the narrow shift right immediate operands to "shr_imm*" operands. Also
expand the testing of the narrowing shift right instructions.

No functionality change.

llvm-svn: 127193
2011-03-07 23:38:41 +00:00
Bill Wendling
304dda7810 Narrow right shifts need to encode their immediates differently from a normal
shift.

   16-bit: imm6<5:3> = '001', 8 - <imm> is encded in imm6<2:0>
   32-bit: imm6<5:4> = '01',16 - <imm> is encded in imm6<3:0>
   64-bit: imm6<5> = '1', 32 - <imm> is encded in imm6<4:0>

llvm-svn: 126723
2011-03-01 01:00:59 +00:00
Bob Wilson
6bbffe19e9 Add patterns to use post-increment addressing for Neon VST1-lane instructions.
llvm-svn: 126477
2011-02-25 06:42:42 +00:00
Bob Wilson
46b105c6a2 Change VLD3/4 and VST3/4 for quad registers to not update the address register.
These operations are expanded to pairs of loads or stores, and the first one
uses the address register update to produce the address for the second one.
So far, the second load/store has also updated the address register, just
for convenience, since that output has never been used.  In anticipation of
actually supporting post-increment updates for these operations, this changes
the non-updating operations to use a non-updating load/store for the second
instruction.

llvm-svn: 125013
2011-02-07 17:43:15 +00:00
Bob Wilson
cdda05b3cc Fix some NEON instruction itineraries.
llvm-svn: 125012
2011-02-07 17:43:12 +00:00
Bob Wilson
22f18a7e94 Add ARM patterns to match EXTRACT_SUBVECTOR nodes.
Also fix an off-by-one in SelectionDAGBuilder that was preventing shuffle
vectors from being translated to EXTRACT_SUBVECTOR.
Patch by Tim Northover.

The test changes are needed to keep those spill-q tests from testing aligned
spills and restores.  If the only aligned stack objects are spill slots, we
no longer realign the stack frame.  Prior to this patch, an EXTRACT_SUBVECTOR
was legalized by loading from the stack, which created an aligned frame index.
Now, however, there is nothing except the spill slot in the stack frame, so
I added an aligned alloca.

llvm-svn: 122995
2011-01-07 04:59:04 +00:00
Bob Wilson
5f9e78fe20 Rearrange some Neon multiclasses. No functional changes.
llvm-svn: 122119
2010-12-18 00:42:58 +00:00
Bob Wilson
776d3f73eb Fix result type of Neon floating-point comparisons against zero.
The result vector elements are always integers.  Radar 8782191.

llvm-svn: 122112
2010-12-18 00:04:33 +00:00
Bob Wilson
438a9a1367 Add Neon VCVT instructions for f32 <-> f16 conversions.
Clang is now providing intrinsics for these and so we need to support them
in the backend.  Radar 8068427.

llvm-svn: 121902
2010-12-15 22:14:12 +00:00
Bob Wilson
33e5e902b0 Remove the rest of the *_sfp Neon instruction patterns.
Use the same COPY_TO_REGCLASS approach as for the 2-register *_sfp instructions.
This change made a big difference in the code generated for the
CodeGen/Thumb2/cross-rc-coalescing-2.ll test: The coalescer is still doing
a fine job, but some instructions that were previously moved outside the loop
are not moved now.  It's using fewer VFP registers now, which is generally
a good thing, so I think the estimates for register pressure changed and that
affected the LICM behavior.  Since that isn't obviously wrong, I've just
changed the test file.  This completes the work for Radar 8711675.

llvm-svn: 121730
2010-12-13 23:02:37 +00:00
Bob Wilson
b189b77d9b Simplify N2VSPat, removing some unnecessary type arguments.
llvm-svn: 121729
2010-12-13 23:02:31 +00:00
Bob Wilson
203303291f Delete a line that I forgot to revert previously.
llvm-svn: 121719
2010-12-13 22:05:55 +00:00
Bob Wilson
074095ddf2 Use COPY_TO_REGCLASS instead of pseudo instructions for Neon FP patterns.
Jakob Olesen suggested that we can avoid the need for separate pseudo
instructions here by using COPY_TO_REGCLASS in the patterns.  The pattern
gets pretty ugly but it seems to work well.  Partial fix for Radar 8711675.

llvm-svn: 121718
2010-12-13 21:58:05 +00:00
Bob Wilson
56b41f8b81 Use pseudo instructions for 2-register Neon instructions for scalar FP.
Partial fix for Radar 8711675.

llvm-svn: 121716
2010-12-13 21:05:52 +00:00
Bob Wilson
9a6d75a499 Remove unused instruction class arguments.
llvm-svn: 121715
2010-12-13 21:05:44 +00:00
Bob Wilson
d30768fe3e Add float patterns for Neon vld1-lane/dup and vst1-lane operations.
llvm-svn: 121583
2010-12-10 22:13:32 +00:00
Bob Wilson
ae683e722f Remove unused arguments.
llvm-svn: 121582
2010-12-10 22:13:24 +00:00
Evan Cheng
fc78767730 Making use of VFP / NEON floating point multiply-accumulate / subtraction is
difficult on current ARM implementations for a few reasons.
1. Even though a single vmla has latency that is one cycle shorter than a pair
   of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause
   additional pipeline stall. So it's frequently better to single codegen
   vmul + vadd.
2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to
   stall for 4 cycles. We need to schedule them apart.
3. A vmla followed vmla is a special case. Obvious issuing back to back RAW
   vmla + vmla is very bad. But this isn't ideal either:
     vmul
     vadd
     vmla
   Instead, we want to expand the second vmla:
     vmla
     vmul
     vadd
   Even with the 4 cycle vmul stall, the second sequence is still 2 cycles
   faster.

Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough
but it isn't the optimial solution. This patch attempts to make it possible to
use vmla / vmls in cases where it is profitable.

A. Add missing isel predicates which cause vmla to be codegen'ed.
B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to
   compute a fmul and a fmla.
C. Add additional isel checks for vmla, avoid cases where vmla is feeding into
   fp instructions (except for the #3 exceptional case).
D. Add ARM hazard recognizer to model the vmla / vmls hazards.
E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the
   vmla / vmls will trigger one of the special hazards.

Work in progress, only A+B are enabled.

llvm-svn: 120960
2010-12-05 22:04:16 +00:00
Jim Grosbach
78ef3199c8 Fix copy/pasto in vmin.f32 encoding.
llvm-svn: 120709
2010-12-02 16:30:58 +00:00
Owen Anderson
2299afbb49 Use by-name rather than by-order matching for NEON operands.
llvm-svn: 120507
2010-12-01 00:28:25 +00:00
Bob Wilson
f5eece615c Fix the encoding of VLD4-dup alignment.
The only reasonable way I could find to do this is to provide an alternate
version of the addrmode6 operand with a different encoding function.  Use it
for all the VLD-dup instructions for the sake of consistency.

llvm-svn: 120358
2010-11-30 00:00:42 +00:00
Bob Wilson
1be989686c Rename VLDnDUP instructions with double-spaced registers
in an attempt to make things a little more consistent.

llvm-svn: 120357
2010-11-30 00:00:38 +00:00
Bob Wilson
bd3d3d2937 Add support for NEON VLD3-dup instructions.
The encoding for alignment in VLD4-dup instructions is still a work in progress.

llvm-svn: 120356
2010-11-30 00:00:35 +00:00
Bob Wilson
aa197b07e6 Add support for NEON VLD3-dup instructions.
llvm-svn: 120312
2010-11-29 19:35:29 +00:00
Bob Wilson
3bb61d1932 Add support for NEON VLD2-dup instructions.
llvm-svn: 120236
2010-11-28 06:51:26 +00:00
Bob Wilson
f4df482b0d Another minor refactoring for VLD1DUP instructions.
The op11_8 field is the same for all of them so put it in the instruction
classes instead of specifying it separately for each instruction.

llvm-svn: 120234
2010-11-28 06:51:15 +00:00
Bob Wilson
3e64f6b309 Refactor. Set alignment bit in VLD1-dup instruction classes.
llvm-svn: 120197
2010-11-27 07:12:02 +00:00
Bob Wilson
cbd6281807 Add NEON VLD1-dup instructions (load 1 element to all lanes).
llvm-svn: 120194
2010-11-27 06:35:16 +00:00
Owen Anderson
0ec4da72fc Use by-name rather than by-order operand matching for some NEON encodings.
llvm-svn: 119923
2010-11-21 06:47:06 +00:00
Owen Anderson
023f096736 The Vm and Vn register fields must be the same for a register-register vmov.
llvm-svn: 119867
2010-11-19 23:12:43 +00:00
Jim Grosbach
7445ae1145 Operand names
llvm-svn: 119864
2010-11-19 22:43:08 +00:00
Jim Grosbach
69cad2c8b0 Clarify operand names.
llvm-svn: 119858
2010-11-19 22:36:02 +00:00
Jim Grosbach
082e9f2f2c Remove trailing whitespace.
llvm-svn: 119608
2010-11-18 01:39:50 +00:00
Jim Grosbach
2f9a2efb3c ARM PseudoInst instructions don't need or use an assembler string. Get rid of
the operand to the pattern.

llvm-svn: 119607
2010-11-18 01:38:26 +00:00
Bill Wendling
b450d320ec Encode the multi-load/store instructions with their respective modes ('ia',
'db', 'ib', 'da') instead of having that mode as a separate field in the
instruction. It's more convenient for the asm parser and much more readable for
humans.
<rdar://problem/8654088>

llvm-svn: 119310
2010-11-16 01:16:36 +00:00
Owen Anderson
52e3873edc Add support for ARM's specialized vector-compare-against-zero instructions.
llvm-svn: 118453
2010-11-08 23:21:22 +00:00
Owen Anderson
add19dd6dd Add codegen and encoding support for the immediate form of vbic.
llvm-svn: 118291
2010-11-05 19:27:46 +00:00
Owen Anderson
1a89511e5d Add support for code generation of the one register with immediate form of vorr.
We could be more aggressive about making this work for a larger range of constants,
but this seems like a good start.

llvm-svn: 118201
2010-11-03 22:44:51 +00:00
Owen Anderson
98f9965c89 Unlike a lot of NEON instructions, vext isn't _actually_ parameterized by element size. Instead,
all of the different element sizes are pseudo instructions that map down to vext.8 underneath, with
the immediate shifted left to reflect the increased element size.

llvm-svn: 118183
2010-11-03 18:16:27 +00:00
Bob Wilson
f44e708279 Add codegen patterns for VST1-lane instructions. Radar 8599955.
llvm-svn: 118176
2010-11-03 16:24:53 +00:00
Jim Grosbach
c10d3f3d4b Break ARM addrmode4 (load/store multiple base address) into its constituent
parts. Represent the operation mode as an optional operand instead.
rdar://8614429

llvm-svn: 118137
2010-11-03 01:01:43 +00:00
Owen Anderson
f754738bbb Revert r118097 to fix buildbots.
llvm-svn: 118121
2010-11-02 23:47:29 +00:00
Owen Anderson
ea89766d0c Since these fields are not exactly equivalent to the encoded field, rename them to something with semantic meaning.
llvm-svn: 118097
2010-11-02 22:41:42 +00:00
Owen Anderson
cdd587157f Provide correct encodings for the remaining vst variants that we currently generate.
llvm-svn: 118087
2010-11-02 22:18:18 +00:00
Owen Anderson
1f88ac90a1 Tentative encodings for the "single element from one lane" variant of vst1.
llvm-svn: 118084
2010-11-02 21:54:45 +00:00
Owen Anderson
46d4ab1a87 Add correct encodings for basic variants for vst3 and vst4.
llvm-svn: 118082
2010-11-02 21:47:03 +00:00
Bob Wilson
248c691f9a Add NEON VST1-lane instructions. Partial fix for Radar 8599955.
llvm-svn: 118069
2010-11-02 21:18:25 +00:00
Owen Anderson
36d5c04fbd Add correct encodings for the basic variants for vst2.
llvm-svn: 118068
2010-11-02 21:16:58 +00:00
Owen Anderson
c9f6909c96 Add correct encodings for the basic form of vst1.
llvm-svn: 118067
2010-11-02 21:06:06 +00:00
Owen Anderson
b34a5f1d02 Factor out a common encoding class for loads and stores with a lane parameter.
llvm-svn: 118055
2010-11-02 20:47:39 +00:00
Owen Anderson
ee1337c01f Add correct encodings for the rest of the vld instructions that we generate.
llvm-svn: 118053
2010-11-02 20:40:59 +00:00
Owen Anderson
9d85c89ade Add correct NEON encodings for vld2, vld3, and vld4 basic variants.
llvm-svn: 117997
2010-11-02 01:24:55 +00:00
Owen Anderson
f4ab06d0b6 Attempt to provide correct encodings for a number of other vld1 variants, which we can't test
since we can neither generate nor parse them at the moment.

llvm-svn: 117988
2010-11-02 00:24:52 +00:00
Owen Anderson
6647eb222b Add correct NEON encodings for the "multiple single elements" form of vld.
llvm-svn: 117984
2010-11-02 00:05:05 +00:00
Bob Wilson
b6bc135df8 Add NEON VLD1-lane instructions. Partial fix for Radar 8599955.
llvm-svn: 117964
2010-11-01 22:04:05 +00:00
Owen Anderson
e75f7c5419 Add correct NEON encodings for vtbl and vtbx.
llvm-svn: 117513
2010-10-28 00:18:46 +00:00
Owen Anderson
008116cb71 Add correct NEON encodings for vext, vtrn, vuzp, and vzip.
llvm-svn: 117512
2010-10-27 23:56:39 +00:00
Owen Anderson
9437a20a72 Provide correct encodings for NEON vcvt, which has its own special immediate encoding
for specifying fractional bits for fixed point conversions.

llvm-svn: 117501
2010-10-27 22:49:00 +00:00
Owen Anderson
d28d229ded Provide correct encodings for the get_lane and set_lane variants of vmov.
llvm-svn: 117495
2010-10-27 21:28:09 +00:00
Owen Anderson
7c46fcfee4 Provide correct NEON encodings for vdup.
llvm-svn: 117475
2010-10-27 19:25:54 +00:00
Owen Anderson
c8757eb137 Add correct NEON encodings for vsli and vsri.
llvm-svn: 117459
2010-10-27 17:40:08 +00:00
Owen Anderson
e64b7187a9 Add correct NEON encodings for vsra and vrsra.
llvm-svn: 117458
2010-10-27 17:29:29 +00:00
Owen Anderson
1dc05f20e2 Add correct NEON encodings for vqshl, vqshrn, vqshrun, vqrshl, vqshrn, and vqrshrun.
llvm-svn: 117411
2010-10-26 22:50:46 +00:00
Owen Anderson
55c0bad37d Correct NEON encodings for vshrn, vrshl, vrshr, vrshrn.
llvm-svn: 117402
2010-10-26 21:58:41 +00:00
Owen Anderson
570a4cdc45 Simplify classes for shift instructions, which are never commutable.
llvm-svn: 117398
2010-10-26 21:13:59 +00:00
Owen Anderson
0cecbd810e Provide correct NEON encodings for vshl, register and immediate forms.
llvm-svn: 117394
2010-10-26 20:56:57 +00:00
Owen Anderson
d8e5d26a56 Add correct NEON encoding for vpadal.
llvm-svn: 117380
2010-10-26 18:18:03 +00:00
Owen Anderson
b7618a821f Add NEON encodings for vmov and vmvn of immediates.
llvm-svn: 117374
2010-10-26 17:40:54 +00:00
Owen Anderson
e5e0dcd665 Add correct encodings for NEON vabal.
llvm-svn: 117315
2010-10-25 21:29:04 +00:00
Owen Anderson
3eff0b86a5 Add correct NEON encodings for vaba.
llvm-svn: 117309
2010-10-25 20:52:57 +00:00
Owen Anderson
61f5b3f2dc Attempt to provide correct encodings for NEON vbit and vbif, even though we can't test them at the moment.
llvm-svn: 117294
2010-10-25 20:17:22 +00:00
Owen Anderson
072692331e Provide correct NEON encodings for vbsl.
llvm-svn: 117293
2010-10-25 20:13:13 +00:00
Owen Anderson
59e85cbd66 Add correct instruction encodings for vbic, vorn, and vmvn.
llvm-svn: 117282
2010-10-25 18:43:52 +00:00
Owen Anderson
ba261b092c Add NEON encoding tests for vcgt and vacgt.
llvm-svn: 117276
2010-10-25 18:03:59 +00:00
Owen Anderson
3a5f798790 Add tests for NEON encodings of vcge and vacge.
llvm-svn: 117274
2010-10-25 17:49:32 +00:00
Owen Anderson
757022131f Add a warning about our inability to test the encoding of vceq with immediate zero.
llvm-svn: 117273
2010-10-25 17:33:02 +00:00
Owen Anderson
424434414e Add correct NEON encodings for vqdmlal.
llvm-svn: 117134
2010-10-22 19:35:48 +00:00
Owen Anderson
2bbdc62e17 Provide correct encodings for NEON vmlal.
llvm-svn: 117131
2010-10-22 19:05:25 +00:00
Owen Anderson
ada2b33321 Provide correct NEON encodings for vmla.
llvm-svn: 117126
2010-10-22 18:54:37 +00:00
Owen Anderson
ba2ac80921 ARM encodes Q registers as 2xregno (i.e. the number of the D register that corresponds to the lower
half of the Q register), rather than with just regno.  This allows us to unify the encodings for
a lot of different NEON instrucitons that differ only in whether they have Q or D register operands.

llvm-svn: 117056
2010-10-21 20:21:49 +00:00
Owen Anderson
51a6bc3b27 Add correct NEON encodings for vhadd and vrhadd.
llvm-svn: 117047
2010-10-21 18:55:04 +00:00
Owen Anderson
dce283c7db Add correct encodings for NEON vaddw.s* and vaddw.u*.
llvm-svn: 117040
2010-10-21 18:20:25 +00:00
Owen Anderson
7d90c72edf Provide correct NEON encodings for vaddl.u* and vaddl.s*.
llvm-svn: 117039
2010-10-21 18:09:17 +00:00
Owen Anderson
a685f8e90a Implement correct encodings for NEON vadd, both integer and floating point.
llvm-svn: 116981
2010-10-21 00:48:00 +00:00
Jim Grosbach
506b966b9d A few 80 column fixes.
llvm-svn: 116451
2010-10-13 23:34:31 +00:00
Evan Cheng
6aac1548ab More ARM scheduling itinerary fixes.
llvm-svn: 116266
2010-10-11 23:41:41 +00:00
Evan Cheng
77ba7b098a Proper VST scheduling itineraries.
llvm-svn: 116251
2010-10-11 22:03:18 +00:00
Evan Cheng
8c17a06411 Add VLD4 scheduling itineraries.
llvm-svn: 116143
2010-10-09 04:07:58 +00:00
Evan Cheng
df7f5672ee Finish vld3 and vld4.
llvm-svn: 116140
2010-10-09 01:45:34 +00:00
Evan Cheng
15fc769cf2 Correct some load / store instruction itinerary mistakes:
1. Cortex-A8 load / store multiplies can only issue on ALU0.
2. Eliminate A8_Issue, A8_LSPipe will correctly limit the load / store issues.
3. Correctly model all vld1 and vld2 variants.

llvm-svn: 116134
2010-10-09 01:03:04 +00:00
Evan Cheng
1ce29574c2 Model operand cycles of vldm / vstm; also fixes scheduling itineraries of vldr / vstr, etc.
llvm-svn: 115898
2010-10-07 01:50:48 +00:00
Jim Grosbach
e1e07b6bf1 Change the NEON VDUPfdf and VDUPfqf pseudo-instructions to actually be
pseudo instructions.

llvm-svn: 115840
2010-10-06 21:16:16 +00:00
Jim Grosbach
54490de165 Add a 'pattern' arg to the ARM PseudoNeonI class.
llvm-svn: 115831
2010-10-06 20:36:55 +00:00
Jim Grosbach
619f1c1cc5 Nuke the rest of the :comment references
llvm-svn: 115373
2010-10-01 23:21:38 +00:00
Evan Cheng
0da8dff3c7 Fix scheduling infor for vmovn and vshrn which I broke accidentially.
llvm-svn: 115354
2010-10-01 21:48:06 +00:00
Evan Cheng
fc1aee5b3c NEON scheduling info fix. vmov reg, reg are single cycle instructions.
llvm-svn: 115344
2010-10-01 20:50:58 +00:00
Bob Wilson
c63e8b4d2d Change VLDMQ and VSTMQ to be pseudo instructions. They are expanded after
register allocation to VLDMD and VSTMD respectively.  This avoids using the
dregpair operand modifier.

llvm-svn: 114047
2010-09-16 00:31:02 +00:00
Bob Wilson
9d68270b2e Use VLD1/VST1 pseudo instructions for loadRegFromStackSlot and
storeRegToStackSlot.

llvm-svn: 113918
2010-09-15 01:48:05 +00:00
Jim Grosbach
901a646188 Reapply r113875 with additional cleanups.
"The register specified for a dregpair is the corresponding Q register, so to
get the pair, we need to look up the sub-regs based on the qreg. Create a
lookup function since we don't have access to TargetRegisterInfo here to
be able to use getSubReg(ARM::dsub_[01])."

Additionaly, fix the NEON VLD1* and VST1* instruction patterns not to use
the dregpair modifier for the 2xdreg versions. Explicitly specifying the two
registers as operands is more correct and more consistent with the other
instruction patterns. This enables further cleanup of special case code in the
disassembler as a nice side-effect.

llvm-svn: 113903
2010-09-14 23:54:06 +00:00
Bob Wilson
1a69820d6d Make NEON ld/st pseudo instruction classes take the instruction itinerary as
an argument, so that we can distinguish instructions with the same register
classes but different numbers of registers (e.g., vld3 and vld4).  Fix some
of the non-pseudo NEON ld/st instruction itineraries to reflect the number
of registers loaded or stored, not just the opcode name.

llvm-svn: 113854
2010-09-14 20:59:49 +00:00
Bob Wilson
ba02d5b620 Convert some VTBL and VTBX instructions to use pseudo instructions prior to
register allocation.  Remove the NEONPreAllocPass, which is no longer needed.
Yeah!!

llvm-svn: 113818
2010-09-13 23:55:10 +00:00
Bob Wilson
6f35180bec Switch all the NEON vld-lane and vst-lane instructions over to the new
pseudo-instruction approach.  Change ARMExpandPseudoInsts to use a table
to record all the NEON load/store information.

llvm-svn: 113812
2010-09-13 23:01:35 +00:00
Bob Wilson
524123343c Fix NEON VLD pseudo instruction itineraries that were incorrectly copied from
the VST pseudos.  The VLD/VST scheduling still needs work (see pr6722), but
at least we shouldn't confuse the loads with the stores.

llvm-svn: 113473
2010-09-09 05:40:26 +00:00
Jim Grosbach
27a5b1fd3b VFP/NEON load/store multiple instructions are addrmode4, not 5.
llvm-svn: 113322
2010-09-08 00:25:50 +00:00
Bob Wilson
8ef469b2a5 Finish converting the rest of the NEON VLD instructions to use pseudo-
instructions prior to regalloc.  Since it's getting a little close to
the 2.8 branch deadline, I'll have to leave the rest of the instructions
handled by the NEONPreAllocPass for now, but I didn't want to leave half
of the VLD instructions converted and the other half not.

llvm-svn: 112983
2010-09-03 18:16:02 +00:00
Bob Wilson
24fa0b33b1 Replace NEON vabdl, vaba, and vabal intrinsics with combinations of the
vabd intrinsic and add and/or zext operations.  In the case of vaba, this
also avoids the need for a DAG combine pattern to combine vabd with add.
Update tests.  Auto-upgrade the old intrinsics.

llvm-svn: 112941
2010-09-03 01:35:08 +00:00
Bob Wilson
8951c7592c Convert VLD1 and VLD2 instructions to use pseudo-instructions until
after regalloc.

llvm-svn: 112825
2010-09-02 16:00:54 +00:00
Bob Wilson
3348d2eb50 Remove NEON vmull, vmlal, and vmlsl intrinsics, replacing them with multiply,
add, and subtract operations with zero-extended or sign-extended vectors.
Update tests.  Add auto-upgrade support for the old intrinsics.

llvm-svn: 112773
2010-09-01 23:50:19 +00:00
Bob Wilson
826a677f94 Remove NEON vmovn intrinsic, replacing it with vector truncate operations.
Auto-upgrade the old intrinsic and update tests.

llvm-svn: 112507
2010-08-30 20:02:30 +00:00
Bob Wilson
807d004452 Remove NEON vaddl, vaddw, vsubl, and vsubw intrinsics. Instead, use llvm
IR add/sub operations with one or both operands sign- or zero-extended.
Auto-upgrade the old intrinsics.

llvm-svn: 112416
2010-08-29 05:57:34 +00:00
Bob Wilson
956e07b985 Use pseudo instructions for VST1 and VST2.
llvm-svn: 112357
2010-08-28 05:12:57 +00:00
Bob Wilson
abdcae7f20 We don't need to custom-select VLDMQ and VSTMQ anymore.
llvm-svn: 112336
2010-08-28 00:20:11 +00:00
Bob Wilson
31d487d235 Change ARM VFP VLDM/VSTM instructions to use addressing mode #4, just like
all the other LDM/STM instructions.  This fixes asm printer crashes when
compiling with -O0.  I've changed one of the NEON tests (vst3.ll) to run
with -O0 to check this in the future.

Prior to this change VLDM/VSTM used addressing mode #5, but not really.
The offset field was used to hold a count of the number of registers being
loaded or stored, and the AM5 opcode field was expanded to specify the IA
or DB mode, instead of the standard ADD/SUB specifier.  Much of the backend
was not aware of these special cases.  The crashes occured when rewriting
a frameindex caused the AM5 offset field to be changed so that it did not
have a valid submode.  I don't know exactly what changed to expose this now.
Maybe we've never done much with -O0 and NEON.  Regardless, there's no longer
any reason to keep a count of the VLDM/VSTM registers, so we can use
addressing mode #4 and clean things up in a lot of places.

llvm-svn: 112322
2010-08-27 23:18:17 +00:00
Bob Wilson
efc503afd2 Use pseudo instructions for VST3.
llvm-svn: 112208
2010-08-26 18:51:29 +00:00
Bob Wilson
e74da18e57 Use pseudo instructions for VST1d64Q.
llvm-svn: 112170
2010-08-26 05:33:30 +00:00
Bob Wilson
b85b3cf91f Start converting NEON load/stores to use pseudo instructions, beginning here
with the VST4 instructions.  Until after register allocation, we want to
represent sets of adjacent registers by a single super-register.  These
VST4 pseudo instructions have a single QQ or QQQQ source register operand.
They get expanded to the real VST4 instructions with 4 separate D register
operands.  Once this conversion is complete, we'll be able to remove the
NEONPreAllocPass and avoid some fragile and hacky code elsewhere.

llvm-svn: 112108
2010-08-25 23:27:42 +00:00
Bob Wilson
0039bc228b Replace the arm.neon.vmovls and vmovlu intrinsics with vector sign-extend and
zero-extend operations.

llvm-svn: 111614
2010-08-20 04:54:02 +00:00
Daniel Dunbar
e0737ebae3 Silence some -Asserts uninitialized variable warnings.
llvm-svn: 109956
2010-07-31 21:08:54 +00:00
Bob Wilson
34f481e895 Add support for NEON VMVN immediate instructions.
llvm-svn: 108324
2010-07-14 06:31:50 +00:00
Bob Wilson
298c5c46c1 The bits in the cmode field of 32-bit VMOV immediate instructions all depend
of the value of the immediate.

llvm-svn: 108323
2010-07-14 06:30:44 +00:00
Bob Wilson
7feb850d36 Use a target-specific VMOVIMM DAG node instead of BUILD_VECTOR to represent
NEON VMOV-immediate instructions.  This simplifies some things.

llvm-svn: 108275
2010-07-13 21:16:48 +00:00
Bob Wilson
822b21f0de Also use REG_SEQUENCE for VTBX instructions.
llvm-svn: 107743
2010-07-07 00:08:54 +00:00
Bob Wilson
ce80768ebf Use REG_SEQUENCE nodes to make the table registers for VTBL instructions be
allocated to consecutive registers.

llvm-svn: 107730
2010-07-06 23:36:25 +00:00
Bob Wilson
4d54e03068 Fix indentation.
llvm-svn: 106881
2010-06-25 20:54:44 +00:00
Bob Wilson
059880161b Remove a fixme comment that is no longer relevant.
llvm-svn: 106382
2010-06-19 05:32:41 +00:00
Bob Wilson
56db632295 Add basic support for NEON modified immediates besides VMOV.
llvm-svn: 106030
2010-06-15 19:05:35 +00:00
Bob Wilson
32016c38ee Rename functions referring to VMOV immediates to refer to NEON "modified
immediate" operands.  These functions have so far only been used for VMOV
but they also apply to other NEON instructions with modified immediate
operands.  No functional changes.

llvm-svn: 105969
2010-06-14 22:19:57 +00:00
Bob Wilson
5e3c60fb63 Add instruction encoding for the Neon VMOV immediate instruction. This changes
the machine instruction representation of the immediate value to be encoded
into an integer with similar fields as the actual VMOV instruction.  This makes
things easier for the disassembler, since it can just stuff the bits into the
immediate operand, but harder for the asm printer since it has to decode the
value to be printed.  Testcase for the encoding will follow later when MC has
more support for ARM.

llvm-svn: 105836
2010-06-11 21:34:50 +00:00
Bob Wilson
9cf6656d4b Further changes for Neon vector shuffles:
- change isShuffleMaskLegal to show that all shuffles with 32-bit and 64-bit
  elements are legal
- the Neon shuffle instructions do not support 64-bit elements, but we were
  not checking for that before lowering shuffles to use them
- remove some 64-bit element vduplane patterns that are no longer needed

llvm-svn: 105586
2010-06-07 23:53:38 +00:00
Jakob Stoklund Olesen
d48a2f5afd Fix a few places that depended on the numeric value of subreg indices.
Add assertions in places that depend on consecutive indices.

llvm-svn: 104510
2010-05-24 17:13:28 +00:00
Jakob Stoklund Olesen
ac6f519e79 Switch ARMRegisterInfo.td to use SubRegIndex and eliminate the parallel enums
from ARMRegisterInfo.h

llvm-svn: 104508
2010-05-24 16:54:32 +00:00
Evan Cheng
0aa58d5b69 Mark pattern-less mayLoad / mayStore instructions neverHasSideEffects. These do not have other un-modeled side effects.
llvm-svn: 104111
2010-05-19 06:07:03 +00:00
Evan Cheng
330b993ede vmov of immediates are trivially re-materializable.
llvm-svn: 103982
2010-05-17 21:54:50 +00:00
Anton Korobeynikov
a63555c10d Chris said that the comment char should be escaped. Fix all the occurences of "@" in *.td
llvm-svn: 103903
2010-05-16 09:15:36 +00:00
Evan Cheng
2af2c9fa14 Added a QQQQ register file to model 4-consecutive Q registers.
llvm-svn: 103760
2010-05-14 02:13:41 +00:00
Evan Cheng
8d516e4c3e Bring back VLD1q and VST1q and use them for reloading / spilling Q registers. This allows folding loads and stores into VMOVQ.
llvm-svn: 103692
2010-05-13 01:12:06 +00:00
Evan Cheng
533ffa237e Mark some pattern-less instructions as neverHasSideEffects.
llvm-svn: 103683
2010-05-13 00:16:46 +00:00
Evan Cheng
2d9c4df15f Use VLD2q32 / VST2q32 to reload / spill QQ (pair of Q) registers when stack slot is sufficiently aligned. Use VLDMD / VSTMD otherwise.
llvm-svn: 103235
2010-05-07 02:04:02 +00:00
Evan Cheng
c82be8216f Remove VLD1q and VST1q for reloading and spilling Q registers. Just use VLD1q64 / VST1q64 and reference sub-registers.
llvm-svn: 103218
2010-05-07 00:24:52 +00:00
Evan Cheng
4c3022f869 Re-apply 103156 and 103157. 103156 didn't break anything. 10315 exposed a coalescer bug that's fixed by 103170.
llvm-svn: 103172
2010-05-06 06:36:08 +00:00
Eric Christopher
72ca6fc94e Revert r103156 since it was breaking the build bots.
Reverse-merging r103156 into '.':
U    lib/Target/ARM/ARMInstrNEON.td
U    lib/Target/ARM/ARMRegisterInfo.h
U    lib/Target/ARM/ARMBaseRegisterInfo.cpp
U    lib/Target/ARM/ARMBaseInstrInfo.cpp
U    lib/Target/ARM/ARMRegisterInfo.td

llvm-svn: 103159
2010-05-06 02:29:06 +00:00
Evan Cheng
f25111f27f Adding pseudo 256-bit registers QQ0 . . . QQ7 to represent pairs of Q registers. These will be used to model VLD2 / VST2 instructions in order to get substantially better codegen for them.
llvm-svn: 103156
2010-05-06 01:52:03 +00:00
Anton Korobeynikov
f9463f5b98 More fixes for itins
llvm-svn: 100662
2010-04-07 18:21:10 +00:00
Anton Korobeynikov
0c3bc7a9ce Fix invalid itins for 32-bit varians of VMLAL and friends
llvm-svn: 100661
2010-04-07 18:21:04 +00:00
Anton Korobeynikov
be68ff5b2c Fix itins for VABA
llvm-svn: 100657
2010-04-07 18:20:42 +00:00
Anton Korobeynikov
fbc58bba2f Correct VMVN itinerary: operand is read in the second cycle, not in the first.
llvm-svn: 100656
2010-04-07 18:20:36 +00:00
Anton Korobeynikov
2abd52b692 More A9 itineraries
llvm-svn: 100655
2010-04-07 18:20:29 +00:00
Anton Korobeynikov
13df6eaf1c Correct itinerary class for VPADD
llvm-svn: 100654
2010-04-07 18:20:24 +00:00
Anton Korobeynikov
982f9a042b VP{MAX, MIN} are of IIC_VSUBi4D itin class as well.
llvm-svn: 100653
2010-04-07 18:20:18 +00:00