1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00
Commit Graph

7306 Commits

Author SHA1 Message Date
Tim Northover
420889afe2 ARM: fix assert on unpredictable POP instruction.
POP instructions are aliased to the ARM LDM variants but have different syntax.
This caused two problems: we tried to access a non-existent operand to annotate
the '!', and the error message didn't make much sense.

With some vigorous hand-waving in the error message both problems can be
fixed.

llvm-svn: 193322
2013-10-24 09:37:18 +00:00
Artyom Skrobov
2ef42e5c31 Make ARM hint ranges consistent, and add tests for these ranges
llvm-svn: 193238
2013-10-23 10:14:40 +00:00
Tim Northover
e4acb9a5a2 ARM: provide diagnostics on more writeback LDM/STM instructions
The set of circumstances where the writeback register is allowed to be in the
list of registers is rather baroque, but I think this implements them all on
the assembly parsing side.

For disassembly, we still warn about an ARM-mode LDM even if the architecture
revision is < v7 (the required architecture information isn't available). It's
a silly instruction anyway, so hopefully no-one will mind.

rdar://problem/15223374

llvm-svn: 193185
2013-10-22 19:00:39 +00:00
Jim Grosbach
4ccaf97952 ARM: Thumb2 copy for GPRPair needs to use thumb instructions.
Use tMOVr instead of plain MOVr.

rdar://15193017

llvm-svn: 193139
2013-10-22 02:29:37 +00:00
Jim Grosbach
fb572e0475 ARM: Clean up copyPhysReg() a bit.
No functional change, just cleaning things up for readability.

llvm-svn: 193138
2013-10-22 02:29:35 +00:00
Richard Barton
4f1967c83f Pure refactoring change.
Patch by Artyom Skrobov.

llvm-svn: 192977
2013-10-18 14:41:50 +00:00
Richard Barton
cb6c32ac32 Add hint disassembly syntax for 16-bit Thumb hint instructions.
Patch by Artyom Skrobov

llvm-svn: 192972
2013-10-18 14:09:49 +00:00
Silviu Baranga
0e6bbf1a98 Add hardware division as a default feature on Cortex-A15. Also add test cases to check this, and change diagnostics for the hwdiv-arm feature to something useful.
llvm-svn: 192963
2013-10-18 10:18:40 +00:00
David Peixotto
839f0f98a5 17309 ARM backend incorrectly lowers COPY_STRUCT_BYVAL_I32 for thumb1 targets
This commit implements the correct lowering of the
COPY_STRUCT_BYVAL_I32 pseudo-instruction for thumb1 targets.
Previously, the lowering of COPY_STRUCT_BYVAL_I32 generated the
post-increment forms of ldr/ldrh/ldrb instructions. Thumb1 does not
have the post-increment form of these instructions so the generated
assembly contained invalid instructions.

Passing the generated assembly to gcc caused it to complain with an
error like this:

  Error: cannot honor width suffix -- `ldrb r3,[r0],#1'

and the integrated assembler would generate an object file with an
invalid instruction encoding.

This commit contains a small test case that demonstrates the problem
with thumb1 targets as well as an expanded test case that more
throughly tests the lowering of byval struct passing for arm,
thumb1, and thumb2 targets.

llvm-svn: 192916
2013-10-17 19:52:05 +00:00
David Peixotto
22f270719b Refactor lowering for COPY_STRUCT_BYVAL_I32
This commit refactors the lowering of the COPY_STRUCT_BYVAL_I32
pseudo-instruction in the ARM backend. We introduce a new helper
class that encapsulates all of the operations needed during the
lowering. The operations are implemented for each subtarget in
different subclasses. Currently only arm and thumb2 subtargets are
supported.

This refactoring was done to easily implement support for thumb1
subtargets. This initial patch does not add support for thumb1, but
is only a refactoring. A follow on patch will implement the support
for thumb1 subtargets.

No intended functionality change.

llvm-svn: 192915
2013-10-17 19:49:22 +00:00
Rafael Espindola
90d8b36e1e Add a MCAsmInfoELF class and factor some code into it.
We had a MCAsmInfoCOFF, but no common class for all the ELF MCAsmInfos before.

llvm-svn: 192760
2013-10-16 01:34:32 +00:00
Manman Ren
39d1a84681 Struct byval: fix a copy-paste error for thumb2.
PR17309

llvm-svn: 192730
2013-10-15 19:42:32 +00:00
Bernard Ogden
f482ee15d7 Add Cortex-A57 support
llvm-svn: 192591
2013-10-14 13:17:07 +00:00
Bernard Ogden
ec0167a2ce Add subtarget feature support for Cortex-A53
Some previous implicit defaults have changed, for example FP and NEON
are now on by default.

llvm-svn: 192590
2013-10-14 13:16:57 +00:00
Amara Emerson
bf6dcda63c [ARM] Fix FP ABI attributes with no VFP enabled.
llvm-svn: 192458
2013-10-11 16:03:43 +00:00
Benjamin Kramer
4cb23ab23b ARM: Put isV8EligibleForIT into the llvm namespace. While there make it inline.
llvm-svn: 192350
2013-10-10 14:35:45 +00:00
Tim Northover
50b95fa75d ARM: correct liveness flags during ARMLoadStoreOpt
When we had a sequence like:

    s1 = VLDRS [r0, 1], Q0<imp-def>
    s3 = VLDRS [r0, 2], Q0<imp-use,kill>, Q0<imp-def>
    s0 = VLDRS [r0, 0], Q0<imp-use,kill>, Q0<imp-def>
    s2 = VLDRS [r0, 4], Q0<imp-use,kill>, Q0<imp-def>

we were gathering the {s0, s1} loads below the s3 load. This is fine,
but confused the verifier since now the s3 load had Q0<imp-use> with
no definition above it.

This should mark such uses <undef> as well. The liveness structure at
the beginning and end of the block is unaffected, and the true sN
definitions should prevent any dodgy reorderings being introduced
elsewhere.

rdar://problem/15124449

llvm-svn: 192344
2013-10-10 09:28:20 +00:00
Benjamin Kramer
2f83eef8d9 Flip the ownership of MCStreamer and MCTargetStreamer.
MCStreamer now owns the target streamer. This prevents leaking the target
streamer.

llvm-svn: 192303
2013-10-09 17:23:41 +00:00
Rafael Espindola
6267c79fdb Add a MCTargetStreamer interface.
This patch fixes an old FIXME by creating a MCTargetStreamer interface
and moving the target specific functions for ARM, Mips and PPC to it.

The ARM streamer is still declared in a common place because it is
used from lib/CodeGen/ARMException.cpp, but the Mips and PPC are
completely hidden in the corresponding Target directories.

I will send an email to llvmdev with instructions on how to use this.

llvm-svn: 192181
2013-10-08 13:08:17 +00:00
Manman Ren
b284db0070 Struct byval: use the correct alignment for loads generated to load
from struct byval to registers.

We used to pass 0 which means the alignment of PtrVT. Even when the alignment
of the struct is smaller than 4, the LOADs would have alignment of 4, and
further optimizations could combine the LOADs into a ldm, which would
cause crash.

The fix is to pass the alignment of the struct byval.

rdar://problem/15144402

llvm-svn: 192126
2013-10-07 19:47:53 +00:00
Amara Emerson
688cdc2151 [ARM] Improve build attributes emission.
llvm-svn: 192111
2013-10-07 16:55:23 +00:00
Rafael Espindola
e60c3625e3 Remove getEHExceptionRegister and getEHHandlerRegister.
They haven't been used for a long time. Patch by MathOnNapkins.

llvm-svn: 192099
2013-10-07 13:39:22 +00:00
Tim Northover
1979375a30 ARM: allow cortex-m0 to use hint instructions
The hint instructions ("nop", "yield", etc) are mostly Thumb2-only, but have
been ported across to the v6M architecture. Fortunately, v6M seems to sit
nicely between v6 (thumb-1 only) and v6T2, so we can add a feature for it
fairly easily.

rdar://problem/15144406

llvm-svn: 192097
2013-10-07 11:10:47 +00:00
Rafael Espindola
a1a1d34e51 Remove some really nasty uses of hasRawTextSupport.
When MC was first added, targets could use hasRawTextSupport to keep features
working before they were added to the MC interface.

The design goal of MC is to provide an uniform api for printing assembly and
object files. Short of relaxations and other corner cases, a object file is
just another representation of the assembly.

It was never the intention that targets would keep doing things like

if (hasRawTextSupport())
  Set flags in one way.
else
  Set flags in another way.

When they do that they create two code paths and the object file is no longer
just another representation of the assembly. This also then requires testing
with llc -filetype=obj, which is extremelly brittle.

This patch removes some of these hacks by replacing them with smaller ones.
The ARM flag setting is trivial, so I just moved it to the constructor. For
Mips, the patch adds two temporary hack directives that allow the assembly
to represent the same things as the object file was already able to.

The hope is that the mips developers will replace the hack directives with
the same ones that gas uses and drop the -print-hack-directives flag.

I will also try to implement a target streamer interface, so that we can
move this out of the common code.

In summary, for any new work, two rules of the thumb are
  * Don't use "llc -filetype=obj" in tests.
  * Don't add calls to hasRawTextSupport.

llvm-svn: 192035
2013-10-05 16:42:21 +00:00
Matthias Braun
f7ddf86363 ARM: optimizeSelect has to consider the previous register class
optimizeSelect folds (predicated) copy instructions, it must not ignore
the original register class of the operand when replacing the register
with the copies dest register.

llvm-svn: 191963
2013-10-04 16:52:56 +00:00
Matthias Braun
fbba53e45c ARM: do not add a regmask for TAILJUMPs
The jump doesn't really kill the registers, the following call does but
we never get back anyway.
This avoids some verify-machineinstrs problems when TAILJUMPs are
if-converted.

llvm-svn: 191962
2013-10-04 16:52:54 +00:00
Matthias Braun
ae6465eb28 ARM: preserve undef flag in pseudo instruction expanders
Copy over the whole register machine operand instead of creating a new one
with an incomplete set of flags.

llvm-svn: 191961
2013-10-04 16:52:51 +00:00
Amara Emerson
ece8c5f612 [ARM] Warn on deprecated IT blocks in v8 AArch32 assembly.
Patch by Artyom Skrobov.

llvm-svn: 191885
2013-10-03 09:31:51 +00:00
Tim Northover
684a0e633d ARM: support interrupt attribute
This function-attribute modifies the callee-saved register list and function
epilogue (specifically the return instruction) so that a routine is suitable
for use as an interrupt-handler of the specified type without disrupting
user-mode applications.

rdar://problem/14207019

llvm-svn: 191766
2013-10-01 14:33:28 +00:00
Joey Gouly
7f15046246 [ARM] Remove an unused function from the disassembler.
Pointed out by Joerg.

llvm-svn: 191749
2013-10-01 13:01:10 +00:00
Joey Gouly
12afb60cf2 [ARM] Introduce the 'sevl' instruction in ARMv8.
This also removes the restriction on the immediate field of the 'hint'
instruction.

llvm-svn: 191744
2013-10-01 12:39:11 +00:00
Tilmann Scheller
8255c22a0e [ARM] Clean up ARMAsmParser::validateInstruction().
Fix some LLVM Coding Standards violations.

No changes in functionality.

llvm-svn: 191686
2013-09-30 17:57:30 +00:00
Tilmann Scheller
e69a5118c3 [ARM] Assembler: ARM LDRD with writeback requires the base register to be different from the destination registers.
See ARM ARM A8.8.72.

Violating this constraint results in unpredictable behavior.

llvm-svn: 191678
2013-09-30 16:11:48 +00:00
Arnold Schwaighofer
a8d7eeffb9 Swift model: Fix uop description on some writes
Those writes really need two/three uops.

llvm-svn: 191677
2013-09-30 15:56:34 +00:00
Arnold Schwaighofer
47322176be IfConverter: Use TargetSchedule for instruction latencies
For targets that have instruction itineraries this means no change. Targets
that move over to the new schedule model will use be able the new schedule
module for instruction latencies in the if-converter (the logic is such that if
there is no itineary we will use the new sched model for the latencies).

Before, we queried "TTI->getInstructionLatency()" for the instruction latency
and the extra prediction cost. Now, we query the TargetSchedule abstraction for
the instruction latency and TargetInstrInfo for the extra predictation cost. The
TargetSchedule abstraction will internally call "TTI->getInstructionLatency" if
an itinerary exists, otherwise it will use the new schedule model.

ATTENTION: Out of tree targets!

(I will also send out an email later to LLVMDev)

This means, if your target implements

 unsigned getInstrLatency(const InstrItineraryData *ItinData,
                          const MachineInstr *MI,
                          unsigned *PredCost);

and returns a value for "PredCost", you now also need to implement

 unsigned getPredictationCost(const MachineInstr *MI);

(if your target uses the IfConversion.cpp pass)

radar://15077010

llvm-svn: 191671
2013-09-30 15:28:56 +00:00
Robert Wilhelm
198f21deb3 Even more spelling fixes for "instruction".
llvm-svn: 191611
2013-09-28 13:42:22 +00:00
Tilmann Scheller
758b35d6b8 ARM: Teach assembler to enforce constraints for ARM LDRD destination register operands.
As specified in A8.8.72/A8.8.73/A8.8.74 in the ARM ARM, all variants of the ARM LDRD instruction have the following two constraints:

LDRD<c> <Rt>, <Rt2>, ...

(a) Rt must be even-numbered and not r14
(b) Rt2 must be R(t+1)

If those two constraints are not met the result of executing the instruction will be unpredictable.

Constraint (b) was already enforced, this commit adds support for constraint (a).

Fixes rdar://14479793.

llvm-svn: 191520
2013-09-27 13:28:17 +00:00
Tilmann Scheller
3f3aef8ded Fix comment.
llvm-svn: 191505
2013-09-27 10:38:11 +00:00
Tilmann Scheller
f2c27b743a ARM: Teach assembler to enforce constraint for Thumb2 LDRD (literal/immediate) destination register operands.
LDRD<c> <Rt>, <Rt2>, <label>
LDRD<c> <Rt>, <Rt2>, [<Rn>{, #+/-<imm>}]
LDRD<c> <Rt>, <Rt2>, [<Rn>], #+/-<imm>
LDRD<c> <Rt>, <Rt2>, [<Rn>, #+/-<imm>]!

As specified in A8.8.72/A8.8.73 in the ARM ARM, the T1 encoding has a constraint which enforces that Rt != Rt2.

If this constraint is not met the result of executing the instruction will be unpredictable.

Fixes rdar://14479780.

llvm-svn: 191504
2013-09-27 10:30:18 +00:00
Weiming Zhao
c16af8ee70 Fix PR 17372: Emitting PLD for stack address for ARM Thumb2
t2PLDi12, t2PLDi8, t2PLDs was omitted in Thumb2InstrInfo.
This patch fixes it.

llvm-svn: 191441
2013-09-26 17:25:10 +00:00
Amara Emerson
80d8b3db1e [ARM] Use the load-acquire/store-release instructions optimally in AArch32.
Patch by Artyom Skrobov.

llvm-svn: 191428
2013-09-26 12:22:36 +00:00
Weiming Zhao
14a079be0c Fix PR 17368: disable vector mul distribution for square of add/sub for ARM
Generally, it is desirable to distribute (a + b) * c to a*c + b*c for
ARM with VMLx forwarding, where a, b and c are vectors.
However, for (a + b)*(a + b), distribution will result in one extra
instruction.
With distribution:
  x = a + b (add)
  y = a * x (mul)
  z = y + b * y (mla)

Without distribution:
  x = a + b (add)
  z = x * x (mul)

This patch checks if a mul is a square of add/sub. If yes, skip
distribution.

llvm-svn: 191410
2013-09-25 23:12:06 +00:00
Andrew Trick
3b462e7046 CriticalAntiDepBreaker is no longer needed for armv7 scheduling.
This is being disabled because it is no longer needed for
performance. It is only used by postRAscheduler which is also planned
for removal, and it is implemented with an out-dated view of register
liveness. It consideres aliases instead of register units, assumes
valid kill flags, and assumes implicit uses on partial register
defs. Kill flags and implicit operands are error prone and impossible
to verify. We should gradually eliminate dependence on them in the
postRA phases.

Targets that still benefit from this should move to the MI
scheduler. If that doesn't solve the problem, then we should add a
hook to regalloc to optimize reload placement.

llvm-svn: 191348
2013-09-25 00:26:16 +00:00
Amara Emerson
4331719810 [ARM] Split A/R class into separate subtarget features.
Patch by Bradley Smith.

llvm-svn: 191202
2013-09-23 14:26:15 +00:00
Tim Northover
c9a7e47164 ISelDAG: spot chain cycles involving MachineNodes
Previously, the DAGISel function WalkChainUsers was spotting that it
had entered already-selected territory by whether a node was a
MachineNode (amongst other things). Since it's fairly common practice
to insert MachineNodes during ISelLowering, this was not the correct
check.

Looking around, it seems that other nodes get their NodeId set to -1
upon selection, so this makes sure the same thing happens to all
MachineNodes and uses that characteristic to determine whether we
should stop looking for a loop during selection.

This should fix PR15840.

llvm-svn: 191165
2013-09-22 08:21:56 +00:00
Richard Mitton
fe7ffd01f6 Added support for generate DWARF .debug_aranges sections automatically.
llvm-svn: 191052
2013-09-19 23:21:01 +00:00
Amara Emerson
7ad0409c56 [ARMv8] Add support for the v8 cryptography extensions.
llvm-svn: 190996
2013-09-19 11:59:01 +00:00
Joey Gouly
e14ac63b96 [ARMv8] Add CRC instructions.
Patch by Bradley Smith!

llvm-svn: 190928
2013-09-18 09:45:55 +00:00
Joey Gouly
10dd14fb9e [ARM] Fix the deprecation of MCR encodings that map to CP15{ISB,DSB,DMB}.
llvm-svn: 190862
2013-09-17 09:54:57 +00:00
Benjamin Kramer
a65d61ae4b ARM: Deduplicate ConstantPoolValues.
llvm-svn: 190779
2013-09-16 10:17:31 +00:00
Benjamin Kramer
b6950f09cc Replace some unnecessary vector copies with references.
llvm-svn: 190770
2013-09-15 22:04:42 +00:00
Robert Wilhelm
78f2958d70 Fix spelling.
llvm-svn: 190749
2013-09-14 09:34:24 +00:00
Joey Gouly
0af412fe63 [ARMv8] Change hasV8Fp to hasFPARMv8, and other command line options
to be more consistent.

llvm-svn: 190692
2013-09-13 13:46:57 +00:00
Joey Gouly
2b0127dd73 [ARMv8] Emit the proper .fpu directive.
Patch by Bradley Smith!

llvm-svn: 190683
2013-09-13 11:51:52 +00:00
Joey Gouly
fccb3bcae3 Add an instruction deprecation feature to TableGen.
The 'Deprecated' class allows you to specify a SubtargetFeature that the
instruction is deprecated on.

The 'ComplexDeprecationPredicate' class allows you to define a custom
predicate that is called to check for deprecation.
For example:
  ComplexDeprecationPredicate<"MCR">

would mean you would have to define the following function:
  bool getMCRDeprecationInfo(MCInst &MI, MCSubtargetInfo &STI,
                             std::string &Info)

Which returns 'false' for not deprecated, and 'true' for deprecated
and store the warning message in 'Info'.

The MCTargetAsmParser constructor was chaned to take an extra argument of
the MCInstrInfo class, so out-of-tree targets will need to be changed.

llvm-svn: 190598
2013-09-12 10:28:05 +00:00
Jim Grosbach
73124a1d77 ARM: Use the PICADD opcode calculated.
We were figuring out whether to use tPICADD or PICADD, then just using
tPICADD unconditionally anyway. Oops.

A testcase from someone familiar enough with ELF to produce one would
be appreciated. The existing PIC testcase correctly verifies the .s
generated, but that doesn't catch this bug, which only showed up in
direct-to-object mode.

http://llvm.org/bugs/show_bug.cgi?id=17180

llvm-svn: 190417
2013-09-10 17:21:39 +00:00
Logan Chien
dca7f7adf8 Remove unused private member in ARMAsmPrinter.cpp.
This commit removes the unused "AttributeItem" from
ObjectAttributeEmitter.

llvm-svn: 190412
2013-09-10 15:10:02 +00:00
Joey Gouly
03af45ccfe [ARMv8] Prevent generation of deprecated IT blocks on ARMv8 in Thumb mode.
IT blocks can only be one instruction lonf, and can only contain a subset of
the 16 instructions.

Patch by Artyom Skrobov!

llvm-svn: 190309
2013-09-09 14:21:49 +00:00
Bill Wendling
2c532e9c9b Generate compact unwind encoding from CFI directives.
We used to generate the compact unwind encoding from the machine
instructions. However, this had the problem that if the user used `-save-temps'
or compiled their hand-written `.s' file (with CFI directives), we wouldn't
generate the compact unwind encoding.

Move the algorithm that generates the compact unwind encoding into the
MCAsmBackend. This way we can generate the encoding whether the code is from a
`.ll' or `.s' file.

<rdar://problem/13623355>

llvm-svn: 190290
2013-09-09 02:37:14 +00:00
Joey Gouly
071ca2ff6d [ARMv8] Implement the new DMB/DSB operands.
This removes the custom ISD Node: MEMBARRIER and replaces it
with an intrinsic.

llvm-svn: 190055
2013-09-05 15:35:24 +00:00
Richard Barton
48e8048205 Add AArch32 DCPS{1,2,3} and HLT instructions.
These were pretty straightforward instructions, with some assembly support
required for HLT.

The ARM assembler is keen to split the instruction mnemonic into a
(non-existent) 'H' instruction with the LT condition code. An exception for
HLT is needed.

HLT follows the same rules as BKPT when in IT blocks, so the special BKPT
hadling code has been adapted to handle HLT also.

Regression tests added including diagnostic tests for out of range immediates
and illegal condition codes, as well as negative tests for pre-ARMv8.

llvm-svn: 190053
2013-09-05 14:14:19 +00:00
Tilmann Scheller
31cc184566 Reverting 190043 for now.
Solution is not sufficient to prevent 'mov pc, lr' being emitted for jump table code.
Test case doesn't trigger the added functionality.

llvm-svn: 190047
2013-09-05 11:59:43 +00:00
Tilmann Scheller
14c2ce0a1e ARM: Add GPR register class excluding LR for use with the ADR instruction.
This improves code generation for jump tables by avoiding the emission of "mov pc, lr" which could fool the processor into believing this is a return from a function causing mispredicts. The code generation logic for jump tables uses ADR to materialize the address of the jump target.

Patch by Daniel Stewart!
   

llvm-svn: 190043
2013-09-05 11:10:31 +00:00
Jim Grosbach
067a40bdd0 ARM: Teach A15 SDOptimizer to properly handle D-reg by-lane.
These instructions, such as vmul.f32, require the second source operand to
be in D0-D15 rather than the full D0-D31. When optimizing, make sure to
account for that by constraining the register class of a replacement virtual
register to be compatible with the virtual register(s) it's replacing.

I've been unsuccessful in creating a non-fragile regression test. This issue
was detected by the LLVM nightly test suite running on an A15 (Bullet).

PR17093: http://llvm.org/bugs/show_bug.cgi?id=17093
llvm-svn: 189972
2013-09-04 19:08:44 +00:00
Arnold Schwaighofer
ac9a5042d8 Swift: Only build vldm/vstm with q register aligned register lists
Unaligned vldm/vstm need more uops and therefore are slower in general on swift.

radar://14522102

llvm-svn: 189961
2013-09-04 17:41:16 +00:00
Silviu Baranga
f1ba2ead74 Fix scheduling for vldm/vstm instructions that load/store more than 32 bytes on Cortex-A9. This also makes the existing code more compact.
llvm-svn: 189958
2013-09-04 17:05:18 +00:00
Jim Grosbach
4f219b5c7e Revert "Revert "ARM: Improve pattern for isel mul of vector by scalar.""
This reverts commit r189648.

Fixes for the previously failing clang-side arm_neon_intrinsics test
cases will be checked in separately.

llvm-svn: 189841
2013-09-03 20:08:17 +00:00
Tilmann Scheller
e8598ae406 ARM: Default to the Swift CPU when targeting armv7s/thumbv7s.
Test cases adjusted accordingly.

This fixes rdar://14871821.

llvm-svn: 189766
2013-09-02 17:09:01 +00:00
Tilmann Scheller
9205705478 Revert 189756 for now, it doesn't match what rdar://14871821 really wants.
What we really want is to enable Swift by default for *v7s triples (and there already seems to be some logic which attempts to do that). In that case the iOS version doesn't matter. 

llvm-svn: 189763
2013-09-02 15:48:17 +00:00
Tilmann Scheller
9e0d3ff678 ARM: Default to Swift when compiling for iOS 6 or later.
Test cases adjusted accordingly.

This fixes rdar://14871821.

llvm-svn: 189756
2013-09-02 12:01:58 +00:00
Charles Davis
5191e0b0d0 Move everything depending on Object/MachOFormat.h over to Support/MachO.h.
llvm-svn: 189728
2013-09-01 04:28:48 +00:00
Michael Gottesman
113e9285a1 Revert "ARM: Improve pattern for isel mul of vector by scalar."
This reverts commit r189619.

The commit was breaking the arm_neon_intrinsic test.

llvm-svn: 189648
2013-08-30 05:36:14 +00:00
Jim Grosbach
7089633cb9 ARM: Improve pattern for isel mul of vector by scalar.
In addition to recognizing when the multiply's second argument is
coming from an explicit VDUPLANE, also look for a plain scalar
f32 reference and reference it via the corresponding vector
lane.

rdar://14870054

llvm-svn: 189619
2013-08-29 22:41:46 +00:00
Cameron Esfahani
ee2e44fcc0 Clean up some usage of Triple. The base class has methods for determining if the target is iOS and Linux.
llvm-svn: 189604
2013-08-29 20:23:14 +00:00
Joey Gouly
fdf2105da5 [ARMv8]
Fix a few things in one swoop.

# Add some negative tests.
# Fix some formatting issues.
# Add some missing IsThumb / ARMv8
# Fix some outs / ins mistakes.

llvm-svn: 189490
2013-08-28 16:39:20 +00:00
Tim Northover
02c638e450 ARM: Use "dmb sy" for barriers on M-class CPUs
The usual default of "dmb ish" (inner-shareable) isn't even a valid instruction
on v6M or v7M (well, it does the same thing but software is strongly
discouraged from using it) so we should emit a full-system barrier there.

llvm-svn: 189483
2013-08-28 14:39:19 +00:00
Joey Gouly
555c84341e [ARMv8] Add a missing IsThumb to t2LDAEXD.
llvm-svn: 189482
2013-08-28 14:33:35 +00:00
Tim Northover
490c4c1bda ARM: remove unused v(add|sub)hn and vqdml[as]l intrinsics.
Clang is now generating cleaner IR, so this removes the old variants which
should be completely unused.

llvm-svn: 189481
2013-08-28 14:33:33 +00:00
Tim Northover
e4e6bb8e0e ARM: add patterns for vqdmlal with separate vqdmull and vqadds
The vqdmlal and vqdmlls instructions are really just a fused pair consisting of
a vqdmull.sN and a vqadd.sN. This adds patterns to LLVM so that we can switch
Clang's CodeGen over to generating these instead of the special vqdmlal
intrinsics.

llvm-svn: 189480
2013-08-28 12:15:16 +00:00
Joey Gouly
c935e7defb [ARMv8] Add MC support for the new load/store acquire/release instructions.
llvm-svn: 189388
2013-08-27 17:38:16 +00:00
Joey Gouly
740189864d [ARMv8] Add some negative tests for the recent VFP/NEON instructions.
Fix two issues I found while writing these tests.

llvm-svn: 189341
2013-08-27 11:24:16 +00:00
Tim Northover
24c6842d69 ARM: add natural patterns for vaddhl and vsubhl.
These instructions aren't particularly complicated and it's well worth having
patterns for some reasonably useful LLVM IR that will match them. Soon we
should be able to switch Clang over to producing this natural version.

llvm-svn: 189335
2013-08-27 10:31:36 +00:00
Charles Davis
6e439dabdb Revert "Fix the build broken by r189315." and "Move everything depending on Object/MachOFormat.h over to Support/MachO.h."
This reverts commits r189319 and r189315. r189315 broke some tests on what I
believe are big-endian platforms.

llvm-svn: 189321
2013-08-27 05:38:30 +00:00
Charles Davis
cecfbfaf57 Move everything depending on Object/MachOFormat.h over to Support/MachO.h.
llvm-svn: 189315
2013-08-27 05:00:43 +00:00
Jim Grosbach
f88c4f8dbc ARM: Constrain regclass for TSTri instruction.
Get the register class right for the TST instruction. This keeps the
machine verifier happy, enabling us to turn it on for another test.

rdar://12594152

llvm-svn: 189274
2013-08-26 20:22:05 +00:00
Jim Grosbach
0cc0fac41e ARM: FastISel verifier error cleanup.
Constant pool and global value reference instructions need more
restricted register classes than plain GPR.

rdar://12594152

llvm-svn: 189270
2013-08-26 20:07:29 +00:00
Jim Grosbach
2b30758f49 ARM: Fix ELF global base reg intialization.
The create machine code wasn't properly in SSA, which the machine verifier
properly complains about. Now that fast-isel is closer to verifier clean,
errors like this show up more clearly.

Additionally, the Thumb pseudo tPICADD was used for both ARM and Thumb
mode functions, which is obviously wrong. Fix that along the way.

Test case is part of the following commit which will finish making an
additional fast-isel test verifier clean an enable it for the
regression test suite. This commit is separate since its not just
a verifier cleanup, but an actual correctness issue.

rdar://12594152 (for the fast-isel verifier aspects)

llvm-svn: 189269
2013-08-26 20:07:25 +00:00
Joey Gouly
25375f9ffb [ARM] Fix another ARM FastISel -verify-machineinstrs issue.
llvm-svn: 189109
2013-08-23 15:20:56 +00:00
Joey Gouly
9ebd1c7d68 [ARMv8] Add CodeGen for VMAXNM/VMINNM.
llvm-svn: 189103
2013-08-23 12:01:13 +00:00
Tim Northover
7c24b95efe ARM: make sure ARM-mode pseudo-inst requires IsARM
I'd forgotten that "Requires" blocks override rather than add to the
constraints, so my pseudo-instruction was being selected in Thumb mode leading
to nonsense instructions.

rdar://problem/14817358

llvm-svn: 189096
2013-08-23 10:16:39 +00:00
Joey Gouly
355a09f268 [ARMv8] Add CodeGen support for VSEL.
This uses the ARMcmov pattern that Tim cleaned up in r188995.

Thanks to Simon Tatham for his floating point help!

llvm-svn: 189024
2013-08-22 15:29:11 +00:00
Mihai Popa
dfdccf5f00 Fix ARM vcvt encoding when the number of fractional bits is zero.
The instruction to convert between floating point and fixed point representations
takes an immediate operand for the number of fractional bits of the fixed point
value. ARMARM specifies that when that number of bits is zero, the assembler
should encode floating point/integer conversion instructions. 

This patch adds the necessary instruction aliases to achieve this behaviour.

llvm-svn: 189009
2013-08-22 13:16:07 +00:00
Joey Gouly
67edac2b5e [ARM] Constrain some register classes in EmitAtomicBinary64 so that
we pass these tests with -verify-machineinstrs.

llvm-svn: 189006
2013-08-22 12:19:24 +00:00
Logan Chien
2891b0c61c Fix ARM FastISel PIC function call.
The function call to external function should come with PLT relocation
type if the PIC relocation model is used.

llvm-svn: 189002
2013-08-22 12:08:04 +00:00
Tim Northover
eb7a86ed88 ARM: use TableGen patterns to select CMOV operations.
Back in the mists of time (2008), it seems TableGen couldn't handle the
patterns necessary to match ARM's CMOV node that we convert select operations
to, so we wrote a lot of fairly hairy C++ to do it for us.

TableGen can deal with it now: there were a few minor differences to CodeGen
(see tests), but nothing obviously worse that I could see, so we should
probably address anything that *does* come up in a localised manner.

llvm-svn: 188995
2013-08-22 09:57:11 +00:00
Tim Northover
7e34f41ef8 ARM: respect tied 64-bit inlineasm operands when printing
The code for 'Q' and 'R' operand modifiers needs to look through tied
operands to discover the register class.

llvm-svn: 188990
2013-08-22 06:51:04 +00:00
Jim Grosbach
d6ff6507fa ARM: R9 is not safe to use for tcGPR.
Indirect tail-calls shouldn't use R9 for the branch destination, as
it's not reliably a call-clobbered register.

rdar://14793425

llvm-svn: 188967
2013-08-22 00:14:24 +00:00
Mihai Popa
721d6a74eb Make "mov" work for all Thumb2 MOV encodings
According to the ARM specification, "mov" is a valid mnemonic for all Thumb2 MOV encodings.
To achieve this, the patch adds one instruction alias with a special range condition to avoid collision with the Thumb1 MOV.

llvm-svn: 188901
2013-08-21 13:14:58 +00:00
Jim Grosbach
343f1fbc39 ARM: Fix fast-isel copy/paste-o.
Update testcase to be more careful about checking register
values. While regexes are general goodness for these sorts of
testcases, in this example, the registers are constrained by
the calling convention, so we can and should check their
explicit values.

rdar://14779513

llvm-svn: 188819
2013-08-20 19:12:42 +00:00
Tim Northover
cec1079024 ARM: implement some simple f64 materializations.
Previously we used a const-pool load for virtually all 64-bit floating values.
Actually, we can get quite a few common values (including 0.0, 1.0) via "vmov"
instructions of one stripe or another.

llvm-svn: 188773
2013-08-20 08:57:11 +00:00
Mihai Popa
db3c6274ef Thumb2 add immediate alias for SP
The Thumb2 add immediate is in fact defined for SP. The manual is misleading as it points to a different section for add immediate with SP, however the encoding is the same as for add immediate with register only with the SP operand hard coded. As such add immediate with SP and add immediate with register can safely be treated as the same instruction.

All the patch does is adjust a register constraint on an instruction alias.

llvm-svn: 188676
2013-08-19 15:02:25 +00:00
Tim Northover
057a4d7c26 ARM: make sure we keep inline asm operands tied.
When patching inlineasm nodes to use GPRPair for 64-bit values, we
were dropping the information that two operands were tied, which
effectively broke the live-interval of vregs affected.

llvm-svn: 188643
2013-08-18 18:06:03 +00:00
Jim Grosbach
58bafdb9f1 ARM: Properly constrain comparison fastisel register classes.
Ongoing 'make the verifier happy' improvements to ARM fast-isel.

rdar://12594152

llvm-svn: 188595
2013-08-16 23:37:40 +00:00
Jim Grosbach
4829862a24 ARM: Fast-isel register class constrain for extends.
Properly constrain the operand register class for instructions used
in [sz]ext expansion. Update more tests to use the verifier now that
we're getting the register classes correct.

rdar://12594152

llvm-svn: 188594
2013-08-16 23:37:36 +00:00
Jim Grosbach
de05043e78 ARM: Fix more fast-isel verifier failures.
Teach the generic instruction selection helper functions to constrain
the register classes of their input operands. For non-physical register
references, the generic code needs to be careful not to mess that up
when replacing references to result registers. As the comment indicates
for MachineRegisterInfo::replaceRegWith(), it's important to call
constrainRegClass() first.

rdar://12594152

llvm-svn: 188593
2013-08-16 23:37:31 +00:00
Jim Grosbach
7f992f45da ARM: Clean up fast-isel machine verifier errors.
Lots of machine verifier errors result from using a plain GPR regclass
for incoming argument copies. A more restrictive rGPR class is more
appropriate since it more accurately represents what's happening, plus
it lines up better with isel later on so the verifier is happier.
Reduces the number of ARM fast-isel tests not running with the verifier
enabled by over half.

rdar://12594152

llvm-svn: 188592
2013-08-16 23:37:23 +00:00
Benjamin Kramer
700f0ccf14 When initializing the PIC global base register on ARM/ELF add pc to fix the address.
This unbreaks PIC with fast isel on ELF targets (PR16717). The output matches
what GCC and SDag do for PIC but may not cover all of the many flavors of PIC
that exist.

llvm-svn: 188551
2013-08-16 12:52:08 +00:00
Mihai Popa
a9e072fd76 Add support for Thumb2 literal loads with negative zero offset
Thumb2 literal loads use an offset encoding which allows for 
negative zero. This fixes parsing and encoding so that #-0 
is correctly processed. The parser represents #-0 as INT32_MIN.

llvm-svn: 188549
2013-08-16 12:03:00 +00:00
Mihai Popa
cbf5f426e7 Fix Thumb2 aliasing complementary instructions taking modified immediates
There are many Thumb instructions which take 12-bit immediates encoded in a special
8-byte value + 4-byte rotator form. Not all numbers are represented, and it's legal
to transform an assembly instruction to be able to encode the immediate.

For example: AND and BIC are complementary instructions; one can switch the AND
to a BIC as long as the immediate is complemented. 

The intent is to switch one instruction into its complementary one when the immediate
cannot be encoded in the form requested in the original assembly and when the 
complementary immediate is encodable.

The patch addresses two issues:
1. definition of t2SOImmNot immediate - it has to check that the orignal value is
not encoded naturally
2. t2AND and t2BIC instruction aliases which should use the Thumb2 SOImm operand 
rather than the ARM one.

llvm-svn: 188548
2013-08-16 11:55:44 +00:00
Renato Golin
cacfb5333a make arm-use-movt available for all ARM
Before this patch this flag is IOS specific, but is also
useful for bare project like bootloaders / kernels etc,
since movw / movt prevents simple relocation. Therefore
make this flag more commonly available.

note: this patch depends on a similiar rename in clang

Patch by Jeroen Hofstee.

llvm-svn: 188487
2013-08-15 20:54:38 +00:00
Renato Golin
9877f793c2 make arm-reserve-r9 available for all ARM
r9 is defined as a platform-specific register in the ARM EABI.
It can be reserved for a special purpose or be used as a general
purpose register. Add support for reserving r9 for all ARM, while
leaving the IOS usage unchanged.

Patch by Jeroen Hofstee.

llvm-svn: 188485
2013-08-15 20:45:13 +00:00
Mihai Popa
95a5647431 This fixes three issues related to Thumb literal loads:
1. The offset range for Thumb1 PC relative loads is [0..1020] and not [-1024..1020]
2. Thumb2 PC relative loads may define the PC, so the restriction placed on target register is removed
3. Removes unneeded alias between "ldr.n" and t1LDRpci. ".n" is actually stripped by both tablegen
and the ASM parser, so this alias rule really does nothing

llvm-svn: 188466
2013-08-15 15:43:06 +00:00
Craig Topper
2653227b8f Replace getValueType().getSimpleVT() with getSimpleValueType(). Also remove one weird cast from MVT->EVT just to call getSimpleVT().
llvm-svn: 188441
2013-08-15 02:33:50 +00:00
Renato Golin
5068d4579d Let t2LDRBi8 and t2LDRBi12 have same Base Pointer
When determining if two different loads are from the same base address,
this patch allows one load to use a t2LDRi8 address mode and another to
use a t2LDRi12 address mode. The current implementation is very
conservative and this allows the case of differing Thumb2 byte loads to
be considered. Allowing these differing modes instead of forcing the exact
same opcode is useful for situations where one opcodes loads from a base
address+1 and a second opcode loads for a base address-1.

Patch by Daniel Stewart.

llvm-svn: 188385
2013-08-14 16:35:29 +00:00
Joey Gouly
c5221430a1 ARMv8: SWP and SWPB are obsoleted on ARMv8.
llvm-svn: 188288
2013-08-13 16:40:47 +00:00
Mihai Popa
f30d3c418d Fix signed overflow in when computing encodings for ADR instructions
llvm-svn: 188268
2013-08-13 14:02:13 +00:00
Benjamin Kramer
169eb89854 Add a overload to CostTable which allows it to infer the size of the table.
Use it to avoid repeating ourselves too often. Also store MVT::SimpleValueType
in the TTI tables so they can be statically initialized, MVT's constructors
create bloated initialization code otherwise.

llvm-svn: 188095
2013-08-09 19:33:32 +00:00
Mihai Popa
5aabbdc839 This fixes the Thumb2 CPS assembly syntax.
In Thumb1, only one variant is supported: CPS{effect} {flags}

Thumb2 supports three:
CPS{effect}.W {flags}
CPS{effect} {flags} {mode}
CPS {mode}

Canonically, .W should be used only when ambiguity is present between encodings of different width.
The wide suffix is still accepted for the latter two forms via aliases.

llvm-svn: 188071
2013-08-09 13:52:32 +00:00
Mihai Popa
b9642a712a Fix assembling of Thumb2 branch instructions.
The long encoding for Thumb2 unconditional branches is broken.
Additionally, there is no range checking for target operands; as such 
for instructions originating in assembly code, only short Thumb encodings
are generated, regardless of the bitsize needed for the offset.

Adding range checking is non trivial due to the representation of Thumb
branch instructions. There is no true difference between conditional and
unconditional branches in terms of operands and syntax - even unconditional
branches have a predicate which is expected to match that of the IT block
they are in. Yet, the encodings and the permitted size of the offset differ.

Due to this, for any mnemonic there are really 4 encodings to choose for.

The problem cannot be handled in the parser alone or by manipulating td files.
Because the parser builds first a set of match candidates and then checks them
one by one, whatever tablegen-only solution might be found will ultimately be
dependent of the parser's evaluation order. What's worse is that due to the fact
that all branches have the same syntax and the same kinds of operands, that 
order is governed by the lexicographical ordering of the names of operand 
classes...

To circumvent all this, any necessary disambiguation is added to the instruction
validation pass.

llvm-svn: 188067
2013-08-09 10:38:32 +00:00
Silviu Baranga
0291b43e25 Remove the now redundant FeatureFP16 from the Cortex-A15 feature list. It was made redundant when FeatureVFP4 was added which implies FP16.
llvm-svn: 187985
2013-08-08 15:47:33 +00:00
Mihai Popa
cda7008bd4 The name "tCDP" isn't used anywhere else in the source code, so renaming it for consistency doesn't cause any problems.
This is the only Thumb2 instruction defined with "t" prefix; all other Thumb2 instructions have "t2" prefix (e.g. "t2CDP2" which is defined immediately afterwards).

Patch by Artyom Skrobov.

llvm-svn: 187973
2013-08-08 10:20:41 +00:00
Mihai Popa
89848e0624 This corrects creation of operands for t2PLDW. It also removes the definition of t2PLDWpci,
as pldw does not have a literal variant (i.e. pc relative version)

llvm-svn: 187804
2013-08-06 16:07:46 +00:00
Mihai Popa
154c25a9c4 Support APSR_nzcv as operand for Thumb2 mrc. Deprecate pre-UAL syntax (pc instead of apsr_nzcv)
llvm-svn: 187803
2013-08-06 15:52:36 +00:00
Tim Northover
d79219981f ARM: implement allowTruncateForTailCall
Now that it's in place, it seems silly not to let ARM make use of the extra
tail call opportunities.

llvm-svn: 187795
2013-08-06 13:58:03 +00:00
NAKAMURA Takumi
0eb9242c56 Target/*/CMakeLists.txt: Add the dependency to CommonTableGen explicitly for each corresponding CodeGen.
Without explicit dependencies, both per-file action and in-CommonTableGen action could run in parallel.
It races to emit *.inc files simultaneously.

llvm-svn: 187780
2013-08-06 06:38:37 +00:00
Benjamin Kramer
52b0884241 ARMAsmParser: Plug a leak.
Using an object to do the cleanup may look like overkill, but it's safer and nicer than putting deletes everywhere.

llvm-svn: 187696
2013-08-03 22:16:24 +00:00
Joey Gouly
061cc31d3b Add a missing 'return' statement.
llvm-svn: 187671
2013-08-02 20:50:01 +00:00
Joey Gouly
9c798eabce [ARMv8] Add an assembler warning for the deprecated 'setend' instruction.
llvm-svn: 187666
2013-08-02 19:18:12 +00:00
Renato Golin
dbd9622c0b Fixes ARM LNT bot from SLP change in O3
This patch fixes the multiple breakages on ARM test-suite after the SLP
vectorizer was introduced by default on O3. The problem was an illegal
vector type on ARMTTI::getCmpSelInstrCost() <3 x i1> which is not simple.

The guard protects this code from breaking (cause of the problems) but
doesn't fix the issue that is generating the odd vector in the first
place, which also needs to be investigated.

llvm-svn: 187658
2013-08-02 17:10:04 +00:00
Bill Wendling
e7b7059f1d Use function attributes to indicate that we don't want to realign the stack.
Function attributes are the future! So just query whether we want to realign the
stack directly from the function instead of through a random target options
structure.

llvm-svn: 187618
2013-08-01 21:42:05 +00:00
Kevin Enderby
47a2c30349 Added the B9.3.19 SUBS PC, LR, #imm (Thumb2) system instruction.
While the .td entry is nice and all, it takes a pretty gross hack in
ARMAsmParser::ParseInstruction() because of handling of other "subs"
instructions to get it to match.  Ran it by Jim Grosbach and he said it was
about what he expected to make this work given the existing code.

rdar://14214063

llvm-svn: 187530
2013-07-31 21:05:30 +00:00
Saleem Abdulrasool
f01cc77809 [ARM] check bitwidth in PerformORCombine
When simplifying a (or (and B A) (and C ~A)) to a (VBSL A B C) ensure that the
bitwidth of the second operands to both ands match before comparing the negation
of the values.

Split the check of the value of the second operands to the ands.  Move the cast
and variable declaration slightly higher to make it slightly easier to follow.

Bug-Id: 16700
Signed-off-by: Saleem Abdulrasool <compnerd@compnerd.org>
llvm-svn: 187404
2013-07-30 04:43:08 +00:00
Silviu Baranga
5aac9ffdd0 Allow generation of vmla.f32 instructions when targeting Cortex-A15. The patch also adds the VFP4 feature to Cortex-A15 and fixes the DontUseFusedMAC predicate so that we can still generate vmla.f32 instructions on non-darwin targets with VFP4.
llvm-svn: 187349
2013-07-29 09:25:50 +00:00
Chandler Carruth
4602399982 Create a constant pool symbol for the GOT in the ARMCGBR the same way we
do in the SDag when lowering references to the GOT: use
ARMConstantPoolSymbol rather than creating a dummy global variable. The
computation of the alignment still feels weird (it uses IR types and
datalayout) but it preserves the exact previous behavior. This change
fixes the memory leak of the global variable detected on the valgrind
leak checking bot.

Thanks to Benjamin Kramer for pointing me at ARMConstantPoolSymbol to
handle this use case.

llvm-svn: 187303
2013-07-27 11:58:26 +00:00
Chandler Carruth
36e7c40816 Fix yet another memory leak found by the vg-leak bot. Folks (including
me) should start watching this bot more as its catching lots of bugs.

The fix here is to not construct the global if we aren't going to need
it. That's cheaper anyways, and globals have highly predictable types in
practice. I've added an assert to catch skew between our manual testing
of the type and the actual type just for paranoia's sake.

Note that this pattern is actually fine in most globals because when you
build a global with a module it automatically is moved to be owned by
that module. But here, we're in isel and don't really want to do that.
The solution of not creating a global is simpler anyways.

llvm-svn: 187302
2013-07-27 11:23:08 +00:00
Quentin Colombet
995760ffc1 [ARM][ISel] Improve the lowering of vector loads.
When vectors are built from a single value, the ARM lowering issues a
scalar_to_vector node.
This node is then always morphed into a move from the general purpose unit to
the vector unit.
When the value comes from a load, this can be simplified into a vector load to
the right lane.

This patch changes the lowering of insert_vector_elt to expose a vector
friendly pattern in this situation.

This is a step toward fixing <rdar://problem/14170854>.

llvm-svn: 186999
2013-07-23 22:34:47 +00:00
Mihai Popa
cb970f1789 This adds range checking for "ldr Rn, [pc, #imm]" Thumb
instructions. With this patch:

1. ldr.n is recognized as mnemonic for the short encoding
2. ldr.w is recognized as menmonic for the long encoding
3. ldr will map to either short or long encodings depending on the size of the offset

llvm-svn: 186831
2013-07-22 15:49:36 +00:00
Tim Northover
a6d63d6cc9 ARM: remove now unneeded custom Asm converters
After Ulrich's r180677 (thanks!) TableGen is intelligent enough to
handle tied constraints involving complex operands properly, so
virtually all of the ARM custom converters are now unnecessary.

llvm-svn: 186810
2013-07-22 09:06:12 +00:00
Lang Hames
2d3b19969c Refactor AnalyzeBranch on ARM. The previous version did not always analyze
indirect branches correctly. Under some circumstances, this led to the deletion
of basic blocks that were the destination of indirect branches. In that case it
left indirect branches to nowhere in the code.

This patch replaces, and is more general than either of the previous fixes for
indirect-branch-analysis issues, r181161 and r186461.

For other branches (not indirect) this refactor should have *almost* identical
behavior to the previous version. There are some corner cases where this
refactor is able to analyze blocks that the previous version could not (e.g.
this necessitated the update to thumb2-ifcvt2.ll). 

<rdar://problem/14464830>

llvm-svn: 186735
2013-07-19 23:52:47 +00:00
Joey Gouly
c0c0031fbe Add a line that got missed off somehow. Sorry about that!
llvm-svn: 186692
2013-07-19 16:45:16 +00:00
Joey Gouly
cfa16b3bc1 [ARMv8] Implement the NEON instructions VRINT{N, X, A, Z, M, P}.
llvm-svn: 186688
2013-07-19 16:34:16 +00:00
Tilmann Scheller
ae212d6c91 ARM: Add instruction aliases for the Thumb2 PLD/PLDW (literal) alternate form.
See A8.8.127 in ARM DDI 0406C.b.

Related to <rdar://problem/14403733>.

llvm-svn: 186682
2013-07-19 16:18:56 +00:00
Tilmann Scheller
8ea80fdb93 ARM: Make sure the instruction alias for PLI uses the right subtarget features.
PLI requires both the Thumb2 and the ARMv7 feature.

Related to <rdar://problem/14403733>.

llvm-svn: 186620
2013-07-18 22:19:59 +00:00
Joey Gouly
73424fc519 Change 'n' to 'N' to keep consistent with other instructions.
llvm-svn: 186576
2013-07-18 12:00:25 +00:00
Joey Gouly
933fb028d7 [ARMv8] Add NEON instructions VCVT{A, N, P, M}.
llvm-svn: 186574
2013-07-18 11:53:22 +00:00
Joey Gouly
59df7acdf4 Add Thumb tests for the ARMv8 FP instructions that I recently added.
Also, fix the namespace for two instructions that I missed previously.

llvm-svn: 186572
2013-07-18 10:20:25 +00:00
Joey Gouly
1ced091dc6 Remove the extra leading 0 from VMAXNMND.
The N3VDIntnp pattern takes bits<5> and I gave it 6 bits.

Thanks to Jiangning Liu for spotting it!

llvm-svn: 186568
2013-07-18 09:34:35 +00:00
Joey Gouly
bc02a480d0 [ARMv8] Add support for the NEON instructions vmaxnm/vminnm.
This adds a new class for non-predicable NEON instructions and a
new DecoderNamespace for v8 NEON instructions.

llvm-svn: 186504
2013-07-17 13:59:38 +00:00
JF Bastien
05ee680a75 Fix ARMFastISel::ARMEmitIntExt shift emission
My patch 'r183551 - ARM FastISel integer sext/zext improvements' was incorrect when emitting ARM register-immediate ASR, LSL, LSR instructions: they are pseudo-instructions in ARMInstrInfo.td and I should have used MOVsi instead.

This is not an issue when code is generated through a .s file, but is an issue when generated straight to a .o (-filetype=obj).

llvm-svn: 186489
2013-07-17 05:46:46 +00:00
Lang Hames
42e80f638a Related to r181161 - Indirect branches may not be the last branch in a basic
block. Blocks that have an indirect branch terminator, even if it's not the
last terminator, should still be treated as unanalyzable.

<rdar://problem/14437274>

Reducing a useful regression test case is proving difficult - I hope to have
one soon.

llvm-svn: 186461
2013-07-16 22:01:40 +00:00
Tilmann Scheller
5c5d0d2141 ARM: Add support for the Thumb2 PLI alternate literal form.
This adds an instruction alias to make the assembler recognize the alternate literal form: pli [PC, #+/-<imm>]

See A8.8.129 in the ARM ARM (DDI 0406C.b).

Fixes <rdar://problem/14403733>.

llvm-svn: 186459
2013-07-16 21:52:34 +00:00
Tim Northover
be10a35a43 ARM: allow printing of ARM atomic DAG nodes.
We'd forgotten to provide string representations for the special ARMISD atomic
nodes; this adds them in. No effect on CodeGen, just makes the output of
"-view-whatever-dags" slightly more readable.

llvm-svn: 186406
2013-07-16 12:15:36 +00:00
Tim Northover
69d676cd12 ARM: implement ldrex, strex and clrex intrinsics
Intrinsics already existed for the 64-bit variants, so these support operations
of size at most 32-bits.

llvm-svn: 186392
2013-07-16 09:46:55 +00:00
Renato Golin
5b7294a39c ARM EABI divmod support
This patch enables calls to __aeabi_idivmod when in EABI mode,
by using the remainder value returned on registers (R1),
enabled by the ARM triple "none-eabi". Note that Darwin and
GNUEABI triples will continue lowering on GNU style, that is,
using the stack for the remainder.

Still need to add SREM/UREM support fix for 64-bit lowering.

llvm-svn: 186390
2013-07-16 09:32:17 +00:00
Craig Topper
4e9457fd7d Use llvm::array_lengthof to replace sizeof(array)/sizeof(array[0]).
llvm-svn: 186301
2013-07-15 04:27:47 +00:00
Craig Topper
58fa7a9b4a Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector size.
llvm-svn: 186274
2013-07-14 04:42:23 +00:00
JF Bastien
5885dc293c Fix ARM paired GPR COPY lowering
ARM paired GPR COPY was being lowered to two MOVr without CC. This
patch puts the CC back.

My test is a reduction of the case where I encountered the issue,
64-bit atomics use paired GPRs.

The issue only occurs with selectionDAG, FastISel doesn't encounter it
so I didn't bother calling it.

llvm-svn: 186226
2013-07-12 23:33:03 +00:00
Eric Christopher
f271491787 Remove extraneous braces.
llvm-svn: 186212
2013-07-12 22:08:24 +00:00
Arnold Schwaighofer
17fdc6e770 ARM cost model: Add cost for gather/scather
Fixes a 35% degradation compared to unvectorized code in
MiBench/automotive-susan and an equally serious regression on a private
image processing benchmark.

radar://14351991

llvm-svn: 186188
2013-07-12 19:16:04 +00:00
Arnold Schwaighofer
b9c37551bc TargetTransformInfo: address calculation parameter for gather/scather
Address calculation for gather/scather in vectorized code can incur a
significant cost making vectorization unbeneficial. Add infrastructure to add
cost.
Tests and cost model for targets will be in follow-up commits.

radar://14351991

llvm-svn: 186187
2013-07-12 19:16:02 +00:00
Craig Topper
d1f0bf4b5f Simplify code.
llvm-svn: 186013
2013-07-10 16:38:35 +00:00
Stephen Lin
bba095310d Fix typo
llvm-svn: 185995
2013-07-10 01:57:39 +00:00
Stephen Lin
af86770cd4 Explicitly define ARMISelLowering::isFMAFasterThanFMulAndFAdd. No functionality change.
Currently ARM is the only backend that supports FMA instructions (for at least some subtargets) but does not implement this virtual, so FMAs are never generated except from explicit fma intrinsic calls. Apparently this is due to the fact that it supports both fused (one rounding step) and unfused (two rounding step) multiply + add instructions. This patch clarifies that this the case without changing behavior by implementing the virtual function to simply return false, as the default TargetLoweringBase version does.

It is possible that some cpus perform the fused version faster than the unfused version and vice-versa, so the function implementation should be revisited if hard data is found.

llvm-svn: 185994
2013-07-10 01:54:24 +00:00
Jim Grosbach
d6be90a2b8 ARM: Fix incorrect pack pattern for thumb2
Propagate the fix from r185712 to Thumb2 codegen as well. Original
commit message applies here as well:

A "pkhtb x, x, y asr #num" uses the lower 16 bits of "y asr #num" and
packs them in the bottom half of "x". An arithmetic and logic shift are
only equivalent in this context if the shift amount is 16. We would be
shifting in ones into the bottom 16bits instead of zeros if "y" is
negative.

rdar://14338767

llvm-svn: 185982
2013-07-09 22:59:22 +00:00
Joey Gouly
1bf5e0fbf1 Add MC assembly/disassembly support for VRINT{A, N, P, M} to V8FP.
llvm-svn: 185929
2013-07-09 11:26:18 +00:00
Joey Gouly
9995bf31f9 Add MC assembly/disassembly support for VRINT{Z, X, R} to V8FP.
llvm-svn: 185926
2013-07-09 11:03:21 +00:00
Joey Gouly
7f5f52a614 Add MC assembly/disassembly support for VCVT{A, N, P, M} to V8FP.
llvm-svn: 185922
2013-07-09 09:59:04 +00:00
Joey Gouly
b4f59412fd Add a comment to this change, requested by Eric Christopher.
llvm-svn: 185853
2013-07-08 19:52:51 +00:00
Jim Grosbach
b4234c1d88 ARM: Improve codegen for generic vselect.
Fall back to by-element insert rather than building it up on the stack.

rdar://14351991

llvm-svn: 185846
2013-07-08 18:18:52 +00:00
Joey Gouly
bc06bffc50 Add MC support for the v8fp instructions: vmaxnm and vminnm.
llvm-svn: 185767
2013-07-06 20:50:18 +00:00
Arnold Schwaighofer
97cea9b991 ARM: Add a pack pattern for matching arithmetic shift right
llvm-svn: 185714
2013-07-05 18:57:49 +00:00
Arnold Schwaighofer
d5fc888196 ARM: Fix incorrect pack pattern
A "pkhtb x, x, y asr #num" uses the lower 16 bits of "y asr #num" and packs them
in the bottom half of "x". An arithmetic and logic shift are only equivalent in
this context if the shift amount is 16. We would be shifting in ones into the
bottom 16bits instead of zeros if "y" is negative.

radar://14338767

llvm-svn: 185712
2013-07-05 18:28:39 +00:00
Joey Gouly
76f34b0ffb PR16490: fix a crash in ARMDAGToDAGISel::SelectInlineAsm.
In the SelectionDAG immediate operands to inline asm are constructed as
two separate operands. The first is a constant of value InlineAsm::Kind_Imm
and the second is a constant with the value of the immediate.

In ARMDAGToDAGISel::SelectInlineAsm, if we reach an operand of Kind_Imm we
should skip over the next operand too.

llvm-svn: 185688
2013-07-05 10:19:40 +00:00
Joey Gouly
3366658175 Remove an unneeded call to 'UpdateThumbVFPPredicate', spotted by Amaury.
llvm-svn: 185651
2013-07-04 15:58:38 +00:00
Joey Gouly
f5a82dca1f Add support for MC assembling and disassembling of vsel{ge, gt, eq, vs} instructions.
This adds a new decoder table/namespace 'VFPV8', as these instructions have their
top 4 bits as 0b1111, while other Thumb instructions have 0b1110.

llvm-svn: 185642
2013-07-04 14:57:20 +00:00
Jakob Stoklund Olesen
d428205e4a Remove the EXCEPTIONADDR, EHSELECTION, and LSDAADDR ISD opcodes.
These exception-related opcodes are not used any longer.

llvm-svn: 185625
2013-07-04 13:54:20 +00:00
Joey Gouly
7a6dcc8db9 Add a V8FP instruction 'vcvt{b,t}' to convert between half and double precision.
llvm-svn: 185620
2013-07-04 10:04:08 +00:00
Craig Topper
783617eba7 Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid specifying the vector size.
llvm-svn: 185606
2013-07-04 01:31:24 +00:00
Jakob Stoklund Olesen
8099b21497 Revert r185595-185596 which broke buildbots.
Revert "Simplify landing pad lowering."
Revert "Remove the EXCEPTIONADDR, EHSELECTION, and LSDAADDR ISD opcodes."

llvm-svn: 185600
2013-07-04 00:26:30 +00:00
Jakob Stoklund Olesen
8bc33424b2 Remove the EXCEPTIONADDR, EHSELECTION, and LSDAADDR ISD opcodes.
These exception-related opcodes are not used any longer.

llvm-svn: 185596
2013-07-03 23:56:31 +00:00
Stephen Lin
6aa0eb5922 Have ARMBaseRegisterInfo::getCallPreservedMask return the 'correct' mask for the GHC calling convention.
This is purely academic because GHC calls are always tail calls so the register mask will never be used; however, this change makes the code clearer and brings the ARM implementation of the GHC calling convention in line with the X86 implementation. Also, it might save someone else some time trying to figuring out what is happening...

llvm-svn: 185592
2013-07-03 23:39:13 +00:00
Quentin Colombet
49190aa8d1 [ARM] Improve the instruction selection of vector loads.
In the ARM back-end, build_vector nodes are lowered to a target specific
build_vector that uses floating point type. 
This works well, unless the inserted bitcasts survive until instruction
selection. In that case, they incur moves between integer unit and floating
point unit that may result in inefficient code.

In other words, this conversion may introduce artificial dependencies when the
code leading to the build vector cannot be completed with a floating point type.

In particular, this happens when loads are not aligned.

Before this patch, in that case, the compiler generates general purpose loads
and creates the floating point vector from them, instead of directly using the
vector unit.

The patch uses a vector friendly sequence of code when the inserted bitcasts to
floating point survived DAGCombine.

This is done by a target specific DAGCombine that changes the target specific
build_vector into a sequence of insert_vector_elt that get rid of the bitcasts.

<rdar://problem/14170854>

llvm-svn: 185587
2013-07-03 21:42:57 +00:00
Tilmann Scheller
07e970a22d ARM: Prevent ARMAsmParser::shouldOmitCCOutOperand() from misidentifying certain Thumb2 add immediate T3 encodings.
Before the fix Thumb2 instructions of type "add rD, rN, #imm" (T3 encoding, see ARM ARM A8.8.4) with rD and rN both being low registers (r0-r7) were classified as having the T4 encoding.

The T4 encoding doesn't have a cc_out operand so for above instructions the operand gets erroneously removed, corrupting the token stream and leading to parse errors later in the process.

This bug prevented "add r1, r7, #0xcbcbcbcb" from being assembled correctly.

Fixes <rdar://problem/14224440>.

llvm-svn: 185575
2013-07-03 20:38:01 +00:00
Craig Topper
9729e843cb Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid specifying the vector size.
llvm-svn: 185540
2013-07-03 15:07:05 +00:00
Mihai Popa
2403cfc9a6 This corrects the implementation of Thumb ADR instruction. There are three issues:
1. it should accept only 4-byte aligned addresses
2. the maximum offset should be 1020
3. it should be encoded with the offset scaled by two bits

llvm-svn: 185528
2013-07-03 09:21:44 +00:00
Tim Northover
f4b07c69d1 ARM: relax the atomic release barrier to "dmb ishst" on Swift
Swift cores implement store barriers that are stronger than the ARM
specification but weaker than general barriers. They are, in fact, just about
enough to provide the ordering needed for atomic operations with release
semantics.

This patch makes use of that quirk.

llvm-svn: 185527
2013-07-03 09:20:36 +00:00
Rafael Espindola
304ef43e7d Remove address spaces from MC.
This is dead code since PIC16 was removed in 2010. The result was an odd mix,
where some parts would carefully pass it along and others would assert it was
zero (most of the object streamer for example).

llvm-svn: 185436
2013-07-02 15:49:13 +00:00
Logan Chien
83886dc182 Fix ARM EHABI compact model 1 and 2 without handlerdata.
According to ARM EHABI section 9.2, if the
__aeabi_unwind_cpp_pr1() or __aeabi_unwind_cpp_pr2() is
used, then the handler data must be emitted after the unwind
opcodes.  The handler data consists of several words, and
should be terminated by zero.

In case that the .handlerdata directive is not specified by
the programmer, we should emit zero to terminate the handler
data.

llvm-svn: 185422
2013-07-02 12:43:27 +00:00
Chad Rosier
54c8df1202 [ARMAsmParser] Sort the ARM register lists based on the encoding value, not the
tablegen enum values.  This should be the last fix due to fallout from r185094.

llvm-svn: 185379
2013-07-01 20:49:23 +00:00
Tim Northover
51fd747de9 Revert r185339 (ARM: relax the atomic release barrier to "dmb ishst")
Turns out I'd misread the architecture reference manual and thought
that was a load/store-store barrier, when it's not.

Thanks for pointing it out Eli!

llvm-svn: 185356
2013-07-01 18:37:33 +00:00
Tim Northover
25286e5b71 ARM: relax the atomic release barrier to "dmb ishst"
I believe the full "dmb ish" barrier is not required to guarantee release
semantics for atomic operations. The weaker "dmb ishst" prevents previous
operations being reordered with a store executed afterwards, which is enough.

A key point to note (fortunately already correct) is that this barrier alone is
*insufficient* for sequential consistency, no matter how liberally placed.

llvm-svn: 185339
2013-07-01 14:48:48 +00:00
David Blaikie
417680c2b9 Remove unused member
llvm-svn: 185219
2013-06-28 21:28:01 +00:00
Eric Christopher
c7a7c7215c Remove unused variables.
llvm-svn: 185180
2013-06-28 18:03:54 +00:00
Weiming Zhao
b97c1a69a2 Bug 13662: Enable GPRPair for all i64 operands of inline asm on ARM
This patch assigns paired GPRs  for inline asm with
64-bit data on ARM. It's enabled for both ARM and Thumb to support modifiers
like %H, %Q, %R.

llvm-svn: 185169
2013-06-28 17:26:02 +00:00
Tim Northover
e13264c995 ARM: ensure fixed-point conversions have sane types
We were generating intrinsics for NEON fixed-point conversions that didn't
exist (e.g. float -> i16). There are two cases to consider:
  + iN is smaller than float. In this case we can do the conversion but need an
    extend or truncate as well.
  + iN is larger than float. In this case using the NEON conversion would be
    incorrect so we don't perform any combining.

llvm-svn: 185158
2013-06-28 15:29:25 +00:00
Tilmann Scheller
9392d20bbe ARM: Fix pseudo-instructions for SRS (Store Return State).
The mapping between SRS pseudo-instructions and SRS native instructions was incorrect, the correct mapping is:

srsfa -> srsib
srsea -> srsia
srsfd -> srsdb
srsed -> srsda

This fixes <rdar://problem/14214734>.

llvm-svn: 185155
2013-06-28 15:09:46 +00:00
Joey Gouly
42f1898415 Add a Subtarget feature 'v8fp' to the ARM backend.
llvm-svn: 185073
2013-06-27 11:49:26 +00:00
Stephen Lin
d1d52203d6 Clarify and doxygen-ify comments
llvm-svn: 185030
2013-06-26 22:27:50 +00:00
Stephen Lin
25c0cb5ba9 ARM: Proactively ensure that the LowerCallResult hack for 'this'-returns is not used for incompatible calling conventions.
(Currently, ARM 'this'-returns are handled in the standard calling convention case by treating R0 as preserved and doing some extra magic in LowerCallResult; this may not apply to calling conventions added in the future so this patch provides and documents an interface for indicating such)

llvm-svn: 185024
2013-06-26 21:42:14 +00:00
Stephen Lin
98ccb9d1ce Minor formatting fix to ARMBaseRegisterInfo::getCalleeSavedRegs
llvm-svn: 185016
2013-06-26 20:19:06 +00:00
Joey Gouly
f4bea6681b Add a subtarget feature 'v8' to the ARM backend.
This allows for targeting the ARMv8 AArch32 variant.

llvm-svn: 184967
2013-06-26 16:58:26 +00:00
Tim Northover
2bf8ffa196 ARM: fix more cases where predication may or may not be allowed
Unfortunately this addresses two issues (by the time I'd disentangled the logic
it wasn't worth putting it back to half-broken):

+ Coprocessor instructions should all be predicable in Thumb mode.
+ BKPT should never be predicable.

llvm-svn: 184965
2013-06-26 16:52:40 +00:00
Tim Northover
817190b1e4 ARM: allow predicated barriers in Thumb mode
The barrier instructions are only "always-execute" in ARM mode, they can quite
happily sit inside an IT block in Thumb.

llvm-svn: 184964
2013-06-26 16:52:32 +00:00
Joey Gouly
a82562fd5b Remove the 'generic' CPU from the ARM eabi attributes printer.
Make v4 the default ARM architecture attribute, to match CodeGen.

llvm-svn: 184962
2013-06-26 16:39:06 +00:00
Amaury de la Vieuville
8a7e3e2195 ARM: operands should be explicit when disassembled
llvm-svn: 184943
2013-06-26 13:39:07 +00:00
Amaury de la Vieuville
550e6ef18f ARM: check predicate bits for thumb instructions
When encoded to thumb, VFP instruction and VMOV/VDUP between scalar and
core registers, must have their predicate bit to 0b1110.

llvm-svn: 184707
2013-06-24 09:15:01 +00:00
Amaury de la Vieuville
37b2270352 ARM: rGPR is meant to be unpredictable, not undefined
llvm-svn: 184706
2013-06-24 09:14:54 +00:00
Amaury de la Vieuville
5a373a526e ARM: fix thumb1 nop decoding
In thumb1, NOP is a pseudo-instruction equivalent to mov r8, r8.
However the disassembler should not use this alias.

llvm-svn: 184703
2013-06-24 09:11:53 +00:00
Amaury de la Vieuville
6eecd3f2cb ARM: fix IT decoding
mask == 0 -> UNPRED

llvm-svn: 184702
2013-06-24 09:11:45 +00:00
Amaury de la Vieuville
0d7ac788f2 ARM: enable decoding of pc-relative PLD/PLI
llvm-svn: 184701
2013-06-24 09:11:38 +00:00
Chad Rosier
d00211e479 The getRegForInlineAsmConstraint function should only accept MVT value types.
llvm-svn: 184642
2013-06-22 18:37:38 +00:00
David Blaikie
b1e1db3db6 DebugInfo: Don't lose unreferenced non-trivial by-value parameters
A FastISel optimization was causing us to emit no information for such
parameters & when they go missing we end up emitting a different
function type. By avoiding that shortcut we not only get types correct
(very important) but also location information (handy) - even if it's
only live at the start of a function & may be clobbered later.

Reviewed/discussion by Evan Cheng & Dan Gohman.

llvm-svn: 184604
2013-06-21 22:56:30 +00:00
Quentin Colombet
bdb6bfd975 ARM: Remove a (false) dependency on the memoryoperand's value as we do not use
it at the moment.
This allows to form more paired loads even when stack coloring pass destroys the
memoryoperand's value.

<rdar://problem/13978317>

llvm-svn: 184492
2013-06-20 22:51:44 +00:00
Joey Gouly
a600985f50 This reverts r155000.
The cdp2 instruction should have the same restrictions as cdp on the
co-processor registers.

VFP instructions on v8/AArch32 share the same encoding space as cdp2.

llvm-svn: 184445
2013-06-20 17:42:36 +00:00
David Blaikie
c3cec14be6 DebugInfo: PR14763/r183329 correct the location of indirect parameters
We had been papering over a problem with location info for non-trivial
types passed by value by emitting their type as references (this caused
the debugger to interpret the location information correctly, but broke
the type of the function). r183329 corrected the type information but
lead to the debugger interpreting the pointer parameter as the value -
the debug info describing the location needed an extra dereference.

Use a new flag in DIVariable to add the extra indirection (either by
promoting an existing DW_OP_reg (parameter passed in a register) to
DW_OP_breg + 0 or by adding DW_OP_deref to an existing DW_OP_breg + n
(parameter passed on the stack).

llvm-svn: 184368
2013-06-19 21:55:13 +00:00
Bill Wendling
a9576dc938 Access the TargetLoweringInfo from the TargetMachine object instead of caching it. The TLI may change between functions. No functionality change.
llvm-svn: 184360
2013-06-19 21:36:55 +00:00
Bill Wendling
4d82ecded8 Access the TargetLoweringInfo from the TargetMachine object instead of caching it. The TLI may change between functions. No functionality change.
llvm-svn: 184352
2013-06-19 21:07:11 +00:00
Bill Wendling
1919cdf3c7 Access the TargetLoweringInfo from the TargetMachine object instead of caching it. The TLI may change between functions. No functionality change.
llvm-svn: 184349
2013-06-19 20:51:24 +00:00
Jim Grosbach
0f0c0ac8be ARM: Add optional datatype suffix to NEON mvn asm syntax.
rdar://14194152

llvm-svn: 184244
2013-06-18 21:49:21 +00:00
Michael Gottesman
5a5656f17d [ARMTargetLowering] ARMISD::{SUB,ADD}{C,E} second result is a boolean implying that upper bits are always 0.
llvm-svn: 184231
2013-06-18 20:49:45 +00:00
Michael Gottesman
dde9797dc2 Converted an overly aggressive assert to a conditional check in AddCombineTo64bitMLAL.
Said assert assumes that ADDC will always have a glue node as its second
argument and is checked before we even know that we are actually performing the
relevant MLAL optimization. This is incorrect since on ARM we *CAN* codegen ADDC
with a use list based second argument. Thus to have both effects, I converted
the assert to a conditional check which if it fails we do not perform the
optimization.

In terms of tests I can not produce an ADDC from the IR level until I get in my
multiprecision optimization patch which is forthcoming. The tests for said patch
would cause this assert to fail implying that said tests will provide the
relevant tests.

llvm-svn: 184230
2013-06-18 20:49:40 +00:00
Kevin Enderby
cb41cc56c6 Change the arm assembler to support this from the v7c spec:
"When assembling to the ARM instruction set, the .N qualifier produces
an assembler error and the .W qualifier has no effect."

In the pre-matcher handler in the asm parser the ".w" (wide) qualifier 
when in ARM mode is now discarded. And an error message is now
produced when the ".n" (narrow) qualifier is used in ARM mode.

Test cases for these were added.

rdar://14064574

llvm-svn: 184224
2013-06-18 20:19:24 +00:00
David Blaikie
9a98240ce6 Reduce indentation.
llvm-svn: 184213
2013-06-18 18:03:17 +00:00
Amaury de la Vieuville
0c0f005a15 ARM: fix literal load with positive offset encoding
When using a positive offset, literal loads where encoded
as if it was negative, because:
- The sign bit was not assigned to an operand
- The addrmode_imm12 operand was not encoding the sign bit correctly

This patch also makes the assembler look at the .w/.n specifier for
loads.

llvm-svn: 184182
2013-06-18 08:13:05 +00:00
Amaury de la Vieuville
f28bf33894 ARM: add operands pre-writeback variants when needed
llvm-svn: 184181
2013-06-18 08:12:51 +00:00
Amaury de la Vieuville
8d8456b196 ARM: fix thumb literal loads decoding
This fixes two previous issues:
- Negative offsets were not correctly disassembled
- The decoded opcodes were not the right one

llvm-svn: 184180
2013-06-18 08:03:06 +00:00
Amaury de la Vieuville
e6ed8ad5df ARM: thumb stores cannot use PC as dest register
llvm-svn: 184179
2013-06-18 08:02:56 +00:00
Bill Wendling
49ef14ef73 Use pointers to the MCAsmInfo and MCRegInfo.
Someone may want to do something crazy, like replace these objects if they
change or something.

No functionality change intended.

llvm-svn: 184175
2013-06-18 07:20:20 +00:00
David Blaikie
813e6b3974 DebugInfo: remove target-specific Frame Index handling for DBG_VALUE MachineInstrs
Frame index handling is now target-agnostic, so delete the target hooks
for creation & asm printing of target-specific addressing in DBG_VALUEs
and any related functions.

llvm-svn: 184067
2013-06-16 20:34:27 +00:00
David Blaikie
f3d2951503 Debug Info: Simplify Frame Index handling in DBG_VALUE Machine Instructions
Rather than using the full power of target-specific addressing modes in
DBG_VALUEs with Frame Indicies, simply use Frame Index + Offset. This
reduces the complexity of debug info handling down to two
representations of values (reg+offset and frame index+offset) rather
than three or four.

Ideally we could ensure that frame indicies had been eliminated by the
time we reached an assembly or dwarf generation, but I haven't spent the
time to figure out where the FIs are leaking through into that & whether
there's a good place to convert them. Some FI+offset=>reg+offset
conversion is done (see PrologEpilogInserter, for example) which is
necessary for some SelectionDAG assumptions about registers, I believe,
but it might be possible to make this a more thorough conversion &
ensure there are no remaining FIs no matter how instruction selection
is performed.

llvm-svn: 184066
2013-06-16 20:34:15 +00:00
Andrew Trick
31eeff56c7 Update machine models. Specify buffer sizes for OOO processors.
llvm-svn: 184033
2013-06-15 04:50:02 +00:00
Andrew Trick
5d13fe97ed Machine Model: Add MicroOpBufferSize and resource BufferSize.
Replace the ill-defined MinLatency and ILPWindow properties with
with straightforward buffer sizes:
MCSchedMode::MicroOpBufferSize
MCProcResourceDesc::BufferSize

These can be used to more precisely model instruction execution if desired.

Disabled some misched tests temporarily. They'll be reenabled in a few commits.

llvm-svn: 184032
2013-06-15 04:49:57 +00:00
Amaury de la Vieuville
13c4894a31 ARM: fix thumb coprocessor instruction with pre-writeback disassembly
was        stc2 p0, c0, [r0]!
instead of stc2 p0, c0, [r0,#0]!

llvm-svn: 183975
2013-06-14 11:21:35 +00:00
JF Bastien
cb60eaba94 Enable FastISel on ARM for Linux and NaCl, not MCJIT
This is a resubmit of r182877, which was reverted because it broken
MCJIT tests on ARM. The patch leaves MCJIT on ARM as it was before: only
enabled for iOS. I've CC'ed people from the original review and revert.

FastISel was only enabled for iOS ARM and Thumb2, this patch enables it
for ARM (not Thumb2) on Linux and NaCl, but not MCJIT.

Thumb2 support needs a bit more work, mainly around register class
restrictions.

The patch punts to SelectionDAG when doing TLS relocation on non-Darwin
targets. I will fix this and other FastISel-to-SelectionDAG failures in
a separate patch.

The patch also forces FastISel to retain frame pointers: iOS always
keeps them for backtracking (so emitted code won't change because of
this), but Linux was getting much worse code that was incorrect when
using big frames (such as test-suite's lencod). I'll also fix this in a
later patch, it will probably require a peephole so that FastISel
doesn't rematerialize frame pointers back-to-back.

The test changes are straightforward, similar to:
  http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130513/174279.html
They also add a vararg test that got dropped in that change.

I ran all of lnt test-suite on A15 hardware with --optimize-option=-O0
and all the tests pass. All the tests also pass on x86 make check-all. I
also re-ran the check-all tests that failed on ARM, and they all seem to
pass.

llvm-svn: 183966
2013-06-14 02:49:43 +00:00
Amaury de la Vieuville
6aadbcd8b4 ARM: fix B decoding
llvm-svn: 183914
2013-06-13 16:41:55 +00:00
Amaury de la Vieuville
6e5b103269 ARM: fix t2am_imm8_offset operand printing for imm=#-0
llvm-svn: 183913
2013-06-13 16:40:51 +00:00
JF Bastien
bc7c69f2f7 ARM FastISel fix sext/zext fold
Sign- and zero-extension folding was slightly incorrect because it wasn't checking that the shift on extensions was zero. Further, I recently added AND rd, rn, #255 as a form of 8-bit zero extension, and failed to add the folding code for it.

This patch fixes both issues.

This patch fixes both, and the test should remain the same:
  test/CodeGen/ARM/fast-isel-fold.ll

llvm-svn: 183794
2013-06-11 22:13:46 +00:00
NAKAMURA Takumi
bc128ca9be Rework r183728, suppress assert(0) for now. Its behavior depends on assertions on win32 hosts.
FIXME: Introduce yet another checker but assert(0).
llvm-svn: 183736
2013-06-11 10:01:42 +00:00
Mihai Popa
2b26ef36f1 It adds support for negative zero offsets for loads and stores.
Negative zero is returned by the primary expression parser as INT32_MIN, so all that the method needs to do is to accept this value.
Behavior already present for Thumb2.

llvm-svn: 183734
2013-06-11 09:48:35 +00:00
Mihai Popa
30b6bcc387 This patch adds support for FPINST/FPINST2 as operands to vmsr/vmrs. These are optional registers that may be supported some ARM implementations to aid with resolution of floating point exceptions. The manual pages for vmsr and vmrs do not detail their use. Encodings and other information can be found in ARM Architecture Reference Manual section F, chapter 6, paragraph 3.
llvm-svn: 183733
2013-06-11 09:39:51 +00:00
Amaury de la Vieuville
334567de5c ARM: Enforce decoding rules for VLDn instructions
llvm-svn: 183731
2013-06-11 08:14:14 +00:00
Amaury de la Vieuville
896fff0d7e ARM: Fix STREX/LDREX reecoding
The decoded MCInst wasn't reencoded as the same instruction

llvm-svn: 183729
2013-06-11 08:03:20 +00:00
NAKAMURA Takumi
5bdf4d5248 Tweak a couple of tests on win32 hosts with +Asserts.
- Don't use assert(0), or tests may pass or fail according to assertions.
  - For now, The tests are marked as XFAIL for win32 hosts.

FIXME: Could we avoid XFAIL to specify triple in the RUN lines?
llvm-svn: 183728
2013-06-11 06:52:58 +00:00
NAKAMURA Takumi
ae2882e710 ARMAsmBackend.cpp: Use Triple::isOSBinFormatCOFF() instead of isOSWindows().
FYI, isOSBinFormatCOFF() is as same as isOSWindows(), on trunk.

llvm-svn: 183727
2013-06-11 06:52:43 +00:00
NAKAMURA Takumi
c1080e2655 Whitespace.
llvm-svn: 183726
2013-06-11 06:52:36 +00:00
Tim Northover
2ca847f56b ARM: diagnose ARM/Thumb assembly switches on CPUs only supporting one.
Some ARM CPUs only support ARM mode (ancient v4 ones, for example) and some
only support Thumb mode (M-class ones currently). This makes sure such CPUs
default to the correct mode and makes the AsmParser diagnose an attempt to
switch modes incorrectly.

rdar://14024354

llvm-svn: 183710
2013-06-10 23:20:58 +00:00
Aaron Ballman
4720beae58 Silencing an MSVC warning about comparing signed and unsigned values.
llvm-svn: 183682
2013-06-10 16:45:40 +00:00
Amaury de la Vieuville
b82468de40 Fix misleading comments in ARMAsmParser
llvm-svn: 183657
2013-06-10 14:17:15 +00:00
Amaury de la Vieuville
477311794b ARM: ISB cannot be passed the same options as DMB
ISB should only accepts full system sync, other options are reserved

llvm-svn: 183656
2013-06-10 14:17:08 +00:00
Logan Chien
fa2266ded0 Fix ARM unwind opcode assembler in several cases.
Changes to ARM unwind opcode assembler:

* Fix multiple .save or .vsave directives.  Besides, the
  order is preserved now.

* For the directives which will generate multiple opcodes,
  such as ".save {r0-r11}", the order of the unwind opcode
  is fixed now, i.e. the registers with less encoding value
  are popped first.

* Fix the $sp offset calculation.  Now, we can use the
  .setfp, .pad, .save, and .vsave directives at any order.

Changes to test cases:

* Add test cases to check the order of multiple opcodes
  for the .save directive.

* Fix the incorrect $sp offset in the test case.  The
  stack pointer offset specified in the test case was
  incorrect.  (Changed test cases: ehabi-mc-section.ll and
  ehabi-mc.ll)

* The opcode to restore $sp are slightly reordered.  The
  behavior are not changed, and the new output is same
  as the output of GNU as.  (Changed test cases:
  eh-directive-pad.s and eh-directive-setfp.s)

llvm-svn: 183627
2013-06-09 12:22:30 +00:00
JF Bastien
87986384a7 ARM FastISel fix load register classes
The register classes when emitting loads weren't quite restricting enough, leading to MI verification failure on the result register.

These are new failures that weren't there the first time I tried enabling ARM FastISel for new targets.

llvm-svn: 183624
2013-06-09 00:20:24 +00:00
Amaury de la Vieuville
534a068d9a ARM: fix VMOVvnf32 decoding when ambiguous with VCVT
Enforce Table A7-15 (op=1, cmode=0b111) -> UNDEF

llvm-svn: 183612
2013-06-08 13:54:05 +00:00
Amaury de la Vieuville
2e930b0bbe ARM: enforce SRS decoding constraints
llvm-svn: 183611
2013-06-08 13:43:59 +00:00
Amaury de la Vieuville
594f70fb57 ARM: fix CPS decoding when ambiguous with QADD
Handle the case when the disassembler table can't tell
the difference between some encodings of QADD and CPS.

Add some necessary safe guards in CPS decoding as well.

llvm-svn: 183610
2013-06-08 13:38:52 +00:00
Amaury de la Vieuville
31c75d7ce5 ARM: fix VCVT decoding
UNPRED was reported instead of UNDEF

llvm-svn: 183608
2013-06-08 13:29:11 +00:00
JF Bastien
1374d6f849 Fix unused variable warning from my previous patch.
llvm-svn: 183601
2013-06-08 00:51:51 +00:00
JF Bastien
c950ffbe08 ARM FastISel integer sext/zext improvements
My recent ARM FastISel patch exposed this bug:
  http://llvm.org/bugs/show_bug.cgi?id=16178
The root cause is that it can't select integer sext/zext pre-ARMv6 and
asserts out.

The current integer sext/zext code doesn't handle other cases gracefully
either, so this patch makes it handle all sext and zext from i1/i8/i16
to i8/i16/i32, with and without ARMv6, both in Thumb and ARM mode. This
should fix the bug as well as make FastISel faster because it bails to
SelectionDAG less often. See fastisel-ext.patch for this.

fastisel-ext-tests.patch changes current tests to always use reg-imm AND
for 8-bit zext instead of UXTB. This simplifies code since it is
supported on ARMv4t and later, and at least on A15 both should perform
exactly the same (both have exec 1 uop 1, type I).

2013-05-31-char-shift-crash.ll is a bitcode version of the above bug
16178 repro.

fast-isel-ext.ll tests all sext/zext combinations that ARM FastISel
should now handle.

Note that my ARM FastISel enabling patch was reverted due to a separate
failure when dealing with MCJIT, I'll fix this second failure and then
turn FastISel on again for non-iOS ARM targets.

I've tested "make check-all" on my x86 box, and "lnt test-suite" on A15
hardware.

llvm-svn: 183551
2013-06-07 20:10:37 +00:00
Bill Wendling
8bc6d84739 Don't cache the instruction and register info from the TargetMachine, because
the internals of TargetMachine could change.

llvm-svn: 183488
2013-06-07 05:54:19 +00:00
Arnold Schwaighofer
ae78fdcfbc ARM sched model: Use the right resources for DIV
llvm-svn: 183477
2013-06-07 01:16:15 +00:00
Arnold Schwaighofer
7ab0e93c53 ARM sched model: Add VFP div instruction on Swift
Reapply 183271.

llvm-svn: 183472
2013-06-07 01:10:36 +00:00
Arnold Schwaighofer
e735cf9e6d ARM sched model: Add SIMD/VFP load/store instructions on Swift
Reapply 183270 again (because three is a magic number).

This should now no longer seg fault after r183459.

llvm-svn: 183464
2013-06-07 00:04:28 +00:00
Arnold Schwaighofer
b6f6c3165a Revert "ARM sched model: Add SIMD/VFP load/store instructions on Swift"
Breaks linux build bots (I thought the problem was something else).

llvm-svn: 183447
2013-06-06 21:08:18 +00:00
Arnold Schwaighofer
c363f5de96 ARM sched model: Add SIMD/VFP load/store instructions on Swift
Reapply 183270.

llvm-svn: 183445
2013-06-06 21:02:18 +00:00
Arnold Schwaighofer
2aabc3838f ARM sched model: Add integer VFP/SIMD instructions on Swift
Reapply 183269.

llvm-svn: 183441
2013-06-06 20:26:18 +00:00
Arnold Schwaighofer
060a499d4c ARM sched model: Add integer load/store instructions on Swift
Reapply 183268.

llvm-svn: 183438
2013-06-06 20:11:56 +00:00
Arnold Schwaighofer
0c32813eef ARM sched model: Add integer arithmetic instructions on Swift
Reapply 183267.

llvm-svn: 183436
2013-06-06 19:49:46 +00:00
Arnold Schwaighofer
4787ddf24e ARM sched model: Cortex A9 - More InstRW sched resources
Add more InstRW mappings.

Reapply 183266.

llvm-svn: 183435
2013-06-06 19:30:21 +00:00
Arnold Schwaighofer
ee3ee53f08 ARM sched model: Add branch thumb instructions
Reapply 183265.

llvm-svn: 183432
2013-06-06 18:51:01 +00:00
Arnold Schwaighofer
ebfafb684f ARM sched model: Add branch thumb2 instructions
Reapply 183264.

llvm-svn: 183430
2013-06-06 18:42:09 +00:00
Arnold Schwaighofer
a9e6444fd2 ARM sched model: Add branch instructions
Reapply 183263.

llvm-svn: 183428
2013-06-06 18:21:13 +00:00
Arnold Schwaighofer
d059d8bd7b ARM sched model: Add preload thumb2 instructions
Reapply 183262.

llvm-svn: 183427
2013-06-06 18:06:30 +00:00
Arnold Schwaighofer
8bca96c063 ARM sched model: Add preload instructions
Reapply 183261.

llvm-svn: 183425
2013-06-06 17:26:12 +00:00
Arnold Schwaighofer
c7c3041d71 ARM sched model: Add more ALU and CMP thumb instructions
Reapply of 183260.

llvm-svn: 183423
2013-06-06 17:03:13 +00:00
Arnold Schwaighofer
fea024594d ARM sched model: Add more ALU and CMP thumb2 instructions
Reapply of 183259.

llvm-svn: 183421
2013-06-06 16:35:25 +00:00
Bill Wendling
2cca7e5acd Cache the TargetLowering info object as a pointer.
Caching it as a pointer allows us to reset it if the TargetMachine object
changes.

llvm-svn: 183361
2013-06-06 00:43:09 +00:00
Arnold Schwaighofer
3d06dc7238 ARM sched model: Add more ALU and CMP instructions
Reapply of 183258.

llvm-svn: 183321
2013-06-05 16:36:51 +00:00
Arnold Schwaighofer
0bfbfaf7e6 ARM sched model: Add divsion, loads, branches, vfp cvt
Add some generic SchedWrites and assign resources for Swift and Cortex A9.

Reapply of r183257. (Removed empty InstRW for division on swift)

llvm-svn: 183319
2013-06-05 16:06:11 +00:00
Arnold Schwaighofer
82241287d7 ARMInstrInfo: Improve isSwiftFastImmShift
An instruction with less than 3 inputs is trivially a fast immediate shift.

Reapply of 183256, should not have caused the tablegen segfault on linux either.

llvm-svn: 183314
2013-06-05 14:59:36 +00:00
Mihai Popa
3e61924089 This is a simple patch that changes RRX and RRXS to accept all registers as operands.
According to the ARM reference manual, RRX(S) have defined encodings for lr, pc and sp.

llvm-svn: 183307
2013-06-05 13:23:51 +00:00
Evan Cheng
1c010771f4 Cortex-R5 can issue Thumb2 integer division instructions.
llvm-svn: 183275
2013-06-04 22:52:09 +00:00
Arnold Schwaighofer
59cf81c2e1 Revert series of sched model patches until I figure out what is going on.
llvm-svn: 183273
2013-06-04 22:35:17 +00:00
Arnold Schwaighofer
d787aba409 ARM sched model: Add VFP div instruction on Swift
llvm-svn: 183271
2013-06-04 22:16:08 +00:00
Arnold Schwaighofer
1078009cb6 ARM sched model: Add SIMD/VFP load/store instructions on Swift
llvm-svn: 183270
2013-06-04 22:16:07 +00:00
Arnold Schwaighofer
bb0c6ae8d0 ARM sched model: Add integer VFP/SIMD instructions on Swift
llvm-svn: 183269
2013-06-04 22:16:05 +00:00
Arnold Schwaighofer
0f39267e23 ARM sched model: Add integer load/store instructions on Swift
llvm-svn: 183268
2013-06-04 22:16:04 +00:00
Arnold Schwaighofer
19192c53f2 ARM sched model: Add integer arithmetic instructions on Swift
llvm-svn: 183267
2013-06-04 22:16:02 +00:00
Arnold Schwaighofer
0266140e21 ARM sched model: Cortex A9 - More InstRW sched resources
Add more InstRW mappings.

llvm-svn: 183266
2013-06-04 22:16:00 +00:00
Arnold Schwaighofer
c51f106173 ARM sched model: Add branch thumb instructions
llvm-svn: 183265
2013-06-04 22:15:59 +00:00
Arnold Schwaighofer
11f0806b48 ARM sched model: Add branch thumb2 instructions
llvm-svn: 183264
2013-06-04 22:15:57 +00:00
Arnold Schwaighofer
f1db24e509 ARM sched model: Add branch instructions
llvm-svn: 183263
2013-06-04 22:15:56 +00:00
Arnold Schwaighofer
c552774669 ARM sched model: Add preload thumb2 instructions
llvm-svn: 183262
2013-06-04 22:15:54 +00:00
Arnold Schwaighofer
38168a47d6 ARM sched model: Add preload instructions
llvm-svn: 183261
2013-06-04 22:15:52 +00:00
Arnold Schwaighofer
555055dd98 ARM sched model: Add more ALU and CMP thumb instructions
llvm-svn: 183260
2013-06-04 22:15:51 +00:00
Arnold Schwaighofer
02a38a8742 ARM sched model: Add more ALU and CMP thumb2 instructions
llvm-svn: 183259
2013-06-04 22:15:49 +00:00
Arnold Schwaighofer
009aa16396 ARM sched model: Add more ALU and CMP instructions
llvm-svn: 183258
2013-06-04 22:15:47 +00:00
Arnold Schwaighofer
fe141a11f4 ARM sched model: Add divsion, loads, branches, vfp cvt
Add some generic SchedWrites and assign resources for Swift and Cortex A9.

llvm-svn: 183257
2013-06-04 22:15:46 +00:00
Arnold Schwaighofer
81440608f7 ARMInstrInfo: Improve isSwiftFastImmShift
An instruction with less than 3 inputs is trivially a fast immediate shift.

llvm-svn: 183256
2013-06-04 22:15:43 +00:00
David Majnemer
d0dc0d58f6 ARM: Fix crash in ARM backend inside of ARMConstantIslandPass
The ARM backend did not expect LDRBi12 to hold a constant pool operand.
Allow for LLVM to deal with the instruction similar to how it deals with
LDRi12.

This fixes PR16215.

llvm-svn: 183238
2013-06-04 17:46:15 +00:00
Ahmed Bougacha
5df932894e Add a way to define the bit range covered by a SubRegIndex.
NOTE: If this broke your out-of-tree backend, in *RegisterInfo.td, change
the instances of SubRegIndex that have a comps template arg to use the
ComposedSubRegIndex class instead.

In TableGen land, this adds Size and Offset attributes to SubRegIndex,
and the ComposedSubRegIndex class, for which the Size and Offset are
computed by TableGen. This also adds an accessor in MCRegisterInfo, and
Size/Offsets for the X86 and ARM subreg indices.

llvm-svn: 183020
2013-05-31 17:08:36 +00:00
Tim Northover
ba543b13e1 ARM: permit upper-case BE/LE on setend instruction
Patch by Amaury de la Vieuville.

llvm-svn: 183012
2013-05-31 15:58:45 +00:00
Tim Northover
2a437cba0e ARM: add fstmx and fldmx instructions for assembly
These instructions are deprecated oddities, but we still need to be able to
disassemble (and reassemble) them if and when they're encountered.

Patch by Amaury de la Vieuville.

llvm-svn: 183011
2013-05-31 15:55:51 +00:00
Tim Northover
6b0f4fd85b ARM: fix VEXT encoding corner case
The disassembly of VEXT instructions was too lax in the bits checked. This
fixes the case where the instruction affects Q-registers but a misaligned lane
was specified (should be UNDEFINED).

Patch by Amaury de la Vieuville

llvm-svn: 183003
2013-05-31 13:47:25 +00:00
Rafael Espindola
d23482285b Revert r182937 and r182877.
r182877 broke MCJIT tests on ARM and r182937 was working around another failure
by r182877.

This should make the ARM bots green.

llvm-svn: 182960
2013-05-30 20:37:52 +00:00
Andrew Trick
aec414c298 Order CALLSEQ_START and CALLSEQ_END nodes.
Fixes PR16146: gdb.base__call-ar-st.exp fails after
pre-RA-sched=source fixes.

Patch by Xiaoyi Guo!

This also fixes an unsupported dbg.value test case. Codegen was
previously incorrect but the test was passing by luck.

llvm-svn: 182885
2013-05-29 22:03:55 +00:00
JF Bastien
235d8c9117 Enable FastISel on ARM for Linux and NaCl
FastISel was only enabled for iOS ARM and Thumb2, this patch enables it
for ARM (not Thumb2) on Linux and NaCl.

Thumb2 support needs a bit more work, mainly around register class
restrictions.

The patch punts to SelectionDAG when doing TLS relocation on non-Darwin
targets. I will fix this and other FastISel-to-SelectionDAG failures in
a separate patch.

The patch also forces FastISel to retain frame pointers: iOS always
keeps them for backtracking (so emitted code won't change because of
this), but Linux was getting much worse code that was incorrect when
using big frames (such as test-suite's lencod). I'll also fix this in a
later patch, it will probably require a peephole so that FastISel
doesn't rematerialize frame pointers back-to-back.

The test changes are straightforward, similar to:
  http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130513/174279.html
They also add a vararg test that got dropped in that change.

I ran all of test-suite on A15 hardware with --optimize-option=-O0 and
all the tests pass.

llvm-svn: 182877
2013-05-29 20:38:10 +00:00
JF Bastien
1b5e07e947 Tidy some register classes for ARM and Thumb
Tidy up three places where the register class for ARM and Thumb wasn't
restrictive enough:
 - No PC dest for reg-reg add/orr/sub.
 - No PC dest for shifts.
 - No PC or SP for Thumb2 reg-imm add.

I encountered this while combining FastISel with
-verify-machineinstrs. These instructions defined registers whose
classes weren't restrictive enough, and the uses failed
verification. They're also undefined in the ISA, or would produce code
that FastISel wouldn't want. This doesn't fix the register class
narrowing issue (where uses should restrict definitions), and isn't
thorough, but it's a small step in the right direction.

llvm-svn: 182863
2013-05-29 15:45:47 +00:00
Andrew Trick
2790ee3a8e Track IR ordering of SelectionDAG nodes 2/4.
Change SelectionDAG::getXXXNode() interfaces as well as call sites of
these functions to pass in SDLoc instead of DebugLoc.

llvm-svn: 182703
2013-05-25 02:42:55 +00:00
Quentin Colombet
a360776566 Follow up of the introduction of MCSymbolizer.
- Ressurect old MCDisassemble API to soften transition.
- Extend MCTargetDesc to set target specific symbolizer.

llvm-svn: 182688
2013-05-24 22:51:52 +00:00
Michael J. Spencer
c195b8a813 Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros.
llvm-svn: 182680
2013-05-24 22:23:49 +00:00
Benjamin Kramer
fcb0899e18 Remove the Copied parameter from MemoryObject::readBytes.
There was exactly one caller using this API right, the others were relying on
specific behavior of the default implementation. Since it's too hard to use it
right just remove it and standardize on the default behavior.

Defines away PR16132.

llvm-svn: 182636
2013-05-24 10:54:58 +00:00
Ahmed Bougacha
eedbfb8aab MC: Disassembled CFG reconstruction.
This patch builds on some existing code to do CFG reconstruction from
a disassembled binary:
- MCModule represents the binary, and has a list of MCAtoms.
- MCAtom represents either disassembled instructions (MCTextAtom), or
  contiguous data (MCDataAtom), and covers a specific range of addresses.
- MCBasicBlock and MCFunction form the reconstructed CFG. An MCBB is
  backed by an MCTextAtom, and has the usual successors/predecessors.
- MCObjectDisassembler creates a module from an ObjectFile using a
  disassembler. It first builds an atom for each section. It can also
  construct the CFG, and this splits the text atoms into basic blocks.

MCModule and MCAtom were only sketched out; MCFunction and MCBB were
implemented under the experimental "-cfg" llvm-objdump -macho option.
This cleans them up for further use; llvm-objdump -d -cfg now generates
graphviz files for each function found in the binary.

In the future, MCObjectDisassembler may be the right place to do
"intelligent" disassembly: for example, handling constant islands is just
a matter of splitting the atom, using information that may be available
in the ObjectFile. Also, better initial atom formation than just using
sections is possible using symbols (and things like Mach-O's
function_starts load command).

This brings two minor regressions in llvm-objdump -macho -cfg:
- The printing of a relocation's referenced symbol.
- An annotation on loop BBs, i.e., which are their own successor.

Relocation printing is replaced by the MCSymbolizer; the basic CFG
annotation will be superseded by more related functionality.

llvm-svn: 182628
2013-05-24 01:07:04 +00:00
Ahmed Bougacha
6979d48fa4 Add MCSymbolizer for symbolic/annotated disassembly.
This is a basic first step towards symbolization of disassembled
instructions. This used to be done using externally provided (C API)
callbacks. This patch introduces:
- the MCSymbolizer class, that mimics the same functions that were used
  in the X86 and ARM disassemblers to symbolize immediate operands and
  to annotate loads based off PC (for things like c string literals).
- the MCExternalSymbolizer class, which implements the old C API.
- the MCRelocationInfo class, which provides a way for targets to
  translate relocations (either object::RelocationRef, or disassembler
  C API VariantKinds) to MCExprs.
- the MCObjectSymbolizer class, which does symbolization using what it
  finds in an object::ObjectFile. This makes simple symbolization (with
  no fancy relocation stuff) work for all object formats!
- x86-64 Mach-O and ELF MCRelocationInfos.
- A basic ARM Mach-O MCRelocationInfo, that provides just enough to
  support the C API VariantKinds.

Most of what works in otool (the only user of the old symbolization API
that I know of) for x86-64 symbolic disassembly (-tvV) works, namely:
- symbol references: call _foo; jmp 15 <_foo+50>
- relocations:       call _foo-_bar; call _foo-4
- __cf?string:       leaq 193(%rip), %rax ## literal pool for "hello"
Stub support is the main missing part (because libObject doesn't know,
among other things, about mach-o indirect symbols).

As for the MCSymbolizer API, instead of relying on the disassemblers
to call the tryAdding* methods, maybe this could be done automagically
using InstrInfo? For instance, even though PC-relative LEAs are used
to get the address of string literals in a typical Mach-O file, a MOV
would be used in an ELF file. And right now, the explicit symbolization
only recognizes PC-relative LEAs. InstrInfo should have already have
most of what is needed to know what to symbolize, so this can
definitely be improved.

I'd also like to remove object::RelocationRef::getValueString (it seems
only used by relocation printing in objdump), as simply printing the
created MCExpr is definitely enough (and cleaner than string concats).

llvm-svn: 182625
2013-05-24 00:39:57 +00:00
Tim Northover
9b6ef4d68e ARM: implement @llvm.readcyclecounter intrinsic
This implements the @llvm.readcyclecounter intrinsic as the specific
MRC instruction specified in the ARM manuals for CPUs with the Power
Management extensions.

Older CPUs had slightly different methods which may also have to be
implemented eventually, but this should cover all v7 cases.

rdar://problem/13939186

llvm-svn: 182603
2013-05-23 19:11:20 +00:00
Tim Northover
36a908d2f7 ARM: Add Performance Monitor Extensions feature
Performance monitors, including a basic cycle counter, are an official
extension in the ARMv7 specification. This adds support for enabling and
disabling them, orthogonally from CPU selection.

rdar://problem/13939186

llvm-svn: 182602
2013-05-23 19:11:14 +00:00
Chad Rosier
93d3b4de04 Simplify logic now that r182490 is in place. No functional change intended.
llvm-svn: 182531
2013-05-22 23:17:36 +00:00
Mihai Popa
d42b1e0685 VSTn instructions have a number of encoding constraints which are not implemented. I have added these using wrapper methods around the original custom decoder (incidentally - this is a huge poorly written method that should be cleaned up. I have left it as is since the changes would be much to hard to review).
llvm-svn: 182281
2013-05-20 14:57:05 +00:00
Mihai Popa
dbf8d46d3c Q registers are encoded in fields of the same length as D registers. As Q registers are half as many, the ARM reference manual mandates the least significant bit to be zeroed out. Failure to do so should result in an undefined instruction. With this change test/MC/Disassembler/ARM/invalid-VQADD-arm.txt is passing (removed XFAIL).
llvm-svn: 182279
2013-05-20 14:42:43 +00:00
Stepan Dyatkovskiy
adb91e7ed9 PR15868 fix.
Introduction:
In case when stack alignment is 8 and GPRs parameter part size is not N*8:
we add padding to GPRs part, so part's last byte must be recovered at
address K*8-1.
We need to do it, since remained (stack) part of parameter starts from
address K*8, and we need to "attach" "GPRs head" without gaps to it:

Stack:
|---- 8 bytes block ----| |---- 8 bytes block ----| |---- 8 bytes...
[ [padding] [GPRs head] ] [ ------ Tail passed via stack  ------ ...

FIX:
Note, once we added padding we need to correct *all* Arg offsets that are going
after padded one. That's why we need this fix: Arg offsets were never corrected
before this patch. See new test-cases included in patch.

We also don't need to insert padding for byval parameters that are stored in GPRs
only. We need pad only last byval parameter and only in case it outsides GPRs
and stack alignment = 8.
Though, stack area, allocated for recovered byval params, must satisfy
"Size mod 8 = 0" restriction.

This patch reduces stack usage for some cases:
We can reduce ArgRegsSaveArea since inner N*4 bytes sized byval params my be
"packed" with alignment 4 in some cases.

llvm-svn: 182237
2013-05-20 08:01:34 +00:00
Benjamin Kramer
1e23dfa473 Replace some bit operations with simpler ones. No functionality change.
llvm-svn: 182226
2013-05-19 22:01:57 +00:00
Matt Arsenault
118196f0ca Add LLVMContext argument to getSetCCResultType
llvm-svn: 182180
2013-05-18 00:21:46 +00:00
JF Bastien
cbcaf8db77 Support unaligned load/store on more ARM targets
This patch matches GCC behavior: the code used to only allow unaligned
load/store on ARM for v6+ Darwin, it will now allow unaligned load/store
for v6+ Darwin as well as for v7+ on Linux and NaCl.

The distinction is made because v6 doesn't guarantee support (but LLVM
assumes that Apple controls hardware+kernel and therefore have
conformant v6 CPUs), whereas v7 does provide this guarantee (and
Linux/NaCl behave sanely).

The patch keeps the -arm-strict-align command line option, and adds
-arm-no-strict-align. They behave similarly to GCC's -mstrict-align and
-mnostrict-align.

I originally encountered this discrepancy in FastIsel tests which expect
unaligned load/store generation. Overall this should slightly improve
performance in most cases because of reduced I$ pressure.

llvm-svn: 182175
2013-05-17 23:49:01 +00:00
Derek Schuff
8af06f2ba6 Revert "Support unaligned load/store on more ARM targets"
This reverts r181898.

llvm-svn: 181944
2013-05-15 23:07:43 +00:00
Derek Schuff
d07b32ae79 Support unaligned load/store on more ARM targets
This patch matches GCC behavior: the code used to only allow unaligned
load/store on ARM for v6+ Darwin, it will now allow unaligned load/store for
v6+ Darwin as well as for v7+ on other targets.

The distinction is made because v6 doesn't guarantee support (but LLVM assumes
that Apple controls hardware+kernel and therefore have conformant v6 CPUs),
whereas v7 does provide this guarantee (and Linux behaves sanely).

Overall this should slightly improve performance in most cases because of
reduced I$ pressure.

Patch by JF Bastien

llvm-svn: 181897
2013-05-15 16:08:30 +00:00
Arnold Schwaighofer
6df028f783 ARM ISel: Don't create illegal types during LowerMUL
The transformation happening here is that we want to turn a
"mul(ext(X), ext(X))" into a "vmull(X, X)", stripping off the extension. We have
to make sure that X still has a valid vector type - possibly recreate an
extension to a smaller type. In case of a extload of a memory type smaller than
64 bit we used create a ext(load()). The problem with doing this - instead of
recreating an extload - is that an illegal type is exposed.

This patch fixes this by creating extloads instead of ext(load()) sequences.

Fixes PR15970.

radar://13871383

llvm-svn: 181842
2013-05-14 22:33:24 +00:00
Mihai Popa
7a2df35483 The purpose of the patch is to fix the syntax of ARM mrc and mrc2 instructions when they are used to write to the APSR. In this case, the destination operand should be APSR_nzcv, and the encoding of the target should be 0b1111 (same as for PC). In pre-UAL syntax, this form used the PC register as a textual target. This is still allowed for backward compatibility.
llvm-svn: 181705
2013-05-13 14:10:04 +00:00
Lang Hames
9019f5804f Correctly preserve the input chain for potential tailcall nodes whose
return values are bitcasts.

The chain had previously been being clobbered with the entry node to
the dag, which sometimes caused other code in the function to be
erroneously deleted when tailcall optimization kicked in.

<rdar://problem/13827621>

llvm-svn: 181696
2013-05-13 10:21:19 +00:00
Rafael Espindola
237980d752 Remove the MachineMove class.
It was just a less powerful and more confusing version of
MCCFIInstruction. A side effect is that, since MCCFIInstruction uses
dwarf register numbers, calls to getDwarfRegNum are pushed out, which
should allow further simplifications.

I left the MachineModuleInfo::addFrameMove interface unchanged since
this patch was already fairly big.

llvm-svn: 181680
2013-05-13 01:16:13 +00:00
Rafael Espindola
d05c5e1727 Remove unused argument.
llvm-svn: 181618
2013-05-10 18:16:59 +00:00
Logan Chien
d5b8ea6c58 Implement AsmParser for ARM unwind directives.
This commit implements the AsmParser for fnstart, fnend,
cantunwind, personality, handlerdata, pad, setfp, save, and
vsave directives.

This commit fixes some minor issue in the ARMELFStreamer:

* The switch back to corresponding section after the .fnend
  directive.

* Emit the unwind opcode while processing .fnend directive
  if there is no .handlerdata directive.

* Emit the unwind opcode to .ARM.extab while processing
  .handlerdata even if .personality directive does not exist.

llvm-svn: 181603
2013-05-10 16:17:24 +00:00
Stepan Dyatkovskiy
13432c8af4 For r181148: fixed warning 'enumeral and non-enumeral type in conditional expression'.
llvm-svn: 181437
2013-05-08 14:51:27 +00:00
Evan Cheng
4d42f5b1d5 ARM AnalyzeBranch should conservatively return true when it sees a predicated
indirect branch at the end of the BB. Otherwise if-converter, branch folding
pass may incorrectly update its successor info if it consider BB as fallthrough
to the next BB.

rdar://13782395

llvm-svn: 181161
2013-05-05 18:06:32 +00:00
Stepan Dyatkovskiy
c06cd03f6e For ARM backend, fixed "byval" attribute support.
Now even the small structures could be passed within byval (small enough
to be stored in GPRs).
In regression tests next function prototypes are checked:

PR15293:
  %artz = type { i32 }
  define void @foo(%artz* byval %s)
  define void @foo2(%artz* byval %s, i32 %p, %artz* byval %s2)
foo: "s" stored in R0
foo2: "s" stored in R0, "s2" stored in R2.

Next AAPCS rules are checked:
5.5 Parameters Passing, C.4 and C.5,
"ParamSize" is parameter size in 32bit words:
-- NSAA != 0, NCRN < R4 and NCRN+ParamSize > R4.
   Parameter should be sent to the stack; NCRN := R4.
-- NSAA != 0, and NCRN < R4, NCRN+ParamSize < R4.
   Parameter stored in GPRs; NCRN += ParamSize.

llvm-svn: 181148
2013-05-05 07:48:36 +00:00
Dmitri Gribenko
82c92dc3dd Add ArrayRef constructor from None, and do the cleanups that this constructor enables
Patch by Robert Wilhelm.

llvm-svn: 181138
2013-05-05 00:40:33 +00:00
Amara Emerson
036eb4649d Revert r181009.
llvm-svn: 181079
2013-05-03 23:57:17 +00:00
Amara Emerson
863672f436 Add support for reading ARM ELF build attributes.
Build attribute sections can now be read if they exist via ELFObjectFile, and
the llvm-readobj tool has been extended with an option to dump this information
if requested. Regression tests are also included which exercise these features.

Also update the docs with a fixed ARM ABI link and a new link to the Addenda
which provides the build attributes specification.

llvm-svn: 181009
2013-05-03 11:36:35 +00:00
Rafael Espindola
50d06e76ab Text files should not be marked executable.
Patch by Oliver Pinter.

llvm-svn: 180797
2013-04-30 19:06:15 +00:00
Mihai Popa
e489a3a040 s tightens up the encoding description for ARM post-indexed ldr instructions. All instructions in this class have bit 4 cleared. It turns out that there is a test case for this, but it was marked XFAIL.
llvm-svn: 180778
2013-04-30 09:00:12 +00:00
Stepan Dyatkovskiy
fd08e5fdd4 Refactoring patch.
1. VarArgStyleRegisters: functionality that emits "store" instructions for byval regs moved out into separated method "StoreByValRegs". Before this patch VarArgStyleRegisters had confused use-cases. It was used for both variadic functions and for regular functions with byval parameters. In last case it created new stack-frame and registered it as VarArg frame, that is wrong.

This patch replaces VarArgsStyleRegisters usage for byval parameters with StoreByValRegs method.

2. In ARMMachineFunctionInfo, "get/setVarArgsRegSaveSize" was renamed to "get/setArgRegsSaveSize". By the same reason. Sometimes it was used for variadic functions, and sometimes for byval parameters in regular functions. Actually, this property means the size of registers, that keeps arguments, and thats why it was renamed.

3. In ARMISelLowering.cpp, ARMTargetLowering class, in methods computeRegArea and StoreByValRegs, VARegXXXXXX was renamed to ArgRegsXXXXXX still by the same reasons.

llvm-svn: 180774
2013-04-30 07:19:58 +00:00
Quentin Colombet
7a0d14f876 ARM: Fix encoding of hint instruction for Thumb.
"hint" space for Thumb actually overlaps the encoding space of the CPS
instruction. In actuality, hints can be defined as CPS instructions where imod
and M bits are all nil.

Handle decoding of permitted nop-compatible hints (i.e. nop, yield, wfi, wfe,
sev) in DecodeT2CPSInstruction.

This commit adds a proper diagnostic message for Imm0_4 and updates all tests.

Patch by Mihail Popa <Mihail.Popa@arm.com>.

llvm-svn: 180617
2013-04-26 17:54:54 +00:00
Benjamin Kramer
40a2d53c85 ARM/NEON: Pattern match vector integer abs to vabs.
llvm-svn: 180604
2013-04-26 15:00:57 +00:00
Arnold Schwaighofer
b1fc314b5f ARM cost model: Integer div and rem is lowered to a function call
Reflect this in the cost model. I observed this in MiBench/consumer-lame.

radar://13354716

llvm-svn: 180576
2013-04-25 21:16:18 +00:00
Stephen Lin
44a24c9593 Add more tests for r179925 to verify correct handling of signext/zeroext; strengthen condition check to require actual MVT::i32 virtual register types, just in case (no actual functionality change)
llvm-svn: 180138
2013-04-23 19:42:25 +00:00
Stephen Lin
93ced316ad Lowercase "is" boolean variable prefix for consistency within function, no functionality change.
llvm-svn: 180136
2013-04-23 19:30:12 +00:00
Eric Christopher
8fa9d9a193 No really, don't store anything to this since it's unconditionally
set below.

llvm-svn: 180015
2013-04-22 14:11:25 +00:00
Eric Christopher
831b87c175 Remove variable store that is never read.
llvm-svn: 180014
2013-04-22 13:51:44 +00:00
Stepan Dyatkovskiy
8adaf54376 Fix for 5.5 Parameter Passing --> Stage C:
-- C.4 and C.5 statements, when NSAA is not equal to SP.
 -- C.1.cp statement for VA functions. Note: There are no VFP CPRCs in a
    variadic procedure.

Before this patch "NSAA != 0" means "don't use GPRs anymore ". But there are
some exceptions in AAPCS.
1. For non VA function: allocate all VFP regs for CPRC. When all VFPs are allocated
   CPRCs would be sent to stack, while non CPRCs may be still allocated in GRPs.
2. Check that for VA functions all params uses GPRs and then stack.
   No exceptions, no CPRCs here.

llvm-svn: 180011
2013-04-22 13:06:52 +00:00
Jim Grosbach
3104dcf2ca Legalize vector truncates by parts rather than just splitting.
Rather than just splitting the input type and hoping for the best, apply
a bit more cleverness. Just splitting the types until the source is
legal often leads to an illegal result time, which is then widened and a
scalarization step is introduced which leads to truly horrible code
generation. With the loop vectorizer, these sorts of operations are much
more common, and so it's worth extra effort to do them well.

Add a legalization hook for the operands of a TRUNCATE node, which will
be encountered after the result type has been legalized, but if the
operand type is still illegal. If simple splitting of both types
ends up with the result type of each half still being legal, just
do that (v16i16 -> v16i8 on ARM, for example). If, however, that would
result in an illegal result type (v8i32 -> v8i8 on ARM, for example),
we can get more clever with power-two vectors. Specifically,
split the input type, but also widen the result element size, then
concatenate the halves and truncate again.  For example on ARM,
To perform a "%res = v8i8 trunc v8i32 %in" we transform to:
  %inlo = v4i32 extract_subvector %in, 0
  %inhi = v4i32 extract_subvector %in, 4
  %lo16 = v4i16 trunc v4i32 %inlo
  %hi16 = v4i16 trunc v4i32 %inhi
  %in16 = v8i16 concat_vectors v4i16 %lo16, v4i16 %hi16
  %res = v8i8 trunc v8i16 %in16

This allows instruction selection to generate three VMOVN instructions
instead of a sequences of moves, stores and loads.

Update the ARMTargetTransformInfo to take this improved legalization
into account.

Consider the simplified IR:

define <16 x i8> @test1(<16 x i32>* %ap) {
  %a = load <16 x i32>* %ap
  %tmp = trunc <16 x i32> %a to <16 x i8>
  ret <16 x i8> %tmp
}

define <8 x i8> @test2(<8 x i32>* %ap) {
  %a = load <8 x i32>* %ap
  %tmp = trunc <8 x i32> %a to <8 x i8>
  ret <8 x i8> %tmp
}

Previously, we would generate the truly hideous:
	.syntax unified
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_test1
	.align	2
_test1:                                 @ @test1
@ BB#0:
	push	{r7}
	mov	r7, sp
	sub	sp, sp, #20
	bic	sp, sp, #7
	add	r1, r0, #48
	add	r2, r0, #32
	vld1.64	{d24, d25}, [r0:128]
	vld1.64	{d16, d17}, [r1:128]
	vld1.64	{d18, d19}, [r2:128]
	add	r1, r0, #16
	vmovn.i32	d22, q8
	vld1.64	{d16, d17}, [r1:128]
	vmovn.i32	d20, q9
	vmovn.i32	d18, q12
	vmov.u16	r0, d22[3]
	strb	r0, [sp, #15]
	vmov.u16	r0, d22[2]
	strb	r0, [sp, #14]
	vmov.u16	r0, d22[1]
	strb	r0, [sp, #13]
	vmov.u16	r0, d22[0]
	vmovn.i32	d16, q8
	strb	r0, [sp, #12]
	vmov.u16	r0, d20[3]
	strb	r0, [sp, #11]
	vmov.u16	r0, d20[2]
	strb	r0, [sp, #10]
	vmov.u16	r0, d20[1]
	strb	r0, [sp, #9]
	vmov.u16	r0, d20[0]
	strb	r0, [sp, #8]
	vmov.u16	r0, d18[3]
	strb	r0, [sp, #3]
	vmov.u16	r0, d18[2]
	strb	r0, [sp, #2]
	vmov.u16	r0, d18[1]
	strb	r0, [sp, #1]
	vmov.u16	r0, d18[0]
	strb	r0, [sp]
	vmov.u16	r0, d16[3]
	strb	r0, [sp, #7]
	vmov.u16	r0, d16[2]
	strb	r0, [sp, #6]
	vmov.u16	r0, d16[1]
	strb	r0, [sp, #5]
	vmov.u16	r0, d16[0]
	strb	r0, [sp, #4]
	vldmia	sp, {d16, d17}
	vmov	r0, r1, d16
	vmov	r2, r3, d17
	mov	sp, r7
	pop	{r7}
	bx	lr

	.globl	_test2
	.align	2
_test2:                                 @ @test2
@ BB#0:
	push	{r7}
	mov	r7, sp
	sub	sp, sp, #12
	bic	sp, sp, #7
	vld1.64	{d16, d17}, [r0:128]
	add	r0, r0, #16
	vld1.64	{d20, d21}, [r0:128]
	vmovn.i32	d18, q8
	vmov.u16	r0, d18[3]
	vmovn.i32	d16, q10
	strb	r0, [sp, #3]
	vmov.u16	r0, d18[2]
	strb	r0, [sp, #2]
	vmov.u16	r0, d18[1]
	strb	r0, [sp, #1]
	vmov.u16	r0, d18[0]
	strb	r0, [sp]
	vmov.u16	r0, d16[3]
	strb	r0, [sp, #7]
	vmov.u16	r0, d16[2]
	strb	r0, [sp, #6]
	vmov.u16	r0, d16[1]
	strb	r0, [sp, #5]
	vmov.u16	r0, d16[0]
	strb	r0, [sp, #4]
	ldm	sp, {r0, r1}
	mov	sp, r7
	pop	{r7}
	bx	lr

Now, however, we generate the much more straightforward:
	.syntax unified
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_test1
	.align	2
_test1:                                 @ @test1
@ BB#0:
	add	r1, r0, #48
	add	r2, r0, #32
	vld1.64	{d20, d21}, [r0:128]
	vld1.64	{d16, d17}, [r1:128]
	add	r1, r0, #16
	vld1.64	{d18, d19}, [r2:128]
	vld1.64	{d22, d23}, [r1:128]
	vmovn.i32	d17, q8
	vmovn.i32	d16, q9
	vmovn.i32	d18, q10
	vmovn.i32	d19, q11
	vmovn.i16	d17, q8
	vmovn.i16	d16, q9
	vmov	r0, r1, d16
	vmov	r2, r3, d17
	bx	lr

	.globl	_test2
	.align	2
_test2:                                 @ @test2
@ BB#0:
	vld1.64	{d16, d17}, [r0:128]
	add	r0, r0, #16
	vld1.64	{d18, d19}, [r0:128]
	vmovn.i32	d16, q8
	vmovn.i32	d17, q9
	vmovn.i16	d16, q8
	vmov	r0, r1, d16
	bx	lr

llvm-svn: 179989
2013-04-21 23:47:41 +00:00
Tim Northover
943f2a9234 ARM: Use ldrd/strd to spill 64-bit pairs when available.
This allows common sp-offsets to be part of the instruction and is
probably faster on modern CPUs too.

llvm-svn: 179977
2013-04-21 11:57:07 +00:00
Tim Northover
de5285eb6f ARM: don't add FrameIndex offset for LDMIA (has no immediate)
Previously, when spilling 64-bit paired registers, an LDMIA with both
a FrameIndex and an offset was produced. This kind of instruction
shouldn't exist, and the extra operand was being confused with the
predicate, causing aborts later on.

This removes the invalid 0-offset from the instruction being
produced.

llvm-svn: 179956
2013-04-20 19:31:00 +00:00
Tim Northover
6a53674c70 Remove unused ShouldFoldAtomicFences flag.
I think it's almost impossible to fold atomic fences profitably under
LLVM/C++11 semantics. As a result, this is now unused and just
cluttering up the target interface.

llvm-svn: 179940
2013-04-20 12:32:43 +00:00