1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 04:22:57 +02:00
Commit Graph

15309 Commits

Author SHA1 Message Date
Jim Grosbach
8f30718112 Now that register allocation properly considers reserved regs, simplify the
ARM register class allocation order functions to take advantage of that.

llvm-svn: 112841
2010-09-02 18:14:29 +00:00
Jim Grosbach
4965868724 Mask out reserved registers when constructing the set of allocatable regs.
llvm-svn: 112828
2010-09-02 16:31:21 +00:00
Bob Wilson
c2c106f73e Fill in a missing comment.
llvm-svn: 112826
2010-09-02 16:17:29 +00:00
Bob Wilson
8951c7592c Convert VLD1 and VLD2 instructions to use pseudo-instructions until
after regalloc.

llvm-svn: 112825
2010-09-02 16:00:54 +00:00
Bruno Cardoso Lopes
659f549638 Replace unpckl_undef and unpckh_undef matching with target specific opcodes
llvm-svn: 112806
2010-09-02 05:23:12 +00:00
Bruno Cardoso Lopes
9d4a11d4c6 Move condition out to prepare for more matching
llvm-svn: 112805
2010-09-02 04:20:26 +00:00
Bruno Cardoso Lopes
1b9095fff1 Remove checking for isUNPCKL_v_undef_Mask, the specific node is already emitted for it
llvm-svn: 112804
2010-09-02 03:57:58 +00:00
Bruno Cardoso Lopes
dcdab94661 become more strict about when it's safe to use X86ISD::MOVLPS
llvm-svn: 112799
2010-09-02 02:35:51 +00:00
Eric Christopher
6c3fa8c78f Clang's -ccc-host-triple was ignoring the arch specifier on my triple,
I don't need to implement this quite yet - and not for ConstantInt anyhow.

llvm-svn: 112798
2010-09-02 02:30:46 +00:00
Eric Christopher
cf0b12d117 This should be TargetMaterializeConstant instead.
llvm-svn: 112795
2010-09-02 01:48:11 +00:00
Eric Christopher
9dc3582a82 One definition of isThumb is plenty, thanks.
llvm-svn: 112793
2010-09-02 01:39:14 +00:00
Jim Grosbach
bb3bfa20e1 Remove trailing whitespace
llvm-svn: 112790
2010-09-02 01:02:06 +00:00
Eric Christopher
3aa2bb55d8 Rework arm fast-isel load and store handling. Move offset computation
into the "address selection" routine and handle constant materialization
for stores.

llvm-svn: 112788
2010-09-02 00:53:56 +00:00
Jim Grosbach
94a445d9d1 trivial cleanup
llvm-svn: 112779
2010-09-02 00:02:26 +00:00
Jim Grosbach
2a3afa421b Simplify the tGPR register class now that the register allocators know not
to try to allocate reserved registers.

llvm-svn: 112774
2010-09-01 23:50:23 +00:00
Bob Wilson
3348d2eb50 Remove NEON vmull, vmlal, and vmlsl intrinsics, replacing them with multiply,
add, and subtract operations with zero-extended or sign-extended vectors.
Update tests.  Add auto-upgrade support for the old intrinsics.

llvm-svn: 112773
2010-09-01 23:50:19 +00:00
Bruno Cardoso Lopes
b73f0cbc7a Revert r112689, avoid those kind of checks cause they mess up with mmx
llvm-svn: 112760
2010-09-01 22:59:03 +00:00
Bruno Cardoso Lopes
601bf4c6d3 Using target specific nodes for shuffle nodes makes the mask
check more strict, breaking some cases not checked in the
testsuite, but also exposes some foldings not done before,
as this example:

  movaps  (%rdi), %xmm0
  movaps  (%rax), %xmm1
  movaps  %xmm0, %xmm2
  movss %xmm1, %xmm2
  shufps  $36, %xmm2, %xmm0

now is generated as:

  movaps  (%rdi), %xmm0
  movaps  %xmm0, %xmm1
  movlps  (%rax), %xmm1
  shufps  $36, %xmm1, %xmm0

llvm-svn: 112753
2010-09-01 22:33:20 +00:00
Eric Christopher
abf61f76c9 Some basic store support.
llvm-svn: 112752
2010-09-01 22:16:27 +00:00
Eric Christopher
92238b2b5a Add some more load types in.
llvm-svn: 112721
2010-09-01 18:01:32 +00:00
Chris Lattner
b911e51b26 zap dead code.
llvm-svn: 112712
2010-09-01 16:04:34 +00:00
Chris Lattner
b74759a9fa temporarily revert r112664, it is causing a decoding conflict, and
the testcases should be merged.

llvm-svn: 112711
2010-09-01 16:00:50 +00:00
Bruno Cardoso Lopes
9375b2f67d Use movlps, movlpd, movss and movsd specific nodes instead of pattern matching with movlp pattern fragment
llvm-svn: 112694
2010-09-01 05:08:25 +00:00
Bruno Cardoso Lopes
b69568ab33 minor change, simplify some logic
llvm-svn: 112689
2010-09-01 00:57:08 +00:00
Bruno Cardoso Lopes
c31697f68c Move some functions around so they can be used for some other to come function
llvm-svn: 112687
2010-09-01 00:51:36 +00:00
Bill Wendling
bb6052cfd6 We have a chance for an optimization. Consider this code:
int x(int t) {
  if (t & 256)
    return -26;
  return 0;
}

We generate this:

     tst.w   r0, #256
     mvn     r0, #25
     it      eq
     moveq   r0, #0

while gcc generates this:

     ands    r0, r0, #256
     it      ne
     mvnne   r0, #25
     bx      lr

Scandalous really!

During ISel time, we can look for this particular pattern. One where we have a
"MOVCC" that uses the flag off of a CMPZ that itself is comparing an AND
instruction to 0. Something like this (greatly simplified):

  %r0 = ISD::AND ...
  ARMISD::CMPZ %r0, 0         @ sets [CPSR]
  %r0 = ARMISD::MOVCC 0, -26  @ reads [CPSR]

All we have to do is convert the "ISD::AND" into an "ARM::ANDS" that sets [CPSR]
when it's zero. The zero value will all ready be in the %r0 register and we only
need to change it if the AND wasn't zero. Easy!

llvm-svn: 112664
2010-08-31 22:41:22 +00:00
Bruno Cardoso Lopes
80613a070e Use x86 specific MOVSLDUP node, add more patterns to match it and remove useless load nodes
llvm-svn: 112661
2010-08-31 22:35:05 +00:00
Bruno Cardoso Lopes
8fc83b1960 Use x86 specific MOVSHDUP node and add more patterns to match it
llvm-svn: 112657
2010-08-31 22:22:11 +00:00
Bill Wendling
4a52e8fec0 And ANDS pattern to match the t2ANDS pattern.
llvm-svn: 112654
2010-08-31 22:05:37 +00:00
Jakob Stoklund Olesen
7ffcddc113 Make %EFLAGS unallocatable.
No CCR virtual registers should exist, and %EFLAGS is used in ways that can
surprise RegAllocFast.

llvm-svn: 112650
2010-08-31 21:51:07 +00:00
Bruno Cardoso Lopes
dfa177cf81 Use MOVHLPS node instead of matching using movhlps and movhlps_undef pattern fragments
llvm-svn: 112644
2010-08-31 21:38:49 +00:00
Bruno Cardoso Lopes
6fbe7b9ddd Use MOVLHPS and MOVHLPS x86 nodes whenever possible. Also remove some useless nodes
llvm-svn: 112642
2010-08-31 21:15:21 +00:00
Jim Grosbach
9cc0a6397a SP relative offsets need to be adjusted by the local allocation size when
determining if they're likely to be in range of the SP when resolving
frame references.

llvm-svn: 112624
2010-08-31 18:52:31 +00:00
Jim Grosbach
d0ebe535e9 this assert should just be a condition, since this function is just asking if
the offset is legally encodable, not actually trying to do the encoding.

llvm-svn: 112622
2010-08-31 18:49:31 +00:00
Bill Wendling
0409e77e99 - Cleanup some whitespaces.
- Convert {0,1} and friends into 0b01, which is identical and more consistent.

llvm-svn: 112593
2010-08-31 07:50:46 +00:00
Bruno Cardoso Lopes
08d5d62dcb Use X86ISD::MOVSS and MOVSD to represent the movl mask pattern, also fix the handling of those nodes when seeking for scalars inside vector shuffles
llvm-svn: 112570
2010-08-31 02:26:40 +00:00
Eric Christopher
b2756a8b99 Rewrite slightly so we can expand for floating point types easier.
llvm-svn: 112568
2010-08-31 01:28:42 +00:00
Eric Christopher
21b355b522 If we have an unhandled type then assert, we shouldn't get here for
things we can't handle.

llvm-svn: 112559
2010-08-30 23:48:26 +00:00
Anton Korobeynikov
851437063a Expand MOVi32imm in ARM mode after regalloc. This provides
scheduling opportunities (extra instruction can go in between
MOVT / MOVW pair removing the stall).

llvm-svn: 112546
2010-08-30 22:50:36 +00:00
Bill Wendling
7532e3418e Use the existing T2I_bin_s_irs pattern instead of creating T2I_bin_sw_irs, which
is meant to do exactly the same thing. Thanks to Jim Grosbach for pointing this
out! :-)

llvm-svn: 112538
2010-08-30 22:05:23 +00:00
Jakob Stoklund Olesen
ce3cfe3e8b Remember to clear the shadow kill flag at the same time as clearing the real
kill flag.

This could cause duplicate kill flags when the same register was used twice in a
continuous sequence of STRs.

There is no small test case. <rdar://problem/8218046>

llvm-svn: 112534
2010-08-30 21:52:40 +00:00
Bob Wilson
826a677f94 Remove NEON vmovn intrinsic, replacing it with vector truncate operations.
Auto-upgrade the old intrinsic and update tests.

llvm-svn: 112507
2010-08-30 20:02:30 +00:00
Jim Grosbach
674b25ce31 Make ARM add rN, sp, #imm instructions rematerializable. That's how the address of locals is calculated, so this should
help relieve register pressure a bit. Recalculating the local address is
almost always going to be better than spilling.

llvm-svn: 112503
2010-08-30 19:49:58 +00:00
Bob Wilson
2b83684be8 When expanding NEON VST pseudo instructions, if the original super-register
operand is killed, add it to the expanded instruction as an implicit kill
operand instead of marking the individual subregs with kill flags.  This
should work better in general and also handles the case for VST3 where one
of the subregs was not referenced in the expanded instruction and so was
not marked killed.

llvm-svn: 112494
2010-08-30 18:10:48 +00:00
Bill Wendling
c325a15569 Create Thumb2sI_cpsr and T2sI_cpsr. These new classes indicate that CPSR is the
optional modified register (instead of reg0). Along with r112461 it will make
sure that the optional define of CPSR is marked as "def" and will thus mark the
instructions using these classes (t2ANDS*) as setting the 's' flag.

llvm-svn: 112462
2010-08-30 01:47:35 +00:00
Kalle Raiskila
daba4ffc75 Fix lowering of INSERT_VECTOR_ELT in SPU.
The IDX was treated as byte index, not element index.

llvm-svn: 112422
2010-08-29 12:41:50 +00:00
Bill Wendling
8a7258d771 Fix whitespaces. No functionality changes.
llvm-svn: 112421
2010-08-29 11:31:07 +00:00
Bob Wilson
807d004452 Remove NEON vaddl, vaddw, vsubl, and vsubw intrinsics. Instead, use llvm
IR add/sub operations with one or both operands sign- or zero-extended.
Auto-upgrade the old intrinsics.

llvm-svn: 112416
2010-08-29 05:57:34 +00:00
Eli Friedman
6ccafafe61 A couple of small missed optimizations.
llvm-svn: 112411
2010-08-29 05:07:40 +00:00
Bill Wendling
6d105ce757 - Add a parameter to T2I_bin_irs for those patterns which set the S bit.
- Create T2I_bin_sw_irs to be like T2I_bin_w_irs, but that it sets the S bit.

llvm-svn: 112399
2010-08-29 03:55:31 +00:00
Chris Lattner
646fee99c3 add a bunch more common shuffles to the instprinter.
llvm-svn: 112397
2010-08-29 03:08:08 +00:00
Bill Wendling
8ad57ff92e Name ANDflag to ANDS, which is less stupid.
llvm-svn: 112395
2010-08-29 03:06:09 +00:00
Bill Wendling
6e586677a7 File missing from last commit.
llvm-svn: 112394
2010-08-29 03:02:28 +00:00
Bill Wendling
385ad1516f Create an ARMISD::AND node. This node is exactly like the "ARM::AND" node, but
it sets the CPSR register.

llvm-svn: 112393
2010-08-29 03:02:11 +00:00
Chris Lattner
56bc8ba493 I have manually decoded the imm field of an insertps one too many
times.  This patch causes llc and llvm-mc (which both default to
verbose-asm) to print out comments after a few common shuffle 
instructions which indicates the shuffle mask, e.g.:

	insertps	$113, %xmm3, %xmm0     ## xmm0 = zero,xmm0[1,2],xmm3[1]
	unpcklps	%xmm1, %xmm0    ## xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
	pshufd	$1, %xmm1, %xmm1        ## xmm1 = xmm1[1,0,0,0]

This is carefully factored to keep the information extraction (of the
shuffle mask) separate from the printing logic.  I plan to move the
extraction part out somewhere else at some point for other parts of
the x86 backend that want to introspect on the behavior of shuffles.

llvm-svn: 112387
2010-08-28 20:42:31 +00:00
Chris Lattner
8cb4abbc0e fix the buildvector->insertp[sd] logic to not always create a redundant
insertp[sd] $0, which is a noop.  Before:

_f32:                                   ## @f32
	pshufd	$1, %xmm1, %xmm2
	pshufd	$1, %xmm0, %xmm3
	addss	%xmm2, %xmm3
	addss	%xmm1, %xmm0
                                        ## kill: XMM0<def> XMM0<kill> XMM0<def>
	insertps	$0, %xmm0, %xmm0
	insertps	$16, %xmm3, %xmm0
	ret

after:

_f32:                                   ## @f32
	movdqa	%xmm0, %xmm2
	addss	%xmm1, %xmm2
	pshufd	$1, %xmm1, %xmm1
	pshufd	$1, %xmm0, %xmm3
	addss	%xmm1, %xmm3
	movdqa	%xmm2, %xmm0
	insertps	$16, %xmm3, %xmm0
	ret

The extra movs are due to a random (poor) scheduling decision.

llvm-svn: 112379
2010-08-28 17:59:08 +00:00
Chris Lattner
c3b630d64b fix the BuildVector -> unpcklps logic to not do pointless shuffles
when the top elements of a vector are undefined.  This happens all
the time for X86-64 ABI stuff because only the low 2 elements of
a 4 element vector are defined.  For example, on:

_Complex float f32(_Complex float A, _Complex float B) {
  return A+B;
}

We used to produce (with SSE2, SSE4.1+ uses insertps):

_f32:                                   ## @f32
	movdqa	%xmm0, %xmm2
	addss	%xmm1, %xmm2
	pshufd	$16, %xmm2, %xmm2
	pshufd	$1, %xmm1, %xmm1
	pshufd	$1, %xmm0, %xmm0
	addss	%xmm1, %xmm0
	pshufd	$16, %xmm0, %xmm1
	movdqa	%xmm2, %xmm0
	unpcklps	%xmm1, %xmm0
	ret

We now produce:

_f32:                                   ## @f32
	movdqa	%xmm0, %xmm2
	addss	%xmm1, %xmm2
	pshufd	$1, %xmm1, %xmm1
	pshufd	$1, %xmm0, %xmm3
	addss	%xmm1, %xmm3
	movaps	%xmm2, %xmm0
	unpcklps	%xmm3, %xmm0
	ret

This implements rdar://8368414

llvm-svn: 112378
2010-08-28 17:28:30 +00:00
Chris Lattner
7fa5fa1207 improve comments in the unpcklps generating logic, introduce
a new EltStride variable instead of reusing NumElems variable
for a non-obvious purpose.  No functionality change.

llvm-svn: 112377
2010-08-28 17:15:43 +00:00
Chris Lattner
d16c80e27f remove the MSIL backend. It isn't maintained, is buggy, has no testcases
and hasn't kept up with ToT.  Approved by Anton.

llvm-svn: 112375
2010-08-28 16:33:36 +00:00
Bob Wilson
956e07b985 Use pseudo instructions for VST1 and VST2.
llvm-svn: 112357
2010-08-28 05:12:57 +00:00
Chris Lattner
ecf276b787 remove unions from LLVM IR. They are severely buggy and not
being actively maintained, improved, or extended.

llvm-svn: 112356
2010-08-28 04:09:24 +00:00
Bruno Cardoso Lopes
1052e6d5d9 Clean up the logic of vector shuffles -> vector shifts.
Also teach this logic how to handle target specific shuffles if
needed, this is necessary while searching recursively for zeroed
scalar elements in vector shuffle operands.

llvm-svn: 112348
2010-08-28 02:46:39 +00:00
Bob Wilson
abdcae7f20 We don't need to custom-select VLDMQ and VSTMQ anymore.
llvm-svn: 112336
2010-08-28 00:20:11 +00:00
Bob Wilson
412a170b04 When merging Thumb2 loads/stores, do not give up when the offset is one of
the special values that for ARM would be used with IB or DA modes.  Fall
through and consider materializing a new base address is it would be
profitable.

llvm-svn: 112329
2010-08-27 23:57:52 +00:00
Bob Wilson
31d487d235 Change ARM VFP VLDM/VSTM instructions to use addressing mode #4, just like
all the other LDM/STM instructions.  This fixes asm printer crashes when
compiling with -O0.  I've changed one of the NEON tests (vst3.ll) to run
with -O0 to check this in the future.

Prior to this change VLDM/VSTM used addressing mode #5, but not really.
The offset field was used to hold a count of the number of registers being
loaded or stored, and the AM5 opcode field was expanded to specify the IA
or DB mode, instead of the standard ADD/SUB specifier.  Much of the backend
was not aware of these special cases.  The crashes occured when rewriting
a frameindex caused the AM5 offset field to be changed so that it did not
have a valid submode.  I don't know exactly what changed to expose this now.
Maybe we've never done much with -O0 and NEON.  Regardless, there's no longer
any reason to keep a count of the VLDM/VSTM registers, so we can use
addressing mode #4 and clean things up in a lot of places.

llvm-svn: 112322
2010-08-27 23:18:17 +00:00
Bob Wilson
09b040a386 Unsigned value cannot be < 0.
llvm-svn: 112300
2010-08-27 21:44:35 +00:00
Anton Korobeynikov
62a9879ef4 Properly handle passing of FP stuff to varargs function on Win64:
value should be copied to the corresponding shadow reg as well.
Patch by Cameron Esfahani!

llvm-svn: 112262
2010-08-27 14:43:06 +00:00
Daniel Dunbar
f642d43594 X86: Fix an encoding issue with LOCK_ADD64mr, which could lead to very hard to find miscompiles with the integrated assembler.
llvm-svn: 112250
2010-08-27 01:30:14 +00:00
Jim Grosbach
2b81a07dc7 Simplify eliminateFrameIndex() interface back down now that PEI doesn't need
to try to re-use scavenged frame index reference registers. rdar://8277890

llvm-svn: 112241
2010-08-26 23:32:16 +00:00
Jim Grosbach
d21756ab1e tidy up a bit. no functional change.
llvm-svn: 112228
2010-08-26 21:56:30 +00:00
Jim Grosbach
5b8e21eaa6 Turn off the scavenging based frame reg reuse briefly to measure whether it's
still having a significant effect. It shouldn't be now that the pre-RA
virtual base reg stuff is in. Assuming that's valididated by the nightly
testers, we can simplify a lot of the PEI frame index code.

llvm-svn: 112220
2010-08-26 21:29:54 +00:00
Bruno Cardoso Lopes
6150648a64 zap the now unused MVT::getIntVectorWithNumElements
llvm-svn: 112218
2010-08-26 20:53:12 +00:00
Bob Wilson
efc503afd2 Use pseudo instructions for VST3.
llvm-svn: 112208
2010-08-26 18:51:29 +00:00
Bill Wendling
c76a3e317c Reapply r112176 without removing the other CMN patterns (that was unintentional).
llvm-svn: 112206
2010-08-26 18:33:51 +00:00
Bob Wilson
640cc8ce83 Fix comment typos.
llvm-svn: 112202
2010-08-26 18:08:11 +00:00
Jim Grosbach
5b1ce460ec Restrict the register to tGPR to make sure the str instruction will be
encodable as a 16-bit wide instruction.

llvm-svn: 112195
2010-08-26 17:02:47 +00:00
Dan Gohman
b1020bb551 Revert r112176; it broke test/CodeGen/Thumb2/thumb2-cmn.ll.
llvm-svn: 112191
2010-08-26 15:50:25 +00:00
Dan Gohman
8088d5e31d Reapply r112091 and r111922, support for metadata linking, with a
fix: add a flag to MapValue and friends which indicates whether
any module-level mappings are being made. In the common case of
inlining, no module-level mappings are needed, so MapValue doesn't
need to examine non-function-local metadata, which can be very
expensive in the case of a large module with really deep metadata
(e.g. a large C++ program compiled with -g).

This flag is a little awkward; perhaps eventually it can be moved
into the ClonedCodeInfo class.

llvm-svn: 112190
2010-08-26 15:41:53 +00:00
Bill Wendling
a125fb1689 There seems to be a (potential) hardware bug with the CMN instruction and
comparison with 0. These two pieces of code should give identical results:

  rsbs r1, r1, 0
  cmp  r0, r1
  mov  r0, #0
  it   ls
  mov  r0, #1

and:

  cmn  r0, r1
  mov  r0, #0
  it   ls
  mov  r0, #1

However, the CMN gives the *opposite* result when r1 is 0. This is because the
carry flag is set in the CMP case but not in the CMN case. In short, the CMP
instruction doesn't perform a truncate of the (logical) NOT of 0 plus the value
of r0 and the carry bit (because the "carry bit" parameter to AddWithCarry is
defined as 1 in this case, the carry flag will always be set when r0 >= 0). The
CMN instruction doesn't perform a NOT of 0 so there is never a "carry" when this
AddWithCarry is performed (because the "carry bit" parameter to AddWithCarry is
defined as 0).

The AddWithCarry in the CMP case seems to be relying upon the identity:

  ~x + 1 = -x

However when x is 0 and unsigned, this doesn't hold:

   x = 0
  ~x = 0xFFFF FFFF
  ~x + 1 = 0x1 0000 0000
  (-x = 0) != (0x1 0000 0000 = ~x + 1)

Therefore, we should disable *all* versions of CMN, especially when comparing
against zero, until we can limit when the CMN instruction is used (when we know
that the RHS is not 0) or when we have a hardware fix for this.

(See the ARM docs for the "AddWithCarry" pseudo-code.)

This is related to <rdar://problem/7569620>.

llvm-svn: 112176
2010-08-26 09:07:33 +00:00
Chris Lattner
148485f707 implement SplitVecOp_CONCAT_VECTORS, fixing the included testcase with SSE1.
llvm-svn: 112171
2010-08-26 05:51:22 +00:00
Bob Wilson
e74da18e57 Use pseudo instructions for VST1d64Q.
llvm-svn: 112170
2010-08-26 05:33:30 +00:00
Chris Lattner
5256226fc8 fix sse1 only codegen in x86-64 mode, which is something we
apparently try to support.

llvm-svn: 112168
2010-08-26 05:24:29 +00:00
Chris Lattner
467be04ce6 remove dead proto
llvm-svn: 112131
2010-08-26 01:14:37 +00:00
Bruno Cardoso Lopes
8bb7c79c1a Fix PR7748 without using microsoft extensions
llvm-svn: 112128
2010-08-26 01:02:53 +00:00
Jim Grosbach
6500a1a2f9 Enable pre-RA virtual frame base register allocation. rdar://8277890
llvm-svn: 112127
2010-08-26 00:58:06 +00:00
Bob Wilson
1df383d9cb Revert svn 107892 (with changes to work with trunk). It caused a crash if
a VLD result was not used (Radar 8355607).  It should also fix pr7988, but
I haven't verified that yet.

llvm-svn: 112118
2010-08-26 00:13:36 +00:00
Chris Lattner
eb4c7e43cc we should pattern match the SSE complex arithmetic ops.
llvm-svn: 112109
2010-08-25 23:31:42 +00:00
Bob Wilson
b85b3cf91f Start converting NEON load/stores to use pseudo instructions, beginning here
with the VST4 instructions.  Until after register allocation, we want to
represent sets of adjacent registers by a single super-register.  These
VST4 pseudo instructions have a single QQ or QQQQ source register operand.
They get expanded to the real VST4 instructions with 4 separate D register
operands.  Once this conversion is complete, we'll be able to remove the
NEONPreAllocPass and avoid some fragile and hacky code elsewhere.

llvm-svn: 112108
2010-08-25 23:27:42 +00:00
Bruno Cardoso Lopes
28f3261dbd Revert this for now, PUNPCKLDQ dont operate on v4f32
llvm-svn: 112090
2010-08-25 21:26:37 +00:00
Daniel Dunbar
1a881a3eca X86: Fix misencode of RI64mi8. This fixes OpenSSL / x86_64-apple-darwin10 / clang -O3.
llvm-svn: 112089
2010-08-25 21:11:02 +00:00
Jim Grosbach
50dbbda454 Don't override the var from the enclosing scope.
When doing copy/paste/modify, it's apparently rather important to remember
the 'modify' bit...

llvm-svn: 112075
2010-08-25 19:11:34 +00:00
Chris Lattner
84423f212d zap dead code
llvm-svn: 112073
2010-08-25 19:00:00 +00:00
Benjamin Kramer
4eb0e8bb2c Remove dead recursive function. Yay for clang -Wunused-function.
llvm-svn: 112060
2010-08-25 17:27:58 +00:00
Daniel Dunbar
9b7c2ce591 ARM/Thumb2: Fix a misselect in getARMCmp, when attempting to adjust a signed
comparison that would overflow.
 - The other under/overflow cases can't actually happen because the immediates
   which would trigger them are legal (so we don't enter this code), but
   adjusted the style to make it clear the transform is always valid.

llvm-svn: 112053
2010-08-25 16:58:05 +00:00
Eric Christopher
1bf07e75ac Do type checks before we bother to do everything else.
llvm-svn: 112039
2010-08-25 08:43:57 +00:00
Anton Korobeynikov
1544f79e36 Fix nasty mingw32 bug, which e.g. prevented llvm-gcc bootstrap there.
Mark _alloca call as clobberring EFLAGS, otherwise some DCE might remove
other flags-clobberring stuff (e.g. cmp instructions) occuring after
_alloca call.

llvm-svn: 112034
2010-08-25 07:50:11 +00:00
Eric Christopher
9e3831d7a9 Reorganize load mechanisms. Handle types in a little less fixed way.
Fix some todos.  No functional change.

llvm-svn: 112031
2010-08-25 07:23:49 +00:00
Bruno Cardoso Lopes
af72dd7362 PUNPCKLDQ should also be used for v4f32
llvm-svn: 112020
2010-08-25 02:55:40 +00:00
Bruno Cardoso Lopes
33aa4f7d1c teach lowering to get target specific nodes for pshufd, emulating the same isel behavior for now, so we can pass all vector shuffle tests
llvm-svn: 112017
2010-08-25 02:35:37 +00:00
Eric Christopher
a2d3859ee7 Fix predicate and add a comment.
llvm-svn: 111981
2010-08-24 22:34:11 +00:00
Eric Christopher
5477fa47fd Rework braindead conditionals I put in yesterday.
llvm-svn: 111974
2010-08-24 22:07:27 +00:00
Eric Christopher
10422f70dc Fix thumb2 mode loads to have the correct operand ordering. Add a todo
to fix this in the port.

llvm-svn: 111973
2010-08-24 22:03:02 +00:00
Jim Grosbach
1b102f0b63 Add ARM heuristic for when to allocate a virtual base register for stack
access. rdar://8277890&7352504

llvm-svn: 111968
2010-08-24 21:19:33 +00:00
Daniel Dunbar
b96b0c40d3 MC/X86: Tweak imul recognition, previous hack only applies for the imul form
taking immediates.

llvm-svn: 111950
2010-08-24 19:37:56 +00:00
Daniel Dunbar
3b74f75d13 MC/X86: Add custom hack for recognizing "imul $12, %eax" and friends.
llvm-svn: 111947
2010-08-24 19:24:18 +00:00
Daniel Dunbar
75e77b0063 MC/X86: Warn on scale factors > 1 without index register, instead of erroring,
for 'as' compatibility.

llvm-svn: 111945
2010-08-24 19:13:38 +00:00
Jim Grosbach
0c3eb7ca50 Move enabling the local stack allocation pass into the target where it belongs.
For now it's still a command line option, but the interface to the generic
code doesn't need to know that.

llvm-svn: 111942
2010-08-24 19:05:43 +00:00
Jim Grosbach
a110ecf96a add ARM cmd line option to force always using virtual base regs when possible.
Intended to help ease reproducing problems by increasing base register usage
after heuristics for only using the when needed are in place.

llvm-svn: 111930
2010-08-24 18:04:52 +00:00
Dan Gohman
e400c660e4 Fix X86's isLegalAddressingMode to recognize that static addresses
need not be RIP-relative in small mode.

llvm-svn: 111917
2010-08-24 15:55:12 +00:00
Kalle Raiskila
1be8a5f947 Fix SPU BE to use all the available return registers.
llc used to assert on the added testcase.

llvm-svn: 111911
2010-08-24 11:50:48 +00:00
Kalle Raiskila
ef9e592448 Remove some dead code from SPU BE that remained
from 64bit vector support.

llvm-svn: 111910
2010-08-24 11:05:51 +00:00
Bruno Cardoso Lopes
7939025262 Use pshufhw and pshuflw in more cases and fix getTargetShuffleNode number of arguments
llvm-svn: 111890
2010-08-24 01:16:15 +00:00
Bill Wendling
c92b4d86ad Add comments for what the condition code symbols mean.
llvm-svn: 111889
2010-08-24 01:11:30 +00:00
Eric Christopher
c2ed70d52b Update comment.
llvm-svn: 111887
2010-08-24 01:10:52 +00:00
Eric Christopher
5f3382bacc Fix the opcode and the operands for the load instruction.
llvm-svn: 111885
2010-08-24 01:10:04 +00:00
Eric Christopher
5d1289db95 Add register class hack that needs to go away, but makes it more obvious
that it needs to go away.  Use loadRegFromStackSlot where possible.

Also, remember to update the value map.

llvm-svn: 111883
2010-08-24 00:50:47 +00:00
Eric Christopher
696d6ee9d7 Add some more debugging code, make it more obvious that RegOffset is
getting an address for an object and select some default values.

llvm-svn: 111871
2010-08-24 00:07:24 +00:00
Eric Christopher
a1652c6ea6 Don't need the extra register here.
llvm-svn: 111864
2010-08-23 23:28:04 +00:00
Eric Christopher
2f01adebca Add some more "get address into register" code and a more TODOs/FIXMEs.
llvm-svn: 111860
2010-08-23 23:14:31 +00:00
Eric Christopher
7ec47db6b2 Add an ARMFunctionInfo member and use it.
llvm-svn: 111854
2010-08-23 22:32:45 +00:00
Eric Christopher
e0d09e27f8 Start getting ARM loads/address computation going.
llvm-svn: 111850
2010-08-23 21:44:12 +00:00
Bruno Cardoso Lopes
ed9ff8d8d0 Start using target speficic nodes for shuffles: pshufhw and pshuflw
llvm-svn: 111837
2010-08-23 20:41:02 +00:00
Gabor Greif
6bd4b1cc6c tyops
llvm-svn: 111835
2010-08-23 20:30:51 +00:00
Chris Lattner
f0f35c4aea Add a new llvm.x86.int intrinsic, allowing access to the
x86 int and int3 instructions.  Patch by Peter Housel!

llvm-svn: 111831
2010-08-23 19:39:25 +00:00
Chris Lattner
f4dfc7aaab random improvement for variable shift codegen.
llvm-svn: 111813
2010-08-23 17:30:29 +00:00
Anton Korobeynikov
a68e2a53a1 Revert invalid r111792. Jump tables are not broken on x86-64 / coff,
it's COFF emitter which does not support differences of two symbols
(and needs to be fixed). GAS is pretty fine with code produced.

llvm-svn: 111801
2010-08-23 07:38:51 +00:00
Michael J. Spencer
c52ac23659 Workaround broken jump tables on x86-64 COFF.
llvm-svn: 111792
2010-08-23 04:45:37 +00:00
Anton Korobeynikov
c3294e6abe Use rip-rel addressing on win64 by default. For this we just
defaults to small pic code model.

llvm-svn: 111741
2010-08-21 17:21:11 +00:00
Michael J. Spencer
18689045ce MC: Add partial x86-64 support to COFF.
llvm-svn: 111728
2010-08-21 05:58:13 +00:00
Dan Gohman
30b8e6cfd2 Fix x86 fast-isel's cmp+branch folding to avoid folding when the
comparison is in a different basic block from the branch. In such
cases, the comparison's operands may not have initialized virtual
registers available.

llvm-svn: 111709
2010-08-21 02:32:36 +00:00
Bruno Cardoso Lopes
1998fbbf1a Prepare LowerVECTOR_SHUFFLEv8i16 to use x86 target specific nodes directly
llvm-svn: 111704
2010-08-21 01:32:18 +00:00
Bruno Cardoso Lopes
28d9071635 This is the first step towards refactoring the x86 vector shuffle code. The
general idea here is to have a group of x86 target specific nodes which are
going to be selected during lowering and then directly matched in isel.

The commit includes the addition of those specific nodes and a *bunch* of
patterns, and incrementally we're going to switch between them and what we
have right now. Both the patterns and target specific nodes can change as
we move forward with this work.

llvm-svn: 111691
2010-08-20 22:55:05 +00:00
Bill Wendling
163660135e Create the new linker type "linker_private_weak_def_auto".
It's similar to "linker_private_weak", but it's known that the address of the
object is not taken. For instance, functions that had an inline definition, but
the compiler decided not to inline it. Note, unlike linker_private and
linker_private_weak, linker_private_weak_def_auto may have only default
visibility.  The symbols are removed by the linker from the final linked image
(executable or dynamic library).

llvm-svn: 111684
2010-08-20 22:05:50 +00:00
Bob Wilson
0039bc228b Replace the arm.neon.vmovls and vmovlu intrinsics with vector sign-extend and
zero-extend operations.

llvm-svn: 111614
2010-08-20 04:54:02 +00:00
Eric Christopher
e082792357 Fix loop conditionals (MO.isDef() asserts that it's a reg) and
move some constraints around.

llvm-svn: 111594
2010-08-20 00:36:24 +00:00
Eric Christopher
df3a3f5e3e Add a couple of random comments.
llvm-svn: 111592
2010-08-20 00:20:31 +00:00
Jim Grosbach
4e6f40561f Better handling of offsets on frame index references. rdar://8277890
llvm-svn: 111585
2010-08-19 23:52:25 +00:00
Jim Grosbach
d009b9d0a8 Add Thumb1 support for virtual frame indices.
rdar://8277890

llvm-svn: 111533
2010-08-19 17:52:13 +00:00
Eric Christopher
8f9362166c Silence warning.
llvm-svn: 111518
2010-08-19 15:35:27 +00:00
Chris Lattner
355d472093 fix PR7465, mishandling of lcall and ljmp: intersegment long
call and jumps.

llvm-svn: 111496
2010-08-19 01:18:43 +00:00
Chris Lattner
b3abfa861f minor progress towards fixing PR7465
llvm-svn: 111494
2010-08-19 01:00:34 +00:00
Eric Christopher
b80df4f04e Add an AddOptionalDefs method and use it.
llvm-svn: 111489
2010-08-19 00:37:05 +00:00
Bill Wendling
fa85185486 Add the "isCompare" attribute to the defm instead of each individual instr.
llvm-svn: 111481
2010-08-19 00:05:48 +00:00
Jakob Stoklund Olesen
f2b0bcb397 Don't call Predicate_* in Mips.
llvm-svn: 111468
2010-08-18 23:56:46 +00:00
Eric Christopher
0749ca13a8 Remove extra header.
llvm-svn: 111456
2010-08-18 23:38:16 +00:00
Jim Grosbach
6f036da8dc Enable ARM base register reuse to local stack slot allocation. Whenever a new
frame index reference to an object in the local block is seen, check if
it's near enough to any previously allocaated base register to re-use.

rdar://8277890

llvm-svn: 111443
2010-08-18 22:44:49 +00:00
Bill Wendling
d4fd98ebda Minor simplification. Gets rid of a needless temporary.
llvm-svn: 111430
2010-08-18 21:32:07 +00:00
Bill Wendling
fa83b9853e Marked with ATTRIBUTE_USED so that clang doesn't complain.
llvm-svn: 111383
2010-08-18 18:40:57 +00:00
Jim Grosbach
b517fe948f Add hook for re-using virtual base registers for local stack slot access.
Nothing fancy, just ask the target if any currently available base reg
is in range for the instruction under consideration and use the first one
that is. Placeholder ARM implementation simply returns false for now.

ongoing saga of rdar://8277890

llvm-svn: 111374
2010-08-18 17:57:37 +00:00
Kalle Raiskila
05d3cc2ef8 Fix a bug with insertelement on SPU.
The previous algorithm in LowerVECTOR_SHUFFLE 
didn't check all requirements for "monotonic" shuffles.

llvm-svn: 111361
2010-08-18 10:20:29 +00:00
Kalle Raiskila
8b6f5df4ae Remove all traces of v2[i,f]32 on SPU.
The "half vectors" are now widened to full size by the legalizer.
The only exception is in parameter passing, where half vectors are 
expanded. This causes changes to some dejagnu tests.

llvm-svn: 111360
2010-08-18 10:04:39 +00:00
Kalle Raiskila
0ee13a45c8 Change SPU C calling convention to match that described in
"SPU Application Binary Interface Specification, v1.9" by
IBM. 
Specifically: use r3-r74 to pass parameters and the return value.

llvm-svn: 111358
2010-08-18 09:50:30 +00:00
Chris Lattner
0a9bda3bde remove some dead code.
llvm-svn: 111345
2010-08-18 02:42:11 +00:00
Chris Lattner
f94830f175 remove some code that is dead now that lea's are modeled with segment registers.
llvm-svn: 111343
2010-08-18 02:40:44 +00:00
Bob Wilson
412be3eea6 Expand ZERO_EXTEND operations for NEON vector types.
Testcase from Nick Lewycky.

llvm-svn: 111341
2010-08-18 01:45:52 +00:00
Jim Grosbach
ff8f931bbf Add materialization of virtual base registers for frame indices allocated into
the local block. Resolve references to those indices to a new base register.
For simplification and testing purposes, a new virtual base register is
allocated for each frame index being resolved. The result is truly horrible,
but correct, code that's good for exercising the new code paths.

Next up is adding thumb1 support, which should be very simple. Following that
will be adding base register re-use and implementing a reasonable ARM
heuristic for when a virtual base register should be generated at all.

llvm-svn: 111315
2010-08-17 22:41:55 +00:00
Anton Korobeynikov
52c8ecf231 Revert part of one of the prev. patches - tailjmp will follow later.
llvm-svn: 111291
2010-08-17 21:08:28 +00:00
Anton Korobeynikov
f0600e9e8a More fixes for win64:
- Do not clobber al during variadic calls, this is AMD64 ABI-only feature
  - Emit wincall64, where necessary
Patch by Cameron Esfahani!

llvm-svn: 111289
2010-08-17 21:06:07 +00:00
Anton Korobeynikov
f1f88db4fd Enable more win64 calls folding opportunities.
Patch by Cameron Esfahani!

llvm-svn: 111288
2010-08-17 21:06:01 +00:00
Jakob Stoklund Olesen
20dbe1681b Don't call tablegen'ed Predicate_* functions in the ARM target.
llvm-svn: 111277
2010-08-17 20:39:04 +00:00
Jim Grosbach
1d9631950f 80 column cleanup.
llvm-svn: 111266
2010-08-17 18:39:16 +00:00
Jakob Stoklund Olesen
74db02758b Don't call Predicate_* methods directly from Sparc target.
Modernize predicates a bit.

The Predicate_* methods are not used by TableGen any longer. They are only
emitted for the sake of legacy code.

llvm-svn: 111263
2010-08-17 18:17:12 +00:00
Jim Grosbach
4597437c58 Add hook to examine an instruction referencing a frame index to determine
whether to allocate a virtual frame base register to resolve the frame
index reference in it. Implement a simple version for ARM to aid debugging.

In LocalStackSlotAllocation, scan the function for frame index references
to local frame indices and ask the target whether to allocate virtual
frame base registers for any it encounters. Purely infrastructural for
debug output. Next step is to actually allocate base registers, then add
intelligent re-use of them.

rdar://8277890

llvm-svn: 111262
2010-08-17 18:13:53 +00:00
Jim Grosbach
b321adf3af explicitly handle no-op cases for clarity. Fixes clang warning.
llvm-svn: 111260
2010-08-17 18:00:41 +00:00
Bob Wilson
e382fce916 Change ARM PKHTB and PKHBT instructions to use a shift_imm operand to avoid
printing "lsl #0".  This fixes the remaining parts of pr7792.  Make
corresponding changes for encoding/decoding these instructions.

llvm-svn: 111251
2010-08-17 17:23:19 +00:00
Chris Lattner
a556264e06 fix emacs language spec's, patch by Edmund Grimley-Evans!
llvm-svn: 111241
2010-08-17 16:20:04 +00:00
Bob Wilson
6239dc42c6 Allow more cases of undef shuffle indices and add tests for them.
llvm-svn: 111226
2010-08-17 05:54:34 +00:00
Eric Christopher
9a8050c4e1 Copy over some overridden MI wrappers for ARM fast-isel. This is where
we're adding predicates and optional defs to the MachineInstrs.

llvm-svn: 111222
2010-08-17 01:25:29 +00:00
Eric Christopher
8a68f2fc40 Make arm fast-isel possible to enable via command line.
llvm-svn: 111219
2010-08-17 00:46:57 +00:00
Bob Wilson
1e40f2351c Ignore undef shuffle indices when checking for a VTRN shuffle. Radar 8290937.
llvm-svn: 111208
2010-08-16 23:37:17 +00:00
Bob Wilson
d662e8cd02 Generalize a pattern for PKHTB: an SRL of 16-31 bits will guarantee
that the high halfword is zero.  The shift need not be exactly 16 bits.

llvm-svn: 111196
2010-08-16 22:26:55 +00:00
Eli Friedman
515b81a494 Comment out some broken/unused/useless instructions which mess up disassembly.
llvm-svn: 111185
2010-08-16 21:18:51 +00:00
Eli Friedman
b9707bb261 Don't attempt to SimplifyShortMoveForm in 64-bit mode.
llvm-svn: 111182
2010-08-16 21:03:32 +00:00
Matt Fleming
0078681411 Hookup ELF support for X86.
llvm-svn: 111173
2010-08-16 18:36:14 +00:00
Bob Wilson
985dab611d Rename sat_shift operand to shift_imm, in preparation for using it for other
instructions besides saturate instructions.  No functional changes.

llvm-svn: 111168
2010-08-16 18:27:34 +00:00
Jakob Stoklund Olesen
437fea641b Partially revert r111155. It looks like MSVC is calling an operator<() that
clang says is unused.

llvm-svn: 111167
2010-08-16 18:24:54 +00:00
Jakob Stoklund Olesen
a3eb6a36c2 Remove unused functions.
llvm-svn: 111155
2010-08-16 17:18:18 +00:00
Bob Wilson
98641e5a51 Remove unused code.
llvm-svn: 111154
2010-08-16 17:06:03 +00:00
Argyrios Kyrtzidis
75b69c1de3 Revert r111082. No warnings for this common pattern.
llvm-svn: 111102
2010-08-15 10:27:23 +00:00
Eric Christopher
1470fe415c Rework how the non-sse2 memory barrier is lowered so that the
encoding is correct for the built-in assembler.

Based on a patch from Chris.

llvm-svn: 111083
2010-08-14 21:51:50 +00:00
Argyrios Kyrtzidis
70b248e3ac Add ATTRIBUTE_UNUSED to methods that are not supposed to be used.
llvm-svn: 111082
2010-08-14 21:35:10 +00:00
Chris Lattner
8426971169 improve indentation
llvm-svn: 111073
2010-08-14 17:26:09 +00:00
Bob Wilson
b1eb015fc8 T2I_rbin_irs rr variant is for disassembly only, so don't provide a pattern.
llvm-svn: 111068
2010-08-14 03:18:29 +00:00
Bob Wilson
92bf5a7425 Add a Thumb2 t2RSBrr instruction for disassembly only.
This fixes another part of PR7792.

llvm-svn: 111057
2010-08-13 23:24:25 +00:00
Bob Wilson
ca672ee828 Temporarily disable tail calls on ARM to work around some linker problems.
llvm-svn: 111050
2010-08-13 22:43:33 +00:00
Bob Wilson
0883c6aae3 Move the Thumb2 SSAT and USAT optional shift operator out of the
instruction opcode.  This fixes part of PR7792.

llvm-svn: 111047
2010-08-13 21:48:10 +00:00
Bruno Cardoso Lopes
1eaa601d84 Add comments to some pattern fragments in x86
llvm-svn: 111041
2010-08-13 20:39:01 +00:00
Bob Wilson
c044a43293 Refactor the code for disassembling Thumb2 saturate instructions along the
same lines as the change I made for ARM saturate instructions.

llvm-svn: 111029
2010-08-13 19:04:21 +00:00
Dale Johannesen
3f9c148d0e Revert 110491. While not wrong, it was based on a
misanalysis and is undesirable.

llvm-svn: 111028
2010-08-13 18:43:45 +00:00
Bruno Cardoso Lopes
8b07859f3a Fix comment to reflect code, and remove an unused argument
llvm-svn: 111022
2010-08-13 17:50:47 +00:00
Bruno Cardoso Lopes
de5f3f5cb6 Improve comment to make explicit why not to touch this could before JIT goes MC
llvm-svn: 111021
2010-08-13 17:44:10 +00:00
Eric Christopher
63c83f19a0 Revert last patch and r110954 as I meant to.
llvm-svn: 111001
2010-08-13 02:37:50 +00:00
Eric Christopher
e9a4223bc8 Revert r110954 for now, pseudo instructions can't make it through to the JIT.
llvm-svn: 111000
2010-08-13 02:30:00 +00:00
Bruno Cardoso Lopes
350d186d69 Some small clean-up: use of pseudo instructions
llvm-svn: 110954
2010-08-12 20:55:18 +00:00
Johnny Chen
78345b1dfe Cleaned up the for-disassembly-only entries in the arm instruction table so that
the memory barrier variants (other than 'SY' full system domain read and write)
are treated as one instruction with option operand.

llvm-svn: 110951
2010-08-12 20:46:17 +00:00
Evan Cheng
362df591b6 Make sure ARM constant island pass does not break up an IT block. If the split point is in the middle of an IT block, it should move it up to just above the IT instruction. rdar://8302637
llvm-svn: 110947
2010-08-12 20:30:05 +00:00
Bruno Cardoso Lopes
7cb26cb8be - Teach SSEDomainFix to switch between different levels of AVX instructions. Here we guess that AVX will have domain issues, so just implement them for consistency and in the future we remove if it's unnecessary.
- Make foldMemoryOperandImpl aware of 256-bit zero vectors folding and support the 128-bit counterparts of AVX too.
- Make sure MOV[AU]PS instructions are only selected when SSE1 is enabled, and duplicate the patterns to match AVX.
- Add a testcase for a simple 128-bit zero vector creation.

llvm-svn: 110946
2010-08-12 20:20:53 +00:00
Bruno Cardoso Lopes
99b5298854 Define AVX 128-bit pattern versions of SET0PS/PD.
llvm-svn: 110937
2010-08-12 18:20:59 +00:00
Bruno Cardoso Lopes
43a7ba2bbc Fix comment order
llvm-svn: 110898
2010-08-12 02:08:52 +00:00
Bruno Cardoso Lopes
bb491bd56c Begin to support some vector operations for AVX 256-bit intructions. The long
term goal here is to be able to match enough of vector_shuffle and build_vector
so all avx intrinsics which aren't mapped to their own built-ins but to
shufflevector calls can be codegen'd. This is the first (baby) step, support
building zeroed vectors.

llvm-svn: 110897
2010-08-12 02:06:36 +00:00
Johnny Chen
fef1367b50 The autogened decoder was confusing the ARM STRBT for ARM USAT, because the .td
entry for ARM STRBT is actually a super-instruction for A8.6.199 STRBT A1 & A2.
Recover by looking for ARM:USAT encoding pattern before delegating to the auto-
gened decoder.

Added a "usat" test case to arm-tests.txt.

llvm-svn: 110894
2010-08-12 01:40:54 +00:00
Daniel Dunbar
4f45de1b1e MC/X86/AsmParser: Give an explicit error message when we reject an instruction
because it could have an ambiguous suffix.

llvm-svn: 110890
2010-08-12 00:55:42 +00:00
Daniel Dunbar
f2b4982344 MC/AsmParser: Push the burdon of emitting diagnostics about unmatched
instructions onto the target specific parser, which can do a better job.

llvm-svn: 110889
2010-08-12 00:55:38 +00:00
Daniel Dunbar
0a98bc5619 tblgen/AsmMatcher: Always emit the match function as 'MatchInstructionImpl',
target specific parsers can adapt the TargetAsmParser to this.

llvm-svn: 110888
2010-08-12 00:55:32 +00:00
Johnny Chen
9a37d16281 Changed the format of DMBsy, DSBsy, and friends from Pseudo to MiscFrm.
Added two test cases to arm-tests.txt.

llvm-svn: 110880
2010-08-11 23:35:12 +00:00
Bob Wilson
3582107cf8 Move the ARM SSAT and USAT optional shift amount operand out of the
instruction opcode.  This also fixes part of PR7792.

llvm-svn: 110875
2010-08-11 23:10:46 +00:00
Jakob Stoklund Olesen
5a62f10abc Fix <rdar://problem/8282498> even if it doesn't reproduce on trunk.
When a register is defined by a partial load:

  %reg1234:sub_32 = MOV32mr <fi#-1>; GR64:%reg1234

That load cannot be folded into an instruction using the full 64-bit register.
It would become a 64-bit load.

This is related to the recent change to have isLoadFromStackSlot return false on
a sub-register load.

llvm-svn: 110874
2010-08-11 23:08:22 +00:00
Dan Gohman
54027cf446 Don't use unsigned char for alignments in TargetData. There aren't
that many of these things, so the memory savings isn't significant,
and there are now situations where there can be alignments greater
than 128.

llvm-svn: 110836
2010-08-11 18:15:01 +00:00
Dan Gohman
d91d51116b Use ISD::ADD instead of ISD::SUB with a negated constant. This
avoids trouble if the return type of TD->getPointerSize() is
changed to something which doesn't promote to a signed type,
and is simpler anyway.

Also, use getCopyFromReg instead of getRegister to read a
physical register's value.

llvm-svn: 110835
2010-08-11 18:14:00 +00:00
Jim Grosbach
1128a47289 cortex m4 has floating point support, but only single precision.
llvm-svn: 110810
2010-08-11 15:44:15 +00:00
Bill Wendling
f10d5c00fc Consider this code snippet:
float t1(int argc) {
  return (argc == 1123) ? 1.234f : 2.38213f;
}

We would generate truly awful code on ARM (those with a weak stomach should look
away):

_t1:
  movw   r1, #1123
  movs   r2, #1
  movs   r3, #0
  cmp    r0, r1
  mov.w  r0, #0
  it     eq
  moveq  r0, r2
  movs   r1, #4
  cmp    r0, #0
  it     ne
  movne  r3, r1
  adr    r0, #LCPI1_0
  ldr    r0, [r0, r3]
  bx     lr

The problem was that legalization was creating a cascade of SELECT_CC nodes, for
for the comparison of "argc == 1123" which was fed into a SELECT node for the ?:
statement which was itself converted to a SELECT_CC node. This is because the
ARM back-end doesn't have custom lowering for SELECT nodes, so it used the
default "Expand".

I added a fairly simple "LowerSELECT" to the ARM back-end. It takes care of this
testcase, but can obviously be expanded to include more cases.

Now we generate this, which looks optimal to me:

_t1:
  movw   r1, #1123
  movs   r2, #0
  cmp    r0, r1
  adr    r0, #LCPI0_0
  it     eq
  moveq  r2, #4
  ldr    r0, [r0, r2]
  bx     lr
  .align  2
LCPI0_0:
  .long   1075344593  @ float 2.382130e+00
  .long   1067316150  @ float 1.234000e+00

llvm-svn: 110799
2010-08-11 08:43:16 +00:00
Evan Cheng
f8604b772e Report error if codegen tries to instantiate a ARM target when the cpu does support it. e.g. cortex-m* processors.
llvm-svn: 110798
2010-08-11 07:17:46 +00:00
Evan Cheng
4929ba9d20 ArchV7M implies HW division instructions.
llvm-svn: 110797
2010-08-11 07:00:16 +00:00
Evan Cheng
31e15214c6 ArchV6T2, V7A, and V7M implies Thumb2; Archv7A implies NEON.
llvm-svn: 110796
2010-08-11 06:57:53 +00:00
Evan Cheng
273160895e Add ARM Archv6M and let it implies FeatureDB (having dmb, etc.)
llvm-svn: 110795
2010-08-11 06:51:54 +00:00
Daniel Dunbar
bc7c0a60da MC/ARM: Add basic support for handling predication by parsing it out of the mnemonic into a separate operand form.
llvm-svn: 110794
2010-08-11 06:37:20 +00:00
Daniel Dunbar
63628f1443 MC/ARM: Split mnemonic on '.' characters.
llvm-svn: 110793
2010-08-11 06:37:16 +00:00
Daniel Dunbar
bbaa88a848 MC/ARM: Fill in ARMOperand::dump a bit.
llvm-svn: 110792
2010-08-11 06:37:12 +00:00
Daniel Dunbar
ee80a239ed MCAsmParser: Add dump() hook to MCParsedAsmOperand.
llvm-svn: 110790
2010-08-11 06:37:04 +00:00
Daniel Dunbar
74ed9321a3 MC/ARM: Add an ARMOperand class for condition codes.
llvm-svn: 110788
2010-08-11 06:36:53 +00:00
Evan Cheng
e67c4c3723 Really control isel of barrier instructions with cpu feature.
llvm-svn: 110787
2010-08-11 06:36:31 +00:00
Evan Cheng
e5bab36c75 Add Cortex-M0 support. It's a ARMv6m device (no ARM mode) with some 32-bit
instructions: dmb, dsb, isb, msr, and mrs.

llvm-svn: 110786
2010-08-11 06:30:38 +00:00
Evan Cheng
5fca4ca5f9 - Add subtarget feature -mattr=+db which determine whether an ARM cpu has the
memory and synchronization barrier dmb and dsb instructions.
- Change instruction names to something more sensible (matching name of actual
  instructions).
- Added tests for memory barrier codegen.

llvm-svn: 110785
2010-08-11 06:22:01 +00:00
Daniel Dunbar
89a64ee590 MC/ARM: Switch to using the generated match functions instead of stub implementations.
llvm-svn: 110783
2010-08-11 05:24:50 +00:00
Daniel Dunbar
0d725e0080 MC/ARM: Enable generation of the ARM asm matcher, not that it can do much.
llvm-svn: 110782
2010-08-11 05:09:20 +00:00
Daniel Dunbar
8311cf950b ARM: Mark some disassembler only instructions as not available for matching --
for some reason they have a very odd MCInst form where the operands overlap, but
I haven't dug in to find out why yet.

llvm-svn: 110781
2010-08-11 04:46:13 +00:00
Daniel Dunbar
a77e3fc8d8 ARM: Quote $p in an asm string.
llvm-svn: 110780
2010-08-11 04:46:10 +00:00
Bill Wendling
615aad17f7 Handle ARM compares as well as converting for ARM adds, subs, and thumb2's adds.
llvm-svn: 110762
2010-08-11 00:23:00 +00:00
Bill Wendling
735305d4d8 Mark ARM compare instructions as isCompare.
llvm-svn: 110761
2010-08-11 00:22:27 +00:00
Bob Wilson
0650cceb38 Add a separate ARM instruction format for Saturate instructions.
(I discovered 2 more copies of the ARM instruction format list, bringing the
total to 4!!  Two of them were already out of sync.  I haven't yet gotten into
the disassembler enough to know the best way to fix this, but something needs
to be done.)  Add support for encoding these instructions.

llvm-svn: 110754
2010-08-11 00:01:18 +00:00
Evan Cheng
966ed540a6 CBZ and CBNZ are implemented.
llvm-svn: 110745
2010-08-10 23:27:11 +00:00
Bruno Cardoso Lopes
6eb24fd744 Add AVX matching patterns to Packed Bit Test intrinsics.
Apply the same approach of SSE4.1 ptest intrinsics but
create a new x86 node "testp" since AVX introduces
vtest{ps}{pd} instructions which set ZF and CF depending
on sign bit AND and ANDN of packed floating-point sources.

This is slightly different from what the "ptest" does.
Tests comming with the other 256 intrinsics tests.

llvm-svn: 110744
2010-08-10 23:25:42 +00:00
Bill Wendling
c8117e507d Turn optimize compares back on with fix. We needed to test that a machine op was
a register before checking if it was defined.

llvm-svn: 110733
2010-08-10 21:38:11 +00:00
Evan Cheng
784a286b92 Delete some unused instructions.
llvm-svn: 110710
2010-08-10 19:36:22 +00:00
Evan Cheng
d9a1b0d046 Re-apply r110655 with fixes. Epilogue must restore sp from fp if the function stack frame has a var-sized object.
Also added a test case to check for the added benefit of this patch: it's optimizing away the unnecessary restore of sp from fp for some non-leaf functions.

llvm-svn: 110707
2010-08-10 19:30:19 +00:00
Daniel Dunbar
872e84afb5 Revert r110655, "Fix ARM hasFP() semantics. It should return true whenever FP
register is", it breaks a couple test-suite tests.

llvm-svn: 110701
2010-08-10 18:32:02 +00:00
Evan Cheng
3d47dbe761 Fix ARM hasFP() semantics. It should return true whenever FP register is
reserved, not available for general allocation. This eliminates all the
extra checks for Darwin.

This change also fixes the use of FP to access frame indices in leaf
functions and cleaned up some confusing code in epilogue emission.

llvm-svn: 110655
2010-08-10 06:26:49 +00:00
Bruno Cardoso Lopes
f1928b60c0 Add AVX movnt{pd,ps,dq} 256-bit intrinsics
llvm-svn: 110650
2010-08-10 02:49:24 +00:00
Bruno Cardoso Lopes
f5884c6791 Add AVX movmsk 256-bit intrinsics
llvm-svn: 110648
2010-08-10 02:34:56 +00:00
Bruno Cardoso Lopes
2a7ed4b5c9 Support AVX 256-bit load and store intrinsics
llvm-svn: 110645
2010-08-10 01:43:16 +00:00
Bruno Cardoso Lopes
1ea37cfa7b Patterns to match AVX cmp instructions
llvm-svn: 110633
2010-08-10 00:13:20 +00:00
Bruno Cardoso Lopes
4e8d77892c Add matching patterns for vblend AVX intrinsics
llvm-svn: 110630
2010-08-10 00:02:05 +00:00
Eric Christopher
a79ff725ab Wording.
llvm-svn: 110618
2010-08-09 22:52:47 +00:00
Evan Cheng
fa0406ae10 ARMBaseRegisterInfo::hasFP() has been broken for a while now. :-(
This will always be false before PEI:
(DisableFramePointerElim(MF) && MFI->adjustsStack())
Which means it's going to make r11 available as a general purpose register even
if -disable-fp-elim is specified. It's working on Darwin only because r7 is
always reserved. But it's obviously broken for other targets.

llvm-svn: 110614
2010-08-09 22:32:45 +00:00
Bruno Cardoso Lopes
e58d077846 Add VCVTPD2PS, VCVTPS2DQ, VCVTPS2PDY, VCVTTPD2DQY, VCVTTPS2DQ and VCVTPD2DQ 256-bit conversion intrinsics
llvm-svn: 110608
2010-08-09 21:51:56 +00:00
Bruno Cardoso Lopes
e7ceec4edf Add patterns to AVX conversions instructions. Do that instead of declaring more intructions whenever is possible, more coming
llvm-svn: 110605
2010-08-09 21:24:59 +00:00
Oscar Fuentes
633432d46c CMake: eliminated unnecessary target_link_libraries.
Next time the build is broken due to wrong library dependencies, just
try building again (if you are on some Unix and are building all LLVM
targets) or ask someone to commit the regenerated LLVMLibDeps.cmake.

llvm-svn: 110593
2010-08-09 20:33:08 +00:00
Evan Cheng
b6b08dfca1 Explicitly initialize SlowFPBrcc and Pref32BitThumb to false.
llvm-svn: 110587
2010-08-09 19:19:36 +00:00
Evan Cheng
15d23d4966 Change -prefer-32bit-thumb to attribute -mattr=+32bit instead to disable more 32-bit to 16-bit optimizations.
llvm-svn: 110584
2010-08-09 18:35:19 +00:00
Bruno Cardoso Lopes
6a92e01d05 Memory version of vcvtdq2pd intrinsic
llvm-svn: 110582
2010-08-09 18:20:14 +00:00