1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-27 22:12:47 +01:00
Commit Graph

4025 Commits

Author SHA1 Message Date
Kalle Raiskila
71dec6ff42 Handle lshr for i128 correctly on SPU also when
shiftamount > 7.

llvm-svn: 120288
2010-11-29 14:44:28 +00:00
Kalle Raiskila
45865e9165 Enable PostRA scheduling for SPU.
This speeds up selected test cases with up to
5% - no slowdowns observed.

llvm-svn: 120286
2010-11-29 10:30:25 +00:00
Bob Wilson
3bb61d1932 Add support for NEON VLD2-dup instructions.
llvm-svn: 120236
2010-11-28 06:51:26 +00:00
Rafael Espindola
45cd9713f2 Lower TLS_addr32 and TLS_addr64.
llvm-svn: 120225
2010-11-27 20:43:02 +00:00
Bob Wilson
cbd6281807 Add NEON VLD1-dup instructions (load 1 element to all lanes).
llvm-svn: 120194
2010-11-27 06:35:16 +00:00
Kalle Raiskila
b017eaea7b Allow for 'fcmp ogt' in SPU.
Fix by Visa Putkinen!

llvm-svn: 120090
2010-11-24 11:42:17 +00:00
Bob Wilson
ff0e6b1252 Recognize sign/zero-extended constant BUILD_VECTORs for VMULL operations.
We need to check if the individual vector elements are sign/zero-extended
values.  For now this only handles constants values.  Radar 8687140.

llvm-svn: 120034
2010-11-23 19:38:38 +00:00
Kalle Raiskila
f71cc94c91 Division by pow-of-2 is not cheap on SPU, do it with
shifts.

llvm-svn: 120022
2010-11-23 13:27:59 +00:00
Chris Lattner
c0fcc364c0 filecheckize
llvm-svn: 119987
2010-11-23 02:26:52 +00:00
Evan Cheng
e6d55cd247 Fix epilogue codegen to avoid leaving the stack pointer in an invalid
state. Previously Thumb2 would restore sp from fp like this:
mov sp, r7
sub, sp, #4
If an interrupt is taken after the 'mov' but before the 'sub', callee-saved
registers might be clobbered by the interrupt handler. Instead, try
restoring directly from sp:
add sp, #4
Or, if necessary (with VLA, etc.) use a scratch register to compute sp and
then restore it:
sub.w r4, r7, #8
mov sp, r7
rdar://8465407

llvm-svn: 119977
2010-11-22 18:12:04 +00:00
Kalle Raiskila
8f1131e569 Fix a bug with extractelement on SPU.
In the attached testcase, the element was
never extracted (missing rotate).

llvm-svn: 119973
2010-11-22 16:28:26 +00:00
Benjamin Kramer
632a91cba5 Implement the "if (X == 6 || X == 4)" -> "if ((X|2) == 6)" optimization.
This currently only catches the most basic case, a two-case switch, but can be
extended later.

llvm-svn: 119964
2010-11-22 09:45:38 +00:00
Wesley Peck
911abf2bc0 Implement branch analysis in the MBlaze backend.
llvm-svn: 119951
2010-11-21 21:53:36 +00:00
Andrew Trick
3166f72d7a Removing the useless test that I added recently. It was meant as an example, but not complicated enough to merit another test.
llvm-svn: 119898
2010-11-20 07:26:51 +00:00
Dale Johannesen
6399550f2f Prefetch has a MemOperand now. FileCheckize a test.
This finishes up 8460971.

llvm-svn: 119848
2010-11-19 21:49:38 +00:00
Mon P Wang
4965983b22 Make isScalarToVector to return false if the node is a scalar. This will prevent
DAGCombine from making an illegal transformation of bitcast of a scalar to a
vector into a scalar_to_vector.

llvm-svn: 119819
2010-11-19 19:08:12 +00:00
Tanya Lattner
9cac0edef3 Fix bug in DAGCombiner for ARM that was trying to do a ShiftCombine on illegal types (vector should be split first).
Added test case.

llvm-svn: 119749
2010-11-18 22:06:46 +00:00
Duncan Sands
a61bc1a41a The DAGCombiner was threading select over pairs of extending loads even
if the extension types were not the same.  The result was that if you
fed a select with sext and zext loads, as in the testcase, then it
would get turned into a zext (or sext) of the select, which is wrong
in the cases when it should have been an sext (resp. zext).  Reported
and diagnosed by Sebastien Deldon.

llvm-svn: 119728
2010-11-18 20:05:18 +00:00
Eric Christopher
bc6a51d63f Rewrite stack callee saved spills and restores to use push/pop instructions.
Remove movePastCSLoadStoreOps and associated code for simple pointer
increments. Update routines that depended upon other opcodes for save/restore.

Adjust all testcases accordingly.

llvm-svn: 119725
2010-11-18 19:40:05 +00:00
Rafael Espindola
93a07b464e Change CodeGen to use .loc directives. This produces a lot more readable output
and testing is easier.  A good example is the unknown-location.ll test that
now can just look for ".loc 1 0 0".  We also don't use a DW_LNE_set_address for
every address change anymore.

llvm-svn: 119613
2010-11-18 02:04:25 +00:00
Dale Johannesen
06f479d543 Do not throw away alignment when generating the DAG for
memset; we may need it to decide between MOVAPS and MOVUPS
later.  Adjust a test that was looking for wrong code.
PR 3866 / 8675131.

llvm-svn: 119605
2010-11-18 01:35:23 +00:00
John Thompson
8c52bd2004 Fixed to use input redirection for source - to eliminate .s output.
llvm-svn: 119599
2010-11-18 00:50:20 +00:00
John Thompson
b33f935bc3 Bug 8621 fix - pointer cast stripped from inline asm constraint argument.
llvm-svn: 119590
2010-11-17 23:58:47 +00:00
Dale Johannesen
a87210e350 These tests are looking for library function names that
appear to differ on Linux.  Try to make them pass on Linux.
Would be good for a Linux person to review this.

llvm-svn: 119572
2010-11-17 21:57:32 +00:00
Bob Wilson
a217a6e40b Change ARMGlobalMerge to keep BSS globals in separate pools.
This completes the fixes for Radar 8673120.

llvm-svn: 119566
2010-11-17 21:25:39 +00:00
Bob Wilson
819660b716 Fix ARMGlobalMerge pass to check if globals are entirely within range.
It is generally not sufficient to check if the starting offset is in range
of the maximum offset that can be efficiently used for the target.

llvm-svn: 119565
2010-11-17 21:25:36 +00:00
Bob Wilson
49bf8702f0 Change the symbol for merged globals from "merged" to "_MergedGlobals".
This makes it more clear that the symbol is an internal, compiler-generated
name and gives a little more description about its contents.

llvm-svn: 119564
2010-11-17 21:25:33 +00:00
Bob Wilson
fd7399b6a6 Fix the ARMGlobalMerge pass to look at variable sizes instead of pointer sizes.
It was mistakenly looking at the pointer type when checking for the size of
global variables.  This is a partial fix for Radar 8673120.

llvm-svn: 119563
2010-11-17 21:25:27 +00:00
Evan Cheng
ce610bd6b3 Remove ARM isel hacks that fold large immediates into a pair of add, sub, and,
and xor. The 32-bit move immediates can be hoisted out of loops by machine
LICM but the isel hacks were preventing them.

Instead, let peephole optimization pass recognize registers that are defined by
immediates and the ARM target hook will fold the immediates in.

Other changes include 1) do not fold and / xor into cmp to isel TST / TEQ
instructions if there are multiple uses. This happens when the 'and' is live
out, machine sink would have sinked the computation and that ends up pessimizing
code. The peephole pass would recognize situations where the 'and' can be
toggled to define CPSR and eliminate the comparison anyway.

2) Move peephole pass to after machine LICM, sink, and CSE to avoid blocking
important optimizations.

rdar://8663787, rdar://8241368

llvm-svn: 119548
2010-11-17 20:13:28 +00:00
Che-Liang Chiou
f8ffd59ccb Add simple arithmetics and %type directive for PTX
llvm-svn: 119485
2010-11-17 08:08:49 +00:00
Jakob Stoklund Olesen
ea79975a68 Fix PR8612 in the standard spiller, take two.
The live range of a register defined by an early clobber starts at the use slot,
not the def slot.

Except when it is an early clobber tied to a use operand. Then it starts at the
def slot like a standard def.

llvm-svn: 119305
2010-11-16 00:40:59 +00:00
Jakob Stoklund Olesen
465c07cc66 Revert "Fix PR8612 in the standard spiller as well."
This reverts r119183 which borke the buildbots.

llvm-svn: 119270
2010-11-15 21:51:51 +00:00
Eric Christopher
22b01f0d76 Recommit this change and remove the failing part of the test - it didn't
pass in the first place and was masked by earlier failures not warning
and aborting the block.

llvm-svn: 119184
2010-11-15 21:11:06 +00:00
Jakob Stoklund Olesen
920856ab38 Fix PR8612 in the standard spiller as well.
The live range of a register defined by an early clobber starts at the use slot,
not the def slot.

llvm-svn: 119183
2010-11-15 20:55:53 +00:00
Jakob Stoklund Olesen
86fba16eb1 When spilling a register defined by an early clobber, make sure that the new
live ranges for the spill register are also defined at the use slot instead of
the normal def slot.

This fixes PR8612 for the inline spiller. A use was being allocated to the same
register as a spilled early clobber def.

This problem exists in all the spillers. A fix for the standard spiller is
forthcoming.

llvm-svn: 119182
2010-11-15 20:55:49 +00:00
Chris Lattner
3e135493e9 remove a pointless testcase.
llvm-svn: 119119
2010-11-15 05:07:03 +00:00
Chris Lattner
7743db8f20 remove some extraneous quotes to make the new instprinter match.
llvm-svn: 119104
2010-11-15 02:43:46 +00:00
Chris Lattner
8cb5c07514 add some nounwind's.
llvm-svn: 119086
2010-11-14 22:22:14 +00:00
Peter Collingbourne
4ec6b2dbe7 Recognise 32-bit ror-based bswap implementation used by uclibc
llvm-svn: 119007
2010-11-13 19:54:30 +00:00
Evan Cheng
a7d3c3d387 Add conditional move of large immediate.
llvm-svn: 118968
2010-11-13 02:25:14 +00:00
Evan Cheng
a08372d4b4 Fix an obvious typo which inverted an immediate.
llvm-svn: 118951
2010-11-13 00:27:47 +00:00
Eric Christopher
ba758d0df4 This should be still failing, but is. Disable it with the
forget-me-stick for now.

llvm-svn: 118950
2010-11-13 00:25:06 +00:00
Evan Cheng
19f018a1be Add conditional mvn instructions.
llvm-svn: 118935
2010-11-12 22:42:47 +00:00
Evan Cheng
b565d1acf9 Add some missing isel predicates on def : pat patterns to avoid generating VFP vmla / vmls (they cause stalls). Disabling them in isel is properly not a right solution, I'll look into a proper solution next.
llvm-svn: 118922
2010-11-12 20:32:20 +00:00
Andrew Trick
a2f5bd63ff Emacs auto-fill bug.
llvm-svn: 118908
2010-11-12 18:17:46 +00:00
Andrew Trick
e5e40ec069 Test case for PR8287: SD scheduling time. Fixed in r118904.
llvm-svn: 118906
2010-11-12 17:57:22 +00:00
Kalle Raiskila
f89d0d0389 Fix memory access lowering on SPU, adding
support for the case where alignment<value size.

These cases were silently miscompiled before this patch.
Now they are overly verbose -especially storing is- and
any front-end should still avoid misaligned memory 
accesses as much as possible. The bit juggling algorithm
added here probably has some room for improvement still.

llvm-svn: 118889
2010-11-12 10:14:03 +00:00
Bruno Cardoso Lopes
37f2439cbd Enable mips32 mul instruction. Patch by Akira Hatanaka <ahatanaka@mips.com>
llvm-svn: 118864
2010-11-12 00:38:32 +00:00
Dan Gohman
8e986f9c2f Remove the memmove->memcpy optimization from CodeGen. MemCpyOpt does this.
llvm-svn: 118789
2010-11-11 16:24:49 +00:00
Bruno Cardoso Lopes
c911863e01 Add a test to the previous added clo instruction. Patch by Akira again
llvm-svn: 118668
2010-11-10 02:22:44 +00:00
Bob Wilson
a933ce4caa Do not use MEMBARRIER_MCR for any Thumb code.
It is only supported for ARM code.  Normally Thumb2 code would use DMB instead,
but depending on how the compiler is invoked (e.g., -mattr=-db) that might be
disabled.  This prevents a "cannot select MEMBARRIER_MCR" error in that
situation.  Radar 8644195

llvm-svn: 118642
2010-11-09 22:50:44 +00:00
Duncan Sands
a6d539ccbc Testcase for PR8211 (llc crash at -O0).
llvm-svn: 118509
2010-11-09 16:22:27 +00:00
Dan Gohman
903935bf3e Fix DAGCombiner to avoid folding a sext-in-reg or similar through a shl
in order to fold it into a load.

llvm-svn: 118471
2010-11-09 01:54:35 +00:00
Dan Gohman
f78cd9006b Delete an extraneous svn:executable property.
llvm-svn: 118470
2010-11-09 01:51:06 +00:00
Dale Johannesen
88f85df7f7 Fix an inline asm pasto from 117667; was preventing
{i64, i64} from matching i128.

llvm-svn: 118465
2010-11-09 01:15:07 +00:00
Owen Anderson
52e3873edc Add support for ARM's specialized vector-compare-against-zero instructions.
llvm-svn: 118453
2010-11-08 23:21:22 +00:00
Dale Johannesen
f16acd1325 Revert 118422 in search of bot verdancy.
llvm-svn: 118429
2010-11-08 19:17:22 +00:00
Jason W Kim
a253ed4e26 Support -mcpu=cortex-a8 in ARM attributes - Has Fixme. 1 Test modified.
llvm-svn: 118422
2010-11-08 17:58:07 +00:00
Chris Lattner
0d6d870628 go to great lengths to work around a GAS bug my previous patch
exposed:

GAS doesn't accept "fcomip %st(1)", it requires "fcomip %st(1), %st(0)"
even though st(0) is implicit in all other fp stack instructions.

Fortunately, there is an alias for fcomip named "fcompi" and gas does
accept the default argument for the alias (boggle!).

As such, switch the canonical form of this instruction to "pi" instead
of "ip".  This makes the code generator and disassembler generate pi,
avoiding the gas bug.

llvm-svn: 118356
2010-11-06 21:37:06 +00:00
Owen Anderson
add19dd6dd Add codegen and encoding support for the immediate form of vbic.
llvm-svn: 118291
2010-11-05 19:27:46 +00:00
Duncan Sands
1af5c00d3c When passing a huge parameter using the byval mechanism, a long
sequence of loads and stores was being generated to perform the
copy on the x86 targets if the parameter was less than 4 byte
aligned, causing llc to use up vast amounts of memory and time.
Use a "rep movs" form instead.  PR7170.

llvm-svn: 118260
2010-11-04 21:16:46 +00:00
Evan Cheng
165e65f53a Fix @llvm.prefetch isel. Selecting between pld / pldw using the first immediate rw. There is currently no intrinsic that matches to pli.
llvm-svn: 118237
2010-11-04 05:19:35 +00:00
Owen Anderson
d144bb1ddc Covert VORRIMM to be produced via early target-specific DAG combining, rather than legalization.
This is both the conceptually correct place for it, as well as allowing it to be more aggressive.

llvm-svn: 118204
2010-11-03 23:15:26 +00:00
Owen Anderson
1a89511e5d Add support for code generation of the one register with immediate form of vorr.
We could be more aggressive about making this work for a larger range of constants,
but this seems like a good start.

llvm-svn: 118201
2010-11-03 22:44:51 +00:00
Evan Cheng
72718cf345 Fix test.
llvm-svn: 118187
2010-11-03 18:21:33 +00:00
Dale Johannesen
74ebf1b011 This test assumes SSE is present; that is not the default
on non-X86 hosts.  Hopefully fixes ppc-host buildbot.

llvm-svn: 118182
2010-11-03 18:08:41 +00:00
Bob Wilson
f44e708279 Add codegen patterns for VST1-lane instructions. Radar 8599955.
llvm-svn: 118176
2010-11-03 16:24:53 +00:00
Bob Wilson
0ec96428ba Check for extractelement with a variable operand for the element number.
For NEON we had been assuming this was always an immediate constant.

llvm-svn: 118175
2010-11-03 16:24:50 +00:00
Evan Cheng
eab7251695 Fix preload instruction isel. Only v7 supports pli, and only v7 with mp extension supports pldw. Add subtarget attribute to denote mp extension support and legalize illegal ones to nothing.
llvm-svn: 118160
2010-11-03 06:34:55 +00:00
Evan Cheng
b41703bc2f Add support to match @llvm.prefetch to pld / pldw / pli. rdar://8601536.
llvm-svn: 118152
2010-11-03 05:14:24 +00:00
Dan Gohman
8071a75d31 Fix DAGCombiner to avoid going into an infinite loop when it
encounters (and:i64 (shl:i64 (load:i64), 1), 0xffffffff).
This fixes rdar://8606584.

llvm-svn: 118143
2010-11-03 01:47:46 +00:00
Evan Cheng
67db408634 Two sets of changes. Sorry they are intermingled.
1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to
   "optimize for latency". Call instructions don't have the right latency and
   this is more likely to use introduce spills.
2. Fix if-converter cost function. For ARM, it should use instruction latencies,
   not # of micro-ops since multi-latency instructions is completely executed
   even when the predicate is false. Also, some instruction will be "slower"
   when they are predicated due to the register def becoming implicit input.
   rdar://8598427

llvm-svn: 118135
2010-11-03 00:45:17 +00:00
John Thompson
4255425219 Inline asm mult-alt constraint tests.
llvm-svn: 118107
2010-11-02 23:01:44 +00:00
Jim Grosbach
d6df785c6d Revert r114340 (improvements in Darwin function prologue/epilogue), as it broke
assumptions about stack layout. Specifically, LR must be saved next to FP.

llvm-svn: 118026
2010-11-02 17:35:25 +00:00
Devang Patel
efd9ac540a Use frameindex, if available, as a last resort to emit debug info for a parameter.
llvm-svn: 118020
2010-11-02 17:01:30 +00:00
Bob Wilson
411b511ac0 Add support for alignment operands on VLD1-lane instructions.
This is another part of the fix for Radar 8599955.

llvm-svn: 117976
2010-11-01 23:40:51 +00:00
Bob Wilson
8f0cf81f09 Add VLD1-lane testcases for quad-register types.
llvm-svn: 117975
2010-11-01 23:40:46 +00:00
Bob Wilson
b6bc135df8 Add NEON VLD1-lane instructions. Partial fix for Radar 8599955.
llvm-svn: 117964
2010-11-01 22:04:05 +00:00
Bill Wendling
4340c9449a When we look at instructions to convert to setting the 's' flag, we need to look
at more than those which define CPSR. You can have this situation:

(1)  subs  ...
(2)  sub   r6, r5, r4
(3)  movge ...
(4)  cmp   r6, 0
(5)  movge ...

We cannot convert (2) to "subs" because (3) is using the CPSR set by
(1). There's an analogous situation here:

(1)  sub   r1, r2, r3
(2)  sub   r4, r5, r6
(3)  cmp   r4, ...
(5)  movge ...
(6)  cmp   r1, ...
(7)  movge ...

We cannot convert (1) to "subs" because of the intervening use of CPSR.

llvm-svn: 117950
2010-11-01 20:41:43 +00:00
Bob Wilson
a9c593e696 NEON does not support truncating vector stores. Radar 8598391.
llvm-svn: 117940
2010-11-01 18:31:39 +00:00
Bill Wendling
a59bb03780 More tests to XFAIL. The arm-and-txt-peephole.ll test passes even when the
peephole optimizer is disabled. That's not good at all.

llvm-svn: 117905
2010-11-01 05:59:43 +00:00
Bill Wendling
bfdbe6b369 Disable because peephole is disabled.
llvm-svn: 117903
2010-11-01 05:48:44 +00:00
Bob Wilson
183c466006 Overhaul memory barriers in the ARM backend. Radar 8601999.
There were a number of issues to fix up here:
* The "device" argument of the llvm.memory.barrier intrinsic should be
used to distinguish the "Full System" domain from the "Inner Shareable"
domain.  It has nothing to do with using DMB vs. DSB instructions.
* The compiler should never need to emit DSB instructions.  Remove the
ARMISD::SYNCBARRIER node and also remove the instruction patterns for DSB.
* Merge the separate DMB/DSB instructions for options only used for the
disassembler with the default DMB/DSB instructions.  Add the default
"full system" option ARM_MB::SY to the ARM_MB::MemBOpt enum.
* Add a separate ARMISD::MEMBARRIER_MCR node for subtargets that implement
a data memory barrier using the MCR instruction.
* Fix up encodings for these instructions (except MCR).
I also updated the tests and added a few new ones to check for DMB options
that were not currently being exercised.

llvm-svn: 117756
2010-10-30 00:54:37 +00:00
Evan Cheng
d81c33d91e Teach machine cse to eliminate instructions with multiple physreg uses and defs. rdar://8610857.
llvm-svn: 117745
2010-10-29 23:36:03 +00:00
Bob Wilson
d67dddb134 Remove DAG combiner patch to fold vector splats. Instcombiner does it now.
llvm-svn: 117720
2010-10-29 22:03:02 +00:00
Evan Cheng
392d2cbdcc Avoiding overly aggressive latency scheduling. If the two nodes share an
operand and one of them has a single use that is a live out copy, favor the
one that is live out. Otherwise it will be difficult to eliminate the copy
if the instruction is a loop induction variable update. e.g.

BB:
sub r1, r3, #1
str r0, [r2, r3]
mov r3, r1
cmp
bne BB

=>

BB:
str r0, [r2, r3]
sub r3, r3, #1
cmp
bne BB

This fixed the recent 256.bzip2 regression.

llvm-svn: 117675
2010-10-29 18:09:28 +00:00
Bob Wilson
65124cd7c7 Teach the DAG combiner to fold a splat of a splat. Radar 8597790.
Also do some minor refactoring to reduce indentation.

llvm-svn: 117558
2010-10-28 17:06:14 +00:00
Evan Cheng
bc4588c439 Re-commit 117518 and 117519 now that ARM MC test failures are out of the way.
llvm-svn: 117531
2010-10-28 06:47:08 +00:00
Evan Cheng
fdc80a0316 Revert 117518 and 117519 for now. They changed scheduling and cause MC tests to fail. Ugh.
llvm-svn: 117520
2010-10-28 02:00:25 +00:00
Evan Cheng
5c358e02ea - Assign load / store with shifter op address modes the right itinerary classes.
- For now, loads of [r, r] addressing mode is the same as the
  [r, r lsl/lsr/asr #] variants. ARMBaseInstrInfo::getOperandLatency() should
  identify the former case and reduce the output latency by 1.
- Also identify [r, r << 2] case. This special form of shifter addressing mode
  is "free".

llvm-svn: 117519
2010-10-28 01:49:06 +00:00
Dale Johannesen
b78530f9b0 Fix pastos in handling of AVX cvttsd2si, PR8491.
Bruno, please review, but I'm pretty sure this is right.
Patch by Alex Mac!

llvm-svn: 117514
2010-10-28 00:35:54 +00:00
Evan Cheng
44d2802e1d Shifter ops are not always free. Do not fold them (especially to form
complex load / store addressing mode) when they have higher cost and
when they have more than one use.

llvm-svn: 117509
2010-10-27 23:41:30 +00:00
Bob Wilson
cdc8dff3ac SelectionDAG shuffle nodes do not allow operands with different numbers of
elements than the result vector type.  So, when an instruction like:

%8 = shufflevector <2 x float> %4, <2 x float> %7, <4 x i32> <i32 1, i32 0, i32 3, i32 2>

is translated to a DAG, each operand is changed to a concat_vectors node that appends 2 undef elements.  That is:

shuffle [a,b], [c,d] is changed to:
shuffle [a,b,u,u], [c,d,u,u]

That's probably the right thing for x86 but for NEON, we'd much rather have:

shuffle [a,b,c,d], undef

Teach the DAG combiner how to do that transformation for ARM.  Radar 8597007.

llvm-svn: 117482
2010-10-27 20:38:28 +00:00
Jim Grosbach
afc1e913c0 FileCheck'ize
llvm-svn: 117401
2010-10-26 21:26:47 +00:00
Kalle Raiskila
64680cd5b8 Change v64 datalayout in SPU.
The SPU ABI does not mention v64, and all examples
in C suggest v128 are treated similarily to arrays, 
we use array alignment for v64 too. This makes the 
alignment of e.g. [2 x <2 x i32>] behave "intuitively"
and similar to as if the elements were e.g. i32s.

This also makes an "unaligned store" test to be 
aligned, with different (but functionally equivalent)
code generated.

llvm-svn: 117360
2010-10-26 10:45:47 +00:00
Bob Wilson
309484bb46 When the "true" and "false" blocks of a diamond if-conversion are the same,
do not double-count the duplicate instructions by counting once from the
beginning and again from the end.  Keep track of where the duplicates from
the beginning ended and don't go past that point when counting duplicates
at the end.  Radar 8589805.

This change causes one of the MC/ARM/simple-fp-encoding tests to produce
different (better!) code without the vmovne instruction being tested.
I changed the test to produce vmovne and vmoveq instructions but moving
between register files in the opposite direction.  That's not quite the same
but predicated versions of those instructions weren't being tested before,
so at least the test coverage is not any worse, just different.

llvm-svn: 117333
2010-10-26 00:02:24 +00:00
Dale Johannesen
2566d39d07 An stdcall function calling a non-stdcall function
cannot use tailcall.  PR 8461.

llvm-svn: 117322
2010-10-25 22:17:05 +00:00
Rafael Espindola
5748458e7d Add support for emitting ARM file attributes.
llvm-svn: 117275
2010-10-25 17:50:35 +00:00
Michael J. Spencer
87c8212d41 X86: Emit _fltused instead of __fltused on Windows x64.
llvm-svn: 117205
2010-10-23 09:06:59 +00:00
Jim Grosbach
1df6f11838 tidy up
llvm-svn: 117185
2010-10-22 23:46:04 +00:00
Jim Grosbach
172f52fb3d Remove duplicate test.
llvm-svn: 117158
2010-10-22 22:04:28 +00:00
Jim Grosbach
3cb3f83374 tidy up.
llvm-svn: 117157
2010-10-22 22:01:56 +00:00
Jim Grosbach
4366bb4852 FileCheck-ize a few tests.
llvm-svn: 117156
2010-10-22 21:55:03 +00:00
Wesley Peck
d646a12f25 Recommit 116986 with capitalization typo fixed.
llvm-svn: 116993
2010-10-21 03:57:26 +00:00
Andrew Trick
4a3b819c1f putback r116983 and fix simple-fp-encoding.ll tests
llvm-svn: 116992
2010-10-21 03:40:16 +00:00
Wesley Peck
3478e641b9 Reverting the commit 116986. It was breaking the build on llvm-x86_64-linux though it
compiles on OS X. I'll ensure that it builds on a linux machine before committing
again.

llvm-svn: 116991
2010-10-21 03:34:22 +00:00
Owen Anderson
7da515c665 Revert r116983, which is breaking all the buildbots.
llvm-svn: 116987
2010-10-21 03:11:16 +00:00
Wesley Peck
c50372298d Major update of the MicroBlaze backend. The new features are:
1. A delay slot filler that searches for valid instructions
       to fill the delay slot with. Previously NOPs would always
       be inserted into delay slots.
    2. Support for MC based instruction printer added.
    3. Support for MC based machine code generation and ELF
       file generation. ELF file generation does not yet
       completely work as much of the ELF support infrastructure
       is still x86/x86-64 specific.
    4. General clean up of the MBlaze backend code. Much of the
       tablegen code has been cleanup and simplified.

Bug Fixes:
    1. Removed duplicate periods from subtarget feature descriptions.
    2. Many of the instructions had bad machine code information
       in the tablegen files. Much of this has been fixed.

llvm-svn: 116986
2010-10-21 03:09:55 +00:00
Evan Cheng
0b9eaaf45d Add missing scheduling itineraries for transfers between core registers and VFP registers.
llvm-svn: 116983
2010-10-21 01:12:00 +00:00
Evan Cheng
9a06e4c7c8 More accurate estimate / tracking of register pressure.
- Initial register pressure in the loop should be all the live defs into the
  loop. Not just those from loop preheader which is often empty.
- When an instruction is hoisted, update register pressure from loop preheader
  to the original BB.
- Treat only use of a virtual register as kill since the code is still SSA.

llvm-svn: 116956
2010-10-20 22:03:58 +00:00
Dale Johannesen
a324c8c6bd Fix crash introduced in 116852. 8573915.
llvm-svn: 116955
2010-10-20 22:03:37 +00:00
Dale Johannesen
ee87cbe4e9 Enable using vdup for vector constants which are splat of
integers by default, and remove the controlling flag, now
that LICM will hoist such vdup's.  8003375.

llvm-svn: 116852
2010-10-19 20:00:17 +00:00
Evan Cheng
1c8dafd12a Re-enable register pressure aware machine licm with fixes. Hoist() may have
erased the instruction during LICM so UpdateRegPressureAfter() should not
reference it afterwards.

llvm-svn: 116845
2010-10-19 18:58:51 +00:00
Daniel Dunbar
6ff550c84d Revert r116781 "- Add a hook for target to determine whether an instruction def
is", which breaks some nightly tests.

llvm-svn: 116816
2010-10-19 17:14:24 +00:00
Che-Liang Chiou
1733b45be9 Add test case mov.ll for PTX device function
llvm-svn: 116806
2010-10-19 13:21:51 +00:00
Evan Cheng
9c3f6f486e - Add a hook for target to determine whether an instruction def is
"long latency" enough to hoist even if it may increase spilling. Reloading
  a value from spill slot is often cheaper than performing an expensive
  computation in the loop. For X86, that means machine LICM will hoist
  SQRT, DIV, etc. ARM will be somewhat aggressive with VFP and NEON
  instructions.
- Enable register pressure aware machine LICM by default.

llvm-svn: 116781
2010-10-19 00:55:07 +00:00
Bob Wilson
c3fb4427f4 Support alignment for NEON vld-lane and vst-lane instructions.
llvm-svn: 116776
2010-10-19 00:16:32 +00:00
Eric Christopher
5850afe5f2 Revert r116220 - thus turning arm fast isel back on by default.
llvm-svn: 116762
2010-10-18 22:53:53 +00:00
Kalle Raiskila
3cdfdd9383 Improve lowering of sext to i128 on SPU.
The old algorithm inserted a 'rotqmbyi' instruction which was
both redundant and wrong - it made shufb select bytes from the
wrong end of the input quad.

llvm-svn: 116701
2010-10-18 09:34:19 +00:00
Michael J. Spencer
e57b670425 X86-Windows: Emit an undefined global __fltused symbol when targeting Windows
if any floating point arguments are passed to an external function.

llvm-svn: 116665
2010-10-16 08:25:41 +00:00
Bob Wilson
fcc42f2f3a ARM instructions that are both predicated and set the condition codes
have been printed with the "S" modifier after the predicate.  With ARM's
unified syntax, they are supposed to go in the other order.  We fixed this
for Thumb when we switched to unified syntax but missed changing it for
ARM.  Apparently we don't generate these instructions often because no one
noticed until now.  Thanks to Bill Wendling for the testcase!

llvm-svn: 116563
2010-10-15 03:23:44 +00:00
Jim Grosbach
804505c7d4 Refactor the MOVsr[al]_flag and RRX pseudo-instructions to really be pseudos
and let the ARMExpandPseudoInsts pass fix them up into the real (MOVs)
instruction form.

llvm-svn: 116534
2010-10-14 22:57:13 +00:00
Jim Grosbach
29dc23398f Tweak the ARM backend to use the RRX mnemonic instead of the 'mov a, b, rrx'
pseudonym.

llvm-svn: 116512
2010-10-14 20:43:44 +00:00
Rafael Espindola
b1ae74bd73 Fix another case where we were preferring instructions with large
immediates instead of 8 bits ones.

llvm-svn: 116410
2010-10-13 17:14:25 +00:00
Rafael Espindola
ff7f11c151 Fix PR8365 by adding a more specialized Pat that checks if an 'and' with
8 bit constants can be used.

llvm-svn: 116403
2010-10-13 13:31:20 +00:00
Eric Christopher
efbd4146d8 FileCheckize this in a hope to quiet a valgrind warning on grep.
llvm-svn: 116376
2010-10-12 23:47:58 +00:00
Andrew Trick
5361f35978 PR8297
llvm-svn: 116223
2010-10-11 21:08:42 +00:00
Jakob Stoklund Olesen
a0a5015a35 PowerPC varargs functions store live-in registers on the stack. Make sure we use
virtual registers for those stores since RegAllocFast requires that each live
physreg only be used once.

This fixes PR8357.

llvm-svn: 116222
2010-10-11 20:43:09 +00:00
Eric Christopher
fa961e31b1 Found a bug turning this on by default. Disable again for now.
llvm-svn: 116220
2010-10-11 20:26:21 +00:00
Eric Christopher
b477aa7c6f Remove now non-existent option.
llvm-svn: 116219
2010-10-11 20:21:21 +00:00
Andrew Trick
5704a15e36 Fixes bug 8297: i386 cmpxchg8b, missing MachineMemOperand
llvm-svn: 116214
2010-10-11 19:02:04 +00:00
Chris Lattner
c9709f154d Per discussion with Sanjiv, remove the PIC16 target from mainline. When/if
it comes back, it will be largely a rewrite, so keeping the old codebase
in tree isn't helping anyone.

llvm-svn: 116190
2010-10-11 05:44:40 +00:00
Michael J. Spencer
8f7251a0e9 X86: MinGW should always use libgcc on Windows.
llvm-svn: 116177
2010-10-10 23:11:06 +00:00
Michael J. Spencer
b13cb4dd72 X86: Call _alldiv instead of __divdi3 on Windows (excluding cygwin).
llvm-svn: 116174
2010-10-10 22:04:34 +00:00
Chris Lattner
307552613d force a triple, varargs isn't supported with the SVR4 ABI the buildbot tells me.
llvm-svn: 116170
2010-10-10 18:59:01 +00:00
Chris Lattner
2b428a0ab8 fix the expansion of va_arg instruction on PPC to know the arg
alignment for PPC32/64, avoiding some masking operations.

llvm-gcc expands vaarg inline instead of using the instruction
so it has never hit this.

llvm-svn: 116168
2010-10-10 18:34:00 +00:00
Evan Cheng
15fc769cf2 Correct some load / store instruction itinerary mistakes:
1. Cortex-A8 load / store multiplies can only issue on ALU0.
2. Eliminate A8_Issue, A8_LSPipe will correctly limit the load / store issues.
3. Correctly model all vld1 and vld2 variants.

llvm-svn: 116134
2010-10-09 01:03:04 +00:00
Bill Wendling
59895ff813 Simplify test and move into a generic "crash" ll file.
llvm-svn: 116130
2010-10-09 00:29:04 +00:00
Bill Wendling
ba58fb0f42 Check to make sure that the iterator isn't at the beginning of the basic block
before decrementing. <rdar://problem/8529919>

llvm-svn: 116126
2010-10-09 00:03:48 +00:00
Cameron Esfahani
664317d6cd Recommit 116056, now with the missing file...
llvm-svn: 116083
2010-10-08 19:24:18 +00:00
Andrew Trick
0d8a3d67e0 reverting 116056: win64_params.ll may need to be conditionalized?
llvm-svn: 116063
2010-10-08 17:22:42 +00:00
Cameron Esfahani
a9f8bb1356 Small patch to restore home register stack space allocation for the Win64 case. Add test case. This code eventually needs to be tighter, since it's always allocating it, even in leaf routines.
llvm-svn: 116056
2010-10-08 10:31:30 +00:00
Bob Wilson
8689a52c10 Change register allocation order for ARM VFP and NEON registers to put the
callee-saved registers at the end of the lists.  Also prefer to avoid using
the low registers that are in register subclasses required by certain
instructions, so that those registers will more likely be available when needed.
This change makes a huge improvement in spilling in some cases.  Thanks to
Jakob for helping me realize the problem.

Most of this patch is fixing the testsuite.  There are quite a few places
where we're checking for specific registers.  I changed those to wildcards
in places where that doesn't weaken the tests.  The spill-q.ll and
thumb2-spill-q.ll tests stopped spilling with this change, so I added a bunch
of live values to force spills on those tests.

llvm-svn: 116055
2010-10-08 06:15:13 +00:00
Chris Lattner
761f40f7ee testcase that goes with r116053
llvm-svn: 116054
2010-10-08 05:12:30 +00:00
Chris Lattner
c77216343c rename test
llvm-svn: 116052
2010-10-08 05:05:06 +00:00
Chris Lattner
c694d80729 merge tests
llvm-svn: 116051
2010-10-08 05:04:58 +00:00
Chris Lattner
79fbe67839 filecheckize.
llvm-svn: 116050
2010-10-08 05:02:29 +00:00
Chris Lattner
82ce325f16 reapply: Use the new TB_NOT_REVERSABLE flag instead of special
reapply: reimplement the second half of the or/add optimization.  We should now

with no changes.  Turns out that one missing "Defs = [EFLAGS]" can upset things
a bit.

llvm-svn: 116040
2010-10-08 03:57:25 +00:00
Daniel Dunbar
983fae5a86 Revert "reimplement the second half of the or/add optimization. We should now",
which depends on r116007, which I am about to revert.

llvm-svn: 116031
2010-10-08 02:07:26 +00:00
Chris Lattner
7577cb7b49 reimplement the second half of the or/add optimization. We should now
only end up emitting LEA instead of OR.  If we aren't able to promote
something into an LEA, we should never be emitting it as an ADD.

Add some testcases that we emit "or" in cases where we used to produce
an "add".

llvm-svn: 116026
2010-10-08 01:05:10 +00:00
Chris Lattner
bd89233607 convert cmp to use a multipattern
llvm-svn: 115978
2010-10-07 20:56:25 +00:00
Evan Cheng
7c89d70f27 Canonicalize X86ISD::MOVDDUP nodes to v2f64 to make sure all cases match. Also eliminate unneeded isel patterns. rdar://8520311
llvm-svn: 115977
2010-10-07 20:50:20 +00:00
Jim Grosbach
1e2566c20d Allow use of the 16-bit literal move instruction in CMOVs for ARM mode.
llvm-svn: 115884
2010-10-07 00:42:42 +00:00
Evan Cheng
6fbb6dea7c - Add TargetInstrInfo::getOperandLatency() to compute operand latencies. This
allow target to correctly compute latency for cases where static scheduling
  itineraries isn't sufficient. e.g. variable_ops instructions such as
  ARM::ldm.
  This also allows target without scheduling itineraries to compute operand
  latencies. e.g. X86 can return (approximated) latencies for high latency
  instructions such as division.
- Compute operand latencies for those defined by load multiple instructions,
  e.g. ldm and those used by store multiple instructions, e.g. stm.

llvm-svn: 115755
2010-10-06 06:27:31 +00:00
Bill Wendling
4938e4d00a PSHUFW is in SSE, not SSSE3.
llvm-svn: 115691
2010-10-05 21:58:12 +00:00
Owen Anderson
20b48697cd Use a more efficient lowering of uint64_t --> float that can take advantage of hardware signed integer conversion without
having to do a double cast (uint64_t --> double --> float).  This is based on the algorithm from compiler_rt's __floatundisf
for X86-64.

llvm-svn: 115634
2010-10-05 17:24:05 +00:00
NAKAMURA Takumi
33e956ce63 test/CodeGen/X86/atomic_op.ll: Rename @main to @func. Extra sequences will be inserted to @main as prologue on cygming, to fail.
llvm-svn: 115611
2010-10-05 11:16:24 +00:00
Anton Korobeynikov
f1acea8615 va_args support for Win64.
Patch by Cameron!

llvm-svn: 115480
2010-10-03 22:52:07 +00:00
Anton Korobeynikov
31b3b2ca41 Properly emit stack probe on win64 (for non-mingw targets).
Based on the patch by Cameron Esfahani!

llvm-svn: 115479
2010-10-03 22:02:38 +00:00
Chris Lattner
f58652610d unbreak buildbot
llvm-svn: 115476
2010-10-03 20:02:48 +00:00
Bill Wendling
f1552256a1 Add test to make sure that the MMX intrinsic calls make it out the other end in
tact.

llvm-svn: 115458
2010-10-03 03:30:30 +00:00
Bill Wendling
1f47b01f08 Need to specify SSE4 for machines which don't have SSE4. The code checked for is generated by SSE4. Otherwise, we get something else.
llvm-svn: 115352
2010-10-01 21:39:35 +00:00
Bill Wendling
89da1661a3 We must check for something.
llvm-svn: 115309
2010-10-01 10:20:10 +00:00
Bill Wendling
a5e9e55778 Disable tests until I can figure out why they're failing on just two machines but not others.
llvm-svn: 115308
2010-10-01 10:01:10 +00:00
Bill Wendling
18960d3eaa Try adding an mtriple.
llvm-svn: 115307
2010-10-01 09:40:50 +00:00
Kalle Raiskila
c6bdc97934 Zap some redundant 'ori $?, $?, 0' from SPU.
Also remove some code that died in the process.
One now non-existant ori is checked for.

llvm-svn: 115306
2010-10-01 09:20:01 +00:00
Bill Wendling
7cb3d0c43e FileCheck-ize this test.
llvm-svn: 115304
2010-10-01 08:55:48 +00:00
Bill Wendling
7b19f58e1c FileCheck-ize this test.
llvm-svn: 115303
2010-10-01 08:50:12 +00:00
Chris Lattner
01c6e93ea4 fix rdar://8494845 + PR8244 - a miscompile exposed by my patch in r101350
llvm-svn: 115294
2010-10-01 05:36:09 +00:00
Dale Johannesen
d2ab5dccdc One more +sse2.
llvm-svn: 115293
2010-10-01 05:08:18 +00:00
Dale Johannesen
4d1ca2ab29 Mark all these as needing SSE2. Should fix PPC and
maybe even Linux.

llvm-svn: 115291
2010-10-01 04:17:55 +00:00
Dale Johannesen
c9a4261060 Disable these tests for now; it's not obvious why they fail on Linux.
llvm-svn: 115257
2010-10-01 00:59:21 +00:00
Dale Johannesen
51bf690108 Make test not sensitive to register choice.
llvm-svn: 115250
2010-10-01 00:16:17 +00:00
Dale Johannesen
c14a1eda84 Massive rewrite of MMX:
The x86_mmx type is used for MMX intrinsics, parameters and
return values where these use MMX registers, and is also
supported in load, store, and bitcast.

Only the above operations generate MMX instructions, and optimizations
do not operate on or produce MMX intrinsics. 

MMX-sized vectors <2 x i32> etc. are lowered to XMM or split into
smaller pieces.  Optimizations may occur on these forms and the
result casted back to x86_mmx, provided the result feeds into a
previous existing x86_mmx operation.

The point of all this is prevent optimizations from introducing
MMX operations, which is unsafe due to the EMMS problem.

llvm-svn: 115243
2010-09-30 23:57:10 +00:00
NAKAMURA Takumi
66978c202a test/CodeGen/X86/sibcall.ll: Add explicit triplets and remove XFAIL: apple-darwin8.
llvm-svn: 115215
2010-09-30 22:02:06 +00:00
Jakob Stoklund Olesen
603950c4af Try again to disable critical edge splitting in CodeGenPrepare.
The bug that broke i386 linux has been fixed in r115191.

llvm-svn: 115204
2010-09-30 20:51:52 +00:00
Jakob Stoklund Olesen
8f4d623f9a When isel is emitting instructions for an x86 target without CMOV, the CFG is
edited during emission.

If the basic block ends in a switch that gets lowered to a jump table, any
phis at the default edge were getting updated wrong. The jump table data
structure keeps a pointer to the header blocks that wasn't getting updated
after the MBB is split.

This bug was exposed on 32-bit Linux when disabling critical edge splitting in
codegen prepare.

The fix is to uipdate stale MBB pointers whenever a block is split during
emission.

llvm-svn: 115191
2010-09-30 19:44:31 +00:00
Jason W Kim
7822e6aab5 Tiny patch for proof-of-concept cleanup of ARMAsmPrinter::EmitStartOfAsmFile()
Small test for sanity check of resulting ARM .s file.
Tested against -r115129.

llvm-svn: 115133
2010-09-30 02:45:56 +00:00
Bob Wilson
d16aaccb05 Increase ARM APCS preferred alignment for i64 and f64 from 32 bits to 64 bits.
LDM/STM instructions can run one cycle faster on some ARM processors if the
memory address is 64-bit aligned.  Radar 8489376.

llvm-svn: 115047
2010-09-29 17:54:10 +00:00
Gabor Greif
c6deccbe3a do not compare actual branch labels; this may fix llvm-gcc-x86_64-darwin10-cross-mingw32 buildbot too
llvm-svn: 115034
2010-09-29 10:45:43 +00:00
Gabor Greif
e1de402213 improve heuristics to find the 'and' corresponding to 'tst' to also catch opportunities on thumb2
added some doxygen on the way

llvm-svn: 115033
2010-09-29 10:12:08 +00:00
Bill Wendling
b01224f53c And remove r114997's test.
llvm-svn: 115003
2010-09-28 23:24:18 +00:00
Bill Wendling
fc29f0d706 Revert r114997. It was causing a failure on darwin10-selfhost.
llvm-svn: 115002
2010-09-28 23:11:55 +00:00
Bill Wendling
13775375ed Fix a FIXME. _foo.eh symbols are currently always exported so that the linker
knows about them. This is not necessary on 10.6 and later.

llvm-svn: 114997
2010-09-28 22:36:56 +00:00
Owen Anderson
c34e1296b8 Add a subtarget hook for reporting the misprediction penalty. Use this to provide more precise
cost modeling for if-conversion.  Now if only we had a way to estimate the misprediction probability.

Adjsut CodeGen/ARM/ifcvt10.ll.  The pipeline on Cortex-A8 is long enough that it is still profitable
to predicate an ldm, but the shorter pipeline on Cortex-A9 makes it unprofitable.

llvm-svn: 114995
2010-09-28 21:57:50 +00:00
Anton Korobeynikov
f1be021755 User proper libcall names & condcodes while compiling for ARM EABI.
Patch by Evzen Muller!

llvm-svn: 114991
2010-09-28 21:39:26 +00:00
Owen Anderson
c0e1200323 Part one of switching to using a more sane heuristic for determining if-conversion profitability.
Rather than having arbitrary cutoffs, actually try to cost model the conversion.

For now, the constants are tuned to more or less match our existing behavior, but these will be
changed to reflect realistic values as this work proceeds.

llvm-svn: 114973
2010-09-28 18:32:13 +00:00
Bob Wilson
dc396388cb Add a command line option "-arm-strict-align" to disallow unaligned memory
accesses for ARM targets that would otherwise allow it.  Radar 8465431.

llvm-svn: 114941
2010-09-28 04:09:35 +00:00
Jakob Stoklund Olesen
cfed90fe40 Revert "Disable codegen prepare critical edge splitting. Machine instruction passes now"
This reverts revision 114633. It was breaking llvm-gcc-i386-linux-selfhost.

It seems there is a downstream bug that is exposed by
-cgp-critical-edge-splitting=0. When that bug is fixed, this patch can go back
in.

Note that the changes to tailcallfp2.ll are not reverted. They were good are
required.

llvm-svn: 114859
2010-09-27 18:43:48 +00:00
Jakob Stoklund Olesen
51362ba70e Explicitly disable CGP critical edge splitting for this test so it won't break
by reenabling it temporarily.

llvm-svn: 114858
2010-09-27 18:43:43 +00:00
Jakob Stoklund Olesen
d1779cf19c Don't depend on basic block numbering.
llvm-svn: 114857
2010-09-27 18:43:40 +00:00
Chris Lattner
0ebcc18dec the latest assembler that runs on powerpc 10.4 machines doesn't
support aligned comm.  Detect when compiling for 10.4 and don't
emit an alignment for comm.  THis will hopefully fix PR8198.

llvm-svn: 114817
2010-09-27 06:44:54 +00:00
Che-Liang Chiou
aeccb0793b Add test case for PTX ret instruction
llvm-svn: 114789
2010-09-25 07:49:54 +00:00
Che-Liang Chiou
0eaf890a31 Add ret instruction to PTX backend
llvm-svn: 114788
2010-09-25 07:46:17 +00:00
Evan Cheng
1d50dccdc5 Enable code placement optimization pass for ARM.
llvm-svn: 114746
2010-09-24 19:07:23 +00:00
Bob Wilson
ff8139baa0 Set alignment operand for NEON VST instructions.
llvm-svn: 114709
2010-09-23 23:42:37 +00:00
Bob Wilson
026ef4b7f8 Set alignment operand for NEON VLD instructions.
llvm-svn: 114696
2010-09-23 21:43:54 +00:00
Evan Cheng
1493b1799e Disable codegen prepare critical edge splitting. Machine instruction passes now
break critical edges on demand.

llvm-svn: 114633
2010-09-23 06:55:34 +00:00
Owen Anderson
7d6373ea9d A select between a constant and zero, when fed by a bit test, can be efficiently
lowered using a series of shifts.
Fixes <rdar://problem/8285015>.

llvm-svn: 114599
2010-09-22 22:58:22 +00:00
Cameron Esfahani
662193bddc Fix PR8201: Update the code to call via X86::CALL64pcrel32 in the 64-bit case.
llvm-svn: 114597
2010-09-22 22:35:21 +00:00
Chris Lattner
1864d6728d Fix an inconsistency in the x86 backend that led it to reject "calll foo" on
x86-32: 32-bit calls were named "call" not "calll".  64-bit calls were correctly
named "callq", so this only impacted x86-32.

This fixes rdar://8456370 - llvm-mc rejects 'calll'

This also exposes that mingw/64 is generating a 32-bit call instead of a 64-bit call,
I will file a bugzilla.

llvm-svn: 114534
2010-09-22 05:49:14 +00:00
Chris Lattner
26d11d7501 reimplement elf TLS support in terms of addressing modes, eliminating SegmentBaseAddress.
llvm-svn: 114529
2010-09-22 04:39:11 +00:00
Chris Lattner
d42791ad4a linux has a different stack alignment than the mac, relax this a bit.
llvm-svn: 114519
2010-09-22 00:46:26 +00:00
Chris Lattner
e52da86fab give VZEXT_LOAD a memory operand, it now works with segment registers.
llvm-svn: 114515
2010-09-22 00:34:38 +00:00
Chris Lattner
706b9206da revert r114386 now that address modes work correctly, we get a nice
call through gs-relative memory now.

llvm-svn: 114510
2010-09-22 00:11:31 +00:00
Chris Lattner
f9861312cb give LCMPXCHG_DAG[8] a memory operand, allowing it to work with addrspace 256/257
llvm-svn: 114508
2010-09-21 23:59:42 +00:00
Chris Lattner
08b4ce2b31 filecheckize
llvm-svn: 114507
2010-09-21 23:57:27 +00:00
Evan Cheng
1d58965067 OptimizeCompareInstr should avoid iterating pass the beginning of the MBB when the 'and' instruction is after the comparison.
llvm-svn: 114506
2010-09-21 23:49:07 +00:00
Owen Anderson
d9fd152c3a Enable target-specific mul-lowering on ARM, even at -Os. Remove a test that this makes
irrelevant, but add a new test for the new, improved functionality.

llvm-svn: 114494
2010-09-21 22:51:46 +00:00
Devang Patel
53b709a85c Use FileCheck
llvm-svn: 114475
2010-09-21 20:50:32 +00:00
Owen Anderson
97a8fdc19c When adding the carry bit to another value on X86, exploit the fact that the carry-materialization
(sbbl x, x) sets the registers to 0 or ~0.  Combined with two's complement arithmetic, we can fold
the intermediate AND and the ADD into a single SUB.

This fixes <rdar://problem/8449754>.

llvm-svn: 114460
2010-09-21 18:41:19 +00:00
Chris Lattner
ecdba24738 fix rdar://8453210, a crash handling a call through a GS relative load.
For now, just disable folding the load into the call.

llvm-svn: 114386
2010-09-21 03:37:00 +00:00
Evan Cheng
1ce02d180e Enable machine sinking critical edge splitting. e.g.
define double @foo(double %x, double %y, i1 %c) nounwind {
  %a = fdiv double %x, 3.2
  %z = select i1 %c, double %a, double %y
  ret double %z
}

Was:
_foo:
        divsd   LCPI0_0(%rip), %xmm0
        testb   $1, %dil
        jne     LBB0_2
        movaps  %xmm1, %xmm0
LBB0_2:
        ret

Now:
_foo:
        testb   $1, %dil
        je      LBB0_2
        divsd   LCPI0_0(%rip), %xmm0
        ret
LBB0_2:
        movaps  %xmm1, %xmm0
        ret

This avoids the divsd when early exit is taken.
rdar://8454886

llvm-svn: 114372
2010-09-20 22:52:00 +00:00
Owen Anderson
b8811b9ed9 CombinerAA is now reordering these stores.
llvm-svn: 114354
2010-09-20 20:56:29 +00:00
Owen Anderson
fc94b337eb When TCO is turned on, it is possible to end up with aliasing FrameIndex's. Therefore,
CombinerAA cannot assume that different FrameIndex's never alias, but can instead use
MachineFrameInfo to get the actual offsets of these slots and check for actual aliasing.

This fixes CodeGen/X86/2010-02-19-TailCallRetAddrBug.ll and CodeGen/X86/tailcallstack64.ll
when CombinerAA is enabled, modulo a different register allocation sequence.

llvm-svn: 114348
2010-09-20 20:39:59 +00:00
Jim Grosbach
cf90f8beb1 Simplify ARM callee-saved register handling by removing the distinction
between the high and low registers for prologue/epilogue code. This was
a Darwin-only thing that wasn't providing a realistic benefit anymore.
Combining the save areas simplifies the compiler code and results in better
ARM/Thumb2 codegen.

For example, previously we would generate code like:
        push    {r4, r5, r6, r7, lr}
        add     r7, sp, #12
        stmdb   sp!, {r8, r10, r11}
With this change, we combine the register saves and generate:
        push    {r4, r5, r6, r7, r8, r10, r11, lr}
        add     r7, sp, #12

rdar://8445635

llvm-svn: 114340
2010-09-20 19:32:20 +00:00
NAKAMURA Takumi
a8d8b5f3c3 test/CodeGen/X86: Add explicit triplet -mtriple=i686-linux to 3 tests incompatible to Win32 codegen.
r114297 raises 3 failures. They might fail also on mingw.

llvm-svn: 114317
2010-09-19 21:58:55 +00:00
Eric Christopher
2901b19344 Add the exit instruction to the PTX target.
Patch by Che-Liang Chiou <clchiou@gmail.com>!

llvm-svn: 114294
2010-09-18 18:52:28 +00:00
Owen Anderson
015641f659 Invert the logic of reachesChainWithoutSideEffects(). What we want to check is that there is
NO path to the destination containing side effects, not that SOME path contains no side effects.
In  practice, this only manifests with CombinerAA enabled, because otherwise the chain has little
to no branching, so "any" is effectively equivalent to "all".

llvm-svn: 114268
2010-09-18 04:45:14 +00:00
Bob Wilson
670e1915c0 Add target-specific DAG combiner for BUILD_VECTOR and VMOVRRD. An i64
value should be in GPRs when it's going to be used as a scalar, and we use
VMOVRRD to make that happen, but if the value is converted back to a vector
we need to fold to a simple bit_convert.  Radar 8407927.

llvm-svn: 114233
2010-09-17 22:59:05 +00:00
Jim Grosbach
8ae5cfffdd Teach the (non-MC) instruction printer to use the cannonical names for push/pop,
and shift instructions on ARM. Update the tests to match.

llvm-svn: 114230
2010-09-17 22:36:38 +00:00
Evan Cheng
8c2bde65f0 Teach machine sink to
1) Do forward copy propagation. This makes it easier to estimate the cost of the
   instruction being sunk.
2) Break critical edges on demand, including cases where the value is used by
   PHI nodes.
Critical edge splitting is not yet enabled by default.

llvm-svn: 114227
2010-09-17 22:28:18 +00:00
Jim Grosbach
d2cf0742a4 Update tests to handle MC-inst instruction printing of shift operations. The
legacy asm printer uses instructions of the form, "mov r0, r0, lsl #3", while
the MC-instruction printer uses the form "lsl r0, r0, #3". The latter mnemonic
is correct and preferred according the ARM documentation (A8.6.98). The former
are pseudo-instructions for the latter.

llvm-svn: 114221
2010-09-17 21:58:46 +00:00
Jim Grosbach
b86aebe2b7 FileCheck-ize
llvm-svn: 114218
2010-09-17 21:46:16 +00:00
Jim Grosbach
b21c19d666 Move thumb2 tests to the thumb2 directory
llvm-svn: 114206
2010-09-17 20:34:09 +00:00
Jim Grosbach
026811c86f tweak test to check instructions rather than relying on the comment string
llvm-svn: 114204
2010-09-17 20:27:26 +00:00
Dan Gohman
aaed2c137f Avoid emitting a PIC base register if no PIC addresses are needed.
This fixes rdar://8396318.

llvm-svn: 114201
2010-09-17 20:24:24 +00:00
Jim Grosbach
fd21b4bb15 tweak test to check instructions rather than relying on the comment string
llvm-svn: 114200
2010-09-17 20:21:03 +00:00
Jim Grosbach
5ca55bd98b tweak test to check instructions rather than relying on the comment string
llvm-svn: 114199
2010-09-17 20:17:41 +00:00
Dale Johannesen
cf9dc14249 When substituting sunkaddrs into indirect arguments an asm, we were
walking the asm arguments once and stashing their Values.  This is
wrong because the same memory location can be in the list twice, and
if the first one has a sunkaddr substituted, the stashed value for the
second one will be wrong (use-after-free).  PR 8154.

llvm-svn: 114104
2010-09-16 18:30:55 +00:00
Kalle Raiskila
68e2c15954 Change SPU register re-interpretations from OR to COPY_TO_REGCLASS instruction.
This cleans up after the mess r108567 left in the CellSPU backend.
ORCvt-instruction were used to reinterpret registers, and the ORs were then
removed by isMoveInstr(). This patch now removes 350 instrucions of format:
	or $3, $3, $3
(from the 52 testcases in CodeGen/CellSPU). One case of a nonexistant or is
checked for.

Some moves of the form 'ori $., $., 0' and 'ai $., $., 0' still remain.

llvm-svn: 114074
2010-09-16 12:29:33 +00:00
Bob Wilson
e7e2f983e5 Reapply Gabor's 113839, 113840, and 113876 with a fix for a problem
encountered while building llvm-gcc for arm.  This is probably the same issue
that the ppc buildbot hit. llvm::prior works on a MachineBasicBlock::iterator,
not a plain MachineInstr.

llvm-svn: 113983
2010-09-15 17:12:08 +00:00
Gabor Greif
f7635897c8 the darwin9-powerpc buildbot keeps consistently crashing,
backing out following to get it back to green,
so I can investigate in peace:

svn merge -c -113840  llvm/test/CodeGen/ARM/arm-and-tst-peephole.ll
svn merge -c -113876 -c -113839 llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp

llvm-svn: 113980
2010-09-15 16:53:07 +00:00
Gabor Greif
a37b447ba6 forgot the testcase change for r113839
llvm-svn: 113840
2010-09-14 09:30:17 +00:00
Gabor Greif
40a8053a15 test for and-tst peephole optimization
documents the status-quo with its opportunities

llvm-svn: 113838
2010-09-14 08:50:43 +00:00
Owen Anderson
9c34a7831d Re-apply r113679, which was reverted in r113720, which added a paid of new instcombine transforms
to expose greater opportunities for store narrowing in codegen.  This patch fixes a potential
infinite loop in instcombine caused by one of the introduced transforms being overly aggressive.

llvm-svn: 113763
2010-09-13 17:59:27 +00:00
Eric Christopher
d4aaabfa74 Revert 113679, it was causing an infinite loop in a testcase that I've sent
on to Owen.

llvm-svn: 113720
2010-09-12 06:09:23 +00:00
Evan Cheng
e2425602c5 Fix test so it passes on non-Darwin hosts.
llvm-svn: 113577
2010-09-10 06:20:01 +00:00
Bob Wilson
01b7bf3510 Fix merging base-updates for VLDM/VSTM: Before I switched these instructions
to use AddrMode4, there was a count of the registers stored in one of the
operands.  I changed that to just count the operands but forgot to adjust for
the size of D registers.  This was noticed by Evan as a performance problem
but it is a potential correctness bug as well, since it is possible that this
could merge a base update with a non-matching immediate.

llvm-svn: 113576
2010-09-10 05:15:04 +00:00
Evan Cheng
c9cb37516d Teach if-converter to be more careful with predicating instructions that would
take multiple cycles to decode.
For the current if-converter clients (actually only ARM), the instructions that
are predicated on false are not nops. They would still take machine cycles to
decode. Micro-coded instructions such as LDM / STM can potentially take multiple
cycles to decode. If-converter should take treat them as non-micro-coded
simple instructions.

llvm-svn: 113570
2010-09-10 01:29:16 +00:00
Bruno Cardoso Lopes
49efee5c95 Add one more pattern to fallback movddup
llvm-svn: 113522
2010-09-09 18:48:34 +00:00
Bob Wilson
524123343c Fix NEON VLD pseudo instruction itineraries that were incorrectly copied from
the VST pseudos.  The VLD/VST scheduling still needs work (see pr6722), but
at least we shouldn't confuse the loads with the stores.

llvm-svn: 113473
2010-09-09 05:40:26 +00:00
Jim Grosbach
0f01a7319e Re-enable usage of the ARM base pointer. r113394 fixed the known failures.
Re-running some nightly testers w/ it enabled to verify.

llvm-svn: 113399
2010-09-08 20:12:02 +00:00
Eric Christopher
95fe7f6d38 Remove ssp from this test.
llvm-svn: 113392
2010-09-08 19:32:34 +00:00
Kalle Raiskila
be22226628 Fix CellSPU vector shuffles, again.
Some cases of lowering to rotate were miscompiled.

llvm-svn: 113355
2010-09-08 11:53:38 +00:00
Jim Grosbach
7a957dc761 disable for the moment while tracking down a few Thumb2-O0 failure that look
related. (attempt deux, complete w/ test update this time)

llvm-svn: 113333
2010-09-08 02:00:34 +00:00
Devang Patel
acddbab1a5 remove these tests for now.
llvm-svn: 113293
2010-09-07 22:03:44 +00:00
Devang Patel
751d5ec40a There is no need to force target if the test is going to run on other x86 platforms.
llvm-svn: 113285
2010-09-07 20:59:09 +00:00
Devang Patel
e9dcee0a69 Fix command line used to link these test cases.
llvm-svn: 113237
2010-09-07 18:17:56 +00:00
Devang Patel
ca974014e7 Reintroduce dbg-declare tests.
llvm-svn: 113232
2010-09-07 18:01:49 +00:00