1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-01 16:33:37 +01:00
Commit Graph

3210 Commits

Author SHA1 Message Date
Anton Korobeynikov
3997a71ef8 Feature test for half precision FP.
llvm-svn: 98504
2010-03-14 18:42:43 +00:00
Chris Lattner
2bdb0765f8 fix AsmPrinter::GetBlockAddressSymbol to always return a unique
label instead of trying to form one based on the BB name (which
causes collisions if the name is empty).  This fixes PR6608

llvm-svn: 98495
2010-03-14 17:53:23 +00:00
Chris Lattner
9331acc6d7 get MMI out of the label uniquing business, just go to MCContext
to get unique assembler temporary labels.

llvm-svn: 98489
2010-03-14 08:36:50 +00:00
Evan Cheng
7d8c39bb1c Do not force indirect tailcall through fixed registers: eax, r11. Add support to allow loads to be folded to tail call instructions.
llvm-svn: 98465
2010-03-14 03:48:46 +00:00
Chris Lattner
683801add5 simplify code to use OutContext.GetOrCreateTemporarySymbol with
no arguments instead of having to come up with a unique name.
This also makes the code less fragile.

llvm-svn: 98364
2010-03-12 18:47:50 +00:00
Chris Lattner
80ab250a1c fix PR6577, a bug in sdbuilder lowering select instructions
whose true value was not Val#0.

llvm-svn: 98336
2010-03-12 07:15:36 +00:00
Bill Wendling
368a68ac82 revert r98270.
llvm-svn: 98281
2010-03-11 19:50:31 +00:00
Evan Cheng
20dbd70316 Bad bad bug. x86 force indirect tail call address into eax when it's meant to force it into a call preserved register instead. Change it to ecx for now.
llvm-svn: 98270
2010-03-11 18:49:14 +00:00
Richard Osborne
0dcde97cbc Add dag combine to simplify lmul(x, 0, a, b)
llvm-svn: 98258
2010-03-11 16:26:35 +00:00
Evan Cheng
4ef6d8fa15 The check for coalescing a virtual register to a physical register, e.g.
cl = EXTRACT_SUBREG reg1024, 1, is overly conservative. It should check
for overlaps of vr's live interval with the super registers of the
physical register (ECX in this case) and let JoinIntervals() handle checking
the coalescing feasibility against the physical register (cl in this case).

llvm-svn: 98251
2010-03-11 08:20:21 +00:00
Eric Christopher
bbdbe41a97 Have fast-isel understand llvm.objectsize. Update testcase for slightly
different codegen.

llvm-svn: 98244
2010-03-11 06:20:22 +00:00
Chris Lattner
d6d11e53ab add support, testcases, and dox for the new GHC calling
convention.  Patch by David Terei!

llvm-svn: 98212
2010-03-11 00:22:57 +00:00
Chris Lattner
7cd70b8066 fix PR6533 by updating the br(xor) code to remember the case
when it looked past a trunc.

llvm-svn: 98203
2010-03-10 23:46:44 +00:00
Richard Osborne
41c5f84f1d Handle MVT::i64 type in DAG combine for ISD::ADD. Fold 64 bit
expression add(add(mul(x,y),a),b) -> lmul(x,y,a,b) if all
operands are zero extended.

llvm-svn: 98168
2010-03-10 18:12:27 +00:00
Richard Osborne
d400202a43 Fold add(add(mul(x,y),a),b) -> lmul(x,y,a,b) if the intermediate
results are unused elsewhere.

llvm-svn: 98157
2010-03-10 16:19:31 +00:00
Richard Osborne
c19c2bd177 Prefer LMUL to MACCU as LMUL has no tied operands.
llvm-svn: 98153
2010-03-10 13:27:10 +00:00
Richard Osborne
43210638f1 Custom lower (S|U)MUL_LOHI -> MACC(S|U)
llvm-svn: 98152
2010-03-10 13:20:07 +00:00
Richard Osborne
c88a8e8d66 Lower add (mul a, b), c into MACCU / MACCS nodes which translate
directly to the maccu / maccs instructions. We handle this in
ExpandADDSUB since after type legalisation it is messy to
recognise these operations.

llvm-svn: 98150
2010-03-10 11:41:08 +00:00
Richard Osborne
6c55bfe516 Convert test to FileCheck.
llvm-svn: 98148
2010-03-10 11:24:03 +00:00
Evan Cheng
1f7af27386 Fix typo.
llvm-svn: 98142
2010-03-10 07:07:55 +00:00
Evan Cheng
96e1e20fd5 Unbreak test on Linux.
llvm-svn: 98141
2010-03-10 07:07:45 +00:00
Evan Cheng
668ceddeec Enable machine cse pass.
llvm-svn: 98132
2010-03-10 03:07:41 +00:00
Dale Johannesen
4b8f3692f4 The address of an indirect call must be in R12 on Darwin.
Make it so.  (This patch is in LowerCall_Darwin, which seems
to be used by SVR4 code as well; since that doesn't belong here,
I haven't worried about this case.)

llvm-svn: 98077
2010-03-09 20:15:42 +00:00
Richard Osborne
4077517135 In cases where the carry / borrow unused converted ladd / lsub
to an add or a sub.

llvm-svn: 98059
2010-03-09 16:34:25 +00:00
Richard Osborne
173efed224 Add DAG combine for ladd / lsub.
llvm-svn: 98057
2010-03-09 16:07:47 +00:00
Chris Lattner
99ca33d324 move .set generation out of DwarfPrinter into AsmPrinter and
MCize it.

llvm-svn: 98010
2010-03-08 23:58:37 +00:00
Chris Lattner
10d571f349 simplify EmitSectionOffset to always use .set if it is
available, the only thing this affects is that we produce
.set in one case we didn't before, which shouldn't harm
anything.  Make EmitSectionOffset call EmitDifference
instead of duplicating it.

llvm-svn: 98005
2010-03-08 23:23:25 +00:00
Bob Wilson
116599fe52 Fix a crash compiling 254.gap for Thumb2. The Thumb2 add/sub with 12-bit
immediate instructions cannot set the condition codes, so they do not have
the extra cc_out operand.  We hit an assertion during tail duplication
because the instruction being duplicated had more operands that expected.

llvm-svn: 98001
2010-03-08 22:56:15 +00:00
Evan Cheng
cfe037000a Add documentation on sibling call optimization. Rename tailcall2.ll test to sibcall.ll.
llvm-svn: 97980
2010-03-08 21:05:02 +00:00
Wesley Peck
bc2c6d7b1b Re-committing the failed r97807 commit with changes to eliminate warnings.
llvm-svn: 97891
2010-03-06 23:23:12 +00:00
Anton Korobeynikov
e0e616a74d Initial bits of ARMv4-only support.
Patch by John Tytgat!

llvm-svn: 97886
2010-03-06 19:39:36 +00:00
Anton Korobeynikov
6c841e6a44 Do not use '&' prefix for globals when register base field is non-zero, otherwise msp430-as will silently miscompile the code (TI's assembler report an error though).
This fixes PR6349

llvm-svn: 97877
2010-03-06 11:41:12 +00:00
Chris Lattner
b2ed5cb501 revert r97807, it introduced build warnings.
llvm-svn: 97869
2010-03-06 04:32:46 +00:00
Charles Davis
faa2f44081 Don't emit global symbols into the (__TEXT,__ustring) section on Darwin. This
is a workaround for <rdar://problem/7672401/> (which I filed).

This let's us build Wine on Darwin, and it gets the Qt build there a little bit
further (so Doug says).

llvm-svn: 97845
2010-03-05 22:28:45 +00:00
Jakob Stoklund Olesen
4e033d2070 Better handling of dead super registers in LiveVariables. We used to do this:
CALL ... %RAX<imp-def>
   ... [not using %RAX]
   %EAX = ..., %RAX<imp-use, kill>
   RET %EAX<imp-use,kill>

Now we do this:

   CALL ... %RAX<imp-def, dead>
   ... [not using %RAX]
   %EAX = ...
   RET %EAX<imp-use,kill>

By not artificially keeping %RAX alive, we lower register pressure a bit.

The correct number of instructions for 2008-08-05-SpillerBug.ll is obviously
55, anybody can see that. Sheesh.

llvm-svn: 97838
2010-03-05 21:49:17 +00:00
Jakob Stoklund Olesen
1fce28720a We don't really care about correct register liveness information after the
post-ra scheduler has run. Disable the verifier checks that late in the game.

llvm-svn: 97837
2010-03-05 21:49:13 +00:00
Jakob Stoklund Olesen
67476519d7 Avoid creating bad PHI instructions when BR is being const-folded.
llvm-svn: 97836
2010-03-05 21:49:10 +00:00
Chris Lattner
e1452aba94 fix bss section printing for cell, patch by Kalle Raiskila!
llvm-svn: 97814
2010-03-05 18:55:36 +00:00
Wesley Peck
79c2f7afcc Reworking the stack layout that the MicroBlaze backend generates.
The MicroBlaze backend was generating stack layouts that did not
conform correctly to the ABI. This update generates stack layouts
which are closer to what GCC does.

Variable arguments support was added as well but the stack layout
for varargs has not been finalized.

llvm-svn: 97807
2010-03-05 15:26:02 +00:00
Evan Cheng
eb43cbfc75 Fix an oops in x86 sibcall optimization. If the ByVal callee argument is itself passed as a pointer, then it's obviously not safe to do a tail call.
llvm-svn: 97797
2010-03-05 08:38:04 +00:00
Chris Lattner
80aaccb987 Fix PR6497, a bug where we'd fold a load into an addc
node which has a flag.  That flag in turn was used by an
already-selected adde which turned into an ADC32ri8 which
used a selected load which was chained to the load we
folded.  This flag use caused us to form a cycle.  Fix
this by not ignoring chains in IsLegalToFold even in
cases where the isel thinks it can.

llvm-svn: 97791
2010-03-05 06:19:13 +00:00
Chris Lattner
5804690a45 cleanup
llvm-svn: 97790
2010-03-05 06:17:43 +00:00
Evan Cheng
04b9deff58 Rever 96389 and 96990. They are causing some miscompilation that I do not fully understand.
llvm-svn: 97782
2010-03-05 03:08:23 +00:00
Bill Wendling
a92a5b8a6c Revert r97766. It's deleting a tag.
llvm-svn: 97768
2010-03-05 00:33:59 +00:00
Bill Wendling
55b4dfcd9d Micro-optimization:
This code:

float floatingPointComparison(float x, float y) {
    double product = (double)x * y;
    if (product == 0.0)
        return product;
    return product - 1.0;
}

produces this:

_floatingPointComparison:
0000000000000000        cvtss2sd        %xmm1,%xmm1
0000000000000004        cvtss2sd        %xmm0,%xmm0
0000000000000008        mulsd           %xmm1,%xmm0
000000000000000c        pxor            %xmm1,%xmm1
0000000000000010        ucomisd         %xmm1,%xmm0
0000000000000014        jne             0x00000004
0000000000000016        jp              0x00000002
0000000000000018        jmp             0x00000008
000000000000001a        addsd           0x00000006(%rip),%xmm0
0000000000000022        cvtsd2ss        %xmm0,%xmm0
0000000000000026        ret

The "jne/jp/jmp" sequence can be reduced to this instead:

_floatingPointComparison:
0000000000000000        cvtss2sd        %xmm1,%xmm1
0000000000000004        cvtss2sd        %xmm0,%xmm0
0000000000000008        mulsd           %xmm1,%xmm0
000000000000000c        pxor            %xmm1,%xmm1
0000000000000010        ucomisd         %xmm1,%xmm0
0000000000000014        jp              0x00000002
0000000000000016        je              0x00000008
0000000000000018        addsd           0x00000006(%rip),%xmm0
0000000000000020        cvtsd2ss        %xmm0,%xmm0
0000000000000024        ret

for a savings of 2 bytes.

This xform can happen when we recognize that jne and jp jump to the same "true"
MBB, the unconditional jump would jump to the "false" MBB, and the "true" branch
is the fall-through MBB.

llvm-svn: 97766
2010-03-05 00:24:26 +00:00
Johnny Chen
b6d35fd803 Drop the ".w" qualifier for t2UXTB16* instructions as there is no 16-bit version
of either sxtb16 or uxtb16, and the unified syntax does not specify ".w".

llvm-svn: 97760
2010-03-04 22:24:41 +00:00
Bob Wilson
188e15d7a5 pr6478: The frame pointer spill frame index is only defined when there is a
frame pointer.

llvm-svn: 97755
2010-03-04 21:42:36 +00:00
Bob Wilson
73b96c00d2 pr6480: Don't try producing ld/st-multiple instructions when the address is
an undef value.  This is only going to come up for bugpoint-reduced tests --
correct programs will not access memory at undefined addresses -- so it's not
worth the effort of doing anything more aggressive.

llvm-svn: 97745
2010-03-04 21:04:38 +00:00
Jakob Stoklund Olesen
3408cd6de1 Fix the remaining MUL8 and DIV8 to define AX instead of AL,AH.
These instructions technically define AL,AH, but a trick in X86ISelDAGToDAG
reads AX in order to avoid reading AH with a REX instruction.

Fix PR6489.

llvm-svn: 97742
2010-03-04 20:42:07 +00:00
Dan Gohman
265f85f6d8 Fix recognition of 16-bit bswap for C front-ends which emit the
clobber registers in a different order.

llvm-svn: 97741
2010-03-04 19:58:08 +00:00
Dan Gohman
da13ee1220 Revert r97580; that's not the right way to fix this.
llvm-svn: 97639
2010-03-03 04:36:42 +00:00
Bill Wendling
d1f658563d This test case:
long test(long x) { return (x & 123124) | 3; }

Currently compiles to:

_test:
        orl     $3, %edi
        movq    %rdi, %rax
        andq    $123127, %rax
        ret

This is because instruction and DAG combiners canonicalize

  (or (and x, C), D) -> (and (or, D), (C | D))

However, this is only profitable if (C & D) != 0. It gets in the way of the
3-addressification because the input bits are known to be zero.

llvm-svn: 97616
2010-03-03 00:35:56 +00:00
Chris Lattner
9c9c1158cb Fix some issues in WalkChainUsers dealing with
CopyToReg/CopyFromReg/INLINEASM.  These are annoying because
they have the same opcode before an after isel.  Fix this by
setting their NodeID to -1 to indicate that they are selected,
just like what automatically happens when selecting things that
end up being machine nodes.

With that done, give IsLegalToFold a new flag that causes it to
ignore chains.  This lets the HandleMergeInputChains routine be
the one place that validates chains after a match is successful,
enabling the new hotness in chain processing.  This smarter
chain processing eliminates the need for "PreprocessRMW" in the
X86 and MSP430 backends and enables MSP to start matching it's
multiple mem operand instructions more aggressively.

I currently #if out the dead code in the X86 backend and MSP 
backend, I'll remove it for real in a follow-on patch.

The testcase changes are:
  test/CodeGen/X86/sse3.ll: we generate better code
  test/CodeGen/X86/store_op_load_fold2.ll: PreprocessRMW was 
      miscompiling this before, we now generate correct code
      Convert it to filecheck while I'm at it.
  test/CodeGen/MSP430/Inst16mm.ll: Add a testcase for mem/mem
      folding to make anton happy. :)

llvm-svn: 97596
2010-03-02 22:20:06 +00:00
Chris Lattner
d25f212f9f this testcase is failing because pic16 doesn't define a reg/reg
xor pattern.  I have no plans to fix this XFAIL.

llvm-svn: 97587
2010-03-02 20:48:24 +00:00
Chris Lattner
f84b94d738 xfail this for now.
llvm-svn: 97584
2010-03-02 19:53:25 +00:00
Dan Gohman
f06941597a When expanding an expression such as (A + B + C + D), sort the operands
by loop depth and emit loop-invariant subexpressions outside of loops.
This speeds up MultiSource/Applications/viterbi and others.

llvm-svn: 97580
2010-03-02 19:32:21 +00:00
Chris Lattner
845db3b26d clean up some testcases.
llvm-svn: 97576
2010-03-02 18:56:03 +00:00
Chris Lattner
2019e2922f Fix the xfail I added a couple of patches back. The issue
was that we weren't properly handling the case when interior
nodes of a matched pattern become dead after updating chain
and flag uses.  Now we handle this explicitly in 
UpdateChainsAndFlags.

llvm-svn: 97561
2010-03-02 07:50:03 +00:00
Chris Lattner
0b41a42411 Rewrite chain handling validation and input TokenFactor handling
stuff now that we don't care about emulating the old broken 
behavior of the old isel.  This eliminates the 
'CheckChainCompatible' check (along with IsChainCompatible) which
did an incorrect and inefficient scan *up* the chain nodes which
happened as the pattern was being formed and does the validation
at the end in HandleMergeInputChains when it forms a structural 
pattern.  This scans "down" the graph, which means that it is
quickly bounded by nodes already selected.  This also handles
token factors that get "trapped" in the dag.

Removing the CheckChainCompatible nodes also shrinks the 
generated tables by about 6K for X86 (down to 83K).

There are two pieces remaining before I can nuke PreprocessRMW:
1. I xfailed a test because we're now producing worse code in a 
   case that has nothing to do with the change: it turns out that
   our use of MorphNodeTo will leave dead nodes in the graph
   which (depending on how the graph is walked) end up causing
   bogus uses of chains and blocking matches.  This is really 
   bad for other reasons, so I'll fix this in a follow-up patch.

2. CheckFoldableChainNode needs to be improved to handle the TF.

llvm-svn: 97539
2010-03-02 02:22:10 +00:00
Dan Gohman
56a20fc5eb Fix several places to handle vector operands properly.
Based on a patch by Micah Villmow for PR6438.

llvm-svn: 97538
2010-03-02 02:14:38 +00:00
Chris Lattner
c0839055a9 Fix PR2590 by making PatternSortingPredicate actually be
ordered correctly.  Previously it would get in trouble when
two patterns were too similar and give them nondet ordering.
We force this by using the record ID order as a fallback.

The testsuite diff is due to alpha patterns being ordered
slightly differently, the change is a semantic noop afaict:

< 	lda $0,-100($16)
---
> 	subq $16,100,$0

llvm-svn: 97509
2010-03-01 22:09:11 +00:00
Chris Lattner
ac2f5c24a0 stop using anders-aa
llvm-svn: 97491
2010-03-01 20:24:05 +00:00
Devang Patel
6853e2432e Rewrite test to test VLA using new debug info encoding scheme.
llvm-svn: 97465
2010-03-01 18:30:58 +00:00
Devang Patel
c56aee014c Remove this generic debug info intrinsic test. LLVM does not use this llvm.dbg.stoppoint intrinsic anymore. There are tests to check new implementation, which attaches location information directly with an instruction using metadata.
llvm-svn: 97464
2010-03-01 18:30:08 +00:00
Chris Lattner
74db1864da add some random nounwinds.
llvm-svn: 97411
2010-02-28 20:36:49 +00:00
Dan Gohman
0799a41c48 Don't try to replace physical registers when doing CSE.
llvm-svn: 97360
2010-02-28 01:33:43 +00:00
Dan Gohman
3d9622fa97 Add nounwinds.
llvm-svn: 97349
2010-02-27 23:53:53 +00:00
Evan Cheng
94051bc37e Re-apply 97040 with fix. This survives a ppc self-host llvm-gcc bootstrap.
llvm-svn: 97310
2010-02-27 07:36:59 +00:00
Jakob Stoklund Olesen
755ba2ee84 Use the right floating point load/store instructions in PPCInstrInfo::foldMemoryOperandImpl().
The PowerPC floating point registers can represent both f32 and f64 via the
two register classes F4RC and F8RC. F8RC is considered a subclass of F4RC to
allow cross-class coalescing. This coalescing only affects whether registers
are spilled as f32 or f64.

Spill slots must be accessed with load/store instructions corresponding to the
class of the spilled register. PPCInstrInfo::foldMemoryOperandImpl was looking
at the instruction opcode which is wrong.

X86 has similar floating point register classes, but doesn't try to fold
memory operands, so there is no problem there.

llvm-svn: 97262
2010-02-26 21:09:24 +00:00
Sanjiv Gupta
8a971a1dc5 Reapply things reverted back in 97220, with the fixed test case.
llvm-svn: 97228
2010-02-26 17:59:28 +00:00
Richard Osborne
fe30a8a2c1 Fix XCoreTargetLowering::isLegalAddressingMode() to handle VoidTy.
Previously LoopStrengthReduce would sometimes be unable to find
a legal formula, causing an assertion failure.

llvm-svn: 97226
2010-02-26 16:44:51 +00:00
Chris Lattner
e4b5559cf8 change the scope node to include a list of children to be checked
instead of to have a chained series of scope nodes.  This makes
the generated table smaller, improves the efficiency of the
interpreter, and make the factoring optimization much more 
reasonable to implement.

llvm-svn: 97160
2010-02-25 19:00:39 +00:00
Dan Gohman
084112437d Revert r97064. Duncan pointed out that bitcasts are defined in
terms of store and load, which means bitcasting between scalar
integer and vector has endian-specific results, which undermines
this whole approach.

llvm-svn: 97137
2010-02-25 15:20:39 +00:00
Dan Gohman
52ed61204b Make LoopSimplify change conditional branches in loop exiting blocks
which branch on undef to branch on a boolean constant for the edge
exiting the loop. This helps ScalarEvolution compute trip counts for
loops.

Teach ScalarEvolution to recognize single-value PHIs, when safe, and
ForgetSymbolicName to forget such single-value PHI nodes as apprpriate
in ForgetSymbolicName.

llvm-svn: 97126
2010-02-25 06:57:05 +00:00
Jakob Stoklund Olesen
2b93d17560 Create a stack frame on ARM when
- Function uses all scratch registers AND
- Function does not use any callee saved registers AND
- Stack size is too big to address with immediate offsets.

In this case a register must be scavenged to calculate the address of a stack
object, and the scavenger needs a spare register or emergency spill slot.

llvm-svn: 97071
2010-02-24 22:43:17 +00:00
Bob Wilson
4ffb88d388 Check for comparisons of +/- zero when optimizing less-than-or-equal and
greater-than-or-equal SELECT_CCs to NEON vmin/vmax instructions.  This is
only allowed when UnsafeFPMath is set or when at least one of the operands
is known to be nonzero.

llvm-svn: 97065
2010-02-24 22:15:53 +00:00
Dan Gohman
424e8f22d0 Make getTypeSizeInBits work correctly for array types; it should return
the number of value bits, not the number of bits of allocation for in-memory
storage.

Make getTypeStoreSize and getTypeAllocSize work consistently for arrays and
vectors.

Fix several places in CodeGen which compute offsets into in-memory vectors
to use TargetData information.

This fixes PR1784.

llvm-svn: 97064
2010-02-24 22:05:23 +00:00
Daniel Dunbar
24c99e027e Speculatively revert r97011, "Re-apply 96540 and 96556 with fixes.", again in
the hopes of fixing PPC bootstrap.

llvm-svn: 97040
2010-02-24 17:05:47 +00:00
Dan Gohman
c0c6077fed When forming SSE min and max nodes for UGE and ULE comparisons, it's
necessary to swap the operands to handle NaN and negative zero properly.

Also, reintroduce logic for checking for NaN conditions when forming
SSE min and max instructions, fixed to take into consideration NaNs and
negative zeros. This allows forming min and max instructions in more
cases.

llvm-svn: 97025
2010-02-24 06:52:40 +00:00
Chris Lattner
52a02205d8 Change the scheduler from adding nodes in allnodes order
to adding them in a determinstic order (bottom up from 
the root) based on the structure of the graph itself.

This updates tests for some random changes, interesting
bits: CodeGen/Blackfin/promote-logic.ll no longer crashes.
I have no idea why, but that's good right?

CodeGen/X86/2009-07-16-LoadFoldingBug.ll also fails, but
now compiles to have one fewer constant pool entry, making
the expected load that was being folded disappear.  Since it
is an unreduced mass of gnast, I just removed it.

This fixes PR6370

llvm-svn: 97023
2010-02-24 06:11:37 +00:00
Jim Grosbach
6f72657d6e LowerCall() should always do getCopyFromReg() to reference the stack pointer.
Machine instruction selection is much happier when operands are in virtual
registers.

llvm-svn: 97012
2010-02-24 01:43:03 +00:00
Evan Cheng
5787cd9349 Re-apply 96540 and 96556 with fixes.
llvm-svn: 97011
2010-02-24 01:42:31 +00:00
Jakob Stoklund Olesen
a946f9eb7d DIV8r must define %AX since X86DAGToDAGISel::Select() sometimes uses it
instead of %AL/%AH.

llvm-svn: 97006
2010-02-24 00:39:35 +00:00
Jakob Stoklund Olesen
3406ec2f57 Remember to handle sub-registers when moving imp-defs to a rematted instruction.
llvm-svn: 96995
2010-02-23 22:44:02 +00:00
Jakob Stoklund Olesen
cf29251712 Keep track of phi join registers explicitly in LiveVariables.
Previously, LiveIntervalAnalysis would infer phi joins by looking for multiply
defined registers. That doesn't work if the phi join is implicitly defined in
all but one of the predecessors.

llvm-svn: 96994
2010-02-23 22:43:58 +00:00
Wesley Peck
94cdac52e5 Adding the MicroBlaze backend.
The MicroBlaze is a highly configurable 32-bit soft-microprocessor for
use on Xilinx FPGAs. For more information see:
http://www.xilinx.com/tools/microblaze.htm
http://en.wikipedia.org/wiki/MicroBlaze

The current LLVM MicroBlaze backend generates assembly which can be
compiled using the an appropriate binutils assembler.

llvm-svn: 96969
2010-02-23 19:15:24 +00:00
Richard Osborne
7387249531 Lower BR_JT on the XCore to a jump into a series of jump instructions.
llvm-svn: 96942
2010-02-23 13:25:07 +00:00
Evan Cheng
8096221984 These should not have been committed.
llvm-svn: 96827
2010-02-22 23:37:48 +00:00
Chris Lattner
c4fea4c8a1 no need to run llvm-as here.
llvm-svn: 96826
2010-02-22 23:34:12 +00:00
Evan Cheng
d9816ef946 Instcombine constant folding can normalize gep with negative index to index with large offset. When instcombine objsize checking transformation sees these geps where the offset seemingly point out of bound, it should just return "i don't know" rather than asserting.
llvm-svn: 96825
2010-02-22 23:34:00 +00:00
Dan Gohman
97481fb5e0 Canonicalize ConstantInts to the right operand of commutative
operators.

The test difference is just due to the multiplication operands
being commuted (and thus requiring a more elaborate match). In
optimized code, that expression would be folded.

llvm-svn: 96816
2010-02-22 22:43:23 +00:00
Dan Gohman
8f672b95f2 Actually enable the -enable-unsafe-fp-math tests.
llvm-svn: 96796
2010-02-22 18:53:26 +00:00
Arnold Schwaighofer
8427969f9c Mark the return address stack slot as mutable when moving the return address
during a tail call. A parameter might overwrite this stack slot during the tail
call. 

The sequence during a tail call is:
1.) load return address to temp reg
2.) move parameters (might involve storing to return address stack slot)
3.) store return address to new location from temp reg

If the stack location is marked immutable CodeGen can colocate load (1) with the
store (3).

This fixes bug 6225.

llvm-svn: 96783
2010-02-22 16:18:09 +00:00
Dan Gohman
c281a5da15 Remove the logic for reasoning about NaNs from the code that forms
SSE min and max instructions. The real thing this code needs to be
concerned about is negative zero.

Update the sse-minmax.ll test accordingly, and add tests for
-enable-unsafe-fp-math mode as well.

llvm-svn: 96775
2010-02-22 04:03:39 +00:00
Dan Gohman
c44dee5fbd When emitting an instruction which depends on both a post-incremented
induction variable value and a loop-variant value, don't force the
insert position to be at the post-increment position, because it may
not be dominated by the loop-variant value. This fixes a
use-before-def problem noticed on PPC.

llvm-svn: 96774
2010-02-22 03:59:54 +00:00
Chris Lattner
37f20c29c8 add some no-unwinds, other minor cleanups.
llvm-svn: 96756
2010-02-21 20:33:20 +00:00
Chris Lattner
fa1fdcf146 add a triple so that this doesn't fail due to linux/ppc register printing
syntax.

llvm-svn: 96748
2010-02-21 19:27:38 +00:00
Chris Lattner
654f38165b filecheckize and add nouwinds.
llvm-svn: 96745
2010-02-21 18:53:28 +00:00
Anton Korobeynikov
fe0d6453ec IT turns out that during jumpless setcc lowering eq and ne were swapped.
This fixes PR6348

llvm-svn: 96734
2010-02-21 12:28:58 +00:00
Chris Lattner
b328c57aa4 fix and un-xfail X86/vec_ss_load_fold.ll
llvm-svn: 96720
2010-02-21 04:53:34 +00:00
Chris Lattner
b28cf9f8d9 temporarily disable this.
llvm-svn: 96717
2010-02-21 03:24:41 +00:00
Dan Gohman
9db0689627 Check for overflow when scaling up an add or an addrec for
scaled reuse.

llvm-svn: 96692
2010-02-19 19:32:49 +00:00
Charles Davis
a64fc8c41b Add support for the 'alignstack' attribute to the x86 backend. Fixes PR5254.
Also, FileCheck'ize a test.

llvm-svn: 96686
2010-02-19 18:17:13 +00:00
Duncan Sands
5d5cce2e19 Revert commits 96556 and 96640, because commit 96556 breaks the
dragonegg self-host build.  I reverted 96640 in order to revert
96556 (96640 goes on top of 96556), but it also looks like with
both of them applied the breakage happens even earlier.  The
symptom of the 96556 miscompile is the following crash:

  llvm[3]: Compiling AlphaISelLowering.cpp for Release build
  cc1plus: /home/duncan/tmp/tmp/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:4982: void llvm::SelectionDAG::ReplaceAllUsesWith(llvm::SDNode*, llvm::SDNode*, llvm::SelectionDAG::DAGUpdateListener*): Assertion `(!From->hasAnyUseOfValue(i) || From->getValueType(i) == To->getValueType(i)) && "Cannot use this version of ReplaceAllUsesWith!"' failed.
  Stack dump:
  0.	Running pass 'X86 DAG->DAG Instruction Selection' on function '@_ZN4llvm19AlphaTargetLowering14LowerOperationENS_7SDValueERNS_12SelectionDAGE'
  g++: Internal error: Aborted (program cc1plus)

This occurs when building LLVM using LLVM built by LLVM (via
dragonegg).  Probably LLVM has miscompiled itself, though it
may have miscompiled GCC and/or dragonegg itself: at this point
of the self-host build, all of GCC, LLVM and dragonegg were built
using LLVM.  Unfortunately this kind of thing is extremely hard
to debug, and while I did rummage around a bit I didn't find any
smoking guns, aka obviously miscompiled code.

Found by bisection.

r96556 | evancheng | 2010-02-18 03:13:50 +0100 (Thu, 18 Feb 2010) | 5 lines

Some dag combiner goodness:
Transform br (xor (x, y)) -> br (x != y)
Transform br (xor (xor (x,y), 1)) -> br (x == y)
Also normalize (and (X, 1) == / != 1 -> (and (X, 1)) != / == 0 to match to "test on x86" and "tst on arm"

r96640 | evancheng | 2010-02-19 01:34:39 +0100 (Fri, 19 Feb 2010) | 16 lines

Transform (xor (setcc), (setcc)) == / != 1 to
(xor (setcc), (setcc)) != / == 1.

e.g. On x86_64
  %0 = icmp eq i32 %x, 0
  %1 = icmp eq i32 %y, 0
  %2 = xor i1 %1, %0
  br i1 %2, label %bb, label %return
=>
	testl   %edi, %edi
	sete    %al
	testl   %esi, %esi
	sete    %cl
	cmpb    %al, %cl
	je      LBB1_2

llvm-svn: 96672
2010-02-19 11:30:41 +00:00
Evan Cheng
32031f7404 Transform (xor (setcc), (setcc)) == / != 1 to
(xor (setcc), (setcc)) != / == 1.

e.g. On x86_64
  %0 = icmp eq i32 %x, 0
  %1 = icmp eq i32 %y, 0
  %2 = xor i1 %1, %0
  br i1 %2, label %bb, label %return
=>
	testl   %edi, %edi
	sete    %al
	testl   %esi, %esi
	sete    %cl
	cmpb    %al, %cl
	je      LBB1_2

llvm-svn: 96640
2010-02-19 00:34:39 +00:00
Dan Gohman
58199e30dc When determining the set of interesting reuse factors, consider
strides in foreign loops. This helps locate reuse opportunities
with existing induction variables in foreign loops and reduces
the need for inserting new ones. This fixes rdar://7657764.

llvm-svn: 96629
2010-02-19 00:05:23 +00:00
Mon P Wang
64cd1a8d7f getSplatIndex assumes that the first element of the mask contains the splat index
which is not always true if the mask contains undefs. Modified it to return
the first non undef value.

llvm-svn: 96621
2010-02-18 22:33:18 +00:00
Jakob Stoklund Olesen
5b9d14b55e Always normalize spill weights, also for intervals created by spilling.
Moderate the weight given to very small intervals.

The spill weight given to new intervals created when spilling was not
normalized in the same way as the original spill weights calculated by
CalcSpillWeights. That meant that restored registers would tend to hang around
because they had a much higher spill weight that unspilled registers.

This improves the runtime of a few tests by up to 10%, and there are no
significant regressions.

llvm-svn: 96613
2010-02-18 21:33:05 +00:00
Dan Gohman
34b5cb7deb Make CodePlacementOpt detect special EH control flow by
checking whether AnalyzeBranch disagrees with the CFG
directly, rather than looking for EH_LABEL instructions.
EH_LABEL instructions aren't always at the end of the
block, due to FP_REG_KILL and other things. This fixes
an infinite loop compiling MultiSource/Benchmarks/Bullet.

llvm-svn: 96611
2010-02-18 21:25:53 +00:00
Chris Lattner
a2e094064f remove empty file
llvm-svn: 96573
2010-02-18 06:29:06 +00:00
Bob Wilson
84fc0200bd Use NEON vmin/vmax instructions for floating-point selects.
Radar 7461718.

llvm-svn: 96572
2010-02-18 06:05:53 +00:00
Evan Cheng
9af06dfc83 Some dag combiner goodness:
Transform br (xor (x, y)) -> br (x != y)
Transform br (xor (xor (x,y), 1)) -> br (x == y)
Also normalize (and (X, 1) == / != 1 -> (and (X, 1)) != / == 0 to match to "test on x86" and "tst on arm"

llvm-svn: 96556
2010-02-18 02:13:50 +00:00
Dan Gohman
3cb7dc5912 Don't check for comments, which vary between subtargets.
llvm-svn: 96434
2010-02-17 01:08:57 +00:00
Dan Gohman
493a1fcbe0 Don't attempt to divide INT_MIN by -1; consider such cases to
have overflowed.

llvm-svn: 96428
2010-02-17 00:41:53 +00:00
Chris Lattner
c87f9d6d1a roundss is an sse 4 thing, fix the test on non-sse41 builders
like llvm-gcc-x86_64-darwin10-selfhost

llvm-svn: 96417
2010-02-17 00:29:06 +00:00
Dale Johannesen
d147b9a4d4 Make g5 target explicit; scheduling affects register choice.
llvm-svn: 96413
2010-02-16 23:25:23 +00:00
Chris Lattner
0d35c68d5c fix rdar://7653908, a crash on a case where we would fold a load
into a roundss intrinsic, producing a cyclic dag.  The root cause
of this is badness handling ComplexPattern nodes in the old dagisel
that I noticed through inspection.  Eliminate a copy of the of the
code that handled ComplexPatterns by making EmitChildMatchCode call
into EmitMatchCode.

llvm-svn: 96408
2010-02-16 22:35:06 +00:00
Dale Johannesen
60d48aef7b Adjust register numbers in tests to compensate for the
new lack of R2.

llvm-svn: 96407
2010-02-16 22:31:31 +00:00
Chris Lattner
008f62bfa2 filecheckize
llvm-svn: 96404
2010-02-16 22:13:43 +00:00
Evan Cheng
ee44d6a752 Look for SSE and instructions of this form: (and x, (build_vector c1,c2,c3,c4)).
If there exists a use of a build_vector that's the bitwise complement of the mask,
then transform the node to
(and (xor x, (build_vector -1,-1,-1,-1)), (build_vector ~c1,~c2,~c3,~c4)).

Since this transformation is only useful when 1) the given build_vector will
become a load from constpool, and 2) (and (xor x -1), y) matches to a single
instruction, I decided this is appropriate as a x86 specific transformation.
rdar://7323335

llvm-svn: 96389
2010-02-16 21:09:44 +00:00
David Greene
c10133139e Add support for emitting non-temporal stores for DAGs marked
non-temporal.  Fix from r96241 for botched encoding of MOVNTDQ.

Add documentation for !nontemporal metadata.

Add a simpler movnt testcase.

llvm-svn: 96386
2010-02-16 20:50:18 +00:00
Bob Wilson
94eef3fc13 Fix pr6111: Avoid using the LR register for the target address of an indirect
branch in ARM v4 code, since it gets clobbered by the return address before
it is used.  Instead of adding a new register class containing all the GPRs
except LR, just use the existing tGPR class.

llvm-svn: 96360
2010-02-16 17:24:15 +00:00
Dan Gohman
d19ecedc40 Split the main for-each-use loop again, this time for GenerateTruncates,
as it also peeks at which registers are being used by other uses. This
makes LSR less sensitive to use-list order.

llvm-svn: 96308
2010-02-16 01:42:53 +00:00
Anton Korobeynikov
dccd240998 Preliminary patch to improve dwarf EH generation - Hooks to return Personality / FDE / LSDA / TType encoding depending on target / options (e.g. code model / relocation model) - MCIzation of Dwarf EH printer to use encoding information - Stub generation for ELF target (needed for indirect references) - Some other small changes here and there
llvm-svn: 96285
2010-02-15 22:35:59 +00:00
Jakob Stoklund Olesen
143339a43a Fix PR6300.
A virtual register can be used before it is defined in the same MBB if the MBB
is part of a loop. Teach the implicit-def pass about this case.

llvm-svn: 96279
2010-02-15 22:03:29 +00:00
Bob Wilson
01e8d35855 Last week we were generating code with duplicate induction variables in this
test, but the problem seems to have gone away today.  Add a check to make sure
it doesn't come back.

llvm-svn: 96277
2010-02-15 21:56:40 +00:00
Chris Lattner
2ce5f89c01 remove empty file.
llvm-svn: 96271
2010-02-15 21:14:50 +00:00
Chris Lattner
d7470aa340 revert r96241. It breaks two regression tests, isn't documented,
and the testcase needs improvement.

llvm-svn: 96265
2010-02-15 20:53:01 +00:00
Chris Lattner
a8505609fe fix PR6305 by handling BlockAddress in a helper function
called by jump threading.

llvm-svn: 96263
2010-02-15 20:47:49 +00:00
David Greene
ba8bac644b Add support for emitting non-temporal stores for DAGs marked
non-temporal.

llvm-svn: 96241
2010-02-15 17:02:56 +00:00
Jakob Stoklund Olesen
0a65533a38 Fix PR6283.
When coalescing with a physreg, remember to add imp-def and imp-kill when
dealing with sub-registers.

Also fix a related bug in VirtRegRewriter where substitutePhysReg may
reallocate the operand list on an instruction and invalidate the reg_iterator.
This can happen when a register is mentioned twice on the same instruction.

llvm-svn: 96072
2010-02-13 02:06:10 +00:00
Bob Wilson
5d66f81412 Besides removing phi cycles that reduce to a single value, also remove dead
phi cycles.  Adjust a few tests to keep dead instructions from being optimized
away.  This (together with my previous change for phi cycles) fixes Apple
radar 7627077.

llvm-svn: 96057
2010-02-13 00:31:44 +00:00
Dale Johannesen
ea96b2974f When save/restoring CR at prolog/epilog, in a large
stack frame, the prolog/epilog code was using the same
register for the copy of CR and the address of the save slot.  Oops.
This is fixed here for Darwin, sort of, by reserving R2 for this case.
A better way would be to do the store before the decrement of SP,
which is safe on Darwin due to the red zone.

SVR4 probably has the same problem, but I don't know how to fix it;
there is no red zone and R2 is already used for something else.
I'm going to leave it to someone interested in that target.

Better still would be to rewrite the CR-saving code completely;
spilling each CR subregister individually is horrible code.

llvm-svn: 96015
2010-02-12 21:35:34 +00:00
Anton Korobeynikov
c66de6687b Testcases for recent stdcall / fastcall mangling improvements
llvm-svn: 95982
2010-02-12 15:29:13 +00:00
Anton Korobeynikov
7073515c86 Cleanup stdcall / fastcall name mangling.
This should fix alot of problems we saw so far, e.g. PRs 5851 & 2936

llvm-svn: 95980
2010-02-12 15:28:40 +00:00
Dan Gohman
c40eb525ad Reapply the new LoopStrengthReduction code, with compile time and
bug fixes, and with improved heuristics for analyzing foreign-loop
addrecs.

This change also flattens IVUsers, eliminating the stride-oriented
groupings, which makes it easier to work with.

llvm-svn: 95975
2010-02-12 10:34:29 +00:00
Bob Wilson
2fd80c3d94 Add a new pass on machine instructions to optimize away PHI cycles that
reduce down to a single value.  InstCombine already does this transformation
but DAG legalization may introduce new opportunities.  This has turned out to
be important for ARM where 64-bit values are split up during type legalization:
InstCombine is not able to remove the PHI cycles on the 64-bit values but
the separate 32-bit values can be optimized.  I measured the compile time 
impact of this (running llc on 176.gcc) and it was not significant.

llvm-svn: 95951
2010-02-12 01:30:21 +00:00
Jakob Stoklund Olesen
b800ff8ca9 Reapply coalescer fix for better cross-class coalescing.
This time with fixed test cases.

llvm-svn: 95938
2010-02-11 23:55:29 +00:00
Mon P Wang
c17e781f35 The previous fix of widening divides that trap was too fragile as it depends on custom
lowering and requires that certain types exist in ValueTypes.h.  Modified widening to
check if an op can trap and if so, the widening algorithm will apply only the op on
the defined elements.  It is safer to do this in widening because the optimizer can't
guarantee removing unused ops in some cases.

llvm-svn: 95823
2010-02-10 23:37:45 +00:00
Bob Wilson
82d5534acc Delete dead PHI machine instructions. These can be created due to type
legalization even when the IR-level optimizer has removed dead phis, such
as when the high half of an i64 value is unused on a 32-bit target.
I had to adjust a few test cases that had dead phis.
This is a partial fix for Radar 7627077.

llvm-svn: 95816
2010-02-10 22:58:57 +00:00
Evan Cheng
8bee7fb61d Now that ShrinkDemandedOps() is separated out from DAG combine. It sometimes leave some obvious nops which dag combine used to clean up afterwards e.g. (trunk (ext n)) -> n. Look for them and squash them.
llvm-svn: 95757
2010-02-10 02:17:34 +00:00
Chris Lattner
340fe1f187 move tests that depend on the x86 backend out of codegen/generic,
and remove a few old and unreduced ones.  Fixes PR5624. 

llvm-svn: 95656
2010-02-09 06:41:03 +00:00
Chris Lattner
fdb9fda4af make target independent.
llvm-svn: 95655
2010-02-09 06:36:30 +00:00
Chris Lattner
28b79c686d merge a target-specific add test into x86 directory.
llvm-svn: 95654
2010-02-09 06:35:50 +00:00
Chris Lattner
34c420d11b merge another test in, drop the trivially constant folded cases.
llvm-svn: 95653
2010-02-09 06:33:27 +00:00
Chris Lattner
ddb2e5a05c consolidate and filecheckize two tests.
llvm-svn: 95652
2010-02-09 06:24:00 +00:00
Chris Lattner
e669789912 merge two tests, make target independent.
llvm-svn: 95651
2010-02-09 06:19:20 +00:00
Chris Lattner
20be5fb012 convert to filecheck.
llvm-svn: 95608
2010-02-08 23:47:34 +00:00
Chris Lattner
6162c89bfe add an x86 implementation of MCTargetExpr for
representing @GOT and friends.  Use it for
personality references as a first use.

llvm-svn: 95588
2010-02-08 22:09:08 +00:00
Dan Gohman
56b9ea088b When CodeGen'ing unoptimized code, there may be unfolded constant expressions
in global initializers. Instead of aborting, attempt to fold them on the
spot. If folding succeeds, emit the folded expression instead.

This fixes PR6255.

llvm-svn: 95583
2010-02-08 22:02:38 +00:00
Dan Gohman
f113e5466c In guaranteed tailcall mode, don't decline the tailcall optimization
for blocks ending in "unreachable".

llvm-svn: 95565
2010-02-08 20:34:14 +00:00
Evan Cheng
5541068ad3 Run codegen dce pass for all targets at all optimization levels. Previously it's
only run for x86 with fastisel. I've found it being very effective in
eliminating some obvious dead code as result of formal parameter lowering
especially when tail call optimization eliminated the need for some of the loads
from fixed frame objects. It also shrinks a number of the tests. A couple of
tests no longer make sense and are now eliminated.

llvm-svn: 95493
2010-02-06 09:07:11 +00:00
Evan Cheng
c3cfda4e7e Remove a large test case that (soon will) no longer make sense.
llvm-svn: 95492
2010-02-06 09:00:30 +00:00
Rafael Espindola
b0bb1ddfe3 Fix alignment on ppc linux. This fixes the build of crtend.o
llvm-svn: 95477
2010-02-06 03:32:21 +00:00
Evan Cheng
de1a4726e6 Do not emit callseq instructions around sibcalls. This eliminated some unnecessary stack adjustments.
llvm-svn: 95475
2010-02-06 03:28:46 +00:00
Bob Wilson
1a324958d6 Handle AddrMode6 (for NEON load/stores) in Thumb2's rewriteT2FrameIndex.
Radar 7614112.

llvm-svn: 95456
2010-02-06 00:24:38 +00:00
Jakob Stoklund Olesen
7b4c60adae Don't unroll loops containing function calls.
llvm-svn: 95454
2010-02-05 23:21:31 +00:00
Bill Wendling
c3f4101cc6 Make test more fucused eliminating extraneous bits.
llvm-svn: 95384
2010-02-05 11:21:05 +00:00
Evan Cheng
4b03f55de1 Fix test.
llvm-svn: 95373
2010-02-05 06:37:00 +00:00
Evan Cheng
81dde4c7f7 Handle tail call with byval arguments.
llvm-svn: 95351
2010-02-05 02:21:12 +00:00
Evan Cheng
94fe5501b7 When the scheduler unfold a load folding instruction it move some of the predecessors to the unfolded load. It decides what gets moved to the load by checking whether the new load is using the predecessor as an operand. The check neglects the cases whether the predecessor is a flagged scheduling unit.
rdar://7604000

llvm-svn: 95339
2010-02-05 01:27:11 +00:00
Bill Wendling
9761f067f8 An empty global constant (one of size 0) may have a section immediately
following it. However, the EmitGlobalConstant method wasn't emitting a body for
the constant. The assembler doesn't like that. Before, we were generating this:

  .zerofill __DATA, __common, __cmd, 1, 3

This fix puts us back to that semantic.

llvm-svn: 95336
2010-02-05 00:17:02 +00:00
Jakob Stoklund Olesen
d72e82107d Fix small bug in handling instructions with more than one implicitly defined operand.
ProcessImplicitDefs would only mark one operand per instruction with <undef>.
This fixed PR6086.

llvm-svn: 95319
2010-02-04 18:46:28 +00:00
Evan Cheng
f5ee7fb571 Re-enable x86 tail call optimization.
llvm-svn: 95295
2010-02-04 06:47:24 +00:00
Chris Lattner
e43007d443 add support for the sparcv9-*-* target triple to turn on
64-bit sparc codegen.  Patch by Nathan Keynes!

llvm-svn: 95293
2010-02-04 06:34:01 +00:00
Evan Cheng
5c8b1b9164 Speculatively disable x86 automatic tail call optimization while we track down a self-hosting issue.
llvm-svn: 95259
2010-02-03 21:40:40 +00:00
Evan Cheng
ccbbdfa8c4 Make test less fragile
llvm-svn: 95258
2010-02-03 21:39:04 +00:00
Evan Cheng
e273e42195 Revert 94937 and move the noreturn check to codegen.
llvm-svn: 95198
2010-02-03 03:55:59 +00:00
Evan Cheng
d9cf09b0d6 Allow all types of callee's to be tail called. But avoid automatic tailcall if the callee is a result of bitcast to avoid losing necessary zext / sext etc.
llvm-svn: 95195
2010-02-03 03:28:02 +00:00
Dale Johannesen
1e9d147461 Reapply 95050 with a tweak to check the register class.
llvm-svn: 95183
2010-02-03 01:40:33 +00:00
Chris Lattner
2b798aafd0 make these less sensitive to asm verbose changes by disabling it for them.
llvm-svn: 95175
2010-02-03 00:48:53 +00:00
Dale Johannesen
08ab638bdc Test revert 95050; there's a good chance it's causing
buildbot failure.

llvm-svn: 95103
2010-02-02 18:52:56 +00:00
Evan Cheng
fac0fdc6a0 Perform sibcall in some cases when arguments are passes memory. Look for cases
where callee's arguments are already in the caller's own caller's stack and
they line up perfectly. e.g.

extern int foo(int a, int b, int c);

int bar(int a, int b, int c) {
  return foo(a, b, c);
}

llvm-svn: 95053
2010-02-02 02:22:50 +00:00
Dale Johannesen
a20fc3d1a9 Make local RA smarter about reusing input register of a copy
as output.  Needed for (functional) correctness in inline asm,
and should be generally beneficial.  7361612.

llvm-svn: 95050
2010-02-02 02:08:02 +00:00
Evan Cheng
efa391da81 Fix PR6196. GV callee may not be a function.
llvm-svn: 95017
2010-02-01 22:40:09 +00:00
Dan Gohman
7b3c210a47 Update this test for a trivial register allocation difference.
llvm-svn: 94989
2010-02-01 19:00:32 +00:00
Evan Cheng
dcc1816642 Undo r94946 now all the tests are passing again.
llvm-svn: 94970
2010-02-01 02:13:39 +00:00
Evan Cheng
b5f97d871c Avoid recursive sibcall's.
llvm-svn: 94946
2010-01-31 06:44:49 +00:00
Anton Korobeynikov
f7651ec593 Fix a gross typo: ARMv6+ may or may not support unaligned memory operations.
Even if they are suported by the core, they can be disabled
(this is just a configuration bit inside some register).

Allow unaligned memops on darwin and conservatively disallow them otherwise.

llvm-svn: 94889
2010-01-30 14:08:12 +00:00
Evan Cheng
40ae22e14d Allow more tailcall optimization: calls with inputs that are all passed in registers.
llvm-svn: 94873
2010-01-30 01:22:00 +00:00
Evan Cheng
2cbd1b19db Catch more trivial tail call opportunities: no inputs and output types match.
llvm-svn: 94804
2010-01-29 06:45:59 +00:00
Chris Lattner
d7a8482810 convert the last 3 targets to use EmitFunctionBody() now that
it has before/end body hooks.

 lib/Target/Alpha/AsmPrinter/AlphaAsmPrinter.cpp |   49 ++-----------
 lib/Target/Mips/AsmPrinter/MipsAsmPrinter.cpp   |   87 ++++++------------------
 lib/Target/XCore/AsmPrinter/XCoreAsmPrinter.cpp |   56 +++------------
 test/CodeGen/XCore/ashr.ll                      |    2 
 4 files changed, 48 insertions(+), 146 deletions(-)

llvm-svn: 94741
2010-01-28 06:22:43 +00:00
Evan Cheng
7e26fdaa78 Fix a bug introduced by r94490 where it created a X86ISD::CMP whose output type is different from its inputs.
This fixes PR6146.

llvm-svn: 94731
2010-01-28 01:57:22 +00:00
Chris Lattner
95118672e3 Give AsmPrinter the most common expected implementation of
runOnMachineFunction, and switch PPC to use EmitFunctionBody.
The two ppc asmprinters now don't heave to define 
runOnMachineFunction.

llvm-svn: 94722
2010-01-28 01:28:58 +00:00
Chris Lattner
df1662f0e6 emit a 0 byte instead of a noop if a function is empty on darwin.
"0" is nice and target independent.

llvm-svn: 94718
2010-01-28 01:06:32 +00:00
Chandler Carruth
4b62a01a0c Quick fix to a test that is currently failing on every Linux build bot. No idea
if this is the "correct" fix, but it seems a strict improvement.

llvm-svn: 94675
2010-01-27 10:36:15 +00:00
Evan Cheng
381bc804d6 Perform trivial tail call optimization for callees with "C" ABI. These are done
even when -tailcallopt is not specified and it does not require changing ABI.
First case is the most trivial one. Perform tail call optimization when both
the caller and callee do not return values and when the callee does not take
any input arguments.

llvm-svn: 94664
2010-01-27 06:25:16 +00:00
Chris Lattner
ee2b6b1cc5 emit jump table an alias ".set" directives through MCStreamer as
assignments.

.set x, a-b

is the same as:

x = a-b

llvm-svn: 94596
2010-01-26 21:53:08 +00:00
Rafael Espindola
f46baf3304 Emit .comm alignment in bytes but .align in powers of 2 for ARM ELF.
Original patch by Sandeep Patel and updated by me.

llvm-svn: 94582
2010-01-26 20:21:43 +00:00
Chris Lattner
044439c9bc eliminate MCAsmInfo::NeedsSet: we now just use .set on any platform
that has it.

llvm-svn: 94581
2010-01-26 20:20:43 +00:00
Evan Cheng
548d00d77c Implement cond ? -1 : 0 with sbb.
llvm-svn: 94490
2010-01-26 02:00:44 +00:00
Rafael Espindola
575697fd65 Update test for darwin.
llvm-svn: 94421
2010-01-25 15:32:10 +00:00
Chris Lattner
2020423588 we removed support for darwin8 tools.
llvm-svn: 94414
2010-01-25 07:43:40 +00:00
Rafael Espindola
82a8b3efd4 Fix PR6134.
We are not emitting alignments on Darwin for "bar". Not sure what is the
correct way to do it.

llvm-svn: 94400
2010-01-25 02:27:39 +00:00
Daniel Dunbar
c1df55e99c Attempt to unbreak test on Linux. Chris, please check.
llvm-svn: 94399
2010-01-25 00:54:13 +00:00
Chris Lattner
6fdaf12267 just remove this test, it is not reduced, is not clear what its testing for and
it is dying due to fragility in the asmprinter .s comments.

llvm-svn: 94372
2010-01-24 19:23:09 +00:00
Mon P Wang
d4d1cbb72b It seems better to scalarize vectors of size 1 instead of widening them.
Add support to widen SETCC.

llvm-svn: 94342
2010-01-24 00:24:43 +00:00
Mon P Wang
871ea08e40 Improved widening loads by adding support for wider loads if
the alignment allows.  Fixed a bug where we didn't use a
vector load/store for PR5626.

llvm-svn: 94338
2010-01-24 00:05:03 +00:00
Chris Lattner
1b7c00a4f2 Change constantexpr global variable initializers to convert the constants
to MCExpr then emit them through MCStreamer with EmitValue.  I think all
global variable initializers are now going through mcstreamer.

llvm-svn: 94293
2010-01-23 06:17:14 +00:00
Eric Christopher
e6d6bfcc32 Don't lower splat vector load to relative to the esp if the
stack may be misaligned.

Update test accordingly.

Patch by Evan Cheng!

llvm-svn: 94291
2010-01-23 06:02:43 +00:00
Chris Lattner
e5e7b41090 stop testing for invalid output.
llvm-svn: 94288
2010-01-23 05:45:28 +00:00
Chris Lattner
20a336f1df emit .ascii and .asciz through MCStreamer.
llvm-svn: 94282
2010-01-23 04:54:10 +00:00
Chris Lattner
4515320f9f remove this test.
llvm-svn: 94276
2010-01-23 03:11:10 +00:00
Evan Cheng
9523142c01 Fix test.
llvm-svn: 94272
2010-01-23 01:21:27 +00:00
Evan Cheng
3bd9efa510 Fix tests.
llvm-svn: 94271
2010-01-23 01:19:28 +00:00
Chris Lattner
5f6aca81b6 make this less constrained, we want blank lines between globals.
llvm-svn: 94201
2010-01-22 19:51:08 +00:00
Dan Gohman
525f7d7833 Revert LoopStrengthReduce.cpp to pre-r94061 for now.
llvm-svn: 94123
2010-01-22 00:46:49 +00:00
Chris Lattner
75db03497a testcase for r94095
llvm-svn: 94096
2010-01-21 20:01:04 +00:00
Dan Gohman
be34c35f32 Re-implement the main strength-reduction portion of LoopStrengthReduction.
This new version is much more aggressive about doing "full" reduction in
cases where it reduces register pressure, and also more aggressive about
rewriting induction variables to count down (or up) to zero when doing so
reduces register pressure.

It currently uses fairly simplistic algorithms for finding reuse
opportunities, but it introduces a new framework allows it to combine
multiple strategies at once to form hybrid solutions, instead of doing
all full-reduction or all base+index.

llvm-svn: 94061
2010-01-21 02:09:26 +00:00
Chris Lattner
e4ac42e11e emit basic block labels with mcstreamer.
llvm-svn: 93993
2010-01-20 07:24:05 +00:00
Chris Lattner
d69a9cc334 emit integer and fp zeros as (e.g.) .byte 0 instead of .space 1,
for tidiness.

llvm-svn: 93992
2010-01-20 07:19:19 +00:00
Chris Lattner
3104fa4a71 signficant cleanups to EmitGlobalConstant (including streamerization
of int initializers), change some methods to be static functions,
use raw_ostream::write_hex instead of a smallstring dance with 
APValue::toStringUnsigned(S, 16).

llvm-svn: 93991
2010-01-20 07:11:32 +00:00
Dan Gohman
34b548b94a Fold (add x, shl(0 - y, n)) -> sub(x, shl(y, n)), to simplify some code
that SCEVExpander can produce when running on behalf of LSR.

llvm-svn: 93949
2010-01-19 23:30:49 +00:00
Dan Gohman
94c0e08951 Make SCEVAddRecExpr's getType return a pointer type when the add
has a pointer member. This helps reduce unnecessary bitcasting
and uglygeps.

llvm-svn: 93939
2010-01-19 22:53:50 +00:00
Dan Gohman
190fee462e Add nounwinds.
llvm-svn: 93919
2010-01-19 21:51:51 +00:00
Jakob Stoklund Olesen
e7d4286d73 Remove predicates when changing an add into an unpredicable mov.
Since the mov is executed unconditionally, make sure that the add didn't have
any predicate.

llvm-svn: 93909
2010-01-19 21:08:28 +00:00
Evan Cheng
4b916556a5 Do not extend extension results beyond the use of a PHI instruction at the start of a use block. A PHI use is expected to kill its source values.
llvm-svn: 93895
2010-01-19 19:45:51 +00:00
Chris Lattner
ff16ee18d9 don't let asm-verbose break the check-next lines in these tests.
llvm-svn: 93869
2010-01-19 06:39:54 +00:00
Chris Lattner
377bd87849 Now that we have everything nicely factored (e.g. asmprinter is not
doing global variable classification anymore) and hookized, sink almost
all target targets global variable emission code into AsmPrinter and out
of each target.

Some notes:

1. PIC16 does completely custom and crazy stuff, so it is not changed.
2. XCore has some custom handling for extra directives.  I'll look at it next.
3. This switches linux/ppc to use .globl instead of .global.  If .globl is
   actually wrong, let me know and I'll fix it.
4. This makes linux/ppc get a lot of random cases right which were obviously
   wrong before, it is probably now a bit healthier.
5. Blackfin will probably start getting .comm and other things that it didn't
   before.  If this is undesirable, it should explicitly opt out of these
   things by clearing the relevant fields of MCAsmInfo.

This leads to a nice diffstat:
 14 files changed, 127 insertions(+), 830 deletions(-)

llvm-svn: 93858
2010-01-19 05:38:33 +00:00
Chris Lattner
c42d723862 fix a significant difference between llvm and gcc on ELF systems:
GCC would put weak zero initialized mutable data in the .bss section,
we would put it into a crasy '.gnu.linkonce.b.test,"aw",@nobits' 
section.  Fixing this will allow simplifications next up.

llvm-svn: 93844
2010-01-19 03:06:01 +00:00
Chris Lattner
394370b299 there is no need to emit a .section above .comm on linux.
llvm-svn: 93842
2010-01-19 02:46:56 +00:00
Evan Cheng
572390be3b Test case for r93758.
llvm-svn: 93824
2010-01-19 00:35:20 +00:00
Evan Cheng
5cf9d23e4e Canonicalize -1 - x to ~x.
Instcombine does this but apparently there are situations where this pattern will escape the optimizer and / or created by isel. Here is a case that's seen in JavaScriptCore:
  %t1 = sub i32 0, %a
  %t2 = add i32 %t1, -1
The dag combiner pattern: ((c1-A)+c2) -> (c1+c2)-A
will fold it to -1 - %a.

llvm-svn: 93773
2010-01-18 21:38:44 +00:00
Chris Lattner
fb103355dd reduce this test and convert to filecheck, hopefully the linux buildbot
will tell me something more useful.

llvm-svn: 93688
2010-01-17 19:09:12 +00:00
Bob Wilson
72cf548263 The Neon "vtst" instruction takes a suffix that is the element size alone --
adding an "i" to the suffix, indicating that the elements are integers, is
accepted but not part of the standard syntax.  This helps us pass a few more
of the Neon tests from gcc.

llvm-svn: 93677
2010-01-17 06:35:17 +00:00
Kenneth Uildriks
d6b30baf78 When checking for sret-demotion, it needs to use legal types. When using the return value of an sret-demoted call, it needs to use possibly illegal types that match the declared Type of the callee.
llvm-svn: 93667
2010-01-16 23:37:33 +00:00
Chris Lattner
4d92f15423 this teestcase takes a long time to crash, remove it. If someone cares about this, they should file a bug, it's not doing any good as an xfail.
llvm-svn: 93604
2010-01-16 00:53:22 +00:00
Bob Wilson
3386047bdb Run the pre-register allocation tail duplication pass by default. Remove
the -pre-regalloc-taildup command-line option, and add a new
-disable-early-taildup option.

llvm-svn: 93597
2010-01-16 00:29:50 +00:00
David Greene
e52529d7cb Fix PR6019. A load has more than one use if it feeds a bitconvert that
has more than one use.

llvm-svn: 93576
2010-01-15 23:23:41 +00:00
Jim Grosbach
b09e69dd22 add testcase for r93564
llvm-svn: 93567
2010-01-15 22:27:37 +00:00
Anton Korobeynikov
7a7f5c50d8 Reenable tests
llvm-svn: 93555
2010-01-15 21:19:26 +00:00
Anton Korobeynikov
b4484a2bab Temporary disable tests
llvm-svn: 93501
2010-01-15 02:09:27 +00:00
Anton Korobeynikov
953a94cb69 Add variable-width shifts for MSP430
llvm-svn: 93468
2010-01-14 22:09:38 +00:00
Dan Gohman
7c596d2b00 Fix a codegen abort seen in 483.xalancbmk.
llvm-svn: 93417
2010-01-14 03:08:49 +00:00
Chris Lattner
b5605b72b7 this test requires SSE, thanks to jyasskin for pointing this out.
llvm-svn: 93360
2010-01-13 21:51:41 +00:00
Evan Cheng
0fa1e2d063 Commit some changes I had managed to lose last night while refactoring the code. Avoid change use of PHI instructions because it's not legal to insert any instructions before them.
This fixes PR6027.

llvm-svn: 93335
2010-01-13 19:16:39 +00:00
Evan Cheng
2afc417122 Re-enable extension optimization pass.
llvm-svn: 93313
2010-01-13 08:45:40 +00:00
Chris Lattner
1b6c061cd0 remove uses of deprecated functions, this generates slightly
different BlockAddress labels, but nothing semantically important.

Add a FIXME that BlockAddress codegen is broken if the LLVM BB has 
an empty name (e.g. strip was run).

llvm-svn: 93303
2010-01-13 07:30:49 +00:00
Evan Cheng
973fceab0c Disable opt-ext pass to unbreak the build for now.
llvm-svn: 93286
2010-01-13 01:51:43 +00:00
Jeffrey Yasskin
59ca529e20 Try to fix the ARM and PPC buildbots. The -mattr=vector-unaligned-mem
flag doesn't exist there, and this is an x86 test.

llvm-svn: 93279
2010-01-13 00:31:43 +00:00
Evan Cheng
76db3bb18e Add a quick pass to optimize sign / zero extension instructions. For targets where the pre-extension values are available in the subreg of the result of the extension, replace the uses of the pre-extension value with the result + extract_subreg.
For now, this pass is fairly conservative. It only perform the replacement when both the pre- and post- extension values are used in the block. It will miss cases where the post-extension values are live, but not used.

llvm-svn: 93278
2010-01-13 00:30:23 +00:00
Evan Cheng
0dddace5f1 Add nounwind.
llvm-svn: 93244
2010-01-12 18:29:23 +00:00
Duncan Sands
395053f13a Revert commit 93204, since it causes the assembler to barf
on x86-64 linux with messages like this:
Error: Incorrect register `%r14' used with `l' suffix

llvm-svn: 93242
2010-01-12 17:46:16 +00:00
Dan Gohman
a48d524fbc Make several tests less fragile.
llvm-svn: 93230
2010-01-12 04:52:47 +00:00
Dan Gohman
51b3e804dc Reapply the MOV64r0 patch, with a fix: MOV64r0 clobbers EFLAGS.
llvm-svn: 93229
2010-01-12 04:42:54 +00:00
Evan Cheng
a93b476689 Add manual ISD::OR fastisel selection routines. TableGen is no longer autogen them after 93152 and 93191.
llvm-svn: 93204
2010-01-11 22:59:27 +00:00
Evan Cheng
bd938ebc90 Extend r93152 to work on OR r, r. If the source set bits are known not to overlap, then select as an ADD instead.
llvm-svn: 93191
2010-01-11 22:03:29 +00:00
Chris Lattner
644f29ddf5 reduce this to a sensible testcase.
llvm-svn: 93189
2010-01-11 21:58:19 +00:00
David Greene
5d479fa341 Shorten up this testcase.
llvm-svn: 93187
2010-01-11 21:50:35 +00:00
Evan Cheng
bc84a42d7b Revert 93158. It's breaking quite a few x86_64 tests.
llvm-svn: 93185
2010-01-11 21:13:41 +00:00