1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 12:02:58 +02:00
Commit Graph

640 Commits

Author SHA1 Message Date
Evan Cheng
8d09e52569 GEP subscript is interpreted as a signed value.
llvm-svn: 32888
2007-01-05 01:46:20 +00:00
Chris Lattner
45c7c69c77 fix PowerPC/2007-01-04-ArgExtension.ll, a bug handling K&R prototypes with
the recent signless changes.

llvm-svn: 32884
2007-01-04 22:22:37 +00:00
Reid Spencer
20d9040d17 Legalizer doesn't do an ANY_EXTEND if we don't ask for one so make sure
that we default to an ANY_EXTEND if no parameter attribute is set on the
result value of a function.

llvm-svn: 32836
2007-01-03 16:49:33 +00:00
Reid Spencer
87c68ceb32 Restore previous behavior of defaulting to ZEXT. This works around two
things: (1) preventing PR1071 and (2) working around missing parameter
attributes for bool type. (2) will be fixed shortly. When PR1071 is fixed,
this patch should be undone.

llvm-svn: 32831
2007-01-03 05:03:05 +00:00
Reid Spencer
92c52fe025 Two changes:
1. Switch expression and cases are compared signed and are sign extended.
2. For function results needing extended, do SIGN_EXTEND if the SExtAttribute
   is set and ZERO_EXTEND if the ZExtAttribute is set, otherwise just let
   the Legalizer do ANY_EXTEND.
This fixes the recent regression in kimwitu++ and probably the llvm-gcc
bootstrap issue we had today.

llvm-svn: 32830
2007-01-03 04:25:33 +00:00
Reid Spencer
dda168599d For PR950:
Three changes:
1. Convert signed integer types to signless versions.
2. Implement the @sext and @zext parameter attributes. Previously the
   type of an function parameter was used to determine whether it should
   be sign extended or zero extended before the call. This information is
   now communicated via the function type's parameter attributes.
3. The interface to LowerCallTo had to be changed in order to accommodate
   the parameter attribute information. Although it would have been
   convenient to pass in the FunctionType itself, there isn't always one
   present in the caller. Consequently, a signedness indication for the
   result type and for each parameter was provided for in the interface
   to this method. All implementations were changed to make the adjustment
   necessary.

llvm-svn: 32788
2006-12-31 05:55:36 +00:00
Reid Spencer
4428c3483b For PR950:
This patch removes the SetCC instructions and replaces them with the ICmp
and FCmp instructions. The SetCondInst instruction has been removed and
been replaced with ICmpInst and FCmpInst.

llvm-svn: 32751
2006-12-23 06:05:41 +00:00
Evan Cheng
564ef2e7f4 getLoad() and getStore() calls missed SVOffset operand. Thanks to Dan Gohman
for pointing it out!

llvm-svn: 32712
2006-12-20 01:27:29 +00:00
Chris Lattner
8b1e4d1edf Fix PR1049 and CodeGen/Generic/2006-12-16-InlineAsmCrash.ll
by producing target constants instead of constants.  Constants can get
selected to li/movri instructions, which causes the scheduler to explode.

llvm-svn: 32633
2006-12-16 21:14:48 +00:00
Evan Cheng
10389e802b More soft-fp work.
llvm-svn: 32559
2006-12-13 20:57:08 +00:00
Reid Spencer
50702907eb Replace CastInst::createInferredCast calls with more accurate cast
creation calls.

llvm-svn: 32521
2006-12-13 00:50:17 +00:00
Evan Cheng
061d636b7a Expand i32/i64 CopyToReg f32/f64 to BIT_CONVERT + CopyToReg.
llvm-svn: 32493
2006-12-12 21:21:32 +00:00
Evan Cheng
df11bd2186 Expand formal arguments and call arguments recursively: e.g. f64 -> i64 -> 2 x i32.
llvm-svn: 32476
2006-12-12 07:27:38 +00:00
Anton Korobeynikov
e76b69846d Cleaned setjmp/longjmp lowering interfaces. Now we're producing right
code (both asm & cbe) for Mingw32 target.
Removed autoconf checks for underscored versions of setjmp/longjmp.

llvm-svn: 32415
2006-12-10 23:12:42 +00:00
Evan Cheng
a366d082b5 Preliminary soft float support.
llvm-svn: 32394
2006-12-09 02:42:38 +00:00
Bill Wendling
23b8b13c9d Removing even more <iostream> includes.
llvm-svn: 32320
2006-12-07 20:04:42 +00:00
Evan Cheng
0be56fab68 Fix for PR1023 by Dan Gohman.
llvm-svn: 32003
2006-11-29 01:58:12 +00:00
Evan Cheng
f38588a1cd Fix for PR1022 (folding loads of static initializers) by Dan Gohman.
llvm-svn: 32000
2006-11-29 01:38:07 +00:00
Chris Lattner
d00734a230 add a hook to allow targets to hack on inline asms to lower them to llvm
when they want to.

llvm-svn: 31997
2006-11-29 01:12:32 +00:00
Evan Cheng
98fa7ab4d7 Change MachineInstr ctor's to take a TargetInstrDescriptor reference instead
of opcode and number of operands.

llvm-svn: 31947
2006-11-27 23:37:22 +00:00
Reid Spencer
992d9788b3 For PR950:
The long awaited CAST patch. This introduces 12 new instructions into LLVM
to replace the cast instruction. Corresponding changes throughout LLVM are
provided. This passes llvm-test, llvm/test, and SPEC CPUINT2000 with the
exception of 175.vpr which fails only on a slight floating point output
difference.

llvm-svn: 31931
2006-11-27 01:05:10 +00:00
Reid Spencer
6e34ef887b For PR950:
First in a series of patches to convert SetCondInst into ICmpInst and
FCmpInst using only two opcodes and having the instructions contain their
predicate value. Nothing uses these classes yet. More patches to follow.

llvm-svn: 31867
2006-11-20 01:22:35 +00:00
Chris Lattner
286bb6e482 remove dead #include
llvm-svn: 31753
2006-11-15 17:51:15 +00:00
Chris Lattner
efb62464b9 commentate
llvm-svn: 31627
2006-11-10 04:41:34 +00:00
Reid Spencer
da1f5b882a For PR950:
This patch converts the old SHR instruction into two instructions,
AShr (Arithmetic) and LShr (Logical). The Shr instructions now are not
dependent on the sign of their operands.

llvm-svn: 31542
2006-11-08 06:47:33 +00:00
Reid Spencer
4bafa71dc1 For PR786:
Turn on -Wunused and -Wno-unused-parameter. Clean up most of the resulting
fall out by removing unused variables. Remaining warnings have to do with
unused functions (I didn't want to delete code without review) and unused
variables in generated code. Maintainers should clean up the remaining
issues when they see them. All changes pass DejaGnu tests and Olden.

llvm-svn: 31380
2006-11-02 20:25:50 +00:00
Reid Spencer
1abf69e923 For PR950:
Replace the REM instruction with UREM, SREM and FREM.

llvm-svn: 31369
2006-11-02 01:53:59 +00:00
Chris Lattner
35cd439221 Allow the getRegForInlineAsmConstraint method to return a register class with
no fixes physreg.  Treat this as permission to use any register in the register
class.  When this happens and it is safe, allow the llvm register allcoator to
allocate the register instead of doing it at isel time.  This eliminates a ton
of copies around common inline asms.  For example:

int test2(int Y, int X) {
  asm("foo %0, %1" : "=r"(X): "r"(X));
  return X;
}

now compiles to:

_test2:
        foo r3, r4
        blr

instead of:

_test2:
        mr r2, r4
        foo r2, r2
        mr r3, r2
        blr

GCC produces:

_test2:
        foo r4, r4
        mr r3,r4
        blr

llvm-svn: 31366
2006-11-02 01:41:49 +00:00
Chris Lattner
850ef9955d Compile CodeGen/PowerPC/fp-branch.ll to:
_intcoord_cond_next55:
LBB1_3: ;cond_next55
        lis r2, ha16(LCPI1_0)
        lfs f0, lo16(LCPI1_0)(r2)
        fcmpu cr0, f1, f0
        blt cr0, LBB1_2 ;cond_next62.exitStub
LBB1_1: ;bb72.exitStub
        li r3, 1
        blr
LBB1_2: ;cond_next62.exitStub
        li r3, 0
        blr

instead of:

_intcoord_cond_next55:
LBB1_3: ;cond_next55
        lis r2, ha16(LCPI1_0)
        lfs f0, lo16(LCPI1_0)(r2)
        fcmpu cr0, f1, f0
        bge cr0, LBB1_1 ;bb72.exitStub
LBB1_4: ;cond_next55
        lis r2, ha16(LCPI1_0)
        lfs f0, lo16(LCPI1_0)(r2)
        fcmpu cr0, f1, f0
        bnu cr0, LBB1_2 ;cond_next62.exitStub
LBB1_1: ;bb72.exitStub
        li r3, 1
        blr
LBB1_2: ;cond_next62.exitStub
        li r3, 0
        blr

llvm-svn: 31330
2006-10-31 23:06:00 +00:00
Chris Lattner
3131b7d6d1 look through isunordered to inline it into branch blocks.
llvm-svn: 31328
2006-10-31 22:37:42 +00:00
Chris Lattner
a44a27dc47 TargetLowering::isOperandValidForConstraint
llvm-svn: 31319
2006-10-31 19:41:18 +00:00
Chris Lattner
a9f10b25cc Turn an assert into an error message. This is commonly triggered when
we don't support a specific constraint yet.  When this happens, print the
unsupported constraint.

llvm-svn: 31310
2006-10-31 07:33:13 +00:00
Evan Cheng
972f469722 Lower jumptable to BR_JT. The legalizer can lower it to a BRIND or let the target custom lower it.
llvm-svn: 31293
2006-10-30 08:00:44 +00:00
Chris Lattner
46a439780e fix Generic/2006-10-29-Crash.ll
llvm-svn: 31281
2006-10-29 21:01:20 +00:00
Chris Lattner
0a5d859ae3 Fix a load folding issue that Evan noticed: there is no need to export values
used by comparisons in the main block.

llvm-svn: 31279
2006-10-29 18:23:37 +00:00
Chris Lattner
776740f897 split critical edges more carefully and intelligently. In particular, critical
edges whose destinations are not phi nodes don't bother us.  Also, share
split edges, since the split edge can't have a phi.  This significantly
reduces the complexity of generated code in some cases.

llvm-svn: 31274
2006-10-28 19:22:10 +00:00
Chris Lattner
ee8a70f370 Split *all* critical edges before isel. This resolves issues with spill code
being inserted on unsplit critical edges, which introduces (sometimes large
amounts of) partially dead spill code.

This also fixes PR925 + CodeGen/Generic/switch-crit-edge-constant.ll

llvm-svn: 31260
2006-10-28 17:04:37 +00:00
Chris Lattner
e009249ae1 Fix a bug in merged condition handling (CodeGen/Generic/2006-10-27-CondFolding.ll).
Add many fewer CFG edges and PHI node entries.  If there is a switch which has
the same block as multiple destinations, only add that block once as a successor/phi
node (in the jumptable case)

llvm-svn: 31242
2006-10-27 23:50:33 +00:00
Chris Lattner
e2297cd5a6 remove debug code
llvm-svn: 31233
2006-10-27 21:58:03 +00:00
Chris Lattner
0fab97080c Codegen cond&cond with two branches. This compiles (f.e.) PowerPC/and-branch.ll to:
cmpwi cr0, r4, 4
        bgt cr0, LBB1_2 ;UnifiedReturnBlock
LBB1_3: ;entry
        cmplwi cr0, r3, 0
        bne cr0, LBB1_2 ;UnifiedReturnBlock

instead of:

        cmpwi cr7, r4, 4
        mfcr r2
        addic r4, r3, -1
        subfe r3, r4, r3
        rlwinm r2, r2, 30, 31, 31
        or r2, r2, r3
        cmplwi cr0, r2, 0
        bne cr0, LBB1_2 ;UnifiedReturnBlock
LBB1_1: ;cond_true

llvm-svn: 31232
2006-10-27 21:54:23 +00:00
Chris Lattner
05159debc4 Turn conditions like x<Y|z==q into multiple blocks.
This compiles Regression/CodeGen/X86/or-branch.ll into:

_foo:
        subl $12, %esp
        call L_bar$stub
        movl 20(%esp), %eax
        movl 16(%esp), %ecx
        cmpl $5, %eax
        jl LBB1_1       #cond_true
LBB1_3: #entry
        testl %ecx, %ecx
        jne LBB1_2      #UnifiedReturnBlock
LBB1_1: #cond_true
        call L_bar$stub
        addl $12, %esp
        ret
LBB1_2: #UnifiedReturnBlock
        addl $12, %esp
        ret

instead of:

_foo:
        subl $12, %esp
        call L_bar$stub
        movl 20(%esp), %eax
        movl 16(%esp), %ecx
        cmpl $4, %eax
        setg %al
        testl %ecx, %ecx
        setne %cl
        testb %cl, %al
        jne LBB1_2      #UnifiedReturnBlock
LBB1_1: #cond_true
        call L_bar$stub
        addl $12, %esp
        ret
LBB1_2: #UnifiedReturnBlock
        addl $12, %esp
        ret

And on ppc to:

        cmpwi cr0, r29, 5
        blt cr0, LBB1_1 ;cond_true
LBB1_3: ;entry
        cmplwi cr0, r30, 0
        bne cr0, LBB1_2 ;UnifiedReturnBlock

instead of:

        cmpwi cr7, r4, 4
        mfcr r2
        addic r4, r3, -1
        subfe r30, r4, r3
        rlwinm r29, r2, 30, 31, 31
        and r2, r29, r30
        cmplwi cr0, r2, 0
        bne cr0, LBB1_2 ;UnifiedReturnBlock

llvm-svn: 31230
2006-10-27 21:36:01 +00:00
Reid Spencer
6833ffe8b8 For PR950:
Make necessary changes to support DIV -> [SUF]Div. This changes llvm to
have three division instructions: signed, unsigned, floating point. The
bytecode and assembler are bacwards compatible, however.

llvm-svn: 31195
2006-10-26 06:15:43 +00:00
Chris Lattner
a797dd393f visitSwitchCase knows how to insert conditional branches well. Change
visitBr to just call visitSwitchCase, eliminating duplicate logic.

llvm-svn: 31167
2006-10-24 18:07:37 +00:00
Chris Lattner
79a1ca0a61 Generalize CaseBlock a bit more:
Rename LHSBB/RHSBB to TrueBB/FalseBB.  Allow the RHS value to be null,
in which case the LHS is treated as a bool.

llvm-svn: 31166
2006-10-24 17:57:59 +00:00
Chris Lattner
fc8e70297e generalize 'CaseBlock'. It really allows any comparison to be inserted.
llvm-svn: 31161
2006-10-24 17:03:35 +00:00
Chris Lattner
06236f7928 Minor tweak. Instead of generating:
movl 32(%esp), %eax
        cmpl $1, %eax
        je LBB1_1       #bb
LBB1_4: #entry
        cmpl $2, %eax
        je LBB1_2       #bb2
        jmp LBB1_3      #UnifiedReturnBlock
LBB1_1: #bb

notice that we would miss the fall through and emit this instead:

        movl 32(%esp), %eax
        cmpl $2, %eax
        je LBB1_2       #bb2
LBB1_4: #entry
        cmpl $1, %eax
        jne LBB1_3      #UnifiedReturnBlock
LBB1_1: #bb

llvm-svn: 31130
2006-10-23 18:38:22 +00:00
Chris Lattner
d91d082100 Fix phi node updating for switches lowered to linear sequences of branches.
llvm-svn: 31125
2006-10-22 23:00:53 +00:00
Chris Lattner
f592f04e3f disable this code for now, it's not yet safely updating phi nodes
llvm-svn: 31124
2006-10-22 22:47:10 +00:00
Chris Lattner
a07b38f113 Implement PR964 and Regression/CodeGen/Generic/SwitchLowering.ll
llvm-svn: 31119
2006-10-22 21:36:53 +00:00
Reid Spencer
d414793dbc For PR950:
This patch implements the first increment for the Signless Types feature.
All changes pertain to removing the ConstantSInt and ConstantUInt classes
in favor of just using ConstantInt.

llvm-svn: 31063
2006-10-20 07:07:24 +00:00
Bill Wendling
edce5ede57 Partially in response to PR926: insert the newly created machine basic
blocks into the basic block list when lowering the switch inst. into a
binary tree of if-then statements. This allows the "visitSwitchCase" func
to allow for fall-through behavior.

llvm-svn: 31057
2006-10-19 21:46:38 +00:00
Jim Laskey
06f4428abc Pass AliasAnalysis thru to DAGCombiner.
llvm-svn: 30984
2006-10-16 20:52:31 +00:00
Evan Cheng
fe5bb5dbe6 Merge ISD::TRUNCSTORE to ISD::STORE. Switch to using StoreSDNode.
llvm-svn: 30945
2006-10-13 21:14:26 +00:00
Andrew Lenharth
4b783303e5 Jimptables working again on alpha.
As a bonus, use the GOT node instead of the AlphaISD::GOT for internal stuff.

llvm-svn: 30873
2006-10-11 04:29:42 +00:00
Chris Lattner
aa1741fc87 add two helper methods.
llvm-svn: 30869
2006-10-11 03:58:02 +00:00
Evan Cheng
d22f3dd3ed Reflects ISD::LOAD / ISD::LOADX / LoadSDNode changes.
llvm-svn: 30844
2006-10-09 20:57:25 +00:00
Chris Lattner
b5b96302f2 jump tables handle pic
llvm-svn: 30776
2006-10-06 22:32:29 +00:00
Evan Cheng
275825195a Make use of getStore().
llvm-svn: 30759
2006-10-05 23:01:46 +00:00
Evan Cheng
5974db9813 Fix some typos that can cause a flag value to have more than one use.
llvm-svn: 30727
2006-10-04 22:23:53 +00:00
Chris Lattner
b512048344 refactor critical edge breaking out into the SplitCritEdgesForPHIConstants method.
This is a baby step towards fixing PR925.

llvm-svn: 30643
2006-09-28 06:17:10 +00:00
Andrew Lenharth
96c41b6c3c Comments on JumpTableness
llvm-svn: 30615
2006-09-26 20:02:30 +00:00
Andrew Lenharth
58f5a24f0c Add support for other relocation bases to jump tables, as well as custom asm directives
llvm-svn: 30593
2006-09-24 19:45:58 +00:00
Evan Cheng
2086ffb27b PIC jump table entries are always 32-bit. This fixes PIC jump table support on X86-64.
llvm-svn: 30590
2006-09-24 05:22:38 +00:00
Andrew Lenharth
00bbd5641b absolute addresses must match pointer size
llvm-svn: 30461
2006-09-18 17:59:35 +00:00
Chris Lattner
fdf4c06dac If LSR went through a lot of trouble to put constants (e.g. the addr of a global
in a specific BB, don't undo this!).  This allows us to compile
CodeGen/X86/loop-hoist.ll into:

_foo:
        xorl %eax, %eax
***     movl L_Arr$non_lazy_ptr, %ecx
        movl 4(%esp), %edx
LBB1_1: #cond_true
        movl %eax, (%ecx,%eax,4)
        incl %eax
        cmpl %edx, %eax
        jne LBB1_1      #cond_true
LBB1_2: #return
        ret

instead of:

_foo:
        xorl %eax, %eax
        movl 4(%esp), %ecx
LBB1_1: #cond_true
***     movl L_Arr$non_lazy_ptr, %edx
        movl %eax, (%edx,%eax,4)
        incl %eax
        cmpl %ecx, %eax
        jne LBB1_1      #cond_true
LBB1_2: #return
        ret

This was noticed in 464.h264ref.  This doesn't usually affect PPC,
but strikes X86 all the time.

llvm-svn: 30290
2006-09-13 06:02:42 +00:00
Chris Lattner
480465a171 This code was trying too hard. By eliminating redundant edges in the CFG
due to switch cases going to the same place, it make #pred != #phi entries,
breaking live interval analysis.

This fixes 458.sjeng on x86 with llc.

llvm-svn: 30236
2006-09-10 06:36:57 +00:00
Chris Lattner
b935214653 Implement the fpowi now by lowering to a libcall
llvm-svn: 30225
2006-09-09 06:03:30 +00:00
Chris Lattner
8b75d6e068 Fix CodeGen/Generic/2006-09-06-SwitchLowering.ll, a bug where SDIsel inserted
too many phi operands when lowering a switch to branches in some cases.

llvm-svn: 30142
2006-09-07 01:59:34 +00:00
Chris Lattner
9cd4e3429e Completely eliminate def&use operands. Now a register operand is EITHER a
def operand or a use operand.

llvm-svn: 30109
2006-09-05 02:31:13 +00:00
Chris Lattner
33bd5dcfb7 s|llvm/Support/Visibility.h|llvm/Support/Compiler.h|
llvm-svn: 29911
2006-08-27 12:54:02 +00:00
Chris Lattner
62b0dcb385 minor changes.
llvm-svn: 29740
2006-08-16 22:57:46 +00:00
Chris Lattner
44d58ded54 eliminate use of getNode that takes vector of valuetypes.
llvm-svn: 29687
2006-08-14 23:53:35 +00:00
Chris Lattner
7b1362fa52 Start eliminating temporary vectors used to create DAG nodes. Instead, pass
in the start of an array and a count of operands where applicable.  In many
cases, the number of operands is known, so this static array can be allocated
on the stack, avoiding the heap.  In many other cases, a SmallVector can be
used, which has the same benefit in the common cases.

I updated a lot of code calling getNode that takes a vector, but ran out of
time.  The rest of the code should be updated, and these methods should be
removed.

We should also do the same thing to eliminate the methods that take a
vector of MVT::ValueTypes.

It would be extra nice to convert the dagiselemitter to avoid creating vectors
for operands when calling getTargetNode.

llvm-svn: 29566
2006-08-08 02:23:42 +00:00
Chris Lattner
8cc175963d Work around a GCC 3.3.5 bug noticed by a user.
llvm-svn: 29490
2006-08-03 00:18:59 +00:00
Jim Laskey
6d121090d3 Final polish on machine pass registries.
llvm-svn: 29471
2006-08-02 12:30:23 +00:00
Jim Laskey
f9f462bc5e Now that the ISel is available, it's possible to create a default instruction
scheduler creator.

llvm-svn: 29452
2006-08-01 19:14:14 +00:00
Jim Laskey
f5e160063e 1. Change use of "Cache" to "Default".
2. Added argument to instruction scheduler creators so the creators can do
special things.
3. Repaired target hazard code.
4. Misc.

More to follow.

llvm-svn: 29450
2006-08-01 18:29:48 +00:00
Jim Laskey
b92b14f422 Introducing plugable register allocators and instruction schedulers.
llvm-svn: 29434
2006-08-01 14:21:23 +00:00
Evan Cheng
c1483b5e72 PIC jump table entries are always 32-bit even in 64-bit mode.
llvm-svn: 29422
2006-08-01 01:03:13 +00:00
Nate Begeman
952d922bf1 Code cleanups, per review
llvm-svn: 29347
2006-07-27 16:46:58 +00:00
Nate Begeman
3d5f5b4e8b Support jump tables when in PIC relocation model
llvm-svn: 29318
2006-07-27 01:13:04 +00:00
Chris Lattner
77e6cda5d0 Mems can be in the output list also. This is the second half of a fix for
PR833

llvm-svn: 29224
2006-07-20 19:02:21 +00:00
Chris Lattner
496bd3fbf6 Use hidden visibility to make symbols in an anonymous namespace get
dropped.  This shrinks libllvmgcc.dylib another 67K

llvm-svn: 28975
2006-06-28 23:17:24 +00:00
Evan Cheng
c4d878dc56 Consistency. EXTRACT_ELEMENT index operand should have ptr type.
llvm-svn: 28795
2006-06-15 08:11:54 +00:00
Chris Lattner
2d4ba3f9ca Make sure to update the CFG correctly if a switch only has a default dest.
This fixes CodeGen/Generic/2006-06-12-LowerSwitchCrash.ll

llvm-svn: 28755
2006-06-12 18:25:29 +00:00
Chris Lattner
a05c244289 Fix X86/inline-asm.ll:test2, a case where an input value was implicitly
truncated.

llvm-svn: 28733
2006-06-08 18:27:11 +00:00
Chris Lattner
963fd11359 Fix Regression/CodeGen/X86/inline-asm.ll, a case where inline asm causes
implement extension of a register.

llvm-svn: 28731
2006-06-08 18:22:48 +00:00
Evan Cheng
58bfb4e600 Make CALL node consistent with RET node. Signness of value has type MVT::i32
instead of MVT::i1. Either is fine except MVT::i32 is probably a legal type
for most (if not all) platforms while MVT::i1 is not.

llvm-svn: 28511
2006-05-26 23:13:20 +00:00
Evan Cheng
f53c4cc192 Change RET node to include signness information of the return values. e.g.
RET chain, value1, sign1, value2, sign2

llvm-svn: 28509
2006-05-26 23:09:09 +00:00
Evan Cheng
920d9c86f1 CALL node change: now including signness of every argument.
llvm-svn: 28461
2006-05-25 00:55:32 +00:00
Evan Cheng
6909147763 -enable-unsafe-fp-math implies -enable-finite-only-fp-math
llvm-svn: 28437
2006-05-23 18:18:46 +00:00
Vladimir Prus
125917432a Fix missing include
llvm-svn: 28435
2006-05-23 13:43:15 +00:00
Evan Cheng
19e9e63dcd Incorrect SETCC CondCode used for FP comparisons.
llvm-svn: 28433
2006-05-23 06:40:47 +00:00
Chris Lattner
b2ecc1b1e7 Fix the result of the call to use a correct vbitconvert. There is no need to
use getPackedTypeBreakdown at all here.

llvm-svn: 28365
2006-05-17 20:49:36 +00:00
Chris Lattner
924f3fed13 Correct a previous patch which broke CodeGen/PowerPC/vec_call.ll
llvm-svn: 28364
2006-05-17 20:43:21 +00:00
Evan Cheng
00f391f87b Fixed a LowerCallTo and LowerArguments bug. They were introducing illegal
VBIT_VECTOR nodes. There were some confusion about the semantics of
getPackedTypeBreakdown(). e.g. for <4 x f32> it returns 1 and v4f32, not 4,
and f32.

llvm-svn: 28352
2006-05-17 18:16:39 +00:00
Chris Lattner
30f61c8bb1 Add support for calls that pass and return legal vectors.
llvm-svn: 28340
2006-05-16 23:39:44 +00:00
Chris Lattner
01f4f28837 Add a new ISD::CALL node, make the default impl of TargetLowering::LowerCallTo
produce it.

llvm-svn: 28338
2006-05-16 22:53:20 +00:00
Chris Lattner
ba1dfc1da7 Add a chain to FORMAL_ARGUMENTS. This is a minimal port of the X86 backend,
it doesn't currently use/maintain the chain properly.  Also, make the
X86ISelLowering.cpp file 80-col clean.

llvm-svn: 28320
2006-05-16 06:45:34 +00:00
Chris Lattner
fdd62b9073 Move function-live-in-handling code from the sdisel code to the scheduler.
This code should be emitted after legalize, so it can't be in sdisel.

Note that the EmitFunctionEntryCode hook should be updated to operate on the
DAG.  The X86 backend is the only one currently using this hook.

llvm-svn: 28315
2006-05-16 06:10:58 +00:00
Evan Cheng
7bb257e178 Revert an un-intended change
llvm-svn: 28278
2006-05-13 05:53:47 +00:00
Chris Lattner
9e29384a4b Remove dead vars
llvm-svn: 28255
2006-05-12 18:06:45 +00:00
Evan Cheng
cb2a0f392c Refactor scheduler code. Move register-reduction list scheduler to a
separate file. Added an initial implementation of top-down register pressure
reduction list scheduler.

llvm-svn: 28226
2006-05-11 23:55:42 +00:00
Nate Begeman
1f359b07de Make emission of jump tables a bit less conservative; they are now required
to be only 31.25% dense, rather than 75% dense.

llvm-svn: 28165
2006-05-08 16:51:36 +00:00
Chris Lattner
02d3258820 When inserting casts, be careful of where we put them. We cannot insert
a cast immediately before a PHI node.

This fixes Regression/CodeGen/Generic/2006-05-06-GEP-Cast-Sink-Crash.ll

llvm-svn: 28143
2006-05-06 09:10:37 +00:00
Chris Lattner
229d3e3c2d More aggressively sink GEP offsets into loops. For example, before we
generated:

        movl 8(%esp), %eax
        movl %eax, %edx
        addl $4316, %edx
        cmpb $1, %cl
        ja LBB1_2       #cond_false
LBB1_1: #cond_true
        movl L_QuantizationTables720$non_lazy_ptr, %ecx
        movl %ecx, (%edx)
        movl L_QNOtoQuantTableShift720$non_lazy_ptr, %edx
        movl %edx, 4460(%eax)
        ret
...

Now we generate:

        movl 8(%esp), %eax
        cmpb $1, %cl
        ja LBB1_2       #cond_false
LBB1_1: #cond_true
        movl L_QuantizationTables720$non_lazy_ptr, %ecx
        movl %ecx, 4316(%eax)
        movl L_QNOtoQuantTableShift720$non_lazy_ptr, %ecx
        movl %ecx, 4460(%eax)
        ret

... which uses one fewer register.

llvm-svn: 28129
2006-05-05 21:17:49 +00:00
Chris Lattner
56694d0bb0 Sink noop copies into the basic block that uses them. This reduces the number
of cross-block live ranges, and allows the bb-at-a-time selector to always
coallesce these away, at isel time.

This reduces the load on the coallescer and register allocator.  For example
on a codec on X86, we went from:

   1643 asm-printer           - Number of machine instrs printed
    419 liveintervals         - Number of loads/stores folded into instructions
   1144 liveintervals         - Number of identity moves eliminated after coalescing
   1022 liveintervals         - Number of interval joins performed
    282 liveintervals         - Number of intervals after coalescing
   1304 liveintervals         - Number of original intervals
     86 regalloc              - Number of times we had to backtrack
1.90232 regalloc              - Ratio of intervals processed over total intervals
     40 spiller               - Number of values reused
    182 spiller               - Number of loads added
    121 spiller               - Number of stores added
    132 spiller               - Number of register spills
      6 twoaddressinstruction - Number of instructions commuted to coalesce
    360 twoaddressinstruction - Number of two-address instructions

to:

   1636 asm-printer           - Number of machine instrs printed
    403 liveintervals         - Number of loads/stores folded into instructions
   1155 liveintervals         - Number of identity moves eliminated after coalescing
   1033 liveintervals         - Number of interval joins performed
    279 liveintervals         - Number of intervals after coalescing
   1312 liveintervals         - Number of original intervals
     76 regalloc              - Number of times we had to backtrack
1.88998 regalloc              - Ratio of intervals processed over total intervals
      1 spiller               - Number of copies elided
     41 spiller               - Number of values reused
    191 spiller               - Number of loads added
    114 spiller               - Number of stores added
    128 spiller               - Number of register spills
      4 twoaddressinstruction - Number of instructions commuted to coalesce
    356 twoaddressinstruction - Number of two-address instructions

On this testcase, this change provides a modest reduction in spill code,
regalloc iterations, and total instructions emitted.  It increases the number
of register coallesces.

llvm-svn: 28115
2006-05-05 01:04:50 +00:00
Nate Begeman
95a4f191e0 Finish up the initial jump table implementation by allowing jump tables to
not be 100% dense.  Increase the minimum threshold for the number of cases
in a switch statement from 4 to 6 in order to create a jump table.

llvm-svn: 28079
2006-05-03 03:48:02 +00:00
Owen Anderson
71bc529dfa Refactor TargetMachine, pushing handling of TargetData into the target-specific subclasses. This has one caller-visible change: getTargetData() now returns a pointer instead of a reference.
This fixes PR 759.

llvm-svn: 28074
2006-05-03 01:29:57 +00:00
Evan Cheng
9f21c2daf4 Remove the temporary option: -no-isel-fold-inflight
llvm-svn: 28012
2006-04-28 18:54:11 +00:00
Evan Cheng
f843942504 TargetLowering::LowerArguments should return a VBIT_CONVERT of
FORMAL_ARGUMENTS SDOperand in the return result vector.

llvm-svn: 28009
2006-04-28 05:25:15 +00:00
Evan Cheng
ff5a88b62e Added a temporary option -no-isel-fold-inflight to control whether a "inflight"
node can be folded.

llvm-svn: 28003
2006-04-28 02:09:19 +00:00
Evan Cheng
a6081bc674 Insert a VBIT_CONVERT between a FORMAL_ARGUMENT node and its vector uses
(VAND, VADD, etc.). Legalizer will assert otherwise.

llvm-svn: 27991
2006-04-27 08:29:42 +00:00
Evan Cheng
b54140b0d6 Don't forget return void.
llvm-svn: 27974
2006-04-25 23:03:35 +00:00
Nate Begeman
7bb910dbe7 Fix the updating of the machine CFG when a PHI node was in a successor of
the jump table's range check block.  This re-enables 100% dense jump tables
by default on PPC & x86

llvm-svn: 27952
2006-04-23 06:26:20 +00:00
Nate Begeman
6cfb33a42c Turn of jump tables for a bit, there are still some issues to work out with
updating the machine CFG.

llvm-svn: 27949
2006-04-22 23:51:56 +00:00
Nate Begeman
7ed816f900 JumpTable support! What this represents is working asm and jit support for
x86 and ppc for 100% dense switch statements when relocations are non-PIC.
This support will be extended and enhanced in the coming days to support
PIC, and less dense forms of jump tables.

llvm-svn: 27947
2006-04-22 18:53:45 +00:00
Chris Lattner
8f7c394f3a The BFS scheduler is apparently nondeterminstic (causes many llvmgcc bootstrap
miscompares).  Switch RISC targets to use the list-td scheduler, which isn't.

llvm-svn: 27933
2006-04-21 17:16:16 +00:00
Chris Lattner
70d68fcfcb Implement support for the formal_arguments node. To get this, targets shouldcustom legalize it and remove their XXXTargetLowering::LowerArguments overload
llvm-svn: 27604
2006-04-12 16:20:43 +00:00
Chris Lattner
2c57f7285e Add code generator support for VSELECT
llvm-svn: 27542
2006-04-08 22:22:57 +00:00
Chris Lattner
fc546e1780 Codegen shufflevector as VVECTOR_SHUFFLE
llvm-svn: 27529
2006-04-08 04:15:24 +00:00
Chris Lattner
234481e1ba Stub out shufflevector
llvm-svn: 27514
2006-04-08 01:19:25 +00:00
Chris Lattner
fe2926cf46 Make a vector live across blocks have the correct Vec type. This fixes
CodeGen/X86/2006-04-04-CrossBlockCrash.ll

llvm-svn: 27436
2006-04-05 06:54:42 +00:00
Chris Lattner
389e309bfb Intrinsics that just load from memory can be treated like loads: they don't
have to serialize against each other.  This allows us to schedule lvx's
across each other, for example.

llvm-svn: 27346
2006-04-02 03:41:14 +00:00
Chris Lattner
badebf1c9b Add a new -view-legalize-dags command line option
llvm-svn: 27342
2006-04-02 03:07:27 +00:00
Chris Lattner
13e8d5973c Prefer larger register classes over smaller ones when a register occurs in
multiple register classes.  This fixes PowerPC/2006-04-01-FloatDoubleExtend.ll

llvm-svn: 27334
2006-04-02 00:24:45 +00:00
Chris Lattner
f30369b9b1 Make sure to pass enough values to phi nodes when we are dealing with
decimated vectors.  This fixes UnitTests/Vector/sumarray-dbl.c

llvm-svn: 27280
2006-03-31 02:12:18 +00:00
Chris Lattner
82a0e17dd7 Significantly improve handling of vectors that are live across basic blocks,
handling cases where the vector elements need promotion, expansion, and when
the vector type itself needs to be decimated.

llvm-svn: 27278
2006-03-31 02:06:56 +00:00
Chris Lattner
9a46d1605c Bug fixes: handle constantexpr insert/extract element operations
Handle constantpacked vectors with constantexpr elements.

This fixes CodeGen/Generic/vector-constantexpr.ll

llvm-svn: 27241
2006-03-29 00:11:43 +00:00
Jim Laskey
a9e74309d9 More bulletproofing of llvm.dbg.declare.
llvm-svn: 27224
2006-03-28 13:45:20 +00:00
Chris Lattner
d5da541d42 Tblgen doesn't like multiple SDNode<> definitions that map to the sameenum value. Split them into separate enums.
llvm-svn: 27201
2006-03-28 00:40:33 +00:00
Jim Laskey
d2619bb575 Reactivate llvm.dbg.declare.
llvm-svn: 27192
2006-03-27 23:31:10 +00:00
Chris Lattner
0b857e9ffc Disable dbg_declare, it currently breaks the CFE build
llvm-svn: 27182
2006-03-27 21:36:03 +00:00
Nate Begeman
3d518334b9 SelectionDAGISel can now natively handle Switch instructions, in the same
manner that the LowerSwitch LLVM to LLVM pass does: emitting a binary
search tree of basic blocks.  The new approach has several advantages:
it is faster, it generates significantly smaller code in many cases, and
it paves the way for implementing dense switch tables as a jump table by
handling switches directly in the instruction selector.

This functionality is currently only enabled on x86, but should be safe for
every target.  In anticipation of making it the default, the cfg is now
properly updated in the x86, ppc, and sparc select lowering code.

llvm-svn: 27156
2006-03-27 01:32:24 +00:00
Jim Laskey
fb55c2c6a7 Bullet proof against undefined args produced by upgrading ols-style debug info.
llvm-svn: 27155
2006-03-26 22:46:27 +00:00
Chris Lattner
1e4694ed84 fix inverted conditional
llvm-svn: 27089
2006-03-24 22:49:42 +00:00
Jim Laskey
0d63725a26 Rename for truth in advertising.
llvm-svn: 27063
2006-03-24 09:50:27 +00:00
Chris Lattner
041588d953 Lower target intrinsics into an INTRINSIC node
llvm-svn: 27035
2006-03-24 02:22:33 +00:00
Jim Laskey
b0ebfecdbf Handle new forms of llvm.dbg intrinsics.
llvm-svn: 26988
2006-03-23 18:06:46 +00:00
Chris Lattner
2ef03f3a43 Fix a typo
llvm-svn: 26965
2006-03-22 22:20:49 +00:00
Chris Lattner
7eaf8fa583 Implement simple support for vector casting. This can currently only handle
casts between legal vector types.

llvm-svn: 26961
2006-03-22 20:09:35 +00:00
Chris Lattner
b90d9241c1 add some trivial support for extractelement.
llvm-svn: 26928
2006-03-21 20:44:12 +00:00
Chris Lattner
128089ae91 Add a hacky workaround for crashes due to vectors live across blocks.
Note that this code won't work for vectors that aren't legal on the
target.  Improvements coming.

llvm-svn: 26925
2006-03-21 19:20:37 +00:00
Chris Lattner
bf4033f63a implement basic support for INSERT_VECTOR_ELT.
llvm-svn: 26849
2006-03-19 01:17:20 +00:00
Chris Lattner
db243940bd Rename ConstantVec -> BUILD_VECTOR and VConstant -> VBUILD_VECTOR. Allow*BUILD_VECTOR to take variable inputs.
llvm-svn: 26847
2006-03-19 00:52:58 +00:00
Chris Lattner
855e3c878b implement vector.ll:test_undef
llvm-svn: 26845
2006-03-19 00:20:20 +00:00
Chris Lattner
35b6933508 Change the structure of lowering vector stuff. Note: This breaks some
things.

llvm-svn: 26840
2006-03-18 01:44:44 +00:00
Nate Begeman
42736d46b2 Remove BRTWOWAY*
Make the PPC backend not dependent on BRTWOWAY_CC and make the branch
selector smarter about the code it generates, fixing a case in the
readme.

llvm-svn: 26814
2006-03-17 01:40:33 +00:00
Chris Lattner
7e80a5a16b Fix a problem fully scalarizing values.
llvm-svn: 26811
2006-03-16 23:05:19 +00:00
Chris Lattner
413bb13b27 Add support for CopyFromReg from vector values. Note: this doesn't support
illegal vector types yet!

llvm-svn: 26799
2006-03-16 19:57:50 +00:00
Chris Lattner
34f2e70f60 Teach CreateRegForValue how to handle vector types.
llvm-svn: 26798
2006-03-16 19:51:18 +00:00
Chris Lattner
95577cb3cd add support for vector->vector casts
llvm-svn: 26788
2006-03-15 22:19:46 +00:00
Jim Laskey
c741139c24 Handle the removal of the debug chain.
llvm-svn: 26729
2006-03-13 13:07:37 +00:00
Evan Cheng
57c206232d Added a parameter to control whether Constant::getStringValue() would chop
off the result string at the first null terminator.

llvm-svn: 26704
2006-03-10 23:52:03 +00:00
Chris Lattner
02e6c29ad2 scrape out bits of llvm-db
llvm-svn: 26701
2006-03-10 22:48:19 +00:00
Chris Lattner
2024573f47 Simplify the interface to the schedulers, to not pass the selected heuristicin.
llvm-svn: 26692
2006-03-10 07:49:12 +00:00
Chris Lattner
dc7268f6a3 remove dbg_declare, it's not used yet.
llvm-svn: 26659
2006-03-09 20:02:42 +00:00
Jim Laskey
a1ad999bba Get rid of the multiple copies of getStringValue. Now a Constant:: method.
llvm-svn: 26616
2006-03-08 18:11:07 +00:00
Chris Lattner
3f23d22d3f Change the interface for getting a target HazardRecognizer to be more clean.
llvm-svn: 26608
2006-03-08 04:25:59 +00:00
Chris Lattner
c6d1fbab70 Hoist the HazardRecognizer out of the ScheduleDAGList.cpp file to where
targets can implement them.  Make the top-down scheduler non-g5-specific.

Remove the old testing hazard recognizer.

llvm-svn: 26569
2006-03-06 00:22:00 +00:00
Chris Lattner
ef4bef419b Split the list scheduler into top-down and bottom-up pieces. The priority
function of the top-down scheduler are completely bogus currently, and
having (future) PPC specific in this file is also wrong, but this is a
small incremental step.

llvm-svn: 26552
2006-03-05 21:10:33 +00:00
Chris Lattner
4b4b3e6cbb Codegen copysign[f] into a FCOPYSIGN node
llvm-svn: 26542
2006-03-05 05:09:38 +00:00
Evan Cheng
4afe769361 Add more vector NodeTypes: VSDIV, VUDIV, VAND, VOR, and VXOR.
llvm-svn: 26504
2006-03-03 07:01:07 +00:00
Chris Lattner
999aa36a04 remove the read/write port/io intrinsics.
llvm-svn: 26479
2006-03-03 00:19:58 +00:00
Chris Lattner
ab22220755 Split memcpy/memset/memmove intrinsics into i32/i64 versions, resolving
PR709, and paving the way for future progress.

llvm-svn: 26476
2006-03-03 00:00:25 +00:00
Evan Cheng
d8b92e04a2 Vector ops lowering.
llvm-svn: 26436
2006-03-01 01:09:54 +00:00
Chris Lattner
a451b5ffd4 Add support for output memory constraints.
llvm-svn: 26410
2006-02-27 23:45:39 +00:00
Jeff Cohen
9773f9c447 Get VC++ building again.
llvm-svn: 26351
2006-02-24 02:52:40 +00:00
Chris Lattner
6248339a24 Implement (most of) selection of inline asm memory operands.
llvm-svn: 26350
2006-02-24 02:13:54 +00:00
Chris Lattner
3f267c1683 Lower C_Memory operands.
llvm-svn: 26346
2006-02-24 01:11:24 +00:00
Chris Lattner
5a70e894d1 Fix an endianness problem on big-endian targets with expanded operands
to inline asms.  Mark some methods const.

llvm-svn: 26334
2006-02-23 20:06:57 +00:00
Chris Lattner
b4951fbe82 Record all of the expanded registers in the DAG and machine instr, fixing
several bugs in inline asm expanded operands.

llvm-svn: 26332
2006-02-23 19:21:04 +00:00
Chris Lattner
2fc133f091 This fixes a couple of problems with expansion
llvm-svn: 26318
2006-02-22 23:09:03 +00:00
Chris Lattner
d1886b8a4e Change a whole bunch of code to be built around RegsForValue instead of
a single register number.  This fully implements promotion for inline asms,
expand is close but not quite right yet.

llvm-svn: 26316
2006-02-22 22:37:12 +00:00
Chris Lattner
6bb2c3e9cd split register class handling from explicit physreg handling.
llvm-svn: 26308
2006-02-22 00:56:39 +00:00
Chris Lattner
b5b7cc99c4 Adjust to changes in getRegForInlineAsmConstraint prototype
llvm-svn: 26306
2006-02-21 23:12:12 +00:00
Evan Cheng
a2433f32b4 Dumb bug. Code sees a memcpy from X+c so it increments src offset. But it
turns out not to point to a constant string but it forgot change the offset
back.

llvm-svn: 26242
2006-02-16 23:11:42 +00:00
Evan Cheng
131901cbb8 If the false case is the current basic block, then this is a self loop.
We do not want to emit "Loop: ... brcond Out; br Loop", as it adds an extra
instruction in the loop.  Instead, invert the condition and emit
"Loop: ... br!cond Loop; br Out.

Generalize the fix by moving it from PPCDAGToDAGISel to SelectionDAGLowering.

llvm-svn: 26231
2006-02-16 08:27:56 +00:00
Evan Cheng
07063456aa Remove an unused function parameter.
llvm-svn: 26221
2006-02-15 22:12:35 +00:00
Evan Cheng
5c2ecfd29b Turn a memcpy from string constant into a series of stores of constant values.
llvm-svn: 26219
2006-02-15 21:59:04 +00:00
Evan Cheng
3f9201ab48 Lower memcpy with small constant size operand into a series of load / store
ops.

llvm-svn: 26195
2006-02-15 01:54:51 +00:00
Evan Cheng
6789a748ac Doh again!
llvm-svn: 26188
2006-02-14 23:05:54 +00:00
Evan Cheng
5a2a9d3896 Keep to < 80 cols
llvm-svn: 26177
2006-02-14 20:12:38 +00:00
Evan Cheng
8937c047b3 Missed a break so memcpy cases fell through to memset. Doh.
llvm-svn: 26176
2006-02-14 19:45:56 +00:00
Evan Cheng
57eebe8ab4 Fixed a build breakage.
llvm-svn: 26175
2006-02-14 09:11:59 +00:00
Evan Cheng
f6c74c0096 Rename maxStoresPerMemSet to maxStoresPerMemset, etc.
llvm-svn: 26174
2006-02-14 08:38:30 +00:00
Evan Cheng
0bfe83eb5b Expand memset dst, c, size to a series of stores if size falls below the
target specific theshold, e.g. 16 for x86.

llvm-svn: 26171
2006-02-14 08:22:34 +00:00
Chris Lattner
47f4f2c148 now that libcalls don't suck, we can remove this hack
llvm-svn: 26164
2006-02-14 05:39:35 +00:00
Jim Laskey
fac853338f Rename to better reflect usage (current and planned.)
llvm-svn: 26145
2006-02-13 12:50:39 +00:00
Jim Laskey
55467f611d Reorg for integration with gcc4. Old style debug info will not be passed though
to SelIDAG.

llvm-svn: 26115
2006-02-11 01:01:30 +00:00
Evan Cheng
062ac6e46b Get rid of some memory leaks identified by Valgrind
llvm-svn: 25960
2006-02-04 06:49:00 +00:00
Chris Lattner
013f5fc2fa Add initial support for immediates. This allows us to compile this:
int %rlwnm(int %A, int %B) {
  %C = call int asm "rlwnm $0, $1, $2, $3, $4", "=r,r,r,n,n"(int %A, int %B, int 4, int 17)
  ret int %C
}

into:

_rlwnm:
        or r2, r3, r3
        or r3, r4, r4
        rlwnm r2, r2, r3, 4, 17    ;; note the immediates :)
        or r3, r2, r2
        blr

llvm-svn: 25955
2006-02-04 02:26:14 +00:00
Chris Lattner
13f609dfe2 Initial early support for non-register operands, like immediates
llvm-svn: 25952
2006-02-04 02:16:44 +00:00
Chris Lattner
47b11a250c remove some #ifdef'd out code, which should properly be in the dag combiner anyway.
llvm-svn: 25941
2006-02-03 20:13:59 +00:00
Chris Lattner
e35694c0bf Implement matching constraints. We can now say things like this:
%C = call int asm "xyz $0, $1, $2, $3", "=r,r,r,0"(int %A, int %B, int 4)

and get:

xyz r2, r3, r4, r2

note that the r2's are pinned together.  Yaay for 2-address instructions.

2342 ----------------------------------------------------------------------

llvm-svn: 25893
2006-02-02 00:25:23 +00:00
Chris Lattner
9665c2d76f Implement simple register assignment for inline asms. This allows us to compile:
int %test(int %A, int %B) {
  %C = call int asm "xyz $0, $1, $2", "=r,r,r"(int %A, int %B)
  ret int %C
}

into:

 (0x8906130, LLVM BB @0x8902220):
        %r2 = OR4 %r3, %r3
        %r3 = OR4 %r4, %r4
        INLINEASM <es:xyz $0, $1, $2>, %r2<def>, %r2, %r3
        %r3 = OR4 %r2, %r2
        BLR

which asmprints as:

_test:
        or r2, r3, r3
        or r3, r4, r4
        xyz $0, $1, $2      ;; need to print the operands now :)
        or r3, r2, r2
        blr

llvm-svn: 25878
2006-02-01 18:59:47 +00:00
Chris Lattner
5549a10792 adjust to changes in InlineAsm interface. Fix a few minor bugs.
llvm-svn: 25865
2006-02-01 01:28:23 +00:00
Chris Lattner
e113238f5c Handle physreg input/outputs. We now compile this:
int %test_cpuid(int %op) {
        %B = alloca int
        %C = alloca int
        %D = alloca int
        %A = call int asm "cpuid", "=eax,==ebx,==ecx,==edx,eax"(int* %B, int* %C, int* %D, int %op)
        %Bv = load int* %B
        %Cv = load int* %C
        %Dv = load int* %D
        %x = add int %A, %Bv
        %y = add int %x, %Cv
        %z = add int %y, %Dv
        ret int %z
}

to this:

_test_cpuid:
        sub %ESP, 16
        mov DWORD PTR [%ESP], %EBX
        mov %EAX, DWORD PTR [%ESP + 20]
        cpuid
        mov DWORD PTR [%ESP + 8], %ECX
        mov DWORD PTR [%ESP + 12], %EBX
        mov DWORD PTR [%ESP + 4], %EDX
        mov %ECX, DWORD PTR [%ESP + 12]
        add %EAX, %ECX
        mov %ECX, DWORD PTR [%ESP + 8]
        add %EAX, %ECX
        mov %ECX, DWORD PTR [%ESP + 4]
        add %EAX, %ECX
        mov %EBX, DWORD PTR [%ESP]
        add %ESP, 16
        ret

... note the proper register allocation.  :)

it is unclear to me why the loads aren't folded into the adds.

llvm-svn: 25827
2006-01-31 02:03:41 +00:00
Chris Lattner
d6d4bcc419 remove method I just added
llvm-svn: 25728
2006-01-28 03:43:09 +00:00
Chris Lattner
063c13029b add a new callback
llvm-svn: 25727
2006-01-28 03:37:03 +00:00
Nate Begeman
87c2c0e66b Implement Promote for VAARG, and allow it to be custom promoted for people
who don't want the default behavior (Alpha).

llvm-svn: 25726
2006-01-28 03:14:31 +00:00
Nate Begeman
d2c6fbef4a Remove TLI.LowerReturnTo, and just let targets custom lower ISD::RET for
the same functionality.  This addresses another piece of bug 680.  Next,
on to fixing Alpha VAARG, which I broke last time.

llvm-svn: 25696
2006-01-27 21:09:22 +00:00
Chris Lattner
b2771c7fcb initial selectiondag support for new INLINEASM node. Note that inline asms
with outputs or inputs are not supported yet. :)

llvm-svn: 25664
2006-01-26 22:24:51 +00:00
Nate Begeman
c29fac7fce First part of bug 680:
Remove TLI.LowerVA* and replace it with SDNodes that are lowered the same
way as everything else.

llvm-svn: 25606
2006-01-25 18:21:52 +00:00
Evan Cheng
69d30b1c55 If scheduler choice is the default (-sched=default), use target scheduling
preference to determine which scheduler to use. SchedulingForLatency ==
Breadth first; SchedulingForRegPressure == bottom up register reduction list
scheduler.

llvm-svn: 25599
2006-01-25 09:12:57 +00:00
Jim Laskey
e4ff0868a1 Typo.
llvm-svn: 25545
2006-01-23 13:34:04 +00:00
Evan Cheng
f622869383 Skeleton of the list schedule.
llvm-svn: 25544
2006-01-23 08:26:10 +00:00
Evan Cheng
37c62244a6 Factor out more instruction scheduler code to the base class.
llvm-svn: 25532
2006-01-23 07:01:07 +00:00
Chris Lattner
adb63115bc Fix bugs lowering stackrestore, fixing 2004-08-12-InlinerAndAllocas.c on
PPC.

llvm-svn: 25522
2006-01-23 05:22:07 +00:00
Chris Lattner
a7ff02319f Fix a bug in a recent refactor that caused a bunch of programs to miscompile
or the compiler to crash.

llvm-svn: 25503
2006-01-21 19:12:11 +00:00
Evan Cheng
4a57a7551f Do some code refactoring on Jim's scheduler in preparation of the new list
scheduler.

llvm-svn: 25493
2006-01-21 02:32:06 +00:00
Chris Lattner
737a8dab41 If the target doesn't support f32 natively, insert the FP_EXTEND in target-indep
code, so that the LowerReturn code doesn't have to handle it.

llvm-svn: 25482
2006-01-20 18:38:32 +00:00
Chris Lattner
9bc0f6cd90 Temporary work around for a libcall insertion bug: If a target doesn't
support FSIN/FCOS nodes, do not lower sin/cos to them.

llvm-svn: 25425
2006-01-18 21:50:14 +00:00
Robert Bocchino
dc31d8561b Support for the insertelement operation.
llvm-svn: 25405
2006-01-17 20:06:42 +00:00
Reid Spencer
3cecd3c4cf For PR411:
This patch is an incremental step towards supporting a flat symbol table.
It de-overloads the intrinsic functions by providing type-specific intrinsics
and arranging for automatically upgrading from the old overloaded name to
the new non-overloaded name. Specifically:
  llvm.isunordered -> llvm.isunordered.f32, llvm.isunordered.f64
  llvm.sqrt -> llvm.sqrt.f32, llvm.sqrt.f64
  llvm.ctpop -> llvm.ctpop.i8, llvm.ctpop.i16, llvm.ctpop.i32, llvm.ctpop.i64
  llvm.ctlz -> llvm.ctlz.i8, llvm.ctlz.i16, llvm.ctlz.i32, llvm.ctlz.i64
  llvm.cttz -> llvm.cttz.i8, llvm.cttz.i16, llvm.cttz.i32, llvm.cttz.i64
New code should not use the overloaded intrinsic names. Warnings will be
emitted if they are used.

llvm-svn: 25366
2006-01-16 21:12:35 +00:00
Nate Begeman
956b57ce43 Remove some duplicated code
llvm-svn: 25313
2006-01-14 03:18:27 +00:00
Nate Begeman
85b2dc0c4e bswap implementation
llvm-svn: 25312
2006-01-14 03:14:10 +00:00
Chris Lattner
4107b4d7ee Compile llvm.stacksave/restore into STACKSAVE/STACKRESTORE nodes, and allow
targets to custom expand them as they desire.

llvm-svn: 25273
2006-01-13 02:50:02 +00:00
Chris Lattner
5ab0813f3a Add "support" for stacksave/stackrestore to the dag isel
llvm-svn: 25268
2006-01-13 02:24:42 +00:00
Robert Bocchino
38060df8d1 Added selection DAG support for the extractelement operation.
llvm-svn: 25179
2006-01-10 19:04:57 +00:00
Jim Laskey
61138e28ff Applied some recommend changes from sabre. The dominate one beginning "let the
pass manager do it's thing."  Fixes crash when compiling -g files and suppresses
dwarf statements if no debug info is present.

llvm-svn: 25100
2006-01-04 22:28:25 +00:00
Chris Lattner
079443691c enable the gep isel opt
llvm-svn: 24910
2005-12-21 19:36:36 +00:00
Chris Lattner
a7d3498167 Lower ConstantAggregateZero into zeros
llvm-svn: 24890
2005-12-21 02:43:26 +00:00
Jim Laskey
37957b1ad3 Added source file/line correspondence for dwarf (PowerPC only at this point.)
llvm-svn: 24748
2005-12-16 22:45:29 +00:00
Chris Lattner
eff6e46178 Don't lump the filename and working dir together
llvm-svn: 24697
2005-12-13 17:40:33 +00:00
Chris Lattner
b0b4e53b55 Accept and ignore prefetches for now
llvm-svn: 24678
2005-12-12 22:51:16 +00:00
Chris Lattner
a54452fd4f Minor tweak to get isel opt
llvm-svn: 24663
2005-12-11 09:05:13 +00:00
Chris Lattner
e27671119a improve code insertion in two ways:
1. Only forward subst offsets into loads and stores, not into arbitrary
   things, where it will likely become a load.
2. If the source is a cast from pointer, forward subst the cast as well,
   allowing us to fold the cast away (improving cases when the cast is
   from an alloca or global).

This hasn't been fully tested, but does appear to further reduce register
pressure and improve code.  Lets let the testers grind on it a bit. :)

llvm-svn: 24640
2005-12-08 08:00:12 +00:00
Nate Begeman
589dff9a20 Fix a crash where ConstantVec nodes were being generated with the wrong
type when the target did not support them.  Also teach Legalize how to
expand ConstantVecs.

This allows us to generate

_test:
        lwz r2, 12(r3)
        lwz r4, 8(r3)
        lwz r5, 4(r3)
        lwz r6, 0(r3)
        addi r2, r2, 4
        addi r4, r4, 3
        addi r5, r5, 2
        addi r6, r6, 1
        stw r2, 12(r3)
        stw r4, 8(r3)
        stw r5, 4(r3)
        stw r6, 0(r3)
        blr

For:

void %test(%v4i *%P) {
        %T = load %v4i* %P
        %S = add %v4i %T, <int 1, int 2, int 3, int 4>
        store %v4i %S, %v4i * %P
        ret void
}

On PowerPC.

llvm-svn: 24633
2005-12-07 19:48:11 +00:00
Nate Begeman
6c1b8712c5 Teach the SelectionDAG ISel how to turn ConstantPacked values into
constant nodes with vector types.  Also teach the asm printer how to print
ConstantPacked constant pool entries.  This allows us to generate altivec
code such as the following, which adds a vector constantto a packed float.

LCPI1_0:  <4 x float> < float 0.0e+0, float 0.0e+0, float 0.0e+0, float 1.0e+0 >
        .space  4
        .space  4
        .space  4
        .long   1065353216      ; float 1
        .text
        .align  4
        .globl  _foo
_foo:
        lis r2, ha16(LCPI1_0)
        la r2, lo16(LCPI1_0)(r2)
        li r4, 0
        lvx v0, r4, r2
        lvx v1, r4, r3
        vaddfp v0, v1, v0
        stvx v0, r4, r3
        blr

For the llvm code:

void %foo(<4 x float> * %a) {
entry:
  %tmp1 = load <4 x float> * %a;
  %tmp2 = add <4 x float> %tmp1, < float 0.0, float 0.0, float 0.0, float 1.0 >
  store <4 x float> %tmp2, <4 x float> *%a
  ret void
}

llvm-svn: 24616
2005-12-06 06:18:55 +00:00
Chris Lattner
46ac4d0810 Fix the #1 code quality problem that I have seen on X86 (and it also affects
PPC and other targets).  In a particular, consider code like this:

struct Vector3 { double x, y, z; };
struct Matrix3 { Vector3 a, b, c; };
double dot(Vector3 &a, Vector3 &b) {
   return a.x * b.x  +  a.y * b.y  +  a.z * b.z;
}
Vector3 mul(Vector3 &a, Matrix3 &b) {
   Vector3 r;
   r.x = dot( a, b.a );
   r.y = dot( a, b.b );
   r.z = dot( a, b.c );
   return r;
}
void transform(Matrix3 &m, Vector3 *x, int n) {
   for (int i = 0; i < n; i++)
      x[i] = mul( x[i], m );
}

we compile transform to a loop with all of the GEP instructions for indexing
into 'm' pulled out of the loop (9 of them).  Because isel occurs a bb at a time
we are unable to fold the constant index into the loads in the loop, leading to
PPC code that looks like this:

LBB3_1: ; no_exit.preheader
        li r2, 0
        addi r6, r3, 64        ;; 9 values live across the loop body!
        addi r7, r3, 56
        addi r8, r3, 48
        addi r9, r3, 40
        addi r10, r3, 32
        addi r11, r3, 24
        addi r12, r3, 16
        addi r30, r3, 8
LBB3_2: ; no_exit
        lfd f0, 0(r30)
        lfd f1, 8(r4)
        fmul f0, f1, f0
        lfd f2, 0(r3)        ;; no constant indices folded into the loads!
        lfd f3, 0(r4)
        lfd f4, 0(r10)
        lfd f5, 0(r6)
        lfd f6, 0(r7)
        lfd f7, 0(r8)
        lfd f8, 0(r9)
        lfd f9, 0(r11)
        lfd f10, 0(r12)
        lfd f11, 16(r4)
        fmadd f0, f3, f2, f0
        fmul f2, f1, f4
        fmadd f0, f11, f10, f0
        fmadd f2, f3, f9, f2
        fmul f1, f1, f6
        stfd f0, 0(r4)
        fmadd f0, f11, f8, f2
        fmadd f1, f3, f7, f1
        stfd f0, 8(r4)
        fmadd f0, f11, f5, f1
        addi r29, r4, 24
        stfd f0, 16(r4)
        addi r2, r2, 1
        cmpw cr0, r2, r5
        or r4, r29, r29
        bne cr0, LBB3_2 ; no_exit

uh, yuck.  With this patch, we now sink the constant offsets into the loop, producing
this code:

LBB3_1: ; no_exit.preheader
        li r2, 0
LBB3_2: ; no_exit
        lfd f0, 8(r3)
        lfd f1, 8(r4)
        fmul f0, f1, f0
        lfd f2, 0(r3)
        lfd f3, 0(r4)
        lfd f4, 32(r3)       ;; much nicer.
        lfd f5, 64(r3)
        lfd f6, 56(r3)
        lfd f7, 48(r3)
        lfd f8, 40(r3)
        lfd f9, 24(r3)
        lfd f10, 16(r3)
        lfd f11, 16(r4)
        fmadd f0, f3, f2, f0
        fmul f2, f1, f4
        fmadd f0, f11, f10, f0
        fmadd f2, f3, f9, f2
        fmul f1, f1, f6
        stfd f0, 0(r4)
        fmadd f0, f11, f8, f2
        fmadd f1, f3, f7, f1
        stfd f0, 8(r4)
        fmadd f0, f11, f5, f1
        addi r6, r4, 24
        stfd f0, 16(r4)
        addi r2, r2, 1
        cmpw cr0, r2, r5
        or r4, r6, r6
        bne cr0, LBB3_2 ; no_exit

This is much nicer as it reduces register pressure in the loop a lot.  On X86,
this takes the function from having 9 spilled registers to 2.  This should help
some spec programs on X86 (gzip?)

This is currently only enabled with -enable-gep-isel-opt to allow perf testing
tonight.

llvm-svn: 24606
2005-12-05 07:10:48 +00:00
Chris Lattner
07f4a0cb99 dbg.stoppoint returns a value, don't forget to init it
llvm-svn: 24583
2005-12-03 18:50:48 +00:00
Nate Begeman
31121419c8 First chunk of actually generating vector code for packed types. These
changes allow us to generate the following code:

_foo:
        li r2, 0
        lvx v0, r2, r3
        vaddfp v0, v0, v0
        stvx v0, r2, r3
        blr

for this llvm:

void %foo(<4 x float>* %a) {
entry:
        %tmp1 = load <4 x float>* %a
        %tmp2 = add <4 x float> %tmp1, %tmp1
        store <4 x float> %tmp2, <4 x float>* %a
        ret void
}

llvm-svn: 24534
2005-11-30 08:22:07 +00:00
Reid Spencer
3bac59d2f0 Fix a problem with llvm-ranlib that (on some platforms) caused the archive
file to become corrupted due to interactions between mmap'd memory segments
and file descriptors closing. The problem is completely avoiding by using
a third temporary file.

Patch provided by Evan Jones

llvm-svn: 24527
2005-11-30 05:21:10 +00:00
Chris Lattner
22327b9d12 Add support for a new STRING and LOCATION node for line number support, patch
contributed by Daniel Berlin, with a few cleanups here and there by me.

llvm-svn: 24515
2005-11-29 06:21:05 +00:00
Nate Begeman
a90bb6d9b1 Check in code to scalarize arbitrarily wide packed types for some simple
vector operations (load, add, sub, mul).

This allows us to codegen:
void %foo(<4 x float> * %a) {
entry:
  %tmp1 = load <4 x float> * %a;
  %tmp2 = add <4 x float> %tmp1, %tmp1
  store <4 x float> %tmp2, <4 x float> *%a
  ret void
}

on ppc as:
_foo:
        lfs f0, 12(r3)
        lfs f1, 8(r3)
        lfs f2, 4(r3)
        lfs f3, 0(r3)
        fadds f0, f0, f0
        fadds f1, f1, f1
        fadds f2, f2, f2
        fadds f3, f3, f3
        stfs f0, 12(r3)
        stfs f1, 8(r3)
        stfs f2, 4(r3)
        stfs f3, 0(r3)
        blr

llvm-svn: 24484
2005-11-22 18:16:00 +00:00
Nate Begeman
d2f6fcf327 Rather than attempting to legalize 1 x float, make sure the SD ISel never
generates it.  Make MVT::Vector expand-only, and remove the code in
Legalize that attempts to legalize it.

The plan for supporting N x Type is to continually epxand it in ExpandOp
until it gets down to 2 x Type, where it will be scalarized into a pair of
scalars.

llvm-svn: 24482
2005-11-22 01:29:36 +00:00
Chris Lattner
517942843d Unbreak codegen of bools. This should fix the llc/jit/llc-beta failures
from last night.

llvm-svn: 24427
2005-11-19 18:40:42 +00:00
Nate Begeman
7d513f65ae Teach LLVM how to scalarize packed types. Currently, this only works on
packed types with an element count of 1, although more generic support is
coming.  This allows LLVM to turn the following code:

void %foo(<1 x float> * %a) {
entry:
  %tmp1 = load <1 x float> * %a;
  %tmp2 = add <1 x float> %tmp1, %tmp1
  store <1 x float> %tmp2, <1 x float> *%a
  ret void
}

Into:

_foo:
        lfs f0, 0(r3)
        fadds f0, f0, f0
        stfs f0, 0(r3)
        blr

llvm-svn: 24416
2005-11-19 00:36:38 +00:00
Nate Begeman
78ac456d32 Split out the shift code from visitBinary.
llvm-svn: 24412
2005-11-18 07:42:56 +00:00
Chris Lattner
2095b19912 when debugging lower dbg intrinsics to calls
llvm-svn: 24377
2005-11-16 07:22:30 +00:00
Andrew Lenharth
9b036b1bdb added a chain output
llvm-svn: 24306
2005-11-11 22:48:54 +00:00
Andrew Lenharth
dca2f13e76 continued readcyclecounter support
llvm-svn: 24300
2005-11-11 16:47:30 +00:00
Chris Lattner
11d12a572e Refactor intrinsic lowering stuff out of visitCall
llvm-svn: 24261
2005-11-09 19:44:01 +00:00
Chris Lattner
306c386a79 Fix CodeGen/X86/shift-folding.ll:test3 on X86
llvm-svn: 24256
2005-11-09 16:50:40 +00:00
Chris Lattner
798441d725 Avoid creating a token factor node in trivially redundant cases. This
eliminates almost one node per block in common cases.

llvm-svn: 24254
2005-11-09 05:03:03 +00:00
Chris Lattner
948932a624 Handle GEP's a bit more intelligently. Fold constant indices early and
turn power-of-two multiplies into shifts early to improve compile time.

llvm-svn: 24253
2005-11-09 04:45:33 +00:00
Nate Begeman
aecebc076b Add the necessary support to the ISel to allow targets to codegen the new
alignment information appropriately.  Includes code for PowerPC to support
fixed-size allocas with alignment larger than the stack.  Support for
arbitrarily aligned dynamic allocas coming soon.

llvm-svn: 24224
2005-11-06 09:00:38 +00:00
Chris Lattner
d7ef6d6774 Significantly simplify this code and make it more aggressive. Instead of having
a special case hack for X86, make the hack more general: if an incoming argument
register is not used in any block other than the entry block, don't copy it to
a vreg.  This helps us compile code like this:

%struct.foo = type { int, int, [0 x ubyte] }
int %test(%struct.foo* %X) {
        %tmp1 = getelementptr %struct.foo* %X, int 0, uint 2, int 100
        %tmp = load ubyte* %tmp1                ; <ubyte> [#uses=1]
        %tmp2 = cast ubyte %tmp to int          ; <int> [#uses=1]
        ret int %tmp2
}

to:

_test:
        lbz r3, 108(r3)
        blr

instead of:

_test:
        lbz r2, 108(r3)
        or r3, r2, r2
        blr

The (dead) copy emitted to copy r3 into a vreg for extra-block uses was
increasing the live range of r3 past the load, preventing the coallescing.

This implements CodeGen/PowerPC/reg-coallesce-simple.ll

llvm-svn: 24115
2005-10-30 19:42:35 +00:00
Nate Begeman
ee581735d9 Add the ability to lower return instructions to TargetLowering. This
allows us to lower legal return types to something else, to meet ABI
requirements (such as that i64 be returned in two i32 regs on Darwin/ppc).

llvm-svn: 23802
2005-10-18 23:23:37 +00:00
Chris Lattner
016497a971 Fix Generic/2005-10-18-ZeroSizeStackObject.ll by not requesting a zero
sized stack object if either the array size or the type size is zero.

llvm-svn: 23801
2005-10-18 22:14:06 +00:00
Chris Lattner
82258b6abb remove hack
llvm-svn: 23797
2005-10-18 22:11:42 +00:00
Chris Lattner
29613bce04 Enable Nate's excellent DAG combiner work by default. This allows the
removal of a bunch of ad-hoc and crufty code from SelectionDAG.cpp.

llvm-svn: 23682
2005-10-10 16:47:10 +00:00
Chris Lattner
cf12d7b556 make sure that -view-isel-dags is the input to the isel, not the input to
the second phase of dag combining

llvm-svn: 23631
2005-10-05 06:09:10 +00:00
Jeff Cohen
412582bcec Fix VC++ warnings.
llvm-svn: 23579
2005-10-01 03:57:14 +00:00
Chris Lattner
61f3785147 Add FP versions of the binary operators, keeping the int and fp worlds seperate.
Though I have done extensive testing, it is possible that this will break
things in configs I can't test.  Please let me know if this causes a problem
and I'll fix it ASAP.

llvm-svn: 23504
2005-09-28 22:28:18 +00:00
Chris Lattner
4655e9de38 If the target prefers it, use _setjmp/_longjmp should be used instead of setjmp/longjmp for llvm.setjmp/llvm.longjmp.
llvm-svn: 23481
2005-09-27 22:15:53 +00:00
Chris Lattner
49669dd169 If a function has liveins, and if the target requested that they be plopped
into particular vregs, emit copies into the entry MBB.

llvm-svn: 23331
2005-09-13 19:30:54 +00:00
Nate Begeman
143dc2039d Add an option to the DAG Combiner to enable it for beta runs, and turn on
that option for PowerPC's beta.

llvm-svn: 23253
2005-09-07 00:15:36 +00:00
Chris Lattner
365774f457 Don't create zero sized stack objects even for array allocas with a zero
number of elements.

llvm-svn: 23219
2005-09-02 18:41:28 +00:00
Chris Lattner
7d89863a77 Fix the release build, noticed by Eric van Riet Paap
llvm-svn: 23215
2005-09-02 07:09:28 +00:00
Chris Lattner
8a6c15f4f4 For values that are live across basic blocks and need promotion, use ANY_EXTEND
instead of ZERO_EXTEND to eliminate extraneous extensions.  This eliminates
dead zero extensions on formal arguments and other cases on PPC, implementing
the newly tightened up test/Regression/CodeGen/PowerPC/small-arguments.ll test.

llvm-svn: 23205
2005-09-02 00:19:37 +00:00
Chris Lattner
f2b775d686 It is NDEBUG not _NDEBUG
llvm-svn: 23186
2005-09-01 18:44:10 +00:00
Chris Lattner
e0eae3d244 Disable this code, which broke many tests last night
llvm-svn: 23114
2005-08-27 16:16:51 +00:00
Chris Lattner
e9cc12f5c4 Don't copy regs that are only used in the entry block into a vreg. This
changes the code generated for:

short %test(short %A) {
  %B = xor short %A, -32768
  ret short %B
}

to:

_test:
        xori r2, r3, 32768
        xoris r2, r2, 65535
        extsh r3, r2
        blr

instead of:

_test:
        rlwinm r2, r3, 0, 16, 31
        xori r2, r3, 32768
        xoris r2, r2, 65535
        extsh r3, r2
        blr

llvm-svn: 23109
2005-08-26 22:49:59 +00:00
Chris Lattner
faa96209d8 Call the InsertAtEndOfBasicBlock hook if the usesCustomDAGSchedInserter
flag is set on an instruction.

llvm-svn: 23098
2005-08-26 20:54:47 +00:00
Chris Lattner
64f7f0beac Make -view-isel-dags show the dag before instruction selecting, in case
the target isel crashes due to unimplemented features like calls :)

llvm-svn: 22997
2005-08-24 00:34:29 +00:00
Chris Lattner
d73a5042d9 Fix a problem where constant expr shifts would not have their shift amount
promoted to the right type.  This fixes: IA64/2005-08-22-LegalizerCrash.ll

llvm-svn: 22969
2005-08-22 17:28:31 +00:00
Chris Lattner
5cbeaed711 Enable critical edge splitting by default
llvm-svn: 22863
2005-08-18 17:35:14 +00:00
Chris Lattner
dbfcba7565 Add a new beta option for critical edge splitting, to avoid a problem that
Nate noticed in yacr2 (and I know occurs in other places as well).

This is still rough, as the critical edge blocks are not intelligently placed
but is added to get some idea to see if this improves performance.

llvm-svn: 22825
2005-08-17 06:37:43 +00:00
Chris Lattner
a103a2e9c6 Fix a regression on X86, where FP values can be promoted too.
llvm-svn: 22822
2005-08-17 06:06:25 +00:00
Chris Lattner
3b7e157005 Eliminate the RegSDNode class, which 3 nodes (CopyFromReg/CopyToReg/ImplicitDef)
used to tack a register number onto the node.

Instead of doing this, make a new node, RegisterSDNode, which is a leaf
containing a register number.  These three operations just become normal
DAG nodes now, instead of requiring special handling.

Note that with this change, it is no longer correct to make illegal
CopyFromReg/CopyToReg nodes.  The legalizer will not touch them, and this
is bad, so don't do it. :)

llvm-svn: 22806
2005-08-16 21:55:35 +00:00
Chris Lattner
0fa4402b59 Eliminate the SetCCSDNode in favor of a CondCodeSDNode class. This pulls the
CC out of the SetCC operation, making SETCC a standard ternary operation and
CC's a standard DAG leaf.  This will make it possible for other node to use
CC's as operands in the future...

llvm-svn: 22728
2005-08-09 20:20:18 +00:00
Jeff Cohen
bd51ec7461 Eliminate all remaining tabs and trailing spaces.
llvm-svn: 22523
2005-07-27 06:12:32 +00:00
Nate Begeman
a25a2010e3 Remove unnecessary FP_EXTEND. This causes worse codegen for SSE.
llvm-svn: 22469
2005-07-19 16:50:03 +00:00
Chris Lattner
bf100c8bdb Make several cleanups to Andrews varargs change:
1. Pass Value*'s into lowering methods so that the proper pointers can be
   added to load/stores from the valist
2. Intrinsics that return void should only return a token chain, not a token
   chain/retval pair.
3. Rename LowerVAArgNext -> LowerVAArg, because VANext is long gone.

llvm-svn: 22338
2005-07-05 19:57:53 +00:00
Andrew Lenharth
898efb338a restore old srcValueNode behavior and try to to work around it
llvm-svn: 22315
2005-06-29 18:54:02 +00:00
Andrew Lenharth
edccb834bb tracking the instructions causing loads and stores provides more information than just the pointer being loaded or stored
llvm-svn: 22311
2005-06-29 15:57:19 +00:00
Andrew Lenharth
4fd2bde906 If we support structs as va_list, we must pass pointers to them to va_copy
See last commit for LangRef, this implements it on all targets.

llvm-svn: 22273
2005-06-22 21:04:42 +00:00
Andrew Lenharth
a9214fec08 core changes for varargs
llvm-svn: 22254
2005-06-18 18:34:52 +00:00
Chris Lattner
46de5c99bd Fix construction of ioport intrinsics, fixing X86/io.llx and io-port.llx
llvm-svn: 22026
2005-05-14 13:56:55 +00:00
Chris Lattner
a035798c4b Eliminate special purpose hacks for dynamic_stack_alloc.
llvm-svn: 22015
2005-05-14 07:29:57 +00:00
Chris Lattner
6e81a4090f LowerOperation takes a dag
llvm-svn: 22004
2005-05-14 05:50:48 +00:00
Chris Lattner
2163eeaa67 Align doubles on 8-byte boundaries if possible.
llvm-svn: 21993
2005-05-13 23:14:17 +00:00
Chris Lattner
9d788e93a6 Add an isTailCall flag to LowerCallTo
llvm-svn: 21958
2005-05-13 18:50:42 +00:00
Chris Lattner
01eba53a10 Emit function entry code after lowering hte arguments.
llvm-svn: 21931
2005-05-13 07:33:32 +00:00
Chris Lattner
fdc4816996 Allow targets to emit code into the entry block of each function
llvm-svn: 21930
2005-05-13 07:23:21 +00:00
Chris Lattner
dd2700de99 Pass calling convention to use into lower call to
llvm-svn: 21900
2005-05-12 19:56:57 +00:00
Chris Lattner
74763db128 wrap long line
llvm-svn: 21870
2005-05-11 18:57:06 +00:00
Chris Lattner
5edb4c4af6 The semantics of cast X to bool are a comparison against zero, not a truncation!
llvm-svn: 21833
2005-05-09 22:17:13 +00:00
Chris Lattner
af6bde0db6 Add support for matching the READPORT, WRITEPORT, READIO, WRITEIO intrinsics
llvm-svn: 21825
2005-05-09 20:22:36 +00:00
Chris Lattner
a1e633ef7a Don't use the load/store instruction as the source pointer, use the pointer
being stored/loaded through!

llvm-svn: 21806
2005-05-09 04:28:51 +00:00
Chris Lattner
b85030373d wrap long lines
llvm-svn: 21804
2005-05-09 04:08:33 +00:00
Chris Lattner
6e8167d1c2 When hitting an unsupported intrinsic, actually print it
Lower debug info to noops.

llvm-svn: 21698
2005-05-05 17:55:17 +00:00
Andrew Lenharth
8b64bd0fd5 Implement count leading zeros (ctlz), count trailing zeros (cttz), and count
population (ctpop).  Generic lowering is implemented, however only promotion
is implemented for SelectionDAG at the moment.

More coming soon.

llvm-svn: 21676
2005-05-03 17:19:30 +00:00
Chris Lattner
fe72cdf838 Codegen and legalize sin/cos/llvm.sqrt as FSIN/FCOS/FSQRT calls. This patch
was contributed by Morten Ofstad, with some minor tweaks and bug fixes added
by me.

llvm-svn: 21636
2005-04-30 04:43:14 +00:00
Andrew Lenharth
2a00530fa7 Implement Value* tracking for loads and stores in the selection DAG. This enables one to use alias analysis in the backends.
(TRUNK)Stores and (EXT|ZEXT|SEXT)Loads have an extra SDOperand which is a SrcValueSDNode which contains the Value*.  Note that if the operation is introduced by the backend, it will still have the operand, but the value* will be null.

llvm-svn: 21599
2005-04-27 20:10:01 +00:00
Misha Brukman
a9a1982a44 Convert tabs to spaces
llvm-svn: 21439
2005-04-22 04:01:18 +00:00
Misha Brukman
774e55c446 Remove trailing whitespace
llvm-svn: 21420
2005-04-21 22:36:52 +00:00
Nate Begeman
a56527ea5f Fold shift by size larger than type size to undef
Make llvm undef values generate ISD::UNDEF nodes

llvm-svn: 21261
2005-04-12 23:12:17 +00:00
Chris Lattner
8e6eafa8e1 Emit BRCONDTWOWAY when possible.
llvm-svn: 21167
2005-04-09 03:30:29 +00:00
Chris Lattner
1a15f58a92 transform fabs/fabsf calls into FABS nodes.
llvm-svn: 21014
2005-04-02 05:26:53 +00:00
Chris Lattner
fcf6ee0a8b Turn -0.0 - X -> fneg
llvm-svn: 21011
2005-04-02 05:04:50 +00:00
Andrew Lenharth
7db3834ecf PCMarker support for DAG and Alpha
llvm-svn: 20965
2005-03-31 21:24:06 +00:00
Chris Lattner
abb59a3c21 Instead of setting up the CFG edges at selectiondag construction time, set
them up after the code has been emitted.  This allows targets to select one
mbb as multiple mbb's as needed.

llvm-svn: 20937
2005-03-30 01:10:47 +00:00
Chris Lattner
02a4d3bd9b Fix a bug that andrew noticed where we do not correctly sign/zero extend
returned integer values all of the way to 64-bits (we only did it to 32-bits
leaving the top bits undefined).  This causes problems for targets like alpha
whose ABI's define the top bits too.

llvm-svn: 20926
2005-03-29 19:09:56 +00:00
Nate Begeman
f821401825 Change interface to LowerCallTo to take a boolean isVarArg argument.
llvm-svn: 20842
2005-03-26 01:29:23 +00:00
Chris Lattner
4b688a1c70 This mega patch converts us from using Function::a{iterator|begin|end} to
using Function::arg_{iterator|begin|end}.  Likewise Module::g* -> Module::global_*.

This patch is contributed by Gabor Greif, thanks!

llvm-svn: 20597
2005-03-15 04:54:21 +00:00
Misha Brukman
381d248dc6 Fix compilation errors with VS 2005, patch by Aaron Gray.
llvm-svn: 20231
2005-02-17 21:39:27 +00:00
Chris Lattner
0de03b45ab Don't sink argument loads into loops or other bad places. This disables folding of argument loads with instructions that are not in the entry block.
llvm-svn: 20228
2005-02-17 19:40:32 +00:00
Chris Lattner
4c997d281c Adjust to changes in SelectionDAG interface.
llvm-svn: 19779
2005-01-23 04:36:26 +00:00
Chris Lattner
63ec3c402b Get this to work for 64-bit systems.
llvm-svn: 19763
2005-01-22 23:04:37 +00:00
Chris Lattner
e5212a16a2 Support targets that do not use i8 shift amounts.
llvm-svn: 19707
2005-01-19 22:31:21 +00:00
Chris Lattner
ff086f3016 Teach legalize to promote copy(from|to)reg, instead of making the isel pass
do it.  This results in better code on X86 for floats (because if strict
precision is not required, we can elide some more expensive double -> float
conversions like the old isel did), and allows other targets to emit
CopyFromRegs that are not legal for arguments.

llvm-svn: 19668
2005-01-18 17:54:55 +00:00
Chris Lattner
95307053ec Allow setcc operations to have nonbool types.
llvm-svn: 19656
2005-01-18 02:52:03 +00:00
Chris Lattner
c0aca0d13c Non-volatile loads can be freely reordered against each other. This fixes
X86/reg-pressure.ll again, and allows us to do nice things in other cases.
For example, we now codegen this sort of thing:

int %loadload(int *%X, int* %Y) {
  %Z = load int* %Y
  %Y = load int* %X      ;; load between %Z and store
  %Q = add int %Z, 1
  store int %Q, int* %Y
  ret int %Y
}

Into this:

loadload:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %EAX, DWORD PTR [%EAX]
        mov %ECX, DWORD PTR [%ESP + 8]
        inc DWORD PTR [%ECX]
        ret

where we weren't able to form the 'inc [mem]' before.  This also lets the
instruction selector emit loads in any order it wants to, which can be good
for register pressure as well.

llvm-svn: 19644
2005-01-17 22:19:26 +00:00
Chris Lattner
49291c4d96 Don't call SelectionDAG.getRoot() directly, go through a forwarding method.
llvm-svn: 19642
2005-01-17 19:43:36 +00:00
Chris Lattner
88bbcfc893 Implement a target independent optimization to codegen arguments only into
the basic block that uses them if possible.  This is a big win on X86, as it
lets us fold the argument loads into instructions and reduce register pressure
(by not loading all of the arguments in the entry block).

For this (contrived to show the optimization) testcase:

int %argtest(int %A, int %B) {
        %X = sub int 12345, %A
        br label %L
L:
        %Y = add int %X, %B
        ret int %Y
}

we used to produce:

argtest:
        mov %ECX, DWORD PTR [%ESP + 4]
        mov %EAX, 12345
        sub %EAX, %ECX
        mov %EDX, DWORD PTR [%ESP + 8]
.LBBargtest_1:  # L
        add %EAX, %EDX
        ret


now we produce:

argtest:
        mov %EAX, 12345
        sub %EAX, DWORD PTR [%ESP + 4]
.LBBargtest_1:  # L
        add %EAX, DWORD PTR [%ESP + 8]
        ret

This also fixes the FIXME in the code.

BTW, this occurs in real code.  164.gzip shrinks from 8623 to 8608 lines of
.s file.  The stack frame in huft_build shrinks from 1644->1628 bytes,
inflate_codes shrinks from 116->108 bytes, and inflate_block from 2620->2612,
due to fewer spills.

Take that alkis. :-)

llvm-svn: 19639
2005-01-17 17:55:19 +00:00
Chris Lattner
49a1f3a109 Refactor code into a new method.
llvm-svn: 19635
2005-01-17 17:15:02 +00:00
Chris Lattner
835a5efef3 add method stub
llvm-svn: 19612
2005-01-16 07:28:41 +00:00
Chris Lattner
9f8589f4b3 Add support for promoted registers being live across blocks.
llvm-svn: 19595
2005-01-16 02:23:07 +00:00
Chris Lattner
9762070e50 Use the new TLI method to get this.
llvm-svn: 19582
2005-01-16 01:11:19 +00:00
Chris Lattner
1de18d422e Add support for targets that require promotions.
llvm-svn: 19579
2005-01-16 00:37:38 +00:00
Chris Lattner
2f65e8798f Add new SIGN_EXTEND_INREG, ZERO_EXTEND_INREG, and FP_ROUND_INREG operators.
llvm-svn: 19568
2005-01-15 06:17:04 +00:00
Chris Lattner
2dfbc4fddd Adjust to CopyFromReg changes, implement deletion of truncating/extending
stores/loads.

llvm-svn: 19562
2005-01-14 22:38:01 +00:00
Chris Lattner
7a8788c9ac Add new ImplicitDef node, rename CopyRegSDNode class to RegSDNode.
llvm-svn: 19535
2005-01-13 20:50:02 +00:00
Chris Lattner
9cc534f2dc Don't forget the existing root.
llvm-svn: 19531
2005-01-13 19:53:14 +00:00
Chris Lattner
160fdb384b Codegen independent ops as being independent.
llvm-svn: 19528
2005-01-13 17:59:43 +00:00
Chris Lattner
e7945a2e2e Add an option to view the selection dags as they are generated.
llvm-svn: 19498
2005-01-12 03:41:21 +00:00
Chris Lattner
f588cdd51e add an assertion, avoid creating copyfromreg/copytoreg pairs that are the
same for PHI nodes.

llvm-svn: 19484
2005-01-11 22:03:46 +00:00
Chris Lattner
7cde8a2658 Turn memset/memcpy/memmove into the corresponding operations.
llvm-svn: 19463
2005-01-11 05:56:49 +00:00
Chris Lattner
cc18c057cf Handle static alloca arguments to PHI nodes.
llvm-svn: 19409
2005-01-09 01:16:24 +00:00
Chris Lattner
3454e31bba Use new interfaces to correctly lower varargs and return/frame address intrinsics.
llvm-svn: 19407
2005-01-09 00:00:49 +00:00
Chris Lattner
aad3ca491d Add support for llvm.setjmp and longjmp. Only 3 SingleSource/UnitTests fail now.
llvm-svn: 19404
2005-01-08 22:48:57 +00:00
Chris Lattner
a58b3f48ef Silence VS warnings.
llvm-svn: 19384
2005-01-08 19:52:31 +00:00
Chris Lattner
60ef22ce82 Adjust to changes in LowerCAllTo interfaces
llvm-svn: 19374
2005-01-08 19:26:18 +00:00
Chris Lattner
fd84495692 Add support for FP->INT conversions and back.
llvm-svn: 19369
2005-01-08 08:08:56 +00:00
Chris Lattner
3f2ce91a99 Implement support for long GEP indices on 32-bit archs and support for
int GEP indices on 64-bit archs.

llvm-svn: 19354
2005-01-07 21:56:57 +00:00
Chris Lattner
86601673d6 Fix handling of dead PHI nodes.
llvm-svn: 19349
2005-01-07 21:34:19 +00:00
Chris Lattner
74f8f6f657 Initial implementation of the SelectionDAGISel class. This contains most
of the code for lowering from LLVM code to a SelectionDAG.

llvm-svn: 19331
2005-01-07 07:47:53 +00:00