1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-30 15:32:52 +01:00
Commit Graph

2048 Commits

Author SHA1 Message Date
Chris Lattner
824a8efa08 Fold (select C, load A, load B) -> load (select C, A, B). This happens quite
a lot throughout many programs.  In particular, specfp triggers it a bunch for
constant FP nodes when you have code like  cond ? 1.0 : -1.0.

If the PPC ISel exposed the loads implicit in pic references to external globals,
we would be able to eliminate a load in cases like this as well:

%X = external global int
%Y = external global int
int* %test4(bool %C) {
        %G = select bool %C, int* %X, int* %Y
        ret int* %G
}

Note that this breaks things that use SrcValue's (see the fixme), but since nothing
uses them yet, this is ok.

Also, simplify some code to use hasOneUse() on an SDOperand instead of hasNUsesOfValue directly.

llvm-svn: 23781
2005-10-18 06:04:22 +00:00
Nate Begeman
b9627ab955 Implement some feedback from Chris re: constant canonicalization
llvm-svn: 23777
2005-10-18 00:28:13 +00:00
Nate Begeman
f495ec2497 Legalize BUILD_PAIR appropriately for upcoming 64 bit PowerPC work.
llvm-svn: 23776
2005-10-18 00:27:41 +00:00
Nate Begeman
b2f472ec56 fold fmul X, +2.0 -> fadd X, X;
llvm-svn: 23774
2005-10-17 20:40:11 +00:00
Chris Lattner
27d166c9bf add a trivial fold
llvm-svn: 23764
2005-10-17 01:07:11 +00:00
Chris Lattner
d22af0377d Fix this logic.
llvm-svn: 23756
2005-10-15 22:35:40 +00:00
Chris Lattner
f2d781b780 Add a case we were missing that was causing us to fail CodeGen/PowerPC/rlwinm.ll:test3
llvm-svn: 23755
2005-10-15 22:18:08 +00:00
Chris Lattner
346dc6fed1 Use getExtLoad here instead of getNode, as extloads produce two values. This
fixes a legalize failure on SPASS for itanium.

llvm-svn: 23747
2005-10-15 20:24:07 +00:00
Nate Begeman
9c6b86dcb4 fold sext_in_reg, sext_in_reg where both have the same VT. This was
popping up in Fourinarow.

llvm-svn: 23722
2005-10-14 01:29:07 +00:00
Nate Begeman
c206a2a32a Relax the checking on zextload generation a bit, since as sabre pointed out
you could be AND'ing with the result of a shift that shifts out all the
bits you care about, in addition to a constant.

Also, move over an add/sub_parts fold from legalize to the dag combiner,
where it works for things other than constants.  Woot!

llvm-svn: 23720
2005-10-14 01:12:21 +00:00
Chris Lattner
b2db3f50f2 Fix the trunc(load) case, finally allowing crafty and povray to pass
llvm-svn: 23718
2005-10-13 22:10:05 +00:00
Chris Lattner
2563394c27 Fix some bugs in (sext (load x))
llvm-svn: 23717
2005-10-13 21:52:31 +00:00
Chris Lattner
3fd6d9358e When ExpandOp'ing a [SZ]EXTLOAD, make sure to remember that the chain
is also legal.  Add support for ExpandOp'ing raw EXTLOADs too.

llvm-svn: 23716
2005-10-13 21:44:47 +00:00
Chris Lattner
dc6e47231b Implement PromoteOp for *EXTLOAD, allowing MallocBench/gs to Legalize
llvm-svn: 23715
2005-10-13 20:07:41 +00:00
Nate Begeman
d155f7f1c2 Fix the remaining DAGCombiner issues pointed out by sabre. This should fix
the remainder of the failures introduced by my patch last night.

llvm-svn: 23714
2005-10-13 18:34:58 +00:00
Chris Lattner
75447d2f8e Fix a minor bug in the dag combiner that broke pcompress2 and some other
tests.

llvm-svn: 23713
2005-10-13 18:16:34 +00:00
Nate Begeman
9a08d6fb43 Add support to Legalize for expanding i64 sextload/zextload into hi and lo
parts. This should fix the crafty and signed long long unit test failure
on x86 last night.

llvm-svn: 23711
2005-10-13 17:15:37 +00:00
Jim Laskey
2d23e75ac5 Inhibit instructions from being pushed before function calls. This will
minimize unnecessary spilling.

llvm-svn: 23710
2005-10-13 16:44:00 +00:00
Nate Begeman
c7e7c94db5 Move some Legalize functionality over to the DAGCombiner where it belongs.
Kill some dead code.

llvm-svn: 23706
2005-10-13 03:11:28 +00:00
Nate Begeman
b1d64c386b Fix a potential bug with two combine-to's back to back that chris pointed
out, where after the first CombineTo() call, the node the second CombineTo
wishes to replace may no longer exist.

Fix a very real bug with the truncated load optimization on little endian
targets, which do not need a byte offset added to the load.

llvm-svn: 23704
2005-10-12 23:18:53 +00:00
Nate Begeman
d97bb9d084 More cool stuff for the dag combiner. We can now finally handle things
like turning:

_foo:
        fctiwz f0, f1
        stfd f0, -8(r1)
        lwz r2, -4(r1)
        rlwinm r3, r2, 0, 16, 31
        blr

into
_foo:
        fctiwz f0,f1
        stfd f0,-8(r1)
        lhz r3,-2(r1)
        blr

Also removed an unncessary constraint from sra -> srl conversion, which
should take care of hte only reason we would ever need to handle sra in
MaskedValueIsZero, AFAIK.

llvm-svn: 23703
2005-10-12 20:40:40 +00:00
Jim Laskey
2fe279f783 Finally committing to the new scheduler. Still -sched=none by default.
llvm-svn: 23702
2005-10-12 18:29:35 +00:00
Jim Laskey
84b96c93e4 Added graphviz/gv support for MF.
llvm-svn: 23700
2005-10-12 12:09:05 +00:00
Chris Lattner
21c3cf756f Fix a powerpc crash on CodeGen/Generic/llvm-ct-intrinsics.ll
llvm-svn: 23694
2005-10-11 17:56:34 +00:00
Chris Lattner
e6fcb88ad1 Add a canonicalization that got lost, fixing PowerPC/fold-li.ll:SUB
llvm-svn: 23693
2005-10-11 06:07:15 +00:00
Chris Lattner
1441e42e56 clean up some corner cases
llvm-svn: 23692
2005-10-10 23:00:08 +00:00
Chris Lattner
5308a566ba Implement trivial DSE. If two stores are neighbors and store to the same
location, replace them with a new store of the last value.  This occurs
in the same neighborhood in 197.parser, speeding it up about 1.5%

llvm-svn: 23691
2005-10-10 22:31:19 +00:00
Chris Lattner
1c60abe065 Add support for CombineTo, allowing the dag combiner to replace nodes with
multiple results.

Use this support to implement trivial store->load forwarding, implementing
CodeGen/PowerPC/store-load-fwd.ll.  Though this is the most simple case and
can be extended in the future, it is still useful.  For example, it speeds
up 197.parser by 6.2% by avoiding an LSU reject in xalloc:

        stw r6, lo16(l5_end_of_array)(r2)
        addi r2, r5, -4
        stwx r5, r4, r2
-       lwzx r5, r4, r2
-       rlwinm r5, r5, 0, 0, 30
        stwx r5, r4, r2
        lwz r2, -4(r4)
        ori r2, r2, 1

llvm-svn: 23690
2005-10-10 22:04:48 +00:00
Nate Begeman
703c8f57d6 Teach the DAGCombiner several new tricks, teaching it how to turn
sext_inreg into zext_inreg based on the signbit (fires a lot), srem into
urem, etc.

llvm-svn: 23688
2005-10-10 21:26:48 +00:00
Chris Lattner
caed5bc79b Fix comment
llvm-svn: 23686
2005-10-10 16:52:03 +00:00
Chris Lattner
58f0f21ed1 Add ISD::ADD to MaskedValueIsZero
llvm-svn: 23685
2005-10-10 16:51:40 +00:00
Chris Lattner
5b6e18d6fd This function is now dead
llvm-svn: 23684
2005-10-10 16:49:22 +00:00
Chris Lattner
29613bce04 Enable Nate's excellent DAG combiner work by default. This allows the
removal of a bunch of ad-hoc and crufty code from SelectionDAG.cpp.

llvm-svn: 23682
2005-10-10 16:47:10 +00:00
Chris Lattner
097b306215 add a todo for something I noticed
llvm-svn: 23679
2005-10-09 22:59:08 +00:00
Chris Lattner
d0eecf4e64 (X & Y) & C == 0 if either X&C or Y&C are zero
llvm-svn: 23678
2005-10-09 22:12:36 +00:00
Chris Lattner
d5ac294abd When emiting a CopyFromReg and the source is already a vreg, do not bother
creating a new vreg and inserting a copy: just use the input vreg directly.

This speeds up the compile (e.g. about 5% on mesa with a debug build of llc)
by not adding a bunch of copies and vregs to be coallesced away.  On mesa,
for example, this reduces the number of intervals from 168601 to 129040
going into the coallescer.

llvm-svn: 23671
2005-10-09 05:58:56 +00:00
Nate Begeman
8feae2fcc9 Lo and behold, the last bits of SelectionDAG.cpp have been moved over.
llvm-svn: 23665
2005-10-08 00:29:44 +00:00
Chris Lattner
f8b0332dfc remove debugging code
llvm-svn: 23663
2005-10-07 15:31:26 +00:00
Chris Lattner
dff6183cd7 implement CodeGen/PowerPC/div-2.ll:test2-4 by propagating zero bits through
C-X's

llvm-svn: 23662
2005-10-07 15:30:32 +00:00
Chris Lattner
5e0581c32b fix indentation
llvm-svn: 23660
2005-10-07 06:37:02 +00:00
Chris Lattner
36b58a015b Turn sdivs into udivs when we can prove the sign bits are clear. This
implements CodeGen/PowerPC/div-2.ll

llvm-svn: 23659
2005-10-07 06:10:46 +00:00
Chris Lattner
7709ee0085 silence a bogus GCC warning
llvm-svn: 23646
2005-10-06 17:39:10 +00:00
Chris Lattner
3b848038ab Fix the LLC regressions on X86 last night. In particular, when undoing
previous copy elisions and we discover we need to reload a register, make
sure to use the regclass of the original register for the reload, not the
class of the current register.  This avoid using 16-bit loads to reload 32-bit
values.

llvm-svn: 23645
2005-10-06 17:19:06 +00:00
Chris Lattner
0f04d333d5 Make the legalizer completely non-recursive
llvm-svn: 23642
2005-10-06 01:20:27 +00:00
Nate Begeman
85d4334da0 Let the combiner handle more cases
llvm-svn: 23641
2005-10-05 21:44:43 +00:00
Nate Begeman
cf23a9a328 Remove some bad code from Legalize
llvm-svn: 23640
2005-10-05 21:44:10 +00:00
Nate Begeman
a34475adfc Check in some more DAGCombiner pieces
llvm-svn: 23639
2005-10-05 21:43:42 +00:00
Chris Lattner
5afc88fe07 Fix a bug in the local spiller, where we could take code like this:
store r12 -> [ss#2]
  R3 = load [ss#1]
  use R3
  R3 = load [ss#2]
  R4 = load [ss#1]

and turn it into this code:

  store R12 -> [ss#2]
  R3 = load [ss#1]
  use R3
  R3 = R12
  R4 = R3    <- oops!

The problem was that promoting R3 = load[ss#2] to a copy missed the fact that
the instruction invalidated R3 at that point.

llvm-svn: 23638
2005-10-05 18:30:19 +00:00
Chris Lattner
7f1bde4996 implement visitBR_CC so that PowerPC/inverted-bool-compares.ll passes
with the dag combiner.  This speeds up espresso by 8%, reaching performance
parity with the dag-combiner-disabled llc.

llvm-svn: 23636
2005-10-05 06:47:48 +00:00
Chris Lattner
27adcf1b0f fix some pastos
llvm-svn: 23635
2005-10-05 06:37:22 +00:00
Chris Lattner
697fdaba58 Add a new HandleNode class, which is used to handle (haha) cases in the
dead node elim and dag combiner passes where the root is potentially updated.
This fixes a fixme in the dag combiner.

llvm-svn: 23634
2005-10-05 06:35:28 +00:00
Chris Lattner
75ce53eefd Implement the code for PowerPC/inverted-bool-compares.ll, even though it
that testcase still does not pass with the dag combiner.  This is because
not all forms of br* are folded yet.

Also, when we combine a node into another one, delete the node immediately
instead of waiting for the node to potentially come up in the future.

llvm-svn: 23632
2005-10-05 06:11:08 +00:00
Chris Lattner
cf12d7b556 make sure that -view-isel-dags is the input to the isel, not the input to
the second phase of dag combining

llvm-svn: 23631
2005-10-05 06:09:10 +00:00
Chris Lattner
86ccb0efb4 Fix a crash compiling Olden/tsp
llvm-svn: 23630
2005-10-05 04:45:43 +00:00
Jim Laskey
9a2a3d4aab Reverting to version - until problem isolated.
llvm-svn: 23622
2005-10-04 16:41:51 +00:00
Nate Begeman
9740be11f7 Fix some faulty logic in the libcall inserter.
Since calls return more than one value, don't bail if one of their uses
happens to be a node that's not an MVT::Other when following the chain
from CALLSEQ_START to CALLSEQ_END.

Once we've found a CALLSEQ_START, we can just return; there's no need to
tail-recurse further up the graph.

Most importantly, just because something only has one use doesn't mean we
should use it's one use to follow from start to end.  This faulty logic
caused us to follow a chain of one-use FP operations back to a much earlier
call, putting a cycle in the graph from a later start to an earlier end.

This is a better fix that reverting to the workaround committed earlier
today.

llvm-svn: 23620
2005-10-04 02:10:55 +00:00
Nate Begeman
2dcd06e46c Add back a workaround that fixes some breakages from chris's last change.
Neither of us have yet figured out why this code is necessary, but stuff
breaks if its not there.  Still tracking this down...

llvm-svn: 23617
2005-10-04 00:37:37 +00:00
Jim Laskey
22633f7a41 Refactor gathering node info and emission.
llvm-svn: 23610
2005-10-03 12:30:32 +00:00
Chris Lattner
7754dd8231 clean up this code a bit, no functionality change
llvm-svn: 23609
2005-10-03 07:22:07 +00:00
Chris Lattner
018dc6d807 Break the body of the loop out into a new method
llvm-svn: 23606
2005-10-03 04:47:08 +00:00
Chris Lattner
70b5f4e3fd Fix a problem where the legalizer would run out of stack space on extremely
large basic blocks because it was purely recursive.  This switches it to an
iterative/recursive hybrid.

llvm-svn: 23596
2005-10-02 17:49:46 +00:00
Chris Lattner
52952a665d silence a bogus warning
llvm-svn: 23595
2005-10-02 16:30:51 +00:00
Chris Lattner
aa1a841fc7 Add assertions to the trivial scheduler to check that the value types match
up between defs and uses.

llvm-svn: 23590
2005-10-02 07:10:55 +00:00
Chris Lattner
2b189d4f9e Codegen CopyFromReg using the regclass that matches the valuetype of the
destination vreg.

llvm-svn: 23586
2005-10-02 06:34:16 +00:00
Chris Lattner
37fdc6dbf9 Add some very paranoid checking for operand/result reg class matchup
For instructions that define multiple results, use the right regclass
to define the result, not always the rc of result #0

llvm-svn: 23580
2005-10-01 07:45:09 +00:00
Jeff Cohen
412582bcec Fix VC++ warnings.
llvm-svn: 23579
2005-10-01 03:57:14 +00:00
Chris Lattner
2a439615b7 add a method
llvm-svn: 23575
2005-10-01 00:17:07 +00:00
Jim Laskey
532fc48d3d typo
llvm-svn: 23574
2005-10-01 00:08:23 +00:00
Jim Laskey
809ab88d91 1. Simplify the gathering of node groups.
2. Printing node groups when displaying nodes.

llvm-svn: 23573
2005-10-01 00:03:07 +00:00
Jim Laskey
5e51979f90 1. Made things node-centric (from operand).
2. Added node groups to handle flagged nodes.

3. Started weaning simple scheduling off existing emitter.

llvm-svn: 23566
2005-09-30 19:15:27 +00:00
Chris Lattner
3fcb5aa250 now that we have a reg class to spill with, get this info from the regclass
llvm-svn: 23559
2005-09-30 17:19:22 +00:00
Chris Lattner
738631f389 Now that we have getCalleeSaveRegClasses() info, use it to pass the register
class into the spill/reload methods.  Targets can now rely on that argument.

llvm-svn: 23556
2005-09-30 16:59:07 +00:00
Chris Lattner
a9cd99bbc1 Change this code ot pass register classes into the stack slot spiller/reloader
code.  PrologEpilogInserter hasn't been updated yet though, so targets cannot
use this info.

llvm-svn: 23536
2005-09-30 01:29:00 +00:00
Chris Lattner
9fbe5b6a51 Fix two bugs in my patch earlier today that broke int->fp conversion on X86.
llvm-svn: 23522
2005-09-29 06:44:39 +00:00
Jeff Cohen
e070c04df0 Silence VC++ redeclaration warnings.
llvm-svn: 23516
2005-09-29 01:59:49 +00:00
Chris Lattner
61f3785147 Add FP versions of the binary operators, keeping the int and fp worlds seperate.
Though I have done extensive testing, it is possible that this will break
things in configs I can't test.  Please let me know if this causes a problem
and I'll fix it ASAP.

llvm-svn: 23504
2005-09-28 22:28:18 +00:00
Chris Lattner
4655e9de38 If the target prefers it, use _setjmp/_longjmp should be used instead of setjmp/longjmp for llvm.setjmp/llvm.longjmp.
llvm-svn: 23481
2005-09-27 22:15:53 +00:00
Jim Laskey
5a82322c66 Remove some redundancies.
llvm-svn: 23469
2005-09-27 17:32:45 +00:00
Jim Laskey
22a1f0f44b Addition of a simple two pass scheduler. This version is currently hacked up
for testing and will require target machine info to do a proper scheduling.
The simple scheduler can be turned on using -sched=simple (defaults
to -sched=none)

llvm-svn: 23455
2005-09-26 21:57:04 +00:00
Chris Lattner
be817baed9 Turn (X^C1) == C2 into X == C1^C2 iff X&~C1 = 0 (and move a function)
This happens all the time on PPC for bool values, e.g. eliminating a xori
in inverted-bool-compares.ll.

This should be added to the dag combiner as well.

llvm-svn: 23403
2005-09-23 00:55:52 +00:00
Chris Lattner
288e5b0a7d Expose the LiveInterval interfaces as public headers.
llvm-svn: 23400
2005-09-21 04:19:09 +00:00
Nate Begeman
236df45f1b Stub out the rest of the DAG Combiner. Just need to fill in the
select_cc bits and then wrap it in a convenience function for  use with
regular select.

llvm-svn: 23389
2005-09-19 22:34:01 +00:00
Chris Lattner
59dd979162 Teach the local spiller to turn stack slot loads into register-register copies
when possible, avoiding the load (and avoiding the copy if the value is already
in the right register).

This patch came about when I noticed code like the following being generated:

  store R17 -> [SS1]
  ...blah...
  R4 = load [SS1]

This was causing an LSU reject on the G5.  This problem was due to the register
allocator folding spill code into a reg-reg copy (producing the load), which
prevented the spiller from being able to rewrite the load into a copy, despite
the fact that the value was already available in a register.  In the case
above, we now rip out the R4 load and replace it with a R4 = R17 copy.

This speeds up several programs on X86 (which spills a lot :) ), e.g.
smg2k from 22.39->20.60s, povray from 12.93->12.66s, 168.wupwise from
68.54->53.83s (!), 197.parser from 7.33->6.62s (!), etc.  This may have a larger
impact in some cases on the G5 (by avoiding LSU rejects), though it probably
won't trigger as often (less spilling in general).

Targets that implement folding of loads/stores into copies should implement
the isLoadFromStackSlot hook to get this.

llvm-svn: 23388
2005-09-19 06:56:21 +00:00
Nate Begeman
d733cf7e8b More DAG combining. Still need the branch instructions, and select_cc
llvm-svn: 23371
2005-09-16 00:54:12 +00:00
Chris Lattner
49669dd169 If a function has liveins, and if the target requested that they be plopped
into particular vregs, emit copies into the entry MBB.

llvm-svn: 23331
2005-09-13 19:30:54 +00:00
Chris Lattner
38fb15db44 Allow targets to say they don't support truncstore i1 (which includes a mask
when storing to an 8-bit memory location), as most don't.

llvm-svn: 23303
2005-09-10 00:20:18 +00:00
Chris Lattner
52a8cb35e6 Add a missing #include, patch courtesy of Baptiste Lepilleur.
llvm-svn: 23302
2005-09-09 23:53:39 +00:00
Chris Lattner
cae9229d6e Fix a problem duraid encountered on itanium where this folding:
select (x < y), 1, 0 -> (x < y) incorrectly: the setcc returns i1 but the
select returned i32.  Add the zero extend as needed.

llvm-svn: 23301
2005-09-09 23:00:07 +00:00
Chris Lattner
85884e9b8a Fix a crash viewing dags that have target nodes in them
llvm-svn: 23300
2005-09-09 22:35:03 +00:00
Chris Lattner
e7610bc599 Use continue in the use-processing loop to make it clear what the early exits
are, simplify logic, and cause things to not be nested as deeply.  This also
uses MRI->areAliases instead of an explicit loop.

No functionality change, just code cleanup.

llvm-svn: 23296
2005-09-09 20:29:51 +00:00
Nate Begeman
8422b3637e Last round of 2-node folds from SD.cpp. Will move on to 3 node ops such
as setcc and select next.

llvm-svn: 23295
2005-09-09 19:49:52 +00:00
Chris Lattner
fc17fe0e6d remove debugging code *slaps head*
llvm-svn: 23294
2005-09-09 19:19:20 +00:00
Chris Lattner
8d8506f8e2 When spilling a live range that is used multiple times by one instruction,
only add a reload live range once for the instruction.  This is one step
towards fixing a regalloc pessimization that Nate notice, but is later undone
by the spiller (so no code is changed).

llvm-svn: 23293
2005-09-09 19:17:47 +00:00
Nate Begeman
1675c67c62 Move yet more folds over to the dag combiner from sd.cpp
llvm-svn: 23278
2005-09-08 20:18:10 +00:00
Nate Begeman
c0f764ada4 Another round of dag combiner changes. This fixes some missing XOR folds
as well as fixing how we replace old values with new values.

llvm-svn: 23260
2005-09-07 23:25:52 +00:00
Chris Lattner
b3516c123f Fix a bug that Tzu-Chien Chiu noticed: live interval analysis does NOT
preserve livevar

llvm-svn: 23259
2005-09-07 17:34:39 +00:00
Nate Begeman
e8db0c961a Implement a common missing fold, (add (add x, c1), c2) -> (add x, c1+c2).
This restores all of stanford to being identical with and without the dag
combiner with the add folding turned off in sd.cpp.

llvm-svn: 23258
2005-09-07 16:09:19 +00:00
Chris Lattner
482f71733a Fix a bug nate ran into with replacealluseswith. In the recursive cse case,
we were losing a node, causing an assertion to fail.  Now we eagerly delete
discovered CSE's, and provide an optional vector to keep track of these
discovered equivalences.

llvm-svn: 23255
2005-09-07 05:37:01 +00:00
Nate Begeman
143dc2039d Add an option to the DAG Combiner to enable it for beta runs, and turn on
that option for PowerPC's beta.

llvm-svn: 23253
2005-09-07 00:15:36 +00:00
Nate Begeman
e1a34193fa Next round of DAGCombiner changes. This version now passes all the tests
I have run so far when run before Legalize.  It still needs to pick up the
SetCC folds, and nodes that use SetCC.

llvm-svn: 23243
2005-09-06 04:43:02 +00:00
Chris Lattner
29929a3745 Fix a checking failure in gs
llvm-svn: 23235
2005-09-03 01:04:40 +00:00
Nate Begeman
613f777bbc Next round of DAG Combiner changes. Just need to support multiple return
values, and then we should be able to hook it up.

llvm-svn: 23231
2005-09-02 21:18:40 +00:00
Chris Lattner
da97aa059c Clean up some code from the last checkin
llvm-svn: 23229
2005-09-02 20:32:45 +00:00
Chris Lattner
4c2b614aa6 Fix a bug in legalize where it would emit two calls to libcalls that return
i64 values on targets that need that expanded to 32-bit registers.  This fixes
PowerPC/2005-09-02-LegalizeDuplicatesCalls.ll and speeds up 189.lucas from
taking 122.72s to 81.96s on my desktop.

llvm-svn: 23228
2005-09-02 20:26:58 +00:00
Chris Lattner
17b67e5137 Make sure to auto-cse nullary ops
llvm-svn: 23224
2005-09-02 19:36:17 +00:00
Chris Lattner
7995b70148 Fix some buggy logic where we would try to remove nodes with two operands
from the binary ops map, even if they had multiple results.  This latent bug
caused a few failures with the dag isel last night.

To prevent stuff like this from happening in the future, add some really
strict checking to make sure that the CSE maps always match up with reality!

llvm-svn: 23221
2005-09-02 19:15:44 +00:00
Chris Lattner
365774f457 Don't create zero sized stack objects even for array allocas with a zero
number of elements.

llvm-svn: 23219
2005-09-02 18:41:28 +00:00
Chris Lattner
7d89863a77 Fix the release build, noticed by Eric van Riet Paap
llvm-svn: 23215
2005-09-02 07:09:28 +00:00
Chris Lattner
86bed2f90b Make sure to legalize assert[zs]ext's operand correctly
llvm-svn: 23208
2005-09-02 01:15:01 +00:00
Chris Lattner
4919477f39 Teach live intervals to not crash on dead livein regs
llvm-svn: 23206
2005-09-02 00:20:32 +00:00
Chris Lattner
8a6c15f4f4 For values that are live across basic blocks and need promotion, use ANY_EXTEND
instead of ZERO_EXTEND to eliminate extraneous extensions.  This eliminates
dead zero extensions on formal arguments and other cases on PPC, implementing
the newly tightened up test/Regression/CodeGen/PowerPC/small-arguments.ll test.

llvm-svn: 23205
2005-09-02 00:19:37 +00:00
Chris Lattner
aae61e684c legalize ANY_EXTEND appropriately
llvm-svn: 23204
2005-09-02 00:18:10 +00:00
Chris Lattner
3f7fbe14a8 Add support for ANY_EXTEND and add a few minor folds for it
llvm-svn: 23203
2005-09-02 00:17:32 +00:00
Nate Begeman
626c46f8d9 Fix some code in the current node combining code, spotted when it was moved
over to DAGCombiner.cpp

1. Don't assume that SetCC returns i1 when folding (xor (setcc) constant)
2. Don't duplicate code in folding AND with AssertZext that is handled by
   MaskedValueIsZero

llvm-svn: 23196
2005-09-01 23:25:49 +00:00
Nate Begeman
18f456b8e3 Implement first round of feedback from chris (there's still a couple things
left to do).

llvm-svn: 23195
2005-09-01 23:24:04 +00:00
Chris Lattner
f2b775d686 It is NDEBUG not _NDEBUG
llvm-svn: 23186
2005-09-01 18:44:10 +00:00
Nate Begeman
517e40a5bb Add the rest of the currently implemented visit routines to the switch
statement in visit().

llvm-svn: 23185
2005-09-01 00:33:32 +00:00
Nate Begeman
be2fa8f86f First pass at the DAG Combiner. It isn't used anywhere yet, but it should
be mostly functional.  It currently has all folds from SelectionDAG.cpp
that do not involve a condition code.

llvm-svn: 23184
2005-09-01 00:19:25 +00:00
Chris Lattner
b8dcea186c If a function has live ins/outs, print them
llvm-svn: 23181
2005-08-31 22:34:59 +00:00
Chris Lattner
fc612f96ec Allow targets to custom expand shifts that are too large for their registers
llvm-svn: 23173
2005-08-31 19:01:53 +00:00
Jeff Cohen
8c454a3024 Fix VC++ precedence warnings
llvm-svn: 23169
2005-08-31 02:47:06 +00:00
Nate Begeman
64ea782435 Sigh, not my day. Fix typo.
llvm-svn: 23166
2005-08-31 00:43:49 +00:00
Nate Begeman
d286f16856 Fix a mistake in my previous patch pointed out by sabre; the AssertZext
case in MaskedValueIsZero was wrong.

llvm-svn: 23165
2005-08-31 00:43:08 +00:00
Nate Begeman
d754412b26 Remove some unnecessary casts, and add the AssertZext case to
MaskedValueIsZero.

llvm-svn: 23164
2005-08-31 00:27:53 +00:00
Chris Lattner
87d45af685 Allow physregs to occur in the dag with multiple types. Though I don't likethis, it is a requirement on PPC, which can have an f32 value in r3 at onepoint in a function and a f64 value in r3 at another point. :(
This fixes compilation of mesa

llvm-svn: 23161
2005-08-30 22:38:38 +00:00
Chris Lattner
36461b2e37 When checking the fixed intervals, don't forget to check for register aliases.
This fixes PR621 and Regression/CodeGen/X86/2005-08-30-RegAllocAliasProblem.ll

llvm-svn: 23158
2005-08-30 21:03:36 +00:00
Chris Lattner
6a990c392c Fix FreeBench/fourinarow with the dag isel, by not adding a bogus result
to SHIFT_PARTS nodes

llvm-svn: 23151
2005-08-30 17:21:17 +00:00
Chris Lattner
772c8814b6 Fix a miscompile of PtrDist/bc. Sign extending bools is not the right thing,
at least tends to expose problems elsewhere.

llvm-svn: 23149
2005-08-30 16:56:19 +00:00
Nate Begeman
25755f7f00 Remove a bogus piece of my AssertSext/AssertZext patch. oops.
llvm-svn: 23148
2005-08-30 02:54:28 +00:00
Nate Begeman
dc36f47d99 Add support for AssertSext and AssertZext, folding other extensions with
them.  This allows for elminination of redundant extends in the entry
blocks of functions on PowerPC.

Add support for i32 x i32 -> i64 multiplies, by recognizing when the inputs
to ISD::MUL in ExpandOp are actually just extended i32 values and not real
i64 values.  this allows us to codegen

int mulhs(int a, int b) { return ((long long)a * b) >> 32; }
as:
_mulhs:
        mulhw r3, r4, r3
        blr

instead of:
_mulhs:
        mulhwu r2, r4, r3
        srawi r5, r3, 31
        mullw r5, r4, r5
        add r2, r2, r5
        srawi r4, r4, 31
        mullw r3, r4, r3
        add r3, r2, r3
        blr

with a similar improvement on x86.

llvm-svn: 23147
2005-08-30 02:44:00 +00:00
Chris Lattner
a611caeec8 Name this variable to be what it really is!
llvm-svn: 23145
2005-08-30 01:58:51 +00:00
Chris Lattner
56051a0f92 Handle CopyToReg nodes with flag operands correctly
llvm-svn: 23144
2005-08-30 01:57:23 +00:00
Chris Lattner
774b9718dc Add a hack to avoid some horrible code in some cases by always emitting
token chains first.  For this C function:

int test() {
  int i;
  for (i = 0; i < 100000; ++i)
    foo();
}

Instead of emitting this (condition before call)

.LBB_test_1:    ; no_exit
        addi r30, r30, 1
        lis r2, 1
        ori r2, r2, 34464
        cmpw cr2, r30, r2
        bl L_foo$stub
        bne cr2, .LBB_test_1    ; no_exit

Emit this:

.LBB_test_1:    ; no_exit
        bl L_foo$stub
        addi r30, r30, 1
        lis r2, 1
        ori r2, r2, 34464
        cmpw cr0, r30, r2
        bne cr0, .LBB_test_1    ; no_exit

Which makes it so we don't have to save/restore cr2 in the prolog/epilog of
the function.

This also makes the code much more similar to what the pattern isel produces.

llvm-svn: 23135
2005-08-29 23:21:29 +00:00
Chris Lattner
32609690c3 Add a new API for Nate
llvm-svn: 23131
2005-08-29 21:59:31 +00:00
Andrew Lenharth
f580b078b6 Some of us cared about the the promote path
llvm-svn: 23130
2005-08-29 20:46:51 +00:00
Chris Lattner
b0e46fa671 Fix an infinite loop on x86
llvm-svn: 23129
2005-08-29 17:30:00 +00:00
Chris Lattner
21400573a7 Fix a bug in my previous patch that was using the wrong iterator. This fixes
Olden/bisort among others.

llvm-svn: 23124
2005-08-29 00:10:46 +00:00
Chris Lattner
44dcf508a1 Fix a bug in ReplaceAllUsesWith
llvm-svn: 23122
2005-08-28 23:59:36 +00:00
Chris Lattner
e0eae3d244 Disable this code, which broke many tests last night
llvm-svn: 23114
2005-08-27 16:16:51 +00:00
Chris Lattner
6bf97cff13 fix PHI node emission for basic blocks that have select_cc's in them on ppc32
llvm-svn: 23113
2005-08-27 00:58:02 +00:00
Chris Lattner
35a82f5f79 Nate noticed that Andrew never did this. This fixes PR600
llvm-svn: 23110
2005-08-26 22:50:40 +00:00
Chris Lattner
e9cc12f5c4 Don't copy regs that are only used in the entry block into a vreg. This
changes the code generated for:

short %test(short %A) {
  %B = xor short %A, -32768
  ret short %B
}

to:

_test:
        xori r2, r3, 32768
        xoris r2, r2, 65535
        extsh r3, r2
        blr

instead of:

_test:
        rlwinm r2, r3, 0, 16, 31
        xori r2, r3, 32768
        xoris r2, r2, 65535
        extsh r3, r2
        blr

llvm-svn: 23109
2005-08-26 22:49:59 +00:00
Chris Lattner
5f55dd72af Make this code safe for when loadRegFromStackSlot inserts multiple instructions.
llvm-svn: 23108
2005-08-26 22:18:32 +00:00
Chris Lattner
7efca0c312 Checking types here is not safe, because multiple types can map to the same
register class.

llvm-svn: 23103
2005-08-26 21:39:15 +00:00
Chris Lattner
faa96209d8 Call the InsertAtEndOfBasicBlock hook if the usesCustomDAGSchedInserter
flag is set on an instruction.

llvm-svn: 23098
2005-08-26 20:54:47 +00:00
Chris Lattner
3e0bfc0cc1 Revampt ReplaceAllUsesWith to be more efficient and easier to use.
llvm-svn: 23087
2005-08-26 18:36:28 +00:00
Chris Lattner
a31708e6b3 Change ConstantPoolSDNode to actually hold the Constant itself instead of
putting it into the constant pool.  This allows the isel machinery to
create constants that it will end up deciding are not needed, without them
ending up in the resultant function constant pool.

llvm-svn: 23081
2005-08-26 17:15:30 +00:00
Chris Lattner
929c00e9e9 Fix a huge annoyance: SelectNodeTo took types before the opcode unlike
every other SD API.  Fix it to take the opcode before the types.

llvm-svn: 23079
2005-08-26 16:36:26 +00:00
Chris Lattner
6d22117d76 the 5th operand is the 4th number
llvm-svn: 23074
2005-08-26 00:43:46 +00:00
Chris Lattner
f25ec1b7b9 Add support for targets that want to custom expand select_cc in some cases.
llvm-svn: 23071
2005-08-26 00:23:59 +00:00
Chris Lattner
7e68d39877 Allow LowerOperation to return a null SDOperand in case it wants to lower
some things given to it, but not all.

llvm-svn: 23070
2005-08-26 00:14:16 +00:00
Chris Lattner
2c3fbbab05 Fix a nasty bug from a previous patch of mine
llvm-svn: 23069
2005-08-26 00:13:12 +00:00
Nate Begeman
54f44ad750 New fold for SELECT_CC
llvm-svn: 23058
2005-08-25 20:04:38 +00:00
Chris Lattner
18572f3c67 Don't auto-cse nodes that return flags
llvm-svn: 23055
2005-08-25 19:12:10 +00:00
Chris Lattner
cb3910fa74 add printer support for flag operands
llvm-svn: 23054
2005-08-25 17:59:23 +00:00
Chris Lattner
7c7a447220 simplify the code a bit using isOperationLegal
llvm-svn: 23053
2005-08-25 17:54:58 +00:00
Chris Lattner
7598a14e0c Add support for flag operands
llvm-svn: 23050
2005-08-25 17:48:54 +00:00
Chris Lattner
0168c8df11 ADd support for TargetConstantPool nodes
llvm-svn: 23041
2005-08-25 05:03:06 +00:00
Chris Lattner
813f6ddaf8 add a new TargetFrameIndex node
llvm-svn: 23035
2005-08-25 00:43:01 +00:00
Chris Lattner
22c6c99e5d add a method
llvm-svn: 23027
2005-08-24 23:00:29 +00:00
Chris Lattner
786ec10dfb Add ReplaceAllUsesWith that can take a vector of replacement values.
Add some foldings to hopefully help the illegal setcc issue, and move some code around.

llvm-svn: 23025
2005-08-24 22:44:39 +00:00
Chris Lattner
92b560cfee Add support for external symbols, and support for variable arity instructions
llvm-svn: 23022
2005-08-24 22:02:41 +00:00
Chris Lattner
55fb48f5ee Fix pasto that prevented VT ndoes from showing up in -view-isel-dags correctly
llvm-svn: 23021
2005-08-24 18:30:00 +00:00
Chris Lattner
6d4cd33447 teach selection dag mask tracking about the fact that select_cc operates like
select.  Also teach it that the bit count instructions can only set the low bits
of the result, depending on the size of the input.

This allows us to compile this:

int %eq0(int %a) {
        %tmp.1 = seteq int %a, 0                ; <bool> [#uses=1]
        %tmp.2 = cast bool %tmp.1 to int                ; <int> [#uses=1]
        ret int %tmp.2
}

To this:

_eq0:
        cntlzw r2, r3
        srwi r3, r2, 5
        blr

instead of this:

_eq0:
        cntlzw r2, r3
        rlwinm r3, r2, 27, 31, 31
        blr

when setcc is marked illegal on ppc (which restores parity to non-illegal
setcc).  Thanks to Nate for pointing this out.

llvm-svn: 23013
2005-08-24 16:46:55 +00:00
Chris Lattner
014e001f23 Start using isOperationLegal and isTypeLegal to simplify the code
llvm-svn: 23012
2005-08-24 16:35:28 +00:00
Nate Begeman
2fc750ca45 Teach SelectionDAG how to simplify a few more setcc-equivalent select_cc
nodes so that backends don't have to.

llvm-svn: 22999
2005-08-24 04:57:57 +00:00
Chris Lattner
64f7f0beac Make -view-isel-dags show the dag before instruction selecting, in case
the target isel crashes due to unimplemented features like calls :)

llvm-svn: 22997
2005-08-24 00:34:29 +00:00
Nate Begeman
d4fcf86262 Fix optimization of select_cc seteq X, 0, 1, 0 -> srl (ctlz X), log2 X size
llvm-svn: 22995
2005-08-24 00:21:28 +00:00
Chris Lattner
e7c3b71a28 Implement LiveVariables.h change
llvm-svn: 22994
2005-08-24 00:09:33 +00:00
Chris Lattner
7e3441972b adjust to new live variables interface
llvm-svn: 22992
2005-08-23 23:42:17 +00:00
Chris Lattner
53b91b741f Simplify this code by using higher-level LiveVariables methods
llvm-svn: 22989
2005-08-23 22:51:41 +00:00
Chris Lattner
610eeca969 Keep track of which registers are related to which other registers.
Use this information to avoid doing expensive interval intersections for
registers that could not possible be interesting.  This speeds up linscan
on ia64 compiling kc++ in release mode from taking 7.82s to 4.8s(!), total
itanium llc time on this program is 27.3s now.  This marginally speeds up
PPC and X86, but they appear to be limited by other parts of linscan, not
this code.

On this program, on itanium, live intervals now takes 41% of llc time.

llvm-svn: 22986
2005-08-23 22:27:31 +00:00
Nate Begeman
f1581c11e9 Teach the SelectionDAG how to transform select_cc eq, X, 0, 1, 0 into
either seteq X, 0 or srl (ctlz X), size(X-1), depending on what's legal
for the target.

llvm-svn: 22978
2005-08-23 05:41:12 +00:00
Nate Begeman
885680bafb Teach Legalize how to turn setcc into select_cc
llvm-svn: 22977
2005-08-23 04:29:48 +00:00
Chris Lattner
2c5f36d938 Try to avoid scanning the fixed list. On architectures with a non-stupid
number of regs (e.g. most riscs), many functions won't need to use callee
clobbered registers.  Do a speculative check to see if we can get a free
register without processing the fixed list (which has all of these).  This
saves a lot of time on machines with lots of callee clobbered regs (e.g.
ppc and itanium, also x86).

This reduces ppc llc compile time from 184s -> 172s on kc++.  This is probably
worth FAR FAR more on itanium though.

llvm-svn: 22972
2005-08-22 20:59:30 +00:00
Chris Lattner
9b0058b424 Move some code in the register assignment case that only needs to happen if
we spill out of the fast path.  The scan of active_ and the calls to
updateSpillWeights don't need to happen unless a spill occurs.  This reduces
debug llc time of kc++ with ppc from 187.3s to 183.2s.

llvm-svn: 22971
2005-08-22 20:20:42 +00:00
Chris Lattner
d73a5042d9 Fix a problem where constant expr shifts would not have their shift amount
promoted to the right type.  This fixes: IA64/2005-08-22-LegalizerCrash.ll

llvm-svn: 22969
2005-08-22 17:28:31 +00:00
Chris Lattner
a9710ba54f Speed up this loop a bit, based on some observations that Nate made, and
add some comments.  This loop really needs to be reevaluated!

llvm-svn: 22966
2005-08-22 16:55:22 +00:00
Chris Lattner
7ce81741ff Add a fast-path for register values. Add support for constant pool entries,
allowing us to compile this:

float %test2(float* %P) {
        %Q = load float* %P
        %R = add float %Q, 10.1
        ret float %R
}

to this:

_test2:
        lfs r2, 0(r3)
        lis r3, ha16(.CPI_test2_0)
        lfs r3, lo16(.CPI_test2_0)(r3)
        fadds f1, r2, r3
        blr

llvm-svn: 22962
2005-08-22 01:04:32 +00:00
Chris Lattner
8927bf468d add anew method
llvm-svn: 22957
2005-08-21 22:30:30 +00:00
Chris Lattner
7a04eff613 Add support for frame index nodes
llvm-svn: 22956
2005-08-21 19:56:04 +00:00
Chris Lattner
cbbd212622 add a method
llvm-svn: 22955
2005-08-21 19:48:59 +00:00
Chris Lattner
481b47fc75 add a method
llvm-svn: 22949
2005-08-21 18:49:33 +00:00
Chris Lattner
3f6df51c19 Add support for basic blocks, fix a bug in result # computation
llvm-svn: 22948
2005-08-21 18:49:29 +00:00
Chris Lattner
9bb0d10479 When legalizing brcond ->brcc or select -> selectcc, make sure to truncate
the old condition to a one bit value.  The incoming value must have been
promoted, and the top bits are undefined.  This causes us to generate:

_test:
        rlwinm r2, r3, 0, 31, 31
        li r3, 17
        cmpwi cr0, r2, 0
        bne .LBB_test_2 ;
.LBB_test_1:    ;
        li r3, 1
.LBB_test_2:    ;
        blr

instead of:

_test:
        rlwinm r2, r3, 0, 31, 31
        li r2, 17
        cmpwi cr0, r3, 0
        bne .LBB_test_2 ;
.LBB_test_1:    ;
        li r2, 1
.LBB_test_2:    ;
        or r3, r2, r2
        blr

for:

int %test(bool %c) {
        %retval = select bool %c, int 17, int 1
        ret int %retval
}

llvm-svn: 22947
2005-08-21 18:03:09 +00:00
Chris Lattner
7c3e52ef92 fix bogus warning
llvm-svn: 22943
2005-08-20 18:07:27 +00:00
Chris Lattner
5b7488224d Add support for global address nodes
llvm-svn: 22940
2005-08-19 22:38:24 +00:00
Chris Lattner
5210fd0e51 Add support for TargetGlobalAddress nodes
llvm-svn: 22938
2005-08-19 22:31:04 +00:00
Chris Lattner
bedf8e757a Implement CopyFromReg, TokenFactor, and fix a bug in CopyToReg. This allows
us to compile stuff like this:

double %test(double %A, double %B, double %C, double %E) {
        %F = mul double %A, %A
        %G = add double %F, %B
        %H = sub double -0.0, %G
        %I = mul double %H, %C
        %J = add double %I, %E
        ret double %J
}

to:

_test:
        fnmadd f0, f1, f1, f2
        fmadd f1, f0, f3, f4
        blr

woot!

llvm-svn: 22937
2005-08-19 21:43:53 +00:00
Chris Lattner
b36807b0d0 Fix a bug in previous commit
llvm-svn: 22936
2005-08-19 21:34:13 +00:00
Chris Lattner
ac699c4db9 Print physreg register nodes with target names (e.g. F1) instead of numbers
llvm-svn: 22934
2005-08-19 21:21:16 +00:00
Chris Lattner
011a721d08 Before implementing copyfromreg, we'll implement copytoreg correctly.
This gets us this for the previous testcase:

_test:
        lis r2, 0
        ori r3, r2, 65535
        blr

Note that we actually write to r3 (the return reg) correctly now :)

llvm-svn: 22933
2005-08-19 20:50:53 +00:00
Chris Lattner
9af3aaf541 Now that we have operand info for machine instructions, use it to create
temporary registers for things that define a register.  This allows dag->dag
isel to compile this:

int %test() { ret int 65535 }

into:

_test:
        lis r2, 0
        ori r2, r2, 65535
        blr

Next up, getting CopyFromReg to work, allowing arguments and cross-bb values.

llvm-svn: 22932
2005-08-19 20:45:43 +00:00
Jeff Cohen
12674110d5 Fix VC++ constant truncation warning.
llvm-svn: 22907
2005-08-19 16:19:21 +00:00
Jeff Cohen
f99748bc0f Fix VC++ precedence warning.
llvm-svn: 22902
2005-08-19 04:39:48 +00:00
Chris Lattner
1207209677 Fix computation of # operands, add a temporary hack for CopyToReg
llvm-svn: 22896
2005-08-19 01:01:34 +00:00
Chris Lattner
7b9f02525e add a new -view-sched-dags option to view dags as they are sent to the scheduler.
llvm-svn: 22878
2005-08-18 20:11:49 +00:00
Chris Lattner
62bc771af7 Implement the first chunk of a code emitter. This is sophisticated enough to
codegen:

_empty:
.LBB_empty_0:   ;
        blr

but can't do anything more (yet). :)

llvm-svn: 22876
2005-08-18 20:07:59 +00:00
Chris Lattner
ebb48e5877 new file, obviously just a stub
llvm-svn: 22868
2005-08-18 18:45:24 +00:00
Chris Lattner
5cbeaed711 Enable critical edge splitting by default
llvm-svn: 22863
2005-08-18 17:35:14 +00:00
Nate Begeman
474ec3c02d Add support for target DAG nodes that take 4 operands, such as PowerPC's
rlwinm.

llvm-svn: 22856
2005-08-18 07:30:15 +00:00
Chris Lattner
d6b9b36616 Fix printing of VTSDNodes
llvm-svn: 22853
2005-08-18 03:31:02 +00:00
Jim Laskey
d761e8859d Move the code dependency for MathExtras.h from SelectionDAGNodes.h.
Added some class dividers in SelectionDAG.cpp.

llvm-svn: 22841
2005-08-17 20:08:02 +00:00
Jim Laskey
61e3d7bca5 Culling out use of unions for converting FP to bits and vice versa.
llvm-svn: 22838
2005-08-17 19:34:49 +00:00
Chris Lattner
a11bdf3abe Fix a bug in RemoveDeadNodes where it would crash when its "optional"
argument is not specified.

Implement ReplaceAllUsesWith.

llvm-svn: 22834
2005-08-17 19:00:20 +00:00
Jim Laskey
7cdadb13d5 Switched to using BitsToDouble for int_to_float to avoid aliasing problem.
llvm-svn: 22831
2005-08-17 17:42:52 +00:00
Jim Laskey
2370cb4e85 Change hex float constants for the sake of VC++.
llvm-svn: 22828
2005-08-17 09:44:59 +00:00
Chris Lattner
dbfcba7565 Add a new beta option for critical edge splitting, to avoid a problem that
Nate noticed in yacr2 (and I know occurs in other places as well).

This is still rough, as the critical edge blocks are not intelligently placed
but is added to get some idea to see if this improves performance.

llvm-svn: 22825
2005-08-17 06:37:43 +00:00
Chris Lattner
a103a2e9c6 Fix a regression on X86, where FP values can be promoted too.
llvm-svn: 22822
2005-08-17 06:06:25 +00:00
Jim Laskey
59b9ee0529 Added generic code expansion for [signed|unsigned] i32 to [f32|f64] casts in the
legalizer.  PowerPC now uses this expansion instead of ISel version.

Example:

// signed integer to double conversion
double f1(signed x) {
  return (double)x;
}

// unsigned integer to double conversion
double f2(unsigned x) {
  return (double)x;
}

// signed integer to float conversion
float f3(signed x) {
  return (float)x;
}

// unsigned integer to float conversion
float f4(unsigned x) {
  return (float)x;
}


Byte Code:

internal fastcc double %_Z2f1i(int %x) {
entry:
        %tmp.1 = cast int %x to double          ; <double> [#uses=1]
        ret double %tmp.1
}

internal fastcc double %_Z2f2j(uint %x) {
entry:
        %tmp.1 = cast uint %x to double         ; <double> [#uses=1]
        ret double %tmp.1
}

internal fastcc float %_Z2f3i(int %x) {
entry:
        %tmp.1 = cast int %x to float           ; <float> [#uses=1]
        ret float %tmp.1
}

internal fastcc float %_Z2f4j(uint %x) {
entry:
        %tmp.1 = cast uint %x to float          ; <float> [#uses=1]
        ret float %tmp.1
}

internal fastcc double %_Z2g1i(int %x) {
entry:
        %buffer = alloca [2 x uint]             ; <[2 x uint]*> [#uses=3]
        %tmp.0 = getelementptr [2 x uint]* %buffer, int 0, int 0                ; <uint*> [#uses=1]
        store uint 1127219200, uint* %tmp.0
        %tmp.2 = cast int %x to uint            ; <uint> [#uses=1]
        %tmp.3 = xor uint %tmp.2, 2147483648            ; <uint> [#uses=1]
        %tmp.5 = getelementptr [2 x uint]* %buffer, int 0, int 1                ; <uint*> [#uses=1]
        store uint %tmp.3, uint* %tmp.5
        %tmp.9 = cast [2 x uint]* %buffer to double*            ; <double*> [#uses=1]
        %tmp.10 = load double* %tmp.9           ; <double> [#uses=1]
        %tmp.13 = load double* cast (long* %signed_bias to double*)             ; <double> [#uses=1]
        %tmp.14 = sub double %tmp.10, %tmp.13           ; <double> [#uses=1]
        ret double %tmp.14
}

internal fastcc double %_Z2g2j(uint %x) {
entry:
        %buffer = alloca [2 x uint]             ; <[2 x uint]*> [#uses=3]
        %tmp.0 = getelementptr [2 x uint]* %buffer, int 0, int 0                ; <uint*> [#uses=1]
        store uint 1127219200, uint* %tmp.0
        %tmp.1 = getelementptr [2 x uint]* %buffer, int 0, int 1                ; <uint*> [#uses=1]
        store uint %x, uint* %tmp.1
        %tmp.4 = cast [2 x uint]* %buffer to double*            ; <double*> [#uses=1]
        %tmp.5 = load double* %tmp.4            ; <double> [#uses=1]
        %tmp.8 = load double* cast (long* %unsigned_bias to double*)            ; <double> [#uses=1]
        %tmp.9 = sub double %tmp.5, %tmp.8              ; <double> [#uses=1]
        ret double %tmp.9
}

internal fastcc float %_Z2g3i(int %x) {
entry:
        %buffer = alloca [2 x uint]             ; <[2 x uint]*> [#uses=3]
        %tmp.0 = getelementptr [2 x uint]* %buffer, int 0, int 0                ; <uint*> [#uses=1]
        store uint 1127219200, uint* %tmp.0
        %tmp.2 = cast int %x to uint            ; <uint> [#uses=1]
        %tmp.3 = xor uint %tmp.2, 2147483648            ; <uint> [#uses=1]
        %tmp.5 = getelementptr [2 x uint]* %buffer, int 0, int 1                ; <uint*> [#uses=1]
        store uint %tmp.3, uint* %tmp.5
        %tmp.9 = cast [2 x uint]* %buffer to double*            ; <double*> [#uses=1]
        %tmp.10 = load double* %tmp.9           ; <double> [#uses=1]
        %tmp.13 = load double* cast (long* %signed_bias to double*)             ; <double> [#uses=1]
        %tmp.14 = sub double %tmp.10, %tmp.13           ; <double> [#uses=1]
        %tmp.16 = cast double %tmp.14 to float          ; <float> [#uses=1]
        ret float %tmp.16
}

internal fastcc float %_Z2g4j(uint %x) {
entry:
        %buffer = alloca [2 x uint]             ; <[2 x uint]*> [#uses=3]
        %tmp.0 = getelementptr [2 x uint]* %buffer, int 0, int 0                ; <uint*> [#uses=1]
        store uint 1127219200, uint* %tmp.0
        %tmp.1 = getelementptr [2 x uint]* %buffer, int 0, int 1                ; <uint*> [#uses=1]
        store uint %x, uint* %tmp.1
        %tmp.4 = cast [2 x uint]* %buffer to double*            ; <double*> [#uses=1]
        %tmp.5 = load double* %tmp.4            ; <double> [#uses=1]
        %tmp.8 = load double* cast (long* %unsigned_bias to double*)            ; <double> [#uses=1]
        %tmp.9 = sub double %tmp.5, %tmp.8              ; <double> [#uses=1]
        %tmp.11 = cast double %tmp.9 to float           ; <float> [#uses=1]
        ret float %tmp.11
}


PowerPC Code:

        .machine ppc970


        .const
        .align  2
.CPIl1__Z2f1i_0:                                        ; float 0x4330000080000000
        .long   1501560836      ; float 4.5036e+15
        .text
        .align  2
        .globl  l1__Z2f1i
l1__Z2f1i:
.LBBl1__Z2f1i_0:        ; entry
        xoris r2, r3, 32768
        stw r2, -4(r1)
        lis r2, 17200
        stw r2, -8(r1)
        lfd f0, -8(r1)
        lis r2, ha16(.CPIl1__Z2f1i_0)
        lfs f1, lo16(.CPIl1__Z2f1i_0)(r2)
        fsub f1, f0, f1
        blr


        .const
        .align  2
.CPIl2__Z2f2j_0:                                        ; float 0x4330000000000000
        .long   1501560832      ; float 4.5036e+15
        .text
        .align  2
        .globl  l2__Z2f2j
l2__Z2f2j:
.LBBl2__Z2f2j_0:        ; entry
        stw r3, -4(r1)
        lis r2, 17200
        stw r2, -8(r1)
        lfd f0, -8(r1)
        lis r2, ha16(.CPIl2__Z2f2j_0)
        lfs f1, lo16(.CPIl2__Z2f2j_0)(r2)
        fsub f1, f0, f1
        blr


        .const
        .align  2
.CPIl3__Z2f3i_0:                                        ; float 0x4330000080000000
        .long   1501560836      ; float 4.5036e+15
        .text
        .align  2
        .globl  l3__Z2f3i
l3__Z2f3i:
.LBBl3__Z2f3i_0:        ; entry
        xoris r2, r3, 32768
        stw r2, -4(r1)
        lis r2, 17200
        stw r2, -8(r1)
        lfd f0, -8(r1)
        lis r2, ha16(.CPIl3__Z2f3i_0)
        lfs f1, lo16(.CPIl3__Z2f3i_0)(r2)
        fsub f0, f0, f1
        frsp f1, f0
        blr


        .const
        .align  2
.CPIl4__Z2f4j_0:                                        ; float 0x4330000000000000
        .long   1501560832      ; float 4.5036e+15
        .text
        .align  2
        .globl  l4__Z2f4j
l4__Z2f4j:
.LBBl4__Z2f4j_0:        ; entry
        stw r3, -4(r1)
        lis r2, 17200
        stw r2, -8(r1)
        lfd f0, -8(r1)
        lis r2, ha16(.CPIl4__Z2f4j_0)
        lfs f1, lo16(.CPIl4__Z2f4j_0)(r2)
        fsub f0, f0, f1
        frsp f1, f0
        blr

llvm-svn: 22814
2005-08-17 00:39:29 +00:00
Chris Lattner
bd8cbd4951 add a new TargetConstant node
llvm-svn: 22813
2005-08-17 00:34:06 +00:00
Chris Lattner
3b7e157005 Eliminate the RegSDNode class, which 3 nodes (CopyFromReg/CopyToReg/ImplicitDef)
used to tack a register number onto the node.

Instead of doing this, make a new node, RegisterSDNode, which is a leaf
containing a register number.  These three operations just become normal
DAG nodes now, instead of requiring special handling.

Note that with this change, it is no longer correct to make illegal
CopyFromReg/CopyToReg nodes.  The legalizer will not touch them, and this
is bad, so don't do it. :)

llvm-svn: 22806
2005-08-16 21:55:35 +00:00
Nate Begeman
f6b6378f23 Implement BR_CC and BRTWOWAY_CC. This allows the removal of a rather nasty
fixme from the PowerPC backend.  Emit slightly better code for legalizing
select_cc.

llvm-svn: 22805
2005-08-16 19:49:35 +00:00
Chris Lattner
65b9983515 Allow passing a dag into dump and getOperationName. If one is available
when printing a node, use it to render target operations with their
target instruction name instead of "<<unknown>>".

llvm-svn: 22804
2005-08-16 18:33:07 +00:00
Chris Lattner
1b07a165e0 Use a extant helper to do this.
llvm-svn: 22802
2005-08-16 18:31:23 +00:00
Chris Lattner
73348d1e89 Add some methods for dag->dag isel.
Split RemoveNodeFromCSEMaps out of DeleteNodesIfDead to do it.

llvm-svn: 22801
2005-08-16 18:17:10 +00:00
Nate Begeman
54423e60c6 Fix last night's PPC32 regressions by
1. Not selecting the false value of a select_cc in the false arm, which
   isn't legal for nested selects.
2. Actually returning the node we created and Legalized in the FP_TO_UINT
   Expander.

llvm-svn: 22789
2005-08-14 18:38:32 +00:00
Nate Begeman
9be6a214ff Teach the legalizer how to legalize FP_TO_UINT.
Teach the legalizer to promote FP_TO_UINT to FP_TO_SINT if the wider
  FP_TO_UINT is also illegal.  This allows us on PPC to codegen
  unsigned short foo(float a) { return a; }

as:
_foo:
.LBB_foo_0:     ; entry
        fctiwz f0, f1
        stfd f0, -8(r1)
        lwz r2, -4(r1)
        rlwinm r3, r2, 0, 16, 31
        blr

instead of:
_foo:
.LBB_foo_0:     ; entry
        fctiwz f0, f1
        stfd f0, -8(r1)
        lwz r2, -4(r1)
        lis r3, ha16(.CPI_foo_0)
        lfs f0, lo16(.CPI_foo_0)(r3)
        fcmpu cr0, f1, f0
        blt .LBB_foo_2  ; entry
.LBB_foo_1:     ; entry
        fsubs f0, f1, f0
        fctiwz f0, f0
        stfd f0, -16(r1)
        lwz r2, -12(r1)
        xoris r2, r2, 32768
.LBB_foo_2:     ; entry
        rlwinm r3, r2, 0, 16, 31
        blr

llvm-svn: 22785
2005-08-14 01:20:53 +00:00
Nate Begeman
021a5b3fe1 Remove an unncessary argument to SimplifySelectCC and add an additional
assert when creating a select_cc node.

llvm-svn: 22780
2005-08-13 06:14:17 +00:00
Nate Begeman
4e8f777256 Fix the fabs regression on x86 by abstracting the select_cc optimization
out into SimplifySelectCC.  This allows both ISD::SELECT and ISD::SELECT_CC
to use the same set of simplifying folds.

llvm-svn: 22779
2005-08-13 06:00:21 +00:00
Chris Lattner
e06d2c3760 implement a couple of simple shift foldings.
e.g.  (X & 7) >> 3   -> 0

llvm-svn: 22774
2005-08-12 23:54:58 +00:00
Nate Begeman
09c56e0432 Add a select_cc optimization for recognizing abs(int). This speeds up an
integer MPEG encoding loop by a factor of two.

llvm-svn: 22758
2005-08-11 02:18:13 +00:00
Nate Begeman
206e850add Some SELECT_CC cleanups:
1. move assertions for node creation to getNode()
2. legalize the values returned in ExpandOp immediately
3. Move select_cc optimizations from SELECT's getNode() to SELECT_CC's,
   allowing them to be cleaned up significantly.

This paves the way to pick up additional optimizations on SELECT_CC, such
as sum-of-absolute-differences.

llvm-svn: 22757
2005-08-11 01:12:20 +00:00
Nate Begeman
eddc9d4856 Add new node, SELECT_CC. This node is for targets that don't natively
implement SELECT.

llvm-svn: 22755
2005-08-10 20:51:12 +00:00
Chris Lattner
51cf9fd316 Fix an oversight that may be causing PR617.
llvm-svn: 22753
2005-08-10 17:37:53 +00:00
Chris Lattner
3179a74493 Fix spelling, fix some broken canonicalizations by my last patch
llvm-svn: 22734
2005-08-09 23:09:05 +00:00
Chris Lattner
3290ca9983 add cc nodes to the AllNodes list so they show up in Graphviz output
llvm-svn: 22731
2005-08-09 20:40:02 +00:00
Chris Lattner
0fa4402b59 Eliminate the SetCCSDNode in favor of a CondCodeSDNode class. This pulls the
CC out of the SetCC operation, making SETCC a standard ternary operation and
CC's a standard DAG leaf.  This will make it possible for other node to use
CC's as operands in the future...

llvm-svn: 22728
2005-08-09 20:20:18 +00:00
Chris Lattner
e7f14fb39d Handle 64-bit constant exprs on 64-bit targets.
llvm-svn: 22696
2005-08-08 04:26:32 +00:00
Chris Lattner
fdb467b18d add a small simplification that can be exposed after promotion/expansion
llvm-svn: 22691
2005-08-07 05:00:44 +00:00
Chris Lattner
d3a8084e5b Change FindEarliestCallSeqEnd (used by libcall insertion) to use a set to
avoid revisiting nodes more than once.  This eliminates a source of
potentially exponential behavior.  For a small function in 191.fma3d
(hexah_stress_divergence_), this speeds up isel from taking > 20mins to
taking 0.07s.

llvm-svn: 22680
2005-08-05 18:10:27 +00:00
Chris Lattner
c7a67abac2 Fix a use-of-dangling-pointer bug, from the introduction of SrcValue's.
llvm-svn: 22679
2005-08-05 16:55:31 +00:00
Chris Lattner
644edfb51e Fix a latent bug in the libcall inserter that was exposed by Nate's patch
yesterday.  This fixes whetstone and a bunch of programs in the External tests.

llvm-svn: 22678
2005-08-05 16:23:57 +00:00
Nate Begeman
348caa49b3 Fix a fixme in LegalizeDAG
llvm-svn: 22661
2005-08-04 21:43:28 +00:00
Misha Brukman
8b8272b648 * Unbreak release build
* Add comments to #endif pragmas for readability

llvm-svn: 22647
2005-08-04 14:22:41 +00:00
Chris Lattner
d124203207 Fix PR611, codegen'ing SREM of FP operands to fmod or fmodf instead of
the sequence used for integer ops

llvm-svn: 22629
2005-08-03 20:31:37 +00:00
Chris Lattner
cc8ae687e1 Update to use the new MathExtras.h support for log2 computation.
Patch contributed by Jim Laskey!

llvm-svn: 22594
2005-08-02 19:26:06 +00:00
Chris Lattner
83f0262a2c Fix casts from long to sbyte on ppc
llvm-svn: 22570
2005-08-01 18:16:37 +00:00
Jeff Cohen
019104459d Keep tabs and trailing spaces out.
llvm-svn: 22565
2005-07-30 18:33:25 +00:00
Chris Lattner
d742a80e9e fix float->long conversions on x86
llvm-svn: 22563
2005-07-30 01:40:57 +00:00
Chris Lattner
e0b705ba00 Allow targets to have custom expanders for FP_TO_*INT conversions where
both the src and dest values are legal

llvm-svn: 22555
2005-07-30 00:04:12 +00:00
Chris Lattner
8d48aef4e3 Allow targets to define custom expanders for FP_TO_*INT
llvm-svn: 22548
2005-07-29 00:33:32 +00:00
Chris Lattner
f355c0f6ea allow a target to request that unknown FP_TO_*INT conversion be promoted to
a larger integer destination.

llvm-svn: 22547
2005-07-29 00:11:56 +00:00
Chris Lattner
6b4f386826 instead of having all conversions be handled by one case value, and then have
subcases inside, break things out earlier.

llvm-svn: 22546
2005-07-28 23:31:12 +00:00
Andrew Lenharth
f623af9b64 new is not a valid default anywhere, so make this pure virtual
llvm-svn: 22542
2005-07-28 18:13:59 +00:00
Chris Lattner
b0658628c1 Fix debug info to not print out recently freed memory.
llvm-svn: 22529
2005-07-27 23:11:25 +00:00
Chris Lattner
1a3a4c7791 Print symbolic register names in debug dumps
llvm-svn: 22528
2005-07-27 23:03:38 +00:00
Jeff Cohen
bd51ec7461 Eliminate all remaining tabs and trailing spaces.
llvm-svn: 22523
2005-07-27 06:12:32 +00:00
Nate Begeman
a25a2010e3 Remove unnecessary FP_EXTEND. This causes worse codegen for SSE.
llvm-svn: 22469
2005-07-19 16:50:03 +00:00
Chris Lattner
d4f9ab3809 The assertion was wrong: the code only worked for i64. While we're at it,
expand the code to work for all integer datatypes.  This should unbreak
alpha.

llvm-svn: 22464
2005-07-18 04:31:14 +00:00