1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-27 05:53:07 +01:00
Commit Graph

523 Commits

Author SHA1 Message Date
Chris Lattner
8927bf468d add anew method
llvm-svn: 22957
2005-08-21 22:30:30 +00:00
Chris Lattner
7a04eff613 Add support for frame index nodes
llvm-svn: 22956
2005-08-21 19:56:04 +00:00
Chris Lattner
cbbd212622 add a method
llvm-svn: 22955
2005-08-21 19:48:59 +00:00
Chris Lattner
481b47fc75 add a method
llvm-svn: 22949
2005-08-21 18:49:33 +00:00
Chris Lattner
3f6df51c19 Add support for basic blocks, fix a bug in result # computation
llvm-svn: 22948
2005-08-21 18:49:29 +00:00
Chris Lattner
9bb0d10479 When legalizing brcond ->brcc or select -> selectcc, make sure to truncate
the old condition to a one bit value.  The incoming value must have been
promoted, and the top bits are undefined.  This causes us to generate:

_test:
        rlwinm r2, r3, 0, 31, 31
        li r3, 17
        cmpwi cr0, r2, 0
        bne .LBB_test_2 ;
.LBB_test_1:    ;
        li r3, 1
.LBB_test_2:    ;
        blr

instead of:

_test:
        rlwinm r2, r3, 0, 31, 31
        li r2, 17
        cmpwi cr0, r3, 0
        bne .LBB_test_2 ;
.LBB_test_1:    ;
        li r2, 1
.LBB_test_2:    ;
        or r3, r2, r2
        blr

for:

int %test(bool %c) {
        %retval = select bool %c, int 17, int 1
        ret int %retval
}

llvm-svn: 22947
2005-08-21 18:03:09 +00:00
Chris Lattner
7c3e52ef92 fix bogus warning
llvm-svn: 22943
2005-08-20 18:07:27 +00:00
Chris Lattner
5b7488224d Add support for global address nodes
llvm-svn: 22940
2005-08-19 22:38:24 +00:00
Chris Lattner
5210fd0e51 Add support for TargetGlobalAddress nodes
llvm-svn: 22938
2005-08-19 22:31:04 +00:00
Chris Lattner
bedf8e757a Implement CopyFromReg, TokenFactor, and fix a bug in CopyToReg. This allows
us to compile stuff like this:

double %test(double %A, double %B, double %C, double %E) {
        %F = mul double %A, %A
        %G = add double %F, %B
        %H = sub double -0.0, %G
        %I = mul double %H, %C
        %J = add double %I, %E
        ret double %J
}

to:

_test:
        fnmadd f0, f1, f1, f2
        fmadd f1, f0, f3, f4
        blr

woot!

llvm-svn: 22937
2005-08-19 21:43:53 +00:00
Chris Lattner
b36807b0d0 Fix a bug in previous commit
llvm-svn: 22936
2005-08-19 21:34:13 +00:00
Chris Lattner
ac699c4db9 Print physreg register nodes with target names (e.g. F1) instead of numbers
llvm-svn: 22934
2005-08-19 21:21:16 +00:00
Chris Lattner
011a721d08 Before implementing copyfromreg, we'll implement copytoreg correctly.
This gets us this for the previous testcase:

_test:
        lis r2, 0
        ori r3, r2, 65535
        blr

Note that we actually write to r3 (the return reg) correctly now :)

llvm-svn: 22933
2005-08-19 20:50:53 +00:00
Chris Lattner
9af3aaf541 Now that we have operand info for machine instructions, use it to create
temporary registers for things that define a register.  This allows dag->dag
isel to compile this:

int %test() { ret int 65535 }

into:

_test:
        lis r2, 0
        ori r2, r2, 65535
        blr

Next up, getting CopyFromReg to work, allowing arguments and cross-bb values.

llvm-svn: 22932
2005-08-19 20:45:43 +00:00
Jeff Cohen
f99748bc0f Fix VC++ precedence warning.
llvm-svn: 22902
2005-08-19 04:39:48 +00:00
Chris Lattner
1207209677 Fix computation of # operands, add a temporary hack for CopyToReg
llvm-svn: 22896
2005-08-19 01:01:34 +00:00
Chris Lattner
7b9f02525e add a new -view-sched-dags option to view dags as they are sent to the scheduler.
llvm-svn: 22878
2005-08-18 20:11:49 +00:00
Chris Lattner
62bc771af7 Implement the first chunk of a code emitter. This is sophisticated enough to
codegen:

_empty:
.LBB_empty_0:   ;
        blr

but can't do anything more (yet). :)

llvm-svn: 22876
2005-08-18 20:07:59 +00:00
Chris Lattner
ebb48e5877 new file, obviously just a stub
llvm-svn: 22868
2005-08-18 18:45:24 +00:00
Chris Lattner
5cbeaed711 Enable critical edge splitting by default
llvm-svn: 22863
2005-08-18 17:35:14 +00:00
Nate Begeman
474ec3c02d Add support for target DAG nodes that take 4 operands, such as PowerPC's
rlwinm.

llvm-svn: 22856
2005-08-18 07:30:15 +00:00
Chris Lattner
d6b9b36616 Fix printing of VTSDNodes
llvm-svn: 22853
2005-08-18 03:31:02 +00:00
Jim Laskey
d761e8859d Move the code dependency for MathExtras.h from SelectionDAGNodes.h.
Added some class dividers in SelectionDAG.cpp.

llvm-svn: 22841
2005-08-17 20:08:02 +00:00
Jim Laskey
61e3d7bca5 Culling out use of unions for converting FP to bits and vice versa.
llvm-svn: 22838
2005-08-17 19:34:49 +00:00
Chris Lattner
a11bdf3abe Fix a bug in RemoveDeadNodes where it would crash when its "optional"
argument is not specified.

Implement ReplaceAllUsesWith.

llvm-svn: 22834
2005-08-17 19:00:20 +00:00
Jim Laskey
7cdadb13d5 Switched to using BitsToDouble for int_to_float to avoid aliasing problem.
llvm-svn: 22831
2005-08-17 17:42:52 +00:00
Jim Laskey
2370cb4e85 Change hex float constants for the sake of VC++.
llvm-svn: 22828
2005-08-17 09:44:59 +00:00
Chris Lattner
dbfcba7565 Add a new beta option for critical edge splitting, to avoid a problem that
Nate noticed in yacr2 (and I know occurs in other places as well).

This is still rough, as the critical edge blocks are not intelligently placed
but is added to get some idea to see if this improves performance.

llvm-svn: 22825
2005-08-17 06:37:43 +00:00
Chris Lattner
a103a2e9c6 Fix a regression on X86, where FP values can be promoted too.
llvm-svn: 22822
2005-08-17 06:06:25 +00:00
Jim Laskey
59b9ee0529 Added generic code expansion for [signed|unsigned] i32 to [f32|f64] casts in the
legalizer.  PowerPC now uses this expansion instead of ISel version.

Example:

// signed integer to double conversion
double f1(signed x) {
  return (double)x;
}

// unsigned integer to double conversion
double f2(unsigned x) {
  return (double)x;
}

// signed integer to float conversion
float f3(signed x) {
  return (float)x;
}

// unsigned integer to float conversion
float f4(unsigned x) {
  return (float)x;
}


Byte Code:

internal fastcc double %_Z2f1i(int %x) {
entry:
        %tmp.1 = cast int %x to double          ; <double> [#uses=1]
        ret double %tmp.1
}

internal fastcc double %_Z2f2j(uint %x) {
entry:
        %tmp.1 = cast uint %x to double         ; <double> [#uses=1]
        ret double %tmp.1
}

internal fastcc float %_Z2f3i(int %x) {
entry:
        %tmp.1 = cast int %x to float           ; <float> [#uses=1]
        ret float %tmp.1
}

internal fastcc float %_Z2f4j(uint %x) {
entry:
        %tmp.1 = cast uint %x to float          ; <float> [#uses=1]
        ret float %tmp.1
}

internal fastcc double %_Z2g1i(int %x) {
entry:
        %buffer = alloca [2 x uint]             ; <[2 x uint]*> [#uses=3]
        %tmp.0 = getelementptr [2 x uint]* %buffer, int 0, int 0                ; <uint*> [#uses=1]
        store uint 1127219200, uint* %tmp.0
        %tmp.2 = cast int %x to uint            ; <uint> [#uses=1]
        %tmp.3 = xor uint %tmp.2, 2147483648            ; <uint> [#uses=1]
        %tmp.5 = getelementptr [2 x uint]* %buffer, int 0, int 1                ; <uint*> [#uses=1]
        store uint %tmp.3, uint* %tmp.5
        %tmp.9 = cast [2 x uint]* %buffer to double*            ; <double*> [#uses=1]
        %tmp.10 = load double* %tmp.9           ; <double> [#uses=1]
        %tmp.13 = load double* cast (long* %signed_bias to double*)             ; <double> [#uses=1]
        %tmp.14 = sub double %tmp.10, %tmp.13           ; <double> [#uses=1]
        ret double %tmp.14
}

internal fastcc double %_Z2g2j(uint %x) {
entry:
        %buffer = alloca [2 x uint]             ; <[2 x uint]*> [#uses=3]
        %tmp.0 = getelementptr [2 x uint]* %buffer, int 0, int 0                ; <uint*> [#uses=1]
        store uint 1127219200, uint* %tmp.0
        %tmp.1 = getelementptr [2 x uint]* %buffer, int 0, int 1                ; <uint*> [#uses=1]
        store uint %x, uint* %tmp.1
        %tmp.4 = cast [2 x uint]* %buffer to double*            ; <double*> [#uses=1]
        %tmp.5 = load double* %tmp.4            ; <double> [#uses=1]
        %tmp.8 = load double* cast (long* %unsigned_bias to double*)            ; <double> [#uses=1]
        %tmp.9 = sub double %tmp.5, %tmp.8              ; <double> [#uses=1]
        ret double %tmp.9
}

internal fastcc float %_Z2g3i(int %x) {
entry:
        %buffer = alloca [2 x uint]             ; <[2 x uint]*> [#uses=3]
        %tmp.0 = getelementptr [2 x uint]* %buffer, int 0, int 0                ; <uint*> [#uses=1]
        store uint 1127219200, uint* %tmp.0
        %tmp.2 = cast int %x to uint            ; <uint> [#uses=1]
        %tmp.3 = xor uint %tmp.2, 2147483648            ; <uint> [#uses=1]
        %tmp.5 = getelementptr [2 x uint]* %buffer, int 0, int 1                ; <uint*> [#uses=1]
        store uint %tmp.3, uint* %tmp.5
        %tmp.9 = cast [2 x uint]* %buffer to double*            ; <double*> [#uses=1]
        %tmp.10 = load double* %tmp.9           ; <double> [#uses=1]
        %tmp.13 = load double* cast (long* %signed_bias to double*)             ; <double> [#uses=1]
        %tmp.14 = sub double %tmp.10, %tmp.13           ; <double> [#uses=1]
        %tmp.16 = cast double %tmp.14 to float          ; <float> [#uses=1]
        ret float %tmp.16
}

internal fastcc float %_Z2g4j(uint %x) {
entry:
        %buffer = alloca [2 x uint]             ; <[2 x uint]*> [#uses=3]
        %tmp.0 = getelementptr [2 x uint]* %buffer, int 0, int 0                ; <uint*> [#uses=1]
        store uint 1127219200, uint* %tmp.0
        %tmp.1 = getelementptr [2 x uint]* %buffer, int 0, int 1                ; <uint*> [#uses=1]
        store uint %x, uint* %tmp.1
        %tmp.4 = cast [2 x uint]* %buffer to double*            ; <double*> [#uses=1]
        %tmp.5 = load double* %tmp.4            ; <double> [#uses=1]
        %tmp.8 = load double* cast (long* %unsigned_bias to double*)            ; <double> [#uses=1]
        %tmp.9 = sub double %tmp.5, %tmp.8              ; <double> [#uses=1]
        %tmp.11 = cast double %tmp.9 to float           ; <float> [#uses=1]
        ret float %tmp.11
}


PowerPC Code:

        .machine ppc970


        .const
        .align  2
.CPIl1__Z2f1i_0:                                        ; float 0x4330000080000000
        .long   1501560836      ; float 4.5036e+15
        .text
        .align  2
        .globl  l1__Z2f1i
l1__Z2f1i:
.LBBl1__Z2f1i_0:        ; entry
        xoris r2, r3, 32768
        stw r2, -4(r1)
        lis r2, 17200
        stw r2, -8(r1)
        lfd f0, -8(r1)
        lis r2, ha16(.CPIl1__Z2f1i_0)
        lfs f1, lo16(.CPIl1__Z2f1i_0)(r2)
        fsub f1, f0, f1
        blr


        .const
        .align  2
.CPIl2__Z2f2j_0:                                        ; float 0x4330000000000000
        .long   1501560832      ; float 4.5036e+15
        .text
        .align  2
        .globl  l2__Z2f2j
l2__Z2f2j:
.LBBl2__Z2f2j_0:        ; entry
        stw r3, -4(r1)
        lis r2, 17200
        stw r2, -8(r1)
        lfd f0, -8(r1)
        lis r2, ha16(.CPIl2__Z2f2j_0)
        lfs f1, lo16(.CPIl2__Z2f2j_0)(r2)
        fsub f1, f0, f1
        blr


        .const
        .align  2
.CPIl3__Z2f3i_0:                                        ; float 0x4330000080000000
        .long   1501560836      ; float 4.5036e+15
        .text
        .align  2
        .globl  l3__Z2f3i
l3__Z2f3i:
.LBBl3__Z2f3i_0:        ; entry
        xoris r2, r3, 32768
        stw r2, -4(r1)
        lis r2, 17200
        stw r2, -8(r1)
        lfd f0, -8(r1)
        lis r2, ha16(.CPIl3__Z2f3i_0)
        lfs f1, lo16(.CPIl3__Z2f3i_0)(r2)
        fsub f0, f0, f1
        frsp f1, f0
        blr


        .const
        .align  2
.CPIl4__Z2f4j_0:                                        ; float 0x4330000000000000
        .long   1501560832      ; float 4.5036e+15
        .text
        .align  2
        .globl  l4__Z2f4j
l4__Z2f4j:
.LBBl4__Z2f4j_0:        ; entry
        stw r3, -4(r1)
        lis r2, 17200
        stw r2, -8(r1)
        lfd f0, -8(r1)
        lis r2, ha16(.CPIl4__Z2f4j_0)
        lfs f1, lo16(.CPIl4__Z2f4j_0)(r2)
        fsub f0, f0, f1
        frsp f1, f0
        blr

llvm-svn: 22814
2005-08-17 00:39:29 +00:00
Chris Lattner
bd8cbd4951 add a new TargetConstant node
llvm-svn: 22813
2005-08-17 00:34:06 +00:00
Chris Lattner
3b7e157005 Eliminate the RegSDNode class, which 3 nodes (CopyFromReg/CopyToReg/ImplicitDef)
used to tack a register number onto the node.

Instead of doing this, make a new node, RegisterSDNode, which is a leaf
containing a register number.  These three operations just become normal
DAG nodes now, instead of requiring special handling.

Note that with this change, it is no longer correct to make illegal
CopyFromReg/CopyToReg nodes.  The legalizer will not touch them, and this
is bad, so don't do it. :)

llvm-svn: 22806
2005-08-16 21:55:35 +00:00
Nate Begeman
f6b6378f23 Implement BR_CC and BRTWOWAY_CC. This allows the removal of a rather nasty
fixme from the PowerPC backend.  Emit slightly better code for legalizing
select_cc.

llvm-svn: 22805
2005-08-16 19:49:35 +00:00
Chris Lattner
65b9983515 Allow passing a dag into dump and getOperationName. If one is available
when printing a node, use it to render target operations with their
target instruction name instead of "<<unknown>>".

llvm-svn: 22804
2005-08-16 18:33:07 +00:00
Chris Lattner
1b07a165e0 Use a extant helper to do this.
llvm-svn: 22802
2005-08-16 18:31:23 +00:00
Chris Lattner
73348d1e89 Add some methods for dag->dag isel.
Split RemoveNodeFromCSEMaps out of DeleteNodesIfDead to do it.

llvm-svn: 22801
2005-08-16 18:17:10 +00:00
Nate Begeman
54423e60c6 Fix last night's PPC32 regressions by
1. Not selecting the false value of a select_cc in the false arm, which
   isn't legal for nested selects.
2. Actually returning the node we created and Legalized in the FP_TO_UINT
   Expander.

llvm-svn: 22789
2005-08-14 18:38:32 +00:00
Nate Begeman
9be6a214ff Teach the legalizer how to legalize FP_TO_UINT.
Teach the legalizer to promote FP_TO_UINT to FP_TO_SINT if the wider
  FP_TO_UINT is also illegal.  This allows us on PPC to codegen
  unsigned short foo(float a) { return a; }

as:
_foo:
.LBB_foo_0:     ; entry
        fctiwz f0, f1
        stfd f0, -8(r1)
        lwz r2, -4(r1)
        rlwinm r3, r2, 0, 16, 31
        blr

instead of:
_foo:
.LBB_foo_0:     ; entry
        fctiwz f0, f1
        stfd f0, -8(r1)
        lwz r2, -4(r1)
        lis r3, ha16(.CPI_foo_0)
        lfs f0, lo16(.CPI_foo_0)(r3)
        fcmpu cr0, f1, f0
        blt .LBB_foo_2  ; entry
.LBB_foo_1:     ; entry
        fsubs f0, f1, f0
        fctiwz f0, f0
        stfd f0, -16(r1)
        lwz r2, -12(r1)
        xoris r2, r2, 32768
.LBB_foo_2:     ; entry
        rlwinm r3, r2, 0, 16, 31
        blr

llvm-svn: 22785
2005-08-14 01:20:53 +00:00
Nate Begeman
021a5b3fe1 Remove an unncessary argument to SimplifySelectCC and add an additional
assert when creating a select_cc node.

llvm-svn: 22780
2005-08-13 06:14:17 +00:00
Nate Begeman
4e8f777256 Fix the fabs regression on x86 by abstracting the select_cc optimization
out into SimplifySelectCC.  This allows both ISD::SELECT and ISD::SELECT_CC
to use the same set of simplifying folds.

llvm-svn: 22779
2005-08-13 06:00:21 +00:00
Chris Lattner
e06d2c3760 implement a couple of simple shift foldings.
e.g.  (X & 7) >> 3   -> 0

llvm-svn: 22774
2005-08-12 23:54:58 +00:00
Nate Begeman
09c56e0432 Add a select_cc optimization for recognizing abs(int). This speeds up an
integer MPEG encoding loop by a factor of two.

llvm-svn: 22758
2005-08-11 02:18:13 +00:00
Nate Begeman
206e850add Some SELECT_CC cleanups:
1. move assertions for node creation to getNode()
2. legalize the values returned in ExpandOp immediately
3. Move select_cc optimizations from SELECT's getNode() to SELECT_CC's,
   allowing them to be cleaned up significantly.

This paves the way to pick up additional optimizations on SELECT_CC, such
as sum-of-absolute-differences.

llvm-svn: 22757
2005-08-11 01:12:20 +00:00
Nate Begeman
eddc9d4856 Add new node, SELECT_CC. This node is for targets that don't natively
implement SELECT.

llvm-svn: 22755
2005-08-10 20:51:12 +00:00
Chris Lattner
51cf9fd316 Fix an oversight that may be causing PR617.
llvm-svn: 22753
2005-08-10 17:37:53 +00:00
Chris Lattner
3179a74493 Fix spelling, fix some broken canonicalizations by my last patch
llvm-svn: 22734
2005-08-09 23:09:05 +00:00
Chris Lattner
3290ca9983 add cc nodes to the AllNodes list so they show up in Graphviz output
llvm-svn: 22731
2005-08-09 20:40:02 +00:00
Chris Lattner
0fa4402b59 Eliminate the SetCCSDNode in favor of a CondCodeSDNode class. This pulls the
CC out of the SetCC operation, making SETCC a standard ternary operation and
CC's a standard DAG leaf.  This will make it possible for other node to use
CC's as operands in the future...

llvm-svn: 22728
2005-08-09 20:20:18 +00:00
Chris Lattner
fdb467b18d add a small simplification that can be exposed after promotion/expansion
llvm-svn: 22691
2005-08-07 05:00:44 +00:00
Chris Lattner
d3a8084e5b Change FindEarliestCallSeqEnd (used by libcall insertion) to use a set to
avoid revisiting nodes more than once.  This eliminates a source of
potentially exponential behavior.  For a small function in 191.fma3d
(hexah_stress_divergence_), this speeds up isel from taking > 20mins to
taking 0.07s.

llvm-svn: 22680
2005-08-05 18:10:27 +00:00
Chris Lattner
c7a67abac2 Fix a use-of-dangling-pointer bug, from the introduction of SrcValue's.
llvm-svn: 22679
2005-08-05 16:55:31 +00:00
Chris Lattner
644edfb51e Fix a latent bug in the libcall inserter that was exposed by Nate's patch
yesterday.  This fixes whetstone and a bunch of programs in the External tests.

llvm-svn: 22678
2005-08-05 16:23:57 +00:00
Nate Begeman
348caa49b3 Fix a fixme in LegalizeDAG
llvm-svn: 22661
2005-08-04 21:43:28 +00:00
Misha Brukman
8b8272b648 * Unbreak release build
* Add comments to #endif pragmas for readability

llvm-svn: 22647
2005-08-04 14:22:41 +00:00
Chris Lattner
d124203207 Fix PR611, codegen'ing SREM of FP operands to fmod or fmodf instead of
the sequence used for integer ops

llvm-svn: 22629
2005-08-03 20:31:37 +00:00
Chris Lattner
cc8ae687e1 Update to use the new MathExtras.h support for log2 computation.
Patch contributed by Jim Laskey!

llvm-svn: 22594
2005-08-02 19:26:06 +00:00
Chris Lattner
83f0262a2c Fix casts from long to sbyte on ppc
llvm-svn: 22570
2005-08-01 18:16:37 +00:00
Jeff Cohen
019104459d Keep tabs and trailing spaces out.
llvm-svn: 22565
2005-07-30 18:33:25 +00:00
Chris Lattner
d742a80e9e fix float->long conversions on x86
llvm-svn: 22563
2005-07-30 01:40:57 +00:00
Chris Lattner
e0b705ba00 Allow targets to have custom expanders for FP_TO_*INT conversions where
both the src and dest values are legal

llvm-svn: 22555
2005-07-30 00:04:12 +00:00
Chris Lattner
8d48aef4e3 Allow targets to define custom expanders for FP_TO_*INT
llvm-svn: 22548
2005-07-29 00:33:32 +00:00
Chris Lattner
f355c0f6ea allow a target to request that unknown FP_TO_*INT conversion be promoted to
a larger integer destination.

llvm-svn: 22547
2005-07-29 00:11:56 +00:00
Chris Lattner
6b4f386826 instead of having all conversions be handled by one case value, and then have
subcases inside, break things out earlier.

llvm-svn: 22546
2005-07-28 23:31:12 +00:00
Jeff Cohen
bd51ec7461 Eliminate all remaining tabs and trailing spaces.
llvm-svn: 22523
2005-07-27 06:12:32 +00:00
Nate Begeman
a25a2010e3 Remove unnecessary FP_EXTEND. This causes worse codegen for SSE.
llvm-svn: 22469
2005-07-19 16:50:03 +00:00
Chris Lattner
d4f9ab3809 The assertion was wrong: the code only worked for i64. While we're at it,
expand the code to work for all integer datatypes.  This should unbreak
alpha.

llvm-svn: 22464
2005-07-18 04:31:14 +00:00
Nate Begeman
160c12d896 Teach the legalizer how to promote SINT_TO_FP to a wider SINT_TO_FP that
the target natively supports.  This eliminates some special-case code from
the x86 backend and generates better code as well.

For an i8 to f64 conversion, before & after:

_x87 before:
        subl $2, %esp
        movb 6(%esp), %al
        movsbw %al, %ax
        movw %ax, (%esp)
        filds (%esp)
        addl $2, %esp
        ret

_x87 after:
        subl $2, %esp
        movsbw 6(%esp), %ax
        movw %ax, (%esp)
        filds (%esp)
        addl $2, %esp
        ret

_sse before:
        subl $12, %esp
        movb 16(%esp), %al
        movsbl %al, %eax
        cvtsi2sd %eax, %xmm0
        addl $12, %esp
        ret

_sse after:
        subl $12, %esp
        movsbl 16(%esp), %eax
        cvtsi2sd %eax, %xmm0
        addl $12, %esp
        ret

llvm-svn: 22452
2005-07-16 02:02:34 +00:00
Chris Lattner
10da57bfed Break the code for expanding UINT_TO_FP operations out into its own
SelectionDAGLegalize::ExpandLegalUINT_TO_FP method.

Add a new method, PromoteLegalUINT_TO_FP, which allows targets to request
that UINT_TO_FP operations be promoted to a larger input type.  This is
useful for targets that have some UINT_TO_FP or SINT_TO_FP operations but
not all of them (like X86).

The same should be done with SINT_TO_FP, but this patch does not do that
yet.

llvm-svn: 22447
2005-07-16 00:19:57 +00:00
Chris Lattner
94e486c56e You can't use config options without config.h
llvm-svn: 22446
2005-07-15 22:48:31 +00:00
Chris Lattner
d8eb6ea6da Make this use the new autoconf support for finding the executables for
gv and Graphviz.

llvm-svn: 22434
2005-07-14 05:33:13 +00:00
Chris Lattner
d9f1a60c61 As discussed on IRC, this stuff is just for debugging.
llvm-svn: 22432
2005-07-14 05:17:43 +00:00
Chris Lattner
61b33e0bc4 If the Graphviz program is available, use it to visualize dot graphs.
llvm-svn: 22429
2005-07-14 01:10:55 +00:00
Chris Lattner
aeae45b371 Fix Alpha/2005-07-12-TwoMallocCalls.ll and PR593.
It is not safe to call LegalizeOp on something that has already been legalized.
Instead, just force another iteration of legalization.

This could affect all platforms but X86, as this codepath is dynamically
dead on X86 (ISD::MEMSET and friends are legal).

llvm-svn: 22419
2005-07-13 02:00:04 +00:00
Chris Lattner
628a248ff9 Fix test/Regression/CodeGen/Generic/2005-07-12-memcpy-i64-length.ll
llvm-svn: 22417
2005-07-13 01:42:45 +00:00
Chris Lattner
6e49696ba6 Change *EXTLOAD to use an VTSDNode operand instead of being an MVTSDNode.
This is the last MVTSDNode.

This allows us to eliminate a bunch of special case code for handling
MVTSDNodes.

llvm-svn: 22367
2005-07-10 01:55:33 +00:00
Chris Lattner
273b81e0c0 Change TRUNCSTORE to use a VTSDNode operand instead of being an MVTSTDNode
llvm-svn: 22366
2005-07-10 00:29:18 +00:00
Chris Lattner
c355896290 Introduce a new VTSDNode class with the ultimate goal of eliminating the
MVTSDNode class.  This class is used to provide an operand to operators
that require an extra type.  We start by converting FP_ROUND_INREG and
SIGN_EXTEND_INREG over to using it.

llvm-svn: 22364
2005-07-10 00:07:11 +00:00
Chris Lattner
bf100c8bdb Make several cleanups to Andrews varargs change:
1. Pass Value*'s into lowering methods so that the proper pointers can be
   added to load/stores from the valist
2. Intrinsics that return void should only return a token chain, not a token
   chain/retval pair.
3. Rename LowerVAArgNext -> LowerVAArg, because VANext is long gone.

llvm-svn: 22338
2005-07-05 19:57:53 +00:00
Andrew Lenharth
3543e3b3a9 2 fixes:
1: Legalize operand in UINT_TO_FP expanision

2: SRA x, const i8 was not promoting the constant to shift amount type.
llvm-svn: 22337
2005-07-05 19:52:39 +00:00
Andrew Lenharth
c9903eb2cc I really didn't think this was necessary. But, Legalize wasn't running again
and legalizing the extload.  Strange.  Should fix most alpha regressions.

llvm-svn: 22329
2005-07-02 20:58:53 +00:00
Andrew Lenharth
b8c48ce74e oops
llvm-svn: 22320
2005-06-30 19:32:57 +00:00
Andrew Lenharth
04aa18bd2a FP EXTLOAD is not support on all archs, expand to LOAD and FP_EXTEND
llvm-svn: 22319
2005-06-30 19:22:37 +00:00
Andrew Lenharth
898efb338a restore old srcValueNode behavior and try to to work around it
llvm-svn: 22315
2005-06-29 18:54:02 +00:00
Andrew Lenharth
edccb834bb tracking the instructions causing loads and stores provides more information than just the pointer being loaded or stored
llvm-svn: 22311
2005-06-29 15:57:19 +00:00
Andrew Lenharth
d534c5cb2a Adapt the code for handling uint -> fp conversion for the 32 bit case to
handling it in the 64 bit case.  The two code paths should probably be merged.

llvm-svn: 22302
2005-06-27 23:28:32 +00:00
Andrew Lenharth
4fd2bde906 If we support structs as va_list, we must pass pointers to them to va_copy
See last commit for LangRef, this implements it on all targets.

llvm-svn: 22273
2005-06-22 21:04:42 +00:00
Andrew Lenharth
a9214fec08 core changes for varargs
llvm-svn: 22254
2005-06-18 18:34:52 +00:00
Nate Begeman
ed49a51836 Fix bug 537 test 2, which checks to make sure that we fold A+(B-A) -> B for
integer types.  Add a couple checks to not perform these kinds of transform
on floating point values.

llvm-svn: 22228
2005-06-16 07:06:03 +00:00
Chris Lattner
811dc49f55 Add some simplifications for MULH[SU]. This allows us to compile this:
long %bar(long %X) {
  %Y = mul long %X, 4294967297
  ret long %Y
}

to this:

l1_bar:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %EDX, %EAX
        add %EDX, DWORD PTR [%ESP + 8]
        ret

instead of:

l1_bar:
        mov %ECX, DWORD PTR [%ESP + 4]
        mov %EDX, 1
        mov %EAX, %ECX
        mul %EDX
        add %EDX, %ECX
        add %EDX, DWORD PTR [%ESP + 8]
        mov %EAX, %ECX
        ret

llvm-svn: 22044
2005-05-15 05:39:08 +00:00
Chris Lattner
46de5c99bd Fix construction of ioport intrinsics, fixing X86/io.llx and io-port.llx
llvm-svn: 22026
2005-05-14 13:56:55 +00:00
Chris Lattner
052759b78c allow token chain at start or end of node
llvm-svn: 22020
2005-05-14 08:34:53 +00:00
Chris Lattner
d9e36f94bb remove special case hacks for readport/readio from the binary operator
codepath

llvm-svn: 22019
2005-05-14 07:45:46 +00:00
Chris Lattner
d1d8fbee2d Implement fixme's by memoizing nodes.
llvm-svn: 22018
2005-05-14 07:42:29 +00:00
Chris Lattner
ac7d55f114 Turn this into a wrapper for a simpler version of getNode.
llvm-svn: 22016
2005-05-14 07:32:14 +00:00
Chris Lattner
a035798c4b Eliminate special purpose hacks for dynamic_stack_alloc.
llvm-svn: 22015
2005-05-14 07:29:57 +00:00
Chris Lattner
b94e243d14 Use the general mechanism for creating multi-value nodes instead of using
special case hacks.

llvm-svn: 22014
2005-05-14 07:25:05 +00:00
Chris Lattner
ad411081fb Wrap long line, actually add node to the graph.
llvm-svn: 22011
2005-05-14 06:42:57 +00:00
Chris Lattner
6f7b63c7d7 legalize target-specific operations
llvm-svn: 22010
2005-05-14 06:34:48 +00:00
Chris Lattner
1afb5ae575 add a getNode() version that allows construction of any node type.
llvm-svn: 22009
2005-05-14 06:20:26 +00:00
Chris Lattner
6e81a4090f LowerOperation takes a dag
llvm-svn: 22004
2005-05-14 05:50:48 +00:00
Chris Lattner
1202c26d6e Allow targets to have a custom int64->fp expander if desired
llvm-svn: 22001
2005-05-14 05:33:54 +00:00
Chris Lattner
2163eeaa67 Align doubles on 8-byte boundaries if possible.
llvm-svn: 21993
2005-05-13 23:14:17 +00:00
Chris Lattner
9d788e93a6 Add an isTailCall flag to LowerCallTo
llvm-svn: 21958
2005-05-13 18:50:42 +00:00
Chris Lattner
3a76f85d43 Handle TAILCALL node
llvm-svn: 21957
2005-05-13 18:43:43 +00:00
Chris Lattner
01eba53a10 Emit function entry code after lowering hte arguments.
llvm-svn: 21931
2005-05-13 07:33:32 +00:00
Chris Lattner
fdc4816996 Allow targets to emit code into the entry block of each function
llvm-svn: 21930
2005-05-13 07:23:21 +00:00
Chris Lattner
670c7f516c Fix a problem that nate reduced for me.
llvm-svn: 21923
2005-05-13 05:17:00 +00:00
Chris Lattner
59bb0edb45 rename variables and functions to match renamed DAG nodes. Bonus feature:
I can actually remember which one is which now!

llvm-svn: 21922
2005-05-13 05:09:11 +00:00
Chris Lattner
c7013ec3a9 do not call expandop on the same value more than once. This fixes
X86/2004-02-22-Casts.llx

llvm-svn: 21919
2005-05-13 04:45:13 +00:00
Chris Lattner
51de10e0c6 fix a bad typeo
llvm-svn: 21917
2005-05-12 23:51:40 +00:00
Chris Lattner
00d2fb482f update comment
llvm-svn: 21916
2005-05-12 23:24:44 +00:00
Chris Lattner
094bbfcebb rename the ADJCALLSTACKDOWN/ADJCALLSTACKUP nodes to be CALLSEQ_START/BEGIN.
llvm-svn: 21915
2005-05-12 23:24:06 +00:00
Chris Lattner
dd2700de99 Pass calling convention to use into lower call to
llvm-svn: 21900
2005-05-12 19:56:57 +00:00
Chris Lattner
ad48ef0a7d fix expansion of ct[lt]z nodes
llvm-svn: 21896
2005-05-12 19:27:51 +00:00
Chris Lattner
6b5bacbc0b Expand 64-bit ctlz/cttz nodes for 32-bit targets
llvm-svn: 21895
2005-05-12 19:05:01 +00:00
Chris Lattner
3677432d39 Fix uint->fp casts on PPC, allowing UnitTests/2005-05-12-Int64ToFP to
work on it.

llvm-svn: 21894
2005-05-12 18:52:34 +00:00
Chris Lattner
dbcdac1ebf Allow something to be legalized multiple times. This can be used to reduce
legalization iteration

llvm-svn: 21892
2005-05-12 16:53:42 +00:00
Chris Lattner
a9a41e8856 Oops, don't do this after we figure out where to insert the call chains.
llvm-svn: 21890
2005-05-12 07:00:44 +00:00
Chris Lattner
b58308e6d4 Make sure to expand all nodes, avoiding unintentional node duplication.
llvm-svn: 21889
2005-05-12 06:54:21 +00:00
Chris Lattner
9f40cfa0a1 handle a common case generated by the uint64 -> FP code path better
llvm-svn: 21888
2005-05-12 06:27:02 +00:00
Chris Lattner
1c248e7462 add fixme
llvm-svn: 21887
2005-05-12 06:04:14 +00:00
Chris Lattner
1196356365 Fix a problem where early legalization can cause token chain problems.
llvm-svn: 21885
2005-05-12 04:49:08 +00:00
Chris Lattner
b38ffd7fbf Make legalize a bit more efficient, and canonicalize sub X, C -> add X, -C
llvm-svn: 21882
2005-05-12 00:17:04 +00:00
Nate Begeman
e84f776b5d Necessary changes to codegen cttz efficiently on PowerPC
1. Teach LegalizeDAG how to better legalize CTTZ if the target doesn't have
   CTPOP, but does have CTLZ
2. Teach PPC32 how to do sub x, const -> add x, -const for valid consts
3. Teach PPC32 how to do and (xor a, -1) b -> andc b, a
4. Teach PPC32 that ISD::CTLZ -> PPC::CNTLZW

llvm-svn: 21880
2005-05-11 23:43:56 +00:00
Chris Lattner
eeeaf45bba Fix the last remaining bug preventing us from switching the X86 BE over
from the simple isel to the pattern isel.  This forces inserted libcalls
to serialize against other function calls, which was breaking
UnitTests/2005-05-12-Int64ToFP.  Hopefully this will fix issues on other
targets as well.

llvm-svn: 21872
2005-05-11 19:02:11 +00:00
Chris Lattner
296754995e Do not memoize ADJCALLSTACKDOWN nodes, provide a method to hack on them.
llvm-svn: 21871
2005-05-11 18:57:39 +00:00
Chris Lattner
74763db128 wrap long line
llvm-svn: 21870
2005-05-11 18:57:06 +00:00
Chris Lattner
d76582b540 Make sure to legalize generated ctpop nodes, convert tabs to spaces
llvm-svn: 21868
2005-05-11 18:35:21 +00:00
Duraid Madina
8ad9786fcd expand count-leading/trailing-zeros; the test 2005-05-11-Popcount-ffs-fls.c
should now pass (the "LLVM" and "REF" results should be identical)

llvm-svn: 21866
2005-05-11 08:45:08 +00:00
Chris Lattner
b452b5aa42 Add some notes for expanding clz/ctz
llvm-svn: 21862
2005-05-11 05:27:09 +00:00
Chris Lattner
4f05136f61 Simplify this code, use the proper shift amount
llvm-svn: 21861
2005-05-11 05:21:31 +00:00
Chris Lattner
3edc8ecb53 Legalize this correctly
llvm-svn: 21859
2005-05-11 05:09:47 +00:00
Chris Lattner
457996c4a6 implement expansion of ctpop nodes, implementing CodeGen/Generic/llvm-ct-intrinsics.ll
llvm-svn: 21856
2005-05-11 04:51:16 +00:00
Chris Lattner
ce84b90a3d Print bit count nodes correctly
llvm-svn: 21855
2005-05-11 04:50:30 +00:00
Jeff Cohen
afc58006b7 Silence some VC++ warnings
llvm-svn: 21838
2005-05-10 02:22:38 +00:00
Chris Lattner
5edb4c4af6 The semantics of cast X to bool are a comparison against zero, not a truncation!
llvm-svn: 21833
2005-05-09 22:17:13 +00:00
Chris Lattner
95c836384b legalize readio/writeio into a load/store if requested
llvm-svn: 21827
2005-05-09 20:36:57 +00:00
Chris Lattner
7cc8edfc30 legalize READPORT, WRITEPORT, READIO, WRITEIO, at least in the basic cases
where they are directly supported by the architecture.  Wrap a bunch of
long lines :(

llvm-svn: 21826
2005-05-09 20:23:03 +00:00
Chris Lattner
af6bde0db6 Add support for matching the READPORT, WRITEPORT, READIO, WRITEIO intrinsics
llvm-svn: 21825
2005-05-09 20:22:36 +00:00
Chris Lattner
eee649df34 Add support for READPORT, WRITEPORT, READIO, WRITEIO
llvm-svn: 21824
2005-05-09 20:22:17 +00:00
Chris Lattner
c3fa88e7c8 Fold shifts into subsequent SHL's. These shifts often arise due to addrses
arithmetic lowering.

llvm-svn: 21818
2005-05-09 17:06:45 +00:00
Chris Lattner
a1e633ef7a Don't use the load/store instruction as the source pointer, use the pointer
being stored/loaded through!

llvm-svn: 21806
2005-05-09 04:28:51 +00:00
Chris Lattner
bfbefe0837 memoize all nodes, even null Value* nodes. Do not add two token chain outputs
llvm-svn: 21805
2005-05-09 04:14:13 +00:00
Chris Lattner
b85030373d wrap long lines
llvm-svn: 21804
2005-05-09 04:08:33 +00:00
Chris Lattner
6ffae1a3ec Print SrcValue nodes correctly
llvm-svn: 21803
2005-05-09 04:08:27 +00:00
Chris Lattner
6e8167d1c2 When hitting an unsupported intrinsic, actually print it
Lower debug info to noops.

llvm-svn: 21698
2005-05-05 17:55:17 +00:00
Andrew Lenharth
09c3c4add4 ctpop lowering in legalize
llvm-svn: 21697
2005-05-05 15:55:21 +00:00
Andrew Lenharth
9282d00d4f Make promoteOp work for CT*
Proof?

ubyte %bar(ubyte %x) {
entry:
        %tmp.1 = call ubyte %llvm.ctlz( ubyte %x )
        ret ubyte %tmp.1
}

==>

zapnot $16,1,$0
CTLZ $0,$0
subq $0,56,$0
zapnot $0,1,$0
ret $31,($26),1

llvm-svn: 21691
2005-05-04 19:11:05 +00:00
Andrew Lenharth
8b64bd0fd5 Implement count leading zeros (ctlz), count trailing zeros (cttz), and count
population (ctpop).  Generic lowering is implemented, however only promotion
is implemented for SelectionDAG at the moment.

More coming soon.

llvm-svn: 21676
2005-05-03 17:19:30 +00:00
Chris Lattner
fe72cdf838 Codegen and legalize sin/cos/llvm.sqrt as FSIN/FCOS/FSQRT calls. This patch
was contributed by Morten Ofstad, with some minor tweaks and bug fixes added
by me.

llvm-svn: 21636
2005-04-30 04:43:14 +00:00
Chris Lattner
6ec8bb9e8d Legalize FSQRT, FSIN, FCOS nodes, patch contributed by Morten Ofstad
llvm-svn: 21606
2005-04-28 21:44:33 +00:00
Chris Lattner
4678a790e6 Add FSQRT, FSIN, FCOS nodes, patch contributed by Morten Ofstad
llvm-svn: 21605
2005-04-28 21:44:03 +00:00
Andrew Lenharth
2a00530fa7 Implement Value* tracking for loads and stores in the selection DAG. This enables one to use alias analysis in the backends.
(TRUNK)Stores and (EXT|ZEXT|SEXT)Loads have an extra SDOperand which is a SrcValueSDNode which contains the Value*.  Note that if the operation is introduced by the backend, it will still have the operand, but the value* will be null.

llvm-svn: 21599
2005-04-27 20:10:01 +00:00
Chris Lattner
15bcc5273b Fold (X > -1) | (Y > -1) --> (X&Y > -1)
llvm-svn: 21552
2005-04-26 01:18:33 +00:00
Chris Lattner
d8ac4da793 implement some more logical compares with constants, so that:
int foo1(int x, int y) {
  int t1 = x >= 0;
  int t2 = y >= 0;
  return t1 & t2;
}
int foo2(int x, int y) {
  int t1 = x == -1;
  int t2 = y == -1;
  return t1 & t2;
}

produces:

_foo1:
        or r2, r4, r3
        srwi r2, r2, 31
        xori r3, r2, 1
        blr
_foo2:
        and r2, r4, r3
        addic r2, r2, 1
        li r2, 0
        addze r3, r2
        blr

instead of:

_foo1:
        srwi r2, r4, 31
        xori r2, r2, 1
        srwi r3, r3, 31
        xori r3, r3, 1
        and r3, r2, r3
        blr
_foo2:
        addic r2, r4, 1
        li r2, 0
        addze r2, r2
        addic r3, r3, 1
        li r3, 0
        addze r3, r3
        and r3, r2, r3
        blr

llvm-svn: 21547
2005-04-25 21:20:28 +00:00
Chris Lattner
7931b75a81 Codegen x < 0 | y < 0 as (x|y) < 0. This allows us to compile this to:
_foo:
        or r2, r4, r3
        srwi r3, r2, 31
        blr

instead of:

_foo:
        srwi r2, r4, 31
        srwi r3, r3, 31
        or r3, r2, r3
        blr

llvm-svn: 21544
2005-04-25 21:03:25 +00:00
Misha Brukman
a9a1982a44 Convert tabs to spaces
llvm-svn: 21439
2005-04-22 04:01:18 +00:00
Misha Brukman
774e55c446 Remove trailing whitespace
llvm-svn: 21420
2005-04-21 22:36:52 +00:00
Chris Lattner
87fbc1c554 Improve and elimination. On PPC, for:
bool %test(int %X) {
        %Y = and int %X, 8
        %Z = setne int %Y, 0
        ret bool %Z
}

we now generate this:

        rlwinm r2, r3, 0, 28, 28
        srwi r3, r2, 3

instead of this:

        rlwinm r2, r3, 0, 28, 28
        srwi r2, r2, 3
        rlwinm r3, r2, 0, 31, 31

I'll leave it to Nate to get it down to one instruction. :)

---------------------------------------------------------------------

llvm-svn: 21391
2005-04-21 06:28:15 +00:00
Chris Lattner
d0a2fda2c6 Fold (x & 8) != 0 and (x & 8) == 8 into (x & 8) >> 3.
This turns this PPC code:

        rlwinm r2, r3, 0, 28, 28
        cmpwi cr7, r2, 8
        mfcr r2
        rlwinm r3, r2, 31, 31, 31

into this:

        rlwinm r2, r3, 0, 28, 28
        srwi r2, r2, 3
        rlwinm r3, r2, 0, 31, 31

Next up, nuking the extra and.

llvm-svn: 21390
2005-04-21 06:12:41 +00:00
Chris Lattner
188ecaab1d Fold setcc of MVT::i1 operands into logical operations
llvm-svn: 21319
2005-04-18 04:48:12 +00:00
Chris Lattner
72aca1b758 Another minor simplification: handle setcc (zero_extend x), c -> setcc(x, c')
llvm-svn: 21318
2005-04-18 04:30:45 +00:00
Chris Lattner
e6117e5d4f Another simple xform
llvm-svn: 21317
2005-04-18 04:11:19 +00:00
Chris Lattner
f6f5b23a00 Fold:
// (X != 0) | (Y != 0) -> (X|Y != 0)
        // (X == 0) & (Y == 0) -> (X|Y == 0)

Compiling this:

int %bar(int %a, int %b) {
        entry:
        %tmp.1 = setne int %a, 0
        %tmp.2 = setne int %b, 0
        %tmp.3 = or bool %tmp.1, %tmp.2
        %retval = cast bool %tmp.3 to int
        ret int %retval
        }

to this:

_bar:
        or r2, r3, r4
        addic r3, r2, -1
        subfe r3, r3, r2
        blr

instead of:

_bar:
        addic r2, r3, -1
        subfe r2, r2, r3
        addic r3, r4, -1
        subfe r3, r3, r4
        or r3, r2, r3
        blr

llvm-svn: 21316
2005-04-18 03:59:53 +00:00
Chris Lattner
a32c50520c Make the AND elimination operation recursive and significantly more powerful,
eliminating an and for Nate's testcase:

int %bar(int %a, int %b) {
        entry:
        %tmp.1 = setne int %a, 0
        %tmp.2 = setne int %b, 0
        %tmp.3 = or bool %tmp.1, %tmp.2
        %retval = cast bool %tmp.3 to int
        ret int %retval
        }

generating:

_bar:
        addic r2, r3, -1
        subfe r2, r2, r3
        addic r3, r4, -1
        subfe r3, r3, r4
        or r3, r2, r3
        blr

instead of:

_bar:
        addic r2, r3, -1
        subfe r2, r2, r3
        addic r3, r4, -1
        subfe r3, r3, r4
        or r2, r2, r3
        rlwinm r3, r2, 0, 31, 31
        blr

llvm-svn: 21315
2005-04-18 03:48:41 +00:00
Nate Begeman
ce63e383b8 Add a couple missing transforms in getSetCC that were triggering assertions
in the PPC Pattern ISel

llvm-svn: 21297
2005-04-14 08:56:52 +00:00
Nate Begeman
20b3399465 Disbale the broken fold of shift + sz[ext] for now
Move the transform for select (a < 0) ? b : 0 into the dag from ppc isel
Enable the dag to fold and (setcc, 1) -> setcc for targets where setcc
  always produces zero or one.

llvm-svn: 21291
2005-04-13 21:23:31 +00:00
Chris Lattner
89f7e115a4 fix an infinite loop
llvm-svn: 21289
2005-04-13 20:06:29 +00:00
Chris Lattner
475fe85ddf fix some serious miscompiles on ia64, alpha, and ppc
llvm-svn: 21288
2005-04-13 19:53:40 +00:00
Chris Lattner
03d675414e avoid work when possible, perhaps fix the problem nate and andrew are seeing
with != 0 comparisons vanishing.

llvm-svn: 21287
2005-04-13 19:41:05 +00:00
Chris Lattner
9540cf8c7e Implement expansion of unsigned i64 -> FP.
Note that this probably only works for little endian targets, but is enough
to get siod working :)

llvm-svn: 21280
2005-04-13 05:09:42 +00:00
Chris Lattner
1a6247ff51 Make expansion of uint->fp cast assert out instead of infinitely recurse.
llvm-svn: 21275
2005-04-13 03:42:14 +00:00
Chris Lattner
63450e87d9 add back the optimization that Nate added for shl X, (zext_inreg y)
llvm-svn: 21273
2005-04-13 02:58:13 +00:00
Chris Lattner
759afe07d7 Oops, remove these too.
llvm-svn: 21272
2005-04-13 02:47:57 +00:00
Chris Lattner
4f188f949c Instead of making ZERO_EXTEND_INREG nodes, use the helper method in
SelectionDAG to do the job with AND.  Don't legalize Z_E_I anymore as
it is gone

llvm-svn: 21266
2005-04-13 02:38:47 +00:00
Chris Lattner
bce0030a88 Remove all foldings of ZERO_EXTEND_INREG, moving them to work for AND nodes
instead.  OVerall, this increases the amount of folding we can do.

llvm-svn: 21265
2005-04-13 02:38:18 +00:00
Nate Begeman
38d8248a9e Fold shift x, [sz]ext(y) -> shift x, y
llvm-svn: 21262
2005-04-12 23:32:28 +00:00
Nate Begeman
a56527ea5f Fold shift by size larger than type size to undef
Make llvm undef values generate ISD::UNDEF nodes

llvm-svn: 21261
2005-04-12 23:12:17 +00:00
Chris Lattner
58f72ab722 promote extload i1 -> extload i8
llvm-svn: 21258
2005-04-12 20:30:10 +00:00
Chris Lattner
cfc7093ca6 Remove some redundant checks, add a couple of new ones. This allows us to
compile this:

int foo (unsigned long a, unsigned long long g) {
  return a >= g;
}

To:

foo:
        movl 8(%esp), %eax
        cmpl %eax, 4(%esp)
        setae %al
        cmpl $0, 12(%esp)
        sete %cl
        andb %al, %cl
        movzbl %cl, %eax
        ret

instead of:

foo:
        movl 8(%esp), %eax
        cmpl %eax, 4(%esp)
        setae %al
        movzbw %al, %cx
        movl 12(%esp), %edx
        cmpl $0, %edx
        sete %al
        movzbw %al, %ax
        cmpl $0, %edx
        cmove %cx, %ax
        movzbl %al, %eax
        ret

llvm-svn: 21244
2005-04-12 02:54:39 +00:00
Chris Lattner
61f353dbdc Emit comparisons against the sign bit better. Codegen this:
bool %test1(long %X) {
        %A = setlt long %X, 0
        ret bool %A
}

like this:

test1:
        cmpl $0, 8(%esp)
        setl %al
        movzbl %al, %eax
        ret

instead of:

test1:
        movl 8(%esp), %ecx
        cmpl $0, %ecx
        setl %al
        movzbw %al, %ax
        cmpl $0, 4(%esp)
        setb %dl
        movzbw %dl, %dx
        cmpl $0, %ecx
        cmove %dx, %ax
        movzbl %al, %eax
        ret

llvm-svn: 21243
2005-04-12 02:19:10 +00:00
Chris Lattner
6cbbb55967 Emit long comparison against -1 better. Instead of this (x86):
test2:
        movl 8(%esp), %eax
        notl %eax
        movl 4(%esp), %ecx
        notl %ecx
        orl %eax, %ecx
        cmpl $0, %ecx
        sete %al
        movzbl %al, %eax
        ret

or this (PPC):

_test2:
        nor r2, r4, r4
        nor r3, r3, r3
        or r2, r2, r3
        cntlzw r2, r2
        srwi r3, r2, 5
        blr

Emit this:

test2:
        movl 8(%esp), %eax
        andl 4(%esp), %eax
        cmpl $-1, %eax
        sete %al
        movzbl %al, %eax
        ret

or this:

_test2:
.LBB_test2_0:   ;
        and r2, r4, r3
        cmpwi cr0, r2, -1
        li r3, 1
        li r2, 0
        beq .LBB_test2_2        ;
.LBB_test2_1:   ;
        or r3, r2, r2
.LBB_test2_2:   ;
        blr

it seems like the PPC isel could do better for R32 == -1 case.

llvm-svn: 21242
2005-04-12 01:46:05 +00:00
Chris Lattner
37534d43d0 canonicalize x <u 1 -> x == 0. On this testcase:
unsigned long long g;
unsigned long foo (unsigned long a) {
  return (a >= g) ? 1 : 0;
}

It changes the ppc code from:

_foo:
.LBB_foo_0:     ; entry
        mflr r11
        stw r11, 8(r1)
        bl "L00000$pb"
"L00000$pb":
        mflr r2
        addis r2, r2, ha16(L_g$non_lazy_ptr-"L00000$pb")
        lwz r2, lo16(L_g$non_lazy_ptr-"L00000$pb")(r2)
        lwz r4, 0(r2)
        lwz r2, 4(r2)
        cmplw cr0, r3, r2
        li r2, 1
        li r3, 0
        bge .LBB_foo_2  ; entry
.LBB_foo_1:     ; entry
        or r2, r3, r3
.LBB_foo_2:     ; entry
        cmplwi cr0, r4, 1
        li r3, 1
        li r5, 0
        blt .LBB_foo_4  ; entry
.LBB_foo_3:     ; entry
        or r3, r5, r5
.LBB_foo_4:     ; entry
        cmpwi cr0, r4, 0
        beq .LBB_foo_6  ; entry
.LBB_foo_5:     ; entry
        or r2, r3, r3
.LBB_foo_6:     ; entry
        rlwinm r3, r2, 0, 31, 31
        lwz r11, 8(r1)
        mtlr r11
        blr


to:

_foo:
.LBB_foo_0:     ; entry
        mflr r11
        stw r11, 8(r1)
        bl "L00000$pb"
"L00000$pb":
        mflr r2
        addis r2, r2, ha16(L_g$non_lazy_ptr-"L00000$pb")
        lwz r2, lo16(L_g$non_lazy_ptr-"L00000$pb")(r2)
        lwz r4, 0(r2)
        lwz r2, 4(r2)
        cmplw cr0, r3, r2
        li r2, 1
        li r3, 0
        bge .LBB_foo_2  ; entry
.LBB_foo_1:     ; entry
        or r2, r3, r3
.LBB_foo_2:     ; entry
        cntlzw r3, r4
        srwi r3, r3, 5
        cmpwi cr0, r4, 0
        beq .LBB_foo_4  ; entry
.LBB_foo_3:     ; entry
        or r2, r3, r3
.LBB_foo_4:     ; entry
        rlwinm r3, r2, 0, 31, 31
        lwz r11, 8(r1)
        mtlr r11
        blr

llvm-svn: 21241
2005-04-12 00:28:49 +00:00
Chris Lattner
7f0f0854fa Teach the dag mechanism that this:
long long test2(unsigned A, unsigned B) {
        return ((unsigned long long)A << 32) + B;
}

is equivalent to this:

long long test1(unsigned A, unsigned B) {
        return ((unsigned long long)A << 32) | B;
}

Now they are both codegen'd to this on ppc:

_test2:
        blr

or this on x86:

test2:
        movl 4(%esp), %edx
        movl 8(%esp), %eax
        ret

llvm-svn: 21231
2005-04-11 20:29:59 +00:00
Chris Lattner
71f3d4ce57 Fix expansion of shifts by exactly NVT bits on arch's (like X86) that have
masking shifts.

This fixes the miscompilation of this:

long long test1(unsigned A, unsigned B) {
        return ((unsigned long long)A << 32) | B;
}

into this:

test1:
        movl 4(%esp), %edx
        movl %edx, %eax
        orl 8(%esp), %eax
        ret

allowing us to generate this instead:

test1:
        movl 4(%esp), %edx
        movl 8(%esp), %eax
        ret

llvm-svn: 21230
2005-04-11 20:08:52 +00:00
Nate Begeman
32163963cb Fix libcall code to not pass a NULL Chain to LowerCallTo
Fix libcall code to not crash or assert looking for an ADJCALLSTACKUP node
  when it is known that there is no ADJCALLSTACKDOWN to match.
Expand i64 multiply when ISD::MULHU is legal for the target.

llvm-svn: 21214
2005-04-11 03:01:51 +00:00
Chris Lattner
4f26677dc9 Don't bother sign/zext_inreg'ing the result of an and operation if we know
the result does change as a result of the extend.

This improves codegen for Alpha on this testcase:

int %a(ushort* %i) {
        %tmp.1 = load ushort* %i
        %tmp.2 = cast ushort %tmp.1 to int
        %tmp.4 = and int %tmp.2, 1
        ret int %tmp.4
}

Generating:

a:
        ldgp $29, 0($27)
        ldwu $0,0($16)
        and $0,1,$0
        ret $31,($26),1

instead of:

a:
        ldgp $29, 0($27)
        ldwu $0,0($16)
        and $0,1,$0
        addl $0,0,$0
        ret $31,($26),1

btw, alpha really should switch to livein/outs for args :)

llvm-svn: 21213
2005-04-10 23:37:16 +00:00
Chris Lattner
c730ea00e2 Teach legalize to deal with targets that don't support some SEXTLOAD/ZEXTLOADs
llvm-svn: 21212
2005-04-10 22:54:25 +00:00
Chris Lattner
1b9e1e26cb don't zextload fp values!
llvm-svn: 21209
2005-04-10 17:40:35 +00:00
Chris Lattner
0c089eae41 Until we have a dag combiner, promote using zextload's instead of extloads.
This gives the optimizer a bit of information about the top-part of the
value.

llvm-svn: 21205
2005-04-10 04:33:47 +00:00
Chris Lattner
9d13d0b958 Fold zext_inreg(zextload), likewise for sext's
llvm-svn: 21204
2005-04-10 04:33:08 +00:00
Chris Lattner
9c8fe594e5 add a simple xform
llvm-svn: 21203
2005-04-10 04:04:49 +00:00
Chris Lattner
b3518a838c Fix a thinko. If the operand is promoted, pass the promoted value into
the new zero extend, not the original operand.  This fixes cast bool -> long
on ppc.

Add an unrelated fixme

llvm-svn: 21196
2005-04-10 01:13:15 +00:00
Chris Lattner
034716de24 add a little peephole optimization. This allows us to codegen:
int a(short i) {
        return i & 1;
}

as

_a:
        andi. r3, r3, 1
        blr

instead of:

_a:
        rlwinm r2, r3, 0, 16, 31
        andi. r3, r2, 1
        blr

on ppc.  It should also help the other risc targets.

llvm-svn: 21189
2005-04-09 21:43:54 +00:00
Chris Lattner
afa0001d54 recognize some patterns as fabs operations, so that fabs at the source level
is deconstructed then reconstructed here.  This catches 19 fabs's in 177.mesa
9 in 168.wupwise, 5 in 171.swim, 3 in 172.mgrid, and 14 in 173.applu out of
specfp2000.

This allows the X86 code generator to make MUCH better code than before for
each of these and saves one instr on ppc.

This depends on the previous CFE patch to expose these correctly.

llvm-svn: 21171
2005-04-09 05:15:53 +00:00
Chris Lattner
8e6eafa8e1 Emit BRCONDTWOWAY when possible.
llvm-svn: 21167
2005-04-09 03:30:29 +00:00
Chris Lattner
55b73bda6c Legalize BRCONDTWOWAY into a BRCOND/BR pair if a target doesn't support it.
llvm-svn: 21166
2005-04-09 03:30:19 +00:00
Chris Lattner
da902bdf1b print and fold BRCONDTWOWAY correctly
llvm-svn: 21165
2005-04-09 03:27:28 +00:00
Chris Lattner
31170cd2ec canonicalize a bunch of operations involving fneg
llvm-svn: 21160
2005-04-09 03:02:46 +00:00
Chris Lattner
9a56ef5693 If a target zero or sign extends the result of its setcc, allow folding of
this into sign/zero extension instructions later.

On PPC, for example, this testcase:

%G = external global sbyte
implementation
void %test(int %X, int %Y) {
  %C = setlt int %X, %Y
  %D = cast bool %C to sbyte
  store sbyte %D, sbyte* %G
  ret void
}

Now codegens to:

        cmpw cr0, r3, r4
        li r3, 1
        li r4, 0
        blt .LBB_test_2 ;
.LBB_test_1:    ;
        or r3, r4, r4
.LBB_test_2:    ;
        addis r2, r2, ha16(L_G$non_lazy_ptr-"L00000$pb")
        lwz r2, lo16(L_G$non_lazy_ptr-"L00000$pb")(r2)
        stb r3, 0(r2)

instead of:

        cmpw cr0, r3, r4
        li r3, 1
        li r4, 0
        blt .LBB_test_2 ;
.LBB_test_1:    ;
        or r3, r4, r4
.LBB_test_2:    ;
***     rlwinm r3, r3, 0, 31, 31
        addis r2, r2, ha16(L_G$non_lazy_ptr-"L00000$pb")
        lwz r2, lo16(L_G$non_lazy_ptr-"L00000$pb")(r2)
        stb r3, 0(r2)

llvm-svn: 21148
2005-04-07 19:43:53 +00:00