1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 12:12:47 +01:00
Commit Graph

10639 Commits

Author SHA1 Message Date
Chris Lattner
4ac016991c Allow indvar simplify to canonicalize ANY affine IV, not just affine IVs with
constant stride.  This implements Transforms/IndVarsSimplify/variable-stride-ivs.ll

llvm-svn: 22744
2005-08-10 01:12:06 +00:00
Chris Lattner
0730ac081a Fix an obvious oops
llvm-svn: 22742
2005-08-10 00:59:40 +00:00
Chris Lattner
8c7e769325 Teach LSR to strength reduce IVs that have a loop-invariant but non-constant stride.
For code like this:

void foo(float *a, float *b, int n, int stride_a, int stride_b) {
  int i;
  for (i=0; i<n; i++)
      a[i*stride_a] = b[i*stride_b];
}

we now emit:

.LBB_foo2_2:    ; no_exit
        lfs f0, 0(r4)
        stfs f0, 0(r3)
        addi r7, r7, 1
        add r4, r2, r4
        add r3, r6, r3
        cmpw cr0, r7, r5
        blt .LBB_foo2_2 ; no_exit

instead of:

.LBB_foo_2:     ; no_exit
        mullw r8, r2, r7     ;; multiply!
        slwi r8, r8, 2
        lfsx f0, r4, r8
        mullw r8, r2, r6     ;; multiply!
        slwi r8, r8, 2
        stfsx f0, r3, r8
        addi r2, r2, 1
        cmpw cr0, r2, r5
        blt .LBB_foo_2  ; no_exit

loops with variable strides occur pretty often.  For example, in SPECFP2K
there are 317 variable strides in 177.mesa, 3 in 179.art, 14 in 188.ammp,
56 in 168.wupwise, 36 in 172.mgrid.

Now we can allow indvars to turn functions written like this:

void foo2(float *a, float *b, int n, int stride_a, int stride_b) {
  int i, ai = 0, bi = 0;
  for (i=0; i<n; i++)
    {
      a[ai] = b[bi];
      ai += stride_a;
      bi += stride_b;
    }
}

into code like the above for better analysis.  With this patch, they generate
identical code.

llvm-svn: 22740
2005-08-10 00:45:21 +00:00
Chris Lattner
3d251b90f3 Fix Regression/Transforms/LoopStrengthReduce/phi_node_update_multiple_preds.ll
by being more careful about updating PHI nodes

llvm-svn: 22739
2005-08-10 00:35:32 +00:00
Chris Lattner
24f927cfe9 Fix some 80 column violations.
Once we compute the evolution for a GEP, tell SE about it.  This allows users
of the GEP to know it, if the users are not direct.  This allows us to compile
this testcase:

void fbSolidFillmmx(int w, unsigned char *d) {
    while (w >= 64) {
        *(unsigned long long *) (d +  0) = 0;
        *(unsigned long long *) (d +  8) = 0;
        *(unsigned long long *) (d + 16) = 0;
        *(unsigned long long *) (d + 24) = 0;
        *(unsigned long long *) (d + 32) = 0;
        *(unsigned long long *) (d + 40) = 0;
        *(unsigned long long *) (d + 48) = 0;
        *(unsigned long long *) (d + 56) = 0;
        w -= 64;
        d += 64;
    }
}

into:

.LBB_fbSolidFillmmx_2:  ; no_exit
        li r2, 0
        stw r2, 0(r4)
        stw r2, 4(r4)
        stw r2, 8(r4)
        stw r2, 12(r4)
        stw r2, 16(r4)
        stw r2, 20(r4)
        stw r2, 24(r4)
        stw r2, 28(r4)
        stw r2, 32(r4)
        stw r2, 36(r4)
        stw r2, 40(r4)
        stw r2, 44(r4)
        stw r2, 48(r4)
        stw r2, 52(r4)
        stw r2, 56(r4)
        stw r2, 60(r4)
        addi r4, r4, 64
        addi r3, r3, -64
        cmpwi cr0, r3, 63
        bgt .LBB_fbSolidFillmmx_2       ; no_exit

instead of:

.LBB_fbSolidFillmmx_2:  ; no_exit
        li r11, 0
        stw r11, 0(r4)
        stw r11, 4(r4)
        stwx r11, r10, r4
        add r12, r10, r4
        stw r11, 4(r12)
        stwx r11, r9, r4
        add r12, r9, r4
        stw r11, 4(r12)
        stwx r11, r8, r4
        add r12, r8, r4
        stw r11, 4(r12)
        stwx r11, r7, r4
        add r12, r7, r4
        stw r11, 4(r12)
        stwx r11, r6, r4
        add r12, r6, r4
        stw r11, 4(r12)
        stwx r11, r5, r4
        add r12, r5, r4
        stw r11, 4(r12)
        stwx r11, r2, r4
        add r12, r2, r4
        stw r11, 4(r12)
        addi r4, r4, 64
        addi r3, r3, -64
        cmpwi cr0, r3, 63
        bgt .LBB_fbSolidFillmmx_2       ; no_exit

llvm-svn: 22737
2005-08-09 23:39:36 +00:00
Chris Lattner
6ca08d5739 implement two helper methods
llvm-svn: 22736
2005-08-09 23:36:33 +00:00
Chris Lattner
3179a74493 Fix spelling, fix some broken canonicalizations by my last patch
llvm-svn: 22734
2005-08-09 23:09:05 +00:00
Chris Lattner
5ad0216bd6 add a optimization note
llvm-svn: 22732
2005-08-09 22:30:57 +00:00
Chris Lattner
3290ca9983 add cc nodes to the AllNodes list so they show up in Graphviz output
llvm-svn: 22731
2005-08-09 20:40:02 +00:00
Chris Lattner
1277703a48 Update the targets to the new SETCC/CondCodeSDNode interfaces.
llvm-svn: 22729
2005-08-09 20:21:10 +00:00
Chris Lattner
0fa4402b59 Eliminate the SetCCSDNode in favor of a CondCodeSDNode class. This pulls the
CC out of the SetCC operation, making SETCC a standard ternary operation and
CC's a standard DAG leaf.  This will make it possible for other node to use
CC's as operands in the future...

llvm-svn: 22728
2005-08-09 20:20:18 +00:00
Chris Lattner
1552a40112 Minor cleanup patch, no functionality changes. Written by Jim Laskey.
llvm-svn: 22727
2005-08-09 18:29:55 +00:00
Chris Lattner
b3baf30fdd Fix CodeGen/Generic/div-neg-power-2.ll, a regression from last night.
llvm-svn: 22726
2005-08-09 18:08:41 +00:00
Chris Lattner
2872f369f0 SCEVAddExpr::get() of an empty list is invalid.
llvm-svn: 22724
2005-08-09 01:13:47 +00:00
Chris Lattner
11dd32a826 Implement: LoopStrengthReduce/share_ivs.ll
Two changes:
  * Only insert one PHI node for each stride.  Other values are live in
    values.  This cannot introduce higher register pressure than the
    previous approach, and can take advantage of reg+reg addressing modes.
  * Factor common base values out of uses before moving values from the
    base to the immediate fields.  This improves codegen by starting the
    stride-specific PHI node out at a common place for each IV use.

As an example, we used to generate this for a loop in swim:

.LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2:        ; no_exit.7.i
        lfd f0, 0(r8)
        stfd f0, 0(r3)
        lfd f0, 0(r6)
        stfd f0, 0(r7)
        lfd f0, 0(r2)
        stfd f0, 0(r5)
        addi r9, r9, 1
        addi r2, r2, 8
        addi r5, r5, 8
        addi r6, r6, 8
        addi r7, r7, 8
        addi r8, r8, 8
        addi r3, r3, 8
        cmpw cr0, r9, r4
        bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1

now we emit:

.LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2:        ; no_exit.7.i
        lfdx f0, r8, r2
        stfdx f0, r9, r2
        lfdx f0, r5, r2
        stfdx f0, r7, r2
        lfdx f0, r3, r2
        stfdx f0, r6, r2
        addi r10, r10, 1
        addi r2, r2, 8
        cmpw cr0, r10, r4
        bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1

As another more dramatic example, we used to emit this:

.LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2:       ; no_exit.1.i19
        lfd f0, 8(r21)
        lfd f4, 8(r3)
        lfd f5, 8(r27)
        lfd f6, 8(r22)
        lfd f7, 8(r5)
        lfd f8, 8(r6)
        lfd f9, 8(r30)
        lfd f10, 8(r11)
        lfd f11, 8(r12)
        fsub f10, f10, f11
        fadd f5, f4, f5
        fmul f5, f5, f1
        fadd f6, f6, f7
        fadd f6, f6, f8
        fadd f6, f6, f9
        fmadd f0, f5, f6, f0
        fnmsub f0, f10, f2, f0
        stfd f0, 8(r4)
        lfd f0, 8(r25)
        lfd f5, 8(r26)
        lfd f6, 8(r23)
        lfd f9, 8(r28)
        lfd f10, 8(r10)
        lfd f12, 8(r9)
        lfd f13, 8(r29)
        fsub f11, f13, f11
        fadd f4, f4, f5
        fmul f4, f4, f1
        fadd f5, f6, f9
        fadd f5, f5, f10
        fadd f5, f5, f12
        fnmsub f0, f4, f5, f0
        fnmsub f0, f11, f3, f0
        stfd f0, 8(r24)
        lfd f0, 8(r8)
        fsub f4, f7, f8
        fsub f5, f12, f10
        fnmsub f0, f5, f2, f0
        fnmsub f0, f4, f3, f0
        stfd f0, 8(r2)
        addi r20, r20, 1
        addi r2, r2, 8
        addi r8, r8, 8
        addi r10, r10, 8
        addi r12, r12, 8
        addi r6, r6, 8
        addi r29, r29, 8
        addi r28, r28, 8
        addi r26, r26, 8
        addi r25, r25, 8
        addi r24, r24, 8
        addi r5, r5, 8
        addi r23, r23, 8
        addi r22, r22, 8
        addi r3, r3, 8
        addi r9, r9, 8
        addi r11, r11, 8
        addi r30, r30, 8
        addi r27, r27, 8
        addi r21, r21, 8
        addi r4, r4, 8
        cmpw cr0, r20, r7
        bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1

we now emit:

.LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2:       ; no_exit.1.i19
        lfdx f0, r21, r20
        lfdx f4, r3, r20
        lfdx f5, r27, r20
        lfdx f6, r22, r20
        lfdx f7, r5, r20
        lfdx f8, r6, r20
        lfdx f9, r30, r20
        lfdx f10, r11, r20
        lfdx f11, r12, r20
        fsub f10, f10, f11
        fadd f5, f4, f5
        fmul f5, f5, f1
        fadd f6, f6, f7
        fadd f6, f6, f8
        fadd f6, f6, f9
        fmadd f0, f5, f6, f0
        fnmsub f0, f10, f2, f0
        stfdx f0, r4, r20
        lfdx f0, r25, r20
        lfdx f5, r26, r20
        lfdx f6, r23, r20
        lfdx f9, r28, r20
        lfdx f10, r10, r20
        lfdx f12, r9, r20
        lfdx f13, r29, r20
        fsub f11, f13, f11
        fadd f4, f4, f5
        fmul f4, f4, f1
        fadd f5, f6, f9
        fadd f5, f5, f10
        fadd f5, f5, f12
        fnmsub f0, f4, f5, f0
        fnmsub f0, f11, f3, f0
        stfdx f0, r24, r20
        lfdx f0, r8, r20
        fsub f4, f7, f8
        fsub f5, f12, f10
        fnmsub f0, f5, f2, f0
        fnmsub f0, f4, f3, f0
        stfdx f0, r2, r20
        addi r19, r19, 1
        addi r20, r20, 8
        cmpw cr0, r19, r7
        bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1

llvm-svn: 22722
2005-08-09 00:18:09 +00:00
Chris Lattner
ea50bf5aca Suck the base value out of the UsersToProcess vector into the BasedUser
class to simplify the code.  Fuse two loops.

llvm-svn: 22721
2005-08-08 22:56:21 +00:00
Chris Lattner
b9d13099a7 Split MoveLoopVariantsToImediateField out from MoveImmediateValues. The
first is a correctness thing, and the later is an optzn thing.  This also
is needed to support a future change.

llvm-svn: 22720
2005-08-08 22:32:34 +00:00
Nate Begeman
6b842b4883 Factor out some common code, and be smarter about when to emit load hi/lo
code sequences.

llvm-svn: 22719
2005-08-08 22:22:56 +00:00
Chris Lattner
f1d821b665 Allow tools with "consume after" options (like lli) to take more positional
opts than they take directly.  Thanks to John C for pointing this problem
out to me!

llvm-svn: 22717
2005-08-08 21:57:27 +00:00
Chris Lattner
afd68f8f76 Remove getImmediateForOpcode, which is now dead.
Patch by Jim Laskey.

llvm-svn: 22716
2005-08-08 21:34:13 +00:00
Chris Lattner
f0eb0b2af5 Add new immediate handling support for mul/div.
Patch by Jim Laskey!

llvm-svn: 22715
2005-08-08 21:33:23 +00:00
Chris Lattner
8efdc3c8d4 Add support for OR/XOR/SUB immediates that are handled with the new immediate
way.  This allows ORI/ORIS pairs, for example.

llvm-svn: 22714
2005-08-08 21:30:29 +00:00
Chris Lattner
051d45ce3c Modify the ISD::AND opcode case to use new immediate constant predicates.
Includes wider support for rotate and mask cases.

Patch by Jim Laskey.

I've requested that Jim add new regression tests the newly handled cases.

llvm-svn: 22712
2005-08-08 21:24:57 +00:00
Chris Lattner
3b23144fc0 Modify the ISD::ADD opcode case to use new immediate constant predicates.
Includes support for 32-bit constants using addi/addis.

Patch by Jim Laskey.

llvm-svn: 22711
2005-08-08 21:21:03 +00:00
Chris Lattner
69eed9f8a7 Modify existing support functions to use new immediate constant predicates.
Patch by Jim Laskey

llvm-svn: 22710
2005-08-08 21:12:35 +00:00
Chris Lattner
fab821d774 Add support predicates for future immediate constant changes.
Patch by Jim Laskey

llvm-svn: 22709
2005-08-08 21:10:27 +00:00
Chris Lattner
f6320ae69a Move IsRunOfOnes to a more logical place and rename to a proper predicate form
(lowercase isXXX).

Patch by Jim Laskey.

llvm-svn: 22708
2005-08-08 21:08:09 +00:00
Nate Begeman
f2d22dbd9b Fix JIT encoding of ppc mfocrf instruction; the operands were reversed
llvm-svn: 22707
2005-08-08 20:04:52 +00:00
Chris Lattner
c6571e5c64 Use the new 'moveBefore' method to simplify some code. Really, which is
easier to understand?  :)

llvm-svn: 22706
2005-08-08 19:11:57 +00:00
Chris Lattner
e30b898fec Reject command lines that have too many positional arguments passed (e.g.,
'opt x y').  This fixes PR493.

Patch contributed by Owen Anderson!

llvm-svn: 22705
2005-08-08 17:25:38 +00:00
Chris Lattner
f6e6e25039 Not all constants are legal immediates in load/store instructions.
llvm-svn: 22704
2005-08-08 06:25:50 +00:00
Chris Lattner
ab45a77fed Implement LoopStrengthReduce/share_code_in_preheader.ll by having one
rewriter for all code inserted into the preheader, which is never flushed.

llvm-svn: 22702
2005-08-08 05:47:49 +00:00
Chris Lattner
dd97325bc0 Implement a simple optimization for the termination condition of the loop.
The termination condition actually wants to use the post-incremented value
of the loop, not a new indvar with an unusual base.

On PPC, for example, this allows us to compile
LoopStrengthReduce/exit_compare_live_range.ll to:

_foo:
        li r2, 0
.LBB_foo_1:     ; no_exit
        li r5, 0
        stw r5, 0(r3)
        addi r2, r2, 1
        cmpw cr0, r2, r4
        bne .LBB_foo_1  ; no_exit
        blr

instead of:

_foo:
        li r2, 1                ;; IV starts at 1, not 0
.LBB_foo_1:     ; no_exit
        li r5, 0
        stw r5, 0(r3)
        addi r5, r2, 1
        cmpw cr0, r2, r4
        or r2, r5, r5           ;; Reg-reg copy, extra live range
        bne .LBB_foo_1  ; no_exit
        blr

This implements LoopStrengthReduce/exit_compare_live_range.ll

llvm-svn: 22699
2005-08-08 05:28:22 +00:00
Chris Lattner
e698d06904 add new helper function
llvm-svn: 22698
2005-08-08 05:21:50 +00:00
Chris Lattner
e7f14fb39d Handle 64-bit constant exprs on 64-bit targets.
llvm-svn: 22696
2005-08-08 04:26:32 +00:00
Chris Lattner
a539b03210 All stats are "Number of ..."
llvm-svn: 22694
2005-08-07 20:02:04 +00:00
Chris Lattner
bbab417e32 Add some simple folds that occur in bitfield cases. Fix a minor bug in
isHighOnes, where it would consider 0 to have high ones.

llvm-svn: 22693
2005-08-07 07:03:10 +00:00
Chris Lattner
427319ff4b Fix typoCVS: ----------------------------------------------------------------------
llvm-svn: 22692
2005-08-07 07:00:52 +00:00
Chris Lattner
fdb467b18d add a small simplification that can be exposed after promotion/expansion
llvm-svn: 22691
2005-08-07 05:00:44 +00:00
Chris Lattner
5b499da0a7 * Use the new PHINode::hasConstantValue method to simplify some code
* Teach this code to move allocas out of the loop when tail call eliminating
  a call marked 'tail'.  This implements TailCallElim/move_alloca_for_tail_call.ll
* Do not perform this transformation if a call is marked 'tail' and if there
  are allocas that we cannot move out of the loop in #2.  Doing so would increase
  the stack usage of the function.  This implements fixes
  PR615 and TailCallElim/dont-tce-tail-marked-call.ll.

llvm-svn: 22690
2005-08-07 04:27:41 +00:00
Chris Lattner
93820f4ad7 Consolidate the GPOpt stuff to all use the Subtarget, instead of still
depending on the command line option.  Now the command line option just
sets the subtarget as appropriate.  G5 opts will now default to on on
G5-enabled nightly testers among other machines.

llvm-svn: 22688
2005-08-05 22:05:03 +00:00
Chris Lattner
07af090121 adjust to change in getSubtarget() api
llvm-svn: 22687
2005-08-05 21:54:27 +00:00
Chris Lattner
79095d3416 Enable gp optimizations by default when available, even when a target triple
is available, since the target triple doesn't specify whether to use gpopts
or not.

llvm-svn: 22685
2005-08-05 21:25:13 +00:00
Chris Lattner
d82395fc04 add a note
llvm-svn: 22681
2005-08-05 19:18:32 +00:00
Chris Lattner
d3a8084e5b Change FindEarliestCallSeqEnd (used by libcall insertion) to use a set to
avoid revisiting nodes more than once.  This eliminates a source of
potentially exponential behavior.  For a small function in 191.fma3d
(hexah_stress_divergence_), this speeds up isel from taking > 20mins to
taking 0.07s.

llvm-svn: 22680
2005-08-05 18:10:27 +00:00
Chris Lattner
c7a67abac2 Fix a use-of-dangling-pointer bug, from the introduction of SrcValue's.
llvm-svn: 22679
2005-08-05 16:55:31 +00:00
Chris Lattner
644edfb51e Fix a latent bug in the libcall inserter that was exposed by Nate's patch
yesterday.  This fixes whetstone and a bunch of programs in the External tests.

llvm-svn: 22678
2005-08-05 16:23:57 +00:00
Chris Lattner
852038daec don't crash when running the PPC backend on non-ppc hosts without specifying
a subtarget.

llvm-svn: 22677
2005-08-05 16:17:22 +00:00
Chris Lattner
fbcacf822a PHINode::hasConstantValue should never return the PHI itself, even if the
PHI is its only operand.

llvm-svn: 22676
2005-08-05 15:37:31 +00:00
Chris Lattner
b80ff334d8 Fix an iterator invalidation problem when we decide a phi has a constant value
llvm-svn: 22675
2005-08-05 15:34:10 +00:00
Chris Lattner
547cf56441 Make sure to clean CastedPointers after casts are potentially deleted.
This fixes LSR crashes on 301.apsi, 191.fma3d, and 189.lucas

llvm-svn: 22673
2005-08-05 01:30:11 +00:00
Chris Lattner
6fa790692f now that hasConstantValue defaults to only returning values that dominate
the PHI node, this ugly code can vanish.

llvm-svn: 22672
2005-08-05 01:04:30 +00:00
Chris Lattner
f112b97227 Invoke instructions do not dominate all successors
llvm-svn: 22671
2005-08-05 01:03:27 +00:00
Chris Lattner
4a3dc26192 Now that hasConstantValue is more careful w.r.t. returning values that only
dominate the PHI node, this code can go away.  This also makes passes more
aggressive, e.g. implementing Transforms/CondProp/phisimplify2.ll

llvm-svn: 22670
2005-08-05 01:02:04 +00:00
Chris Lattner
f9aa29ef5c Use the bool argument to hasConstantValue to decide whether the client is
prepared to deal with return values that do not dominate the PHI.  If we
cannot prove that the result dominates the PHI node, do not return it if
the client can't cope.

llvm-svn: 22669
2005-08-05 01:00:58 +00:00
Chris Lattner
915053ba79 This code can handle non-dominating instructions
llvm-svn: 22667
2005-08-05 00:57:45 +00:00
Chris Lattner
752c3e7b35 Mark hasConstantValue as a const method
llvm-svn: 22666
2005-08-05 00:49:06 +00:00
Nate Begeman
7547f4085b Add an extra parameter that Chris requested
llvm-svn: 22665
2005-08-04 23:50:43 +00:00
Nate Begeman
ef41400067 Fix a fixme in CondPropagate.cpp by moving a PhiNode optimization into
BasicBlock's removePredecessor routine.  This requires shuffling around
the definition and implementation of hasContantValue from Utils.h,cpp into
Instructions.h,cpp

llvm-svn: 22664
2005-08-04 23:24:19 +00:00
Chris Lattner
855cfc4e90 Modify how immediates are removed from base expressions to deal with the fact
that the symbolic evaluator is not always able to use subtraction to remove
expressions.  This makes the code faster, and fixes the last crash on 178.galgel.
Finally, add a statistic to see how many phi nodes are inserted.

On 178.galgel, we get the follow stats:

2562 loop-reduce  - Number of PHIs inserted
3927 loop-reduce  - Number of GEPs strength reduced

llvm-svn: 22662
2005-08-04 22:34:05 +00:00
Nate Begeman
348caa49b3 Fix a fixme in LegalizeDAG
llvm-svn: 22661
2005-08-04 21:43:28 +00:00
Nate Begeman
c76ffa6717 Hack to naturally align doubles in the constant pool. Remove this once we
know what The Right Thing To Do is.

llvm-svn: 22660
2005-08-04 21:04:09 +00:00
Nate Begeman
84d0a2806a Use the new subtarget support to automatically choose the correct ABI
and asm printer for PowerPC if one is not specified.

llvm-svn: 22659
2005-08-04 20:49:48 +00:00
Chris Lattner
07466ea612 * Refactor some code into a new BasedUser::RewriteInstructionToUseNewBase
method.
* Fix a crash on 178.galgel, where we would insert expressions before PHI
  nodes instead of into the PHI node predecessor blocks.

llvm-svn: 22657
2005-08-04 20:03:32 +00:00
Chris Lattner
c4beaec288 Fix a case that caused this to crash on 178.galgel
llvm-svn: 22653
2005-08-04 19:26:19 +00:00
Chris Lattner
5a0587224a Teach LSR about loop-variant expressions, such as loops like this:
for (i = 0; i < N; ++i)
    A[i][foo()] = 0;

here we still want to strength reduce the A[i] part, even though foo() is
l-v.

This also simplifies some of the 'CanReduce' logic.

This implements Transforms/LoopStrengthReduce/ops_after_indvar.ll

llvm-svn: 22652
2005-08-04 19:08:16 +00:00
Nate Begeman
0ab0a0f72d Remove some more dead code.
llvm-svn: 22650
2005-08-04 18:13:56 +00:00
Chris Lattner
f63b85d9c2 Refactor this code substantially with the following improvements:
1. We only analyze instructions once, guaranteed
  2. AnalyzeGetElementPtrUsers has been ripped apart and replaced with
     something much simpler.

The next step is to handle expressions that are not all indvar+loop-invariant
values (e.g. handling indvar+loopvariant).

llvm-svn: 22649
2005-08-04 17:40:30 +00:00
Andrew Lenharth
bf363b1165 No, IDEFs shouldn't be JITed
llvm-svn: 22648
2005-08-04 15:32:36 +00:00
Misha Brukman
8b8272b648 * Unbreak release build
* Add comments to #endif pragmas for readability

llvm-svn: 22647
2005-08-04 14:22:41 +00:00
Misha Brukman
e094f6c611 * Unbreak optimized build (noticed by Eric van Riet Paap)
* Comment #endif clauses for readability

llvm-svn: 22646
2005-08-04 14:16:48 +00:00
Nate Begeman
09997f1012 Add Subtarget support to PowerPC. Next up, using it.
llvm-svn: 22644
2005-08-04 07:12:09 +00:00
Chris Lattner
ea82ef5db1 refactor some code
llvm-svn: 22643
2005-08-04 01:19:13 +00:00
Chris Lattner
09241be4c7 invert to if's to make the logic simpler
llvm-svn: 22641
2005-08-04 00:40:47 +00:00
Chris Lattner
df7961ec73 When processing outer loops and we find uses of an IV in inner loops, make
sure to handle the use, just don't recurse into it.

This permits us to generate this code for a simple nested loop case:

.LBB_foo_0:     ; entry
        stwu r1, -48(r1)
        stw r29, 44(r1)
        stw r30, 40(r1)
        mflr r11
        stw r11, 56(r1)
        lis r2, ha16(L_A$non_lazy_ptr)
        lwz r30, lo16(L_A$non_lazy_ptr)(r2)
        li r29, 1
.LBB_foo_1:     ; no_exit.0
        bl L_bar$stub
        li r2, 1
        or r3, r30, r30
.LBB_foo_2:     ; no_exit.1
        lfd f0, 8(r3)
        stfd f0, 0(r3)
        addi r4, r2, 1
        addi r3, r3, 8
        cmpwi cr0, r2, 100
        or r2, r4, r4
        bne .LBB_foo_2  ; no_exit.1
.LBB_foo_3:     ; loopexit.1
        addi r30, r30, 800
        addi r2, r29, 1
        cmpwi cr0, r29, 100
        or r29, r2, r2
        bne .LBB_foo_1  ; no_exit.0
.LBB_foo_4:     ; return
        lwz r11, 56(r1)
        mtlr r11
        lwz r30, 40(r1)
        lwz r29, 44(r1)
        lwz r1, 0(r1)
        blr

instead of this:

_foo:
.LBB_foo_0:     ; entry
        stwu r1, -48(r1)
        stw r28, 44(r1)                   ;; uses an extra register.
        stw r29, 40(r1)
        stw r30, 36(r1)
        mflr r11
        stw r11, 56(r1)
        li r30, 1
        li r29, 0
        or r28, r29, r29
.LBB_foo_1:     ; no_exit.0
        bl L_bar$stub
        mulli r2, r28, 800           ;; unstrength-reduced multiply
        lis r3, ha16(L_A$non_lazy_ptr)   ;; loop invariant address computation
        lwz r3, lo16(L_A$non_lazy_ptr)(r3)
        add r2, r2, r3
        mulli r4, r29, 800           ;; unstrength-reduced multiply
        addi r3, r3, 8
        add r3, r4, r3
        li r4, 1
.LBB_foo_2:     ; no_exit.1
        lfd f0, 0(r3)
        stfd f0, 0(r2)
        addi r5, r4, 1
        addi r2, r2, 8                 ;; multiple stride 8 IV's
        addi r3, r3, 8
        cmpwi cr0, r4, 100
        or r4, r5, r5
        bne .LBB_foo_2  ; no_exit.1
.LBB_foo_3:     ; loopexit.1
        addi r28, r28, 1               ;;; Many IV's with stride 1
        addi r29, r29, 1
        addi r2, r30, 1
        cmpwi cr0, r30, 100
        or r30, r2, r2
        bne .LBB_foo_1  ; no_exit.0
.LBB_foo_4:     ; return
        lwz r11, 56(r1)
        mtlr r11
        lwz r30, 36(r1)
        lwz r29, 40(r1)
        lwz r28, 44(r1)
        lwz r1, 0(r1)
        blr

llvm-svn: 22640
2005-08-04 00:14:11 +00:00
Chris Lattner
8b1b7c9e7d Teach loop-reduce to see into nested loops, to pull out immediate values
pushed down by SCEV.

In a nested loop case, this allows us to emit this:

        lis r3, ha16(L_A$non_lazy_ptr)
        lwz r3, lo16(L_A$non_lazy_ptr)(r3)
        add r2, r2, r3
        li r3, 1
.LBB_foo_2:     ; no_exit.1
        lfd f0, 8(r2)        ;; Uses offset of 8 instead of 0
        stfd f0, 0(r2)
        addi r4, r3, 1
        addi r2, r2, 8
        cmpwi cr0, r3, 100
        or r3, r4, r4
        bne .LBB_foo_2  ; no_exit.1

instead of this:

        lis r3, ha16(L_A$non_lazy_ptr)
        lwz r3, lo16(L_A$non_lazy_ptr)(r3)
        add r2, r2, r3
        addi r3, r3, 8
        li r4, 1
.LBB_foo_2:     ; no_exit.1
        lfd f0, 0(r3)
        stfd f0, 0(r2)
        addi r5, r4, 1
        addi r2, r2, 8
        addi r3, r3, 8
        cmpwi cr0, r4, 100
        or r4, r5, r5
        bne .LBB_foo_2  ; no_exit.1

llvm-svn: 22639
2005-08-03 23:44:42 +00:00
Chris Lattner
512f74d445 improve debug output
llvm-svn: 22638
2005-08-03 23:30:08 +00:00
Nate Begeman
6cd034da8e Scalar SSE: load +0.0 -> xorps/xorpd
Scalar SSE: a < b ? c : 0.0 -> cmpss, andps
Scalar SSE: float -> i16 needs to be promoted

llvm-svn: 22637
2005-08-03 23:26:28 +00:00
Chris Lattner
1dcd811d36 Move from Stage 0 to Stage 1.
Only emit one PHI node for IV uses with identical bases and strides (after
moving foldable immediates to the load/store instruction).

This implements LoopStrengthReduce/dont_insert_redundant_ops.ll, allowing
us to generate this PPC code for test1:

        or r30, r3, r3
.LBB_test1_1:   ; Loop
        li r2, 0
        stw r2, 0(r30)
        stw r2, 4(r30)
        bl L_pred$stub
        addi r30, r30, 8
        cmplwi cr0, r3, 0
        bne .LBB_test1_1        ; Loop

instead of this code:

        or r30, r3, r3
        or r29, r3, r3
.LBB_test1_1:   ; Loop
        li r2, 0
        stw r2, 0(r29)
        stw r2, 4(r30)
        bl L_pred$stub
        addi r30, r30, 8        ;; Two iv's with step of 8
        addi r29, r29, 8
        cmplwi cr0, r3, 0
        bne .LBB_test1_1        ; Loop

llvm-svn: 22635
2005-08-03 22:51:21 +00:00
Andrew Lenharth
2865f0fe01 Alpha ABI specifies stack is always 16 byte alligned, and gcc does it, so I will too
llvm-svn: 22634
2005-08-03 22:33:21 +00:00
Chris Lattner
96367799a2 Rename IVUse to IVUsersOfOneStride, use a struct instead of a pair to
unify some parallel vectors and get field names more descriptive than
"first" and "second".  This isn't lisp afterall :)

llvm-svn: 22633
2005-08-03 22:21:05 +00:00
Chris Lattner
230700ef26 Fix a nasty dangling pointer issue. The ScalarEvolution pass would keep a
map from instruction* to SCEVHandles.  When we delete instructions, we have
to tell it about it.  We would run into nasty cases where new instructions
were reallocated at old instruction addresses and get the old map values.
Bad bad bad :(

llvm-svn: 22632
2005-08-03 21:36:09 +00:00
Chris Lattner
d124203207 Fix PR611, codegen'ing SREM of FP operands to fmod or fmodf instead of
the sequence used for integer ops

llvm-svn: 22629
2005-08-03 20:31:37 +00:00
Chris Lattner
eee2daf85d The correct fix for PR612, which also fixes
Transforms/LowerInvoke/2005-08-03-InvokeWithPHIUse.ll

llvm-svn: 22628
2005-08-03 18:51:44 +00:00
Chris Lattner
edac412122 When inserting code, make sure not to insert it before PHI nodes. This
fixes PR612 and Transforms/LowerInvoke/2005-08-03-InvokeWithPHI.ll

llvm-svn: 22626
2005-08-03 18:34:29 +00:00
Chris Lattner
3672ceb70b Fix Transforms/SimplifyCFG/2005-08-03-PHIFactorCrash.ll, a problem that
occurred while bugpointing another testcase

llvm-svn: 22621
2005-08-03 17:59:45 +00:00
Chris Lattner
62b0ecdfb8 add support for Graphviz when viewing CFGs
llvm-svn: 22620
2005-08-03 17:55:05 +00:00
Misha Brukman
41a00bbfa6 Fix grammar: apostrophe-s ('s) is possessive, not plural; also iff vs. if.
llvm-svn: 22619
2005-08-03 17:29:52 +00:00
Chris Lattner
adfe9f12ef minor capitalization thing, patch by Jim Laskey
llvm-svn: 22617
2005-08-03 16:52:22 +00:00
Chris Lattner
6e1d5a8b28 Finally, add the required constraint checks to fix Transforms/SimplifyCFG/2005-08-01-PHIUpdateFail.ll
the right way

llvm-svn: 22615
2005-08-03 00:59:12 +00:00
Chris Lattner
cbf4b650ba Simplify some code, add the correct pred checks
llvm-svn: 22613
2005-08-03 00:38:27 +00:00
Chris Lattner
c59014baef Refactor code out of PropagatePredecessorsForPHIs, turning it into a pure function with no side-effects
llvm-svn: 22612
2005-08-03 00:29:26 +00:00
Chris Lattner
0f4e0d19cc use splice instead of remove/insert to avoid some symtab operations
llvm-svn: 22611
2005-08-03 00:23:42 +00:00
Chris Lattner
adbd086f50 move two functions up in the file, use SafeToMergeTerminators to eliminate
some duplicated code

llvm-svn: 22610
2005-08-03 00:19:45 +00:00
Chris Lattner
b9efb90e1a Rip some code out of the main SimplifyCFG function into a subfunction and
call it from the only place it is live.  No functionality changes.

llvm-svn: 22609
2005-08-03 00:11:16 +00:00
Chris Lattner
df31d75597 Disable this patch:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20050801/027345.html

This breaks real programs and only fixes an obscure regression testcase.  A
real fix is in development.

llvm-svn: 22606
2005-08-02 23:31:38 +00:00
Chris Lattner
b5906d5783 Change a place to use an arbitrary value instead of null, when possible
llvm-svn: 22605
2005-08-02 23:29:23 +00:00
Chris Lattner
31a972dc23 one more hunk that got dropped
llvm-svn: 22596
2005-08-02 19:35:29 +00:00
Chris Lattner
75a8a64de4 This hunk accidentally got dropped. Patch by Jim Laskey
llvm-svn: 22595
2005-08-02 19:30:55 +00:00
Chris Lattner
cc8ae687e1 Update to use the new MathExtras.h support for log2 computation.
Patch contributed by Jim Laskey!

llvm-svn: 22594
2005-08-02 19:26:06 +00:00
Chris Lattner
d59fba1bce Update to use the new MathExtras.h support for log2 computation.
Patch contributed by Jim Laskey!

llvm-svn: 22592
2005-08-02 19:16:58 +00:00
Chris Lattner
3a81e0795b add a pass name to make debugging dumps nicer
llvm-svn: 22588
2005-08-02 19:07:49 +00:00
Misha Brukman
eb3ceda067 Fix grammar: it's == "it is".
llvm-svn: 22587
2005-08-02 16:04:59 +00:00
Chris Lattner
05431d5ca5 Like the comment says, do not insert cast instructions before phi nodes
llvm-svn: 22586
2005-08-02 03:31:14 +00:00
Jeff Cohen
06094b5e91 It's dangerous coding on Mondays.
llvm-svn: 22585
2005-08-02 03:26:32 +00:00
Chris Lattner
b8fd6a098e This code was very close, but not quite right. It did not take into
consideration the case where a reference in an unreachable block could
occur.  This fixes Transforms/SimplifyCFG/2005-08-01-PHIUpdateFail.ll,
something I ran into while bugpoint'ing another pass.

llvm-svn: 22584
2005-08-02 03:24:05 +00:00
Jeff Cohen
82a6b596d0 Implement SetInterruptFunction for Windows.
llvm-svn: 22582
2005-08-02 03:04:47 +00:00
Chris Lattner
4187b8bebe add a comment, make a check more lenient
llvm-svn: 22581
2005-08-02 02:52:02 +00:00
Chris Lattner
38b0e93bd4 Simplify for loop, clear a per-loop map after processing each loop
llvm-svn: 22580
2005-08-02 02:44:31 +00:00
Chris Lattner
2ba2324c96 Implement sys::SetInterruptFunction on Unix, stub it on win32 so that the
build will not fail

llvm-svn: 22578
2005-08-02 02:14:22 +00:00
Chris Lattner
954842274e Add a comment
Make LSR ignore GEP's that have loop variant base values, as we currently
cannot codegen them

llvm-svn: 22576
2005-08-02 01:32:29 +00:00
Chris Lattner
ec624cdcb5 Fix an iterator invalidation problem
llvm-svn: 22575
2005-08-02 00:41:11 +00:00
Chris Lattner
d2455468b5 200.sixtrack prints FP numbers with a very strange notation that uses D
instead of E for exponentials (e.g. 1.234D-43).  Add support for this
notation.

llvm-svn: 22574
2005-08-02 00:11:53 +00:00
Andrew Lenharth
913742c65f update function codes to reflect /su flags that have been added since this was written
llvm-svn: 22571
2005-08-01 20:06:01 +00:00
Chris Lattner
83f0262a2c Fix casts from long to sbyte on ppc
llvm-svn: 22570
2005-08-01 18:16:37 +00:00
Andrew Lenharth
0d65956c61 use llabs not abs
llvm-svn: 22569
2005-08-01 17:47:28 +00:00
Andrew Lenharth
a48ff3bd21 one cannot allocate a global, until one is done initializing the global pointers
llvm-svn: 22568
2005-08-01 17:35:40 +00:00
Chris Lattner
8e9bc37bda ConstantInt::get only works for arguments < 128.
SimplifyLibCalls probably has to be audited to make sure it does not make
this mistake elsewhere.  Also, if this code knows that the type will be
unsigned, obviously one arm of this is dead.

Reid, can you take a look into this further?

llvm-svn: 22566
2005-08-01 16:52:50 +00:00
Jeff Cohen
019104459d Keep tabs and trailing spaces out.
llvm-svn: 22565
2005-07-30 18:33:25 +00:00
Jeff Cohen
4f69b0d5cd Fix VC++ build problems.
llvm-svn: 22564
2005-07-30 18:22:27 +00:00
Chris Lattner
d742a80e9e fix float->long conversions on x86
llvm-svn: 22563
2005-07-30 01:40:57 +00:00
Chris Lattner
11be5c11d5 fix a typeo
llvm-svn: 22561
2005-07-30 00:43:00 +00:00
Nate Begeman
13bd25dc1d Ack, typo
llvm-svn: 22560
2005-07-30 00:21:31 +00:00
Chris Lattner
a681fc64d6 Change the fp to integer code to not perform 2-byte stores followed by
1 byte loads and other operations.  This is bad for store-forwarding on
common CPUs.  We now do this:

fnstcw WORD PTR [%ESP]
mov %AX, WORD PTR [%ESP]

instead of:

fnstcw WORD PTR [%ESP]
mov %AL, BYTE PTR [%ESP + 1]

llvm-svn: 22559
2005-07-30 00:17:52 +00:00
Nate Begeman
454caae5bd Commit a new LoopStrengthReduce pass that can use scalar evolutions and
target data to decide which loop induction variables to strength reduce
and how to do so.  This work is mostly by Chris Lattner, with tweaks by
me to get it working on some of MultiSource.

llvm-svn: 22558
2005-07-30 00:15:07 +00:00
Nate Begeman
0d1a7b6737 Break SCEVExpander out of IndVarSimplify into its own .h/.cpp file so that
other passes may use it.

llvm-svn: 22557
2005-07-30 00:12:19 +00:00
Chris Lattner
cf208334d9 Use a custom expander for all FP to int conversions, as the X86 only has
FP-to-int-in-memory: this exposes the load from the stored slot to the
selection dag, allowing it to be folded into other operaions.

llvm-svn: 22556
2005-07-30 00:05:54 +00:00
Chris Lattner
e0b705ba00 Allow targets to have custom expanders for FP_TO_*INT conversions where
both the src and dest values are legal

llvm-svn: 22555
2005-07-30 00:04:12 +00:00
Andrew Lenharth
d8bfbd99e9 support near allocations for the JIT
llvm-svn: 22554
2005-07-29 23:40:16 +00:00
Andrew Lenharth
3a7dc9f0bd turn off GOT on archs that didn't use it (not that it appeard to harm them much with it on)
llvm-svn: 22553
2005-07-29 23:32:02 +00:00
Chris Lattner
2aa847898d Implement a FIXME: move a bunch of cruft for handling FP_TO_*INT operations
that the X86 does not support to the legalizer.  This allows it to be better
optimized, etc, and will help with SSE support.

llvm-svn: 22551
2005-07-29 01:00:29 +00:00
Chris Lattner
9db78c43c5 Don't forget to diddle with the control word when performing an FISTP64.
llvm-svn: 22550
2005-07-29 00:54:34 +00:00
Chris Lattner
a30d4be57d Use a custom expander to compile this:
long %test4(double %X) {
        %tmp.1 = cast double %X to long         ; <long> [#uses=1]
        ret long %tmp.1
}

to this:

_test4:
        sub %ESP, 12
        fld QWORD PTR [%ESP + 16]
        fistp QWORD PTR [%ESP]
        mov %EDX, DWORD PTR [%ESP + 4]
        mov %EAX, DWORD PTR [%ESP]
        add %ESP, 12
        ret

instead of this:

_test4:
        sub %ESP, 28
        fld QWORD PTR [%ESP + 32]
        fstp QWORD PTR [%ESP]
        call ___fixdfdi
        add %ESP, 28
        ret

llvm-svn: 22549
2005-07-29 00:40:01 +00:00
Chris Lattner
8d48aef4e3 Allow targets to define custom expanders for FP_TO_*INT
llvm-svn: 22548
2005-07-29 00:33:32 +00:00
Chris Lattner
f355c0f6ea allow a target to request that unknown FP_TO_*INT conversion be promoted to
a larger integer destination.

llvm-svn: 22547
2005-07-29 00:11:56 +00:00
Chris Lattner
6b4f386826 instead of having all conversions be handled by one case value, and then have
subcases inside, break things out earlier.

llvm-svn: 22546
2005-07-28 23:31:12 +00:00
Andrew Lenharth
0fcc129f80 support bsr, and more .td simplification
llvm-svn: 22543
2005-07-28 18:14:47 +00:00
Andrew Lenharth
f623af9b64 new is not a valid default anywhere, so make this pure virtual
llvm-svn: 22542
2005-07-28 18:13:59 +00:00
Reid Spencer
69d45ce28e Fix a problem in getDirectoryContents where sub-directory names were
appended to a path string that didn't end in a slash, yielding invalid
path names.

Path contribute by Nicholas Riley.

llvm-svn: 22539
2005-07-28 16:25:57 +00:00
Andrew Lenharth
5a08d95904 get lazy JITing working. Some of shootout runs now
llvm-svn: 22538
2005-07-28 12:45:20 +00:00
Andrew Lenharth
02e0c80ecb Like constants, globals on some platforms are GOT relative. This means they have to be allocated
near the GOT, which new doesn't do.  So break out the allocate into a new function.

Also move GOT index handling into JITResolver.  This lets it update the mapping when a Lazy
function is JITed.  It doesn't managed the table, just the mapping.  Note that this is
still non-ideal, as any function that takes a function address should also take a GOT
index, but that is a lot of changes.  The relocation resolve process updates any GOT entry
it sees is out of date.

llvm-svn: 22537
2005-07-28 12:44:13 +00:00
Chris Lattner
5d02e3a15e Eliminate an extra copy from R1 that Nate noticed on function calls that
have to write arguments to the stack

llvm-svn: 22536
2005-07-28 05:23:43 +00:00
Chris Lattner
a9dac1cd7a Specify the correct number of operands
llvm-svn: 22535
2005-07-28 04:42:11 +00:00
Nate Begeman
b125feb99f Fold constant adds into loads and stores to frame indices.
For the following code:
double %ext(int %A.0__, long %A.1__) {
        %A_addr = alloca %typedef.DComplex              ; <%typedef.DComplex*> [#uses=2]
        %tmp.1 = cast %typedef.DComplex* %A_addr to int*                ; <int*> [#uses=1]
        store int %A.0__, int* %tmp.1
        %tmp.2 = getelementptr %typedef.DComplex* %A_addr, int 0, uint 1                ; <double*> [#uses=2]
        %tmp.3 = cast double* %tmp.2 to long*           ; <long*> [#uses=1]
        store long %A.1__, long* %tmp.3
        %tmp.5 = load double* %tmp.2            ; <double> [#uses=1]
        ret double %tmp.5
}

We now generate:
_ext:
.LBB_ext_0:     ;
        stw r3, -12(r1)
        stw r4, -8(r1)
        stw r5, -4(r1)
        lfd f1, -8(r1)
        blr

Instead of:
_ext:
.LBB_ext_0:     ;
        stw r3, -12(r1)
        addi r2, r1, -12
        stw r4, 4(r2)
        stw r5, 8(r2)
        lfd f1, 4(r2)
        blr

This also fires hundreds of times on MultiSource.

llvm-svn: 22533
2005-07-28 03:02:05 +00:00
Nate Begeman
d230bf7242 Fix some comments
llvm-svn: 22530
2005-07-27 23:11:27 +00:00
Chris Lattner
b0658628c1 Fix debug info to not print out recently freed memory.
llvm-svn: 22529
2005-07-27 23:11:25 +00:00
Chris Lattner
1a3a4c7791 Print symbolic register names in debug dumps
llvm-svn: 22528
2005-07-27 23:03:38 +00:00
Jeff Cohen
bd51ec7461 Eliminate all remaining tabs and trailing spaces.
llvm-svn: 22523
2005-07-27 06:12:32 +00:00
Nate Begeman
54792213a7 Implement the optimization for the Red Zone on Darwin. This removes the
unnecessary SP manipulation in leaf routines that don't need it.

llvm-svn: 22522
2005-07-27 06:06:29 +00:00
Chris Lattner
d9239c4ae0 fix some warnings when compiled with 32-bit hosts
llvm-svn: 22521
2005-07-27 05:58:01 +00:00