1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-28 14:32:51 +01:00
Commit Graph

2351 Commits

Author SHA1 Message Date
Chris Lattner
a3f2d4c4f2 Fix Transforms/InstCombine/2006-02-07-SextZextCrash.ll
llvm-svn: 26040
2006-02-07 19:07:40 +00:00
Chris Lattner
9a35257363 Generalize MaskedValueIsZero into a ComputeMaskedNonZeroBits function, which
is just as efficient as MVIZ and is also more general.

Fix a few minor bugs introduced in recent patches

llvm-svn: 26036
2006-02-07 08:05:22 +00:00
Chris Lattner
c1a185741c Make MaskedValueIsZero take a uint64_t instead of a ConstantIntegral as a
mask.  This allows the code to be simpler and more efficient.

Also, generalize some of the cases in MVIZ a bit, making it slightly more aggressive.

llvm-svn: 26035
2006-02-07 07:27:52 +00:00
Chris Lattner
6318e7b7b7 Use Type::getIntegralTypeMask() to simplify some code
llvm-svn: 26034
2006-02-07 07:00:41 +00:00
Chris Lattner
a0d7868722 Implement the beginnings of a facility for simplifying expressions based on
'demanded bits', inspired by Nate's work in the dag combiner.  This isn't
complete, but needs to unrelated instcombiner changes to continue.

llvm-svn: 26033
2006-02-07 06:56:34 +00:00
Chris Lattner
807c68d572 Turn A % (C << N), where C is 2^k, into A & ((C << N)-1) [urem only].
Turn A / (C1 << N), where C1 is "1<<C2" into A >> (N+C2) [udiv only].

Tested with: rem.ll:test5, div.ll:test10

llvm-svn: 26003
2006-02-05 07:54:04 +00:00
Chris Lattner
e4dc660cbb Use SCEVExpander::InsertCastOfTo instead of our own code. This reduces
#LLVM LOC, and auto-cse's cast instructions.

llvm-svn: 25974
2006-02-04 09:52:43 +00:00
Chris Lattner
8741126ea7 Fix two significant bugs in LSR:
1. When rewriting code in outer loops, sometimes we would insert code into
   inner loops that is invariant in that loop.
2. Notice that 4*(2+x) is 8+4*x and use that to simplify expressions.

This is a performance neutral change.

llvm-svn: 25964
2006-02-04 07:36:50 +00:00
Jeff Cohen
f329a41a66 Improve compatibility with VC2005, patch by Morten Ofstad!
llvm-svn: 25661
2006-01-26 20:41:32 +00:00
Chris Lattner
66f10c8957 teach the cloner to handle inline asms
llvm-svn: 25633
2006-01-26 01:55:22 +00:00
Chris Lattner
84f2acfaa0 Fix Regression/Transforms/ScalarRepl/2006-01-24-IllegalUnionPromoteCrash.ll
llvm-svn: 25587
2006-01-24 19:36:27 +00:00
Chris Lattner
d09c95f83c rename method
llvm-svn: 25572
2006-01-24 04:16:34 +00:00
Chris Lattner
c06c8d8c06 When cloning a module, clone the inline asm.
llvm-svn: 25559
2006-01-23 23:06:28 +00:00
Chris Lattner
6e1f262158 add a bunch more optimizations for unary double math functions
llvm-svn: 25530
2006-01-23 06:24:46 +00:00
Chris Lattner
22fa9eeac1 Refactor/genericize this, no functionality change
llvm-svn: 25525
2006-01-23 05:57:36 +00:00
Chris Lattner
2588f0eb8f Make iostream #inclusion explicit
llvm-svn: 25514
2006-01-22 23:32:06 +00:00
Chris Lattner
0dbef6ec71 Make this more efficient in the following ways:
1. Do not statically construct a map when the program starts up, this
   is expensive and cannot be optimized.  Instead, create a list.
2. Do not insert entries for all function in the module into a hashmap
   that lives the full life of the compiler.

llvm-svn: 25512
2006-01-22 23:10:26 +00:00
Chris Lattner
adff158fbd Add explicit #includes of <iostream>
llvm-svn: 25509
2006-01-22 22:53:01 +00:00
Chris Lattner
305fee5118 Several non-functionality changing changes:
1. Use the varargs version of getOrInsertFunction to simplify code.
2. remove #include
3. Reduce the number of #ifdef's.
4. remove extraneous vertical whitespace.

llvm-svn: 25508
2006-01-22 22:35:08 +00:00
Robert Bocchino
40c6c91f56 ConstantFoldLoadThroughGEPConstantExpr wasn't handling pointers to
packed types correctly.

llvm-svn: 25470
2006-01-19 23:53:23 +00:00
Reid Spencer
585019f629 For PR696:
Don't do floor->floorf conversion if floorf is not available. This checks
the compiler's host, not its target, which is incorrect for cross-compilers
Not sure that's important as we don't build many cross-compilers.

llvm-svn: 25456
2006-01-19 08:36:56 +00:00
Chris Lattner
6e4d8741d5 Implement casts.ll:test26: a cast from float -> double -> integer, doesn't
need the float->double part.

llvm-svn: 25452
2006-01-19 07:40:22 +00:00
Chris Lattner
d5a7ceda96 If not internalizing, don't mark llvm.global[cd]tors const, as a fix for a
hypothetical future boog.

llvm-svn: 25430
2006-01-19 00:46:54 +00:00
Chris Lattner
197d33ce21 Don't internalize llvm.global[cd]tor unless there are uses of it. This
unbreaks front-ends that don't use __main (like the new CFE).

llvm-svn: 25429
2006-01-19 00:40:39 +00:00
Chris Lattner
f7623e2065 Make sure that cloning a module clones its target triple and dependent
library list as well.  This should help bugpoint.

llvm-svn: 25424
2006-01-18 21:32:45 +00:00
Robert Bocchino
443661ec6b Constant folding support for the insertelement operation.
llvm-svn: 25407
2006-01-17 20:07:07 +00:00
Robert Bocchino
d9fa267a49 Lowerpacked and SCCP support for the insertelement operation.
llvm-svn: 25406
2006-01-17 20:06:55 +00:00
Chris Lattner
ddbf4fba37 Clean up the FFS optimization code, and make it correctly create the appropriate
unsigned llvm.cttz.* intrinsic, fixing the 2005-05-11-Popcount-ffs-fls regression
last night.

llvm-svn: 25398
2006-01-17 18:27:17 +00:00
Reid Spencer
3cecd3c4cf For PR411:
This patch is an incremental step towards supporting a flat symbol table.
It de-overloads the intrinsic functions by providing type-specific intrinsics
and arranging for automatically upgrading from the old overloaded name to
the new non-overloaded name. Specifically:
  llvm.isunordered -> llvm.isunordered.f32, llvm.isunordered.f64
  llvm.sqrt -> llvm.sqrt.f32, llvm.sqrt.f64
  llvm.ctpop -> llvm.ctpop.i8, llvm.ctpop.i16, llvm.ctpop.i32, llvm.ctpop.i64
  llvm.ctlz -> llvm.ctlz.i8, llvm.ctlz.i16, llvm.ctlz.i32, llvm.ctlz.i64
  llvm.cttz -> llvm.cttz.i8, llvm.cttz.i16, llvm.cttz.i32, llvm.cttz.i64
New code should not use the overloaded intrinsic names. Warnings will be
emitted if they are used.

llvm-svn: 25366
2006-01-16 21:12:35 +00:00
Chris Lattner
12d9016774 fix a crash due to missing parens
llvm-svn: 25363
2006-01-16 19:47:21 +00:00
Chris Lattner
00c02ed5d6 This pass has never worked correctly. Remove.
llvm-svn: 25349
2006-01-16 01:06:00 +00:00
Chris Lattner
f718ef50ad Let the inliner update the callgraph to reflect the changes it makes, instead
of doing it ourselves.  This fixes Transforms/Inline/2006-01-14-CallGraphUpdate.ll

llvm-svn: 25321
2006-01-14 20:09:18 +00:00
Chris Lattner
fea434f37e Teach the inliner to update the CallGraph itself, and have it add edges to
llvm.stacksave/restore when it inserts calls to them.

llvm-svn: 25320
2006-01-14 20:07:50 +00:00
Chris Lattner
7cbd5dc1f0 FunctionPass's cannot do IPO things.
llvm-svn: 25315
2006-01-14 19:30:35 +00:00
Nate Begeman
4750001146 Add bswap intrinsics as documented in the Language Reference
llvm-svn: 25309
2006-01-14 01:25:24 +00:00
Robert Bocchino
4617a805da Added instcombine support for extractelement.
llvm-svn: 25299
2006-01-13 22:48:06 +00:00
Chris Lattner
818f217fad it is ok to dce stacksave.
llvm-svn: 25295
2006-01-13 21:31:54 +00:00
Chris Lattner
61a2fca725 Do a simple instcombine xforms to delete llvm.stackrestore cases.
llvm-svn: 25294
2006-01-13 21:28:09 +00:00
Chris Lattner
e79c8847f0 Simplify this a tiny bit by using the new IntrinsicInst functionality.
llvm-svn: 25292
2006-01-13 20:11:04 +00:00
Chris Lattner
67a0c03bb4 Permit inlining functions that contain dynamic allocations now that
InlineFunction handles this case safely.  This implements
Transforms/Inline/dynamic_alloca_test.ll.

llvm-svn: 25288
2006-01-13 19:35:43 +00:00
Chris Lattner
423aeb28d5 If inlining a call to a function that contains dynamic allocas, wrap the
resultant code with llvm.stacksave/llvm.stackrestore intrinsics.

llvm-svn: 25286
2006-01-13 19:34:14 +00:00
Chris Lattner
672f2df6d0 Use ClonedCodeInfo to avoid another walk over the inlined code, this this
time in common C cases.

llvm-svn: 25285
2006-01-13 19:18:11 +00:00
Chris Lattner
6f96396bae Use the ClonedCodeInfo object to avoid scans of the inlined code when
it doesn't contain any calls.  This is a fairly common case for C++ code,
so it will probably speed up the inliner marginally in these cases.

llvm-svn: 25284
2006-01-13 19:15:15 +00:00
Chris Lattner
ec00fcaba6 Refactor a bunch of invoke handling stuff out into a new function
"HandleInlinedInvoke".  No functionality change.

llvm-svn: 25283
2006-01-13 19:05:59 +00:00
Chris Lattner
67fb415248 Allow the code cloning interfaces to capture some important info about the
code being cloned if the client wants.

llvm-svn: 25281
2006-01-13 18:39:17 +00:00
Chris Lattner
32ff638ae5 Fix a bug I noticed by inspection: if the first instruction in the inlined
function was not an alloca, we wouldn't check the entry block for any allocas,
leading to increased stack space in some cases.  In practice, allocas are almost
always at the top of the block, so this was never noticed.

llvm-svn: 25280
2006-01-13 18:16:48 +00:00
Chris Lattner
32f54b291d Fix 80 column violations
llvm-svn: 25279
2006-01-13 18:06:56 +00:00
Chris Lattner
d726328ee9 Preserve and update ETForest. Patch by Daniel Berlin
llvm-svn: 25203
2006-01-11 05:11:13 +00:00
Chris Lattner
7b7fee2d92 Switch these to using ETForest instead of DominatorSet to compute itself.
Patch written by Daniel Berlin!

llvm-svn: 25202
2006-01-11 05:10:20 +00:00
Chris Lattner
229a42a573 Switch this to using ETForest instead of DominatorSet to compute itself.
Patch written by Daniel Berlin!

llvm-svn: 25201
2006-01-11 05:09:40 +00:00
Robert Bocchino
9a57550e4e Added support for the extractelement operation.
llvm-svn: 25181
2006-01-10 19:05:34 +00:00
Robert Bocchino
e00d93bc83 Added lower packed support for the extractelement operation.
llvm-svn: 25180
2006-01-10 19:05:05 +00:00
Chris Lattner
05816ecb7f Teach loopsimplify to update et-forest. Patch contributed by Daniel Berlin!
llvm-svn: 25153
2006-01-09 08:03:08 +00:00
Chris Lattner
9453593f40 fix some 176.gcc miscompilation from my previous patch.
llvm-svn: 25137
2006-01-07 01:32:28 +00:00
Chris Lattner
6c99d09404 silence some bogus gcc warnings on fenris
llvm-svn: 25130
2006-01-06 17:59:59 +00:00
Chris Lattner
6c01df15ac Enhance the shift-shift folding code to allow a no-op cast to occur in between
the shifts.

This allows us to fold this (which is the 'integer add a constant' sequence
from cozmic's scheme compmiler):

int %x(uint %anf-temporary776) {
        %anf-temporary777 = shr uint %anf-temporary776, ubyte 1
        %anf-temporary800 = cast uint %anf-temporary777 to int
        %anf-temporary804 = shl int %anf-temporary800, ubyte 1
        %anf-temporary805 = add int %anf-temporary804, -2
        %anf-temporary806 = or int %anf-temporary805, 1
        ret int %anf-temporary806
}

into this:

int %x(uint %anf-temporary776) {
        %anf-temporary776 = cast uint %anf-temporary776 to int
        %anf-temporary776.mask1 = add int %anf-temporary776, -2
        %anf-temporary805 = or int %anf-temporary776.mask1, 1
        ret int %anf-temporary805
}

note that instcombine already knew how to eliminate the AND that the two
shifts fold into.  This is tested by InstCombine/shift.ll:test26

-Chris

llvm-svn: 25128
2006-01-06 07:52:12 +00:00
Chris Lattner
410db54bbf Simplify the code a bit more
llvm-svn: 25126
2006-01-06 07:22:22 +00:00
Chris Lattner
83fc19a4a9 Extract a bunch of code out of visitShiftInst into FoldShiftByConstant. No
functionality changes.

llvm-svn: 25125
2006-01-06 07:12:35 +00:00
Chris Lattner
6a56973377 Pull inline methods out of the pass class definition to make it easier to
read the code.

Do not internalize debugger anchors.

llvm-svn: 25067
2006-01-03 19:13:17 +00:00
Duraid Madina
7cb522e3e8 getting there...
llvm-svn: 25021
2005-12-26 13:48:44 +00:00
Chris Lattner
76b2303521 Fix Transforms/ScalarRepl/2005-12-14-UnionPromoteCrash.ll, a crash on undefined
behavior in 126.gcc on big-endian systems.

llvm-svn: 24708
2005-12-14 17:23:59 +00:00
Reid Spencer
519dc24073 Improve ResolveFunctions to:
a) use better local variable names (OldMT -> OldFT) where "M" is used to
   mean "Function" (perhaps it was previously "Method"?)
b) print out the module identifier in a warning message so that it is
   possible to track down in which module the error occurred.

llvm-svn: 24698
2005-12-13 19:56:51 +00:00
Chris Lattner
d61c654e67 Implement a little hack for parity with GCC on crafty. This speeds up
186.crafty by about 16% (from 15.109s to 13.045s) on my system.

This turns allocas with unions/casts into scalars.  For example crafty has
something like this:

    union doub {
      unsigned short i[4];
      long long d;
    };
int f(long long a) {
  return ((union doub){.d=a}).i[1];
}

Instead of generating loads and stores to an alloca, we now promote the
whole thing to a scalar long value.

This implements: Transforms/ScalarRepl/AggregatePromote.ll

llvm-svn: 24667
2005-12-12 07:19:13 +00:00
Chris Lattner
3d993f7e4d getRawValue zero extens for unsigned values, use getsextvalue so that we
know that small negative values fit into the immediate field of addressing
modes.

llvm-svn: 24608
2005-12-05 18:23:57 +00:00
Chris Lattner
a4fe0bd75f Wrap a long line, never internalize llvm.used.
llvm-svn: 24602
2005-12-05 05:07:38 +00:00
Chris Lattner
f9a1c37c84 Fix SimplifyCFG/2005-12-03-IncorrectPHIFold.ll
llvm-svn: 24581
2005-12-03 18:25:58 +00:00
Chris Lattner
99e894a36f Fix a bug where we didn't realize that vaarg reads memory. This fixes
Transforms/DeadStoreElimination/2005-11-30-vaarg.ll

llvm-svn: 24545
2005-11-30 19:38:22 +00:00
Andrew Lenharth
1ffbe58972 a few more comments on the interfaces and functions
llvm-svn: 24500
2005-11-28 18:10:59 +00:00
Andrew Lenharth
cad1d52b64 Added documented rsprofiler interface. Also remove new profiler passes, the
old ones have been updated to implement the interface.

llvm-svn: 24499
2005-11-28 18:00:38 +00:00
Jeff Cohen
b171dee053 Fix VC++ warning.
llvm-svn: 24496
2005-11-28 06:45:57 +00:00
Andrew Lenharth
311ec68cf4 Random sampling (aka Arnold and Ryder) profiling. This is still preliminary, but it works on spec on x86 and alpha. The idea is to allow profiling passes to remember what profiling they inserted, then a random sampling framework is inserted which consists of duplicated basic blocks (without profiling), such that at each backedge in the program and entry into every function, the framework chooses whether to use the instrumented code or the instrumentation free code. The goal of such a framework is to make it reasonably cheap to do random sampling of very expensive profiling products (such as load-value profiling).
The code is organized into 3 parts (2 passes)
1) a linked set of profiling passes, which implement an analysis group (linked, like alias analysis are).  These insert profiling into the program, and remember what they inserted, so that at a later time they can be queried about any instruction.

2) a pass that handles inserting the random sampling framework.  This also has options to control how random samples are choosen.  Currently implemented are Global counters, register allocated global counters, and read cycle counter (see? there was a reason for it).

The profiling passes are almost identical to the existing ones (block, function, and null profiling is supported right now), and they are valid passes without the sampling framework (hence the existing passes can be unified with the new ones, not done yet).

Some things are a bit ugly still, but that should be fixed up soon enough.

Other todo? making the counter values not "magic 2^16 -1" values, but dynamically choosable.

llvm-svn: 24493
2005-11-28 00:58:09 +00:00
Andrew Lenharth
2700c92469 since reg2mem requires it, might as well mention that it preserves it
llvm-svn: 24491
2005-11-25 16:04:54 +00:00
Andrew Lenharth
79ee761b69 Reg2Mem is something a pass may depend on, so allow that
llvm-svn: 24488
2005-11-22 22:14:23 +00:00
Andrew Lenharth
939cd99914 turns out, demotion and invokes and critical edges don't mix
llvm-svn: 24487
2005-11-22 21:45:19 +00:00
Chris Lattner
e67f211b68 Fix a crash building 176.gcc due to my recent patch, which only fixed
half the problem.

llvm-svn: 24414
2005-11-18 18:30:47 +00:00
Chris Lattner
fc4928a31a Implement a refinement to the mem2reg algorithm for cases where an alloca
has a single def.  In this case, look for uses that are dominated by the def
and attempt to rewrite them to directly use the stored value.

This speeds up mem2reg on these values and reduces the number of phi nodes
inserted.  This should address PR665.

llvm-svn: 24411
2005-11-18 07:31:42 +00:00
Chris Lattner
86e6fa1ee7 This needs proper dominance
llvm-svn: 24410
2005-11-18 07:29:44 +00:00
Chris Lattner
bf3324e75d This was checking the wrong GEP expression. Fixing this fixes a gccas crash
compiling mysql reported by Ted Kremenek.

llvm-svn: 24402
2005-11-17 19:35:42 +00:00
Andrew Lenharth
0b424575e0 the pain isn't gone unless the phinodes are spilled too
llvm-svn: 24288
2005-11-10 19:39:09 +00:00
Andrew Lenharth
b4169fe539 this works with backedges to the existing entry block alot better
llvm-svn: 24270
2005-11-10 17:35:34 +00:00
Andrew Lenharth
03d60c3d09 The pass everyone has been waiting for!
Reg2Mem

for fun you can opt -reg2mem -mem2reg

llvm-svn: 24267
2005-11-10 01:58:38 +00:00
Nate Begeman
f299b9fb03 Add support alignment of allocation instructions.
Add support for specifying alignment and size of setjmp jmpbufs.

No targets currently do anything with this information, nor is it presrved
in the bytecode representation.  That's coming up next.

llvm-svn: 24196
2005-11-05 09:21:28 +00:00
Chris Lattner
0bd3e3230c Implement Transforms/TailCallElim/return-undef.ll, a trivial case
that has been sitting in my inbox since May 18. :)

llvm-svn: 24194
2005-11-05 08:21:11 +00:00
Chris Lattner
4352c050d1 Turn sdiv into udiv if both operands have a clear sign bit. This occurs
a few times in crafty:

OLD:    %tmp.36 = div int %tmp.35, 8            ; <int> [#uses=1]
NEW:    %tmp.36 = div uint %tmp.35, 8           ; <uint> [#uses=0]
OLD:    %tmp.19 = div int %tmp.18, 8            ; <int> [#uses=1]
NEW:    %tmp.19 = div uint %tmp.18, 8           ; <uint> [#uses=0]
OLD:    %tmp.117 = div int %tmp.116, 8          ; <int> [#uses=1]
NEW:    %tmp.117 = div uint %tmp.116, 8         ; <uint> [#uses=0]
OLD:    %tmp.92 = div int %tmp.91, 8            ; <int> [#uses=1]
NEW:    %tmp.92 = div uint %tmp.91, 8           ; <uint> [#uses=0]

Which all turn into shrs.

llvm-svn: 24190
2005-11-05 07:40:31 +00:00
Chris Lattner
297e545d4b Turn srem -> urem when neither input has their sign bit set. This triggers
8 times in vortex, allowing the srems to be turned into shrs:

OLD:    %tmp.104 = rem int %tmp.5.i37, 16               ; <int> [#uses=1]
NEW:    %tmp.104 = rem uint %tmp.5.i37, 16              ; <uint> [#uses=0]
OLD:    %tmp.98 = rem int %tmp.5.i24, 16                ; <int> [#uses=1]
NEW:    %tmp.98 = rem uint %tmp.5.i24, 16               ; <uint> [#uses=0]
OLD:    %tmp.91 = rem int %tmp.5.i19, 8         ; <int> [#uses=1]
NEW:    %tmp.91 = rem uint %tmp.5.i19, 8                ; <uint> [#uses=0]
OLD:    %tmp.88 = rem int %tmp.5.i14, 8         ; <int> [#uses=1]
NEW:    %tmp.88 = rem uint %tmp.5.i14, 8                ; <uint> [#uses=0]
OLD:    %tmp.85 = rem int %tmp.5.i9, 1024               ; <int> [#uses=2]
NEW:    %tmp.85 = rem uint %tmp.5.i9, 1024              ; <uint> [#uses=0]
OLD:    %tmp.82 = rem int %tmp.5.i, 512         ; <int> [#uses=2]
NEW:    %tmp.82 = rem uint %tmp.5.i1, 512               ; <uint> [#uses=0]
OLD:    %tmp.48.i = rem int %tmp.5.i.i161, 4            ; <int> [#uses=1]
NEW:    %tmp.48.i = rem uint %tmp.5.i.i161, 4           ; <uint> [#uses=0]
OLD:    %tmp.20.i2 = rem int %tmp.5.i.i, 4              ; <int> [#uses=1]
NEW:    %tmp.20.i2 = rem uint %tmp.5.i.i, 4             ; <uint> [#uses=0]

it also occurs 9 times in gcc, but with odd constant divisors (1009 and 61)
so the payoff isn't as great.

llvm-svn: 24189
2005-11-05 07:28:37 +00:00
Andrew Lenharth
9a32a77e33 make this 64 bit clean, fixed test30 of /Regression/Transforms/InstCombine/add.ll
llvm-svn: 24158
2005-11-02 18:35:40 +00:00
Chris Lattner
f8e14244c3 Limit the search depth of MaskedValueIsZero to 6 instructions, to avoid
bad cases.  This fixes Markus's second testcase in PR639, and should
seal it for good.

llvm-svn: 24123
2005-10-31 18:35:52 +00:00
Chris Lattner
bafe11a821 This pass is now obsolete since all targets have moved to the SelectionDAG
infrastructure and the simple isels have been removed.

llvm-svn: 24090
2005-10-29 05:33:46 +00:00
Chris Lattner
4772d742a0 Remove dead #include
llvm-svn: 24083
2005-10-29 04:41:30 +00:00
Chris Lattner
20e1564bf0 Now that instcombine does this xform, remove it from the -raise pass
llvm-svn: 24082
2005-10-29 04:40:23 +00:00
Chris Lattner
2c6430d1a7 Pull some code out into a function, give it the ability to see through +.
This allows us to turn code like malloc(4*x+4) -> malloc int, (x+1)

llvm-svn: 24081
2005-10-29 04:36:15 +00:00
Chris Lattner
b313b6d67f Remove a special case, allowing the general case to handle it. No functionality
change.

llvm-svn: 24076
2005-10-29 03:19:53 +00:00
Chris Lattner
d0b0dd1d62 Fix a bit of backwards logic that broke exptree and smg2000
llvm-svn: 24056
2005-10-28 16:27:35 +00:00
Chris Lattner
e545fcc680 Do not sink any instruction with side effects, including vaarg. This fixes
PR640

llvm-svn: 24046
2005-10-27 17:13:11 +00:00
Chris Lattner
64f57e9d14 Fix #include order
llvm-svn: 24044
2005-10-27 16:34:00 +00:00
John Criswell
d6538108e8 Move some constant folding code shared by Analysis and Transform passes
into the LLVMAnalysis library.
This allows LLVMTranform and LLVMTransformUtils to be archives and linked
with LLVMAnalysis.a, which provides any missing definitions.

llvm-svn: 24036
2005-10-27 15:54:34 +00:00
Chris Lattner
74ac5d0cc4 Fix typo
llvm-svn: 24033
2005-10-27 06:26:26 +00:00
Chris Lattner
adbc250213 Teach instcombine to promote stuff like (cast (malloc sbyte, 8*X) to int*)
into: malloc int, (2*X)

llvm-svn: 24032
2005-10-27 06:24:46 +00:00
Chris Lattner
4c9dae5fdb Promote cases like cast (malloc sbyte, 100) to int* into
(malloc [25 x int]) directly without having to convert to
(malloc [100 x sbyte]) first.

llvm-svn: 24031
2005-10-27 06:12:00 +00:00
Chris Lattner
2b0006cd60 Minor change to this file to support obscure cases with constant array amounts
llvm-svn: 24030
2005-10-27 05:53:56 +00:00
John Criswell
b0f5adf975 1. Remove libraries no longer created from the list of libraries linked into the
SparcV9 JIT.
2. Make LLVMTransformUtils a relinked object file and always link it before
   LLVMAnalysis.a.  These two libraries have circular dependencies on each
   other which creates problem when building the SparcV9 JIT.  This change
   fixes the dependency on all platforms problems with a minimum of fuss.

llvm-svn: 24023
2005-10-26 20:35:13 +00:00
Chris Lattner
e1fda00ea5 fold nested and's early to avoid inefficiencies in MaskedValueIsZero. This
fixes a very slow compile in PR639.

llvm-svn: 24011
2005-10-26 17:18:16 +00:00
Jeff Cohen
e561bf727e Update Visual Studio projects to reflect moved file.
llvm-svn: 23998
2005-10-26 05:36:51 +00:00
Alkis Evlogimenos
7fe091048c Stop using deprecated types
llvm-svn: 23973
2005-10-25 11:18:06 +00:00
Chris Lattner
77f228f586 Handle allocations that, even after removing dead uses, still have more than
one use (but one is a cast).  This handles the very common case of:

 X = alloc [n x byte]
 Y = cast X to somethingbetter
 seteq X, null

In order to avoid infinite looping when there are multiple casts, we only
allow this if the xform is strictly increasing the alignment of the
allocation.

llvm-svn: 23961
2005-10-24 06:35:18 +00:00
Chris Lattner
a295128ce5 Fix a bug where we would 'promote' an allocation from one type to another
where the second has less alignment required.  If we had explicit alignment
support in the IR, we could handle this case, but we can't until we do.

llvm-svn: 23960
2005-10-24 06:26:18 +00:00
Chris Lattner
3806b84861 Before promoting a malloc type, remove dead uses. This makes instcombine
more effective at promoting these allocations, catching them earlier in the
compile process.

llvm-svn: 23959
2005-10-24 06:22:12 +00:00
Chris Lattner
5ad6f085ab Pull some code out into a function, no functionality change
llvm-svn: 23958
2005-10-24 06:03:58 +00:00
Chris Lattner
949f205c4c Remove some beta code that no longer has an owner.
llvm-svn: 23944
2005-10-24 02:32:41 +00:00
Chris Lattner
79a1a2af13 Do not build the ProfilePaths directory anymore
llvm-svn: 23943
2005-10-24 02:31:49 +00:00
Chris Lattner
e6f7a38925 DONT_BUILD_RELINKED is gone and implied by BUILD_ARCHIVE now
llvm-svn: 23940
2005-10-24 02:26:13 +00:00
Chris Lattner
a4b13acd52 Only build .a file versions of these libraries, instead of .a and .o versions.
This should speed up build times.

llvm-svn: 23933
2005-10-24 01:59:48 +00:00
Chris Lattner
46cc930a67 Make sure that anything using the ADCE pass pulls in the UnifyFunctionExitNodes
code

llvm-svn: 23931
2005-10-24 01:40:23 +00:00
Jeff Cohen
a38c737e85 When a function takes a variable number of pointer arguments, with a zero
pointer marking the end of the list, the zero *must* be cast to the pointer
type.  An un-cast zero is a 32-bit int, and at least on x86_64, gcc will
not extend the zero to 64 bits, thus allowing the upper 32 bits to be
random junk.

The new END_WITH_NULL macro may be used to annotate a such a function
so that GCC (version 4 or newer) will detect the use of un-casted zero
at compile time.

llvm-svn: 23888
2005-10-23 04:37:20 +00:00
Chris Lattner
63aa74e804 My previous patch was too conservative. Reject FP and void types, but do
allow pointer types.

llvm-svn: 23859
2005-10-21 05:45:41 +00:00
Chris Lattner
3451ac7691 Do NOT touch FP ops with LSR. This fixes a testcase Nate sent me from an
inner loop like this:

LBB_RateConvertMono8AltiVec_2:  ; no_exit
        lis r2, ha16(.CPI_RateConvertMono8AltiVec_0)
        lfs f3, lo16(.CPI_RateConvertMono8AltiVec_0)(r2)
        fmr f3, f3
        fadd f0, f2, f0
        fadd f3, f0, f3
        fcmpu cr0, f3, f1
        bge cr0, LBB_RateConvertMono8AltiVec_2  ; no_exit

to an inner loop like this:

LBB_RateConvertMono8AltiVec_1:  ; no_exit
        fsub f2, f2, f1
        fcmpu cr0, f2, f1
        fmr f0, f2
        bge cr0, LBB_RateConvertMono8AltiVec_1  ; no_exit

Doh! good catch!

llvm-svn: 23838
2005-10-20 04:47:10 +00:00
Chris Lattner
d21885e6c9 Add an option to this pass. If it is set, we are allowed to internalize
all but main.  If it's not set, we can still internalize, but only if an
explicit symbol list is provided.

llvm-svn: 23783
2005-10-18 06:29:22 +00:00
Chris Lattner
515ce6687d Make this work for FP constantexprs
llvm-svn: 23773
2005-10-17 20:18:38 +00:00
Chris Lattner
09b3b3d658 Oops, X+0.0 isn't foldable, but X+-0.0 is.
llvm-svn: 23772
2005-10-17 17:56:38 +00:00
Chris Lattner
00f8306943 relax this a bit, as we only support the default rounding mode
llvm-svn: 23771
2005-10-17 17:49:32 +00:00
Chris Lattner
23ecce5c82 Fix (hopefully the last) issue where LSR is nondeterminstic. When pulling
out CSE's of base expressions it could build a result whose order was
nondet.

llvm-svn: 23698
2005-10-11 18:41:04 +00:00
Chris Lattner
7effc54496 Fix another problem where LSR was being nondeterminstic. Also remove elements
from the end of a vector instead of the beginning

llvm-svn: 23697
2005-10-11 18:30:57 +00:00
Chris Lattner
1ec477da8f Fix another lsr-is-nondeterministic case
llvm-svn: 23695
2005-10-11 18:17:57 +00:00
Chris Lattner
1809933a31 Make MaskedValueIsZero a bit more aggressive
llvm-svn: 23677
2005-10-09 22:08:50 +00:00
Chris Lattner
d4ff47daf6 Fix funky xcode indentation
llvm-svn: 23674
2005-10-09 06:36:35 +00:00
Chris Lattner
f6c2411c40 Hrm, you didn't see this.
llvm-svn: 23673
2005-10-09 06:24:02 +00:00
Chris Lattner
6fa98d5979 Fix a source of non-determinism in the backend: the order of processing
IV strides dependend on the pointer order of the strides in memory.
Non-determinism is bad.

llvm-svn: 23672
2005-10-09 06:20:55 +00:00
Jeff Cohen
3c6db93125 Remove useless variable.
llvm-svn: 23656
2005-10-07 05:28:29 +00:00
Chris Lattner
27464c8061 Fix DemoteRegToStack on an invoke. This fixes PR634.
llvm-svn: 23618
2005-10-04 00:44:01 +00:00
Chris Lattner
cd43592209 Clean up the code a bit. Use isInstructionTriviallyDead to be more aggressive
and more correct than use_empty().  This fixes PR635 and
SimplifyCFG/2005-10-02-InvokeSimplify.ll

llvm-svn: 23616
2005-10-03 23:43:43 +00:00
Chris Lattner
be6f88a2e8 Make IVUseShouldUsePostIncValue more aggressive when the use is a PHI. In
particular, it should realize that phi's use their values in the pred block
not the phi block itself.  This change turns our em3d loop from this:

_test:
        cmpwi cr0, r4, 0
        bgt cr0, LBB_test_2     ; entry.no_exit_crit_edge
LBB_test_1:     ; entry.loopexit_crit_edge
        li r2, 0
        b LBB_test_6    ; loopexit
LBB_test_2:     ; entry.no_exit_crit_edge
        li r6, 0
LBB_test_3:     ; no_exit
        or r2, r6, r6
        lwz r6, 0(r3)
        cmpw cr0, r6, r5
        beq cr0, LBB_test_6     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r2, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit
LBB_test_5:     ; endif.loopexit.loopexit_crit_edge
        addi r3, r2, 1
        blr
LBB_test_6:     ; loopexit
        or r3, r2, r2
        blr

into:

_test:
        cmpwi cr0, r4, 0
        bgt cr0, LBB_test_2     ; entry.no_exit_crit_edge
LBB_test_1:     ; entry.loopexit_crit_edge
        li r2, 0
        b LBB_test_5    ; loopexit
LBB_test_2:     ; entry.no_exit_crit_edge
        li r6, 0
LBB_test_3:     ; no_exit
        lwz r2, 0(r3)
        cmpw cr0, r2, r5
        or r2, r6, r6
        beq cr0, LBB_test_5     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r6, 1
        cmpw cr0, r6, r4
        or r2, r6, r6
        blt cr0, LBB_test_3     ; no_exit
LBB_test_5:     ; loopexit
        or r3, r2, r2
        blr


Unfortunately, this is actually worse code, because the register coallescer
is getting confused somehow.  If it were doing its job right, it could turn the
code into this:

_test:
        cmpwi cr0, r4, 0
        bgt cr0, LBB_test_2     ; entry.no_exit_crit_edge
LBB_test_1:     ; entry.loopexit_crit_edge
        li r6, 0
        b LBB_test_5    ; loopexit
LBB_test_2:     ; entry.no_exit_crit_edge
        li r6, 0
LBB_test_3:     ; no_exit
        lwz r2, 0(r3)
        cmpw cr0, r2, r5
        beq cr0, LBB_test_5     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r6, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit
LBB_test_5:     ; loopexit
        or r3, r6, r6
        blr

... which I'll work on next. :)

llvm-svn: 23604
2005-10-03 02:50:05 +00:00
Chris Lattner
6a5ace34da Refactor some code into a function
llvm-svn: 23603
2005-10-03 01:04:44 +00:00
Chris Lattner
5a865c8598 This break is bogus and I have no idea why it was there. Basically it prevents
memoizing code when IV's are used by phinodes outside of loops.  In a simple
example, we were getting this code before (note that r6 and r7 are isomorphic
IV's):

        li r6, 0
        or r7, r6, r6
LBB_test_3:     ; no_exit
        lwz r2, 0(r3)
        cmpw cr0, r2, r5
        or r2, r7, r7
        beq cr0, LBB_test_5     ; loopexit
LBB_test_4:     ; endif
        addi r2, r7, 1
        addi r7, r7, 1
        addi r3, r3, 4
        addi r6, r6, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit

Now we get:

        li r6, 0
LBB_test_3:     ; no_exit
        or r2, r6, r6
        lwz r6, 0(r3)
        cmpw cr0, r6, r5
        beq cr0, LBB_test_6     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r2, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit

this was noticed in em3d.

llvm-svn: 23602
2005-10-03 00:37:33 +00:00
Chris Lattner
af938ddbe5 when checking if we should move a split edge block outside of a loop,
check the presplit pred, not the post-split pred.  This was causing us
to make the wrong decision in some cases, leaving the critical edge block
in the loop.

llvm-svn: 23601
2005-10-03 00:31:52 +00:00
Jeff Cohen
412582bcec Fix VC++ warnings.
llvm-svn: 23579
2005-10-01 03:57:14 +00:00
Chris Lattner
633db4c298 Insert stores after phi nodes in the normal dest. This fixes
LowerInvoke/2005-08-03-InvokeWithPHI.ll

llvm-svn: 23525
2005-09-29 17:44:20 +00:00
Chris Lattner
c800214144 Fold isascii into a simple comparison. This speeds up 197.parser by 7.4%,
bringing the LLC time down to the CBE time.

llvm-svn: 23521
2005-09-29 06:17:27 +00:00
Chris Lattner
3aa9edd482 remove a bunch of unneeded stuff, or self evident comments
llvm-svn: 23519
2005-09-29 06:16:11 +00:00
Chris Lattner
0cf56701ef Implement a couple of memcmp folds from the todo list
llvm-svn: 23517
2005-09-29 04:54:20 +00:00
Chris Lattner
6ce1d7dabb Constant fold llvm.sqrt
llvm-svn: 23487
2005-09-28 01:34:32 +00:00
Chris Lattner
cd79ed877a add a note about a way to improve this code further, that I won't be getting
to right now.

llvm-svn: 23485
2005-09-27 22:44:59 +00:00
Chris Lattner
80e4480cfa Fix a regression in my previous patch, fixing GlobalOpt/2005-09-27-Crash.ll
and PR632.

llvm-svn: 23484
2005-09-27 22:28:11 +00:00
Chris Lattner
8ff6df40ba Avoid spilling stack slots... to stack slots.
llvm-svn: 23478
2005-09-27 21:33:12 +00:00
Chris Lattner
b2fa69f4b2 Completely rewrite 'correct' eh support. This changes how setjmp insertion
is performed so it is only at most once per function that contains an invoke
instead of once per invoke in the function.  This patch has the following perks:

1. It fixes PR631, which complains about slowness.
2. If fixes PR240, which complains about non-volatile vars being live across
   setjmp/longjmps.
3. It improves (but does not fix) the jmpbuf alignment issue on itanium by not
   forcing the jmpbufs to always be 8-bytes off the alignment of the structure.
4. It speeds up 253.perlbmk from 338s to 13.70s (a 25x improvement!), making us
   now about 4% faster than GCC.

Further improvements are also possible.

llvm-svn: 23477
2005-09-27 21:18:17 +00:00
Chris Lattner
29f2ba1384 Make the pass name simpler
llvm-svn: 23476
2005-09-27 21:10:32 +00:00
Chris Lattner
92b0563eb3 allow demotion to volatile values, add support for invoke
llvm-svn: 23473
2005-09-27 19:39:00 +00:00
Chris Lattner
e242673da3 Add support for external calls that we know how to constant fold. This implements
ctor-list-opt.ll:CTOR8

llvm-svn: 23465
2005-09-27 05:02:43 +00:00
Chris Lattner
b3b15ed0a3 Fix a bug where we would evaluate stores into linkonce objects which could be
potentially replaced at link-time.

llvm-svn: 23463
2005-09-27 04:50:03 +00:00
Chris Lattner
dc94923d86 Implement support for static constructors with calls in them. This is useful
because gccas runs globalopt before inlining.

This implements ctor-list-opt.ll:CTOR7

llvm-svn: 23462
2005-09-27 04:45:34 +00:00
Chris Lattner
5bc02b52dd Refactor this code a bit, no functionality changes.
llvm-svn: 23460
2005-09-27 04:27:01 +00:00