1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-01 16:33:37 +01:00
Commit Graph

636 Commits

Author SHA1 Message Date
Chris Lattner
f7db1f1a3a Enhance the previous fix for PR4895 to allow more values than just
simple constants for the true/false value of the select.  We now
do phi translation etc.  This really fixes PR4895 :)

llvm-svn: 82917
2009-09-27 20:18:49 +00:00
Chris Lattner
0af7f0ceaf implement PR4895, by making FoldOpIntoPhi handle select conditions
that are phi nodes.  Also tighten up FoldOpIntoPhi to treat constantexpr
operands to phis just like other variables, avoiding moving constantexpr
computations around.

Patch by Daniel Dunbar.

llvm-svn: 82913
2009-09-27 19:57:57 +00:00
Nick Lewycky
6ea218a96b Filecheckify this one test.
llvm-svn: 82888
2009-09-27 06:25:05 +00:00
Dale Johannesen
53c365d807 Handle sqrt in CannotBeNegativeZero. absf and absl
appear to be misspellings, removed in favor of fabs*.

llvm-svn: 82796
2009-09-25 20:54:50 +00:00
Victor Hernandez
f772c0b8b2 Revert 82694 "Auto-upgrade malloc instructions to malloc calls." because it causes regressions in the nightly tests.
llvm-svn: 82784
2009-09-25 18:11:52 +00:00
Victor Hernandez
ff8027ced6 Auto-upgrade malloc instructions to malloc calls.
Reviewed by Devang Patel.

llvm-svn: 82694
2009-09-24 17:47:49 +00:00
Dan Gohman
2296899a6b Fix the comment in this test.
llvm-svn: 82051
2009-09-16 16:33:59 +00:00
Dan Gohman
77638f25b1 Don't sink gep operators through phi nodes if the result would require
more than one phi, since that leads to higher register pressure on
entry to the phi. This is especially problematic when the phi is in
a loop header, as it increases register pressure throughout the loop.

llvm-svn: 81993
2009-09-16 02:01:52 +00:00
Dan Gohman
205b641954 Change tests from "opt %s" to "opt < %s" so that opt doesn't see the
input filename so that opt doesn't print the input filename in the
output so that grep lines in the tests don't unintentionally match
strings in the input filename.

llvm-svn: 81537
2009-09-11 18:01:28 +00:00
Dan Gohman
aa66e3d968 Teach lib/VMCore/ConstantFold.cpp how to set the inbounds keyword and
how to fold notionally-out-of-bounds array getelementptr indices instead
of just doing these in lib/Analysis/ConstantFolding.cpp, because it can
be done in a fairly general way without TargetData, and because not all
constants are visited by lib/Analysis/ConstantFolding.cpp. This enables
more constant folding.

Also, set the "inbounds" flag when the getelementptr indices are
one-past-the-end.

llvm-svn: 81483
2009-09-11 00:04:14 +00:00
Dan Gohman
58a0550024 Factor out the code for checking that all indices in a getelementptr are
within the notional bounds of the static type of the getelementptr (which
is not the same as "inbounds") from GlobalOpt into a utility routine,
and use it in ConstantFold.cpp to check whether there are any mis-behaved
indices.

llvm-svn: 81478
2009-09-10 23:37:55 +00:00
Dan Gohman
0df0f8323c Use "opt < %s" instead of "opt %s" to keep the testname away from the grep.
llvm-svn: 81299
2009-09-09 00:22:49 +00:00
Dan Gohman
142428ce64 Eliminate more uses of llvm-as and llvm-dis.
llvm-svn: 81293
2009-09-09 00:09:15 +00:00
Dan Gohman
cd0fb89725 Use "opt < %s" instead of "opt %s" so that opt doesn't print the test
filename in the output, which interferes with the tests' grep lines.

llvm-svn: 81263
2009-09-08 22:57:49 +00:00
Dan Gohman
a3ab9b3b9e Convert a few more opt | llvm-dis to opt -S.
llvm-svn: 81261
2009-09-08 22:41:33 +00:00
Dan Gohman
c95df8b6d8 Use opt -S instead of piping bitcode output through llvm-dis.
llvm-svn: 81257
2009-09-08 22:34:10 +00:00
Chris Lattner
5caba6ecdc remove an extremely dubious instcombine transformation of
extractelement(load).

llvm-svn: 81239
2009-09-08 18:48:01 +00:00
Dan Gohman
8d84372836 Change these tests to feed the assembly files to opt directly, instead
of using llvm-as, now that opt supports this.

llvm-svn: 81226
2009-09-08 16:50:01 +00:00
Chris Lattner
12d0bc749f instcombine transforms vector loads that are only used by
extractelement operations into a bitcast of the pointer,
then a gep, then a scalar load.  Disable this when the vector
only has one element, because it leads to infinite loops in
instcombine (PR4908).

This transformation seems like a really bad idea to me, as it
will likely disable CSE of vector load/stores etc and can be
better done in the code generator when profitable.  This
goes all the way back to the first days of packed types,
r25299 specifically.

I'll let those people who care about the performance of vector
code decide what to do with this.

llvm-svn: 81185
2009-09-08 03:44:51 +00:00
Chris Lattner
37dbbde91b fix ComputeMaskedBits handling of zext/sext/trunc to work with vectors.
This fixes PR4905

llvm-svn: 81174
2009-09-08 00:13:52 +00:00
Daniel Dunbar
b220fd9c58 Don't depend on Tcl behavior of redirecting stderr for all commands in a
pipeline.

llvm-svn: 81153
2009-09-07 19:26:02 +00:00
Daniel Dunbar
a953c39b9e Eliminate uses of %prcontext.
- I'd appreciate it if someone else eyeballs my changes to make sure I captured
   the intent of the test.

llvm-svn: 81083
2009-09-05 11:35:16 +00:00
Chris Lattner
8bb351e2c2 fix PR4837, some bugs folding vector compares. These
return a vector of i1, not i1 itself.

llvm-svn: 80761
2009-09-02 05:12:37 +00:00
Chris Lattner
73d1e17cec fix a bug I introduced with my 'instcombine builder' refactoring
changes: SimplifyDemandedBits can't use the builder yet because it
has the wrong insertion point.  This fixes a crash building
MultiSource/Benchmarks/PAQ8p

llvm-svn: 80537
2009-08-31 04:36:22 +00:00
Chris Lattner
1c81f120e2 suck a bunch more gep tests into getelementptr.ll and filecheckize them all.
llvm-svn: 80517
2009-08-30 21:31:34 +00:00
Chris Lattner
1c8bd32732 consolodate various GEP tests into getelementptr.ll using filecheck.
llvm-svn: 80514
2009-08-30 21:02:36 +00:00
Chris Lattner
ac17c19a06 another huge testcase, this time from 'gs' in llvm-test.
llvm-svn: 80513
2009-08-30 21:02:02 +00:00
Chris Lattner
6c6cd82568 remove another poorly-reduced testcase which came from ldecod in llvm-test.
llvm-svn: 80512
2009-08-30 21:01:14 +00:00
Chris Lattner
94ea2fb674 this testcase is 500 lines long and is distilled from bzip2, just
remove it.

llvm-svn: 80511
2009-08-30 21:00:11 +00:00
Chris Lattner
22c8be162d convert to filecheck
llvm-svn: 80510
2009-08-30 20:48:15 +00:00
Chris Lattner
d21a94f8d6 Fix PR4748: don't fold gep(bitcast(x)) into bitcast(gep) when x
is itself a bitcast.  Since we have gep(bitcast(bitcast(y))) in this
case, just wait for the two bitcasts to get zapped.  This prevents
instcombine from confusing some aliasing stuff, and allows it to
directly eliminate the load in the testcase.

llvm-svn: 80508
2009-08-30 20:38:21 +00:00
Dan Gohman
bf08e82d8e Remove obsolete -f flags.
llvm-svn: 79992
2009-08-25 15:38:29 +00:00
Dan Gohman
d240c19451 Change getelementptr folding to use APInt instead of uint64_t for
offset computations. This fixes a truncation bug on targets that
don't have 64-bit pointers.

llvm-svn: 79639
2009-08-21 16:52:54 +00:00
Dan Gohman
cc511acf87 Fix a bug in the over-index constant folding. When over-indexing an
array member of a struct, it's possible to land in an arbitrary position
inside that struct, such that attempting to find further getelementptr
indices will fail. In such cases, folding cannot be done.

llvm-svn: 79485
2009-08-19 22:46:59 +00:00
Dan Gohman
bc59c24278 Canonicalize indices in a constantexpr GEP. If Indices exceed the
static extents of the static array type, it causes GlobalOpt and
other passes to be more conservative. This canonicalization also
allows the constant folder to add "inbounds" to GEPs.

llvm-svn: 79440
2009-08-19 18:18:36 +00:00
Mon P Wang
d56e4482fc When InstCombine simplifies a load -> extract element to gep -> load, place
the new load by the old load instead of by the extract element because
a store could have occurred between the load and extract element.

llvm-svn: 78891
2009-08-13 05:12:13 +00:00
Dan Gohman
9f80d2be6b Make LLVM Assembly dramatically easier to read by aligning the comments,
using formatted_raw_ostream's PadToColumn.

Before:

bb1:            ; preds = %bb
  %2 = sext i32 %i.01 to i64            ; <i64> [#uses=1]
  %3 = getelementptr double* %p, i64 %2         ; <double*> [#uses=1]
  %4 = load double* %3, align 8         ; <double> [#uses=1]
  %5 = fmul double %4, 1.100000e+00             ; <double> [#uses=1]
  %6 = sext i32 %i.01 to i64            ; <i64> [#uses=1]
  %7 = getelementptr double* %p, i64 %6         ; <double*> [#uses=1]

After:

bb1:                                        ; preds = %bb
  %2 = sext i32 %i.01 to i64                ; <i64> [#uses=1]
  %3 = getelementptr double* %p, i64 %2     ; <double*> [#uses=1]
  %4 = load double* %3, align 8             ; <double> [#uses=1]
  %5 = fmul double %4, 1.100000e+00         ; <double> [#uses=1]
  %6 = sext i32 %i.01 to i64                ; <i64> [#uses=1]
  %7 = getelementptr double* %p, i64 %6     ; <double*> [#uses=1]

Several tests required whitespace adjustments.

llvm-svn: 78816
2009-08-12 17:23:50 +00:00
Dan Gohman
00ee3a9a1a Transform -X/C to X/-C, implementing a README.txt entry.
llvm-svn: 78812
2009-08-12 16:37:02 +00:00
Dan Gohman
d5b6e35080 Optimize (x/C)*C to x if the division is exact.
llvm-svn: 78811
2009-08-12 16:33:09 +00:00
Dan Gohman
8d69df5773 Optimize exact sdiv by a constant power of 2 to ashr.
llvm-svn: 78714
2009-08-11 20:47:47 +00:00
Dan Gohman
08e747855c Don't assume that external global variables are aligned at their preferred
alignment. Only the minimum alignment guaranteed by the ABI may be assumed.

llvm-svn: 78668
2009-08-11 15:50:03 +00:00
Dan Gohman
aba682a290 Add -disable-output. Thanks Bill!
llvm-svn: 78009
2009-08-03 22:24:22 +00:00
Dan Gohman
39f93f6443 Add a new Constant::getIntegerValue helper function, and convert a
few places in InstCombine to use it, to fix problems handling pointer
types. This fixes the recent llvm-gcc bootstrap error.

llvm-svn: 78005
2009-08-03 22:07:33 +00:00
Dan Gohman
0d0dd7b732 Teach instcombine to respect and preserve inbounds. Add inbounds
to a few tests where it is required for the expected transformation.

llvm-svn: 77290
2009-07-28 01:40:03 +00:00
Chris Lattner
0426853d67 merge vector-casts-0.ll into vector-casts.ll
llvm-svn: 76864
2009-07-23 05:33:39 +00:00
Chris Lattner
c687344f0c Make some existing optimizations that would only trigger on scalars
also apply to vectors.  This allows us to compile this:

#include <emmintrin.h>
__m128i a(__m128 a, __m128 b) { return a==a & b==b; }
__m128i b(__m128 a, __m128 b) { return a!=a | b!=b; }

to:

_a:
	cmpordps	%xmm1, %xmm0
	ret
_b:
	cmpunordps	%xmm1, %xmm0
	ret

with clang instead of to a ton of horrible code.

llvm-svn: 76863
2009-07-23 05:32:17 +00:00
Chris Lattner
f4474da353 convert a test to filecheck format. This fixes an endemic problem
with negative tests: this test wasn't checking what it thought it was
because it was grepping .bc, not .ll.

llvm-svn: 76861
2009-07-23 05:27:48 +00:00
Chris Lattner
7061a4300d rename test
llvm-svn: 76860
2009-07-23 05:25:12 +00:00
Dan Gohman
f2c6e6a1bd Add a testcase for PR2831.
llvm-svn: 76527
2009-07-21 01:02:18 +00:00
Dan Gohman
00b05492f1 Revert the addition of hasNoPointerOverflow to GEPOperator.
Getelementptrs that are defined to wrap are virtually useless to
optimization, and getelementptrs that are undefined on any kind
of overflow are too restrictive -- it's difficult to ensure that
all intermediate addresses are within bounds. I'm going to take
a different approach.

Remove a few optimizations that depended on this flag.

llvm-svn: 76437
2009-07-20 17:43:30 +00:00
Eli Friedman
e507c1afaa Canonicalize bitcasts between types like <1 x i64> and i64 to
insertelement/extractelement.

I'm not entirely sure this is precisely what we want to do: should we 
prefer bitcast(insertelement) or insertelement(bitcast)?  Similarly. should we 
prefer extractelement(bitcast) or bitcast(extractelement)?

llvm-svn: 76345
2009-07-18 23:06:53 +00:00
Eli Friedman
debc43cb11 Back out 76300; apparently the preference is to canonicalize the other
way (bitcast -> insert/extractelement).

llvm-svn: 76325
2009-07-18 19:04:16 +00:00
Eli Friedman
65a5fe312a Add combine: X sdiv (1 << Y) -> X udiv (1 << Y) when X doesn't have the
sign bit set.

llvm-svn: 76304
2009-07-18 09:53:21 +00:00
Eli Friedman
f1878fcda1 Canonicalize insert/extractelement from single-element vectors into
bitcasts.

It would also be possible to canonicalize the other way; does anyone 
have a preference?

llvm-svn: 76300
2009-07-18 09:07:47 +00:00
Eli Friedman
048d13f9bb Don't restrict the set of instructions where we try to constant-fold the
operands; it's possible to end up with a constant-foldable operand to 
most instructions, even those which can't trap.

llvm-svn: 75845
2009-07-15 22:13:34 +00:00
Eli Friedman
63028801b8 Fix trivial todo in instcombine.
llvm-svn: 75586
2009-07-14 02:01:53 +00:00
Eli Friedman
a6c7a3d44e PR4548: optimize zext+udiv+trunc to udiv.
llvm-svn: 75539
2009-07-13 22:46:01 +00:00
Eli Friedman
47839d3dec Fix bug in run-line.
llvm-svn: 75534
2009-07-13 22:31:30 +00:00
Eli Friedman
6b51ac6728 Canonicalize boolean +/- a constant to a select.
(I think it's reasonably clear that we want to have a canonical form for 
constructs like this; if anyone thinks that a select is not the best 
canonical form, please tell me.)

llvm-svn: 75531
2009-07-13 22:27:52 +00:00
Chris Lattner
54c0359890 do not try to analyze bitcasts from i64 to <2 x i32> in ComputedMaskedBits. While
we could do this, doing so requires adjusting the demanded mask and the code isn't 
doing that yet.  This fixes PR4495

llvm-svn: 74699
2009-07-02 16:04:08 +00:00
Dan Gohman
e3b1f9e14b Fix an instcombine abort on a scalar-to-vector bitcast. This fixes PR4487.
llvm-svn: 74646
2009-07-01 21:38:46 +00:00
Dan Gohman
dc884a7830 Generalize the zext(trunc(t) & C) instcombine to work even with
C is not a low-bits mask, and add a similar instcombine for
zext((trunc(t) & C) ^ C).

llvm-svn: 73705
2009-06-18 16:30:21 +00:00
Dan Gohman
1530824138 Instcombine zext(trunc(x) & mask) to x&mask, even if the trunc has
multiple users.

llvm-svn: 73656
2009-06-17 23:17:05 +00:00
Eli Friedman
36d7ca738e Correct an accidental duplication of the test (patch doesn't handle
creating new files very well).

llvm-svn: 73599
2009-06-17 03:05:00 +00:00
Eli Friedman
b3947071ff PR3439: Correct a silly mistake in the SimplifyDemandedUseBits code for
SRem.

llvm-svn: 73598
2009-06-17 02:57:36 +00:00
Dan Gohman
54bbef1525 Generalize a few more instcombines to be vector/scalar-independent.
llvm-svn: 73541
2009-06-16 19:55:29 +00:00
Chris Lattner
f54c97c579 Testcase for r73506
llvm-svn: 73508
2009-06-16 17:23:25 +00:00
Dan Gohman
2e737ac21f Support vector casts in more places, fixing a variety of assertion
failures.

To support this, add some utility functions to Type to help support
vector/scalar-independent code. Change ConstantInt::get and
ConstantFP::get to support vector types, and add an overload to
ConstantInt::get that uses a static IntegerType type, for
convenience.

Introduce a new getConstant method for ScalarEvolution, to simplify
common use cases.

llvm-svn: 73431
2009-06-15 22:12:54 +00:00
Chris Lattner
52510b0788 fix testcase to properly check for the patch in r73195.
llvm-svn: 73380
2009-06-15 05:46:02 +00:00
Dan Gohman
f9b0419cd8 Don't do (x - (y - z)) --> (x + (z - y)) on floating-point types, because
it may round differently. This fixes PR4374.

llvm-svn: 73243
2009-06-12 19:23:25 +00:00
Chris Lattner
e0360f8ae8 Fix 4366: store to null in non-default addr space should not be
turned into unreachable.

llvm-svn: 73195
2009-06-11 17:54:56 +00:00
Eli Friedman
770f633389 PR4340: Run SimplifyDemandedVectorElts on insertelement instructions;
sometimes it can find simplifications that won't be found otherwise.

llvm-svn: 73006
2009-06-06 20:08:03 +00:00
Dan Gohman
5f6f8101d5 Split the Add, Sub, and Mul instruction opcodes into separate
integer and floating-point opcodes, introducing
FAdd, FSub, and FMul.

For now, the AsmParser, BitcodeReader, and IRBuilder all preserve
backwards compatability, and the Core LLVM APIs preserve backwards
compatibility for IR producers. Most front-ends won't need to change
immediately.

This implements the first step of the plan outlined here:
http://nondot.org/sabre/LLVMNotes/IntegerOverflow.txt

llvm-svn: 72897
2009-06-04 22:49:04 +00:00
Dan Gohman
05fe1217c7 Check in test changes that I accidentally left out of r72872.
llvm-svn: 72875
2009-06-04 18:22:31 +00:00
Evan Cheng
77529302a6 Fix bug in FoldFCmp_IntToFP_Cst. If inttofp is a uintofp, use unsigned instead of signed integer constant.
llvm-svn: 72300
2009-05-22 23:10:53 +00:00
Dan Gohman
fc28858d91 Teach ValueTracking a new way to analyze PHI nodes, and and teach
Instcombine to be more aggressive about using SimplifyDemandedBits
on shift nodes. This allows a shift to be simplified to zero in the
included test case.

llvm-svn: 72204
2009-05-21 02:28:33 +00:00
Chris Lattner
eb2f327449 calls in nothrow functions can be marked nothrow even if the callee
is not known to be nothrow.  This allows readnone/readonly functions
to be deleted even if we don't know whether the callee can throw.

llvm-svn: 71676
2009-05-13 17:39:14 +00:00
Dan Gohman
ebacd61d7d Revert 71165. It did more than just revert 71158 and it introduced
several regressions. The problem due to 71158 is now fixed.

llvm-svn: 71176
2009-05-07 19:46:24 +00:00
Bill Wendling
9f97e4a3dc Temporarily revert r71158. It was causing a failure during a full bootstrap:
checking for bcopy... no
checking for getc_unlocked... Assertion failed: (0 && "Unknown SCEV kind!"), function operator(), file /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore.roots/llvmCore~obj/src/lib/Analysis/ScalarEvolution.cpp, line 511.
/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmgcc42.roots/llvmgcc42~obj/src/libdecnumber/decUtility.c:360: internal compiler error: Abort trap
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://developer.apple.com/bugreporter> for instructions.
make[4]: *** [decUtility.o] Error 1
make[4]: *** Waiting for unfinished jobs....
Assertion failed: (0 && "Unknown SCEV kind!"), function operator(), file /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore.roots/llvmCore~obj/src/lib/Analysis/ScalarEvolution.cpp, line 511.
/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmgcc42.roots/llvmgcc42~obj/src/libdecnumber/decNumber.c:5591: internal compiler error: Abort trap
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://developer.apple.com/bugreporter> for instructions.
make[4]: *** [decNumber.o] Error 1
make[3]: *** [all-stage2-libdecnumber] Error 2
make[3]: *** Waiting for unfinished jobs....

llvm-svn: 71165
2009-05-07 17:26:14 +00:00
Dan Gohman
9a6a882979 Constant-fold ptrtoint+add+inttoptr to gep when the pointer is an
array and the add is within range. This helps simplify expressions
expanded by ScalarEvolutionExpander.

llvm-svn: 71158
2009-05-07 14:24:56 +00:00
Dan Gohman
a7fae1f865 Add several more icmp simplifications. Transform signed comparisons
into unsigned ones when the operands are known to have the same
sign bit value.

llvm-svn: 70053
2009-04-25 17:12:48 +00:00
Chris Lattner
c1bfdc9bb2 Add a new "available_externally" linkage type. This is intended
to support C99 inline, GNU extern inline, etc.  Related bugzilla's
include PR3517, PR3100, & PR2933.  Nothing uses this yet, but it
appears to work.

llvm-svn: 68940
2009-04-13 05:44:34 +00:00
Chris Lattner
7d75f78b92 Instcombine should not promote whole computation trees to "strange"
integer types, unless they are already strange.  This prevents it from
turning the code produced by SROA into crazy libcalls and stuff that 
the code generator can't handle.  In the attached example, the result
was an i96 multiply that caused the x86 backend to assert.

Note that if TargetData had an idea of what the legal types are for
a target that this could be used to stop instcombine from introducing
i64 muls, as Scott wanted.

llvm-svn: 68598
2009-04-08 05:41:03 +00:00
Chris Lattner
2f520929d4 fix rdar://6762290, a crash compiling cxx filt with clang.
llvm-svn: 68500
2009-04-07 05:03:34 +00:00
Evan Cheng
c419350132 Throttle back "fold select into operand" transformation. InstCombine should not generate selects of two constants unless they are selects of 0 and 1.
e.g.
define i32 @t1(i32 %c, i32 %x) nounwind {
       %t1 = icmp eq i32 %c, 0
       %t2 = lshr i32 %x, 18
       %t3 = select i1 %t1, i32 %t2, i32 %x
       ret i32 %t3
}

was turned into

define i32 @t2(i32 %c, i32 %x) nounwind {
       %t1 = icmp eq i32 %c, 0
       %t2 = select i1 %t1, i32 18, i32 0
       %t3 = lshr i32 %x, %t2
       ret i32 %t3
}

For most targets, that means materializing two constants and then a select. e.g. On x86-64

movl    %esi, %eax
shrl    $18, %eax
testl   %edi, %edi
cmovne  %esi, %eax
ret

=>

xorl    %eax, %eax
testl   %edi, %edi
movl    $18, %ecx
cmovne  %eax, %ecx
movl    %esi, %eax
shrl    %cl, %eax
ret

Also, the optimizer and codegen can reason about shl / and / add, etc. by a constant. This optimization will hinder optimizations using ComputeMaskedBits.

llvm-svn: 68142
2009-03-31 20:42:45 +00:00
Chris Lattner
c055403764 Fix PR3874 by restoring a condition I removed, but making it more
precise than it used to be.

llvm-svn: 67662
2009-03-25 00:28:58 +00:00
Chris Lattner
aabd3eeeff canonicalize inttoptr and ptrtoint instructions which cast pointers
to/from integer types that are not intptr_t to convert to intptr_t
then do an integer conversion to the dest type.  This exposes the
cast to the optimizer.

llvm-svn: 67638
2009-03-24 18:35:40 +00:00
Chris Lattner
51a4134e1c two changes:
1. Make instcombine always canonicalize trunc x to i1 into an icmp(x&1).  This 
   exposes the AND to other instcombine xforms and is more of what the code
   generator expects.
2. Rewrite the remaining trunc pattern match to use 'match', which 
   simplifies it a lot.
   

llvm-svn: 67635
2009-03-24 18:15:30 +00:00
Chris Lattner
623662e8e1 Fix instcombine to not introduce undefined shifts when merging two
shifts together.  This fixes PR3851.

llvm-svn: 67411
2009-03-20 22:41:15 +00:00
Chris Lattner
0542f9f1ba Fix PR3826 - InstComb assert with vector shift, by not calling ComputeNumSignBits on a vector.
llvm-svn: 67211
2009-03-18 16:32:19 +00:00
Duncan Sands
51ce06c788 Fix PR3694: add an instcombine micro-optimization that helps
clean up when using variable length arrays in llvm-gcc.

llvm-svn: 65832
2009-03-02 09:18:21 +00:00
Chris Lattner
1443cb8f77 Fix PR3667
llvm-svn: 65464
2009-02-25 18:20:01 +00:00
Dan Gohman
486728ef53 Add a testcase for the problem fixed in r65289.
llvm-svn: 65365
2009-02-24 02:17:42 +00:00
Dan Gohman
1197d46ccf Fix a ValueTracking rule: RHS means operand 1, not 0. Add a simple
ashr instcombine to help expose this code. And apply the fix to
SelectionDAG's copy of this code too.

llvm-svn: 65364
2009-02-24 02:00:40 +00:00
Nick Lewycky
2c8f0fd57f Don't sign extend the char when expanding char -> int during
load(bitcast(char[4] to i32*)) evaluation.

llvm-svn: 65246
2009-02-21 20:50:42 +00:00
Chris Lattner
3adae91c70 rename a function to indicate that it checks for profitability as well
as legality.  Make load sinking and gep sinking more careful: we only
do it when it won't pessimize loads from the stack.  This has the added
benefit of not producing code that is unanalyzable to SROA.

llvm-svn: 65209
2009-02-21 00:46:50 +00:00
Dan Gohman
4ed0aa2409 Change the argument type in this test to something less convoluted,
since it isn't actually used. 

llvm-svn: 64883
2009-02-18 04:25:04 +00:00
Chris Lattner
0837686a2a commit a tweaked version of Daniel's patch for PR3599. We now
eliminate all the extensions and all but the one required truncate
from the testcase, but the or/and/shift stuff still isn't zapped.

llvm-svn: 64809
2009-02-17 20:47:23 +00:00
Dan Gohman
e06ea828a2 Fix EnforceKnownAlignment so that it doesn't ever reduce the alignment
of an alloca or global variable.

llvm-svn: 64693
2009-02-16 23:02:21 +00:00
Dan Gohman
3d93bc5654 Change these tests to use regular loads instead of llvm.x86.sse2.loadu.dq.
Enhance instcombine to use the preferred field of
GetOrEnforceKnownAlignment in more cases, so that regular IR operations are
optimized in the same way that the intrinsics currently are.

llvm-svn: 64623
2009-02-16 00:44:23 +00:00
Nate Begeman
8b548c0a9e Add suppport for ConstantExprs of shufflevectors whose result type is not equal to the
type of the vectors being shuffled.

llvm-svn: 64401
2009-02-12 21:28:33 +00:00
Mon P Wang
028d995112 Instrcombine should not change load(cast p) to cast(load p) if the cast
changes the address space of the pointer.

llvm-svn: 64035
2009-02-07 22:19:29 +00:00
Duncan Sands
6b95b76bca Allow the inverse transform x86_fp80 -> i80 (also
fires during the Ada build).

llvm-svn: 63731
2009-02-04 11:17:06 +00:00
Duncan Sands
528bb91ea8 Fix PR3468: a crash when constant folding a bitcast of
i80 to x86 long double (this was presumably generated
by sroa).

llvm-svn: 63730
2009-02-04 10:17:14 +00:00
Evan Cheng
b3da5fb3a4 APInt'fy SimplifyDemandedVectorElts so it can analyze vectors with more than 64 elements.
llvm-svn: 63631
2009-02-03 10:05:09 +00:00
Chris Lattner
1cb94d541b reduce testcase.
llvm-svn: 63499
2009-02-02 06:55:45 +00:00
Nick Lewycky
e25b96473e Reinstate this optimization to fold icmp of xor when possible. Don't try to
turn icmp eq a+x, b+x into icmp eq a, b if a+x or b+x has other uses. This
may have been increasing register pressure leading to the bzip2 slowdown.

llvm-svn: 63487
2009-01-31 21:30:05 +00:00
Chris Lattner
26698a600e Fix PR3452 (an infinite loop bootstrapping) by disabling the recent
improvements to the EvaluateInDifferentType code.  This code works 
by just inserted a bunch of new code and then seeing if it is 
useful.  Instcombine is not allowed to do this: it can only insert
new code if it is useful, and only when it is converging to a more
canonical fixed point.  Now that we iterate when DCE makes progress,
this causes an infinite loop when the code ends up not being used.

llvm-svn: 63483
2009-01-31 19:05:27 +00:00
Chris Lattner
c4729610fc now that all the pieces are in place, teach instcombine's
simplifydemandedbits to simplify instructions with *multiple
uses* in contexts where it can get away with it.  This allows
it to simplify the code in multi-use-or.ll into a single 'add 
double'.

This change is particularly interesting because it will cover
up for some common codegen bugs with large integers created due
to the recent SROA patch.  When working on fixing those bugs,
this should be disabled.

llvm-svn: 63481
2009-01-31 08:40:03 +00:00
Chris Lattner
abf34563ec make sure to set Changed=true when instcombine hacks on the code,
not doing so prevents it from properly iterating and prevents it
from deleting the entire body of dce-iterate.ll

llvm-svn: 63476
2009-01-31 07:04:22 +00:00
Mon P Wang
80efbf07bd Fixed optimization of combining two shuffles where the first shuffle inputs
has a different number of elements than the output.

llvm-svn: 62998
2009-01-26 04:39:00 +00:00
Torok Edwin
2a7e7066b3 testcase for PR3381.
Also it was an empty struct, not a void after all.

llvm-svn: 62920
2009-01-24 17:16:04 +00:00
Chris Lattner
d386e82ec9 Make InstCombineStoreToCast handle aggregates more aggressively,
handling the case in Transforms/InstCombine/cast-store-gep.ll, which
is a heavily reduced testcase from Clang on x86-64.

llvm-svn: 62904
2009-01-24 01:00:13 +00:00
Dale Johannesen
a5699a1e8b Do not use host floating point types when emitting
ASCII IR; loading and storing these can change the
bits of NaNs on some hosts.  Remove or add warnings
at a few other places using host floating point;
this is a bad thing to do in general.

llvm-svn: 62712
2009-01-21 20:32:55 +00:00
Dale Johannesen
ba0f5e174f Disable on x86_64 until I figure out what's wrong.
llvm-svn: 62660
2009-01-21 02:08:30 +00:00
Dale Johannesen
6854f86296 Make special cases (0 inf nan) work for frem.
Besides APFloat, this involved removing code
from two places that thought they knew the
result of frem(0., x) but were wrong.

llvm-svn: 62645
2009-01-21 00:35:19 +00:00
Dale Johannesen
1c12d1b665 Calls to fmod, it turns out, are constant-folded by
invoking the host fmod, not by lowering to frem and
constant-folding that.  Fix this so it tests what I
want to test.

llvm-svn: 62622
2009-01-20 21:58:13 +00:00
Dale Johannesen
5508ead868 Move & restructure test per review.
llvm-svn: 62538
2009-01-19 22:33:12 +00:00
Chris Lattner
5d1ed9ed1f Fix PR3335 by not turning a store to one address space into a store to another.
llvm-svn: 62351
2009-01-16 20:12:52 +00:00
Evan Cheng
e7c9310d1b Clean up previous cast optimization a bit. Also make zext elimination a bit more aggressive: if it's not necessary to emit an AND (i.e. high bits are already zero), it's profitable to evaluate the operand at a different type.
llvm-svn: 62297
2009-01-16 02:11:43 +00:00
Evan Cheng
d504f9fe27 - Teach CanEvaluateInDifferentType of this xform: sext (zext ty1), ty2 -> zext ty2
- Looking at the number of sign bits of the a sext instruction to determine  whether new trunc + sext pair should be added when its source is being evaluated in a different type.

llvm-svn: 62263
2009-01-15 17:01:23 +00:00
Dan Gohman
958861e65e Make instcombine ensure that all allocas are explicitly aligned at at
least their preferred alignment.

llvm-svn: 62176
2009-01-13 20:18:38 +00:00
Chris Lattner
660c094906 Implement rdar://6480391, extending of equality icmp's to avoid a truncation.
I noticed this in the code compiled for a routine using std::map, which produced
this code:
	%25 = tail call i32 @memcmp(i8* %24, i8* %23, i32 6) nounwind readonly
	%.lobit.i = lshr i32 %25, 31		; <i32> [#uses=1]
	%tmp.i = trunc i32 %.lobit.i to i8		; <i8> [#uses=1]
	%toBool = icmp eq i8 %tmp.i, 0		; <i1> [#uses=1]
	br i1 %toBool, label %bb3, label %bb4
which compiled to:

	call	L_memcmp$stub
	shrl	$31, %eax
	testb	%al, %al
	jne	LBB1_11	## 

with this change, we compile it to:

	call	L_memcmp$stub
	testl	%eax, %eax
	js	LBB1_11

This triggers all the time in common code, with patters like this:

	%169 = and i32 %ply, 1		; <i32> [#uses=1]
	%170 = trunc i32 %169 to i8		; <i8> [#uses=1]
	%toBool = icmp ne i8 %170, 0		; <i1> [#uses=1]

 	%7 = lshr i32 %6, 24		; <i32> [#uses=1]
	%9 = trunc i32 %7 to i8		; <i8> [#uses=1]
	%10 = icmp ne i8 %9, 0		; <i1> [#uses=1]

etc

llvm-svn: 61985
2009-01-09 07:47:06 +00:00
Chris Lattner
5ce930d116 Fix part 3/2 of PR3290, making instcombine zap (gep(bitcast)) when possible.
llvm-svn: 61980
2009-01-09 05:44:56 +00:00
Chris Lattner
5a8a2b046d ValueTracker can't assume that an alloca with no specified alignment
will get its preferred alignment.  It has to be careful and cautiously assume
it will just get the ABI alignment.  This prevents instcombine from rounding
up the alignment of a load/store without adjusting the alignment of the alloca.

llvm-svn: 61934
2009-01-08 19:28:38 +00:00
Chris Lattner
e10764369d make m_ConstantInt(int64_t) safely match ConstantInt's that are larger than i64.
This fixes an instcombine crash on PR3235.

llvm-svn: 61775
2009-01-05 23:45:50 +00:00
Bill Wendling
dd61282551 XFAIL this test. The xform was removed.
llvm-svn: 61624
2009-01-04 06:32:28 +00:00
Bill Wendling
efbe8b808c Add transformation:
xor (or (icmp, icmp), true) -> and(icmp, icmp)

This is possible because of De Morgan's law.

llvm-svn: 61537
2009-01-01 01:18:23 +00:00
Nick Lewycky
ab50d88e6a Make all the vector elements positive in an srem of constant vector.
llvm-svn: 61195
2008-12-18 06:31:11 +00:00
Bill Wendling
a6e7dd2299 Use m_Specific() instead of double matching.
llvm-svn: 60341
2008-12-01 08:09:47 +00:00
Chris Lattner
e6c7ed156f simplify these patterns using m_Specific. No need to grep for
xor in testcase (or is a substring).

llvm-svn: 60328
2008-12-01 05:16:26 +00:00
Chris Lattner
0e03e40a76 Teach inst combine to merge GEPs through PHIs. This is really
important because it is sinking the loads using the GEPs, but
not the GEPs themselves.  This triggers 647 times on 403.gcc
and makes the .s file much much nicer.  For example before:

        je      LBB1_87 ## bb78
LBB1_62:        ## bb77
        leal    84(%esi), %eax
LBB1_63:        ## bb79
        movl    (%eax), %eax
...
LBB1_87:        ## bb78
        movl    $0, 4(%esp)
        movl    %esi, (%esp)
        call    L_make_decl_rtl$stub
        jmp     LBB1_62 ## bb77


after:

        jne     LBB1_63 ## bb79
LBB1_62:        ## bb78
        movl    $0, 4(%esp)
        movl    %esi, (%esp)
        call    L_make_decl_rtl$stub
LBB1_63:        ## bb79
        movl    84(%esi), %eax

The input code was (and the GEPs are merged and
the PHI is now eliminated by instcombine):

        br i1 %tmp233, label %bb78, label %bb77
bb77:           
        %tmp234 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22              
        br label %bb79
bb78:           
        call void @make_decl_rtl(%struct.tree_node* %t_addr.3, i8* null) nounwind
        %tmp235 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22              
        br label %bb79
bb79:           
        %iftmp.12.0.in = phi %struct.rtx_def** [ %tmp235, %bb78 ], [ %tmp234, %bb77 ]           
        %iftmp.12.0 = load %struct.rtx_def** %iftmp.12.0.in             

llvm-svn: 60322
2008-12-01 02:34:36 +00:00
Bill Wendling
23684a026c Implement ((A|B)&1)|(B&-2) -> (A&1) | B transformation. This also takes care of
permutations of this pattern.

llvm-svn: 60312
2008-12-01 01:07:11 +00:00
Bill Wendling
66a7442059 Add instruction combining for ((A&~B)|(~A&B)) -> A^B and all permutations.
llvm-svn: 60291
2008-11-30 13:52:49 +00:00
Bill Wendling
3e27ac16a6 Implement (A&((~A)|B)) -> A&B transformation in the instruction combiner. This
takes care of all permutations of this pattern.

llvm-svn: 60290
2008-11-30 13:08:13 +00:00
Bill Wendling
97ad688c1b getSExtValue() doesn't work for ConstantInts with bitwidth > 64 bits. Use all
APInt calls instead.

This fixes PR3144.

llvm-svn: 60288
2008-11-30 12:38:24 +00:00
Bill Wendling
5020e916ef Strengthen check for div inst-combining.
llvm-svn: 60276
2008-11-30 04:33:53 +00:00
Bill Wendling
ac11f7d37e Instcombine was illegally transforming -X/C into X/-C when either X or C
overflowed on negation. This commit checks to make sure that neithe C nor X
overflows. This requires that the RHS of X (a subtract instruction) be a
constant integer.

llvm-svn: 60275
2008-11-30 03:42:12 +00:00
Nick Lewycky
40db216722 Chris prefers icmp/select over udiv!
llvm-svn: 60187
2008-11-27 22:41:10 +00:00
Nick Lewycky
882443585d Add a couple of missed optimizations on integer vectors. Multiply and divide
by 1, as well as multiply by -1.

llvm-svn: 60182
2008-11-27 20:21:08 +00:00
Nick Lewycky
2fbf26fe70 Optimize (x/y)*y into x-(x%y) in general. Div and rem are about the same, and
a subtract is cheaper than a multiply. This generalizes an existing transform.

llvm-svn: 59800
2008-11-21 07:33:58 +00:00
Chris Lattner
21f18c9760 Handle the case where there is no "not". It is possible it got
folded into the select.

llvm-svn: 59389
2008-11-16 04:25:26 +00:00
Chris Lattner
4f8153d48f make this actually test what it is trying to.
llvm-svn: 59386
2008-11-16 04:21:51 +00:00
Bill Wendling
436d4cce83 If the LHS of the FCMP is coming from a UIToFP instruction, then we don't want
to generate signed ICMP instructions to replace the FCMP. This would violate
the following:

define i1 @test1(i32 %val) {
  %1 = uitofp i32 %val to double
  %2 = fcmp ole double %1, 0.000000e+00
  ret i1 %2
}

would be transformed into:

define i1 @test1(i32 %val) {
  %1 = icmp slt i33 %val, 1
  ret i1 %1
}

which is obviously wrong. This patch modifes InstCombiner::FoldFCmp_IntToFP_Cst
to handle when the LHS comes from UIToFP.

llvm-svn: 58929
2008-11-09 04:26:50 +00:00
Nick Lewycky
bcadcbb1ec Fix demanded bits analysis with srem by negative number. Based on a patch
by Richard Osborne.

llvm-svn: 58555
2008-11-02 02:41:50 +00:00
Dan Gohman
1f1ebc5389 Fix this recently moved code to use the correct type. CI is now a
ConstantInt, and SI is the original cast instruction. This fixes
PR2996.

llvm-svn: 58549
2008-11-02 00:17:33 +00:00
Dan Gohman
50061675c5 Canonicalize sext(i1) to i1?-1:0, and update various instcombine
optimizations accordingly.

llvm-svn: 58457
2008-10-30 20:40:10 +00:00
Dan Gohman
3ceee36545 (A & sext(C)) | (B & ~sext(C) -> C ? A : B
llvm-svn: 58351
2008-10-28 22:38:57 +00:00
Nick Lewycky
44356e13da Don't try to create a mask when we don't need one. Fixes a crash.
llvm-svn: 58075
2008-10-24 06:14:27 +00:00
Dan Gohman
6f40163d83 Teach instcombine's visitLoad to scan back several instructions
to find opportunities for store-to-load forwarding or load CSE,
in the same way that visitStore scans back to do DSE. Also, define
a new helper function for testing whether the addresses of two
memory accesses are known to have the same value, and use it in
both visitStore and visitLoad.

These two changes allow instcombine to eliminate loads in code
produced by front-ends that frequently emit obviously redundant
addressing for memory references.

llvm-svn: 57608
2008-10-15 23:19:35 +00:00
Evan Cheng
591baeed7c Combine (fcmp cc0 x, y) | (fcmp cc1 x, y) into a single fcmp when possible.
llvm-svn: 57515
2008-10-14 18:44:08 +00:00
Evan Cheng
778b47e6c0 - Somehow I forgot about one / une.
- Renumber fcmp predicates to match their icmp counterparts.
- Try swapping operands to expose more optimization opportunities.

llvm-svn: 57513
2008-10-14 18:13:38 +00:00
Evan Cheng
91528965e7 Optimize anding of two fcmp into a single fcmp if the operands are the same. e.g. uno && ueq -> ueq
ord && olt -> olt
     ord && ueq -> oeq

llvm-svn: 57507
2008-10-14 17:15:11 +00:00
Chris Lattner
7a61ef92f5 Fix PR2697 by rewriting the '(X / pos) op neg' logic. This also changes
a couple other cases for clarity, but shouldn't affect correctness.

Patch by Eli Friedman!

llvm-svn: 57387
2008-10-11 22:55:00 +00:00
Chris Lattner
107e8f8b60 rewrite bswap matching to be more general, allowing arbitrary
shifting and masking inside a bswap expr.  This allows it to handle
the cases from PR2842, which involve the intermediate 'or' 
expressions being shifted, not just the input value.

llvm-svn: 57095
2008-10-05 02:13:19 +00:00
Nick Lewycky
9e918179c8 Fix misoptimization of: xor i1 (icmp eq (X, C1), icmp s[lg]t (X, C2))
llvm-svn: 56834
2008-09-30 06:08:34 +00:00
Dan Gohman
c598e29a1c Improve instcombine's handling of integer min and max in two ways:
- Recognize expressions like "x > -1 ? x : 0" as min/max and turn them
   into expressions like "x < 0 ? 0 : x", which is easily recognizable
   as a min/max operation.
 - Refrain from folding expression like "y/2 < 1" to "y < 2" when the
   comparison is being used as part of a min or max idiom, like
   "y/2 < 1 ? 1 : y/2". In that case, the division has another use, so
   folding doesn't eliminate it, and obfuscates the min/max, making it
   harder to recognize as a min/max operation.

These benefit ScalarEvolution, CodeGen, and anything else that wants to
recognize integer min and max.

llvm-svn: 56246
2008-09-16 18:46:06 +00:00
Dan Gohman
0b6d3a9a9b On 64-bit targets, change 32-bit getelementptr indices to be 64-bit
getelementptr indices, inserting an explicit cast if necessary.
This helps expose the sign-extension operation to other optimizations.

llvm-svn: 56133
2008-09-11 23:06:38 +00:00
Dan Gohman
5e154a591d Fix a vectorshuffle instcombine bug introduced by r55995.
Patch by Nicolas Capens!

llvm-svn: 56129
2008-09-11 22:47:57 +00:00
Dan Gohman
ebfb483309 Fix an icmp+sdiv optimization to check for and handle an overflow
condition. This fixes PR2740.

llvm-svn: 56076
2008-09-10 23:30:57 +00:00
Dan Gohman
28c911b79b Make SimplifyDemandedVectorElts simplify vectors with multiple
users, and teach it about shufflevector instructions.

Also, fix a subtle bug in SimplifyDemandedVectorElts'
insertelement code.

This is a patch that was originally written by Eli Friedman,
with some fixes and cleanup by me.

llvm-svn: 55995
2008-09-09 18:11:14 +00:00
Nick Lewycky
57cebeaeba Don't crash when trying to constant fold a vector with some elements that can't
be folded. Instead, fail to fold the entire vector.

We could also return a vector with some elements folded and some not. If anyone
thinks that's a better approach, please speak up!

llvm-svn: 55689
2008-09-03 05:54:33 +00:00
Nick Lewycky
7b87c4d8a4 Revert r54876 r54877 r54906 and r54907. Evan found that these caused a 20%
slowdown in bzip2.

llvm-svn: 55113
2008-08-21 05:56:10 +00:00
Nick Lewycky
30a0ad8900 Consider the case where xor by -1 and xor by 128 have been combined already to
produce an xor by 127.

llvm-svn: 54906
2008-08-17 19:58:24 +00:00
Nick Lewycky
205be593b8 Xor'ing both sides of icmp by sign-bit is equivalent to swapping signedness of
the predicate.

Also, make this optz'n apply in more cases where it's safe to do so.

llvm-svn: 54876
2008-08-17 07:34:14 +00:00
Owen Anderson
42c1d6c7b9 Remove GCSE and LoadVN from the testsuite.
llvm-svn: 54832
2008-08-16 00:00:54 +00:00
Dan Gohman
db5b503d60 Fix a bogus srem rule - a negative value srem'd by a power-of-2
can have a non-negative result; for example, -16%16 is 0. Also,
clarify the related comments. This fixes PR2670.

llvm-svn: 54767
2008-08-13 23:12:35 +00:00
Chris Lattner
ae09ade343 Implement support for simplifying vector comparisons by 0.0 and 1.0 like we
do for scalars.  Patch contributed by Nicolas Capens

This also generalizes the previous xforms to work on long double, now that 
isExactlyValue works for long double.

llvm-svn: 54653
2008-08-11 22:06:05 +00:00
Dan Gohman
4ad77e1ca2 Fix a shufflevector instcombine that was emitting invalid masks indices
when it meant to be emitting undef indices.

llvm-svn: 54417
2008-08-06 18:17:32 +00:00
Chris Lattner
55b99a6739 optimize a common idiom generated by clang for bitfield access, PR2638.
llvm-svn: 54408
2008-08-06 07:35:52 +00:00
Chris Lattner
cae04940bd Zap sitofp/fptoui pairs. In all cases when the sign difference
matters, the result is undefined anyway.

llvm-svn: 54396
2008-08-06 05:13:06 +00:00
Nick Lewycky
0bf3c812d2 Reinstate this optimization, but without the miscompile. Thanks to Bill for
tracking down that this was breaking llvm-gcc bootstrap on Linux.

llvm-svn: 54394
2008-08-06 04:54:03 +00:00
Bill Wendling
1854852a75 Just grep for through the LL code instead of the ASM code
llvm-svn: 54389
2008-08-06 00:10:32 +00:00
Bill Wendling
aea14c2dfe Add default architecture.
llvm-svn: 54384
2008-08-05 23:36:00 +00:00
Bill Wendling
f69c83e554 Testcase for PR2629.
llvm-svn: 54377
2008-08-05 22:23:59 +00:00
Bill Wendling
3882f060ef Revert r53282. This was causing a miscompile on Linux. Also, the transformation
looks bogus. Please see PR2629 for details on why this is breaking things.

llvm-svn: 54372
2008-08-05 21:23:45 +00:00
Chris Lattner
eccd57d118 Fix PR2553
llvm-svn: 53715
2008-07-17 06:07:20 +00:00
Matthijs Kooijman
c05651e3ce Add a few cases to instcombine's extractvalue testcase.
llvm-svn: 53675
2008-07-16 12:57:25 +00:00
Evan Cheng
7218339189 Fix PR2296. Do not transform x86_sse2_storel_dq into a full-width store.
llvm-svn: 53666
2008-07-16 07:28:14 +00:00
Chris Lattner
14faada3a3 Fix PR2506 by being a bit more careful about reverse fact propagation when
disproving a condition.  This actually compiles the existing testcase
(udiv_select_to_select_shift) to:

define i64 @test(i64 %X, i1 %Cond) {
entry:
	%divisor1.t = lshr i64 %X, 3		; <i64> [#uses=1]
	%quotient2 = lshr i64 %X, 3		; <i64> [#uses=1]
	%sum = add i64 %divisor1.t, %quotient2		; <i64> [#uses=1]
	ret i64 %sum
}

instead of:

define i64 @test(i64 %X, i1 %Cond) {
entry:
	%quotient1.v = select i1 %Cond, i64 3, i64 4		; <i64> [#uses=1]
	%quotient1 = lshr i64 %X, %quotient1.v		; <i64> [#uses=1]
	%quotient2 = lshr i64 %X, 3		; <i64> [#uses=1]
	%sum = add i64 %quotient1, %quotient2		; <i64> [#uses=1]
	ret i64 %sum
}

llvm-svn: 53534
2008-07-14 00:15:52 +00:00
Nick Lewycky
3fb5816774 Enhance analysis of srem.
Remove dead code analyzing urem. 'urem' of power-of-2 is canonicalized to an
'and' instruction.

llvm-svn: 53506
2008-07-12 05:04:38 +00:00
Nick Lewycky
8cd0f2058e Add another optimization from PR2330. Also catch some missing cases that are
similar.

llvm-svn: 53451
2008-07-11 07:20:53 +00:00
Chris Lattner
16b8ae98c1 Fix folding of icmp's of i1 where the comparison is signed. The code
was using the algorithm for folding unsigned comparisons which is
completely wrong.  This has been broken since the signless types change.

llvm-svn: 53444
2008-07-11 04:20:58 +00:00
Chris Lattner
f3f6b6d7af Fix a bogus optimization: folding (slt (zext i1 A to i32), 1) -> (slt i1 A, true)
This cause a regression in InstCombine/JavaCompare, which was doing the right
thing on accident.  To handle the missed case, generalize the comparisons based
on masked bits a little bit to handle comparisons against the max value. For 
example, we can now xform (slt i32 (and X, 4), 4) -> (setne i32 (and X, 4), 4)

llvm-svn: 53443
2008-07-11 04:09:09 +00:00
Chris Lattner
43a2b1b16d make this condition more precise.
llvm-svn: 53442
2008-07-11 03:54:57 +00:00
Nick Lewycky
26ccb8e9a8 Fix overzealous optimization. Thanks to Duncan Sands for pointing out my error!
llvm-svn: 53393
2008-07-10 05:51:40 +00:00
Nick Lewycky
6341c5a7ec Fold (a < 8) && (b < 8) into (a|b) < 8 for unsigned less or greater than.
llvm-svn: 53282
2008-07-09 07:29:11 +00:00
Nick Lewycky
38fa84fa12 Fold ((1 << a) & 1) to (a == 0).
llvm-svn: 53276
2008-07-09 05:20:13 +00:00
Chris Lattner
1a2c55201e Fix a broken test. Neither load is eliminable without changing the CFG.
llvm-svn: 53273
2008-07-09 05:01:02 +00:00
Nick Lewycky
2a6469c9a5 Reduce x - y to -y when we know the 'x' part will get masked off anyways.
llvm-svn: 53271
2008-07-09 04:32:37 +00:00
Chris Lattner
7f0adf0b34 new testcase for PR2496
llvm-svn: 53239
2008-07-08 17:18:05 +00:00
Nick Lewycky
94f9c5a42e Fix missed optimization opportunity when analyzing cast of mul and select.
llvm-svn: 53151
2008-07-05 21:19:34 +00:00
Chris Lattner
73b52018e9 Fix PR2488, a case where we deleted stack restores too aggressively.
llvm-svn: 52702
2008-06-25 05:59:28 +00:00
Eli Friedman
369401ef95 Fix for PR2479: correctly optimize expressions like (a > 13) & (a ==
15).

See also PR1800, which is about the signed case.

llvm-svn: 52608
2008-06-21 23:36:13 +00:00
Chris Lattner
e588f546c5 Fix PR2471, which is a bug involving an invalid promotion from a conditional load.
llvm-svn: 52525
2008-06-20 05:12:56 +00:00
Chris Lattner
93da79f7a1 implement some simple bswap optimizations, rdar://5992453
llvm-svn: 52442
2008-06-18 04:33:20 +00:00
Chris Lattner
7e403da191 make truncate/sext elimination capable of changing phi's. This
implements rdar://6013816 and the testcase in Transforms/InstCombine/sext-misc.ll.

llvm-svn: 52440
2008-06-18 04:00:49 +00:00
Matthijs Kooijman
bdd5cae51c Make testcase check for extractvalue instead of extractelement.
llvm-svn: 52317
2008-06-16 13:03:44 +00:00
Eli Friedman
88cdc65941 Remove unnecessary target lines.
llvm-svn: 52261
2008-06-13 22:12:16 +00:00
Eli Friedman
46782c75fe Remove unnecessary target lines.
llvm-svn: 52260
2008-06-13 22:10:32 +00:00
Eli Friedman
11d4c94933 Don't skip over instructions other than loads that might read memory
when trying to sink stores.

llvm-svn: 52259
2008-06-13 22:02:12 +00:00
Eli Friedman
d38a639deb Make sure SimplifyStoreAtEndOfBlock doesn't mess with loops; the
structure checks are incorrect if the blocks aren't distinct.
Fixes PR2435.

llvm-svn: 52257
2008-06-13 21:17:49 +00:00
Matthijs Kooijman
0f9df32e12 Teach instruction combining about the extractvalue. It can succesfully fold
useless insert-extract chains, similar to how it folds them for vectors.

Add a testcase for this.

llvm-svn: 52217
2008-06-11 14:05:05 +00:00
Matthijs Kooijman
3488e4542b Ignore stderr for some more tests that expect warnings there.
This fixes 2 testcases.

llvm-svn: 52184
2008-06-10 16:13:38 +00:00
Dan Gohman
9eace09bfa Fix two more not-grep tests that were missing llvm-dis.
llvm-svn: 52159
2008-06-09 22:36:45 +00:00
Chris Lattner
4a896996cb Limit the icmp+phi merging optimization to the cases where it is profitable:
don't make i1 phis when it won't be possible to eliminate them.

llvm-svn: 52097
2008-06-08 20:52:11 +00:00
Zhou Sheng
d7b035ee2b Add a test case for opt -instcombine bug fix in revision 52003.
llvm-svn: 52004
2008-06-05 14:25:11 +00:00
Duncan Sands
d14212a3e1 When simplifying a call to a bitcast function, tighten up
the conditions for performing the transform when only the
function declaration is available: no longer allow turning
i32 into i64 for example.  Only allow changing between
pointer types, and between pointer types and integers of
the same size.  For return values ptr -> intptr was already
allowed; I added ptr -> ptr and intptr -> ptr while there.
As shown by a recent objc testcase, changing the way
parameters/return values are passed can be fatal when calling
code written in assembler that directly manipulates call
arguments and return values unless the transform has no
impact on the way they are passed at the codegen level.
While it is possible to imagine an ABI that treats integers
of pointer size differently to pointers, I don't think LLVM
supports any so the transform should now be safe while still
being useful.

llvm-svn: 51834
2008-06-01 07:38:42 +00:00
Nick Lewycky
1bcd80adf7 Peer through sext/zext when looking for not(cmp).
llvm-svn: 51819
2008-05-31 19:01:33 +00:00
Nick Lewycky
b30afdb62b Add more i1 optimizations. add, sub, mul, s/udiv on i1 are now simplified away.
llvm-svn: 51817
2008-05-31 17:59:52 +00:00
Nick Lewycky
cdcdcddc85 Adding i1 is always Xor.
llvm-svn: 51816
2008-05-31 17:10:28 +00:00
Chris Lattner
7a7da4f9c3 Implement PR2370: memmove(x,x,size) -> noop.
llvm-svn: 51636
2008-05-28 05:30:41 +00:00
Nick Lewycky
744dad8004 "ret (constexpr)" can't be folded into a Constant. Add a method to
Analysis/ConstantFolding to fold ConstantExpr's, then make instcombine use it
to try to use targetdata to fold constant expressions on void instructions.

Also extend the icmp(inttoptr, inttoptr) folding to handle the case where
int size != ptr size.

llvm-svn: 51559
2008-05-25 20:56:15 +00:00
Chris Lattner
3def8b4e53 Fix a serious brain-o. Obviously no-one reviewed my patch :(
This fixes PR2359

llvm-svn: 51536
2008-05-24 04:06:28 +00:00
Nick Lewycky
6a16ace643 Constant integer vectors may also be negated.
llvm-svn: 51476
2008-05-23 04:54:45 +00:00
Nick Lewycky
bd2da8098d Revert X + X --> X * 2 optz'n which pessimizes heavily on x86.
llvm-svn: 51474
2008-05-23 04:34:58 +00:00
Nick Lewycky
427209006f Implement X + X for vectors.
llvm-svn: 51472
2008-05-23 04:14:51 +00:00
Nick Lewycky
e62259c369 Fix a recently added optimization to not crash on vectors.
llvm-svn: 51471
2008-05-23 03:26:47 +00:00
Dan Gohman
67e1a58e22 Generalize the new code in instcombine's ComputeNumSignBits for handling
and/or to handle more cases (such as this add-sitofp.ll testcase), and
port it to selectiondag's ComputeNumSignBits.

llvm-svn: 51469
2008-05-23 02:28:01 +00:00
Gabor Greif
b03785f0cd Eliminate questionable syntax for stdin redirection. This probably also speeds things up a bit.
llvm-svn: 51357
2008-05-20 22:07:21 +00:00
Dan Gohman
7d78d53d2a Oops, commit the version of this test that actually works.
llvm-svn: 51351
2008-05-20 21:19:36 +00:00
Dan Gohman
b48d4a75f6 Port SelectionDAG's ComputeNumSignBits-using code to instcombine,
now that instcombine also has ComputeNumSignBits.

llvm-svn: 51350
2008-05-20 21:01:12 +00:00
Gabor Greif
807c2df887 sabre brings to my attention that the 'tr' suffix is also obsolete
llvm-svn: 51349
2008-05-20 21:00:03 +00:00
Gabor Greif
d8a4dbb5da Rename the last test with .llx extension to .ll, resolve duplicate test by renaming to isnan2. Now that no test has llx ending there is no need to search for them from dg.exp too.
llvm-svn: 51328
2008-05-20 19:52:04 +00:00
Chris Lattner
b387fd90fc Teach instcombine 4 new xforms:
(add (sext x), cst) --> (sext (add x, cst'))
  (add (sext x), (sext y)) --> (sext (add int x, y))
  (add double (sitofp x), fpcst) --> (sitofp (add int x, intcst))
  (add double (sitofp x), (sitofp y)) --> (sitofp (add int x, y))

This generally reduces conversions.  For example MiBench/telecomm-gsm
gets these simplifications:

HACK2: 	%tmp67.i142.i.i = sext i16 %tmp6.i141.i.i to i32		; <i32> [#uses=1]
	%tmp23.i139.i.i = sext i16 %tmp2.i138.i.i to i32		; <i32> [#uses=1]
	%tmp8.i143.i.i = add i32 %tmp67.i142.i.i, %tmp23.i139.i.i		; <i32> [#uses=3]
HACK2: 	%tmp67.i121.i.i = sext i16 %tmp6.i120.i.i to i32		; <i32> [#uses=1]
	%tmp23.i118.i.i = sext i16 %tmp2.i117.i.i to i32		; <i32> [#uses=1]
	%tmp8.i122.i.i = add i32 %tmp67.i121.i.i, %tmp23.i118.i.i		; <i32> [#uses=3]
HACK2: 	%tmp67.i.i190.i = sext i16 %tmp6.i.i189.i to i32		; <i32> [#uses=1]
	%tmp23.i.i187.i = sext i16 %tmp2.i.i186.i to i32		; <i32> [#uses=1]
	%tmp8.i.i191.i = add i32 %tmp67.i.i190.i, %tmp23.i.i187.i		; <i32> [#uses=3]
HACK2: 	%tmp67.i173.i.i.i = sext i16 %tmp6.i172.i.i.i to i32		; <i32> [#uses=1]
	%tmp23.i170.i.i.i = sext i16 %tmp2.i169.i.i.i to i32		; <i32> [#uses=1]
	%tmp8.i174.i.i.i = add i32 %tmp67.i173.i.i.i, %tmp23.i170.i.i.i		; <i32> [#uses=3]
HACK2: 	%tmp67.i152.i.i.i = sext i16 %tmp6.i151.i.i.i to i32		; <i32> [#uses=1]
	%tmp23.i149.i.i.i = sext i16 %tmp2.i148.i.i.i to i32		; <i32> [#uses=1]
	%tmp8.i153.i.i.i = add i32 %tmp67.i152.i.i.i, %tmp23.i149.i.i.i		; <i32> [#uses=3]
HACK2: 	%tmp67.i.i.i.i = sext i16 %tmp6.i.i.i.i to i32		; <i32> [#uses=1]
	%tmp23.i.i5.i.i = sext i16 %tmp2.i.i.i.i to i32		; <i32> [#uses=1]
	%tmp8.i.i7.i.i = add i32 %tmp67.i.i.i.i, %tmp23.i.i5.i.i		; <i32> [#uses=3]


This also fixes a bug in ComputeNumSignBits handling select and
makes it more aggressive with and/or.

llvm-svn: 51302
2008-05-20 05:46:13 +00:00
Chris Lattner
63c384df1e convert fptosi(sitofp x) -> x if the fp value has enough bits in its mantissa
to accurately represent the integer.  This triggers 9 times in 471.omnetpp,
though 8 of those seem to be inlined from the same place.

llvm-svn: 51271
2008-05-19 20:25:04 +00:00
Chris Lattner
1435b94f62 Fold FP comparisons where one operand is converted from an integer
type and the other operand is a constant into integer comparisons.
This happens surprisingly frequently (e.g. 10 times in 471.omnetpp),
which are things like this:

	%tmp8283 = sitofp i32 %tmp82 to double	
	%tmp1013 = fcmp ult double %tmp8283, 0.0

Clearly comparing tmp82 against i32 0 is cheaper here.

this also triggers 8 times in gobmk, including this one:

	%tmp375376 = sitofp i32 %tmp375 to double
	%tmp377 = fcmp ogt double %tmp375376, 8.150000e+01

which is comparing an integer against 81.5 :).

llvm-svn: 51268
2008-05-19 20:18:56 +00:00
Chris Lattner
510a6b249c be more aggressive about transforming add -> or when the operands have no
intersecting bits.  This triggers all over the place, for example in lencode,
with adds of stuff like:

	%tmp580 = mul i32 %tmp579, 2	
	%tmp582 = and i32 %b8, 1
and

	%tmp28 = shl i32 %abs.i, 1		
	%sign.0 = select i1 %tmp23, i32 1, i32 0
and
	%tmp344 = shl i32 %tmp343, 2	
	%tmp346 = and i32 %tmp96, 3

etc.

llvm-svn: 51263
2008-05-19 20:01:56 +00:00
Chris Lattner
8c0f0a0e6c Fix PR2339
llvm-svn: 51226
2008-05-18 04:11:26 +00:00
Chris Lattner
8871489ae7 remove empty file?
llvm-svn: 51225
2008-05-18 04:10:18 +00:00
Nick Lewycky
46e3a168c0 Revert constant-folding change that will miscompile in some cases.
llvm-svn: 51223
2008-05-17 19:00:05 +00:00
Nick Lewycky
1df40102a9 Constant fold inttoptr and ptrtoint.
llvm-svn: 51216
2008-05-17 09:03:26 +00:00
Chris Lattner
00e8e1e258 implement PR2328.
llvm-svn: 51176
2008-05-16 02:59:42 +00:00
Bill Wendling
c1d9f9604b Situations can arise when you have a function called that returns a 'void', but
is bitcast to return a floating point value. The result of the instruction may
not be used by the program afterwards, and LLVM will happily remove all
instructions except the call. But, on some platforms, if a value is returned as
a floating point, it may need to be removed from the stack (like x87). Thus, we
can't get rid of the bitcast even if there isn't a use of the value.

llvm-svn: 51134
2008-05-14 22:45:20 +00:00
Duncan Sands
15622620d3 Testcase for PR2303.
llvm-svn: 50951
2008-05-10 16:43:10 +00:00
Chris Lattner
02ca137915 Implement PR2298. This transforms:
~x < ~y --> y < x
   -x == -y --> x == y

llvm-svn: 50882
2008-05-09 05:19:28 +00:00
Chris Lattner
4c1ef3628b More than just loads can read from memory: readonly calls like strlen
also need to be checked for memory modifying instructions before we
can sink them.  THis fixes the second half of PR2297.

llvm-svn: 50860
2008-05-08 17:37:37 +00:00
Chris Lattner
cba8b4c7e8 Make instcombine's DSE respect loads as well as stores. It is not safe to
delete the first store in:

store x -> p
load p
store y -> p

This is for PR2297.

llvm-svn: 50859
2008-05-08 17:20:30 +00:00
Dan Gohman
6ea87fa437 Fix a bug in the ComputeMaskedBits logic for multiply.
llvm-svn: 50793
2008-05-07 00:35:55 +00:00
Dan Gohman
faf9df7227 Correct the value of LowBits in srem and urem handling in
ComputeMaskedBits.

llvm-svn: 50692
2008-05-06 00:51:48 +00:00
Dan Gohman
27156711ef Fix a mistake in the computation of leading zeros for udiv.
llvm-svn: 50591
2008-05-02 21:30:02 +00:00
Dan Gohman
04e2b94842 Update old-style syntax in some "not grep" tests.
llvm-svn: 50560
2008-05-01 23:50:07 +00:00
Dan Gohman
793c9fed45 Fix an overaggressive SimplifyDemandedBits optimization on urem. This
fixes the 254.gap regression on x86 and the 403.gcc regression on x86-64.

llvm-svn: 50537
2008-05-01 19:13:24 +00:00
Chris Lattner
d4bf588b85 move some tests from libcall optimizer suite.
llvm-svn: 50516
2008-05-01 06:13:48 +00:00
Chris Lattner
15195e00ee move lowering of llvm.memset -> store from simplify libcalls
to instcombine.

llvm-svn: 50472
2008-04-30 06:39:11 +00:00
Chris Lattner
5bd55b0885 don't eliminate load from volatile value on paths where the load is dead.
This fixes the second half of PR2262

llvm-svn: 50430
2008-04-29 17:28:22 +00:00
Chris Lattner
4b5d48a3f0 make this test reduced and *valid*
llvm-svn: 50429
2008-04-29 17:25:32 +00:00
Chris Lattner
7099f3c400 fix a subtle volatile handling bug.
llvm-svn: 50428
2008-04-29 17:13:43 +00:00
Chris Lattner
51fe8415da don't delete the last store to an alloca if the store is volatile.
llvm-svn: 50390
2008-04-29 04:58:38 +00:00
Dan Gohman
1b7238e6e4 Teach InstCombine's ComputeMaskedBits what SelectionDAG's
ComputeMaskedBits knows about cttz, ctlz, and ctpop. Teach
SelectionDAG's ComputeMaskedBits what InstCombine's knows
about SRem. And teach them both some things about high bits
in Mul, UDiv, URem, and Sub. This allows instcombine and
dagcombine to eliminate sign-extension operations in
several new cases.

llvm-svn: 50358
2008-04-28 17:02:21 +00:00