1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 12:12:47 +01:00
Commit Graph

6626 Commits

Author SHA1 Message Date
Dan Gohman
3691dbc5cf Add some debug output to LoopSimplify.
llvm-svn: 97458
2010-03-01 17:55:27 +00:00
Dan Gohman
c08c48c9f3 Spelling fixes.
llvm-svn: 97453
2010-03-01 17:49:51 +00:00
Dan Gohman
25af6486ae Prune #includes.
llvm-svn: 97448
2010-03-01 17:42:17 +00:00
Bob Wilson
6d8360b3d8 Revert r97245 which seems to be causing performance problems.
llvm-svn: 97366
2010-02-28 05:34:05 +00:00
Chris Lattner
ca018ea20b fix grammaro's pointed out by daniel
llvm-svn: 97313
2010-02-27 07:50:40 +00:00
Chris Lattner
93fd3ddf24 fix PR6414, a nondeterminism issue in IPSCCP which was because
of a subtle interation in a loop operating in densemap order.

llvm-svn: 97288
2010-02-27 00:07:42 +00:00
Chris Lattner
8e0bfd5ce0 Fix rdar://7694996 a miscompile of 183.equake from my patch yesterday,
confusing the old MAT variable with the new GlobalType one.  This caused
us to promote the @disp global pointer into:

@disp.body = internal global double*** undef

instead of:

@disp.body = internal global [3 x double**] undef

llvm-svn: 97285
2010-02-26 23:42:13 +00:00
Chris Lattner
995bac5839 remove dead code, by this point all uses of CI are gone.
llvm-svn: 97283
2010-02-26 23:35:25 +00:00
Bob Wilson
139fdbd4d2 Move the EnableFullLoadPRE flag from a separate command-line option to an
argument of createGVNPass and set it automatically for -O3.

llvm-svn: 97245
2010-02-26 19:09:47 +00:00
Bob Wilson
cf20a07501 Remove unused "NoPRE" parameter in GVN and createGVNPass().
llvm-svn: 97235
2010-02-26 18:35:19 +00:00
Chris Lattner
3832b527d8 fix PR6435 another bug from the MallocInst elimination work.
llvm-svn: 97231
2010-02-26 18:23:13 +00:00
Chris Lattner
8cb1dc746d rewrite OptimizeGlobalAddressOfMalloc to fix PR6422, some bugs
introduced when mallocinst was eliminated. 

llvm-svn: 97178
2010-02-25 22:33:52 +00:00
Dan Gohman
52ed61204b Make LoopSimplify change conditional branches in loop exiting blocks
which branch on undef to branch on a boolean constant for the edge
exiting the loop. This helps ScalarEvolution compute trip counts for
loops.

Teach ScalarEvolution to recognize single-value PHIs, when safe, and
ForgetSymbolicName to forget such single-value PHI nodes as apprpriate
in ForgetSymbolicName.

llvm-svn: 97126
2010-02-25 06:57:05 +00:00
Nick Lewycky
e0601cb7f8 Modernize comment.
llvm-svn: 97121
2010-02-25 06:39:10 +00:00
Nick Lewycky
772e9deb48 Correct whitespace.
llvm-svn: 97120
2010-02-25 06:38:51 +00:00
Daniel Dunbar
8adb7774dc Reapply r97010, the speculative revert failed.
llvm-svn: 97036
2010-02-24 08:48:04 +00:00
Daniel Dunbar
8b33ce41d0 Speculatively revert r97010, "Add an argument to PHITranslateValue to specify
the DominatorTree. ...", in hopes of restoring poor old PPC bootstrap.

llvm-svn: 97027
2010-02-24 06:55:22 +00:00
Dan Gohman
c97d42a5d5 Fix indentation.
llvm-svn: 97024
2010-02-24 06:46:09 +00:00
Bob Wilson
437d85dd57 Add an argument to PHITranslateValue to specify the DominatorTree. If this
argument is non-null, pass it along to PHITranslateSubExpr so that it can
prefer using existing values that dominate the PredBB, instead of just
blindly picking the first equivalent value that it finds on a uselist.
Also when the DominatorTree is specified, have PHITranslateValue filter
out any result that does not dominate the PredBB.  This is basically just
refactoring the check that used to be in GetAvailablePHITranslatedSubExpr
and also in GVN.

Despite my initial expectations, this change does not affect the results
of GVN for any testcases that I could find, but it should help compile time.
Before this change, if PHITranslateSubExpr picked a value that does not
dominate, PHITranslateWithInsertion would then insert a new value, which GVN
would later determine to be redundant and would replace.  By picking a good
value to begin with, we save GVN the extra work of inserting and then
replacing a new value.

llvm-svn: 97010
2010-02-24 01:39:00 +00:00
Dan Gohman
b3bf4992c5 Don't do (X != Y) ? X : Y -> X for floating-point values; it doesn't
handle NaN properly.

Do (X une Y) ? X : Y  -> X if one of X and Y is not zero.

llvm-svn: 96955
2010-02-23 17:17:57 +00:00
Bob Wilson
b3640f8018 Update memdep when load PRE inserts a new load, and add some debug output.
I don't have a small testcase for this.

llvm-svn: 96890
2010-02-23 05:55:00 +00:00
Evan Cheng
d9816ef946 Instcombine constant folding can normalize gep with negative index to index with large offset. When instcombine objsize checking transformation sees these geps where the offset seemingly point out of bound, it should just return "i don't know" rather than asserting.
llvm-svn: 96825
2010-02-22 23:34:00 +00:00
Bob Wilson
e075edf725 Erase deleted instructions from GVN's ValueTable. This fixes assertion
failures from ValueTable::verifyRemoved() when using -debug.

llvm-svn: 96805
2010-02-22 21:39:41 +00:00
Dan Gohman
2575392bf1 Remove unused variables and parameters.
llvm-svn: 96780
2010-02-22 04:11:59 +00:00
Dan Gohman
c44dee5fbd When emitting an instruction which depends on both a post-incremented
induction variable value and a loop-variant value, don't force the
insert position to be at the post-increment position, because it may
not be dominated by the loop-variant value. This fixes a
use-before-def problem noticed on PPC.

llvm-svn: 96774
2010-02-22 03:59:54 +00:00
Dan Gohman
8c51905154 This cast<Instruction> is unnecessary.
llvm-svn: 96771
2010-02-22 02:07:36 +00:00
Dan Gohman
220c2212c1 Rename getSDiv to getExactSDiv to reflect its behavior in cases where
the division would have a remainder.

llvm-svn: 96693
2010-02-19 19:35:48 +00:00
Dan Gohman
9db0689627 Check for overflow when scaling up an add or an addrec for
scaled reuse.

llvm-svn: 96692
2010-02-19 19:32:49 +00:00
Dale Johannesen
0f58c73229 recommit 96626, evidence that it broke things appears
to be spurious

llvm-svn: 96662
2010-02-19 07:14:22 +00:00
Dale Johannesen
d73b81bf12 Revert 96626, which causes build failure on ppc Darwin.
llvm-svn: 96653
2010-02-19 01:54:37 +00:00
Dan Gohman
58199e30dc When determining the set of interesting reuse factors, consider
strides in foreign loops. This helps locate reuse opportunities
with existing induction variables in foreign loops and reduces
the need for inserting new ones. This fixes rdar://7657764.

llvm-svn: 96629
2010-02-19 00:05:23 +00:00
Dan Gohman
2f0615ee5e Indvars needs to explicitly notify ScalarEvolution when it is replacing
a loop exit value, so that if a loop gets deleted, ScalarEvolution
isn't stick holding on to dangling SCEVAddRecExprs for that loop. This
fixes PR6339.

llvm-svn: 96626
2010-02-18 23:26:33 +00:00
Dan Gohman
f0d7b5df4b Hoist this loop-invariant logic out of the loop.
llvm-svn: 96614
2010-02-18 21:34:02 +00:00
Dan Gohman
82f46afcdd Delete some unneeded casts.
llvm-svn: 96429
2010-02-17 00:42:19 +00:00
Dan Gohman
493a1fcbe0 Don't attempt to divide INT_MIN by -1; consider such cases to
have overflowed.

llvm-svn: 96428
2010-02-17 00:41:53 +00:00
Bob Wilson
86dded571f Rename SuccessorNumber to GetSuccessorNumber.
llvm-svn: 96387
2010-02-16 21:06:42 +00:00
Dan Gohman
e2da5181db Refactor rewriting for PHI nodes into a separate function.
llvm-svn: 96382
2010-02-16 20:25:07 +00:00
Bob Wilson
a866c660db Split critical edges as needed for load PRE.
llvm-svn: 96378
2010-02-16 19:51:59 +00:00
Bob Wilson
1115365fcf Refactor to share code to find the position of a basic block successor in the
terminator's list of successors.

llvm-svn: 96377
2010-02-16 19:49:17 +00:00
Dan Gohman
815d966a5a Fix whitespace.
llvm-svn: 96372
2010-02-16 19:42:34 +00:00
Duncan Sands
1b33dd3c83 There are two ways of checking for a given type, for example isa<PointerType>(T)
and T->isPointerTy().  Convert most instances of the first form to the second form.
Requested by Chris.

llvm-svn: 96344
2010-02-16 11:11:14 +00:00
Dan Gohman
d19ecedc40 Split the main for-each-use loop again, this time for GenerateTruncates,
as it also peeks at which registers are being used by other uses. This
makes LSR less sensitive to use-list order.

llvm-svn: 96308
2010-02-16 01:42:53 +00:00
Chris Lattner
a8505609fe fix PR6305 by handling BlockAddress in a helper function
called by jump threading.

llvm-svn: 96263
2010-02-15 20:47:49 +00:00
Duncan Sands
2acaf3609c Uniformize the names of type predicates: rather than having isFloatTy and
isInteger, we now have isFloatTy and isIntegerTy.  Requested by Chris!

llvm-svn: 96223
2010-02-15 16:12:20 +00:00
Dan Gohman
c954fc3333 Fix whitespace.
llvm-svn: 96179
2010-02-14 18:51:39 +00:00
Dan Gohman
900be69752 Fix a comment.
llvm-svn: 96178
2010-02-14 18:51:20 +00:00
Dan Gohman
de32b2fac0 When complicated expressions are broken down into subexpressions
with multiplication by constants distributed through, occasionally
those subexpressions can include both x and -x. For now, if this
condition is discovered within LSR, just prune such cases away,
as they won't be profitable. This fixes a "zero allocated in a
base register" assertion failure.

llvm-svn: 96177
2010-02-14 18:50:49 +00:00
Dan Gohman
dbffcb6c92 Actually, this code doesn't have to be quite so conservative in
the no-TLI case. But it should still default to declining the
transformation.

llvm-svn: 96152
2010-02-14 03:21:49 +00:00
Dan Gohman
3b065f43af Don't attempt aggressive post-inc uses if TargetLowering is not available,
because profitability can't be sufficiently approximated.

llvm-svn: 96148
2010-02-14 02:45:21 +00:00
John McCall
5089836939 Make LSR not crash if invoked without target lowering info, e.g. if invoked
from opt.

llvm-svn: 96135
2010-02-13 23:40:16 +00:00
Eric Christopher
96f3c4222f Fix a problem where we had bitcasted operands that gave us
odd offsets since the bitcasted pointer size and the offset pointer
size are going to be different types for the GEP vs base object.

llvm-svn: 96134
2010-02-13 23:38:01 +00:00
Chris Lattner
23368e55d7 remove dead code.
llvm-svn: 96109
2010-02-13 19:07:06 +00:00
Chris Lattner
0874dc6dab Split some code out to a helper function (FindReusablePredBB)
and add a doxygen comment.

Cache the phi entry to avoid doing tons of 
PHINode::getBasicBlockIndex calls in the common case.

On my insane testcase from re2c, this speeds up CGP from
617.4s to 7.9s (78x).

llvm-svn: 96083
2010-02-13 05:35:08 +00:00
Chris Lattner
9c797fdf1d Speed up codegen prepare from 3.58s to 0.488s.
llvm-svn: 96081
2010-02-13 05:01:14 +00:00
Chris Lattner
975da20868 PHINode::getBasicBlockIndex is O(n) in the number of inputs
to a PHI, avoid it in the common case where the BB occurs
in the same index for multiple phis.  This speeds up CGP on
an insane testcase from 8.35 to 3.58s.

llvm-svn: 96080
2010-02-13 04:24:19 +00:00
Chris Lattner
1a067d53b0 iterate over preds using PHI information when available instead of
using pred_begin/end.  It is much faster.

llvm-svn: 96079
2010-02-13 04:15:26 +00:00
Chris Lattner
9be0b06335 speed up CGP a bit by scanning predecessors through phi operands
instead of with pred_begin/end.

llvm-svn: 96078
2010-02-13 04:04:42 +00:00
Dan Gohman
25722fd3fc Fix a pruning heuristic which implicitly assumed that SmallPtrSet is
deterministically sorted.

llvm-svn: 96071
2010-02-13 02:06:02 +00:00
Jakob Stoklund Olesen
8ce1b3d280 Enable the inlinehint attribute in the Inliner.
Functions explicitly marked inline will get an inlining threshold slightly
more aggressive than the default for -O3. This means than -O3 builds are
mostly unaffected while -Os builds will be a bit bigger and faster.

The difference depends entirely on how many 'inline's are sprinkled on the
source.

In the CINT2006 suite, only these tests are significantly affected under -Os:

               Size   Time
471.omnetpp   +1.63% -1.85%
473.astar     +4.01% -6.02%
483.xalancbmk +4.60%  0.00%

Note that 483.xalancbmk runs too quickly to give useful timing results.

llvm-svn: 96066
2010-02-13 01:51:53 +00:00
Dan Gohman
cbf5771473 Reapply 95979, a compile-time speedup, now that the bug it exposed is fixed.
llvm-svn: 96005
2010-02-12 19:35:25 +00:00
Dan Gohman
006952d39d Fix this code to avoid dereferencing an end() iterator in
offset distributions it doesn't expect.

llvm-svn: 96002
2010-02-12 19:20:37 +00:00
Chris Lattner
ecb203898a 1. modernize the constantmerge pass, using densemap/smallvector.
2. don't bother trying to merge globals in non-default sections,
   doing so is quite dubious at best anyway.
3. fix a bug reported by Arnaud de Grandmaison where we'd try to
   merge two globals in different address spaces.

llvm-svn: 95995
2010-02-12 18:17:23 +00:00
Daniel Dunbar
43ad3dfe00 Revert "Reverse the order for collecting the parts of an addrec. The order", it
is breaking llvm-gcc bootstrap.

llvm-svn: 95988
2010-02-12 17:27:08 +00:00
Dan Gohman
2f46f79492 Reverse the order for collecting the parts of an addrec. The order
doesn't matter, except that ScalarEvolution tends to need less time
to fold the results this way.

llvm-svn: 95979
2010-02-12 11:08:26 +00:00
Dan Gohman
c40eb525ad Reapply the new LoopStrengthReduction code, with compile time and
bug fixes, and with improved heuristics for analyzing foreign-loop
addrecs.

This change also flattens IVUsers, eliminating the stride-oriented
groupings, which makes it easier to work with.

llvm-svn: 95975
2010-02-12 10:34:29 +00:00
Eric Christopher
2e0201ee18 Make sure that ConstantExpr offsets also aren't off of extern
symbols.

Thanks to Duncan Sands for the testcase!

llvm-svn: 95877
2010-02-11 17:44:04 +00:00
Chris Lattner
a59eb7c09c Rename ValueRequiresCast to ShouldOptimizeCast, to better reflect
what it does.  Enhance it to return false to optimizing vector
sign extensions from vector comparisions, which is the idiom used
to get a splatted vector for a vector comparison.

Doing this breaks vector-casts.ll, add some compensating 
transformations to handle the important case they cover without
depending on this canonicalization.

This fixes rdar://7434900 a serious pessimization of vector compares.

llvm-svn: 95855
2010-02-11 06:26:33 +00:00
Chris Lattner
a087e6e82f Make DSE only scan blocks that are reachable from the entry
block.  Other blocks may have pointer cycles that will crash
basicaa and other alias analyses.  In any case, there is no
point wasting cycles optimizing dead blocks.  This fixes 
rdar://7635088

llvm-svn: 95852
2010-02-11 05:11:54 +00:00
Chris Lattner
733ffcdb1f Make jump threading honor x|undef -> true and x&undef -> false,
instead of considering x|undef -> x, which may not be true.

llvm-svn: 95850
2010-02-11 04:40:44 +00:00
Eric Christopher
9516309f55 Add ConstantExpr handling to Intrinsic::objectsize lowering.
Update testcase accordingly now that we can optimize another
section.

llvm-svn: 95846
2010-02-11 01:48:54 +00:00
Devang Patel
e87a8a944d Ignore dbg info intrinsics.
llvm-svn: 95828
2010-02-11 00:20:49 +00:00
Devang Patel
43cc7530be Strip new llvm.dbg.value intrinsic.
llvm-svn: 95807
2010-02-10 21:19:56 +00:00
Dan Gohman
92b6122204 Fix "the the" and similar typos.
llvm-svn: 95781
2010-02-10 16:03:48 +00:00
Eric Christopher
6691a59247 Move Intrinsic::objectsize lowering back to InstCombineCalls and
enable constant 0 offset lowering.

llvm-svn: 95691
2010-02-09 21:24:27 +00:00
Eric Christopher
871cf7bce2 Pull these back out, they're a little too aggressive and time
consuming for a simple optimization.

llvm-svn: 95671
2010-02-09 17:29:18 +00:00
Chris Lattner
f6d9c1f90c simplify this code, duh.
llvm-svn: 95643
2010-02-09 01:14:06 +00:00
Chris Lattner
26b712379f fix PR6193, only considering sign extensions *from i1* for this
xform.

llvm-svn: 95642
2010-02-09 01:12:41 +00:00
Eric Christopher
1ff7f162e2 Add file in here too.
llvm-svn: 95641
2010-02-09 01:11:03 +00:00
Eric Christopher
428b385575 Add a new pass to do llvm.objsize lowering using SCEV.
Initial skeleton and SCEVUnknown lowering implemented,
the rest should come relatively quickly.  Move testcase
to new directory.

Move pass to right before SimplifyLibCalls - which is
moved down a bit so we can take advantage of a few opts.

llvm-svn: 95628
2010-02-09 00:35:38 +00:00
Chris Lattner
9635fe03ca fix some problems handling large vectors reported in PR6230
llvm-svn: 95616
2010-02-08 23:56:03 +00:00
Jakob Stoklund Olesen
83ebc265b3 Reintroduce the InlineHint function attribute.
This time it's for real! I am going to hook this up in the frontends as well.

The inliner has some experimental heuristics for dealing with the inline hint.
When given a -respect-inlinehint option, functions marked with the inline
keyword are given a threshold just above the default for -O3.

We need some experiments to determine if that is the right thing to do.

llvm-svn: 95466
2010-02-06 01:16:28 +00:00
Jakob Stoklund Olesen
7b4c60adae Don't unroll loops containing function calls.
llvm-svn: 95454
2010-02-05 23:21:31 +00:00
Jakob Stoklund Olesen
670458b3be Teach SimplifyCFG about magic pointer constants.
Weird code sometimes uses pointer constants other than null. This patch
teaches SimplifyCFG to build switch instructions in those cases.

Code like this:

void f(const char *x) {
  if (!x)
    puts("null");
  else if ((uintptr_t)x == 1)
    puts("one");
  else if (x == (char*)2 || x == (char*)3)
    puts("two");
  else if ((intptr_t)x == 4)
    puts("four");
  else
    puts(x);
}

Now becomes a switch:

define void @f(i8* %x) nounwind ssp {
entry:
  %magicptr23 = ptrtoint i8* %x to i64            ; <i64> [#uses=1]
  switch i64 %magicptr23, label %if.else16 [
    i64 0, label %if.then
    i64 1, label %if.then2
    i64 2, label %if.then9
    i64 3, label %if.then9
    i64 4, label %if.then14
  ]

Note that LLVM's own DenseMap uses magic pointers.

llvm-svn: 95439
2010-02-05 22:03:18 +00:00
Chris Lattner
44965f1107 fix logical-select to invoke filecheck right, and fix hte instcombine
xform it is checking to actually pass.  There is no need to match
m_SelectCst<0, -1> since instcombine canonicalizes that into not(sext).

Add matches for sext(not(x)) in addition to not(sext(x)).

llvm-svn: 95420
2010-02-05 19:53:02 +00:00
Dan Gohman
96cad72d2d Implement releaseMemory in CodeGenPrepare and free the BackEdges
container data. This prevents it from holding onto dangling
pointers and potentially behaving unpredictably.

llvm-svn: 95409
2010-02-05 19:24:11 +00:00
Dan Gohman
8142ba6bbb Use a SmallSetVector instead of a SetVector; this code showed up as a
malloc caller in a profile.

llvm-svn: 95407
2010-02-05 19:20:15 +00:00
Eric Christopher
f89979ce6a Remove this code for now. I have a better idea and will rewrite with
that in mind.

llvm-svn: 95402
2010-02-05 19:04:06 +00:00
Bob Wilson
c7e8107ff2 Do not reassociate expressions with i1 type. SimplifyCFG converts some
short-circuited conditions to AND/OR expressions, and those expressions
are often converted back to a short-circuited form in code gen.  The
original source order may have been optimized to take advantage of the
expected values, and if we reassociate them, we change the order and
subvert that optimization.  Radar 7497329.

llvm-svn: 95333
2010-02-04 23:32:37 +00:00
Jakob Stoklund Olesen
54b09bc819 Increase inliner thresholds by 25.
This makes the inliner about as agressive as it was before my changes to the
inliner cost calculations. These levels give the same performance and slightly
smaller code than before.

llvm-svn: 95320
2010-02-04 18:48:20 +00:00
Eric Christopher
ee4a176739 Temporarily revert this since it appears to have caused a build
failure.

llvm-svn: 95294
2010-02-04 06:41:27 +00:00
Eric Christopher
9b3e42f09e Rework constant expr and array handling for objectsize instcombining.
Fix bugs where we would compute out of bounds as in bounds, and where
we couldn't know that the linker could override the size of an array.

Add a few new testcases, change existing testcase to use a private
global array instead of extern.

llvm-svn: 95283
2010-02-04 02:55:34 +00:00
Eric Christopher
fe6ab1518e If we're dealing with a zero-length array, don't lower to any
particular size, we just don't know what the length is yet.

llvm-svn: 95266
2010-02-03 23:56:07 +00:00
Bob Wilson
8635e08b7f Adjust the heuristics used to decide when SROA is likely to be profitable.
The SRThreshold value makes perfect sense for checking if an entire aggregate
should be promoted to a scalar integer, but it is not so good for splitting
an aggregate into its separate elements.  A struct may contain a large embedded
array along with some scalar fields that would benefit from being split apart
by SROA.  Even if the total aggregate size is large, it may still be good to
perform SROA.  Thus, the most important piece of this patch is simply moving
the aggregate size comparison vs. SRThreshold so that it guards only the
aggregate promotion.

We have also been checking the number of elements to decide if an aggregate
should be split up.  The limit of "SRThreshold/4" seemed rather arbitrary,
and I don't think it's very useful to derive this limit from SRThreshold
anyway.  I've collected some data showing that the current default limit of
32 (since SRThreshold defaults to 128) is a reasonable cutoff for struct
types.  One thing suggested by the data is that distinguishing between structs
and arrays might be useful.  There are (obviously) a lot more large arrays
than large structs (as measured by the number of elements and not the total
size -- a large array inside a struct still counts as a single element given
the way we do SROA right now).  Out of 8377 arrays where we successfully
performed SROA while compiling a large set of benchmarks, only 16 of them had
more than 8 elements.  And, for those 16 arrays, it's not at all clear that
SROA was actually beneficial.  So, to offset the compile time cost of
investigating more large structs for SROA, the patch lowers the limit on array
elements to 8.

This fixes Apple Radar 7563690.

llvm-svn: 95224
2010-02-03 17:23:56 +00:00
Evan Cheng
e273e42195 Revert 94937 and move the noreturn check to codegen.
llvm-svn: 95198
2010-02-03 03:55:59 +00:00
Bob Wilson
cabb8bea4c Fix some comment typos.
llvm-svn: 95170
2010-02-03 00:33:21 +00:00
Eric Christopher
ac28e14b77 Recommit this, looks like it wasn't the cause.
llvm-svn: 95165
2010-02-03 00:21:58 +00:00
Eric Christopher
f070aae6f7 Hopefully temporarily revert this.
llvm-svn: 95154
2010-02-02 23:01:31 +00:00
Eric Christopher
048492d669 Reformat my last patch slightly.
llvm-svn: 95147
2010-02-02 22:29:26 +00:00
Eric Christopher
575fe8690d Re-add strcmp and known size object size checking optimization.
Passed bootstrap and nightly test run here.

llvm-svn: 95145
2010-02-02 22:10:43 +00:00
Chris Lattner
9f50341a96 don't turn (A & (C0?-1:0)) | (B & ~(C0?-1:0)) -> C0 ? A : B
for vectors.  Codegen is generating awful code or segfaulting
in various cases (e.g. PR6204).

llvm-svn: 95058
2010-02-02 02:43:51 +00:00
Chris Lattner
e471d94f91 fix a crash in loop unswitch on a loop invariant vector condition.
llvm-svn: 95055
2010-02-02 02:26:54 +00:00
Dan Gohman
95e0161d1b LangRef.html says that inttoptr and ptrtoint always use zero-extension
when the cast is extending.

llvm-svn: 95046
2010-02-02 01:44:02 +00:00
Eric Christopher
02559754dc Don't need to check the last argument since it'll always be bool. We also
don't use TargetData here.

llvm-svn: 95040
2010-02-02 00:51:45 +00:00
Eric Christopher
48f2aae32f More indentation/tabification fixes.
llvm-svn: 95036
2010-02-02 00:13:06 +00:00
Eric Christopher
1004836336 Untabify previous commit.
llvm-svn: 95035
2010-02-02 00:06:55 +00:00
Eric Christopher
9e9e599070 Formatting.
llvm-svn: 95027
2010-02-01 23:25:03 +00:00
Bob Wilson
1f966e8ca6 Add an option to GVN to remove all partially redundant loads. This is currently
disabled by default.  This divides the existing load PRE code into 2 phases:
first it checks that it is safe to move the load to each of the predecessors
where it is unavailable, and then if it is safe, the code is changed to move
the load.  Radar 7571861.

llvm-svn: 95007
2010-02-01 21:17:14 +00:00
Chris Lattner
333bf5abf5 cleanups.
llvm-svn: 94995
2010-02-01 19:54:45 +00:00
Chris Lattner
5f10919836 fix rdar://7590304, a miscompilation of objc apps on arm. The caller
of objc message send was getting marked arm_apcscc, but the prototype
isn't.  This is fine at runtime because objcmsgsend is implemented in
assembly.  Only turn a mismatched caller and callee into 'unreachable'
if the callee is a definition.

llvm-svn: 94986
2010-02-01 18:11:34 +00:00
Chris Lattner
a336497d3f fix rdar://7590304, an infinite loop in instcombine. In the invoke
case, instcombine can't zap the invoke for fear of changing the CFG.
However, we have to do something to prevent the next iteration of
instcombine from inserting another store -> undef before the invoke
thereby getting into infinite iteration between dead store elim and
store insertion.

Just zap the callee to null, which will prevent the next iteration
from doing anything.

llvm-svn: 94985
2010-02-01 18:04:58 +00:00
Bob Wilson
8207d33d94 Fix pr6198 by moving the isSized() check to an outer conditional.
The testcase from pr6198 does not crash for me -- I don't know what's up with
that -- so I'm not adding it to the tests.

llvm-svn: 94984
2010-02-01 17:41:44 +00:00
Eli Friedman
19c5c57885 Simplify/generalize the xor+add->sign-extend instcombine.
llvm-svn: 94943
2010-01-31 04:29:12 +00:00
Eli Friedman
58c7936637 Add a small transform: transform -(X<<Y) to (-X<<Y) when the shift has a single
use and X is free to negate.

llvm-svn: 94941
2010-01-31 02:30:23 +00:00
Evan Cheng
c2f3c20680 Do not mark no-return calls tail calls. It'll screw up special calls like longjmp and it doesn't make much sense for performance reason. If my logic is faulty, please let me know.
llvm-svn: 94937
2010-01-31 00:59:31 +00:00
Bob Wilson
0f04082970 Check alignment of loads when deciding whether it is safe to execute them
unconditionally.  Besides checking the offset, also check that the underlying
object is aligned as much as the load itself.

llvm-svn: 94875
2010-01-30 04:42:39 +00:00
Bob Wilson
e979c978e0 Use more specific types to avoid casts. No functionality change.
llvm-svn: 94863
2010-01-30 00:41:10 +00:00
Jakob Stoklund Olesen
e3fd8b5848 Keep iterating over all uses when meeting a phi node in AllUsesOfValueWillTrapIfNull().
This bug was exposed by my inliner cost changes in r94615, and caused failures
of lencod on most architectures when building with LTO.

This patch fixes lencod and 464.h264ref on x86-64 (and likely others).

llvm-svn: 94858
2010-01-29 23:54:14 +00:00
Bob Wilson
71bc5f0787 Preserve load alignment in instcombine transformations. I've been unable to
create a testcase where this matters.  The select+load transformation only
occurs when isSafeToLoadUnconditionally is true, and in those situations,
instcombine also changes the underlying objects to be aligned.  This seems
like a good idea regardless, and I've verified that it doesn't pessimize
the subsequent realignment.

llvm-svn: 94850
2010-01-29 22:39:21 +00:00
Eric Christopher
47d90f7adb Revert my last couple of patches. They appear to have broken bison.
llvm-svn: 94841
2010-01-29 21:16:24 +00:00
Bob Wilson
bb651db10d Use uint64_t instead of unsigned for offsets and sizes.
llvm-svn: 94835
2010-01-29 20:34:28 +00:00
Bob Wilson
f897b7b37e Improve isSafeToLoadUnconditionally to recognize that GEPs with constant
indices are safe if the result is known to be within the bounds of the
underlying object.

llvm-svn: 94829
2010-01-29 19:19:08 +00:00
Duncan Sands
d6baca9159 Having RHSKnownZero and RHSKnownOne be alternative names for KnownZero and KnownOne
(via APInt &RHSKnownZero = KnownZero, etc) seems dangerous and confusing to me: it
is easy not to notice this, and then wonder why KnownZero/RHSKnownZero changed
underneath you when you modified RHSKnownZero/KnownZero etc.  So get rid of this.
No intended functionality change (tested with "make check" + llvm-gcc bootstrap).

llvm-svn: 94802
2010-01-29 06:18:46 +00:00
Eric Christopher
f01379e6c2 Make strcpy_chk lower to strcpy if we have a safe size.
llvm-svn: 94783
2010-01-29 01:37:11 +00:00
Eric Christopher
7d74af1824 Add constant support to object size handling and remove default
lowering. We'll either figure it out, or not and be lowered by
SelectionDAGBuild.

Add test.

llvm-svn: 94775
2010-01-29 01:09:57 +00:00
Bill Wendling
f6736d2bae Generic reformatting and comment fixing. No functionality change.
llvm-svn: 94771
2010-01-29 00:52:43 +00:00
Bill Wendling
d115615285 Add newline to debugging output, and fix some grammar-os in comment.
llvm-svn: 94765
2010-01-29 00:27:39 +00:00
Victor Hernandez
60060d030c mem2reg erases the dbg.declare intrinsics that it converts to dbg.val intrinsics
llvm-svn: 94763
2010-01-29 00:01:35 +00:00
Duncan Sands
a3395c61b5 Fix PR6165. The bug was that LHSKnownZero was being and'd with DemandedMask
when it should have been and'd with LowBits.  Fix that and while there beef
up the logic in the case of a negative LHS.

llvm-svn: 94745
2010-01-28 17:22:42 +00:00
Bob Wilson
2e1a609654 Avoid creating redundant PHIs in SSAUpdater::GetValueInMiddleOfBlock.
This was already being done in SSAUpdater::GetValueAtEndOfBlock so I've
just changed SSAUpdater to check for existing PHIs in both places.

llvm-svn: 94690
2010-01-27 22:01:02 +00:00
Jeffrey Yasskin
fb10587e50 Kill ModuleProvider and ghost linkage by inverting the relationship between
Modules and ModuleProviders. Because the "ModuleProvider" simply materializes
GlobalValues now, and doesn't provide modules, it's renamed to
"GVMaterializer". Code that used to need a ModuleProvider to materialize
Functions can now materialize the Functions directly. Functions no longer use a
magic linkage to record that they're materializable; they simply ask the
GVMaterializer.

Because the C ABI must never change, we can't remove LLVMModuleProviderRef or
the functions that refer to it. Instead, because Module now exposes the same
functionality ModuleProvider used to, we store a Module* in any
LLVMModuleProviderRef and translate in the wrapper methods.  The bindings to
other languages still use the ModuleProvider concept.  It would probably be
worth some time to update them to follow the C++ more closely, but I don't
intend to do it.

Fixes http://llvm.org/PR5737 and http://llvm.org/PR5735.

llvm-svn: 94686
2010-01-27 20:34:15 +00:00
Benjamin Kramer
cfe07fadc7 Don't bother with sprintf, just pass the Twine through.
llvm-svn: 94684
2010-01-27 19:58:47 +00:00
Benjamin Kramer
d408373dbe Use the less expensive getName function instead of getNameStr.
llvm-svn: 94683
2010-01-27 19:46:52 +00:00
Chris Lattner
deabd1cfb6 some cleanups.
llvm-svn: 94649
2010-01-27 02:12:20 +00:00
Chris Lattner
354782f9b9 no need to check for null
llvm-svn: 94648
2010-01-27 02:04:20 +00:00
Victor Hernandez
e6321dc910 When converting dbg.declare to dbg.value, attach promoted store's debug metadata to dbg.value
llvm-svn: 94634
2010-01-27 00:44:36 +00:00
Victor Hernandez
4e7937f77b Avoid extra calls to MD->getNumOperands()
llvm-svn: 94618
2010-01-26 23:29:09 +00:00
Victor Hernandez
3a9845ecf7 Switch AllocaDbgDeclares to SmallVector and don't leak DIFactory
llvm-svn: 94567
2010-01-26 18:57:53 +00:00
Victor Hernandez
f12e8a120f In mem2reg, for all alloca/stores that get promoted where the alloca has an associated llvm.dbg.declare instrinsic, insert an llvm.dbg.var intrinsic before each store.
llvm-svn: 94493
2010-01-26 02:42:15 +00:00
Bob Wilson
e3d00b2e56 Remove check for an impossible condition: the condition of the while loop has
already checked that TmpBB->getSinglePredecessor() is non-null.

llvm-svn: 94451
2010-01-25 21:28:05 +00:00
Bob Wilson
1882fba746 Change Value::getUnderlyingObject to have the MaxLookup value specified as a
parameter with a default value, instead of just hardcoding it in the
implementation.  The limit of MaxLookup = 6 was introduced in r69151 to fix
a performance problem with O(n^2) behavior in instcombine, but the scalarrepl
pass is relying on getUnderlyingObject to go all the way back to an AllocaInst.
Making the limit part of the method signature makes it clear that by default
the result is limited and should help avoid similar problems in the future.
This fixes pr6126.

llvm-svn: 94433
2010-01-25 18:26:54 +00:00
Victor Hernandez
253c09eb8f Revert r94260 until findDbgDeclare() is made more efficient
llvm-svn: 94432
2010-01-25 17:52:13 +00:00
Chris Lattner
5a57121631 make -fno-rtti the default unless a directory builds with REQUIRES_RTTI.
llvm-svn: 94378
2010-01-24 20:43:08 +00:00
Chris Lattner
91fafbd4e8 change the canonical form of "cond ? -1 : 0" to be
"sext cond" instead of a select.  This simplifies some instcombine
code, matches the policy for zext (cond ? 1 : 0 -> zext), and allows
us to generate better code for a testcase on ppc.

llvm-svn: 94339
2010-01-24 00:09:49 +00:00
Chris Lattner
b9e618553f fix a potential overflow issue Eli pointed out.
llvm-svn: 94336
2010-01-23 23:31:46 +00:00
Nick Lewycky
f9a681fb61 Speculatively revert r94322 to see if it fixes darwin selfhost buildbot.
llvm-svn: 94331
2010-01-23 20:32:12 +00:00
Chris Lattner
b444bc0234 third bug from PR6119: the xor dupe extension allows
for arbitrary terminators in predecessors, don't assume
it is a conditional or uncond branch.  The testcase shows
an example where they can happen with switches.

llvm-svn: 94323
2010-01-23 19:21:31 +00:00
Nick Lewycky
8bbb754c7c Teach DAE that even though it can't modify the function signature of an
externally visible function, it can still find all callers of it and replace
the parameters to a dead argument with undef.

llvm-svn: 94322
2010-01-23 19:19:34 +00:00
Chris Lattner
7130788ea2 add an early out to ProcessBranchOnXOR to speed it up,
handle the case when we can infer an input to the xor
from all inputs that agree, instead of going into an
infinite loop.  Another part of PR6199

llvm-svn: 94321
2010-01-23 19:16:25 +00:00
Chris Lattner
e4391a1adb fix a crash in jump threading, PR6119
llvm-svn: 94319
2010-01-23 18:56:07 +00:00
Chris Lattner
8909d5aca5 implement a simple instcombine xform that has been in the
readme forever.

llvm-svn: 94318
2010-01-23 18:49:30 +00:00
Eric Christopher
bc8f2b1a56 Reapply 94059 while fixing the calling convention setup
for strcpy.

llvm-svn: 94287
2010-01-23 05:29:06 +00:00
Victor Hernandez
7ce9006ba0 In mem2reg, for all alloca/stores that get promoted where the alloca has an associated llvm.dbg.declare instrinsic, insert an llvm.dbg.var intrinsic before each store
llvm-svn: 94260
2010-01-23 00:17:34 +00:00
Benjamin Kramer
a56f170cc0 Another strncmp -> StringRef.startswith simplification.
llvm-svn: 94203
2010-01-22 20:00:21 +00:00
Bob Wilson
8d14b72877 Revert 94059. It is breaking the MultiSource/Benchmarks/Prolangs-C/bison
test on ARM.

llvm-svn: 94198
2010-01-22 19:16:40 +00:00
Victor Hernandez
e4d8ef8fc4 Keep ignoring pointer-to-pointer bitcasts
llvm-svn: 94194
2010-01-22 19:05:05 +00:00
Chris Lattner
276811b58a Stop building RTTI information for *most* llvm libraries. Notable
missing ones are libsupport, libsystem and libvmcore.  libvmcore is
currently blocked on bugpoint, which uses EH.  Once it stops using
EH, we can switch it off.

This #if 0's out 3 unit tests, because gtest requires RTTI information.
Suggestions welcome on how to fix this.

llvm-svn: 94164
2010-01-22 06:49:46 +00:00
Dan Gohman
525f7d7833 Revert LoopStrengthReduce.cpp to pre-r94061 for now.
llvm-svn: 94123
2010-01-22 00:46:49 +00:00
Victor Hernandez
728ab49966 No need to look through bitcasts for DbgInfoIntrinsic
llvm-svn: 94114
2010-01-21 23:09:12 +00:00
Victor Hernandez
ef9e66c909 DbgInfoIntrinsic no longer appear in an instruction's use list
llvm-svn: 94113
2010-01-21 23:08:36 +00:00
Victor Hernandez
1cc721ee56 No need to look through bitcasts for DbgInfoIntrinsic
llvm-svn: 94112
2010-01-21 23:07:15 +00:00
Victor Hernandez
d44671b931 DbgInfoIntrinsics no longer appear in an instruction's use list; so clean up looking for them in use iterations and remove OnlyUsedByDbgInfoIntrinsics()
llvm-svn: 94111
2010-01-21 23:05:53 +00:00
Dan Gohman
23ac1f6157 When inserting expressions for post-increment users which contain
loop-variant components, adds must be inserted after the increment.
Keep track of the increment position for this case, and insert
these adds in the correct location.

llvm-svn: 94110
2010-01-21 23:01:22 +00:00
Dan Gohman
2c93a3fa96 Include IVUsers information in LSR's debug output.
llvm-svn: 94108
2010-01-21 22:46:32 +00:00
Dan Gohman
3cce294b13 Prune the search for candidate formulae if the number of register
operands exceeds the number of registers used in the initial
solution, as that wouldn't lead to a profitable solution anyway.

llvm-svn: 94107
2010-01-21 22:42:49 +00:00
Dan Gohman
a4954135a7 Add a comment.
llvm-svn: 94104
2010-01-21 21:31:09 +00:00
Chris Lattner
642049e127 It turns out that this #include is needed because otherwise
ValueMapper.cpp ends up calling an out of line 
__ZNK4llvm12PATypeHolder3getEv, which is a template and llvm-config
determines arbitrarily to use the one in libipo.  This sucks, but
keeping the #include is a reasonable workaround.

llvm-svn: 94103
2010-01-21 21:29:25 +00:00
Chris Lattner
3547371b0b unbreak the build, apparently without this transformutils starts depending on libipa?
llvm-svn: 94102
2010-01-21 21:20:51 +00:00
Chris Lattner
3f91babac8 tidy up
llvm-svn: 94101
2010-01-21 21:05:54 +00:00
Victor Hernandez
507d86fcbb Don't need to include IntrinsicInst.h any more
llvm-svn: 94092
2010-01-21 19:33:59 +00:00
Victor Hernandez
3a31114cf9 No need to map NULL operands of metadata
llvm-svn: 94091
2010-01-21 19:26:20 +00:00
Dan Gohman
be34c35f32 Re-implement the main strength-reduction portion of LoopStrengthReduction.
This new version is much more aggressive about doing "full" reduction in
cases where it reduces register pressure, and also more aggressive about
rewriting induction variables to count down (or up) to zero when doing so
reduces register pressure.

It currently uses fairly simplistic algorithms for finding reuse
opportunities, but it introduces a new framework allows it to combine
multiple strategies at once to form hybrid solutions, instead of doing
all full-reduction or all base+index.

llvm-svn: 94061
2010-01-21 02:09:26 +00:00
Eric Christopher
939ad8be86 Add strcpy_chk -> strcpy support for "don't know" object size
answers.  This will update as object size checking gets better information.

llvm-svn: 94059
2010-01-21 01:04:38 +00:00
Chris Lattner
5c16b9815b simplify this code.
llvm-svn: 94048
2010-01-20 23:30:28 +00:00
Jakob Stoklund Olesen
ac14f3bf31 Move per-function inline threshold calculation to a method.
No functional change except the forgotten test for
InlineLimit.getNumOccurrences() == 0 in the CurrentThreshold2 calculation.

llvm-svn: 94007
2010-01-20 17:51:28 +00:00
Victor Hernandez
50e7dcdcc2 Switch Elts from vector to SmallVector
llvm-svn: 93989
2010-01-20 06:56:16 +00:00
Victor Hernandez
d79e7e0e12 Map operands of all function-local metadata, not just metadata passed to llvm.dbg.declare intrinsics
llvm-svn: 93979
2010-01-20 05:49:59 +00:00
Dan Gohman
4ca37472ec When doing address-mode sinking, expand the base register first, rather
than the scaled register. This makes it more likely that subsequent
AddrModeMatcher queries will match the new address the same way as the
old, instead of accidentally matching what had been the base register
as the new scaled register, and then failing to match the scaled register.
This fixes some problems with address-mode sinking multiple muls into a
block, which will be a lot more common with some upcoming
LoopStrengthReduction changes.

llvm-svn: 93935
2010-01-19 22:45:06 +00:00
Chris Lattner
e0124f19f9 optimize ~(~X >>s Y) --> (X >>s Y), patch by Edmund Grimley
Evans!

llvm-svn: 93884
2010-01-19 18:16:19 +00:00
Bob Wilson
e94d77854f Fix a crash in scalarrepl for memcpy/memmove where the source and destination
are the same.  I had already fixed a similar problem where the source and
destination were different bitcasts derived from the same alloca, but the
previous fix still did not handle the case where both operands are exactly
the same value.  Radar 7552893.

llvm-svn: 93848
2010-01-19 04:32:48 +00:00
Eric Christopher
5b602f9bf7 Fix comment.
llvm-svn: 93831
2010-01-19 01:20:15 +00:00
Chris Lattner
6cd7a81f86 my instcombine transformations to make extension elimination more
aggressive changed the canonical form from sext(trunc(x)) to ashr(lshr(x)),
make sure to transform a couple more things into that canonical form,
and catch a case where we missed turning zext/shl/ashr into a single sext.

llvm-svn: 93787
2010-01-18 22:19:16 +00:00
Devang Patel
2cae5754c6 While mapping llvm.dbg.declare intrinsic manually map its operand, if possible,
because it points to an alloca instruction through metadata.

llvm-svn: 93757
2010-01-18 19:52:14 +00:00
Owen Anderson
d73ce407a5 Convert some of the dynamic opcode lookups into static ones.
llvm-svn: 93693
2010-01-17 19:33:27 +00:00
Owen Anderson
50cacaff8f Fix comment.
llvm-svn: 93679
2010-01-17 06:49:03 +00:00
Bob Wilson
bdd9890c7d Fix a comment typo.
llvm-svn: 93560
2010-01-15 21:55:02 +00:00
Bill Wendling
488a7187b4 When the visitSub method was split into visitSub and visitFSub, this xform was
added to the FSub version. However, the original version of this xform guarded
against doing this for floating point (!Op0->getType()->isFPOrFPVector()).

This is causing LLVM to perform incorrect xforms for code like:

void func(double *rhi, double *rlo, double xh, double xl, double yh, double yl){
  double mh, ml;
  double c = 134217729.0;
  double up, u1, u2, vp, v1, v2;
        
  up = xh*c;
  u1 = (xh - up) + up;
  u2 = xh - u1;
        
  vp = yh*c;
  v1 = (yh - vp) + vp;
  v2 = yh - v1;
        
  mh = xh*yh;
  ml = (((u1*v1 - mh) + (u1*v2)) + (u2*v1)) + (u2*v2);
  ml += xh*yl + xl*yh;
        
  *rhi = mh + ml;
  *rlo = (mh - (*rhi)) + ml;
}

The last line was optimized away, but rl is intended to be the difference
between the infinitely precise result of mh + ml and after it has been rounded
to double precision.

llvm-svn: 93369
2010-01-13 23:23:17 +00:00
Chris Lattner
21bbaf49d5 1) Use the new SimplifyInstructionsInBlock routine instead of the copy
in JT.

2) When cloning blocks for PHI or xor conditions, use
instsimplify to simplify the code as we go.  This allows us to 
squish common cases early in JT which opens up opportunities for
subsequent iterations, and allows it to completely simplify the
testcase.

llvm-svn: 93253
2010-01-12 20:41:47 +00:00
Chris Lattner
87f86498c3 add a helper function.
llvm-svn: 93251
2010-01-12 19:40:54 +00:00
Chris Lattner
bc0016437d tidy up
llvm-svn: 93222
2010-01-12 02:07:50 +00:00
Chris Lattner
774e3967ad Teach jump threading to duplicate small blocks when the branch
condition is a xor with a phi node.  This eliminates nonsense
like this from 176.gcc in several places:

 LBB166_84:
        testl   %eax, %eax
-       setne   %al
-       xorb    %cl, %al
-       notb    %al
-       testb   $1, %al
-       je      LBB166_85
+       je      LBB166_69
+       jmp     LBB166_85

This is rdar://7391699

llvm-svn: 93221
2010-01-12 02:07:17 +00:00
Chris Lattner
d6d8cc7b37 some cleanup, and make it obvious that ProcessJumpOnPHI only works
on branches by renaming it and checking for a branch at the call site.

llvm-svn: 93208
2010-01-11 23:41:09 +00:00
Chris Lattner
a8cabeeecb reenable the piece that turns trunc(zext(x)) -> x even if zext has multiple uses,
codegen has no apparent problem with the trunc version of this, because it turns
into a simple subreg idiom

llvm-svn: 93202
2010-01-11 22:49:40 +00:00
Chris Lattner
2749cc2036 Disable folding sext(trunc(x)) -> x (and other similar cast/cast cases) when the
trunc has multiple uses.  Codegen is not able to coalesce the subreg case 
correctly and so this leads to higher register pressure and spilling (see PR5997).

This speeds up 256.bzip2 from 8.60 -> 8.04s on my machine, ~7%.

llvm-svn: 93200
2010-01-11 22:45:25 +00:00
Chris Lattner
85a6f02b94 add one more bitfield optimization, allowing clang to generate
good code on PR4216:

_test_bitfield:                                             ## @test_bitfield
	orl	$32962, %edi
	movl	$4294941946, %eax
	andq	%rdi, %rax
	ret

instead of:

_test_bitfield:
        movl    $4294941696, %ecx
        movl    %edi, %eax
        orl     $194, %edi
        orl     $32768, %eax
        andq    $250, %rdi
        andq    %rax, %rcx
        movq    %rdi, %rax
        orq     %rcx, %rax
        ret

Evan is looking into the remaining andq+imm -> andl optimization.

llvm-svn: 93147
2010-01-11 06:55:24 +00:00
Chris Lattner
16e36659f5 Extend CanEvaluateZExtd to handle and/or/xor more aggressively in the
BitsToClear case.  This allows it to promote expressions which have an
and/or/xor after the lshr, promoting cases like test2 (from PR4216) 
and test3 (random extample extracted from a spec benchmark).

clang now compiles the code in PR4216 into:

_test_bitfield:                                             ## @test_bitfield
	movl	%edi, %eax
	orl	$194, %eax
	movl	$4294902010, %ecx
	andq	%rax, %rcx
	orl	$32768, %edi
	andq	$39936, %rdi
	movq	%rdi, %rax
	orq	%rcx, %rax
	ret

instead of:

_test_bitfield:                                             ## @test_bitfield
	movl	%edi, %eax
	orl	$194, %eax
	movl	$4294902010, %ecx
	andq	%rax, %rcx
	shrl	$8, %edi
	orl	$128, %edi
	shlq	$8, %rdi
	andq	$39936, %rdi
	movq	%rdi, %rax
	orq	%rcx, %rax
	ret

which is still not great, but is progress.

llvm-svn: 93145
2010-01-11 04:05:13 +00:00
Chris Lattner
f2ba85eedc Remove the dead TD argument to CanEvaluateZExtd, and add a
new BitsToClear result which allows us to start promoting
expressions that end with a lshr-by-constant.  This is
conservatively correct and better than what we had before
(see testcases) but still needs to be extended further.

llvm-svn: 93144
2010-01-11 03:32:00 +00:00
Chris Lattner
d7f1b97147 improve comments, remove dead TD argument to CanEvaluateSExtd.
llvm-svn: 93143
2010-01-11 02:43:35 +00:00
Chris Lattner
18d753e05f teach sext optimization to handle truncs from types that are not
the dest of the sext.

llvm-svn: 93128
2010-01-10 20:30:41 +00:00
Chris Lattner
ca53de1ab7 teach zext optimization how to deal with truncs that don't come from
the zext dest type.  This allows us to handle test52/53 in cast.ll,
and allows llvm-gcc to generate much better code for PR4216 in -m64
mode:

_test_bitfield:                                             ## @test_bitfield
	orl	$32962, %edi
	movl	%edi, %eax
	andl	$-25350, %eax
	ret

This also fixes a bug handling vector extends, ensuring that the
mask produced is a vector constant, not an integer constant.

llvm-svn: 93127
2010-01-10 20:25:54 +00:00
Chris Lattner
67e0ef4d5d simplify CanEvaluateSExtd to return a bool now that we have a
simpler profitability predicate.

llvm-svn: 93111
2010-01-10 07:57:20 +00:00