input the the mul is a zext from bool, just that it is all zeros
other than the low bit. This fixes some phase ordering issues
that would cause us to miss some xforms in mul.ll when the worklist
is visited differently.
llvm-svn: 83794
it to visit instructions from the start of the function to the
end of the function in the first path. This greatly speeds up
some pathological cases (e.g. PR5150).
llvm-svn: 83790
For now the metadata of sinked/hoisted instructions is still wrong, but that'll
be fixed when instructions will have debug metadata directly attached.
llvm-svn: 83786
done by condprop, but do it in a much more general form. The
basic idea is that we can do a limited form of tail duplication
in the case when we have a branch on a phi. Moving the branch
up in to the predecessor block makes instruction selection
much easier and encourages chained jump threadings.
llvm-svn: 83759
from GVN, this also speeds it up, inserts fewer PHI nodes (see the
testcase) and allows it to remove more loads (due to fewer PHI nodes
standing in the way).
llvm-svn: 83746
DemoteRegToStack. This makes it more efficient (because it isn't
creating a ton of load/stores that are eventually removed by a later
mem2reg), and more slightly more effective (because those load/stores
don't get in the way of threading).
llvm-svn: 83706
to declare that they preserve other passes without needing to pull in
additional header file or library dependencies. Convert MachineFunctionPass
and CodeGenLICM to make use of this.
llvm-svn: 83555
already on the worklist, and print Visited when an instruction is about to be
visited. Net, on one input, this reduced the output size by at least 9x.
llvm-svn: 83510
the new predicates I added) instead of going through a context and doing a
pointer comparison. Besides being cheaper, this allows a smart compiler
to turn the if sequence into a switch.
llvm-svn: 83297
that are phi nodes. Also tighten up FoldOpIntoPhi to treat constantexpr
operands to phis just like other variables, avoiding moving constantexpr
computations around.
Patch by Daniel Dunbar.
llvm-svn: 82913
from a piece of a large store when both are in the same block.
This allows clang to compile the testcase in PR4216 to this code:
_test_bitfield:
movl 4(%esp), %eax
movl %eax, %ecx
andl $-65536, %ecx
orl $32962, %eax
andl $40186, %eax
orl %ecx, %eax
ret
This is not ideal, but is a whole lot better than the code produced
by llvm-gcc:
_test_bitfield:
movw $-32574, %ax
orw 4(%esp), %ax
andw $-25350, %ax
movw %ax, 4(%esp)
movw 7(%esp), %cx
shlw $8, %cx
movzbl 6(%esp), %edx
orw %cx, %dx
movzwl %dx, %ecx
shll $16, %ecx
movzwl %ax, %eax
orl %ecx, %eax
ret
and dramatically better than that produced by gcc 4.2:
_test_bitfield:
pushl %ebx
call L3
"L00000000001$pb":
L3:
popl %ebx
movl 8(%esp), %eax
leal 0(,%eax,4), %edx
sarb $7, %dl
movl %eax, %ecx
andl $7168, %ecx
andl $-7201, %ebx
movzbl %dl, %edx
andl $1, %edx
sall $5, %edx
orl %ecx, %ebx
orl %edx, %ebx
andl $24, %eax
andl $-58336, %ebx
orl %eax, %ebx
orl $32962, %ebx
movl %ebx, %eax
popl %ebx
ret
llvm-svn: 82439
so that nonlocal and partially redundant loads can use it as well.
The testcase shows examples of craziness this can handle. This triggers
*many* times in 176.gcc.
llvm-svn: 82403
(and load -> load) when the base pointers must alias but when
they are different types. This occurs very very frequently in
176.gcc and other code that uses bitfields a lot.
llvm-svn: 82399
constants out of loops. These aren't covered by the regular LICM
pass, because in LLVM IR constants don't require separate
instructions. They're not always covered by the MachineLICM pass
either, because it doesn't know how to unfold folded constant-pool
loads. This is somewhat experimental at this point, and off by
default.
llvm-svn: 82076
more than one phi, since that leads to higher register pressure on
entry to the phi. This is especially problematic when the phi is in
a loop header, as it increases register pressure throughout the loop.
llvm-svn: 81993
loop exit edge -- new PHIs may be needed not only for the additional
splits that are made to preserve LoopSimplify form, but also for the
original split. Factor out the code that inserts new PHIs so that it
can be used for both. Remove LoopRotation.cpp's code for manually
updating LCSSA form, as it is now redundant. This fixes PR4934.
llvm-svn: 81363
that get created during loop unswitching, and fix SplitBlockPredecessors'
LCSSA updating code to create new PHIs instead of trying to just move
existing ones.
Also, optimize Loop::verifyLoop, since it gets called a lot. Use
searches on a sorted list of blocks instead of calling the "contains"
function, as is done in other places in the Loop class, since "contains"
does a linear search. Also, don't call verifyLoop from LoopSimplify or
LCSSA, as the PassManager is already calling verifyLoop as part of
LoopInfo's verifyAnalysis.
llvm-svn: 81221
extractelement operations into a bitcast of the pointer,
then a gep, then a scalar load. Disable this when the vector
only has one element, because it leads to infinite loops in
instcombine (PR4908).
This transformation seems like a really bad idea to me, as it
will likely disable CSE of vector load/stores etc and can be
better done in the code generator when profitable. This
goes all the way back to the first days of packed types,
r25299 specifically.
I'll let those people who care about the performance of vector
code decide what to do with this.
llvm-svn: 81185
- I think there are more instances of this, but I think they are fixed in Dan's
incoming patch. This one was preventing me from doing a bugpoint reduction
though.
llvm-svn: 81103
Constant uniquing tables. This allows distinct ConstantExpr objects
with the same operation and different flags.
Even though a ConstantExpr "a + b" is either always overflowing or
never overflowing (due to being a ConstantExpr), it's still necessary
to be able to represent it both with and without overflow flags at
the same time within the IR, because the safety of the flag may
depend on the context of the use. If the constant really does overflow,
it wouldn't ever be safe to use with the flag set, however the use
may be in code that is never actually executed.
This also makes it possible to merge all the flags tests into a single test.
llvm-svn: 80998
instead of a bool argument, and to do the dominator check itself.
This makes it eaiser to use when DominatorTree information is
available.
llvm-svn: 80920
simplifylibcalls optimization is thus valid for C++ but not C.
It's not important enough to worry about for C++ apps, so just
remove it.
rdar://7191924
llvm-svn: 80887
changes: SimplifyDemandedBits can't use the builder yet because it
has the wrong insertion point. This fixes a crash building
MultiSource/Benchmarks/PAQ8p
llvm-svn: 80537
is itself a bitcast. Since we have gep(bitcast(bitcast(y))) in this
case, just wait for the two bitcasts to get zapped. This prevents
instcombine from confusing some aliasing stuff, and allows it to
directly eliminate the load in the testcase.
llvm-svn: 80508
workslist and is set to insert new instructions before the current one.
Convert a bunch of stuff that used to call InsertNewInstBefore over to
use it, greatly simplifying code and making it more natural.
There is still a lot more to go, but this is a good start.
llvm-svn: 80492
if the operand is not an instruction.
Simplify most uses of AddOperandsToWorkList to use AddValue and
inline it into the one remaining callsite.
llvm-svn: 80488
former looks too much like AddUsersToWorkList and keeps
confusing me.
Remove AddSoonDeadInstToWorklist and change its two callers
to do the same thing in a simpler way.
llvm-svn: 80486
into their callers. simplify ReplaceInstUsesWith. Make
EraseInstFromFunction only add operands to the worklist if
there aren't too many of them (this was a scalability win
for crazy programs that was only infrequently enforced).
Switch more code to using EraseInstFromFunction instead of
duplicating it inline. Change some fcmp/icmp optimizations
to modify fcmp/icmp in place instead of creating a new one
and deleting the old one just to change the predicate.
llvm-svn: 80483
and introduce a new Instruction::isIdenticalTo which tests for full
identity, including the SubclassOptionalData flags. Also, fix the
Instruction::clone implementations to preserve the SubclassOptionalData
flags. Finally, teach several optimizations how to handle
SubclassOptionalData correctly, given these changes.
This fixes the counterintuitive behavior of isIdenticalTo not comparing
the full value, and clone not returning an identical clone, as well as
some subtle bugs that could be caused by these.
Thanks to Nick Lewycky for reporting this, and for an initial patch!
llvm-svn: 80038
sinking code, since they are special. If the loop preheader happens
to be the entry block of a function, don't sink static allocas
out of it. This fixes PR4775.
llvm-svn: 80010
the new load by the old load instead of by the extract element because
a store could have occurred between the load and extract element.
llvm-svn: 78891
- Part of optimal static profiling patch sequence by Andreas Neustifter.
- Store edge, block, and function information separately for each functions
(instead of in one giant map).
- Return frequencies as double instead of int, and use a sentinel value for
missing information.
llvm-svn: 78477
appropriate. Patch per report on llvmdev. No testcase because the
original report didn't come with a testcase, and I can't come up with a case
that actually fails.
llvm-svn: 77986
a Twine, e.g., for names).
- I am a little ambivalent about this; we don't want the string conversion of
utostr, but using overload '+' mixed with string and integer arguments is
sketchy. On the other hand, this particular usage is something of an idiom.
llvm-svn: 77579
- Some clients which used DOUT have moved to DEBUG. We are deprecating the
"magic" DOUT behavior which avoided calling printing functions when the
statement was disabled. In addition to being unnecessary magic, it had the
downside of leaving code in -Asserts builds, and of hiding potentially
unnecessary computations.
llvm-svn: 77019
- Yay for '-'s and simplifications!
- I kept StringMap::GetOrCreateValue for compatibility purposes, this can
eventually go away. Likewise the StringMapEntry Create functions still follow
the old style.
- NIFC.
llvm-svn: 76888
also apply to vectors. This allows us to compile this:
#include <emmintrin.h>
__m128i a(__m128 a, __m128 b) { return a==a & b==b; }
__m128i b(__m128 a, __m128 b) { return a!=a | b!=b; }
to:
_a:
cmpordps %xmm1, %xmm0
ret
_b:
cmpunordps %xmm1, %xmm0
ret
with clang instead of to a ton of horrible code.
llvm-svn: 76863
Getelementptrs that are defined to wrap are virtually useless to
optimization, and getelementptrs that are undefined on any kind
of overflow are too restrictive -- it's difficult to ensure that
all intermediate addresses are within bounds. I'm going to take
a different approach.
Remove a few optimizations that depended on this flag.
llvm-svn: 76437
insertelement/extractelement.
I'm not entirely sure this is precisely what we want to do: should we
prefer bitcast(insertelement) or insertelement(bitcast)? Similarly. should we
prefer extractelement(bitcast) or bitcast(extractelement)?
llvm-svn: 76345
all values belonging to the intersection will belong to the resulting range.
The former was inconsistent about that point (either way is fine, just pick
one.) This is part of PR4545.
llvm-svn: 76289
GEPOperator's hasNoPointer0verflow(), and make a few places in instcombine
that create GEPs that may overflow clear the NoOverflow value. Among
other things, this partially addresses PR2831.
llvm-svn: 76252
in a convenient manner, factoring out some common code from
InstructionCombining and ValueTracking. Move the contents of
BinaryOperators.h into Operator.h and use Operator to generalize them
to support ConstantExprs as well as Instructions.
llvm-svn: 76232
isSafeToSpeculativelyExecute. The new method is a bit closer to what
the callers actually care about in that it rejects more things callers
don't want. It also adds more precise handling for integer
division, and unifies code for analyzing the legality of a speculative
load.
llvm-svn: 76150
This adds location info for all llvm_unreachable calls (which is a macro now) in
!NDEBUG builds.
In NDEBUG builds location info and the message is off (it only prints
"UREACHABLE executed").
llvm-svn: 75640
(I think it's reasonably clear that we want to have a canonical form for
constructs like this; if anyone thinks that a select is not the best
canonical form, please tell me.)
llvm-svn: 75531
using the Curiously Recurring Template Pattern with LoopBase.
This will help further refactoring, and future functionality for
Loop. Also, Headers can now foward-declare Loop, instead of pulling
in LoopInfo.h or doing tricks.
llvm-svn: 75519
the changes are allowed by not calling this function for bitcasts.
The Instruction::AShr case is dead because
SimplifyDemandedInstructionBits handles that case.
llvm-svn: 75514
This involves temporarily hard wiring some parts to use the global context. This isn't ideal, but it's
the only way I could figure out to make this process vaguely incremental.
llvm-svn: 75445
Make llvm_unreachable take an optional string, thus moving the cerr<< out of
line.
LLVM_UNREACHABLE is now a simple wrapper that makes the message go away for
NDEBUG builds.
llvm-svn: 75379
per icmp predicate out of predsimplify and into ConstantRange.
Add another utility method that determines whether one range is a subset of
another. Combine with the former to determine whether icmp pred range, range
is known to be true or not.
llvm-svn: 75357
inserted to replace that value must dominate all of of the basic
blocks associated with the uses of the value in the PHI, not just
one of them.
llvm-svn: 74376
This helps it avoid reusing an instruction that doesn't dominate all
of the users, in cases where the original instruction was inserted
before all of the users were known. This may result in redundant
expansions of sub-expressions that depend on loop-unpredictable values
in some cases, however this isn't very common, and it primarily impacts
IndVarSimplify, so GVN can be expected to clean these up.
This eliminates the need for IndVarSimplify's FixUsesBeforeDefs,
which fixes several bugs.
llvm-svn: 74352
terminator, instead of after the last phi. This fixes a bug
exposed by ScalarEvolution analyzing more kinds of loops.
This fixes PR4436.
llvm-svn: 74072
trip counts in more cases.
Generalize ScalarEvolution's isLoopGuardedByCond code to recognize
And and Or conditions, splitting the code out into an
isNecessaryCond helper function so that it can evaluate Ands and Ors
recursively, and make SCEVExpander be much more aggressive about
hoisting instructions out of loops.
test/CodeGen/X86/pr3495.ll has an additional instruction now, but
it appears to be due to an arbitrary register allocation difference.
llvm-svn: 74048
now, this hasn't mattered, because ScalarEvolution hasn't been able
to compute trip counts for loops with multiple exits. But it will
soon.
llvm-svn: 73864
as if they were multiple uses of the same instruction. This interacts
well with the existing loadpre that j-t does to open up many new jump
threads earlier.
llvm-svn: 73768
casted induction variables in cases where the cast
isn't foldable. It ended up being a pessimization in
many cases. This could be fixed, but it would require
a bunch of complicated code in IVUsers' clients. The
advantages of this approach aren't visible enough to
justify it at this time.
llvm-svn: 73706
move loads back past a check that the load address
is valid, see new testcase. The test that went
in with 72661 has exactly this case, except that
the conditional it's moving past is checking
something else; I've settled for changing that
test to reference a global, not a pointer. It
may be possible to scan all the tests you pass and
make sure none of them are checking any component
of the address, but it's not trivial and I'm not
trying to do that here.
llvm-svn: 73632
failures.
To support this, add some utility functions to Type to help support
vector/scalar-independent code. Change ConstantInt::get and
ConstantFP::get to support vector types, and add an overload to
ConstantInt::get that uses a static IntegerType type, for
convenience.
Introduce a new getConstant method for ScalarEvolution, to simplify
common use cases.
llvm-svn: 73431