turn icmp eq a+x, b+x into icmp eq a, b if a+x or b+x has other uses. This
may have been increasing register pressure leading to the bzip2 slowdown.
llvm-svn: 63487
improvements to the EvaluateInDifferentType code. This code works
by just inserted a bunch of new code and then seeing if it is
useful. Instcombine is not allowed to do this: it can only insert
new code if it is useful, and only when it is converging to a more
canonical fixed point. Now that we iterate when DCE makes progress,
this causes an infinite loop when the code ends up not being used.
llvm-svn: 63483
simplifydemandedbits to simplify instructions with *multiple
uses* in contexts where it can get away with it. This allows
it to simplify the code in multi-use-or.ll into a single 'add
double'.
This change is particularly interesting because it will cover
up for some common codegen bugs with large integers created due
to the recent SROA patch. When working on fixing those bugs,
this should be disabled.
llvm-svn: 63481
Now, if it detects that "V" is the same as some other value,
SimplifyDemandedBits returns the new value instead of RAUW'ing it immediately.
This has two benefits:
1) simpler code in the recursive SimplifyDemandedBits routine.
2) it allows future fun stuff in instcombine where an operation has multiple
uses and can be simplified in one context, but not all.
#2 isn't implemented yet, this patch should have no functionality change.
llvm-svn: 63479
be able to handle *ANY* alloca that is poked by loads and stores of
bitcasts and GEPs with constant offsets. Before the code had a number
of annoying limitations and caused it to miss cases such as storing into
holes in structs and complex casts (as in bitfield-sroa) where we had
unions of bitfields etc. This also handles a number of important cases
that are exposed due to the ABI lowering stuff we do to pass stuff by
value.
One case that is pretty great is that we compile
2006-11-07-InvalidArrayPromote.ll into:
define i32 @func(<4 x float> %v0, <4 x float> %v1) nounwind {
%tmp10 = call <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float> %v1)
%tmp105 = bitcast <4 x i32> %tmp10 to i128
%tmp1056 = zext i128 %tmp105 to i256
%tmp.upgrd.43 = lshr i256 %tmp1056, 96
%tmp.upgrd.44 = trunc i256 %tmp.upgrd.43 to i32
ret i32 %tmp.upgrd.44
}
which turns into:
_func:
subl $28, %esp
cvttps2dq %xmm1, %xmm0
movaps %xmm0, (%esp)
movl 12(%esp), %eax
addl $28, %esp
ret
Which is pretty good code all things considering :).
One effect of this is that SROA will start generating arbitrary bitwidth
integers that are a multiple of 8 bits. In the case above, we got a
256 bit integer, but the codegen guys assure me that it can handle the
simple and/or/shift/zext stuff that we're doing on these operations.
This addresses rdar://6532315
llvm-svn: 63469
There is now a direct way from value-use-iterator to incoming block in PHINode's API.
This way we avoid the iterator->index->iterator trip, and especially the costly
getOperandNo() invocation. Additionally there is now an assertion that the iterator
really refers to one of the PHI's Uses.
llvm-svn: 62869
we assumed a CFG structure that would be valid when all code in
the function is reachable, but not all code is necessarily
reachable. Do a simple, but horrible, CFG walk to check for this
case.
llvm-svn: 62487
because of dead code, a phi could use the speculated instruction
that was not in "BB2". Make this check explicit and tighten up
some other corners. This fixes PR3292. No testcase becauase this
depends entirely on visitation order of blocks and requires a
sequence of 8 passes to repro.
llvm-svn: 62476
doing very similar pointer capture analysis.
Factor out the common logic. The new version
is from FunctionAttrs since it does a better
job than the version in BasicAliasAnalysis
llvm-svn: 62461
putc, puts, perror, vscanf and vsscanf from getting annotations.
Add annotations for eight printf functions, memalign, pread and pwrite.
On Linux, llvm-gcc sometimes renames strdup, getc, putc, strtok_r, scanf and
sscanf. Match the alternate function names.
Fix a crash annotating opendir.
Don't mark fsetpos's second parameter as nocapture. It's supposed to be
captured.
Do mark fopen's path and mode strings as nocapture. Mark ferror as readonly,
but not fileno which may set errno.
llvm-svn: 62456
- Looking at the number of sign bits of the a sext instruction to determine whether new trunc + sext pair should be added when its source is being evaluated in a different type.
llvm-svn: 62263