version uses a new algorithm for evaluating the binomial coefficients
which is significantly more efficient for AddRecs of more than 2 terms
(see the comments in the code for details on how the algorithm works).
It also fixes some bugs: it removes the arbitrary length restriction for
AddRecs, it fixes the silent generation of incorrect code for AddRecs
which require a wide calculation width, and it fixes an issue where we
were incorrectly truncating the iteration count too far when evaluating
an AddRec expression narrower than the induction variable.
There are still a few related issues I know of: I think there's
still an issue with the SCEVExpander expansion of AddRec in terms of
the width of the induction variable used. The hack to avoid generating
too-wide integers shouldn't be necessary; instead, the callers should be
considering the cost of the expansion before expanding it (in addition
to not expanding too-wide integers, we might not want to expand
expressions that are really expensive, especially when optimizing for
size; calculating an length-17 32-bit AddRec currently generates about 250
instructions of straight-line code on X86). Also, for long 32-bit
AddRecs on X86, CodeGen really sucks at scheduling the code. I'm planning on
filing follow-up PRs for these issues.
llvm-svn: 54332
subreg form on x86-64, to avoid the problem with x86-32
having GPRs that don't have 8-bit subregs.
Also, change several 16-bit instructions to use
equivalent 32-bit instructions. These have a smaller
encoding and avoid partial-register updates.
llvm-svn: 54223
to different address spaces. This alters the naming scheme for those
intrinsics, e.g., atomic.load.add.i32 => atomic.load.add.i32.p0i32
llvm-svn: 54195
time applying to the implicit comparison in smin expressions. The
correct way to transform an inequality into the opposite
inequality, either signed or unsigned, is with a not expression.
I looked through the SCEV code, and I don't think there are any more
occurrences of this issue.
llvm-svn: 54194
SGT exit condition. Essentially, the correct way to flip an inequality
in 2's complement is the not operator, not the negation operator.
That said, the difference only affects cases involving INT_MIN.
Also, enhance the pre-test search logic to be a bit smarter about
inequalities flipped with a not operator, so it can eliminate the smax
from the iteration count for simple loops.
llvm-svn: 54184
that says "unconditional loads from this argument are safe", we now keep track
of the safety per set of indices from which loads happen. This prevents
ArgPromotion from promoting loads that aren't really valid. As an added effect,
this will now disregard the the type of the indices passed to a GEP, so
"load GEP %A, i32 1" and "load GEP %A, i64 1" will result in a single argument,
not two.
This fixes PR2598, for which a testcase has been added as well.
llvm-svn: 54159
which is represented in codegen as an 'and' operation. This matches them
with movz instructions, instead of leaving them to be matched by and
instructions with an immediate field.
llvm-svn: 54147
because opt exited while llvm-as was still
writing to the pipe, causing it to get a
SIGPIPE. It seems best to change things to
avoid the race altogether.
llvm-svn: 54138
and knowledge of PseudoSourceValues. This unfortunately isn't sufficient to allow
constants to be rematerialized in PIC mode -- the extra indirection is a
complication.
llvm-svn: 54000
command-line option, and disable it by default. It introduced performance
regressions because CodeGen is currently not able to remat such loads.
llvm-svn: 53997
case for this.
This allows instructions like loads from global variables declared to
be constant to be moved out of loops."
Patch by Stefanus Du Toit!
llvm-svn: 53945
Remove the GetResultInst instruction. It is still accepted in LLVM assembly
and bitcode, where it is now auto-upgraded to ExtractValueInst. Also, remove
support for return instructions with multiple values. These are auto-upgraded
to use InsertValueInst instructions.
The IRBuilder still accepts multiple-value returns, and auto-upgrades them
to InsertValueInst instructions.
llvm-svn: 53941
leads into a cycle involving a different PHI, LSR got stuck running
around that cycle looking for the original PHI. To avoid this, keep
track of visited PHIs and stop searching if we see one more than once.
This fixes PR2570.
llvm-svn: 53879
force evaluation (ComputeIterationCountExhaustively) should be turned off.
It doesn't apply to trip-count2.ll because this file tests the brute force
evaluation.
The test for PR2364 (2008-05-25-NegativeStepToZero.ll) currently fails
showing that the patch for this bug doesn't work. I'll fix it in a few hours
with a patch for PR2088.
llvm-svn: 53792