1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-28 14:32:51 +01:00
Commit Graph

5548 Commits

Author SHA1 Message Date
Eli Friedman
ef366a16a2 PR2621: Improvements to the SCEV AddRec binomial expansion. This
version uses a new algorithm for evaluating the binomial coefficients 
which is significantly more efficient for AddRecs of more than 2 terms 
(see the comments in the code for details on how the algorithm works).  
It also fixes some bugs: it removes the arbitrary length restriction for 
AddRecs, it fixes the silent generation of incorrect code for AddRecs 
which require a wide calculation width, and it fixes an issue where we 
were incorrectly truncating the iteration count too far when evaluating 
an AddRec expression narrower than the induction variable.

There are still a few related issues I know of: I think there's 
still an issue with the SCEVExpander expansion of AddRec in terms of
the width of the induction variable used.  The hack to avoid generating 
too-wide integers shouldn't be necessary; instead, the callers should be 
considering the cost of the expansion before expanding it (in addition 
to not expanding too-wide integers, we might not want to expand 
expressions that are really expensive, especially when optimizing for 
size; calculating an length-17 32-bit AddRec currently generates about 250 
instructions of straight-line code on X86).  Also, for long 32-bit 
AddRecs on X86, CodeGen really sucks at scheduling the code.  I'm planning on 
filing follow-up PRs for these issues.

llvm-svn: 54332
2008-08-04 23:49:06 +00:00
Dan Gohman
60ea311ec8 Fix SDISel lowering of PHI nodes to use ComputeValueVTs.
This allows it to work correctly on aggregate values.
This fixes PR2623.

llvm-svn: 54331
2008-08-04 23:42:46 +00:00
Dan Gohman
af429b3e52 Fix SDISel lowering of zeroinitializer and undef to use ComputeValueVTs.
This allows it to work correctly on nested aggregate values.
This fixes PR2625.

llvm-svn: 54330
2008-08-04 23:30:41 +00:00
Dale Johannesen
c1ae4b8c08 Make sse2 explicit, for non-x86 hosts.
llvm-svn: 54251
2008-07-31 20:16:33 +00:00
Dan Gohman
f691fc703d Improve dagcombining for sext-loads and sext-in-reg nodes.
llvm-svn: 54239
2008-07-31 00:50:31 +00:00
Dan Gohman
4ca56a8993 Don't look for leaf values to store when lowering stores of
empty structs. This fixes PR2612.

llvm-svn: 54226
2008-07-30 18:36:51 +00:00
Dan Gohman
6f3fa16fd9 I missed this file in r54223. movzbl is now used instead
of movzbw here.

llvm-svn: 54224
2008-07-30 18:23:34 +00:00
Dan Gohman
efb5d2ce6e Reapply r54147 with a constraint to only use the 8-bit
subreg form on x86-64, to avoid the problem with x86-32
having GPRs that don't have 8-bit subregs.

Also, change several 16-bit instructions to use 
equivalent 32-bit instructions. These have a smaller
encoding and avoid partial-register updates.

llvm-svn: 54223
2008-07-30 18:09:17 +00:00
Mon P Wang
fb483982f5 Added support for overloading intrinsics (atomics) based on pointers
to different address spaces.  This alters the naming scheme for those
intrinsics, e.g., atomic.load.add.i32 => atomic.load.add.i32.p0i32

llvm-svn: 54195
2008-07-30 04:36:53 +00:00
Eli Friedman
a750724150 Another SCEV issue from PR2607; essentially the same issue, but this
time applying to the implicit comparison in smin expressions. The 
correct way to transform an inequality into the opposite 
inequality, either signed or unsigned, is with a not expression.

I looked through the SCEV code, and I don't think there are any more 
occurrences of this issue.

llvm-svn: 54194
2008-07-30 04:36:32 +00:00
Eli Friedman
81169f2e1b Fix for PR2607: SCEV miscomputing the loop count for loops with an
SGT exit condition.  Essentially, the correct way to flip an inequality 
in 2's complement is the not operator, not the negation operator.  
That said, the difference only affects cases involving INT_MIN.

Also, enhance the pre-test search logic to be a bit smarter about 
inequalities flipped with a not operator, so it can eliminate the smax 
from the iteration count for simple loops.

llvm-svn: 54184
2008-07-30 00:04:08 +00:00
Duncan Sands
c3d73fbfc0 Fix PR2609. If a label is deleted, then it needs
to be marked invalid regardless of whether it is
a debug, an exception handling or (hopefully) a
GC label.

llvm-svn: 54172
2008-07-29 20:56:02 +00:00
Nate Begeman
9a71580e21 Add vector shifts to the IR, patch by Eli Friedman.
CodeGen & Clang work coming next.

llvm-svn: 54161
2008-07-29 15:49:41 +00:00
Matthijs Kooijman
7199907f50 Add -unroll-allow-partial command line option that enabled the loop unroller to
partially unroll a loop when fully unrolling would not fit under the threshold.

Patch by Mikael Lepistö.

llvm-svn: 54160
2008-07-29 13:21:23 +00:00
Matthijs Kooijman
77948dbbc2 Restructure ArgumentPromotion a bit. Instead of just having a single boolean
that says "unconditional loads from this argument are safe", we now keep track
of the safety per set of indices from which loads happen. This prevents
ArgPromotion from promoting loads that aren't really valid. As an added effect,
this will now disregard the the type of the indices passed to a GEP, so
"load GEP %A, i32 1" and "load GEP %A, i64 1" will result in a single argument,
not two.

This fixes PR2598, for which a testcase has been added as well.

llvm-svn: 54159
2008-07-29 10:00:13 +00:00
Dan Gohman
ebe629a4b2 Revert 54147.
llvm-svn: 54148
2008-07-29 01:02:18 +00:00
Dan Gohman
1816900fd1 Add x86 isel patterns to match what would be a ZERO_EXTEND_INREG operation,
which is represented in codegen as an 'and' operation. This matches them
with movz instructions, instead of leaving them to be matched by and
instructions with an immediate field.

llvm-svn: 54147
2008-07-28 22:18:25 +00:00
Duncan Sands
ac8ddfc48d Test this differently: I saw this test fail
because opt exited while llvm-as was still
writing to the pipe, causing it to get a
SIGPIPE.  It seems best to change things to
avoid the race altogether.

llvm-svn: 54138
2008-07-28 19:09:01 +00:00
Dan Gohman
415d5069f4 Fix a bashism in TestRunner.sh.
llvm-svn: 54134
2008-07-28 18:41:03 +00:00
Owen Anderson
4d84a90fa9 Add support for eliminating stores that store the same value that was just loaded.
This fixes PR2599.

llvm-svn: 54133
2008-07-28 16:14:26 +00:00
Dan Gohman
a5a50a8853 Fix embedded CRLF characters.
llvm-svn: 54125
2008-07-27 18:37:58 +00:00
Nate Begeman
1396e3d206 Fix test RUN line
llvm-svn: 54040
2008-07-25 19:08:59 +00:00
Nate Begeman
5523d40e4b Disable mov{L, LP, HP, HLP, *DUP} shuffles for mmx
mmx needs its own fancy shuffle logic based on unpack; for now we get correct but awful code.

Also commit Mon Ping's VSETCC patch

llvm-svn: 54039
2008-07-25 19:05:58 +00:00
Dan Gohman
6d394147f2 This test needs -aggressive-remat enabled.
llvm-svn: 54015
2008-07-25 15:25:32 +00:00
Evan Cheng
d4eb684258 Teach ARM isLegalAddressingMode to handle unknown type without crashing. This fixes pr2589.
llvm-svn: 54004
2008-07-25 00:55:17 +00:00
Dan Gohman
680e1bd958 Enable rematerialization of constants using AliasAnalysis::pointsToConstantMemory,
and knowledge of PseudoSourceValues. This unfortunately isn't sufficient to allow
constants to be rematerialized in PIC mode -- the extra indirection is a
complication.

llvm-svn: 54000
2008-07-25 00:02:30 +00:00
Dan Gohman
1ecbcecdf3 Put the LICM of constant GlobalVariables, introduced in r53945, under a
command-line option, and disable it by default. It introduced performance
regressions because CodeGen is currently not able to remat such loads.

llvm-svn: 53997
2008-07-24 23:57:25 +00:00
Dan Gohman
da5c2b50b8 Add target triples so these tests behave as expected on non-darwin hosts.
llvm-svn: 53991
2008-07-24 18:08:01 +00:00
Evan Cheng
9c8cac5fd7 Fix a catastrophic PPC64 ABI bug: i32 operands which are passed in memory (all of the parameter registers are used) are loaded from sp offsets that were off by 4.
llvm-svn: 53979
2008-07-24 08:17:07 +00:00
Evan Cheng
055f5e6ed0 New test case.
llvm-svn: 53971
2008-07-24 00:22:05 +00:00
Chris Lattner
8eb899ecbc "Allow LICM to sink or lift loads from constant memory. Also add a test
case for this.

This allows instructions like loads from global variables declared to
be constant to be moved out of loops."

Patch by Stefanus Du Toit!

llvm-svn: 53945
2008-07-23 05:06:28 +00:00
Dan Gohman
6564581be0 Enable first-class aggregates support.
Remove the GetResultInst instruction. It is still accepted in LLVM assembly
and bitcode, where it is now auto-upgraded to ExtractValueInst. Also, remove
support for return instructions with multiple values. These are auto-upgraded
to use InsertValueInst instructions.

The IRBuilder still accepts multiple-value returns, and auto-upgrades them
to InsertValueInst instructions.

llvm-svn: 53941
2008-07-23 00:34:11 +00:00
Evan Cheng
20c9cdbe69 Fix PR2485: do all 4-element SSE shuffles in max. of 2 shuffle instructions.
Based on patch by Nicolas Capens.

llvm-svn: 53939
2008-07-23 00:22:17 +00:00
Duncan Sands
550e0de239 LegalizeTypes support for VSETCC. Fixes PR2575.
llvm-svn: 53938
2008-07-22 23:54:03 +00:00
Evan Cheng
1aa928a8e6 Fix pr2566: incorrect assumption about bit_convert. It doesn't not have to output a vector value. Patch by Nicolas Capens!
llvm-svn: 53932
2008-07-22 20:42:56 +00:00
Evan Cheng
901d469e05 Fix PR2574: implement v2f32 scalar_to_vector.
llvm-svn: 53927
2008-07-22 18:39:19 +00:00
Dan Gohman
693339b859 Add the PR number to the test.
llvm-svn: 53880
2008-07-21 21:50:25 +00:00
Dan Gohman
8f7b6c8113 Fix a bug in LSR's dead-PHI cleanup. If a PHI has a def-use chain that
leads into a cycle involving a different PHI, LSR got stuck running
around that cycle looking for the original PHI. To avoid this, keep
track of visited PHIs and stop searching if we see one more than once.
This fixes PR2570.

llvm-svn: 53879
2008-07-21 21:45:02 +00:00
Wojciech Matyjewicz
eea926ec20 Fix PR2088. Use modulo linear equation solver to compute loop iteration
count.

llvm-svn: 53810
2008-07-20 15:55:14 +00:00
Bill Wendling
98b6e63176 Fix for first part of PR2562. Generate the "pinsrw" instruction for inserts
into v4i16 vectors.

llvm-svn: 53807
2008-07-20 02:32:23 +00:00
Nick Lewycky
13166526c5 XFAIL this test.
llvm-svn: 53793
2008-07-19 15:52:06 +00:00
Wojciech Matyjewicz
852a8f47f1 While testing particular algorithms to compute loop iteration count the brute
force evaluation (ComputeIterationCountExhaustively) should be turned off.

It doesn't apply to trip-count2.ll because this file tests the brute force
evaluation.

The test for PR2364 (2008-05-25-NegativeStepToZero.ll) currently fails
showing that the patch for this bug doesn't work. I'll fix it in a few hours
with a patch for PR2088.

llvm-svn: 53792
2008-07-19 13:26:15 +00:00
Anton Korobeynikov
6f354293fe Testcase for PR2549
llvm-svn: 53785
2008-07-19 06:31:12 +00:00
Duncan Sands
ef45c602b6 Softfloat support for FDIV. Patch by
Richard Pennington.

llvm-svn: 53773
2008-07-18 21:18:48 +00:00
Dan Gohman
b97c076af4 In the CBackend, use casts to force integer add, subtract, and
multiply to be done as unsigned, so that they have well defined
behavior on overflow. This fixes PR2408.

llvm-svn: 53767
2008-07-18 18:43:12 +00:00
Evan Cheng
d26080487b Subreg live interval valno may not have a corresponding def machineinstr since it's less precise.
llvm-svn: 53734
2008-07-17 19:48:53 +00:00
Evan Cheng
48b2f3dfe9 Add nounwind.
llvm-svn: 53733
2008-07-17 19:48:04 +00:00
Dan Gohman
8981962672 Add a new function, ReplaceAllUsesOfValuesWith, which handles bulk
replacement of multiple values. This is slightly more efficient
than doing multiple ReplaceAllUsesOfValueWith calls, and theoretically
could be optimized even further. However, an important property of this
new function is that it handles the case where the source value set and
destination value set overlap. This makes it feasible for isel to use
SelectNodeTo in many very common cases, which is advantageous because
SelectNodeTo avoids a temporary node and it doesn't require CSEMap
updates for users of values that don't change position.

Revamp MorphNodeTo, which is what does all the work of SelectNodeTo, to
handle operand lists more efficiently, and to correctly handle a number
of corner cases to which its new wider use exposes it.

This commit also includes a change to the encoding of post-isel opcodes
in SDNodes; now instead of being sandwiched between the target-independent
pre-isel opcodes and the target-dependent pre-isel opcodes, post-isel
opcodes are now represented as negative values. This makes it possible
to test if an opcode is pre-isel or post-isel without having to know
the size of the current target's post-isel instruction set.

These changes speed up llc overall by 3% and reduce memory usage by 10%
on the InstructionCombining.cpp testcase with -fast and -regalloc=local.

llvm-svn: 53728
2008-07-17 19:10:17 +00:00
Duncan Sands
c3331602f9 LegalizeTypes support for what seems to be the
only missing ppc long double operations: FNEG
and FP_EXTEND.

llvm-svn: 53723
2008-07-17 17:35:14 +00:00
Duncan Sands
778e45e748 Turn LegalizeTypes back off again for the moment:
it is breaking Darwin bootstrap due to missing
functionality.

llvm-svn: 53721
2008-07-17 17:06:03 +00:00