1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 03:33:20 +01:00
Commit Graph

1709 Commits

Author SHA1 Message Date
Dan Gohman
d13f674130 Factor the code for collecting IV users out of LSR into an IVUsers class,
and generalize it so that it can be used by IndVarSimplify. Implement the
base IndVarSimplify transformation code using IVUsers. This removes
TestOrigIVForWrap and associated code, as ScalarEvolution now has enough
builtin overflow detection and folding logic to handle all the same cases,
and more. Run "opt -iv-users -analyze -disable-output" on your favorite
loop for an example of what IVUsers does.

This lets IndVarSimplify eliminate IV casts and compute trip counts in
more cases. Also, this happens to finally fix the remaining testcases
in PR1301.

Now that IndVarSimplify is being more aggressive, it occasionally runs
into the problem where ScalarEvolutionExpander's code for avoiding
duplicate expansions makes it difficult to ensure that all expanded
instructions dominate all the instructions that will use them. As a
temporary measure, IndVarSimplify now uses a FixUsesBeforeDefs function
to fix up instructions inserted by SCEVExpander. Fortunately, this code
is contained, and can be easily removed once a more comprehensive
solution is available.

llvm-svn: 71535
2009-05-12 02:17:14 +00:00
Evan Cheng
9b27f3ec42 Teach LSR to optimize more loop exit compares, i.e. change them to use postinc iv value. Previously LSR would only optimize those which are in the loop latch block. However, if LSR can prove it is safe (and profitable), it's now possible to change those not in the latch blocks to use postinc values.
Also, if the compare is the only use, LSR would place the iv increment instruction before the compare instead in the latch.

llvm-svn: 71485
2009-05-11 22:33:01 +00:00
Dale Johannesen
dd32623987 Fix PR4188. TailMerging can't tolerate inexact
sucessor info.

llvm-svn: 71478
2009-05-11 21:54:13 +00:00
Dan Gohman
25ab4c185c Make this grep line a little more specific so that it doesn't
accidentally match something unrelated.

llvm-svn: 71458
2009-05-11 18:49:56 +00:00
Dan Gohman
dfa39efe6d When scalarizing a vector BITCAST, check whether the operand has vector
type, rather than assume that it does. If the operand is not vector, it
shouldn't be run through ScalarizeVectorOp. This fixes one of the
testcases in PR3886.

llvm-svn: 71453
2009-05-11 18:30:42 +00:00
Dan Gohman
0edabc8a6f Convert a subtract into a negate and an add when it helps x86
address folding.

llvm-svn: 71446
2009-05-11 18:02:53 +00:00
Dale Johannesen
f86e34065b Reverse a loop that is counting up to a maximum to
count down to 0 instead, under very restricted
circumstances.  Adjust 4 testcases in which this
optimization fires.

llvm-svn: 71439
2009-05-11 17:15:42 +00:00
Anton Korobeynikov
fe1c6d85b8 Add MSP430 test for PR4136
llvm-svn: 71392
2009-05-10 14:48:36 +00:00
Evan Cheng
06b0d3879e Enable loop bb placement optimization.
llvm-svn: 71291
2009-05-08 23:35:49 +00:00
Chris Lattner
7b2dabcac9 Fix PR4152: asm constraint validation happens before dag combine, so we
need to work a bit to combine things like (x+c1+c2) into x+c3.

llvm-svn: 71232
2009-05-08 18:23:14 +00:00
Evan Cheng
2a1d20b0fb Optimize code placement in loop to eliminate unconditional branches or move unconditional branch to the outside of the loop. e.g.
///       A:                                                                                                                                                                 
///       ...                                                                                                                                                                
///       <fallthrough to B>                                                                                                                                                 
///                                                                                                                                                                          
///       B:  --> loop header                                                                                                                                                
///       ...                                                                                                                                                                
///       jcc <cond> C, [exit]                                                                                                                                               
///                                                                                                                                                                          
///       C:                                                                                                                                                                 
///       ...                                                                                                                                                                
///       jmp B                                                                                                                                                              
///                                                                                                                                                                          
/// ==>                                                                                                                                                                      
///                                                                                                                                                                          
///       A:                                                                                                                                                                 
///       ...                                                                                                                                                                
///       jmp B                                                                                                                                                              
///                                                                                                                                                                          
///       C:  --> new loop header                                                                                                                                            
///       ...                                                                                                                                                                
///       <fallthough to B>                                                                                                                                                  
///                                                                                                                                                                          
///       B:                                                                                                                                                                 
///       ...                                                                                                                                                                
///       jcc <cond> C, [exit] 

llvm-svn: 71209
2009-05-08 06:34:09 +00:00
Bob Wilson
d61f4e70d8 Fix pr4100. Do not remove no-op copies when they are dead. The register
scavenger gets confused about register liveness if it doesn't see them.
I'm not thrilled with this solution, but it only comes up when there are dead
copies in the code, which is something that hopefully doesn't happen much.

Here is what happens in pr4100: As shown in the following excerpt from the
debug output of llc, the source of a move gets reloaded from the stack,
inserting a new load instruction before the move.  Since that source operand
is a kill, the physical register is free to be reused for the destination
of the move.  The move ends up being a no-op, copying R3 to R3, so it is
deleted.  But, it leaves behind the load to reload %reg1028 into R3, and
that load is not updated to show that it's destination operand (R3) is dead.
The scavenger gets confused by that load because it thinks that R3 is live.

Starting RegAlloc of: %reg1025<def,dead> = MOVr %reg1028<kill>, 14, %reg0, %reg0
  Regs have values: 
  Reloading %reg1028 into R3
  Last use of R3[%reg1028], removing it from live set
  Assigning R3 to %reg1025
  Register R3 [%reg1025] is never used, removing it from live set

Alternative solutions might be either marking the load as dead, or zapping
the load along with the no-op copy.  I couldn't see an easy way to do
either of those, though.

llvm-svn: 71196
2009-05-07 23:47:03 +00:00
Bill Wendling
864cbcfc46 THis doesn't fail.
llvm-svn: 71142
2009-05-07 01:41:42 +00:00
Bill Wendling
7c50dcd02e Temporarily revert r71010. It was causing massive failures during self-hosting.
llvm-svn: 71138
2009-05-07 01:27:25 +00:00
Evan Cheng
0ee6696fd8 Do not use register as base ptr of pre- and post- inc/dec load / store nodes.
llvm-svn: 71098
2009-05-06 18:25:01 +00:00
Lang Hames
fcc5ebb1d4 Renamed Spiller classes (plus uses and related files) to VirtRegRewriter.
llvm-svn: 71057
2009-05-06 02:36:21 +00:00
Evan Cheng
0d781df8dc Quotes should be printed before private prefix; some code clean up.
llvm-svn: 71032
2009-05-05 22:50:29 +00:00
Dan Gohman
5e839321f2 If a MachineBasicBlock has multiple ways of reaching another block,
allow it to have multiple CFG edges to that block. This is needed
to allow MachineBasicBlock::isOnlyReachableByFallthrough to work
correctly. This fixes PR4126.

llvm-svn: 71018
2009-05-05 21:10:19 +00:00
Evan Cheng
984da04cd0 Enable stack coloring with regs at -O3.
llvm-svn: 71010
2009-05-05 20:30:36 +00:00
Chris Lattner
5cc9a36d1c Add basic support for code generation of
addrspace(257) -> FS relative on x86.  Patch by Zoltan Varga!

llvm-svn: 70992
2009-05-05 18:52:19 +00:00
Dan Gohman
2973567a95 X86FastISel doesn't support the -tailcallopt ABI.
llvm-svn: 70902
2009-05-04 19:50:33 +00:00
Anton Korobeynikov
262a397978 Fix code emission for conditional branches.
Patch by Collin Winter!

llvm-svn: 70898
2009-05-04 19:10:38 +00:00
Dan Gohman
a79cce4aef Previously, RecursivelyDeleteDeadInstructions provided an option
of returning a list of pointers to Values that are deleted. This was
unsafe, because the pointers in the list are, by nature of what
RecursivelyDeleteDeadInstructions does, always dangling. Replace this
with a simple callback mechanism. This may eventually be removed if
all clients can reasonably be expected to use CallbackVH.

Use this to factor out the dead-phi-cycle-elimination code from LSR
utility function, and generalize it to use the
RecursivelyDeleteTriviallyDeadInstructions utility function.

This makes LSR more aggressive about eliminating dead PHI cycles;
adjust tests to either be less trivial or to simply expect fewer
instructions.

llvm-svn: 70636
2009-05-02 18:29:22 +00:00
Chris Lattner
c6d561ed27 'The attached patch fixes an issue where llc -march=cpp fails with
"Invalid primitive type" on input containing the x86_fp80 type.'
Patch by Collin Winter!

llvm-svn: 70610
2009-05-01 23:54:26 +00:00
Evan Cheng
b7d41a6680 Mark MOV8mr_NOREX and MOV8rm_NOREX as mayStore / mayLoad respectively.
llvm-svn: 70461
2009-04-30 00:58:57 +00:00
Chris Lattner
794fb5b4b3 fix a regression handling indirect results: these need to be considered
memory operands otherwise the writebacks get lost when the inline asm 
doesn't otherwise have side effects.  This fixes rdar://6839427, though
clang really shouldn't generate these anymore.

llvm-svn: 70455
2009-04-30 00:48:50 +00:00
Nate Begeman
b407809122 Fix infinite recursion in the C++ code which handles movddup by making it unnecessary.
llvm-svn: 70425
2009-04-29 22:47:44 +00:00
Evan Cheng
62fdc300dd spillPhysRegAroundRegDefsUses() may have invalidated iterators stored in fixed_ IntervalPtrs. Reset them.
llvm-svn: 70378
2009-04-29 07:16:34 +00:00
Chris Lattner
e1eefefdc3 Disable the load-shrinking optimization from looking at
anything larger than 64-bits, avoiding a crash.  This should
really be fixed to use APInts, though type legalization happens
to help us out and we get good code on the attached testcase at
least.

This fixes rdar://6836460

llvm-svn: 70360
2009-04-29 03:45:07 +00:00
Bill Wendling
7546bed590 Second attempt:
Massive check in. This changes the "-fast" flag to "-O#" in llc. If you want to
use the old behavior, the flag is -O0. This change allows for finer-grained
control over which optimizations are run at different -O levels.

Most of this work was pretty mechanical. The majority of the fixes came from
verifying that a "fast" variable wasn't used anymore. The JIT still uses a
"Fast" flag. I'll change the JIT with a follow-up patch.

llvm-svn: 70343
2009-04-29 00:15:41 +00:00
Anton Korobeynikov
1799ac4b55 Properly print 'P' modifier on inline asm memory operands.
This should fix PR3379 and PR4064.
Patch inspired by Edwin Török!

llvm-svn: 70328
2009-04-28 21:49:33 +00:00
Evan Cheng
754a0d2f9e Fix PR4034. Bug in LiveInterval::join when it's compacting new valno's.
llvm-svn: 70291
2009-04-28 06:24:09 +00:00
Evan Cheng
8a9736a26c Fix for PR4051. When 2address pass delete an instruction, update kill info when necessary.
llvm-svn: 70279
2009-04-28 02:12:36 +00:00
Bill Wendling
ef47ace92f r70270 isn't ready yet. Back this out. Sorry for the noise.
llvm-svn: 70275
2009-04-28 01:04:53 +00:00
Bill Wendling
2799e916c3 Massive check in. This changes the "-fast" flag to "-O#" in llc. If you want to
use the old behavior, the flag is -O0. This change allows for finer-grained
control over which optimizations are run at different -O levels.

Most of this work was pretty mechanical. The majority of the fixes came from
verifying that a "fast" variable wasn't used anymore. The JIT still uses a
"Fast" flag. I'm not 100% sure if it's necessary to change it there...

llvm-svn: 70270
2009-04-28 00:21:31 +00:00
Evan Cheng
c315cf24e3 Fix PR4076. Correctly create live interval of physical register with two-address update.
llvm-svn: 70245
2009-04-27 20:42:46 +00:00
Dan Gohman
e1a532cb4f Permit ChangeCompareStride to rewrite a comparison when the factor
between the comparison's iv stride and the candidate stride is
exactly -1.

llvm-svn: 70244
2009-04-27 20:35:32 +00:00
Dan Gohman
ff30ebd710 Teach getZeroExtendExpr and getSignExtendExpr to use trip-count
information to simplify [sz]ext({a,+,b}) to {zext(a),+,[zs]ext(b)},
as appropriate.

These functions and the trip count code each call into the other, so
this requires careful handling to avoid infinite recursion. During
the initial trip count computation, conservative SCEVs are used,
which are subsequently discarded once the trip count is actually
known.

Among other benefits, this change lets LSR automatically eliminate
some unnecessary zext-inreg and sext-inreg operation where the
operand is an induction variable.

llvm-svn: 70241
2009-04-27 20:16:15 +00:00
Nate Begeman
7902a2344d Revert accidental testcase reduction
llvm-svn: 70226
2009-04-27 18:42:40 +00:00
Nate Begeman
9d121924fd 2nd attempt, fixing SSE4.1 issues and implementing feedback from duncan.
PR2957

ISD::VECTOR_SHUFFLE now stores an array of integers representing the shuffle
mask internal to the node, rather than taking a BUILD_VECTOR of ConstantSDNodes
as the shuffle mask.  A value of -1 represents UNDEF.

In addition to eliminating the creation of illegal BUILD_VECTORS just to 
represent shuffle masks, we are better about canonicalizing the shuffle mask,
resulting in substantially better code for some classes of shuffles.

llvm-svn: 70225
2009-04-27 18:41:29 +00:00
Evan Cheng
43fc90ae59 Fix PR4056. It's possible a physical register def is dead if its implicit use is deleted by two-address pass.
llvm-svn: 70213
2009-04-27 17:36:47 +00:00
Dan Gohman
4aeebb184b Fix the syntax for a PR number in a test.
llvm-svn: 70208
2009-04-27 15:08:34 +00:00
Dan Gohman
744f455d55 When transforming sext(trunc(load(x))) into sext(smaller load(x)),
the trunc is directly replaced with the smaller load, so don't
try to create a new sext node. This fixes PR4050.

llvm-svn: 70179
2009-04-27 02:00:55 +00:00
Evan Cheng
696a04eba2 Do not share a single unknown val# for all the live ranges merged into a physical sub-register live interval. When coalescer is merging in clobbered virtaul register live interval into a physical register live interval, give each virtual register val# a separate val# in the physical register live interval. Otherwise, the coalescer would have lost track of the definitions information it needs to make correct coalescing decisions.
llvm-svn: 70026
2009-04-25 09:25:19 +00:00
Dale Johannesen
493c3bcdc0 Fix PR 4057, a crash doing float->char const folding.
This particular one is undefined behavior (although this
isn't related to the crash), so it will no longer do it
at compile time, which seems better.

llvm-svn: 69990
2009-04-24 21:34:13 +00:00
Rafael Espindola
4e7a0bf1f1 Fix PR 4004 by including the call to __tls_get_addr in X86tlsaddr. This is not
very elegant, but neither is the tls specification :-(

llvm-svn: 69968
2009-04-24 12:59:40 +00:00
Rafael Espindola
0b1037ad26 Revert 69952. Causes testsuite failures on linux x86-64.
llvm-svn: 69967
2009-04-24 12:40:33 +00:00
Nate Begeman
c1a09c7dfa PR2957
ISD::VECTOR_SHUFFLE now stores an array of integers representing the shuffle
mask internal to the node, rather than taking a BUILD_VECTOR of ConstantSDNodes
as the shuffle mask.  A value of -1 represents UNDEF.

In addition to eliminating the creation of illegal BUILD_VECTORS just to 
represent shuffle masks, we are better about canonicalizing the shuffle mask,
resulting in substantially better code for some classes of shuffles.

A clean up of x86 shuffle code, and some canonicalizing in DAGCombiner is next.

llvm-svn: 69952
2009-04-24 03:42:54 +00:00
Dan Gohman
3499a53e1d Explicitly pass -tailcallopt=false to these tests so that they
work as intended no matter what the default setting of that
option is.

llvm-svn: 69911
2009-04-23 19:39:41 +00:00
Evan Cheng
a36c6c6819 It has finally happened. Spiller is now using live interval info.
This fixes a very subtle bug. vr defined by an implicit_def is allowed overlap with any register since it doesn't actually modify anything. However, if it's used as a two-address use, its live range can be extended and it can be spilled. The spiller must take care not to emit a reload for the vn number that's defined by the implicit_def. This is both a correctness and performance issue.

llvm-svn: 69743
2009-04-21 22:46:52 +00:00