1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 12:02:58 +02:00
Commit Graph

68978 Commits

Author SHA1 Message Date
Chris Lattner
b6a67a9068 Step #2 to improve trip count analysis for loops like this:
void f(int* begin, int* end) { std::fill(begin, end, 0); }

which turns into a != exit expression where one pointer is
strided and (thanks to step #1) known to not overflow, and 
the other is loop invariant.

The observation here is that, though the IV is strided by
4 in this case, that the IV *has* to become equal to the
end value.  It cannot "miss" the end value by stepping over
it, because if it did, the strided IV expression would
eventually wrap around.

Handle this by turning A != B into "A-B != 0" where the A-B
part is known to be NUW.

llvm-svn: 123131
2011-01-09 22:26:35 +00:00
Jakob Stoklund Olesen
785d31a2d2 Remove MachineRegisterInfo::getLastVirtReg(), it was giving wrong results
when no virtual registers have been allocated.

It was only used to resize IndexedMaps, so provide an IndexedMap::resize()
method such that

 Map.grow(MRI.getLastVirtReg());

can be replaced with the simpler

 Map.resize(MRI.getNumVirtRegs());

This works correctly when no virtuals are allocated, and it bypasses the to/from
index conversions.

llvm-svn: 123130
2011-01-09 21:58:20 +00:00
Chris Lattner
f26e71fa4c sort this.
llvm-svn: 123129
2011-01-09 21:31:39 +00:00
Jakob Stoklund Olesen
957748e7ac Teach TargetRegisterInfo how to cram stack slot indexes in with the virtual and
physical register numbers.

This makes the hack used in LiveInterval official, and lets LiveInterval be
oblivious of stack slots.

The isPhysicalRegister() and isVirtualRegister() predicates don't know about
this, so when a variable may contain a stack slot, isStackSlot() should always
be tested first.

llvm-svn: 123128
2011-01-09 21:17:37 +00:00
Chandler Carruth
2a30077fed Add a note about a missed FP optimization.
llvm-svn: 123126
2011-01-09 21:00:19 +00:00
Jakob Stoklund Olesen
ec41e691f0 Fix comment.
llvm-svn: 123125
2011-01-09 19:45:45 +00:00
Chris Lattner
82de29fb76 fix a few old bugs (found by inspection) where we would zap instructions
without informing memdep.  This could cause nondeterminstic weirdness 
based on where instructions happen to get allocated, and will hopefully
breath some life into some broken testers.

llvm-svn: 123124
2011-01-09 19:26:10 +00:00
Jakob Stoklund Olesen
0088b6ffb6 Add a forgotten VireReg2IndexFunctor.
llvm-svn: 123123
2011-01-09 18:58:33 +00:00
Oscar Fuentes
07e2fbd31e Apply -fPIC to C sources too.
llvm-svn: 123122
2011-01-09 17:38:31 +00:00
Tobias Grosser
9899845dd3 Instcombine: Fix pattern where the sext did not dominate the icmp using it
llvm-svn: 123121
2011-01-09 16:00:11 +00:00
Tobias Grosser
8dadf82d11 DominatorTree->print() now prints the status of the DFSNumbers correctly
llvm-svn: 123120
2011-01-09 16:00:09 +00:00
Oscar Fuentes
184f5d0911 Rewrite handling of LLVM_ENABLE_PIC. It was being processed after
config.h was generated, so it had no effect on it.

Thanks to arrowdodger for pointing out this and a tentative patch.

llvm-svn: 123119
2011-01-09 14:34:39 +00:00
Cameron Zwarich
afbf7a9fe3 LoopInstSimplify preserves LoopSimplify.
llvm-svn: 123117
2011-01-09 12:35:16 +00:00
Chandler Carruth
17c1672ea9 Another missed memset in std::vector initialization.
llvm-svn: 123116
2011-01-09 11:29:57 +00:00
Cameron Zwarich
3e060bd398 Eliminate some extra hash table lookups.
llvm-svn: 123115
2011-01-09 10:54:21 +00:00
Cameron Zwarich
4625675112 Add an informative comment.
llvm-svn: 123114
2011-01-09 10:32:30 +00:00
Chandler Carruth
dcbd7b6eaa Fix a cut-paste-o so that the sample code is correct for my last note.
Also, switch to a more clear 'sink' function with its declaration to
avoid any confusion about 'g'. Thanks for the suggestion Frits.

llvm-svn: 123113
2011-01-09 10:10:59 +00:00
Chandler Carruth
3de0da8801 Another missed optimization of trivial vector code.
llvm-svn: 123112
2011-01-09 09:58:36 +00:00
Chandler Carruth
9220d9fa48 Add a note about vector's size-constructor producing dead stores.
llvm-svn: 123111
2011-01-09 09:58:33 +00:00
Jakob Stoklund Olesen
d4dcf22b65 Simplify LiveDebugVariables by storing MachineOperand copies locations instead
of using a Location class with the same information.

When making a copy of a MachineOperand that was already stored in a
MachineInstr, it is necessary to clear the parent pointer on the copy. Otherwise
the register use-def lists become inconsistent.

Add MachineOperand::clearParent() to do that. An alternative would be a custom
MachineOperand copy constructor that cleared ParentMI. I didn't want to do that
because of the performance impact.

llvm-svn: 123109
2011-01-09 05:33:21 +00:00
Jakob Stoklund Olesen
c20baa8f1d Shrink a BitVector that didn't mean to store bits for all physical registers.
llvm-svn: 123108
2011-01-09 03:45:44 +00:00
Jakob Stoklund Olesen
ed53ab1635 Replace TargetRegisterInfo::printReg with a PrintReg class that also works without a TRI instance.
Print virtual registers numbered from 0 instead of the arbitrary
FirstVirtualRegister. The first virtual register is printed as %vreg0.
TRI::NoRegister is printed as %noreg.

llvm-svn: 123107
2011-01-09 03:05:53 +00:00
Jakob Stoklund Olesen
9a7e67d141 Use IndexedMap for MachineRegisterInfo as well. No functional change.
llvm-svn: 123106
2011-01-09 03:05:46 +00:00
Chris Lattner
57e9b35653 teach SCEV analysis of PHI nodes that PHI recurences formed
with GEP instructions are always NUW, because PHIs cannot wrap
the end of the address space.

llvm-svn: 123105
2011-01-09 02:28:48 +00:00
Chris Lattner
fa37cac39c reduce indentation. Print <nuw> and <nsw> when dumping SCEV AddRec's
that have the bit set.

llvm-svn: 123104
2011-01-09 02:16:18 +00:00
Chandler Carruth
815cbfb43c Add a note about a missed memset optimization from std::fill.
llvm-svn: 123103
2011-01-09 01:32:55 +00:00
Jakob Stoklund Olesen
e2e0850651 Fix the last virtual register enumerations.
llvm-svn: 123102
2011-01-08 23:11:11 +00:00
Jakob Stoklund Olesen
f43442c9f7 Fix VirtRegMap to use TRI::index2VirtReg and TRI::virtReg2Index instead of
depending on TRI::FirstVirtualRegister.

Also use TRI::printReg instead of printing virtual registers directly.

llvm-svn: 123101
2011-01-08 23:11:07 +00:00
Jakob Stoklund Olesen
b04c78d5ea Fix a MachineVerifier loop that probably didn't mean to skip the last two
virtual registers.

llvm-svn: 123100
2011-01-08 23:11:02 +00:00
Jakob Stoklund Olesen
aaec1aa6b6 Don't document exactly how virtual registers are represented as integers. Code
shouldn't depend directly on that.

Give an example of how to iterate over all virtual registers in a function
without depending on the representation.

llvm-svn: 123099
2011-01-08 23:10:59 +00:00
Jakob Stoklund Olesen
fb2b53c0de Use an IndexedMap for LiveVariables::VirtRegInfo.
Provide MRI::getNumVirtRegs() and TRI::index2VirtReg() functions to allow
iteration over virtual registers without depending on the representation of
virtual register numbers.

llvm-svn: 123098
2011-01-08 23:10:57 +00:00
Jakob Stoklund Olesen
4bc0b62215 Do not talk about TargetRegisterInfo::FirstVirtualRegister.
llvm-svn: 123097
2011-01-08 23:10:53 +00:00
Jakob Stoklund Olesen
b3820cdc22 Use an IndexedMap for LiveOutRegInfo to hide its dependence on TargetRegisterInfo::FirstVirtualRegister.
llvm-svn: 123096
2011-01-08 23:10:50 +00:00
Cameron Zwarich
33c137a88b Fix coding style.
llvm-svn: 123093
2011-01-08 22:36:53 +00:00
Chris Lattner
7df7b47828 fix a latent bug in memcpyoptimizer that my recent patches exposed: it wasn't
updating memdep when fusing stores together.  This fixes the crash optimizing
the bullet benchmark.

llvm-svn: 123091
2011-01-08 22:19:21 +00:00
Chris Lattner
563e57abbd tryMergingIntoMemset can only handle constant length memsets.
llvm-svn: 123090
2011-01-08 22:11:56 +00:00
Chris Lattner
98136397bd Merge memsets followed by neighboring memsets and other stores into
larger memsets.  Among other things, this fixes rdar://8760394 and
allows us to handle "Example 2" from http://blog.regehr.org/archives/320,
compiling it into a single 4096-byte memset:

_mad_synth_mute:                        ## @mad_synth_mute
## BB#0:                                ## %entry
	pushq	%rax
	movl	$4096, %esi             ## imm = 0x1000
	callq	___bzero
	popq	%rax
	ret

llvm-svn: 123089
2011-01-08 21:19:19 +00:00
Chris Lattner
e09439ed9d fix an issue in IsPointerOffset that prevented us from recognizing that
P and P+1 are relative to the same base pointer.

llvm-svn: 123087
2011-01-08 21:07:56 +00:00
Chris Lattner
20bf2d50b8 enhance memcpyopt to merge a store and a subsequent
memset into a single larger memset.

llvm-svn: 123086
2011-01-08 20:54:51 +00:00
Chris Lattner
71c356a8b0 fit in 80 cols
llvm-svn: 123085
2011-01-08 20:53:41 +00:00
Chris Lattner
756416d4c0 merge two tests and filecheckify
llvm-svn: 123082
2011-01-08 20:27:22 +00:00
Chris Lattner
19aedc848c constify TargetData references.
Split memset formation logic out into its own
"tryMergingIntoMemset" helper function.

llvm-svn: 123081
2011-01-08 20:24:01 +00:00
Chris Lattner
7d3c4712e9 When loop rotation happens, it is *very* common for the duplicated condbr
to be foldable into an uncond branch.  When this happens, we can make a
much simpler CFG for the loop, which is important for nested loop cases
where we want the outer loop to be aggressively optimized.

Handle this case more aggressively.  For example, previously on
phi-duplicate.ll we would get this:


define void @test(i32 %N, double* %G) nounwind ssp {
entry:
  %cmp1 = icmp slt i64 1, 1000
  br i1 %cmp1, label %bb.nph, label %for.end

bb.nph:                                           ; preds = %entry
  br label %for.body

for.body:                                         ; preds = %bb.nph, %for.cond
  %j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ]
  %arrayidx = getelementptr inbounds double* %G, i64 %j.02
  %tmp3 = load double* %arrayidx
  %sub = sub i64 %j.02, 1
  %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
  %tmp7 = load double* %arrayidx6
  %add = fadd double %tmp3, %tmp7
  %arrayidx10 = getelementptr inbounds double* %G, i64 %j.02
  store double %add, double* %arrayidx10
  %inc = add nsw i64 %j.02, 1
  br label %for.cond

for.cond:                                         ; preds = %for.body
  %cmp = icmp slt i64 %inc, 1000
  br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge

for.cond.for.end_crit_edge:                       ; preds = %for.cond
  br label %for.end

for.end:                                          ; preds = %for.cond.for.end_crit_edge, %entry
  ret void
}

Now we get the much nicer:

define void @test(i32 %N, double* %G) nounwind ssp {
entry:
  br label %for.body

for.body:                                         ; preds = %entry, %for.body
  %j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ]
  %arrayidx = getelementptr inbounds double* %G, i64 %j.01
  %tmp3 = load double* %arrayidx
  %sub = sub i64 %j.01, 1
  %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
  %tmp7 = load double* %arrayidx6
  %add = fadd double %tmp3, %tmp7
  %arrayidx10 = getelementptr inbounds double* %G, i64 %j.01
  store double %add, double* %arrayidx10
  %inc = add nsw i64 %j.01, 1
  %cmp = icmp slt i64 %inc, 1000
  br i1 %cmp, label %for.body, label %for.end

for.end:                                          ; preds = %for.body
  ret void
}

With all of these recent changes, we are now able to compile:

void foo(char *X) {
 for (int i = 0; i != 100; ++i) 
   for (int j = 0; j != 100; ++j)
     X[j+i*100] = 0;
}

into a single memset of 10000 bytes.  This series of changes
should also be helpful for other nested loop scenarios as well.

llvm-svn: 123079
2011-01-08 19:59:06 +00:00
Chris Lattner
e07601d493 make domtree verification print something useful on failure.
llvm-svn: 123078
2011-01-08 19:55:55 +00:00
Chris Lattner
a01093c990 split ssa updating code out to its own helper function. Don't bother
moving the OrigHeader block anymore: we just merge it away anyway so
its code layout doesn't matter.

llvm-svn: 123077
2011-01-08 19:26:33 +00:00
Chris Lattner
e5f41b44c5 Implement a TODO: Enhance loopinfo to merge away the unconditional branch
that it was leaving in loops after rotation (between the original latch
block and the original header.

With this change, it is possible for rotated loops to have just a single
basic block, which is useful.

llvm-svn: 123075
2011-01-08 19:10:28 +00:00
Chris Lattner
145e4ee94e various code cleanups, enhance MergeBlockIntoPredecessor to preserve
loop info.

llvm-svn: 123074
2011-01-08 19:08:40 +00:00
Chris Lattner
b601620bcd inline preserveCanonicalLoopForm now that it is simple.
llvm-svn: 123073
2011-01-08 18:55:50 +00:00
Chris Lattner
db05334c7f Three major changes:
1. Rip out LoopRotate's domfrontier updating code.  It isn't
   needed now that LICM doesn't use DF and it is super complex
   and gross.
2. Make DomTree updating code a lot simpler and faster.  The 
   old loop over all the blocks was just to find a block??
3. Change the code that inserts the new preheader to just use
   SplitCriticalEdge instead of doing an overcomplex 
   reimplementation of it.

No behavior change, except for the name of the inserted preheader.

llvm-svn: 123072
2011-01-08 18:52:51 +00:00
Chris Lattner
f1a01c01d9 reduce nesting.
llvm-svn: 123071
2011-01-08 18:47:43 +00:00