1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 03:23:01 +02:00
Commit Graph

7545 Commits

Author SHA1 Message Date
Devang Patel
52db9b4821 Use SmallVector instead of SmallPtrSet and avoid non-deterministic behavior.
llvm-svn: 123318
2011-01-12 19:12:45 +00:00
Chris Lattner
fdef60bef7 revert 123144, reenabling the rest of memset formation.
llvm-svn: 123302
2011-01-12 03:25:15 +00:00
Chris Lattner
e288204194 revert r123146 which disabled code that wasn't the root cause
of the bootstrap miscompare issue.

llvm-svn: 123299
2011-01-12 01:52:23 +00:00
Chris Lattner
c7a5a12af5 revert r123149, reenabling an improvement to memcpyopt that wasn't
the source of the bootstrap problem.

llvm-svn: 123298
2011-01-12 01:43:46 +00:00
Jakob Stoklund Olesen
b935aa1678 Remove the PR8954 workaround.
llvm-svn: 123288
2011-01-11 22:56:41 +00:00
Jakob Stoklund Olesen
37fe53c1a9 Fix a non-deterministic loop in llvm::MergeBlockIntoPredecessor.
DT->changeImmediateDominator() trivially ignores identity updates, so there is
really no need for the uniqueing provided by SmallPtrSet.

I expect this to fix PR8954.

llvm-svn: 123286
2011-01-11 22:54:38 +00:00
Cameron Zwarich
b0e688cc7c Dial back the speculative fix for PR8954 a bit, so that we only recompute dominators
once at the beginning of GVN instead of once per iteration.

llvm-svn: 123278
2011-01-11 22:14:42 +00:00
Cameron Zwarich
e8c1c44e01 Attempt to fix the bootstrap buildbot. Rafael says this works for him on x86-64 Linux.
llvm-svn: 123270
2011-01-11 20:23:34 +00:00
Owen Anderson
a82627567b Remove dead variable, const-ref-ize an APInt.
llvm-svn: 123248
2011-01-11 18:26:37 +00:00
Chris Lattner
1674862564 this pass claims to preserve scev, make sure to tell it about deletions.
llvm-svn: 123247
2011-01-11 18:14:50 +00:00
Frits van Bommel
f5bd48972a Factor the actual simplification out of SimplifyIndirectBrOnSelect and into a new helper function so it can be reused in e.g. an upcoming SimplifySwitchOnSelect.
No functional change.

llvm-svn: 123234
2011-01-11 12:52:11 +00:00
Chris Lattner
e5688368c6 update memdep when an instruction is deleted. This code isn't
actually reached in the testcase in PR8954, but it's safe and good
practice.

llvm-svn: 123224
2011-01-11 08:19:16 +00:00
Chris Lattner
a82a6cfe6d when MergeBlockIntoPredecessor merges two blocks, update MemDep if it
is floating around in the ether.

llvm-svn: 123223
2011-01-11 08:16:49 +00:00
Chris Lattner
dc7b2160ba Fix FoldSingleEntryPHINodes to update memdep and AA when it deletes
phi nodes.  It is called from MergeBlockIntoPredecessor which is 
called from GVN, which claims to preserve these.

I'm skeptical that this is the actual problem behind PR8954, but
this is a stab in the right direction.

llvm-svn: 123222
2011-01-11 08:13:40 +00:00
Chris Lattner
b1a9c9ed36 random cleanups
llvm-svn: 123221
2011-01-11 08:00:40 +00:00
Chris Lattner
5731a92f5b remove a bogus assertion: the latch block of a loop is not
neccesarily an uncond branch to the header.  This fixes 
PR8955 (the assertion tripping).

llvm-svn: 123219
2011-01-11 07:47:59 +00:00
Owen Anderson
4479341626 Fix a random missed optimization by making InstCombine more aggressive when determining which bits are demanded by
a comparison against a constant.

llvm-svn: 123203
2011-01-11 00:36:45 +00:00
Chandler Carruth
772e26df36 Teach instcombine about the rest of the SSE and SSE2 conversion
intrinsics element dependencies. Reviewed by Nick.

llvm-svn: 123161
2011-01-10 07:19:37 +00:00
Chris Lattner
1404348022 another random stab in the dark trying to fix llvm-gcc-i386-linux-selfhost
llvm-svn: 123149
2011-01-10 02:34:11 +00:00
Chris Lattner
b5562212e2 another (more) aggressive attempt to bring llvm-gcc-i386-linux-selfhost
back to life.

llvm-svn: 123146
2011-01-10 00:47:34 +00:00
Chris Lattner
e8e9ec58bf temporarily disable memset formation from memsets in an effort to restore buildbot stability.
llvm-svn: 123144
2011-01-09 23:52:48 +00:00
Chris Lattner
82de29fb76 fix a few old bugs (found by inspection) where we would zap instructions
without informing memdep.  This could cause nondeterminstic weirdness 
based on where instructions happen to get allocated, and will hopefully
breath some life into some broken testers.

llvm-svn: 123124
2011-01-09 19:26:10 +00:00
Tobias Grosser
9899845dd3 Instcombine: Fix pattern where the sext did not dominate the icmp using it
llvm-svn: 123121
2011-01-09 16:00:11 +00:00
Cameron Zwarich
afbf7a9fe3 LoopInstSimplify preserves LoopSimplify.
llvm-svn: 123117
2011-01-09 12:35:16 +00:00
Chris Lattner
fa37cac39c reduce indentation. Print <nuw> and <nsw> when dumping SCEV AddRec's
that have the bit set.

llvm-svn: 123104
2011-01-09 02:16:18 +00:00
Chris Lattner
7df7b47828 fix a latent bug in memcpyoptimizer that my recent patches exposed: it wasn't
updating memdep when fusing stores together.  This fixes the crash optimizing
the bullet benchmark.

llvm-svn: 123091
2011-01-08 22:19:21 +00:00
Chris Lattner
563e57abbd tryMergingIntoMemset can only handle constant length memsets.
llvm-svn: 123090
2011-01-08 22:11:56 +00:00
Chris Lattner
98136397bd Merge memsets followed by neighboring memsets and other stores into
larger memsets.  Among other things, this fixes rdar://8760394 and
allows us to handle "Example 2" from http://blog.regehr.org/archives/320,
compiling it into a single 4096-byte memset:

_mad_synth_mute:                        ## @mad_synth_mute
## BB#0:                                ## %entry
	pushq	%rax
	movl	$4096, %esi             ## imm = 0x1000
	callq	___bzero
	popq	%rax
	ret

llvm-svn: 123089
2011-01-08 21:19:19 +00:00
Chris Lattner
e09439ed9d fix an issue in IsPointerOffset that prevented us from recognizing that
P and P+1 are relative to the same base pointer.

llvm-svn: 123087
2011-01-08 21:07:56 +00:00
Chris Lattner
20bf2d50b8 enhance memcpyopt to merge a store and a subsequent
memset into a single larger memset.

llvm-svn: 123086
2011-01-08 20:54:51 +00:00
Chris Lattner
19aedc848c constify TargetData references.
Split memset formation logic out into its own
"tryMergingIntoMemset" helper function.

llvm-svn: 123081
2011-01-08 20:24:01 +00:00
Chris Lattner
7d3c4712e9 When loop rotation happens, it is *very* common for the duplicated condbr
to be foldable into an uncond branch.  When this happens, we can make a
much simpler CFG for the loop, which is important for nested loop cases
where we want the outer loop to be aggressively optimized.

Handle this case more aggressively.  For example, previously on
phi-duplicate.ll we would get this:


define void @test(i32 %N, double* %G) nounwind ssp {
entry:
  %cmp1 = icmp slt i64 1, 1000
  br i1 %cmp1, label %bb.nph, label %for.end

bb.nph:                                           ; preds = %entry
  br label %for.body

for.body:                                         ; preds = %bb.nph, %for.cond
  %j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ]
  %arrayidx = getelementptr inbounds double* %G, i64 %j.02
  %tmp3 = load double* %arrayidx
  %sub = sub i64 %j.02, 1
  %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
  %tmp7 = load double* %arrayidx6
  %add = fadd double %tmp3, %tmp7
  %arrayidx10 = getelementptr inbounds double* %G, i64 %j.02
  store double %add, double* %arrayidx10
  %inc = add nsw i64 %j.02, 1
  br label %for.cond

for.cond:                                         ; preds = %for.body
  %cmp = icmp slt i64 %inc, 1000
  br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge

for.cond.for.end_crit_edge:                       ; preds = %for.cond
  br label %for.end

for.end:                                          ; preds = %for.cond.for.end_crit_edge, %entry
  ret void
}

Now we get the much nicer:

define void @test(i32 %N, double* %G) nounwind ssp {
entry:
  br label %for.body

for.body:                                         ; preds = %entry, %for.body
  %j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ]
  %arrayidx = getelementptr inbounds double* %G, i64 %j.01
  %tmp3 = load double* %arrayidx
  %sub = sub i64 %j.01, 1
  %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
  %tmp7 = load double* %arrayidx6
  %add = fadd double %tmp3, %tmp7
  %arrayidx10 = getelementptr inbounds double* %G, i64 %j.01
  store double %add, double* %arrayidx10
  %inc = add nsw i64 %j.01, 1
  %cmp = icmp slt i64 %inc, 1000
  br i1 %cmp, label %for.body, label %for.end

for.end:                                          ; preds = %for.body
  ret void
}

With all of these recent changes, we are now able to compile:

void foo(char *X) {
 for (int i = 0; i != 100; ++i) 
   for (int j = 0; j != 100; ++j)
     X[j+i*100] = 0;
}

into a single memset of 10000 bytes.  This series of changes
should also be helpful for other nested loop scenarios as well.

llvm-svn: 123079
2011-01-08 19:59:06 +00:00
Chris Lattner
a01093c990 split ssa updating code out to its own helper function. Don't bother
moving the OrigHeader block anymore: we just merge it away anyway so
its code layout doesn't matter.

llvm-svn: 123077
2011-01-08 19:26:33 +00:00
Chris Lattner
e5f41b44c5 Implement a TODO: Enhance loopinfo to merge away the unconditional branch
that it was leaving in loops after rotation (between the original latch
block and the original header.

With this change, it is possible for rotated loops to have just a single
basic block, which is useful.

llvm-svn: 123075
2011-01-08 19:10:28 +00:00
Chris Lattner
145e4ee94e various code cleanups, enhance MergeBlockIntoPredecessor to preserve
loop info.

llvm-svn: 123074
2011-01-08 19:08:40 +00:00
Chris Lattner
b601620bcd inline preserveCanonicalLoopForm now that it is simple.
llvm-svn: 123073
2011-01-08 18:55:50 +00:00
Chris Lattner
db05334c7f Three major changes:
1. Rip out LoopRotate's domfrontier updating code.  It isn't
   needed now that LICM doesn't use DF and it is super complex
   and gross.
2. Make DomTree updating code a lot simpler and faster.  The 
   old loop over all the blocks was just to find a block??
3. Change the code that inserts the new preheader to just use
   SplitCriticalEdge instead of doing an overcomplex 
   reimplementation of it.

No behavior change, except for the name of the inserted preheader.

llvm-svn: 123072
2011-01-08 18:52:51 +00:00
Chris Lattner
f1a01c01d9 reduce nesting.
llvm-svn: 123071
2011-01-08 18:47:43 +00:00
Chris Lattner
2fb7c9c272 LoopRotate requires canonical loop form, so it always has preheaders
and latch blocks.  Reorder entry conditions to make hte pass faster
and more logical.

llvm-svn: 123069
2011-01-08 18:06:22 +00:00
Chris Lattner
c6d905407b use the LI ivar.
llvm-svn: 123068
2011-01-08 17:49:51 +00:00
Chris Lattner
0893592a39 some cleanups: remove dead arguments and eliminate ivars
that are just passed to one function.

llvm-svn: 123067
2011-01-08 17:48:33 +00:00
Chris Lattner
ab9a79eda3 fix an issue duncan pointed out, which could cause loop rotate
to violate LCSSA form

llvm-svn: 123066
2011-01-08 17:38:45 +00:00
Cameron Zwarich
9a35e69c7d Fix coding style issues.
llvm-svn: 123065
2011-01-08 17:07:11 +00:00
Cameron Zwarich
a40df277f1 Make more passes preserve dominators (or state that they preserve dominators if
they all ready do). This removes two dominator recomputations prior to isel,
which is a 1% improvement in total llc time for 403.gcc.

The only potentially suspect thing is making GCStrategy recompute dominators if
it used a custom lowering strategy.

llvm-svn: 123064
2011-01-08 17:01:52 +00:00
Cameron Zwarich
e97e0555d5 Contract subloop bodies. However, it is still important to visit the phis at the
top of subloop headers, as the phi uses logically occur outside of the subloop.

llvm-svn: 123062
2011-01-08 15:52:22 +00:00
Frits van Bommel
966cc00809 Fix a bug in r123034 (trying to sext/zext non-integers) and clean up a little.
llvm-svn: 123061
2011-01-08 10:51:36 +00:00
Chris Lattner
6729ce1c33 Have loop-rotate simplify instructions (yay instsimplify!) as it clones
them into the loop preheader, eliminating silly instructions like
"icmp i32 0, 100" in fixed tripcount loops.  This also better exposes the 
bigger problem with loop rotate that I'd like to fix: once this has been
folded, the duplicated conditional branch *often* turns into an uncond branch.

Not aggressively handling this is pessimizing later loop optimizations 
somethin' fierce by making "dominates all exit blocks" checks fail.

llvm-svn: 123060
2011-01-08 08:24:46 +00:00
Chris Lattner
397937fa0d Revamp the ValueMapper interfaces in a couple ways:
1. Take a flags argument instead of a bool.  This makes
   it more clear to the reader what it is used for.
2. Add a flag that says that "remapping a value not in the
   map is ok".
3. Reimplement MapValue to share a bunch of code and be a lot
   more efficient.  For lookup failures, don't drop null values
   into the map.
4. Using the new flag a bunch of code can vaporize in LinkModules
   and LoopUnswitch, kill it.

No functionality change.

llvm-svn: 123058
2011-01-08 08:15:20 +00:00
Chris Lattner
9700b4d5ce two minor changes: switch to the standard ValueToValueMapTy
map from ValueMapper.h (giving us access to its utilities)
and add a fastpath in the loop rotation code, avoiding expensive
ssa updator manipulation for values with nothing to update.

llvm-svn: 123057
2011-01-08 07:21:31 +00:00
Tobias Grosser
48469b566a InstCombine: Match min/max hidden by sext/zext
X = sext x; x >s c ? X : C+1 --> X = sext x; X <s C+1 ? C+1 : X
X = sext x; x <s c ? X : C-1 --> X = sext x; X >s C-1 ? C-1 : X
X = zext x; x >u c ? X : C+1 --> X = zext x; X <u C+1 ? C+1 : X
X = zext x; x <u c ? X : C-1 --> X = zext x; X >u C-1 ? C-1 : X
X = sext x; x >u c ? X : C+1 --> X = sext x; X <u C+1 ? C+1 : X
X = sext x; x <u c ? X : C-1 --> X = sext x; X >u C-1 ? C-1 : X

Instead of calculating this with mixed types promote all to the
larger type. This enables scalar evolution to analyze this
expression. PR8866

llvm-svn: 123034
2011-01-07 21:33:14 +00:00