1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-28 14:32:51 +01:00
Commit Graph

68951 Commits

Author SHA1 Message Date
Jakob Stoklund Olesen
f43442c9f7 Fix VirtRegMap to use TRI::index2VirtReg and TRI::virtReg2Index instead of
depending on TRI::FirstVirtualRegister.

Also use TRI::printReg instead of printing virtual registers directly.

llvm-svn: 123101
2011-01-08 23:11:07 +00:00
Jakob Stoklund Olesen
b04c78d5ea Fix a MachineVerifier loop that probably didn't mean to skip the last two
virtual registers.

llvm-svn: 123100
2011-01-08 23:11:02 +00:00
Jakob Stoklund Olesen
aaec1aa6b6 Don't document exactly how virtual registers are represented as integers. Code
shouldn't depend directly on that.

Give an example of how to iterate over all virtual registers in a function
without depending on the representation.

llvm-svn: 123099
2011-01-08 23:10:59 +00:00
Jakob Stoklund Olesen
fb2b53c0de Use an IndexedMap for LiveVariables::VirtRegInfo.
Provide MRI::getNumVirtRegs() and TRI::index2VirtReg() functions to allow
iteration over virtual registers without depending on the representation of
virtual register numbers.

llvm-svn: 123098
2011-01-08 23:10:57 +00:00
Jakob Stoklund Olesen
4bc0b62215 Do not talk about TargetRegisterInfo::FirstVirtualRegister.
llvm-svn: 123097
2011-01-08 23:10:53 +00:00
Jakob Stoklund Olesen
b3820cdc22 Use an IndexedMap for LiveOutRegInfo to hide its dependence on TargetRegisterInfo::FirstVirtualRegister.
llvm-svn: 123096
2011-01-08 23:10:50 +00:00
Cameron Zwarich
33c137a88b Fix coding style.
llvm-svn: 123093
2011-01-08 22:36:53 +00:00
Chris Lattner
7df7b47828 fix a latent bug in memcpyoptimizer that my recent patches exposed: it wasn't
updating memdep when fusing stores together.  This fixes the crash optimizing
the bullet benchmark.

llvm-svn: 123091
2011-01-08 22:19:21 +00:00
Chris Lattner
563e57abbd tryMergingIntoMemset can only handle constant length memsets.
llvm-svn: 123090
2011-01-08 22:11:56 +00:00
Chris Lattner
98136397bd Merge memsets followed by neighboring memsets and other stores into
larger memsets.  Among other things, this fixes rdar://8760394 and
allows us to handle "Example 2" from http://blog.regehr.org/archives/320,
compiling it into a single 4096-byte memset:

_mad_synth_mute:                        ## @mad_synth_mute
## BB#0:                                ## %entry
	pushq	%rax
	movl	$4096, %esi             ## imm = 0x1000
	callq	___bzero
	popq	%rax
	ret

llvm-svn: 123089
2011-01-08 21:19:19 +00:00
Chris Lattner
e09439ed9d fix an issue in IsPointerOffset that prevented us from recognizing that
P and P+1 are relative to the same base pointer.

llvm-svn: 123087
2011-01-08 21:07:56 +00:00
Chris Lattner
20bf2d50b8 enhance memcpyopt to merge a store and a subsequent
memset into a single larger memset.

llvm-svn: 123086
2011-01-08 20:54:51 +00:00
Chris Lattner
71c356a8b0 fit in 80 cols
llvm-svn: 123085
2011-01-08 20:53:41 +00:00
Chris Lattner
756416d4c0 merge two tests and filecheckify
llvm-svn: 123082
2011-01-08 20:27:22 +00:00
Chris Lattner
19aedc848c constify TargetData references.
Split memset formation logic out into its own
"tryMergingIntoMemset" helper function.

llvm-svn: 123081
2011-01-08 20:24:01 +00:00
Chris Lattner
7d3c4712e9 When loop rotation happens, it is *very* common for the duplicated condbr
to be foldable into an uncond branch.  When this happens, we can make a
much simpler CFG for the loop, which is important for nested loop cases
where we want the outer loop to be aggressively optimized.

Handle this case more aggressively.  For example, previously on
phi-duplicate.ll we would get this:


define void @test(i32 %N, double* %G) nounwind ssp {
entry:
  %cmp1 = icmp slt i64 1, 1000
  br i1 %cmp1, label %bb.nph, label %for.end

bb.nph:                                           ; preds = %entry
  br label %for.body

for.body:                                         ; preds = %bb.nph, %for.cond
  %j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ]
  %arrayidx = getelementptr inbounds double* %G, i64 %j.02
  %tmp3 = load double* %arrayidx
  %sub = sub i64 %j.02, 1
  %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
  %tmp7 = load double* %arrayidx6
  %add = fadd double %tmp3, %tmp7
  %arrayidx10 = getelementptr inbounds double* %G, i64 %j.02
  store double %add, double* %arrayidx10
  %inc = add nsw i64 %j.02, 1
  br label %for.cond

for.cond:                                         ; preds = %for.body
  %cmp = icmp slt i64 %inc, 1000
  br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge

for.cond.for.end_crit_edge:                       ; preds = %for.cond
  br label %for.end

for.end:                                          ; preds = %for.cond.for.end_crit_edge, %entry
  ret void
}

Now we get the much nicer:

define void @test(i32 %N, double* %G) nounwind ssp {
entry:
  br label %for.body

for.body:                                         ; preds = %entry, %for.body
  %j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ]
  %arrayidx = getelementptr inbounds double* %G, i64 %j.01
  %tmp3 = load double* %arrayidx
  %sub = sub i64 %j.01, 1
  %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
  %tmp7 = load double* %arrayidx6
  %add = fadd double %tmp3, %tmp7
  %arrayidx10 = getelementptr inbounds double* %G, i64 %j.01
  store double %add, double* %arrayidx10
  %inc = add nsw i64 %j.01, 1
  %cmp = icmp slt i64 %inc, 1000
  br i1 %cmp, label %for.body, label %for.end

for.end:                                          ; preds = %for.body
  ret void
}

With all of these recent changes, we are now able to compile:

void foo(char *X) {
 for (int i = 0; i != 100; ++i) 
   for (int j = 0; j != 100; ++j)
     X[j+i*100] = 0;
}

into a single memset of 10000 bytes.  This series of changes
should also be helpful for other nested loop scenarios as well.

llvm-svn: 123079
2011-01-08 19:59:06 +00:00
Chris Lattner
e07601d493 make domtree verification print something useful on failure.
llvm-svn: 123078
2011-01-08 19:55:55 +00:00
Chris Lattner
a01093c990 split ssa updating code out to its own helper function. Don't bother
moving the OrigHeader block anymore: we just merge it away anyway so
its code layout doesn't matter.

llvm-svn: 123077
2011-01-08 19:26:33 +00:00
Chris Lattner
e5f41b44c5 Implement a TODO: Enhance loopinfo to merge away the unconditional branch
that it was leaving in loops after rotation (between the original latch
block and the original header.

With this change, it is possible for rotated loops to have just a single
basic block, which is useful.

llvm-svn: 123075
2011-01-08 19:10:28 +00:00
Chris Lattner
145e4ee94e various code cleanups, enhance MergeBlockIntoPredecessor to preserve
loop info.

llvm-svn: 123074
2011-01-08 19:08:40 +00:00
Chris Lattner
b601620bcd inline preserveCanonicalLoopForm now that it is simple.
llvm-svn: 123073
2011-01-08 18:55:50 +00:00
Chris Lattner
db05334c7f Three major changes:
1. Rip out LoopRotate's domfrontier updating code.  It isn't
   needed now that LICM doesn't use DF and it is super complex
   and gross.
2. Make DomTree updating code a lot simpler and faster.  The 
   old loop over all the blocks was just to find a block??
3. Change the code that inserts the new preheader to just use
   SplitCriticalEdge instead of doing an overcomplex 
   reimplementation of it.

No behavior change, except for the name of the inserted preheader.

llvm-svn: 123072
2011-01-08 18:52:51 +00:00
Chris Lattner
f1a01c01d9 reduce nesting.
llvm-svn: 123071
2011-01-08 18:47:43 +00:00
Francois Pichet
2094592a6f On Windows, replace each occurrence of '\' by '\\' on the replacement string. This is necessary to prevent re.sub from replacing escape sequences occurring in path.
For example:

llvm\tools\clang\test
was replaced by
llvm <tab> ools\clang <tab> est

llvm-svn: 123070
2011-01-08 18:09:48 +00:00
Chris Lattner
2fb7c9c272 LoopRotate requires canonical loop form, so it always has preheaders
and latch blocks.  Reorder entry conditions to make hte pass faster
and more logical.

llvm-svn: 123069
2011-01-08 18:06:22 +00:00
Chris Lattner
c6d905407b use the LI ivar.
llvm-svn: 123068
2011-01-08 17:49:51 +00:00
Chris Lattner
0893592a39 some cleanups: remove dead arguments and eliminate ivars
that are just passed to one function.

llvm-svn: 123067
2011-01-08 17:48:33 +00:00
Chris Lattner
ab9a79eda3 fix an issue duncan pointed out, which could cause loop rotate
to violate LCSSA form

llvm-svn: 123066
2011-01-08 17:38:45 +00:00
Cameron Zwarich
9a35e69c7d Fix coding style issues.
llvm-svn: 123065
2011-01-08 17:07:11 +00:00
Cameron Zwarich
a40df277f1 Make more passes preserve dominators (or state that they preserve dominators if
they all ready do). This removes two dominator recomputations prior to isel,
which is a 1% improvement in total llc time for 403.gcc.

The only potentially suspect thing is making GCStrategy recompute dominators if
it used a custom lowering strategy.

llvm-svn: 123064
2011-01-08 17:01:52 +00:00
Rafael Espindola
9f526bcf4d First step in fixing PR8927:
Add a unnamed_addr bit to global variables and functions. This will be used
to indicate that the address is not significant and therefore the constant
or function can be merged with others.

If an optimization pass can show that an address is not used, it can set this.

Examples of things that can have this set by the FE are globals created to
hold string literals and C++ constructors.

Adding unnamed_addr to a non-const global should have no effect unless
an optimization can transform that global into a constant.

Aliases are not allowed to have unnamed_addr since I couldn't figure
out any use for it.

llvm-svn: 123063
2011-01-08 16:42:36 +00:00
Cameron Zwarich
e97e0555d5 Contract subloop bodies. However, it is still important to visit the phis at the
top of subloop headers, as the phi uses logically occur outside of the subloop.

llvm-svn: 123062
2011-01-08 15:52:22 +00:00
Frits van Bommel
966cc00809 Fix a bug in r123034 (trying to sext/zext non-integers) and clean up a little.
llvm-svn: 123061
2011-01-08 10:51:36 +00:00
Chris Lattner
6729ce1c33 Have loop-rotate simplify instructions (yay instsimplify!) as it clones
them into the loop preheader, eliminating silly instructions like
"icmp i32 0, 100" in fixed tripcount loops.  This also better exposes the 
bigger problem with loop rotate that I'd like to fix: once this has been
folded, the duplicated conditional branch *often* turns into an uncond branch.

Not aggressively handling this is pessimizing later loop optimizations 
somethin' fierce by making "dominates all exit blocks" checks fail.

llvm-svn: 123060
2011-01-08 08:24:46 +00:00
Chris Lattner
50aaa34f29 make this file properly self contained.
llvm-svn: 123059
2011-01-08 08:19:49 +00:00
Chris Lattner
397937fa0d Revamp the ValueMapper interfaces in a couple ways:
1. Take a flags argument instead of a bool.  This makes
   it more clear to the reader what it is used for.
2. Add a flag that says that "remapping a value not in the
   map is ok".
3. Reimplement MapValue to share a bunch of code and be a lot
   more efficient.  For lookup failures, don't drop null values
   into the map.
4. Using the new flag a bunch of code can vaporize in LinkModules
   and LoopUnswitch, kill it.

No functionality change.

llvm-svn: 123058
2011-01-08 08:15:20 +00:00
Chris Lattner
9700b4d5ce two minor changes: switch to the standard ValueToValueMapTy
map from ValueMapper.h (giving us access to its utilities)
and add a fastpath in the loop rotation code, avoiding expensive
ssa updator manipulation for values with nothing to update.

llvm-svn: 123057
2011-01-08 07:21:31 +00:00
Eric Christopher
4580fc9a0d I don't think I could find a 10.2.x box if I tried.
llvm-svn: 123051
2011-01-08 01:52:20 +00:00
Evan Cheng
1afd04fc59 Recognize inline asm 'rev /bin/bash, ' as a bswap intrinsic call.
llvm-svn: 123048
2011-01-08 01:24:27 +00:00
Evan Cheng
aa16fd02ad Do not model all INLINEASM instructions as having unmodelled side effects.
Instead encode llvm IR level property "HasSideEffects" in an operand (shared
with IsAlignStack). Added MachineInstrs::hasUnmodeledSideEffects() to check
the operand when the instruction is an INLINEASM.

This allows memory instructions to be moved around INLINEASM instructions.

llvm-svn: 123044
2011-01-07 23:50:32 +00:00
Bob Wilson
4e879f0f00 Use __builtin_shufflevector to implement vget_low and vget_high intrinsics.
This was suggested by Edmund Grimley Evans in pr8411.

llvm-svn: 123043
2011-01-07 23:40:49 +00:00
Bob Wilson
eb4521fbcb Add an explanatory message for an assertion.
llvm-svn: 123042
2011-01-07 23:40:46 +00:00
Matt Beaumont-Gay
68a61cf6dd Eliminate variable only used in debug builds.
llvm-svn: 123040
2011-01-07 22:34:58 +00:00
Devang Patel
d3ba97949a Speculatively revert r123032.
llvm-svn: 123039
2011-01-07 22:33:41 +00:00
Devang Patel
1c9a5199b5 Do not include DataTypes.h in llvm-c/lto.h.
This means avoid using uint32_t. This patch reverts r112200 and fixes original  problem by fixing argument type in lto.cpp.

llvm-svn: 123038
2011-01-07 22:26:25 +00:00
Evan Cheng
b960c4c2ae Fix comment. INLINEASM node operand #3 is IsAlignStack bit.
llvm-svn: 123036
2011-01-07 21:38:59 +00:00
Bob Wilson
c485ff3ced Lower some BUILD_VECTORS using VEXT+shuffle.
Patch by Tim Northover.

llvm-svn: 123035
2011-01-07 21:37:30 +00:00
Tobias Grosser
48469b566a InstCombine: Match min/max hidden by sext/zext
X = sext x; x >s c ? X : C+1 --> X = sext x; X <s C+1 ? C+1 : X
X = sext x; x <s c ? X : C-1 --> X = sext x; X >s C-1 ? C-1 : X
X = zext x; x >u c ? X : C+1 --> X = zext x; X <u C+1 ? C+1 : X
X = zext x; x <u c ? X : C-1 --> X = zext x; X >u C-1 ? C-1 : X
X = sext x; x >u c ? X : C+1 --> X = sext x; X <u C+1 ? C+1 : X
X = sext x; x <u c ? X : C-1 --> X = sext x; X >u C-1 ? C-1 : X

Instead of calculating this with mixed types promote all to the
larger type. This enables scalar evolution to analyze this
expression. PR8866

llvm-svn: 123034
2011-01-07 21:33:14 +00:00
Tobias Grosser
492e97f0e5 Some whitespace fixes
llvm-svn: 123033
2011-01-07 21:33:13 +00:00
Devang Patel
a52d6c216d Appropriately truncate debug info range in dwarf output.
Enable live debug variables pass.

llvm-svn: 123032
2011-01-07 21:30:41 +00:00