1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-27 14:02:50 +01:00
Commit Graph

430 Commits

Author SHA1 Message Date
Cameron Zwarich
72cd0d1b5b Fix PR10104 by adding a bounds check on a vector element access check. It was
assuming that all offsets are legal vector accesses, and thus trying to access
the float member of { <2 x float>, float } as the 3rd element of the first
member.

llvm-svn: 132766
2011-06-09 01:45:33 +00:00
Cameron Zwarich
68f8e98b8e Fix an assymmetry between ConvertScalar_ExtractValue and ConvertScalar_InsertValue. The
former was using the size of the entire alloca, whereas the latter was correctly using
the allocated size of the immediate type being converted (which may differ from the size
of the alloca). This fixes PR10082.

llvm-svn: 132759
2011-06-08 22:08:31 +00:00
Devang Patel
ec747bb76b Use IRBuilder, preserve line numbers.
llvm-svn: 132578
2011-06-03 19:46:19 +00:00
Cameron Zwarich
e425894eaf Clean up the lazy initialization of DIBuilder a bit.
llvm-svn: 131956
2011-05-24 06:00:08 +00:00
Cameron Zwarich
462b5db500 Make LoadAndStorePromoter preserve debug info and create llvm.dbg.values when
promoting allocas to SSA variables. Fixes <rdar://problem/9479036>.

llvm-svn: 131953
2011-05-24 03:10:43 +00:00
Duncan Sands
c4d27a63a4 Fix PR9820: a read-only call differs from a load in that a load doesn't
return the pointer being dereferenced, it returns the pointee, but a call
might return the pointer itself.

llvm-svn: 130979
2011-05-06 10:30:37 +00:00
Cameron Zwarich
fee060175e Fix another case of <rdar://problem/9184212> that only occurs with code
generated by llvm-gcc, since llvm-gcc uses 2 i64s for passing a 4 x float
vector on ARM rather than an i64 array like Clang.

llvm-svn: 129878
2011-04-20 21:48:38 +00:00
Cameron Zwarich
139f811a9b The bitcast case here is actually handled uniformly earlier in the function, so
delete it.

llvm-svn: 129877
2011-04-20 21:48:34 +00:00
Cameron Zwarich
13cd49fd54 Cleanup some code to better use an early return style in preparation for adding
more cases.

llvm-svn: 129876
2011-04-20 21:48:16 +00:00
Mon P Wang
f9b26f115c Cleanup r129509 based on comments by Chris
llvm-svn: 129532
2011-04-14 19:20:42 +00:00
Mon P Wang
8b09087f46 Cleanup r129472 by using a utility routine as suggested by Eli.
llvm-svn: 129509
2011-04-14 08:04:01 +00:00
Mon P Wang
4b667cd995 Vectors with different number of elements of the same element type can have
the same allocation size but different primitive sizes(e.g., <3xi32> and
<4xi32>).  When ScalarRepl promotes them, it can't use a bit cast but
should use a shuffle vector instead.

llvm-svn: 129472
2011-04-13 21:40:02 +00:00
Jay Foad
53632b7c03 Remove PHINode::reserveOperandSpace(). Instead, add a parameter to
PHINode::Create() giving the (known or expected) number of operands.

llvm-svn: 128537
2011-03-30 11:28:46 +00:00
Jay Foad
dc5a008237 (Almost) always call reserveOperandSpace() on newly created PHINodes.
llvm-svn: 128535
2011-03-30 11:19:20 +00:00
Cameron Zwarich
d49e32233c Do some simple copy propagation through integer loads and stores when promoting
vector types. This helps a lot with inlined functions when using the ARM soft
float ABI. Fixes <rdar://problem/9184212>.

llvm-svn: 128453
2011-03-29 05:19:52 +00:00
Cameron Zwarich
09bd1deda3 Fix a typo and add a test.
llvm-svn: 128331
2011-03-26 04:58:50 +00:00
Cameron Zwarich
9f72ea0a80 Fix PR9464 by correcting some math that just happened to be right in most cases
that were hit in practice.

llvm-svn: 128146
2011-03-23 05:25:55 +00:00
Cameron Zwarich
c60b47a7e2 Fix a comment.
llvm-svn: 127728
2011-03-16 08:13:42 +00:00
Cameron Zwarich
7fd94ea393 Only convert allocas to scalars if it is profitable. The profitability metric I
chose is having a non-memcpy/memset use and being larger than any native integer
type. Originally I chose having an access of a size smaller than the total size
of the alloca, but this caused some minor issues on the spirit benchmark where
SRoA runs again after some inlining.

This fixes <rdar://problem/8613163>.

llvm-svn: 127718
2011-03-16 00:13:44 +00:00
Cameron Zwarich
88790f3d4d Better use initializer lists.
llvm-svn: 127716
2011-03-16 00:13:37 +00:00
Cameron Zwarich
f09bb5f2f5 Add a clarifying comment.
llvm-svn: 127715
2011-03-16 00:13:35 +00:00
Cameron Zwarich
90efa03e8b Fix a crasher introduced by r127317 that is seen on the bots when using an
alloca as both integer and floating-point vectors of the same size. Bugpoint is
not cooperating with me, but I'll try to find a manual testcase tomorrow.

llvm-svn: 127320
2011-03-09 07:34:11 +00:00
Cameron Zwarich
e0b4705b03 Add support to scalar replacement for partial vector accesses of an alloca, e.g.
a union of a float, <2 x float>, and <4 x float>. This mostly comes up with the
use of vector intrinsics, especially in NEON when programmers know the layout of
the register file. This enables codegen to eliminate a lot of the subregister
traffic it would otherwise generate.

This commit only enables this for a small number of floating-point cases, but a
lot more integer cases. I assume this is okay for all ports, but I did not do
extensive testing of the quality of code involving i512 vectors and the like. If
there is a use case where this generates worse code than before, let me know and
we can scale it back.

This fixes <rdar://problem/9036264>.

llvm-svn: 127317
2011-03-09 05:43:05 +00:00
Cameron Zwarich
e80c47f295 Move vector type merging to a separate function in preparation for it getting
more complicated.

llvm-svn: 127316
2011-03-09 05:43:01 +00:00
Chris Lattner
db204cbe42 convert ConstantVector::get to use ArrayRef.
llvm-svn: 125537
2011-02-15 00:14:00 +00:00
Chris Lattner
ee7f7c2494 revert my ConstantVector patch, it seems to have made the llvm-gcc
builders unhappy.

llvm-svn: 125504
2011-02-14 18:15:46 +00:00
Chris Lattner
34f32cb4c2 Switch ConstantVector::get to use ArrayRef instead of a pointer+size
idiom.  Change various clients to simplify their code.

llvm-svn: 125487
2011-02-14 07:55:32 +00:00
Dan Gohman
db0dc19c04 Give GetUnderlyingObject a TargetData, to keep it in sync
with BasicAA's DecomposeGEPExpression, which recently began
using a TargetData. This fixes PR8968, though the testcase
is awkward to reduce.

Also, update several off GetUnderlyingObject's users
which happen to have a TargetData handy to pass it in.

llvm-svn: 124134
2011-01-24 18:53:32 +00:00
Chris Lattner
3609e5afb4 enhance SRoA to promote allocas that are used by PHI nodes. This often
occurs because instcombine sinks loads and inserts phis.  This kicks in 
on such apps as 175.vpr, eon, 403.gcc, xalancbmk and a bunch of times in
spec2006 in some app that uses std::deque.

This resolves the last of rdar://7339113.

llvm-svn: 124090
2011-01-24 01:07:11 +00:00
Chris Lattner
50288f8e54 Enhance SRoA to promote allocas that are used by selects in some
common cases.  This triggers a surprising number of times in SPEC2K6
because min/max idioms end up doing this.  For example, code from the
STL ends up looking like this to SRoA:

  %202 = load i64* %__old_size, align 8, !tbaa !3
  %203 = load i64* %__old_size, align 8, !tbaa !3
  %204 = load i64* %__n, align 8, !tbaa !3
  %205 = icmp ult i64 %203, %204
  %storemerge.i = select i1 %205, i64* %__n, i64* %__old_size
  %206 = load i64* %storemerge.i, align 8, !tbaa !3

We can now promote both the __n and the __old_size allocas.

This addresses another chunk of rdar://7339113, poor codegen on
stringswitch.

llvm-svn: 124088
2011-01-23 22:04:55 +00:00
Chris Lattner
2ef605b499 Enhance SRoA to be more aggressive about scalarization of aggregate allocas
that have PHI or select uses of their element pointers.  This can often happen
when instcombine sinks two loads into a successor, inserting a phi or select.

With this patch, we can scalarize the alloca, but the pinned elements are not
yet promoted.  This is still a win for large aggregates where only one element
is used.  This fixes rdar://8904039 and part of rdar://7339113 (poor codegen
on stringswitch).

llvm-svn: 124070
2011-01-23 08:27:54 +00:00
Chris Lattner
2e57722d9d have AllocaInfo store the alloca being inspected, simplifying callers.
No functionality change.

llvm-svn: 124067
2011-01-23 07:29:29 +00:00
Chris Lattner
2064a3e213 Rearrange some code a bit. Change MarkUnsafe to
handle the "Transformation preventing inst" printing, 
so that -scalarrepl -debug will always print the rejected
instruction.  No functionality change.

llvm-svn: 124066
2011-01-23 07:05:44 +00:00
Chris Lattner
ba66871643 remove an old hack that avoided creating MMX datatypes. The
X86 backend has been fixed.

llvm-svn: 124064
2011-01-23 06:40:33 +00:00
Cameron Zwarich
e39e476305 Remove outdated references to dominance frontiers.
llvm-svn: 123724
2011-01-18 03:53:26 +00:00
Cameron Zwarich
0a1975802e Roll r123609 back in with two changes that fix test failures with expensive
checks enabled:

1) Use '<' to compare integers in a comparison function rather than '<='.

2) Use the uniqued set DefBlocks rather than Info.DefiningBlocks to initialize
the priority queue.

The speedup of scalarrepl on test-suite + SPEC2000 + SPEC2006 is a bit less, at
just under 16% rather than 17%.

llvm-svn: 123662
2011-01-17 17:38:41 +00:00
Cameron Zwarich
77e381e302 Roll out r123609 due to failures on the llvm-x86_64-linux-checks bot.
llvm-svn: 123618
2011-01-17 07:26:51 +00:00
Cameron Zwarich
e35642342b Eliminate the use of dominance frontiers in PromoteMemToReg. In addition to
eliminating a potentially quadratic data structure, this also gives a 17%
speedup when running -scalarrepl on test-suite + SPEC2000 + SPEC2006. My initial
experiment gave a greater speedup around 25%, but I moved the dominator tree
level computation from dominator tree construction to PromoteMemToReg.

Since this approach to computing IDFs has a much lower overhead than the old
code using precomputed DFs, it is worth looking at using this new code for the
second scalarrepl pass as well.

llvm-svn: 123609
2011-01-17 01:08:59 +00:00
Chris Lattner
2f902bc3b1 tidy up a comment, as suggested by duncan
llvm-svn: 123590
2011-01-16 17:46:19 +00:00
Chris Lattner
29f339f87c if an alloca is only ever accessed as a unit, and is accessed with load/store instructions,
then don't try to decimate it into its individual pieces.  This will just make a mess of the
IR and is pointless if none of the elements are individually accessed.  This was generating
really terrible code for std::bitset (PR8980) because it happens to be lowered by clang
as an {[8 x i8]} structure instead of {i64}.

The testcase now is optimized to:

define i64 @test2(i64 %X) {
  br label %L2

L2:                                               ; preds = %0
  ret i64 %X
}

before we generated:

define i64 @test2(i64 %X) {
  %sroa.store.elt = lshr i64 %X, 56
  %1 = trunc i64 %sroa.store.elt to i8
  %sroa.store.elt8 = lshr i64 %X, 48
  %2 = trunc i64 %sroa.store.elt8 to i8
  %sroa.store.elt9 = lshr i64 %X, 40
  %3 = trunc i64 %sroa.store.elt9 to i8
  %sroa.store.elt10 = lshr i64 %X, 32
  %4 = trunc i64 %sroa.store.elt10 to i8
  %sroa.store.elt11 = lshr i64 %X, 24
  %5 = trunc i64 %sroa.store.elt11 to i8
  %sroa.store.elt12 = lshr i64 %X, 16
  %6 = trunc i64 %sroa.store.elt12 to i8
  %sroa.store.elt13 = lshr i64 %X, 8
  %7 = trunc i64 %sroa.store.elt13 to i8
  %8 = trunc i64 %X to i8
  br label %L2

L2:                                               ; preds = %0
  %9 = zext i8 %1 to i64
  %10 = shl i64 %9, 56
  %11 = zext i8 %2 to i64
  %12 = shl i64 %11, 48
  %13 = or i64 %12, %10
  %14 = zext i8 %3 to i64
  %15 = shl i64 %14, 40
  %16 = or i64 %15, %13
  %17 = zext i8 %4 to i64
  %18 = shl i64 %17, 32
  %19 = or i64 %18, %16
  %20 = zext i8 %5 to i64
  %21 = shl i64 %20, 24
  %22 = or i64 %21, %19
  %23 = zext i8 %6 to i64
  %24 = shl i64 %23, 16
  %25 = or i64 %24, %22
  %26 = zext i8 %7 to i64
  %27 = shl i64 %26, 8
  %28 = or i64 %27, %25
  %29 = zext i8 %8 to i64
  %30 = or i64 %29, %28
  ret i64 %30
}

In this case, instcombine was able to eliminate the nonsense, but in PR8980 enough
PHIs are in play that instcombine backs off.  It's better to not generate this stuff
in the first place.

llvm-svn: 123571
2011-01-16 06:18:28 +00:00
Chris Lattner
b43fce09a9 Use an irbuilder to get some trivial constant folding when doing a store
of a constant.

llvm-svn: 123570
2011-01-16 05:58:24 +00:00
Chris Lattner
2067fb2a93 enhance FoldOpIntoPhi in instcombine to try harder when a phi has
multiple uses.  In some cases, all the uses are the same operation,
so instcombine can go ahead and promote the phi.  In the testcase
this pushes an add out of the loop.

llvm-svn: 123568
2011-01-16 05:28:59 +00:00
Chris Lattner
e6d5b3c4ce Generalize LoadAndStorePromoter a bit and switch LICM
to use it.

llvm-svn: 123501
2011-01-15 00:12:35 +00:00
Chris Lattner
1ce35a0362 switch SRoA to use LoadAndStorePromoter instead of its own copy of the code.
llvm-svn: 123457
2011-01-14 19:50:47 +00:00
Chris Lattner
8e171470d3 split SROA into two passes: one that uses DomFrontiers (-scalarrepl)
and one that uses SSAUpdater (-scalarrepl-ssa)

llvm-svn: 123436
2011-01-14 08:13:00 +00:00
Chris Lattner
b5c39352d8 Implement full support for promoting allocas to registers using SSAUpdater
instead of DomTree/DomFrontier.  This may be interesting for reducing compile 
time.  This is currently disabled, but seems to work just fine.

When this is enabled, we eliminate two runs of dominator frontier, one in the
"early per-function" optimizations and one in the "interlaced with inliner"
function passes.

llvm-svn: 123434
2011-01-14 07:50:47 +00:00
Bob Wilson
1238f872da Fix whitespace.
llvm-svn: 123396
2011-01-13 20:59:44 +00:00
Bob Wilson
fbab825516 Check for empty structs, and for consistency, zero-element arrays.
llvm-svn: 123383
2011-01-13 18:26:59 +00:00
Bob Wilson
3b0197489e Extend SROA to handle arrays accessed as homogeneous structs and vice versa.
This is a minor extension of SROA to handle a special case that is
important for some ARM NEON operations.  Some of the NEON intrinsics
return multiple values, which are handled as struct types containing
multiple elements of the same vector type.  The corresponding return
types declared in the arm_neon.h header have equivalent arrays.  We
need SROA to recognize that it can split up those arrays and structs
into separate vectors, even though they are not always accessed with
the same type.  SROA already handles loads and stores of an entire
alloca by using insertvalue/extractvalue to access the individual
pieces, and that code works the same regardless of whether the type
is a struct or an array.  So, all that needs to be done is to check
for compatible arrays and homogeneous structs.

llvm-svn: 123381
2011-01-13 17:45:11 +00:00
Bob Wilson
9f8d730f9b Make SROA more aggressive with allocas containing padding.
SROA only split up structs and arrays one level at a time, so padding can
only cause trouble if it is located in between the struct or array elements.

llvm-svn: 123380
2011-01-13 17:45:08 +00:00
Chris Lattner
e396e846b4 split dom frontier handling stuff out to its own DominanceFrontier header,
so that Dominators.h is *just* domtree.  Also prune #includes a bit.

llvm-svn: 122714
2011-01-02 22:09:33 +00:00
Chris Lattner
9007b56712 start using irbuilder to make mem intrinsics in a few passes.
llvm-svn: 122572
2010-12-26 22:57:41 +00:00
Mon P Wang
eb2ae28352 Preserve the address space when generating bitcasts for MemTransferInst in ConvertToScalarInfo
llvm-svn: 122462
2010-12-23 01:41:32 +00:00
Dan Gohman
295ba3ab26 Move Value::getUnderlyingObject to be a standalone
function so that it can live in Analysis instead of
VMCore.

llvm-svn: 121885
2010-12-15 20:02:24 +00:00
Nick Lewycky
f6fa6b29f4 Treat a call of function pointer like a load of the pointer when considering
whether the pointer can be replaced with the global variable it is a copy of.
Fixes PR8680.

llvm-svn: 120126
2010-11-24 22:04:20 +00:00
Benjamin Kramer
9141603779 Simplify code. No change in functionality.
llvm-svn: 119908
2010-11-20 18:43:35 +00:00
Chris Lattner
f8540ee386 finish a thought.
llvm-svn: 119690
2010-11-18 07:32:33 +00:00
Chris Lattner
6048697a30 allow eliminating an alloca that is just copied from an constant global
if it is passed as a byval argument.  The byval argument will just be a
read, so it is safe to read from the original global instead.  This allows
us to promote away the %agg.tmp alloca in PR8582

llvm-svn: 119686
2010-11-18 06:41:51 +00:00
Chris Lattner
791e914b1b enhance the "alloca is just a memcpy from constant global"
to ignore calls that obviously can't modify the alloca
because they are readonly/readnone.

llvm-svn: 119683
2010-11-18 06:26:49 +00:00
Chris Lattner
44ccd4643d fix a small oversight in the "eliminate memcpy from constant global"
optimization.  If the alloca that is "memcpy'd from constant" also has
a memcpy from *it*, ignore it: it is a load.  We now optimize the testcase to:

define void @test2() {
  %B = alloca %T
  %a = bitcast %T* @G to i8*
  %b = bitcast %T* %B to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %b, i8* %a, i64 124, i32 4, i1 false)
  call void @bar(i8* %b)
  ret void
}

previously we would generate:

define void @test() {
  %B = alloca %T
  %b = bitcast %T* %B to i8*
  %G.0 = getelementptr inbounds %T* @G, i32 0, i32 0
  %tmp3 = load i8* %G.0, align 4
  %G.1 = getelementptr inbounds %T* @G, i32 0, i32 1
  %G.15 = bitcast [123 x i8]* %G.1 to i8*
  %1 = bitcast [123 x i8]* %G.1 to i984*
  %srcval = load i984* %1, align 1
  %B.0 = getelementptr inbounds %T* %B, i32 0, i32 0
  store i8 %tmp3, i8* %B.0, align 4
  %B.1 = getelementptr inbounds %T* %B, i32 0, i32 1
  %B.12 = bitcast [123 x i8]* %B.1 to i8*
  %2 = bitcast [123 x i8]* %B.1 to i984*
  store i984 %srcval, i984* %2, align 1
  call void @bar(i8* %b)
  ret void
}

llvm-svn: 119682
2010-11-18 06:20:47 +00:00
Owen Anderson
46990c17f7 Get rid of static constructors for pass registration. Instead, every pass exposes an initializeMyPassFunction(), which
must be called in the pass's constructor.  This function uses static dependency declarations to recursively initialize
the pass's dependencies.

Clients that only create passes through the createFooPass() APIs will require no changes.  Clients that want to use the
CommandLine options for passes will need to manually call the appropriate initialization functions in PassInitialization.h
before parsing commandline arguments.

I have tested this with all standard configurations of clang and llvm-gcc on Darwin.  It is possible that there are problems
with the static dependencies that will only be visible with non-standard options.  If you encounter any crash in pass
registration/creation, please send the testcase to me directly.

llvm-svn: 116820
2010-10-19 17:21:58 +00:00
Benjamin Kramer
a0e156fe6d Eliminate some calls to Value::getNameStr.
llvm-svn: 116670
2010-10-16 11:28:23 +00:00
Owen Anderson
63f757463c Begin adding static dependence information to passes, which will allow us to
perform initialization without static constructors AND without explicit initialization
by the client.  For the moment, passes are required to initialize both their
(potential) dependencies and any passes they preserve.  I hope to be able to relax
the latter requirement in the future.

llvm-svn: 116334
2010-10-12 19:48:12 +00:00
Owen Anderson
69cbf2e8b7 Now with fewer extraneous semicolons!
llvm-svn: 115996
2010-10-07 22:25:06 +00:00
Dale Johannesen
c14a1eda84 Massive rewrite of MMX:
The x86_mmx type is used for MMX intrinsics, parameters and
return values where these use MMX registers, and is also
supported in load, store, and bitcast.

Only the above operations generate MMX instructions, and optimizations
do not operate on or produce MMX intrinsics. 

MMX-sized vectors <2 x i32> etc. are lowered to XMM or split into
smaller pieces.  Optimizations may occur on these forms and the
result casted back to x86_mmx, provided the result feeds into a
previous existing x86_mmx operation.

The point of all this is prevent optimizations from introducing
MMX operations, which is unsafe due to the EMMS problem.

llvm-svn: 115243
2010-09-30 23:57:10 +00:00
Chris Lattner
b1c861e28c deepen my MMX/SRoA hack to avoid hurting non-x86 codegen.
llvm-svn: 112763
2010-09-01 23:09:27 +00:00
Chris Lattner
9759e898f0 add a gross hack to work around a problem that Argiris reported
on llvmdev: SRoA is introducing MMX datatypes like <1 x i64>,
which then cause random problems because the X86 backend is
producing mmx stuff without inserting proper emms calls.

In the short term, force off MMX datatypes.  In the long term,
the X86 backend should not select generic vector types to MMX
registers.  This is being worked on, but won't be done in time
for 2.8.  rdar://8380055

llvm-svn: 112696
2010-09-01 05:14:33 +00:00
Chris Lattner
686cddf177 remove dead prototype.
llvm-svn: 111342
2010-08-18 02:37:06 +00:00
Owen Anderson
f2fea95f2f Reapply r110396, with fixes to appease the Linux buildbot gods.
llvm-svn: 110460
2010-08-06 18:33:48 +00:00
Owen Anderson
aadd8a89ca Revert r110396 to fix buildbots.
llvm-svn: 110410
2010-08-06 00:23:35 +00:00
Owen Anderson
b9762c07cb Don't use PassInfo* as a type identifier for passes. Instead, use the address of the static
ID member as the sole unique type identifier.  Clean up APIs related to this change.

llvm-svn: 110396
2010-08-05 23:42:04 +00:00
Owen Anderson
f8addbb0a1 Fix batch of converting RegisterPass<> to INTIALIZE_PASS().
llvm-svn: 109045
2010-07-21 22:09:45 +00:00
Gabor Greif
17c48ecd68 eliminate CallInst::ArgOffset
llvm-svn: 108522
2010-07-16 09:38:02 +00:00
Chris Lattner
bf009b527a Fix the second half of PR7437: scalarrepl wasn't preserving
address spaces when SRoA'ing memcpy's.

llvm-svn: 107846
2010-07-08 00:27:05 +00:00
Gabor Greif
b6e59f4a82 use getArgOperand instead of getOperand
llvm-svn: 107271
2010-06-30 09:16:16 +00:00
Gabor Greif
6bf330181d employ CallInst::ArgOffset (for now)
llvm-svn: 107015
2010-06-28 16:43:57 +00:00
Gabor Greif
e57b886f3d use cached value
llvm-svn: 107000
2010-06-28 11:20:42 +00:00
Chris Lattner
d3a1ef7fea minor cleanup to SROA: when lowering type unsafe accesses to
large integers, the first inserted value would always create
an 'or X, 0'.  Even though this is trivially zapped by
instcombine, don't bother creating this pointless instruction.

llvm-svn: 106979
2010-06-27 07:58:26 +00:00
Dan Gohman
74d5144414 Use pre-increment instead of post-increment when the result is not used.
llvm-svn: 106542
2010-06-22 15:08:57 +00:00
Gabor Greif
2eaf8afcff use abstract accessors to CallInst
llvm-svn: 101899
2010-04-20 13:13:04 +00:00
Eric Christopher
e78496e5f1 Revert 101465, it broke internal OpenGL testing.
Probably the best way to know that all getOperand() calls have been handled
is to replace that API instead of updating.

llvm-svn: 101579
2010-04-16 23:37:20 +00:00
Gabor Greif
e7d6812008 reapply r101434
with a fix for self-hosting

rotate CallInst operands, i.e. move callee to the back
of the operand array

the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary

llvm-svn: 101465
2010-04-16 15:33:14 +00:00
Chris Lattner
74934dcb1d fix comment noticed by Bob
llvm-svn: 101437
2010-04-16 02:32:17 +00:00
Gabor Greif
cd116e8c6a back out r101423 and r101397, they break llvm-gcc self-host on darwin10
llvm-svn: 101434
2010-04-16 01:16:20 +00:00
Chris Lattner
f90092185c fix PR6832: we were using the alignment of a pointer when we
wanted the alignment of the pointee.

llvm-svn: 101432
2010-04-16 01:05:38 +00:00
Chris Lattner
5baf2c68e5 improve comments.
llvm-svn: 101429
2010-04-16 00:38:19 +00:00
Chris Lattner
9c8dd19de8 pull all the ConvertToScalarInfo code together into one
place.

llvm-svn: 101427
2010-04-16 00:24:57 +00:00
Chris Lattner
082c018658 more refactoring: suck some stuff out of SRoA into
ConvertToScalarInfo.

llvm-svn: 101425
2010-04-16 00:20:00 +00:00
Chris Lattner
977c0bfeb8 introduce a new ConvertToScalarInfo struct to simplify
CanConvertToScalar/MergeInType.  Eliminate a pointless
LLVMContext argument to MergeInType.

llvm-svn: 101422
2010-04-15 23:50:26 +00:00
Chris Lattner
c38eea2b85 tidy interface to isOnlyCopiedFromConstantGlobal
llvm-svn: 101405
2010-04-15 21:59:20 +00:00
Gabor Greif
2e18d34d80 reapply r101364, which has been backed out in r101368
with a fix

rotate CallInst operands, i.e. move callee to the back
of the operand array

the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary

llvm-svn: 101397
2010-04-15 20:51:13 +00:00
Gabor Greif
6022150477 back out r101364, as it trips the linux nightlybot on some clang C++ tests
llvm-svn: 101368
2010-04-15 12:46:56 +00:00
Gabor Greif
428ca23bbd rotate CallInst operands, i.e. move callee to the back
of the operand array

the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary

llvm-svn: 101364
2010-04-15 10:49:53 +00:00
Gabor Greif
faa9bf800f performance: get rid of repeated dereferencing of use_iterator by caching its result
llvm-svn: 100550
2010-04-06 19:32:30 +00:00
Mon P Wang
484bbe6aa9 Reapply address space patch after fixing an issue in MemCopyOptimizer.
Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)

llvm-svn: 100304
2010-04-04 03:10:48 +00:00
Mon P Wang
0ccf050ca3 Revert r100191 since it breaks objc in clang
llvm-svn: 100199
2010-04-02 18:43:02 +00:00
Mon P Wang
a01350755e Reapply address space patch after fixing an issue in MemCopyOptimizer.
Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)

llvm-svn: 100191
2010-04-02 18:04:15 +00:00
Bob Wilson
aae933cc81 Revert Mon Ping's change 99928, since it broke all the llvm-gcc buildbots.
llvm-svn: 99948
2010-03-30 22:27:04 +00:00
Mon P Wang
9351ea594a Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
A update of langref will occur in a subsequent checkin.

llvm-svn: 99928
2010-03-30 20:55:56 +00:00
Duncan Sands
1b33dd3c83 There are two ways of checking for a given type, for example isa<PointerType>(T)
and T->isPointerTy().  Convert most instances of the first form to the second form.
Requested by Chris.

llvm-svn: 96344
2010-02-16 11:11:14 +00:00
Duncan Sands
2acaf3609c Uniformize the names of type predicates: rather than having isFloatTy and
isInteger, we now have isFloatTy and isIntegerTy.  Requested by Chris!

llvm-svn: 96223
2010-02-15 16:12:20 +00:00
Bob Wilson
8635e08b7f Adjust the heuristics used to decide when SROA is likely to be profitable.
The SRThreshold value makes perfect sense for checking if an entire aggregate
should be promoted to a scalar integer, but it is not so good for splitting
an aggregate into its separate elements.  A struct may contain a large embedded
array along with some scalar fields that would benefit from being split apart
by SROA.  Even if the total aggregate size is large, it may still be good to
perform SROA.  Thus, the most important piece of this patch is simply moving
the aggregate size comparison vs. SRThreshold so that it guards only the
aggregate promotion.

We have also been checking the number of elements to decide if an aggregate
should be split up.  The limit of "SRThreshold/4" seemed rather arbitrary,
and I don't think it's very useful to derive this limit from SRThreshold
anyway.  I've collected some data showing that the current default limit of
32 (since SRThreshold defaults to 128) is a reasonable cutoff for struct
types.  One thing suggested by the data is that distinguishing between structs
and arrays might be useful.  There are (obviously) a lot more large arrays
than large structs (as measured by the number of elements and not the total
size -- a large array inside a struct still counts as a single element given
the way we do SROA right now).  Out of 8377 arrays where we successfully
performed SROA while compiling a large set of benchmarks, only 16 of them had
more than 8 elements.  And, for those 16 arrays, it's not at all clear that
SROA was actually beneficial.  So, to offset the compile time cost of
investigating more large structs for SROA, the patch lowers the limit on array
elements to 8.

This fixes Apple Radar 7563690.

llvm-svn: 95224
2010-02-03 17:23:56 +00:00
Benjamin Kramer
d408373dbe Use the less expensive getName function instead of getNameStr.
llvm-svn: 94683
2010-01-27 19:46:52 +00:00
Bob Wilson
1882fba746 Change Value::getUnderlyingObject to have the MaxLookup value specified as a
parameter with a default value, instead of just hardcoding it in the
implementation.  The limit of MaxLookup = 6 was introduced in r69151 to fix
a performance problem with O(n^2) behavior in instcombine, but the scalarrepl
pass is relying on getUnderlyingObject to go all the way back to an AllocaInst.
Making the limit part of the method signature makes it clear that by default
the result is limited and should help avoid similar problems in the future.
This fixes pr6126.

llvm-svn: 94433
2010-01-25 18:26:54 +00:00
Victor Hernandez
d44671b931 DbgInfoIntrinsics no longer appear in an instruction's use list; so clean up looking for them in use iterations and remove OnlyUsedByDbgInfoIntrinsics()
llvm-svn: 94111
2010-01-21 23:05:53 +00:00
Bob Wilson
e94d77854f Fix a crash in scalarrepl for memcpy/memmove where the source and destination
are the same.  I had already fixed a similar problem where the source and
destination were different bitcasts derived from the same alloca, but the
previous fix still did not handle the case where both operands are exactly
the same value.  Radar 7552893.

llvm-svn: 93848
2010-01-19 04:32:48 +00:00
Dan Gohman
e4a22bdd4a Use do+while instead of while for loops which obviously have a
non-zero trip count. Use SmallVector's pop_back_val().

llvm-svn: 92734
2010-01-05 16:27:25 +00:00
David Greene
578f63538f Change errs() to dbgs().
llvm-svn: 92610
2010-01-05 01:27:09 +00:00
Chris Lattner
4a7bce50b8 Fix the Convert to scalar to not insert dead loads in the store case. The
load is needed when we have a small store into a large alloca (at which 
point we get a load/insert/store sequence), but when you do a full-sized
store, this load ends up being dead.

This dead load is bad in really large nasty testcases where the load ends
up causing mem2reg to insert large chains of dependent phi nodes which only
ADCE can delete.  Instead of doing this, just don't insert the dead load.

This fixes rdar://6864035

llvm-svn: 91917
2009-12-22 19:33:28 +00:00
Chris Lattner
0fbb62fcc5 fix some fixme's by using twines
llvm-svn: 91916
2009-12-22 19:23:33 +00:00
Bob Wilson
0dc93264b1 Generalize SROA to allow the first index of a GEP to be non-zero. Add a
missing check that an array reference doesn't go past the end of the array,
and remove some redundant checks for in-bound array and vector references
that are no longer needed.

llvm-svn: 91897
2009-12-22 06:57:14 +00:00
Bob Wilson
eb77079db5 Remove special-case SROA optimization of variable indexes to one-element and
two-element arrays.  After restructuring the SROA code, it was not safe to
do this without adding more checking.  It is not clear that this special-case
has really been useful, and removing this simplifies the code quite a bit.

llvm-svn: 91828
2009-12-21 18:39:47 +00:00
Bob Wilson
067a1be3e9 Update my SROA changes in response to review.
* change FindElementAndOffset to return a uint64_t instead of unsigned, and
  to identify the type to be used for that result in a GEP instruction.
* move "isa<ConstantInt>" to be first in conditional.
* replace some dyn_casts with casts.
* add a comment about handling mem intrinsics.

llvm-svn: 91762
2009-12-19 06:53:17 +00:00
Bob Wilson
03b6955c7f Reapply 91459 with a simple fix for the problem that broke the x86_64-darwin
bootstrap.  This also replaces the WeakVH references that Chris objected to
with normal Value references.

llvm-svn: 91711
2009-12-18 20:14:40 +00:00
Bob Wilson
9d4b46c0e6 Re-revert 91459. It's breaking the x86_64 darwin bootstrap.
llvm-svn: 91607
2009-12-17 18:34:24 +00:00
Daniel Dunbar
c4abbc0ab6 Reapply r91459, it was only unmasking the bug, and since TOT is still broken having it reverted does no good.
llvm-svn: 91559
2009-12-16 20:09:53 +00:00
Daniel Dunbar
929f303477 Revert "Reapply 91184 with fixes and an addition to the testcase to cover the
problem", this broke llvm-gcc bootstrap for release builds on
x86_64-apple-darwin10.

This reverts commit db22309800b224a9f5f51baf76071d7a93ce59c9.

llvm-svn: 91534
2009-12-16 10:56:17 +00:00
Bob Wilson
8505aa648d Reapply 91184 with fixes and an addition to the testcase to cover the problem
found last time.  Instead of trying to modify the IR while iterating over it,
I've change it to keep a list of WeakVH references to dead instructions, and
then delete those instructions later.  I also added some special case code to
detect and handle the situation when both operands of a memcpy intrinsic are
referencing the same alloca.

llvm-svn: 91459
2009-12-15 22:00:51 +00:00
Chris Lattner
6603d21f13 revert r91184, because it causes a crash on a .bc file I just
sent to Bob.

llvm-svn: 91268
2009-12-14 05:11:02 +00:00
Bob Wilson
8486cae4ce Revise scalar replacement to be more flexible about handle bitcasts and GEPs.
While scanning through the uses of an alloca, keep track of the current offset
relative to the start of the alloca, and check memory references to see if
the offset & size correspond to a component within the alloca.  This has the
nice benefit of unifying much of the code from isSafeUseOfAllocation,
isSafeElementUse, and isSafeUseOfBitCastedAllocation.  The code to rewrite
the uses of a promoted alloca, after it is determined to be safe, is
reorganized in the same way.

Also, when rewriting GEP instructions, mark them as "in-bounds" since all the
indices are known to be safe.

llvm-svn: 91184
2009-12-11 23:47:40 +00:00
Bob Wilson
9e68616e49 Fix a comment.
llvm-svn: 90975
2009-12-09 18:05:27 +00:00
Bob Wilson
04cc375a1a Some superficial cleanups.
llvm-svn: 90866
2009-12-08 18:27:03 +00:00
Bob Wilson
d673b32280 Clean up dead operands left around after SROA replaces a mem intrinsic.
I'm not aware that this does anything significant on its own, but it's
needed for another patch that I'm working on.

llvm-svn: 90864
2009-12-08 18:22:03 +00:00
Bob Wilson
8c1617ed73 Fix up some comments.
llvm-svn: 90603
2009-12-04 21:57:37 +00:00
Bob Wilson
514e0d319a Fix 80-column violations.
llvm-svn: 90601
2009-12-04 21:51:35 +00:00
Benjamin Kramer
d9780ec7c5 Revert r90089 for now, it's breaking selfhost.
llvm-svn: 90097
2009-11-29 21:17:48 +00:00
Benjamin Kramer
a1d24b5a8d Fix two FIXMEs.
llvm-svn: 90089
2009-11-29 20:29:30 +00:00
Chris Lattner
cdfa9dadf1 fix PR5436 by making the 'simple' case of SRoA not promote out of range
array indexes.  The "complex" case of SRoA still handles them, and correctly.

This fixes a weirdness where we'd correctly avoid transforming A[0][42] if
the 42 was too large, but we'd only do it if it was one gep, not two separate
ones.

llvm-svn: 90007
2009-11-27 16:37:41 +00:00
Nick Lewycky
b3bedf4b2d Pull LLVMContext out of PromoteMemToReg.
llvm-svn: 89645
2009-11-23 03:50:44 +00:00
Victor Hernandez
8428eb5720 Remove AllocationInst. Since MallocInst went away, AllocaInst is the only subclass of AllocationInst, so it no longer is necessary.
llvm-svn: 84969
2009-10-23 21:09:37 +00:00
Chris Lattner
a3cf123e86 strength reduce a ton of type equality tests to check the typeid (Through
the new predicates I added) instead of going through a context and doing a
pointer comparison.  Besides being cheaper, this allows a smart compiler
to turn the if sequence into a switch.

llvm-svn: 83297
2009-10-05 05:54:46 +00:00
Nick Lewycky
85f7bd1942 Add more newlines to make up for the ones removed from the end of instructions.
llvm-svn: 81851
2009-09-15 07:08:25 +00:00
Chris Lattner
4dd9afd1be add newline to debug dump
llvm-svn: 81840
2009-09-15 05:14:57 +00:00
Chris Lattner
5f54c5bd1c eliminate VISIBILITY_HIDDEN from Transforms/Scalar. PR4861
llvm-svn: 80766
2009-09-02 06:11:42 +00:00
Chris Lattner
01dae858b6 eliminate the "Value" printing methods that print to a std::ostream.
This required converting a bunch of stuff off DOUT and other cleanups.

llvm-svn: 79819
2009-08-23 04:37:46 +00:00
Dan Gohman
e982de2e30 Make SROA and PredicateSimplifier cope if TargetData is not
available. This is very conservative for now.

llvm-svn: 79442
2009-08-19 18:22:18 +00:00
Nick Lewycky
e791519ea3 Don't crash trying to promote VLAs.
llvm-svn: 79226
2009-08-17 05:37:31 +00:00
Owen Anderson
9df206d02d Push LLVMContexts through the IntegerType APIs.
llvm-svn: 78948
2009-08-13 21:58:54 +00:00
Owen Anderson
1dc40e205b Move a few more APIs back to 2.5 forms. The only remaining ones left to change back are
metadata related, which I'm waiting on to avoid conflicting with Devang.

llvm-svn: 77721
2009-07-31 20:28:14 +00:00
Owen Anderson
93ccaf5c60 Move more code back to 2.5 APIs.
llvm-svn: 77635
2009-07-30 23:03:37 +00:00
Daniel Dunbar
81f704c26a Twines: Don't allow implicit conversion from integers, this is too tricky.
llvm-svn: 77605
2009-07-30 17:37:43 +00:00
Daniel Dunbar
4d07efb3c4 Switch obvious clients to Twine instead of utostr (when they were already using
a Twine, e.g., for names).
 - I am a little ambivalent about this; we don't want the string conversion of
   utostr, but using overload '+' mixed with string and integer arguments is
   sketchy. On the other hand, this particular usage is something of an idiom.

llvm-svn: 77579
2009-07-30 04:20:37 +00:00
Owen Anderson
881d928f9b Move types back to the 2.5 API.
llvm-svn: 77516
2009-07-29 22:17:13 +00:00
Owen Anderson
0ce2151b36 Move ConstantExpr to 2.5 API.
llvm-svn: 77494
2009-07-29 18:55:55 +00:00
Owen Anderson
390e9778d4 Return ConstantVector to 2.5 API.
llvm-svn: 77366
2009-07-28 21:19:26 +00:00
Daniel Dunbar
251177c96e Initial update to VMCore to use Twines for string arguments.
- The only meat here is in Value.{h,cpp} the rest is essential 'const
   std::string &' -> 'const Twine &'.

llvm-svn: 77048
2009-07-25 04:41:11 +00:00
Owen Anderson
cc33e89571 Revert the ConstantInt constructors back to their 2.5 forms where possible, thanks to contexts-on-types. More to come.
llvm-svn: 77011
2009-07-24 23:12:02 +00:00
Owen Anderson
cc287b28c9 Get rid of the Pass+Context magic.
llvm-svn: 76702
2009-07-22 00:24:57 +00:00
Owen Anderson
4483fbda5e Revert yesterday's change by removing the LLVMContext parameter to AllocaInst and MallocInst.
llvm-svn: 75863
2009-07-15 23:53:25 +00:00
Owen Anderson
8c85061ee6 Move EVER MORE stuff over to LLVMContext.
llvm-svn: 75703
2009-07-14 23:09:55 +00:00