llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-27 22:12:47 +01:00

Author	SHA1	Message	Date
Chris Lattner	2e57722d9d	have AllocaInfo store the alloca being inspected, simplifying callers. No functionality change. llvm-svn: 124067	2011-01-23 07:29:29 +00:00
Chris Lattner	2064a3e213	Rearrange some code a bit. Change MarkUnsafe to handle the "Transformation preventing inst" printing, so that -scalarrepl -debug will always print the rejected instruction. No functionality change. llvm-svn: 124066	2011-01-23 07:05:44 +00:00
Chris Lattner	ba66871643	remove an old hack that avoided creating MMX datatypes. The X86 backend has been fixed. llvm-svn: 124064	2011-01-23 06:40:33 +00:00
Cameron Zwarich	e39e476305	Remove outdated references to dominance frontiers. llvm-svn: 123724	2011-01-18 03:53:26 +00:00
Cameron Zwarich	0a1975802e	Roll r123609 back in with two changes that fix test failures with expensive checks enabled: 1) Use '<' to compare integers in a comparison function rather than '<='. 2) Use the uniqued set DefBlocks rather than Info.DefiningBlocks to initialize the priority queue. The speedup of scalarrepl on test-suite + SPEC2000 + SPEC2006 is a bit less, at just under 16% rather than 17%. llvm-svn: 123662	2011-01-17 17:38:41 +00:00
Cameron Zwarich	77e381e302	Roll out r123609 due to failures on the llvm-x86_64-linux-checks bot. llvm-svn: 123618	2011-01-17 07:26:51 +00:00
Cameron Zwarich	e35642342b	Eliminate the use of dominance frontiers in PromoteMemToReg. In addition to eliminating a potentially quadratic data structure, this also gives a 17% speedup when running -scalarrepl on test-suite + SPEC2000 + SPEC2006. My initial experiment gave a greater speedup around 25%, but I moved the dominator tree level computation from dominator tree construction to PromoteMemToReg. Since this approach to computing IDFs has a much lower overhead than the old code using precomputed DFs, it is worth looking at using this new code for the second scalarrepl pass as well. llvm-svn: 123609	2011-01-17 01:08:59 +00:00
Chris Lattner	2f902bc3b1	tidy up a comment, as suggested by duncan llvm-svn: 123590	2011-01-16 17:46:19 +00:00
Chris Lattner	29f339f87c	if an alloca is only ever accessed as a unit, and is accessed with load/store instructions, then don't try to decimate it into its individual pieces. This will just make a mess of the IR and is pointless if none of the elements are individually accessed. This was generating really terrible code for std::bitset (PR8980) because it happens to be lowered by clang as an {[8 x i8]} structure instead of {i64}. The testcase now is optimized to: define i64 @test2(i64 %X) { br label %L2 L2: ; preds = %0 ret i64 %X } before we generated: define i64 @test2(i64 %X) { %sroa.store.elt = lshr i64 %X, 56 %1 = trunc i64 %sroa.store.elt to i8 %sroa.store.elt8 = lshr i64 %X, 48 %2 = trunc i64 %sroa.store.elt8 to i8 %sroa.store.elt9 = lshr i64 %X, 40 %3 = trunc i64 %sroa.store.elt9 to i8 %sroa.store.elt10 = lshr i64 %X, 32 %4 = trunc i64 %sroa.store.elt10 to i8 %sroa.store.elt11 = lshr i64 %X, 24 %5 = trunc i64 %sroa.store.elt11 to i8 %sroa.store.elt12 = lshr i64 %X, 16 %6 = trunc i64 %sroa.store.elt12 to i8 %sroa.store.elt13 = lshr i64 %X, 8 %7 = trunc i64 %sroa.store.elt13 to i8 %8 = trunc i64 %X to i8 br label %L2 L2: ; preds = %0 %9 = zext i8 %1 to i64 %10 = shl i64 %9, 56 %11 = zext i8 %2 to i64 %12 = shl i64 %11, 48 %13 = or i64 %12, %10 %14 = zext i8 %3 to i64 %15 = shl i64 %14, 40 %16 = or i64 %15, %13 %17 = zext i8 %4 to i64 %18 = shl i64 %17, 32 %19 = or i64 %18, %16 %20 = zext i8 %5 to i64 %21 = shl i64 %20, 24 %22 = or i64 %21, %19 %23 = zext i8 %6 to i64 %24 = shl i64 %23, 16 %25 = or i64 %24, %22 %26 = zext i8 %7 to i64 %27 = shl i64 %26, 8 %28 = or i64 %27, %25 %29 = zext i8 %8 to i64 %30 = or i64 %29, %28 ret i64 %30 } In this case, instcombine was able to eliminate the nonsense, but in PR8980 enough PHIs are in play that instcombine backs off. It's better to not generate this stuff in the first place. llvm-svn: 123571	2011-01-16 06:18:28 +00:00
Chris Lattner	b43fce09a9	Use an irbuilder to get some trivial constant folding when doing a store of a constant. llvm-svn: 123570	2011-01-16 05:58:24 +00:00
Chris Lattner	2067fb2a93	enhance FoldOpIntoPhi in instcombine to try harder when a phi has multiple uses. In some cases, all the uses are the same operation, so instcombine can go ahead and promote the phi. In the testcase this pushes an add out of the loop. llvm-svn: 123568	2011-01-16 05:28:59 +00:00
Chris Lattner	e6d5b3c4ce	Generalize LoadAndStorePromoter a bit and switch LICM to use it. llvm-svn: 123501	2011-01-15 00:12:35 +00:00
Chris Lattner	1ce35a0362	switch SRoA to use LoadAndStorePromoter instead of its own copy of the code. llvm-svn: 123457	2011-01-14 19:50:47 +00:00
Chris Lattner	8e171470d3	split SROA into two passes: one that uses DomFrontiers (-scalarrepl) and one that uses SSAUpdater (-scalarrepl-ssa) llvm-svn: 123436	2011-01-14 08:13:00 +00:00
Chris Lattner	b5c39352d8	Implement full support for promoting allocas to registers using SSAUpdater instead of DomTree/DomFrontier. This may be interesting for reducing compile time. This is currently disabled, but seems to work just fine. When this is enabled, we eliminate two runs of dominator frontier, one in the "early per-function" optimizations and one in the "interlaced with inliner" function passes. llvm-svn: 123434	2011-01-14 07:50:47 +00:00
Bob Wilson	1238f872da	Fix whitespace. llvm-svn: 123396	2011-01-13 20:59:44 +00:00
Bob Wilson	fbab825516	Check for empty structs, and for consistency, zero-element arrays. llvm-svn: 123383	2011-01-13 18:26:59 +00:00
Bob Wilson	3b0197489e	Extend SROA to handle arrays accessed as homogeneous structs and vice versa. This is a minor extension of SROA to handle a special case that is important for some ARM NEON operations. Some of the NEON intrinsics return multiple values, which are handled as struct types containing multiple elements of the same vector type. The corresponding return types declared in the arm_neon.h header have equivalent arrays. We need SROA to recognize that it can split up those arrays and structs into separate vectors, even though they are not always accessed with the same type. SROA already handles loads and stores of an entire alloca by using insertvalue/extractvalue to access the individual pieces, and that code works the same regardless of whether the type is a struct or an array. So, all that needs to be done is to check for compatible arrays and homogeneous structs. llvm-svn: 123381	2011-01-13 17:45:11 +00:00
Bob Wilson	9f8d730f9b	Make SROA more aggressive with allocas containing padding. SROA only split up structs and arrays one level at a time, so padding can only cause trouble if it is located in between the struct or array elements. llvm-svn: 123380	2011-01-13 17:45:08 +00:00
Chris Lattner	e396e846b4	split dom frontier handling stuff out to its own DominanceFrontier header, so that Dominators.h is just domtree. Also prune #includes a bit. llvm-svn: 122714	2011-01-02 22:09:33 +00:00
Chris Lattner	9007b56712	start using irbuilder to make mem intrinsics in a few passes. llvm-svn: 122572	2010-12-26 22:57:41 +00:00
Mon P Wang	eb2ae28352	Preserve the address space when generating bitcasts for MemTransferInst in ConvertToScalarInfo llvm-svn: 122462	2010-12-23 01:41:32 +00:00
Dan Gohman	295ba3ab26	Move Value::getUnderlyingObject to be a standalone function so that it can live in Analysis instead of VMCore. llvm-svn: 121885	2010-12-15 20:02:24 +00:00
Nick Lewycky	f6fa6b29f4	Treat a call of function pointer like a load of the pointer when considering whether the pointer can be replaced with the global variable it is a copy of. Fixes PR8680. llvm-svn: 120126	2010-11-24 22:04:20 +00:00
Benjamin Kramer	9141603779	Simplify code. No change in functionality. llvm-svn: 119908	2010-11-20 18:43:35 +00:00
Chris Lattner	f8540ee386	finish a thought. llvm-svn: 119690	2010-11-18 07:32:33 +00:00
Chris Lattner	6048697a30	allow eliminating an alloca that is just copied from an constant global if it is passed as a byval argument. The byval argument will just be a read, so it is safe to read from the original global instead. This allows us to promote away the %agg.tmp alloca in PR8582 llvm-svn: 119686	2010-11-18 06:41:51 +00:00
Chris Lattner	791e914b1b	enhance the "alloca is just a memcpy from constant global" to ignore calls that obviously can't modify the alloca because they are readonly/readnone. llvm-svn: 119683	2010-11-18 06:26:49 +00:00
Chris Lattner	44ccd4643d	fix a small oversight in the "eliminate memcpy from constant global" optimization. If the alloca that is "memcpy'd from constant" also has a memcpy from it, ignore it: it is a load. We now optimize the testcase to: define void @test2() { %B = alloca %T %a = bitcast %T* @G to i8* %b = bitcast %T* %B to i8* call void @llvm.memcpy.p0i8.p0i8.i64(i8* %b, i8* %a, i64 124, i32 4, i1 false) call void @bar(i8* %b) ret void } previously we would generate: define void @test() { %B = alloca %T %b = bitcast %T* %B to i8* %G.0 = getelementptr inbounds %T* @G, i32 0, i32 0 %tmp3 = load i8* %G.0, align 4 %G.1 = getelementptr inbounds %T* @G, i32 0, i32 1 %G.15 = bitcast [123 x i8]* %G.1 to i8* %1 = bitcast [123 x i8]* %G.1 to i984* %srcval = load i984* %1, align 1 %B.0 = getelementptr inbounds %T* %B, i32 0, i32 0 store i8 %tmp3, i8* %B.0, align 4 %B.1 = getelementptr inbounds %T* %B, i32 0, i32 1 %B.12 = bitcast [123 x i8]* %B.1 to i8* %2 = bitcast [123 x i8]* %B.1 to i984* store i984 %srcval, i984* %2, align 1 call void @bar(i8* %b) ret void } llvm-svn: 119682	2010-11-18 06:20:47 +00:00
Owen Anderson	46990c17f7	Get rid of static constructors for pass registration. Instead, every pass exposes an initializeMyPassFunction(), which must be called in the pass's constructor. This function uses static dependency declarations to recursively initialize the pass's dependencies. Clients that only create passes through the createFooPass() APIs will require no changes. Clients that want to use the CommandLine options for passes will need to manually call the appropriate initialization functions in PassInitialization.h before parsing commandline arguments. I have tested this with all standard configurations of clang and llvm-gcc on Darwin. It is possible that there are problems with the static dependencies that will only be visible with non-standard options. If you encounter any crash in pass registration/creation, please send the testcase to me directly. llvm-svn: 116820	2010-10-19 17:21:58 +00:00
Benjamin Kramer	a0e156fe6d	Eliminate some calls to Value::getNameStr. llvm-svn: 116670	2010-10-16 11:28:23 +00:00
Owen Anderson	63f757463c	Begin adding static dependence information to passes, which will allow us to perform initialization without static constructors AND without explicit initialization by the client. For the moment, passes are required to initialize both their (potential) dependencies and any passes they preserve. I hope to be able to relax the latter requirement in the future. llvm-svn: 116334	2010-10-12 19:48:12 +00:00
Owen Anderson	69cbf2e8b7	Now with fewer extraneous semicolons! llvm-svn: 115996	2010-10-07 22:25:06 +00:00
Dale Johannesen	c14a1eda84	Massive rewrite of MMX: The x86_mmx type is used for MMX intrinsics, parameters and return values where these use MMX registers, and is also supported in load, store, and bitcast. Only the above operations generate MMX instructions, and optimizations do not operate on or produce MMX intrinsics. MMX-sized vectors <2 x i32> etc. are lowered to XMM or split into smaller pieces. Optimizations may occur on these forms and the result casted back to x86_mmx, provided the result feeds into a previous existing x86_mmx operation. The point of all this is prevent optimizations from introducing MMX operations, which is unsafe due to the EMMS problem. llvm-svn: 115243	2010-09-30 23:57:10 +00:00
Chris Lattner	b1c861e28c	deepen my MMX/SRoA hack to avoid hurting non-x86 codegen. llvm-svn: 112763	2010-09-01 23:09:27 +00:00
Chris Lattner	9759e898f0	add a gross hack to work around a problem that Argiris reported on llvmdev: SRoA is introducing MMX datatypes like <1 x i64>, which then cause random problems because the X86 backend is producing mmx stuff without inserting proper emms calls. In the short term, force off MMX datatypes. In the long term, the X86 backend should not select generic vector types to MMX registers. This is being worked on, but won't be done in time for 2.8. rdar://8380055 llvm-svn: 112696	2010-09-01 05:14:33 +00:00
Chris Lattner	686cddf177	remove dead prototype. llvm-svn: 111342	2010-08-18 02:37:06 +00:00
Owen Anderson	f2fea95f2f	Reapply r110396, with fixes to appease the Linux buildbot gods. llvm-svn: 110460	2010-08-06 18:33:48 +00:00
Owen Anderson	aadd8a89ca	Revert r110396 to fix buildbots. llvm-svn: 110410	2010-08-06 00:23:35 +00:00
Owen Anderson	b9762c07cb	Don't use PassInfo* as a type identifier for passes. Instead, use the address of the static ID member as the sole unique type identifier. Clean up APIs related to this change. llvm-svn: 110396	2010-08-05 23:42:04 +00:00
Owen Anderson	f8addbb0a1	Fix batch of converting RegisterPass<> to INTIALIZE_PASS(). llvm-svn: 109045	2010-07-21 22:09:45 +00:00
Gabor Greif	17c48ecd68	eliminate CallInst::ArgOffset llvm-svn: 108522	2010-07-16 09:38:02 +00:00
Chris Lattner	bf009b527a	Fix the second half of PR7437: scalarrepl wasn't preserving address spaces when SRoA'ing memcpy's. llvm-svn: 107846	2010-07-08 00:27:05 +00:00
Gabor Greif	b6e59f4a82	use getArgOperand instead of getOperand llvm-svn: 107271	2010-06-30 09:16:16 +00:00
Gabor Greif	6bf330181d	employ CallInst::ArgOffset (for now) llvm-svn: 107015	2010-06-28 16:43:57 +00:00
Gabor Greif	e57b886f3d	use cached value llvm-svn: 107000	2010-06-28 11:20:42 +00:00
Chris Lattner	d3a1ef7fea	minor cleanup to SROA: when lowering type unsafe accesses to large integers, the first inserted value would always create an 'or X, 0'. Even though this is trivially zapped by instcombine, don't bother creating this pointless instruction. llvm-svn: 106979	2010-06-27 07:58:26 +00:00
Dan Gohman	74d5144414	Use pre-increment instead of post-increment when the result is not used. llvm-svn: 106542	2010-06-22 15:08:57 +00:00
Gabor Greif	2eaf8afcff	use abstract accessors to CallInst llvm-svn: 101899	2010-04-20 13:13:04 +00:00
Eric Christopher	e78496e5f1	Revert 101465, it broke internal OpenGL testing. Probably the best way to know that all getOperand() calls have been handled is to replace that API instead of updating. llvm-svn: 101579	2010-04-16 23:37:20 +00:00

1 2 3 4 5 ...

299 Commits