llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00

Author	SHA1	Message	Date
Nadav Rotem	fd924ec3c6	Vectorizer: Add support for loops with an unknown count. For example: for (i=0; i<n; i++){ a[i] = b[i+1] + c[i+3]; } llvm-svn: 166165	2012-10-18 05:29:12 +00:00
Nadav Rotem	8303c909c7	Add a loop vectorizer. llvm-svn: 166112	2012-10-17 18:25:06 +00:00
Chandler Carruth	7d68167eb0	This just in, it is a bad idea to use 'udiv' on an offset of a pointer. A very bad idea. Let's not do that. Fixes PR14105. Note that this wasn't that glaring of an oversight. Originally, these routines were only called on offsets within an alloca, which are intrinsically positive. But over the evolution of the pass, they ended up being called for arbitrary offsets, and things went downhill... llvm-svn: 166095	2012-10-17 09:23:48 +00:00
Michael Gottesman	b9205da3c6	[InstCombine] Teach InstCombine how to handle an obfuscated splat. An obfuscated splat is where the frontend poorly generates code for a splat using several different shuffles to create the splat, i.e., %A = load <4 x float>* %in_ptr, align 16 %B = shufflevector <4 x float> %A, <4 x float> undef, <4 x i32> <i32 0, i32 0, i32 undef, i32 undef> %C = shufflevector <4 x float> %B, <4 x float> %A, <4 x i32> <i32 0, i32 1, i32 4, i32 undef> %D = shufflevector <4 x float> %C, <4 x float> %A, <4 x i32> <i32 0, i32 1, i32 2, i32 4> llvm-svn: 166061	2012-10-16 21:29:38 +00:00
Chandler Carruth	0659e10e8a	Update the memcpy rewriting to fully support widened int rewriting. This includes extracting ints for copying elsewhere and inserting ints when copying into the alloca. This should fix the CanSROA assertion coming out of Clang's regression test suite. llvm-svn: 165931	2012-10-15 10:24:43 +00:00
Chandler Carruth	7755041393	Follow-up fix to r165928: handle memset rewriting for widened integers, and generally clean up the memset handling. It had rotted a bit as the other rewriting logic got polished more. llvm-svn: 165930	2012-10-15 10:24:40 +00:00
Chandler Carruth	65613836e9	First major step toward addressing PR14059. This teaches SROA to handle cases where we have partial integer loads and stores to an otherwise promotable alloca to widen[1] those loads and stores to cover the entire alloca and bitcast them into the appropriate type such that promotion can proceed. These partial loads and stores stem from an annoying confluence of ARM's calling convention and ABI lowering and the FCA pre-splitting which takes place in SROA. Clang lowers a { double, double } in-register function argument as a [4 x i32] function argument to ensure it is placed into integer 32-bit registers (a really unnerving implicit contract between Clang and the ARM backend I would add). This results in a FCA load of [4 x i32]* from the { double, double } alloca, and SROA decomposes this into a sequence of i32 loads and stores. Inlining proceeds, code gets folded, but at the end of the day, we still have i32 stores to the low and high halves of a double alloca. Widening these to be i64 operations, and bitcasting them to double prior to loading or storing allows promotion to proceed for these allocas. I looked quite a bit changing the IR which Clang produces for this case to be more friendly, but small changes seem unlikely to help. I think the best representation we could use currently would be to pass 4 i32 arguments thereby avoiding any FCAs, but that would still require this fix. It seems like it might eventually be nice to somehow encode the ABI register selection choices outside of the parameter type system so that the parameter can be a { double, double }, but the CC register annotations indicate that this should be passed via 4 integer registers. This patch does not address the second problem in PR14059, which is the reverse: when a struct alloca is loaded as a larger single integer. This patch also does not address some of the code quality issues with the FCA-splitting. Those don't actually impede any optimizations really, but they're on my list to clean up. [1]: Pedantic footnote: for those concerned about memory model issues here, this is safe. For the alloca to be promotable, it cannot escape or have any use of its address that could allow these loads or stores to be racing. Thus, widening is always safe. llvm-svn: 165928	2012-10-15 08:40:30 +00:00
Meador Inge	a24293f9d8	instcombine: Migrate strcmp and strncmp optimizations This patch migrates the strcmp and strncmp optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 165915	2012-10-15 03:47:37 +00:00
Meador Inge	7fafebf1b4	instcombine: Migrate strchr and strrchr optimizations This patch migrates the strchr and strrchr optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 165875	2012-10-13 16:45:37 +00:00
Meador Inge	29f1f4f7b1	instcombine: Migrate strcat and strncat optimizations This patch migrates the strcat and strncat optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 165874	2012-10-13 16:45:32 +00:00
Chandler Carruth	5aece20d98	Teach SROA to cope with wrapper aggregates. These show up a lot in ABI type coercion code, especially when targetting ARM. Things like [1 x i32] instead of i32 are very common there. The goal of this logic is to ensure that when we are picking an alloca type, we look through such wrapper aggregates and across any zero-length aggregate elements to find the simplest type possible to form a type partition. This logic should (generally speaking) rarely fire. It only ends up kicking in when an alloca is accessed using two different types (for instance, i32 and float), and the underlying alloca type has wrapper aggregates around it. I noticed a significant amount of this occurring looking at stepanov_abstraction generated code for arm, and suspect it happens elsewhere as well. Note that this doesn't yet address truly heinous IR productions such as PR14059 is concerning. Those result in mismatched sizes of types in addition to mismatched access and alloca types. llvm-svn: 165870	2012-10-13 10:49:33 +00:00
Nick Lewycky	211cb06fc3	Don't crash when !tbaa.struct contents is invalid. llvm-svn: 165693	2012-10-11 02:05:23 +00:00
Duncan Sands	9db0be6c19	Add the testcase from pr13254 (the old scalarreply pass handles this wrong; the new sroa pass handles it right). llvm-svn: 165644	2012-10-10 18:41:19 +00:00
Michael Ilseman	072e9fdbb9	New EarlyCSE tests for CSE-ing across commutativity. llvm-svn: 165510	2012-10-09 16:58:13 +00:00
Alexey Samsonov	561aa02d50	Fix PR14016. DeadArgumentElimination pass can replace one LLVM function with another, invalidating a pointer stored in debug info metadata entry for this function. To fix this, we collect debug info descriptors for functions before running a DeadArgumentElimination pass and "patch" pointers in metadata nodes if we replace a function. llvm-svn: 165490	2012-10-09 08:13:15 +00:00
Chandler Carruth	e55d2920b1	Fix PR14034, an infloop / heap corruption / crash bug in the new SROA. Thanks to Benjamin for the raw test case. This one took about 50 times longer to reduce than to fix. =/ llvm-svn: 165476	2012-10-09 01:58:35 +00:00
Micah Villmow	fe3338a7eb	Move TargetData to DataLayout. llvm-svn: 165403	2012-10-08 16:39:34 +00:00
Chandler Carruth	353476d536	Teach the new SROA a new trick. Now we zap any memcpy or memmoves which are in fact identity operations. We detect these and kill their partitions so that even splitting is unaffected by them. This is particularly important because Clang relies on emitting identity memcpy operations for struct copies, and these fold away to constants very often after inlining. Fixes the last big performance FIXME I have on my plate. llvm-svn: 165285	2012-10-05 01:29:09 +00:00
Benjamin Kramer	bbb006ad7d	SimplifyCFG: Enhance the "remove CFG edge that leads to null pointer dereference" optimization to also handle instructions with multiple uses. We conservatively only check the first use to avoid walking long use chains. This catches the common case of having both a load and a store to a pointer supplied by a PHI node. llvm-svn: 165232	2012-10-04 16:11:49 +00:00
Duncan Sands	0ebee9338d	In my recent change to avoid use of underaligned memory I didn't notice that cpyDest can be mutated in some cases, which would then cause a crash later if indeed the memory was underaligned. This brought down several buildbots, so I guess the underaligned case is much more common than I thought! llvm-svn: 165228	2012-10-04 13:53:21 +00:00
Duncan Sands	448245e370	The alignment of an sret parameter is known: it must be at least the alignment of the return type. Teach the optimizers this. llvm-svn: 165226	2012-10-04 13:36:31 +00:00
Chandler Carruth	9d83695de1	Fix PR13969, a mini-phase-ordering issue with the new SROA pass. Currently, we re-visit allocas when something changes about the way they might be split to allow better scalarization to take place. However, we weren't handling the case when the promotion is what would change the behavior of SROA. When an address derived from an alloca is stored into another alloca, we consider the first to have escaped. If the second is ever promoted to an SSA value, we will suddenly be able to run the SROA pass on the first alloca. This patch adds explicit support for this form if iteration. When we detect a store of a pointer derived from an alloca, we flag the underlying alloca for reprocessing after promotion. The logic works hard to only do this when there is definitely going to be promotion and it might remove impediments to the analysis of the alloca. Thanks to Nick for the great test case and Benjamin for some sanity check review. llvm-svn: 165223	2012-10-04 12:33:50 +00:00
Duncan Sands	86f8827745	The memcpy optimizer was happily doing call slot forwarding when the new memory was less aligned than the old. In the testcase this results in an overaligned memset: the memset alignment was correct for the original memory but is too much for the new memory. Fix this by either increasing the alignment of the new memory or bailing out if that isn't possible. Should fix the gcc-4.7 self-host buildbot failure. llvm-svn: 165220	2012-10-04 10:54:40 +00:00
Chandler Carruth	6e02238cee	Teach the integer-promotion rewrite strategy to be endianness aware. Sorry for this being broken so long. =/ As part of this, switch all of the existing tests to be Little Endian, which is the behavior I was asserting in them anyways! Add in a new big-endian test that checks the interesting behavior there. Another part of this is to tighten the rules abotu when we perform the full-integer promotion. This logic now rejects cases where there fully promoted integer is a non-multiple-of-8 bitwidth or cases where the loads or stores touch bits which are in the allocated space of the alloca but are not loaded or stored when accessing the integer. Sadly, these aren't really observable today as the rest of the pass will already ensure the invariants hold. However, the latter situation is likely to become a potential concern in the future. Thanks to Benjamin and Duncan for early review of this patch. I'm still looking into whether there are further endianness issues, please let me know if anyone sees BE failures persisting past this. llvm-svn: 165219	2012-10-04 10:39:28 +00:00
Jakub Staszak	73d9bdcca5	Fix PR13967. llvm-svn: 165187	2012-10-03 23:59:47 +00:00
Chandler Carruth	57e63536e6	Fix an issue where we failed to adjust the alignment constraint on a memcpy to reflect that '0' has a different meaning when applied to a load or store. Now we correctly use underaligned loads and stores for the test case added. llvm-svn: 165101	2012-10-03 08:26:28 +00:00
Chandler Carruth	c0353523f6	Try to use a better set of abstractions for computing the alignment necessary during rewriting. As part of this, fix a real think-o here where we might have left off an alignment specification when the address is in fact underaligned. I haven't come up with any way to trigger this, as there is always some other factor that reduces the alignment, but it certainly might have been an observable bug in some way I can't think of. This also slightly changes the strategy for placing explicit alignments on loads and stores to only do so when the alignment does not match that required by the ABI. This causes a few redundant alignments to go away from test cases. I've also added a couple of tests that really push on the alignment that we end up with on loads and stores. More to come here as I try to fix an underlying bug I have conjectured and produced test cases for, although it's not clear if this bug is the one currently hitting dragonegg's gcc47 bootstrap. llvm-svn: 165100	2012-10-03 08:14:02 +00:00
Chandler Carruth	72359007f5	Teach the new SROA to handle cases where an alloca that has already been scheduled for processing on the worklist eventually gets deleted while we are processing another alloca, fixing the original test case in PR13990. To facilitate this, add a remove_if helper to the SetVector abstraction. It's not easy to use the standard abstractions for this because of the specifics of SetVectors types and implementation. Finally, a nice small test case is included. Thanks to Benjamin for the fantastic reduced test case here! All I had to do was delete some empty basic blocks! llvm-svn: 165065	2012-10-02 22:46:45 +00:00
Benjamin Kramer	aa07e96212	Fix broken tests. llvm-svn: 165019	2012-10-02 15:49:34 +00:00
Chandler Carruth	6be8e1c6b5	Fix more misspellings found by Duncan during review. llvm-svn: 164940	2012-10-01 12:30:45 +00:00
Chandler Carruth	93f31cefdc	Fix several issues with alignment. We weren't always accounting for type alignment requirements of the new alloca. As one consequence which was reported as a bug by Duncan, we overaligned memcpy calls to ranges of allocas after they were rewritten to types with lower alignment requirements. Other consquences are possible, but I don't have any test cases for them. llvm-svn: 164937	2012-10-01 12:16:54 +00:00
Benjamin Kramer	54a33840fc	SimplifyCFG: Don't crash when forming a switch bitmap with an undef default value. Fixes PR13985. llvm-svn: 164934	2012-10-01 11:31:48 +00:00
Chandler Carruth	0ba7581e2f	Refactor the PartitionUse structure to actually use the Use* instead of a pair of instructions, one for the used pointer and the second for the user. This simplifies the representation and also makes it more dense. This was noticed because of the miscompile in PR13926. In that case, we were running up against a fundamental "bad idea" in the speculation of PHI and select instructions: the speculation and rewriting are interleaved, which requires phi speculation to also perform load rewriting! This is bad, and causes us to miss opportunities to do (for example) vector rewriting only exposed after PHI speculation, etc etc. It also, in the old system, required us to insert new load uses into the current partition's use list, which would then be ignored during rewriting because we had already extracted an end iterator for the use list. The appending behavior (and much of the other oddities) stem from the strange de-duplication strategy in the PartitionUse builder. Amusingly, all this went without notice for so long because it could only be triggered by having different GEPs into the same partition of the same alloca, where both different GEPs were operands of a single PHI, and where the GEP which was not encountered first also had multiple uses within that same PHI node... Hence the insane steps required to reproduce. So, step one in fixing this fundamental bad idea is to make the PartitionUse actually contain a Use*, and to make the builder do proper deduplication instead of funky de-duplication. This is enough to remove the appending behavior, and fix the miscompile in PR13926, but there is more work to be done here. Subsequent commits will lift the speculation into its own visitor. It'll be a useful step toward potentially extracting all of the speculation logic into a generic utility transform. The existing PHI test case for repeated operands has been made more extreme to catch even these issues. This test case, run through the old pass, will exactly reproduce the miscompile from PR13926. ;] We were so close here! llvm-svn: 164925	2012-10-01 01:49:22 +00:00
Chandler Carruth	477c891332	Fix a somewhat surprising miscompile where code relying on an ABI alignment could lose it due to the alloca type moving down to a much smaller alignment guarantee. Now SROA will actively compute a proper alignment, factoring the target data, any explicit alignment, and the offset within the struct. This will in some cases lower the alignment requirements, but when we lower them below those of the type, we drop the alignment entirely to give freedom to the code generator to align it however is convenient. Thanks to Duncan for the lovely test case that pinned this down. =] llvm-svn: 164891	2012-09-29 10:41:21 +00:00
Evan Cheng	f4c080b01e	Add test case for r164850. llvm-svn: 164867	2012-09-29 00:12:08 +00:00
Benjamin Kramer	4193023537	CorrelatedPropagation: BasicBlock::removePredecessor can simplify PHI nodes. If the it's the condition of a SwitchInst, reload it. Fixes PR13972. llvm-svn: 164818	2012-09-28 10:42:50 +00:00
Benjamin Kramer	b4a61e5a00	GlobalOpt: non-constexpr bitcasts or GEPs can occur even if the global value is only stored once. Fixes PR13968. llvm-svn: 164815	2012-09-28 10:01:27 +00:00
Nick Lewycky	689f61680b	Surprisingly, we missed a trivial case here. Fix that! llvm-svn: 164814	2012-09-28 09:33:53 +00:00
Meador Inge	55db33f26d	instcombine: Add more test cases for __strncpy_chk simplification llvm-svn: 164800	2012-09-27 21:21:31 +00:00
Meador Inge	5e916a5b6f	instcombine: Add more test cases for __strcpy_chk simplification llvm-svn: 164799	2012-09-27 21:21:28 +00:00
Meador Inge	a7ed72c476	instcombine: Add more test cases for __memmove_chk simplification llvm-svn: 164798	2012-09-27 21:21:25 +00:00
Meador Inge	89cd434f53	instcombine: Add more test cases for __memcpy_chk simplification llvm-svn: 164797	2012-09-27 21:21:21 +00:00
Meador Inge	8fb8751a66	instcombine: Add more test cases for __memset_chk simplification llvm-svn: 164796	2012-09-27 21:21:18 +00:00
Benjamin Kramer	cb89947f87	Fix a integer overflow in SimplifyCFG's look up table formation logic. If the width is very large it gets truncated from uint64_t to uint32_t when passed to TD->fitsInLegalInteger. The truncated value can fit in a register. This manifested in massive memory usage or crashes (PR13946). llvm-svn: 164784	2012-09-27 18:29:58 +00:00
Sylvestre Ledru	b77340e506	Revert 'Fix a typo 'iff' => 'if''. iff is an abreviation of if and only if. See: http://en.wikipedia.org/wiki/If_and_only_if Commit 164767 llvm-svn: 164768	2012-09-27 10:14:43 +00:00
Sylvestre Ledru	1c5e7904de	Fix a typo 'iff' => 'if' llvm-svn: 164767	2012-09-27 09:59:43 +00:00
Nick Lewycky	9ae46f5c91	Prefer shuffles to selects. Backends love shuffles! llvm-svn: 164763	2012-09-27 08:33:56 +00:00
Hans Wennborg	e1a73f6ca3	Address Duncan's comments on r164684: - Put statistics in alphabetical order - Don't use getZextValue when building TableInt, just use APInts - Introduce Create{Z,S}ExtOrTrunc in IRBuilder. llvm-svn: 164696	2012-09-26 14:01:53 +00:00
Chandler Carruth	8638d35784	When rewriting the pointer operand to a load or store which has alignment guarantees attached, re-compute the alignment so that we consider offsets which impact alignment. llvm-svn: 164690	2012-09-26 10:45:28 +00:00
Chandler Carruth	0254cf6d85	Teach all of the loads, stores, memsets and memcpys created by the rewriter in SROA to carry a proper alignment. This involves interrogating various sources of alignment, etc. This is a more complete and principled fix to PR13920 as well as related bugs pointed out by Eli in review and by inspection in the area. Also by inspection fix the integer and vector promotion paths to create aligned loads and stores. I still need to work up test cases for these... Sorry for the delay, they were found purely by inspection. llvm-svn: 164689	2012-09-26 10:27:46 +00:00

1 2 3 4 5 ...

3140 Commits