llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 12:02:58 +02:00

Author	SHA1	Message	Date
Matt Arsenault	0d6c559675	Constify functions llvm-svn: 189234	2013-08-26 17:56:38 +00:00
Matt Arsenault	fe57252c78	Vectorize starting from insertelements building a vector llvm-svn: 189233	2013-08-26 17:56:35 +00:00
Matt Arsenault	c1b8722791	Check if in set on insertion instead of separately llvm-svn: 189179	2013-08-24 19:55:38 +00:00
Benjamin Kramer	02d328a0f9	Add a function object to compare the first or second component of a std::pair. Replace instances of this scattered around the code base. llvm-svn: 189169	2013-08-24 12:54:27 +00:00
Peter Collingbourne	a2ec50d21b	DataFlowSanitizer: correctly combine labels in the case where they are equal. llvm-svn: 189133	2013-08-23 18:45:06 +00:00
Evgeniy Stepanov	47f9a57504	[msan] Fix handling of va_arg overflow area on x86_64. The code was erroneously reading overflow area shadow from the TLS slot, bypassing the local copy. Reading shadow directly from TLS is wrong, because it can be overwritten by a nested vararg call, if that happens before va_start. llvm-svn: 189104	2013-08-23 12:11:00 +00:00
Richard Sandiford	b195d89bde	Turn MipsOptimizeMathLibCalls into a target-independent scalar transform ...so that it can be used for z too. Most of the code is the same. The only real change is to use TargetTransformInfo to test when a sqrt instruction is available. The pass is opt-in because at the moment it only handles sqrt. llvm-svn: 189097	2013-08-23 10:27:02 +00:00
Alexey Samsonov	e81fe60561	80 cols llvm-svn: 189091	2013-08-23 07:42:51 +00:00
Michael Gottesman	0f9b142f60	Update StripDeadDebugInfo to use DebugInfoFinder so that it is no longer stale to the point of not working and more resilient to debug info changes. The current version of StripDeadDebugInfo became stale and no longer actually worked since it was expecting an older version of debug info. This patch updates it to use DebugInfoFinder and the modern DebugInfo classes as much as possible to make it more redundent to such changes. Additionally, the only place where that was avoided (the code where we replace the old sets with the new), I call verify on the DIContextUnit implying that if the format changes and my live set changes no longer make sense an assert will be hit. In order to ensure that that occurs I have included a test case. The actual stripping of the dead debug info follows the same strategy as was used before in this class: find the live set and replace the old set in the given compile unit (which may contain dead global variables/functions) with the new live one. llvm-svn: 189078	2013-08-23 00:23:24 +00:00
Peter Collingbourne	1e7de1b7af	DataFlowSanitizer: Replace non-instrumented aliases of instrumented functions, and vice versa, with wrappers. Differential Revision: http://llvm-reviews.chandlerc.com/D1442 llvm-svn: 189054	2013-08-22 20:08:15 +00:00
Peter Collingbourne	411f259ed9	DataFlowSanitizer: Factor the wrapper builder out to buildWrapperFunction. Differential Revision: http://llvm-reviews.chandlerc.com/D1441 llvm-svn: 189053	2013-08-22 20:08:11 +00:00
Peter Collingbourne	ac1c1c4377	DataFlowSanitizer: Prefix the name of each instrumented function with "dfs$". DFSan changes the ABI of each function in the module. This makes it possible for a function with the native ABI to be called with the instrumented ABI, or vice versa, thus possibly invoking undefined behavior. A simple way of statically detecting instances of this problem is to prepend the prefix "dfs$" to the name of each instrumented-ABI function. This will not catch every such problem; in particular function pointers passed across the instrumented-native barrier cannot be used on the other side. These problems could potentially be caught dynamically. Differential Revision: http://llvm-reviews.chandlerc.com/D1373 llvm-svn: 189052	2013-08-22 20:08:08 +00:00
Chandler Carruth	e6b6740e73	Teach the SLP vectorizer the correct way to check for consecutive access using GEPs. Previously, it used a number of different heuristics for analyzing the GEPs. Several of these were conservatively correct, but failed to fall back to SCEV even when SCEV might have given a reasonable answer. One was simply incorrect in how it was formulated. There was good code already to recursively evaluate the constant offsets in GEPs, look through pointer casts, etc. I gathered this into a form code like the SLP code can use in a previous commit, which allows all of this code to become quite simple. There is some performance (compile time) concern here at first glance as we're directly attempting to walk both pointers constant GEP chains. However, a couple of thoughts: 1) The very common cases where there is a dynamic pointer, and a second pointer at a constant offset (usually a stride) from it, this code will actually not do any unnecessary work. 2) InstCombine and other passes work very hard to collapse constant GEPs, so it will be rare that we iterate here for a long time. That said, if there remain performance problems here, there are some obvious things that can improve the situation immensely. Doing a vectorizer-pass-wide memoizer for each individual layer of pointer values, their base values, and the constant offset is likely to be able to completely remove redundant work and strictly limit the scaling of the work to scrape these GEPs. Since this optimization was not done on the prior version (which would still benefit from it), I've not done it here. But if folks have benchmarks that slow down it should be straight forward for them to add. I've added a test case, but I'm not really confident of the amount of testing done for different access patterns, strides, and pointer manipulation. llvm-svn: 189007	2013-08-22 12:45:17 +00:00
Matt Arsenault	1b410a4dec	Teach LoopVectorize about address space sizes llvm-svn: 188980	2013-08-22 02:42:55 +00:00
Michael Gottesman	50d821cd72	Fixed typo. llvm-svn: 188957	2013-08-21 22:53:54 +00:00
Michael Gottesman	704d037910	Removed trailing whitespace. llvm-svn: 188956	2013-08-21 22:53:29 +00:00
Yunzhong Gao	9a8c1892ea	No functionality change. Replace "(255 & value)" with "(0xFF & value)" to improve clarity. llvm-svn: 188941	2013-08-21 22:11:15 +00:00
Matt Arsenault	95d00423a7	Teach InstCombine about address spaces llvm-svn: 188926	2013-08-21 19:53:10 +00:00
Matt Arsenault	3e8997425a	Use attribute helper function llvm-svn: 188916	2013-08-21 18:54:50 +00:00
Matt Arsenault	68d9c05b51	Fix typo llvm-svn: 188915	2013-08-21 18:54:47 +00:00
Bill Wendling	68c17b24b6	Move registering the execution of a basic block to the beginning rather than the end. There are situations which can affect the correctness (or at least expectation) of the gcov output. For instance, if a call to __gcov_flush() occurs within a block before the execution count is registered and then the program aborts in some way, then that block will not be marked as executed. This is not normally what the user expects. If we move the code that's registering when a block is executed to the beginning, we can catch these types of situations. PR16893 llvm-svn: 188849	2013-08-20 23:52:00 +00:00
Arnold Schwaighofer	276cfe784a	SLPVectorizer: Fix invalid iterator errors Update iterator when the SLP vectorizer changes the instructions in the basic block by restarting the traversal of the basic block. Patch by Yi Jiang! Fixes PR 16899. llvm-svn: 188832	2013-08-20 21:21:45 +00:00
Hal Finkel	8f395a803a	Add a llvm.copysign intrinsic This adds a llvm.copysign intrinsic; We already have Libfunc recognition for copysign (which is turned into the FCOPYSIGN SDAG node). In order to autovectorize calls to copysign in the loop vectorizer, we need a corresponding intrinsic as well. In addition to the expected changes to the language reference, the loop vectorizer, BasicTTI, and the SDAG builder (the intrinsic is transformed into an FCOPYSIGN node, just like the function call), this also adds FCOPYSIGN to a few lists in LegalizeVector{Ops,Types} so that vector copysigns can be expanded. In TargetLoweringBase::initActions, I've made the default action for FCOPYSIGN be Expand for vector types. This seems correct for all in-tree targets, and I think is the right thing to do because, previously, there was no way to generate vector-values FCOPYSIGN nodes (and most targets don't specify an action for vector-typed FCOPYSIGN). llvm-svn: 188728	2013-08-19 23:35:46 +00:00
Jakub Staszak	b4a62fef02	Use pop_back_val() instead of both back() and pop_back(). llvm-svn: 188723	2013-08-19 22:47:55 +00:00
Matt Arsenault	77bbadbfcb	Teach InstCombine visitGetElementPtr about address spaces llvm-svn: 188721	2013-08-19 22:17:40 +00:00
Matt Arsenault	2c55fe773b	Cleanup visitGetElementPtr to make address space change easier llvm-svn: 188720	2013-08-19 22:17:34 +00:00
Matt Arsenault	14a3f7be8d	commonPointerCast cleanups to make address space change easier llvm-svn: 188719	2013-08-19 22:17:18 +00:00
Matt Arsenault	22098dc46b	Revert non-test parts of r188507 Re-add the inboundsless tests I didn't add originally llvm-svn: 188710	2013-08-19 21:40:31 +00:00
Peter Collingbourne	aa48e92de9	Introduce SpecialCaseList::isIn overload for GlobalAliases. Differential Revision: http://llvm-reviews.chandlerc.com/D1437 llvm-svn: 188688	2013-08-19 19:00:35 +00:00
Michael Kuperstein	9bedf7fc8c	Adds missing TLI check for library simplification of * pow(x, 0.5) -> fabs(sqrt(x)) * pow(2.0, x) -> exp2(x) llvm-svn: 188656	2013-08-19 06:55:47 +00:00
Peter Collingbourne	ea402298e2	Remove SpecialCaseList::findCategory. It turned out that I didn't need this for DFSan. llvm-svn: 188646	2013-08-19 00:24:20 +00:00
Joerg Sonnenberger	72a32889f4	PR 16899: Do not modify the basic block using the iterator, but keep the next value. This avoids crashes due to invalidation. Patch by Joey Gouly. llvm-svn: 188605	2013-08-17 11:04:47 +00:00
Jim Grosbach	933ecf8022	InstCombine: Use isAllOnesValue() instead of explicit -1. llvm-svn: 188563	2013-08-16 17:03:36 +00:00
Jim Grosbach	72387340f5	InstCombine: Simplify if(x!=0 && x!=-1). When both constants are positive or both constants are negative, InstCombine already simplifies comparisons like this, but when it's exactly zero and -1, the operand sorting ends up reversed and the pattern fails to match. Handle that special case. Follow up for rdar://14689217 llvm-svn: 188512	2013-08-16 00:15:20 +00:00
Matt Arsenault	9594ef019c	Don't do FoldCmpLoadFromIndexedGlobal for non inbounds GEPs This path wasn't tested before without a datalayout, so add some more tests and re-run with and without one. llvm-svn: 188507	2013-08-15 23:11:07 +00:00
Matt Arsenault	66eeeddb1d	Fix spelling llvm-svn: 188506	2013-08-15 23:11:03 +00:00
Yunzhong Gao	37d3ce60e8	Fixing a corner-case bug in strchr and strrchr lib call optimizations where the input character is not converted to char before comparing with zero. The patch was discussed in this thread: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130812/184069.html llvm-svn: 188489	2013-08-15 20:58:59 +00:00
Peter Collingbourne	25f0a1d209	DataFlowSanitizer: Add a debugging feature to help us track nonzero labels. Summary: When the -dfsan-debug-nonzero-labels parameter is supplied, the code is instrumented such that when a call parameter, return value or load produces a nonzero label, the function __dfsan_nonzero_label is called. The idea is that a debugger breakpoint can be set on this function in a nominally label-free program to help identify any bugs in the instrumentation pass causing labels to be introduced. Reviewers: eugenis CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1405 llvm-svn: 188472	2013-08-15 18:51:12 +00:00
Mark Lacey	c3ab3bb4bc	Fix small typo: s/succ/Succ/ llvm-svn: 188415	2013-08-14 22:11:42 +00:00
Peter Collingbourne	8968732c46	DataFlowSanitizer: Instrumentation for memset. Differential Revision: http://llvm-reviews.chandlerc.com/D1395 llvm-svn: 188412	2013-08-14 20:51:38 +00:00
Peter Collingbourne	905e1efbe5	DataFlowSanitizer: greylist is now ABI list. This replaces the old incomplete greylist functionality with an ABI list, which can provide more detailed information about the ABI and semantics of specific functions. The pass treats every function in the "uninstrumented" category in the ABI list file as conforming to the "native" (i.e. unsanitized) ABI. Unless the ABI list contains additional categories for those functions, a call to one of those functions will produce a warning message, as the labelling behaviour of the function is unknown. The other supported categories are "functional", "discard" and "custom". - "discard" -- This function does not write to (user-accessible) memory, and its return value is unlabelled. - "functional" -- This function does not write to (user-accessible) memory, and the label of its return value is the union of the label of its arguments. - "custom" -- Instead of calling the function, a custom wrapper __dfsw_F is called, where F is the name of the function. This function may wrap the original function or provide its own implementation. Differential Revision: http://llvm-reviews.chandlerc.com/D1345 llvm-svn: 188402	2013-08-14 18:54:12 +00:00
Chandler Carruth	219c3d81d0	Fix a really terrifying but improbable bug in mem2reg. If you have seen extremely subtle miscompilations (such as a load getting replaced with the value stored below the load within a basic block) related to promoting an alloca to an SSA value, there is the dim possibility that you hit this. Please let me know if you won this unfortunate lottery. The first half of mem2reg's core logic (as it is used both in the standalone mem2reg pass and in SROA) builds up a mapping from 'Instruction ' to the index of that instruction within its basic block. This allows quickly establishing which store dominate a particular load even for large basic blocks. We cache this information throughout the run of mem2reg over a function in order to amortize the cost of computing it. This is not in and of itself a strange pattern in LLVM. However, it introduces a very important constraint: absolutely no instruction can be deleted from the program without updating the mapping. Otherwise a newly allocated instruction might get the same pointer address, and then end up with a wrong index. Yes, LLVM routinely suffers from a single threaded* variant of the ABA problem. Most places in LLVM don't find avoiding this an imposition because they don't both delete and create new instructions iteratively, but mem2reg loves to do this... All the time. Fortunately, the mem2reg code was really careful about updating this cache to handle this eventuallity... except when it comes to the debug declare intrinsic. Oops. The fix is to invalidate that pointer in the cache when we delete it, the same as we do when deleting alloca instructions and other instructions. I've also caused the same bug in new code while working on a fix to PR16867, so this seems to be a really unfortunate pattern. Hopefully in subsequent patches the deletion of dead instructions can be consolidated sufficiently to make it less likely that we'll see future occurences of this bug. Sorry for not having a test case, but I have literally no idea how to reliably trigger this kind of thing. It may be single-threaded, but it remains an ABA problem. It would require a really amazing number of stars to align. llvm-svn: 188367	2013-08-14 08:56:41 +00:00
Matt Arsenault	cb3b478d91	Fix always creating GEP with i32 indices Use the pointer size if datalayout is available. Use i64 if it's not, which is consistent with what other places do when the pointer size is unknown. The test doesn't really test this in a useful way since it will be transformed to that later anyway, but this now tests it for non-zero arrays and when datalayout isn't available. The cases in visitGetElementPtrInst should save an extra re-visit to the newly created GEP since it won't need to cleanup after itself. llvm-svn: 188339	2013-08-14 00:24:38 +00:00
Matt Arsenault	6d1daebfff	Use type helper functions instead of cast llvm-svn: 188338	2013-08-14 00:24:34 +00:00
Matt Arsenault	734103e561	Use array initializer, space around operator llvm-svn: 188337	2013-08-14 00:24:05 +00:00
Hal Finkel	fd36621506	BBVectorize: Add initial stores to the write set when tracking uses When computing the use set of a store, we need to add the store to the write set prior to iterating over later instructions. Otherwise, if there is a later aliasing load of that store, that load will not be tagged as a use, and bad things will happen. trackUsesOfI still adds later dependent stores of an instruction to that instruction's write set, but it never sees the original instruction, and so when tracking uses of a store, the store must be added to the write set by the caller. Fixes PR16834. llvm-svn: 188329	2013-08-13 23:34:32 +00:00
Nick Lewycky	eab287a60a	Revert r187191, which broke opt -mem2reg on the testcases included in PR16867. However, opt -O2 doesn't run mem2reg directly so nobody noticed until r188146 when SROA started sending more things directly down the PromoteMemToReg path. In order to revert r187191, I also revert dependent revisions r187296, r187322 and r188146. Fixes PR16867. Does not add the testcases from that PR, but both of them should get added for both mem2reg and sroa when this revert gets unreverted. llvm-svn: 188327	2013-08-13 22:51:58 +00:00
Dmitry Vyukov	fe91b1efa2	dfsan: fix lint warnings llvm-svn: 188293	2013-08-13 16:52:41 +00:00
Arnold Schwaighofer	dfaac373ee	Also remove logic in LateVectorize llvm-svn: 188285	2013-08-13 16:12:04 +00:00
Arnold Schwaighofer	406976609b	Remove logic that decides whether to vectorize or not depending on O-levels I have moved this logic into clang and opt. llvm-svn: 188281	2013-08-13 15:51:25 +00:00
Peter Collingbourne	723e9b89ec	Reapply r188119 now that the bug it exposed is fixed. llvm-svn: 188217	2013-08-12 22:38:43 +00:00
Peter Collingbourne	10cbe4a9bb	DataFlowSanitizer: fix a use-after-free. Spotted by libgmalloc. llvm-svn: 188216	2013-08-12 22:38:39 +00:00
Bill Wendling	f37b5b1f31	Move stack protector names to the same place. llvm-svn: 188198	2013-08-12 20:09:37 +00:00
Nadav Rotem	bc08e7ce84	Fix PR16797 - Support PHINodes with multiple inputs from the same basic block. Do not generate new vector values for the same entries because we know that the incoming values from the same block must be identical. llvm-svn: 188185	2013-08-12 17:46:44 +00:00
Alexey Samsonov	d27599b42b	Remove unused SpecialCaseList constructors llvm-svn: 188171	2013-08-12 11:50:44 +00:00
Alexey Samsonov	cdc0f339aa	Add SpecialCaseList::createOrDie() factory and use it in sanitizer passes llvm-svn: 188169	2013-08-12 11:46:09 +00:00
Alexey Samsonov	d27061d7e7	Introduce factory methods for SpecialCaseList Summary: Doing work in constructors is bad: this change suggests to call SpecialCaseList::create(Path, Error) instead of "new SpecialCaseList(Path)". Currently the latter may crash with report_fatal_error, which is undesirable - sometimes we want to report the error to user gracefully - for example, if he provides an incorrect file as an argument of Clang's -fsanitize-blacklist flag. Reviewers: pcc Reviewed By: pcc CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1327 llvm-svn: 188156	2013-08-12 07:49:36 +00:00
Richard Sandiford	c01b885f5d	Fix big-endian handling of integer-to-vector bitcasts in InstCombine These functions used to assume that the lsb of an integer corresponds to vector element 0, whereas for big-endian it's the other way around: the msb is in the first element and the lsb is in the last element. Fixes MultiSource/Benchmarks/mediabench/gsm/toast for z. llvm-svn: 188155	2013-08-12 07:26:09 +00:00
Chandler Carruth	24ce5e9526	Re-instate r187323 which fast-tracks promotable allocas as soon as the SROA-based analysis has enough information. This should work now that both mem2reg and the SSAUpdater-based AllocaPromoter have been updated to be able to promote the types of allocas that the SROA analysis detects. I've included tests for the AllocaPromoter that were only possible to write once we fast-tracked promotable allocas without rewriting them. This includes a test both for r187347 and r188145. Original commit log for r187323: """ Now that mem2reg understands how to cope with a slightly wider set of uses of an alloca, we can pre-compute promotability while analyzing an alloca for splitting in SROA. That lets us short-circuit the common case of a bunch of trivially promotable allocas. This cuts 20% to 30% off the run time of SROA for typical frontend-generated IR sequneces I'm seeing. It gets the new SROA to within 20% of ScalarRepl for such code. My current benchmark for these numbers is PR15412, but it fits the general pattern of IR emitted by Clang so it should be widely applicable. """ llvm-svn: 188146	2013-08-11 02:17:11 +00:00
Chandler Carruth	bc4fdfb024	Finish fixing the SSAUpdater-based AllocaPromoter strategy in SROA to cope with the more general set of patterns that are now handled by mem2reg and that we can detect quickly while doing SROA's initial analysis. Notably, this allows it to promote through no-op bitcast and GEP sequences. A core part of the SSAUpdater approach is the ability to test whether a particular instruction is part of the set being promoted. Testing this becomes significantly more complex in the world where the operand to every load and store isn't the alloca itself. I ended up using the approach of walking up the def-chain until we find the alloca. I benchmarked this against keeping a set of pointer operands and keeping a set of the loads and stores we care about, and this one seemed faster although the difference was very small. No test case yet because currently the rewriting always "fixes" the inputs to not require this. The next patch which re-enables early promotion of easy cases in SROA will include a test case that specifically exercises this aspect of the alloca promoter. llvm-svn: 188145	2013-08-11 01:56:15 +00:00
Chandler Carruth	5b16f00b28	Reformat some bits of AllocaPromoter and simplify the name and type of our visiting datastructures in the AllocaPromoter/SSAUpdater path of SROA. Also shift the order if clears around to be more consistent. No functionality changed here, this is just a cleanup. llvm-svn: 188144	2013-08-11 01:03:18 +00:00
Arnold Schwaighofer	3525aff8f6	Revert r188119 "Kill some duplicated code for removing unreachable BBs." It is breaking builbots with libgmalloc enabled on Mac OS X. $ cd llvm ; mkdir release ; cd release $ ../configure --enable-optimized —prefix=$PWD/install $ make $ make check $ Release+Asserts/bin/llvm-lit -v --param use_gmalloc=1 --param \ gmalloc_path=/usr/lib/libgmalloc.dylib \ ../test/Instrumentation/DataFlowSanitizer/args-unreachable-bb.ll llvm-svn: 188142	2013-08-10 20:16:06 +00:00
Michael Gottesman	f005430fa0	[objc-arc] Track if we encountered an additive overflow while computing {TopDown,BottomUp}PathCounts and do nothing if it occurred. I fixed the aforementioned problems that came up on some of the linux boxes. Major thanks to Nick Lewycky for his help debugging! rdar://14590914 llvm-svn: 188122	2013-08-09 23:22:27 +00:00
Peter Collingbourne	0b56f9dd44	Kill some duplicated code for removing unreachable BBs. This moves removeUnreachableBlocksFromFn from SimplifyCFGPass.cpp to Utils/Local.cpp and uses it to replace the implementation of llvm::removeUnreachableBlocks, which appears to do a strict subset of what removeUnreachableBlocksFromFn does. Differential Revision: http://llvm-reviews.chandlerc.com/D1334 llvm-svn: 188119	2013-08-09 22:47:24 +00:00
Peter Collingbourne	df8eabc1f8	DataFlowSanitizer: Remove unreachable BBs so IR continues to verify under the args ABI. Differential Revision: http://llvm-reviews.chandlerc.com/D1316 llvm-svn: 188113	2013-08-09 21:42:53 +00:00
Jakub Staszak	d7c1eac478	Mark obviously const methods. Also use reference for parameters when possible. llvm-svn: 188103	2013-08-09 20:53:48 +00:00
Michael Gottesman	d0acc77647	Revert "[objc-arc] Track if we encountered an additive overflow while computing {TopDown,BottomUp}PathCounts and do nothing if it occured." This reverts commit r187941. The commit was passing on my os x box, but it is failing on some non-osx platforms. I do not have time to look into it now, so I am reverting and will recommit after I figure this out. llvm-svn: 187946	2013-08-08 00:41:18 +00:00
Peter Collingbourne	34bf946403	Fix ARM build. llvm-svn: 187944	2013-08-08 00:15:27 +00:00
Michael Gottesman	a8c38e944e	[objc-arc] Track if we encountered an additive overflow while computing {TopDown,BottomUp}PathCounts and do nothing if it occured. rdar://14590914 llvm-svn: 187941	2013-08-07 23:56:41 +00:00
Michael Gottesman	630805fc43	[objc-arc] Change 4 iterator methods which return const_iterators to be const methods. llvm-svn: 187940	2013-08-07 23:56:34 +00:00
Hal Finkel	bdc7aa32c1	Add ISD::FROUND for libm round() All libm floating-point rounding functions, except for round(), had their own ISD nodes. Recent PowerPC cores have an instruction for round(), and so here I'm adding ISD::FROUND so that round() can be custom lowered as well. For the most part, this is straightforward. I've added an intrinsic and a matching ISD node just like those for nearbyint() and friends. The SelectionDAG pattern I've named frnd (because ISD::FP_ROUND has already claimed fround). This will be used by the PowerPC backend in a follow-up commit. llvm-svn: 187926	2013-08-07 22:49:12 +00:00
Peter Collingbourne	efa9025062	DataFlowSanitizer; LLVM changes. DataFlowSanitizer is a generalised dynamic data flow analysis. Unlike other Sanitizer tools, this tool is not designed to detect a specific class of bugs on its own. Instead, it provides a generic dynamic data flow analysis framework to be used by clients to help detect application-specific issues within their own code. Differential Revision: http://llvm-reviews.chandlerc.com/D965 llvm-svn: 187923	2013-08-07 22:47:18 +00:00
Benjamin Kramer	fbbec483df	JumpThreading: Turn a select instruction into branching if it allows to thread one half of the select. This is a common pattern coming out of simplifycfg generating gross code. a: ; preds = %entry %sel = select i1 %cmp1, double %add, double 0.000000e+00 br label %b b: %cond5 = phi double [ %sel, %a ], [ %sub, %entry ] %cmp6 = fcmp oeq double %cond5, 0.000000e+00 br i1 %cmp6, label %if.then, label %if.end becomes a: br i1 %cmp1, label %b, label %if.then b: %cond5 = phi double [ %sub, %entry ], [ %add, %a ] %cmp6 = fcmp oeq double %cond5, 0.000000e+00 br i1 %cmp6, label %if.then, label %if.end Skipping block b completely if possible. llvm-svn: 187880	2013-08-07 10:29:38 +00:00
Bill Wendling	1998a223f6	Change the linkage of these global values to 'internal'. The globals being generated here were given the 'private' linkage type. However, this caused them to end up in different sections with the wrong prefix. E.g., they would be in the __TEXT,__const section with an 'L' prefix instead of an 'l' (lowercase ell) prefix. The problem is that the linker will eat a literal label with 'L'. If a weak symbol is then placed into the __TEXT,__const section near that literal, then it cannot distinguish between the literal and the weak symbol. Part of the problems here was introduced because the address sanitizer converted some C strings into constant initializers with trailing nuls. (Thus putting them in the __const section with the wrong prefix.) The others were variables that the address sanitizer created but simply had the wrong linkage type. llvm-svn: 187827	2013-08-06 22:52:42 +00:00
Arnold Schwaighofer	af6776a17b	LoopVectorize: Allow vectorization of loops with lifetime markers Patch by Marc Jessome! llvm-svn: 187825	2013-08-06 22:37:52 +00:00
Jakub Staszak	06543ea089	Adjust file to the coding standard. llvm-svn: 187808	2013-08-06 17:03:42 +00:00
Serge Pavlov	a57ba3eab8	Unbreak Debug build on Windows llvm-svn: 187786	2013-08-06 08:44:18 +00:00
Tom Stellard	e4e3be6f50	Factor FlattenCFG out from SimplifyCFG Patch by: Mei Ye llvm-svn: 187764	2013-08-06 02:43:45 +00:00
Matt Arsenault	de2f38a2db	Fix missing -- C++ --s llvm-svn: 187758	2013-08-06 00:16:21 +00:00
Peter Collingbourne	42b450c977	Introduce an optimisation for special case lists with large numbers of literal entries. Our internal regex implementation does not cope with large numbers of anchors very efficiently. Given a ~3600-entry special case list, regex compilation can take on the order of seconds. This patch solves the problem for the special case of patterns matching literal global names (i.e. patterns with no regex metacharacters). Rather than forming regexes from literal global name patterns, add them to a StringSet which is checked before matching against the regex. This reduces regex compilation time by an order of roughly thousands when reading the aforementioned special case list, according to a completely unscientific study. No test cases. I figure that any new tests for this code should check that regex metacharacters are properly recognised. However, I could not find any documentation which documents the fact that the syntax of global names in special case lists is based on regexes. The extent to which regex syntax is supported in special case lists should probably be decided on/documented before writing tests. Differential Revision: http://llvm-reviews.chandlerc.com/D1150 llvm-svn: 187732	2013-08-05 17:48:04 +00:00
Alexey Samsonov	186358278d	80-cols llvm-svn: 187725	2013-08-05 13:19:49 +00:00
Nadav Rotem	eef986f7a3	SLPVectorizer: Fix PR16777. PHInodes may use multiple extracted values that come from different blocks. Thanks Alexey Samsonov. llvm-svn: 187663	2013-08-02 18:40:24 +00:00
Alexey Samsonov	4124641a22	Fix dereferencing end iterator in SimplifyCFG. Patch by Ye Mei. llvm-svn: 187646	2013-08-02 08:06:43 +00:00
Matt Arsenault	f9b8fbb891	Teach getOrEnforceKnownAlignment about address spaces llvm-svn: 187629	2013-08-01 22:42:18 +00:00
Nadav Rotem	90a6933b7c	Move the optlevel check to the frontend. llvm-svn: 187628	2013-08-01 22:41:58 +00:00
Nadav Rotem	c61cb93837	Only enable SLP-vectorization on O3 builds. llvm-svn: 187595	2013-08-01 18:28:15 +00:00
Nadav Rotem	403f78810d	80-col llvm-svn: 187535	2013-07-31 22:17:45 +00:00
Owen Anderson	43c2c4c5e4	Preserve fast-math flags when folding (fsub x, (fneg y)) to (fadd x, y). llvm-svn: 187462	2013-07-30 23:53:17 +00:00
Matt Arsenault	c825ddd4ca	Change behavior of calling bitcasted alias functions. It will now only convert the arguments / return value and call the underlying function if the types are able to be bitcasted. This avoids using fp<->int conversions that would occur before. llvm-svn: 187444	2013-07-30 20:45:05 +00:00
Nadav Rotem	fc24bbce9c	SLPVectorier: update the debug location for the new instructions. llvm-svn: 187363	2013-07-29 18:18:46 +00:00
Chandler Carruth	4093b4bbce	Teach the AllocaPromoter which is wrapped around the SSAUpdater infrastructure to do promotion without a domtree the same smarts about looking through GEPs, bitcasts, etc., that I just taught mem2reg about. This way, if SROA chooses to promote an alloca which still has some noisy instructions this code can cope with them. I've not used as principled of an approach here for two reasons: 1) This code doesn't really need it as we were already set up to zip through the instructions used by the alloca. 2) I view the code here as more of a hack, and hopefully a temporary one. The SSAUpdater path in SROA is a real sore point for me. It doesn't make a lot of architectural sense for many reasons: - We're likely to end up needing the domtree anyways in a subsequent pass, so why not compute it earlier and use it. - In the future we'll likely end up needing the domtree for parts of the inliner itself. - If we need to we could teach the inliner to preserve the domtree. Part of the re-work of the pass manager will allow this to be very powerful even in large SCCs with many functions. - Ultimately, computing a domtree has gotten significantly faster since the original SSAUpdater-using code went into ScalarRepl. We no longer use domfrontiers, and much of domtree is lazily done based on queries rather than eagerly. - At this point keeping the SSAUpdater-based promotion saves a total of 0.7% on a build of the 'opt' tool for me. That's not a lot of performance given the complexity! So I'm leaving this a bit ugly in the hope that eventually we just remove all of this nonsense. I can't even readily test this because this code isn't reachable except through SROA. When I re-instate the patch that fast-tracks allocas already suitable for promotion, I'll add a testcase there that failed before this change. Before that, SROA will fix any test case I give it. llvm-svn: 187347	2013-07-29 09:06:53 +00:00
Nadav Rotem	931f83b2ef	Don't vectorize when the attribute NoImplicitFloat is used. llvm-svn: 187340	2013-07-29 05:13:00 +00:00
Rafael Espindola	ca86cefe79	Fix -Wdocumentation warnings. llvm-svn: 187336	2013-07-28 23:43:28 +00:00
Chandler Carruth	379281ffcc	Update comments for SSAUpdater to use the modern doxygen comment standards for LLVM. Remove duplicated comments on the interface from the implementation file (implementation comments are left there of course). Also clean up, re-word, and fix a few typos and errors in the commenst spotted along the way. This is in preparation for changes to these files and to keep the uninteresting tidying in a separate commit. llvm-svn: 187335	2013-07-28 22:00:33 +00:00
Chandler Carruth	cfd2dee91f	Temporarily revert r187323 until I update SSAUpdater to match mem2reg. I forgot that we had two totally independent things here. :: sigh :: llvm-svn: 187327	2013-07-28 09:05:49 +00:00
Chandler Carruth	d3834e5bef	Now that mem2reg understands how to cope with a slightly wider set of uses of an alloca, we can pre-compute promotability while analyzing an alloca for splitting in SROA. That lets us short-circuit the common case of a bunch of trivially promotable allocas. This cuts 20% to 30% off the run time of SROA for typical frontend-generated IR sequneces I'm seeing. It gets the new SROA to within 20% of ScalarRepl for such code. My current benchmark for these numbers is PR15412, but it fits the general pattern of IR emitted by Clang so it should be widely applicable. llvm-svn: 187323	2013-07-28 08:27:12 +00:00
Chandler Carruth	ca3736c604	Thread DataLayout through the callers and into mem2reg. This will be useful in a subsequent patch, but causes an unfortunate amount of noise, so I pulled it out into a separate patch. llvm-svn: 187322	2013-07-28 06:43:11 +00:00
Nadav Rotem	b6c365bd70	Update the comment llvm-svn: 187316	2013-07-27 23:28:47 +00:00
Chandler Carruth	b3e0150179	Don't use all the #ifdefs to hide the stats counters and instead rely on their being optimized out in debug mode. Realistically, this just isn't going to be the slow part anyways. This also fixes unused variable warnings that are breaking LLD build bots. =/ I didn't see these at first, and kept losing track of the fact that they were broken. llvm-svn: 187297	2013-07-27 10:17:49 +00:00
Chandler Carruth	b4731e3c73	Merge the removal of dead instructions and lifetime markers with the analysis of the alloca. We don't need to visit all the users twice for this. We build up a kill list during the analysis and then just process it afterward. This recovers the tiny bit of performance lost by moving to the visitor based analysis system as it removes one entire use-list walk from mem2reg. In some cases, this is now faster than mem2reg was previously. llvm-svn: 187296	2013-07-27 09:43:30 +00:00
Nick Lewycky	3d5df03884	Reimplement isPotentiallyReachable to make nocapture deduction much stronger. Adds unit tests for it too. Split BasicBlockUtils into an analysis-half and a transforms-half, and put the analysis bits into a new Analysis/CFG.{h,cpp}. Promote isPotentiallyReachable into llvm::isPotentiallyReachable and move it into Analysis/CFG. llvm-svn: 187283	2013-07-27 01:24:00 +00:00
Tom Stellard	8e98bf332b	SimplifyCFG: Use parallel-and and parallel-or mode to consolidate branch conditions Merge consecutive if-regions if they contain identical statements. Both transformations reduce number of branches. The transformation is guarded by a target-hook, and is currently enabled only for +R600, but the correctness has been tested on X86 target using a variety of CPU benchmarks. Patch by: Mei Ye llvm-svn: 187278	2013-07-27 00:01:07 +00:00
Nadav Rotem	9bebda1517	SLP Vectorier: Don't vectorize really short chains because they are already handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize. llvm-svn: 187267	2013-07-26 23:07:55 +00:00
Nadav Rotem	965db2159b	SLP Vectorizer: Disable the vectorization of non power of two chains, such as <3 x float>, because we dont have a good cost model for these types. llvm-svn: 187265	2013-07-26 22:53:11 +00:00
Owen Anderson	e7a1434407	Fix variable name. llvm-svn: 187253	2013-07-26 22:06:21 +00:00
Owen Anderson	841856eb0b	When InstCombine tries to fold away (fsub x, (fneg y)) into (fadd x, y), it is also worthwhile for it to look through FP extensions and truncations, whose application commutes with fneg. llvm-svn: 187249	2013-07-26 21:40:29 +00:00
Stephen Lin	c7791e3e9b	Correct case of m_UIToFp to m_UIToFP to match instruction name, add m_SIToFP for consistency. llvm-svn: 187225	2013-07-26 17:55:00 +00:00
Chandler Carruth	4a66ec34bd	Re-implement the analysis of uses in mem2reg to be significantly more robust. It now uses an InstVisitor and worklist to actually walk the uses of the Alloca transitively and detect the pattern which we can directly promote: loads & stores of the whole alloca and instructions we can completely ignore. Also, with this new implementation teach both the predicate for testing whether we can promote and the promotion engine itself to use the same code so we no longer have strange divergence between the two code paths. I've added some silly test cases to demonstrate that we can handle slightly more degenerate code patterns now. See the below for why this is even interesting. Performance impact: roughly 1% regression in the performance of SROA or ScalarRepl on a large C++-ish test case where most of the allocas are basically ready for promotion. The reason is because of silly redundant work that I've left FIXMEs for and which I'll address in the next commit. I wanted to separate this commit as it changes the behavior. Once the redundant work in removing the dead uses of the alloca is fixed, this code appears to be faster than the old version. =] So why is this useful? Because the previous requirement for promotion required a specific visit pattern of the uses of the alloca to verify: we had to look for no more than 1 intervening use. The end goal is to have SROA automatically detect when an alloca is already promotable and directly hand it to the mem2reg machinery rather than trying to partition and rewrite it. This is a 25% or more performance improvement for SROA, and a significant chunk of the delta between it and ScalarRepl. To get there, we need to make mem2reg actually capable of promoting allocas which look promotable to SROA without have SROA do tons of work to massage the code into just the right form. This is actually the tip of the iceberg. There are tremendous potential savings we can realize here by de-duplicating work between mem2reg and SROA. llvm-svn: 187191	2013-07-26 08:20:39 +00:00
Bill Schmidt	88e45cc177	[PowerPC] Support powerpc64le as a syntax-checking target. This patch provides basic support for powerpc64le as an LLVM target. However, use of this target will not actually generate little-endian code. Instead, use of the target will cause the correct little-endian built-in defines to be generated, so that code that tests for __LITTLE_ENDIAN__, for example, will be correctly parsed for syntax-only testing. Code generation will otherwise be the same as powerpc64 (big-endian), for now. The patch leaves open the possibility of creating a little-endian PowerPC64 back end, but there is no immediate intent to create such a thing. The LLVM portions of this patch simply add ppc64le coverage everywhere that ppc64 coverage currently exists. There is nothing of any import worth testing until such time as little-endian code generation is implemented. In the corresponding Clang patch, there is a new test case variant to ensure that correct built-in defines for little-endian code are generated. llvm-svn: 187179	2013-07-26 01:35:43 +00:00
Rafael Espindola	ad8b23acd3	Respect llvm.used in Internalize. The language reference says that: "If a symbol appears in the @llvm.used list, then the compiler, assembler, and linker are required to treat the symbol as if there is a reference to the symbol that it cannot see" Since even the linker cannot see the reference, we must assume that the reference can be using the symbol table. For example, a user can add __attribute__((used)) to a debug helper function like dump and use it from a debugger. llvm-svn: 187103	2013-07-25 03:23:25 +00:00
Nick Lewycky	a0237867ad	Check that TD isn't NULL before dereferencing it down this path. llvm-svn: 187099	2013-07-25 02:55:14 +00:00
Rafael Espindola	d4a363d0de	Make these methods const correct. Thanks to Nick Lewycky for noticing it. llvm-svn: 187098	2013-07-25 02:50:08 +00:00
Benjamin Kramer	936fe15ecf	TRE: Move class into anonymous namespace. While there shrink a dangerously large SmallPtrSet. llvm-svn: 187050	2013-07-24 16:12:08 +00:00
Chandler Carruth	97ac149a21	Fix a problem I introduced in r187029 where we would over-eagerly schedule an alloca for another iteration in SROA. This only showed up with a mixture of promotable and unpromotable selects and phis. Added a test case for this. llvm-svn: 187031	2013-07-24 12:12:17 +00:00
Chandler Carruth	b086f93d75	Fix PR16687 where we were incorrectly promoting an alloca that had pending speculation for a phi node. The problem here is that we were using growth of the specluation set as an indicator of whether speculation would occur, and if the phi node is already in the set we don't see it grow. This is a symptom of the fact that this signal is a total hack. Unfortunately, I couldn't really come up with a non-hacky way of signaling that promotion remains valid after speculation occurs, such that we only speculate when all else looks good for promotion. In the end, I went with at least a much more explicit approach of doing the work of queuing inside the phi and select processing and setting a preposterously named flag to convey that we're in the special state of requiring speculating before promotion. Thanks to Richard Trieu and Nick Lewycky for the excellent work reducing a testcase for this from a pretty giant, nasty assert in a big application. =] The testcase was excellent. llvm-svn: 187029	2013-07-24 09:47:28 +00:00
Matt Arsenault	87d07d2f4c	Fix spelling llvm-svn: 186997	2013-07-23 22:20:57 +00:00
Nick Lewycky	80c6b2a977	Remove extraneous null statement. No functionality change! llvm-svn: 186893	2013-07-22 23:38:27 +00:00
Jakub Staszak	ad1f6af5a0	Use switch instead of if. No functionality change. llvm-svn: 186892	2013-07-22 23:38:16 +00:00
Jakub Staszak	cb88203090	Remove trailing spaces. llvm-svn: 186890	2013-07-22 23:16:36 +00:00
Nadav Rotem	383f980dca	When we vectorize across multiple basic blocks we may vectorize PHINodes that create a cycle. We already break the cycle on phi-nodes, but arithmetic operations are still uplicated. This patch adds code that checks if the operation that we are vectorizing was vectorized during the visit of the operands and uses this value if it can. llvm-svn: 186883	2013-07-22 22:18:07 +00:00
Jakub Staszak	9e648c8d0e	OldPtr is llvm::Instruction. Remove unneeded cast<>. llvm-svn: 186880	2013-07-22 22:10:43 +00:00
Jakub Staszak	11159a5a91	Change tabs to spaces. llvm-svn: 186877	2013-07-22 21:11:30 +00:00
Matt Arsenault	f4ec7f5975	Fix spelling and grammar llvm-svn: 186858	2013-07-22 18:59:58 +00:00
Nadav Rotem	53033f1204	Fix an obvious typo in the loop vectorizer where the cost model uses the wrong variable. The variable BlockCost is ignored. We don't have tests for the effect of if-conversion loops because it requires a big test (that includes if-converted loops) and it is difficult to find and balance a loop to do the right thing. llvm-svn: 186845	2013-07-22 17:10:48 +00:00
Nadav Rotem	c15d196f83	Delete unused helper functions. llvm-svn: 186808	2013-07-22 05:19:22 +00:00
Benjamin Kramer	02e86e1f3f	mem2reg: Minor STL usage cleanup. No functionality change. llvm-svn: 186790	2013-07-21 11:03:40 +00:00
Chandler Carruth	a549ad0c21	Make the mem2reg interface use an ArrayRef as it keeps a copy of these to iterate over. llvm-svn: 186788	2013-07-21 08:37:58 +00:00
Nadav Rotem	cb987ae696	Revert a part of r186420. Don't forbid multiple store chains that merge. llvm-svn: 186786	2013-07-21 06:12:57 +00:00
Chandler Carruth	d4299dfcf7	Hoist the rest of the logic for promoting single-store allocas into the helper function. This leaves both trivial cases handled entirely in helper functions and merely manages the list of allocas to process in the run method. The next step will be to handle all of the trivial promotion work prior to even creating the core class and the subsequent simplifications that enables. llvm-svn: 186784	2013-07-21 01:52:33 +00:00
Chandler Carruth	ee3da824df	Hoist the rest of the logic for fully promoting allocas with all uses in a single block into the helper routine. This takes advantage of the fact that we can directly replace uses prior to any store with undef to simplify matters and unconditionally promote allocas only used within one block. I've removed the special handling for the case of no stores existing. This has no semantic effect but might slow things down. I'll fix that in a later patch when I refactor this entire thing to be easier to manage the different cases. llvm-svn: 186783	2013-07-21 01:44:07 +00:00
Chandler Carruth	649189782e	Remove a method made dead by the prior refactoring. llvm-svn: 186782	2013-07-21 00:01:34 +00:00
Chandler Carruth	1522f7782e	Hoist the two trivial promotion routines out of the big class that handles the general cases. The hope is to refactor this so that we don't end up building the entire class for the trivial cases. I also want to lift a lot of the early pre-processing in the initial segment of run() into a separate routine, and really none of it needs to happen inside the primary promotion class. These routines in particular used none of the actual state in the promotion class, so they don't really make sense as members. llvm-svn: 186781	2013-07-20 23:59:51 +00:00
Chandler Carruth	4f122f4bf7	Hoist the AllocaInfo struct to the top of the file. This struct is nicely independent of everything else, and we already needed a foward declaration here. It's simpler to just define it immediately. llvm-svn: 186780	2013-07-20 23:39:26 +00:00
Chandler Carruth	9056374adc	Sink a typedef and comparator down to the function that actually uses them. llvm-svn: 186779	2013-07-20 23:36:19 +00:00
Rafael Espindola	b64a8ded32	Don't crash when llvm.compiler.used becomes empty. GlobalOpt simplifies llvm.compiler.used by removing any members that are also in the more strict llvm.used. Handle the special case where llvm.compiler.used becomes empty. llvm-svn: 186778	2013-07-20 23:33:15 +00:00
Chandler Carruth	625842576e	Don't allocate the DIBuilder on the heap and remove all the complexity that ensued from that. llvm-svn: 186777	2013-07-20 23:33:06 +00:00
Chandler Carruth	8a5b9d1ecd	Rename constructor parameters to follow the common member-shadowing pattern and conform to the naming conventions. llvm-svn: 186776	2013-07-20 23:23:47 +00:00
Chandler Carruth	b93a8119e3	Reformat the implementation of mem2reg with clang-format so that my subsequent changes don't introduce inconsistencies. llvm-svn: 186775	2013-07-20 23:20:08 +00:00
Chandler Carruth	82ebea7f15	Remove a DenseMapInfo specialization for std::pair -- we have one of those baked into DenseMap now. llvm-svn: 186773	2013-07-20 23:09:05 +00:00
Chandler Carruth	cf50f83dfd	Update mem2reg's comments to conform to the new doxygen standards. No functionality changed. llvm-svn: 186772	2013-07-20 22:20:05 +00:00
Benjamin Kramer	78617fc38a	SROA: Microoptimization: Remove dead entries first, then sort. While there replace an explicit struct with std::mem_fun. llvm-svn: 186761	2013-07-20 08:38:34 +00:00
Stephen Lin	5a583ae2f7	InstCombine: call FoldOpIntoSelect for all floating binops, not just fmul llvm-svn: 186759	2013-07-20 07:13:13 +00:00
Nadav Rotem	85fcd9dde0	fix an 80-col line. llvm-svn: 186733	2013-07-19 23:14:01 +00:00
Nadav Rotem	43e3cf61bf	Use LLVMs ADTs that improve the compile time of this pass. llvm-svn: 186732	2013-07-19 23:12:19 +00:00
Nadav Rotem	aead8bb4d6	SLPVectorizer: Improve the compile time of isConsecutive by reordering the conditions that check GEPs and eliminate two of the calls to accumulateConstantOffset. llvm-svn: 186731	2013-07-19 23:11:15 +00:00
Rafael Espindola	6178a47aa5	s/compiler_used/compiler.used/. We were incorrectly using compiler_used instead of compiler.used. Unfortunately the passes using the broken name had tests also using the broken name. llvm-svn: 186705	2013-07-19 18:44:51 +00:00
Chandler Carruth	684f6afef6	Cleanup the stats counters for the new implementation. These actually count the right things and have the right names. llvm-svn: 186667	2013-07-19 10:57:36 +00:00
Chandler Carruth	5d65964bbf	Fix another assert failure very similar to PR16651's test case. This test case came from Benjamin and found the parallel bug in the vector promotion code. llvm-svn: 186666	2013-07-19 10:57:32 +00:00
Chandler Carruth	47c6759da6	Try to move to a more reasonable set of naming conventions given the new implementation of the SROA algorithm. We were using the term 'partition' in many places that no longer ever represented an actual partition, but rather just an arbitrary slice of an alloca. No functionality change intended here. Mostly just renaming of types, functions, variables, and rewording of comments. Several comments were rewritten to make a lot more sense in the new structure of things. The stats are still weird and not reflective of how this really works. I'll fix those up in a separate patch as it is a touch more semantic of a change... llvm-svn: 186659	2013-07-19 09:13:58 +00:00
Chandler Carruth	51ced11831	A long overdue cleanup in SROA to use 'DL' instead of 'TD' for the DataLayout variables. llvm-svn: 186656	2013-07-19 07:21:28 +00:00
Chandler Carruth	dc65e5d0aa	Fix PR16651, an assert introduced in my recent re-work of the innards of SROA. The crux of the issue is that now we track uses of a partition of the alloca in two places: the iterators over the partitioning uses and the previously collected split uses vector. We weren't accounting for the fact that the split uses might invalidate integer widening in ways other than due to their width (in this case due to being volatile). Further reduced testcase added to the tests. llvm-svn: 186655	2013-07-19 07:12:23 +00:00
Eric Christopher	172dc14902	Remove DIBuilder cache of variable TheCU and change the few uses that wanted it. Also change the interface for createCompileUnit to compensate. Fix comments that refer to TheCU as well. llvm-svn: 186637	2013-07-19 00:51:47 +00:00
Nick Lewycky	b9c6641ca8	Clean up some of this code a tiny bit, no functionality change. llvm-svn: 186622	2013-07-18 22:32:32 +00:00
Eric Christopher	1a3914a23a	Revert "Remove DIBuilder cache of variable TheCU and change the few" This reverts commit r186599 as I didn't want to commit this yet. llvm-svn: 186601	2013-07-18 19:13:06 +00:00
Eric Christopher	9b80723f96	Remove DIBuilder cache of variable TheCU and change the few uses that wanted it. Also change the interface for createCompileUnit to compensate. Fix comments that refer to TheCU as well. llvm-svn: 186599	2013-07-18 19:11:29 +00:00
Nadav Rotem	d9b6f69b94	Handle constants without going through SCEV. llvm-svn: 186593	2013-07-18 18:34:21 +00:00
Nadav Rotem	c47d289b77	SLPVectorizer: Speedup isConsecutive by manually checking GEPs with multiple indices. This brings the compile time of the SLP-Vectorizer to about 2.5% of OPT for my testcase. llvm-svn: 186592	2013-07-18 18:20:45 +00:00
Chandler Carruth	b59978aea3	Reapply r186316 with a fix for one bug where the code could walk off the end of a vector. This was found with ASan. I've had one other report of a crasher, but thus far been unable to reproduce the crash. It may well be fixed with this version, and if not I'd like to get more information from the build bots about what is happening. See r186316 for the full commit log for the new implementation of the SROA algorithm. llvm-svn: 186565	2013-07-18 07:15:00 +00:00
Nadav Rotem	440c849dd9	SLPVectorizer: Speedup isConsecutive (that checks if two addresses are consecutive in memory) by checking for additional patterns that don't need to go through SCEV. llvm-svn: 186563	2013-07-18 04:33:20 +00:00
Eric Christopher	8b4b95131b	Add comparison operators for DIDescriptors to fix c++98 fallout of operator bool change. Also convert a variable in DebugIR. llvm-svn: 186544	2013-07-17 23:25:22 +00:00
Nadav Rotem	22f3c7e5fb	Fix a comment. llvm-svn: 186541	2013-07-17 22:41:16 +00:00
Stephen Lin	a7fd9682cd	Restore r181216, which was partially reverted in r182499. llvm-svn: 186533	2013-07-17 20:06:03 +00:00
Nadav Rotem	990f24a7d2	Add a micro optimization to catch cases where the PtrA equals PtrB. llvm-svn: 186531	2013-07-17 19:52:25 +00:00
Hal Finkel	cae8df776e	Fix comparisons of alloca alignment in inliner merging Duncan pointed out a mistake in my fix in r186425 when only one of the allocas being compared had the target-default alignment. This is essentially his suggested solution. Thanks! llvm-svn: 186510	2013-07-17 14:32:41 +00:00
Craig Topper	f263d3bfb8	Mark a method 'const' and another 'static'. llvm-svn: 186485	2013-07-17 03:54:53 +00:00
Craig Topper	2bf5d5a2f0	Make a few more static string pointers constant. llvm-svn: 186484	2013-07-17 03:43:10 +00:00
Nadav Rotem	ae8b6de415	SLPVectorizer: Accelerate the isConsecutive check by replacing the subtraction of the two values with a simple SCEV expression that adds the offset to one of the pointers that we compare. llvm-svn: 186479	2013-07-17 00:48:31 +00:00
Nadav Rotem	adfe58a7ad	flip the scev minus direction to simplify the code. llvm-svn: 186466	2013-07-16 22:57:06 +00:00
Nadav Rotem	633bd23118	SLPVectorizer: Improve the compile time of isConsecutive by adding a simple constant-gep check before using SCEV. This check does not always work because not all of the GEPs use a constant offset, but it happens often enough to reduce the number of times we use SCEV. llvm-svn: 186465	2013-07-16 22:51:07 +00:00
Rafael Espindola	2a9326a78f	Add a wrapper for open. This centralizes the handling of O_BINARY and opens the way for hiding more differences (like how open behaves with directories). llvm-svn: 186447	2013-07-16 19:44:17 +00:00
Peter Collingbourne	82d932d6c2	Make SpecialCaseList match full strings, as documented, using anchors. Differential Revision: http://llvm-reviews.chandlerc.com/D1149 llvm-svn: 186431	2013-07-16 17:56:07 +00:00
Hal Finkel	35292d605d	When the inliner merges allocas, it must keep the larger alignment For safety, the inliner cannot decrease the allignment on an alloca when merging it with another. I've included two variants of the test case for this: one with DataLayout available, and one without. When DataLayout is not available, if only one of the allocas uses the default alignment (getAlignment() == 0), then they cannot be safely merged. llvm-svn: 186425	2013-07-16 17:10:55 +00:00
Nadav Rotem	ebe95f88ed	SLPVectorizer: Reduce the compile time of the consecutive store lookup. Process groups of stores in chunks of 16. llvm-svn: 186420	2013-07-16 15:25:17 +00:00
Craig Topper	b8260534f6	Add 'const' qualifiers to static const char* variables. llvm-svn: 186371	2013-07-16 01:17:10 +00:00
Nadav Rotem	f1e0aebca1	PR16628: Fix a bug in the code that merges compares. Compares return i1 but they compare different types. llvm-svn: 186359	2013-07-15 22:52:48 +00:00
Stephen Lin	b764d0591b	Remove trailing whitespace llvm-svn: 186333	2013-07-15 17:55:02 +00:00
Chandler Carruth	255d06c032	Revert r186316 while I track down an ASan failure and an assert from a bot. This reverts the commit which introduced a new implementation of the fancy SROA pass designed to reduce its overhead. I'll skip the huge commit log here, refer to r186316 if you're looking for how this all works and why it works that way. llvm-svn: 186332	2013-07-15 17:36:21 +00:00
Chandler Carruth	05be8f3230	Reimplement SROA yet again. Same fundamental principle, but a totally different core implementation strategy. Previously, SROA would build a relatively elaborate partitioning of an alloca, associate uses with each partition, and then rewrite the uses of each partition in an attempt to break apart the alloca into chunks that could be promoted. This was very wasteful in terms of memory and compile time because regardless of how complex the alloca or how much we're able to do in breaking it up, all of the datastructure work to analyze the partitioning was done up front. The new implementation attempts to form partitions of the alloca lazily and on the fly, rewriting the uses that make up that partition as it goes. This has a few significant effects: 1) Much simpler data structures are used throughout. 2) No more double walk of the recursive use graph of the alloca, only walk it once. 3) No more complex algorithms for associating a particular use with a particular partition. 4) PHI and Select speculation is simplified and happens lazily. 5) More precise information is available about a specific use of the alloca, removing the need for some side datastructures. Ultimately, I think this is a much better implementation. It removes about 300 lines of code, but arguably removes more like 500 considering that some code grew in the process of being factored apart and cleaned up for this all to work. I've re-used as much of the old implementation as possible, which includes the lion's share of code in the form of the rewriting logic. The interesting new logic centers around how the uses of a partition are sorted, and split into actual partitions. Each instruction using a pointer derived from the alloca gets a 'Partition' entry. This name is totally wrong, but I'll do a rename in a follow-up commit as there is already enough churn here. The entry describes the offset range accessed and the nature of the access. Once we have all of these entries we sort them in a very specific way: increasing order of begin offset, followed by whether they are splittable uses (memcpy, etc), followed by the end offset or whatever. Sorting by splittability is important as it simplifies the collection of uses into a partition. Once we have these uses sorted, we walk from the beginning to the end building up a range of uses that form a partition of the alloca. Overlapping unsplittable uses are merged into a single partition while splittable uses are broken apart and carried from one partition to the next. A partition is also introduced to bridge splittable uses between the unsplittable regions when necessary. I've looked at the performance PRs fairly closely. PR15471 no longer will even load (the module is invalid). Not sure what is up there. PR15412 improves by between 5% and 10%, however it is nearly impossible to know what is holding it up as SROA (the entire pass) takes less time than reading the IR for that test case. The analysis takes the same time as running mem2reg on the final allocas. I suspect (without much evidence) that the new implementation will scale much better however, and it is just the small nature of the test cases that makes the changes small and noisy. Either way, it is still simpler and cleaner I think. llvm-svn: 186316	2013-07-15 10:30:19 +00:00
Craig Topper	642bfedd2e	Add 'const' qualifier to some arrays. llvm-svn: 186312	2013-07-15 08:02:13 +00:00
Craig Topper	4e9457fd7d	Use llvm::array_lengthof to replace sizeof(array)/sizeof(array[0]). llvm-svn: 186301	2013-07-15 04:27:47 +00:00
Nadav Rotem	37c18f5cda	SLPVectorizer: change the order in which we search for vectorization candidates. Do stores first and PHIs second. llvm-svn: 186277	2013-07-14 06:15:46 +00:00
Craig Topper	58fa7a9b4a	Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector size. llvm-svn: 186274	2013-07-14 04:42:23 +00:00
Arnold Schwaighofer	970f54281c	LoopVectorizer: Disallow reductions whose header phi is used outside the loop If an outside loop user of the reduction value uses the header phi node we cannot just reduce the vectorized phi value in the vector code epilog because we would loose VF-1 reductions. lp: p = phi (0, lv) lv = lv + 1 ... brcond , lp, outside outside: usr = add 0, p (Say the loop iterates two times, the value of p coming out of the loop is one). We cannot just transform this to: vlp: p = phi (<0,0>, lv) lv = lv + <1,1> .. brcond , lp, outside outside: p_reduced = p[0] + [1]; usr = add 0, p_reduced (Because the original loop iterated two times the vectorized loop would iterate one time, but p_reduced ends up being zero instead of one). We would have to execute VF-1 iterations in the scalar remainder loop in such cases. For now, just disable vectorization. PR16522 llvm-svn: 186256	2013-07-13 19:09:29 +00:00
Andrew Trick	651c624842	LoopVectorize fix: LoopInfo must be valid when invoking utils like SCEVExpander. In general, one should always complete CFG modifications first, update CFG-based analyses, like Dominatores and LoopInfo, then generate instruction sequences. LoopVectorizer was creating a new loop, calling SCEVExpander to generate checks, then updating LoopInfo. I just changed the order. llvm-svn: 186241	2013-07-13 06:20:06 +00:00
Nick Lewycky	1897fc3733	Add a microoptimization for urem. llvm-svn: 186235	2013-07-13 01:16:47 +00:00
Joey Gouly	04b7300444	Fix a crash in EvaluateInDifferentElementOrder where it would generate an undef vector of the wrong type. LGTM'd by Nick Lewycky on IRC. llvm-svn: 186224	2013-07-12 23:08:06 +00:00
Andrew Trick	b79ae09045	LFTR improvement to avoid truncation. This is a reimplemntation of the patch originally in r186107. llvm-svn: 186215	2013-07-12 22:08:48 +00:00
Andrew Trick	064fb30cab	Cleanup LFTR logic. llvm-svn: 186214	2013-07-12 22:08:44 +00:00
Andrew Trick	aacd582de2	Cleanup: rename a variable to make the logic easier to follow. llvm-svn: 186213	2013-07-12 22:08:41 +00:00
Arnold Schwaighofer	b9c37551bc	TargetTransformInfo: address calculation parameter for gather/scather Address calculation for gather/scather in vectorized code can incur a significant cost making vectorization unbeneficial. Add infrastructure to add cost. Tests and cost model for targets will be in follow-up commits. radar://14351991 llvm-svn: 186187	2013-07-12 19:16:02 +00:00
Chandler Carruth	4163f3fe85	Revert "indvars: Improve LFTR by eliminating truncation when comparing against a constant." This reverts commit r186107. It didn't handle wrapping arithmetic in the loop correctly and thus caused the following C program to count from 0 to UINT64_MAX instead of from 0 to 255 as intended: #include <stdio.h> int main() { unsigned char first = 0, last = 255; do { printf("%d\n", first); } while (first++ != last); } Full test case and instructions to reproduce with just the -indvars pass sent to the original review thread rather than to r186107's commit. llvm-svn: 186152	2013-07-12 11:18:55 +00:00
Nadav Rotem	ee62470368	SLPVectorizer: Sink and enable CSE for ExtractElements. llvm-svn: 186145	2013-07-12 06:09:24 +00:00
Nadav Rotem	1e6246b38c	SLPVectorize: Replace the code that checks for vectorization candidates in successor blocks with code that scans PHINodes. Before we could vectorize PHINodes scanning successors was a good way of finding candidates. Now we can vectorize the phinodes which is simpler. llvm-svn: 186139	2013-07-12 00:04:18 +00:00
Nadav Rotem	cd9c4e430f	Remove an argument that we dont use anymore. llvm-svn: 186116	2013-07-11 20:56:13 +00:00
Andrew Trick	fe577c9f45	indvars: Improve LFTR by eliminating truncation when comparing against a constant. Patch by Michele Scandale! Adds a special handling of the case where, during the loop exit condition rewriting, the exit value is a constant of bitwidth lower than the type of the induction variable: instead of introducing a trunc operation in order to match correctly the operand types, it allows to convert the constant value to an equivalent constant, depending on the initial value of the induction variable and the trip count, in order have an equivalent comparison between the induction variable and the new constant. llvm-svn: 186107	2013-07-11 17:08:59 +00:00
Benjamin Kramer	2880b5c20b	Don't use a potentially expensive shift if all we want is one set bit. No functionality change. llvm-svn: 186095	2013-07-11 16:05:50 +00:00
Arnold Schwaighofer	a8667081e1	LoopVectorize: Vectorize all accesses in address space zero with unit stride We can vectorize them because in the case where we wrap in the address space the unvectorized code would have had to access a pointer value of zero which is undefined behavior in address space zero according to the LLVM IR semantics. (Thank you Duncan, for pointing this out to me). Fixes PR16592. llvm-svn: 186088	2013-07-11 15:21:55 +00:00
Duncan Sands	447d97b223	TryToSimplifyUncondBranchFromEmptyBlock was checking that any common predecessors of the two blocks it is attempting to merge supply the same incoming values to any phi in the successor block. This change allows merging in the case where there is one or more incoming values that are undef. The undef values are rewritten to match the non-undef value that flows from the other edge. Patch by Mark Lacey. llvm-svn: 186069	2013-07-11 08:28:20 +00:00
Nadav Rotem	965af9cb85	Fix a warning. llvm-svn: 186064	2013-07-11 05:39:02 +00:00
Nadav Rotem	8e1d89128f	SLPVectorizer: refactor the code that places extracts. Place the code that decides where to put extracts in the build-tree phase. This allows us to take the cost of the extracts into account. llvm-svn: 186058	2013-07-11 04:54:05 +00:00

... 2 3 4 5 6 ...

10839 Commits