llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00

Author	SHA1	Message	Date
Hal Finkel	6122fb79cb	Add an Assumption-Tracking Pass This adds an immutable pass, AssumptionTracker, which keeps a cache of @llvm.assume call instructions within a module. It uses callback value handles to keep stale functions and intrinsics out of the map, and it relies on any code that creates new @llvm.assume calls to notify it of the new instructions. The benefit is that code needing to find @llvm.assume intrinsics can do so directly, without scanning the function, thus allowing the cost of @llvm.assume handling to be negligible when none are present. The current design is intended to be lightweight. We don't keep track of anything until we need a list of assumptions in some function. The first time this happens, we scan the function. After that, we add/remove @llvm.assume calls from the cache in response to registration calls and ValueHandle callbacks. There are no new direct test cases for this pass, but because it calls it validation function upon module finalization, we'll pick up detectable inconsistencies from the other tests that touch @llvm.assume calls. This pass will be used by follow-up commits that make use of @llvm.assume. llvm-svn: 217334	2014-09-07 12:44:26 +00:00
Benjamin Kramer	e991977346	Add override to overriden virtual methods, remove virtual keywords. No functionality change. Changes made by clang-tidy + some manual cleanup. llvm-svn: 217028	2014-09-03 11:41:21 +00:00
Hal Finkel	960b7410dc	[CFLAA] Remove one final initializer list Maybe MSVC will be happy now... llvm-svn: 217000	2014-09-03 00:06:47 +00:00
Hal Finkel	5c25f70da3	[CFLAA] And even more MSVC fixes Remove a couple more initializer lists and constexpr dependencies. llvm-svn: 216998	2014-09-02 23:50:01 +00:00
Hal Finkel	7480b4d2e0	[CFLAA] More cleanup for MSVC Remove more initializer lists, etc. llvm-svn: 216994	2014-09-02 23:29:48 +00:00
Hal Finkel	5e34c59d10	[CFLAA] No initializer lists for MSVC MSVC 2012 does not understand initializer lists; remove them. llvm-svn: 216991	2014-09-02 22:52:30 +00:00
Hal Finkel	c195a929be	[CFLAA] Remove tautological comparison Fixes this (the warning is right, the unsigned value is not negative): lib/Analysis/StratifiedSets.h:689:53: warning: comparison of unsigned expression >= 0 is always true [-Wtautological-compare] bool inbounds(StratifiedIndex N) const { return N >= 0 && N < Links.size(); } llvm-svn: 216987	2014-09-02 22:36:58 +00:00
Hal Finkel	27b7d42cf2	[CFLAA] LLVM_CONSTEXPR -> const The number is just a constant, and this should make MSVC happy (or at least happier). llvm-svn: 216981	2014-09-02 22:26:06 +00:00
Hal Finkel	38b9f2c79f	[CFLAA] constexpr -> LLVM_CONSTEXPR Attempt to fix the MSVC build by not using constexpr. llvm-svn: 216979	2014-09-02 22:13:00 +00:00
Hal Finkel	0ad5c26d4b	Add a CFL Alias Analysis implementation This provides an implementation of CFL alias analysis (including some supporting data structures). Currently, we don't have any extremely fancy features, sans some interprocedural analysis (i.e. no field sensitivity, etc.), and we do best sitting behind BasicAA + TBAA. In such a configuration, we take ~0.6-0.8% of total compile time, and give ~7-8% NoAlias responses to queries TBAA and BasicAA couldn't answer when bootstrapping LLVM. In testing this on other projects, we've seen up to 10.5% of queries dropped by BasicAA+TBAA answered with NoAlias by this algorithm. Patch by George Burgess IV (with minor modifications by me -- mostly adapting some BasicAA tests), thanks! llvm-svn: 216970	2014-09-02 21:43:13 +00:00
Robin Morisset	958eb8baeb	Fix MemoryDependenceAnalysis in cases where QueryInstr is a CmpXchg or a AtomicRMW Summary: MemoryDependenceAnalysis is currently cautious when the QueryInstr is an atomic load or store, but I forgot to check for atomic cmpxchg/atomicrmw. This patch is a way of fixing that, and making it less brittle (i.e. no risk that I forget another possible kind of atomic, even if the IR ends up changing in the future), by adding a fallback checking mayReadOrWriteFromMemory. Thanks to Philip Reames for finding this bug and suggesting this solution in http://reviews.llvm.org/D4845 Sadly, I don't see how to add a test for this, since the passes depending on MemoryDependenceAnalysis won't trigger for an atomic rmw anyway. Does anyone see a way for testing it? Test Plan: none possible at first sight Reviewers: jfb, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5019 llvm-svn: 216940	2014-09-02 20:17:52 +00:00
Nick Lewycky	ea9374b1f2	Remove an errant outer loop that contains nothing but an inner loop over exactly the same elements. While no functionality is change intended (and hence there are no changes to tests), you don't want to skip this revision if bisecting for errors. llvm-svn: 216864	2014-09-01 05:17:15 +00:00
Craig Topper	57c93cf3ef	Remove 'virtual' keyword from methods markedwith 'override' keyword. llvm-svn: 216823	2014-08-30 16:48:34 +00:00
Robin Morisset	e583310c3b	Fix typos in comments, NFC Summary: Just fixing comments, no functional change. Test Plan: N/A Reviewers: jfb Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D5130 llvm-svn: 216784	2014-08-29 21:53:01 +00:00
Robin Morisset	66408b8a5f	Relax the constraint more in MemoryDependencyAnalysis.cpp Even loads/stores that have a stronger ordering than monotonic can be safe. The rule is no release-acquire pair on the path from the QueryInst, assuming that the QueryInst is not atomic itself. llvm-svn: 216771	2014-08-29 20:32:58 +00:00
Matt Arsenault	f9216b5a3a	Make fabs safe to speculatively execute llvm-svn: 216736	2014-08-29 16:01:17 +00:00
David Majnemer	e48fe8e34c	InstSimplify: Move a transform from InstCombine to InstSimplify Several combines involving icmp (shl C2, %X) C1 can be simplified without introducing any new instructions. Move them to InstSimplify; while we are at it, make them more powerful. llvm-svn: 216642	2014-08-28 03:34:28 +00:00
David Majnemer	26488cdc00	InstSimplify: Don't simplify gep X, (Y-X) to Y if types differ It's incorrect to perform this simplification if the types differ. A bitcast would need to be inserted for this to work. This fixes PR20771. llvm-svn: 216597	2014-08-27 20:08:34 +00:00
Nico Weber	6595699903	Reland r216439 215441, majnemer has a real fix for PR20771. llvm-svn: 216586	2014-08-27 20:06:19 +00:00
Nico Weber	10e86f2a15	Revert r216439 (and r216441, else the former doesn't revert cleanly). It caused PR 20771. I'll land a test on the clang side. llvm-svn: 216582	2014-08-27 20:00:13 +00:00
David Majnemer	6ca502a667	InstSimplify: Compute comparison ranges for left shift instructions 'shl nuw CI, x' produces [CI, CI << CLZ(CI)] 'shl nsw CI, x' produces [CI << CLO(CI)-1, CI] if CI is negative 'shl nsw CI, x' produces [CI, CI << CLZ(CI)-1] if CI is non-negative llvm-svn: 216570	2014-08-27 18:03:46 +00:00
David Majnemer	635dd8cd82	InstSimplify: Fold gep X, (sub 0, ptrtoint(X)) to null Save InstCombine some work if we can perform this fold during InstSimplify. llvm-svn: 216441	2014-08-26 07:08:03 +00:00
David Majnemer	c439a36c15	InstSimplify: Simplify trivial pointer expressions like b + (e - b) consider: long long f(long long b, long long e) { return b + (e - b); } we would lower this to something like: define i64 @f(i64* %b, i64* %e) { %1 = ptrtoint i64* %e to i64 %2 = ptrtoint i64* %b to i64 %3 = sub i64 %1, %2 %4 = ashr exact i64 %3, 3 %5 = getelementptr inbounds i64* %b, i64 %4 ret i64* %5 } This should fold away to just 'e'. N.B. This adds m_SpecificInt as a convenient way to match against a particular 64-bit integer when using LLVM's match interface. llvm-svn: 216439	2014-08-26 05:55:16 +00:00
Dylan Noblesmith	e0403e6eab	Analysis: cleanup Address review comments. llvm-svn: 216432	2014-08-26 02:03:40 +00:00
Dylan Noblesmith	24bc8991ce	Revert "Analysis: unique_ptr-ify DependenceAnalysis::collectCoeffInfo" This reverts commit r216358. llvm-svn: 216431	2014-08-26 02:03:38 +00:00
Rafael Espindola	1d5713d9bf	Modernize raw_fd_ostream's constructor a bit. Take a StringRef instead of a "const char *". Take a "std::error_code &" instead of a "std::string &" for error. A create static method would be even better, but this patch is already a bit too big. llvm-svn: 216393	2014-08-25 18:16:47 +00:00
Karthik Bhat	d94045aa5a	Allow vectorization of division by uniform power of 2. This patch adds support to recognize division by uniform power of 2 and modifies the cost table to vectorize division by uniform power of 2 whenever possible. Updates Cost model for Loop and SLP Vectorizer.The cost table is currently only updated for X86 backend. Thanks to Hal, Andrea, Sanjay for the review. (http://reviews.llvm.org/D4971) llvm-svn: 216371	2014-08-25 04:56:54 +00:00
Dylan Noblesmith	6af08b54ee	Analysis: unique_ptr-ify DependenceAnalysis::collectCoeffInfo llvm-svn: 216358	2014-08-25 00:28:43 +00:00
Dylan Noblesmith	d59c8b7d67	Analysis: unique_ptr-ify DependenceAnalysis::depends llvm-svn: 216357	2014-08-25 00:28:39 +00:00
Dylan Noblesmith	a124352238	Analysis: take a reference instead of pointer This parameter is never null. llvm-svn: 216356	2014-08-25 00:28:35 +00:00
Craig Topper	c2e0ae6754	Use range based for loops to avoid needing to re-mention SmallPtrSet size. llvm-svn: 216351	2014-08-24 23:23:06 +00:00
David Majnemer	5a45d76cfb	ValueTracking: Figure out more bits when looking at add/sub Given something like X01XX + X01XX, we know that the result must look like X1XXX. Adapted from a patch by Richard Smith, test-case written by me. llvm-svn: 216250	2014-08-22 00:40:43 +00:00
Craig Topper	65775cc03d	Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size. llvm-svn: 216158	2014-08-21 05:55:13 +00:00
Robin Morisset	6fd6f5e912	Answer to Philip Reames comments - add check for volatile (probably unneeded, but I agree that we should be conservative about it). - strengthen condition from isUnordered() to isSimple(), as I don't understand well enough Unordered semantics (and it also matches the comment better this way) to be confident in the previous behaviour (thanks for catching that one, I had missed the case Monotonic/Unordered). - separate a condition in two. - lengthen comment about aliasing and loads - add tests in GVN/atomic.ll llvm-svn: 215943	2014-08-18 22:18:14 +00:00
Robin Morisset	f6230dcf49	Weak relaxing of the constraints on atomics in MemoryDependencyAnalysis Monotonic accesses do not have to kill the analysis, as long as the QueryInstr is not itself atomic. llvm-svn: 215942	2014-08-18 22:18:11 +00:00
Craig Topper	aa7422b5a6	Revert "Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size." Getting a weird buildbot failure that I need to investigate. llvm-svn: 215870	2014-08-18 00:24:38 +00:00
Craig Topper	227456e133	Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size. llvm-svn: 215868	2014-08-17 23:47:00 +00:00
Jiangning Liu	b8e38ef6d3	In LVI(Lazy Value Info), originally value on a BB can only be caculated once, and the lattice will be updated to be a state other than "undefined". This limiation could miss some opportunities of lowering "overdefined" to be an even accurate value. So this patch ask the algorithm to try to lower the lattice value again even if the value has been lowered to be "overdefined". llvm-svn: 215343	2014-08-11 05:02:04 +00:00
Richard Smith	930ab2538a	Remove Support/IncludeFile.h and its only user. This is actively harmful, since it breaks the modules builds (where CallGraph.h can be quite reasonably transitively included by an unimported portion of a module, and CallGraph.cpp not linked in), and appears to have been entirely redundant since PR780 was fixed back in 2008. If this breaks anything, please revert; I have only tested this with a single configuration, and it's possible that this is still somehow fixing something (though I doubt it, since no other similar file uses this mechanism any more). llvm-svn: 215142	2014-08-07 20:41:17 +00:00
James Molloy	ea323a2876	Teach the SLP Vectorizer that keeping some values live over a callsite can have a cost. Some types, such as 128-bit vector types on AArch64, don't have any callee-saved registers. So if a value needs to stay live over a callsite, it must be spilled and refilled. This cost is now taken into account. llvm-svn: 214859	2014-08-05 12:30:34 +00:00
Hal Finkel	89deb1a79e	Fix ScalarEvolutionExpander when creating a PHI in a block with duplicate predecessors It seems that when I fixed this, almost exactly a year ago, I did not quite do it correctly. When we have duplicate block predecessors, we can indeed not have different incoming values for the same block, but we must have duplicate entries. So, instead of skipping the duplicates, we explicitly add the duplicate incoming values. Fixes PR20442. llvm-svn: 214423	2014-07-31 19:13:38 +00:00
David Majnemer	ad214c8e9e	InstSimplify: Simplify (X - (0 - Y)) if the second sub is NUW If the NUW bit is set for 0 - Y, we know that all values for Y other than 0 would produce a poison value. This allows us to replace (0 - Y) with 0 in the expression (X - (0 - Y)) which will ultimately leave us with X. This partially fixes PR20189. llvm-svn: 214384	2014-07-31 04:49:18 +00:00
Hal Finkel	c1f65c8564	Add @llvm.assume, lowering, and some basic properties This is the first commit in a series that add an @llvm.assume intrinsic which can be used to provide the optimizer with a condition it may assume to be true (when the control flow would hit the intrinsic call). Some basic properties are added here: - llvm.invariant(true) is dead. - llvm.invariant(false) is unreachable (this directly corresponds to the documented behavior of MSVC's __assume(0)), so is llvm.invariant(undef). The intrinsic is tagged as writing arbitrarily, in order to maintain control dependencies. BasicAA has been updated, however, to return NoModRef for any particular location-based query so that we don't unnecessarily block code motion. llvm-svn: 213973	2014-07-25 21:13:35 +00:00
Hal Finkel	9c1513447c	Simplify and improve scoped-noalias metadata semantics In the process of fixing the noalias parameter -> metadata conversion process that will take place during inlining (which will be committed soon, but not turned on by default), I have come to realize that the semantics provided by yesterday's commit are not really what we want. Here's why: void foo(noalias a, noalias b, noalias c, bool x) { q = x ? a : b; c = q; } Generically, we know that c does not alias with a and with b (so there is an 'and' in what we know we're not), and we know that q might be derived from a or from *b (so there is an 'or' in what we know that we are). So we do not want the semantics currently, where any noalias scope matching any alias.scope causes a NoAlias return. What we want to know is that the noalias scopes form a superset of the alias.scope list (meaning that all the things we know we're not is a superset of all of things the other instruction might be). Making that change, however, introduces a composibility problem. If we inline once, adding the noalias metadata, and then inline again adding more, and we append new scopes onto the noalias and alias.scope lists each time. But, this means that we could change what was a NoAlias result previously into a MayAlias result because we appended an additional scope onto one of the alias.scope lists. So, instead of giving scopes the ability to have parents (which I had borrowed from the TBAA implementation, but seems increasingly unlikely to be useful in practice), I've given them domains. The subset/superset condition now applies within each domain independently, and we only need it to hold in one domain. Each time we inline, we add the new scopes in a new scope domain, and everything now composes nicely. In addition, this simplifies the implementation. llvm-svn: 213948	2014-07-25 15:50:02 +00:00
Hal Finkel	7463a12ef9	Add scoped-noalias metadata This commit adds scoped noalias metadata. The primary motivations for this feature are: 1. To preserve noalias function attribute information when inlining 2. To provide the ability to model block-scope C99 restrict pointers Neither of these two abilities are added here, only the necessary infrastructure. In fact, there should be no change to existing functionality, only the addition of new features. The logic that converts noalias function parameters into this metadata during inlining will come in a follow-up commit. What is added here is the ability to generally specify noalias memory-access sets. Regarding the metadata, alias-analysis scopes are defined similar to TBAA nodes: !scope0 = metadata !{ metadata !"scope of foo()" } !scope1 = metadata !{ metadata !"scope 1", metadata !scope0 } !scope2 = metadata !{ metadata !"scope 2", metadata !scope0 } !scope3 = metadata !{ metadata !"scope 2.1", metadata !scope2 } !scope4 = metadata !{ metadata !"scope 2.2", metadata !scope2 } Loads and stores can be tagged with an alias-analysis scope, and also, with a noalias tag for a specific scope: ... = load %ptr1, !alias.scope !{ !scope1 } ... = load %ptr2, !alias.scope !{ !scope1, !scope2 }, !noalias !{ !scope1 } When evaluating an aliasing query, if one of the instructions is associated with an alias.scope id that is identical to the noalias scope associated with the other instruction, or is a descendant (in the scope hierarchy) of the noalias scope associated with the other instruction, then the two memory accesses are assumed not to alias. Note that is the first element of the scope metadata is a string, then it can be combined accross functions and translation units. The string can be replaced by a self-reference to create globally unqiue scope identifiers. [Note: This overview is slightly stylized, since the metadata nodes really need to just be numbers (!0 instead of !scope0), and the scope lists are also global unnamed metadata.] Existing noalias metadata in a callee is "cloned" for use by the inlined code. This is necessary because the aliasing scopes are unique to each call site (because of possible control dependencies on the aliasing properties). For example, consider a function: foo(noalias a, noalias b) { a = b; } that gets inlined into bar() { ... if (...) foo(a1, b1); ... if (...) foo(a2, b2); } -- now just because we know that a1 does not alias with b1 at the first call site, and a2 does not alias with b2 at the second call site, we cannot let inlining these functons have the metadata imply that a1 does not alias with b2. llvm-svn: 213864	2014-07-24 14:25:39 +00:00
Hal Finkel	9be4aefa57	AA metadata refactoring (introduce AAMDNodes) In order to enable the preservation of noalias function parameter information after inlining, and the representation of block-level __restrict__ pointer information (etc.), additional kinds of aliasing metadata will be introduced. This metadata needs to be carried around in AliasAnalysis::Location objects (and MMOs at the SDAG level), and so we need to generalize the current scheme (which is hard-coded to just one TBAA MDNode). This commit introduces only the necessary refactoring to allow for the introduction of other aliasing metadata types, but does not actually introduce any (that will come in a follow-up commit). What it does introduce is a new AAMDNodes structure to hold all of the aliasing metadata nodes associated with a particular memory-accessing instruction, and uses that structure instead of the raw MDNode in AliasAnalysis::Location, etc. No functionality change intended. llvm-svn: 213859	2014-07-24 12:16:19 +00:00
Hal Finkel	3c4b506191	Make use of the align parameter attribute for all pointer arguments We previously supported the align attribute on all (pointer) parameters, but we only used it for byval parameters. However, it is completely consistent at the IR level to treat 'align n' on all pointer parameters as an alignment assumption on the pointer, and now we wll. Specifically, this causes computeKnownBits to use the align attribute on all pointer parameters, not just byval parameters. I've also added an explicit parameter attribute test for this to test/Bitcode/attributes.ll. And I've updated the LangRef to document the align parameter attribute (as it turns out, it was not documented at all previously, although the byval documentation mentioned that it could be used). There are (at least) two benefits to doing this: - It allows enhancing alignment based on the pointer alignment after inlining callees. - It allows simplification of pointer arithmetic. llvm-svn: 213670	2014-07-22 16:58:55 +00:00
Hal Finkel	796c65151d	Match semantics of PointerMayBeCapturedBefore to its name by default As it turns out, the capture tracker named CaptureBefore used by AA, and now available via the PointerMayBeCapturedBefore function, would have been more-aptly named CapturedBeforeOrAt, because it considers captures at the instruction provided. This is not always what one wants, and it is difficult to get the strictly-before behavior given only the current interface. This adds an additional parameter which controls whether or not you want to include captures at the provided instruction. The default is not to include the instruction provided, so that 'Before' matches its name. No functionality change intended. llvm-svn: 213582	2014-07-21 21:30:22 +00:00
Duncan P. N. Exon Smith	2ae51d315c	Revert "[C++11] Add predecessors(BasicBlock ) / successors(BasicBlock ) iterator ranges." This reverts commit r213474 (and r213475), which causes a miscompile on a stage2 LTO build. I'll reply on the list in a moment. llvm-svn: 213562	2014-07-21 17:06:51 +00:00
Hal Finkel	b4b78bf273	Move the CapturesBefore tracker from AA into CaptureTracking There were two generally-useful CaptureTracker classes defined in LLVM: the simple tracker defined in CaptureTracking (and made available via the PointerMayBeCaptured utility function), and the CapturesBefore tracker available only inside of AA. This change moves the CapturesBefore tracker into CaptureTracking, generalizes it slightly (by adding a ReturnCaptures parameter), and makes it generally available via a PointerMayBeCapturedBefore utility function. This logic will be needed, for example, to perform noalias function parameter attribute inference. No functionality change intended. llvm-svn: 213519	2014-07-21 13:15:48 +00:00
Hal Finkel	60f586ad81	Move isIdentifiedFunctionLocal from BasicAA to AA The ability to identify function locals will exist outside of BasicAA (for example, logic for inferring noalias function arguments will need this), so make this concept generally accessible without code duplication. No functionality change. llvm-svn: 213514	2014-07-21 12:27:23 +00:00
Manuel Jacob	e98e4c3031	Remove braces around single-statement block and rangify outer loop. This is a follow-up to r213474. llvm-svn: 213475	2014-07-20 09:20:47 +00:00
Manuel Jacob	8e924ddc40	[C++11] Add predecessors(BasicBlock ) / successors(BasicBlock ) iterator ranges. Summary: This patch introduces two new iterator ranges and updates existing code to use it. No functional change intended. Test Plan: All tests (make check-all) still pass. Reviewers: dblaikie Reviewed By: dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4481 llvm-svn: 213474	2014-07-20 09:10:11 +00:00
NAKAMURA Takumi	92df839afe	Fix msc17 build. RegionInfo::RegionInfo::recalculate() doesn't make sense. llvm-svn: 213466	2014-07-20 03:57:51 +00:00
NAKAMURA Takumi	bdca3fcb3b	Fix -Asserts build introduced since r213456. llvm-svn: 213465	2014-07-20 00:00:42 +00:00
Matt Arsenault	56af912b43	Templatify RegionInfo so it works on MachineBasicBlocks llvm-svn: 213456	2014-07-19 18:29:29 +00:00
David Blaikie	939901ec68	Remove uses of the redundant ".reset(nullptr)" of unique_ptr, in favor of ".reset()" It's also possible to just write "= nullptr", but there's some question of whether that's as readable, so I leave it up to authors to pick which they prefer for now. If we want to discuss standardizing on one or the other, we can do that at some point in the future. llvm-svn: 213438	2014-07-19 01:05:11 +00:00
Hal Finkel	000be1bc2f	Add a dereferenceable attribute This attribute indicates that the parameter or return pointer is dereferenceable. Practically speaking, loads from such a pointer within the associated byte range are safe to speculatively execute. Such pointer parameters are common in source languages (C++ references, for example). llvm-svn: 213385	2014-07-18 15:51:28 +00:00
Suyog Sarda	7302ccd41e	Rectify r213231. Use proper version of 'ComputeNumSignBits'. Earlier when the code was in InstCombine, we were calling the version of ComputeNumSignBits in InstCombine.h that automatically added the DataLayout* before calling into ValueTracking. When the code moved to InstSimplify, we are calling into ValueTracking directly without passing in the DataLayout*. This patch rectifies the same by passing DataLayout in ComputeNumSignBits. llvm-svn: 213295	2014-07-17 19:07:00 +00:00
Suyog Sarda	e62b39fcd0	Move ashr optimization from InstCombineShift to InstSimplify. Refactor code, no functionality change, test case moved from instcombine to instsimplify. Differential Revision: http://reviews.llvm.org/D4102 llvm-svn: 213231	2014-07-17 06:28:15 +00:00
Hal Finkel	9779454568	Improve BasicAA CS-CS queries (redux) This reverts, "r213024 - Revert r212572 "improve BasicAA CS-CS queries", it causes PR20303." with a fix for the bug in pr20303. As it turned out, the relevant code was both wrong and over-conservative (because, as with the code it replaced, it would return the overall ModRef mask even if just Ref had been implied by the argument aliasing results). Hopefully, this correctly fixes both problems. Thanks to Nick Lewycky for reducing the test case for pr20303 (which I've cleaned up a little and added in DSE's test directory). The BasicAA test has also been updated to check for this error. Original commit message: BasicAA contains knowledge of certain intrinsics, such as memcpy and memset, and uses that information to form more-accurate answers to CallSite vs. Loc ModRef queries. Unfortunately, it did not use this information when answering CallSite vs. CallSite queries. Generically, when an intrinsic takes one or more pointers and the intrinsic is marked only to read/write from its arguments, the offset/size is unknown. As a result, the generic code that answers CallSite vs. CallSite (and CallSite vs. Loc) queries in AA uses UnknownSize when forming Locs from an intrinsic's arguments. While BasicAA's CallSite vs. Loc override could use more-accurate size information for some intrinsics, it did not do the same for CallSite vs. CallSite queries. This change refactors the intrinsic-specific logic in BasicAA into a generic AA query function: getArgLocation, which is overridden by BasicAA to supply the intrinsic-specific knowledge, and used by AA's generic implementation. This allows the intrinsic-specific knowledge to be used by both CallSite vs. Loc and CallSite vs. CallSite queries, and simplifies the BasicAA implementation. Currently, only one function, Mac's memset_pattern16, is handled by BasicAA (all the rest are intrinsics). As a side-effect of this refactoring, BasicAA's getModRefBehavior override now also returns OnlyAccessesArgumentPointees for this function (which is an improvement). llvm-svn: 213219	2014-07-17 01:28:25 +00:00
Matt Arsenault	d769765bf9	Teach computeKnownBits to look through addrspacecast. This fixes inferring alignment through an addrspacecast. llvm-svn: 213030	2014-07-15 01:55:03 +00:00
Matt Arsenault	95ee145d10	Teach GetUnderlyingObject / BasicAA about addrspacecast llvm-svn: 213025	2014-07-15 00:56:40 +00:00
Nick Lewycky	91e41155de	Revert r212572 "improve BasicAA CS-CS queries", it causes PR20303. llvm-svn: 213024	2014-07-15 00:53:38 +00:00
Matt Arsenault	da8f2f7d36	Look through addrspacecast in IsConstantOffsetFromGlobal llvm-svn: 213000	2014-07-14 22:39:26 +00:00
Matt Arsenault	65202e7cee	Look through addrspacecast in GetPointerBaseWithConstantOffset llvm-svn: 212999	2014-07-14 22:39:22 +00:00
David Majnemer	6e615bab35	InstSimplify: Correct sdiv x / -1 Determining the bounds of x/ -1 would start off with us dividing it by INT_MIN. Suffice to say, this would not work very well. Instead, handle it upfront by checking for -1 and mapping it to the range: [INT_MIN + 1, INT_MAX. This means that the result of our division can be any value other than INT_MIN. llvm-svn: 212981	2014-07-14 20:38:45 +00:00
David Majnemer	a39248360a	InstSimplify: The upper bound of X / C was missing a rounding step Summary: When calculating the upper bound of X / -8589934592, we would perform the following calculation: Floor[INT_MAX / 8589934592] However, flooring the result would make us wrongly come to the conclusion that 1073741824 was not in the set of possible values. Instead, use the ceiling of the result. Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4502 llvm-svn: 212976	2014-07-14 19:49:57 +00:00
Matt Arsenault	3d717281d0	Templatify DominanceFrontier. Theoretically this should now work for MachineBasicBlocks. llvm-svn: 212885	2014-07-12 21:59:52 +00:00
Duncan P. N. Exon Smith	72e8fb652a	BFI: Add constructor for Weight llvm-svn: 212868	2014-07-12 00:26:00 +00:00
Duncan P. N. Exon Smith	fdb4eaa3bf	BFI: Clean up BlockMass Implementation is small now -- the interesting logic was moved to `BranchProbability` a while ago. Move it into `bfi_detail` and get rid of the related TODOs. I was originally planning to define it within `BlockFrequencyInfoImpl` (or `BFIIBase`), but it seems cleaner in a namespace. Besides, `isPodLike` needs to be specialized before `BlockMass` can be used in some of the other data structures, and there isn't a clear way to do that. llvm-svn: 212866	2014-07-12 00:21:30 +00:00
Duncan P. N. Exon Smith	8c157be757	BFI: Mark the end of namespaces llvm-svn: 212861	2014-07-11 23:56:50 +00:00
Hal Finkel	54ccd892b9	Allow isDereferenceablePointer to look through some bitcasts isDereferenceablePointer should not give up upon encountering any bitcast. If we're casting from a pointer to a larger type to a pointer to a small type, we can continue by examining the bitcast's operand. This missing capability was noted in a comment in the function. In order for this to work, isDereferenceablePointer now takes an optional DataLayout pointer (essentially all callers already had such a pointer available). Most code uses isDereferenceablePointer though isSafeToSpeculativelyExecute (which already took an optional DataLayout pointer), and to enable the LICM test case, LICM needs to actually provide its DL pointer to isSafeToSpeculativelyExecute (which it was not doing previously). llvm-svn: 212686	2014-07-10 05:27:53 +00:00
Hal Finkel	ed72d81b90	Improve BasicAA CS-CS queries BasicAA contains knowledge of certain intrinsics, such as memcpy and memset, and uses that information to form more-accurate answers to CallSite vs. Loc ModRef queries. Unfortunately, it did not use this information when answering CallSite vs. CallSite queries. Generically, when an intrinsic takes one or more pointers and the intrinsic is marked only to read/write from its arguments, the offset/size is unknown. As a result, the generic code that answers CallSite vs. CallSite (and CallSite vs. Loc) queries in AA uses UnknownSize when forming Locs from an intrinsic's arguments. While BasicAA's CallSite vs. Loc override could use more-accurate size information for some intrinsics, it did not do the same for CallSite vs. CallSite queries. This change refactors the intrinsic-specific logic in BasicAA into a generic AA query function: getArgLocation, which is overridden by BasicAA to supply the intrinsic-specific knowledge, and used by AA's generic implementation. This allows the intrinsic-specific knowledge to be used by both CallSite vs. Loc and CallSite vs. CallSite queries, and simplifies the BasicAA implementation. Currently, only one function, Mac's memset_pattern16, is handled by BasicAA (all the rest are intrinsics). As a side-effect of this refactoring, BasicAA's getModRefBehavior override now also returns OnlyAccessesArgumentPointees for this function (which is an improvement). llvm-svn: 212572	2014-07-08 23:16:49 +00:00
Sanjay Patel	f4bfc3ef60	fixed typos in comments llvm-svn: 212424	2014-07-06 23:24:53 +00:00
David Majnemer	44123517c9	InstSimplify: Fix a bug when INT_MIN is in a sdiv When INT_MIN is the numerator in a sdiv, we would not properly handle overflow when calculating the bounds of possible values; abs(INT_MIN) is not a meaningful number. Instead, check and handle INT_MIN by reasoning that the largest value is INT_MIN/-2 and the smallest value is INT_MIN. This fixes PR20199. llvm-svn: 212307	2014-07-04 00:23:39 +00:00
Andrea Di Biagio	5c282923d9	[CostModel][x86] Improved cost model for alternate shuffles. This patch: 1) Improves the cost model for x86 alternate shuffles (originally added at revision 211339); 2) Teaches the Cost Model Analysis pass how to analyze alternate shuffles. Alternate shuffles are a special kind of blend; on x86, we can often easily lowered alternate shuffled into single blend instruction (depending on the subtarget features). The existing cost model didn't take into account subtarget features. Also, it had a couple of "dead" entries for vector types that are never legal (example: on x86 types v2i32 and v2f32 are not legal; those are always either promoted or widened to 128-bit vector types). The new x86 cost model takes into account what target features we have before returning the shuffle cost (i.e. the number of instructions after the blend is lowered/expanded). This patch also teaches the Cost Model Analysis how to identify and analyze alternate shuffles (i.e. 'SK_Alternate' shufflevector instructions): - added function 'isAlternateVectorMask'; - added some logic to check if an instruction is a alternate shuffle and, in case, call the target specific TTI to get the corresponding shuffle cost; - added a test to verify the cost model analysis on alternate shuffles. llvm-svn: 212296	2014-07-03 22:24:18 +00:00
Richard Trieu	51122628e6	Add new lines to debugging information. Differential Revision: http://reviews.llvm.org/D4262 llvm-svn: 212250	2014-07-03 02:11:49 +00:00
Gerolf Hoflehner	dc712d8ab8	Suppress inlining when the block address is taken Inlining functions with block addresses can cause many problem and requires a rich infrastructure to support including escape analysis. At this point the safest approach to address these problems is by blocking inlining from happening. Background: There have been reports on Ruby segmentation faults triggered by inlining functions with block addresses like //Ruby code snippet vm_exec_core() { finish_insn_seq_0 = &&INSN_LABEL_finish; INSN_LABEL_finish: ; } This kind of scenario can also happen when LLVM picks a subset of blocks for inlining, which is the case with the actual code in the Ruby environment. LLVM suppresses inlining for such functions when there is an indirect branch. The attached patch does so even when there is no indirect branch. Note that user code like above would not make much sense: using the global for jumping across function boundaries would be illegal. Why was there a segfault: In the snipped above the block with the label is recognized as dead So it is eliminated. Instead of a block address the cloner stores a constant (sic!) into the global resulting in the segfault (when the global is used in a goto). Why had it worked in the past then: By luck. In older versions vm_exec_core was also inlined but the label address used was the block label address in vm_exec_core. So the global jump ended up in the original function rather than in the caller which accidentally happened to work. Test case ./tools/clang/test/CodeGen/indirect-goto.c will fail as a result of this commit. rdar://17245966 llvm-svn: 212077	2014-07-01 00:19:34 +00:00
Alp Toker	97022b0c1f	Revert "Introduce a string_ostream string builder facilty" Temporarily back out commits r211749, r211752 and r211754. llvm-svn: 211814	2014-06-26 22:52:05 +00:00
Dinesh Dwivedi	9d122cf780	This patch removed duplicate code for matching patterns which are now handled in SimplifyUsingDistributiveLaws() (after r211261) Differential Revision: http://reviews.llvm.org/D4253 llvm-svn: 211768	2014-06-26 08:57:33 +00:00
Alp Toker	5ad6808be1	MSVC build fix following r211749 Avoid strndup() llvm-svn: 211752	2014-06-26 00:25:41 +00:00
Alp Toker	fd9ead3b6f	Introduce a string_ostream string builder facilty string_ostream is a safe and efficient string builder that combines opaque stack storage with a built-in ostream interface. small_string_ostream<bytes> additionally permits an explicit stack storage size other than the default 128 bytes to be provided. Beyond that, storage is transferred to the heap. This convenient class can be used in most places an std::string+raw_string_ostream pair or SmallString<>+raw_svector_ostream pair would previously have been used, in order to guarantee consistent access without byte truncation. The patch also converts much of LLVM to use the new facility. These changes include several probable bug fixes for truncated output, a programming error that's no longer possible with the new interface. llvm-svn: 211749	2014-06-26 00:00:48 +00:00
Duncan P. N. Exon Smith	2b829a7a79	Support: Move class ScaledNumber ScaledNumber has been cleaned up enough to pull out of BFI now. Still work to do there (tests for shifting, bloated printing code, etc.), but it seems clean enough for its new home. llvm-svn: 211562	2014-06-24 00:38:09 +00:00
Duncan P. N. Exon Smith	7a62dcd0cd	BFI: Un-floatify more language llvm-svn: 211561	2014-06-24 00:26:13 +00:00
Duncan P. N. Exon Smith	6ccaf0fa23	Support: Extract ScaledNumbers::MinScale and MaxScale llvm-svn: 211558	2014-06-24 00:15:19 +00:00
Duncan P. N. Exon Smith	a21f5c3569	BFI: Change language from "exponent" to "scale" llvm-svn: 211557	2014-06-23 23:57:12 +00:00
Duncan P. N. Exon Smith	1c9633c62e	BFI: Rename UnsignedFloat => ScaledNumber A lot of the docs and API are out of date, but I'll leave that for a separate commit. llvm-svn: 211555	2014-06-23 23:36:17 +00:00
Benjamin Kramer	8d54e9ca1f	SCEVExpander: Fold constant PHIs harder. The logic below only understands proper IVs. PR20093. llvm-svn: 211433	2014-06-21 11:47:18 +00:00
Richard Trieu	b7d5af56cb	Add back functionality removed in r210497. Instead of asserting, output a message stating that a null pointer was found. llvm-svn: 211430	2014-06-21 02:43:02 +00:00
Duncan P. N. Exon Smith	db0cbc8b8a	Support: Write ScaledNumber::getQuotient() and getProduct() llvm-svn: 211409	2014-06-20 21:47:47 +00:00
Jingyue Wu	52b8eafe4c	[ValueTracking] Extend range metadata to call/invoke Summary: With this patch, range metadata can be added to call/invoke including IntrinsicInst. Previously, it could only be added to load. Rename computeKnownBitsLoad to computeKnownBitsFromRangeMetadata because range metadata is not only used by load. Update the language reference to reflect this change. Test Plan: Add several tests in range-2.ll to confirm the verifier is happy with having range metadata on call/invoke. Add two tests in AddOverFlow.ll to confirm annotating range metadata to call/invoke can benefit InstCombine. Reviewers: meheff, nlewycky, reames, hfinkel, eliben Reviewed By: eliben Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4187 llvm-svn: 211281	2014-06-19 16:50:16 +00:00
Nick Lewycky	051f63ab97	Move optimization of some cases of (A & C1)\|(B & C2) from instcombine to instsimplify. Patch by Rahul Jain, plus some last minute changes by me -- you can blame me for any bugs. llvm-svn: 211252	2014-06-19 03:51:46 +00:00
Nick Lewycky	4eb68b1ca7	Make instsimplify's analysis of icmp eq/ne use computeKnownBits to determine whether the icmp is always true or false. Patch by Suyog Sarda! llvm-svn: 211251	2014-06-19 03:35:49 +00:00
Richard Trieu	8c7b353cd7	Removing an "if (!this)" check from two print methods. The condition will never be true in a well-defined context. The checking for null pointers has been moved into the caller logic so it does not rely on undefined behavior. llvm-svn: 210497	2014-06-09 22:53:16 +00:00
Alp Toker	a9e2748af6	Remove old fenv.h workaround for a historic clang driver bug Tested and works fine with clang using libstdc++. All indications are that this was fixed some time ago and isn't a problem with any clang version we support. I've added a note in PR6907 which is still open for some reason. llvm-svn: 210485	2014-06-09 19:00:52 +00:00
Alp Toker	a026ddb3ba	Fold FEnv.h into the implementation Support headers shouldn't use config.h definitions, and they should never be undefined like this. ConstantFolding.cpp was the only user of this facility and already includes config.h for other math features, so it makes sense to move the checks there at point of use. (The implicit config.h was also quite dangerous -- removing the FEnv.h include would have silently disabled math constant folding without causing any tests to fail. Need to investigate -Wundef once the cleanup is done.) This eliminates the last config.h include from LLVM headers, paving the way for more consistent configuration checks. llvm-svn: 210483	2014-06-09 18:28:53 +00:00
Tobias Grosser	e914a50dc9	ScalarEvolution: Derive element size from the type of the loaded element Before, we where looking at the size of the pointer type that specifies the location from which to load the element. This did not make any sense at all. This change fixes a bug in the delinearization where we failed to delinerize certain load instructions. llvm-svn: 210435	2014-06-08 19:21:20 +00:00
Tom Roeder	740d86dc79	Add a new attribute called 'jumptable' that creates jump-instruction tables for functions marked with this attribute. It includes a pass that rewrites all indirect calls to jumptable functions to pass through these tables. This also adds backend support for generating the jump-instruction tables on ARM and X86. Note that since the jumptable attribute creates a second function pointer for a function, any function marked with jumptable must also be marked with unnamed_addr. llvm-svn: 210280	2014-06-05 19:29:43 +00:00
Rafael Espindola	0746266d63	Add a Constant version of stripPointerCasts. Thanks to rnk for the suggestion. llvm-svn: 210205	2014-06-04 19:01:48 +00:00
Sebastian Pop	e038cb3e5a	implement missing SCEVDivision case without this case we would end on an infinite recursion: the remainder is zero, so Numerator - Remainder is equal to Numerator and so we would recursively ask for the division of Numerator by Denominator. llvm-svn: 209838	2014-05-29 19:44:09 +00:00
Sebastian Pop	a5d17facf7	fail to find dimensions when ElementSize is nullptr when ScalarEvolution::getElementSize returns nullptr it is safe to early return in ScalarEvolution::findArrayDimensions such that we avoid later problems when we try to divide the terms by ElementSize. llvm-svn: 209837	2014-05-29 19:44:05 +00:00
Sanjay Patel	3591bead10	test check-in: added missing parenthesis in comment llvm-svn: 209763	2014-05-28 19:03:33 +00:00
Sebastian Pop	6efdf0e296	avoid type mismatch when building SCEVs This is a corner case I have stumbled upon when dealing with ARM64 type conversions. I was not able to extract a testcase for the community codebase to fail on. The patch conservatively discards a division that would have ended up in an ICE due to a type mismatch when building a multiply expression. I have also added code to a place that builds add expressions and in which we should be careful not to pass in operands of different types. llvm-svn: 209694	2014-05-27 22:42:00 +00:00
Sebastian Pop	fa763d3c07	do not use the GCD to compute the delinearization strides We do not need to compute the GCD anymore after we removed the constant coefficients from the terms: the terms are now all parametric expressions and there is no need to recognize constant terms that divide only a subset of the terms. We only rely on the size of the terms, i.e., the number of operands in the multiply expressions, to sort the terms and recognize the parametric dimensions. llvm-svn: 209693	2014-05-27 22:41:56 +00:00
Sebastian Pop	721b704445	remove BasePointer before delinearizing No functional change is intended: instead of relying on the delinearization to come up with the base pointer as a remainder of the divisions in the delinearization, we just compute it from the array access and use that value. We substract the base pointer from the SCEV to be delinearized and that simplifies the work of the delinearizer. llvm-svn: 209692	2014-05-27 22:41:51 +00:00
Sebastian Pop	1664c3c2ec	remove constant terms The delinearization is needed only to remove the non linearity induced by expressions involving multiplications of parameters and induction variables. There is no problem in dealing with constant times parameters, or constant times an induction variable. For this reason, the current patch discards all constant terms and multipliers before running the delinearization algorithm on the terms. The only thing remaining in the term expressions are parameters and multiply expressions of parameters: these simplified term expressions are passed to the array shape recognizer that will not recognize constant dimensions anymore: these will be recognized as different strides in parametric subscripts. The only important special case of a constant dimension is the size of elements. Instead of relying on the delinearization to infer the size of an element, compute the element size from the base address type. This is a much more precise way of computing the element size than before, as we would have mixed together the size of an element with the strides of the innermost dimension. llvm-svn: 209691	2014-05-27 22:41:45 +00:00
Michael Zolotukhin	406287c5b7	Some cleanup for r209568. llvm-svn: 209634	2014-05-26 14:49:46 +00:00
Michael Zolotukhin	df83a19a09	Implement sext(C1 + C2X) --> sext(C1) + sext(C2X) and sext{C1,+,C2} --> sext(C1) + sext{0,+,C2} transformation in Scalar Evolution. That helps SLP-vectorizer to recognize consecutive loads/stores. <rdar://problem/14860614> llvm-svn: 209568	2014-05-24 08:09:57 +00:00
Andrew Trick	3b4463f718	Fix and improve SCEV ComputeBackedgeTankCount. This is a follow-up to r209358: PR19799: Indvars miscompile due to an incorrect max backedge taken count from SCEV. That fix was incomplete as pointed out by Arnold and Michael Z. The code was also too confusing. It needed a careful rewrite with more unit tests. This version will also happen to optimize more cases. <rdar://17005101> PR19799: Indvars miscompile... llvm-svn: 209545	2014-05-23 19:47:13 +00:00
Justin Bogner	5e6887dc27	ScalarEvolution: Fix handling of AddRecs in isKnownPredicate ScalarEvolution::isKnownPredicate() can wrongly reduce a comparison when both the LHS and RHS are SCEVAddRecExprs. This checks that both LHS and RHS are guarded in the case when both are SCEVAddRecExprs. The test case is against indvars because I could not find a way to directly test SCEV. Patch by Sanjay Patel! llvm-svn: 209487	2014-05-23 00:06:56 +00:00
Andrew Trick	102d4404fb	Fix a bug in SCEV's backedge taken count computation from my prior fix in Jan. This has to do with the trip count computation for loops with multiple exits, which is quite subtle. Most passes just ask for a single trip count number, so we must be conservative assuming any exit could be taken. Normally, we rely on the "exact" trip count, which was correctly given as "unknown". However, SCEV also gives a "max" back-edge taken count. The loops max BE taken count is conservatively a maximum over the max of each exit's non-exiting iterations count. Note that some exit tests can be skipped so the max loop back-edge taken count can actually exceed the max non-exiting iterations for some exits. However, when we know the loop latch cannot be skipped, we can directly use its max taken count disregarding other exits. I previously took the minimum here without checking whether the other exit could be skipped. The correct, and simpler thing to do here is just to directly use the loop latch's max non-exiting iterations as the loops max back-edge count. In the problematic test case, the first loop exit had a max of zero non-exiting iterations, but could be skipped. The loop latch was known not to be skipped but had max of one non-exiting iteration. We incorrectly claimed the loop back-edge could be taken zero times, when it is actually taken one time. Fixes Loop %for.body.i: <multiple exits> Unpredictable backedge-taken count. Loop %for.body.i: max backedge-taken count is 1. llvm-svn: 209358	2014-05-22 00:37:03 +00:00
Eric Christopher	262770bdee	Clean up language and grammar. Based on a patch by jfcaron3@gmail.com! PR19806 llvm-svn: 209216	2014-05-20 17:11:11 +00:00
Nick Lewycky	ea4c3a9a9c	Teach isKnownNonNull that a nonnull return is not null. Add a test for this case as well as the case of a nonnull attribute (already handled but not tested). llvm-svn: 209193	2014-05-20 05:13:21 +00:00
Nick Lewycky	de84a8bb51	Add 'nonnull', a new parameter and return attribute which indicates that the pointer is not null. Instcombine will elide comparisons between these and null. Patch by Luqman Aden! llvm-svn: 209185	2014-05-20 01:23:40 +00:00
Peter Collingbourne	6b9c51d275	Check the alwaysinline attribute on the call as well as on the caller. Differential Revision: http://reviews.llvm.org/D3815 llvm-svn: 209150	2014-05-19 18:25:54 +00:00
David Majnemer	ef2cb1fc63	InstSimplify: Improve handling of ashr/lshr Summary: Analyze the range of values produced by ashr/lshr cst, %V when it is being used in an icmp. Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3774 llvm-svn: 209000	2014-05-16 17:14:03 +00:00
David Majnemer	186633e0f8	InstSimplify: Optimize using dividend in sdiv Summary: The dividend in an sdiv tells us the largest and smallest possible results. Use this fact to optimize comparisons against an sdiv with a constant dividend. Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3795 llvm-svn: 208999	2014-05-16 16:57:04 +00:00
Juergen Ributzka	271cad0970	Add C API for thread yielding callback. Sometimes a LLVM compilation may take more time then a client would like to wait for. The problem is that it is not possible to safely suspend the LLVM thread from the outside. When the timing is bad it might be possible that the LLVM thread holds a global mutex and this would block any progress in any other thread. This commit adds a new yield callback function that can be registered with a context. LLVM will try to yield by calling this callback function, but there is no guaranteed frequency. LLVM will only do so if it can guarantee that suspending the thread won't block any forward progress in other LLVM contexts in the same process. Once the client receives the call back it can suspend the thread safely and resume it at another time. Related to <rdar://problem/16728690> llvm-svn: 208945	2014-05-16 02:33:15 +00:00
Jay Foad	2827803889	Instead of littering asserts throughout the code after every call to computeKnownBits, consolidate them into one assert at the end of computeKnownBits itself. llvm-svn: 208876	2014-05-15 12:12:55 +00:00
Chandler Carruth	0d35e1f8fc	Teach the constant folder to look through bitcast constant expressions much more effectively when trying to constant fold a load of a constant. Previously, we only handled bitcasts by trying to find a totally generic byte representation of the constant and use that. Now, we look through the bitcast to see what constant we might fold the load into, and then try to form a constant expression cast of the found value that would be equivalent to loading the value. You might wonder why on earth this actually matters. Well, turns out that the Itanium ABI causes us to create a single array for a vtable where the first elements are virtual base offsets, followed by the virtual function pointers. Because the array is homogenous the element type is consistently i8* and we inttoptr the virtual base offsets into the initial elements. Then constructors bitcast these pointers to i64 pointers prior to loading them. Boom, no more constant folding of virtual base offsets. This is the first fix to LLVM to address the insane performance Eric Niebler discovered with Clang on his range comprehensions[1]. There is more to come though, this doesn't really fix the problem fully. [1]: http://ericniebler.com/2014/04/27/range-comprehensions/ llvm-svn: 208856	2014-05-15 09:56:28 +00:00
Alp Toker	18115693f7	Fix typos llvm-svn: 208839	2014-05-15 01:52:21 +00:00
Jay Foad	e0eac700cb	Rename ComputeMaskedBits to computeKnownBits. "Masked" has been inappropriate since it lost its Mask parameter in r154011. llvm-svn: 208811	2014-05-14 21:14:37 +00:00
David Majnemer	6098432810	InstSimplify: Optimize signed icmp of -(zext V) Summary: We know that -(zext V) will always be <= zero, simplify signed icmps that have these. Uncovered using http://www.cs.utah.edu/~regehr/souper/ Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3754 llvm-svn: 208809	2014-05-14 20:16:28 +00:00
Jay Foad	df682f6c8b	Update the comments for ComputeMaskedBits, which lost its Mask parameter in r154011. llvm-svn: 208757	2014-05-14 08:00:07 +00:00
Sebastian Pop	25e94ba142	use nullptr instead of NULL llvm-svn: 208622	2014-05-12 20:11:01 +00:00
Sebastian Pop	de2f65cfdd	do not assert when delinearization fails llvm-svn: 208615	2014-05-12 19:01:53 +00:00
Sebastian Pop	28499bfbb2	use isZero() llvm-svn: 208614	2014-05-12 19:01:49 +00:00
Benjamin Kramer	28edee20f3	SCEV: Use range-based for loop and fold variable into assert. llvm-svn: 208476	2014-05-10 17:47:18 +00:00
Sebastian Pop	f6b4cc99cd	move findArrayDimensions to ScalarEvolution we do not use the information from SCEVAddRecExpr to compute the shape of the array, so a better place for this function is in ScalarEvolution. llvm-svn: 208456	2014-05-09 22:45:07 +00:00
Sebastian Pop	76196fee4f	fix typo in debug message llvm-svn: 208455	2014-05-09 22:45:02 +00:00
Tobias Grosser	f264562cb9	Correct formatting. Sorry for the commit spam. My clang-format crashed on me and the vim plugin did not print an error, but instead just left the formatting untouched. llvm-svn: 208358	2014-05-08 21:43:19 +00:00
Tobias Grosser	7888d0c465	Use std::remove_if to remove elements from a vector Suggested-by: Benjamin Kramer <benny.kra@gmail.com> llvm-svn: 208357	2014-05-08 21:32:59 +00:00
Rafael Espindola	c6c3ed654b	Use a range loop. llvm-svn: 208343	2014-05-08 17:57:50 +00:00
Tobias Grosser	358e9a97e7	Revert "SCEV: Use I = vector<>.erase(I) to iterate and delete at the same time" as committed in r208282. The original commit was incorrect. llvm-svn: 208286	2014-05-08 07:55:34 +00:00
Tobias Grosser	4c447db9fb	SCEV: Use I = vector<>.erase(I) to iterate and delete at the same time llvm-svn: 208282	2014-05-08 07:12:44 +00:00
Sebastian Pop	866eb1eecf	avoid segfaulting Quotient and Remainder don't have to be initialized. llvm-svn: 208238	2014-05-07 19:00:37 +00:00
Sebastian Pop	8f355b84ae	do not collect undef terms llvm-svn: 208237	2014-05-07 19:00:32 +00:00
Sebastian Pop	d5cb815565	split delinearization pass in 3 steps To compute the dimensions of the array in a unique way, we split the delinearization analysis in three steps: - find parametric terms in all memory access functions - compute the array dimensions from the set of terms - compute the delinearized access functions for each dimension The first step is executed on all the memory access functions such that we gather all the patterns in which an array is accessed. The second step reduces all this information in a unique description of the sizes of the array. The third step is delinearizing each memory access function following the common description of the shape of the array computed in step 2. This rewrite of the delinearization pass also solves a problem we had with the previous implementation: because the previous algorithm was by induction on the structure of the SCEV, it would not correctly recognize the shape of the array when the memory access was not following the nesting of the loops: for example, see polly/test/ScopInfo/multidim_only_ivs_3d_reverse.ll ; void foo(long n, long m, long o, double A[n][m][o]) { ; ; for (long i = 0; i < n; i++) ; for (long j = 0; j < m; j++) ; for (long k = 0; k < o; k++) ; A[i][k][j] = 1.0; Starting with this patch we no longer delinearize access functions that do not contain parameters, for example in test/Analysis/DependenceAnalysis/GCD.ll ;; for (long int i = 0; i < 100; i++) ;; for (long int j = 0; j < 100; j++) { ;; A[2i - 4j] = i; ;; B++ = A[6i + 8*j]; these accesses will not be delinearized as the upper bound of the loops are constants, and their access functions do not contain SCEVUnknown parameters. llvm-svn: 208232	2014-05-07 18:01:20 +00:00
Tobias Grosser	8128149f4a	[C++11] Add NArySCEV->Operands iterator range llvm-svn: 208158	2014-05-07 06:07:47 +00:00
Duncan P. N. Exon Smith	cbf47b5244	blockfreq: Move include to .cpp llvm-svn: 208035	2014-05-06 01:57:42 +00:00
Chandler Carruth	2ccafe8399	[LCG] Add the last (and most complex) of the edge insertion mutation operations on the call graph. This one forms a cycle, and while not as complex as removing an internal edge from an SCC, it involves a reasonable amount of work to find all of the nodes newly connected in a cycle. Also somewhat alarming is the worst case complexity here: it might have to walk roughly the entire SCC inverse DAG to insert a single edge. This is carefully documented in the API (I hope). llvm-svn: 207935	2014-05-04 09:38:32 +00:00
Juergen Ributzka	b855191ef0	[TBAA] Fix handling of mixed TBAA (path-aware and non-path-aware TBAA). This fix simply ensures that both metadata nodes are path-aware before performing path-aware alias analysis. This issue isn't normally triggered in LLVM, because we perform an autoupgrade of the TBAA metadata to the new format when reading in LL or BC files. This issue only appears when a client creates the IR manually and mixes old and new TBAA metadata format. This fixes <rdar://problem/16760860>. llvm-svn: 207923	2014-05-03 22:32:52 +00:00
Chandler Carruth	143c70588a	[LCG] Add the other simple edge insertion API to the call graph. This just connects an SCC to one of its descendants directly. Not much of an impact. The last one is the hard one -- connecting an SCC to one of its ancestors, and thereby forming a cycle such that we have to merge all the SCCs participating in the cycle. llvm-svn: 207751	2014-05-01 12:18:20 +00:00
Chandler Carruth	6f4d8c2889	[LCG] Don't lookup the child SCC twice. Spotted this by inspection, and no functionality changed. llvm-svn: 207750	2014-05-01 12:16:31 +00:00
Chandler Carruth	91cf62ad50	[LCG] Add some basic methods for querying the parent/child relationships of SCCs in the SCC DAG. Exercise them in the big graph test case. These will be especially useful for establishing invariants in insertion logic. llvm-svn: 207749	2014-05-01 12:12:42 +00:00
Chandler Carruth	bd97884116	[LCG] Add the really, really boring edge insertion case: adding an edge entirely within an existing SCC. Shockingly, making the connected component more connected is ... a total snooze fest. =] Anyways, its wired up, and I even added a test case to make sure it pretty much sorta works. =D llvm-svn: 207631	2014-04-30 10:48:36 +00:00
Chandler Carruth	aa6122effe	[LCG] Actually test the basic edge removal bits (IE, the non-SCC bits), and discover that it's totally broken. Yay tests. Boo bug. Fix the basic edge removal so that it works by nulling out the removed edges rather than actually removing them. This leaves the indices valid in the map from callee to index, and preserves some of the locality for iterating over edges. The iterator is made bidirectional to reflect that it now has to skip over null entries, and the skipping logic is layered onto it. As future work, I would like to track essentially the "load factor" of the edge list, and when it falls below a threshold do a compaction. An alternative I considered (and continue to consider) is storing the callees in a doubly linked list where each element of the list is in a set (which is essentially the classical linked-hash-table datastructure). The problem with that approach is that either you need to heap allocate the linked list nodes and use pointers to them, or use a bucket hash table (with even more linked list pointer overhead!), etc. It's pretty easy to get 5x overhead for values that are just pointers. So far, I think punching holes in the vector, and periodic compaction is likely to be much more efficient overall in the space/time tradeoff. llvm-svn: 207619	2014-04-30 07:45:27 +00:00
Benjamin Kramer	4f8fb8ff6c	raw_ostream: Forward declare OpenFlags and include FileSystem.h only where necessary. llvm-svn: 207593	2014-04-29 23:26:49 +00:00
Duncan P. N. Exon Smith	705fc7169e	blockfreq: Defer to BranchProbability::scale() `BlockMass` can now defer to `BranchProbability::scale()`. llvm-svn: 207547	2014-04-29 16:20:05 +00:00
Duncan P. N. Exon Smith	583ed8f3b0	blockfreq: Remove more extra typenames from r207438 llvm-svn: 207440	2014-04-28 20:22:29 +00:00
Duncan P. N. Exon Smith	2eaef1aa01	Reapply "blockfreq: Approximate irreducible control flow" This reverts commit r207287, reapplying r207286. I'm hoping that declaring an explicit struct and instantiating `addBlockEdges()` directly works around the GCC crash from r207286. This is a lot more boilerplate, though. llvm-svn: 207438	2014-04-28 20:02:29 +00:00
Chandler Carruth	08eb8582cd	[LCG] Add the most basic of edge insertion to the lazy call graph. This just handles the pre-DFS case. Also add some test cases for this case to make sure it works. llvm-svn: 207411	2014-04-28 11:10:23 +00:00
Chandler Carruth	4098580cb2	[LCG] Make the return of the IntraSCC removal method actually match its contract (and be much more useful). It now provides exactly the post-order traversal a caller might need to perform on newly formed SCCs. llvm-svn: 207410	2014-04-28 10:49:06 +00:00
Chandler Carruth	02b3960e8a	[inliner] Significantly improve the compile time in cases like PR19499 by avoiding inlining massive switches merely because they have no instructions in them. These switches still show up where we fail to form lookup tables, and in those cases they are actually going to cause a very significant code size hit anyways, so inlining them is not the right call. The right way to fix any performance regressions stemming from this is to enhance the switch-to-lookup-table logic to fire in more places. This makes PR19499 about 5x less bad. It uncovers a second compile time problem in that test case that is unrelated (surprisingly!). llvm-svn: 207403	2014-04-28 08:52:44 +00:00
Craig Topper	b663bffa27	[C++] Use 'nullptr'. llvm-svn: 207394	2014-04-28 04:05:08 +00:00
Chandler Carruth	1b5573df25	[LCG] Re-organize the methods for mutating a call graph to make their API requirements much more obvious. The key here is that there are two totally different use cases for mutating the graph. Prior to doing any SCC formation, it is very easy to mutate the graph. There may be users that want to do small tweaks here, and then use the already-built graph for their SCC-based operations. This method remains on the graph itself and is documented carefully as being cheap but unavailable once SCCs are formed. Once SCCs are formed, and there is some in-flight DFS building them, we have to be much more careful in how we mutate the graph. These mutation operations are sunk onto the SCCs themselves, which both simplifies things (the code was already there!) and helps make it obvious that these interfaces are only applicable within that context. The other primary constraint is that the edge being mutated is actually related to the SCC on which we call the method. This helps make it obvious that you cannot arbitrarily mutate some other SCC. I've tried to write much more complete documentation for the interesting mutation API -- intra-SCC edge removal. Currently one aspect of this documentation is a lie (the result list of SCCs) but we also don't even have tests for that API. =[ I'm going to add tests and fix it to match the documentation next. llvm-svn: 207339	2014-04-27 01:59:50 +00:00
Chandler Carruth	864b47743f	[LCG] Rather than removing nodes from the SCC entry set when we process them, just skip over any DFS-numbered nodes when finding the next root of a DFS. This allows the entry set to just be a vector as we populate it from a uniqued source. It also removes the possibility for a linear scan of the entry set to actually do the removal which can make things go quadratic if we get unlucky. llvm-svn: 207312	2014-04-26 09:45:55 +00:00
Chandler Carruth	4dbb64e3cd	[LCG] Rotate the full SCC finding algorithm to avoid round-trips through the DFS stack for leaves in the call graph. As mentioned in my previous commit, this is particularly interesting for graphs which have high fan out but low connectivity resulting in many leaves. For such graphs, this can remove a large % of the DFS stack traffic even though it doesn't make the stack much smaller. It's a bit easier to formulate this for the full algorithm because that one stops completely for each SCC. For example, I was able to directly eliminate the "Recurse" boolean used to continue an outer loop from the inner loop. llvm-svn: 207311	2014-04-26 09:28:00 +00:00
Chandler Carruth	3a16e3f5fa	[LCG] Hoist the main DFS loop out of the edge removal function. This makes working through the worklist much cleaner, and makes it possible to avoid the 'bool-to-continue-the-outer-loop' hack. Not a huge difference, but I think this is approaching as polished as I can make it. llvm-svn: 207310	2014-04-26 09:06:53 +00:00
Chandler Carruth	87d8609624	[LCG] In the incremental SCC re-formation, lift the node currently being processed in the DFS out of the stack completely. Keep it exclusively in a variable. Re-shuffle some code structure to make this easier. This can have a very dramatic effect in some cases because call graphs tend to look like a high fan-out spanning tree. As a consequence, there are a large number of leaf nodes in the graph, and this technique causes leaf nodes to never even go into the stack. While this only reduces the max depth by 1, it may cause the total number of round trips through the stack to drop by a lot. Now, most of this isn't really relevant for the incremental version. =] But I wanted to prototype it first here as this variant is in ways more complex. As long as I can get the code factored well here, I'll next make the primary walk look the same. There are several refactorings this exposes I think. llvm-svn: 207306	2014-04-26 03:36:42 +00:00
Chandler Carruth	04ce1b92d9	[LCG] Special case the removal of self edges. These don't impact the SCC graph in any way because we don't track edges in the SCC graph, just nodes. This also lets us add a nice assert about the invariant that we're working on at least a certain number of nodes within the SCC. llvm-svn: 207305	2014-04-26 03:36:37 +00:00
Chandler Carruth	0e388582e5	[LCG] Refactor the duplicated code I added in my last commit here into a helper function. Also factor the other two places where we did the same thing into the helper function. =] Much cleaner this way. NFC. llvm-svn: 207300	2014-04-26 01:03:46 +00:00
Duncan P. N. Exon Smith	c54b3a7e23	Revert "blockfreq: Approximate irreducible control flow" This reverts commit r207286. It causes an ICE on the cmake-llvm-x86_64-linux buildbot [1]: llvm/lib/Analysis/BlockFrequencyInfo.cpp: In lambda function: llvm/lib/Analysis/BlockFrequencyInfo.cpp:182:1: internal compiler error: in get_expr_operands, at tree-ssa-operands.c:1035 [1]: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/12093/steps/build_llvm/logs/stdio llvm-svn: 207287	2014-04-25 23:16:58 +00:00
Duncan P. N. Exon Smith	3189616c35	blockfreq: Approximate irreducible control flow Previously, irreducible backedges were ignored. With this commit, irreducible SCCs are discovered on the fly, and modelled as loops with multiple headers. This approximation specifies the headers of irreducible sub-SCCs as its entry blocks and all nodes that are targets of a backedge within it (excluding backedges within true sub-loops). Block frequency calculations act as if we insert a new block that intercepts all the edges to the headers. All backedges and entries to the irreducible SCC point to this imaginary block. This imaginary block has an edge (with even probability) to each header block. The result is now reasonable enough that I've added a number of testcases for irreducible control flow. I've outlined in `BlockFrequencyInfoImpl.h` ways to improve the approximation. <rdar://problem/14292693> llvm-svn: 207286	2014-04-25 23:08:57 +00:00
Duncan P. N. Exon Smith	0f4795de58	blockfreq: Further shift logic to LoopData Move a lot of the loop-related logic that was sprinkled around the code into `LoopData`. <rdar://problem/14292693> llvm-svn: 207258	2014-04-25 18:47:04 +00:00
Duncan P. N. Exon Smith	ea68e6a3d5	SCC: Change clients to use const, NFC It's fishy to be changing the `std::vector<>` owned by the iterator, and no one actual does it, so I'm going to remove the ability in a subsequent commit. First, update the users. <rdar://problem/14292693> llvm-svn: 207252	2014-04-25 18:24:50 +00:00
Chandler Carruth	f9a129a8ff	[LCG] During the incremental update of an SCC, switch to using the SCCMap to test for nodes that have been re-added to the root SCC rather than a set vector. We already have done the SCCMap lookup, we juts need to test it in two different ways. In turn, do most of the processing of these nodes as they go into the root SCC rather than lazily. This simplifies the final loop to just stitch the root SCC into its children's parent sets. No functionlatiy changed. However, this makes a few things painfully obvious, which was my intent. =] There is tons of repeated code introduced here and elsewhere. I'm splitting the refactoring of that code into helpers from this change so its clear that this is the change which switches the datastructures used around, and the other is a pure factoring & deduplication of code change. llvm-svn: 207217	2014-04-25 09:52:44 +00:00
Chandler Carruth	099d43c4dc	[LCG] During the incremental re-build of an SCC after removing an edge, remove the nodes in the SCC from the SCC map entirely prior to the DFS walk. This allows the SCC map to represent both the state of not-yet-re-added-to-an-SCC and added-back-to-this-SCC independently. The first is being missing from the SCC map, the second is mapping back to 'this'. In a subsequent commit, I'm going to use this property to simplify the new node list for this SCC. In theory, I think this also makes the contract for orphaning a node from the graph slightly less confusing. Now it is also orphaned from the SCC graph. Still, this isn't quite right either, and so I'm not adding test cases here. I'll add test cases for the behavior of orphaning nodes when the code actually supports it. The change here is mostly incidental, my goal is simplifying the algorithm. llvm-svn: 207213	2014-04-25 09:08:10 +00:00
Chandler Carruth	92048ceb62	[LCG] Rather than doing a linear time SmallSetVector removal of each child from the worklist, wait until we actually need to pop another element off of the worklist and skip over any that were already visited by the DFS. This also enables swapping the nodes of the SCC into the worklist. No functionality changed. llvm-svn: 207212	2014-04-25 09:08:05 +00:00
Chandler Carruth	a937a4a8a9	[LCG] Remove a completely unnecessary loop. It wasn't even doing any thing, just mucking up the code. I feel bad that I even wrote this loop. Very sorry. The diff is huge because of the indent change, but I promise all this is doing is realizing that the outer two loops were actually the exact same loops, and we didn't need two of them. llvm-svn: 207202	2014-04-25 06:45:06 +00:00
Chandler Carruth	6a4aae8f97	[LCG] Now that the loop structure of the core SCC finding routine is factored into a more reasonable form, replace the tail call with a simple outer-loop continuation. It's sad that C++ makes this so awkward to write, but it seems more direct and clear than the tail call at this point. llvm-svn: 207201	2014-04-25 06:38:58 +00:00
Duncan P. N. Exon Smith	6117699d7e	blockfreq: Only one mass distribution per node Remove the concepts of "forward" and "general" mass distributions, which was wrong. The split might have made sense in an early version of the algorithm, but it's definitely wrong now. <rdar://problem/14292693> llvm-svn: 207195	2014-04-25 04:38:43 +00:00
Duncan P. N. Exon Smith	eebda28f4c	blockfreq: Document assertion <rdar://problem/14292693> llvm-svn: 207194	2014-04-25 04:38:40 +00:00
Duncan P. N. Exon Smith	20240d1036	blockfreq: Document high-level functions <rdar://problem/14292693> llvm-svn: 207191	2014-04-25 04:38:32 +00:00
Duncan P. N. Exon Smith	23f427f288	blockfreq: Scale LoopData::Scale on the way down Rather than scaling loop headers and then scaling all the loop members by the header frequency, scale `LoopData::Scale` itself, and scale the loop members by it. It's much more obvious what's going on this way, and doesn't cost any extra multiplies. <rdar://problem/14292693> llvm-svn: 207189	2014-04-25 04:38:27 +00:00
Duncan P. N. Exon Smith	103f12e173	blockfreq: unwrapLoopPackage() => unwrapLoop() <rdar://problem/14292693> llvm-svn: 207188	2014-04-25 04:38:25 +00:00
Duncan P. N. Exon Smith	4f4a0c2fa2	blockfreq: Pass the Loop directly into unwrapLoopPackage() <rdar://problem/14292693> llvm-svn: 207187	2014-04-25 04:38:23 +00:00
Duncan P. N. Exon Smith	3fb457a1d2	blockfreq: Unwrap from Loops When unwrapping loops, just visit the loops rather than all nodes. <rdar://problem/14292693> llvm-svn: 207186	2014-04-25 04:38:20 +00:00
Duncan P. N. Exon Smith	20bb8bb185	blockfreq: Separate unwrapLoops() from finalizeMetrics() <rdar://problem/14292693> llvm-svn: 207185	2014-04-25 04:38:17 +00:00
Duncan P. N. Exon Smith	bed52c2a81	blockfreq: Expose getPackagedNode() Make `getPackagedNode()` a member function of `BlockFrequencyInfoImplBase` so that it's available for templated code. <rdar://problem/14292693> llvm-svn: 207183	2014-04-25 04:38:12 +00:00
Duncan P. N. Exon Smith	b9744991af	blockfreq: Store the header with the members <rdar://problem/14292693> llvm-svn: 207182	2014-04-25 04:38:09 +00:00
Duncan P. N. Exon Smith	2a2a2175c7	blockfreq: Encapsulate LoopData::Header <rdar://problem/14292693> llvm-svn: 207181	2014-04-25 04:38:06 +00:00
Duncan P. N. Exon Smith	e670959c2a	blockfreq: Use LoopData directly Instead of passing around loop headers, pass around `LoopData` directly. <rdar://problem/14292693> llvm-svn: 207179	2014-04-25 04:38:01 +00:00
Duncan P. N. Exon Smith	0b338fa955	blockfreq: Use a std::list for Loops As pointed out by David Blaikie in code review, a `std::list<T>` is simpler than a `std::vector<std::unique_ptr<T>>`. Another option is a `std::deque<T>` (which allocates in chunks), but I'd like to leave open the option of inserting in the middle of the sequence for handling irreducible control flow on the fly. <rdar://problem/14292693> llvm-svn: 207177	2014-04-25 04:30:06 +00:00
Chandler Carruth	59e03f926f	[LCG] Switch a weird do/while loop that actually couldn't fail its condition into an obviously infinite loop with an assert about the degenerate condition. No functionality changed. llvm-svn: 207147	2014-04-24 21:19:30 +00:00
Chandler Carruth	9e4513f082	[LCG] Incorporate the core trick of improvements on the naive Tarjan's algorithm here: http://dl.acm.org/citation.cfm?id=177301. The idea of isolating the roots has even more relevance when using the stack not just to implement the DFS but also to implement the recursive step. Because we use it for the recursive step, to isolate the roots we need to maintain two stacks: one for our recursive DFS walk, and another of the nodes that have been walked. The nice thing is that the latter will be half the size. It also fixes a complete hack where we scanned backwards over the stack to find the next potential-root to continue processing. Now that is always the top of the DFS stack. While this is a really nice improvement already (IMO) it further opens the door for two important simplifications: 1) De-duplicating some of the code across the two different walks. I've actually made the duplication a bit worse in some senses with this patch because the two are starting to converge. 2) Dramatically simplifying the loop structures of both walks. I wanted to do those separately as they'll be essentially just CFG restructuring. This patch on the other hand actually uses different datastructures to implement the algorithm itself. llvm-svn: 207098	2014-04-24 11:05:20 +00:00
Chandler Carruth	9e16f14789	[LCG] Rotate logic applied to the top of the DFSStack to instead be applied prior to pushing a node onto the DFSStack. This is the first step toward avoiding the stack entirely for leaf nodes. It also simplifies things a bit and I think is pointing the way toward factoring some more of the shared logic out of the two implementations. It is also making it more obvious how to restructure the loops themselves to be a bit easier to read (although no different in terms of functionality). llvm-svn: 207095	2014-04-24 09:59:59 +00:00
Chandler Carruth	cd39f4c2e6	[LCG] Switch the parent SCC tracking from a SmallSetVector to a SmallPtrSet. Currently, there is no need for stable iteration in this dimension, and I now thing there won't need to be going forward. If this is ever re-introduced in any form, it needs to not be a SetVector based solution because removal cannot be linear. There will be many SCCs with large numbers of parents. When encountering these, the incremental SCC update for intra-SCC edge removal was quadratic due to linear removal (kind of). I'm really hoping we can avoid having an ordering property here at all though... llvm-svn: 207091	2014-04-24 09:22:31 +00:00
Chandler Carruth	ccccef94ac	[LCG] We don't actually need a set in each SCC to track the nodes. We can use the node -> SCC mapping in the top-level graph to test this on the rare occasions we need it. llvm-svn: 207090	2014-04-24 08:55:36 +00:00
Craig Topper	c7c3a99ec2	[C++] Use 'nullptr'. llvm-svn: 207083	2014-04-24 06:44:33 +00:00
Chandler Carruth	a18f590cc4	[LCG] Normalize the post-order SCC iterator to just iterate over the SCC values rather than having pointers in weird places. llvm-svn: 207053	2014-04-23 23:51:07 +00:00
Chandler Carruth	1d124691ed	[LCG] Switch the primary node iterator to be a much more normal C++ iterator, returning a Node by reference on dereference. llvm-svn: 207048	2014-04-23 23:34:48 +00:00
Chandler Carruth	18f0202abb	[LCG] Make the insertion and query paths into the LCG which cannot fail return references to better model this property. No functionality changed. llvm-svn: 207047	2014-04-23 23:20:36 +00:00
Chandler Carruth	ddc1da4ac6	[LCG] Switch the SCC lookup to be in terms of call graph nodes rather than functions. So far, this access pattern is much more common. It seems likely that any user of this interface is going to have nodes at the point that they are querying the SCCs. No functionality changed. llvm-svn: 207045	2014-04-23 23:12:06 +00:00
Chandler Carruth	e064af9075	[LCG] Switch the primary SCC building code to use the negative low-link values rather than an expensive dense map query to test whether children have already been popped into an SCC. This matches the incremental SCC building code. I've also included the assert that I put there but updated both of their text. No functionality changed here. I still don't have any great ideas for sharing the code between the two implementations, but I may try a brute-force approach to factoring it at some point. llvm-svn: 207042	2014-04-23 22:28:13 +00:00
Chandler Carruth	72105b1195	[LCG] Add the first round of mutation support to the lazy call graph. This implements the core functionality necessary to remove an edge from the call graph and correctly update both the basic graph and the SCC structure. As part of that it has to run a tiny (in number of nodes) Tarjan-style DFS walk of an SCC being mutated to compute newly formed SCCs, etc. This is very rough and a WIP. I have a bunch of FIXMEs for code cleanup that will reduce the boilerplate in this change substantially. I also have a bunch of simplifications to various parts of both algorithms that I want to make, but first I'd like to have a more holistic picture. Ideally, I'd also like more testing. I'll probably add quite a few more unit tests as I go here to cover the various different aspects and corner cases of removing edges from the graph. Still, this is, so far, successfully updating the SCC graph in-place without disrupting the identity established for the existing SCCs even when we do challenging things like delete the critical edge that made an SCC cycle at all and have to reform things as a tree of smaller SCCs. Getting this to work is really critical for the new pass manager as it is going to associate significant state with the SCC instance and needs it to be stable. That is also the motivation behind the return of the newly formed SCCs. Eventually, I'll wire this all the way up to the public API so that the pass manager can use it to correctly re-enqueue newly formed SCCs into a fresh postorder traversal. llvm-svn: 206968	2014-04-23 11:03:03 +00:00
Chandler Carruth	ba4ce79281	[LCG] Implement Tarjan's algorithm correctly this time. We have to walk up the stack finishing the exploration of each entries children before we're finished in addition to accounting for their low-links. Added a unittest that really hammers home the need for this with interlocking cycles that would each appear distinct otherwise and crash or compute the wrong result. As part of this, nuke a stale fixme and bring the rest of the implementation still more closely in line with the original algorithm. llvm-svn: 206966	2014-04-23 10:31:17 +00:00
Chandler Carruth	4d480e7c41	[LCG] Add a unittest for the LazyCallGraph. I had a weak moment and resisted this for too long. Just with the basic testing here I was able to exercise the analysis in more detail and sift out both type signature bugs in the API and a bug in the DFS numbering. All of these are fixed here as well. The unittests will be much more important for the mutation support where it is necessary to craft minimal mutations and then inspect the state of the graph. There is just no way to do that with a standard FileCheck test. However, unittesting these kinds of analyses is really quite easy, especially as they're designed with the new pass manager where there is essentially no infrastructure required to rig up the core logic and exercise it at an API level. As a minor aside about the DFS numbering bug, the DFS numbering used in LCG is a bit unusual. Rather than numbering from 0, we number from 1, and use 0 as the sentinel "unvisited" state. Other implementations often use '-1' for this, but I find it easier to deal with 0 and it shouldn't make any real difference provided someone doesn't write silly bugs like forgetting to actually initialize the DFS numbering. Oops. ;] llvm-svn: 206954	2014-04-23 08:08:49 +00:00
Chandler Carruth	6ddb99ce88	[LCG] Hoist the logic for forming a new SCC from the top of the DFSStack into a helper function. I plan to re-use it for doing incremental DFS-based updates to the SCCs when we mutate the call graph. llvm-svn: 206948	2014-04-23 06:09:03 +00:00

... 2 3 4 5 6 ...

5179 Commits