llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 20:43:44 +02:00

Author	SHA1	Message	Date
Aaron Ballman	ad452d2332	These functions are not actually defined for NDEBUG or !LLVM_DUMP_ENABLED, so guarding the declarations as well. NFC, silences MSVC warnings in release builds. llvm-svn: 220565	2014-10-24 15:16:39 +00:00
Benjamin Kramer	eb5ea88a48	AssumptionTracker: Don't create temporary CallbackVHs. Those are expensive to create in cold cache scenarios. NFC. llvm-svn: 219575	2014-10-11 19:13:01 +00:00
Chandler Carruth	9e809b8dad	[SCEV] Add some asserts to the recently improved trip count computation routines and fix all of the bugs they expose. I hit a test case that crashed even without these asserts due to passing a non-exiting latch to the ExitingBlock parameter of the trip count computation machinery. However, when I add the nice asserts, it turns out we have plenty of coverage of these bugs, they just didn't manifest in crashers. The core problem seems to stem from an assumption that the latch is the exiting block. While this is often true, and somewhat the "normal" way to think about loops, it isn't necessarily true. The correct way to call the trip count routines in a generic fashion (that is, without a particular exit in mind) is to just use the loop's single exiting block if it has one. The trip count can't be computed generically unless it does. This works great for the loop vectorizer. The loop unroller actually wants to select the latch when it has to chose between multiple exits because for unrolling it is the latch trips that matter. But if this is the desire, it needs to explicitly guard for non-exiting latches and check for the generic trip count in that case. I've added the asserts, and added convenience APIs for querying the trip count generically that check for a single exit block. I've kept the APIs consistent between computing trip count and trip multiples. Thansk to Mark for the help debugging and tracking down the right fix here! llvm-svn: 219550	2014-10-11 00:12:11 +00:00
Mark Heffernan	fdfe8b7008	This patch de-pessimizes the calculation of loop trip counts in ScalarEvolution in the presence of multiple exits. Previously all loops exits had to have identical counts for a loop trip count to be considered computable. This pessimization was implemented by calling getBackedgeTakenCount(L) rather than getExitCount(L, ExitingBlock) inside of ScalarEvolution::getSmallConstantTripCount() (see the FIXME in the comments of that function). The pessimization was added to fix a corner case involving undefined behavior (pr/16130). This patch more precisely handles the undefined behavior case allowing the pessimization to be removed. ControlsExit replaces IsSubExpr to more precisely track the case where undefined behavior is expected to occur. Because undefined behavior is tracked more precisely we can remove MustExit from ExitLimit. MustExit was used to track the case where the limit was computed potentially assuming undefined behavior even if undefined behavior didn't necessarily occur. llvm-svn: 219517	2014-10-10 17:39:11 +00:00
Benjamin Kramer	f35a067b43	Reduce double set lookups. NFC. llvm-svn: 219505	2014-10-10 15:32:50 +00:00
Robin Morisset	60d53043dc	Fix typo in comment llvm-svn: 219365	2014-10-08 23:30:45 +00:00
Benjamin Kramer	860521c88b	Make AAMDNodes ctor and operator bool (!!!) explicit, mop up bugs and weirdness exposed by it. llvm-svn: 219068	2014-10-04 22:44:29 +00:00
David Peixotto	1c44b9f4ef	Fix assertion in LICM doFinalization() The doFinalization method checks that the LoopToAliasSetMap is empty. LICM populates that map as it runs through the loop nest, deleting the entries for child loops as it goes. However, if a child loop is deleted by another pass (e.g. unrolling) then the loop will never be deleted from the map because LICM walks the loop nest to find entries it can delete. The fix is to delete the loop from the map and free the alias set when the loop is deleted from the loop nest. Differential Revision: http://reviews.llvm.org/D5305 llvm-svn: 218387	2014-09-24 16:48:31 +00:00
Eric Christopher	c75fbbac7c	Add a new pass FunctionTargetTransformInfo. This pass serves as a shim between the TargetTransformInfo immutable pass and the Subtarget via the TargetMachine and Function. Migrate a single call from BasicTargetTransformInfo as an example and provide shims where TargetMachine begins taking a Function to determine the subtarget. No functional change. llvm-svn: 218004	2014-09-18 00:34:14 +00:00
Sanjay Patel	8030ed3639	Rename getMaximumUnrollFactor -> getMaxInterleaveFactor; also rename option names controlling this variable. "Unroll" is not the appropriate name for this variable. Clang already uses the term "interleave" in pragmas and metadata for this. Differential Revision: http://reviews.llvm.org/D5066 llvm-svn: 217528	2014-09-10 17:58:16 +00:00
Hal Finkel	082ca75fd6	Don't static_cast invalid pointers UBSan complained about using static_cast on the invalid (tombstone, etc.) pointers used by DenseMap. Use a reinterpret_cast instead. llvm-svn: 217397	2014-09-08 19:31:25 +00:00
Hal Finkel	9ede5a39dd	Make use of @llvm.assume from LazyValueInfo This change teaches LazyValueInfo to use the @llvm.assume intrinsic. Like with the known-bits change (r217342), this requires feeding a "context" instruction pointer through many functions. Aside from a little refactoring to reuse the logic that turns predicates into constant ranges in LVI, the only new code is that which can 'merge' the range from an assumption into that otherwise computed. There is also a small addition to JumpThreading so that it can have LVI use assumptions in the same block as the comparison feeding a conditional branch. With this patch, we can now simplify this as expected: int foo(int a) { __builtin_assume(a > 5); if (a > 3) { bar(); return 1; } return 0; } llvm-svn: 217345	2014-09-07 20:29:59 +00:00
Hal Finkel	f8bb9b78cf	Make use of @llvm.assume in ValueTracking (computeKnownBits, etc.) This change, which allows @llvm.assume to be used from within computeKnownBits (and other associated functions in ValueTracking), adds some (optional) parameters to computeKnownBits and friends. These functions now (optionally) take a "context" instruction pointer, an AssumptionTracker pointer, and also a DomTree pointer, and most of the changes are just to pass this new information when it is easily available from InstSimplify, InstCombine, etc. As explained below, the significant conceptual change is that known properties of a value might depend on the control-flow location of the use (because we care that the @llvm.assume dominates the use because assumptions have control-flow dependencies). This means that, when we ask if bits are known in a value, we might get different answers for different uses. The significant changes are all in ValueTracking. Two main changes: First, as with the rest of the code, new parameters need to be passed around. To make this easier, I grouped them into a structure, and I made internal static versions of the relevant functions that take this structure as a parameter. The new code does as you might expect, it looks for @llvm.assume calls that make use of the value we're trying to learn something about (often indirectly), attempts to pattern match that expression, and uses the result if successful. By making use of the AssumptionTracker, the process of finding @llvm.assume calls is not expensive. Part of the structure being passed around inside ValueTracking is a set of already-considered @llvm.assume calls. This is to prevent a query using, for example, the assume(a == b), to recurse on itself. The context and DT params are used to find applicable assumptions. An assumption needs to dominate the context instruction, or come after it deterministically. In this latter case we only handle the specific case where both the assumption and the context instruction are in the same block, and we need to exclude assumptions from being used to simplify their own ephemeral values (those which contribute only to the assumption) because otherwise the assumption would prove its feeding comparison trivial and would be removed. This commit adds the plumbing and the logic for a simple masked-bit propagation (just enough to write a regression test). Future commits add more patterns (and, correspondingly, more regression tests). llvm-svn: 217342	2014-09-07 18:57:58 +00:00
Hal Finkel	575ec5e04c	Add functions for finding ephemeral values This adds a set of utility functions for collecting 'ephemeral' values. These are LLVM IR values that are used only by @llvm.assume intrinsics (directly or indirectly), and thus will be removed prior to code generation, implying that they should be considered free for certain purposes (like inlining). The inliner's cost analysis, and a few other passes, have been updated to account for ephemeral values using the provided functionality. This functionality is important for the usability of @llvm.assume, because it limits the "non-local" side-effects of adding llvm.assume on inlining, loop unrolling, etc. (these are hints, and do not generate code, so they should not directly contribute to estimates of execution cost). llvm-svn: 217335	2014-09-07 13:49:57 +00:00
Hal Finkel	6122fb79cb	Add an Assumption-Tracking Pass This adds an immutable pass, AssumptionTracker, which keeps a cache of @llvm.assume call instructions within a module. It uses callback value handles to keep stale functions and intrinsics out of the map, and it relies on any code that creates new @llvm.assume calls to notify it of the new instructions. The benefit is that code needing to find @llvm.assume intrinsics can do so directly, without scanning the function, thus allowing the cost of @llvm.assume handling to be negligible when none are present. The current design is intended to be lightweight. We don't keep track of anything until we need a list of assumptions in some function. The first time this happens, we scan the function. After that, we add/remove @llvm.assume calls from the cache in response to registration calls and ValueHandle callbacks. There are no new direct test cases for this pass, but because it calls it validation function upon module finalization, we'll pick up detectable inconsistencies from the other tests that touch @llvm.assume calls. This pass will be used by follow-up commits that make use of @llvm.assume. llvm-svn: 217334	2014-09-07 12:44:26 +00:00
Hal Finkel	0ad5c26d4b	Add a CFL Alias Analysis implementation This provides an implementation of CFL alias analysis (including some supporting data structures). Currently, we don't have any extremely fancy features, sans some interprocedural analysis (i.e. no field sensitivity, etc.), and we do best sitting behind BasicAA + TBAA. In such a configuration, we take ~0.6-0.8% of total compile time, and give ~7-8% NoAlias responses to queries TBAA and BasicAA couldn't answer when bootstrapping LLVM. In testing this on other projects, we've seen up to 10.5% of queries dropped by BasicAA+TBAA answered with NoAlias by this algorithm. Patch by George Burgess IV (with minor modifications by me -- mostly adapting some BasicAA tests), thanks! llvm-svn: 216970	2014-09-02 21:43:13 +00:00
Robin Morisset	e583310c3b	Fix typos in comments, NFC Summary: Just fixing comments, no functional change. Test Plan: N/A Reviewers: jfb Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D5130 llvm-svn: 216784	2014-08-29 21:53:01 +00:00
Dylan Noblesmith	24bc8991ce	Revert "Analysis: unique_ptr-ify DependenceAnalysis::collectCoeffInfo" This reverts commit r216358. llvm-svn: 216431	2014-08-26 02:03:38 +00:00
Rafael Espindola	1d5713d9bf	Modernize raw_fd_ostream's constructor a bit. Take a StringRef instead of a "const char *". Take a "std::error_code &" instead of a "std::string &" for error. A create static method would be even better, but this patch is already a bit too big. llvm-svn: 216393	2014-08-25 18:16:47 +00:00
Karthik Bhat	d94045aa5a	Allow vectorization of division by uniform power of 2. This patch adds support to recognize division by uniform power of 2 and modifies the cost table to vectorize division by uniform power of 2 whenever possible. Updates Cost model for Loop and SLP Vectorizer.The cost table is currently only updated for X86 backend. Thanks to Hal, Andrea, Sanjay for the review. (http://reviews.llvm.org/D4971) llvm-svn: 216371	2014-08-25 04:56:54 +00:00
Dylan Noblesmith	6af08b54ee	Analysis: unique_ptr-ify DependenceAnalysis::collectCoeffInfo llvm-svn: 216358	2014-08-25 00:28:43 +00:00
Dylan Noblesmith	d59c8b7d67	Analysis: unique_ptr-ify DependenceAnalysis::depends llvm-svn: 216357	2014-08-25 00:28:39 +00:00
Dylan Noblesmith	a124352238	Analysis: take a reference instead of pointer This parameter is never null. llvm-svn: 216356	2014-08-25 00:28:35 +00:00
Craig Topper	65775cc03d	Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size. llvm-svn: 216158	2014-08-21 05:55:13 +00:00
Andrew Trick	fadfd8041e	Tweak CFGPrinter to wrap very long names. I added wrapping to the CFGPrinter a while back so the -view-cfg output is actually viewable. I've since enountered very long mangled names with the same problem, so I'm slightly tweaking this code to work in that case. llvm-svn: 216087	2014-08-20 17:38:12 +00:00
Craig Topper	aa7422b5a6	Revert "Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size." Getting a weird buildbot failure that I need to investigate. llvm-svn: 215870	2014-08-18 00:24:38 +00:00
Craig Topper	227456e133	Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size. llvm-svn: 215868	2014-08-17 23:47:00 +00:00
Benjamin Kramer	da144ed5a2	Canonicalize header guards into a common format. Add header guards to files that were missing guards. Remove #endif comments as they don't seem common in LLVM (we can easily add them back if we decide they're useful) Changes made by clang-tidy with minor tweaks. llvm-svn: 215558	2014-08-13 16:26:38 +00:00
Richard Smith	930ab2538a	Remove Support/IncludeFile.h and its only user. This is actively harmful, since it breaks the modules builds (where CallGraph.h can be quite reasonably transitively included by an unimported portion of a module, and CallGraph.cpp not linked in), and appears to have been entirely redundant since PR780 was fixed back in 2008. If this breaks anything, please revert; I have only tested this with a single configuration, and it's possible that this is still somehow fixing something (though I doubt it, since no other similar file uses this mechanism any more). llvm-svn: 215142	2014-08-07 20:41:17 +00:00
James Molloy	ea323a2876	Teach the SLP Vectorizer that keeping some values live over a callsite can have a cost. Some types, such as 128-bit vector types on AArch64, don't have any callee-saved registers. So if a value needs to stay live over a callsite, it must be spilled and refilled. This cost is now taken into account. llvm-svn: 214859	2014-08-05 12:30:34 +00:00
Richard Smith	6c8f410a31	Header hygiene: remove using directive and #undef DEBUG_TYPE once we're done. llvm-svn: 214263	2014-07-30 00:25:24 +00:00
Hal Finkel	7463a12ef9	Add scoped-noalias metadata This commit adds scoped noalias metadata. The primary motivations for this feature are: 1. To preserve noalias function attribute information when inlining 2. To provide the ability to model block-scope C99 restrict pointers Neither of these two abilities are added here, only the necessary infrastructure. In fact, there should be no change to existing functionality, only the addition of new features. The logic that converts noalias function parameters into this metadata during inlining will come in a follow-up commit. What is added here is the ability to generally specify noalias memory-access sets. Regarding the metadata, alias-analysis scopes are defined similar to TBAA nodes: !scope0 = metadata !{ metadata !"scope of foo()" } !scope1 = metadata !{ metadata !"scope 1", metadata !scope0 } !scope2 = metadata !{ metadata !"scope 2", metadata !scope0 } !scope3 = metadata !{ metadata !"scope 2.1", metadata !scope2 } !scope4 = metadata !{ metadata !"scope 2.2", metadata !scope2 } Loads and stores can be tagged with an alias-analysis scope, and also, with a noalias tag for a specific scope: ... = load %ptr1, !alias.scope !{ !scope1 } ... = load %ptr2, !alias.scope !{ !scope1, !scope2 }, !noalias !{ !scope1 } When evaluating an aliasing query, if one of the instructions is associated with an alias.scope id that is identical to the noalias scope associated with the other instruction, or is a descendant (in the scope hierarchy) of the noalias scope associated with the other instruction, then the two memory accesses are assumed not to alias. Note that is the first element of the scope metadata is a string, then it can be combined accross functions and translation units. The string can be replaced by a self-reference to create globally unqiue scope identifiers. [Note: This overview is slightly stylized, since the metadata nodes really need to just be numbers (!0 instead of !scope0), and the scope lists are also global unnamed metadata.] Existing noalias metadata in a callee is "cloned" for use by the inlined code. This is necessary because the aliasing scopes are unique to each call site (because of possible control dependencies on the aliasing properties). For example, consider a function: foo(noalias a, noalias b) { a = b; } that gets inlined into bar() { ... if (...) foo(a1, b1); ... if (...) foo(a2, b2); } -- now just because we know that a1 does not alias with b1 at the first call site, and a2 does not alias with b2 at the second call site, we cannot let inlining these functons have the metadata imply that a1 does not alias with b2. llvm-svn: 213864	2014-07-24 14:25:39 +00:00
Hal Finkel	9be4aefa57	AA metadata refactoring (introduce AAMDNodes) In order to enable the preservation of noalias function parameter information after inlining, and the representation of block-level __restrict__ pointer information (etc.), additional kinds of aliasing metadata will be introduced. This metadata needs to be carried around in AliasAnalysis::Location objects (and MMOs at the SDAG level), and so we need to generalize the current scheme (which is hard-coded to just one TBAA MDNode). This commit introduces only the necessary refactoring to allow for the introduction of other aliasing metadata types, but does not actually introduce any (that will come in a follow-up commit). What it does introduce is a new AAMDNodes structure to hold all of the aliasing metadata nodes associated with a particular memory-accessing instruction, and uses that structure instead of the raw MDNode in AliasAnalysis::Location, etc. No functionality change intended. llvm-svn: 213859	2014-07-24 12:16:19 +00:00
Hal Finkel	796c65151d	Match semantics of PointerMayBeCapturedBefore to its name by default As it turns out, the capture tracker named CaptureBefore used by AA, and now available via the PointerMayBeCapturedBefore function, would have been more-aptly named CapturedBeforeOrAt, because it considers captures at the instruction provided. This is not always what one wants, and it is difficult to get the strictly-before behavior given only the current interface. This adds an additional parameter which controls whether or not you want to include captures at the provided instruction. The default is not to include the instruction provided, so that 'Before' matches its name. No functionality change intended. llvm-svn: 213582	2014-07-21 21:30:22 +00:00
Hal Finkel	b4b78bf273	Move the CapturesBefore tracker from AA into CaptureTracking There were two generally-useful CaptureTracker classes defined in LLVM: the simple tracker defined in CaptureTracking (and made available via the PointerMayBeCaptured utility function), and the CapturesBefore tracker available only inside of AA. This change moves the CapturesBefore tracker into CaptureTracking, generalizes it slightly (by adding a ReturnCaptures parameter), and makes it generally available via a PointerMayBeCapturedBefore utility function. This logic will be needed, for example, to perform noalias function parameter attribute inference. No functionality change intended. llvm-svn: 213519	2014-07-21 13:15:48 +00:00
Aaron Ballman	c169fe52d2	This declaration has no definition, which is causing MSVC to emit several "no suitable definition provided for explicit template instantiation request" C4661 warnings. llvm-svn: 213517	2014-07-21 13:08:08 +00:00
Hal Finkel	60f586ad81	Move isIdentifiedFunctionLocal from BasicAA to AA The ability to identify function locals will exist outside of BasicAA (for example, logic for inferring noalias function arguments will need this), so make this concept generally accessible without code duplication. No functionality change. llvm-svn: 213514	2014-07-21 12:27:23 +00:00
Matt Arsenault	043d06593c	Fix build with GCC. Seems like a bug in either GCC or clang, but I'm not sure which is right. llvm-svn: 213460	2014-07-19 19:16:36 +00:00
Matt Arsenault	56af912b43	Templatify RegionInfo so it works on MachineBasicBlocks llvm-svn: 213456	2014-07-19 18:29:29 +00:00
Hal Finkel	9779454568	Improve BasicAA CS-CS queries (redux) This reverts, "r213024 - Revert r212572 "improve BasicAA CS-CS queries", it causes PR20303." with a fix for the bug in pr20303. As it turned out, the relevant code was both wrong and over-conservative (because, as with the code it replaced, it would return the overall ModRef mask even if just Ref had been implied by the argument aliasing results). Hopefully, this correctly fixes both problems. Thanks to Nick Lewycky for reducing the test case for pr20303 (which I've cleaned up a little and added in DSE's test directory). The BasicAA test has also been updated to check for this error. Original commit message: BasicAA contains knowledge of certain intrinsics, such as memcpy and memset, and uses that information to form more-accurate answers to CallSite vs. Loc ModRef queries. Unfortunately, it did not use this information when answering CallSite vs. CallSite queries. Generically, when an intrinsic takes one or more pointers and the intrinsic is marked only to read/write from its arguments, the offset/size is unknown. As a result, the generic code that answers CallSite vs. CallSite (and CallSite vs. Loc) queries in AA uses UnknownSize when forming Locs from an intrinsic's arguments. While BasicAA's CallSite vs. Loc override could use more-accurate size information for some intrinsics, it did not do the same for CallSite vs. CallSite queries. This change refactors the intrinsic-specific logic in BasicAA into a generic AA query function: getArgLocation, which is overridden by BasicAA to supply the intrinsic-specific knowledge, and used by AA's generic implementation. This allows the intrinsic-specific knowledge to be used by both CallSite vs. Loc and CallSite vs. CallSite queries, and simplifies the BasicAA implementation. Currently, only one function, Mac's memset_pattern16, is handled by BasicAA (all the rest are intrinsics). As a side-effect of this refactoring, BasicAA's getModRefBehavior override now also returns OnlyAccessesArgumentPointees for this function (which is an improvement). llvm-svn: 213219	2014-07-17 01:28:25 +00:00
Nick Lewycky	91e41155de	Revert r212572 "improve BasicAA CS-CS queries", it causes PR20303. llvm-svn: 213024	2014-07-15 00:53:38 +00:00
Sanjay Patel	ac9d9a2ca2	removed circular definitions in comments llvm-svn: 212990	2014-07-14 21:51:59 +00:00
Matt Arsenault	85ca1395b9	Add CreatePointerBitCastOrAddrSpaceCast to IRBuilder and co. llvm-svn: 212962	2014-07-14 17:24:35 +00:00
Matt Arsenault	ef03e8a51b	Try to fix MSVC warning. llvm-svn: 212889	2014-07-12 23:16:26 +00:00
Matt Arsenault	c5722e9461	Try to fix MSVC build llvm-svn: 212886	2014-07-12 22:19:49 +00:00
Matt Arsenault	3d717281d0	Templatify DominanceFrontier. Theoretically this should now work for MachineBasicBlocks. llvm-svn: 212885	2014-07-12 21:59:52 +00:00
Duncan P. N. Exon Smith	72e8fb652a	BFI: Add constructor for Weight llvm-svn: 212868	2014-07-12 00:26:00 +00:00
Duncan P. N. Exon Smith	fdb4eaa3bf	BFI: Clean up BlockMass Implementation is small now -- the interesting logic was moved to `BranchProbability` a while ago. Move it into `bfi_detail` and get rid of the related TODOs. I was originally planning to define it within `BlockFrequencyInfoImpl` (or `BFIIBase`), but it seems cleaner in a namespace. Besides, `isPodLike` needs to be specialized before `BlockMass` can be used in some of the other data structures, and there isn't a clear way to do that. llvm-svn: 212866	2014-07-12 00:21:30 +00:00
Duncan P. N. Exon Smith	8c157be757	BFI: Mark the end of namespaces llvm-svn: 212861	2014-07-11 23:56:50 +00:00
Mark Heffernan	937bab3688	Partially fix PR20058: reduce compile time for loop unrolling with very high count by reducing calls to SE->forgetLoop llvm-svn: 212782	2014-07-10 23:30:06 +00:00

1 2 3 4 5 ...

2435 Commits