llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 02:52:53 +02:00

Author	SHA1	Message	Date
Yuanfang Chen	e1803bebb8	[NewPM][PassInstrument] Add PrintPass callback to StandardInstrumentations Problem: Right now, our "Running pass" is not accurate when passes are wrapped in adaptor because adaptor is never skipped and a pass could be skipped. The other problem is that "Running pass" for a adaptor is before any "Running pass" of passes/analyses it depends on. (for example, FunctionToLoopPassAdaptor). So the order of printing is not the actual order. Solution: Doing things like PassManager::Debuglogging is very intrusive because we need to specify Debuglogging whenever adaptor is created. (Actually, right now we're not specifying Debuglogging for some sub-PassManagers. Check PassBuilder) This patch move debug logging for pass as a PassInstrument callback. We could be sure that all running passes are logged and in the correct order. This could also be used to implement hierarchy pass logging in legacy PM. We could also move logging of pass manager to this if we want. The test fixes looks messy. It includes changes: - Remove PassInstrumentationAnalysis - Remove PassAdaptor - If a PassAdaptor is for a real pass, the pass is added - Pass reorder (to the correct order), related to PassAdaptor - Add missing passes (due to Debuglogging not passed down) Reviewed By: asbirlea, aeubanks Differential Revision: https://reviews.llvm.org/D84774	2020-07-30 10:07:57 -07:00
Yuanfang Chen	3c47ffec1f	[NewPM] Move debugging log printing after PassInstrumentation before-pass-callbacks For passes got skipped, this is confusing because the log said it is `running pass` but it is skipped later. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D82511	2020-06-25 10:03:25 -07:00
Alina Sbirlea	c25b0c2bff	[NewPassManager] Add assertions when getting statefull cached analysis. Summary: Analyses that are statefull should not be retrieved through a proxy from an outer IR unit, as these analyses are only invalidated at the end of the inner IR unit manager. This patch disallows getting the outer manager and provides an API to get a cached analysis through the proxy. If the analysis is not stateless, the call to getCachedResult will assert. Reviewers: chandlerc Subscribers: mehdi_amini, eraman, hiraditya, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72893	2020-05-13 12:38:38 -07:00
Mircea Trofin	578cdb641b	[llvm][NFC][CallSite] Remove Implementation uses of CallSite Reviewers: dblaikie, davidxl, craig.topper Subscribers: arsenm, dschuff, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78142	2020-04-14 14:49:47 -07:00
Andrew Monshizadeh	cc7164597b	Extend TimeTrace to LLVM's new pass manager With the addition of the LLD time tracing it made sense to include coverage for LLVM's various passes. Doing so ensures that ThinLTO is also covered with a time trace. Before: {F11333974} After: {F11333928} Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D74516	2020-03-06 14:45:19 -08:00
Reid Kleckner	9369556310	Add PassManagerImpl.h to hide implementation details ClangBuildAnalyzer results show that a lot of time is spent instantiating AnalysisManager::getResultImpl across the code base: **** Templates that took longest to instantiate: 50445 ms: llvm::AnalysisManager<llvm::Function>::getResultImpl (412 times, avg 122 ms) 47797 ms: llvm::AnalysisManager<llvm::Function>::getResult<llvm::TargetLibraryAnalysis> (389 times, avg 122 ms) 46894 ms: std::tie<const unsigned long long, const bool> (2452 times, avg 19 ms) 43851 ms: llvm::BumpPtrAllocatorImpl<llvm::MallocAllocator, 4096, 4096>::Allocate (3228 times, avg 13 ms) 33911 ms: std::tie<const unsigned int, const unsigned int, const unsigned int, const unsigned int> (897 times, avg 37 ms) 33854 ms: std::tie<const unsigned long long, const unsigned long long> (1897 times, avg 17 ms) 27886 ms: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string (11156 times, avg 2 ms) I mentioned this result to @chandlerc, and he suggested this direction. AnalysisManager is already explicitly instantiated, and getResultImpl doesn't need to be inlined. Move the definition to an Impl header, and include that header in files that explicitly instantiate AnalysisManager. There are only four (real) IR units: - function - module - loop - cgscc Looking at a specific transform (ArgumentPromotion.cpp), here are three compilations before & after this change: BEFORE: $ for i in $(seq 3) ; do ./ccit.bat ; done peak memory: 258.15MB real: 0m6.297s peak memory: 257.54MB real: 0m5.906s peak memory: 257.47MB real: 0m6.219s AFTER: $ for i in $(seq 3) ; do ./ccit.bat ; done peak memory: 235.35MB real: 0m5.454s peak memory: 234.72MB real: 0m5.235s peak memory: 234.39MB real: 0m5.469s The 20MB of memory saved seems real, and the time improvement seems like it is there. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D73817	2020-02-03 11:15:55 -08:00
Martin Storsjö	61497b26e1	[PM][CGSCC] Add parentheses to avoid a GCC warning. NFC. This avoids a warning about "suggest parentheses around && within \|\|".	2020-02-03 09:55:02 +02:00
Johannes Doerfert	f80ba61e01	[PM][CGSCC] Add a helper to update the call graph from SCC passes With this patch new trivial edges can be added to an SCC in a CGSCC pass via the updateCGAndAnalysisManagerForCGSCCPass method. It shares almost all the code with the existing updateCGAndAnalysisManagerForFunctionPass method but it implements the first step towards the TODOs. This was initially part of D70927. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D72025	2020-02-02 23:32:18 -06:00
Benjamin Kramer	8b30fad250	Revert "[CallGraph] Refine call graph for indirect calls with !callees metadata" This reverts commit r369025. Crashes clang, test case is on the mailing list. llvm-svn: 369096	2019-08-16 10:59:18 +00:00
Mark Lacey	5db11d7594	[CallGraph] Refine call graph for indirect calls with !callees metadata For indirect call sites having a small set of possible callees, !callees metadata can be used to indicate what those callees are. This patch updates the call graph and lazy call graph analyses so that they consider this metadata when encountering call sites. For the call graph, it adds a new external call graph node to the graph for each unique !callees metadata node. A call graph edge connects an indirect call site with the external node associated with the !callees metadata that is attached to it. And there is an edge from this external node to each of the callees indicated by the metadata. Similarly, for the lazy call graph, the patch adds Ref edges from a caller to the possible callees indicated by the metadata. The primary purpose of the patch is to facilitate iterating over the functions in a module such that all of the callees indicated by a given !callees metadata node will be visited prior to the functions containing call sites annotated by that node. This property is required by optimizations performing a bottom-up traversal of the SCC DAG. For example, the inliner can be made to inline through an indirect call. If the call site is annotated with !callees metadata, this patch ensures that the inliner will have visited all of the callees prior to the caller, allowing it to reliably compute the cost of inlining one or more of the potential callees. Original patch by @mssimpso. I've made some small changes to get it to apply, build, and pass tests on the top of tree, as well as some minor tweaks to formatting and functionality. Subscribers: mehdi_amini, hiraditya, llvm-commits, mssimpso Tags: #llvm Differential Revision: https://reviews.llvm.org/D39339 llvm-svn: 369025	2019-08-15 17:47:53 +00:00
Chandler Carruth	7cfefe1cbe	[NewPM] Fix a nasty bug with analysis invalidation in the new PM. The issue here is that we actually allow CGSCC passes to mutate IR (and therefore invalidate analyses) outside of the current SCC. At a minimum, we need to support mutating parent and ancestor SCCs to support the ArgumentPromotion pass which rewrites all calls to a function. However, the analysis invalidation infrastructure is heavily based around not needing to invalidate the same IR-unit at multiple levels. With Loop passes for example, they don't invalidate other Loops. So we need to customize how we handle CGSCC invalidation. Doing this without gratuitously re-running analyses is even harder. I've avoided most of these by using an out-of-band preserved set to accumulate the cross-SCC invalidation, but it still isn't perfect in the case of re-visiting the same SCC repeatedly but it coming off the worklist. Unclear how important this use case really is, but I wanted to call it out. Another wrinkle is that in order for this to successfully propagate to function analyses, we have to make sure we have a proxy from the SCC to the Function level. That requires pre-creating the necessary proxy. The motivating test case now works cleanly and is added for ArgumentPromotion. Thanks for the review from Philip and Wei! Differential Revision: https://reviews.llvm.org/D59869 llvm-svn: 357137	2019-03-28 00:51:36 +00:00
Chandler Carruth	ae65e281f3	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Fedor Sergeev	5a4198704c	[NewPM] fixing asserts on deleted loop in -print-after-all IR-printing AfterPass instrumentation might be called on a loop that has just been invalidated. We should skip printing it to avoid spurious asserts. Reviewed By: chandlerc, philip.pfaffe Differential Revision: https://reviews.llvm.org/D54740 llvm-svn: 348887	2018-12-11 19:05:35 +00:00
Fedor Sergeev	406a5f5b60	[New PM] Introducing PassInstrumentation framework Pass Execution Instrumentation interface enables customizable instrumentation of pass execution, as per "RFC: Pass Execution Instrumentation interface" posted 06/07/2018 on llvm-dev@ The intent is to provide a common machinery to implement all the pass-execution-debugging features like print-before/after, opt-bisect, time-passes etc. Here we get a basic implementation consisting of: * PassInstrumentationCallbacks class that handles registration of callbacks and access to them. * PassInstrumentation class that handles instrumentation-point interfaces that call into PassInstrumentationCallbacks. * Callbacks accept StringRef which is just a name of the Pass right now. There were some ideas to pass an opaque wrapper for the pointer to pass instance, however it appears that pointer does not actually identify the instance (adaptors and managers might have the same address with the pass they govern). Hence it was decided to go simple for now and then later decide on what the proper mental model of identifying a "pass in a phase of pipeline" is. * Callbacks accept llvm::Any serving as a wrapper for const IRUnit, to remove direct dependencies on different IRUnits (e.g. Analyses). PassInstrumentationAnalysis analysis is explicitly requested from PassManager through usual AnalysisManager::getResult. All pass managers were updated to run that to get PassInstrumentation object for instrumentation calls. * Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra args out of a generic PassManager's extra args. This is the only way I was able to explicitly run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or RepeatedPass::run. TODO: Upon lengthy discussions we agreed to accept this as an initial implementation and then get rid of getAnalysisResult by improving RepeatedPass implementation. * PassBuilder takes PassInstrumentationCallbacks object to pass it further into PassInstrumentationAnalysis. Callbacks registration should be performed directly through PassInstrumentationCallbacks. * new-pm tests updated to account for PassInstrumentationAnalysis being run * Added PassInstrumentation tests to PassBuilderCallbacks unit tests. Other unit tests updated with registration of the now-required PassInstrumentationAnalysis. Made getName helper to return std::string (instead of StringRef initially) to fix asan builtbot failures on CGSCC tests. Reviewers: chandlerc, philip.pfaffe Differential Revision: https://reviews.llvm.org/D47858 llvm-svn: 342664	2018-09-20 17:08:45 +00:00
Eric Christopher	7b8d4f4b37	Temporarily Revert "[New PM] Introducing PassInstrumentation framework" as it was causing failures in the asan buildbot. This reverts commit r342597. llvm-svn: 342616	2018-09-20 05:16:29 +00:00
Fedor Sergeev	817b214094	[New PM] Introducing PassInstrumentation framework Pass Execution Instrumentation interface enables customizable instrumentation of pass execution, as per "RFC: Pass Execution Instrumentation interface" posted 06/07/2018 on llvm-dev@ The intent is to provide a common machinery to implement all the pass-execution-debugging features like print-before/after, opt-bisect, time-passes etc. Here we get a basic implementation consisting of: * PassInstrumentationCallbacks class that handles registration of callbacks and access to them. * PassInstrumentation class that handles instrumentation-point interfaces that call into PassInstrumentationCallbacks. * Callbacks accept StringRef which is just a name of the Pass right now. There were some ideas to pass an opaque wrapper for the pointer to pass instance, however it appears that pointer does not actually identify the instance (adaptors and managers might have the same address with the pass they govern). Hence it was decided to go simple for now and then later decide on what the proper mental model of identifying a "pass in a phase of pipeline" is. * Callbacks accept llvm::Any serving as a wrapper for const IRUnit, to remove direct dependencies on different IRUnits (e.g. Analyses). PassInstrumentationAnalysis analysis is explicitly requested from PassManager through usual AnalysisManager::getResult. All pass managers were updated to run that to get PassInstrumentation object for instrumentation calls. * Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra args out of a generic PassManager's extra args. This is the only way I was able to explicitly run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or RepeatedPass::run. TODO: Upon lengthy discussions we agreed to accept this as an initial implementation and then get rid of getAnalysisResult by improving RepeatedPass implementation. * PassBuilder takes PassInstrumentationCallbacks object to pass it further into PassInstrumentationAnalysis. Callbacks registration should be performed directly through PassInstrumentationCallbacks. * new-pm tests updated to account for PassInstrumentationAnalysis being run * Added PassInstrumentation tests to PassBuilderCallbacks unit tests. Other unit tests updated with registration of the now-required PassInstrumentationAnalysis. Reviewers: chandlerc, philip.pfaffe Differential Revision: https://reviews.llvm.org/D47858 llvm-svn: 342597	2018-09-19 22:42:57 +00:00
Fedor Sergeev	a39d896f9f	Revert rL342544: [New PM] Introducing PassInstrumentation framework A bunch of bots fail to compile unittests. Reverting. llvm-svn: 342552	2018-09-19 14:54:48 +00:00
Fedor Sergeev	ee5fd89f4a	[New PM] Introducing PassInstrumentation framework Summary: Pass Execution Instrumentation interface enables customizable instrumentation of pass execution, as per "RFC: Pass Execution Instrumentation interface" posted 06/07/2018 on llvm-dev@ The intent is to provide a common machinery to implement all the pass-execution-debugging features like print-before/after, opt-bisect, time-passes etc. Here we get a basic implementation consisting of: * PassInstrumentationCallbacks class that handles registration of callbacks and access to them. * PassInstrumentation class that handles instrumentation-point interfaces that call into PassInstrumentationCallbacks. * Callbacks accept StringRef which is just a name of the Pass right now. There were some ideas to pass an opaque wrapper for the pointer to pass instance, however it appears that pointer does not actually identify the instance (adaptors and managers might have the same address with the pass they govern). Hence it was decided to go simple for now and then later decide on what the proper mental model of identifying a "pass in a phase of pipeline" is. * Callbacks accept llvm::Any serving as a wrapper for const IRUnit, to remove direct dependencies on different IRUnits (e.g. Analyses). PassInstrumentationAnalysis analysis is explicitly requested from PassManager through usual AnalysisManager::getResult. All pass managers were updated to run that to get PassInstrumentation object for instrumentation calls. * Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra args out of a generic PassManager's extra args. This is the only way I was able to explicitly run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or RepeatedPass::run. TODO: Upon lengthy discussions we agreed to accept this as an initial implementation and then get rid of getAnalysisResult by improving RepeatedPass implementation. * PassBuilder takes PassInstrumentationCallbacks object to pass it further into PassInstrumentationAnalysis. Callbacks registration should be performed directly through PassInstrumentationCallbacks. * new-pm tests updated to account for PassInstrumentationAnalysis being run * Added PassInstrumentation tests to PassBuilderCallbacks unit tests. Other unit tests updated with registration of the now-required PassInstrumentationAnalysis. Reviewers: chandlerc, philip.pfaffe Differential Revision: https://reviews.llvm.org/D47858 llvm-svn: 342544	2018-09-19 12:25:52 +00:00
Nicola Zaghen	9667127c14	Rename DEBUG macro to LLVM_DEBUG. The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' \| xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master \| ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240	2018-05-14 12:53:11 +00:00
Vedant Kumar	6758a0cd13	Fixed spelling mistake in comments of LLVM Analysis passes Patch by Reshabh Sharma! Differential Revision: https://reviews.llvm.org/D43861 llvm-svn: 326352	2018-02-28 19:08:52 +00:00
Sanjoy Das	3115e502f7	Use a BumpPtrAllocator for Loop objects Summary: And now that we no longer have to explicitly free() the Loop instances, we can (with more ease) use the destructor of LoopBase to do what LoopBase::clear() was doing. Reviewers: chandlerc Subscribers: mehdi_amini, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D38201 llvm-svn: 314375	2017-09-28 02:45:42 +00:00
Chandler Carruth	9586106cf5	[PM/CGSCC] Teach the CGSCC pass manager components to gracefully handle invalidated SCCs even when we do not have an updated SCC to redirect towards. This comes up in a fairly subtle and surprising circumstance: we need to have a connected but internal node in the call graph which later becomes a disconnected island, and then gets deleted. All of this needs to happen mid-CGSCC walk. Because it is disconnected, we have no way of computing a new "current" SCC when it gets deleted. Instead, we need to explicitly check for a deleted "current" SCC and bail out of the current CGSCC step. This will bubble all the way up to the post-order walk and then resume correctly. I've included minimal tests for this bug. The specific behavior matches something we've seen in the wild with the new PM combined with ThinLTO and sample PGO, but I've not yet confirmed whether this is the only issue there. llvm-svn: 313242	2017-09-14 08:33:57 +00:00
Eugene Zelenko	f25fa567b0	[Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes. Also affected in files (NFC). llvm-svn: 312289	2017-08-31 21:56:16 +00:00
Chandler Carruth	10f8380767	[PM] Switch the CGSCC debug messages to use the standard LLVM debug printing techniques with a DEBUG_TYPE controlling them. It was a mistake to start re-purposing the pass manager `DebugLogging` variable for generic debug printing -- those logs are intended to be very minimal and primarily used for testing. More detailed and comprehensive logging doesn't make sense there (it would only make for brittle tests). Moreover, we kept forgetting to propagate the `DebugLogging` variable to various places making it also ineffective and/or unavailable. Switching to `DEBUG_TYPE` makes this a non-issue. llvm-svn: 310695	2017-08-11 05:47:13 +00:00
Chandler Carruth	d7fd660b9a	[LCG] Switch one of the update methods for the LazyCallGraph to support limited batch updates. Specifically, allow removing multiple reference edges starting from a common source node. There are a few constraints that play into supporting this form of batching: 1) The way updates occur during the CGSCC walk, about the most we can functionally batch together are those with a common source node. This also makes the batching simpler to implement, so it seems a worthwhile restriction. 2) The far and away hottest function for large C++ files I measured (generated code for protocol buffers) showed a huge amount of time was spent removing ref edges specifically, so it seems worth focusing there. 3) The algorithm for removing ref edges is very amenable to this restricted batching. There are just both API and implementation special casing for the non-batch case that gets in the way. Once removed, supporting batches is nearly trivial. This does modify the API in an interesting way -- now, we only preserve the target RefSCC when the RefSCC structure is unchanged. In the face of any splits, we create brand new RefSCC objects. However, all of the users were OK with it that I could find. Only the unittest needed interesting updates here. How much does batching these updates help? I instrumented the compiler when run over a very large generated source file for a protocol buffer and found that the majority of updates are intrinsically updating one function at a time. However, nearly 40% of the total ref edges removed are removed as part of a batch of removals greater than one, so these are the cases batching can help with. When compiling the IR for this file with 'opt' and 'O3', this patch reduces the total time by 8-9%. Differential Revision: https://reviews.llvm.org/D36352 llvm-svn: 310450	2017-08-09 09:05:27 +00:00
Chandler Carruth	ef4f76c80e	[PM] Fix a likely more critical infloop bug in the CGSCC pass manager. This was just a bad oversight on my part. The code in question should never have worked without this fix. But it turns out, there are relatively few places that involve libfunctions that participate in a single SCC, and unless they do, this happens to not matter. The effect of not having this correct is that each time through this routine, the edge from write_wrapper to write was toggled between a call edge and a ref edge. First time through, it becomes a demoted call edge and is turned into a ref edge. Next time it is a promoted call edge from a ref edge. On, and on it goes forever. I've added the asserts which should have always been here to catch silly mistakes like this in the future as well as a test case that will actually infloop without the fix. The other (much scarier) infinite-inlining issue I think didn't actually occur in practice, and I simply misdiagnosed this minor issue as that much more scary issue. The other issue is still a real issue, but I'm somewhat relieved that so far it hasn't happened in real-world code yet... llvm-svn: 310342	2017-08-08 10:13:23 +00:00
Chandler Carruth	7c07404bc3	[PM] Add a comment clarifying what a particular predicate is doing. This came up as a point of confusion while working on a fundamental problem with the combination of CGSCC iteration and the inliner. llvm-svn: 309662	2017-08-01 06:40:11 +00:00
Chandler Carruth	099fbc1e8e	[PM/LCG] Teach the LazyCallGraph to maintain reference edges from every function to every defined function known to LLVM as a library function. LLVM can introduce calls to these functions either by replacing other library calls or by recognizing patterns (such as memset_pattern or vector math patterns) and replacing those with calls. When these library functions are actually defined in the module, we need to have reference edges to them initially so that we visit them during the CGSCC walk in the right order and can effectively rebuild the call graph afterward. This was discovered when building code with Fortify enabled as that is a common case of both inline definitions of library calls and simplifications of code into calling them. This can in extreme cases of LTO-ing with libc introduce many more reference edges. I discussed a bunch of different options with folks but all of them are unsatisfying. They either make the graph operations substantially more complex even when there are no defined libfuncs, or they introduce some other complexity into the callgraph. So this patch goes with the simplest possible solution of actual synthetic reference edges. If this proves to be a memory problem, I'm happy to implement one of the clever techniques to save memory here. llvm-svn: 308088	2017-07-15 08:08:19 +00:00
Chandler Carruth	4a89b23644	[PM] Fix a silly bug in my recent update to the CG update logic. I used the wrong variable to update. This was even covered by a unittest I wrote, and the comments for the unittest were correct (if confusing) but the test itself just matched the buggy behavior. =[ llvm-svn: 307764	2017-07-12 09:08:11 +00:00
Chandler Carruth	fda9b703c9	[PM] Fix a nasty bug in the new PM where we failed to properly invalidation of analyses when merging SCCs. While I've added a bunch of testing of this, it takes something much more like the inliner to really trigger this as you need to have partially-analyzed SCCs with updates at just the right time. So I've added a direct test for this using the inliner and verifying the domtree. Without the changes here, this test ends up finding a stale dominator tree. However, to handle this properly, we need to invalidate analyses before merging the SCCs. After talking to Philip and Sanjoy about this they convinced me this was the right approach. To do this, we need a callback mechanism when merging SCCs so we can observe the cycle that will be merged before the merge happens. This API update ended up being surprisingly easy. With this commit, the new PM passes the test-suite again. It hadn't since MemorySSA was enabled for EarlyCSE as that also will find this bug very quickly. llvm-svn: 307498	2017-07-09 13:45:11 +00:00
Chandler Carruth	267b806fd9	[PM] Add unittesting of the call graph update logic with complex dependencies between analyses. This uncovers even more issues with the proxies and the splitting apart of SCCs which are fixed in this patch. I discovered this while trying to add more rigorous testing for a change I'm making to the call graph update invalidation logic. llvm-svn: 307497	2017-07-09 13:16:55 +00:00
Chandler Carruth	e28f591d48	[PM] Finish implementing and fix a chain of bugs uncovered by testing the invalidation propagation logic from an SCC to a Function. I wrote the infrastructure to test this but didn't actually use it in the unit test where it was designed to be used. =[ My bad. Once I actually added it to the test case I discovered that it also hadn't been properly implemented, so I've implemented it. The logic in the FAM proxy for an SCC pass to propagate invalidation follows the same ideas as the FAM proxy for a Module pass, but the implementation is a bit different to reflect the fact that it is forwarding just for an SCC. However, implementing this correctly uncovered a surprising "bug" (it was conservatively correct but relatively very expensive) in how we handle invalidation when splitting one SCC into multiple SCCs. We did an eager invalidation when in reality we should be deferring invaliadtion for the current SCC to the CGSCC pass manager and just invaliating the newly constructed SCCs. Otherwise we end up invalidating too much too soon. This was exposed by the inliner test case that I've updated. Now, we invalidate just the split off '(test1_f)' SCC when doing the CG update, and then the inliner finishes and invalidates the '(test1_g, test1_h)' SCC's analyses. The first few attempts at fixing this hit still more bugs, but all of those are covered by existing tests. For example, the inliner should also preserve the FAM proxy to avoid unnecesasry invalidation, and this is safe because the CG update routines it uses handle any necessary adjustments to the FAM proxy. Finally, the unittests for the CGSCC pass manager needed a bunch of updates where we weren't correctly preserving the FAM proxy because it hadn't been fully implemented and failing to preserve it didn't matter. Note that this doesn't yet fix the current crasher due to MemSSA finding a stale dominator tree, but without this the fix to that crasher doesn't really make any sense when testing because it relies on the proxy behavior. llvm-svn: 307487	2017-07-09 03:59:31 +00:00
Chandler Carruth	042041bdf3	[PM/LCG] Teach the LazyCallGraph how to replace a function without disturbing the graph or having to update edges. This is motivated by porting argument promotion to the new pass manager. Because of how LLVM IR Function objects work, in order to change their signature a new object needs to be created. This is efficient and straight forward in the IR but previously was very hard to implement in LCG. We could easily replace the function a node in the graph represents. The challenging part is how to handle updating the edges in the graph. LCG previously used an edge to a raw function to represent a node that had not yet been scanned for calls and references. This was the core of its laziness. However, that model causes this kind of update to be very hard: 1) The keys to lookup an edge need to be `Function`s that would all need to be updated when we update the node. 2) There will be some unknown number of edges that haven't transitioned from `Function` edges to `Node` edges. All of this complexity isn't necessary. Instead, we can always build a node around any function, always pointing edges at it and always using it as the key to lookup an edge. To maintain the laziness, we need to sink the edges* of a node into a secondary object and explicitly model transitioning a node from empty to populated by scanning the function. This design seems much cleaner in a number of ways, but importantly there is now exactly one place where the `Function` has to be updated! Some other cleanups that fall out of this include having something to model the entry* edges more accurately. Rather than hand rolling parts of the node in the graph itself, we have an explicit `EdgeSequence` object that gives us exactly the functionality needed. We also have a consistent place to define the edge iterators and can use them for both the entry edges and the internal edges of the graph. The API used to model the separation between a node and its edges is intentionally very thin as most clients are expected to deal with nodes that have populated edges. We model this exactly as an optional does with an additional method to populate the edges when that is a reasonable thing for a client to do. This is based on API design suggestions from Richard Smith and David Blaikie, credit goes to them for helping pick how to model this without it being either too explicit or too implicit. The patch is somewhat noisy due to shifting around iterator types and new syntax for walking the edges of a node, but most of the functionality change is in the `Edge`, `EdgeSequence`, and `Node` types. Differential Revision: https://reviews.llvm.org/D29577 llvm-svn: 294653	2017-02-09 23:24:13 +00:00
Chandler Carruth	b07ec0321e	Revert r293017 and fix the actual underlying issue. The patch committed in r293017, as discussed on the list, doesn't really make sense but was causing an actual issue to go away. The issue turns out to be that in one place the extra template arguments were dropped from the OuterAnalysisManagerProxy. This in turn caused the types used in one set of places to access the key to be completely different from the types used in another set of places for both Loop and CGSCC cases where there are extra arguments. I have literally no idea how anything seemed to work with this bug in place. It blows my mind. But it did except for mingw64 in a DLL build. I've added a really handy static assert that helps ensure we don't break this in the future. It immediately diagnoses the issue with a compile failure and a very clear error message. Much better that staring at backtraces on a build bot. =] llvm-svn: 294267	2017-02-07 01:50:48 +00:00
Chandler Carruth	93759a2957	[PM/LCG] Remove the lazy RefSCC formation from the LazyCallGraph during iteration. The lazy formation of RefSCCs isn't really the most important part of the laziness here -- that has to do with walking the functions themselves -- and isn't essential to maintain. Originally, there were incremental update algorithms that relied on updates happening predominantly near the most recent RefSCC formed, but those have been replaced with ones that have much tighter general case bounds at this point. We do still perform asserts that only scale well due to this incrementality, but those are easy to place behind EXPENSIVE_CHECKS. Removing this simplifies the entire analysis by having a single up-front step that builds all of the RefSCCs in a direct Tarjan walk. We can even easily replace this with other or better algorithms at will and with much less confusion now that there is no iterator-based incremental logic involved. This removes a lot of complexity from LCG. Another advantage of moving in this direction is that it simplifies testing the system substantially as we no longer have to worry about observing and mutating the graph half-way through the RefSCC formation. We still need a somewhat special iterator for RefSCCs because we want the iterator to remain stable in the face of graph updates. However, this now merely involves relative indexing to the current RefSCC's position in the sequence which isn't too hard. Differential Revision: https://reviews.llvm.org/D29381 llvm-svn: 294227	2017-02-06 19:38:06 +00:00
NAKAMURA Takumi	b8624b1643	Rewind instantiations of OuterAnalysisManagerProxy in r289317, r291651, and r291662. I found root class should be instantiated for variadic tempate to instantiate static member explicitly. This will fix failures in mingw DLL build. llvm-svn: 293017	2017-01-25 04:26:29 +00:00
Chandler Carruth	90f7376682	[PM] Teach the CGSCC's CG update utility to more carefully invalidate analyses when we're about to break apart an SCC. We can't wait until after breaking apart the SCC to invalidate things: 1) Which SCC do we then invalidate? All of them? 2) Even if we invalidate all of them, a newly created SCC may not have a proxy that will convey the invalidation to functions! Previously we only invalidated one of the SCCs and too late. This led to stale analyses remaining in the cache. And because the caching strategy actually works, they would get used and chaos would ensue. Doing invalidation early is somewhat pessimizing though if we know that the SCC structure won't change. So it turns out that the design to make the mutation API force the caller to know the kind of mutation in advance was indeed 100% correct and we didn't do enough of it. So this change also splits two cases of switching a call edge to a ref edge into two separate APIs so that callers can clearly test for this and take the easy path without invalidating when appropriate. This is particularly important in this case as we expect most inlines to be between functions in separate SCCs and so the common case is that we don't have to so aggressively invalidate analyses. The LCG API change in turn needed some basic cleanups and better testing in its unittest. No interesting functionality changed there other than more coverage of the returned sequence of SCCs. While this seems like an obvious improvement over the current state, I'd like to revisit the core concept of invalidating within the CG-update layer at all. I'm wondering if we would be better served forcing the callers to handle the invalidation beforehand in the cases that they can handle it. An interesting example is when we want to teach the inliner to update and preserve analyses. But we can cross that bridge when we get there. With this patch, the new pass manager an build all of the LLVM test suite at -O3 and everything passes. =D I haven't bootstrapped yet and I'm sure there are still plenty of bugs, but this gives a nice baseline so I'm going to increasingly focus on fleshing out the missing functionality, especially the bits that are just turned off right now in order to let us establish this baseline. llvm-svn: 290664	2016-12-28 10:34:50 +00:00
Chandler Carruth	99ebb6e7dc	[PM] Introduce the facilities for registering cross-IR-unit dependencies that require deferred invalidation. This handles the other real-world invalidation scenario that we have cases of: a function analysis which caches references to a module analysis. We currently do this in the AA aggregation layer and might well do this in other places as well. Since this is relative rare, the technique is somewhat more cumbersome. Analyses need to register themselves when accessing the outer analysis manager's proxy. This proxy is already necessarily present to allow access to the outer IR unit's analyses. By registering here we can track and trigger invalidation when that outer analysis goes away. To make this work we need to enhance the PreservedAnalyses infrastructure to support a (slightly) more explicit model for "sets" of analyses, and allow abandoning a single specific analyses even when a set covering that analysis is preserved. That allows us to describe the scenario of preserving all Function analyses except for the one where deferred invalidation has triggered. We also need to teach the invalidator API to support direct ID calls instead of always going through a template to dispatch so that we can just record the ID mapping. I've introduced testing of all of this both for simple module<->function cases as well as for more complex cases involving a CGSCC layer. Much like the previous patch I've not tried to fully update the loop pass management layer because that layer is due to be heavily reworked to use similar techniques to the CGSCC to handle updates. As that happens, we'll have a better testing basis for adding support like this. Many thanks to both Justin and Sean for the extensive reviews on this to help bring the API design and documentation into a better state. Differential Revision: https://reviews.llvm.org/D27198 llvm-svn: 290594	2016-12-27 08:40:39 +00:00
Chandler Carruth	816d7841f8	[PM] Remove now-dead extern template and explicit instantiation declarations. We're using a custom class here instead of the helper template, these bits just didn't get deleted when the other bits did get deleted. This was found by a really nice MSVC warning about explicitly instantiating a template where some member functions aren't defined and thus can't be instantiatied. llvm-svn: 290327	2016-12-22 07:14:33 +00:00
Chandler Carruth	c44a1a0c4a	[PM] Rework a loop in the CGSCC update logic to be more conservative and clear. The current RefSCC can occur in exactly one position so we should just enforce that and leverage the property rather than checking for it anywhere. This addresses review comments made on another patch. llvm-svn: 290162	2016-12-20 03:32:17 +00:00
Chandler Carruth	4fc787e70d	[PM] Support invalidation of inner analysis managers from a pass over the outer IR unit. Summary: This never really got implemented, and was very hard to test before a lot of the refactoring changes to make things more robust. But now we can test it thoroughly and cleanly, especially at the CGSCC level. The core idea is that when an inner analysis manager proxy receives the invalidation event for the outer IR unit, it needs to walk the inner IR units and propagate it to the inner analysis manager for each of those units. For example, each function in the SCC needs to get an invalidation event when the SCC gets one. The function / module interaction is somewhat boring here. This really becomes interesting in the face of analysis-backed IR units. This patch effectively handles all of the CGSCC layer's needs -- both invalidating SCC analysis and invalidating function analysis when an SCC gets invalidated. However, this second aspect doesn't really handle the LoopAnalysisManager well at this point. That one will need some change of design in order to fully integrate, because unlike the call graph, the entire function behind a LoopAnalysis's results can vanish out from under us, and we won't even have a cached API to access. I'd like to try to separate solving the loop problems into a subsequent patch though in order to keep this more focused so I've adapted them to the API and updated the tests that immediately fail, but I've not added the level of testing and validation at that layer that I have at the CGSCC layer. An important aspect of this change is that the proxy for the FunctionAnalysisManager at the SCC pass layer doesn't work like the other proxies for an inner IR unit as it doesn't directly manage the FunctionAnalysisManager and invalidation or clearing of it. This would create an ever worsening problem of dual ownership of this responsibility, split between the module-level FAM proxy and this SCC-level FAM proxy. Instead, this patch changes the SCC-level FAM proxy to work in terms of the module-level proxy and defer to it to handle much of the updates. It only does SCC-specific invalidation. This will become more important in subsequent patches that support more complex invalidaiton scenarios. Reviewers: jlebar Subscribers: mehdi_amini, mcrosier, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D27197 llvm-svn: 289317	2016-12-10 06:34:44 +00:00
Chandler Carruth	126f8ff595	[PM] Basic cleanups to CGSCC update code, NFC. Just using InstIterator, simpler loop structures, and making better use of the visit callback infrastructure. llvm-svn: 288790	2016-12-06 10:06:06 +00:00
Chandler Carruth	84780666b4	[PM] Extend the explicit 'invalidate' method API on analysis results to accept an Invalidator that allows them to invalidate themselves if their dependencies are in turn invalidated. Rather than recording the dependency graph ahead of time when analysis get results from other analyses, this simply lets each result trigger the immediate invalidation of any analyses they actually depend on. They do this in a way that has three nice properties: 1) They don't have to handle transitive dependencies because the infrastructure will recurse for them. 2) The invalidate methods are still called only once. We just dynamically discover the necessary topological ordering, everything is memoized nicely. 3) The infrastructure still provides a default implementation and can access it so that only analyses which have dependencies need to do anything custom. To make this work at all, the invalidation logic also has to defer the deletion of the result objects themselves so that they can remain alive until we have collected the complete set of results to invalidate. A unittest is added here that has exactly the dependency pattern we are concerned with. It hit the use-after-free described by Sean in much detail in the long thread about analysis invalidation before this change, and even in an intermediate form of this change where we failed to defer the deletion of the result objects. There is an important problem with doing dependency invalidation that isn't solved here: we don't enforce that results correctly invalidate all the analyses whose results they depend on. I actually looked at what it would take to do that, and it isn't as hard as I had thought but the complexity it introduces seems very likely to outweigh the benefit. The technique would be to provide a base class for an analysis result that would be populated with other results, and automatically provide the invalidate method which immediately does the correct thing. This approach has some nice pros IMO: - Handles the case we care about and nothing else: only results that depend on other analyses trigger extra invalidation. - Localized to the result rather than centralized in the analysis manager. - Ties the storage of the reference to another result to the triggering of the invalidation of that analysis. - Still supports extending invalidation in customized ways. But the down sides here are: - Very heavy-weight meta-programming is needed to provide this base class. - Requires a pretty awful API for accessing the dependencies. Ultimately, I fear it will not pull its weight. But we can re-evaluate this at any point if we start discovering consistent problems where the invalidation and dependencies get out of sync. It will fit as a clean layer on top of the facilities in this patch that we can add if and when we need it. Note that I'm not really thrilled with the names for these APIs... The name "Invalidator" seems ok but not great. The method name "invalidate" also. In review some improvements were suggested, but they really need other uses of these terms to be updated as well so I'm going to do that in a follow-up commit. I'm working on the actual fixes to various analyses that need to use these, but I want to try to get tests for each of them so we don't regress. And those changes are seperable and obvious so once this goes in I should be able to roll them out throughout LLVM. Many thanks to Sean, Justin, and others for help reviewing here. Differential Revision: https://reviews.llvm.org/D23738 llvm-svn: 288077	2016-11-28 22:04:31 +00:00
Chandler Carruth	9d50667842	[PM] Remove weird marking of invalidated analyses as "preserved". This never made a lot of sense. They've been invalidated for one IR unit but they aren't really preserved in any normal sense. It seemed like it would be an elegant way of communicating to outer IR units that pass managers and adaptors had already handled invalidation, but we've since ended up adding sets that model this more clearly: we're now using the 'AllAnalysesOn<IRUnitT>' set to handle cases where the trick of "preserving" invalidated analyses didn't work. This patch moves to rely on that technique exclusively and removes the cumbersome API aspect of updating the preserved set when doing invalidation. This in turn will simplify a number of upcoming patches. This has a side benefit of exposing a number of places where we were failing to mark the 'AllAnalysesOn<IRUnitT>' set as preserved. This patch fixes those, and with those fixes shouldn't change any observable behavior. llvm-svn: 288023	2016-11-28 10:42:21 +00:00
NAKAMURA Takumi	35c1fea391	Fixup r279618, instantiate AnalysisManagerProxy<AnalysisManager,LazyCallGraph::SCC>, instead of AnalysisManagerProxy<AnalysisManager,LazyCallGraph::SCC,LazyCallGraph&>, for PassID. Or they were not instantiated as expected; llvm::InnerAnalysisManagerProxy<llvm::AnalysisManager<llvm::Function>, llvm::LazyCallGraph::SCC>::PassID llvm::InnerAnalysisManagerProxy<llvm::AnalysisManager<llvm::Function>, llvm::LazyCallGraph::SCC>::PassID llvm-svn: 280105	2016-08-30 15:47:13 +00:00
Chandler Carruth	94942f5c00	[PM] Introduce basic update capabilities to the new PM's CGSCC pass manager, including both plumbing and logic to handle function pass updates. There are three fundamentally tied changes here: 1) Plumbing some mechanism for updating the CGSCC pass manager as the CG changes while passes are running. 2) Changing the CGSCC pass manager infrastructure to have support for the underlying graph to mutate mid-pass run. 3) Actually updating the CG after function passes run. I can separate them if necessary, but I think its really useful to have them together as the needs of #3 drove #2, and that in turn drove #1. The plumbing technique is to extend the "run" method signature with extra arguments. We provide the call graph that intrinsically is available as it is the basis of the pass manager's IR units, and an output parameter that records the results of updating the call graph during an SCC passes's run. Note that "...UpdateResult" isn't a great name here... suggestions very welcome. I tried a pretty frustrating number of different data structures and such for the innards of the update result. Every other one failed for one reason or another. Sometimes I just couldn't keep the layers of complexity right in my head. The thing that really worked was to just directly provide access to the underlying structures used to walk the call graph so that their updates could be informed by the particular nature of the change to the graph. The technique for how to make the pass management infrastructure cope with mutating graphs was also something that took a really, really large number of iterations to get to a place where I was happy. Here are some of the considerations that drove the design: - We operate at three levels within the infrastructure: RefSCC, SCC, and Node. In each case, we are working bottom up and so we want to continue to iterate on the "lowest" node as the graph changes. Look at how we iterate over nodes in an SCC running function passes as those function passes mutate the CG. We continue to iterate on the "lowest" SCC, which is the one that continues to contain the function just processed. - The call graph structure re-uses SCCs (and RefSCCs) during mutation events for the highest entry in the resulting new subgraph, not the lowest. This means that it is necessary to continually update the current SCC or RefSCC as it shifts. This is really surprising and subtle, and took a long time for me to work out. I actually tried changing the call graph to provide the opposite behavior, and it breaks EVERYTHING. The graph update algorithms are really deeply tied to this particualr pattern. - When SCCs or RefSCCs are split apart and refined and we continually re-pin our processing to the bottom one in the subgraph, we need to enqueue the newly formed SCCs and RefSCCs for subsequent processing. Queuing them presents a few challenges: 1) SCCs and RefSCCs use wildly different iteration strategies at a high level. We end up needing to converge them on worklist approaches that can be extended in order to be able to handle the mutations. 2) The order of the enqueuing need to remain bottom-up post-order so that we don't get surprising order of visitation for things like the inliner. 3) We need the worklists to have set semantics so we don't duplicate things endlessly. We don't need a persistent set though because we always keep processing the bottom node!!!! This is super, super surprising to me and took a long time to convince myself this is correct, but I'm pretty sure it is... Once we sink down to the bottom node, we can't re-split out the same node in any way, and the postorder of the current queue is fixed and unchanging. 4) We need to make sure that the "current" SCC or RefSCC actually gets enqueued here such that we re-visit it because we continue processing a new, bottom SCC/RefSCC. - We also need the ability to skip SCCs and RefSCCs that get merged into a larger component. We even need the ability to skip nodes from an SCC that are no longer part of that SCC. This led to the design you see in the patch which uses SetVector-based worklists. The RefSCC worklist is always empty until an update occurs and is just used to handle those RefSCCs created by updates as the others don't even exist yet and are formed on-demand during the bottom-up walk. The SCC worklist is pre-populated from the RefSCC, and we push new SCCs onto it and blacklist existing SCCs on it to get the desired processing. We then directly update these when updating the call graph as I was never able to find a satisfactory abstraction around the update strategy. Finally, we need to compute the updates for function passes. This is mostly used as an initial customer of all the update mechanisms to drive their design to at least cover some real set of use cases. There are a bunch of interesting things that came out of doing this: - It is really nice to do this a function at a time because that function is likely hot in the cache. This means we want even the function pass adaptor to support online updates to the call graph! - To update the call graph after arbitrary function pass mutations is quite hard. We have to build a fairly comprehensive set of data structures and then process them. Fortunately, some of this code is related to the code for building the cal graph in the first place. Unfortunately, very little of it makes any sense to share because the nature of what we're doing is so very different. I've factored out the one part that made sense at least. - We need to transfer these updates into the various structures for the CGSCC pass manager. Once those were more sanely worked out, this became relatively easier. But some of those needs necessitated changes to the LazyCallGraph interface to make it significantly easier to extract the changed SCCs from an update operation. - We also need to update the CGSCC analysis manager as the shape of the graph changes. When an SCC is merged away we need to clear analyses associated with it from the analysis manager which we didn't have support for in the analysis manager infrsatructure. New SCCs are easy! But then we have the case that the original SCC has its shape changed but remains in the call graph. There we need to invalidate the analyses associated with it. - We also need to invalidate analyses after we finish processing an SCC. But the analyses we need to invalidate here are only those for the newly updated SCC!!! Because we only continue processing the bottom SCC, if we split SCCs apart the original one gets invalidated once when its shape changes and is not processed farther so its analyses will be correct. It is the bottom SCC which continues being processed and needs to have the "normal" invalidation done based on the preserved analyses set. All of this is mostly background and context for the changes here. Many thanks to all the reviewers who helped here. Especially Sanjoy who caught several interesting bugs in the graph algorithms, David, Sean, and others who all helped with feedback. Differential Revision: http://reviews.llvm.org/D21464 llvm-svn: 279618	2016-08-24 09:37:14 +00:00
Mehdi Amini	9ff867f98c	[NFC] Header cleanup Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' \| xargs grep -L 'IndexedMap[<]' \| xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595	2016-04-18 09:17:29 +00:00
Chandler Carruth	0bb4ed7ba7	[PM] Implement the final conclusion as to how the analysis IDs should work in the face of the limitations of DLLs and templated static variables. This requires passes that use the AnalysisBase mixin provide a static variable themselves. So as to keep their APIs clean, I've made these private and befriended the CRTP base class (which is the common practice). I've added documentation to AnalysisBase for why this is necessary and at what point we can go back to the much simpler system. This is clearly a better pattern than the extern template as it caught numerous places where the template magic hadn't been applied and things were "just working" but would eventually have broken mysteriously. llvm-svn: 263216	2016-03-11 10:22:49 +00:00
NAKAMURA Takumi	798b80e69c	[PM] Appease mingw32's auto-import DLL build with minimal tweaks, with fix for clang. char AnalysisBase::ID should be declared as extern and defined in one module. llvm-svn: 262188	2016-02-28 17:17:00 +00:00
NAKAMURA Takumi	e7de739142	Revert r262185, "[PM] Appease mingw32's auto-import DLL build with minimal tweaks." I'll rework soon. llvm-svn: 262186	2016-02-28 16:54:06 +00:00

1 2

67 Commits