1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00
llvm-mirror/lib/Passes/PassRegistry.def

319 lines
15 KiB
Modula-2
Raw Normal View History

//===- PassRegistry.def - Registry of passes --------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This file is used as the registry of passes that are part of the core LLVM
// libraries. This file describes both transformation passes and analyses
// Analyses are registered while transformation passes have names registered
// that can be used when providing a textual pass pipeline.
//
//===----------------------------------------------------------------------===//
// NOTE: NO INCLUDE GUARD DESIRED!
#ifndef MODULE_ANALYSIS
#define MODULE_ANALYSIS(NAME, CREATE_PASS)
#endif
MODULE_ANALYSIS("callgraph", CallGraphAnalysis())
MODULE_ANALYSIS("lcg", LazyCallGraphAnalysis())
MODULE_ANALYSIS("module-summary", ModuleSummaryIndexAnalysis())
MODULE_ANALYSIS("no-op-module", NoOpModuleAnalysis())
MODULE_ANALYSIS("profile-summary", ProfileSummaryAnalysis())
MODULE_ANALYSIS("stack-safety", StackSafetyGlobalAnalysis())
MODULE_ANALYSIS("targetlibinfo", TargetLibraryAnalysis())
MODULE_ANALYSIS("verify", VerifierAnalysis())
[New PM] Introducing PassInstrumentation framework Pass Execution Instrumentation interface enables customizable instrumentation of pass execution, as per "RFC: Pass Execution Instrumentation interface" posted 06/07/2018 on llvm-dev@ The intent is to provide a common machinery to implement all the pass-execution-debugging features like print-before/after, opt-bisect, time-passes etc. Here we get a basic implementation consisting of: * PassInstrumentationCallbacks class that handles registration of callbacks and access to them. * PassInstrumentation class that handles instrumentation-point interfaces that call into PassInstrumentationCallbacks. * Callbacks accept StringRef which is just a name of the Pass right now. There were some ideas to pass an opaque wrapper for the pointer to pass instance, however it appears that pointer does not actually identify the instance (adaptors and managers might have the same address with the pass they govern). Hence it was decided to go simple for now and then later decide on what the proper mental model of identifying a "pass in a phase of pipeline" is. * Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies on different IRUnits (e.g. Analyses). * PassInstrumentationAnalysis analysis is explicitly requested from PassManager through usual AnalysisManager::getResult. All pass managers were updated to run that to get PassInstrumentation object for instrumentation calls. * Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra args out of a generic PassManager's extra args. This is the only way I was able to explicitly run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or RepeatedPass::run. TODO: Upon lengthy discussions we agreed to accept this as an initial implementation and then get rid of getAnalysisResult by improving RepeatedPass implementation. * PassBuilder takes PassInstrumentationCallbacks object to pass it further into PassInstrumentationAnalysis. Callbacks registration should be performed directly through PassInstrumentationCallbacks. * new-pm tests updated to account for PassInstrumentationAnalysis being run * Added PassInstrumentation tests to PassBuilderCallbacks unit tests. Other unit tests updated with registration of the now-required PassInstrumentationAnalysis. Made getName helper to return std::string (instead of StringRef initially) to fix asan builtbot failures on CGSCC tests. Reviewers: chandlerc, philip.pfaffe Differential Revision: https://reviews.llvm.org/D47858 llvm-svn: 342664
2018-09-20 19:08:45 +02:00
MODULE_ANALYSIS("pass-instrumentation", PassInstrumentationAnalysis(PIC))
MODULE_ANALYSIS("asan-globals-md", ASanGlobalsMetadataAnalysis())
#ifndef MODULE_ALIAS_ANALYSIS
#define MODULE_ALIAS_ANALYSIS(NAME, CREATE_PASS) \
MODULE_ANALYSIS(NAME, CREATE_PASS)
#endif
MODULE_ALIAS_ANALYSIS("globals-aa", GlobalsAA())
#undef MODULE_ALIAS_ANALYSIS
#undef MODULE_ANALYSIS
#ifndef MODULE_PASS
#define MODULE_PASS(NAME, CREATE_PASS)
#endif
[PM] Port the always inliner to the new pass manager in a much more minimal and boring form than the old pass manager's version. This pass does the very minimal amount of work necessary to inline functions declared as always-inline. It doesn't support a wide array of things that the legacy pass manager did support, but is alse ... about 20 lines of code. So it has that going for it. Notably things this doesn't support: - Array alloca merging - To support the above, bottom-up inlining with careful history tracking and call graph updates - DCE of the functions that become dead after this inlining. - Inlining through call instructions with the always_inline attribute. Instead, it focuses on inlining functions with that attribute. The first I've omitted because I'm hoping to just turn it off for the primary pass manager. If that doesn't pan out, I can add it here but it will be reasonably expensive to do so. The second should really be handled by running global-dce after the inliner. I don't want to re-implement the non-trivial logic necessary to do comdat-correct DCE of functions. This means the -O0 pipeline will have to be at least 'always-inline,global-dce', but that seems reasonable to me. If others are seriously worried about this I'd like to hear about it and understand why. Again, this is all solveable by factoring that logic into a utility and calling it here, but I'd like to wait to do that until there is a clear reason why the existing pass-based factoring won't work. The final point is a serious one. I can fairly easily add support for this, but it seems both costly and a confusing construct for the use case of the always inliner running at -O0. This attribute can of course still impact the normal inliner easily (although I find that a questionable re-use of the same attribute). I've started a discussion to sort out what semantics we want here and based on that can figure out if it makes sense ta have this complexity at O0 or not. One other advantage of this design is that it should be quite a bit faster due to checking for whether the function is a viable candidate for inlining exactly once per function instead of doing it for each call site. Anyways, hopefully a reasonable starting point for this pass. Differential Revision: https://reviews.llvm.org/D23299 llvm-svn: 278896
2016-08-17 04:56:20 +02:00
MODULE_PASS("always-inline", AlwaysInlinerPass())
MODULE_PASS("attributor", AttributorPass())
MODULE_PASS("called-value-propagation", CalledValuePropagationPass())
MODULE_PASS("canonicalize-aliases", CanonicalizeAliasesPass())
MODULE_PASS("cg-profile", CGProfilePass())
MODULE_PASS("constmerge", ConstantMergePass())
MODULE_PASS("cross-dso-cfi", CrossDSOCFIPass())
MODULE_PASS("deadargelim", DeadArgumentEliminationPass())
MODULE_PASS("elim-avail-extern", EliminateAvailableExternallyPass())
MODULE_PASS("forceattrs", ForceFunctionAttrsPass())
MODULE_PASS("function-import", FunctionImportPass())
MODULE_PASS("globaldce", GlobalDCEPass())
MODULE_PASS("globalopt", GlobalOptPass())
MODULE_PASS("globalsplit", GlobalSplitPass())
MODULE_PASS("hotcoldsplit", HotColdSplittingPass())
MODULE_PASS("hwasan", HWAddressSanitizerPass(false, false))
MODULE_PASS("khwasan", HWAddressSanitizerPass(true, true))
MODULE_PASS("inferattrs", InferFunctionAttrsPass())
MODULE_PASS("insert-gcov-profiling", GCOVProfilerPass())
MODULE_PASS("instrorderfile", InstrOrderFilePass())
MODULE_PASS("instrprof", InstrProfiling())
MODULE_PASS("internalize", InternalizePass())
MODULE_PASS("invalidate<all>", InvalidateAllAnalysesPass())
MODULE_PASS("ipsccp", IPSCCPPass())
MODULE_PASS("lowertypetests", LowerTypeTestsPass(nullptr, nullptr))
MODULE_PASS("name-anon-globals", NameAnonGlobalPass())
MODULE_PASS("no-op-module", NoOpModulePass())
MODULE_PASS("partial-inliner", PartialInlinerPass())
MODULE_PASS("pgo-icall-prom", PGOIndirectCallPromotion())
MODULE_PASS("pgo-instr-gen", PGOInstrumentationGen())
MODULE_PASS("pgo-instr-use", PGOInstrumentationUse())
MODULE_PASS("pre-isel-intrinsic-lowering", PreISelIntrinsicLoweringPass())
MODULE_PASS("print-profile-summary", ProfileSummaryPrinterPass(dbgs()))
MODULE_PASS("print-callgraph", CallGraphPrinterPass(dbgs()))
MODULE_PASS("print", PrintModulePass(dbgs()))
MODULE_PASS("print-lcg", LazyCallGraphPrinterPass(dbgs()))
MODULE_PASS("print-lcg-dot", LazyCallGraphDOTPrinterPass(dbgs()))
MODULE_PASS("print-stack-safety", StackSafetyGlobalPrinterPass(dbgs()))
MODULE_PASS("rewrite-statepoints-for-gc", RewriteStatepointsForGC())
MODULE_PASS("rewrite-symbols", RewriteSymbolPass())
[PM] Port ReversePostOrderFunctionAttrs to the new PM Below are my super rough notes when porting. They can probably serve as a basic guide for porting other passes to the new PM. As I port more passes I'll expand and generalize this and make a proper docs/HowToPortToNewPassManager.rst document. There is also missing documentation for general concepts and API's in the new PM which will require some documentation. Once there is proper documentation in place we can put up a list of passes that have to be ported and game-ify/crowdsource the rest of the porting (at least of the middle end; the backend is still unclear). I will however be taking personal responsibility for ensuring that the LLD/ELF LTO pipeline is ported in a timely fashion. The remaining passes to be ported are (do something like `git grep "<the string in the bullet point below>"` to find the pass): General Scalar: [ ] Simplify the CFG [ ] Jump Threading [ ] MemCpy Optimization [ ] Promote Memory to Register [ ] MergedLoadStoreMotion [ ] Lazy Value Information Analysis General IPO: [ ] Dead Argument Elimination [ ] Deduce function attributes in RPO Loop stuff / vectorization stuff: [ ] Alignment from assumptions [ ] Canonicalize natural loops [ ] Delete dead loops [ ] Loop Access Analysis [ ] Loop Invariant Code Motion [ ] Loop Vectorization [ ] SLP Vectorizer [ ] Unroll loops Devirtualization / CFI: [ ] Cross-DSO CFI [ ] Whole program devirtualization [ ] Lower bitset metadata CGSCC passes: [ ] Function Integration/Inlining [ ] Remove unused exception handling info [ ] Promote 'by reference' arguments to scalars Please let me know if you are interested in working on any of the passes in the above list (e.g. reply to the post-commit thread for this patch). I'll probably be tackling "General Scalar" and "General IPO" first FWIW. Steps as I port "Deduce function attributes in RPO" --------------------------------------------------- (note: if you are doing any work based on these notes, please leave a note in the post-commit review thread for this commit with any improvements / suggestions / incompleteness you ran into!) Note: "Deduce function attributes in RPO" is a module pass. 1. Do preparatory refactoring. Do preparatory factoring. In this case all I had to do was to pull out a static helper (r272503). (TODO: give more advice here e.g. if pass holds state or something) 2. Rename the old pass class. llvm/lib/Transforms/IPO/FunctionAttrs.cpp Rename class ReversePostOrderFunctionAttrs -> ReversePostOrderFunctionAttrsLegacyPass in preparation for adding a class ReversePostOrderFunctionAttrs as the pass in the new PM. (edit: actually wait what? The new class name will be ReversePostOrderFunctionAttrsPass, so it doesn't conflict. So this step is sort of useless churn). llvm/include/llvm/InitializePasses.h llvm/lib/LTO/LTOCodeGenerator.cpp llvm/lib/Transforms/IPO/IPO.cpp llvm/lib/Transforms/IPO/FunctionAttrs.cpp Rename initializeReversePostOrderFunctionAttrsPass -> initializeReversePostOrderFunctionAttrsLegacyPassPass (note that the "PassPass" thing falls out of `s/ReversePostOrderFunctionAttrs/ReversePostOrderFunctionAttrsLegacyPass/`) Note that the INITIALIZE_PASS macro is what creates this identifier name, so renaming the class requires this renaming too. Note that createReversePostOrderFunctionAttrsPass does not need to be renamed since its name is not generated from the class name. 3. Add the new PM pass class. In the new PM all passes need to have their declaration in a header somewhere, so you will often need to add a header. In this case llvm/include/llvm/Transforms/IPO/FunctionAttrs.h is already there because PostOrderFunctionAttrsPass was already ported. The file-level comment from the .cpp file can be used as the file-level comment for the new header. You may want to tweak the wording slightly from "this file implements" to "this file provides" or similar. Add declaration for the new PM pass in this header: class ReversePostOrderFunctionAttrsPass : public PassInfoMixin<ReversePostOrderFunctionAttrsPass> { public: PreservedAnalyses run(Module &M, AnalysisManager<Module> &AM); }; Its name should end with `Pass` for consistency (note that this doesn't collide with the names of most old PM passes). E.g. call it `<name of the old PM pass>Pass`. Also, move the doxygen comment from the old PM pass to the declaration of this class in the header. Also, include the declaration for the new PM class `llvm/Transforms/IPO/FunctionAttrs.h` at the top of the file (in this case, it was already done when the other pass in this file was ported). Now define the `run` method for the new class. The main things here are: a) Use AM.getResult<...>(M) to get results instead of `getAnalysis<...>()` b) If the old PM pass would have returned "false" (i.e. `Changed == false`), then you should return PreservedAnalyses::all(); c) In the old PM getAnalysisUsage method, observe the calls `AU.addPreserved<...>();`. In the case `Changed == true`, for each preserved analysis you should do call `PA.preserve<...>()` on a PreservedAnalyses object and return it. E.g.: PreservedAnalyses PA; PA.preserve<CallGraphAnalysis>(); return PA; Note that calls to skipModule/skipFunction are not supported in the new PM currently, so optnone and optimization bisect support do not work. You can just drop those calls for now. 4. Add the pass to the new PM pass registry to make it available in opt. In llvm/lib/Passes/PassBuilder.cpp add a #include for your header. `#include "llvm/Transforms/IPO/FunctionAttrs.h"` In this case there is already an include (from when PostOrderFunctionAttrsPass was ported). Add your pass to llvm/lib/Passes/PassRegistry.def In this case, I added `MODULE_PASS("rpo-functionattrs", ReversePostOrderFunctionAttrsPass())` The string is from the `INITIALIZE_PASS*` macros used in the old pass manager. Then choose a test that uses the pass and use the new PM `-passes=...` to run it. E.g. in this case there is a test that does: ; RUN: opt < %s -basicaa -functionattrs -rpo-functionattrs -S | FileCheck %s I have added the line: ; RUN: opt < %s -aa-pipeline=basic-aa -passes='require<targetlibinfo>,cgscc(function-attrs),rpo-functionattrs' -S | FileCheck %s The `-aa-pipeline=basic-aa` and `require<targetlibinfo>,cgscc(function-attrs)` are what is needed to run functionattrs in the new PM (note that in the new PM "functionattrs" becomes "function-attrs" for some reason). This is just pulled from `readattrs.ll` which contains the change from when functionattrs was ported to the new PM. Adding rpo-functionattrs causes the pass that was just ported to run. llvm-svn: 272505
2016-06-12 09:48:51 +02:00
MODULE_PASS("rpo-functionattrs", ReversePostOrderFunctionAttrsPass())
MODULE_PASS("sample-profile", SampleProfileLoaderPass())
MODULE_PASS("strip-dead-prototypes", StripDeadPrototypesPass())
MODULE_PASS("synthetic-counts-propagation", SyntheticCountsPropagation())
MODULE_PASS("wholeprogramdevirt", WholeProgramDevirtPass(nullptr, nullptr))
MODULE_PASS("verify", VerifierPass())
MODULE_PASS("asan-module", ModuleAddressSanitizerPass(/*CompileKernel=*/false, false, true, false))
MODULE_PASS("kasan-module", ModuleAddressSanitizerPass(/*CompileKernel=*/true, false, true, false))
MODULE_PASS("sancov-module", ModuleSanitizerCoveragePass())
MODULE_PASS("poison-checking", PoisonCheckingPass())
#undef MODULE_PASS
#ifndef CGSCC_ANALYSIS
#define CGSCC_ANALYSIS(NAME, CREATE_PASS)
#endif
CGSCC_ANALYSIS("no-op-cgscc", NoOpCGSCCAnalysis())
[PM] Support invalidation of inner analysis managers from a pass over the outer IR unit. Summary: This never really got implemented, and was very hard to test before a lot of the refactoring changes to make things more robust. But now we can test it thoroughly and cleanly, especially at the CGSCC level. The core idea is that when an inner analysis manager proxy receives the invalidation event for the outer IR unit, it needs to walk the inner IR units and propagate it to the inner analysis manager for each of those units. For example, each function in the SCC needs to get an invalidation event when the SCC gets one. The function / module interaction is somewhat boring here. This really becomes interesting in the face of analysis-backed IR units. This patch effectively handles all of the CGSCC layer's needs -- both invalidating SCC analysis and invalidating function analysis when an SCC gets invalidated. However, this second aspect doesn't really handle the LoopAnalysisManager well at this point. That one will need some change of design in order to fully integrate, because unlike the call graph, the entire function behind a LoopAnalysis's results can vanish out from under us, and we won't even have a cached API to access. I'd like to try to separate solving the loop problems into a subsequent patch though in order to keep this more focused so I've adapted them to the API and updated the tests that immediately fail, but I've not added the level of testing and validation at that layer that I have at the CGSCC layer. An important aspect of this change is that the proxy for the FunctionAnalysisManager at the SCC pass layer doesn't work like the other proxies for an inner IR unit as it doesn't directly manage the FunctionAnalysisManager and invalidation or clearing of it. This would create an ever worsening problem of dual ownership of this responsibility, split between the module-level FAM proxy and this SCC-level FAM proxy. Instead, this patch changes the SCC-level FAM proxy to work in terms of the module-level proxy and defer to it to handle much of the updates. It only does SCC-specific invalidation. This will become more important in subsequent patches that support more complex invalidaiton scenarios. Reviewers: jlebar Subscribers: mehdi_amini, mcrosier, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D27197 llvm-svn: 289317
2016-12-10 07:34:44 +01:00
CGSCC_ANALYSIS("fam-proxy", FunctionAnalysisManagerCGSCCProxy())
[New PM] Introducing PassInstrumentation framework Pass Execution Instrumentation interface enables customizable instrumentation of pass execution, as per "RFC: Pass Execution Instrumentation interface" posted 06/07/2018 on llvm-dev@ The intent is to provide a common machinery to implement all the pass-execution-debugging features like print-before/after, opt-bisect, time-passes etc. Here we get a basic implementation consisting of: * PassInstrumentationCallbacks class that handles registration of callbacks and access to them. * PassInstrumentation class that handles instrumentation-point interfaces that call into PassInstrumentationCallbacks. * Callbacks accept StringRef which is just a name of the Pass right now. There were some ideas to pass an opaque wrapper for the pointer to pass instance, however it appears that pointer does not actually identify the instance (adaptors and managers might have the same address with the pass they govern). Hence it was decided to go simple for now and then later decide on what the proper mental model of identifying a "pass in a phase of pipeline" is. * Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies on different IRUnits (e.g. Analyses). * PassInstrumentationAnalysis analysis is explicitly requested from PassManager through usual AnalysisManager::getResult. All pass managers were updated to run that to get PassInstrumentation object for instrumentation calls. * Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra args out of a generic PassManager's extra args. This is the only way I was able to explicitly run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or RepeatedPass::run. TODO: Upon lengthy discussions we agreed to accept this as an initial implementation and then get rid of getAnalysisResult by improving RepeatedPass implementation. * PassBuilder takes PassInstrumentationCallbacks object to pass it further into PassInstrumentationAnalysis. Callbacks registration should be performed directly through PassInstrumentationCallbacks. * new-pm tests updated to account for PassInstrumentationAnalysis being run * Added PassInstrumentation tests to PassBuilderCallbacks unit tests. Other unit tests updated with registration of the now-required PassInstrumentationAnalysis. Made getName helper to return std::string (instead of StringRef initially) to fix asan builtbot failures on CGSCC tests. Reviewers: chandlerc, philip.pfaffe Differential Revision: https://reviews.llvm.org/D47858 llvm-svn: 342664
2018-09-20 19:08:45 +02:00
CGSCC_ANALYSIS("pass-instrumentation", PassInstrumentationAnalysis(PIC))
#undef CGSCC_ANALYSIS
#ifndef CGSCC_PASS
#define CGSCC_PASS(NAME, CREATE_PASS)
#endif
CGSCC_PASS("argpromotion", ArgumentPromotionPass())
CGSCC_PASS("invalidate<all>", InvalidateAllAnalysesPass())
CGSCC_PASS("function-attrs", PostOrderFunctionAttrsPass())
[PM] Provide an initial, minimal port of the inliner to the new pass manager. This doesn't implement *every* feature of the existing inliner, but tries to implement the most important ones for building a functional optimization pipeline and beginning to sort out bugs, regressions, and other problems. Notable, but intentional omissions: - No alloca merging support. Why? Because it isn't clear we want to do this at all. Active discussion and investigation is going on to remove it, so for simplicity I omitted it. - No support for trying to iterate on "internally" devirtualized calls. Why? Because it adds what I suspect is inappropriate coupling for little or no benefit. We will have an outer iteration system that tracks devirtualization including that from function passes and iterates already. We should improve that rather than approximate it here. - Optimization remarks. Why? Purely to make the patch smaller, no other reason at all. The last one I'll probably work on almost immediately. But I wanted to skip it in the initial patch to try to focus the change as much as possible as there is already a lot of code moving around and both of these *could* be skipped without really disrupting the core logic. A summary of the different things happening here: 1) Adding the usual new PM class and rigging. 2) Fixing minor underlying assumptions in the inline cost analysis or inline logic that don't generally hold in the new PM world. 3) Adding the core pass logic which is in essence a loop over the calls in the nodes in the call graph. This is a bit duplicated from the old inliner, but only a handful of lines could realistically be shared. (I tried at first, and it really didn't help anything.) All told, this is only about 100 lines of code, and most of that is the mechanics of wiring up analyses from the new PM world. 4) Updating the LazyCallGraph (in the new PM) based on the *newly inlined* calls and references. This is very minimal because we cannot form cycles. 5) When inlining removes the last use of a function, eagerly nuking the body of the function so that any "one use remaining" inline cost heuristics are immediately refined, and queuing these functions to be completely deleted once inlining is complete and the call graph updated to reflect that they have become dead. 6) After all the inlining for a particular function, updating the LazyCallGraph and the CGSCC pass manager to reflect the function-local simplifications that are done immediately and internally by the inline utilties. These are the exact same fundamental set of CG updates done by arbitrary function passes. 7) Adding a bunch of test cases to specifically target CGSCC and other subtle aspects in the new PM world. Many thanks to the careful review from Easwaran and Sanjoy and others! Differential Revision: https://reviews.llvm.org/D24226 llvm-svn: 290161
2016-12-20 04:15:32 +01:00
CGSCC_PASS("inline", InlinerPass())
CGSCC_PASS("no-op-cgscc", NoOpCGSCCPass())
#undef CGSCC_PASS
#ifndef FUNCTION_ANALYSIS
#define FUNCTION_ANALYSIS(NAME, CREATE_PASS)
#endif
FUNCTION_ANALYSIS("aa", AAManager())
FUNCTION_ANALYSIS("assumptions", AssumptionAnalysis())
FUNCTION_ANALYSIS("block-freq", BlockFrequencyAnalysis())
FUNCTION_ANALYSIS("branch-prob", BranchProbabilityAnalysis())
FUNCTION_ANALYSIS("domtree", DominatorTreeAnalysis())
FUNCTION_ANALYSIS("postdomtree", PostDominatorTreeAnalysis())
FUNCTION_ANALYSIS("demanded-bits", DemandedBitsAnalysis())
FUNCTION_ANALYSIS("domfrontier", DominanceFrontierAnalysis())
FUNCTION_ANALYSIS("loops", LoopAnalysis())
FUNCTION_ANALYSIS("lazy-value-info", LazyValueAnalysis())
FUNCTION_ANALYSIS("da", DependenceAnalysis())
FUNCTION_ANALYSIS("memdep", MemoryDependenceAnalysis())
FUNCTION_ANALYSIS("memoryssa", MemorySSAAnalysis())
FUNCTION_ANALYSIS("phi-values", PhiValuesAnalysis())
FUNCTION_ANALYSIS("regions", RegionInfoAnalysis())
FUNCTION_ANALYSIS("no-op-function", NoOpFunctionAnalysis())
FUNCTION_ANALYSIS("opt-remark-emit", OptimizationRemarkEmitterAnalysis())
[PM] Port ScalarEvolution to the new pass manager. This change makes ScalarEvolution a stand-alone object and just produces one from a pass as needed. Making this work well requires making the object movable, using references instead of overwritten pointers in a number of places, and other refactorings. I've also wired it up to the new pass manager and added a RUN line to a test to exercise it under the new pass manager. This includes basic printing support much like with other analyses. But there is a big and somewhat scary change here. Prior to this patch ScalarEvolution was never *actually* invalidated!!! Re-running the pass just re-wired up the various other analyses and didn't remove any of the existing entries in the SCEV caches or clear out anything at all. This might seem OK as everything in SCEV that can uses ValueHandles to track updates to the values that serve as SCEV keys. However, this still means that as we ran SCEV over each function in the module, we kept accumulating more and more SCEVs into the cache. At the end, we would have a SCEV cache with every value that we ever needed a SCEV for in the entire module!!! Yowzers. The releaseMemory routine would dump all of this, but that isn't realy called during normal runs of the pipeline as far as I can see. To make matters worse, there *is* actually a key that we don't update with value handles -- there is a map keyed off of Loop*s. Because LoopInfo *does* release its memory from run to run, it is entirely possible to run SCEV over one function, then over another function, and then lookup a Loop* from the second function but find an entry inserted for the first function! Ouch. To make matters still worse, there are plenty of updates that *don't* trip a value handle. It seems incredibly unlikely that today GVN or another pass that invalidates SCEV can update values in *just* such a way that a subsequent run of SCEV will incorrectly find lookups in a cache, but it is theoretically possible and would be a nightmare to debug. With this refactoring, I've fixed all this by actually destroying and recreating the ScalarEvolution object from run to run. Technically, this could increase the amount of malloc traffic we see, but then again it is also technically correct. ;] I don't actually think we're suffering from tons of malloc traffic from SCEV because if we were, the fact that we never clear the memory would seem more likely to have come up as an actual problem before now. So, I've made the simple fix here. If in fact there are serious issues with too much allocation and deallocation, I can work on a clever fix that preserves the allocations (while clearing the data) between each run, but I'd prefer to do that kind of optimization with a test case / benchmark that shows why we need such cleverness (and that can test that we actually make it faster). It's possible that this will make some things faster by making the SCEV caches have higher locality (due to being significantly smaller) so until there is a clear benchmark, I think the simple change is best. Differential Revision: http://reviews.llvm.org/D12063 llvm-svn: 245193
2015-08-17 04:08:17 +02:00
FUNCTION_ANALYSIS("scalar-evolution", ScalarEvolutionAnalysis())
FUNCTION_ANALYSIS("stack-safety-local", StackSafetyAnalysis())
[PM] Rework how the TargetLibraryInfo pass integrates with the new pass manager to support the actual uses of it. =] When I ported instcombine to the new pass manager I discover that it didn't work because TLI wasn't available in the right places. This is a somewhat surprising and/or subtle aspect of the new pass manager design that came up before but I think is useful to be reminded of: While the new pass manager *allows* a function pass to query a module analysis, it requires that the module analysis is already run and cached prior to the function pass manager starting up, possibly with a 'require<foo>' style utility in the pass pipeline. This is an intentional hurdle because using a module analysis from a function pass *requires* that the module analysis is run prior to entering the function pass manager. Otherwise the other functions in the module could be in who-knows-what state, etc. A somewhat surprising consequence of this design decision (at least to me) is that you have to design a function pass that leverages a module analysis to do so as an optional feature. Even if that means your function pass does no work in the absence of the module analysis, you have to handle that possibility and remain conservatively correct. This is a natural consequence of things being able to invalidate the module analysis and us being unable to re-run it. And it's a generally good thing because it lets us reorder passes arbitrarily without breaking correctness, etc. This ends up causing problems in one case. What if we have a module analysis that is *definitionally* impossible to invalidate. In the places this might come up, the analysis is usually also definitionally trivial to run even while other transformation passes run on the module, regardless of the state of anything. And so, it follows that it is natural to have a hard requirement on such analyses from a function pass. It turns out, that TargetLibraryInfo is just such an analysis, and InstCombine has a hard requirement on it. The approach I've taken here is to produce an analysis that models this flexibility by making it both a module and a function analysis. This exposes the fact that it is in fact safe to compute at any point. We can even make it a valid CGSCC analysis at some point if that is useful. However, we don't want to have a copy of the actual target library info state for each function! This state is specific to the triple. The somewhat direct and blunt approach here is to turn TLI into a pimpl, with the state and mutators in the implementation class and the query routines primarily in the wrapper. Then the analysis can lazily construct and cache the implementations, keyed on the triple, and on-demand produce wrappers of them for each function. One minor annoyance is that we will end up with a wrapper for each function in the module. While this is a bit wasteful (one pointer per function) it seems tolerable. And it has the advantage of ensuring that we pay the absolute minimum synchronization cost to access this information should we end up with a nice parallel function pass manager in the future. We could look into trying to mark when analysis results are especially cheap to recompute and more eagerly GC-ing the cached results, or we could look at supporting a variant of analyses whose results are specifically *not* cached and expected to just be used and discarded by the consumer. Either way, these seem like incremental enhancements that should happen when we start profiling the memory and CPU usage of the new pass manager and not before. The other minor annoyance is that if we end up using the TLI in both a module pass and a function pass, those will be produced by two separate analyses, and thus will point to separate copies of the implementation state. While a minor issue, I dislike this and would like to find a way to cleanly allow a single analysis instance to be used across multiple IR unit managers. But I don't have a good solution to this today, and I don't want to hold up all of the work waiting to come up with one. This too seems like a reasonable thing to incrementally improve later. llvm-svn: 226981
2015-01-24 03:06:09 +01:00
FUNCTION_ANALYSIS("targetlibinfo", TargetLibraryAnalysis())
FUNCTION_ANALYSIS("targetir",
TM ? TM->getTargetIRAnalysis() : TargetIRAnalysis())
FUNCTION_ANALYSIS("verify", VerifierAnalysis())
[New PM] Introducing PassInstrumentation framework Pass Execution Instrumentation interface enables customizable instrumentation of pass execution, as per "RFC: Pass Execution Instrumentation interface" posted 06/07/2018 on llvm-dev@ The intent is to provide a common machinery to implement all the pass-execution-debugging features like print-before/after, opt-bisect, time-passes etc. Here we get a basic implementation consisting of: * PassInstrumentationCallbacks class that handles registration of callbacks and access to them. * PassInstrumentation class that handles instrumentation-point interfaces that call into PassInstrumentationCallbacks. * Callbacks accept StringRef which is just a name of the Pass right now. There were some ideas to pass an opaque wrapper for the pointer to pass instance, however it appears that pointer does not actually identify the instance (adaptors and managers might have the same address with the pass they govern). Hence it was decided to go simple for now and then later decide on what the proper mental model of identifying a "pass in a phase of pipeline" is. * Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies on different IRUnits (e.g. Analyses). * PassInstrumentationAnalysis analysis is explicitly requested from PassManager through usual AnalysisManager::getResult. All pass managers were updated to run that to get PassInstrumentation object for instrumentation calls. * Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra args out of a generic PassManager's extra args. This is the only way I was able to explicitly run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or RepeatedPass::run. TODO: Upon lengthy discussions we agreed to accept this as an initial implementation and then get rid of getAnalysisResult by improving RepeatedPass implementation. * PassBuilder takes PassInstrumentationCallbacks object to pass it further into PassInstrumentationAnalysis. Callbacks registration should be performed directly through PassInstrumentationCallbacks. * new-pm tests updated to account for PassInstrumentationAnalysis being run * Added PassInstrumentation tests to PassBuilderCallbacks unit tests. Other unit tests updated with registration of the now-required PassInstrumentationAnalysis. Made getName helper to return std::string (instead of StringRef initially) to fix asan builtbot failures on CGSCC tests. Reviewers: chandlerc, philip.pfaffe Differential Revision: https://reviews.llvm.org/D47858 llvm-svn: 342664
2018-09-20 19:08:45 +02:00
FUNCTION_ANALYSIS("pass-instrumentation", PassInstrumentationAnalysis(PIC))
#ifndef FUNCTION_ALIAS_ANALYSIS
#define FUNCTION_ALIAS_ANALYSIS(NAME, CREATE_PASS) \
FUNCTION_ANALYSIS(NAME, CREATE_PASS)
#endif
FUNCTION_ALIAS_ANALYSIS("basic-aa", BasicAA())
FUNCTION_ALIAS_ANALYSIS("cfl-anders-aa", CFLAndersAA())
FUNCTION_ALIAS_ANALYSIS("cfl-steens-aa", CFLSteensAA())
FUNCTION_ALIAS_ANALYSIS("scev-aa", SCEVAA())
FUNCTION_ALIAS_ANALYSIS("scoped-noalias-aa", ScopedNoAliasAA())
FUNCTION_ALIAS_ANALYSIS("type-based-aa", TypeBasedAA())
#undef FUNCTION_ALIAS_ANALYSIS
#undef FUNCTION_ANALYSIS
#ifndef FUNCTION_PASS
#define FUNCTION_PASS(NAME, CREATE_PASS)
#endif
FUNCTION_PASS("aa-eval", AAEvaluator())
FUNCTION_PASS("adce", ADCEPass())
FUNCTION_PASS("add-discriminators", AddDiscriminatorsPass())
FUNCTION_PASS("aggressive-instcombine", AggressiveInstCombinePass())
FUNCTION_PASS("alignment-from-assumptions", AlignmentFromAssumptionsPass())
FUNCTION_PASS("bdce", BDCEPass())
FUNCTION_PASS("bounds-checking", BoundsCheckingPass())
FUNCTION_PASS("break-crit-edges", BreakCriticalEdgesPass())
FUNCTION_PASS("callsite-splitting", CallSiteSplittingPass())
FUNCTION_PASS("consthoist", ConstantHoistingPass())
FUNCTION_PASS("chr", ControlHeightReductionPass())
FUNCTION_PASS("correlated-propagation", CorrelatedValuePropagationPass())
FUNCTION_PASS("dce", DCEPass())
FUNCTION_PASS("div-rem-pairs", DivRemPairsPass())
FUNCTION_PASS("dse", DSEPass())
FUNCTION_PASS("dot-cfg", CFGPrinterPass())
FUNCTION_PASS("dot-cfg-only", CFGOnlyPrinterPass())
FUNCTION_PASS("early-cse", EarlyCSEPass(/*UseMemorySSA=*/false))
FUNCTION_PASS("early-cse-memssa", EarlyCSEPass(/*UseMemorySSA=*/true))
FUNCTION_PASS("ee-instrument", EntryExitInstrumenterPass(/*PostInlining=*/false))
Introduce llvm.experimental.widenable_condition intrinsic This patch introduces a new instinsic `@llvm.experimental.widenable_condition` that allows explicit representation for guards. It is an alternative to using `@llvm.experimental.guard` intrinsic that does not contain implicit control flow. We keep finding places where `@llvm.experimental.guard` is not supported or treated too conservatively, and there are 2 reasons to that: - `@llvm.experimental.guard` has memory write side effect to model implicit control flow, and this sometimes confuses passes and analyzes that work with memory; - Not all passes and analysis are aware of the semantics of guards. These passes treat them as regular throwing call and have no idea that the condition of guard may be used to prove something. One well-known place which had caused us troubles in the past is explicit loop iteration count calculation in SCEV. Another example is new loop unswitching which is not aware of guards. Whenever a new pass appears, we potentially have this problem there. Rather than go and fix all these places (and commit to keep track of them and add support in future), it seems more reasonable to leverage the existing optimizer's logic as much as possible. The only significant difference between guards and regular explicit branches is that guard's condition can be widened. It means that a guard contains (explicitly or implicitly) a `deopt` block successor, and it is always legal to go there no matter what the guard condition is. The other successor is a guarded block, and it is only legal to go there if the condition is true. This patch introduces a new explicit form of guards alternative to `@llvm.experimental.guard` intrinsic. Now a widenable guard can be represented in the CFG explicitly like this: %widenable_condition = call i1 @llvm.experimental.widenable.condition() %new_condition = and i1 %cond, %widenable_condition br i1 %new_condition, label %guarded, label %deopt guarded: ; Guarded instructions deopt: call type @llvm.experimental.deoptimize(<args...>) [ "deopt"(<deopt_args...>) ] The new intrinsic `@llvm.experimental.widenable.condition` has semantics of an `undef`, but the intrinsic prevents the optimizer from folding it early. This form should exploit all optimization boons provided to `br` instuction, and it still can be widened by replacing the result of `@llvm.experimental.widenable.condition()` with `and` with any arbitrary boolean value (as long as the branch that is taken when it is `false` has a deopt and has no side-effects). For more motivation, please check llvm-dev discussion "[llvm-dev] Giving up using implicit control flow in guards". This patch introduces this new intrinsic with respective LangRef changes and a pass that converts old-style guards (expressed as intrinsics) into the new form. The naming discussion is still ungoing. Merging this to unblock further items. We can later change the name of this intrinsic. Reviewed By: reames, fedor.sergeev, sanjoy Differential Revision: https://reviews.llvm.org/D51207 llvm-svn: 348593
2018-12-07 15:39:46 +01:00
FUNCTION_PASS("make-guards-explicit", MakeGuardsExplicitPass())
FUNCTION_PASS("post-inline-ee-instrument", EntryExitInstrumenterPass(/*PostInlining=*/true))
FUNCTION_PASS("gvn-hoist", GVNHoistPass())
FUNCTION_PASS("instcombine", InstCombinePass())
FUNCTION_PASS("instsimplify", InstSimplifyPass())
FUNCTION_PASS("invalidate<all>", InvalidateAllAnalysesPass())
FUNCTION_PASS("float2int", Float2IntPass())
FUNCTION_PASS("no-op-function", NoOpFunctionPass())
FUNCTION_PASS("libcalls-shrinkwrap", LibCallsShrinkWrapPass())
FUNCTION_PASS("loweratomic", LowerAtomicPass())
FUNCTION_PASS("lower-expect", LowerExpectIntrinsicPass())
FUNCTION_PASS("lower-guard-intrinsic", LowerGuardIntrinsicPass())
FUNCTION_PASS("lower-widenable-condition", LowerWidenableConditionPass())
FUNCTION_PASS("guard-widening", GuardWideningPass())
FUNCTION_PASS("gvn", GVN())
FUNCTION_PASS("load-store-vectorizer", LoadStoreVectorizerPass())
FUNCTION_PASS("loop-simplify", LoopSimplifyPass())
FUNCTION_PASS("loop-sink", LoopSinkPass())
FUNCTION_PASS("lowerinvoke", LowerInvokePass())
FUNCTION_PASS("mem2reg", PromotePass())
FUNCTION_PASS("memcpyopt", MemCpyOptPass())
FUNCTION_PASS("mergeicmps", MergeICmpsPass())
FUNCTION_PASS("mldst-motion", MergedLoadStoreMotionPass())
FUNCTION_PASS("nary-reassociate", NaryReassociatePass())
FUNCTION_PASS("newgvn", NewGVNPass())
FUNCTION_PASS("jump-threading", JumpThreadingPass())
FUNCTION_PASS("partially-inline-libcalls", PartiallyInlineLibCallsPass())
FUNCTION_PASS("lcssa", LCSSAPass())
FUNCTION_PASS("loop-data-prefetch", LoopDataPrefetchPass())
FUNCTION_PASS("loop-load-elim", LoopLoadEliminationPass())
FUNCTION_PASS("loop-fuse", LoopFusePass())
FUNCTION_PASS("loop-distribute", LoopDistributePass())
FUNCTION_PASS("pgo-memop-opt", PGOMemOPSizeOpt())
FUNCTION_PASS("print", PrintFunctionPass(dbgs()))
FUNCTION_PASS("print<assumptions>", AssumptionPrinterPass(dbgs()))
FUNCTION_PASS("print<block-freq>", BlockFrequencyPrinterPass(dbgs()))
FUNCTION_PASS("print<branch-prob>", BranchProbabilityPrinterPass(dbgs()))
FUNCTION_PASS("print<da>", DependenceAnalysisPrinterPass(dbgs()))
FUNCTION_PASS("print<domtree>", DominatorTreePrinterPass(dbgs()))
FUNCTION_PASS("print<postdomtree>", PostDominatorTreePrinterPass(dbgs()))
FUNCTION_PASS("print<demanded-bits>", DemandedBitsPrinterPass(dbgs()))
FUNCTION_PASS("print<domfrontier>", DominanceFrontierPrinterPass(dbgs()))
FUNCTION_PASS("print<loops>", LoopPrinterPass(dbgs()))
FUNCTION_PASS("print<memoryssa>", MemorySSAPrinterPass(dbgs()))
FUNCTION_PASS("print<phi-values>", PhiValuesPrinterPass(dbgs()))
FUNCTION_PASS("print<regions>", RegionInfoPrinterPass(dbgs()))
[PM] Port ScalarEvolution to the new pass manager. This change makes ScalarEvolution a stand-alone object and just produces one from a pass as needed. Making this work well requires making the object movable, using references instead of overwritten pointers in a number of places, and other refactorings. I've also wired it up to the new pass manager and added a RUN line to a test to exercise it under the new pass manager. This includes basic printing support much like with other analyses. But there is a big and somewhat scary change here. Prior to this patch ScalarEvolution was never *actually* invalidated!!! Re-running the pass just re-wired up the various other analyses and didn't remove any of the existing entries in the SCEV caches or clear out anything at all. This might seem OK as everything in SCEV that can uses ValueHandles to track updates to the values that serve as SCEV keys. However, this still means that as we ran SCEV over each function in the module, we kept accumulating more and more SCEVs into the cache. At the end, we would have a SCEV cache with every value that we ever needed a SCEV for in the entire module!!! Yowzers. The releaseMemory routine would dump all of this, but that isn't realy called during normal runs of the pipeline as far as I can see. To make matters worse, there *is* actually a key that we don't update with value handles -- there is a map keyed off of Loop*s. Because LoopInfo *does* release its memory from run to run, it is entirely possible to run SCEV over one function, then over another function, and then lookup a Loop* from the second function but find an entry inserted for the first function! Ouch. To make matters still worse, there are plenty of updates that *don't* trip a value handle. It seems incredibly unlikely that today GVN or another pass that invalidates SCEV can update values in *just* such a way that a subsequent run of SCEV will incorrectly find lookups in a cache, but it is theoretically possible and would be a nightmare to debug. With this refactoring, I've fixed all this by actually destroying and recreating the ScalarEvolution object from run to run. Technically, this could increase the amount of malloc traffic we see, but then again it is also technically correct. ;] I don't actually think we're suffering from tons of malloc traffic from SCEV because if we were, the fact that we never clear the memory would seem more likely to have come up as an actual problem before now. So, I've made the simple fix here. If in fact there are serious issues with too much allocation and deallocation, I can work on a clever fix that preserves the allocations (while clearing the data) between each run, but I'd prefer to do that kind of optimization with a test case / benchmark that shows why we need such cleverness (and that can test that we actually make it faster). It's possible that this will make some things faster by making the SCEV caches have higher locality (due to being significantly smaller) so until there is a clear benchmark, I think the simple change is best. Differential Revision: http://reviews.llvm.org/D12063 llvm-svn: 245193
2015-08-17 04:08:17 +02:00
FUNCTION_PASS("print<scalar-evolution>", ScalarEvolutionPrinterPass(dbgs()))
FUNCTION_PASS("print<stack-safety-local>", StackSafetyPrinterPass(dbgs()))
FUNCTION_PASS("reassociate", ReassociatePass())
FUNCTION_PASS("scalarizer", ScalarizerPass())
FUNCTION_PASS("sccp", SCCPPass())
FUNCTION_PASS("sink", SinkingPass())
FUNCTION_PASS("slp-vectorizer", SLPVectorizerPass())
FUNCTION_PASS("speculative-execution", SpeculativeExecutionPass())
Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable. The core idea is to (re-)introduce some redundancies where their cost is hidden by the cost of materializing immediates for constant operands of PHI nodes. When the cost of the redundancies is covered by this, avoiding materializing the immediate has numerous benefits: 1) Less register pressure 2) Potential for further folding / combining 3) Potential for more efficient instructions due to immediate operand As a motivating example, consider the remarkably different cost on x86 of a SHL instruction with an immediate operand versus a register operand. This pattern turns up surprisingly frequently, but is somewhat rarely obvious as a significant performance problem. The pass is entirely target independent, but it does rely on the target cost model in TTI to decide when to speculate things around the PHI node. I've included x86-focused tests, but any target that sets up its immediate cost model should benefit from this pass. There is probably more that can be done in this space, but the pass as-is is enough to get some important performance on our internal benchmarks, and should be generally performance neutral, but help with more extensive benchmarking is always welcome. One awkward part is that this pass has to be scheduled after *everything* that can eliminate these kinds of redundancies. This includes SimplifyCFG, GVN, etc. I'm open to suggestions about better places to put this. We could in theory make it part of the codegen pass pipeline, but there doesn't really seem to be a good reason for that -- it isn't "lowering" in any sense and only relies on pretty standard cost model based TTI queries, so it seems to fit well with the "optimization" pipeline model. Still, further thoughts on the pipeline position are welcome. I've also only implemented this in the new pass manager. If folks are very interested, I can try to add it to the old PM as well, but I didn't really see much point (my use case is already switched over to the new PM). I've tested this pretty heavily without issue. A wide range of benchmarks internally show no change outside the noise, and I don't see any significant changes in SPEC either. However, the size class computation in tcmalloc is substantially improved by this, which turns into a 2% to 4% win on the hottest path through tcmalloc for us, so there are definitely important cases where this is going to make a substantial difference. Differential revision: https://reviews.llvm.org/D37467 llvm-svn: 319164
2017-11-28 12:32:31 +01:00
FUNCTION_PASS("spec-phis", SpeculateAroundPHIsPass())
[PM] Port SROA to the new pass manager. In some ways this is a very boring port to the new pass manager as there are no interesting analyses or dependencies or other oddities. However, this does introduce the first good example of a transformation pass with non-trivial state porting to the new pass manager. I've tried to carve out patterns here to replicate elsewhere, and would appreciate comments on whether folks like these patterns: - A common need in the new pass manager is to effectively lift the pass class and some of its state into a public header file. Prior to this, LLVM used anonymous namespaces to provide "module private" types and utilities, but that doesn't scale to cases where a public header file is needed and the new pass manager will exacerbate that. The pattern I've adopted here is to use the namespace-cased-name of the core pass (what would be a module if we had them) as a module-private namespace. Then utility and other code can be declared and defined in this namespace. At some point in the future, we could even have (conditionally compiled) code that used modules features when available to do the same basic thing. - I've split the actual pass run method in two in order to expose a private method usable by the old pass manager to wrap the new class with a minimum of duplicated code. I actually looked at a bunch of ways to automate or generate these, but they are all quite terrible IMO. The fundamental need is to extract the set of analyses which need to cross this interface boundary, and that will end up being too unpredictable to effectively encapsulate IMO. This is also a relatively small amount of boiler plate that will live a relatively short time, so I'm not too worried about the fact that it is boiler plate. The rest of the patch is totally boring but results in a massive diff (sorry). It just moves code around and removes or adds qualifiers to reflect the new name and nesting structure. Differential Revision: http://reviews.llvm.org/D12773 llvm-svn: 247501
2015-09-12 11:09:14 +02:00
FUNCTION_PASS("sroa", SROA())
FUNCTION_PASS("tailcallelim", TailCallElimPass())
FUNCTION_PASS("unreachableblockelim", UnreachableBlockElimPass())
FUNCTION_PASS("verify", VerifierPass())
FUNCTION_PASS("verify<domtree>", DominatorTreeVerifierPass())
FUNCTION_PASS("verify<loops>", LoopVerifierPass())
FUNCTION_PASS("verify<memoryssa>", MemorySSAVerifierPass())
FUNCTION_PASS("verify<regions>", RegionInfoVerifierPass())
FUNCTION_PASS("verify<safepoint-ir>", SafepointIRVerifierPass())
FUNCTION_PASS("view-cfg", CFGViewerPass())
FUNCTION_PASS("view-cfg-only", CFGOnlyViewerPass())
[Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes. When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g. #pragma clang loop unroll_and_jam(enable) #pragma clang loop distribute(enable) is the same as #pragma clang loop distribute(enable) #pragma clang loop unroll_and_jam(enable) and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used. This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance, !0 = !{!0, !1, !2} !1 = !{!"llvm.loop.unroll_and_jam.enable"} !2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3} !3 = !{!"llvm.loop.distribute.enable"} defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop. Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account. For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations. Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated. To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied. With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling). Reviewed By: hfinkel, dmgreen Differential Revision: https://reviews.llvm.org/D49281 Differential Revision: https://reviews.llvm.org/D55288 llvm-svn: 348944
2018-12-12 18:32:52 +01:00
FUNCTION_PASS("transform-warning", WarnMissedTransformationsPass())
FUNCTION_PASS("asan", AddressSanitizerPass(false, false, false))
FUNCTION_PASS("kasan", AddressSanitizerPass(true, false, false))
FUNCTION_PASS("msan", MemorySanitizerPass({}))
FUNCTION_PASS("kmsan", MemorySanitizerPass({0, false, /*Kernel=*/true}))
FUNCTION_PASS("tsan", ThreadSanitizerPass())
FUNCTION_PASS("sancov-func", SanitizerCoveragePass())
#undef FUNCTION_PASS
#ifndef FUNCTION_PASS_WITH_PARAMS
#define FUNCTION_PASS_WITH_PARAMS(NAME, CREATE_PASS, PARSER)
#endif
FUNCTION_PASS_WITH_PARAMS("unroll",
[](LoopUnrollOptions Opts) {
return LoopUnrollPass(Opts);
},
parseLoopUnrollOptions)
FUNCTION_PASS_WITH_PARAMS("msan",
[](MemorySanitizerOptions Opts) {
return MemorySanitizerPass(Opts);
},
parseMSanPassOptions)
FUNCTION_PASS_WITH_PARAMS("simplify-cfg",
[](SimplifyCFGOptions Opts) {
return SimplifyCFGPass(Opts);
},
parseSimplifyCFGOptions)
FUNCTION_PASS_WITH_PARAMS("loop-vectorize",
[](LoopVectorizeOptions Opts) {
return LoopVectorizePass(Opts);
},
parseLoopVectorizeOptions)
#undef FUNCTION_PASS_WITH_PARAMS
#ifndef LOOP_ANALYSIS
#define LOOP_ANALYSIS(NAME, CREATE_PASS)
#endif
LOOP_ANALYSIS("no-op-loop", NoOpLoopAnalysis())
LOOP_ANALYSIS("access-info", LoopAccessAnalysis())
LOOP_ANALYSIS("ivusers", IVUsersAnalysis())
[New PM] Introducing PassInstrumentation framework Pass Execution Instrumentation interface enables customizable instrumentation of pass execution, as per "RFC: Pass Execution Instrumentation interface" posted 06/07/2018 on llvm-dev@ The intent is to provide a common machinery to implement all the pass-execution-debugging features like print-before/after, opt-bisect, time-passes etc. Here we get a basic implementation consisting of: * PassInstrumentationCallbacks class that handles registration of callbacks and access to them. * PassInstrumentation class that handles instrumentation-point interfaces that call into PassInstrumentationCallbacks. * Callbacks accept StringRef which is just a name of the Pass right now. There were some ideas to pass an opaque wrapper for the pointer to pass instance, however it appears that pointer does not actually identify the instance (adaptors and managers might have the same address with the pass they govern). Hence it was decided to go simple for now and then later decide on what the proper mental model of identifying a "pass in a phase of pipeline" is. * Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies on different IRUnits (e.g. Analyses). * PassInstrumentationAnalysis analysis is explicitly requested from PassManager through usual AnalysisManager::getResult. All pass managers were updated to run that to get PassInstrumentation object for instrumentation calls. * Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra args out of a generic PassManager's extra args. This is the only way I was able to explicitly run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or RepeatedPass::run. TODO: Upon lengthy discussions we agreed to accept this as an initial implementation and then get rid of getAnalysisResult by improving RepeatedPass implementation. * PassBuilder takes PassInstrumentationCallbacks object to pass it further into PassInstrumentationAnalysis. Callbacks registration should be performed directly through PassInstrumentationCallbacks. * new-pm tests updated to account for PassInstrumentationAnalysis being run * Added PassInstrumentation tests to PassBuilderCallbacks unit tests. Other unit tests updated with registration of the now-required PassInstrumentationAnalysis. Made getName helper to return std::string (instead of StringRef initially) to fix asan builtbot failures on CGSCC tests. Reviewers: chandlerc, philip.pfaffe Differential Revision: https://reviews.llvm.org/D47858 llvm-svn: 342664
2018-09-20 19:08:45 +02:00
LOOP_ANALYSIS("pass-instrumentation", PassInstrumentationAnalysis(PIC))
#undef LOOP_ANALYSIS
#ifndef LOOP_PASS
#define LOOP_PASS(NAME, CREATE_PASS)
#endif
LOOP_PASS("invalidate<all>", InvalidateAllAnalysesPass())
LOOP_PASS("licm", LICMPass())
LOOP_PASS("loop-idiom", LoopIdiomRecognizePass())
LOOP_PASS("loop-instsimplify", LoopInstSimplifyPass())
LOOP_PASS("rotate", LoopRotatePass())
LOOP_PASS("no-op-loop", NoOpLoopPass())
LOOP_PASS("print", PrintLoopPass(dbgs()))
LOOP_PASS("loop-deletion", LoopDeletionPass())
LOOP_PASS("simplify-cfg", LoopSimplifyCFGPass())
LOOP_PASS("strength-reduce", LoopStrengthReducePass())
LOOP_PASS("indvars", IndVarSimplifyPass())
LOOP_PASS("irce", IRCEPass())
LOOP_PASS("unroll-and-jam", LoopUnrollAndJamPass())
LOOP_PASS("unroll-full", LoopFullUnrollPass())
LOOP_PASS("print-access-info", LoopAccessInfoPrinterPass(dbgs()))
LOOP_PASS("print<ivusers>", IVUsersPrinterPass(dbgs()))
Title: Loop Cache Analysis Summary: Implement a new analysis to estimate the number of cache lines required by a loop nest. The analysis is largely based on the following paper: Compiler Optimizations for Improving Data Locality By: Steve Carr, Katherine S. McKinley, Chau-Wen Tseng http://www.cs.utexas.edu/users/mckinley/papers/asplos-1994.pdf The analysis considers temporal reuse (accesses to the same memory location) and spatial reuse (accesses to memory locations within a cache line). For simplicity the analysis considers memory accesses in the innermost loop in a loop nest, and thus determines the number of cache lines used when the loop L in loop nest LN is placed in the innermost position. The result of the analysis can be used to drive several transformations. As an example, loop interchange could use it determine which loops in a perfect loop nest should be interchanged to maximize cache reuse. Similarly, loop distribution could be enhanced to take into consideration cache reuse between arrays when distributing a loop to eliminate vectorization inhibiting dependencies. The general approach taken to estimate the number of cache lines used by the memory references in the inner loop of a loop nest is: Partition memory references that exhibit temporal or spatial reuse into reference groups. For each loop L in the a loop nest LN: a. Compute the cost of the reference group b. Compute the 'cache cost' of the loop nest by summing up the reference groups costs For further details of the algorithm please refer to the paper. Authored By: etiotto Reviewers: hfinkel, Meinersbur, jdoerfert, kbarton, bmahjour, anemet, fhahn Reviewed By: Meinersbur Subscribers: reames, nemanjai, MaskRay, wuzish, Hahnfeld, xusx595, venkataramanan.kumar.llvm, greened, dmgreen, steleman, fhahn, xblvaOO, Whitney, mgorny, hiraditya, mgrang, jsji, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D63459 llvm-svn: 368439
2019-08-09 15:56:29 +02:00
LOOP_PASS("print<loop-cache-cost>", LoopCachePrinterPass(dbgs()))
LOOP_PASS("loop-predication", LoopPredicationPass())
LOOP_PASS("guard-widening", GuardWideningPass())
#undef LOOP_PASS
#ifndef LOOP_PASS_WITH_PARAMS
#define LOOP_PASS_WITH_PARAMS(NAME, CREATE_PASS, PARSER)
#endif
LOOP_PASS_WITH_PARAMS("unswitch",
[](bool NonTrivial) {
return SimpleLoopUnswitchPass(NonTrivial);
},
parseLoopUnswitchOptions)
#undef LOOP_PASS_WITH_PARAMS