mirror of
https://github.com/RPCS3/llvm-mirror.git
synced 2025-01-31 12:41:49 +01:00
[Kaleidoscope][BuildingAJIT] Add tutorial text for Chapter 2.
This chapter discusses IR optimizations, the ORC IRTransformLayer, and the ORC layer concept itself. The text is still pretty rough, but I think the main ideas are there. Feedback is very welcome, as always. llvm-svn: 271865
This commit is contained in:
parent
3ff4515717
commit
6c7de226cb
@ -12,22 +12,296 @@ we welcome any feedback.
|
||||
Chapter 2 Introduction
|
||||
======================
|
||||
|
||||
Welcome to Chapter 2 of the "Building an ORC-based JIT in LLVM" tutorial. This
|
||||
chapter shows you how to add IR optimization support to the KaleidoscopeJIT
|
||||
class that was introduced in `Chapter 1 <BuildingAJIT1.html>`_ by adding a
|
||||
new *ORC Layer* -- IRTransformLayer.
|
||||
Welcome to Chapter 2 of the "Building an ORC-based JIT in LLVM" tutorial. In
|
||||
`Chapter 1 <BuildingAJIT1.html>`_ of this series we examined a basic JIT
|
||||
class, KaleidoscopeJIT, that could take LLVM IR modules as input and produce
|
||||
executable code in memory. KaleidoscopeJIT was able to do this with relatively
|
||||
little code by composing two off-the-shelf *ORC layers*: IRCompileLayer and
|
||||
ObjectLinkingLayer, to do much of the heavy lifting.
|
||||
|
||||
**To be done:**
|
||||
In this layer we'll learn more about the ORC layer concept by using a new layer,
|
||||
IRTransformLayer, to add IR optimization support to KaleidoscopeJIT.
|
||||
|
||||
**(1) Briefly describe FunctionPassManager and the optimizeModule
|
||||
method (reference the Kaleidoscope language tutorial chapter 4 for more detail
|
||||
about IR optimization - it's covered in detail there, here it just provides a
|
||||
motivation for learning about layers).**
|
||||
Optimizing Modules using the IRTransformLayer
|
||||
=============================================
|
||||
|
||||
**(2) Describe IRTransformLayer, show how it is used to call our optimizeModule
|
||||
method.**
|
||||
In `Chapter 4 <LangImpl4.html>`_ of the "Implementing a language with LLVM"
|
||||
tutorial series the llvm *FunctionPassManager* is introduced as a means for
|
||||
optimizing LLVM IR. Interested readers may read that chapter for details, but
|
||||
in short, to optimize a Module we create an llvm::FunctionPassManager
|
||||
instance, configure it with a set of optimizations, then run the PassManager on
|
||||
a Module to mutate it into a (hopefully) more optimized but semantically
|
||||
equivalent form. In the original tutorial series the FunctionPassManager was
|
||||
created outside the KaleidoscopeJIT, and modules were optimized before being
|
||||
added to it. In this Chapter we will make optimization a phase of our JIT
|
||||
instead. For now, this will provide us a motivation to learn more about ORC
|
||||
layers, but in the long term making optimization part of our JIT will yield an
|
||||
important benefit: When we begin lazily compiling code (i.e. deferring
|
||||
compilation of each function until the first time it's run), having
|
||||
optimization managed by our JIT will allow us to optimize lazily too, rather
|
||||
than having to do all our optimization up-front.
|
||||
|
||||
**(3) Describe the ORC Layer concept using IRTransformLayer as an example.**
|
||||
To add optimization support to our JIT we will take the KaleidoscopeJIT from
|
||||
Chapter 1 and compose an ORC *IRTransformLayer* on top. We will look at how the
|
||||
IRTransformLayer works in more detail below, but the interface is simple: the
|
||||
constructor for this layer takes a reference to the layer below (as all layers
|
||||
do) plus an *IR optimization function* that it will apply to each Module that
|
||||
is added via addModuleSet:
|
||||
|
||||
.. code-block: c++
|
||||
|
||||
class KaleidoscopeJIT {
|
||||
private:
|
||||
std::unique_ptr<TargetMachine> TM;
|
||||
const DataLayout DL;
|
||||
ObjectLinkingLayer<> ObjectLayer;
|
||||
IRCompileLayer<decltype(ObjectLayer)> CompileLayer;
|
||||
|
||||
typedef std::function<std::unique_ptr<Module>(std::unique_ptr<Module>)>
|
||||
OptimizeFunction;
|
||||
|
||||
IRTransformLayer<decltype(CompileLayer), OptimizeFunction> OptimizeLayer;
|
||||
|
||||
public:
|
||||
typedef decltype(OptimizeLayer)::ModuleSetHandleT ModuleHandle;
|
||||
|
||||
KaleidoscopeJIT()
|
||||
: TM(EngineBuilder().selectTarget()), DL(TM->createDataLayout()),
|
||||
CompileLayer(ObjectLayer, SimpleCompiler(*TM)),
|
||||
OptimizeLayer(CompileLayer,
|
||||
[this](std::unique_ptr<Module> M) {
|
||||
return optimizeModule(std::move(M));
|
||||
}) {
|
||||
llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr);
|
||||
}
|
||||
|
||||
Our extended KaleidoscopeJIT class starts out the same as it did in Chapter 1,
|
||||
but after the CompileLayer we introduce a typedef for our optimization function.
|
||||
In this case we use a std::function (a handy wrapper for "function-like" things)
|
||||
from a single unique_ptr<Module> input to a std::unique_ptr<Module> output. With
|
||||
our optimization function typedef in place we can declare our OptimizeLayer,
|
||||
which sits on top of our CompileLayer.
|
||||
|
||||
To initialize our OptimizeLayer we pass it a reference to the CompileLayer
|
||||
below (standard practice for layers), and we initialize the OptimizeFunction
|
||||
using a lambda. In the lambda, we just call out to the "optimizeModule" function
|
||||
that we will define below.
|
||||
|
||||
.. code-block:
|
||||
|
||||
// ...
|
||||
auto Resolver = createLambdaResolver(
|
||||
[&](const std::string &Name) {
|
||||
if (auto Sym = OptimizeLayer.findSymbol(Name, false))
|
||||
return Sym.toRuntimeDyldSymbol();
|
||||
return RuntimeDyld::SymbolInfo(nullptr);
|
||||
},
|
||||
// ...
|
||||
// Add the set to the JIT with the resolver we created above and a newly
|
||||
// created SectionMemoryManager.
|
||||
return OptimizeLayer.addModuleSet(std::move(Ms),
|
||||
make_unique<SectionMemoryManager>(),
|
||||
std::move(Resolver));
|
||||
// ...
|
||||
|
||||
// ...
|
||||
return OptimizeLayer.findSymbol(MangledNameStream.str(), true);
|
||||
// ...
|
||||
|
||||
// ...
|
||||
OptimizeLayer.removeModuleSet(H);
|
||||
// ...
|
||||
|
||||
Next we need to replace references to 'CompileLayer' with references to
|
||||
OptimizeLayer in our key methods: addModule, findSymbol, and removeModule. In
|
||||
addModule we need to be careful to replace both references: the findSymbol call
|
||||
inside our resolver, and the call through to addModuleSet.
|
||||
|
||||
.. code-block: c++
|
||||
|
||||
std::unique_ptr<Module> optimizeModule(std::unique_ptr<Module> M) {
|
||||
// Create a function pass manager.
|
||||
auto FPM = llvm::make_unique<legacy::FunctionPassManager>(M.get());
|
||||
|
||||
// Add some optimizations.
|
||||
FPM->add(createInstructionCombiningPass());
|
||||
FPM->add(createReassociatePass());
|
||||
FPM->add(createGVNPass());
|
||||
FPM->add(createCFGSimplificationPass());
|
||||
FPM->doInitialization();
|
||||
|
||||
// Run the optimizations over all functions in the module being added to
|
||||
// the JIT.
|
||||
for (auto &F : *M)
|
||||
FPM->run(F);
|
||||
|
||||
return M;
|
||||
}
|
||||
|
||||
At the bottom of our JIT we add a private method to do the actual optimization:
|
||||
*optimizeModule*. This function sets up a FunctionPassManager, adds some passes
|
||||
to it, runs it over every function in the module, and then returns the mutated
|
||||
module. The specific optimizations used are the same ones used in
|
||||
`Chapter 4 <LangImpl4.html>`_ of the "Implementing a language with LLVM"
|
||||
tutorial series -- readers may visit that chapter for a more in-depth
|
||||
discussion of them, and of IR optimization in general.
|
||||
|
||||
And that's it: When a module is added to our JIT the OptimizeLayer will now
|
||||
pass it to our optimizeModule function before passing the transformed module
|
||||
on to the CompileLayer below. Of course, we could have called optimizeModule
|
||||
directly in our addModule function and not gone to the bother of using the
|
||||
IRTransformLayer, but it gives us an opportunity to see how layers compose, and
|
||||
how one can be implemented, because IRTransformLayer turns out to be one of
|
||||
the simplest implementations of the *layer* concept that can be devised:
|
||||
|
||||
.. code-block:
|
||||
|
||||
template <typename BaseLayerT, typename TransformFtor>
|
||||
class IRTransformLayer {
|
||||
public:
|
||||
typedef typename BaseLayerT::ModuleSetHandleT ModuleSetHandleT;
|
||||
|
||||
IRTransformLayer(BaseLayerT &BaseLayer,
|
||||
TransformFtor Transform = TransformFtor())
|
||||
: BaseLayer(BaseLayer), Transform(std::move(Transform)) {}
|
||||
|
||||
template <typename ModuleSetT, typename MemoryManagerPtrT,
|
||||
typename SymbolResolverPtrT>
|
||||
ModuleSetHandleT addModuleSet(ModuleSetT Ms,
|
||||
MemoryManagerPtrT MemMgr,
|
||||
SymbolResolverPtrT Resolver) {
|
||||
|
||||
for (auto I = Ms.begin(), E = Ms.end(); I != E; ++I)
|
||||
*I = Transform(std::move(*I));
|
||||
|
||||
return BaseLayer.addModuleSet(std::move(Ms), std::move(MemMgr),
|
||||
std::move(Resolver));
|
||||
}
|
||||
|
||||
void removeModuleSet(ModuleSetHandleT H) { BaseLayer.removeModuleSet(H); }
|
||||
|
||||
JITSymbol findSymbol(const std::string &Name, bool ExportedSymbolsOnly) {
|
||||
return BaseLayer.findSymbol(Name, ExportedSymbolsOnly);
|
||||
}
|
||||
|
||||
JITSymbol findSymbolIn(ModuleSetHandleT H, const std::string &Name,
|
||||
bool ExportedSymbolsOnly) {
|
||||
return BaseLayer.findSymbolIn(H, Name, ExportedSymbolsOnly);
|
||||
}
|
||||
|
||||
void emitAndFinalize(ModuleSetHandleT H) {
|
||||
BaseLayer.emitAndFinalize(H);
|
||||
}
|
||||
|
||||
TransformFtor& getTransform() { return Transform; }
|
||||
|
||||
const TransformFtor& getTransform() const { return Transform; }
|
||||
|
||||
private:
|
||||
BaseLayerT &BaseLayer;
|
||||
TransformFtor Transform;
|
||||
};
|
||||
|
||||
This is the whole definition of IRTransformLayer, from
|
||||
``llvm/include/llvm/ExecutionEngine/Orc/IRTransformLayer.h``, stripped of its
|
||||
comments. It is a template class with two template arguments: ``BaesLayerT`` and
|
||||
``TransformFtor`` that provide the type of the base layer, and the type of the
|
||||
"transform functor" (in our case a std::function) respectively. The body of the
|
||||
class is concerned with two very simple jobs: (1) Running every IR Module that
|
||||
is added with addModuleSet through the transform functor, and (2) conforming to
|
||||
the ORC layer interface, which is:
|
||||
|
||||
+------------------------------------------------------------------------------+
|
||||
| Interface | Description |
|
||||
+==================+===========================================================+
|
||||
| | Provides a handle that can be used to identify a module |
|
||||
| ModuleSetHandleT | set when calling findSymbolIn, removeModuleSet, or |
|
||||
| | emitAndFinalize. |
|
||||
+------------------+-----------------------------------------------------------+
|
||||
| | Takes a given set of Modules and makes them "available |
|
||||
| | for execution. This means that symbols in those modules |
|
||||
| | should be searchable via findSymbol and findSymbolIn, and |
|
||||
| | the address of the symbols should be read/writable (for |
|
||||
| | data symbols), or executable (for function symbols) after |
|
||||
| | JITSymbol::getAddress() is called. Note: This means that |
|
||||
| addModuleSet | addModuleSet doesn't have to compile (or do any other |
|
||||
| | work) up-front. It *can*, like IRCompileLayer, act |
|
||||
| | eagerly, but it can also simply record the module and |
|
||||
| | take no further action until somebody calls |
|
||||
| | JITSymbol::getAddress(). In IRTransformLayer's case |
|
||||
| | addModuleSet eagerly applies the transform functor to |
|
||||
| | each module in the set, then passes the resulting set |
|
||||
| | of mutated modules down to the layer below. |
|
||||
+------------------+-----------------------------------------------------------+
|
||||
| | Removes a set of modules from the JIT. Code or data |
|
||||
| removeModuleSet | defined in these modules will no longer be available, and |
|
||||
| | the memory holding the JIT'd definitions will be freed. |
|
||||
+------------------+-----------------------------------------------------------+
|
||||
| | Searches for the named symbol in all modules that have |
|
||||
| | previously been added via addModuleSet (and not yet |
|
||||
| findSymbol | removed by a call to removeModuleSet). In |
|
||||
| | IRTransformLayer we just pass the query on to the layer |
|
||||
| | below. In our REPL this is our default way to search for |
|
||||
| | function definitions. |
|
||||
+------------------+-----------------------------------------------------------+
|
||||
| | Searches for the named symbol in the module set indicated |
|
||||
| | by the given ModuleSetHandleT. This is just an optimized |
|
||||
| | search, better for lookup-speed when you know exactly |
|
||||
| | a symbol definition should be found. In IRTransformLayer |
|
||||
| findSymbolIn | we just pass this query on to the layer below. In our |
|
||||
| | REPL we use this method to search for functions |
|
||||
| | representing top-level expressions, since we know exactly |
|
||||
| | where we'll find them: in the top-level expression module |
|
||||
| | we just added. |
|
||||
+------------------+-----------------------------------------------------------+
|
||||
| | Forces all of the actions required to make the code and |
|
||||
| | data in a module set (represented by a ModuleSetHandleT) |
|
||||
| | accessible. Behaves as if some symbol in the set had been |
|
||||
| | searched for and JITSymbol::getSymbolAddress called. This |
|
||||
| emitAndFinalize | is rarely needed, but can be useful when dealing with |
|
||||
| | layers that usually behave lazily if the user wants to |
|
||||
| | trigger early compilation (for example, to use idle CPU |
|
||||
| | time to eagerly compile code in the background). |
|
||||
+------------------+-----------------------------------------------------------+
|
||||
|
||||
This interface attempts to capture the natural operations of a JIT (with some
|
||||
wrinkles like emitAndFinalize for performance), similar to the basic JIT API
|
||||
operations we identified in Chapter 1. Conforming to the layer concept allows
|
||||
classes to compose neatly by implementing their behaviors in terms of the these
|
||||
same operations, carried out on the layer below. For example, an eager layer
|
||||
(like IRTransformLayer) can implement addModuleSet by running each module in the
|
||||
set through its transform up-front and immediately passing the result to the
|
||||
layer below. A lazy layer, by contrast, could implement addModuleSet by
|
||||
squirreling away the modules doing no other up-front work, but applying the
|
||||
transform (and calling addModuleSet on the layer below) when the client calls
|
||||
findSymbol instead. The JIT'd program behavior will be the same either way, but
|
||||
these choices will have different performance characteristics: Doing work
|
||||
eagerly means the JIT takes longer up-front, but proceeds smoothly once this is
|
||||
done. Deferring work allows the JIT to get up-and-running quickly, but will
|
||||
force the JIT to pause and wait whenever some code or data is needed that hasn't
|
||||
already been procesed.
|
||||
|
||||
Our current REPL is eager: Each function definition is optimized and compiled as
|
||||
soon as it's typed in. If we were to make the transform layer lazy (but not
|
||||
change things otherwise) we could defer optimization until the first time we
|
||||
reference a function in a top-level expression (see if you can figure out why,
|
||||
then check out the answer below [1]_). In the next chapter, however we'll
|
||||
introduce fully lazy compilation, in which function's aren't compiled until
|
||||
they're first called at run-time. At this point the trade-offs get much more
|
||||
interesting: the lazier we are, the quicker we can start executing the first
|
||||
function, but the more often we'll have to pause to compile newly encountered
|
||||
functions. If we only code-gen lazily, but optimize eagerly, we'll have a slow
|
||||
startup (which everything is optimized) but relatively short pauses as each
|
||||
function just passes through code-gen. If we both optimize and code-gen lazily
|
||||
we can start executing the first function more quickly, but we'll have longer
|
||||
pauses as each function has to be both optimized and code-gen'd when it's first
|
||||
executed. Things become even more interesting if we consider interproceedural
|
||||
optimizations like inlining, which must be performed eagerly. These are
|
||||
complex trade-offs, and there is no one-size-fits all solution to them, but by
|
||||
providing composable layers we leave the decisions to the person implementing
|
||||
the JIT, and make it easy for them to experiment with different configurations.
|
||||
|
||||
`Next: Adding Per-function Lazy Compilation <BuildingAJIT3.html>`_
|
||||
|
||||
Full Code Listing
|
||||
=================
|
||||
@ -47,4 +321,9 @@ Here is the code:
|
||||
.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter2/KaleidoscopeJIT.h
|
||||
:language: c++
|
||||
|
||||
`Next: Adding Per-function Lazy Compilation <BuildingAJIT3.html>`_
|
||||
.. [1] When we add our top-level expression to the JIT, any calls to functions
|
||||
that we defined earlier will appear to the ObjectLinkingLayer as
|
||||
external symbols. The ObjectLinkingLayer will call the SymbolResolver
|
||||
that we defined in addModuleSet, which in turn calls findSymbol on the
|
||||
OptimizeLayer, at which point even a lazy transform layer will have to
|
||||
do its work.
|
||||
|
Loading…
x
Reference in New Issue
Block a user