1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 12:41:49 +01:00

[BuildingAJIT] Update chapter 1 to use the ORCv2 APIs.

llvm-svn: 344667
This commit is contained in:
Lang Hames 2018-10-17 03:34:09 +00:00
parent 5edf1afbaa
commit 6abcd4dd50
3 changed files with 273 additions and 365 deletions

View File

@ -8,18 +8,19 @@ Building a JIT: Starting out with KaleidoscopeJIT
Chapter 1 Introduction
======================
**Warning: This text is currently out of date due to ORC API updates.**
**Warning: This tutorial is currently being updated to account for ORC API
changes. Only Chapter 1 is up-to-date.**
**The example code has been updated and can be used. The text will be updated
once the API churn dies down.**
**Example code from Chapters 2 to 4 will compile and run, but has not been
updated**
Welcome to Chapter 1 of the "Building an ORC-based JIT in LLVM" tutorial. This
tutorial runs through the implementation of a JIT compiler using LLVM's
On-Request-Compilation (ORC) APIs. It begins with a simplified version of the
KaleidoscopeJIT class used in the
`Implementing a language with LLVM <LangImpl01.html>`_ tutorials and then
introduces new features like optimization, lazy compilation and remote
execution.
introduces new features like concurrent compilation, optimization, lazy
compilation and remote execution.
The goal of this tutorial is to introduce you to LLVM's ORC JIT APIs, show how
these APIs interact with other parts of LLVM, and to teach you how to recombine
@ -45,11 +46,9 @@ The structure of the tutorial is:
- `Chapter #5 <BuildingAJIT5.html>`_: Add process isolation by JITing code into
a remote process with reduced privileges using the JIT Remote APIs.
To provide input for our JIT we will use the Kaleidoscope REPL from
`Chapter 7 <LangImpl07.html>`_ of the "Implementing a language in LLVM tutorial",
with one minor modification: We will remove the FunctionPassManager from the
code for that chapter and replace it with optimization support in our JIT class
in Chapter #2.
To provide input for our JIT we will use a lightly modified version of the
Kaleidoscope REPL from `Chapter 7 <LangImpl07.html>`_ of the "Implementing a
language in LLVM tutorial".
Finally, a word on API generations: ORC is the 3rd generation of LLVM JIT API.
It was preceded by MCJIT, and before that by the (now deleted) legacy JIT.
@ -63,14 +62,13 @@ JIT API Basics
The purpose of a JIT compiler is to compile code "on-the-fly" as it is needed,
rather than compiling whole programs to disk ahead of time as a traditional
compiler does. To support that aim our initial, bare-bones JIT API will be:
compiler does. To support that aim our initial, bare-bones JIT API will have
just two functions:
1. Handle addModule(Module &M) -- Make the given IR module available for
execution.
2. JITSymbol findSymbol(const std::string &Name) -- Search for pointers to
2. Expected<JITSymbol> lookup() -- Search for pointers to
symbols (functions or variables) that have been added to the JIT.
3. void removeModule(Handle H) -- Remove a module from the JIT, releasing any
memory that had been used for the compiled code.
A basic use-case for this API, executing the 'main' function from a module,
will look like:
@ -79,16 +77,15 @@ will look like:
std::unique_ptr<Module> M = buildModule();
JIT J;
Handle H = J.addModule(*M);
int (*Main)(int, char*[]) = (int(*)(int, char*[]))J.getSymbolAddress("main");
J.addModule(*M);
auto *Main = (int(*)(int, char*[]))J.lookup("main");.getAddress();
int Result = Main();
J.removeModule(H);
The APIs that we build in these tutorials will all be variations on this simple
theme. Behind the API we will refine the implementation of the JIT to add
support for optimization and lazy compilation. Eventually we will extend the
API itself to allow higher-level program representations (e.g. ASTs) to be
added to the JIT.
theme. Behind this API we will refine the implementation of the JIT to add
support for concurrent compilation, optimization and lazy compilation.
Eventually we will extend the API itself to allow higher-level program
representations (e.g. ASTs) to be added to the JIT.
KaleidoscopeJIT
===============
@ -100,12 +97,10 @@ the REPL code from `Chapter 7 <LangImpl07.html>`_ of that tutorial to supply the
input for our JIT: Each time the user enters an expression the REPL will add a
new IR module containing the code for that expression to the JIT. If the
expression is a top-level expression like '1+1' or 'sin(x)', the REPL will also
use the findSymbol method of our JIT class find and execute the code for the
expression, and then use the removeModule method to remove the code again
(since there's no way to re-invoke an anonymous expression). In later chapters
of this tutorial we'll modify the REPL to enable new interactions with our JIT
class, but for now we will take this setup for granted and focus our attention on
the implementation of our JIT itself.
use the lookup method of our JIT class find and execute the code for the
expression. In later chapters of this tutorial we will modify the REPL to enable
new interactions with our JIT class, but for now we will take this setup for
granted and focus our attention on the implementation of our JIT itself.
Our KaleidoscopeJIT class is defined in the KaleidoscopeJIT.h header. After the
usual include guards and #includes [2]_, we get to the definition of our class:
@ -115,216 +110,154 @@ usual include guards and #includes [2]_, we get to the definition of our class:
#ifndef LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H
#define LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H
#include "llvm/ADT/STLExtras.h"
#include "llvm/ExecutionEngine/ExecutionEngine.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/ExecutionEngine/JITSymbol.h"
#include "llvm/ExecutionEngine/RTDyldMemoryManager.h"
#include "llvm/ExecutionEngine/SectionMemoryManager.h"
#include "llvm/ExecutionEngine/Orc/CompileUtils.h"
#include "llvm/ExecutionEngine/Orc/Core.h"
#include "llvm/ExecutionEngine/Orc/ExecutionUtils.h"
#include "llvm/ExecutionEngine/Orc/IRCompileLayer.h"
#include "llvm/ExecutionEngine/Orc/LambdaResolver.h"
#include "llvm/ExecutionEngine/Orc/JITTargetMachineBuilder.h"
#include "llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h"
#include "llvm/ExecutionEngine/SectionMemoryManager.h"
#include "llvm/IR/DataLayout.h"
#include "llvm/IR/Mangler.h"
#include "llvm/Support/DynamicLibrary.h"
#include "llvm/Support/raw_ostream.h"
#include "llvm/Target/TargetMachine.h"
#include <algorithm>
#include "llvm/IR/LLVMContext.h"
#include <memory>
#include <string>
#include <vector>
namespace llvm {
namespace orc {
class KaleidoscopeJIT {
private:
std::unique_ptr<TargetMachine> TM;
const DataLayout DL;
RTDyldObjectLinkingLayer ObjectLayer;
IRCompileLayer<decltype(ObjectLayer), SimpleCompiler> CompileLayer;
ExecutionSession ES;
RTDyldObjectLinkingLayer ObjectLayer{ES, getMemoryMgr};
IRCompileLayer CompileLayer{ES, ObjectLayer,
ConcurrentIRCompiler(getJTMB())};
DataLayout DL{cantFail(getJTMB().getDefaultDataLayoutForTarget())};
MangleAndInterner Mangle{ES, DL};
ThreadSafeContext Ctx{llvm::make_unique<LLVMContext>()};
static JITTargetMachineBuilder getJTMB() {
return cantFail(JITTargetMachineBuilder::detectHost());
}
static std::unique_ptr<SectionMemoryManager> getMemoryMgr(VModuleKey) {
return llvm::make_unique<SectionMemoryManager>();
}
We begin with the ExecutionSession member, ``ES``, which provides context for
our running JIT'd code. It holds the string pool for symbol names, the global
mutex that guards the critical sections of JIT operations, error logging
facilities, and other utilities. For basic use cases such as this, a default
constructed ExecutionSession is all we will need. We will investigate more
advanced uses of ExecutionSession in later chapters. Following our
ExecutionSession we have two ORC *layers*: an RTDyldObjectLinkingLayer and an
IRCompileLayer. We will be talking more about layers in the next chapter, but
for now you can think of them as analogous to LLVM Passes: they wrap up useful
JIT utilities behind an easy to compose interface. The first layer, ObjectLayer,
is the foundation of our JIT: it takes in-memory object files produced by a
compiler and links them on the fly to make them executable. This
JIT-on-top-of-a-linker design was introduced in MCJIT, however the linker was
hidden inside the MCJIT class. In ORC we expose the linker so that clients can
access and configure it directly if they need to. In this tutorial our
ObjectLayer will just be used to support the next layer in our stack: the
CompileLayer, which will be responsible for taking LLVM IR, compiling it, and
passing the resulting in-memory object files down to the object linking layer
below. Our ObjectLayer is constructed with a reference to the ExecutionSession
and the getMemoryMgr utility function, which it uses to generate a new memory
manager for each object file as it is added. Next up is our CompileLayer, which
is initialized with a reference to the ExecutionSession, a reference to the
ObjectLayer (where it will send the objects produced by the compiler), and an IR
compiler instance. In this case we are using the ConcurrentIRCompiler class
which is constructed with a JITTargetMachineBuilder and can be called to compile
IR concurrently from several threads (though in this chapter we will only use
one).
Following the ExecutionSession and layers we have three supporting member
variables. The DataLayout, ``DL``; and MangleAndInterner, ``Mangle`` members are
used to support portable lookups based on IR symbol names (more on that when we
get to our ``lookup`` function below), and the ThreadSafeContext member,
``Ctx``, manages an LLVMContext that can be used while building IR Modules for
the JIT.
After that, we have two static utility functions. The ``getJTMB()`` function
returns a JITTargetMachineBuilder, which is a factory for building LLVM
TargetMachine instances that are used by the compiler. In this first tutorial we
will only need one (implicitly created) TargetMachine, but in future tutorials
that enable concurrent compilation we will need one per thread. This is why we
use a target machine builder, rather than a single TargetMachine. (note: Older
LLVM JIT APIs that did not support concurrent compilation were constructed with
a single TargetMachines). The ``getMemoryMgr()`` function constructs instances
of RuntimeDyld::MemoryManager, and is used by the linking layer to generate a
new memory manager for each object file.
.. code-block:: c++
public:
using ModuleHandle = decltype(CompileLayer)::ModuleHandleT;
Our class begins with four members: A TargetMachine, TM, which will be used to
build our LLVM compiler instance; A DataLayout, DL, which will be used for
symbol mangling (more on that later), and two ORC *layers*: an
RTDyldObjectLinkingLayer and a CompileLayer. We'll be talking more about layers
in the next chapter, but for now you can think of them as analogous to LLVM
Passes: they wrap up useful JIT utilities behind an easy to compose interface.
The first layer, ObjectLayer, is the foundation of our JIT: it takes in-memory
object files produced by a compiler and links them on the fly to make them
executable. This JIT-on-top-of-a-linker design was introduced in MCJIT, however
the linker was hidden inside the MCJIT class. In ORC we expose the linker so
that clients can access and configure it directly if they need to. In this
tutorial our ObjectLayer will just be used to support the next layer in our
stack: the CompileLayer, which will be responsible for taking LLVM IR, compiling
it, and passing the resulting in-memory object files down to the object linking
layer below.
KaleidoscopeJIT() {
ES.getMainJITDylib().setGenerator(
cantFail(DynamicLibrarySearchGenerator::GetForCurrentProcess(DL)));
}
That's it for member variables, after that we have a single typedef:
ModuleHandle. This is the handle type that will be returned from our JIT's
addModule method, and can be passed to the removeModule method to remove a
module. The IRCompileLayer class already provides a convenient handle type
(IRCompileLayer::ModuleHandleT), so we just alias our ModuleHandle to this.
const DataLayout &getDataLayout() const { return DL; }
LLVMContext &getContext() { return *Ctx.getContext(); }
Next up we have our class constructor. Our members have already been
initialized, so the one thing that remains to do is to tweak the configuration
of the *JITDylib* that we will store our code in. We want to modify this dylib
to contain not only the symbols that we add to it, but also the symbols from
our REPL process as well. We do this by attaching a
``DynamicLibrarySearchGenerator`` instance using the
``DynamicLibrarySearchGenerator::GetForCurrentProcess`` method.
Following the constructor we have the ``getDataLayout()`` and ``getContext()``
methods. These are used to make data structures created and managed by the JIT
(especially the LLVMContext) available to the REPL code that will build our
IR modules.
.. code-block:: c++
KaleidoscopeJIT()
: TM(EngineBuilder().selectTarget()), DL(TM->createDataLayout()),
ObjectLayer([]() { return std::make_shared<SectionMemoryManager>(); }),
CompileLayer(ObjectLayer, SimpleCompiler(*TM)) {
llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr);
void addModule(std::unique_ptr<Module> M) {
cantFail(CompileLayer.add(ES.getMainJITDylib(),
ThreadSafeModule(std::move(M), Ctx)));
}
TargetMachine &getTargetMachine() { return *TM; }
Next up we have our class constructor. We begin by initializing TM using the
EngineBuilder::selectTarget helper method which constructs a TargetMachine for
the current process. Then we use our newly created TargetMachine to initialize
DL, our DataLayout. After that we need to initialize our ObjectLayer. The
ObjectLayer requires a function object that will build a JIT memory manager for
each module that is added (a JIT memory manager manages memory allocations,
memory permissions, and registration of exception handlers for JIT'd code). For
this we use a lambda that returns a SectionMemoryManager, an off-the-shelf
utility that provides all the basic memory management functionality required for
this chapter. Next we initialize our CompileLayer. The CompileLayer needs two
things: (1) A reference to our object layer, and (2) a compiler instance to use
to perform the actual compilation from IR to object files. We use the
off-the-shelf SimpleCompiler instance for now. Finally, in the body of the
constructor, we call the DynamicLibrary::LoadLibraryPermanently method with a
nullptr argument. Normally the LoadLibraryPermanently method is called with the
path of a dynamic library to load, but when passed a null pointer it will 'load'
the host process itself, making its exported symbols available for execution.
.. code-block:: c++
ModuleHandle addModule(std::unique_ptr<Module> M) {
// Build our symbol resolver:
// Lambda 1: Look back into the JIT itself to find symbols that are part of
// the same "logical dylib".
// Lambda 2: Search for external symbols in the host process.
auto Resolver = createLambdaResolver(
[&](const std::string &Name) {
if (auto Sym = CompileLayer.findSymbol(Name, false))
return Sym;
return JITSymbol(nullptr);
},
[](const std::string &Name) {
if (auto SymAddr =
RTDyldMemoryManager::getSymbolAddressInProcess(Name))
return JITSymbol(SymAddr, JITSymbolFlags::Exported);
return JITSymbol(nullptr);
});
// Add the set to the JIT with the resolver we created above and a newly
// created SectionMemoryManager.
return cantFail(CompileLayer.addModule(std::move(M),
std::move(Resolver)));
Expected<JITEvaluatedSymbol> lookup(StringRef Name) {
return ES.lookup({&ES.getMainJITDylib()}, Mangle(Name.str()));
}
Now we come to the first of our JIT API methods: addModule. This method is
responsible for adding IR to the JIT and making it available for execution. In
this initial implementation of our JIT we will make our modules "available for
execution" by adding them straight to the CompileLayer, which will immediately
compile them. In later chapters we will teach our JIT to defer compilation
of individual functions until they're actually called.
execution" by adding them to the CompileLayer, which will it turn store the
Module in the main JITDylib. This process will create new symbol table entries
in the JITDylib for each definition in the module, and will defer compilation of
the module until any of its definitions is looked up. Note that this is not lazy
compilation: just referencing a definition, even if it is never used, will be
enough to trigger compilation. In later chapters we will teach our JIT to defer
compilation of functions until they're actually called. To add our Module we
must first wrap it in a ThreadSafeModule instance, which manages the lifetime of
the Module's LLVMContext (our Ctx member) in a thread-friendly way. In our
example, all modules will share the Ctx member, which will exist for the
duration of the JIT. Once we switch to concurrent compilation in later chapters
we will use a new context per module.
To add our module to the CompileLayer we need to supply both the module and a
symbol resolver. The symbol resolver is responsible for supplying the JIT with
an address for each *external symbol* in the module we are adding. External
symbols are any symbol not defined within the module itself, including calls to
functions outside the JIT and calls to functions defined in other modules that
have already been added to the JIT. (It may seem as though modules added to the
JIT should know about one another by default, but since we would still have to
supply a symbol resolver for references to code outside the JIT it turns out to
be easier to re-use this one mechanism for all symbol resolution.) This has the
added benefit that the user has full control over the symbol resolution
process. Should we search for definitions within the JIT first, then fall back
on external definitions? Or should we prefer external definitions where
available and only JIT code if we don't already have an available
implementation? By using a single symbol resolution scheme we are free to choose
whatever makes the most sense for any given use case.
Building a symbol resolver is made especially easy by the *createLambdaResolver*
function. This function takes two lambdas [3]_ and returns a JITSymbolResolver
instance. The first lambda is used as the implementation of the resolver's
findSymbolInLogicalDylib method, which searches for symbol definitions that
should be thought of as being part of the same "logical" dynamic library as this
Module. If you are familiar with static linking: this means that
findSymbolInLogicalDylib should expose symbols with common linkage and hidden
visibility. If all this sounds foreign you can ignore the details and just
remember that this is the first method that the linker will use to try to find a
symbol definition. If the findSymbolInLogicalDylib method returns a null result
then the linker will call the second symbol resolver method, called findSymbol,
which searches for symbols that should be thought of as external to (but
visibile from) the module and its logical dylib. In this tutorial we will adopt
the following simple scheme: All modules added to the JIT will behave as if they
were linked into a single, ever-growing logical dylib. To implement this our
first lambda (the one defining findSymbolInLogicalDylib) will just search for
JIT'd code by calling the CompileLayer's findSymbol method. If we don't find a
symbol in the JIT itself we'll fall back to our second lambda, which implements
findSymbol. This will use the RTDyldMemoryManager::getSymbolAddressInProcess
method to search for the symbol within the program itself. If we can't find a
symbol definition via either of these paths, the JIT will refuse to accept our
module, returning a "symbol not found" error.
Now that we've built our symbol resolver, we're ready to add our module to the
JIT. We do this by calling the CompileLayer's addModule method. The addModule
method returns an ``Expected<CompileLayer::ModuleHandle>``, since in more
advanced JIT configurations it could fail. In our basic configuration we know
that it will always succeed so we use the cantFail utility to assert that no
error occurred, and extract the handle value. Since we have already typedef'd
our ModuleHandle type to be the same as the CompileLayer's handle type, we can
return the unwrapped handle directly.
.. code-block:: c++
JITSymbol findSymbol(const std::string Name) {
std::string MangledName;
raw_string_ostream MangledNameStream(MangledName);
Mangler::getNameWithPrefix(MangledNameStream, Name, DL);
return CompileLayer.findSymbol(MangledNameStream.str(), true);
}
JITTargetAddress getSymbolAddress(const std::string Name) {
return cantFail(findSymbol(Name).getAddress());
}
void removeModule(ModuleHandle H) {
cantFail(CompileLayer.removeModule(H));
}
Now that we can add code to our JIT, we need a way to find the symbols we've
added to it. To do that we call the findSymbol method on our CompileLayer, but
with a twist: We have to *mangle* the name of the symbol we're searching for
first. The ORC JIT components use mangled symbols internally the same way a
static compiler and linker would, rather than using plain IR symbol names. This
allows JIT'd code to interoperate easily with precompiled code in the
application or shared libraries. The kind of mangling will depend on the
DataLayout, which in turn depends on the target platform. To allow us to remain
portable and search based on the un-mangled name, we just re-produce this
mangling ourselves.
Next we have a convenience function, getSymbolAddress, which returns the address
of a given symbol. Like CompileLayer's addModule function, JITSymbol's getAddress
function is allowed to fail [4]_, however we know that it will not in our simple
example, so we wrap it in a call to cantFail.
We now come to the last method in our JIT API: removeModule. This method is
responsible for destructing the MemoryManager and SymbolResolver that were
added with a given module, freeing any resources they were using in the
process. In our Kaleidoscope demo we rely on this method to remove the module
representing the most recent top-level expression, preventing it from being
treated as a duplicate definition when the next top-level expression is
entered. It is generally good to free any module that you know you won't need
to call further, just to free up the resources dedicated to it. However, you
don't strictly need to do this: All resources will be cleaned up when your
JIT class is destructed, if they haven't been freed before then. Like
``CompileLayer::addModule`` and ``JITSymbol::getAddress``, removeModule may
fail in general but will never fail in our example, so we wrap it in a call to
cantFail.
Our last method is ``lookup``, which allows us to look up addresses for
function and variable definitions added to the JIT based on their symbol names.
As noted above, lookup will implicitly trigger compilation for any symbol
that has not already been compiled. Our lookup method calls through to
`ExecutionSession::lookup`, passing in a list of dylibs to search (in our case
just the main dylib), and the symbol name to search for, with a twist: We have
to *mangle* the name of the symbol we're searching for first. The ORC JIT
components use mangled symbols internally the same way a static compiler and
linker would, rather than using plain IR symbol names. This allows JIT'd code
to interoperate easily with precompiled code in the application or shared
libraries. The kind of mangling will depend on the DataLayout, which in turn
depends on the target platform. To allow us to remain portable and search based
on the un-mangled name, we just re-produce this mangling ourselves using our
``Mangle`` member function object.
This brings us to the end of Chapter 1 of Building a JIT. You now have a basic
but fully functioning JIT stack that you can use to take LLVM IR and make it
@ -362,42 +295,26 @@ Here is the code:
.. [2] +-----------------------------+-----------------------------------------------+
| File | Reason for inclusion |
+=============================+===============================================+
| STLExtras.h | LLVM utilities that are useful when working |
| | with the STL. |
| JITSymbol.h | Defines the lookup result type |
| | JITEvaluatedSymbol |
+-----------------------------+-----------------------------------------------+
| ExecutionEngine.h | Access to the EngineBuilder::selectTarget |
| | method. |
| CompileUtils.h | Provides the SimpleCompiler class. |
+-----------------------------+-----------------------------------------------+
| | Access to the |
| RTDyldMemoryManager.h | RTDyldMemoryManager::getSymbolAddressInProcess|
| | method. |
| Core.h | Core utilities such as ExecutionSession and |
| | JITDylib. |
+-----------------------------+-----------------------------------------------+
| CompileUtils.h | Provides the SimpleCompiler class. |
| ExecutionUtils.h | Provides the DynamicLibrarySearchGenerator |
| | class. |
+-----------------------------+-----------------------------------------------+
| IRCompileLayer.h | Provides the IRCompileLayer class. |
| IRCompileLayer.h | Provides the IRCompileLayer class. |
+-----------------------------+-----------------------------------------------+
| | Access the createLambdaResolver function, |
| LambdaResolver.h | which provides easy construction of symbol |
| | resolvers. |
| JITTargetMachineBuilder.h | Provides the JITTargetMachineBuilder class. |
+-----------------------------+-----------------------------------------------+
| RTDyldObjectLinkingLayer.h | Provides the RTDyldObjectLinkingLayer class. |
| RTDyldObjectLinkingLayer.h | Provides the RTDyldObjectLinkingLayer class. |
+-----------------------------+-----------------------------------------------+
| Mangler.h | Provides the Mangler class for platform |
| | specific name-mangling. |
| SectionMemoryManager.h | Provides the SectionMemoryManager class. |
+-----------------------------+-----------------------------------------------+
| DynamicLibrary.h | Provides the DynamicLibrary class, which |
| | makes symbols in the host process searchable. |
| DataLayout.h | Provides the DataLayout class. |
+-----------------------------+-----------------------------------------------+
| | A fast output stream class. We use the |
| raw_ostream.h | raw_string_ostream subclass for symbol |
| | mangling |
| LLVMContext.h | Provides the LLVMContext class. |
+-----------------------------+-----------------------------------------------+
| TargetMachine.h | LLVM target machine description class. |
+-----------------------------+-----------------------------------------------+
.. [3] Actually they don't have to be lambdas, any object with a call operator
will do, including plain old functions or std::functions.
.. [4] ``JITSymbol::getAddress`` will force the JIT to compile the definition of
the symbol if it hasn't already been compiled, and since the compilation
process could fail getAddress must be able to return this failure.

View File

@ -14,84 +14,59 @@
#ifndef LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H
#define LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H
#include "llvm/ADT/STLExtras.h"
#include "llvm/ExecutionEngine/ExecutionEngine.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/ExecutionEngine/JITSymbol.h"
#include "llvm/ExecutionEngine/RTDyldMemoryManager.h"
#include "llvm/ExecutionEngine/SectionMemoryManager.h"
#include "llvm/ExecutionEngine/Orc/CompileUtils.h"
#include "llvm/ExecutionEngine/Orc/Core.h"
#include "llvm/ExecutionEngine/Orc/ExecutionUtils.h"
#include "llvm/ExecutionEngine/Orc/IRCompileLayer.h"
#include "llvm/ExecutionEngine/Orc/LambdaResolver.h"
#include "llvm/ExecutionEngine/Orc/JITTargetMachineBuilder.h"
#include "llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h"
#include "llvm/ExecutionEngine/SectionMemoryManager.h"
#include "llvm/IR/DataLayout.h"
#include "llvm/IR/Mangler.h"
#include "llvm/Support/DynamicLibrary.h"
#include "llvm/Support/raw_ostream.h"
#include "llvm/Target/TargetMachine.h"
#include <algorithm>
#include "llvm/IR/LLVMContext.h"
#include <memory>
#include <string>
#include <vector>
namespace llvm {
namespace orc {
class KaleidoscopeJIT {
private:
ExecutionSession ES;
std::shared_ptr<SymbolResolver> Resolver;
std::unique_ptr<TargetMachine> TM;
const DataLayout DL;
LegacyRTDyldObjectLinkingLayer ObjectLayer;
LegacyIRCompileLayer<decltype(ObjectLayer), SimpleCompiler> CompileLayer;
RTDyldObjectLinkingLayer ObjectLayer{ES, getMemoryMgr};
IRCompileLayer CompileLayer{ES, ObjectLayer,
ConcurrentIRCompiler(getJTMB())};
DataLayout DL{cantFail(getJTMB().getDefaultDataLayoutForTarget())};
MangleAndInterner Mangle{ES, DL};
ThreadSafeContext Ctx{llvm::make_unique<LLVMContext>()};
static JITTargetMachineBuilder getJTMB() {
return cantFail(JITTargetMachineBuilder::detectHost());
}
static std::unique_ptr<SectionMemoryManager> getMemoryMgr() {
return llvm::make_unique<SectionMemoryManager>();
}
public:
KaleidoscopeJIT()
: Resolver(createLegacyLookupResolver(
ES,
[this](const std::string &Name) -> JITSymbol {
if (auto Sym = CompileLayer.findSymbol(Name, false))
return Sym;
else if (auto Err = Sym.takeError())
return std::move(Err);
if (auto SymAddr =
RTDyldMemoryManager::getSymbolAddressInProcess(Name))
return JITSymbol(SymAddr, JITSymbolFlags::Exported);
return nullptr;
},
[](Error Err) { cantFail(std::move(Err), "lookupFlags failed"); })),
TM(EngineBuilder().selectTarget()), DL(TM->createDataLayout()),
ObjectLayer(ES,
[this](VModuleKey) {
return LegacyRTDyldObjectLinkingLayer::Resources{
std::make_shared<SectionMemoryManager>(), Resolver};
}),
CompileLayer(ObjectLayer, SimpleCompiler(*TM)) {
llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr);
KaleidoscopeJIT() {
ES.getMainJITDylib().setGenerator(
cantFail(DynamicLibrarySearchGenerator::GetForCurrentProcess(DL)));
}
TargetMachine &getTargetMachine() { return *TM; }
const DataLayout &getDataLayout() const { return DL; }
VModuleKey addModule(std::unique_ptr<Module> M) {
// Add the module to the JIT with a new VModuleKey.
auto K = ES.allocateVModule();
cantFail(CompileLayer.addModule(K, std::move(M)));
return K;
LLVMContext &getContext() { return *Ctx.getContext(); }
void addModule(std::unique_ptr<Module> M) {
cantFail(CompileLayer.add(ES.getMainJITDylib(),
ThreadSafeModule(std::move(M), Ctx)));
}
JITSymbol findSymbol(const std::string Name) {
std::string MangledName;
raw_string_ostream MangledNameStream(MangledName);
Mangler::getNameWithPrefix(MangledNameStream, Name, DL);
return CompileLayer.findSymbol(MangledNameStream.str(), true);
}
JITTargetAddress getSymbolAddress(const std::string Name) {
return cantFail(findSymbol(Name).getAddress());
}
void removeModule(VModuleKey K) {
cantFail(CompileLayer.removeModule(K));
Expected<JITEvaluatedSymbol> lookup(StringRef Name) {
return ES.lookup({&ES.getMainJITDylib()}, Mangle(Name.str()));
}
};

View File

@ -676,10 +676,11 @@ static std::unique_ptr<FunctionAST> ParseDefinition() {
}
/// toplevelexpr ::= expression
static std::unique_ptr<FunctionAST> ParseTopLevelExpr() {
static std::unique_ptr<FunctionAST> ParseTopLevelExpr(unsigned ExprCount) {
if (auto E = ParseExpression()) {
// Make an anonymous proto.
auto Proto = llvm::make_unique<PrototypeAST>("__anon_expr",
auto Proto = llvm::make_unique<PrototypeAST>(("__anon_expr" +
Twine(ExprCount)).str(),
std::vector<std::string>());
return llvm::make_unique<FunctionAST>(std::move(Proto), std::move(E));
}
@ -696,11 +697,11 @@ static std::unique_ptr<PrototypeAST> ParseExtern() {
// Code Generation
//===----------------------------------------------------------------------===//
static LLVMContext TheContext;
static IRBuilder<> Builder(TheContext);
static std::unique_ptr<KaleidoscopeJIT> TheJIT;
static LLVMContext *TheContext;
static std::unique_ptr<IRBuilder<>> Builder;
static std::unique_ptr<Module> TheModule;
static std::map<std::string, AllocaInst *> NamedValues;
static std::unique_ptr<KaleidoscopeJIT> TheJIT;
static std::map<std::string, std::unique_ptr<PrototypeAST>> FunctionProtos;
Value *LogErrorV(const char *Str) {
@ -729,11 +730,11 @@ static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
const std::string &VarName) {
IRBuilder<> TmpB(&TheFunction->getEntryBlock(),
TheFunction->getEntryBlock().begin());
return TmpB.CreateAlloca(Type::getDoubleTy(TheContext), nullptr, VarName);
return TmpB.CreateAlloca(Type::getDoubleTy(*TheContext), nullptr, VarName);
}
Value *NumberExprAST::codegen() {
return ConstantFP::get(TheContext, APFloat(Val));
return ConstantFP::get(*TheContext, APFloat(Val));
}
Value *VariableExprAST::codegen() {
@ -743,7 +744,7 @@ Value *VariableExprAST::codegen() {
return LogErrorV("Unknown variable name");
// Load the value.
return Builder.CreateLoad(V, Name.c_str());
return Builder->CreateLoad(V, Name.c_str());
}
Value *UnaryExprAST::codegen() {
@ -755,7 +756,7 @@ Value *UnaryExprAST::codegen() {
if (!F)
return LogErrorV("Unknown unary operator");
return Builder.CreateCall(F, OperandV, "unop");
return Builder->CreateCall(F, OperandV, "unop");
}
Value *BinaryExprAST::codegen() {
@ -778,7 +779,7 @@ Value *BinaryExprAST::codegen() {
if (!Variable)
return LogErrorV("Unknown variable name");
Builder.CreateStore(Val, Variable);
Builder->CreateStore(Val, Variable);
return Val;
}
@ -789,15 +790,15 @@ Value *BinaryExprAST::codegen() {
switch (Op) {
case '+':
return Builder.CreateFAdd(L, R, "addtmp");
return Builder->CreateFAdd(L, R, "addtmp");
case '-':
return Builder.CreateFSub(L, R, "subtmp");
return Builder->CreateFSub(L, R, "subtmp");
case '*':
return Builder.CreateFMul(L, R, "multmp");
return Builder->CreateFMul(L, R, "multmp");
case '<':
L = Builder.CreateFCmpULT(L, R, "cmptmp");
L = Builder->CreateFCmpULT(L, R, "cmptmp");
// Convert bool 0/1 to double 0.0 or 1.0
return Builder.CreateUIToFP(L, Type::getDoubleTy(TheContext), "booltmp");
return Builder->CreateUIToFP(L, Type::getDoubleTy(*TheContext), "booltmp");
default:
break;
}
@ -808,7 +809,7 @@ Value *BinaryExprAST::codegen() {
assert(F && "binary operator not found!");
Value *Ops[] = {L, R};
return Builder.CreateCall(F, Ops, "binop");
return Builder->CreateCall(F, Ops, "binop");
}
Value *CallExprAST::codegen() {
@ -828,7 +829,7 @@ Value *CallExprAST::codegen() {
return nullptr;
}
return Builder.CreateCall(CalleeF, ArgsV, "calltmp");
return Builder->CreateCall(CalleeF, ArgsV, "calltmp");
}
Value *IfExprAST::codegen() {
@ -837,46 +838,46 @@ Value *IfExprAST::codegen() {
return nullptr;
// Convert condition to a bool by comparing equal to 0.0.
CondV = Builder.CreateFCmpONE(
CondV, ConstantFP::get(TheContext, APFloat(0.0)), "ifcond");
CondV = Builder->CreateFCmpONE(
CondV, ConstantFP::get(*TheContext, APFloat(0.0)), "ifcond");
Function *TheFunction = Builder.GetInsertBlock()->getParent();
Function *TheFunction = Builder->GetInsertBlock()->getParent();
// Create blocks for the then and else cases. Insert the 'then' block at the
// end of the function.
BasicBlock *ThenBB = BasicBlock::Create(TheContext, "then", TheFunction);
BasicBlock *ElseBB = BasicBlock::Create(TheContext, "else");
BasicBlock *MergeBB = BasicBlock::Create(TheContext, "ifcont");
BasicBlock *ThenBB = BasicBlock::Create(*TheContext, "then", TheFunction);
BasicBlock *ElseBB = BasicBlock::Create(*TheContext, "else");
BasicBlock *MergeBB = BasicBlock::Create(*TheContext, "ifcont");
Builder.CreateCondBr(CondV, ThenBB, ElseBB);
Builder->CreateCondBr(CondV, ThenBB, ElseBB);
// Emit then value.
Builder.SetInsertPoint(ThenBB);
Builder->SetInsertPoint(ThenBB);
Value *ThenV = Then->codegen();
if (!ThenV)
return nullptr;
Builder.CreateBr(MergeBB);
Builder->CreateBr(MergeBB);
// Codegen of 'Then' can change the current block, update ThenBB for the PHI.
ThenBB = Builder.GetInsertBlock();
ThenBB = Builder->GetInsertBlock();
// Emit else block.
TheFunction->getBasicBlockList().push_back(ElseBB);
Builder.SetInsertPoint(ElseBB);
Builder->SetInsertPoint(ElseBB);
Value *ElseV = Else->codegen();
if (!ElseV)
return nullptr;
Builder.CreateBr(MergeBB);
Builder->CreateBr(MergeBB);
// Codegen of 'Else' can change the current block, update ElseBB for the PHI.
ElseBB = Builder.GetInsertBlock();
ElseBB = Builder->GetInsertBlock();
// Emit merge block.
TheFunction->getBasicBlockList().push_back(MergeBB);
Builder.SetInsertPoint(MergeBB);
PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(TheContext), 2, "iftmp");
Builder->SetInsertPoint(MergeBB);
PHINode *PN = Builder->CreatePHI(Type::getDoubleTy(*TheContext), 2, "iftmp");
PN->addIncoming(ThenV, ThenBB);
PN->addIncoming(ElseV, ElseBB);
@ -903,7 +904,7 @@ Value *IfExprAST::codegen() {
// br endcond, loop, endloop
// outloop:
Value *ForExprAST::codegen() {
Function *TheFunction = Builder.GetInsertBlock()->getParent();
Function *TheFunction = Builder->GetInsertBlock()->getParent();
// Create an alloca for the variable in the entry block.
AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
@ -914,17 +915,17 @@ Value *ForExprAST::codegen() {
return nullptr;
// Store the value into the alloca.
Builder.CreateStore(StartVal, Alloca);
Builder->CreateStore(StartVal, Alloca);
// Make the new basic block for the loop header, inserting after current
// block.
BasicBlock *LoopBB = BasicBlock::Create(TheContext, "loop", TheFunction);
BasicBlock *LoopBB = BasicBlock::Create(*TheContext, "loop", TheFunction);
// Insert an explicit fall through from the current block to the LoopBB.
Builder.CreateBr(LoopBB);
Builder->CreateBr(LoopBB);
// Start insertion in LoopBB.
Builder.SetInsertPoint(LoopBB);
Builder->SetInsertPoint(LoopBB);
// Within the loop, the variable is defined equal to the PHI node. If it
// shadows an existing variable, we have to restore it, so save it now.
@ -945,7 +946,7 @@ Value *ForExprAST::codegen() {
return nullptr;
} else {
// If not specified, use 1.0.
StepVal = ConstantFP::get(TheContext, APFloat(1.0));
StepVal = ConstantFP::get(*TheContext, APFloat(1.0));
}
// Compute the end condition.
@ -955,23 +956,23 @@ Value *ForExprAST::codegen() {
// Reload, increment, and restore the alloca. This handles the case where
// the body of the loop mutates the variable.
Value *CurVar = Builder.CreateLoad(Alloca, VarName.c_str());
Value *NextVar = Builder.CreateFAdd(CurVar, StepVal, "nextvar");
Builder.CreateStore(NextVar, Alloca);
Value *CurVar = Builder->CreateLoad(Alloca, VarName.c_str());
Value *NextVar = Builder->CreateFAdd(CurVar, StepVal, "nextvar");
Builder->CreateStore(NextVar, Alloca);
// Convert condition to a bool by comparing equal to 0.0.
EndCond = Builder.CreateFCmpONE(
EndCond, ConstantFP::get(TheContext, APFloat(0.0)), "loopcond");
EndCond = Builder->CreateFCmpONE(
EndCond, ConstantFP::get(*TheContext, APFloat(0.0)), "loopcond");
// Create the "after loop" block and insert it.
BasicBlock *AfterBB =
BasicBlock::Create(TheContext, "afterloop", TheFunction);
BasicBlock::Create(*TheContext, "afterloop", TheFunction);
// Insert the conditional branch into the end of LoopEndBB.
Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
Builder->CreateCondBr(EndCond, LoopBB, AfterBB);
// Any new code will be inserted in AfterBB.
Builder.SetInsertPoint(AfterBB);
Builder->SetInsertPoint(AfterBB);
// Restore the unshadowed variable.
if (OldVal)
@ -980,13 +981,13 @@ Value *ForExprAST::codegen() {
NamedValues.erase(VarName);
// for expr always returns 0.0.
return Constant::getNullValue(Type::getDoubleTy(TheContext));
return Constant::getNullValue(Type::getDoubleTy(*TheContext));
}
Value *VarExprAST::codegen() {
std::vector<AllocaInst *> OldBindings;
Function *TheFunction = Builder.GetInsertBlock()->getParent();
Function *TheFunction = Builder->GetInsertBlock()->getParent();
// Register all variables and emit their initializer.
for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
@ -1004,11 +1005,11 @@ Value *VarExprAST::codegen() {
if (!InitVal)
return nullptr;
} else { // If not specified, use 0.0.
InitVal = ConstantFP::get(TheContext, APFloat(0.0));
InitVal = ConstantFP::get(*TheContext, APFloat(0.0));
}
AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
Builder.CreateStore(InitVal, Alloca);
Builder->CreateStore(InitVal, Alloca);
// Remember the old variable binding so that we can restore the binding when
// we unrecurse.
@ -1033,9 +1034,9 @@ Value *VarExprAST::codegen() {
Function *PrototypeAST::codegen() {
// Make the function type: double(double,double) etc.
std::vector<Type *> Doubles(Args.size(), Type::getDoubleTy(TheContext));
std::vector<Type *> Doubles(Args.size(), Type::getDoubleTy(*TheContext));
FunctionType *FT =
FunctionType::get(Type::getDoubleTy(TheContext), Doubles, false);
FunctionType::get(Type::getDoubleTy(*TheContext), Doubles, false);
Function *F =
Function::Create(FT, Function::ExternalLinkage, Name, TheModule.get());
@ -1062,8 +1063,8 @@ Function *FunctionAST::codegen() {
BinopPrecedence[P.getOperatorName()] = P.getBinaryPrecedence();
// Create a new basic block to start insertion into.
BasicBlock *BB = BasicBlock::Create(TheContext, "entry", TheFunction);
Builder.SetInsertPoint(BB);
BasicBlock *BB = BasicBlock::Create(*TheContext, "entry", TheFunction);
Builder->SetInsertPoint(BB);
// Record the function arguments in the NamedValues map.
NamedValues.clear();
@ -1072,7 +1073,7 @@ Function *FunctionAST::codegen() {
AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, Arg.getName());
// Store the initial value into the alloca.
Builder.CreateStore(&Arg, Alloca);
Builder->CreateStore(&Arg, Alloca);
// Add arguments to variable symbol table.
NamedValues[Arg.getName()] = Alloca;
@ -1080,7 +1081,7 @@ Function *FunctionAST::codegen() {
if (Value *RetVal = Body->codegen()) {
// Finish off the function.
Builder.CreateRet(RetVal);
Builder->CreateRet(RetVal);
// Validate the generated code, checking for consistency.
verifyFunction(*TheFunction);
@ -1102,8 +1103,11 @@ Function *FunctionAST::codegen() {
static void InitializeModule() {
// Open a new module.
TheModule = llvm::make_unique<Module>("my cool jit", TheContext);
TheModule->setDataLayout(TheJIT->getTargetMachine().createDataLayout());
TheModule = llvm::make_unique<Module>("my cool jit", *TheContext);
TheModule->setDataLayout(TheJIT->getDataLayout());
// Create a new builder for the module.
Builder = llvm::make_unique<IRBuilder<>>(*TheContext);
}
static void HandleDefinition() {
@ -1136,23 +1140,34 @@ static void HandleExtern() {
}
static void HandleTopLevelExpression() {
static unsigned ExprCount = 0;
// Update ExprCount. This number will be added to anonymous expressions to
// prevent them from clashing.
++ExprCount;
// Evaluate a top-level expression into an anonymous function.
if (auto FnAST = ParseTopLevelExpr()) {
if (auto FnAST = ParseTopLevelExpr(ExprCount)) {
if (FnAST->codegen()) {
// JIT the module containing the anonymous expression, keeping a handle so
// we can free it later.
auto H = TheJIT->addModule(std::move(TheModule));
TheJIT->addModule(std::move(TheModule));
InitializeModule();
// Get the anonymous expression's address and cast it to the right type,
// double(*)(), so we can call it as a native function.
double (*FP)() =
(double (*)())(intptr_t)TheJIT->getSymbolAddress("__anon_expr");
assert(FP && "Failed to codegen function");
fprintf(stderr, "Evaluated to %f\n", FP());
// Get the anonymous expression's JITSymbol.
auto Sym = TheJIT->lookup(("__anon_expr" + Twine(ExprCount)).str());
// Delete the anonymous expression module from the JIT.
TheJIT->removeModule(H);
if (Sym) {
// If the lookup succeeded, cast the symbol's address to a function
// pointer then call it.
auto *FP = (double (*)())(intptr_t)Sym->getAddress();
assert(FP && "Failed to codegen function");
fprintf(stderr, "Evaluated to %f\n", FP());
} else {
// Otherwise log the reason the symbol lookup failed.
logAllUnhandledErrors(Sym.takeError(), errs(),
"Could not evaluate: ");
}
}
} else {
// Skip token for error recovery.
@ -1221,6 +1236,7 @@ int main() {
getNextToken();
TheJIT = llvm::make_unique<KaleidoscopeJIT>();
TheContext = &TheJIT->getContext();
InitializeModule();