ORC LLVMs Next Generation of JIT API Contents LLVM JIT APIs Past, - - PowerPoint PPT Presentation

orc
SMART_READER_LITE
LIVE PREVIEW

ORC LLVMs Next Generation of JIT API Contents LLVM JIT APIs Past, - - PowerPoint PPT Presentation

ORC LLVMs Next Generation of JIT API Contents LLVM JIT APIs Past, Present and Future I will praise MCJIT Then I will critique MCJIT Then Ill introduce ORC Code examples available in the Building A JIT tutorial on llvm.org


slide-1
SLIDE 1

ORC

LLVM’s Next Generation of JIT API

slide-2
SLIDE 2

Contents

  • LLVM JIT APIs Past, Present and Future
  • I will praise MCJIT
  • Then I will critique MCJIT
  • Then I’ll introduce ORC
  • Code examples available in the Building A JIT tutorial on llvm.org
slide-3
SLIDE 3

Use Cases

Kaleidoscope LLDB High Performance JITs Interpreters and REPLs Simple and safe Cross-target compilation Ability to configure

  • ptimizations and codegen

Lazy compilation Equivalence with static compile

slide-4
SLIDE 4

Requirements

, configurable for advanced users , in-process for application scripting

  • , non-lazy for high-performance cases
  • Simple for beginners
  • Cross-target for LLDB
  • Lazy for interpreters

We can support all of these requirements But not behind a single interface…

slide-5
SLIDE 5

ExecutionEngine

void addModule(Module*); void *getPointerToFunction(Function*); void addGlobalMapping(Function*, void*); // Many terrible things that, trust me, you
 // really don’t want to know about.

slide-6
SLIDE 6
  • MCJIT (LLVM 2.9 — present)
  • Implements ExecutionEngine
  • Cross-target, no lazy compilation

JIT Implementations

  • Legacy JIT (LLVM 1.0 — 3.5)
  • Introduced ExecutionEngine
  • Lazy compilation, in-process only
  • ORC (LLVM 3.7 — present)
  • Forward looking API
  • Does NOT implement ExecutionEngine
slide-7
SLIDE 7

MCJIT Design

  • Static Pipeline with JIT Linker
  • Efficient code and tool re-use
  • Supports cross-target JITing
  • Does not support lazy compilation

MCJIT

CodeGen MC RuntimeDyld Object LLVM IR Raw Bits Address

slide-8
SLIDE 8

MCJIT Implementation

  • Only accessible via ExecutionEngine
  • Caused ExecutionEngine to bloat
  • Can not support all of ExecutionEngine
slide-9
SLIDE 9

ExecutionEngine

void *getPointerToFunction(Function*) uint64_t getFunctionAddress(const std::string&) void *getPointerToNamedFunction(StringRef) void *getPointerToFunctionOrStub(Function*) uint64_t getAddressToGlobalIfAvailable(StringRef) void *getPointerToGlobalIfAvailable(StringRef) void *getPointerToGlobal(const GlobalValue*) uint64_t getGlobalValueAddress(const std::string&) void *getOrEmitGlobalVariable(const GlobalVariable*) Symbol Query Horrors…

slide-10
SLIDE 10

MCJIT Implementation

  • Only accessible via ExecutionEngine
  • Caused ExecutionEngine to bloat
  • Can not support all of ExecutionEngine
  • Limited visibility into internal actions
  • No automatic memory management
slide-11
SLIDE 11

ORC — On Request Compilation

A Modular MCJIT

slide-12
SLIDE 12

Modularizing MCJIT

MCJIT

CodeGen MC RuntimeDyld Object Raw Bits

slide-13
SLIDE 13

Modularizing MCJIT

Link Layer Compile Layer

CodeGen MC RuntimeDyld Object Raw Bits

Compile Layer forwards symbol queries to the link layer

slide-14
SLIDE 14

Initial Benefits

  • Layers can be tested in isolation

Link Layer Compile Layer

CodeGen MC RuntimeDyld Object Raw Bits

slide-15
SLIDE 15

Initial Benefits

  • Layers can be tested in isolation

Link Layer

RuntimeDyld Object Raw Bits

  • E.g. Unit test the link layer
slide-16
SLIDE 16

Initial Benefits

  • Layers can be tested in isolation

Link Layer Compile Layer

CodeGen MC RuntimeDyld Object Raw Bits

  • Observe events without callbacks
  • E.g. Add a notification layer
  • E.g. Unit test the link layer
slide-17
SLIDE 17

The Layer Interface

  • Handle addModule(Module*, MemMgr*, Resolver*)
  • Memory manager owns executable bits
  • Resolver provides symbol resolution
  • JITSymbol findSymbol(StringRef, bool)
  • void removeModule(Handle)
slide-18
SLIDE 18

Example: Basic Composition

…
 ObjectLinkingLayer LinkLayer;
 SimpleCompiler Compiler(TargetMachine());
 IRCompileLayer<…> CompileLayer(LinkLayer, Compiler);
 …

slide-19
SLIDE 19

Example: Basic Composition

…
 ObjectLinkingLayer LinkLayer;
 SimpleCompiler Compiler(TargetMachine());
 IRCompileLayer<…> CompileLayer(LinkLayer, Compiler);
 … class MyJIT {
 
 
 
 
 
 };

slide-20
SLIDE 20

Example: Basic Composition

…
 ObjectLinkingLayer LinkLayer;
 SimpleCompiler Compiler(TargetMachine());
 IRCompileLayer<…> CompileLayer(LinkLayer, Compiler);
 …

slide-21
SLIDE 21

Example: Basic Composition

…
 ObjectLinkingLayer LinkLayer;
 SimpleCompiler Compiler(TargetMachine());
 IRCompileLayer<…> CompileLayer(LinkLayer, Compiler);
 
 CompileLayer.addModule(Mod, MemMgr, SymResolver);
 auto FooSym = CompileLayer.findSymbol(“foo”, true);
 auto Foo = reinterpret_cast<int(*)()>(FooSym.getAddress());
 int Result = Foo(); // <—— Call into JIT’d code.
 …

slide-22
SLIDE 22

Memory Managers

  • Own executable code, free it on destruction
  • Inherit from RuntimeDyld::MemoryManager
  • Custom memory managers supported
  • SectionMemoryManager provides a good default
slide-23
SLIDE 23

Symbol Resolvers

auto Resolver =
 createLambdaResolver(
 [&](StringRef Name) {
 return CompileLayer.findSymbol(Name, false);
 },
 [&](StringRef Name) {
 return getSymbolAddressInProcess(Name);
 });

  • First lambda implements in-image lookup
  • Second implements external lookup
slide-24
SLIDE 24

The Story So Far

  • Layers wrap up JIT functionality to make it composable
  • Build custom JITs by composing layers
  • Memory managers handle memory ownership
  • Symbol resolvers handle symbol resolution
slide-25
SLIDE 25

Adding New Features

  • New layers provide new features
  • Compile On Demand Layer
  • addModule builds function stubs


that trigger lazy compilation

  • Symbol queries resolve to stubs

Compile On Demand Compile Link

slide-26
SLIDE 26

Without Laziness

ObjectLinkingLayer LinkLayer;
 SimpleCompiler Compiler(TargetMachine());
 IRCompileLayer<…> CompileLayer(LinkLayer, Compiler); CompileLayer.addModule(Mod, MemMgr, SymResolver);
 CompileLayer.findSymbol(“foo”, true); auto FooSym =
 auto Foo = reinterpret_cast<int(*)()>(FooSym.getAddress());
 int Result = Foo(); // <—— Call foo.

slide-27
SLIDE 27

With Laziness

ObjectLinkingLayer LinkLayer;
 SimpleCompiler Compiler(TargetMachine());
 IRCompileLayer<…> CompileLayer(LinkLayer, Compiler); CompileOnDemandLayer<…> CODLayer(CompileLayer, …); CODLayer.addModule(Mod, MemMgr, SymResolver);
 CODLayer.findSymbol(“foo”, true); auto FooSym =
 auto Foo = reinterpret_cast<int(*)()>(FooSym.getAddress());
 int Result = Foo(); // <—— Call foo’s stub.

slide-28
SLIDE 28

COD Layer Requirements

  • Indirect Stubs Manager
  • Create named indirect stubs (indirect jumps via pointers)
  • Modify stub pointers
  • Compile Callback Manager
  • Create compile callbacks (re-entry points in the compiler)
slide-29
SLIDE 29

Compile Callbacks

foo: jmp *foo$ptr foo$cc: push foo_key jmp Resolver Resolver foo$impl: … ret ORC/LLVM bar: … call foo … .ptr foo$ptr

slide-30
SLIDE 30

Callbacks and Stubs

auto StubsMgr = … ;
 auto CCMgr = … ;
 auto CC = CCMgr.getCompileCallback();
 StubsMgr.createStub(“foo”, CC.getAddress(), Exported);
 CC.setCompileAction([&]() -> TargetAddress {
 
 
 });
 auto FooSym = StubsMgr.findStub(“foo”, true);
 auto Foo = static_cast<int(*)()>(FooSym.getAddress());
 int Result = Foo(); printf(“Hello world”);
 return 0;

Prints “Hello world”, then jumps to 0

slide-31
SLIDE 31

Callbacks and Stubs

auto StubsMgr = … ;
 auto CCMgr = … ;
 auto CC = CCMgr.getCompileCallback();
 StubsMgr.createStub(“foo”, CC.getAddress(), Exported);
 CC.setCompileAction([&]() -> TargetAddress {
 
 
 });
 auto FooSym = StubsMgr.findStub(“foo”, true);
 auto Foo = static_cast<int(*)()>(FooSym.getAddress());
 int Result = Foo(); CompileLayer.addModule(
 return CompileLayer.findSymbol(“foo”, true).getAddress();

Lazily compiles “foo” from existing IR

FooModule, MemMgr, Resolver);

slide-32
SLIDE 32

Callbacks and Stubs

auto StubsMgr = … ;
 auto CCMgr = … ;
 auto CC = CCMgr.getCompileCallback();
 StubsMgr.createStub(“foo”, CC.getAddress(), Exported);
 CC.setCompileAction([&]() -> TargetAddress {
 
 
 });
 auto FooSym = StubsMgr.findStub(“foo”, true);
 auto Foo = static_cast<int(*)()>(FooSym.getAddress());
 int Result = Foo(); CompileLayer.addModule(
 return CompileLayer.findSymbol(“foo”, true).getAddress();

Lazily compiles “foo” from AST

IRGen(FooAST), MemMgr, Resolver);

slide-33
SLIDE 33

Laziness Recap

  • Callbacks and Stubs
  • Provide direct access to lazy compilation
  • Push laziness earlier in the compiler pipeline
  • CompileOnDemand provides off-the-shelf laziness for IR
  • ORC supports arbitrary laziness with a clean API
slide-34
SLIDE 34

Adding New Layers

  • Transform Layer
  • addModule runs a user-supplied


transform function on the module

  • Symbol queries are forwarded
  • Above C.O.D.: Eager optimizations
  • Below C.O.D.: Lazy optimizations

Transform Compile Link Transform Compile On Demand

slide-35
SLIDE 35

Layers and Modularity

Pick features “off the shelf” Mix and match components:
 experiment with new designs Create, modify and share new features
 without breaking existing clients

slide-36
SLIDE 36

Remote JIT Support

slide-37
SLIDE 37

Remote JIT Support

  • Execute code on a different process / machine / architecture
  • Enables JIT code to be sandboxed
  • MCJIT supported remote compilation, but required a lot of manual work
  • OrcRemoteTarget client/server provides high level API
  • Remote mapped memory, stub and callback managers
  • Symbol queries
  • Execute remote functions
slide-38
SLIDE 38

Local Laziness

auto StubsMgr =
 auto CCMgr =
 auto CC = CCMgr.getCompileCallback();
 StubsMgr.createStub(“foo”, CC.getAddress(), Exported);
 CC.setCompileAction([&]() -> TargetAddress {
 CompileLayer.addModule(IRGen(FooAST), MemMgr, Resolver); return CompileLayer.findSymbol(“foo”, true).getAddress();
 });
 auto FooSym = StubsMgr.findStub(“foo”, true); auto Foo = static_cast<int(*)()>(FooSym.getAddress()); int Result = Foo(); auto StubsMgr =
 auto CCMgr =
 auto CC = CCMgr.getCompileCallback();
 StubsMgr.createStub(“foo”, CC.getAddress(), Exported);
 CC.setCompileAction([&]() -> TargetAddress { CompileLayer.addModule … ; … ;

slide-39
SLIDE 39

auto StubsMgr =
 auto CCMgr =
 auto CC = CCMgr.getCompileCallback();
 StubsMgr.createStub(“foo”, CC.getAddress(), Exported);
 CC.setCompileAction([&]() -> TargetAddress {
 CompileLayer.addModule( RT.createStubsMgr();

Remote Laziness

auto RT = … ; IRGen(FooAST), RT.createMemMgr(), Resolver); return CompileLayer.findSymbol(“foo”, true).getAddress();
 });
 auto FooSym = StubsMgr.findStub(“foo”, true); int Result = RT.callIntVoid(FooSym.getAddress()); RT.createCallbackMgr();

slide-40
SLIDE 40

Demo

slide-41
SLIDE 41

Remote JIT Support

  • Remote JITing with ORC is easy
  • Remoteness is orthogonal to other features, including laziness
  • Security implications are serious
  • Sandbox the server, authenticate the client, encrypt the channel
  • Treat like mains electricity: very useful, but safety first!
slide-42
SLIDE 42

Future Opportunities

  • New development modes: edit/test vs edit/compile/test
  • Remote interpreters for development on embedded devices
  • Distributing work for clusters
  • Compute
  • Database queries
slide-43
SLIDE 43

ORC vs MCJIT

  • Same underlying architecture: static compiler + JIT linker
  • ORC
  • Offers a strict superset of features
  • A more flexible API
  • Supports remoteness and laziness
  • Has better memory management
  • OrcMCJITReplacement provides a transition path
slide-44
SLIDE 44

Future Goals

  • Kill off ExecutionEngine, design a new in-tree JIT (for LLI and C-API)
  • New layers and components (e.g. hot function recompilation)
  • API cleanup: Core abstractions are in place but need polish
  • More architectural and relocation support (Fix RuntimeDyldELF!)
  • Check out the Building A JIT tutorial
  • Get involved: http://llvm.org/bugs, OrcJIT component