A Tale of Two Projects It is the best of jitting, it is the worst of - PowerPoint PPT Presentation

A Tale of Two Projects It is the best of jitting, it is the worst of jitting …

Collaborators • Jan Vitek • Oli Fluckiger • Jan Jecmen • Paley Li • Roman Tsegelskyi • Alena Sochurkova • Petr Maj

Design Goals • Performance • The JIT should outperform both AST and BC interpreter • Compatibility • Full R language must be supported • At least in theory, in practice we are happy with BC interpreter compatibility • Easy Maintenance • Source code should be easy to understand and simple to maintain • Counterexample: LuaJIT

The Importance of having a JIT • Costs of BC Interpreter • Hard to predict indirect jump for each instruction in program • Operands stack vs registers • JIT mitigates these • Zero cost of moving to next instructions • Uses platform registers directly • Better optimization for low-level parts

Low Level Virtual Machine (LLVM) • Backend for clang compiler • Used by many other languages • State of the art compiler suite • Hundreds of optimizations (including some vectorization) • Dozens of targets • Designed as AOT compiler • Slow compilation time • Fast & Optimized output • But provides a JIT layer

McJIT – LLVM JIT Layer • Developed by Laurie Hendren at McGill • used for Matlab • Program must be translated to LLVM IR • McJIT then turns LLVM functions into pointers to native functions • Handles the dynamic loading and native code generation • Newer LLVM versions uses ORC JIT instead • Layered approach, true JIT

LLVM IR • Everything is Typed • Values, functions, registers, instructions • Very low-level • Assembly-like nature • Registers based VM • Unlimited number of registers • Single Static Assignment

RJIT The pros & cons of using LLVM as backend for R

Getting a JIT Quickly • Translating R semantics directly to LLVM IR too complicated • Main idea: • Convert R bytecode instructions into functions and call them from within the JIT

> x = 2 + 3 A simple expression in R’s REPL

OP(LDCONST, 1): > x = 2 + 3 R_Visible = TRUE; value = VECTOR_ELT(constants, GETOP()); MARK_NOT_MUTABLE(value); BCNPUSH(value); NEXT(); LDCONST.OP 2 LDCONST.OP 3 OP ( ADD , 1 ) : ADD.OP FastBinary ( R_ADD , PLUSOP , R_AddSym ); NEXT(); SETVAR.OP x OP(SETVAR, 1): R Bytecode int sidx = GETOP(); SEXP loc; SEXP symbol = VECTOR_ELT(constants, sidx); loc = GET_BINDING_CELL_CACHE(symbol, rho, vcache, sidx); ... value = GETSTACK(-1); INCREMENT_NAMED(value); SET_BINDING_VALUE(loc, value)) ... NEXT();

void instruction_LDCONST_OP(InterpreterContext * c, int arg1 ) { OP(LDCONST, 1): > x = 2 + 3 R_Visible = TRUE; R_Visible = TRUE; c -> value = VECTOR_ELT ( c -> constants , arg1 ); value = VECTOR_ELT(constants, GETOP()); MARK_NOT_MUTABLE ( c -> value ); MARK_NOT_MUTABLE(value); BCNPUSH ( c -> value ); BCNPUSH(value); NEXT (); NEXT(); } LDCONST.OP 2 LDCONST.OP 3 OP ( ADD , 1 ) : ADD.OP FastBinary ( R_ADD , PLUSOP , R_AddSym ); NEXT(); SETVAR.OP x OP(SETVAR, 1): int sidx = GETOP(); SEXP loc; SEXP symbol = VECTOR_ELT(constants, sidx); loc = GET_BINDING_CELL_CACHE(symbol, rho, vcache, sidx); ... value = GETSTACK(-1); INCREMENT_NAMED(value); SET_BINDING_VALUE(loc, value)) ... NEXT();

void instruction_LDCONST_OP(InterpreterContext * c, int arg1 ) { OP(LDCONST, 1): > x = 2 + 3 R_Visible = TRUE; R_Visible = TRUE; c -> value = VECTOR_ELT ( c -> constants , arg1 ); value = VECTOR_ELT(constants, GETOP()); MARK_NOT_MUTABLE ( c -> value ); MARK_NOT_MUTABLE(value); BCNPUSH ( c -> value ); BCNPUSH(value); NEXT (); NEXT(); } LDCONST.OP 2 LDCONST.OP 3 void ADD_OP(InterpreterContext * c, int arg1 ) { OP ( ADD , 1 ) : FastBinary2(R_ADD, PLUSOP, R_AddSym, arg1); ADD.OP FastBinary ( R_ADD , PLUSOP , R_AddSym ); NEXT(); NEXT(); SETVAR.OP x } OP(SETVAR, 1): int sidx = GETOP(); SEXP loc; SEXP symbol = VECTOR_ELT(constants, sidx); loc = GET_BINDING_CELL_CACHE(symbol, rho, vcache, sidx); ... value = GETSTACK(-1); INCREMENT_NAMED(value); SET_BINDING_VALUE(loc, value)) ... NEXT();

void instruction_LDCONST_OP(InterpreterContext * c, int arg1 ) { OP(LDCONST, 1): > x = 2 + 3 R_Visible = TRUE; R_Visible = TRUE; c -> value = VECTOR_ELT (c-> constants , arg1 ); value = VECTOR_ELT(constants, GETOP()); MARK_NOT_MUTABLE ( c -> value ); MARK_NOT_MUTABLE(value); BCNPUSH ( c -> value ); BCNPUSH(value); NEXT (); NEXT(); } LDCONST.OP 2 LDCONST.OP 3 void ADD_OP(InterpreterContext * c, int arg1 ) { OP ( ADD , 1 ) : FastBinary2(R_ADD, PLUSOP, R_AddSym, arg1); ADD.OP FastBinary ( R_ADD , PLUSOP , R_AddSym ); NEXT(); NEXT(); SETVAR.OP x } void SETVAR_OP(InterpreterContext * c, int arg1) { OP(SETVAR, 1): SEXP loc; int sidx = GETOP(); SEXP symbol = VECTOR_ELT(c->constants, arg1); SEXP loc; loc = GET_BINDING_CELL_CACHE(symbol, c->rho, vcache, sidx); SEXP symbol = VECTOR_ELT(constants, sidx); ... loc = GET_BINDING_CELL_CACHE(symbol, rho, vcache, sidx); SEXP value = GETSTACK(-1); ... INCREMENT_NAMED(value); value = GETSTACK(-1); SET_BINDING_VALUE(loc, value)) INCREMENT_NAMED(value); ... SET_BINDING_VALUE(loc, value)) NEXT(); ... } NEXT();

void instruction_LDCONST_OP(InterpreterContext * c, int arg1 ) { OP(LDCONST, 1): > x = 2 + 3 R_Visible = TRUE; R_Visible = TRUE; c -> value = VECTOR_ELT ( c -> constants , arg1 ); value = VECTOR_ELT(constants, GETOP()); MARK_NOT_MUTABLE ( c -> value ); MARK_NOT_MUTABLE(value); BCNPUSH ( c -> value ); BCNPUSH(value); NEXT (); NEXT(); } LDCONST.OP 2 LDCONST.OP 3 void ADD_OP(InterpreterContext * c, int arg1 ) { OP ( ADD , 1 ) : FastBinary2(R_ADD, PLUSOP, R_AddSym, arg1); ADD.OP FastBinary ( R_ADD , PLUSOP , R_AddSym ); NEXT(); NEXT(); SETVAR.OP x } void SETVAR_OP(InterpreterContext * c, int arg1) { OP(SETVAR, 1): SEXP loc; int sidx = GETOP(); SEXP symbol = VECTOR_ELT(c->constants, arg1); SEXP loc; loc = GET_BINDING_CELL_CACHE(symbol, c->rho, vcache, sidx); SEXP symbol = VECTOR_ELT(constants, sidx); ... loc = GET_BINDING_CELL_CACHE(symbol, rho, vcache, sidx); SEXP value = GETSTACK(-1); ... INCREMENT_NAMED(value); value = GETSTACK(-1); SET_BINDING_VALUE(loc, value)) INCREMENT_NAMED(value); ... SET_BINDING_VALUE(loc, value)) typedef struct { NEXT(); ... SEXP rho; } NEXT(); Rboolean useCache; SEXP value; SEXP constants; R_bcstack_t * oldntop; R_binding_cache_t vcache; Rboolean smallcache; } InterpreterContext;

void instruction_LDCONST_OP(InterpreterContext * c, int arg1 ) { OP(LDCONST, 1): > x = 2 + 3 R_Visible = TRUE; R_Visible = TRUE; c -> value = VECTOR_ELT (c-> constants , arg1 ); value = VECTOR_ELT(constants, GETOP()); MARK_NOT_MUTABLE ( c -> value ); MARK_NOT_MUTABLE(value); BCNPUSH ( c -> value ); BCNPUSH(value); NEXT (); NEXT(); } LDCONST.OP 2 LDCONST.OP 3 void ADD_OP(InterpreterContext * c, int arg1 ) { OP ( ADD , 1 ) : FastBinary2(R_ADD, PLUSOP, R_AddSym, arg1); ADD.OP FastBinary ( R_ADD , PLUSOP , R_AddSym ); NEXT(); NEXT(); SETVAR.OP x } void SETVAR_OP(InterpreterContext * c, int arg1) { OP(SETVAR, 1): SEXP loc; int sidx = GETOP(); SEXP symbol = VECTOR_ELT(c->constants, arg1); call void LDCONST_OP(2) SEXP loc; loc = GET_BINDING_CELL_CACHE(symbol, c->rho, vcache, sidx); SEXP symbol = VECTOR_ELT(constants, sidx); call void LDCONST_OP(3) ... loc = GET_BINDING_CELL_CACHE(symbol, rho, vcache, sidx); call void ADD_OP() SEXP value = GETSTACK(-1); ... INCREMENT_NAMED(value); call void SETVAR_OP() value = GETSTACK(-1); SET_BINDING_VALUE(loc, value)) INCREMENT_NAMED(value); ... SET_BINDING_VALUE(loc, value)) NEXT(); ... LLVM IR } NEXT();

• So far the effort was minimal • Refactor BC insns into functions • Interpreter’s local variables go to the context • LLVM IR is just a sequence of calls • Constant pool is roughly the same • Control flow is a bit more involved

• So far the effort was minimal • Refactor BC insns into functions • Interpreter’s local variables go to the context • LLVM IR is just a sequence of calls • Constant pool is roughly the same • Control flow is a bit more involved call void GETVAR_OP a %1 = call i1 ConvertToLogicalNoNA() br %1 true false if (a) { true: call void GETVAR_OP b b; br next } else { false: c; call void GETVAR_OP c } br next next: %3 = call SEXP bcPop() ret SEXP %3

Removing the Stack • So far the effort was minimal • Refactor BC insns into functions • Interpreter’s local variables go to the context • LLVM IR is just a sequence of calls • Constant pool is roughly the same • Control flow is a bit more involved • We can do better • Use LLVM registers instead of the stack • Rewrite functions to take & return SEXPs

A Tale of Two Projects It is the best of jitting, it is the worst of - PowerPoint PPT Presentation

A Tale of Two Projects It is the best of jitting, it is the worst of jitting Collaborators Jan Vitek Oli Fluckiger Jan Jecmen Paley Li Roman Tsegelskyi Alena Sochurkova Petr Maj Design Goals Performance The

NEC Forum Tale of Two Contracts Ir. Ir. PAUL LEE PAUL LEE, , Kai Kai-hung hung Assistant

A Tale of Two Indices: Positive vs. Normative Indexation in the Emerging Markets April 2020 A

City of Forest Park A Tale of Two TIFs A Tale of Two TIFs It was the best of times, it was

A Tale of Two Theories: A Tale of Two Theories: Reconciling Reconciling random matrix theory

Three Talks: 1. How does the solar wind blow? 2 A Tale of Two Space Plasma Physics 2. A Tale of

Michael Watchorn: Campus Manager Monash University Berwick and Peninsula Campuses A tale of two

Hash Functions Hash Functions Lecture 10 Hash Functions Lecture 10 Before we talk about

Hash Functions Hash Functions Lecture 10 Hash Functions Lecture 10 Before we talk about

The 42 Year-Old Tennis Player Heard a Crack: What I Do, Tension Side or Compression Side, and

The tale fundamental group, tale homotopy and anabelian geometry Axel Sarlin |

The Environmentalists Tale Damnation, Purgatory and Armageddon The Environmentalists Tale 2

Console Games to iOS: A Technical Overview of Adaptation Stphane Khalil Jacoby CTO / Co-Founder

The Telescope Array Low-Energy Extension (TALE) C. Jui For the TA/TALE Collaboration ISVHECRI,

A Tale of Knots & Games Allison Henrich, Ph.D. Seattle University April 27, 2014 Allison

A Tale of Testing the Untestable A Tale of Testing the Untestable Angie Jones Senior Developer

STAFFING AT HEALTHEAST THE TALE OF TWO HEALTHEASTS It was the best of times, it was the worst of

Easy::Jit Just-In-Time compilation for C++ codes Serge Guelton (two presentations from) Juan

On Deza Circulants Sergey Goryainov Shanghai Jiao Tong University & Krasovskii Insitute of

New Observations on Impossible Differential Cryptanalysis of Reduced-Round Camellia Ya Liu 1 ,

Proving Lock-Freedom Easily and Automatically Xiao Jia 1 Wei Li 1 Viktor Vafeiadis 2 1 Shanghai

How PyTorch Optimizes Deep Learning Computations Vincent Quenneville-Blair, PhD. Facebook AI.

Optimizing JavaScript Filip Pizlo Apple Untyped Objects are hashtables Functions are

JIT Compilation Module Overview JIT Compilation Native vs. Managed Compilation Managed

Retargeting JIT compilers by using C-compiler generated executable code Mark Tokutomi January

Sambuz

Useful Links

Newsletter

Mail Us

A Tale of Two Projects It is the best of jitting, it is the worst of - PowerPoint PPT Presentation

A Tale of Two Projects It is the best of jitting, it is the worst of jitting Collaborators Jan Vitek Oli Fluckiger Jan Jecmen Paley Li Roman Tsegelskyi Alena Sochurkova Petr Maj Design Goals Performance The

NEC Forum Tale of Two Contracts Ir. Ir. PAUL LEE PAUL LEE, , Kai Kai-hung hung Assistant

A Tale of Two Indices: Positive vs. Normative Indexation in the Emerging Markets April 2020 A

City of Forest Park A Tale of Two TIFs A Tale of Two TIFs It was the best of times, it was

A Tale of Two Theories: A Tale of Two Theories: Reconciling Reconciling random matrix theory

Three Talks: 1. How does the solar wind blow? 2 A Tale of Two Space Plasma Physics 2. A Tale of

Michael Watchorn: Campus Manager Monash University Berwick and Peninsula Campuses A tale of two

Hash Functions Hash Functions Lecture 10 Hash Functions Lecture 10 Before we talk about

Hash Functions Hash Functions Lecture 10 Hash Functions Lecture 10 Before we talk about

The 42 Year-Old Tennis Player Heard a Crack: What I Do, Tension Side or Compression Side, and

The tale fundamental group, tale homotopy and anabelian geometry Axel Sarlin |

The Environmentalists Tale Damnation, Purgatory and Armageddon The Environmentalists Tale 2

Console Games to iOS: A Technical Overview of Adaptation Stphane Khalil Jacoby CTO / Co-Founder

The Telescope Array Low-Energy Extension (TALE) C. Jui For the TA/TALE Collaboration ISVHECRI,

A Tale of Knots &amp; Games Allison Henrich, Ph.D. Seattle University April 27, 2014 Allison

A Tale of Testing the Untestable A Tale of Testing the Untestable Angie Jones Senior Developer

STAFFING AT HEALTHEAST THE TALE OF TWO HEALTHEASTS It was the best of times, it was the worst of

Easy::Jit Just-In-Time compilation for C++ codes Serge Guelton (two presentations from) Juan

On Deza Circulants Sergey Goryainov Shanghai Jiao Tong University &amp; Krasovskii Insitute of

New Observations on Impossible Differential Cryptanalysis of Reduced-Round Camellia Ya Liu 1 ,

Proving Lock-Freedom Easily and Automatically Xiao Jia 1 Wei Li 1 Viktor Vafeiadis 2 1 Shanghai

How PyTorch Optimizes Deep Learning Computations Vincent Quenneville-Blair, PhD. Facebook AI.

Optimizing JavaScript Filip Pizlo Apple Untyped Objects are hashtables Functions are

JIT Compilation Module Overview JIT Compilation Native vs. Managed Compilation Managed

Retargeting JIT compilers by using C-compiler generated executable code Mark Tokutomi January

Sambuz

Useful Links

Newsletter

Mail Us

A Tale of Knots & Games Allison Henrich, Ph.D. Seattle University April 27, 2014 Allison

On Deza Circulants Sergey Goryainov Shanghai Jiao Tong University & Krasovskii Insitute of