Effective Compilation of Higher-Order Programs Roland Leia Klaas - - PowerPoint PPT Presentation

effective compilation of higher order programs
SMART_READER_LITE
LIVE PREVIEW

Effective Compilation of Higher-Order Programs Roland Leia Klaas - - PowerPoint PPT Presentation

Effective Compilation of Higher-Order Programs Roland Leia Klaas Boesche Sebastian Hack Richard Membarth Arsne Prard-Gayot Philipp Slusallek http://compilers.cs.uni-saarland.de https://github.com/AnyDSL/thorin Compiler Design Lab


slide-1
SLIDE 1

Effective Compilation of Higher-Order Programs

Roland Leißa Klaas Boesche Sebastian Hack Richard Membarth Arsène Pérard-Gayot Philipp Slusallek

http://compilers.cs.uni-saarland.de https://github.com/AnyDSL/thorin Compiler Design Lab Saarland University 1

slide-2
SLIDE 2

Introduction

slide-3
SLIDE 3

Intermediate Representations (IRs)

imperative languages C, Fortran, …

  • instruction lists + CFGs
  • LLVM
  • GIMPLE (gcc)
  • graph-based IRs – “sea of nodes” [Click95]
  • Java Hotspot
  • libFirm
  • TurboFan (Google’s JavaScript compiler)

functional languages Haskell, ML, …

  • calculus
  • Core (GHC)
  • Lambda IR (OCaml)
  • Continuation Passing Style (CPS) [Appel06]

2

slide-4
SLIDE 4

Intermediate Representations (IRs)

imperative languages C, Fortran, …

  • instruction lists + CFGs
  • LLVM
  • GIMPLE (gcc)
  • graph-based IRs – “sea of nodes” [Click95]
  • Java Hotspot
  • libFirm
  • TurboFan (Google’s JavaScript compiler)

functional languages Haskell, ML, …

  • calculus
  • Core (GHC)
  • Lambda IR (OCaml)
  • Continuation Passing Style (CPS) [Appel06]

2

slide-5
SLIDE 5

Intermediate Representations (IRs)

imperative languages C, Fortran, …

  • instruction lists + CFGs
  • LLVM
  • GIMPLE (gcc)
  • graph-based IRs – “sea of nodes” [Click95]
  • Java Hotspot
  • libFirm
  • TurboFan (Google’s JavaScript compiler)

functional languages Haskell, ML, …

  • λ-calculus
  • Core (GHC)
  • Lambda IR (OCaml)
  • Continuation Passing Style (CPS) [Appel06]

2

slide-6
SLIDE 6

Motivation: Post-Order Visit

struct Node { int data; Node* left; Node* right; }; void post_order_visit(Node* n) { if (n->left) post_order_visit(n->left, f); if (n->right) post_order_visit(n->right, f); cout << n->data << endl; } 3

slide-7
SLIDE 7

Motivation: Post-Order Visit

struct Node { int data; Node* left; Node* right; }; void post_order_visit(Node* n) { if (n->left) post_order_visit(n->left, f); if (n->right) post_order_visit(n->right, f); cout << n->data << endl; } 3

slide-8
SLIDE 8

Motivation: Post-Order Visit

struct Node { int data; Node* left; Node* right; }; void post_order_visit(Node* n) { if (n->left) post_order_visit(n->left, f); if (n->right) post_order_visit(n->right, f); cout << n->data << endl; } 3

slide-9
SLIDE 9

Motivation: Post-Order Visit

struct Node { int data; Node* left; Node* right; }; void post_order_visit(Node* n) { if (n->left) post_order_visit(n->left, f); if (n->right) post_order_visit(n->right, f); cout << n->data << endl; } 3

slide-10
SLIDE 10

Motivation: Post-Order Visit

struct Node { int data; Node* left; Node* right; }; void post_order_visit(Node* n) { if (n->left) post_order_visit(n->left, f); if (n->right) post_order_visit(n->right, f); cout << n->data << endl; } 3

slide-11
SLIDE 11

Motivation: Post-Order Visit

struct Node { int data; Node* left; Node* right; }; void post_order_visit(Node* n) { if (n->left) post_order_visit(n->left, f); if (n->right) post_order_visit(n->right, f); cout << n->data << endl; } 3

slide-12
SLIDE 12

Motivation: Post-Order Visit

struct Node { int data; Node* left; Node* right; }; void post_order_visit(Node* n) { if (n->left) post_order_visit(n->left, f); if (n->right) post_order_visit(n->right, f); cout << n->data << endl; } 3

slide-13
SLIDE 13

Motivation: Post-Order Visit

struct Node { int data; Node* left; Node* right; }; void post_order_visit(Node* n) { if (n->left) post_order_visit(n->left, f); if (n->right) post_order_visit(n->right, f); cout << n->data << endl; } 3

slide-14
SLIDE 14

Motivation: Post-Order Visit

struct Node { int data; Node* left; Node* right; }; void post_order_visit(Node* n) { if (n->left) post_order_visit(n->left, f); if (n->right) post_order_visit(n->right, f); cout << n->data << endl; } 3

slide-15
SLIDE 15

Motivation: Post-Order Visit

struct Node { int data; Node* left; Node* right; }; void post_order_visit(Node* n) { if (n->left) post_order_visit(n->left, f); if (n->right) post_order_visit(n->right, f); cout << n->data << endl; } 3

slide-16
SLIDE 16

How to factor-out the Visiting Algorithm?

Two Choices:

  • 1. Iterators
  • standard conforming iterators: expert C++ knowledge
  • additional pointers in Node or
  • explicit, heap-managed stack
  • 2. Higher-order Functions

4

slide-17
SLIDE 17

How to factor-out the Visiting Algorithm?

Two Choices:

  • 1. Iterators
  • standard conforming iterators: expert C++ knowledge
  • additional pointers in Node or
  • explicit, heap-managed stack
  • 2. Higher-order Functions

4

slide-18
SLIDE 18

How to factor-out the Visiting Algorithm?

Two Choices:

  • 1. Iterators
  • standard conforming iterators: expert C++ knowledge
  • additional pointers in Node or
  • explicit, heap-managed stack
  • 2. Higher-order Functions

4

slide-19
SLIDE 19

Factor-out Visiting Algorithm

void post_order_visit(Node* n, function<void(int)> f) { if (n->left) post_order_visit(n->left, f); if (n->right) post_order_visit(n->right, f); f(n->data); } void print(Node* n) { post_order_visit(n, [](int d) { cout << d << endl; }); } void sum(Node* n) { int result = 0; post_order_visit(n, [&](int d) { result += d; }); cout << result << endl; } 5

slide-20
SLIDE 20

Factor-out Visiting Algorithm

void post_order_visit(Node* n, function<void(int)> f) { if (n->left) post_order_visit(n->left, f); if (n->right) post_order_visit(n->right, f); f(n->data); } void print(Node* n) { post_order_visit(n, [](int d) { cout << d << endl; }); } void sum(Node* n) { int result = 0; post_order_visit(n, [&](int d) { result += d; }); cout << result << endl; } 5

slide-21
SLIDE 21

Factor-out Visiting Algorithm

void post_order_visit(Node* n, function<void(int)> f) { if (n->left) post_order_visit(n->left, f); if (n->right) post_order_visit(n->right, f); f(n->data); } void print(Node* n) { post_order_visit(n, [](int d) { cout << d << endl; }); } void sum(Node* n) { int result = 0; post_order_visit(n, [&](int d) { result += d; }); cout << result << endl; } 5

slide-22
SLIDE 22

Factor-out Visiting Algorithm

void post_order_visit(Node* n, function<void(int)> f) { if (n->left) post_order_visit(n->left, f); if (n->right) post_order_visit(n->right, f); f(n->data); } void print(Node* n) { post_order_visit(n, [](int d) { cout << d << endl; }); } void sum(Node* n) { int result = 0; post_order_visit(n, [&](int d) { result += d; }); cout << result << endl; } 5

slide-23
SLIDE 23

Factor-out Visiting Algorithm

void post_order_visit(Node* n, function<void(int)> f) { if (n->left) post_order_visit(n->left, f); if (n->right) post_order_visit(n->right, f); f(n->data); } void print(Node* n) { post_order_visit(n, [](int d) { cout << d << endl; }); } void sum(Node* n) { int result = 0; post_order_visit(n, [&](int d) { result += d; }); cout << result << endl; } 5

slide-24
SLIDE 24

Factor-out Visiting Algorithm

void post_order_visit(Node* n, function<void(int)> f) { if (n->left) post_order_visit(n->left, f); if (n->right) post_order_visit(n->right, f); f(n->data); } void print(Node* n) { post_order_visit(n, [](int d) { cout << d << endl; }); } void sum(Node* n) { int result = 0; post_order_visit(n, [&](int d) { result += d; }); cout << result << endl; } 5

slide-25
SLIDE 25

Factor-out Visiting Algorithm

void post_order_visit(Node* n, function<void(int)> f) { if (n->left) post_order_visit(n->left, f); if (n->right) post_order_visit(n->right, f); f(n->data); } void print(Node* n) { post_order_visit(n, [](int d) { cout << d << endl; }); } void sum(Node* n) { int result = 0; post_order_visit(n, [&](int d) { result += d; }); cout << result << endl; } 5

slide-26
SLIDE 26

Factor-out Visiting Algorithm

void post_order_visit(Node* n, function<void(int)> f) { if (n->left) post_order_visit(n->left, f); if (n->right) post_order_visit(n->right, f); f(n->data); } void print(Node* n) { post_order_visit(n, [](int d) { cout << d << endl; }); } void sum(Node* n) { int result = 0; post_order_visit(n, [&](int d) { result += d; }); cout << result << endl; } 5

slide-27
SLIDE 27

Compiling

void post_order_visit(Node* n, function<void(int)> f) { if (n->left) post_order_visit(n->left, f); if (n->right) post_order_visit(n->right, f); f(n->data); } void print(Node* n) { post_order_visit(n, [](int d) { cout << d << endl; }); }

clang -O3 -fno-exceptions

6

slide-28
SLIDE 28

post_order_visit with clang -O3 -fno-exceptions

define internal fastcc void @_ZL16post_order_visitP4NodeSt8functionIFviEE ( %struct . Node* nocapture readonly %n, %"class.std::function"* %f ) unnamed_addr #3 { entry : %__args . addr . i = alloca i32 , align 4 %agg . tmp = alloca %"class.std::function" , align 8 %agg . tmp5 = alloca %"class.std::function" , align 8 %left = getelementptr inbounds %struct . Node , %struct . Node* %n, i64 0 , i32 1 %0 = load %struct . Node* , %struct . Node** %left , align 8 , ! tbaa !8 %tobool = icmp eq %struct . Node* %0, null br i 1 %tobool , label %if . end , label %if . then i f . then : ; preds = %entry %_M_manager . i . i = getelementptr inbounds %"class.std::function" , %"class.std::function"* %agg . tmp , i64 0 , i32 0 , i32 1 store i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 )* null , i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 ) * * %_M_manager . i . i , align 8 , ! tbaa !6 %_M_manager . i . i . i = getelementptr inbounds %"class.std::function" , %"class.std::function"* %f , i64 0 , i32 0 , i32 1 %1 = load i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 ) * , i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 ) * * %_M_manager . i . i . i , align 8 , ! tbaa !6 %lnot . i . i = icmp eq i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 )* %1 , null br i 1 %lnot . i . i , label %_ZNSt8functionIFviEEC2ERKS1_ . exit , label %if . then . i i f . then . i : ; preds = %if . then %_M_functor . i = getelementptr inbounds %"class.std::function" , %"class.std::function"* %agg . tmp , i64 0 , i32 0 , i32 0 %_M_functor2 . i = getelementptr inbounds %"class.std::function" , %"class.std::function"* %f , i64 0 , i32 0 , i32 0 %call3 . i = c a l l zeroext i 1 %1(%"union.std::_Any_data"* dereferenceable ( 1 6 ) %_M_functor . i , %"union.std::_Any_data"* dereferenceable ( 1 6 ) %_M_functor2 . i , i32 2) #2 %2 = bitcas t i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 ) * * %_M_manager . i . i . i to <2 x i64 >* %3 = load <2 x i64 > , <2 x i64 >* %2, align 8 , ! tbaa ! 1 1 %4 = bitcas t i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 ) * * %_M_manager . i . i to <2 x i64 >* store <2 x i64 > %3, <2 x i64 >* %4, align 8 , ! tbaa ! 1 1 br label %_ZNSt8functionIFviEEC2ERKS1_ . e x i t _ZNSt8functionIFviEEC2ERKS1_ . e x i t : ; preds = %if . then , %if . then . i c a l l fastcc void @_ZL16post_order_visitP4NodeSt8functionIFviEE ( %struct . Node* nonnull %0, %"class.std::function"* nonnull %agg . tmp) %5 = load i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 ) * , i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 ) * * %_M_manager . i . i , align 8 , ! tbaa !6 %tobool . i = icmp eq i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 )* %5, null br i 1 %tobool . i , label %if . end , label %if . then . i 1 7 i f . then . i 1 7 : ; preds = %_ZNSt8functionIFviEEC2ERKS1_ . e x i t %_M_functor . i16 = getelementptr inbounds %"class.std::function" , %"class.std::function"* %agg . tmp , i64 0 , i32 0 , i32 0 %call . i = c a l l zeroext i 1 %5(%"union.std::_Any_data"* dereferenceable ( 1 6 ) %_M_functor . i16 , %"union.std::_Any_data"* dereferenceable ( 1 6 ) %_M_functor . i16 , i32 3) #2 br label %if . end i f . end : ; preds = %if . then . i17 , %_ZNSt8functionIFviEEC2ERKS1_ . exit , %entry %right = getelementptr inbounds %struct . Node , %struct . Node* %n, i64 0 , i32 2 %6 = load %struct . Node* , %struct . Node** %right , align 8 , ! tbaa ! 1 2 %tobool2 = icmp eq %struct . Node* %6, null br i 1 %tobool2 , label %if . end . i f . end6_crit_edge , label %if . then3 i f . end . i f . end6_crit_edge : ; preds = %if . end %. pre = getelementptr inbounds %"class.std::function" , %"class.std::function"* %f , i64 0 , i32 0 , i32 1 br label %if . end6 i f . then3 : ; preds = %if . end %_M_manager . i . i18 = getelementptr inbounds %"class.std::function" , %"class.std::function"* %agg . tmp5 , i64 0 , i32 0 , i32 1 store i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 )* null , i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 ) * * %_M_manager . i . i18 , align 8 , ! tbaa !6 %_M_manager . i . i . i19 = getelementptr inbounds %"class.std::function" , %"class.std::function"* %f , i64 0 , i32 0 , i32 1 %7 = load i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 ) * , i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 ) * * %_M_manager . i . i . i19 , align 8 , ! tbaa !6 %lnot . i . i20 = icmp eq i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 )* %7 , null br i 1 %lnot . i . i20 , label %_ZNSt8functionIFviEEC2ERKS1_ . exit27 , label %if . then . i26 i f . then . i26 : ; preds = %if . then3 %_M_functor . i 2 1 = getelementptr inbounds %"class.std::function" , %"class.std::function"* %agg . tmp5 , i64 0 , i32 0 , i32 0 %_M_functor2 . i22 = getelementptr inbounds %"class.std::function" , %"class.std::function"* %f , i64 0 , i32 0 , i32 0 %call3 . i23 = c a l l zeroext i 1 %7(%"union.std::_Any_data"* dereferenceable ( 1 6 ) %_M_functor . i21 , %"union.std::_Any_data"* dereferenceable ( 1 6 ) %_M_functor2 . i22 , i32 2) #2 %8 = bitcas t i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 ) * * %_M_manager . i . i . i19 to <2 x i64 >* %9 = load <2 x i64 > , <2 x i64 >* %8, align 8 , ! tbaa ! 1 1 %10 = bitcas t i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 ) * * %_M_manager . i . i18 to <2 x i64 >* store <2 x i64 > %9, <2 x i64 >* %10 , align 8 , ! tbaa ! 1 1 br label %_ZNSt8functionIFviEEC2ERKS1_ . e x i t 2 7 _ZNSt8functionIFviEEC2ERKS1_ . e x i t 2 7 : ; preds = %if . then3 , %if . then . i26 c a l l fastcc void @_ZL16post_order_visitP4NodeSt8functionIFviEE ( %struct . Node* nonnull %6, %"class.std::function"* nonnull %agg . tmp5 ) %11 = load i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 ) * , i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 ) * * %_M_manager . i . i18 , align 8 , ! tbaa !6 %tobool . i29 = icmp eq i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 )* %11 , null br i 1 %tobool . i29 , label %if . end6 , label %if . then . i32 i f . then . i32 : ; preds = %_ZNSt8functionIFviEEC2ERKS1_ . e x i t 2 7 %_M_functor . i30 = getelementptr inbounds %"class.std::function" , %"class.std::function"* %agg . tmp5 , i64 0 , i32 0 , i32 0 %call . i 3 1 = c a l l zeroext i 1 %11 (%"union.std::_Any_data"* dereferenceable ( 1 6 ) %_M_functor . i30 , %"union.std::_Any_data"* dereferenceable ( 1 6 ) %_M_functor . i30 , i32 3) #2 br label %if . end6 i f . end6 : ; preds = %if . end . i f . end6_crit_edge , %if . then . i32 , %_ZNSt8functionIFviEEC2ERKS1_ . e x i t 2 7 %_M_manager . i . i 1 1 . pre_phi = phi i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 ) * * [ %. pre , %if . end . i f . end6_crit_edge ] , [ %_M_manager . i . i . i19 , %if . then . i32 ] , [ %_M_manager . i . i . i19 , %_ZNSt8functionIFviEEC2ERKS1_ . e x i t 2 7 ] %data = getelementptr inbounds %struct . Node , %struct . Node* %n, i64 0 , i32 0 %12 = load i32 , i32 * %data , align 8 , ! tbaa ! 1 3 %13 = bitcas t i32 * %__args . addr . i to i8 * c a l l void @llvm . l i f e t i m e . s t a r t ( i64 4 , i8 * %13) store i32 %12 , i32 * %__args . addr . i , align 4 , ! tbaa ! 1 4 %14 = load i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 ) * , i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 ) * * %_M_manager . i . i 1 1 . pre_phi , align 8 , ! tbaa !6 %lnot . i . i 1 2 = icmp eq i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 )* %14 , null br i 1 %lnot . i . i12 , label %if . then . i13 , label %_ZNKSt8functionIFviEEclEi . e x i t i f . then . i 1 3 : ; preds = %if . end6 c a l l void @_ZSt25__throw_bad_function_callv ( ) #7 unreachable _ZNKSt8functionIFviEEclEi . e x i t : ; preds = %if . end6 %_M_invoker . i1 4 = getelementptr inbounds %"class.std::function" , %"class.std::function"* %f , i64 0 , i32 1 %15 = load void (%"union.std::_Any_data"* , i32 * ) * , void (%"union.std::_Any_data"* , i32 * ) * * %_M_invoker . i14 , align 8 , ! tbaa ! 1 %_M_functor . i 1 5 = getelementptr inbounds %"class.std::function" , %"class.std::function"* %f , i64 0 , i32 0 , i32 0 c a l l void %15(%"union.std::_Any_data"* dereferenceable ( 1 6 ) %_M_functor . i15 , i32 * nonnull dereferenceable ( 4 ) %__args . addr . i ) #2 c a l l void @llvm . l i f e t i m e . end ( i64 4 , i8 * %13) ret void }

slide-29
SLIDE 29

print with clang -O3 -fno-exceptions

define void @_Z5printP4Node ( %struct . Node* nocapture readonly %n) #3 { entry : %agg . tmp = alloca %"class.std::function" , align 8 %_M_manager . i . i = getelementptr inbounds %"class.std::function" , %"class.std::function"* %agg . tmp , i64 0 , i32 0 , i32 1 %_M_invoker . i = getelementptr inbounds %"class.std::function" , %"class.std::function"* %agg . tmp , i64 0 , i32 1 store void (%"union.std::_Any_data"* , i32 * ) * @ "_ZNSt17_Function_handlerIFviEZ5printP4NodeE3$_0E9_M_invokeERKSt9_Any_dataOi" , void (%"union.std::_Any_data"* , i32 * ) * * %_M_invoker . i , align 8 , ! tbaa ! 1 store i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 )* @ "_ZNSt14_Function_base13_Base_managerIZ5printP4NodeE3$_0E10_M_managerERSt9_Any_dataRKS5_St18_Manager_operation" , i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 ) * * %_M_manager . i . i , align 8 , ! tbaa !6 c a l l fastcc void @_ZL16post_order_visitP4NodeSt8functionIFviEE ( %struct . Node* %n, %"class.std::function"* nonnull %agg . tmp) %0 = load i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 ) * , i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 ) * * %_M_manager . i . i , align 8 , ! tbaa !6 %tobool . i = icmp eq i 1 (%"union.std::_Any_data"* , %"union.std::_Any_data"* , i32 )* %0, null br i 1 %tobool . i , label %_ZNSt14_Function_baseD2Ev . exit , label %if . then . i i f . then . i : ; preds = %entry %_M_functor . i = getelementptr inbounds %"class.std::function" , %"class.std::function"* %agg . tmp , i64 0 , i32 0 , i32 0 %call . i = c a l l zeroext i 1 %0(%"union.std::_Any_data"* dereferenceable ( 1 6 ) %_M_functor . i , %"union.std::_Any_data"* dereferenceable ( 1 6 ) %_M_functor . i , i32 3) #2 br label %_ZNSt14_Function_baseD2Ev . e x i t _ZNSt14_Function_baseD2Ev . e x i t : ; preds = %entry , %if . then . i ret void } define internal void @ "_ZNSt17_Function_handlerIFviEZ5printP4NodeE3$_0E9_M_invokeERKSt9_Any_dataOi"(%"union.std::_Any_data"* nocapture readnone dereferenceable ( 1 6 ) %__functor , i32 * nocapture readonly dereferenceable ( 4 ) %__args ) #3 align 2 { entry : %0 = load i32 , i32 * %__args , align 4 , ! tbaa ! 1 4 %call . i = t a i l c a l l dereferenceable ( 2 7 2 ) %"class.std::basic_ostream"* @_ZNSolsEi (%"class.std::basic_ostream"* nonnull @_ZSt4cout , i32 %0) #2 %1 = bitcas t %"class.std::basic_ostream"* %call . i to i8 ** %vtable . i . i = load i8 * , i8 ** %1 , align 8 , ! tbaa ! 1 5 %vbase . offset . ptr . i . i = getelementptr i8 , i8 * %vtable . i . i , i64 _24 %2 = bitcas t i8 * %vbase . offset . ptr . i . i to i64 * %vbase . offset . i . i = load i64 , i64 * %2, align 8 %3 = bitcas t %"class.std::basic_ostream"* %call . i to i8 * %add . ptr . i . i = getelementptr inbounds i8 , i8 * %3, i64 %vbase . offset . i . i %_M_ctype . i . i = getelementptr inbounds i8 , i8 * %add . ptr . i . i , i64 240 %4 = bitcas t i8 * %_M_ctype . i . i to %"class.std::ctype"** %5 = load %"class.std::ctype"* , %"class.std::ctype"** %4, align 8 , ! tbaa ! 1 7 %tobool . i5 . i = icmp eq %"class.std::ctype"* %5, null br i 1 %tobool . i5 . i , label %if . then . i6 . i , label %_ZSt13__check_facetISt5ctypeIcEERKT_PS3_ . e x i t . i i f . then . i6 . i : ; preds = %entry t a i l c a l l void @_ZSt16__throw_bad_castv ( ) #7 unreachable _ZSt13__check_facetISt5ctypeIcEERKT_PS3_ . e x i t . i : ; preds = %entry %_M_widen_ok . i . i = getelementptr inbounds %"class.std::ctype" , %"class.std::ctype"* %5, i64 0 , i32 8 %6 = load i8 , i8 * %_M_widen_ok . i . i , align 8 , ! tbaa !20 %tobool . i . i = icmp eq i8 %6, br i 1 %tobool . i . i , label %if . end . i . i , label %if . then . i . i i f . then . i . i : ; preds = %_ZSt13__check_facetISt5ctypeIcEERKT_PS3_ . e x i t . i %arrayidx . i . i = getelementptr inbounds %"class.std::ctype" , %"class.std::ctype"* %5, i64 0 , i32 9 , i64 10 %7 = load i8 , i8 * %arrayidx . i . i , align 1 , ! tbaa ! 22 br label %"_ZZ5printP4NodeENK3$_0clEi.exit" i f . end . i . i : ; preds = %_ZSt13__check_facetISt5ctypeIcEERKT_PS3_ . e x i t . i t a i l c a l l void @_ZNKSt5ctypeIcE13_M_widen_initEv (%"class.std::ctype"* nonnull %5) #2 %8 = bitcas t %"class.std::ctype"* %5 to i8 (%"class.std::ctype"* , i8 ) * * * %vtable . i3 . i = load i8 (%"class.std::ctype"* , i8 ) * * , i8 (%"class.std::ctype"* , i8 ) * * * %8, align 8 , ! tbaa ! 1 5 %vfn . i . i = getelementptr inbounds i8 (%"class.std::ctype"* , i8 ) * , i8 (%"class.std::ctype"* , i8 ) * * %vtable . i3 . i , i64 6 %9 = load i8 (%"class.std::ctype"* , i8 ) * , i8 (%"class.std::ctype"* , i8 ) * * %vfn . i . i , align 8 %call . i4 . i = t a i l c a l l signext i8 %9(%"class.std::ctype"* nonnull %5, i8 signext 10) #2 br label %"_ZZ5printP4NodeENK3$_0clEi.exit" "_ZZ5printP4NodeENK3$_0clEi.exit" : ; preds = %if . then . i . i , %if . end . i . i %retval . 0 . i . i = phi i8 [ %7 , %if . then . i . i ] , [ %call . i4 . i , %if . end . i . i ] %call1 . i . i = t a i l c a l l dereferenceable ( 2 7 2 ) %"class.std::basic_ostream"* @_ZNSo3putEc (%"class.std::basic_ostream"* nonnull %call . i , i8 signext %retval . 0 . i . i ) #2 %call . i . i = t a i l c a l l dereferenceable ( 2 7 2 ) %"class.std::basic_ostream"* @_ZNSo5flushEv (%"class.std::basic_ostream"* nonnull %call1 . i . i ) #2 ret void } define internal zeroext i 1 @ "_ZNSt14_Function_base13_Base_managerIZ5printP4NodeE3$_0E10_M_managerERSt9_Any_dataRKS5_St18_Manager_operation"(%"union.std::_Any_data"* nocapture dereferenceable ( 1 6 ) %__dest , %"union.std::_Any_data"* dereferenceable ( 1 6 ) %__source , i32 %__op) #5 align 2 { entry : switch i32 %__op , label %sw. epilog [ i32 0 , label %sw. bb i32 1 , label %sw. bb1 ] sw . bb : ; preds = %entry %0 = bitcas t %"union.std::_Any_data"* %__dest to %"class.std::type_info"** store %"class.std::type_info"* bitcas t ( { i8 * , i8 * }* @ "_ZTIZ5printP4NodeE3$_0" to %"class.std::type_info" * ) , %"class.std::type_info"** %0, align 8 , ! tbaa ! 1 1 br label %sw. epilog sw . bb1 : ; preds = %entry %1 = bitcas t %"union.std::_Any_data"* %__dest to %"union.std::_Any_data"** store %"union.std::_Any_data"* %__source , %"union.std::_Any_data"** %1 , align 8 , ! tbaa ! 1 1 br label %sw. epilog sw . epilog : ; preds = %entry , %sw. bb1 , %sw. bb ret i 1 false }

slide-30
SLIDE 30

Working with higher-order Functions

  • A Graph-Based Higher-Order Intermediate Representation

Leißa, Köster, and Hack. CGO 2015

  • Shallow Embedding of DSLs via Online Partial Evaluation

Leißa, Boesche, Hack, Membarth, and Slusallek. GPCE 2015.

9

slide-31
SLIDE 31

Working with higher-order Functions

  • A Graph-Based Higher-Order Intermediate Representation

Leißa, Köster, and Hack. CGO 2015

  • Shallow Embedding of DSLs via Online Partial Evaluation

Leißa, Boesche, Hack, Membarth, and Slusallek. GPCE 2015.

9

slide-32
SLIDE 32

Closure Conversion

slide-33
SLIDE 33

Closure Conversion

void range(int a, int b, function<void(int)> f) { if (a < b) { f(a); range(a+1, b, f); } } void foo(int n) { range(0, n, [=] (int i) { use(i, n); }); } struct closurebase { void (*f)(void* c, int i); }; struct closure { closurebase base; int n; }; void lambda(void* c, int i) { use(i, (closure* c)->n); } void range(int a, int b, void* c) { if (a < b) { ((closurebase*) c)->f(c, a); range(a+1, b, c); } } void foo(int n) { closure c = {{&lambda}, n}; range(0, n, &c); }

slide-34
SLIDE 34

Closure Conversion

void range(int a, int b, function<void(int)> f) { if (a < b) { f(a); range(a+1, b, f); } } void foo(int n) { range(0, n, [=] (int i) { use(i, n); }); } struct closurebase { void (*f)(void* c, int i); }; struct closure { closurebase base; int n; }; void lambda(void* c, int i) { use(i, (closure* c)->n); } void range(int a, int b, void* c) { if (a < b) { ((closurebase*) c)->f(c, a); range(a+1, b, c); } } void foo(int n) { closure c = {{&lambda}, n}; range(0, n, &c); }

slide-35
SLIDE 35

Closure Conversion

void range(int a, int b, function<void(int)> f) { if (a < b) { f(a); range(a+1, b, f); } } void foo(int n) { range(0, n, [=] (int i) { use(i, n); }); } struct closurebase { void (*f)(void* c, int i); }; struct closure { closurebase base; int n; }; void lambda(void* c, int i) { use(i, (closure* c)->n); } void range(int a, int b, void* c) { if (a < b) { ((closurebase*) c)->f(c, a); range(a+1, b, c); } } void foo(int n) { closure c = {{&lambda}, n}; range(0, n, &c); }

slide-36
SLIDE 36

Closure Conversion

void range(int a, int b, function<void(int)> f) { if (a < b) { f(a); range(a+1, b, f); } } void foo(int n) { range(0, n, [=] (int i) { use(i, n); }); } struct closurebase { void (*f)(void* c, int i); }; struct closure { closurebase base; int n; }; void lambda(void* c, int i) { use(i, (closure* c)->n); } void range(int a, int b, void* c) { if (a < b) { ((closurebase*) c)->f(c, a); range(a+1, b, c); } } void foo(int n) { closure c = {{&lambda}, n}; range(0, n, &c); }

slide-37
SLIDE 37

Closure Conversion

void range(int a, int b, function<void(int)> f) { if (a < b) { f(a); range(a+1, b, f); } } void foo(int n) { range(0, n, [=] (int i) { use(i, n); }); } struct closurebase { void (*f)(void* c, int i); }; struct closure { closurebase base; int n; }; void lambda(void* c, int i) { use(i, (closure* c)->n); } void range(int a, int b, void* c) { if (a < b) { ((closurebase*) c)->f(c, a); range(a+1, b, c); } } void foo(int n) { closure c = {{&lambda}, n}; range(0, n, &c); }

slide-38
SLIDE 38

Closure Conversion

void range(int a, int b, function<void(int)> f) { if (a < b) { f(a); range(a+1, b, f); } } void foo(int n) { range(0, n, [=] (int i) { use(i, n); }); } struct closurebase { void (*f)(void* c, int i); }; struct closure { closurebase base; int n; }; void lambda(void* c, int i) { use(i, (closure* c)->n); } void range(int a, int b, void* c) { if (a < b) { ((closurebase*) c)->f(c, a); range(a+1, b, c); } } void foo(int n) { closure c = {{&lambda}, n}; range(0, n, &c); }

slide-39
SLIDE 39

Closure Conversion

void range(int a, int b, function<void(int)> f) { if (a < b) { f(a); range(a+1, b, f); } } void foo(int n) { range(0, n, [=] (int i) { use(i, n); }); } struct closurebase { void (*f)(void* c, int i); }; struct closure { closurebase base; int n; }; void lambda(void* c, int i) { use(i, (closure* c)->n); } void range(int a, int b, void* c) { if (a < b) { ((closurebase*) c)->f(c, a); range(a+1, b, c); } } void foo(int n) { closure c = {{&lambda}, n}; range(0, n, &c); }

slide-40
SLIDE 40

Closure Conversion

void range(int a, int b, function<void(int)> f) { if (a < b) { f(a); range(a+1, b, f); } } void foo(int n) { range(0, n, [=] (int i) { use(i, n); }); } struct closurebase { void (*f)(void* c, int i); }; struct closure { closurebase base; int n; }; void lambda(void* c, int i) { use(i, (closure* c)->n); } void range(int a, int b, void* c) { if (a < b) { ((closurebase*) c)->f(c, a); range(a+1, b, c); } } void foo(int n) { closure c = {{&lambda}, n}; range(0, n, &c); }

slide-41
SLIDE 41

Closure Conversion

void range(int a, int b, function<void(int)> f) { if (a < b) { f(a); range(a+1, b, f); } } void foo(int n) { range(0, n, [=] (int i) { use(i, n); }); } struct closurebase { void (*f)(void* c, int i); }; struct closure { closurebase base; int n; }; void lambda(void* c, int i) { use(i, (closure* c)->n); } void range(int a, int b, void* c) { if (a < b) { ((closurebase*) c)->f(c, a); range(a+1, b, c); } } void foo(int n) { closure c = {{&lambda}, n}; range(0, n, &c); }

slide-42
SLIDE 42

Closure Conversion

void range(int a, int b, function<void(int)> f) { if (a < b) { f(a); range(a+1, b, f); } } void foo(int n) { range(0, n, [=] (int i) { use(i, n); }); } struct closurebase { void (*f)(void* c, int i); }; struct closure { closurebase base; int n; }; void lambda(void* c, int i) { use(i, (closure* c)->n); } void range(int a, int b, void* c) { if (a < b) { ((closurebase*) c)->f(c, a); range(a+1, b, c); } } void foo(int n) { closure c = {{&lambda}, n}; range(0, n, &c); }

slide-43
SLIDE 43

Closure Conversion

void range(int a, int b, function<void(int)> f) { if (a < b) { f(a); range(a+1, b, f); } } void foo(int n) { range(0, n, [=] (int i) { use(i, n); }); } struct closurebase { void (*f)(void* c, int i); }; struct closure { closurebase base; int n; }; void lambda(void* c, int i) { use(i, (closure* c)->n); } void range(int a, int b, void* c) { if (a < b) { ((closurebase*) c)->f(c, a); range(a+1, b, c); } } void foo(int n) { closure c = {{&lambda}, n}; range(0, n, &c); }

slide-44
SLIDE 44

Closure Conversion

void range(int a, int b, function<void(int)> f) { if (a < b) { f(a); range(a+1, b, f); } } void foo(int n) { range(0, n, [=] (int i) { use(i, n); }); } struct closurebase { void (*f)(void* c, int i); }; struct closure { closurebase base; int n; }; void lambda(void* c, int i) { use(i, (closure* c)->n); } void range(int a, int b, void* c) { if (a < b) { ((closurebase*) c)->f(c, a); range(a+1, b, c); } } void foo(int n) { closure c = {{&lambda}, n}; range(0, n, &c); }

slide-45
SLIDE 45

Closure Conversion

void range(int a, int b, function<void(int)> f) { if (a < b) { f(a); range(a+1, b, f); } } void foo(int n) { range(0, n, [=] (int i) { use(i, n); }); } struct closurebase { void (*f)(void* c, int i); }; struct closure { closurebase base; int n; }; void lambda(void* c, int i) { use(i, (closure* c)->n); } void range(int a, int b, void* c) { if (a < b) { ((closurebase*) c)->f(c, a); range(a+1, b, c); } } void foo(int n) { closure c = {{&lambda}, n}; range(0, n, &c); }

slide-46
SLIDE 46

Closure Conversion

void range(int a, int b, function<void(int)> f) { if (a < b) { f(a); range(a+1, b, f); } } void foo(int n) { range(0, n, [=] (int i) { use(i, n); }); } struct closurebase { void (*f)(void* c, int i); }; struct closure { closurebase base; int n; }; void lambda(void* c, int i) { use(i, (closure* c)->n); } void range(int a, int b, void* c) { if (a < b) { ((closurebase*) c)->f(c, a); range(a+1, b, c); } } void foo(int n) { closure c = {{&lambda}, n}; range(0, n, &c); }

slide-47
SLIDE 47

Closure Conversion

void range(int a, int b, function<void(int)> f) { if (a < b) { f(a); range(a+1, b, f); } } void foo(int n) { range(0, n, [=] (int i) { use(i, n); }); } struct closurebase { void (*f)(void* c, int i); }; struct closure { closurebase base; int n; }; void lambda(void* c, int i) { use(i, (closure* c)->n); } void range(int a, int b, void* c) { if (a < b) { ((closurebase*) c)->f(c, a); range(a+1, b, c); } } void foo(int n) { closure c = {{&lambda}, n}; range(0, n, &c); }

slide-48
SLIDE 48

Closure Conversion

void range(int a, int b, function<void(int)> f) { if (a < b) { f(a); range(a+1, b, f); } } void foo(int n) { range(0, n, [=] (int i) { use(i, n); }); } struct closurebase { void (*f)(void* c, int i); }; struct closure { closurebase base; int n; }; void lambda(void* c, int i) { use(i, (closure* c)->n); } void range(int a, int b, void* c) { if (a < b) { ((closurebase*) c)->f(c, a); range(a+1, b, c); } } void foo(int n) { closure c = {{&lambda}, n}; range(0, n, &c); }

slide-49
SLIDE 49

Closure Conversion

void range(int a, int b, function<void(int)> f) { if (a < b) { f(a); range(a+1, b, f); } } void foo(int n) { range(0, n, [=] (int i) { use(i, n); }); } struct closurebase { void (*f)(void* c, int i); }; struct closure { closurebase base; int n; }; void lambda(void* c, int i) { use(i, (closure* c)->n); } void range(int a, int b, void* c) { if (a < b) { ((closurebase*) c)->f(c, a); range(a+1, b, c); } } void foo(int n) { closure c = {{&lambda}, n}; range(0, n, &c); }

slide-50
SLIDE 50

Closure Conversion

void range(int a, int b, function<void(int)> f) { if (a < b) { f(a); range(a+1, b, f); } } void foo(int n) { range(0, n, [=] (int i) { use(i, n); }); } struct closurebase { void (*f)(void* c, int i); }; struct closure { closurebase base; int n; }; void lambda(void* c, int i) { use(i, (closure* c)->n); } void range(int a, int b, void* c) { if (a < b) { ((closurebase*) c)->f(c, a); range(a+1, b, c); } } void foo(int n) { closure c = {{&lambda}, n}; range(0, n, &c); }

slide-51
SLIDE 51

What does LLVM do?

  • inline the call to the closure’s function pointer
  • SSA-construct the closure struct
  • dissolve the struct to scalar values

(Scalar Replacement of Aggregates)

  • usually works well for typical STL algorithms
  • fails for recursive higher-order functions like
  • range
  • post_order_visit

11

slide-52
SLIDE 52

What does LLVM do?

  • inline the call to the closure’s function pointer
  • SSA-construct the closure struct
  • dissolve the struct to scalar values

(Scalar Replacement of Aggregates)

  • usually works well for typical STL algorithms
  • fails for recursive higher-order functions like
  • range
  • post_order_visit

11

slide-53
SLIDE 53

What does LLVM do?

  • inline the call to the closure’s function pointer
  • SSA-construct the closure struct
  • dissolve the struct to scalar values

(Scalar Replacement of Aggregates)

  • usually works well for typical STL algorithms
  • fails for recursive higher-order functions like
  • range
  • post_order_visit

11

slide-54
SLIDE 54

What does LLVM do?

  • inline the call to the closure’s function pointer
  • SSA-construct the closure struct
  • dissolve the struct to scalar values

(Scalar Replacement of Aggregates)

  • usually works well for typical STL algorithms
  • fails for recursive higher-order functions like
  • range
  • post_order_visit

11

slide-55
SLIDE 55

What does LLVM do?

  • inline the call to the closure’s function pointer
  • SSA-construct the closure struct
  • dissolve the struct to scalar values

(Scalar Replacement of Aggregates)

  • usually works well for typical STL algorithms
  • fails for recursive higher-order functions like
  • range
  • post_order_visit

11

slide-56
SLIDE 56

Closure Conversion

clang AST LLVM BE

closure conversion

  • reimplement for every front-end
  • taints the IR with implementation of higher-order

functions

  • bloats the IR
  • set of finely tuned analyses & transformations needed for
  • ptimization

12

slide-57
SLIDE 57

Closure Conversion

impala AST Thorin LLVM

  • Thorin = higher-order + CPS + ”sea of nodes”
  • directly translate higher-order functions and calls to

Thorin

  • keep higher-order functions till late during compilation
  • powerful closure-elimination phase

12

slide-58
SLIDE 58

Thorin

slide-59
SLIDE 59

SSA-Form

int foo(int n) { int a; if (n==0) { a = 23; } else { a = 42; } return a; } int foo(int n) { branch(n==0, then, else) then: goto next; else: goto next; next: int a = (23 [then], 42 [else]); return a; } 13

slide-60
SLIDE 60

SSA-Form

int foo(int n) { int a; if (n==0) { a = 23; } else { a = 42; } return a; } int foo(int n) { branch(n==0, then, else) then: goto next; else: goto next; next: int a = φ(23 [then], 42 [else]); return a; } 13

slide-61
SLIDE 61

CPS

int foo(int n) { branch(n==0, then, else) then: goto next; else: goto next; next: int a = φ(23 [then], 42 [else]); return a; } foo(n: int, ret: int ) : let then() : next(23) else() : next(42) next(a: int) : ret(a) in branch(n==0, then, else) 14

slide-62
SLIDE 62

CPS

int foo(int n) { branch(n==0, then, else) then: goto next; else: goto next; next: int a = φ(23 [then], 42 [else]); return a; } foo(n: int, ret: int → ⊥) → ⊥: let then() → ⊥: next(23) else() → ⊥: next(42) next(a: int) → ⊥: ret(a) in branch(n==0, then, else) 14

slide-63
SLIDE 63

Thorin

foo(n: int, ret: int → ⊥) → ⊥: let then() → ⊥: next(23) else() → ⊥: next(42) next(a: int) → ⊥: ret(a) in branch(n==0, then, else) foo(n: int, ret: cn(int)): n==0 branch( , then, else) then(): next(23) else(): next(42) next(a: int): ret(a) 15

slide-64
SLIDE 64

Thorin

foo(n: int, ret: int → ⊥) → ⊥: let then() → ⊥: next(23) else() → ⊥: next(42) next(a: int) → ⊥: ret(a) in branch(n==0, then, else) foo(n: int, ret: cn(int)): n==0 branch(•, then, else) then(): next(23) else(): next(42) next(a: int): ret(a) 15

slide-65
SLIDE 65

Thorin

foo(n: int, ret: int → ⊥) → ⊥: let then() → ⊥: next(23) else() → ⊥: next(42) next(a: int) → ⊥: ret(a) in branch(n==0, then, else) foo(n: int, ret: cn(int)): n==0 branch(•, then, else) then(): next(23) else(): next(42) next(a: int): ret(a) 15

slide-66
SLIDE 66

Thorin

foo(n: int, ret: int → ⊥) → ⊥: let then() → ⊥: next(23) else() → ⊥: next(42) next(a: int) → ⊥: ret(a) in branch(n==0, then, else) foo(n: int, ret: cn(int)): n==0 branch(•, then, else) then(): next(23) else(): next(42) next(a: int): ret(a) 15

slide-67
SLIDE 67

Thorin

foo(n: int, ret: int → ⊥) → ⊥: let then() → ⊥: next(23) else() → ⊥: next(42) next(a: int) → ⊥: ret(a) in branch(n==0, then, else) foo(n: int, ret: cn(int)): n==0 branch(•, then, else) then(): next(23) else(): next(42) next(a: int): ret(a) 15

slide-68
SLIDE 68

Classic CPS vs Thorin

Classic CPS Thorin let graph edge (acyclic graph) letrec graph edge (cyclic graph) block nesting implicit name resolution graph edge name capture

  • 16
slide-69
SLIDE 69

Classic CPS vs Thorin

Classic CPS Thorin let graph edge (acyclic graph) letrec graph edge (cyclic graph) block nesting implicit name resolution graph edge name capture

  • 16
slide-70
SLIDE 70

Classic CPS vs Thorin

Classic CPS Thorin let graph edge (acyclic graph) letrec graph edge (cyclic graph) block nesting implicit name resolution graph edge name capture

  • 16
slide-71
SLIDE 71

SSA vs Thorin

int foo(int n) { branch(n==0, then, else) then: goto next; else: goto next; next: int a = φ(23 [then], 42 [else]); return a; } foo(n: int, ret: cn(int)): branch(n==0, then, else) then(): next(23) else(): next(42) next(a: int): ret(a) continuation parameter call function basic block parameter call terminator

  • arg

primop instruction

slide-72
SLIDE 72

SSA vs Thorin

int foo(int n) { branch(n==0, then, else) then: goto next; else: goto next; next: int a = φ(23 [then], 42 [else]); return a; } foo(n: int, ret: cn(int)): branch(n==0, then, else) then(): next(23) else(): next(42) next(a: int): ret(a) continuation parameter call function basic block parameter call terminator

  • arg

primop instruction

slide-73
SLIDE 73

SSA vs Thorin

int foo(int n) { branch(n==0, then, else) then: goto next; else: goto next; next: int a = φ(23 [then], 42 [else]); return a; } foo(n: int, ret: cn(int)): branch(n==0, then, else) then(): next(23) else(): next(42) next(a: int): ret(a) continuation parameter call function basic block parameter call terminator

  • arg

primop instruction

slide-74
SLIDE 74

SSA vs Thorin

int foo(int n) { branch(n==0, then, else) then: goto next; else: goto next; next: int a = φ(23 [then], 42 [else]); return a; } foo(n: int, ret: cn(int)): branch(n==0, then, else) then(): next(23) else(): next(42) next(a: int): ret(a) continuation parameter call function basic block parameter call terminator

  • arg

primop instruction

slide-75
SLIDE 75

SSA vs Thorin

int foo(int n) { branch(n==0, then, else) then: goto next; else: goto next; next: int a = φ(23 [then], 42 [else]); return a; } foo(n: int, ret: cn(int)): branch(n==0, then, else) then(): next(23) else(): next(42) next(a: int): ret(a) continuation parameter call function basic block parameter call terminator

  • arg

primop instruction

slide-76
SLIDE 76

SSA vs Thorin

int foo(int n) { branch(n==0, then, else) then: goto next; else: goto next; next: int a = φ(23 [then], 42 [else]); return a; } foo(n: int, ret: cn(int)): branch(n==0, then, else) then(): next(23) else(): next(42) next(a: int): ret(a) continuation parameter call function basic block parameter call terminator

  • arg

primop instruction

slide-77
SLIDE 77

SSA vs Thorin

int foo(int n) { branch(n==0, then, else) then: goto next; else: goto next; next: int a = φ(23 [then], 42 [else]); return a; } foo(n: int, ret: cn(int)): branch(n==0, then, else) then(): next(23) else(): next(42) next(a: int): ret(a) continuation parameter call function basic block parameter Φ call terminator

  • arg

primop instruction

slide-78
SLIDE 78

SSA vs Thorin

int foo(int n) { branch(n==0, then, else) then: goto next; else: goto next; next: int a = φ(23 [then], 42 [else]); return a; } foo(n: int, ret: cn(int)): branch(n==0, then, else) then(): next(23) else(): next(42) next(a: int): ret(a) continuation parameter call function basic block parameter Φ call terminator

  • arg

primop instruction

slide-79
SLIDE 79

SSA vs Thorin

int foo(int n) { branch(n==0, then, else) then: goto next; else: goto next; next: int a = φ(23 [then], 42 [else]); return a; } foo(n: int, ret: cn(int)): branch(n==0, then, else) then(): next(23) else(): next(42) next(a: int): ret(a) continuation parameter call function basic block parameter Φ call terminator

  • arg

primop instruction

slide-80
SLIDE 80

SSA vs Thorin

int foo(int n) { branch(n==0, then, else) then: goto next; else: goto next; next: int a = φ(23 [then], 42 [else]); return a; } foo(n: int, ret: cn(int)): branch(n==0, then, else) then(): next(23) else(): next(42) next(a: int): ret(a) continuation parameter call function basic block parameter Φ call terminator Φ-arg primop instruction

slide-81
SLIDE 81

SSA vs Thorin

int foo(int n) { branch(n==0, then, else) then: goto next; else: goto next; next: int a = φ(23 [then], 42 [else]); return a; } foo(n: int, ret: cn(int)): branch(n==0, then, else) then(): next(23) else(): next(42) next(a: int): ret(a) continuation parameter call function basic block parameter Φ call terminator Φ-arg primop instruction

slide-82
SLIDE 82

Lambda Mangling

slide-83
SLIDE 83

Control-Flow Form

int foo(int n) { branch(n==0, then, else) then: goto next; else: goto next; next: int a = φ(23 [then], 42 [else]); return a; } foo(n: int, ret: cn(int)): branch(n==0, then, else) then(): next(23) else(): next(42) next(a: int): ret(a)

  • Thorin program in CFF if
  • first-order continuation

basic block

  • top-level, continuation with “return”

function

  • straightforward to translate to SSA form [Kelsey95]
  • no closures needed
slide-84
SLIDE 84

Control-Flow Form

int foo(int n) { branch(n==0, then, else) then: goto next; else: goto next; next: int a = φ(23 [then], 42 [else]); return a; } foo(n: int, ret: cn(int)): branch(n==0, then, else) then(): next(23) else(): next(42) next(a: int): ret(a)

  • Thorin program in CFF if
  • first-order continuation ⇒ basic block
  • top-level, continuation with “return”

function

  • straightforward to translate to SSA form [Kelsey95]
  • no closures needed
slide-85
SLIDE 85

Control-Flow Form

int foo(int n) { branch(n==0, then, else) then: goto next; else: goto next; next: int a = φ(23 [then], 42 [else]); return a; } foo(n: int, ret: cn(int)): branch(n==0, then, else) then(): next(23) else(): next(42) next(a: int): ret(a)

  • Thorin program in CFF if
  • first-order continuation ⇒ basic block
  • top-level, continuation with “return” ⇒ function
  • straightforward to translate to SSA form [Kelsey95]
  • no closures needed
slide-86
SLIDE 86

Control-Flow Form

int foo(int n) { branch(n==0, then, else) then: goto next; else: goto next; next: int a = φ(23 [then], 42 [else]); return a; } foo(n: int, ret: cn(int)): branch(n==0, then, else) then(): next(23) else(): next(42) next(a: int): ret(a)

  • Thorin program in CFF if
  • first-order continuation ⇒ basic block
  • top-level, continuation with “return” ⇒ function
  • straightforward to translate to SSA form [Kelsey95]
  • no closures needed
slide-87
SLIDE 87

Not in CFF

void range(int a, int b, function<void(int)> f) { //... range(a+1, b, f); } range(a: int, b: int, f: cn(int, cn()), ret: cn()): /* * ... */ range(a+1, b, f, ret)

CFF-convertible if

  • recursion-free or
  • tail-recursive

19

slide-88
SLIDE 88

Not in CFF

void range(int a, int b, function<void(int)> f) { //... range(a+1, b, f); } range(a: int, b: int, f: cn(int, cn()), ret: cn()): /* * ... */ range(a+1, b, f, ret)

CFF-convertible if

  • recursion-free or
  • tail-recursive

19

slide-89
SLIDE 89

Not in CFF

void range(int a, int b, function<void(int)> f) { //... range(a+1, b, f); } range(a: int, b: int, f: cn(int, cn()), ret: cn()): /* * ... */ range(a+1, b, f, ret)

CFF-convertible if

  • recursion-free or
  • tail-recursive

19

slide-90
SLIDE 90

Not in CFF

void range(int a, int b, function<void(int)> f) { //... range(a+1, b, f); } range(a: int, b: int, f: cn(int, cn()), ret: cn()): /* * ... */ range(a+1, b, f, ret)

CFF-convertible if

  • recursion-free or
  • tail-recursive

19

slide-91
SLIDE 91

Not in CFF

void range(int a, int b, function<void(int)> f) { //... range(a+1, b, f); } range(a: int, b: int, f: cn(int, cn()), ret: cn()): /* * ... */ range(a+1, b, f, ret)

CFF-convertible if

  • recursion-free or
  • tail-recursive

19

slide-92
SLIDE 92

Not in CFF

void range(int a, int b, function<void(int)> f) { //... range(a+1, b, f); } range(a: int, b: int, f: cn(int, cn()), ret: cn()): /* * ... */ range(a+1, b, f, ret)

CFF-convertible if

  • recursion-free or
  • tail-recursive

19

slide-93
SLIDE 93

Not in CFF

void range(int a, int b, function<void(int)> f) { //... range(a+1, b, f); } range(a: int, b: int, f: cn(int, cn()), ret: cn()): /* * ... */ range(a+1, b, f, ret)

CFF-convertible if

  • recursion-free or
  • tail-recursive

19

slide-94
SLIDE 94

Classes of Thorin Programs

CFF CFF-convertible explicit closures

lambda mangling

20

slide-95
SLIDE 95

Classes of Thorin Programs

CFF CFF-convertible explicit closures

lambda mangling

20

slide-96
SLIDE 96

Classes of Thorin Programs

CFF CFF-convertible explicit closures

lambda mangling

20

slide-97
SLIDE 97

Classes of Thorin Programs

CFF CFF-convertible explicit closures

lambda mangling

20

slide-98
SLIDE 98

Lambda Mangling = partial inlining/outlining

  • (partial) inlining
  • (partial) outlining
  • clone basic blocks/functions
  • loop peeling
  • loop unrolling
  • tail-recursion elimination

21

slide-99
SLIDE 99

Lambda Mangling = partial inlining/outlining

  • (partial) inlining
  • (partial) outlining
  • clone basic blocks/functions
  • loop peeling
  • loop unrolling
  • tail-recursion elimination

21

slide-100
SLIDE 100

Lambda Mangling = partial inlining/outlining

  • (partial) inlining
  • (partial) outlining
  • clone basic blocks/functions
  • loop peeling
  • loop unrolling
  • tail-recursion elimination

21

slide-101
SLIDE 101

Lambda Mangling = partial inlining/outlining

  • (partial) inlining
  • (partial) outlining
  • clone basic blocks/functions
  • loop peeling
  • loop unrolling
  • tail-recursion elimination

21

slide-102
SLIDE 102

Impala

slide-103
SLIDE 103

Impala

fn post_order_visit(n: &Node, f: fn(int) -> ()) -> () { if n.left != nil { post_order_visit(n.left, f); } if n.right != nil { post_order_visit(n.right, f); } f(n.data) } fn print(n: &Node) -> () { post_order_visit(n, |d| { println(d); }); } 22

slide-104
SLIDE 104

Impala

fn post_order_visit(n: &Node, f: fn(int) -> ()) -> () { if n.left != nil { post_order_visit(n.left, f); } if n.right != nil { post_order_visit(n.right, f); } f(n.data) } fn print(n: &Node) -> () { post_order_visit(n, |d| { println(d); }); } 22

slide-105
SLIDE 105

Impala

fn post_order_visit(n: &Node, f: fn(int) -> ()) -> () { if n.left != nil { post_order_visit(n.left, f); } if n.right != nil { post_order_visit(n.right, f); } f(n.data) } fn print(n: &Node) -> () { post_order_visit(n, |d| { println(d); }); } 22

slide-106
SLIDE 106

Impala

fn post_order_visit(n: &Node, f: fn(int) -> ()) -> () { if n.left != nil { post_order_visit(n.left, f); } if n.right != nil { post_order_visit(n.right, f); } f(n.data) } fn print(n: &Node) -> () { post_order_visit(n, |d| { println(d); }); } 22

slide-107
SLIDE 107

Impala

fn post_order_visit(n: &Node, f: fn(int) -> ()) -> () { if n.left != nil { post_order_visit(n.left, f); } if n.right != nil { post_order_visit(n.right, f); } f(n.data) } fn print(n: &Node) -> () { post_order_visit(n, |d| { println(d); }); } 22

slide-108
SLIDE 108

Impala

fn post_order_visit(n: &Node, f: fn(int) -> ()) -> () { if n.left != nil { post_order_visit(n.left, f); } if n.right != nil { post_order_visit(n.right, f); } f(n.data) } fn print(n: &Node) -> () { post_order_visit(n, |d| { println(d); }); } 22

slide-109
SLIDE 109

Impala

fn post_order_visit(n: &Node, f: fn(int) -> ()) -> () { if n.left != nil { post_order_visit(n.left, f); } if n.right != nil { post_order_visit(n.right, f); } f(n.data) } fn print(n: &Node) -> () { post_order_visit(n, |d| { println(d); }); } 22

slide-110
SLIDE 110

Impala

fn post_order_visit(n: &Node, f: fn(int) -> ()) -> () { if n.left != nil { post_order_visit(n.left, f); } if n.right != nil { post_order_visit(n.right, f); } f(n.data) } fn print(n: &Node) -> () { post_order_visit(n, |d| { println(d); }); } 22

slide-111
SLIDE 111

Impala - for Syntax

fn post_order_visit(n: &Node, f: fn(int) -> ()) -> () { if n.left != nil { post_order_visit(n.left, f); } if n.right != nil { post_order_visit(n.right, f); } f(n.data) } fn print(n: &Node) -> () { for d in post_order_visit(n) { println(d); } } 23

slide-112
SLIDE 112

Impala - sum

fn sum(n: &Node) -> () { let mut result = 0; for d in post_order_visit(n) { result += d } println(result); } 24

slide-113
SLIDE 113

Impala - sum

fn sum(n: &Node) -> () { let mut result = 0; for d in post_order_visit(n) { result += d } println(result); } 24

slide-114
SLIDE 114

Impala - sum

fn sum(n: &Node) -> () { let mut result = 0; for d in post_order_visit(n) { result += d } println(result); } 24

slide-115
SLIDE 115

Impala - return is the new continue

fn sum(n: &Node) -> () { let mut result = 0; post_order_visit(n, |d| { if d == 23 { return() } result += d } println(result); } 25

slide-116
SLIDE 116

Impala - return is the new continue

fn sum(n: &Node) -> () { let mut result = 0; post_order_visit(n, |d| { if d == 23 { return() } result += d } println(result); } 25

slide-117
SLIDE 117

Impala - return is the new continue

fn sum(n: &Node) -> () { let mut result = 0; post_order_visit(n, |d| { if d == 23 { return() } result += d } println(result); } 25

slide-118
SLIDE 118

Impala - return is the new continue

fn sum(n: &Node) -> () { let mut result = 0; post_order_visit(n, |d| { if d == 23 { return() } result += d } println(result); } 25

slide-119
SLIDE 119

Impala - continue is the new return

fn sum(n: &Node) -> () { let mut result = 0; for d in post_order_visit(n) { if d == 23 { continue() } result += d } println(result); } 26

slide-120
SLIDE 120

Impala - Give me a break, please!

fn sum(n: &Node) -> () { let mut result = 0; for d in post_order_visit(n) { if d == 23 { break() } result += d } println(result); } 27

slide-121
SLIDE 121

Impala

fn post_order_visit(n: &Node, f: fn(int) -> ()) -> () { if n.left != nil { post_order_visit(n.left, f); } if n.right != nil { post_order_visit(n.right, f); } f(n.data) } fn print(n: &Node) -> () { for d in post_order_visit(n) { println(result); } } 28

slide-122
SLIDE 122

Impala

fn post_order_visit(n: &Node, f: fn(int) -> ()) -> () { if n.left != nil { post_order_visit(n.left, f); } if n.right != nil { post_order_visit(n.right, f); } f(n.data) } fn print(n: &Node) -> () { for d in post_order_visit(n) { println(result); } } 28

slide-123
SLIDE 123

Impala

fn post_order_visit(n: &Node, f: fn(int) -> ()) -> () { if n.left != nil { post_order_visit(n.left, f); } if n.right != nil { post_order_visit(n.right, f); } f(n.data) } fn print(n: &Node) -> () { for d in post_order_visit(n) { println(result); } } 28

slide-124
SLIDE 124

Generated LLVM (1)

define internal void @post_order_visit_392(%Node* %n_394) { post_order_visit_392_start: br label %post_order_visit post_order_visit: %0 = getelementptr inbounds %0, %Node* %n_394, i32 0, i32 1 %1 = load %Node*, %Node** %0 %2 = icmp ne %Node* %1, null br i1 %2, label %if_then, label %if_else if_then: call void @post_order_visit_392(%Node* %1) br label %next if_else: br label %next ; ... 29

slide-125
SLIDE 125

Generated LLVM (1)

define internal void @post_order_visit_392(%Node* %n_394) { post_order_visit_392_start: br label %post_order_visit post_order_visit: %0 = getelementptr inbounds %0, %Node* %n_394, i32 0, i32 1 %1 = load %Node*, %Node** %0 %2 = icmp ne %Node* %1, null br i1 %2, label %if_then, label %if_else if_then: call void @post_order_visit_392(%Node* %1) br label %next if_else: br label %next ; ... 29

slide-126
SLIDE 126

Generated LLVM (1)

define internal void @post_order_visit_392(%Node* %n_394) { post_order_visit_392_start: br label %post_order_visit post_order_visit: %0 = getelementptr inbounds %0, %Node* %n_394, i32 0, i32 1 %1 = load %Node*, %Node** %0 %2 = icmp ne %Node* %1, null br i1 %2, label %if_then, label %if_else if_then: call void @post_order_visit_392(%Node* %1) br label %next if_else: br label %next ; ... 29

slide-127
SLIDE 127

Generated LLVM (1)

define internal void @post_order_visit_392(%Node* %n_394) { post_order_visit_392_start: br label %post_order_visit post_order_visit: %0 = getelementptr inbounds %0, %Node* %n_394, i32 0, i32 1 %1 = load %Node*, %Node** %0 %2 = icmp ne %Node* %1, null br i1 %2, label %if_then, label %if_else if_then: call void @post_order_visit_392(%Node* %1) br label %next if_else: br label %next ; ... 29

slide-128
SLIDE 128

Generated LLVM (2)

; ... next: %3 = getelementptr inbounds %0, %Node* %n_394, i32 0, i32 2 %4 = load %Node*, %Node** %3 %5 = icmp ne %Node* %4, null br i1 %5, label %if_then2, label %if_else1 if_then2: call void @post_order_visit_392(%Node* %4) br label %next3 if_else1: br label %next3 next3: %6 = getelementptr inbounds %0, %Node* %n_394, i32 0, i32 0 %7 = load i32, i32* %6 call void @println(i32 %7) ret void } 30

slide-129
SLIDE 129

Generated LLVM (2)

; ... next: %3 = getelementptr inbounds %0, %Node* %n_394, i32 0, i32 2 %4 = load %Node*, %Node** %3 %5 = icmp ne %Node* %4, null br i1 %5, label %if_then2, label %if_else1 if_then2: call void @post_order_visit_392(%Node* %4) br label %next3 if_else1: br label %next3 next3: %6 = getelementptr inbounds %0, %Node* %n_394, i32 0, i32 0 %7 = load i32, i32* %6 call void @println(i32 %7) ret void } 30

slide-130
SLIDE 130

Generated LLVM (2)

; ... next: %3 = getelementptr inbounds %0, %Node* %n_394, i32 0, i32 2 %4 = load %Node*, %Node** %3 %5 = icmp ne %Node* %4, null br i1 %5, label %if_then2, label %if_else1 if_then2: call void @post_order_visit_392(%Node* %4) br label %next3 if_else1: br label %next3 next3: %6 = getelementptr inbounds %0, %Node* %n_394, i32 0, i32 0 %7 = load i32, i32* %6 call void @println(i32 %7) ret void } 30

slide-131
SLIDE 131

Generated LLVM (2)

; ... next: %3 = getelementptr inbounds %0, %Node* %n_394, i32 0, i32 2 %4 = load %Node*, %Node** %3 %5 = icmp ne %Node* %4, null br i1 %5, label %if_then2, label %if_else1 if_then2: call void @post_order_visit_392(%Node* %4) br label %next3 if_else1: br label %next3 next3: %6 = getelementptr inbounds %0, %Node* %n_394, i32 0, i32 0 %7 = load i32, i32* %6 call void @println(i32 %7) ret void } 30

slide-132
SLIDE 132

Generated LLVM (2)

; ... next: %3 = getelementptr inbounds %0, %Node* %n_394, i32 0, i32 2 %4 = load %Node*, %Node** %3 %5 = icmp ne %Node* %4, null br i1 %5, label %if_then2, label %if_else1 if_then2: call void @post_order_visit_392(%Node* %4) br label %next3 if_else1: br label %next3 next3: %6 = getelementptr inbounds %0, %Node* %n_394, i32 0, i32 0 %7 = load i32, i32* %6 call void @println(i32 %7) ret void } 30

slide-133
SLIDE 133

Evaluation

slide-134
SLIDE 134

Benchmarks – The Computer Language Benchmark Game1

runtime in ms C Impala aobench 1.220 1.357 fannkuch-redux 27.137 28.070 fasta 2.313 1.517 mandelbrot 2.143 2.113 meteor-contest 0.047 0.043 n-body 5.497 6.130 pidigits 0.710 0.763 regex 6.477 6.470 reverse-complement 1.090 1.220 spectral-norm 4.423 4.480

  • high-order IR does not “hurt” performance
  • all closures removed

1https://benchmarksgame.alioth.debian.org/

slide-135
SLIDE 135

Benchmarks – The Computer Language Benchmark Game1

runtime in ms C Impala aobench 1.220 1.357 fannkuch-redux 27.137 28.070 fasta 2.313 1.517 mandelbrot 2.143 2.113 meteor-contest 0.047 0.043 n-body 5.497 6.130 pidigits 0.710 0.763 regex 6.477 6.470 reverse-complement 1.090 1.220 spectral-norm 4.423 4.480

  • high-order IR does not “hurt” performance
  • all closures removed

1https://benchmarksgame.alioth.debian.org/

slide-136
SLIDE 136

Summary

slide-137
SLIDE 137

Conclusions

Thank you! Questions?

32

slide-138
SLIDE 138

Conclusions

Thank you! Questions?

32

slide-139
SLIDE 139

Conclusions

Thank you! Questions?

32

slide-140
SLIDE 140

Conclusions

Thank you! Questions?

32