Effect Handlers in Multicore OCaml Daniel Hillerstrm, Daan Leijen, - - PowerPoint PPT Presentation

effect handlers in multicore ocaml
SMART_READER_LITE
LIVE PREVIEW

Effect Handlers in Multicore OCaml Daniel Hillerstrm, Daan Leijen, - - PowerPoint PPT Presentation

Effect Handlers in Multicore OCaml Daniel Hillerstrm, Daan Leijen, Sam Lindley, Matija Pretnar, Andreas Rossberg, KC Sivaramakrishnan Effect Handlers Multicore OCaml is an OCaml extension with native support for concurrency and


slide-1
SLIDE 1

Effect Handlers in Multicore OCaml

Daniel Hillerström, Daan Leijen, Sam Lindley, Matija Pretnar, Andreas Rossberg, KC Sivaramakrishnan

slide-2
SLIDE 2

Effect Handlers

  • Multicore OCaml is an OCaml extension with native support

for concurrency and shared-memory parallelism

✦ Concurrency expressed through effect handlers ✦ Will land upstream in Q2 2021

slide-3
SLIDE 3

Effect Handlers

  • Multicore OCaml is an OCaml extension with native support

for concurrency and shared-memory parallelism

✦ Concurrency expressed through effect handlers ✦ Will land upstream in Q2 2021

effect E : string let comp () = print_string "0 "; print_string (perform E); print_string "3 " let main () = try comp () with effect E k -> print_string "1 "; continue k "2 "; print_string “4 "

slide-4
SLIDE 4

Effect Handlers

  • Multicore OCaml is an OCaml extension with native support

for concurrency and shared-memory parallelism

✦ Concurrency expressed through effect handlers ✦ Will land upstream in Q2 2021

effect E : string let comp () = print_string "0 "; print_string (perform E); print_string "3 " let main () = try comp () with effect E k -> print_string "1 "; continue k "2 "; print_string “4 "

effect declaration

slide-5
SLIDE 5

Effect Handlers

  • Multicore OCaml is an OCaml extension with native support

for concurrency and shared-memory parallelism

✦ Concurrency expressed through effect handlers ✦ Will land upstream in Q2 2021

effect E : string let comp () = print_string "0 "; print_string (perform E); print_string "3 " let main () = try comp () with effect E k -> print_string "1 "; continue k "2 "; print_string “4 "

computation effect declaration

slide-6
SLIDE 6

Effect Handlers

  • Multicore OCaml is an OCaml extension with native support

for concurrency and shared-memory parallelism

✦ Concurrency expressed through effect handlers ✦ Will land upstream in Q2 2021

effect E : string let comp () = print_string "0 "; print_string (perform E); print_string "3 " let main () = try comp () with effect E k -> print_string "1 "; continue k "2 "; print_string “4 "

computation handler effect declaration

slide-7
SLIDE 7

Effect Handlers

  • Multicore OCaml is an OCaml extension with native support

for concurrency and shared-memory parallelism

✦ Concurrency expressed through effect handlers ✦ Will land upstream in Q2 2021

effect E : string let comp () = print_string "0 "; print_string (perform E); print_string "3 " let main () = try comp () with effect E k -> print_string "1 "; continue k "2 "; print_string “4 "

computation handler suspends current computation effect declaration

slide-8
SLIDE 8

Effect Handlers

  • Multicore OCaml is an OCaml extension with native support

for concurrency and shared-memory parallelism

✦ Concurrency expressed through effect handlers ✦ Will land upstream in Q2 2021

effect E : string let comp () = print_string "0 "; print_string (perform E); print_string "3 " let main () = try comp () with effect E k -> print_string "1 "; continue k "2 "; print_string “4 "

computation handler delimited continuation suspends current computation effect declaration

slide-9
SLIDE 9

Effect Handlers

  • Multicore OCaml is an OCaml extension with native support

for concurrency and shared-memory parallelism

✦ Concurrency expressed through effect handlers ✦ Will land upstream in Q2 2021

effect E : string let comp () = print_string "0 "; print_string (perform E); print_string "3 " let main () = try comp () with effect E k -> print_string "1 "; continue k "2 "; print_string “4 "

computation handler delimited continuation suspends current computation resume suspended computation effect declaration

slide-10
SLIDE 10

Compilation

effect E : string let comp () = print_string "0 "; print_string (perform E); print_string "3 " let main () = try comp () with effect E k -> print_string "1 "; continue k "2 "; print_string “4 "

pc

main

sp

slide-11
SLIDE 11

Compilation

effect E : string let comp () = print_string "0 "; print_string (perform E); print_string "3 " let main () = try comp () with effect E k -> print_string "1 "; continue k "2 "; print_string “4 "

pc

main

sp

slide-12
SLIDE 12

comp

Compilation

effect E : string let comp () = print_string "0 "; print_string (perform E); print_string "3 " let main () = try comp () with effect E k -> print_string "1 "; continue k "2 "; print_string “4 "

pc

main

sp parent

slide-13
SLIDE 13

comp comp

Compilation

effect E : string let comp () = print_string "0 "; print_string (perform E); print_string "3 " let main () = try comp () with effect E k -> print_string "1 "; continue k "2 "; print_string “4 "

pc

main

sp parent

slide-14
SLIDE 14

comp comp

Compilation

effect E : string let comp () = print_string "0 "; print_string (perform E); print_string "3 " let main () = try comp () with effect E k -> print_string "1 "; continue k "2 "; print_string “4 "

pc

main

sp parent

slide-15
SLIDE 15

comp comp

Compilation

effect E : string let comp () = print_string "0 "; print_string (perform E); print_string "3 " let main () = try comp () with effect E k -> print_string "1 "; continue k "2 "; print_string “4 "

pc

main

sp

k

slide-16
SLIDE 16

comp comp

Compilation

effect E : string let comp () = print_string "0 "; print_string (perform E); print_string "3 " let main () = try comp () with effect E k -> print_string "1 "; continue k "2 "; print_string “4 "

pc

main

sp

k

slide-17
SLIDE 17

comp comp

Compilation

effect E : string let comp () = print_string "0 "; print_string (perform E); print_string "3 " let main () = try comp () with effect E k -> print_string "1 "; continue k "2 "; print_string “4 "

pc

main

sp

k

1

slide-18
SLIDE 18

comp comp

Compilation

effect E : string let comp () = print_string "0 "; print_string (perform E); print_string "3 " let main () = try comp () with effect E k -> print_string "1 "; continue k "2 "; print_string “4 "

pc

main

sp

k

1

slide-19
SLIDE 19

comp comp

Compilation

effect E : string let comp () = print_string "0 "; print_string (perform E); print_string "3 " let main () = try comp () with effect E k -> print_string "1 "; continue k "2 "; print_string “4 "

pc

main

sp

k

parent

1

slide-20
SLIDE 20

comp comp

Compilation

effect E : string let comp () = print_string "0 "; print_string (perform E); print_string "3 " let main () = try comp () with effect E k -> print_string "1 "; continue k "2 "; print_string “4 "

pc

main

sp

k

parent

1 2

slide-21
SLIDE 21

Compilation

effect E : string let comp () = print_string "0 "; print_string (perform E); print_string "3 " let main () = try comp () with effect E k -> print_string "1 "; continue k "2 "; print_string “4 "

pc

main

sp

k

1 2 3

slide-22
SLIDE 22

Compilation

effect E : string let comp () = print_string "0 "; print_string (perform E); print_string "3 " let main () = try comp () with effect E k -> print_string "1 "; continue k "2 "; print_string “4 "

pc

main

sp

k

1 2 3 4

slide-23
SLIDE 23

effect A : unit effect B : unit let baz () = perform A let bar () = try baz () with effect B k -> continue k () let foo () = try bar () with effect A k -> continue k ()

Handlers can be nested

foo bar baz

sp parent parent pc

slide-24
SLIDE 24

effect A : unit effect B : unit let baz () = perform A let bar () = try baz () with effect B k -> continue k () let foo () = try bar () with effect A k -> continue k ()

Handlers can be nested

foo bar baz

sp parent parent pc

slide-25
SLIDE 25

effect A : unit effect B : unit let baz () = perform A let bar () = try baz () with effect B k -> continue k () let foo () = try bar () with effect A k -> continue k ()

Handlers can be nested

foo bar baz

sp parent pc

k

slide-26
SLIDE 26

effect A : unit effect B : unit let baz () = perform A let bar () = try baz () with effect B k -> continue k () let foo () = try bar () with effect A k -> continue k ()

Handlers can be nested

foo bar baz

sp parent pc

k

  • Linear search through handlers
  • Handler stacks shallow in practice
slide-27
SLIDE 27

Deep-dive into perform

slide-28
SLIDE 28

Deep-dive into perform

  • Full power of pattern matching for matching effects

✦ Tag test + branching is compiled to a function

slide-29
SLIDE 29

Deep-dive into perform

  • Full power of pattern matching for matching effects

✦ Tag test + branching is compiled to a function https://github.com/ocaml-multicore/ocaml-multicore/blob/parallel_minor_gc/runtime/amd64.S#L865

slide-30
SLIDE 30

Performance

  • Intel(R) Xeon(R) Gold 5120 CPU @ 2.20GHz

✦ For reference, memory read latency is 90 ns (local NUMA node) and 145

ns (remote NUMA node)

slide-31
SLIDE 31

Performance

  • Intel(R) Xeon(R) Gold 5120 CPU @ 2.20GHz

✦ For reference, memory read latency is 90 ns (local NUMA node) and 145

ns (remote NUMA node)

let foo () = (* a *) try (* b *) perform E (* d *) with effect E k -> (* c *) continue k () (* e *)

slide-32
SLIDE 32

Performance

  • Intel(R) Xeon(R) Gold 5120 CPU @ 2.20GHz

✦ For reference, memory read latency is 90 ns (local NUMA node) and 145

ns (remote NUMA node)

let foo () = (* a *) try (* b *) perform E (* d *) with effect E k -> (* c *) continue k () (* e *)

Instruction Sequence a to b b to c c to d d to e Significance Create a new stack & run the computation Performing & handling an effect Resuming a continuation Returning from a computation & free the stack

  • Each of the instruction sequences involves a stack switch
slide-33
SLIDE 33

Performance

  • Intel(R) Xeon(R) Gold 5120 CPU @ 2.20GHz

✦ For reference, memory read latency is 90 ns (local NUMA node) and 145

ns (remote NUMA node)

let foo () = (* a *) try (* b *) perform E (* d *) with effect E k -> (* c *) continue k () (* e *)

Instruction Sequence a to b b to c c to d d to e Significance Create a new stack & run the computation Performing & handling an effect Resuming a continuation Returning from a computation & free the stack Time (ns) 2479 122 189 155

  • Each of the instruction sequences involves a stack switch
slide-34
SLIDE 34

Performance: Generators

slide-35
SLIDE 35

Performance: Generators

  • Traverse a complete binary-tree of depth 25
slide-36
SLIDE 36

Performance: Generators

  • Traverse a complete binary-tree of depth 25
  • Iterator — idiomatic recursive traversal
slide-37
SLIDE 37

Performance: Generators

  • Traverse a complete binary-tree of depth 25
  • Iterator — idiomatic recursive traversal
  • Generator — next() function to consume elements on-demand

✦ Hand-written generator (hw-generator)

CPS translation + defunctionalization to remove intermediate closure allocation

✦ Generator using effect handlers (eh-generator)

2 * (225 - 1) + 2 = 226 stack switches

slide-38
SLIDE 38

Performance: Generators

  • Traverse a complete binary-tree of depth 25
  • Iterator — idiomatic recursive traversal
  • Generator — next() function to consume elements on-demand

✦ Hand-written generator (hw-generator)

CPS translation + defunctionalization to remove intermediate closure allocation

✦ Generator using effect handlers (eh-generator)

2 * (225 - 1) + 2 = 226 stack switches

Variant Time (milliseconds) Iterator (baseline) 202 hw-generator 761 (3.76x) eh-generator 1879 (9.30x)

Multicore OCaml

slide-39
SLIDE 39

Performance: Generators

  • Traverse a complete binary-tree of depth 25
  • Iterator — idiomatic recursive traversal
  • Generator — next() function to consume elements on-demand

✦ Hand-written generator (hw-generator)

CPS translation + defunctionalization to remove intermediate closure allocation

✦ Generator using effect handlers (eh-generator)

2 * (225 - 1) + 2 = 226 stack switches

Variant Time (milliseconds) Iterator (baseline) 202 hw-generator 761 (3.76x) eh-generator 1879 (9.30x) Variant Time (milliseconds) Iterator (baseline) 492 generator 43842 (89.1x)

Multicore OCaml nodejs 14.07

slide-40
SLIDE 40

Performance: WebServer

  • Effect handlers for asynchronous I/O
  • Variants

✦ Go + net/http ✦ OCaml + http/af + Async (explicit callbacks) ✦ OCaml + http/af + Effect handlers

  • Latency measured using wrk2
slide-41
SLIDE 41

Performance: WebServer

  • Effect handlers for asynchronous I/O
  • Variants

✦ Go + net/http ✦ OCaml + http/af + Async (explicit callbacks) ✦ OCaml + http/af + Effect handlers

  • Latency measured using wrk2
slide-42
SLIDE 42

Thank you!

  • Multicore OCaml

✦ https://github.com/ocaml-multicore/ocaml-multicore

  • A collection of effect handlers examples

✦ https://github.com/ocaml-multicore/effects-examples

  • JS generator example

✦ https://github.com/kayceesrk/wasmfx/tree/master/cg_4_aug_20