Reagents: lock-free programming for the masses KC Sivaramakrishnan - - PowerPoint PPT Presentation

reagents lock free programming for the masses
SMART_READER_LITE
LIVE PREVIEW

Reagents: lock-free programming for the masses KC Sivaramakrishnan - - PowerPoint PPT Presentation

Reagents: lock-free programming for the masses KC Sivaramakrishnan University of OCaml Cambridge Labs Multicore OCaml Concurrency Parallelism Libraries Language + Stdlib Compiler 2 Multicore OCaml Concurrency Parallelism


slide-1
SLIDE 1

Reagents: lock-free programming for the masses

“KC” Sivaramakrishnan

OCaml Labs University of Cambridge

slide-2
SLIDE 2

Multicore OCaml

2

Concurrency Parallelism

Compiler Language + Stdlib Libraries

slide-3
SLIDE 3

Multicore OCaml

2

Concurrency Parallelism

Compiler Language + Stdlib Libraries

slide-4
SLIDE 4

Multicore OCaml

2

Concurrency Parallelism

Compiler

Fibers

Language + Stdlib Libraries

slide-5
SLIDE 5

Multicore OCaml

2

Concurrency Parallelism

Compiler

Fibers

Language + Stdlib

  • 12M fibers/s
  • n 1 core
  • 30M fibers/s
  • n 4 cores

Libraries

slide-6
SLIDE 6

Multicore OCaml

2

Domains

Concurrency Parallelism

Compiler

Fibers

Language + Stdlib

  • 12M fibers/s
  • n 1 core
  • 30M fibers/s
  • n 4 cores

Libraries

slide-7
SLIDE 7

Multicore OCaml

2

Effects Domains

Concurrency Parallelism

Compiler

Fibers

Language + Stdlib

Domain API

  • 12M fibers/s
  • n 1 core
  • 30M fibers/s
  • n 4 cores

Libraries

slide-8
SLIDE 8

Multicore OCaml

2

Effects Cooperative threading libraries Domains

Concurrency Parallelism

Compiler

Fibers

Language + Stdlib

Domain API

  • 12M fibers/s
  • n 1 core
  • 30M fibers/s
  • n 4 cores

Libraries

slide-9
SLIDE 9

Multicore OCaml

2

Effects Cooperative threading libraries Reagents: lock- free programming Domains

Concurrency Parallelism

Compiler

Fibers

Language + Stdlib

Domain API

  • 12M fibers/s
  • n 1 core
  • 30M fibers/s
  • n 4 cores

Libraries

slide-10
SLIDE 10

JVM: java.util.concurrent

3

.Net: System.Concurrent.Collections

slide-11
SLIDE 11

JVM: java.util.concurrent

Synchronization Data structures

Reentrant locks Semaphores R/W locks Reentrant R/W locks Condition variables Countdown latches Cyclic barriers Phasers Exchangers Queues Nonblocking Blocking (array & list) Synchronous Priority, nonblocking Priority, blocking Deques Sets Maps (hash & skiplist)

3

.Net: System.Concurrent.Collections

slide-12
SLIDE 12

JVM: java.util.concurrent

Synchronization Data structures

Reentrant locks Semaphores R/W locks Reentrant R/W locks Condition variables Countdown latches Cyclic barriers Phasers Exchangers Queues Nonblocking Blocking (array & list) Synchronous Priority, nonblocking Priority, blocking Deques Sets Maps (hash & skiplist)

3

.Net: System.Concurrent.Collections

Not Composable

slide-13
SLIDE 13

How to build composable lock-free programs?

4

slide-14
SLIDE 14

lock-free

5

slide-15
SLIDE 15

lock-free

5

Under contention, at least 1 thread makes progress

slide-16
SLIDE 16

lock-free

5

Under contention, at least 1 thread makes progress Single thread in isolation makes progress

  • bstruction-free
slide-17
SLIDE 17

lock-free

5

Under contention, at least 1 thread makes progress Under contention, each thread makes progress

wait-free

Single thread in isolation makes progress

  • bstruction-free
slide-18
SLIDE 18

Compare-and-swap (CAS)

module CAS : sig val cas : 'a ref -> expect:'a -> update:'a -> bool end = struct (* atomically... *) let cas r ~expect ~update = if !r = expect then (r:= update; true) else false end

6

slide-19
SLIDE 19

Compare-and-swap (CAS)

module CAS : sig val cas : 'a ref -> expect:'a -> update:'a -> bool end = struct (* atomically... *) let cas r ~expect ~update = if !r = expect then (r:= update; true) else false end

  • Implemented atomically by processors
  • x86: CMPXCHG and friends
  • arm: LDREX, STREX, etc.
  • ppc: lwarx, stwcx, etc.

6

slide-20
SLIDE 20

3 2

Head

7

slide-21
SLIDE 21

3 2

Head

7

7

slide-22
SLIDE 22

3 2

Head

7

7

CAS attempt

slide-23
SLIDE 23

3 2

Head

7 5

7

CAS attempt

slide-24
SLIDE 24

3 2

Head

7 5

CAS fail

7

slide-25
SLIDE 25

3 2

Head

7 5

7

slide-26
SLIDE 26

3 2

Head

7 5

8

slide-27
SLIDE 27

module type TREIBER_STACK = sig type 'a t val push : 'a t -> 'a -> unit ... end module Treiber_stack : TREIBER_STACK = struct type 'a t = 'a list ref let rec push s t = let cur = !s in if CAS.cas s cur (t::cur) then () else (backoff (); push s t) end

9

slide-28
SLIDE 28

module type TREIBER_STACK = sig type 'a t val push : 'a t -> 'a -> unit val try_pop : 'a t -> 'a option end module Treiber_stack : TREIBER_STACK = struct type 'a t = 'a list ref let rec push s t = ... let rec try_pop s = match !s with | [] -> None | (x::xs) as cur -> if CAS.cas s cur xs then Some x else (backoff (); try_pop s) end

10

slide-29
SLIDE 29

let v = Treiber_stack.pop s1 in Treiber_stack.push s2 v

is not atomic

11

slide-30
SLIDE 30

Concurrency libraries are indispensable, but hard to build and extend

The Problem:

let v = Treiber_stack.pop s1 in Treiber_stack.push s2 v

is not atomic

11

slide-31
SLIDE 31

Scalable concurrent algorithms can be built and extended using abstraction and composition

Reagents

Treiber_stack.pop s1 >>> Treiber_stack.push s2

is atomic

12

slide-32
SLIDE 32

13

PLDI 2012

slide-33
SLIDE 33

13

Sequential >>> — Software transactional memory Parallel <*> — Join Calculus Selective <+> — Concurrent ML PLDI 2012

slide-34
SLIDE 34

13

Sequential >>> — Software transactional memory Parallel <*> — Join Calculus Selective <+> — Concurrent ML PLDI 2012

still lock-free!

slide-35
SLIDE 35

Design

14

slide-36
SLIDE 36

Lambda: the ultimate abstraction f

'a 'b

g

'b 'c

val f : 'a -> 'b val g : 'b -> 'c

15

slide-37
SLIDE 37

Lambda: the ultimate abstraction f

'a

g

'b 'c

(compose g f): 'a -> 'c

16

slide-38
SLIDE 38

f

'a 'b

Lambda abstraction:

17

slide-39
SLIDE 39

f

'a 'b

Lambda abstraction: Reagent abstraction:

'a 'b

R

('a,'b) Reagent.t

17

slide-40
SLIDE 40

f

'a 'b

Lambda abstraction: Reagent abstraction:

'a 'b

R

('a,'b) Reagent.t

17

val run : ('a,'b) Reagent.t -> 'a -> ‘b

slide-41
SLIDE 41

Thread Interaction

18

module type Reagents = sig type ('a,'b) t (* shared memory *) module Ref : Ref.S with type ('a,'b) reagent = ('a,'b) t (* communication channels *) module Channel : Channel.S with type ('a,'b) reagent = ('a,'b) t ... end

slide-42
SLIDE 42

module type Channel = sig type ('a,'b) endpoint type ('a,'b) reagent val mk_chan : unit -> ('a,'b) endpoint * ('b,'a) endpoint val swap : ('a,'b) endpoint -> ('a,'b) reagent end

slide-43
SLIDE 43

c: ('a,'b) endpoint

c

swap

'a 'b

module type Channel = sig type ('a,'b) endpoint type ('a,'b) reagent val mk_chan : unit -> ('a,'b) endpoint * ('b,'a) endpoint val swap : ('a,'b) endpoint -> ('a,'b) reagent end

slide-44
SLIDE 44

c: ('a,'b) endpoint

c

swap

'a 'b

c

swap

'b 'a

module type Channel = sig type ('a,'b) endpoint type ('a,'b) reagent val mk_chan : unit -> ('a,'b) endpoint * ('b,'a) endpoint val swap : ('a,'b) endpoint -> ('a,'b) reagent end

slide-45
SLIDE 45

c

swap

'a 'b c: ('a,'b) endpoint

slide-46
SLIDE 46

swap

Message passing

type 'a ref val upd : 'a ref

  • > f:(‘a -> 'b -> ('a * ‘c) option)
  • > ('b, 'c) Reagent.t

21

slide-47
SLIDE 47

swap upd

f

r

'a 'a 'b 'c

Message passing

type 'a ref val upd : 'a ref

  • > f:(‘a -> 'b -> ('a * ‘c) option)
  • > ('b, 'c) Reagent.t

21

slide-48
SLIDE 48

swap upd

f

Message passing Shared state

22

slide-49
SLIDE 49

swap upd

f 'a 'b

R

'a 'b

S

Message passing Shared state

22

slide-50
SLIDE 50

swap upd

f

R S

<+>

'a 'b

Message passing Shared state

22

slide-51
SLIDE 51

swap upd

f

R S

<+>

Message passing Shared state Disjunction

23

slide-52
SLIDE 52

swap upd

f

R S

<+>

'a 'b

R

'a 'c

S

Message passing Shared state Disjunction

23

slide-53
SLIDE 53

swap upd

f

R S

<+>

R S

<*>

'a ('b * 'c)

Message passing Shared state Disjunction

23

slide-54
SLIDE 54

swap upd

f

R S

<+>

R S

<*>

Message passing Shared state Disjunction Conjunction

24

slide-55
SLIDE 55

module type TREIBER_STACK = sig type 'a t val create : unit -> 'a t val push : 'a t -> ('a, unit) Reagent.t val pop : 'a t -> (unit, 'a) Reagent.t ... end module Treiber_stack : TREIBER_STACK = struct type 'a t = 'a list Ref.ref let create () = Ref.ref [] let push r x = Ref.upd r (fun xs x -> Some (x::xs,())) let pop r = Ref.upd r (fun l () -> match l with | [] -> None (* block *) | x::xs -> Some (xs,x)) ... end

25

slide-56
SLIDE 56

Composability

Treiber_stack.pop s1 >>> Treiber_stack.push s2

Transfer elements atomically

26

slide-57
SLIDE 57

Composability

Treiber_stack.pop s1 >>> Treiber_stack.push s2

Transfer elements atomically Consume elements atomically

Treiber_stack.pop s1 <*> Treiber_stack.pop s2

26

slide-58
SLIDE 58

Composability

Treiber_stack.pop s1 >>> Treiber_stack.push s2

Transfer elements atomically Consume elements atomically

Treiber_stack.pop s1 <*> Treiber_stack.pop s2

Consume elements from either

Treiber_stack.pop s1 <+> Treiber_stack.pop s2

26

slide-59
SLIDE 59

Composability

27

Transform arbitrary blocking reagent to a non-blocking reagent

slide-60
SLIDE 60

Composability

27

val lift : ('a -> 'b option) -> ('a,'b) t val constant : 'a -> ('b,'a) t

Transform arbitrary blocking reagent to a non-blocking reagent

slide-61
SLIDE 61

Composability

27

let attempt (r : ('a,'b) t) : ('a,'b option) t = (r >>> lift (fun x -> Some (Some x))) <+> (constant None) val lift : ('a -> 'b option) -> ('a,'b) t val constant : 'a -> ('b,'a) t

Transform arbitrary blocking reagent to a non-blocking reagent

slide-62
SLIDE 62

Composability

27

let attempt (r : ('a,'b) t) : ('a,'b option) t = (r >>> lift (fun x -> Some (Some x))) <+> (constant None) val lift : ('a -> 'b option) -> ('a,'b) t val constant : 'a -> ('b,'a) t

Transform arbitrary blocking reagent to a non-blocking reagent

let try_pop stack = attempt (pop stack)

slide-63
SLIDE 63
  • Philosopher’s alternate between thinking and

eating

  • Philosopher can only eat after obtaining both

forks

  • No philosopher starves
slide-64
SLIDE 64

type fork = {drop : (unit,unit) endpoint; take : (unit,unit) endpoint} let mk_fork () = let drop, take = mk_chan () in {drop; take} let drop f = swap f.drop let take f = swap f.take

  • Philosopher’s alternate between thinking and

eating

  • Philosopher can only eat after obtaining both

forks

  • No philosopher starves
slide-65
SLIDE 65

type fork = {drop : (unit,unit) endpoint; take : (unit,unit) endpoint} let mk_fork () = let drop, take = mk_chan () in {drop; take} let drop f = swap f.drop let take f = swap f.take

let eat l_fork r_fork = run (take l_fork <*> take r_fork) (); (* ... * eat * ... *) spawn @@ run (drop l_fork); spawn @@ run (drop r_fork)

  • Philosopher’s alternate between thinking and

eating

  • Philosopher can only eat after obtaining both

forks

  • No philosopher starves
slide-66
SLIDE 66

Implementation

29

slide-67
SLIDE 67

Phase 1 Phase 2

30

slide-68
SLIDE 68

Phase 1 Phase 2 Accumulate CASes

30

slide-69
SLIDE 69

Phase 1 Phase 2 Accumulate CASes Attempt k-CAS

30

slide-70
SLIDE 70

Accumulate CASes Attempt k-CAS

31

slide-71
SLIDE 71

Accumulate CASes Attempt k-CAS

Permanent failure

31

slide-72
SLIDE 72

Accumulate CASes Attempt k-CAS

Permanent failure Transient failure

31

slide-73
SLIDE 73

Accumulate CASes Attempt k-CAS

Permanent failure Transient failure

31

HTM Ready

slide-74
SLIDE 74

Status

https://github.com/ocamllabs/reagents

Synchronization Data structures

Locks Reentrant locks Semaphores R/W locks Reentrant R/W locks Condition variables Countdown latches Cyclic barriers Phasers Exchangers Queues Nonblocking Blocking (array & list) Synchronous Priority, nonblocking Priority, blocking Stacks Treiber Elimination backoff Counters Deques Sets Maps (hash & skiplist)

slide-75
SLIDE 75

STM vs Reagents

  • STM is more ambitious — atomic { … }. Reagents are

conservative.

  • Reagents don’t allow multiple writes to the same

memory location.

  • Reagents are lock-free. STMs are typically obstruction-

free.

33