Reagents: Functional programming meets scalable concurrency Aaron - - PowerPoint PPT Presentation

reagents
SMART_READER_LITE
LIVE PREVIEW

Reagents: Functional programming meets scalable concurrency Aaron - - PowerPoint PPT Presentation

Reagents: Functional programming meets scalable concurrency Aaron Turon Northeastern University Saturday, November 10, 12 Concurrency Parallelism Concurrency is overlapped execution of processes. Parallelism is simultaneous execution of


slide-1
SLIDE 1

Reagents:

Functional programming meets scalable concurrency

Aaron Turon

Northeastern University

Saturday, November 10, 12

slide-2
SLIDE 2

Concurrency ≠ Parallelism

Concurrency is overlapped execution of processes. Parallelism is simultaneous execution of computations.

Saturday, November 10, 12

slide-3
SLIDE 3

The trouble is that essentially all the interesting applications of concurrency involve the deliberate and controlled mutation of shared state, such as screen real estate, the file system, or the internal data structures of the program. The right solution, therefore, is to provide mechanisms which allow (though alas they cannot enforce) the safe mutation of shared state.

  • - Peyton Jones, Gordon, and Finne

in Concurrent Haskell

Saturday, November 10, 12

slide-4
SLIDE 4

Concurrency ⋂ Parallelism

  • Concurrent programs on parallel hardware

(e.g. OS kernels)

  • Implementing parallel abstractions

(e.g. work stealing for data parallelism)

  • “Last mile” of parallel programming

(where we must resort to concurrency)

= Scalable Concurrency

Use cases:

Saturday, November 10, 12

slide-5
SLIDE 5

class LockCounter { private var c: Int = 0 private var l = new Lock def inc: Int = { l.lock() val old = c c = old + 1 l.unlock()

  • ld

} }

Saturday, November 10, 12

slide-6
SLIDE 6

class CASCounter { private var c = new AtomicRef[Int](0) def inc: Int = { while (true) { val old = c if (c.cas(old, old+1)) return old } } }

Saturday, November 10, 12

slide-7
SLIDE 7

A simple test

  • Increment counter
  • Busywait for t cycles (no cache interaction)
  • Repeat

Saturday, November 10, 12

slide-8
SLIDE 8

Threads Throughput 1 8 Predicted CAS Locking

Results for 98% parallelism

Saturday, November 10, 12

slide-9
SLIDE 9

Lock-based CAS-based

Threads

2 4 6 8

Threads

2 4 6 8

Parallelism (log-scale)

63% 88% 98% 99.7% 99.9%

Throughput Optimal

1.0 0.87 0.74 0.61 0.48 0.35

Saturday, November 10, 12

slide-10
SLIDE 10

What’s going on here?

Saturday, November 10, 12

slide-11
SLIDE 11

What’s going on here?

Cost Coarse-grained Fine-grained

Communication

Saturday, November 10, 12

slide-12
SLIDE 12

Nehalem Quadcore Core 0 Shared Level 3 Cache IMC (3 Channel) QPI L1 Core 1 Core 2 Core 3 L2 L2 L2 L2 I/O Hub L1 L1 L1 Nehalem Quadcore Core 4 Shared Level 3 Cache QPI L1 Core 5 Core 6 Core 7 L2 L2 L2 L2 L1 L1 L1 DDR3 A IMC (3 Channel) DDR3 C DDR3 B DDR3 D DDR3 F DDR3 E

Saturday, November 10, 12

slide-13
SLIDE 13

Saturday, November 10, 12

slide-14
SLIDE 14

java.util.concurrent

Synchronization Data structures

Reentrant locks Semaphores R/W locks Reentrant R/W locks Condition variables Countdown latches Cyclic barriers Phasers Exchangers Queues Nonblocking Blocking (array & list) Synchronous Priority, nonblocking Priority, blocking Deques Sets Maps (hash & skiplist)

Saturday, November 10, 12

slide-15
SLIDE 15

class TreiberStack[A] { private val head = new AtomicRef[List[A]](Nil) def push(a: A) { val backoff = new Backoff while (true) { val cur = head.get() if (head.cas(cur, a :: cur)) return backoff.once() } } ...

Saturday, November 10, 12

slide-16
SLIDE 16

3 2

Head

Saturday, November 10, 12

slide-17
SLIDE 17

3 2

Head

7

Saturday, November 10, 12

slide-18
SLIDE 18

3 2

Head

7 5

Saturday, November 10, 12

slide-19
SLIDE 19

3 2

Head

7 5

CAS fail

Saturday, November 10, 12

slide-20
SLIDE 20

3 2

Head

7 5

Saturday, November 10, 12

slide-21
SLIDE 21

3 2

Head

7 5

Saturday, November 10, 12

slide-22
SLIDE 22

def tryPop(): Option[A] = { val backoff = new Backoff while (true) { val cur = head.get() cur match { case Nil => return None case a::tail => if (head.cas(cur, tail)) return Some(a) } backoff.once() } }

Saturday, November 10, 12

slide-23
SLIDE 23

Concurrency libraries are indispensable, but hard to build and extend

The Problem:

Saturday, November 10, 12

slide-24
SLIDE 24

Build and extend scalable concurrent algorithms using a monad with shared-state and message-passing operations

The Proposal:

Saturday, November 10, 12

slide-25
SLIDE 25

Design

Saturday, November 10, 12

slide-26
SLIDE 26

f

A B

Lambda abstraction:

Reagents are (first) arrows

Saturday, November 10, 12

slide-27
SLIDE 27

f

A B

Lambda abstraction: Reagent abstraction:

A B

R

Reagents are (first) arrows

Saturday, November 10, 12

slide-28
SLIDE 28

c: Chan[A,B]

c

swap

A B

Saturday, November 10, 12

slide-29
SLIDE 29

c: Chan[A,B]

c

swap

A B

c

swap

B A

Saturday, November 10, 12

slide-30
SLIDE 30

c: Chan[A,B]

c

swap

A B

Saturday, November 10, 12

slide-31
SLIDE 31

swap

Message passing

Saturday, November 10, 12

slide-32
SLIDE 32

swap

r: Ref[A] f: (A,B)→(A,C)

upd

f

r

A A B C

Message passing

Saturday, November 10, 12

slide-33
SLIDE 33

swap upd

f

Message passing Shared state

Saturday, November 10, 12

slide-34
SLIDE 34

swap upd

f A B

R

A B

S

Message passing Shared state

Saturday, November 10, 12

slide-35
SLIDE 35

swap upd

f

R S

+

A B

Message passing Shared state

Saturday, November 10, 12

slide-36
SLIDE 36

swap upd

f

R S

+

Message passing Shared state Disjunction

Saturday, November 10, 12

slide-37
SLIDE 37

swap upd

f

R S

+

A B

R

A C

S

Message passing Shared state Disjunction

Saturday, November 10, 12

slide-38
SLIDE 38

swap upd

f

R S

+

R S

*

A (B,C)

Message passing Shared state Disjunction

Saturday, November 10, 12

slide-39
SLIDE 39

swap upd

f

R S

+

R S

*

Message passing Shared state Disjunction Conjunction

Saturday, November 10, 12

slide-40
SLIDE 40

f

A B

Lambda abstraction: Reagent abstraction:

A B

R

Saturday, November 10, 12

slide-41
SLIDE 41

f

A B

Lambda abstraction: Reagent abstraction:

A B

R

Saturday, November 10, 12

slide-42
SLIDE 42

f

A B

Lambda abstraction: Reagent abstraction:

A B

R application:

f(a) = b

Saturday, November 10, 12

slide-43
SLIDE 43

f

A B

Lambda abstraction: Reagent abstraction:

A B

R application:

f(a) = b

apply as reactant:

R ! a = b

Saturday, November 10, 12

slide-44
SLIDE 44

f

A B

Lambda abstraction: Reagent abstraction:

A B

R application:

f(a) = b

apply as reactant:

R ! a = b

apply as catalyst:

dissolve(R)

Saturday, November 10, 12

slide-45
SLIDE 45

c: Chan[A,B]

c

swap

A B

c

swap

B A

Saturday, November 10, 12

slide-46
SLIDE 46

c: Chan[Unit,Int]

c

swap Unit

c

swap Unit Int Int

Saturday, November 10, 12

slide-47
SLIDE 47

c: Chan[Unit,Int]

c

swap Unit

c

swap Unit Int Int

!() !3

Saturday, November 10, 12

slide-48
SLIDE 48

c: Chan[Unit,Int]

c

swap Unit

c

swap Unit Int Int

() 3

Saturday, November 10, 12

slide-49
SLIDE 49

c: Chan[Unit,Int]

c

swap Unit

c

swap Unit Int Int

dissolve !3

Saturday, November 10, 12

slide-50
SLIDE 50

c: Chan[Unit,Int]

c

swap Unit

c

swap Unit Int Int

() !3 ...()()()

Saturday, November 10, 12

slide-51
SLIDE 51

c: Chan[Unit,Int]

c

swap Unit

c

swap Unit Int Int

() 3 ...()()()

Saturday, November 10, 12

slide-52
SLIDE 52

d

swap

A

c

swap

Unit Unit

Saturday, November 10, 12

slide-53
SLIDE 53

d

swap

A

c

swap

Unit Unit

“Receive” “Send”

Saturday, November 10, 12

slide-54
SLIDE 54

d

swap

A

c

swap

“Receive” “Send”

Saturday, November 10, 12

slide-55
SLIDE 55

d

swap

A

c

swap

Pipeline catalyst

Saturday, November 10, 12

slide-56
SLIDE 56

d

swap

A

c

swap

Pipeline catalyst

NB: transfer is atomic

Saturday, November 10, 12

slide-57
SLIDE 57

d

swap *

c

swap

A B (A,B)

Saturday, November 10, 12

slide-58
SLIDE 58

d

swap *

2-way join

c

swap

A B (A,B)

Saturday, November 10, 12

slide-59
SLIDE 59

d

swap *

2-way join

c

swap

A B (A,B)

+

( )

e

swap

Exn

Saturday, November 10, 12

slide-60
SLIDE 60

d

swap *

c

swap

A B (A,B)

+

( )

e

swap

Exn

Abortable 2-way join

Saturday, November 10, 12

slide-61
SLIDE 61

Join Calculus

c1(x1) & · · · & cn(xn) ⇒ e

Saturday, November 10, 12

slide-62
SLIDE 62

Join Calculus

c1(x1) & · · · & cn(xn) ⇒ e

(swap c1 * · · · * swap cn)

>>> postCommit e

becomes

Saturday, November 10, 12

slide-63
SLIDE 63

Join Calculus

c1(x1) & · · · & cn(xn) ⇒ e

(swap c1 * · · · * swap cn)

>>> postCommit e

becomes

dissolve(

)

Saturday, November 10, 12

slide-64
SLIDE 64

class TreiberStack [A] { private val head = new Ref[List[A]](Nil) val push : A ↣ () = upd(head)(cons) val tryPop : () ↣ A? = upd(head) { case (x :: xs) => (xs, Some(x)) case Nil => (Nil, None) } }

Saturday, November 10, 12

slide-65
SLIDE 65

class TreiberStack [A] { private val head = new Ref[List[A]](Nil) val push : A ↣ () = upd(head)(cons) val tryPop : () ↣ A? = upd(head) { case (x :: xs) => (xs, Some(x)) case Nil => (Nil, None) } val pop : () ↣ A = upd(head) { case (x :: xs) => (xs, x) } }

Saturday, November 10, 12

slide-66
SLIDE 66

class TreiberStack [A] { private val head = new Ref[List[A]](Nil) val push = upd(head)(cons) val tryPop = upd(head)(trySplit) val pop = upd(head)(split) }

Saturday, November 10, 12

slide-67
SLIDE 67

class TreiberStack [A] { private val head = new Ref[List[A]](Nil) val push = upd(head)(cons) val tryPop = upd(head)(trySplit) val pop = upd(head)(split) } class EliminationStack [A] { private val stack = new TreiberStack[A] private val (send, recv) = new Chan[A] val push = stack.push + swap(send) val pop = stack.pop + swap(recv) }

Saturday, November 10, 12

slide-68
SLIDE 68

stack1.pop >>> stack2.push

Saturday, November 10, 12

slide-69
SLIDE 69

Going Monadic

3 2

Head

5

Tail

X

Saturday, November 10, 12

slide-70
SLIDE 70

Going Monadic

3 2

Head

5

Tail

X

computed: A → (() ↣ B) → (A ↣ B)

Saturday, November 10, 12

slide-71
SLIDE 71

Use invisible side-effects to traverse the queue while computing the upd

  • peration to perform

Saturday, November 10, 12

slide-72
SLIDE 72

Implementation

Saturday, November 10, 12

slide-73
SLIDE 73

Phase 1 Phase 2

Saturday, November 10, 12

slide-74
SLIDE 74

Phase 1 Phase 2 Accumulate CASes

Saturday, November 10, 12

slide-75
SLIDE 75

Phase 1 Phase 2 Accumulate CASes Attempt k-CAS

Saturday, November 10, 12

slide-76
SLIDE 76

Accumulate CASes Attempt k-CAS

Saturday, November 10, 12

slide-77
SLIDE 77

Accumulate CASes Attempt k-CAS

Permanent failure

Saturday, November 10, 12

slide-78
SLIDE 78

Accumulate CASes Attempt k-CAS

Permanent failure Transient failure

Saturday, November 10, 12

slide-79
SLIDE 79

Saturday, November 10, 12

slide-80
SLIDE 80

Permanent failure

Saturday, November 10, 12

slide-81
SLIDE 81

Permanent failure Transient failure

Saturday, November 10, 12

slide-82
SLIDE 82

Permanent failure Transient failure Transient failure

Saturday, November 10, 12

slide-83
SLIDE 83

Permanent failure Transient failure ? failure Transient failure

Saturday, November 10, 12

slide-84
SLIDE 84

Permanent failure Transient failure ? failure Transient failure

P & P = P T & T = T P & T = T T & P = T

Saturday, November 10, 12

slide-85
SLIDE 85

Is this just STM?

Saturday, November 10, 12

slide-86
SLIDE 86

Is this just STM?

No:

  • Single CAS collapses to single phase
  • Multiple CASes to single location forbidden

So the “redo log” is write-only for phase 1! Therefore: pay-as-you-go

  • Treiber stack is really a Treiber stack
  • Pay for kCAS only for compositions

Saturday, November 10, 12

slide-87
SLIDE 87

Is this just STM?

Isolation Shared state Interaction Message passing

Saturday, November 10, 12

slide-88
SLIDE 88

Is this just STM?

Isolation Shared state Interaction Message passing Using lock-free bags, based on earlier work with Russo [OOPSLA’11]

Saturday, November 10, 12

slide-89
SLIDE 89
  • Treiber stack

Throughput (iters/μs)

  • Threads

Saturday, November 10, 12

slide-90
SLIDE 90
  • Stack transfer

Throughput (iters/μs)

  • Threads

Saturday, November 10, 12

slide-91
SLIDE 91

Open Questions

  • Composition and invisible read/writes
  • Find a better rule?
  • Statically detect bad cases?
  • Composition with lock-based algorithms?
  • Conflicts between interaction and

isolation?

Saturday, November 10, 12

slide-92
SLIDE 92

Open Questions 2

  • Guaranteed inlining
  • Read/CAS windows must be short
  • “CAPER” with Sam Tobin-Hochstadt
  • Formal semantics
  • Integrate Haskell’s STM semantics with

message-passing?

Saturday, November 10, 12

slide-93
SLIDE 93

Related work

Joins CML STM

Saturday, November 10, 12

slide-94
SLIDE 94

Related work

Joins CML STM

Transactional events Communicating transactions

Saturday, November 10, 12