Scalable Termination Detection for Distributed Actor Systems Dan - - PowerPoint PPT Presentation

scalable termination detection for distributed actor
SMART_READER_LITE
LIVE PREVIEW

Scalable Termination Detection for Distributed Actor Systems Dan - - PowerPoint PPT Presentation

Scalable Termination Detection for Distributed Actor Systems Dan Plyukhin and Gul Agha UIUC Actor Model Actors are lightweight, stateful, async processes. Used to build low-latency distributed systems (e.g. Riak, Discord, CouchDB). Most popular


slide-1
SLIDE 1

Scalable Termination Detection for Distributed Actor Systems

Dan Plyukhin and Gul Agha UIUC

slide-2
SLIDE 2

Actor Model

Actors are lightweight, stateful, async processes. Used to build low-latency distributed systems (e.g. Riak, Discord, CouchDB). Most popular frameworks (Erlang, Akka, Orleans) do not garbage collect actors. Ordinary tracing GC techniques don’t work. Other solutions don’t scale well.

slide-3
SLIDE 3

Part I: Actors

slide-4
SLIDE 4

Idle actor Message

m

slide-5
SLIDE 5

m

Idle actor Message

m

slide-6
SLIDE 6

Once an idle actor receives a message, it becomes busy.

m

Idle actor Message

m

slide-7
SLIDE 7

Once an idle actor receives a message, it becomes busy.

Idle Message Busy

m

slide-8
SLIDE 8

Once an idle actor receives a message, it becomes busy. A busy actor can…

Idle Message Busy

m

slide-9
SLIDE 9

Once an idle actor receives a message, it becomes busy. A busy actor can…

  • spawn actors...

A

Idle Message Busy Reference

m

slide-10
SLIDE 10

Once an idle actor receives a message, it becomes busy. A busy actor can…

  • spawn actors...

A B

Idle Message Busy Reference

m

slide-11
SLIDE 11

Once an idle actor receives a message, it becomes busy. A busy actor can…

  • spawn actors...

A B

Idle Message Busy Reference

m

C

slide-12
SLIDE 12

Once an idle actor receives a message, it becomes busy. A busy actor can…

  • spawn actors...
  • send async messages...

A B

Idle Message Busy Reference

m

C

slide-13
SLIDE 13

Once an idle actor receives a message, it becomes busy. A busy actor can…

  • spawn actors...
  • send async messages...

A B

“hello”

Idle Message Busy Reference

m

C

slide-14
SLIDE 14

Once an idle actor receives a message, it becomes busy. A busy actor can…

  • spawn actors...
  • send async messages...

A B

“hello” sayHello(A)

Idle Message Busy Reference

m

C

slide-15
SLIDE 15

A

Once an idle actor receives a message, it becomes busy. A busy actor can…

  • spawn actors...
  • send async messages...

B

Idle Message Busy Reference

m

“hello”

C

slide-16
SLIDE 16

A

Once an idle actor receives a message, it becomes busy. A busy actor can…

  • spawn actors...
  • send async messages...
  • update its local state...

Idle Message Busy Reference

m

“hello”

B C

slide-17
SLIDE 17

A

Once an idle actor receives a message, it becomes busy. A busy actor can…

  • spawn actors...
  • send async messages...
  • update its local state...

Idle Message Busy Reference

m

“hello”

B C

slide-18
SLIDE 18

A

Once an idle actor receives a message, it becomes busy. A busy actor can…

  • spawn actors...
  • send async messages...
  • update its local state...

Idle Message Busy Reference

m

“hello”

B C

slide-19
SLIDE 19

“hello”

A

Once an idle actor receives a message, it becomes busy. A busy actor can…

  • spawn actors...
  • send async messages...
  • update its local state...
  • perform efgects...

Idle Message Busy Reference

m

B C

slide-20
SLIDE 20

“hello”

A

Once an idle actor receives a message, it becomes busy. A busy actor can…

  • spawn actors...
  • send async messages...
  • update its local state...
  • perform efgects...

Idle Message Busy Reference

m

B C

slide-21
SLIDE 21

“hello”

A

Once an idle actor receives a message, it becomes busy. A busy actor can…

  • spawn actors...
  • send async messages...
  • update its local state...
  • perform efgects...

Idle Message Busy Reference

m

B C

slide-22
SLIDE 22

“hello”

A

Once an idle actor receives a message, it becomes busy. A busy actor can…

  • spawn actors...
  • send async messages...
  • update its local state...
  • perform efgects...

...before becoming idle again.

Idle Message Busy Reference

m

B C

slide-23
SLIDE 23

“hello”

A

Once an idle actor receives a message, it becomes busy. A busy actor can…

  • spawn actors...
  • send async messages...
  • update its local state...
  • perform efgects...

...before becoming idle again.

Idle Message Busy Reference

m

B C

slide-24
SLIDE 24

“hello”

A

Idle Message Busy Reference

m

B

An actor is unblocked if: a. it is busy, or b. it has undelivered messages.

C

slide-25
SLIDE 25

“hello”

A C

Idle Message Busy Reference

m

B

An actor is unblocked if: a. it is busy, or b. it has undelivered messages.

slide-26
SLIDE 26

Part II: Garbage

slide-27
SLIDE 27

An actor is garbage if it can be destroyed without afgecting system behavior.

slide-28
SLIDE 28

An actor is garbage if it can be destroyed without afgecting system behavior. If an actor could become unblocked, we shouldn’t collect it.

slide-29
SLIDE 29

An actor is garbage if it can be destroyed without afgecting system behavior. If an actor could become unblocked, we shouldn’t collect it. We want actors that are permanently blocked, aka terminated actors

slide-30
SLIDE 30

Is X terminated?

Idle Message Busy Reference

m

C B X D A

Yes.

slide-31
SLIDE 31

Is X terminated?

Idle Message Busy Reference

m

C B X D A

Yes.

slide-32
SLIDE 32

Is X terminated?

Idle Message Busy Reference

m

C B X D

Possibly not!

A

slide-33
SLIDE 33

So X is terminated if all actors that can “reach” it are blocked?

slide-34
SLIDE 34

So X is terminated if all actors that can “reach” it are blocked? Not so fast!

slide-35
SLIDE 35

Idle Busy Reference Message

m

A C B X D E

sayHello(X)

Is X terminated?

Possibly not.

slide-36
SLIDE 36

A is potentially acquainted with B if either:

  • a. A has a reference to B
  • b. there is an undelivered message to A that contains a reference to B

A B A B

Idle Busy Reference Message

m sayHello(B)

slide-37
SLIDE 37

A is potentially acquainted with B if either:

  • a. A has a reference to B
  • b. there is an undelivered message to A that contains a reference to B

A is called a potential inverse acquaintance of B

A B A B

Idle Busy Reference Message

m sayHello(B)

slide-38
SLIDE 38

Potential reachability is the refmexive transitive closure of the potential acquaintance relation

A B C D

s a y H e l l

  • (

B ) s a y H e l l

  • (

C ) s a y H e l l

  • (

D )

Idle Busy Reference Message

m

slide-39
SLIDE 39

Potential reachability is the refmexive transitive closure of the potential acquaintance relation

B C D

s a y H e l l

  • (

C ) s a y H e l l

  • (

D )

A

Idle Busy Reference Message

m

slide-40
SLIDE 40

Potential reachability is the refmexive transitive closure of the potential acquaintance relation

C D

s a y H e l l

  • (

D )

A B

Idle Busy Reference Message

m

slide-41
SLIDE 41

Potential reachability is the refmexive transitive closure of the potential acquaintance relation

A B C D

Idle Busy Reference Message

m

slide-42
SLIDE 42

Idle Busy Reference Message

m

A C B X D E

sayHello(X)

X may not be terminated if it is potentially reachable by an unblocked actor

slide-43
SLIDE 43

C B X D A C B X D A

X is terminated if it is potentially reachable only by blocked actors

Idle Busy Reference Message

m

slide-44
SLIDE 44

Related Work

○ Global snapshots

  • Not incremental

○ SALSA: based on approximate snapshots

  • High overhead

○ Pony: inspiration for this work

  • Causal message delivery is expensive
slide-45
SLIDE 45

Demo

slide-46
SLIDE 46

Actor Reference Message m

slide-47
SLIDE 47

Actor Reference Message Knowledge m

... ... ... ... ... ... ... ... ... ...

slide-48
SLIDE 48

... ... ... ... ... ... ... ... ...

Actor Reference Message Knowledge Aggregator m

...

slide-49
SLIDE 49

... ... ... ... ... ... ... ... ...

s n a p s h

  • t

( . . . )

Actor Reference Message Knowledge Aggregator m

...

slide-50
SLIDE 50

... ... ... ... ... ... ... ... ...

s n a p s h

  • t

( . . . ) s n a p s h

  • t

( . . . )

Actor Reference Message Knowledge Aggregator m

...

slide-51
SLIDE 51

... ... ... ... ... ... ... ... ... ... ... ...

Actor Reference Message Knowledge Aggregator m

...

slide-52
SLIDE 52

... ... ... ... ... ... ... ... ... ... ... ...

Actor Reference Message Knowledge Aggregator m

...

slide-53
SLIDE 53

... ... ... ... ... ... ... ... ... ...

s e l f

  • d

e s t r u c t ( )

Actor Reference Message Knowledge Aggregator m

...

slide-54
SLIDE 54

... ... ... ... ... ... ... ...

Actor Reference Message Knowledge Aggregator m

...

slide-55
SLIDE 55

... ... ... ... ... ... ...

Actor Reference Message Knowledge Aggregator m

...

slide-56
SLIDE 56

... ... ... ... ... ... ...

Actor Reference Message Knowledge Aggregator m

...

slide-57
SLIDE 57

... ... ... ... ... ... ... ... ... ...

Actor Reference Message Knowledge Aggregator m

...

slide-58
SLIDE 58

... ... ... ... ... ... ... ... ... ...

Actor Reference Message Knowledge Aggregator m

...

slide-59
SLIDE 59

... ... ... ... ... ... ... ...

Actor Reference Message Knowledge Aggregator m

...

slide-60
SLIDE 60

... ... ... ... ... ...

Actor Reference Message Knowledge Aggregator m

...

slide-61
SLIDE 61

Detecting Terminated Actors

slide-62
SLIDE 62

Detecting Terminated Actors

Q: How do you fjnd an actor’s potential inverse acquaintances? Q: How do you know if an actor has undelivered messages? Q: How do you know the snapshots are consistent?

slide-63
SLIDE 63

Detecting Terminated Actors

Q: How do you fjnd an actor’s potential inverse acquaintances? With contact tracing! Q: How do you know if an actor has undelivered messages? Q: How do you know the snapshots are consistent?

slide-64
SLIDE 64

Detecting Terminated Actors

Q: How do you fjnd an actor’s potential inverse acquaintances? With contact tracing! Q: How do you know if an actor has undelivered messages? With message counts! Q: How do you know the snapshots are consistent?

slide-65
SLIDE 65

Detecting Terminated Actors

Q: How do you fjnd an actor’s potential inverse acquaintances? With contact tracing! Q: How do you know if an actor has undelivered messages? With message counts! Q: How do you know the snapshots are consistent? Magic!

slide-66
SLIDE 66

Part III: Contact Tracing

slide-67
SLIDE 67

Contact Tracing

  • Actors must use reference objects (refobs) instead of ordinary references
  • Refobs are denoted (x : A ⊸ B), where x is a globally unique token
  • Can only be used by the owner A to send messages to the target B
  • Must be deactivated when no longer needed
  • Actor gets a refob when it spawns a child
  • If A has (x : A ⊸ B) and (y : A ⊸ C) then A can create (z : C ⊸ B)
slide-68
SLIDE 68

Contact Tracing

○ Actors must use reference objects (refobs) instead of ordinary references ○ Refobs are denoted (x : A ⊸ B), where x is a globally unique token ○ Can only be used by the owner A to send messages to the target B ○ Must be deactivated when no longer needed ○ Actor gets a refob when it spawns a child ○ If A has (x : A ⊸ B) and (y : A ⊸ C) then A can create (z : C ⊸ B)

A B C x y

slide-69
SLIDE 69

A

Idle Busy Reference Message Knowledge m

...

...

slide-70
SLIDE 70

A B

Idle Busy Reference Message Knowledge m

...

Active(x : A ⊸ B) ... Created(x : A ⊸ B)

x

slide-71
SLIDE 71

A B

Idle Busy Reference Message Knowledge m

...

Active(x) ... Created(x)

x

slide-72
SLIDE 72

A B C y

Idle Busy Reference Message Knowledge m

...

Created(y)

x

Created(x) Active(x) Active(y) ...

slide-73
SLIDE 73

A B C y

Idle Busy Reference Message Knowledge m

...

Created(y)

x

Created(x) Active(x) Active(y) ...

slide-74
SLIDE 74

A B C y

Idle Busy Reference Message Knowledge m

...

Created(y)

x

Created(x) Active(x) Active(y) CreatedUsing(x, z) ...

msg(z : C ⊸ B)

slide-75
SLIDE 75

A B C y

Idle Busy Reference Message Knowledge m

...

Active(x) Active(y) CreatedUsing(x, z) ...

x z

Created(x) Created(y) Active(z)

slide-76
SLIDE 76

A B C y

Idle Busy Reference Message Knowledge m

...

Active(x) Active(y) CreatedUsing(x, z) ...

x z

Created(x) Created(y) Active(z)

info(z : C ⊸ B)

slide-77
SLIDE 77

A B C y

Idle Busy Reference Message Knowledge m

...

Active(x) Active(y) ...

x

Created(x) Created(y) Active(z)

info(z : C ⊸ B) release(z : C ⊸ B)

slide-78
SLIDE 78

Case 1: info message arrives fjrst

slide-79
SLIDE 79

A B C y

Idle Busy Reference Message Knowledge m

...

Active(x) Active(y) ...

x

Created(x) Created(y)

info(z : C ⊸ B) release(z : C ⊸ B)

slide-80
SLIDE 80

A B C y

Idle Busy Reference Message Knowledge m

...

Active(x) Active(y) ...

x

Created(x) Created(z) Created(y)

release(z : C ⊸ B)

slide-81
SLIDE 81

A B C y

Idle Busy Reference Message Knowledge m

...

Active(x) Active(y) ...

x

Created(x) Created(z) Created(y)

slide-82
SLIDE 82

Case 2: release message arrives fjrst

slide-83
SLIDE 83

A B C y

Idle Busy Reference Message Knowledge m

...

Active(x) Active(y) ...

x

Created(x) Created(y)

info(z : C ⊸ B) release(z : C ⊸ B)

slide-84
SLIDE 84

A B C y

Idle Busy Reference Message Knowledge m

...

Active(x) Active(y) ...

x

Created(y)

info(z : C ⊸ B)

Created(x) Released(z)

slide-85
SLIDE 85

A B C y

Idle Busy Reference Message Knowledge m

...

Active(x) Active(y) ...

x

Created(y) Created(x) Released(z)

slide-86
SLIDE 86

Part IV: Message Counts

slide-87
SLIDE 87

A B

Idle Busy Reference Message Knowledge m

...

msg(x : A ⊸ B)

slide-88
SLIDE 88

A B

Idle Busy Reference Message Knowledge m

...

x

Active(x) ...

slide-89
SLIDE 89

B

Idle Busy Reference Message Knowledge m

...

x

Active(x) Sent(x, 1) ...

msg(x, “hi”)

A

slide-90
SLIDE 90

B

Idle Busy Reference Message Knowledge m

...

x

Active(x) Sent(x, 1) ... ... Received(x, 1) ...

A

slide-91
SLIDE 91

B

Idle Busy Reference Message Knowledge m

...

x

Active(x) Sent(x, 2) ... ... Received(x, 1) ...

A

info(x, y)

slide-92
SLIDE 92

B

Idle Busy Reference Message Knowledge m

...

x

Active(x) Sent(x, 2) ... ... Received(x, 2) ...

A

slide-93
SLIDE 93

B

Idle Busy Reference Message Knowledge m

...

x

Active(x) Sent(x, 2) ... ... Received(x, 2) ...

A

slide-94
SLIDE 94

A B

Idle Busy Reference Message Knowledge m

...

Active(x) Sent(x, 2) ... ... Received(x, 2) ...

release(x)

slide-95
SLIDE 95

A B

Idle Busy Reference Message Knowledge m

...

... Received(x, 2) ...

release(x)

...

slide-96
SLIDE 96

A B

Idle Busy Reference Message Knowledge m

...

... ... Received(x, 2) ...

slide-97
SLIDE 97

A B

Idle Busy Reference Message Knowledge m

...

... ...

slide-98
SLIDE 98

Part V: Termination Detection

slide-99
SLIDE 99

“Appear to be terminated”?

slide-100
SLIDE 100

“Appear to be terminated”?

An actor is terminated if it is blocked… and its potential inverse acquaintances are blocked… and their potential inverse acquaintances are blocked… and so on.

slide-101
SLIDE 101

“Appear to be terminated”?

Let S be a set of snapshots. Assume B is the fjrst actor in S to take a snapshot.

slide-102
SLIDE 102

“Appear to be terminated”?

B

...

Idle Busy Reference Message Knowledge m

...

slide-103
SLIDE 103

“Appear to be terminated”?

B

...

Idle Busy Reference Message Knowledge m

...

If B has no facts Created(x : A ⊸ B) then it has no potential inverse acquaintances (see paper).

slide-104
SLIDE 104

“Appear to be terminated”?

B

Created(x : A ⊸ B) Received(x, 5) ...

Idle Busy Reference Message Knowledge m

...

slide-105
SLIDE 105

“Appear to be terminated”?

B

Created(x : A ⊸ B) Received(x, 5) ...

Idle Busy Reference Message Knowledge m

...

To fjnd out if B is terminated, we need a snapshot from A.

slide-106
SLIDE 106

“Appear to be terminated”?

A B

x

Idle Busy Reference Message Knowledge m

...

Created(x : A ⊸ B) Received(x, 5) ...

To fjnd out if B is terminated, we need a snapshot from A.

slide-107
SLIDE 107

“Appear to be terminated”?

A B

x

Idle Busy Reference Message Knowledge m

...

Created(x : A ⊸ B) Received(x, 5) ...

What if A’s snapshot doesn’t contain Active(x)?

...

slide-108
SLIDE 108

“Appear to be terminated”?

A B

m s g ( x )

Idle Busy Reference Message Knowledge m

...

Created(x : A ⊸ B) Received(x, 5) ...

What if A’s snapshot doesn’t contain Active(x)? Case 1: It hasn’t received x yet.

...

slide-109
SLIDE 109

“Appear to be terminated”?

A B

r e l e a s e ( x )

Idle Busy Reference Message Knowledge m

...

Created(x : A ⊸ B) Received(x, 5) ...

What if A’s snapshot doesn’t contain Active(x)? Case 1: It hasn’t received x yet. Case 2: It has released x already.

...

slide-110
SLIDE 110

“Appear to be terminated”?

A B

r e l e a s e ( x )

Idle Busy Reference Message Knowledge m

...

Created(x : A ⊸ B) Received(x, 5) ...

What if A’s snapshot doesn’t contain Active(x)? Case 1: It hasn’t received x yet. Case 2: It has released x already. Then B is not terminated!

...

slide-111
SLIDE 111

“Appear to be terminated”?

A B

x

Idle Busy Reference Message Knowledge m

...

Created(x : A ⊸ B) Received(x, 5) ...

B is only terminated if…

  • A’s snapshot contains Active(x).

Active(x) ...

slide-112
SLIDE 112

“Appear to be terminated”?

A B

x

Idle Busy Reference Message Knowledge m

...

Created(x : A ⊸ B) Received(x, 5) ...

What if the message counts difger?

Active(x) Sent(x, 6) ...

slide-113
SLIDE 113

“Appear to be terminated”?

A B

x

Idle Busy Reference Message Knowledge m

...

Created(x : A ⊸ B) Received(x, 5) ...

What if the message counts difger? Recall B’s snapshot came fjrst. So at some point B became unblocked.

Active(x) Sent(x, 6) ...

msg(x, ...)

slide-114
SLIDE 114

“Appear to be terminated”?

A B

x

Idle Busy Reference Message Knowledge m

...

Created(x : A ⊸ B) Received(x, 5) ...

B is only terminated if…

  • A’s snapshot contains Active(x);
  • Their send and receive counts agree.

Active(x) Sent(x, 5) ...

slide-115
SLIDE 115

“Appear to be terminated”?

A B

x

Idle Busy Reference Message Knowledge m

...

Created(x : A ⊸ B) Received(x, 5) ...

What if A created a new refob y : C ⊸ B before snapshot?

s a y H e l l

  • (

_ , y : C ⊸ B )

Active(x) Sent(x, 5) ...

slide-116
SLIDE 116

“Appear to be terminated”?

A B

x

Idle Busy Reference Message Knowledge m

...

Created(x : A ⊸ B) Received(x, 5) ...

What if A created a new refob y : C ⊸ B before snapshot? Then A created a fact CreatedUsing(x, y).

s a y H e l l

  • (

_ , y : C ⊸ B )

Active(x) Sent(x, 5) CreatedUsing(x, y) ...

slide-117
SLIDE 117

“Appear to be terminated”?

A B

x

Idle Busy Reference Message Knowledge m

...

Created(x : A ⊸ B) Received(x, 5) ...

What if A created a new refob y : C ⊸ B before snapshot? Then A created a fact CreatedUsing(x, y). Could A forget it?

Active(x) Sent(x, 5) CreatedUsing(x, y) ...

info(x, y)

slide-118
SLIDE 118

“Appear to be terminated”?

A B

x

Idle Busy Reference Message Knowledge m

...

Created(x : A ⊸ B) Received(x, 5) ...

What if A created a new refob y : C ⊸ B before snapshot? Then A created a fact CreatedUsing(x, y). Could A forget it? Not without changing the send count!

Active(x) Sent(x, 6) CreatedUsing(x, y) ...

info(x, y)

slide-119
SLIDE 119

“Appear to be terminated”?

A B

x

Idle Busy Reference Message Knowledge m

...

Created(x : A ⊸ B) Received(x, 5) ...

B is only terminated if… ○ A’s snapshot contains Active(x); ○ Their send and receive counts agree. Now we need a snapshot from C...

Active(x) Sent(x, 5) CreatedUsing(x, y) ...

slide-120
SLIDE 120

“Appear to be terminated”?

Assuming B is blocked, show that the next actor is also blocked.

slide-121
SLIDE 121

“Appear to be terminated”?

Assuming B is blocked, show that the next actor is also blocked. Hence every actor that can potentially reach B is blocked!

slide-122
SLIDE 122

This work was supported in part by the National Science Foundation under Grant

  • No. SHF 1617401, and in part by the Laboratory Directed Research and

Development program at Sandia National Laboratories, a multi-mission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525. Thanks also to Dipayan Mukherjee, Atul Sandur, Charles Kuch, Jerry Wu, Emily Hutchinson, and the anonymous referees for their valuable feedback.

Acknowledgments