CALM P ROGRAMMING THE C LOUD Joe Hellerstein Peter Alvaro A GENDA A - - PowerPoint PPT Presentation

calm
SMART_READER_LITE
LIVE PREVIEW

CALM P ROGRAMMING THE C LOUD Joe Hellerstein Peter Alvaro A GENDA A - - PowerPoint PPT Presentation

B LOOM CALM P ROGRAMMING THE C LOUD Joe Hellerstein Peter Alvaro A GENDA A GENDA Brief research background from the BOOM project B i f h b k d f th BOOM j t http://boom.cs.berkeley.edu A taste of CS194 17 Programming the


slide-1
SLIDE 1

BLOOM CALM PROGRAMMING THE CLOUD

Joe Hellerstein Peter Alvaro

slide-2
SLIDE 2

AGENDA AGENDA

B i f h b k d f th BOOM j t

  • Brief research background from the BOOM project
  • http://boom.cs.berkeley.edu
  • A taste of CS194 17 “Programming the Cloud”
  • A taste of CS194‐17, Programming the Cloud

and the bloom language

  • Some related work

Some related work

slide-3
SLIDE 3

BOOM BOOM

I f h t d bi it d t In an era of cheap compute and ubiquitous data… … Productivity is a key grand challenge in computing. Berkeley Orders Of Magnitude project

OOM bigger systems OOM less code OOM bigger systems, OOM less code.

Significantly improve productivity for developers of Significantly improve productivity for developers of distributed systems.

slide-4
SLIDE 4

THE

von NEUMANN

MACHINE

  • ORDER
  • LIST of Instructions
  • ARRAY of Memory
  • THE STATE
  • THE STATE
  • Mutation in time
slide-5
SLIDE 5

DISTRIBUTED COMPUTING COMPUTING

IS THE

NEW NORMAL NEW NORMAL

ORDER COSTLY

  • ORDER IS TOO COSTLY

– Coordination

  • THE STATE IS HEARSAY

– Delay F il – Failure

http://www.flickr.com/photos/scobleizer/4870003098/sizes/l/in/photostream/

slide-6
SLIDE 6

DISORDERLY PROGRAMMING STATE

  • Order-insensitive objects

LOGIC

  • Order-insensitive merge rules

IMPLICATION: KEEP CALM IMPLICATION: KEEP CALM

  • Asynchrony is irrelevant
  • Replication is easy
  • Coordination is unnecessary

Not always possible! But often.

  • Disorder by default

y

  • Order is the exception.

The CALM Theorem says when The CALM Theorem says when.

http://www.flickr.com/photos/scobleizer/4870003098/sizes/l/in/photostream/

slide-7
SLIDE 7

< bloom <~ bloom

A di d l di t ib t d l b

  • A disorderly distributed language as above
  • [Hellerstein, et al. CIDR11]
  • http://bloom‐lang.org

http://bloom lang.org

  • Ruby prototype: Bud

% gem install bud

  • Theoretical grounding: Dedalus
  • A logic for data, space and time

M d l th ti (f ll d l ti ) ti

  • Model‐theoretic (fully declarative) semantics
  • [Alvaro, et al. Datalog2.0‐11, Datalog2.0‐12]
slide-8
SLIDE 8

< bloom <~ bloom

A di d l di t ib t d l b

  • A disorderly distributed language as above
  • [Hellerstein, et al. CIDR11]
  • http://bloom‐lang.org

http://bloom lang.org

  • Ruby prototype: Bud

% gem install bud

  • Theoretical grounding: Dedalus
  • A logic for data, space and time

M d l th ti (f ll d l ti ) ti

  • Model‐theoretic (fully declarative) semantics
  • [Alvaro, et al. Datalog2.0‐11, Datalog2.0‐12]
slide-9
SLIDE 9

< bloom <~ bloom

A di d l di t ib t d l b

  • A disorderly distributed language as above
  • [Hellerstein, et al. CIDR11]
  • http://bloom‐lang.org

http://bloom lang.org

  • Ruby prototype: Bud

% gem install bud

  • Theoretical grounding: Dedalus
  • A logic for data, space and time

M d l th ti (f ll d l ti ) ti

  • Model‐theoretic (fully declarative) semantics
  • [Alvaro, et al. Datalog2.0‐11, Datalog2.0‐12]
slide-10
SLIDE 10

< bloom <~ bloom

A di d l di t ib t d l b

  • A disorderly distributed language as above
  • [Hellerstein, et al. CIDR11]
  • http://bloom‐lang.org

http://bloom lang.org

  • Ruby prototype: Bud

% gem install bud

  • Theoretical grounding: Dedalus
  • A logic for data, space and time

M d l th ti (f ll d l ti ) ti

  • Model‐theoretic (fully declarative) semantics
  • [Alvaro, et al. Datalog2.0‐11, Datalog2.0‐12]
slide-11
SLIDE 11

CS194-17 at Berkeley: y Programming the Cloud

  • Joe Hellerstein & Peter Alvaro
  • Now in its second offering.

g

  • Tuesdays: Big Picture
  • lectures on distributed systems fundamentals
  • Thursdays: Hands On
  • live‐coding in Bloom

g

  • We’ll do a bit of a blend today…
slide-12
SLIDE 12

Lessons for Today Lessons for Today

1 C i ti R d i S & Ti

  • 1. Communication as Rendezvous in Space & Time
  • 2. The Duality of Communication and Storage
  • 3. Assessing the need for Coordination protocols
  • CALM program analysis
slide-13
SLIDE 13

Lessons for Today Lessons for Today

1 C i ti R d i S & Ti

  • 1. Communication as Rendezvous in Space & Time
  • 2. The Duality of Communication and Storage
  • 3. Assessing the need for Coordination protocols
  • CALM program analysis
slide-14
SLIDE 14

The Land of Two Mountains The Land of Two Mountains

slide-15
SLIDE 15

Rendezvous by Luck y (Smoke Signals)

slide-16
SLIDE 16

Sender Persists Sender Persists

slide-17
SLIDE 17

Receiver Persists Receiver Persists

slide-18
SLIDE 18

Both Persist Both Persist

slide-19
SLIDE 19

Lessons for Today Lessons for Today

1 C i ti R d i S & Ti

  • 1. Communication as Rendezvous in Space & Time
  • 2. The Duality of Communication and Storage
  • 3. Assessing the need for Coordination protocols
  • CALM program analysis
slide-20
SLIDE 20

Lessons for Today Lessons for Today

1 C i ti R d i S & Ti

  • 1. Communication as Rendezvous in Space & Time
  • 2. The Duality of Communication and Storage
  • 3. Assessing the need for Coordination protocols
  • CALM program analysis
slide-21
SLIDE 21

Directions for Thought Directions for Thought

  • Thm (CALM) Consistency As Logical Monotonicity
  • Thm (CALM): Consistency As Logical Monotonicity
  • <= : Distributed code that’s monotonic will be “eventually

consistent” without coordination.

  • Corollary: It is sufficient to use coordination only to “guard” the
  • Corollary: It is sufficient to use coordination only to guard the

non‐monotonic statements in a program.

  • => : Any eventually consistent program is in some

fundamental way monotonic.

  • Said differently:
  • “Thank you for all the Paxos, Dr. Lamport. Do I need it?”
  • Or perhaps better: “What is time for? Must I spend it?”

Or perhaps better: What is time for? Must I spend it?

  • [Hellerstein, SIGMODRecord 3/10; Ameloot PODS11, ICDT12, Marczak

Datalog 2.0‐12]

  • Realized in practice via Bloom/Budplot
  • Realized in practice via Bloom/Budplot.
slide-22
SLIDE 22

More Results More Results

htt //b b k l d

  • http://boom.cs.berkeley.edu
  • http://bloom‐lang.org

Materials for this talk:

  • https://github.com/programthecloud/ptcrepo/tree/gh‐pages/demo
slide-23
SLIDE 23

BOOM TEAM BOOM TEAM

joe hellerstein ras bodik david maier alan fekete l il bill k h di i b ili peter alvaro neil conway bill marczak haryadi gunawi peter bailis sriram srinivasan emily andrews andy hutchinson Joshua rosen

slide-24
SLIDE 24

Key Results 1 Key Results 1

  • BOOM Analytics [Alvaro et al Eurosys ‘10]
  • BOOM Analytics [Alvaro, et al. Eurosys 10]
  • HDFS rebuilt in Overlog, the predecessor to Bloom, with HA and scale‐out
  • Hadoop scheduler as well
  • BloomL: Beyond sets/tables [Conway, et al. SoCC ‘12]
  • Extensions for natural monotone data types like counters vector clocks KVS with
  • Extensions for natural monotone data types like counters, vector clocks, KVS with

commutative merges

  • Safe mappings between these types
  • Blazes: Coordination analysis of streaming services [Alvaro, et al. In process]
  • Grey‐box: bring CALM analysis to popular streaming systems like Storm

Grey box: bring CALM analysis to popular streaming systems like Storm

  • White‐box: more fully automated stream analysis in the Bloom context
  • Correct, Composable Concurrent Editing [Conway, et al. In process]
  • Google‐Doc style concurrent editing remains a black art
  • Operational Transforms

Operational Transforms

  • Lattices underlie a lot of the intuition
  • BloomL provides a rich language for composing lattices and traditional data
  • Automated analysis of correctness
slide-25
SLIDE 25

Key Results 2 Key Results 2

C i t d C lit i th Wild (B ili t l )

  • Consistency and Causality in the Wild (Bailis et al.)
  • Probabilistically Bounded Staleness [VLDB ‘12]
  • Dangers of Causal Consistency and a Solution [SoCC ‘12]

Dangers of Causal Consistency and a Solution [SoCC 12]

  • HAT, not CAP: Towards Highly Available Transactions

[HotOS ‘13] B lt O C l C i t [SIGMOD ‘13]

  • Bolt‐On Causal Consistency [SIGMOD ‘13]
slide-26
SLIDE 26

Summing Up Summing Up

Di t ib t d? Di d l b d f lt

  • Distributed? Disorderly by default.
  • Logic and Lattices in Space and Time
  • The Duality of Communication and Storage
  • The Duality of Communication and Storage
  • Unifying the two linguistically makes for nice code
  • Assessing the need for Coordination protocols

Assessing the need for Coordination protocols

  • CALM leads to straightforward program checks in Bloom
  • Points to games we can play in other languages/systems
  • Many interesting questions remain