Causal Consistency for Distributed Data Stores and Applications as - PowerPoint PPT Presentation

COMPSAC 2016 June 2016 Causal Consistency for Distributed Data Stores and Applications as They are Kazuyuki Shudo , Takashi Yaguchi Tokyo Tech

Background: Distributed data store • Database management system (DBMS) that consists of multiple servers. – For performance, capacity, and fault tolerance – Cf. NoSQL NoSQL: • A data item is replicated. Replicas 1 - 5 … … … … A cluster of Servers 1 - 1,000 1/11

Background: Causal consistency • One of consistency models. • A consistency model is a contract between DBMS and a client – of what a client observes. – It is related to replica s closely. If a client see an old replica, … • Consistency models related to this research : – Eventual consistency • All replica s converge to the same value eventually. • Most NoSQLs adopt this model. – Causal consistency • All writes and reads of replica s obey causality relationships 2/11 between them.

Background: Causal consistency • An example: social networking site Causally consistent Not causally consistent Now I’m in A A Atlanta! dependency dependency A client It’s warmer than It’s warmer than A A I expected. I expected. • Precise definition – Write after read by the same process (client) – Write after write by the same process ‐ illustrated above – Read after write of the same variable (data item) regardless of which process reads or writes 3/11

Contribution: Letting ‐ It ‐ Be protocol • A protocol to achieve causal consistency on an eventually consistent data store. • It requires no modification of applications and data stores. Data store approach Middleware approach Ex. COPS, Eiger, ChainReaction and Orbe Our Letting-It-Be protocol Existing protocol does not require any modifications Ex. Bolt-on causal consistency to either data stores or applications Applications Applications Applications modified to specify explicitly Access data dependency to be managed Middleware Middleware Eventually consistent Eventually consistent Eventually consistent data store data store data store 4/11 Modified part of software

Causality resolution in general • Servers maintain dependency graphs and resolve dependency for each operation. Causal dependency Causal dependency between operations between variables Client 1 Client 2 Client 3 W(x 1 ) u 4 R(u 4 ) Level 2 W(y 2 ) R(y 2 ) x 1 y 2 z 1 W(z 1 ) Level 1 R(z 1 ) v 3 dependency Level 0 Dependency graph W(v 3 ) Time for the version 3 of v. 5/11

Causality resolution Ex. COPS, Eiger, • Data store approach – write time ChainReaction and Orbe – When a server receives a replica update of v3 , before writing v3 , the server confirms the cluster has level 1 vertexes, x1 , y2 and z1 . • u4 is confirmed when z1 is written. • Middleware approach – read time Ex. Bolt-on causal consistency, – It cannot implement write ‐ time resolution. Letting-It-Be (our proposal) • Because a middleware cannot catch a replica update. – When a server receives a read request of v , the server confirms that the cluster has all the vertexes including x1 , y2 , z1 and u4 . u 4 Level 2 x 1 y 2 z 1 Level 1 Dependency graph v 3 for v3 Level 0 6/11

Problems of middleware approach It requires no modification of a data store. But there are problems. • Overwritten dependency graph – Dependency graph for v4 overwrites graph for v3 though it is still required as part of graphs for other variables. – Solution: … (in the next page) v 3 can be lost. v 3 t 1 is to be overwritten by v4. Dep graph for t Dep graph for v • Concurrent overwrites by multiple clients – Multiple v3 are written concurrently. – Solution: Mutual exclusion with CAS and vector clocks. 7/11

Solutions to overwritten dependency graph problem • Bolt ‐ on attaches entire graph (!) to all the variables. – It reduces the amount of data by forcing an app to specify deps explicitly. – It requires modification of apps .  • Our Letting ‐ It ‐ Be keeps graphs for multiple versions such as v4 , v3. – It reduces the amount of data by attaching only level 1 vertexes. – It requires no modification of apps .  – It traverses a graph across servers  , but marking technique reduces it. – It requires garbage collection of unnecessary old dep graphs.  v 3 Bolt-on attaches Letting-It-Be keeps entire graph. multiple versions of t 1 v 4, v 3, … graphs up to level 1. 8/11 Dep graph for t Dep graph for v

Performance • Our contribution is a protocol that requires no modification of both apps and a data store. • But, performance overheads should be acceptable. It depends on an application. • Benchmark conditions – 2 clusters, each has 9 servers running Linux 3.2.0, and 50 ms of latency between the clusters – Apache Cassandra 2.1.0, configured as each cluster has one replica. – Letting ‐ It ‐ Be protocol implemented as a library in 3,000 lines of code – Yahoo! Cloud Serving Benchmark (YCSB) [ACM SOCC 2010] with Zipfian distribution Supposed system model 9/11

Performance Best case: Worst case: Read latencies with read-heavy workload Write latencies with write-heavy workload Maximum throughput Maximum throughput 21% lower 78% lower Better 5.2 6.6 1.4 0.9 1.2 3 7 3 7 • Overheads for reads are smaller than writes though the protocol does read ‐ time resolution. – Marking already ‐ resolved data items works well. • Comparison with Bolt ‐ on is part of future work. 10/11

Summary • Letting ‐ It ‐ Be protocol maintains causal consistency over an eventually consistent data store. – We demonstrated that it works with a production ‐ level data store, Apache Cassandra. • It is unique in that it requires no modifications of applications and a data store. • Future direction – A better consistency model that involves • less modification to each layer, • less costs, • less and simple interaction between layers, • easier extraction of consistency relationships from an application. 11/11

Causal Consistency for Distributed Data Stores and Applications as - PowerPoint PPT Presentation

COMPSAC 2016 June 2016 Causal Consistency for Distributed Data Stores and Applications as They are Kazuyuki Shudo , Takashi Yaguchi Tokyo Tech Background: Distributed data store Database management system (DBMS) that consists of multiple

Causal Effect Evaluation and Causal Network Learning Zhi Geng Peking University, China June

Consistency - Chapter 5 Introduce several notions of Local Consistency: arc consistency,

Constraint Programming - An overview Node-consistency Arc-consistency Path-consistency

Web Cache Consistency Web Cache Consistency Web Cache Consistency Web Cache Consistency

Political Science 209 - Fall 2018 Causal Inference Florian Hollenbach 7th September 2018 Causal

Foundations of Causal Discovery Frederick Eberhardt KDD Causality Workshop 2016 Causal Discovery

1 Applications ? Trading Consistency for Performance Applications ? Trading Consistency for

Causal Inference By: Miguel A. Hern an and James M. Robins Part I: Causal inference without

Causal Programming Causal Programming Joshua Brul Joshua Brul

Few-shot Domain Adaptation 1/12 by Causal Mechanism Transfer Domain adaptation Causal mechanism

Causal Discovery from Observational Data Brady Neal causalcourse.com What if we dont have

I I Can ant Belie elieve e It Its No Not C Causal ! ! Scalable Causal Consistency

Lightweight Causal Cluster Consistency Boris Koldehofe, Anders Gidenstam, Marina

Causal Consistency CS 240: Computing Systems and Concurrency Lecture 16 Marco Canini Credits:

Consistent Storage or Scalable Storage Why Not Both? CONSISTENCY Strong Consistency

Seminar: Search and Optimization Directional Consistency Gabi R oger Universit at Basel

LaTeX Workshop: CVs, Cover Letters, and SOPs Richard Wong UT Austin, Fall 2020 Slides are

Goals Provide you with tools and helpful hints to assist in cover letter and resume writing,

| 1 GAC Preparation For Meeting with ICANN Board 25 June 2019 Governmental Advisory Committee

Sampling for Frequent Itemset Mining prof. dr Arno Siebes Algorithmic Data Analysis Group

FI FINI NISHIN SHING G TH THE E SEME SE MESTER STER ST STRON ONG E N G A G I N G S T

2020-21 Proposed Final Budget October 26, 2020 2020-21 Budget Planning (Before COVID-19)

LSTMs Exploit Linguistic Attributes of Data Nelson F . Liu, Omer Levy, Roy Schwartz, Chenhao

Small Grants Fund Step 2 Process & Fund Allocation November 30 th , 2020 Purpose of the Small

Causal Consistency for Distributed Data Stores and Applications as - PowerPoint PPT Presentation

COMPSAC 2016 June 2016 Causal Consistency for Distributed Data Stores and Applications as They are Kazuyuki Shudo , Takashi Yaguchi Tokyo Tech Background: Distributed data store Database management system (DBMS) that consists of multiple

Causal Effect Evaluation and Causal Network Learning Zhi Geng Peking University, China June

Consistency - Chapter 5 Introduce several notions of Local Consistency: arc consistency,

Constraint Programming - An overview Node-consistency Arc-consistency Path-consistency

Web Cache Consistency Web Cache Consistency Web Cache Consistency Web Cache Consistency

Political Science 209 - Fall 2018 Causal Inference Florian Hollenbach 7th September 2018 Causal

Foundations of Causal Discovery Frederick Eberhardt KDD Causality Workshop 2016 Causal Discovery

1 Applications ? Trading Consistency for Performance Applications ? Trading Consistency for

Causal Inference By: Miguel A. Hern an and James M. Robins Part I: Causal inference without

Causal Programming Causal Programming Joshua Brul Joshua Brul

Few-shot Domain Adaptation 1/12 by Causal Mechanism Transfer Domain adaptation Causal mechanism

Causal Discovery from Observational Data Brady Neal causalcourse.com What if we dont have

I I Can ant Belie elieve e It Its No Not C Causal ! ! Scalable Causal Consistency

Lightweight Causal Cluster Consistency Boris Koldehofe, Anders Gidenstam, Marina

Causal Consistency CS 240: Computing Systems and Concurrency Lecture 16 Marco Canini Credits:

Consistent Storage or Scalable Storage Why Not Both? CONSISTENCY Strong Consistency

Seminar: Search and Optimization Directional Consistency Gabi R oger Universit at Basel

LaTeX Workshop: CVs, Cover Letters, and SOPs Richard Wong UT Austin, Fall 2020 Slides are

Goals Provide you with tools and helpful hints to assist in cover letter and resume writing,

| 1 GAC Preparation For Meeting with ICANN Board 25 June 2019 Governmental Advisory Committee

Sampling for Frequent Itemset Mining prof. dr Arno Siebes Algorithmic Data Analysis Group

FI FINI NISHIN SHING G TH THE E SEME SE MESTER STER ST STRON ONG E N G A G I N G S T

2020-21 Proposed Final Budget October 26, 2020 2020-21 Budget Planning (Before COVID-19)

LSTMs Exploit Linguistic Attributes of Data Nelson F . Liu, Omer Levy, Roy Schwartz, Chenhao

Small Grants Fund Step 2 Process &amp; Fund Allocation November 30 th , 2020 Purpose of the Small

Small Grants Fund Step 2 Process & Fund Allocation November 30 th , 2020 Purpose of the Small