1 Issues and Techniques for Weak Replication Issues and Techniques - PDF document

Grapevine and Clearinghouse (Xerox) Grapevine and Clearinghouse (Xerox) Weakly consistent replication was used in earlier work at Xerox PARC: • Grapevine and Clearinghouse name services Asynchronous Replication and Bayou Asynchronous Replication and Bayou Updates were propagated by unreliable multicast (“direct mail”). • Periodic anti-entropy exchanges among replicas ensure that they eventually converge, even if updates are lost. Arbitrary pairs of replicas periodically establish contact and Jeff Chase resolve all differences between their databases. CPS 212, Fall 2000 Various mechanisms (e.g., MD5 digests and update logs) reduce the volume of data exchanged in the common case. Deletions handled as a special case via “death certificates” recording the delete operation as an update. Asynchronous Replication Asynchronous Replication Epidemic Algorithms Epidemic Algorithms Idea: build available/scalable information services with read-any-write-any PARC developed a family of weak update protocols based on replication and a weak consistency model. a disease metaphor ( epidemic algorithms [Demers et. al. OSR 1/88]): - no denial of service during transient network partitions • Each replica periodically “touches” a selected “susceptible” - supports massive replication without massive overhead - “ideal for the Internet and mobile computing” [Golding92] peer site and “infects” it with updates. Transfer every update known to the carrier but not the victim. replica A Problems: replicas may be out of date, Partner selection is randomized using a variety of heuristics. may accept conflicting writes, and may receive updates in different orders. client A • Theory shows that the epidemic will eventually the entire population (assuming it is connected). Probability that replicas that have not yet converged decreases exponentially with time. client B client C asynchronous state Heuristics (e.g., push vs. pull) affect traffic load and the expected propagation time-to-convergence. replica B replica C Synchronous Replication Synchronous Replication How to Ensure That Replicas Converge How to Ensure That Replicas Converge Basic scheme: connect each client (or front-end ) with every replica: writes go to all replicas, but client can read from any replica ( read-one-write-all replication ). 1. Using any form of epidemic (randomized) anti-entropy, all updates will (eventually) be known to all replicas. How to ensure that each replica sees updates in the “right” order? 2. Imposing a global order on updates guarantees that all sites (eventually) apply the same updates in the same order. 3. Assuming conflict detection is deterministic, all sites will detect the same conflicts. client B client A Write conflicts cannot (generally) be detected when a site accepts a write; they appear when updates are applied . Problem: low concurrency, low availability, and high response times. 3. Assuming conflict resolution is deterministic, all sites will Partial Solution: Allow writes to any N replicas resolve all conflicts in exactly the same way. replicas (a quorum of size N ). To be safe, reads must also request data from a quorum of replicas. 1

Issues and Techniques for Weak Replication Issues and Techniques for Weak Replication Update Ordering Update Ordering 1. How should replicas choose partners for anti-entropy exchanges? Problem: how to ensure that all sites recognize a fixed order on updates, even if updates are delivered out of order? Topology-aware choices minimize bandwidth demand by “flooding”, but randomized choices survive transient link failures. Solution: Assign timestamps to updates at their accepting site, and 2. How to impose a global ordering on updates? order them by source timestamp at the receiver. logical clocks and delayed delivery (or delayed commitment) of updates Assign nodes unique IDs: break ties with the origin node ID. 3. How to integrate new updates with existing database state? • What (if any) ordering exists between updates accepted Propagate updates rather than state, but how to detect and reconcile by different sites? conflicting updates? Bayou: user-defined checks and merge rules . Comparing physical timestamps is arbitrary: physical clocks drift. 4. How to determine which updates to propagate to a peer on each anti- entropy exchange? Even a protocol to maintain loosely synchronized physical clocks cannot assign a meaningful ordering to events that occurred at vector clocks or vector timestamps “almost exactly the same time”. 5. When can a site safely commit or stabilize received updates? • In Bayou, received updates may affect generation of receiver acknowledgement by vector clocks (TSAE protocol) future updates, since they are immediately visible to the user. Bayou Basics Bayou Basics Causality Causality Constraint: The update ordering must respect potential causality . 1. Highly available, weak replication for mobile clients. • Communication patterns establish a happened-before order Beware : every device is a “server”... let’s call ‘em sites . on events, which tells us when ordering might matter. 2. Update conflicts are detected/resolved by rules specified by • Event e 1 happened-before e 2 iff e 1 could possibly have the application and transmitted with the update. affected the generation of e 2 : we say that e 1 < e 2 . interpreted dependency checks and merge procedures e 1 < e 2 iff e 1 was “known” when e 2 occurred. 3. Stale or tentative data may be observed by the client, but Events e 1 and e 2 are potentially causally related . may mutate later. • In Bayou, users or applications may perceive inconsistencies if causal ordering of updates is not respected at all replicas. The client is aware that some updates have not yet been confirmed . An update u should be ordered after all updates w known to the accepting site at the time u was accepted. “An inconsistent database is marginally less useful than a e.g., the newsgroup example in the text. consistent one.” Clocks Clocks Causality: Example Causality: Example 1. physical clocks Protocols to control drift exist, but physical clock timestamps cannot A1 A2 assign an ordering to “nearly concurrent” events. A 2. logical clocks A3 A4 Simple timestamps guaranteed to respect causality: “ A ’s current time is later than the timestamp of any event A knows about, no matter where it happened or who told A about it.” B1 B2 B4 B3 3. vector clocks B Order(N) timestamps that say exactly what A knows about events on B , even if A heard it from C . A1 < B2 < C2 4. matrix clocks B3 < A3 C1 C2 C3 Order(N 2 ) timestamps that say what A knows about what B knows about C2 < A4 events on C . C Acknowledgement vectors : an O(N) approximation to matrix clocks. 2

Logical Clocks Logical Clocks Flooding and the Prefix Property Flooding and the Prefix Property In Bayou, each replica’s knowledge of updates is determined Solution: timestamp updates with logical clocks [Lamport] by its pattern of communication with other nodes. Timestamping updates with the originating node’s logical clock Loosely, a site knows everything that it could know from its LC induces a partial order that respects potential causality. contacts with other nodes. Clock condition : e 1 < e 2 implies that LC(e 1 ) < LC(e 2 ) • Anti-entropy floods updates. 1. Each site maintains a monotonically increasing clock value LC . Tag each update originating from site i with accept stamp (i, LC i ) . 2. Globally visible events (e.g., updates) are timestamped with the Updates from each site are bulk-transmitted cumulatively in an current LC value at the generating site. order consistent with their source accept stamps. Increment local LC on each new event: LC = LC + 1 • Flooding guarantees the prefix property of received updates. 3. Piggyback current clock value on all messages. If a site knows an update u originating at site i with accept stamp LC u , then it also knows all preceding updates w originating at Receiver resets local LC: if LC s > LC r then LC r = LC s + 1 site i : those with accept stamps LC w < LC u . Logical Clocks: Example Logical Clocks: Example Causality and Reconciliation Causality and Reconciliation A6-A10: receiver’s clock is unaffected In general, a transfer from A must send B all updates that did because it is “running fast” relative to sender. not happen-before any update known to B . A 3 4 5 6 7 8 9 0 2 10 1 “Who have you talked to, and when?” “This is who I talked to.” B A B 2 3 4 5 6 0 7 “Here’s everything I know that they did not know when they talked to you.” C5: LC update advances receiver’s clock if it is “running slow” relative to sender. Can we determine which updates to propagate by comparing C logical clocks LC(A) and LC(B) ? NO. 5 6 7 0 1 8 Which Updates to Propagate? Which Updates to Propagate? Causality and Updates: Example Causality and Updates: Example In an anti-entropy exchange, A must send B all updates A1 A2 known to A that are not yet known to B . A A4 A5 Problem: which updates are those? one-way “push” anti-entropy exchange (Bayou reconciliation) B1 B2 B4 B3 B “What do you know?” A1 < B2 < C3 “Here’s what I know.” B3 < A4 A B C1 C3 C4 C3 < A5 “Here’s what I know that you don’t know.” C 3

1 Issues and Techniques for Weak Replication Issues and Techniques - PDF document

Grapevine and Clearinghouse (Xerox) Grapevine and Clearinghouse (Xerox) Weakly consistent replication was used in earlier work at Xerox PARC: Grapevine and Clearinghouse name services Asynchronous Replication and Bayou Asynchronous

Reasoning about Consensus Protocols Ilya Sergey ilyasergey.net Consensus Common meaning :

Concurrency Control Lecture # 18 Database Systems Andy Pavlo AP AP Computer Science

Stata Bootcamp - STAMP Denise Laroze 1 University of Essex Department of Government October 10,

Should I Stale or Should I Close? An Analysis of a Bot that Closes Abandoned Issues and Pull

DEFINING CLASSES IN PYTHON CSSE 120Rose Hulman Institute of Technology Review: Using Objects

Modeling unobserved heterogeneity in Stata Rafal Raciborski StataCorp LLC November 27, 2017

Chapter 3.23.4 Spanning Tree Algorithms Prof. Tesler Math 154 Winter 2020 Prof. Tesler Ch.

Operating Systems Fall 2014 Secondary Storage Myungjin Lee myungjin.lee@ed.ac.uk 1 Secondary

Linear Regression 18.05 Spring 2014 Agenda Fitting curves to bivariate data Measuring the

The Dust Between the Stars adventures with a small telescope (with notes on a big dark space)

Distributed Optimistic Algorithm Assumptions 1. Synchronized clocks 2. MTD (max, trans, delay)

1. Lack of motivation. 2. Not Wowing your patients enough. 3. No exit strategy. 4. Your office

Know ledge-Based Systems IS430 Modeling Basic Operations and Inputs Mostafa Z. Ali Mostafa Z.

Lepton Collider meeting 3 December With Fred Hartjes, Kees Ligtenberg, Jan Timmermans, Jochen

Assessing Field-level Change: Lessons from the Evaluation of the Assessment for Learning Project

Voter Registration & Voting during COVID-19 How to help people you work with register and

MA162: Finite mathematics . Jack Schmidt University of Kentucky January 9, 2013 Schedule: HW

Hazard Analysis (FMEA & STPA) Todd Pawlicki, Ph.D. Joint IAEA-ICTP training on patient safety

Sharon: Shared Online Event Sequence Aggregation Olga Poppe, Allison Rozet, Chuan Lei, Elke A.

MTWRF 9:45-11:15 AM Sitterson 011 1 O ffj ce hours: MW 1-2 PM If you still cannot make it

Welcome! Thank you for joining us Prelude Wade in the Water T raditional African-American folk

on Highway Policy & Funding May 13, 2020 | 2pm CT Laura Perrotta, CAE President & CEO

interaction design users who they are, what they are like basics scenarios

Human-Computer Interaction Termin 6: Designing Usable Systems readings: Dahm: Grundlagen der

1 Issues and Techniques for Weak Replication Issues and Techniques - PDF document

Grapevine and Clearinghouse (Xerox) Grapevine and Clearinghouse (Xerox) Weakly consistent replication was used in earlier work at Xerox PARC: Grapevine and Clearinghouse name services Asynchronous Replication and Bayou Asynchronous

Reasoning about Consensus Protocols Ilya Sergey ilyasergey.net Consensus Common meaning :

Concurrency Control Lecture # 18 Database Systems Andy Pavlo AP AP Computer Science

Stata Bootcamp - STAMP Denise Laroze 1 University of Essex Department of Government October 10,

Should I Stale or Should I Close? An Analysis of a Bot that Closes Abandoned Issues and Pull

DEFINING CLASSES IN PYTHON CSSE 120Rose Hulman Institute of Technology Review: Using Objects

Modeling unobserved heterogeneity in Stata Rafal Raciborski StataCorp LLC November 27, 2017

Chapter 3.23.4 Spanning Tree Algorithms Prof. Tesler Math 154 Winter 2020 Prof. Tesler Ch.

Operating Systems Fall 2014 Secondary Storage Myungjin Lee myungjin.lee@ed.ac.uk 1 Secondary

Linear Regression 18.05 Spring 2014 Agenda Fitting curves to bivariate data Measuring the

The Dust Between the Stars adventures with a small telescope (with notes on a big dark space)

Distributed Optimistic Algorithm Assumptions 1. Synchronized clocks 2. MTD (max, trans, delay)

1. Lack of motivation. 2. Not Wowing your patients enough. 3. No exit strategy. 4. Your office

Know ledge-Based Systems IS430 Modeling Basic Operations and Inputs Mostafa Z. Ali Mostafa Z.

Lepton Collider meeting 3 December With Fred Hartjes, Kees Ligtenberg, Jan Timmermans, Jochen

Assessing Field-level Change: Lessons from the Evaluation of the Assessment for Learning Project

Voter Registration &amp; Voting during COVID-19 How to help people you work with register and

MA162: Finite mathematics . Jack Schmidt University of Kentucky January 9, 2013 Schedule: HW

Hazard Analysis (FMEA &amp; STPA) Todd Pawlicki, Ph.D. Joint IAEA-ICTP training on patient safety

Sharon: Shared Online Event Sequence Aggregation Olga Poppe, Allison Rozet, Chuan Lei, Elke A.

MTWRF 9:45-11:15 AM Sitterson 011 1 O ffj ce hours: MW 1-2 PM If you still cannot make it

Welcome! Thank you for joining us Prelude Wade in the Water T raditional African-American folk

on Highway Policy &amp; Funding May 13, 2020 | 2pm CT Laura Perrotta, CAE President &amp; CEO

interaction design users who they are, what they are like basics scenarios

Human-Computer Interaction Termin 6: Designing Usable Systems readings: Dahm: Grundlagen der

Voter Registration & Voting during COVID-19 How to help people you work with register and

Hazard Analysis (FMEA & STPA) Todd Pawlicki, Ph.D. Joint IAEA-ICTP training on patient safety

on Highway Policy & Funding May 13, 2020 | 2pm CT Laura Perrotta, CAE President & CEO