1 Issues and Techniques for Weak Replication Bayou Basics Issues - PDF document

Asynchronous Replication Asynchronous Replication Idea: build available/scalable information services with read-any-write-any replication and a weak consistency model. - no denial of service during transient network partitions - supports massive replication without massive overhead - “ideal for the Internet and mobile computing” [Golding92] Asynchronous Replication and Bayou Asynchronous Replication and Bayou replica A Problems: replicas may be out of date, may accept conflicting writes, and may receive updates in different orders. client A “optimistic” client B client C asynchronous state propagation replica C replica B Synchronous Replication Grapevine and Clearinghouse (Xerox) Synchronous Replication Grapevine and Clearinghouse (Xerox) Basic scheme: connect each client (or front-end ) with every replica: writes go to all replicas, but client can read from any replica ( read-one-write-all replication ). Weakly consistent replication was used in earlier work at Xerox PARC: How to ensure that each replica • Grapevine and Clearinghouse name services sees updates in the “right” order? Updates were propagated by unreliable multicast (“direct mail”). • Periodic anti-entropy exchanges among replicas ensure that they eventually converge, even if updates are lost. client B Arbitrary pairs of replicas periodically establish contact and client A resolve all differences between their databases. Problem: low concurrency, low availability, and Various mechanisms (e.g., MD5 digests and update logs) reduce high response times. the volume of data exchanged in the common case. Deletions handled as a special case via “death certificates” Partial Solution: Allow writes to any N replicas recording the delete operation as an update. replicas (a quorum of size N ). To be safe, reads must also request data from a quorum of replicas. Epidemic Algorithms Epidemic Algorithms How to Ensure That Replicas Converge How to Ensure That Replicas Converge PARC developed a family of weak update protocols based on 1. Using any form of epidemic (randomized) anti-entropy, all a disease metaphor ( epidemic algorithms [Demers et. al. OSR 1/88]): updates will (eventually) be known to all replicas. • Each replica periodically “touches” a selected “susceptible” 2. Imposing a global order on updates guarantees that all sites peer site and “infects” it with updates. (eventually) apply the same updates in the same order. Transfer every update known to the carrier but not the victim. 3. Assuming conflict detection is deterministic, all sites will Partner selection is randomized using a variety of heuristics. detect the same conflicts. • Theory shows that the epidemic eventually infects the entire Write conflicts cannot (generally) be detected when a site accepts population with high probability (assuming it is connected). a write; they appear when updates are applied . Probability that replicas that have not yet converged decreases 3. Assuming conflict resolution is deterministic, all sites will exponentially with time. resolve all conflicts in exactly the same way. Heuristics (e.g., push vs. pull) affect traffic load and the expected time-to-convergence. 1

Issues and Techniques for Weak Replication Bayou Basics Issues and Techniques for Weak Replication Bayou Basics 1. How should replicas choose partners for anti-entropy exchanges? 1. Highly available, weak replication for mobile clients. Topology-aware choices minimize bandwidth demand by “flooding”, Beware : every device is a “server”... let’s call ‘em sites . but randomized choices survive transient link failures. 2. Update conflicts are detected/resolved by rules specified by 2. How to impose a global ordering on updates? logical clocks and delayed delivery (or delayed commitment) of updates the application and transmitted with the update. 3. How to integrate new updates with existing database state? interpreted dependency checks and merge procedures Propagate updates rather than state, but how to detect and reconcile 3. Stale or tentative data may be observed by the client, but conflicting updates? Bayou: user-defined checks and merge rules . may mutate later. 4. How to determine which updates to propagate to a peer on each anti- entropy exchange? The client is aware that some updates have not yet been vector clocks or vector timestamps confirmed . 5. When can a site safely commit or stabilize received updates? “An inconsistent database is marginally less useful than a receiver acknowledgement by vector clocks (TSAE protocol) consistent one.” Clocks Update Ordering Clocks Update Ordering 1. physical clocks Problem: how to ensure that all sites recognize a fixed order on updates, Protocols to control drift exist, but physical clock timestamps cannot even if updates are delivered out of order? assign an ordering to “nearly concurrent” events. Solution: Assign timestamps to updates at their accepting site, and order 2. logical clocks them by source timestamp at the receiver. Simple timestamps guaranteed to respect causality: “ A ’s current time is Assign nodes unique IDs: break ties with the origin node ID. later than the timestamp of any event A knows about, no matter where it happened or who told A about it.” • What (if any) ordering exists between updates accepted by different sites? 3. vector clocks Comparing physical timestamps is arbitrary: physical clocks drift. Order(N) timestamps that say exactly what A knows about events on B , Even a protocol to maintain loosely synchronized physical clocks even if A heard it from C . cannot assign a meaningful ordering to events that occurred at 4. matrix clocks “almost exactly the same time”. Order(N 2 ) timestamps that say what A knows about what B knows about • In Bayou, received updates may affect generation of future events on C . updates, since they are immediately visible to the user. Acknowledgement vectors : an O(N) approximation to matrix clocks. Causality and Logical Time Causality and Logical Time Causality: Example Causality: Example Constraint: The update ordering must respect potential causality . A1 A2 • Communication patterns establish a happened-before order A A3 A4 on events, which tells us when ordering might matter. • Event e 1 happened-before e 2 iff e 1 could possibly have affected the generation of e 2 : we say that e 1 < e 2 . B1 B2 B4 e 1 < e 2 iff e 1 was “known” when e 2 occurred. B3 B Events e 1 and e 2 are potentially causally related . • In Bayou, users or applications may perceive inconsistencies A1 < B2 < C2 if causal ordering of updates is not respected at all replicas. B3 < A3 An update u should be ordered after all updates w known to the C1 C2 C3 accepting site at the time u was accepted. C2 < A4 C e.g., the newsgroup example in the text. 2

Logical Clocks Logical Clocks: Example Logical Clocks Logical Clocks: Example A6-A10: receiver’s clock is unaffected Solution: timestamp updates with logical clocks [Lamport] because it is “running fast” relative to sender. A Timestamping updates with the originating node’s logical clock 3 4 5 6 7 8 9 10 0 1 2 LC induces a partial order that respects potential causality. Clock condition : e 1 < e 2 implies that LC(e 1 ) < LC(e 2 ) 1. Each site maintains a monotonically increasing clock value LC . 2. Globally visible events (e.g., updates) are timestamped with the B 5 6 0 2 3 4 7 current LC value at the generating site. Increment local LC on each new event: LC = LC + 1 C5: LC update advances receiver’s clock if it is “running slow” relative to sender. 3. Piggyback current clock value on all messages. Receiver resets local LC: if LC s > LC r then LC r = LC s + 1 C 5 6 7 0 1 8 Flooding and the Prefix Property Which Updates to Propagate? Flooding and the Prefix Property Which Updates to Propagate? In Bayou, each replica’s knowledge of updates is determined In an anti-entropy exchange, A must send B all updates by its pattern of communication with other nodes. known to A that are not yet known to B . Loosely, a site knows everything that it could know from its contacts with other nodes. Problem: which updates are those? • Anti-entropy floods updates. one-way “push” anti-entropy exchange (Bayou reconciliation) Tag each update originating from site i with accept stamp (i, LC i ) . Updates from each site are bulk-transmitted cumulatively in an order consistent with their source accept stamps. “What do you know?” • Flooding guarantees the prefix property of received updates. “Here’s what I know.” If a site knows an update u originating at site i with accept stamp B A LC u , then it also knows all preceding updates w originating at “Here’s what I know that you don’t know.” site i : those with accept stamps LC w < LC u . Causality and Reconciliation Causality and Reconciliation Causality and Updates: Example Causality and Updates: Example In general, a transfer from A must send B all updates that did A1 A2 not happen-before any update known to B . A A4 A5 “Who have you talked to, and when?” B1 B2 B4 “This is who I talked to.” B3 B A B “Here’s everything I know that they did not know A1 < B2 < C3 when they talked to you.” B3 < A4 C1 C3 C4 Can we determine which updates to propagate by comparing C3 < A5 C logical clocks LC(A) and LC(B) ? NO. 3

1 Issues and Techniques for Weak Replication Bayou Basics Issues - PDF document

Asynchronous Replication Asynchronous Replication Idea: build available/scalable information services with read-any-write-any replication and a weak consistency model. - no denial of service during transient network partitions - supports massive

Youve Found Your Soldier on Fold3, Now What? It is not enough to find the record Building the

Existence & Emergence of Navigability in Social Networks Emmanuelle Lebhar (CNRS &

Bretton Woods and World Bank at 75 Ravi Kanbur www.kanbur.dyson.cornell.edu Keynote, Nordic

Real People Sharing Their Hepatitis B Stories Webinar May 17, 2018 Phone Option Call-In #: +1

EU Privacy + Security Intensive Jan Dhont Partner, Wilson Sonsini Goodrich & Rosati John

Absorbtion and emission spectra of formaldehyde N. Runeberg SSCC16 8-11 March 2016 CSC

Theoretical Insights Into Novel Telluro-ketones 1 Myself Miss Jaufeerally Bibi Naziah PhD

Effects of Landcover Type on Trace Gas Emissions from Biomass Burning Alicia Hoffman Peter

Proposed 15-Day Changes Public Workshops March 5-14, 2019 CALIFORNIA AIR RESOURCES BOARD

Adsorption Refrigeration Orange Team B Customer Needs Need Attribute Metric Number Spacious

Chemistry 2000 Slide Set 8: Valence bond theory Marc R. Roussel January 23, 2020 Marc R. Roussel

Interpretive Planning for Museums Oregon Museum AssociationHood River, OregonSeptember 10,

Massachuse(s)Toxics)Use)Reduc1on)Act) (TURA):)Reducing)the)Use)of)Carcinogens) Rachel'Massey'

Separation algebras for C verification in Coq Robbert Krebbers ICIS, Radboud University Nijmegen,

Formalizing the C99 standard Robbert Krebbers Joint work with Freek Wiedijk Radboud University

terpenoids synthesis and pharmacological properties Mariia Nesterkina*and Iryna Kravchenko

Biobanking for NEC: Challenges & Opportunities Misty Good, MD, MS Division of Newborn

On the Invertibility of ReLU Networks Inverse Problems and Machine Learning, Caltech Jens

The Dairy Farmer Margin Protection Program USDAs Safety Net For Producers: 2018 Enrollment

Neural Networks for Time Series Prediction 15-486/782: Artificial Neural Networks Fall 2006

Getting Started with TensorFlow Part I: TensorFlow Graphs and Sessions Nick Winovich Department

Two Cases of Plastics Regulation: Gaps, Insights from an Emerging Research Agenda Dr David

Burlington Early Learning In Initiative Scholarship Model Discussion August 22, 2018 Meetin

Part D Event Derived Variables Barbara Frank, M.S., M.P.H. Director of Workshops, Outreach, &

1 Issues and Techniques for Weak Replication Bayou Basics Issues - PDF document

Asynchronous Replication Asynchronous Replication Idea: build available/scalable information services with read-any-write-any replication and a weak consistency model. - no denial of service during transient network partitions - supports massive

Youve Found Your Soldier on Fold3, Now What? It is not enough to find the record Building the

Existence &amp; Emergence of Navigability in Social Networks Emmanuelle Lebhar (CNRS &amp;

Bretton Woods and World Bank at 75 Ravi Kanbur www.kanbur.dyson.cornell.edu Keynote, Nordic

Real People Sharing Their Hepatitis B Stories Webinar May 17, 2018 Phone Option Call-In #: +1

EU Privacy + Security Intensive Jan Dhont Partner, Wilson Sonsini Goodrich &amp; Rosati John

Absorbtion and emission spectra of formaldehyde N. Runeberg SSCC16 8-11 March 2016 CSC

Theoretical Insights Into Novel Telluro-ketones 1 Myself Miss Jaufeerally Bibi Naziah PhD

Effects of Landcover Type on Trace Gas Emissions from Biomass Burning Alicia Hoffman Peter

Proposed 15-Day Changes Public Workshops March 5-14, 2019 CALIFORNIA AIR RESOURCES BOARD

Adsorption Refrigeration Orange Team B Customer Needs Need Attribute Metric Number Spacious

Chemistry 2000 Slide Set 8: Valence bond theory Marc R. Roussel January 23, 2020 Marc R. Roussel

Interpretive Planning for Museums Oregon Museum AssociationHood River, OregonSeptember 10,

Massachuse(s)Toxics)Use)Reduc1on)Act) (TURA):)Reducing)the)Use)of)Carcinogens) Rachel'Massey'

Separation algebras for C verification in Coq Robbert Krebbers ICIS, Radboud University Nijmegen,

Formalizing the C99 standard Robbert Krebbers Joint work with Freek Wiedijk Radboud University

terpenoids synthesis and pharmacological properties Mariia Nesterkina*and Iryna Kravchenko

Biobanking for NEC: Challenges &amp; Opportunities Misty Good, MD, MS Division of Newborn

On the Invertibility of ReLU Networks Inverse Problems and Machine Learning, Caltech Jens

The Dairy Farmer Margin Protection Program USDAs Safety Net For Producers: 2018 Enrollment

Neural Networks for Time Series Prediction 15-486/782: Artificial Neural Networks Fall 2006

Getting Started with TensorFlow Part I: TensorFlow Graphs and Sessions Nick Winovich Department

Two Cases of Plastics Regulation: Gaps, Insights from an Emerging Research Agenda Dr David

Burlington Early Learning In Initiative Scholarship Model Discussion August 22, 2018 Meetin

Part D Event Derived Variables Barbara Frank, M.S., M.P.H. Director of Workshops, Outreach, &amp;

Existence & Emergence of Navigability in Social Networks Emmanuelle Lebhar (CNRS &

EU Privacy + Security Intensive Jan Dhont Partner, Wilson Sonsini Goodrich & Rosati John

Biobanking for NEC: Challenges & Opportunities Misty Good, MD, MS Division of Newborn

Part D Event Derived Variables Barbara Frank, M.S., M.P.H. Director of Workshops, Outreach, &