group communication
play

Group Communication Shan-Hung Wu and DataLab CS, NTHU Outline - PowerPoint PPT Presentation

Group Communication Shan-Hung Wu and DataLab CS, NTHU Outline Group Communication Basic Abstraction Perfect Point to Point Link Perfect Failure Detection Reliable Broadcast Best Effort Broadcast Reliable


  1. Group Communication Shan-Hung Wu and DataLab CS, NTHU

  2. Outline • Group Communication • Basic Abstraction – Perfect Point to Point Link – Perfect Failure Detection • Reliable Broadcast – Best Effort Broadcast – Reliable Broadcast – Uniform Reliable Broadcast • Consensus – Regular Consensus – Total Order Broadcast • Paxos – Basic Paxos – Zab – Other Variants: Multi-Paxos, FastPaxos, and Generalized Paxos 2

  3. Outline • Group Communication • Basic Abstraction – Perfect Point to Point Link – Perfect Failure Detection • Reliable Broadcast – Best Effort Broadcast – Reliable Broadcast – Uniform Reliable Broadcast • Consensus – Regular Consensus – Total Order Broadcast • Paxos – Basic Paxos – Zab – Other Variants: Multi-Paxos, FastPaxos, and Generalized Paxos 3

  4. Group Communication • Group Communication is to provide multipoint to multipoint communication – Guarantees certain properties 4

  5. Difficulties in Group Communication • Challenges – Message delay or loss – Out of order – Node Failure – Link Failure • Actually it is difficult to recognize whether the node or the link fails 5

  6. Outline • Group Communication • Basic Abstraction – Perfect Point to Point Link – Perfect Failure Detection • Reliable Broadcast – Best Effort Broadcast – Reliable Broadcast – Uniform Reliable Broadcast • Consensus – Regular Consensus – Total Order Broadcast • Paxos – Basic Paxos – Zab – Other Variants: Multi-Paxos, FastPaxos, and Generalized Paxos 6

  7. Perfect Point to Point Link • How to cope with message loss? – Message retransmission and eliminating duplicates 7

  8. Message to be sent Message to be sent p 1 p 1 p 2 p 2 Message loss 8

  9. Perfect Point to Point Link • Properties – Reliable delivery : if neither the sender nor the receiver crashes, then the receiver eventually delivers a message sent by the sender • Keep retransmitting the message until an ACK is received – No duplication : a receiver may receive a message many times, but can only deliver it once • Sequence number – No creation : if a message is delivered, it must be sent by some process • Checksum 9

  10. Perfect Point to Point Link • A simplified implementation without ACKs Retransmit all messages periodically 10

  11. Perfect Failure Detection • How to detect a node failure? – Detect timeout for heartbeats – If not receiving a heartbeat from a process p for a long time, then deem p has crashed 11

  12. Perfect Failure Detection • Uses: – PerfectPointToPointLink • Properties – Strong completeness : eventually every correct process knows which processes are still alive. • Achieved by broadcasting which nodes are failed, or everyone can detect by themselves – Strong accuracy : if a process p is detected by any process, then p has crashed • A process is detected as failure iff it has crashed 12

  13. Perfect Failure Detection Send heartbeat messages to all processes 13

  14. Outline • Group Communication • Basic Abstraction – Perfect Point to Point Link – Perfect Failure Detection • Reliable Broadcast – Best Effort Broadcast – Reliable Broadcast – Uniform Reliable Broadcast • Consensus – Regular Consensus – Total Order Broadcast • Paxos – Basic Paxos – Zab – Other Variants: Multi-Paxos, FastPaxos, and Generalized Paxos 14

  15. Broadcast • A broadcast abstraction enables a process to send a message to all processes in a system, including itself • A naïve approach • Try to broadcast the message to as many nodes as possible 15

  16. Best Effort Broadcast p 1 p 2 p 3 p 4 16

  17. Best Effort Broadcast • Uses: – PerfectPointToPointLink – PerfectFailureDetection • Properties – Best-effort validity • For any two processes p i and p j . If p i and p j are both correct, then every message broadcast by p i is eventually delivered by p j – No duplication – No creation 17

  18. Best Effort Broadcast • How to achieve best effort broadcast ? – For the first property, the sender uses PerfectPointToPointLink to send the message to all receivers that hasn’t been detected as failure by PerfectFailureDetection – The other two properties are covered by PerfectPointToPointLink 18

  19. Best Effort Broadcast 19

  20. Is This Reliable? • Is best effort broadcast enough to have every correct processes receive the message ? – No. If the sender fails , rest correct processes may not deliver the message 20

  21. Reliable Broadcast • Reliable broadcast ensures all correct processes deliver the same messages even if the sender fails • How? – If the sender is detected to have crashed, other processes will relay the message to all 21

  22. Reliable Broadcast Detected p 1 Crash p 2 p 3 p 4 Relay 22

  23. Reliable Broadcast • Uses: – BestEffortBroadcast – PerfectFailureDetection • Properties – Validity • If a correct process p i broadcasts a message m , then p i eventually delivers m. – No duplication – No creation – Agreement • If a message m is delivered by some correct processes p i , then m is eventually delivered by every correct process p j . 23

  24. Reliable Broadcast Log the broadcast message Relay all broadcast messages coming from the failed process 24

  25. Reliable Broadcast Meets Database • Can be used for GC-based eager replication? – To broadcast the effects of committed txs • Problems: – A process may deliver the messages too early – If this process crashes, other processes may not see the messages • Fails to ensure durability in DB world – Some committed txs are not propagated 25

  26. Uniform Reliable Broadcast • Ensure the failed nodes do not deliver some other messages that others do not know • A process can only deliver the message when it knows all the other correct processes have received the message and returned an ack 26

  27. Uniform Reliable Broadcast p 1 p 2 p 3 p 4 27

  28. Uniform Reliable Broadcast • Uses: – BestEffortBroadcast – PerfectFailureDetection • Properties – Validity – No duplication – No creation – Uniform agreement • If a message m is delivered by some processes p i ( whether correct or faulty ), then m is also eventually delivered by every correct process p j 28

  29. Uniform Reliable Broadcast Deliver the message only if it received ACKs from all correct processes 29

  30. Outline • Group Communication • Basic Abstraction – Perfect Point to Point Link – Perfect Failure Detection • Reliable Broadcast – Best Effort Broadcast – Reliable Broadcast – Uniform Reliable Broadcast • Consensus – Regular Consensus – Total Order Broadcast • Paxos – Basic Paxos – Zab – Other Variants: Multi-Paxos, FastPaxos, and Generalized Paxos 30

  31. Consensus • Consensus: all participants want to decide a value • Specified in terms of two primitives: propose and decide – Each process has an initial value that it proposes for the agreement , through the primitive propose 31

  32. Consensus • Uses: – BestEffortBroadcast – PerfectFailureDetection • Properties – Termination • Every correct process eventually decides some value. – Validity • If a process decides v , then v was proposed by some process. – Integrity • No process decides twice. – Agreement • No two correct process decide differently. 32

  33. How? 33

  34. Flooding Consensus • A consensus instance requires two rounds: – Round 1 • Every process proposes a value and broadcast to others • A consensus decision is reached when a process knows it has seen all proposed values that will be considered by correct processes for possible decision • The decision is made in a deterministic function • It’s ok to have many processes make the decision since the decisions should be all the same – Round 2 • The process that made the decision broadcasts the decision to all 34

  35. Flooding Consensus Can decide upon arrival of all proposals of processes in Propose(2) current view p 1 Decide(2 = min(2, 3, 5, 7)) Propose(3) p 2 Propose(5) Decide(2) (3, 5, 7) p 3 Decide(2) Propose(7) (3, 5, 7) p 4 Cannot decide, starts another round Crash detected 35

  36. Flooding Consensus Arrival of all proposals of processes in current view Relay the decision 36

  37. Any Alternative? • Processes could fail during Round 1 and 2 • Why not using reliable broadcast? – All correct processes should receive all the proposals! – Every process decides (deterministically) the same – No need for round 2 any more! • However, if any process fails, the rest need to relay the proposals • Why not just relay decision? – This is exactly the purpose of the round 2! 37

  38. Performance of Flooding Consensus • Regular: 2 steps • Each failure causes the start of a new round • Best case (no failures) – Single communication step in round 1 • Worst case (failure in every step) – N (the amount of processes) steps at most • Each step requires O(N 2 ) messages to be exchanged 38

  39. Is This Enough for a Deterministic Database System? 39

  40. Total Order Broadcast • Total order broadcast is a reliable broadcast communication abstraction which ensures that all processes deliver messages in the same order 40

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend