P2P Systems: Gossip Protocols CS 6410 By Alane Suhr & Danny - - PowerPoint PPT Presentation

p2p systems gossip protocols
SMART_READER_LITE
LIVE PREVIEW

P2P Systems: Gossip Protocols CS 6410 By Alane Suhr & Danny - - PowerPoint PPT Presentation

P2P Systems: Gossip Protocols CS 6410 By Alane Suhr & Danny Adams 1 Outline Timeline CAP Theorem Epidemic algorithms for replicated database maintenance Managing update conflicts in Bayou, a weakly connected replicated


slide-1
SLIDE 1

P2P Systems: Gossip Protocols

CS 6410 By Alane Suhr & Danny Adams

1

slide-2
SLIDE 2

Outline

2

❖ Timeline ❖ CAP Theorem ❖ Epidemic algorithms for replicated database maintenance ❖ Managing update conflicts in Bayou, a weakly connected replicated storage system ❖ Conclusion

A

slide-3
SLIDE 3

Timeline

3

Lamport Time, Clocks, and the Ordering of Events in a Distributed System

1978

Lamport The Byzantine Generals Problem

1982

FLP Impossibility of Distributed Consensus with One Faulty Process

1985

Demers Epidemic algorithms for replicated database maintenance

1987

Schneider Implementing fault-tolerant services using the state machine approach: A tutorial

1990

Terry Managing update conflicts in Bayou, a weakly connected replicated storage system

1995

Lamport The part-time parliament

1998

A

slide-4
SLIDE 4

CAP

  • Consistency -- all nodes contain the same state
  • Availability -- requests are responded to promptly
  • Partition

○ part of a system completely independent from the rest of the system ○ ideally should maintain itself autonomously

  • Partition tolerance -- system can stay online and functional

even when message passing fails

4

A

slide-5
SLIDE 5

CAP Theorem Paxos & Gossip

  • Paxos: prioritize consistency

given a network partition

  • Gossip: prioritize availability

given a network partition

5

A

slide-6
SLIDE 6

Gossip

6

D

slide-7
SLIDE 7

Gossip Overview

7

❏ Authors ❏ Motivations ❏ Epidemic Models ❏ Direct Mail ❏ Anti-Entropy ❏ Rumor mongering ❏ Evaluation ❏ DC’s ❏ Spatial Distribution

D

slide-8
SLIDE 8

8

Alan Demers Cornell University Dan Greene PARC Research Scott Shenker EECS Berkeley Doug Terry Amazon Web Services Carl Hauser PhD Cornell Washington State University

A u t h

  • r

s

D

slide-9
SLIDE 9

Motivations

9

  • Unreliable network
  • Unreliable nodes
  • CAP: *AP*

○ always be able to respond to a (read/write) request ○ eventual consistency

D

slide-10
SLIDE 10

Epidemic Models

10

A

slide-11
SLIDE 11

Proposers and Acceptors

  • Proposer

○ In Paxos: clients propose an update to the database ○ Epidemic model: a node infects its neighbors

  • Acceptor

○ In Paxos: acceptor accepts an update based on one or more proposals ○ Epidemic model: a node is infected by a neighbor

11

A

slide-12
SLIDE 12

Types of Epidemics

❖ Direct Mail ❖ Anti-Entropy ❖ Rumor Mongering

12

A

slide-13
SLIDE 13

Advantages

13

➢ Simple algorithms ➢ High Availability ➢ Fault Tolerant ➢ Tunable ➢ Scalable ➢ Works in Partition

A

slide-14
SLIDE 14

Direct Mail

14

  • Notify all neighbors of

an update

  • Timely and reasonably

efficient

  • n messages per update

D

slide-15
SLIDE 15

Direct Mail

15

D

slide-16
SLIDE 16

Direct Mail

16

D

slide-17
SLIDE 17

Direct Mail

Messages sent: O(n) where n is number of neighbors Not fault tolerant -- doesn’t guarantee eventual consistency High volume of traffic with site at the epicenter

17

D

slide-18
SLIDE 18

Anti-Entropy

18

❏ Site chooses random partner to share data ❏ Number of rounds til consistency: O(log n) ❏ Sites use custom protocols to resolve conflicts ❏ Fault tolerant

A

slide-19
SLIDE 19

Anti-Entropy

19

A

slide-20
SLIDE 20

Anti-Entropy

20

A

slide-21
SLIDE 21

Anti-Entropy

21

A

slide-22
SLIDE 22

Anti-Entropy

22

A

slide-23
SLIDE 23

Anti-Entropy

23

A

slide-24
SLIDE 24

Anti-Entropy

24

A

slide-25
SLIDE 25

Anti-Entropy

25

A

slide-26
SLIDE 26

Anti-Entropy

26

A

slide-27
SLIDE 27

Anti-Entropy

27

What happens next?

A

slide-28
SLIDE 28

28

Mechanism: Push & Pull

D

slide-29
SLIDE 29

Push vs. Pull

Push Pull

29

{A, B} {A, C} {A, B} {A, C} {A, B} {A,B,C} {A, B, C} {A, C} D

slide-30
SLIDE 30

What is Push-Pull?

30

{A, B} {A, C} {A, B, C} {A,B,C} D

slide-31
SLIDE 31

Propagation times of Push vs. Pull

Push: Pi+1 = Pie-1 P= Probability node hasn’t received update after the ith round Pull: Pi+1= Pi

2 31

Pull is faster!!

D

slide-32
SLIDE 32

Rumor Mongering

1. Sites choose a random neighbor to share information with 2. Transmission rate is tuneable 3. How long new updates are interesting is also tuneable 4. Can use push or pull mechanisms

32

A

slide-33
SLIDE 33

Rumor Mongering Complexity

  • O(ln n) rounds leads to consistency with

high probability

  • Push requires O(n ln n) transmissions

until consistency

  • Further proved lower bound for all

push-pull transmissions: 0(n ln ln n)

33

Karp et al 2000. Randomized rumor spreading. In FOCS.

A

slide-34
SLIDE 34

Analogy to epidemiology

  • Susceptible: site does not know an update yet
  • Infective: actively sharing an update
  • Removed: updated and no longer sharing

Rumor mongering: nodes go from susceptible to infective and eventually (probabilistically) to removed

34

A

slide-35
SLIDE 35

Rumor mongering

35

A

slide-36
SLIDE 36

Rumor mongering

36

A

slide-37
SLIDE 37

Rumor mongering

37

A

slide-38
SLIDE 38

Rumor mongering

38

A

slide-39
SLIDE 39

Rumor mongering

39

A

slide-40
SLIDE 40

Rumor mongering

40

A

slide-41
SLIDE 41

Rumor mongering

41

A A

slide-42
SLIDE 42

Rumor mongering

Pros:

  • Fast
  • Low call on resources
  • Fault-Tolerant
  • Less traffic

42

Cons:

  • A site can potentially miss an

update

A

slide-43
SLIDE 43

Backups

  • Anti-entropy can be used to

“update” the network regularly after direct mail or rumor mongering

  • If inconsistency found in

anti-entropy, run the original algorithm again

43

D

slide-44
SLIDE 44

Death Certificates

❖ How are items deleted using epidemic models?

44

D

slide-45
SLIDE 45

45

I like Bread I DON’T like Bread! I like orange juice D

slide-46
SLIDE 46

Death Certificates

❖ How to remove items from epidemic model?

46

D

❖ Drawbacks ➢ Space ➢ Increases traffic ➢ DC Can be lost ❖ Dormant death certificates & retention

slide-47
SLIDE 47

Evaluating Epidemic Models

➢ Residue: remaining susceptibles when epidemic finishes

Traffic: ➢ Delay: ○ Tavg: Average time between start of outbreak and arrival

  • f update @ given site

○ Tlast: Delay until last update

47

D

slide-48
SLIDE 48

Spatial Distribution

Helping Or Hurting

48

A

slide-49
SLIDE 49

Convergence Times and Traffic

  • Linear network: anti entropy

○ Nearest-neighbors ■ O(n) convergence ■ O(1) traffic ○ Random connections ■ O(log(n)) convergence ■ O(n) traffic

49

A

slide-50
SLIDE 50

Optimizations for realistic network distributions

  • Select connections from

list of neighbors sorted by distance

  • Treat network as linear
  • Compute probabilities

based on position in list

50

A

slide-51
SLIDE 51

Rumor Mongering Non-Standard Distribution

  • Increase k --

number of rounds a rumor is “interesting”

  • Use push-pull

51

A

slide-52
SLIDE 52

Takeaways

  • Availability >> consistency
  • Updates can be expensive
  • Distribution protocols should be

robust

  • Network design can hurt overall

performance

  • Byzantine Behavior not addressed

Questions?

52

A

slide-53
SLIDE 53

Managing update conflicts in Bayou, a weakly connected replicated storage system 1995

Additional Reading

53

D

slide-54
SLIDE 54
  • Weak consistency makes

unstable network applications possible

  • Developing good interfaces

allows for complex functions like merging to be interchangeable via the application

54

D

slide-55
SLIDE 55

Timeline

55

Lamport Time, Clocks, and the Ordering of Events in a Distributed System

1978

Lamport The Byzantine Generals Problem

1982

FLP Impossibility of Distributed Consensus with One Faulty Process

1985

Demers Epidemic algorithms for replicated database maintenance

1987

Schneider Implementing fault-tolerant services using the state machine approach: A tutorial

1990

Terry Managing update conflicts in Bayou, a weakly connected replicated storage system

1995

Lamport The part-time parliament

1998

D

slide-56
SLIDE 56

What is Bayou?

  • Storage system designed for

mobile computing ○ Network is not stable ○ Parts of the network may not be connected all the time ○ Goal: high availability ○ Guarantees weak consistency

56

D

slide-57
SLIDE 57

Bayou System Diagram

57

Server Client Client Write (unique ID) Server Anti-Entropy Read Request Data

D

slide-58
SLIDE 58

Consistent Replicas

  • Writes are first tentative
  • Eventually they are

committed, ordered by time

  • Clients can tell whether

writes are stable (committed)

  • Primary servers deal with

committing updates

58

A

slide-59
SLIDE 59

Detecting and Resolving Conflicts

  • Dependency checks
  • Merge procedures
  • Described by the clients,

application-dependent

59

A

slide-60
SLIDE 60

Conclusions

  • Distributed systems need a

form of consensus

  • Effectively choosing the

correct consensus model for a system has to be weighed carefully with the attributes

  • f the system

60

A

slide-61
SLIDE 61

Acknowledgements

Content Inspired by: Ki Suh Lee: “Epidemic Techniques”[2009] Eugene Bagdasaryan: “P2P Gossip Protocols” [2016] Photos www.pixabay.com www.unsplash.com www.1001freedownloads.com/free-cliparts

61

A