[PPT] - P2P Systems: Gossip Protocols CS 6410 By Alane Suhr & Danny PowerPoint Presentation

SLIDE 1

P2P Systems: Gossip Protocols

CS 6410 By Alane Suhr & Danny Adams

1

SLIDE 2

Outline

2

❖ Timeline ❖ CAP Theorem ❖ Epidemic algorithms for replicated database maintenance ❖ Managing update conflicts in Bayou, a weakly connected replicated storage system ❖ Conclusion

A

SLIDE 3

Timeline

3

Lamport Time, Clocks, and the Ordering of Events in a Distributed System

1978

Lamport The Byzantine Generals Problem

1982

FLP Impossibility of Distributed Consensus with One Faulty Process

1985

Demers Epidemic algorithms for replicated database maintenance

1987

Schneider Implementing fault-tolerant services using the state machine approach: A tutorial

1990

Terry Managing update conflicts in Bayou, a weakly connected replicated storage system

1995

Lamport The part-time parliament

1998

A

SLIDE 4

CAP

Consistency -- all nodes contain the same state
Availability -- requests are responded to promptly
Partition

○ part of a system completely independent from the rest of the system ○ ideally should maintain itself autonomously

Partition tolerance -- system can stay online and functional

even when message passing fails

4

A

SLIDE 5

CAP Theorem Paxos & Gossip

Paxos: prioritize consistency

given a network partition

Gossip: prioritize availability

given a network partition

5

A

SLIDE 6

Gossip

6

D

SLIDE 7

Gossip Overview

7

❏ Authors ❏ Motivations ❏ Epidemic Models ❏ Direct Mail ❏ Anti-Entropy ❏ Rumor mongering ❏ Evaluation ❏ DC’s ❏ Spatial Distribution

D

SLIDE 8

8

Alan Demers Cornell University Dan Greene PARC Research Scott Shenker EECS Berkeley Doug Terry Amazon Web Services Carl Hauser PhD Cornell Washington State University

A u t h

r

s

D

SLIDE 9

Motivations

9

Unreliable network
Unreliable nodes
CAP: *AP*

○ always be able to respond to a (read/write) request ○ eventual consistency

D

SLIDE 10

Epidemic Models

10

A

SLIDE 11

Proposers and Acceptors

Proposer

○ In Paxos: clients propose an update to the database ○ Epidemic model: a node infects its neighbors

Acceptor

○ In Paxos: acceptor accepts an update based on one or more proposals ○ Epidemic model: a node is infected by a neighbor

11

A

SLIDE 12

Types of Epidemics

❖ Direct Mail ❖ Anti-Entropy ❖ Rumor Mongering

12

A

SLIDE 13

Advantages

13

➢ Simple algorithms ➢ High Availability ➢ Fault Tolerant ➢ Tunable ➢ Scalable ➢ Works in Partition

A

SLIDE 14

Direct Mail

14

Notify all neighbors of

an update

Timely and reasonably

efficient

n messages per update

D

SLIDE 15

Direct Mail

15

D

SLIDE 16

Direct Mail

16

D

SLIDE 17

Direct Mail

Messages sent: O(n) where n is number of neighbors Not fault tolerant -- doesn’t guarantee eventual consistency High volume of traffic with site at the epicenter

17

D

SLIDE 18

Anti-Entropy

18

❏ Site chooses random partner to share data ❏ Number of rounds til consistency: O(log n) ❏ Sites use custom protocols to resolve conflicts ❏ Fault tolerant

A

SLIDE 19

Anti-Entropy

19

A

SLIDE 20

Anti-Entropy

20

A

SLIDE 21

Anti-Entropy

21

A

SLIDE 22

Anti-Entropy

22

A

SLIDE 23

Anti-Entropy

23

A

SLIDE 24

Anti-Entropy

24

A

SLIDE 25

Anti-Entropy

25

A

SLIDE 26

Anti-Entropy

26

A

SLIDE 27

Anti-Entropy

27

What happens next?

A

SLIDE 28

28

Mechanism: Push & Pull

D

SLIDE 29

Push vs. Pull

Push Pull

29

{A, B} {A, C} {A, B} {A, C} {A, B} {A,B,C} {A, B, C} {A, C} D

SLIDE 30

What is Push-Pull?

30

{A, B} {A, C} {A, B, C} {A,B,C} D

SLIDE 31

Propagation times of Push vs. Pull

Push: Pi+1 = Pie-1 P= Probability node hasn’t received update after the ith round Pull: Pi+1= Pi

2 31

Pull is faster!!

D

SLIDE 32

Rumor Mongering

1. Sites choose a random neighbor to share information with 2. Transmission rate is tuneable 3. How long new updates are interesting is also tuneable 4. Can use push or pull mechanisms

32

A

SLIDE 33

Rumor Mongering Complexity

O(ln n) rounds leads to consistency with

high probability

Push requires O(n ln n) transmissions

until consistency

Further proved lower bound for all

push-pull transmissions: 0(n ln ln n)

33

Karp et al 2000. Randomized rumor spreading. In FOCS.

A

SLIDE 34

Analogy to epidemiology

Susceptible: site does not know an update yet
Infective: actively sharing an update
Removed: updated and no longer sharing

Rumor mongering: nodes go from susceptible to infective and eventually (probabilistically) to removed

34

A

SLIDE 35

Rumor mongering

35

A

SLIDE 36

Rumor mongering

36

A

SLIDE 37

Rumor mongering

37

A

SLIDE 38

Rumor mongering

38

A

SLIDE 39

Rumor mongering

39

A

SLIDE 40

Rumor mongering

40

A

SLIDE 41

Rumor mongering

41

A A

SLIDE 42

Rumor mongering

Pros:

Fast
Low call on resources
Fault-Tolerant
Less traffic

42

Cons:

A site can potentially miss an

update

A

SLIDE 43

Backups

Anti-entropy can be used to

“update” the network regularly after direct mail or rumor mongering

If inconsistency found in

anti-entropy, run the original algorithm again

43

D

SLIDE 44

Death Certificates

❖ How are items deleted using epidemic models?

44

D

SLIDE 45

45

I like Bread I DON’T like Bread! I like orange juice D

SLIDE 46

Death Certificates

❖ How to remove items from epidemic model?

46

D

❖ Drawbacks ➢ Space ➢ Increases traffic ➢ DC Can be lost ❖ Dormant death certificates & retention

SLIDE 47

Evaluating Epidemic Models

➢ Residue: remaining susceptibles when epidemic finishes

➢

Traffic: ➢ Delay: ○ Tavg: Average time between start of outbreak and arrival

f update @ given site

○ Tlast: Delay until last update

47

D

SLIDE 48

Spatial Distribution

Helping Or Hurting

48

A

SLIDE 49

Convergence Times and Traffic

Linear network: anti entropy

○ Nearest-neighbors ■ O(n) convergence ■ O(1) traffic ○ Random connections ■ O(log(n)) convergence ■ O(n) traffic

49

A

SLIDE 50

Optimizations for realistic network distributions

Select connections from

list of neighbors sorted by distance

Treat network as linear
Compute probabilities

based on position in list

50

A

SLIDE 51

Rumor Mongering Non-Standard Distribution

Increase k --

number of rounds a rumor is “interesting”

Use push-pull

51

A

SLIDE 52

Takeaways

Availability >> consistency
Updates can be expensive
Distribution protocols should be

robust

Network design can hurt overall

performance

Byzantine Behavior not addressed

Questions?

52

A

SLIDE 53

Managing update conflicts in Bayou, a weakly connected replicated storage system 1995

Additional Reading

53

D

SLIDE 54

Weak consistency makes

unstable network applications possible

Developing good interfaces

allows for complex functions like merging to be interchangeable via the application

54

D

SLIDE 55

Timeline

55

Lamport Time, Clocks, and the Ordering of Events in a Distributed System

1978

Lamport The Byzantine Generals Problem

1982

FLP Impossibility of Distributed Consensus with One Faulty Process

1985

Demers Epidemic algorithms for replicated database maintenance

1987

Schneider Implementing fault-tolerant services using the state machine approach: A tutorial

1990

Terry Managing update conflicts in Bayou, a weakly connected replicated storage system

1995

Lamport The part-time parliament

1998

D

SLIDE 56

What is Bayou?

Storage system designed for

mobile computing ○ Network is not stable ○ Parts of the network may not be connected all the time ○ Goal: high availability ○ Guarantees weak consistency

56

D

SLIDE 57

Bayou System Diagram

57

Server Client Client Write (unique ID) Server Anti-Entropy Read Request Data

D

SLIDE 58

Consistent Replicas

Writes are first tentative
Eventually they are

committed, ordered by time

Clients can tell whether

writes are stable (committed)

Primary servers deal with

committing updates

58

A

SLIDE 59

Detecting and Resolving Conflicts

Dependency checks
Merge procedures
Described by the clients,

application-dependent

59

A

SLIDE 60

Conclusions

Distributed systems need a

form of consensus

Effectively choosing the

correct consensus model for a system has to be weighed carefully with the attributes

f the system

60

A

SLIDE 61

Acknowledgements

Content Inspired by: Ki Suh Lee: “Epidemic Techniques”[2009] Eugene Bagdasaryan: “P2P Gossip Protocols” [2016] Photos www.pixabay.com www.unsplash.com www.1001freedownloads.com/free-cliparts

61

A