EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive - - PowerPoint PPT Presentation

epichord parallelizing the chord lookup algorithm with
SMART_READER_LITE
LIVE PREVIEW

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive - - PowerPoint PPT Presentation

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management Ben Leong, Barbara Liskov, and Eric D. Demaine MIT Computer Science and Artificial Intelligence Laboratory { benleong, liskov, edemaine } @mit.edu


slide-1
SLIDE 1

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management

Ben Leong, Barbara Liskov, and Eric D. Demaine MIT Computer Science and Artificial Intelligence Laboratory

{benleong, liskov, edemaine}@mit.edu

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.1

slide-2
SLIDE 2

Structured Peer-to-Peer Systems

Large scale dynamic network Overlay infrastructure : Scalable Self configuring Fault tolerant Every node responsible for some objects Find node having desired object Challenge: Efficient Routing at Low Cost

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.2

slide-3
SLIDE 3

Address Space

N15 N10 N17 N20 N25 N30 N35 N40 N47 N49 N51 N57 N62 N0 N6

Most common — one-dimensional circular address space

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.3

slide-4
SLIDE 4

Mapping Keys to Nodes

N15 N10 N17 N20 N25 N30 N35 N40 N47 N49 N51 N57 N62 N0 N6 K13 K2 K47 K32 K52 K54

successor of key is its owner

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.4

slide-5
SLIDE 5

Distributed Hash Tables (DHTs)

A Distributed Hash Table (DHT) is a distributed data structure that supports a put/get interface. Store and retrieve {key, value} pairs efficiently

  • ver a network of (generally unreliable) nodes

Keep state stored per node small because of network churn ⇒ minimize book-keeping & maintenance traffic ⇒ EpiChord explores the trade-offs in moving from sequential lookup to parallel lookup and from O(log n) to O(log n) + + state

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.5

slide-6
SLIDE 6

Chord

N15 N10 N17 N20 N25 N30 N35 N40 N47 N49 N51 N57 N62 N0 N6

Each node periodically probes O(log n) fingers Achieves O(log n)-hop performance

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.6

slide-7
SLIDE 7

Our Goal

We want to do better than O(log n)-hop lookup without adding extra overhead. Use a combination of techiques: Piggyback information on lookup messages Allow cache to store more than O(log n) routing state Issue parallel queries during lookup

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.7

slide-8
SLIDE 8

Outline

Parallel Lookup Algorithm Reactive Cache Management Simulation Results Related Work Conclusion

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.8

slide-9
SLIDE 9

EpiChord Lookup Algorithm

YOU ARE HERE YOU WANT: K2 N15 N10 N17 N20 N25 N30 N35 N40 N47 N49 N51 N57 N62 N0 N6 K2 Known node Unknown node

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.9

slide-10
SLIDE 10

EpiChord Lookup Algorithm

N15 N10 N17 N20 N25 N30 N35 N40 N47 N49 N51 N57 N62 N0 N6 K2 Known node Unknown node query for K2

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.10

slide-11
SLIDE 11

EpiChord Lookup Algorithm

N15 N10 N17 N20 N25 N30 N35 N40 N47 N49 N51 N57 N62 N0 N6 K2 Known node Unknown node p−1 queries

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.11

slide-12
SLIDE 12

EpiChord Lookup Algorithm

N15 N17 N20 N25 N30 N35 N40 N47 N49 N51 N6 K2 Known node Unknown node N57, N62, N0, N10 N57 N62 N0 N10

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.12

slide-13
SLIDE 13

EpiChord Lookup Algorithm

N15 N10 N17 N20 N25 N30 N35 N40 N47 N49 N51 N57 N62 N0 N6 K2 Known node Unknown node

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.13

slide-14
SLIDE 14

EpiChord Lookup Algorithm

N15 N10 N17 N20 N25 N30 N35 N40 N47 N49 N51 N57 N62 N0 N6 K2 Known node Unknown node N0, N6

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.14

slide-15
SLIDE 15

EpiChord Lookup Algorithm

N15 N10 N17 N20 N25 N30 N35 N40 N47 N49 N51 N57 N62 N0 N6 K2 Known node Unknown node N0, N6

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.15

slide-16
SLIDE 16

EpiChord Lookup Algorithm

N15 N10 N17 N20 N25 N30 N35 N40 N47 N49 N51 N57 N62 N0 N6 K2 Known node Unknown node FOUND K2!!

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.16

slide-17
SLIDE 17

EpiChord Lookup Algorithm

Intrinsically iterative Learn about more nodes Avoid redundant queries – typically 2(p + h) messages Additional policies to learn new routing entries: When a node first joins network, obtains a cache transfer from successor Nodes gather information by observing lookup traffic

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.17

slide-18
SLIDE 18

Reactive Cache Management

Traditional (active) approach ⇒ Ping fingers periodically Our (reactive) approach: Cache entries have a fixed expiration period Divide address space into exponentially smaller slices Periodically check if each slice has sufficient (j) un-expired entries If not, make a lookup to the midpoint of the

  • ffending slice

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.18

slide-19
SLIDE 19

Division of Address Space

Estimate number of slices from k successors and k predecessors j and k are system parameters ⇒ choose k ≥ 2j

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.19

slide-20
SLIDE 20

Summary

Piggyback extra information on lookups Allow cache to contain more than O(log n) state Flush out old state with TTLs Use cache entries in parallel to avoid timeouts Check that cache entries are well-distributed. Fix if necessary. Now, let’s evaluate performance : (i) latency and (ii) cost

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.20

slide-21
SLIDE 21

Simulation Setup

Compare EpiChord to the optimal sequential Chord lookup algorithm (base 2) What’s optimal? We ignore Chord maintenance costs and assume that the finger tables of nodes are perfectly accurate regardless of node failures The competing sequential lookup algorithm is thus a reasonably strong adversary and not just a straw man

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.21

slide-22
SLIDE 22

Simulation Setup

The assumed workloads will affect comparisons (Li et al., 2004) Consider 2 types of workloads: Lookup-Intensive 200 to 1,200 nodes, r ≈

1 600 ⇒ rn ≈ 0.3 to 2

query rate, Q ≈ 2 per sec Churn-Intensive 600 to 9,000 nodes, r ≈

1 600 ⇒ rn ≈ 1.0 to

15 query rate, Q ≈ 0.05 to 0.07 per sec

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.22

slide-23
SLIDE 23

Hop Count – Lookup-Intensive

1 2 3 4 5 200 300 400 600 800 1000 1200 1400 Chord 1-way EpiChord 2-way EpiChord 3-way EpiChord 4-way EpiChord 5-way EpiChord

Average number of hops per lookup Network Size (Logscale)

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.23

slide-24
SLIDE 24

Latency – Lookup-Intensive

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 200 300 400 600 800 1000 1200 1400 Chord 1-way EpiChord 2-way EpiChord 3-way EpiChord 4-way EpiChord 5-way EpiChord

Average lookup latency (s) Network Size (Logscale)

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.24

slide-25
SLIDE 25

Messages Sent Per Lookup

5 10 15 20 200 300 400 600 800 1000 1200 1400 5-way EpiChord 4-way EpiChord 3-way EpiChord 2-way EpiChord 1-way EpiChord Chord

Average number of messages per lookup Network Size (Logscale)

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.25

slide-26
SLIDE 26

Summary of Results

Increasing p improves hop count and latency and reduces lookup failure rate Since our approach is iterative ⇒ about 2(p + h) messages per lookup Higher lookup rates yield better overall performance due to caching Number of entries returned per query l > 3 does not affect performance much, so we set l = 3

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.26

slide-27
SLIDE 27

Related Work

Chord (Stoica et al., 2001) DHash++ (Dabek et al., 2004) Kademlia (Maymounkov and Mazieres, 2002) Kelips (Gupta et al., 2003) One-Hop (Gupta et al., 2004)

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.27

slide-28
SLIDE 28

Conclusion

Parallel lookup and reactive routing state maintenance algorithm trades off storage with better lookup performance w/o increasing bandwidth consumption Reduce both lookup latencies and pathlengths over Chord by a factor of 3 by issuing only 3 queries asynchronously in parallel per lookup w/o using more messages A parallel lookup strategy is inherently more resilient to timeouts than a sequential one

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.28

slide-29
SLIDE 29

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management

Ben Leong, Barbara Liskov, and Eric D. Demaine MIT Computer Science and Artificial Intelligence Laboratory

{benleong, liskov, edemaine}@mit.edu

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.29

slide-30
SLIDE 30

Proximity

We do not track latency information or explicitly use proximity information But parallel asynchronous lookup exploits proximity indirectly Key observation — Final sequence of lookups that returns the correct answer first is approximately equivalent to a proximity-optimized lookup sequence

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.30

slide-31
SLIDE 31

Worst-Case Performance

If j (entries/slice) = 1, equivalent to Chord Assume a uniformly distributed workload, worst-case lookup pathlength is at most 1 2 logα n, α = 3j + 6 j + 3 (j > 1) If j = 2, α = 7.2 and expected worst-case lookup pathlengths are at most only

1 2 log2 n 1 2 logα n = logα 2 ≈ 1

3 of that for Chord

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.31

slide-32
SLIDE 32

Reduction in Background Probes

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.5 1 1.5 2 2.5 3 3.5 4 4.5 n=2,000 n=20,000 n=200,000 n=1,000,000

Proportion of cache invariant satisfied Lookup traffic relative to minimal background network maintenance traffic

Probably at least 20 to 25% savings

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.32

slide-33
SLIDE 33

System Parameters

Timeout = 0.5 s Retransmits = 3 times Node lifespan – exponentially distributed with mean 600 s (10 mins) Cache Expiration Interval = 120 s (2 mins)

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.33

slide-34
SLIDE 34

Background Maintenance Traffic

Need to ping every 60 s for 90% validity j = 2 ⇒ min routing set 4× Chord Need only half probes because of symmetry Since 120 s = 2 × 60 s ⇒ background maintenance bandwidth ≤ Chord

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.34

slide-35
SLIDE 35

Hop Count – Churn-Intensive

1 2 3 4 5 6 7 500 1000 1500 2000 3000 4000 5000 6000 8000 10000 Chord 1-way EpiChord 2-way EpiChord 3-way EpiChord 4-way EpiChord 5-way EpiChord

Average number of hops per lookup Network Size (Logscale)

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.35

slide-36
SLIDE 36

Latency – Churn-Intensive

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 500 1000 1500 2000 3000 4000 5000 6000 8000 10000 1-way EpiChord Chord 2-way EpiChord 3-way EpiChord 4-way EpiChord 5-way EpiChord

Average lookup latency (s) Network Size (Logscale)

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.36

slide-37
SLIDE 37

Messages Sent Per Lookup

5 10 15 20 25 30 500 1000 1500 2000 3000 4000 5000 6000 8000 10000 5-way EpiChord 4-way EpiChord 3-way EpiChord 2-way EpiChord 1-way EpiChord Chord

Average number of messages per lookup Network Size (Logscale)

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.37

slide-38
SLIDE 38

Modelling Cache Composition

Consider a network of steady state size n, where per unit time a fraction r of the nodes leave a fraction f of the cache entries are flushed Each node makes Q lookups uniformly

  • ver the address space

p queries are sent in parallel for each lookup

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.38

slide-39
SLIDE 39

Modelling Cache Composition

Where x is the number of live nodes that is known to a node at time t, we obtain the following relation:

d dtx(t) = incoming queries

  • pQ(1 − x

n) − entries flushed

  • fx

− nodes departed but not flushed

  • (1 − f)rx

This assumes that new knowledge comes

  • nly from incoming queries

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.39

slide-40
SLIDE 40

Modelling Cache Composition

Where y is the number of outdated cache entries at time t, we have the following relation:

d dty(t) =

dead nodes not flushed

  • (1 − f)rx −

dead nodes flushed

  • fy

  • utdated nodes discovered by

timeouts of outgoing queries

  • pQ(

y x + y )

If churn is low relative to lookup rate, cache maintenance protocol is unimportant

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.40

slide-41
SLIDE 41

Modelling Cache Composition

If churn is high, the proportion of outdated entries in the cache, γ, is

γ = lim

t→∞

y x + y ≈

  • 1 + (1−f)r

f

− 1

  • 1 + (1−f)r

f

If cache entries are flushed at node failure rate,

γ ≈ √2 − f − 1 √2 − f ≤ 1 − 1 √ 2 = 0.292

⇒ most 30% of cache entries will be

  • utdated

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.41

slide-42
SLIDE 42

Cache – Lookup-Intensive

100 200 300 400 500 600 700 800 900 200 400 600 800 1000 1200 1400 5-way EpiChord - live entries 3-way EpiChord - live entries 1-way EpiChord - live entries 5-way EpiChord - outdated entries 3-way EpiChord - outdated entries 1-way EpiChord - outdated entries

Average number of entries in cache Network Size (Logscale)

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.42

slide-43
SLIDE 43

Cache – Lookup-Intensive

0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 0.2 200 300 400 600 800 1000 1200 1400 1-way EpiChord 2-way EpiChord 3-way EpiChord 4-way EpiChord 5-way EpiChord

Network Size (Logscale) Fraction of outdated cache entries

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.43

slide-44
SLIDE 44

Cache – Churn-Intensive

20 40 60 80 100 120 140 500 1000 1500 2000 3000 4000 5000 6000 8000 10000 5-way EpiChord - live entries 3-way EpiChord - live entries 1-way EpiChord - live entries 5-way EpiChord - outdated entries 3-way EpiChord - outdated entries 1-way EpiChord - outdated entries

Average number of entries in cache Network Size (Logscale)

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.44

slide-45
SLIDE 45

Cache – Churn-Intensive

0.11 0.12 0.13 0.14 500 1000 1500 2000 3000 4000 5000 6000 8000 10000 1-way EpiChord 2-way EpiChord 3-way EpiChord 4-way EpiChord 5-way EpiChord

Fraction of outdated cache entries Network Size (Logscale)

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management – p.45

slide-46
SLIDE 46

References

Dabek, F ., Li, J., Sit, E., Robertson, J., Kaashoek, M. F ., and Morris, R. (2004). Designing a DHT for low latency and high

  • throughput. In Proceedings of the 1st Symposium on Net-

worked Systems Design and Implementation (NSDI 2004), pages 85–98. Gupta, A., Liskov, B., and Rodrigues, R. (2004). Efficient rout- ing for peer-to-peer overlays. In Proceedings of the 1st Sym- posium on Networked Systems Design and Implementation (NSDI 2004), pages 113–126. Gupta, I., Birman, K., Linga, P ., Demers, A., and van Renesse,

  • R. (2003). Kelips: Building an efficient and stable P2P DHT

through increased memory and background overhead. In Proceedings of the 2nd International Workshop on Peer-to- Peer Systems (IPTPS ’03). Li, J., Stribling, J., Morris, R., Kaashoek, M. F ., and Gil, T. M. (2004). DHT routing tradeoffs in network with churn. In Proceedings of the 3rd International Workshop on Peer-to- Peer Systems (IPTPS ’04). Maymounkov, P . and Mazieres, D. (2002). Kademlia: A peer- to-peer information system based on the xor metric. In Pro- ceedings of the 1st International Workshop on Peer-to-Peer Systems (IPTPS ’02). Stoica, I., Morris, R., Karger, D., Kaashoek, F ., and Balakrish- nan, H. (2001). Chord: A scalable Peer-To-Peer lookup ser- 45-1

slide-47
SLIDE 47

vice for internet applications. In Proceedings of the 2001 ACM SIGCOMM Conference, pages 149–160. 45-2