CompSci 514: Computer Networks Lecture 13: Distributed Hash Table - PowerPoint PPT Presentation

CompSci 514: Computer Networks Lecture 13: Distributed Hash Table Xiaowei Yang

Overview • What problems do DHTs solve? • How are DHTs implemented?

Background • A hash table is a data structure that stores (key, object) pairs. • Key is mapped to a table index via a hash function for fast lookup. • Content distribution networks – Given an URL, returns the object

Example of a Hash table: a web cache http://www.cnn.com Page content 0 http://www.nytimes.com ……. 1 http://www.slashdot.org ….. … … 2 … … • Client requests http://www.cnn.com • Web cache returns the page content located at the 1 st entry of the table.

DHT: why? • If the number of objects is large, it is impossible for any single node to store it. • Solution: distributed hash tables. – Split one large hash table into smaller tables and distribute them to multiple nodes

DHT K V V K K V K V

A content distribution network • A single provider that manages multiple replicas. • A client obtains content from a close replica.

Basic function of DHT • DHT is a � virtual � hash table – Input: a key – Output: a data item • Data Items are stored by a network of nodes. • DHT abstraction – Input: a key – Output: the node that stores the key • Applications handle key and data item association.

DHT: a visual example K V K V (K1, V1) K V K V K V Insert (K 1 , V 1 )

DHT: a visual example K V K V (K1, V1) K V K V K V Retrieve K 1

Desired properties of DHT • Scalability: each node does not keep much state • Performance: look up latency is small • Load balancing: no node is overloaded with a large amount of state • Dynamic reconfiguration: when nodes join and leave, the amount of state moved from nodes to nodes is small. • Distributed: no node is more important than others.

A straw man design (0, V 1 ) (3, V 2 ) 0 (1, V 3 ) (2, V 5 ) (4, V 4 ) 1 2 (5, V 6 ) • Suppose all keys are intergers • The number of nodes in the network is n. • id = key % n

When node 2 dies (0, V1) 0 (2, V5) (4, V4) (1, V3) (3, V2) (5, V6) 1 • A large number of data items need to be rehashed.

Fix: consistent hashing • When a node joins or leaves, the expected fraction of objects that must be moved is the minimum needed to maintain a balanced load. • A node is responsible for a range of keys • All DHTs implement consistent hashing

Chord: basic idea • Hash both node id and key into a m-bit one-dimension circular identifier space • Consistent hashing: a key is stored at a node whose identifier is closest to the key in the identifier space – Key refers to both the key and its hash value.

Basic components of DHTs • Overlapping key and node identifier space – Hash(www.cnn.com/image.jpg) à a n-bit binary string – Nodes that store the objects also have n-bit string as their identifiers • Building routing tables – Next hops – Distance functions – These two determine the geometry of DHTs • Ring, Tree, Hybercubes, hybrid (tree + ring) etc. – Handle node join and leave • Lookup and store interface

Chord: ring topology Key 5 K5 Node 105 N105 K20 Circular 7-bit N32 ID space N90 K80 A key is stored at its successor: node with next higher ID

Chord: how to find a node that stores a key? • Solution 1: every node keeps a routing table to all other nodes – Given a key, a node knows which node id is successor of the key – The node sends the query to the successor – What are the advantages and disadvantages of this solution?

Solution 2: every node keeps a routing entry to the node � s successor (a linked list) N120 N10 � Where is key 80? � N105 N32 � N90 has K80 � N90 K80 N60

Simple lookup algorithm Lookup(my-id, key-id) n = my successor if my-id < n < key-id call Lookup(key-id) on node n // next hop else return my successor // done • Correctness depends only on successors • Q1: will this algorithm miss the real successor? • Q2: what � s the average # of lookup hops?

Solution 3: � Finger table � allows log(N)-time lookups ½ ¼ 1/8 1/16 1/32 1/64 1/128 N80 • Analogy: binary search

Finger i points to successor of n+2 i-1 N120 112 ½ ¼ 1/8 1/16 1/32 1/64 1/128 N80 • A finger table entry includes Chord Id and IP address • Each node stores a small table log(N)

Chord finger table example 1 [1,2) 1 Keys: 5,6 2 [2,4) 3 0 4 [4,0) 0 2 [2,3) 3 Keys: 7 1 1 3 [3,5) 3 0 5 [5,1) 6 2 4 [4,5) 0 Keys: 5 3 2 4 5 [5,7) 0 0 7 [7,3)

Lookup with fingers Lookup(my-id, key-id) look in local finger table for highest node n s.t. my-id < n < key-id if n exists call Lookup(key-id) on node n // next hop else return my successor // done

// ask node n to fi nd the successor of id n. find successor ( id ) if ( id ∈ ( n, successor ] ) return successor ; else n ′ = closest preceding node ( id ) ; return n ′ . fi nd successor ( id ) ; // search the local table for the highest predecessor of id n. closest preceding node ( id ) for i = m downto 1 if ( fi nger [ i ] ∈ ( n, id )) return fi nger [ i ] ; return n ; Fig. 5. Scalable key lookup using the fi nger table.

Chord lookup example • Lookup(1,6) 1 [1,2) 1 Keys: 5,6 2 [2,4) 3 • Lookup(1,2) 0 4 [4,0) 0 2 [2,3) 3 Keys: 7 1 1 3 [3,5) 3 0 5 [5,1) 6 2 4 [4,5) 0 Keys: 5 3 2 4 5 [5,7) 0 0 7 [7,3)

Node join • Maintain the invariant 1.Each node � s successor is correctly maintained 2.For every node k, node successor(k) answers for k. It � s desirable that finger table entries are correct • Each nodes maintains a predecessor pointer • Tasks: – Initialize predecessor and fingers of new node – Update existing nodes � state – Notify apps to transfer state to new node

Chord Joining: linked list insert N25 N36 1. Lookup(36) K30 N40 K38 • Node n queries a known node n � to initialize its state • for its successor: lookup (n)

Join (2) N25 2. N36 sets its own N36 successor pointer K30 N40 K38

Join (3) N25 3. Copy keys 26..36 K30 N36 from N40 to N36 K30 N40 K38 • Note that join does not make the network aware of n

Join (4): stabilize N25 4. Set N25 � s successor K30 N36 pointer N40 K38 • Stabilize 1) obtains a node n � s successor � s predecessor x, and determines whether x should be n � s successor 2) notifies n � s successor n � s existence – N25 calls its successor N40 to return its predecessor – Set its successor to N36 – Notifies N36 it is predecessor • Update finger pointers in the background periodically – Find the successor of each entry i • Correct successors produce correct lookups

Failures might cause incorrect lookup N120 N10 N113 N102 Lookup(90) N85 N80 N80 doesn � t know correct successor, so incorrect lookup

Solution: successor lists • Each node knows r immediate successors • After failure, will know first live successor • Correct successors guarantee correct lookups • Guarantee is with some probability • Higher layer software can be notified to duplicate keys at failed nodes to live successors

Choosing the successor list length • Assume 1/2 of nodes fail • P(successor list all dead) = (1/2) r – I.e. P(this node breaks the Chord ring) – Depends on independent failure • P(no broken nodes) = (1 – (1/2) r ) N – r = 2log(N) makes prob. = 1 – 1/N

Lookup with fault tolerance Lookup(my-id, key-id) look in local finger table and successor-list for highest node n s.t. my-id < n < key-id if n exists call Lookup(key-id) on node n // next hop if call failed, remove n from finger table return Lookup(my-id, key-id) else return my successor // done

Chord performance • Per node storage – Ideally: K/N – Implementation: large variance due to unevenly node id distribution • Lookup latency – O(logN)

Comments on Chord • DHTs are used for p2p file lookup in the real world • ID distance ¹ Network distance – Reducing lookup latency and locality are research challenges • Strict successor selection – Can � t overshoot • Asymmetry – A node does not learn its routing table entries from queries it receives

Conclusion • Consistent Hashing – What problem does it solve • Design of DHTs – Chord: ring • Kademlia: tree – Used in practice, emule, Bittorrent – CAN: hybercube – Much more others: Pastry, Tapestry, Viceroy….

Discussion • What tradeoff does chord make? • How can we improve chord � s lookup latency? • What are the possible applications of DHT? • Recursive lookup or iterative lookup?

CompSci 514: Computer Networks Lecture 13: Distributed Hash Table - PowerPoint PPT Presentation

CompSci 514: Computer Networks Lecture 13: Distributed Hash Table Xiaowei Yang Overview What problems do DHTs solve? How are DHTs implemented? Background A hash table is a data structure that stores (key, object) pairs.

Hash Functions in Action Hash Functions in Action Lecture 12 Hash Functions Hash Functions

Hash Functions in Action Hash Functions in Action Lecture 11 Hash Functions Hash Functions

CompSci 514: Computer Networks Lecture 15 Practical Datacenter Networks Xiaowei Yang Overview

Hash Functions Hash Functions 1 Cryptographic Hash Function Crypto hash function h(x) must

Hash Functions and Hash Tables (2.5.2) A hash function h maps keys of a given type to

CompSci 514: Computer Networks Lecture 14 Datacenter Transport protocols II Xiaowei Yang

CompSci 514: Computer Networks Lecture 16: Network Function Virtualization Xiaowei Yang Adapted

CompSci 514: Computer Networks Lecture 13 TCP incast and Solutions Xiaowei Yang Roadmap

Camera Calibration COMPSCI 527 Computer Vision COMPSCI 527 Computer Vision Camera

Hash Functions Hash Functions Lecture 10 Hash Functions Lecture 10 Before we talk about

Hash Functions Hash Functions Lecture 10 Hash Functions Lecture 10 Before we talk about

Generics Asumu Takikawa RacketCon 2012 1 What are generics? 2 What are generics? hash-ref

Hash Pile Ups: Using Collisions to Identify Unknown Hash Functions R. Joshua Tobin and David

CompSci 514: Computer Networks Lecture 5: Congestion Control Xiaowei Yang 1 Outline

CompSci 514: Computer Networks Lecture 04: Evolution of the Internet Xiaowei Yang

CompSci 514: Computer Networks Lecture 17: Datacenter Network Architectures Xiaowei Yang

Distributed Hash Tables What is a DHT? Hash Table data structure that maps keys to

Introduction to Distributed Hash Tables Eric Rescorla Network Resonance ekr@networkresonance.com

Learning from Fine-Grained and Long-Tailed Visual Data Yin Cui Google Research Dec 11 2019

Building a Digital First Future: Digital Primary Care Congress Chamber of Commerce, Manchester 5

Handling Churn in a DHT Sean Rhea, Dennis Geels, Timothy Roscoe, and John Kubiatowicz UC

Distributed Hash Tables CS425 /ECE428 DISTRIBUTED SYSTEMS SPRING 2020 Material derived

Handling Churn in a DHT Sean Rhea, Dennis Geels, Timothy Roscoe, and John Kubiatowicz University

BRANCH PREDICTORS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of