Ken Birman i Cornell University. CS5410 Fall 2008. What is a - PowerPoint PPT Presentation

Ken Birman i Cornell University. CS5410 Fall 2008.

What is a Distributed Hash Table (DHT)? � Exactly that ☺ � A service, distributed over multiple machines, with h hash table semantics h bl i � Put (key, value), Value = Get (key) � Designed to work in a peer ‐ to ‐ peer (P2P) environment Designed to work in a peer to peer (P2P) environment � No central control � Nodes under different administrative control � But of course can operate in an “infrastructure” sense B f i “i f ”

More specifically � Hash table semantics: � Put (key, value), � Value = Get (key) � Key is a single flat string l fl � Limited semantics compared to keyword search � Put () causes value to be stored at one (or more) peer(s) � Get () retrieves value from a peer G () i l f � Put () and Get () accomplished with unicast routed messages � In other words, it scales � Other API calls to support application, like notification when Oth API ll t t li ti lik tifi ti h neighbors come and go

P2P “environment” � Nodes come and go at will (possibly quite frequently ‐‐‐ a few minutes) � Nodes have heterogeneous capacities N d h h i i � Bandwidth, processing, and storage � Nodes may behave badly � N d b h b dl � Promise to do something (store a file) and not do it (free ‐ loaders) ( ) � Attack the system

Several flavors, each with variants � Tapestry (Berkeley) � Based on Plaxton trees ‐‐‐ similar to hypercube routing � The first* DHT � The first* DHT � Complex and hard to maintain (hard to understand too!) � CAN (ACIRI), Chord (MIT), and Pastry (Rice/MSR Cambridge) � Second wave of DHTs (contemporary with and � Second wave of DHTs (contemporary with and independent of each other) * Landmark Routing, 1988, used a form of DHT called Assured Destination Binding (ADB)

Basics of all DHTs � Goal is to build some “structured” 127 overlay network with the following 13 111 characteristics: characteristics: 97 � Node IDs can be mapped to the hash key 33 space p � Given a hash key as a “destination 81 address”, you can route through the 58 network to a given node network to a given node � Always route to the same node no matter where you start from y

Simple example (doesn’t scale) � Circular number space 0 to 127 127 � Routing rule is to move counter ‐ 13 111 clockwise until current node ID ≥ key, d ID ≥ k l k i il 97 and last hop node ID < key 33 � Example: key = 42 81 58 � Obviously you will route to node 58 � Obviously you will route to node 58 from no matter where you start

Building any DHT � Newcomer always starts with at least 127 one known member 13 111 97 33 81 58 24

Building any DHT � Newcomer always starts with at least 127 one known member 13 111 � Newcomer searches for “self” in the N h f “ lf” i h 97 network 33 � hash key = newcomer’s node ID � hash key = newcomer s node ID � Search results in a node in the vicinity 81 58 where newcomer needs to be 24

Building any DHT � Newcomer always starts with at least 127 one known member 13 111 � Newcomer searches for “self” in the N h f “ lf” i h 24 97 network 33 � hash key = newcomer’s node ID � hash key = newcomer s node ID � Search results in a node in the vicinity 81 58 where newcomer needs to be � Links are added/removed to satisfy properties of network

Building any DHT � Newcomer always starts with at least one known member 127 13 111 � Newcomer searches for “self” in the � Newcomer searches for self in the 24 network 97 33 � hash key = newcomer’s node ID � Search results in a node in the vicinity where newcomer needs to be 81 58 � Links are added/removed to satisfy � Links are added/removed to satisfy properties of network � Objects that now hash to new node are transferred to new node

Insertion/lookup for any DHT � Hash name of object to produce key 127 � Well ‐ known way to do this 13 111 � Use key as destination address to k d dd 24 97 route through network 33 � Routes to the target node � Routes to the target node � Insert object, or retrieve object, at the 81 58 target node g foo.htm → 93

Properties of most DHTs � Memory requirements grow (something like) logarithmically with N � Routing path length grows (something like) R i h l h ( hi lik ) logarithmically with N � Cost of adding or removing a node grows (something � Cost of adding or removing a node grows (something like) logarithmically with N � Has caching, replication, etc… Has caching, replication, etc…

DHT Issues � Resilience to failures � Load Balance � Heterogeneity H t it � Number of objects at each node � Routing hot spots g p � Lookup hot spots � Locality (performance issue) � Churn (performance and correctness issue) � Security

We’re going to look at four DHTs � At varying levels of detail… � CAN (Content Addressable Network) � ACIRI (now ICIR) � ACIRI (now ICIR) � Chord � MIT � Kelips � Cornell � Pastry Pastry � Rice/Microsoft Cambridge

Things we’re going to look at � What is the structure? � How does routing work in the structure? � How does it deal with node departures? H d it d l ith d d t ? � How does it scale? � How does it deal with locality? � How does it deal with locality? � What are the security issues?

CAN structure is a cartesian coordinate CAN structure is a cartesian coordinate space in a D dimensional torus 1 CAN graphics care of Santashil PalChaudhuri, Rice Univ

Simple example in two p p dimensions 1 2

Note: torus wraps on “top” and “sides” N t t “t ” d “ id ” 3 1 2

Each node in CAN network occupies Each node in CAN network occupies a “square” in the space 3 1 4 2

With relatively uniform square sizes

Neighbors in CAN network � Neighbor is a node that: � Overlaps d-1 dimensions � Abuts along one Abuts along one dimension

Route to neighbors closer to target Z2 Z2 Z3 Z3 Z4 Z4… Zn Zn Z1 Z1 � d ‐ dimensional space � n zones (a,b) � Zone is space occupied by a “square” in one dimension � Avg route path length � Avg. route path length � (d/4)(n 1/d ) � Number neighbors = O(d) (x,y) � Tunable (vary d or n) � Can factor proximity into route decision route decision

Ch Chord uses a circular ID space d i l ID N10 K5, K10 Key ID Node ID N100 K100 Circular ID Space N32 N32 K11, K30 K11, K30 N80 N80 K65, K70 K65, K70 N60 K33, K40, K52 • Successor: node with next highest ID Chord slides care of Robert Morris, MIT

Basic Lookup N5 N10 N110 “Where is key 50?” N20 N99 “Key 50 is Key 50 is N32 At N60” N40 N40 N80 N60 • Lookups find the ID’s predecessor • Correct if successors are correct

Successor Lists Ensure Robust Lookup Successor Lists Ensure Robust Lookup 10, 20, 32 N5 20, 32, 40 N10 5, 10, 20 N110 32, 40, 60 N20 110, 5, 10 N99 40, 60, 80 N32 N40 60, 80, 99 99, 110, 5 N80 N60 N60 80 99 110 80, 99, 110 • Each node remembers r successors • Lookup can skip over dead nodes to find blocks Lookup can skip over dead nodes to find blocks • Periodic check of successor and predecessor links

Ch Chord “Finger Table” Accelerates Lookups d “Fi T bl ” A l t L k To build finger tables, new ½ ¼ node searches for the key node searches for the key values for each finger To do it efficiently, new T d it ffi i tl 1/8 nodes obtain successor’s finger table, and use as a 1/16 1/32 1/32 hint to optimize the search 1/64 1/128 N80

Chord lookups take O(log N) hops N5 N10 N110 N110 K19 K19 N20 N99 N32 Lookup(K19) N80 N60

Drill down on Chord reliability � Interested in maintaining a correct routing table (successors, predecessors, and fingers) � Primary invariant: correctness of successor pointers � Primary invariant: correctness of successor pointers � Fingers, while important for performance, do not have to be exactly correct for routing to work � Algorithm is to “get closer” to the target � Successor nodes always do this

Maintaining successor pointers � Periodically run “stabilize” algorithm � Finds successor’s predecessor � Repair if this isn’t self � This algorithm is also run at join � Eventually routing will repair itself E ll i ill i i lf � Fix_finger also periodically run � For randomly selected finger F d l l t d fi

Initial: 25 wants to join correct ring Initial: 25 wants to join correct ring (between 20 and 30) 20 20 20 25 25 30 30 25 30 25 finds successor, 25 finds successor, and tells successor 20 runs “stabilize”: (30) of itself 20 asks 30 for 30’s predecessor 30 returns 25 20 tells 25 of itself

This time, 28 joins before 20 runs This time, 28 joins before 20 runs “stabilize” 20 20 20 28 25 28 25 25 30 25 30 28 30 28 finds successor, 20 runs “stabilize”: and tells successor 20 asks 30 for 30’s predecessor (30) of itself 30 returns 28 20 tells 28 of itself

20 20 20 25 25 28 28 25 28 30 30 25 runs “stabilize” 25 runs “stabilize” 30 20 runs “stabilize” “

Ken Birman i Cornell University. CS5410 Fall 2008. What is a - PowerPoint PPT Presentation

Ken Birman i Cornell University. CS5410 Fall 2008. What is a Distributed Hash Table (DHT)? Exactly that A service, distributed over multiple machines, with h hash table semantics h bl i Put (key, value), Value = Get (key)

Live Objects Live Objects Live Objects Live Objects Krzys Ostrowski, Ken Birman, Danny Dolev

CS5412: HOW DURABLE SHOULD IT BE? Lecture XV Ken Birman Durability 2 When a system accepts

CS5412: ANATOMY OF A CLOUD Lecture VII Ken Birman How are cloud structured? 2 Clients talk

CS5412: WHERE DID MY PERFORMANCE GO? Lecture XVIII Ken Birman Suppose you follow the rules

Ken Birman i Cornell University. CS5410 Fall 2008. Welcome to CS5140! A course on cloud

CS5412: TRANSACTIONS (I) Lecture XVII Ken Birman Transactions A widely used reliability

CS5412: SPRING 2012 CLOUD COMPUTING Lecture 1 Ken Birman Welcome to CS 5412... 2 A completely

OTHER DATA CENTER SERVICES Lecture V Ken Birman Tier two and Inner Tiers 2 If tier one

CS5412: HOW IT WORKS Lecture II Ken Birman Today: Lets look at some real apps 2 Well

CS5412: VIRTUAL SYNCHRONY Lecture XIV Ken Birman Group Communication idea 2 System

CS5412: TORRENTS AND TIT-FOR-TAT Lecture VII Ken Birman BitTorrent 2 Used in WAN setting

CS5412: THE BASE METHODOLOGY VERSUS THE ACID MODEL Lecture VIII Ken Birman Todays lecture

CS5412: TWO AND THREE PHASE COMMIT Lecture XI Ken Birman Continuing our consistency saga 2

CS5412: TORRENTS AND TIT-FOR-TAT Lecture VI Ken Birman BitTorrent 2 Today well be

CS5412: HOW MUCH ORDERING? Lecture XVI Ken Birman Ordering 2 The key to consistency turns

CS5412: CONSENSUS AND THE FLP IMPOSSIBILITY RESULT Lecture XII Ken Birman Generalizing Ron and

Fundamentals of Fluid Dynamics: Waves in Fluids Introductory Course on Multiphysics Modelling T

8. Energy 8.1 Electricity and Circuits 8.2 Electricity and Magnetism 8.3 Introduction to Waves

Review: Basic Concepts Simula5ons 1. Radio Waves

CutFem and Finite Differences for wave equations Gunilla Kreiss Uppsala University, Sweden

Sonic Automotive And EchoPark Continue To See Stronger Than Expected Recovery Updated Guidance

Malaysian Healthy Ageing Society The Effect of Coresidence with an Adult Child on Depressive

Networks and Distributed Systems Olaf Landsiedel Defini/on I believe you know what a network

WELCOME LOCKDOWN 2.0 Legal Tips for your Business or Organisation WELCOME Mitchell Zadow

Sambuz

Useful Links

Newsletter

Mail Us