Distributed Systems Principles and Paradigms Chapter 05 (version - PDF document

Distributed Systems Principles and Paradigms Chapter 05 (version September 20, 2007 ) Maarten van Steen Vrije Universiteit Amsterdam, Faculty of Science Dept. Mathematics and Computer Science Room R4.20. Tel: (020) 598 7784 E-mail:steen@cs.vu.nl, URL: www.cs.vu.nl/ ∼ steen/ 01 Introduction 02 Architectures 03 Processes 04 Communication 05 Naming 06 Synchronization 07 Consistency and Replication 08 Fault Tolerance 09 Security 10 Distributed Object-Based Systems 11 Distributed File Systems 12 Distributed Web-Based Systems 13 Distributed Coordination-Based Systems 00 – 1 /

Naming Entities • Names, identifiers, and addresses • Name resolution • Name space implementation 05 – 1 Naming/5.1 Naming Entities

Naming Essence: Names are used to denote entities in a distributed system. To operate on an entity, we need to access it at an access point . Access points are entities that are named by means of an address . Note: A location-independent name for an entity E , is independent from the addresses of the access points offered by E . 05 – 2 Naming/5.1 Naming Entities

Identifiers Pure name: A name that has no meaning at all; it is just a random string. Pure names can be used for comparison only. Identifier: A name having the following properties: P1 Each identifier refers to at most one entity P2 Each entity is referred to by at most one identifier P3 An identifier always refers to the same entity (pro- hibits reusing an identifier) Observation: An identifier need not necessarily be a pure name, i.e., it may have content. Question: Can the content of an identifier ever change? 05 – 3 Naming/5.1 Naming Entities

Flat Naming Problem: Given an essentially unstructured name (e.g., an identifier), how can we locate its associated access point ? • Simple solutions (broadcasting) • Home-based approaches • Distributed Hash Tables (structured P2P) • Hierarchical location service 05 – 4 Naming/5.2 Flat Naming

Simple Solutions Broadcasting: Simply broadcast the ID, requesting the entity to return its current address. • Can never scale beyond local-area networks (think of ARP/RARP) • Requires all processes to listen to incoming location requests Forwarding pointers: Each time an entity moves, it leaves behind a pointer telling where it has gone to. • Dereferencing can be made entirely transparent to clients by simply following the chain of pointers • Update a client’s reference as soon as present location has been found • Geographical scalability problems: – Long chains are not fault tolerant – Increased network latency at dereferencing Essential to have separate chain reduction mechanisms 05 – 5 Naming/5.2 Flat Naming

Home-Based Approaches (1/2) Single-tiered scheme: Let a home keep track of where the entity is: • An entity’s home address is registered at a naming service • The home registers the foreign address of the entity • Clients always contact the home first, and then continues with the foreign location Host's home location 1. Send packet to host at its home 2. Return address of current location Client's location 3. Tunnel packet to current location 4. Send successive packets to current location Host 's present location 05 – 6 Naming/5.2 Flat Naming

Home-Based Approaches (2/2) Two-tiered scheme: Keep track of visiting entities: • Check local visitor register first • Fall back to home location if local lookup fails Problems with home-based approaches: • The home address has to be supported as long as the entity lives. • The home address is fixed, which means an un- necessary burden when the entity permanently moves to another location • Poor geographical scalability (the entity may be next to the client) Question: How can we solve the “permanent move” problem? 05 – 7 Naming/5.2 Flat Naming

Distributed Hash Tables Example: Consider the organization of many nodes into a logical ring ( Chord ) • Each node is assigned a random m -bit identifier . • Every entity is assigned a unique m -bit key . • Entity with key k falls under jurisdiction of node with smallest id ≥ k (called its successor ). Nonsolution: Let node id keep track of succ ( id ) and start linear search along the ring. 05 – 8 Naming/5.2 Flat Naming

DHTs: Finger Tables (1/2) • Each node p maintains a finger table FT p [] with at most m entries: FT p [ i ] = succ ( p + 2 i − 1 ) Note: FT p [ i ] points to the first node succeeding p by at least 2 i − 1 . • To look up a key k , node p forwards the request to node with index j satisfying q = FT p [ j ] ≤ k < FT p [ j + 1 ] • If p < k < FT p [ 1 ] , the request is also forwarded to FT p [ 1 ] 05 – 9 Naming/5.2 Flat Naming

DHTs: Finger Tables (2/2) 1 4 Finger table 2 4 succ(p + 2 ) 1 3 9 - i 4 9 5 18 Actual node 0 i 31 1 30 2 1 9 29 3 2 9 1 1 3 9 2 1 28 4 4 14 3 1 5 20 4 4 27 5 5 14 Resolve k = 12� from node 28 26 6 25 7 24 8 1 11 2 11 3 14 23 9 Resolve k = 26� 4 18 from node 1 5 28 22 10 1 28 2 28 21 11 1 14 3 28 2 14 4 1 20 12 3 18 5 9 4 20 13 19 5 28 1 21 18 14 2 28 15 17 16 3 28 1 18 4 28 2 18 5 4 1 20 3 18 2 20 4 28 3 28 5 1 4 28 5 4 05 – 10 Naming/5.2 Flat Naming

Exploiting Network Proximity Problem: The logical organization of nodes in the overlay may lead to erratic message transfers in the underlying Internet: node k and node succ ( k + 1 ) may be very far apart. Topology-aware node assignment: When assigning an ID to a node, make sure that nodes close in the ID space are also close in the network. Can be very difficult . Proximity routing: Maintain more than one possible successor, and forward to the closest. Example: in Chord FT p [ i ] points to first node in INT = [ p + 2 i − 1 , p + 2 i − 1 ] . Node p can also store pointers to other nodes in INT . Proximity neighbor selection: When there is a choice of selecting who your neighbor will be (not in Chord), pick the closest one. 05 – 11 Naming/5.2 Flat Naming

Hierarchical Location Services (HLS) Basic idea: Build a large-scale search tree for which the underlying network is divided into hierarchical do- mains. Each domain is represented by a separate directory node. The root directory Top-level node dir(T) domain T Directory node dir(S) of domain S A subdomain S of top-level domain T (S is contained in T) A leaf domain, contained in S 05 – 12 Naming/5.2 Flat Naming

HLS: Tree Organization • The address of an entity is stored in a leaf node, or in an intermediate node • Intermediate nodes contain a pointer to a child if and only if the subtree rooted at the child stores an address of the entity • The root knows about all entities Field with no data Field for domain Location record dom(N) with for E at node M pointer to N M N Location record with only one field, containing an address Domain D1 Domain D2 05 – 13 Naming/5.2 Flat Naming

HLS: Lookup Operation Basic principles: • Start lookup at local leaf node • If node knows about the entity, follow downward pointer, otherwise go one level up • Upward lookup always stops at root Node knows about E, so request is forwarded to child Node has no record for E, so M that request is forwarded to parent Look-up Domain D request 05 – 14 Naming/5.2 Flat Naming

HLS: Insert Operation Node knows Node has no about E, so request record for E, is no longer forwarded so request is forwarded M to parent Domain D Insert request (a) Node creates record and stores pointer M Node creates record and stores address (b) 05 – 15 Naming/5.2 Flat Naming

Name Space (1/2) Essence: a graph in which a leaf node represents a (named) entity. A directory node is an entity that refers to other nodes. Data stored in n1 n0 home keys n2: "elke" n3: "max" "/keys" n4: "steen" n1 n5 "/home/steen/keys" elke steen max keys n2 n3 n4 Leaf node .twmrc mbox Directory node "/home/steen/mbox" Note: A directory node contains a (directory) table of (edge label, node identifier) pairs. 05 – 16 Naming/5.3 Structured Naming

Name Space (2/2) Observation: We can easily store all kinds of attributes in a node, describing aspects of the entity the node represents: • Type of the entity • An identifier for that entity • Address of the entity ’s location • Nicknames • ... Observation: Directory nodes can also have attributes, besides just storing a directory table with (edge label, node identifier) pairs. 05 – 17 Naming/5.3 Structured Naming

Name Resolution Problem: To resolve a name we need a directory node. How do we actually find that (initial) node? Closure mechanism: The mechanism to select the implicit context from which to start name resolution: • www.cs.vu.nl : start at a DNS name server • /home/steen/mbox : start at the local NFS file server (possible recursive search) • 0031204447784 : dial a phone number • 130.37.24.8 : route to the VU ’s Web server Question: Why are closure mechanisms always implicit ? Observation: A closure mechanism may also determine how name resolution should proceed 05 – 18 Naming/5.3 Structured Naming

Distributed Systems Principles and Paradigms Chapter 05 (version - PDF document

Distributed Systems Principles and Paradigms Chapter 05 (version September 20, 2007 ) Maarten van Steen Vrije Universiteit Amsterdam, Faculty of Science Dept. Mathematics and Computer Science Room R4.20. Tel: (020) 598 7784

Distributed Systems (ICE 601) Distributed Transactions Dongman Lee ICU Class Overview

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals & Challenges

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals & Challenges

Distributed File Systems Distributed File Systems A distributed file system (DFS) is a

Introduction to Distributed * Systems Introduction to Distributed * Systems Outline Outline

Introduction to Distributed Systems Introduction to Distributed Systems Outline Outline

Unleashing Talent in A Distributed Workforce C O R E N E T 2 0 2 0 HACKATHON: DISTRIBUTED W O R K

` James R. Wilcox Zach Tatlock Ilya Sergey Distributed Systems Distributed Infrastructure

Distributed Storage Systems part 1 Marko Vukoli Distributed Systems and Cloud Computing This

Coordinating distributed systems Marko Vukoli Distributed Systems and Cloud Computing Previous

Distributed File Systems Issues in Distributed File Service Case Studies: Sun

WHAT WE TALK ABOUT WHEN WE TALK ABOUT DISTRIBUTED SYSTEMS ALVARO VIDELA DISTRIBUTED SYSTEMS

Distributed File Systems: An Overview of Peer-to-Peer Architectures Distributed File Systems

DISTRIBUTED SYSTEMS Department of Computing Science Umea University Distributed Systems - D N

Networks and Distributed Systems Olaf Landsiedel Networks and Distributed Systems What is

Distributed Storage Systems part 2 Marko Vukoli Distributed Systems and Cloud Computing

Virtual Machine Migration Pierre Riteau University of Rennes 1, IRISA Inria Rennes - Bretagne

Exerccios de Fixao TCP/IP Captulo 2 Endereo de Rede 1) Marque Falso ou

Adilson Aparecido Floren/no Network Specialist Who am I??? Adilson Aparecido Florentino

Network Slicing: Predictable Performance in Unpredictable Environment? Stefan Schmid (University

The Alligator meets the Terminator: Caiman, AI, and the other 998 ways of installing OpenSolaris

Allowances: Responding to Reform #dsa Chair: Professor Alan Hurst F ormerly of the School of

BP4 B eyond P acket P rocessing towards P rotocol P rocessing Optimizing host networking

1 Pop Quiz #1 Pop Quiz #2 How could you use this to determine link latency? How could you use

Distributed Systems Principles and Paradigms Chapter 05 (version - PDF document

Distributed Systems Principles and Paradigms Chapter 05 (version September 20, 2007 ) Maarten van Steen Vrije Universiteit Amsterdam, Faculty of Science Dept. Mathematics and Computer Science Room R4.20. Tel: (020) 598 7784

Distributed Systems (ICE 601) Distributed Transactions Dongman Lee ICU Class Overview

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals &amp; Challenges

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals &amp; Challenges

Distributed File Systems Distributed File Systems A distributed file system (DFS) is a

Introduction to Distributed * Systems Introduction to Distributed * Systems Outline Outline

Introduction to Distributed Systems Introduction to Distributed Systems Outline Outline

Unleashing Talent in A Distributed Workforce C O R E N E T 2 0 2 0 HACKATHON: DISTRIBUTED W O R K

` James R. Wilcox Zach Tatlock Ilya Sergey Distributed Systems Distributed Infrastructure

Distributed Storage Systems part 1 Marko Vukoli Distributed Systems and Cloud Computing This

Coordinating distributed systems Marko Vukoli Distributed Systems and Cloud Computing Previous

Distributed File Systems Issues in Distributed File Service Case Studies: Sun

WHAT WE TALK ABOUT WHEN WE TALK ABOUT DISTRIBUTED SYSTEMS ALVARO VIDELA DISTRIBUTED SYSTEMS

Distributed File Systems: An Overview of Peer-to-Peer Architectures Distributed File Systems

DISTRIBUTED SYSTEMS Department of Computing Science Umea University Distributed Systems - D N

Networks and Distributed Systems Olaf Landsiedel Networks and Distributed Systems What is

Distributed Storage Systems part 2 Marko Vukoli Distributed Systems and Cloud Computing

Virtual Machine Migration Pierre Riteau University of Rennes 1, IRISA Inria Rennes - Bretagne

Exerccios de Fixao TCP/IP Captulo 2 Endereo de Rede 1) Marque Falso ou

Adilson Aparecido Floren/no Network Specialist Who am I??? Adilson Aparecido Florentino

Network Slicing: Predictable Performance in Unpredictable Environment? Stefan Schmid (University

The Alligator meets the Terminator: Caiman, AI, and the other 998 ways of installing OpenSolaris

Allowances: Responding to Reform #dsa Chair: Professor Alan Hurst F ormerly of the School of

BP4 B eyond P acket P rocessing towards P rotocol P rocessing Optimizing host networking

1 Pop Quiz #1 Pop Quiz #2 How could you use this to determine link latency? How could you use

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals & Challenges

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals & Challenges