Distributed Systems Principles and Paradigms Chapter 11 (version - - PDF document

distributed systems
SMART_READER_LITE
LIVE PREVIEW

Distributed Systems Principles and Paradigms Chapter 11 (version - - PDF document

Distributed Systems Principles and Paradigms Chapter 11 (version October 15, 2007 ) Maarten van Steen Vrije Universiteit Amsterdam, Faculty of Science Dept. Mathematics and Computer Science Room R4.20. Tel: (020) 598 7784


slide-1
SLIDE 1

Distributed Systems

Principles and Paradigms

Chapter 11

(version October 15, 2007)

Maarten van Steen

Vrije Universiteit Amsterdam, Faculty of Science

  • Dept. Mathematics and Computer Science

Room R4.20. Tel: (020) 598 7784 E-mail:steen@cs.vu.nl, URL: www.cs.vu.nl/∼steen/

01 Introduction 02 Architectures 03 Processes 04 Communication 05 Naming 06 Synchronization 07 Consistency and Replication 08 Fault Tolerance 09 Security 10 Distributed Object-Based Systems 11 Distributed File Systems 12 Distributed Web-Based Systems 13 Distributed Coordination-Based Systems

00 – 1 /

slide-2
SLIDE 2

Distributed File Systems

General goal: Try to make a file system transparently available to remote clients.

Client Client File stays

  • n server

Server Server Requests from client to access remote file

  • 1. File moved to client
  • 3. When client is done,

file is returned to server

  • 2. Accesses are

done on client Old file New file

Remote access model Upload/download model

11 – 1 Distributed File Systems/11.1 Architecture

slide-3
SLIDE 3

Example: NFS Architecture

NFS is implemented using the Virtual File System abstraction, which is now used for lots of different op- erating systems:

Virtual file system (VFS) layer Virtual file system (VFS) layer System call layer System call layer NFS client RPC client stub RPC server stub NFS server Local file system interface Local file system interface Network Client Server

Essence: VFS provides standard file system inter- face, and allows to hide difference between accessing local or remote file system. Question: Is NFS actually a file system?

11 – 2 Distributed File Systems/11.1 Architecture

slide-4
SLIDE 4

NFS File Operations

Oper. v3 v4 Description Create Yes No Create a regular file Create No Yes Create a nonregular file Link Yes Yes Create a hard link to a file Symlink Yes No Create a symbolic link to a file Mkdir Yes No Create a subdirectory Mknod Yes No Create a special file Rename Yes Yes Change the name of a file Remove Yes Yes Remove a file from a file system Rmdir Yes No Remove an empty subdirectory Open No Yes Open a file Close No Yes Close a file Lookup Yes Yes Look up a file by means of a name Readdir Yes Yes Read the entries in a directory Readlink Yes Yes Read the path name in a symbolic link Getattr Yes Yes Get the attribute values for a file Setattr Yes Yes Set one or more file-attribute values Read Yes Yes Read the data contained in a file Write Yes Yes Write data to a file

Question: Anything unusual between v3 and v4?

11 – 3 Distributed File Systems/11.1 Architecture

slide-5
SLIDE 5

Cluster-Based File Systems

Observation: When dealing with very large data col- lections, following a simple client-server approach is not going to work. Solution 1: For speeding up file accesses, apply striping techniques by which files can be fetched in parallel:

File block of file a File block of file e

a a a b b b c c c d d d e e e a c d e b a c e d b a b e c d

File-striped system Whole-file distribution 11 – 4 Distributed File Systems/11.1 Architecture

slide-6
SLIDE 6

Example: Google File System

Solution 2: Divide files in large 64 MB chunks, and distribute/replicate chunks across many servers.

Chunk server Linux file system Chunk server Linux file system Chunk server Linux file system Master GFS client file name, chunk index contact address Chunk-server state Instructions Chunk ID, range Chunk data

A couple of important details:

  • The master maintains only a (file name, chunk

server) table in main memory ⇒ minimal I/O

  • Files are replicated using a primary-backup scheme;

the master is kept out of the loop

11 – 5 Distributed File Systems/11.1 Architecture

slide-7
SLIDE 7

RPCs in File Systems

Observation: Many (traditional) distributed file sys- tems deploy remote procedure calls to access files. When wide-area networks need to be crossed, alter- natives need to be exploited:

LOOKUP READ LOOKUP OPEN READ Lookup name Read file data Open file Lookup name Read file data (a) (b) Client Client Server Server Time Time

11 – 6 Distributed File Systems/11.3 Communication

slide-8
SLIDE 8

Example: RPCs in Coda

Observation: When dealing with replicated files, se- quentially sending information is not the way to go:

Invalidate Invalidate Invalidate Invalidate Reply Reply Reply Reply Time Time Server Server Client Client Client Client (a) (b)

Note: In Coda, clients can cache files, but will be in- formed when an update has been performed.

11 – 7 Distributed File Systems/11.3 Communication

slide-9
SLIDE 9

File Sharing Semantics (1/2)

Problem: When dealing with distributed file systems, we need to take into account the ordering of concur- rent read/write operations, and expected semantics (= consistency).

Single machine

  • 1. Write "c"

Original file a a a a a a b b b b b b c c Process A Process A Process B Process B

  • 2. Read gets "abc"
  • 1. Read "ab"
  • 2. Write "c"
  • 3. Read gets "ab"

Client machine #1 File server Client machine #2 (a) (b)

11 – 8 Distributed File Systems/11.5 Synchronization

slide-10
SLIDE 10

File Sharing Semantics (2/2)

UNIX semantics: a read operation returns the effect

  • f the last write operation ⇒ can only be imple-

mented for remote access models in which there is only a single copy of the file Transaction semantics: the file system supports trans- actions on a single file ⇒ issue is how to allow concurrent access to a physically distributed file Session semantics: the effects of read and write

  • perations are seen only by the client that has
  • pened (a local copy) of the file ⇒ what happens

when a file is closed (only one client may actually win)

11 – 9 Distributed File Systems/11.5 Synchronization

slide-11
SLIDE 11

Example: File Sharing in Coda

Essence: Coda assumes transactional semantics, but without the full-fledged capabilities of real transactions.

Time Server Client Client Open(RD) Open(WR) File f File f Close Close Invalidate Session S Session S

A B

Note: Transactional issues reappear in the form of “this ordering could have taken place.”

11 – 10 Distributed File Systems/11.5 Synchronization

slide-12
SLIDE 12

Consistency and Replication

Observation: In modern distributed file systems, client- side caching is the preferred technique for attaining performance; server-side replication is done for fault tolerance. Observation: Clients are allowed to keep (large parts

  • f) a file, and will be notified when control is with-

drawn ⇒ servers are now generally stateful

Client Server Old file Updated file Local copy

  • 2. Server delegates file
  • 3. Server recalls delegation
  • 4. Client sends returns file
  • 1. Client asks for file

11 – 11 Distributed File Systems/11.6 Consistency and Replication

slide-13
SLIDE 13

Example: Client-side Caching in Coda

Time Server Client A Client B Open(RD) Open(RD) Open(WR) Open(WR) File f File f File f Close Close Close Close Invalidate (callback break) OK (no file transfer) Session S Session S Session S Session S

A A B B

Note: By making use of transactional semantics, it becomes possible to further improve performance.

11 – 12 Distributed File Systems/11.6 Consistency and Replication

slide-14
SLIDE 14

Fault Tolerance

Observation: FT is handled by simply replicating file servers, generally using a standard primary-backup protocol:

Data store Primary server for item x Client Client Backup server

  • W1. Write request
  • W2. Forward request to primary
  • W3. Tell backups to update
  • W4. Acknowledge update
  • W5. Acknowledge write completed

W1 W2 W3 W3 W3 W4 W4 W4 W5

  • R1. Read request
  • R2. Response to read

R1 R2

11 – 13 Distributed File Systems/11.7 Fault Tolerance

slide-15
SLIDE 15

High Availability in P2P Systems

Problem: There are many fully decentralized file-sharing systems, but because churn is high (i.e., nodes come and go all the time), we may face an availability prob- lem. Solution: Replicate files all over the place (replica- tion factor: rrep). Alternative: Apply erasure coding:

  • Partition a file F into m fragments, and recode into

a collection F∗ of n > m fragments

  • Property: any m fragments from F∗ are sufficient

to reconstruct F.

  • Replication factor: rec = n/m

11 – 14 Distributed File Systems/11.7 Fault Tolerance

slide-16
SLIDE 16

Replication vs. Erasure Coding

With an average node availability a, and required file unavailability ǫ, we have for erasure coding: 1 − ǫ =

rec·m

i=m

rec · m i

  • ai(1 − a)rec·m−i

and for file replication: 1 − ǫ = 1 − (1 − a)rrep

0.2 0.4 0.6 0.8 1 1.4 1.6 1.8 2.0 2.2 Node availability rreq

ec

r 11 – 15 Distributed File Systems/11.7 Fault Tolerance