} { 64 MB Server Server 64 bit unique handle Many Client - PDF document

The Need • Component failures normal – Due to clustered computing • Files are huge Google File System – By traditional standards (many TB) • Most mutations are mutations – Not random access overwrite CSE 454 • Co-Designing apps & file system • Typical: 1000 nodes & 300 TB From paper by Ghemawat, Gobioff & Leung Desiderata Interface • Must monitor & recover from comp failures • Familiar • Modest number of large files – Create, delete, open, close, read, write • Workload • Novel – Large streaming reads + small random reads – Snapshot – Many large sequential writes • Low cost • Random access overwrites don’t need to be efficient – Record append • Need semantics for concurrent appends • Atomicity with multiple concurrent writes • High sustained bandwidth – More important than low latency Architecture Architecture metadata only Master • Store all files – In fixed-size chucks Client Chunk Chunk } { • 64 MB Server Server • 64 bit unique handle Many Client Many • Triple redundancy Chunk Chunk Server Server Client data only Chunk Chunk Client Server Server 1

Architecture Architecture Master • Stores all metadata Client • GFS code implements API – Namespace • Cache only metadata – Access-control information Client – Chunk locations – ‘Lease’ management Client • Heartbeats • Having one master � global knowledge – Allows better placement / replication Client – Simplifies design Using fixed chunk size, translate filename & Replies with chunk handle & location of chunkserver byte offset to chunk index. replicas (including which is ‘primary’) Send request to master Cache info using filename & chunk index as key Request data from nearest chunkserver “chunkhandle & index into chunk” 2

No need to talk more Often initial request asks about About this 64MB chunk Sequence of chunks Until cached info expires or file reopened Metadata Consistency Model • Master stores three types – File & chunk namespaces – Mapping from files � chunks – Location of chunk replicas • Stored in memory • Kept persistent thru logging Consistent = all clients see same data Consistency Model Consistency Model Defined = consistent + clients see full effect Different clients may see different data of mutation Key: all replicas must process chunk-mutation requests in same order 3

Implications Leases & Mutation Order • Apps must rely on appends, not overwrites • Objective • Must write records that – Ensure data consistent & defined – Minimize load on master – Self-validate – Self-identify • Master grants ‘lease’ to one replica • Typical uses – Called ‘ primary ’ chunkserver – Single writer writes file from beginning to end, • Primary serializes all mutation requests then renames file (or checkpoints along way) – Communicates order to replicas – Many writers concurrently append • At-least-once semantics ok • Reader deal with padding & duplicates Write Control & Dataflow Atomic Appends • As in last slide, but… • Primary also checks to see if append spills over into new chunk – If so, pads old chunk to full extent – Tells secondary chunk-servers to do the same – Tells client to try append again on next chunk • Usually works because – max(append-size) < ¼ chunk-size [API rule] – (meanwhile other clients may be appending) Other Issues Master Replication • Fast snapshot • Master log & checkpoints replicated • Master operation • Outside monitor watches master livelihood – Namespace management & locking – Starts new master process as needed – Replica placement & rebalancing • Shadow masters – Garbage collection (deleted / stale files) – Provide read-access when primary is down – Detecting stale replicas – Lag state of true master 4

Read Performance Write Performance Record-Append Performance 5

} { 64 MB Server Server 64 bit unique handle Many Client - PDF document

The Need Component failures normal Due to clustered computing Files are huge Google File System By traditional standards (many TB) Most mutations are mutations Not random access overwrite CSE 454 Co-Designing

Evolution of the Web 1 Client Server 2 2 / 22 1 Client Server 2 2 / 22 1 Client Server

Services Do A for me. Sockets and Client/Server OK, heres your answer.

Multi-Threaded Servers December 6, 2007 1 Client-Server Communication Client Client Client

Binding: Connecting From Procedure to Client and Server Remote Procedure Server exports its

Server algorithms and their design many ways that a client/server can be designed each different

Interfaces as Contracts A client and a server are bound by a contract The server promises

Storing Data in The Client Saves information unique to users On Server: Less network

CS 10: Problem solving via Object Oriented Programming Client/Server Agenda 1. Sockets 2.

Socket Programming Socket Programming Srinidhi Varadarajan Client- -server paradigm server

Slow Orbit Feedback - Schematics oco Console Client Event @ 0.5Hz CDEV BPM CORBA Server

Sockets and Client/Server Communication Jeff Chase Duke University Services Do A for me.

1 Handling Return Traffic Handling Return Traffic URL Switching URL Switching Idea: switch

multi-platform, multi-os client/server Client/Server Communication Suppose we send data

Server and Threads vanilladb.org Where are we? VanillaCore JDBC Interface (at Client Side)

Content Server Caching Network Client Web Server Browser Avoid Network Latency Avoid Queuing

Micro Kernel Mach User Process File Server Mem Server Kernel Client-Server Good

Collaborative Systems of Support Garth L. Larson @LarsonGarth #CSSMN 1 2 Collaborative

Value-Ordering and Discrepancies Ciaran McCreesh and Patrick Prosser Maintaining Arc Consistency

Wilson Elementary / WSTEM Neighborhood Design Meeting 3 project approach Tonights Topics: 1.

Natural Language Processing with Deep Learning CS224N/Ling284 Christopher Manning Lecture 10:

iLab DNS and DNSSEC Dominik Scholz Slides by Benjamin Hof ilab1 @list.net.in.tum.de Chair of

J ob Monitoring MIB Need for monitoring print jobs in devices and servers Proposal for a new

High Availability using virtualization Federico Calzolari Scuola Normale Superiore - INFN Pisa

It Isn't Just a Video Game: Second Life for Events Bryan Person http://www.bryper.com/ Jay Moonah