64 mb server server 64 bit unique handle many client many
play

} { 64 MB Server Server 64 bit unique handle Many Client - PDF document

The Need Component failures normal Due to clustered computing Files are huge Google File System By traditional standards (many TB) Most mutations are mutations Not random access overwrite CSE 454 Co-Designing


  1. The Need • Component failures normal – Due to clustered computing • Files are huge Google File System – By traditional standards (many TB) • Most mutations are mutations – Not random access overwrite CSE 454 • Co-Designing apps & file system • Typical: 1000 nodes & 300 TB From paper by Ghemawat, Gobioff & Leung Desiderata Interface • Must monitor & recover from comp failures • Familiar • Modest number of large files – Create, delete, open, close, read, write • Workload • Novel – Large streaming reads + small random reads – Snapshot – Many large sequential writes • Low cost • Random access overwrites don’t need to be efficient – Record append • Need semantics for concurrent appends • Atomicity with multiple concurrent writes • High sustained bandwidth – More important than low latency Architecture Architecture metadata only Master • Store all files – In fixed-size chucks Client Chunk Chunk } { • 64 MB Server Server • 64 bit unique handle Many Client Many • Triple redundancy Chunk Chunk Server Server Client data only Chunk Chunk Client Server Server 1

  2. Architecture Architecture Master • Stores all metadata Client • GFS code implements API – Namespace • Cache only metadata – Access-control information Client – Chunk locations – ‘Lease’ management Client • Heartbeats • Having one master � global knowledge – Allows better placement / replication Client – Simplifies design Using fixed chunk size, translate filename & Replies with chunk handle & location of chunkserver byte offset to chunk index. replicas (including which is ‘primary’) Send request to master Cache info using filename & chunk index as key Request data from nearest chunkserver “chunkhandle & index into chunk” 2

  3. No need to talk more Often initial request asks about About this 64MB chunk Sequence of chunks Until cached info expires or file reopened Metadata Consistency Model • Master stores three types – File & chunk namespaces – Mapping from files � chunks – Location of chunk replicas • Stored in memory • Kept persistent thru logging Consistent = all clients see same data Consistency Model Consistency Model Defined = consistent + clients see full effect Different clients may see different data of mutation Key: all replicas must process chunk-mutation requests in same order 3

  4. Implications Leases & Mutation Order • Apps must rely on appends, not overwrites • Objective • Must write records that – Ensure data consistent & defined – Minimize load on master – Self-validate – Self-identify • Master grants ‘lease’ to one replica • Typical uses – Called ‘ primary ’ chunkserver – Single writer writes file from beginning to end, • Primary serializes all mutation requests then renames file (or checkpoints along way) – Communicates order to replicas – Many writers concurrently append • At-least-once semantics ok • Reader deal with padding & duplicates Write Control & Dataflow Atomic Appends • As in last slide, but… • Primary also checks to see if append spills over into new chunk – If so, pads old chunk to full extent – Tells secondary chunk-servers to do the same – Tells client to try append again on next chunk • Usually works because – max(append-size) < ¼ chunk-size [API rule] – (meanwhile other clients may be appending) Other Issues Master Replication • Fast snapshot • Master log & checkpoints replicated • Master operation • Outside monitor watches master livelihood – Namespace management & locking – Starts new master process as needed – Replica placement & rebalancing • Shadow masters – Garbage collection (deleted / stale files) – Provide read-access when primary is down – Detecting stale replicas – Lag state of true master 4

  5. Read Performance Write Performance Record-Append Performance 5

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend