The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak - PowerPoint PPT Presentation

The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung SOSP 2003 presented by Kun Suo

Outline • GFS Background, Concepts and Key words • Example of GFS Operations • Some optimizations in GFS • Evaluation • Conclusion

Motivation

What is the GFS? • Google File System is a scalable distributed file system for large distributed data-intensive applications, which runs on inexpensive commodity hardware and provides fault tolerance, high performance to a large number of clients. • GFS shares many of the same goals as previous distributed file systems such as performance, scalability, reliability, and availability

GFS Assumptions • Hardware : The system is built from many inexpensive commodity components that often fail • File : The system stores a modest number of large files • Workloads characteristics: - Large streaming reads - Small random reads. - Many large, sequential writes that append data to files • Client : the system must efficiently implement for multiple clients that concurrently append to the same file. • Target : High sustained bandwidth is more important than low latency

Interface of GFS • GFS provides a familiar file system interface: support the usual operations to create , delete , open , close , read , and write files. • GFS supports snapshot and record append operations - Producer-Consumer queues - Many-way merging

Architecture of GFS • GFS components: - One single master - Multiple Clients - Multiple GFS chunkserver

Chunk Size • Chunksize is set as 64MB • Pro: - Less interoperation between client and master node - Keep TCP long connection, less network overhead - Less meta data on master node • Con: - Small file - Too many clients visit the same file, hot spots

Metadata • Three types of metadata: - (1) File and chunk namespaces - (2) Mapping from files to chunks - (3) Locations of each chunk’s replicas • All metadata is kept in master memory (performance) - Fast - Easily accessible • (1) & (2) are kept persistent by logging (Reliability); (3) will be updated periodically

Master Node • Metadata storage • Namespace management • Periodically communicate with chunkservers • Chunk operation: create, re-replicate, delete, garbage collection, load balance, etc.

System Interaction • (1) Mutation • (2) Lease • Minimize management overhead at the master

Mutation • Mutation = write or append to the contents or metadata of a chunk - Must be done for all replicas (Consistency) • Lease - Master picks one replica as primary; gives it a “lease”for mutations for all replicas • Purpose - Data flow decoupled from control flow - Minimize master involvement

Outline • GFS Background, Concepts and Key words (Question) • Example of GFS Operations • Some optimizations in GFS • Evaluation • Conclusion

Question [1] • “…its design has been driven by key observations of our application workloads and technological environment,…” What are the workload and technology characteristics GFS assumed in its design and what are their corresponding design choices ? —> GFS design assumptions and target workload

GFS Assumptions • Hardware : The system is built from many inexpensive commodity components that often fail • File : The system stores a modest number of large files • Workloads characteristics: - Large streaming reads - Small random reads. - Many large, sequential writes that append data to files • Client : the system must efficiently implement for multiple clients that concurrently append to the same file. • Target : High sustained bandwidth is more important than low latency

Question [2] • “…while caching data blocks in the client loses its appeal.” GFS does not cache file data. Why does this design choice not lead to performance loss? What benefit does this choice have? (1) stream through (a) Simply design of GFS huge files (b)Eliminating cache coherence issues, (2) working sets too challenging large server client Client caches offer little benefit. However, clients still cache metadata for future access.

Question [3] • “Small files must be supported, but we need not optimize for them.” Why? (a) GFS is designed to store millions of large files, each typically 100 MB or larger in size (b) The chunkservers storing chunks which belong to small files may become hot spots if many clients are accessing the same file. In practice, hot spots have not been a major issue because our applications mostly read large multi-chunk files sequentially. Large and small files exist in (c) One of disadvantages of GFS almost every systems.

Outline • GFS Background, Concepts and Key words • Example of GFS Operations • Some optimizations in GFS • Evaluation • Conclusion

⑤ ⑥ ④ ③ ② ① Read in GFS 1, Application • originates the Application read request file name, byte range 2, GFS client • file name, chunk index data translates request and sends it to Client Master master chunk handle replica location 3, Master • chunk handle responds with byte range chunk handle and data from file replica locations Chunk Chunk Chunk

⑤ ① ⑥ ④ ③ ② Read in GFS 4, Client picks a • Application location and sends the request file name, byte range file name, chunk index data 5, Chunkserver • sends requested Client Master data to the client chunk handle replica location 6, Client forwards • chunk handle the data to the byte range application data from file Chunk Chunk Chunk

⑧ ⑥ ⑨ ① ⑦ ⑦ ⑥ ② ③ ④ ⑤ Write on GFS Application 1. Application originates • the request file name, byte range 2. GFS client translates • request and sends it to Client Master master 3. Master responds with • chunk handle and replica locations Chunk Chunk replica Chunk replica (Primary)

⑦ ⑥ ⑨ ① ⑧ ⑦ ⑥ ② ③ ④ ⑤ Write on GFS Application file name, byte range 4, Client pushes write data • to all locations. Data is stored in chunkserver’s Client Master internal buffers 5, Client sends write • command to primary Chunk Chunk replica Chunk replica (Primary)

⑧ ⑥ ⑨ ① ⑦ ⑦ ⑥ ② ③ ④ ⑤ Write on GFS Application 6, Primary determines • file name, byte range serial order for data instances in its buffer and writes the instances in that order to the chunk Client Master Primary sends the serial • order to the secondaries and tells them to perform the write Chunk Chunk replica Chunk replica (Primary)

⑦ ⑥ ⑨ ① ⑧ ⑦ ⑥ ② ③ ④ ⑤ Write on GFS Application 7, Secondaries respond file name, byte range • back to primary 8, Primary responds back Client Master • to the client 9, Client responds to • applications Chunk Chunk replica Chunk replica (Primary)

Append on GFS • In a traditional write, the client specifies the offset at which data is to be written. • Append is same as write, but no offset. GFS picks the offset and works for concurrent writers difference

Outline • GFS Background, Concepts and Key words • Example of GFS Operations (Question) • Some optimizations in GFS • Evaluation • Conclusion

Question [4] • “Clients interact with the master for metadata operations, but all data-bearing communication goes directly to the chunkservers.” How does this design help improve the system’s performance? Potential bottleneck minimize clients’ involvement in reads and writes with the master node

Question [5] • “A GFS cluster consists of a single master…”. What’s benefit of having only a single master? What’s its potential performance risk? How does GFS minimize such a risk? 1, Simplify Design 2, Potential bottleneck 3, Minimize clients’ involvement in reads and writes with the master node

Question [6] • “Each chunk replica is stored as a plain Linux file on a chunkserver and is extended only as needed.” How does GFS collaborate with chunkserver’s local file system to store file chunks? What’s lazy space allocation and what’s its benefit? GFS is composed of many servers Each server is typically a commodity Linux machine running a user-level server process. The file in GFS is finally stored in local server as regular Linux file

Question [6] • “Each chunk replica is stored as a plain Linux file on a chunkserver and is extended only as needed.” How does GFS collaborate with chunkserver’s local file system to store file chunks? What’s lazy space allocation and what’s its benefit? with help of local file system

Question [6] • “Each chunk replica is stored as a plain Linux file on a chunkserver and is extended only as needed.” How does GFS collaborate with chunkserver’s local file system to store file chunks? What’s lazy space allocation and what’s its benefit? Lazy allocation simply means not allocating a resource until it is actually needed. Benefits : Lazy space allocation avoids wasting space due to internal fragmentation, perhaps the greatest objection against such a large chunksize.

The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak - PowerPoint PPT Presentation

The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung SOSP 2003 presented by Kun Suo Outline GFS Background, Concepts and Key words Example of GFS Operations Some optimizations in GFS Evaluation

File Management What is a file? Elements of file management File organization

Click on M odel File for CAD Click on M odel File for CAD Click on Model File for CAD Click

~FILE SYSTEM~ SUNU WIBIRAMA OUTLINE FILE SYSTEM ACCESS METHODS DIRECTORY STRUCTURE FILE

RPC Metrics at Google JBD, Google (@rakyll) gRPC Metrics at Google JBD, Google (@rakyll)

CPSC 410/611: File Management What is a file? Elements of file management File

Week 10: File Management What is a file? Elements of file management File

File System Implementation Summer 2016 Cornell University Today File allocation Unix

FILE SYSTEM IMPLEMENTATION Sunu Wibirama Outline File-System Structure File-System

[537] Distributed Systems Chapters 42 Tyler Harter 11/19/14 File-System Case Studies Local -

File Systems: Semantics & Structure What is a File a file is a named collection of

File Systems: Semantics & Structure What is a File a file is a named collection of

Websites from Presentation Search Engines Google https://www.google.com/ Google Scholar

BRAINJAR HOW GOOGLE THINKS AND DISPELLING 3 GOOGLE MYTHS (& 6 TIPS!) BRAINJAR HOW GOOGLE

Containers At Scale At Google, the Google Cloud Platform and Beyond Joe Beda jbeda@google.com

Chapter 12: File System Implementation File System Structure File System Implementation

File Systems Chapter 11, 13 OSPP What is a File? What is a Directory? Goals of File System

Outline Overview Goals Distributed Computing Systems Software Client Server

SMB3.1.1 and beyond: Optimizing access from Linux Client to Samba, the Cloud and modern file

Sharding Scaling Paxos: Shards We can use Paxos to decide on the order of operations, e.g., to a

FOSDEM 2020 Tracking Performance of a Big Application from Dev to Ops Philippe WAROQUIERS

A Lightweight Secure Cyber Foraging Infrastructure for Resource-Constrained Devices Sachin Goyal

11/19/12 The Problem: Distributed Methods for Finding P aths in Networks 4 L2 L0 B D 19 L1

Chapter 8 Communication Networks and Services 1. IPv6 2. Internet Routing Protocols: OSPF,

Fluid Types Statically Verified Distributed Protocols with Refinements Fangyi Zhou Francisco

The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak - PowerPoint PPT Presentation

The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung SOSP 2003 presented by Kun Suo Outline GFS Background, Concepts and Key words Example of GFS Operations Some optimizations in GFS Evaluation

File Management What is a file? Elements of file management File organization

Click on M odel File for CAD Click on M odel File for CAD Click on Model File for CAD Click

~FILE SYSTEM~ SUNU WIBIRAMA OUTLINE FILE SYSTEM ACCESS METHODS DIRECTORY STRUCTURE FILE

RPC Metrics at Google JBD, Google (@rakyll) gRPC Metrics at Google JBD, Google (@rakyll)

CPSC 410/611: File Management What is a file? Elements of file management File

Week 10: File Management What is a file? Elements of file management File

File System Implementation Summer 2016 Cornell University Today File allocation Unix

FILE SYSTEM IMPLEMENTATION Sunu Wibirama Outline File-System Structure File-System

[537] Distributed Systems Chapters 42 Tyler Harter 11/19/14 File-System Case Studies Local -

File Systems: Semantics &amp; Structure What is a File a file is a named collection of

File Systems: Semantics &amp; Structure What is a File a file is a named collection of

Websites from Presentation Search Engines Google https://www.google.com/ Google Scholar

BRAINJAR HOW GOOGLE THINKS AND DISPELLING 3 GOOGLE MYTHS (&amp; 6 TIPS!) BRAINJAR HOW GOOGLE

Containers At Scale At Google, the Google Cloud Platform and Beyond Joe Beda jbeda@google.com

Chapter 12: File System Implementation File System Structure File System Implementation

File Systems Chapter 11, 13 OSPP What is a File? What is a Directory? Goals of File System

Outline Overview Goals Distributed Computing Systems Software Client Server

SMB3.1.1 and beyond: Optimizing access from Linux Client to Samba, the Cloud and modern file

Sharding Scaling Paxos: Shards We can use Paxos to decide on the order of operations, e.g., to a

FOSDEM 2020 Tracking Performance of a Big Application from Dev to Ops Philippe WAROQUIERS

A Lightweight Secure Cyber Foraging Infrastructure for Resource-Constrained Devices Sachin Goyal

11/19/12 The Problem: Distributed Methods for Finding P aths in Networks 4 L2 L0 B D 19 L1

Chapter 8 Communication Networks and Services 1. IPv6 2. Internet Routing Protocols: OSPF,

Fluid Types Statically Verified Distributed Protocols with Refinements Fangyi Zhou Francisco

File Systems: Semantics & Structure What is a File a file is a named collection of

File Systems: Semantics & Structure What is a File a file is a named collection of

BRAINJAR HOW GOOGLE THINKS AND DISPELLING 3 GOOGLE MYTHS (& 6 TIPS!) BRAINJAR HOW GOOGLE