File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung - PowerPoint PPT Presentation

File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Farnaz Farahanipad 1001134035

Overview • Introduction • Design overview • GFS structure • System interaction • Master operation • Questions and answers • Conclusion

Introduction: Distributed File systems • What is distributed file system? • A DFS is any file system that allows access to file from multiple hosts sharing via a computer network.

Introduction: • Why not use an existing file system? • Bottle neck problem • Balancing issue • Different workload and design properties • GFS is designed for google apps and workloads

Google file system • GFS is a scalable distributed file system for large data intensive applications. • GFS, has master slave architecture. • Shares many of the same goals as previous distributed file systems such as performance, scalability, reliability, and availability to large networks. 5

GFS design assumption • High component failure rates • “Modest” number of huge files – Just a few millions of big files • Files are write-once, mostly append to • Large streaming reads • High sustained throughput favored over low latency

GFS design decisions • Files stored as chunks • At the fixed size of 64MB • Single Master • Simple centralized management • Reliability through replication • Each chunk is replicated across 3 or more chunk servers • No data caching • Little benefit due to large size of data sets • Familiar interface, but customize the API • Suitable to Google apps • Add snapshot, record append operation

GFS architecture

Client • Interacts with the master for metadata operations: • Client translates offset in file into chunk index within file • Send master request with file name/chunk index • Caches info using file name/chunk index as key • Interact with chunk servers for read/write operation

Chunk servers • Chunk servers are the workers of GFS. • Responsible for storing 64-MB file chunks. • Each chunk replica is stored on a chunk server and is extended only as needed.

Why large chunk size? • Size of meta data is reduced • Involvement of Master is reduced • Network overhead is reduced • Lazy space allocation avoids internal fragmentation

Reliability issue • What if a chunk server goes down? • The GFS, copies every chunk multiple times and store it on different chunk servers.

Single master weakness: Single point of failure • What if master goes down? • GFS Solution: • Shadow master

Single master weakness: Scalability bottleneck • How to solve bottle neck problem? • GFS Solution: • Minimize master involvement • Never move data through master. Use only for meta data • Large chunk size less meta data • Data mutation is done by chunk servers

Master • The master maintains all file system metadata. • Periodically communicates with chunk-servers • Gives instruction, collects state • Chunk creation, re-replication, rebalancing • Garbage collection • Simpler and more reliable • Lazily garbage collects hidden files

Master-Metadata • Global metadata is stored on the master • File and chunk namespaces • Mapping from files to chunks • Location of each chunk’s replica • All in memory(64bytes/chunk) • Fast • Easy access

Master-Operation log • The operation log contains a historical record of critical metadata change. • Defines the order of concurrent operations • Critical • Replicated to multiple remote machines • Respond to client only when it is log locally and remotely

Master-Operation log • Master checkpoints its state whenever the log goes beyond a certain size. • Fast recovery by using checkpoints • Recovery needs only latest files so older files can be deleted freely.

Why it is important to log on information of master? • Using a log allows us to update the master state simply, reliably, and without risking inconsistencies in the event of a master crash.

Master-Keep chunk servers and master synchronized • By sending heart beat messages

System interaction: lease and mutation order • A lease is a grant of ownership or control for a limited time. • The owner/holder can renew or extend the lease. • If the owner fails, the lease expires and is free again.

System interaction: lease and mutation order • A mutation is an operation that changes the contents or metadata of a chunk such as write or an append operation. • Each mutation is performed to all the chunk’s replicas.

System interaction: lease and mutation order • Leases are used to maintain consistent mutation order across replicas.

System interaction: data flow • To avoid network bottlenecks and high-latency links • Each machine forwards data to the closest machine • Latency is being minimized by pipelining the data transfer over TCP connections. Decoupled

System interaction: data flow Time for transferring B bytes to R replicas between two machines without network congestion: 𝑢 = 𝐶 𝑈 + 𝑆𝑀 B=1MB L= 1ms t ~ 80 ms T=100Mbps

Master operation • Namespace management and locking • Replica placement • Creation, Re-replication, Rebalancing

Master operation: Namespace management and locking • We allow multiple operation to be active in master by using locking to ensure proper serialization. • Recall that GFS does not have per-directory data structure. • It only store file and chunks mapping • So, GFS logically represent its namespace as a look up table mapping full pathnames to metadata. • Using read/write lock on each node in the namespace tree to ensure serialization. • Each master operation acquires a set of locks before it runs

Master operation: Namespace management and locking If it involves: /d1/d2/…/ dn/leaf Read locks on the /d1 directory name /d1/d2 … /d1/d2/…/ dn Either a read lock /d1/d2/…/ dn/leaf or a write lock on the full pathname

Master operation: Namespace management and locking • How this locking mechanism can prevent a file /home/user/foo from being created while /home/user is being snap shotted to /save/user Read locks Write locks Snapshot /home /home/user operation /save /save/user Creation /home /home/user/foo operation /home/user

Master operation: Namespace management and locking

Master operation: Namespace management and locking Create new file under a directory: e.g.,create/dir/file3, /dir/file4/, ...... Allow concurrent mutations in the same directory • Key: using read lock for dir. • By locking pathname, it can lock the new file before it is created Prevent from creating files with the same name simultaneously

Master operation: Replica placement • Serves two purposes: • Maximize data reliability and availability • Maximize network bandwidth utilization • Spread chunk replicas across racks: • To ensure chunk survivability • To exploit aggregate read bandwidth of multiple rack • Write traffic has to flow through multiple racks

Master operation: Creation, Re-replication, Rebalancing • Chunk Replicas are created for three reasons: • Chunk Creation • Chunk Replication • Rebalancing

Master operation: Creation, Re-replication, Rebalancing • New chunks are created on chunk servers • Master has to decide which chunk servers could be used for chunk creation. • Put new replicas on chunk servers with below-average disk space utilization. • It wants to reduce the number of creation on each chunk server.(cheap but heavy write traffic) • Spread replicas of a chunk on to different racks.

Master operation: Creation, Re-replication, Rebalancing • Master re-replicates a chunk as soon as the number of available replicas fall below verified goal. • Each chunk that need to be re-replicated is prioritized based on several factors. • Master picks highest priority chunk and “clones” it by instructing some chunk servers by the chunk data directly from an exiting replica • Additionally, each chunk server limits the amount of bandwidth it spends on each replication by controlling its reads requests to the source chunk servers.

Master operation: Creation, Re-replication, Rebalancing • Master rebalances replicas periodically • Examines the current replica distribution and move replica for best space and load balancing • Master gradually fills up a new chunk server rather than instantly swamps it with new chunks and heavy traffic comes with them. • Master must choose which existing replica to remove. • It prefers to remove those on chunk servers with below average free space so as to equalize disk space usage.

Master operation: Garbage collection • Replica that is not known to master is garbage • Master logs the deletion immediately like other changes, but it does not reclaim the resources. • The file renamed to hidden name which includes the deletion timestamp. • During master’s regular scan, it removes the hidden files with in more tan 3 days. • After removing the hidden file from namespace, its in memory metadata is erased.

File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung - PowerPoint PPT Presentation

File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Farnaz Farahanipad 1001134035 Overview Introduction Design overview GFS structure System interaction Master operation Questions and answers Conclusion

File Management What is a file? Elements of file management File organization

Click on M odel File for CAD Click on M odel File for CAD Click on Model File for CAD Click

~FILE SYSTEM~ SUNU WIBIRAMA OUTLINE FILE SYSTEM ACCESS METHODS DIRECTORY STRUCTURE FILE

CPSC 410/611: File Management What is a file? Elements of file management File

Week 10: File Management What is a file? Elements of file management File

File System Implementation Summer 2016 Cornell University Today File allocation Unix

FILE SYSTEM IMPLEMENTATION Sunu Wibirama Outline File-System Structure File-System

[537] Distributed Systems Chapters 42 Tyler Harter 11/19/14 File-System Case Studies Local -

File Systems: Semantics & Structure What is a File a file is a named collection of

File Systems: Semantics & Structure What is a File a file is a named collection of

Chapter 12: File System Implementation File System Structure File System Implementation

File Systems Chapter 11, 13 OSPP What is a File? What is a Directory? Goals of File System

What if... There is no file with the name given to the File constructor: new File

CPSC 410/611: File Management What is a file? Elements of file management

Chapter 11: File-System Interface File Concept Access Methods Directory Structure

Distributed File Systems Distributed File Systems A distributed file system (DFS) is a

Management of Superfund Remedies in Post Construction Tracy Hopkins, EPA HQ Amanda Van Epps,

Data structures Organize your data to support various queries using little time an space

Lock-free Concurrent Level Hashing for Persistent Memory Zhangyu Chen , Yu Hua, Bo Ding, Pengfei

Bayesian Network Parameter Learning from Incomplete Data Guy Van den Broeck, Karthika Mohan,

CPSC 221: Data Structures Dictionary ADT Binary Search Trees Alan J. Hu (Using Steve

Minors under Structural Parameterizations Bart M.P. Jansen and Astrid Pieterse Problem is a

ECE 2574: Data Structures and Algorithms - Linked-Based Implementations C. L. Wyatt Fall 2017

CS 10: Problem solving via Object Oriented Programming Balance Agenda 1. Balanced Binary Trees

File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung - PowerPoint PPT Presentation

File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Farnaz Farahanipad 1001134035 Overview Introduction Design overview GFS structure System interaction Master operation Questions and answers Conclusion

File Management What is a file? Elements of file management File organization

Click on M odel File for CAD Click on M odel File for CAD Click on Model File for CAD Click

~FILE SYSTEM~ SUNU WIBIRAMA OUTLINE FILE SYSTEM ACCESS METHODS DIRECTORY STRUCTURE FILE

CPSC 410/611: File Management What is a file? Elements of file management File

Week 10: File Management What is a file? Elements of file management File

File System Implementation Summer 2016 Cornell University Today File allocation Unix

FILE SYSTEM IMPLEMENTATION Sunu Wibirama Outline File-System Structure File-System

[537] Distributed Systems Chapters 42 Tyler Harter 11/19/14 File-System Case Studies Local -

File Systems: Semantics &amp; Structure What is a File a file is a named collection of

File Systems: Semantics &amp; Structure What is a File a file is a named collection of

Chapter 12: File System Implementation File System Structure File System Implementation

File Systems Chapter 11, 13 OSPP What is a File? What is a Directory? Goals of File System

What if... There is no file with the name given to the File constructor: new File

CPSC 410/611: File Management What is a file? Elements of file management

Chapter 11: File-System Interface File Concept Access Methods Directory Structure

Distributed File Systems Distributed File Systems A distributed file system (DFS) is a

Management of Superfund Remedies in Post Construction Tracy Hopkins, EPA HQ Amanda Van Epps,

Data structures Organize your data to support various queries using little time an space

Lock-free Concurrent Level Hashing for Persistent Memory Zhangyu Chen , Yu Hua, Bo Ding, Pengfei

Bayesian Network Parameter Learning from Incomplete Data Guy Van den Broeck, Karthika Mohan,

CPSC 221: Data Structures Dictionary ADT Binary Search Trees Alan J. Hu (Using Steve

Minors under Structural Parameterizations Bart M.P. Jansen and Astrid Pieterse Problem is a

ECE 2574: Data Structures and Algorithms - Linked-Based Implementations C. L. Wyatt Fall 2017

CS 10: Problem solving via Object Oriented Programming Balance Agenda 1. Balanced Binary Trees

File Systems: Semantics & Structure What is a File a file is a named collection of

File Systems: Semantics & Structure What is a File a file is a named collection of