g oogle f ile s ystem and rpc
play

[G OOGLE F ILE S YSTEM AND RPC] Shrideep Pallickara Computer - PDF document

CS555: Distributed Systems [Fall 2019] Dept. Of Computer Science , Colorado State University CS 555: D ISTRIBUTED S YSTEMS [G OOGLE F ILE S YSTEM AND RPC] Shrideep Pallickara Computer Science Colorado State University CS555: Distributed Systems


  1. CS555: Distributed Systems [Fall 2019] Dept. Of Computer Science , Colorado State University CS 555: D ISTRIBUTED S YSTEMS [G OOGLE F ILE S YSTEM AND RPC] Shrideep Pallickara Computer Science Colorado State University CS555: Distributed Systems [Fall 2019] November 21, 2019 L26.1 Dept. Of Computer Science , Colorado State University Frequently asked questions from the previous class survey ¨ After a snapshot, if a client seeks to change a chunk how is that handled? ¨ Why is caching files (at the application level) not done? L26. 2 CS555: Distributed Systems [Fall 2019] November 21, 2019 Dept. Of Computer Science , Colorado State University Professor: S HRIDEEP P ALLICKARA L19.1 S LIDES C REATED B Y : S HRIDEEP P ALLICKARA

  2. CS555: Distributed Systems [Fall 2019] Dept. Of Computer Science , Colorado State University Topics covered in this lecture ¨ Google File System ¤ Replication ¤ Consistency in GFS ¤ Deletion of files and garbage collection ¨ RPC ¤ Persistence/transience ¤ Synchronous/asynchronous communications ¤ Parameters in RPC settings L26. 3 CS555: Distributed Systems [Fall 2019] November 21, 2019 Dept. Of Computer Science , Colorado State University Professor: S HRIDEEP P ALLICKARA R EPLICATION CS555: Distributed Systems [Fall 2019] November 19, 2019 L25.4 Dept. Of Computer Science , Colorado State University L19.2 S LIDES C REATED B Y : S HRIDEEP P ALLICKARA

  3. CS555: Distributed Systems [Fall 2019] Dept. Of Computer Science , Colorado State University Reasons why chunk replicas are created ¨ Chunk creation ¨ Re-replication ¨ Rebalancing L26. 5 CS555: Distributed Systems [Fall 2019] November 19, 2019 Dept. Of Computer Science , Colorado State University Professor: S HRIDEEP P ALLICKARA Chunk replica creation ¨ Place replicas on chunk servers with below average disk space utilization ¨ Limit the number of recent creations on a chunk server ¤ Predictor of imminent heavy traffic ¨ Spread replicas across racks L26. 6 CS555: Distributed Systems [Fall 2019] November 19, 2019 Dept. Of Computer Science , Colorado State University Professor: S HRIDEEP P ALLICKARA L19.3 S LIDES C REATED B Y : S HRIDEEP P ALLICKARA

  4. CS555: Distributed Systems [Fall 2019] Dept. Of Computer Science , Colorado State University Re-replicate chunks when replication level drops ¨ How far is it from replication goal ¨ Preference for chunks of live files ¨ Boost priority of chunks blocking client progress L26. 7 CS555: Distributed Systems [Fall 2019] November 19, 2019 Dept. Of Computer Science , Colorado State University Professor: S HRIDEEP P ALLICKARA Rebalancing replicas ¨ Examine current replica distribution and move replicas ¤ Better disk space ¤ Load balancing ¨ Removal of existing replicas ¤ Chunk servers with below-average disk space L26. 8 CS555: Distributed Systems [Fall 2019] November 19, 2019 Dept. Of Computer Science , Colorado State University Professor: S HRIDEEP P ALLICKARA L19.4 S LIDES C REATED B Y : S HRIDEEP P ALLICKARA

  5. CS555: Distributed Systems [Fall 2019] Dept. Of Computer Science , Colorado State University Incorporating a new chunk server ¨ Do not swamp new server with lots of chunks ¤ Concomitant traffic will bog down the machine ¨ Gradually fill up new server with chunks L26. 9 CS555: Distributed Systems [Fall 2019] November 19, 2019 Dept. Of Computer Science , Colorado State University Professor: S HRIDEEP P ALLICKARA C ONSISTENCY IN GFS CS555: Distributed Systems [Fall 2019] November 19, 2019 L25.10 Dept. Of Computer Science , Colorado State University L19.5 S LIDES C REATED B Y : S HRIDEEP P ALLICKARA

  6. CS555: Distributed Systems [Fall 2019] Dept. Of Computer Science , Colorado State University In GFS the state of file region after mutation depends on … ¨ T YPE of the mutation ¨ S UCCESS /F AILURE of the mutation ¨ Whether there were CONCURRENT mutations L26. 11 CS555: Distributed Systems [Fall 2019] November 19, 2019 Dept. Of Computer Science , Colorado State University Professor: S HRIDEEP P ALLICKARA GFS has a relaxed consistency model ¨ Consistent : See the same data ¤ On all replicas ¨ Defined ¤ Clients see mutation writes in its entirety L26. 12 CS555: Distributed Systems [Fall 2019] November 19, 2019 Dept. Of Computer Science , Colorado State University Professor: S HRIDEEP P ALLICKARA L19.6 S LIDES C REATED B Y : S HRIDEEP P ALLICKARA

  7. CS555: Distributed Systems [Fall 2019] Dept. Of Computer Science , Colorado State University File state region after a mutation Write Record Append Serial success defined defined Consistent Concurrent interspersed with but undefined inconsistent success Failure Inconsistent L26. 13 CS555: Distributed Systems [Fall 2019] November 19, 2019 Dept. Of Computer Science , Colorado State University Professor: S HRIDEEP P ALLICKARA Implications for applications ¨ Rely on appends instead of overwrites ¨ Checkpoint ¨ Write records that are ¤ Self-validating ¤ Self-identifying L26. 14 CS555: Distributed Systems [Fall 2019] November 19, 2019 Dept. Of Computer Science , Colorado State University Professor: S HRIDEEP P ALLICKARA L19.7 S LIDES C REATED B Y : S HRIDEEP P ALLICKARA

  8. CS555: Distributed Systems [Fall 2019] Dept. Of Computer Science , Colorado State University D ELETION OF F ILES & G ARBAGE C OLLECTION CS555: Distributed Systems [Fall 2019] November 19, 2019 L25.15 Dept. Of Computer Science , Colorado State University Garbage collection in GFS ¨ After a file is deleted, GFS does not reclaim space immediately ¨ Done lazily during garbage collection at ¤ File and chunk levels L26. 16 CS555: Distributed Systems [Fall 2019] November 19, 2019 Dept. Of Computer Science , Colorado State University Professor: S HRIDEEP P ALLICKARA L19.8 S LIDES C REATED B Y : S HRIDEEP P ALLICKARA

  9. CS555: Distributed Systems [Fall 2019] Dept. Of Computer Science , Colorado State University Master logs a file’s deletion immediately ¨ File is renamed to a hidden name ¤ Includes deletion timestamp ¨ Master scans the file system namespace ¤ Delete if hidden file existed for more than 3 days ¨ When file removed from namespace ¤ In memory metadata is also removed ¤ Severs links to all its chunks! L26. 17 CS555: Distributed Systems [Fall 2019] November 19, 2019 Dept. Of Computer Science , Colorado State University Professor: S HRIDEEP P ALLICKARA Garbage collection: When Master scans its chunk namespace ¨ Identifies orphaned chunks ¤ Not reachable from any file ¨ Erase metadata for these chunks L26. 18 CS555: Distributed Systems [Fall 2019] November 19, 2019 Dept. Of Computer Science , Colorado State University Professor: S HRIDEEP P ALLICKARA L19.9 S LIDES C REATED B Y : S HRIDEEP P ALLICKARA

  10. CS555: Distributed Systems [Fall 2019] Dept. Of Computer Science , Colorado State University The role of heart-beats in garbage collection ¨ Chunk server reports subset of chunks it currently has ¨ Master replies with identity of chunks no longer present ¤ Chunk server free to delete its replica of such chunks L26. 19 CS555: Distributed Systems [Fall 2019] November 19, 2019 Dept. Of Computer Science , Colorado State University Professor: S HRIDEEP P ALLICKARA Stale chunks and issues ¨ If a chunk server fails ¤ A ND misses mutations to the chunk ¤ The chunk replica becomes stale ¨ Working with a stale replica causes problems with: ¤ Correctness ¤ Consistency L26. 20 CS555: Distributed Systems [Fall 2019] November 19, 2019 Dept. Of Computer Science , Colorado State University Professor: S HRIDEEP P ALLICKARA L19.10 S LIDES C REATED B Y : S HRIDEEP P ALLICKARA

  11. CS555: Distributed Systems [Fall 2019] Dept. Of Computer Science , Colorado State University Aiding the detection of stale chunks ¨ Master maintains a chunk version number for each chunk ¤ Distinguish between stale and up-to-date chunks ¨ When master grants a new lease on chunk ¤ Increase version number Occurs BEFORE any ¤ Inform replicas client can write to chunk ¤ Record new version L26. 21 CS555: Distributed Systems [Fall 2019] November 19, 2019 Dept. Of Computer Science , Colorado State University Professor: S HRIDEEP P ALLICKARA If a replica is unavailable its version number will not be advanced ¨ When a chunk server restarts, it reports to the Master with the following: ¤ Set of Chunks ¤ Corresponding version numbers ¨ Used to detect stale replicas ¨ Remove stale replicas in regular garbage collection L26. 22 CS555: Distributed Systems [Fall 2019] November 19, 2019 Dept. Of Computer Science , Colorado State University Professor: S HRIDEEP P ALLICKARA L19.11 S LIDES C REATED B Y : S HRIDEEP P ALLICKARA

  12. CS555: Distributed Systems [Fall 2019] Dept. Of Computer Science , Colorado State University Additional safeguards against stale replicas ¨ Include chunk version number ¤ When client requests chunk information n Client/Chunk server verify version to make sure things are up-to-date ¤ During cloning operations n Clone the most up-to-date chunk ¨ Clients and chunk servers expected to verify versioning information L26. 23 CS555: Distributed Systems [Fall 2019] November 19, 2019 Dept. Of Computer Science , Colorado State University Professor: S HRIDEEP P ALLICKARA D ATA I NTEGRITY CS555: Distributed Systems [Fall 2019] November 19, 2019 L25.24 Dept. Of Computer Science , Colorado State University L19.12 S LIDES C REATED B Y : S HRIDEEP P ALLICKARA

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend