 
              CS535 Big Data 2/19/2020 Week 5-B Sangmi Lee Pallickara CS535 BIG DATA PART B. GEAR SESSIONS SESSION 1: PETA-SCALE STORAGE SYSTEMS Google had 2.5 million servers in 2016 Sangmi Lee Pallickara Computer Science, Colorado State University http://www.cs.colostate.edu/~cs535 CS535 Big Data | Computer Science | Colorado State University FAQs • Quiz #2 • 2/21 ~ 2/23 • Spark and Storm • 10 questions • 30 minutes • Answers will be available at 9PM 2/24 http://www.cs.colostate.edu/~cs535 Spring 2020 Colorado State University, page 1
CS535 Big Data 2/19/2020 Week 5-B Sangmi Lee Pallickara CS535 Big Data | Computer Science | Colorado State University Topics of Todays Class • GEAR Session I. Peta Scale Storage Systems • Lecture 2. • GFS I and II • Cassandra CS535 Big Data | Computer Science | Colorado State University GEAR Session 1. Peta-scale Storage Systems http://www.cs.colostate.edu/~cs535 Spring 2020 Colorado State University, page 2
CS535 Big Data 2/19/2020 Week 5-B Sangmi Lee Pallickara CS535 Big Data | Computer Science | Colorado State University GEAR Session 1. peta-scale storage systems Lecture 2. Google File System and Hadoop Distributed File System 3. Relaxed Consistency CS535 Big Data | Computer Science | Colorado State University Two breaks in the communication lines London Rome Boston Chicago LA Paris Miami A single machine can’t partition So it does not have to worry about partition tolerance There is only one node. Sydney If it’s up, it’s available http://www.cs.colostate.edu/~cs535 Spring 2020 Colorado State University, page 3
CS535 Big Data 2/19/2020 Week 5-B Sangmi Lee Pallickara CS535 Big Data | Computer Science | Colorado State University Eventually consistent • At any time nodes may have replication inconsistencies • If there are no more updates (or updates can be ordered), eventually all nodes will be updated to the same value CS535 Big Data | Computer Science | Colorado State University GFS has a relaxed consistency model • Consistent : See the same data • On all replicas • Defined : If it is consistent AND • Clients see mutation writes in its entirety http://www.cs.colostate.edu/~cs535 Spring 2020 Colorado State University, page 4
CS535 Big Data 2/19/2020 Week 5-B Sangmi Lee Pallickara CS535 Big Data | Computer Science | Colorado State University Inconsistent and undefined Operation A Operation B CS535 Big Data | Computer Science | Colorado State University Consistent but undefined Operation A Operation B http://www.cs.colostate.edu/~cs535 Spring 2020 Colorado State University, page 5
CS535 Big Data 2/19/2020 Week 5-B Sangmi Lee Pallickara CS535 Big Data | Computer Science | Colorado State University Defined Operation A Operation B CS535 Big Data | Computer Science | Colorado State University File state region after a mutation Write Record Append Serial success Defined defined interspersed with Consistent inconsistent Concurrent but undefined success Failure Inconsistent http://www.cs.colostate.edu/~cs535 Spring 2020 Colorado State University, page 6
CS535 Big Data 2/19/2020 Week 5-B Sangmi Lee Pallickara CS535 Big Data | Computer Science | Colorado State University GEAR Session 1. peta-scale storage systems Lecture 2. Google File System and Hadoop Distributed File System 4. Handling write and append to a file CS535 Big Data | Computer Science | Colorado State University GFS uses leases to maintain consistent mutation order across replicas • Master grants lease to one of the replicas • Primary • Primary picks serial-order • For all mutations to the chunk • Other replicas follow this order • When applying mutations http://www.cs.colostate.edu/~cs535 Spring 2020 Colorado State University, page 7
CS535 Big Data 2/19/2020 Week 5-B Sangmi Lee Pallickara CS535 Big Data | Computer Science | Colorado State University Lease mechanism designed to minimize communications with the master • Lease has initial timeout of 60 seconds • As long as chunk is being mutated • Primary can request and receive extensions • Extension requests/grants piggybacked over heart-beat messages CS535 Big Data | Computer Science | Colorado State University Revocation and transfer of leases • Master may revoke a lease before it expires • If communications lost with primary • Master can safely give lease to another replica • Only After the lease period for old primary elapses http://www.cs.colostate.edu/~cs535 Spring 2020 Colorado State University, page 8
CS535 Big Data 2/19/2020 Week 5-B Sangmi Lee Pallickara CS535 Big Data | Computer Science | Colorado State University How a write is actually performed 1. Chunkserver holding the current lease for the chunk and the location of the other replica MASTER 4. Write request Client 2. Identity of the primary 3*. and the locations of other replicas Secondary Replica A Primary 5. Write request/ 6. Acknowledgement Replica 7. Final Reply Secondary Replica B 3. Client pushes the data to all the replicas CS535 Big Data | Computer Science | Colorado State University Client pushes data to all the replicas [1/2] • Each chunk server stores data in an LRU buffer until • Data is used • Aged out http://www.cs.colostate.edu/~cs535 Spring 2020 Colorado State University, page 9
CS535 Big Data 2/19/2020 Week 5-B Sangmi Lee Pallickara CS535 Big Data | Computer Science | Colorado State University Client pushes data to all the replicas [2/2] • When chunk servers acknowledge receipt of data • Client sends a write request to primary • Primary assigns consecutive serial numbers to mutations • Forwards to replicas CS535 Big Data | Computer Science | Colorado State University Data flow is decoupled from the control flow to utilize network efficiently • Utilize each machine’s network bandwidth • Avoid network bottlenecks • Avoid high-latency links • Leverage network topology • Estimate distances from IP addresses • Pipeline the data transfer • Once a chunkserver receives some data, it starts forwarding immediately. • For transferring B bytes to R replicas • Ideal elapsed time will be ≈ B/T+RL where: • T is the network throughput • L is latency to transfer bytes between two machines http://www.cs.colostate.edu/~cs535 Spring 2020 Colorado State University, page 10
CS535 Big Data 2/19/2020 Week 5-B Sangmi Lee Pallickara CS535 Big Data | Computer Science | Colorado State University Append: Record sizes and fragmentation • Size is restricted to ¼ the chunk size • Maximum size • Minimizes worst-case fragmentation • Internal fragmentation in each chunk … CS535 Big Data | Computer Science | Colorado State University Inconsistent Regions Data 1 Data 1 Data 1 Data 2 Data 2 Data 2 Data 3 Data 3 Failed User will re-try to store Data 3 Data 1 Data 1 Data 1 Empty Data 2 Data 2 Data 2 Data 3 Data 3 Data 3 Data 3 Data 3 Data 3 http://www.cs.colostate.edu/~cs535 Spring 2020 Colorado State University, page 11
CS535 Big Data 2/19/2020 Week 5-B Sangmi Lee Pallickara CS535 Big Data | Computer Science | Colorado State University What if record append fails at one of the replicas • Client must retry the operation • Replicas of same chunk may contain • Different data • Duplicates of the same record • In whole or in part • Replicas of chunks are not bit-wise identical ! • In most systems, replicas are identical CS535 Big Data | Computer Science | Colorado State University GFS only guarantees that the data will be written at least once as an atomic unit • For an operation to return success • Data must be written at the same offset on all the replicas • After the write, all replicas are as long as the end of the record • Any future record will be assigned a higher offset or a different chunk http://www.cs.colostate.edu/~cs535 Spring 2020 Colorado State University, page 12
CS535 Big Data 2/19/2020 Week 5-B Sangmi Lee Pallickara CS535 Big Data | Computer Science | Colorado State University GEAR Session 1. peta-scale storage systems Lecture 2. Google File System and Hadoop Distributed File System Google File System II Colossus CS535 Big Data | Computer Science | Colorado State University Storage Software: Colossus (GFS2) • Next-generation cluster-level file system • Automatically sharded metadata layer • Distributed Masters (64MB block size à 1MB) • Data typically written using Reed-Solomon (1.5x) • Client-driven replication, encoding and replication • Metadata space has enabled availability • Why Reed-Solomon? • Cost • Especially with cross cluster replication • More flexible cost vs. availability choices • Google File System II: Dawn of the Multiplying Master Nodes, http://www.theregister.co.uk/2009/08/12/google_file_system_part_deux/?page=1 http://www.cs.colostate.edu/~cs535 Spring 2020 Colorado State University, page 13
Recommend
More recommend