CSE 6350 File and Storage System Infrastructure in Data centers - PowerPoint PPT Presentation

CSE 6350 File and Storage System Infrastructure in Data centers Supporting Internet-wide Services FAWN: A Fast Array of Wimpy Nodes Presenter: Abhishek Sharma

Topics Introduction. FAWN Architecture. FAWN DS Basic Functions: Store, lookup, delete. Questions.

Introduction Fast array of Wimpy nodes is a cluster architecture for low power data intensive computing. FAWN couples low power embedded CPU to small amount of local flash storage, and balances computation and I/O capabilities to enable efficient, massively parallel access to data. FAWN cluster could handle roughly 350 key-value queries per Joule of energy.

Introduction Key value storage systems are growing in both size and importance, they are now critical part of major internet services such as Amazon ( Dynamo ), Linkedin ( Voldemort ) and Facebook ( memcached). These system supports several characteristics, I/O intensive not computation. Random access over large dataset. They are massively parallel, with thousand of parallel operations.

Question 1: “The workloads these systems support share several characteristics: they are I/O, not computation, intensive, requiring random access over large datasets,......, and the size of objects stored is typically small.” Read the above statement, indicate why workloads of these characteristics represents a challenge to the system design? ➔ The size of the objects could be as small as 1 KB ( for image) or 100s of bytes for wall posts or twitter message etc. Maintaining meta information for such small objects in memory is a major challenge as size of RAM is limited and keeping that information on disk would require multiple disk read which is not efficient design choice from such systems.

Design Overview FAWN-KV Architecture

Question 2: “The key design choice in FAWN -KV is the use of a log structured per-node datastore called FAWN- DS that provides high performance reads and writes using flash memory”. “These performance problems motivate log -structured techniques for flash filesystems and data structures” , What key benefit does a log structured data organization bring to the KV Store. ➔ Maintaining a log structure provides a very high efficiency for write operations as they are sequential. The recovery is also fast by maintaining such structure as we could append metadata information periodically into this log which could act as a checkpoint for efficient and faster recovery.

FAWN Data Store. FAWN-DS is a log-structured key-value store. It is designed to work on Flash storage and in a limited DRAM. All writes to the data store are sequential and all random reads only require single random access.

FAWN Data Store FAWN-DS maintains an in-DRAM hash table that maps keys to an offset in the append only Data log on Flash. FAWN-DS uses 160 bit keys, It only stores fragment of key in memory to find the location in the log. It extracts two field from the 160 bit key, the i low order bits of the key( the index bits) to select the bucket from table and next 15 low order bit for key fragment. Each hash table bucket is only of 6 bytes: 15 bit key fragment, a valid bit and a 4 byte pointer to the location in the data log.

Question 5: Store, Lookup and Delete

➔ Store: In inserting a new key value pair in data store. The store appends an entry to the log, updates the corresponding hash table entry to point to this offset within the Data Log, and sets the valid bit to true. If key written already existed, the old value is now orphaned ( no hash entry points to it ) and it will be later removed in garbage collection. ➔ Lookup: It will starts by locating the bucket using the index bits and comparing the key against the key fragment, it the fragment do not match, FAWN DS uses hash chaining to continue searching the hash table. Once its find the matching key fragment, FAWN DS reads the record off the flash. If stored full key matches the fragment then the operation is complete, else FAWN-DS resumes its hash chaining search of the in-memory hash table and searches additional records. ➔ Delete: It invalidates the hash entry corresponding to the key by clearing the valid flag and writing a delete entry to the end of the data file.

Question 3: “To provide this property, FAWN DS maintains an in -DRAM hash table (Hash Index) that maps keys to an offset in the append- only Data Log on flash”. What are potential issues of the design? [consider meta size and volatility of DRAM] ➔ For every key value pair we are maintaining a metadata information in the RAM, if we have a lot of small objects suppose of 4KB and we have flash disk of 1TB then we can store 250 million keys which will require 1.5GB of of RAM just for metadata. ➔ In case of system failure we could hash table and recovering that could take time. To overcome this issue, in FAWN, the hash table get saved into the log periodically so that in case of failure the recovery process could get started from the last saved entry of hash table, which will make recovery faster.

Question 4: “It stores only a fragment of the actual key in memory to find a location in the log,” Is there a concern on the correctness in the design? ➔ No, it might possible that fragment of key will not match the actual key and would require two reads but as per authors for a 6 bytes key fragment in memory it will happen only once in 32,768 access.

Question 6: “As an optimization, FAWN -DS periodically checkpoints the index by writing the hash index and pointer to the last log entry to flash.” Why does this checkpointing help with the recovery efficiency? Why is delete entry needed in the log for a correct recovery? ➔ As data is stored sequentially in the log, having these checkpoints helps in recovering the data faster in case of failure. As they act as a starting point to reconstruct the in-memory hash table. ➔ Delete entry is required in the log because in case of failure if we lose in memory hash table and the last checkpoint ( hash information saved in log ) is before that delete update, then we will have inconsistency, as recovered hash table might shows that data is available ( if garbage collection operation is also not performed in between failure and delete). Because of this reason every time whenever a delete operation is performed, delete entry is written to the log for that key.

References https://www.cs.cmu.edu/~fawnproj/papers/fawn-sosp2009.pdf

CSE 6350 File and Storage System Infrastructure in Data centers - PowerPoint PPT Presentation

CSE 6350 File and Storage System Infrastructure in Data centers Supporting Internet-wide Services FAWN: A Fast Array of Wimpy Nodes Presenter: Abhishek Sharma Topics Introduction. FAWN Architecture. FAWN DS Basic Functions: Store, lookup,

CSE 6350 File and Storage System Infrastructure in Data centers Supporting Internet-wide Services

CSE 6350 File and Storage System Infrastructure in Data centers Supporting Internet-wide Services

Storage and File Structure December 12, 2008 Storage and File Structure Magnetic Discs RAID

File Management What is a file? Elements of file management File organization

Click on M odel File for CAD Click on M odel File for CAD Click on Model File for CAD Click

~FILE SYSTEM~ SUNU WIBIRAMA OUTLINE FILE SYSTEM ACCESS METHODS DIRECTORY STRUCTURE FILE

File Systems Basics Nima Honarmand Fall 2017 :: CSE 306 File and inode File :

Part III Part III Storage Management Storage Management Chapter 11: File System Implementation

CPSC 410/611: File Management What is a file? Elements of file management File

Week 10: File Management What is a file? Elements of file management File

File System Implementation Summer 2016 Cornell University Today File allocation Unix

FILE SYSTEM IMPLEMENTATION Sunu Wibirama Outline File-System Structure File-System

[537] Distributed Systems Chapters 42 Tyler Harter 11/19/14 File-System Case Studies Local -

Chapter 10: Storage and File Structure Overview of Physical Storage Media Magnetic Disks

File Systems: Semantics & Structure What is a File a file is a named collection of

File Systems: Semantics & Structure What is a File a file is a named collection of

Data Structures in Java Session 14 Instructor: Bert Huang

Hashing CptS 223 Advanced Data Structures Larry Holder School of Electrical Engineering and

csci 210: Data Structures Maps and Hash Tables Summary Topics the Map ADT Map

RuQAR : Reasoning with OWL 2 RL Using Forward Chaining Engines Jaroslaw Bak Institute of Control

Overflow Handling Linear Probing Get And Put An overflow occurs when the home bucket for

7 Hashing: chaining Summer Term 2010 Robert Elssser Robert Elssser Possible ways of

EECS 3401 AI and Logic Prog. Lecture 8 Adapted from slides of Brachman & Levesque

Symbol Table ALSU Textbook Chapter 2.7 and 6.5 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw

CSE 6350 File and Storage System Infrastructure in Data centers - PowerPoint PPT Presentation

CSE 6350 File and Storage System Infrastructure in Data centers Supporting Internet-wide Services FAWN: A Fast Array of Wimpy Nodes Presenter: Abhishek Sharma Topics Introduction. FAWN Architecture. FAWN DS Basic Functions: Store, lookup,

CSE 6350 File and Storage System Infrastructure in Data centers Supporting Internet-wide Services

CSE 6350 File and Storage System Infrastructure in Data centers Supporting Internet-wide Services

Storage and File Structure December 12, 2008 Storage and File Structure Magnetic Discs RAID

File Management What is a file? Elements of file management File organization

Click on M odel File for CAD Click on M odel File for CAD Click on Model File for CAD Click

~FILE SYSTEM~ SUNU WIBIRAMA OUTLINE FILE SYSTEM ACCESS METHODS DIRECTORY STRUCTURE FILE

File Systems Basics Nima Honarmand Fall 2017 :: CSE 306 File and inode File :

Part III Part III Storage Management Storage Management Chapter 11: File System Implementation

CPSC 410/611: File Management What is a file? Elements of file management File

Week 10: File Management What is a file? Elements of file management File

File System Implementation Summer 2016 Cornell University Today File allocation Unix

FILE SYSTEM IMPLEMENTATION Sunu Wibirama Outline File-System Structure File-System

[537] Distributed Systems Chapters 42 Tyler Harter 11/19/14 File-System Case Studies Local -

Chapter 10: Storage and File Structure Overview of Physical Storage Media Magnetic Disks

File Systems: Semantics &amp; Structure What is a File a file is a named collection of

File Systems: Semantics &amp; Structure What is a File a file is a named collection of

Data Structures in Java Session 14 Instructor: Bert Huang

Hashing CptS 223 Advanced Data Structures Larry Holder School of Electrical Engineering and

csci 210: Data Structures Maps and Hash Tables Summary Topics the Map ADT Map

RuQAR : Reasoning with OWL 2 RL Using Forward Chaining Engines Jaroslaw Bak Institute of Control

Overflow Handling Linear Probing Get And Put An overflow occurs when the home bucket for

7 Hashing: chaining Summer Term 2010 Robert Elssser Robert Elssser Possible ways of

EECS 3401 AI and Logic Prog. Lecture 8 Adapted from slides of Brachman &amp; Levesque

Symbol Table ALSU Textbook Chapter 2.7 and 6.5 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw

File Systems: Semantics & Structure What is a File a file is a named collection of

File Systems: Semantics & Structure What is a File a file is a named collection of

EECS 3401 AI and Logic Prog. Lecture 8 Adapted from slides of Brachman & Levesque