FAWN - Fast Array of Wimpy Nodes David G. Andersen et al. - PowerPoint PPT Presentation

FAWN - Fast Array of Wimpy Nodes David G. Andersen et al. Presented by: Ravi Kiran Boggavarapu 1001541261

● A cluster architecture for low-power and data-intensive computing. ● Wimpy nodes = A combination of low-power CPUs and small flash. ○ design centers around log-structured datastores that provide high performance on flash. ● Goal of the architecture?? ○ Increase performance while minimizing power consumption -- Save the electricity bills of the Data Centers! ● How performance is measured? ○ This paper uses queries per Joule as metric. FAWN handles roughly 350 k-v qpJ. 2

The above photo is taken from: http://www.cs.cmu.edu/~fawnproj/ 3

Trade-offs of using Flash: ● Flash provides a non-volatile memory store with several significant benefits over traditional magnetic disks: ○ Fast random reads. ○ Efficient power consumption for I/O ● But it also introduces challenges: ○ Small writes on flash are very expensive. ○ Updating a single page requires first erasing the entire block of pages and writing the entire modified block. 4

Log-structured datastore ● An append-only file system. ● Writes are appended to a sequential log Data log. ● Reads require a single random access 5

Q1) “The workloads these systems support share several characteristics, they are: - I/O, not computation, intensive, - requiring random access over large datasets, - and the size of objects stored is typically small. ” Why workloads of these characteristics represent a challenge to the system design? 6

Ans - Q1) ● Increasing gap between CPU performance - I/O bandwidth. ● "For data-intensive workloads storage, network, and memory bandwidth bottleneck often cause low CPU utilization." ● The "Small-write problem." - Multiple random disk writes(very slow). 7

Q2) “The key design choice in FAWN-KV is the use of a log structured per-node datastore called FAWN-DS that provides high performance reads and writes using flash memory. ” “These performance problems motivate log-structured techniques for flash filesystems and data structures” What key benefit does a log structured data organization bring to the KV store design? 8

Ans - Q2) ● get() = Random read. ● While, put() and delete() = Append. ● log-structured design = append only filesystem. ● Hence, using a log-structured data store prevents small random writes on disk. 9

Q3) “To provide this property, FAWN-DS maintains an in-DRAM hash table (Hash Index) that maps keys to an offset in the append-only Data Log on flash. ” 10

Ans - Q3) ● Large metadata - long buckets(nodes) and multiple pointers for each node(Linked List). ● RAM is volatile - in case of failure, the whole Hash Table is will be lost! 11

Q4) “It stores only a fragment of the actual key in memory to find a location in the log; ” Is there concern on correctness of this design? 12

Ans - Q4) ● What if multiple keys have have the fragment part that is similar? ○ Reads the full key from the log and verifies it with the key it read. ○ Therefore, no worries about the correctness. 13

Q5) Explain "Basic functions:" Store, Lookup, Delete 14

Ans - Q5) ● Store: ○ appends entry log updates the corresponding hash table entry. ● Lookup: ○ gets offset from hash entry and indexes into Data log, and returns the data blob ● Delete: ○ invalidates the hash entry by clearing the valid flag. ○ appends Delete entry to the log. ● Why append delete? - Discussed in the answer to the next question. Figure copied from http://vijay.vasu.org/static/talks/fawn-sosp2009-slides.pdf 15

Q6) “As an optimization, FAWN-DS periodically checkpoints the index by writing the Hash Index and a pointer to the last log entry to flash. ” Why does this checkpointing help with the recovery efficiency? Why is a Delete entry needed in the log for a correct recovery? 16

Ans - Q6) ● How check point helps with recovery efficient? ○ After a failure only the contents starting from the checkpoint are necessary to create the Hash Index. ● Why the Delete entry? ○ Fault tolerance. ○ Avoid random writes to disks. 17

Thank you 18

FAWN - Fast Array of Wimpy Nodes David G. Andersen et al. - PowerPoint PPT Presentation

FAWN - Fast Array of Wimpy Nodes David G. Andersen et al. Presented by: Ravi Kiran Boggavarapu 1001541261 A cluster architecture for low-power and data-intensive computing. Wimpy nodes = A combination of low-power CPUs and small

FAWN - a Fast Array of Wimpy Nodes Tomasz Dubrownik University of Warsaw January 12, 2011

FAWN FAST ARRAY OF WIMPY NODES VIRAJ SULE FAWN is a cluster architecture for low-power

FAWN: A Fast Array of Wimpy Nodes David G. Andersen, Jason Franklin, Michael Kaminsky * , Amar

CSE 6350 File and Storage System Infrastructure in Data centers Supporting Internet-wide Services

A WIMPy Leptogenesis Miracle Baryogenesis via WIMP freeze-out Brian Shuve with Yanou Cui and

singly linked lists Sept. 18, 2017 1 Recall last lecture: Java array array array array of

A Brief History of Chain Replication Christopher Meiklejohn // @cmeik QCon 2015, November 17th,

Breakfast Menu Breakfast Menu Paper: PopSet Fawn 120g Size: 594 x 420 mm Scale: 40%

Review We can declare an array of any type, even other arrays A 2D array is an array of

Cache Performance 1 C and cache misses (1) int array[1024]; // 4KB array int even_sum = 0,

Habanero Operating Committee January 25 2017 Habanero Overview 1. Execute Nodes 2. Head Nodes

Minimum Number Of Nodes Minimum number of nodes in a binary tree whose height is h. At

Minimum Number Of Nodes Minimum number of nodes in a binary tree whose height is h. At

Minimum Number Of Nodes Minimum number of nodes in a binary tree whose height is h. At

Being a METS Startup Fast Failure; Fast Reward November 2016 Fast Failure; Fast Reward

Ener Energy gy and and Pe Performance Can Can a Wi Wimpy mpy Node Node Cl Clus uster

Energy-performance tradeoffs for HPC applications on low power processors E. Calore 1 . Schifano 1

PoM @ Breaking Bitcoin Kevin Loaec @kloaec kevin@chainsmiths.com The Proof of Work Mechanism A

Waste Vitrification - Overview of Current Practice Ian L. Pegg Vitreous State Laboratory The

Crowdsourcing Access Network Spectrum Allocation Using Smartphones Jinghao Shi , Zhangyu Guan

Part V. AWGN Channel Capacity AWGN Capacity Formula; Sphere Packing; Resources in AWGN Channel

1 Today were going to start with a brief mo-va-on

Thermoelectric spin voltage in graphene 2017/12/22

These lectures provide an account of the basic concepts of magneostatics, atomic magnetism and