Block-level RAID is dead Raja Appuswamy, David C. van Moolenbroek, - PowerPoint PPT Presentation

Block-level RAID is dead Raja Appuswamy, David C. van Moolenbroek, Andrew S. Tanenbaum Vrije Universiteit, Amsterdam June 22, 2010

Traditional storage stack File Originally one file system per disk system Later RAID layer was introduced Block-level RAID and Volume managers SW RAID Storage stack has remained the same for decades Disk Compatibility-driven integration has fatal flaws driver

Problem 1: Silent data corruption Disks exhibit fail-partial failure modes Lost, torn, misdirected writes Such failures result in silent data corruption Checksumming algorithms fail to detect corruption Most algorithms detect only a subset of all failure modes Parental checksumming detects all classes of failures Parental checksumming fails with block-level RAID RAID-initiated reads are unverified RAID-initiated reads propagate corruption

Problem 2: Heterogeneity issues Integration of new devices is an interesting problem Building device-specific FS Not compatible with block-based RAID Building a translation layer Widens the “Information gap” Duplication of functionality

Problem 3: Device failure Traditional RAID fails ungracefully Graceful degradation has two requirements Selective metadata replication Fault-isolated file placement Semantically unaware traditional RAID cannot fail gracefully

Problem 4: Administration nightmare Too many Volume management abstractions PVs, VGs, LVs, FSes, etc. Simple tasks need several error-prone steps Too many tunable parameters Chunk size, stripe width, LV size, etc. Improper configuration leads to bad performance Coarse-grained policy specification Need more flexibility (per file, directory or volume)

Problem 5: System failure Crashes/power failures result in “Write holes” HW RAID uses NVRAM to sidestep this issue Software RAID cannot rely on NVRAM Whole-disk resynchronization is impractical Journaling duplicates functionality and affects performance

Loris - the new storage stack File-based interface between layers Each file has a unique file identifier Each file has a set of attributes File-oriented requests: create truncate delete getattr read setattr write sync

Modular split and reliable flip (1) File system SW RAID Disk driver

Modular split and reliable flip (2) File system SW RAID Disk driver

Loris - the new storage stack POSIX call processing Directory handling Data caching RAID-like file multiplexing Logical policy storage Metadata caching Parental checksums On-disk layout

Solution to problem 1: End-to-end data integrity Physical layer converts fail-partial to fail-stop failures Physical layer verifies all requests alike RAID algorithms provide recovery from fail-stop failures

Solution to problem 2: Embracing heterogeneity Device-specific physical layers Can exploit device access characteristics Eliminate multiple translation steps RAID and Volume management across device families File abstraction hides device-specific vagaries No need to reimplement RAID algorithms per device family

Solution to problem 3: Graceful failure Directories replicated on all devices Naming layer chooses RAID 1 policy Zero-effort fault-isolated placement DIRECTORY FILE DIRECTORY FILE DIRECTORY FILE FILE 1 FILE 2 FILE 1 FILE 3 FILE 2 FILE 3

Solution to problem 3: Graceful failure Directories replicated on all devices Naming layer chooses RAID 1 policy Zero-effort fault-isolated placement DIRECTORY FILE DIRECTORY FILE DIRECTORY FILE FILE 1 FILE 2 FILE 1 FILE 3 FILE 2 FILE 3 66% availability under two failures!

Solution to problem 4: Simplified administration File pools similar to storage pools New device ⇒ new source of files Completely automate error-prone tasks “File systems/Volumes” share the file pool Flexible policy assignment Logical layer provides mechanism Any layer can assign policies Policies per file, directory, or volume

Solution to problem 5: Crash recovery Traditional FS recovery techniques can be used Journaling in physical layer (ext3) Transactional COW (ZFS) Goal is to protect important user data Metadata journaling does not help Full data journaling is very expensive Can we do selective data journaling?

Conclusion We examined block-level RAID along several dimensions We highlighted several fatal flaws We suggested a simple, yet fundamental change to the stack We showed how the new stack solves all issues by design

Block-level RAID is dead Raja Appuswamy, David C. van Moolenbroek, - PowerPoint PPT Presentation

Block-level RAID is dead Raja Appuswamy, David C. van Moolenbroek, Andrew S. Tanenbaum Vrije Universiteit, Amsterdam June 22, 2010 Traditional storage stack File Originally one file system per disk system Later RAID layer was introduced

Generic RAID Reassembly using Block-Level Entropy Christian Zoubek, Sabine Seufert, Andreas

MD/RAID-456 Write Journal and Cache Shaohua Li & So Song g Liu Software Engineer, Facebook

Cert-Lexsi Cert-Lexsi Dead angle ( Torpig vs PRG) Dead angle ( Torpig vs PRG) Dead angle (

Dead Code Elimination (DCE) Dead code elimination is an optimization that removes DEAD

Lecture 23: Multiprocessors Todays topics: RAID Multiprocessor taxonomy

38. RAID Operating System: Three Easy Pieces 1 Youjip Won RAID (Redundant Array of Inexpensive

A RAID AT THE HEART OF THE OILIBYA RALLY OF MOROCCO Discover the Cross- Country Raid in the

Welcome to RAID 2009 Saint-Malo France Septembre 23-25 and to Saint-Malo, Brittany RAID

RAID Summer 2016 Cornell University Today Performance and reliability using RAID. 2 Need

Software RAID on Linux Software RAID on Linux Presented by: Niladri Saha Niladri Saha Amit

ZFS The Last Word in Filesystem tzute Computer Center, CS, NCTU What is RAID? 2 Computer

Disk Management Disk Structure Disk Scheduling RAID Disk Block Management

Estimated Red Snapper Dead Discards 1981 2011 Source: SEDAR 31 (2013) Estimated dead discards

2018/10/24 1 2018/10/24 2 SAFE DEAD WORK SAFE DEAD WORK.. What is this?

Its Your Life, Real Education Unit 1, Lesson 5 1 Who is Dead Prez? 2 Video Clip 10 - Dead

Dead Code Elimination & Dead code elimination Constant Propagation Conceptually similar

Small-Scale Communities Are Sufficient for Cost- and Data-Efficient Peer-to-Peer Energy Sharing

2007 National Truck and Bus Crash Picture Webinar Ralph Craft, Ph.D. Senior Transportation

IMCS LONDON 2018 Pricing tale @ Finastra Romain Gilles Dev Manager 25 June 2018 Finastra

Branch Flow Model relaxations, convexification Masoud Farivar Steven Low Computing + Math

Towards Type Safety of Aspect-Oriented Languages by Florian Kammller & Matthias Vsgen

The Lean Theorem Prover Jeremy Avigad Department of Philosophy and Department of Mathematical

Joint work with I. Comparing research projects in proof theory Weaker vs. stronger systems

Assured Reconfiguration: An Architectural Core For System Dependability ICSE 2005 Workshop on

Block-level RAID is dead Raja Appuswamy, David C. van Moolenbroek, - PowerPoint PPT Presentation

Block-level RAID is dead Raja Appuswamy, David C. van Moolenbroek, Andrew S. Tanenbaum Vrije Universiteit, Amsterdam June 22, 2010 Traditional storage stack File Originally one file system per disk system Later RAID layer was introduced

Generic RAID Reassembly using Block-Level Entropy Christian Zoubek, Sabine Seufert, Andreas

MD/RAID-456 Write Journal and Cache Shaohua Li &amp; So Song g Liu Software Engineer, Facebook

Cert-Lexsi Cert-Lexsi Dead angle ( Torpig vs PRG) Dead angle ( Torpig vs PRG) Dead angle (

Dead Code Elimination (DCE) Dead code elimination is an optimization that removes DEAD

Lecture 23: Multiprocessors Todays topics: RAID Multiprocessor taxonomy

38. RAID Operating System: Three Easy Pieces 1 Youjip Won RAID (Redundant Array of Inexpensive

A RAID AT THE HEART OF THE OILIBYA RALLY OF MOROCCO Discover the Cross- Country Raid in the

Welcome to RAID 2009 Saint-Malo France Septembre 23-25 and to Saint-Malo, Brittany RAID

RAID Summer 2016 Cornell University Today Performance and reliability using RAID. 2 Need

Software RAID on Linux Software RAID on Linux Presented by: Niladri Saha Niladri Saha Amit

ZFS The Last Word in Filesystem tzute Computer Center, CS, NCTU What is RAID? 2 Computer

Disk Management Disk Structure Disk Scheduling RAID Disk Block Management

Estimated Red Snapper Dead Discards 1981 2011 Source: SEDAR 31 (2013) Estimated dead discards

2018/10/24 1 2018/10/24 2 SAFE DEAD WORK SAFE DEAD WORK.. What is this?

Its Your Life, Real Education Unit 1, Lesson 5 1 Who is Dead Prez? 2 Video Clip 10 - Dead

Dead Code Elimination &amp; Dead code elimination Constant Propagation Conceptually similar

Small-Scale Communities Are Sufficient for Cost- and Data-Efficient Peer-to-Peer Energy Sharing

2007 National Truck and Bus Crash Picture Webinar Ralph Craft, Ph.D. Senior Transportation

IMCS LONDON 2018 Pricing tale @ Finastra Romain Gilles Dev Manager 25 June 2018 Finastra

Branch Flow Model relaxations, convexification Masoud Farivar Steven Low Computing + Math

Towards Type Safety of Aspect-Oriented Languages by Florian Kammller &amp; Matthias Vsgen

The Lean Theorem Prover Jeremy Avigad Department of Philosophy and Department of Mathematical

Joint work with I. Comparing research projects in proof theory Weaker vs. stronger systems

Assured Reconfiguration: An Architectural Core For System Dependability ICSE 2005 Workshop on

MD/RAID-456 Write Journal and Cache Shaohua Li & So Song g Liu Software Engineer, Facebook

Dead Code Elimination & Dead code elimination Constant Propagation Conceptually similar

Towards Type Safety of Aspect-Oriented Languages by Florian Kammller & Matthias Vsgen