(aka Embedded Storage at the Edge Paper) Jianshen Liu*, Matthew Leon - PowerPoint PPT Presentation

Scale-out Edge Storage Systems with Embedded Storage Nodes to Get Better Availability and Cost-Efficiency At the Same Time (aka “Embedded Storage at the Edge” Paper) Jianshen Liu*, Matthew Leon Curry ‡ , Carlos Maltzahn*, Philip Kufeldt § *UC Santa Cruz, ‡ Sandia National Laboratories, § Seagate Technology

Challenges of Data Availability at the Edge “Truck rolls” are expensive! Failure Edge Deployments Environmental Limitations 2

Embedded Storage General-purpose (GP) Servers An Ethernet SSD with NVMe-oF Interface * ✓ Ethernet-attached storage Embedded Storage Devices devices integrated with computing resources ✓ Computational storage devices * https://www.servethehome.com/marvell-88ss5000-nvmeof-ssd-controller-shown-with-toshiba-bics/ 3

Failure Domains and Data Availability Simpler Embedded Storage enables Each GP servers contains more nodes under the same multiple storage devices cost/space/power restrictions . Embedded Storage Devices The more independent failure domains a failover mechanism spans, the more available the data becomes. 4

The Analytical Model Goal Determine availability of Server-based Storage System embedded storage relative to traditional servers. Embedded Storage System P data-loss (server-based storage system) Relative Benefit = Relative Benefit > 1 embedded storage is better P data-loss (embedded storage system) 5

Our Analytical Model — Assumptions of System Configurations The units of deployment are homogeneous. ◎ Both systems have the same level of network redundancy and power ◎ redundancy for all nodes. Both systems use 3-way replication for data protection. ◎ Both systems use the copyset replication § scheme instead of the random ◎ replication scheme. It's not our work, but we apply this scheme to our model Independence of servers and storage devices. Therefore, we can use Poisson ◎ distribution* to model the possibilities of hardware failures. § Cidon, Asaf, et al. "Copysets: Reducing the frequency of data loss in cloud storage." Presented as part of the 2013 {USENIX} Annual Technical Conference ({USENIX}{ATC} 13). 2013. 6 * Wikipedia contributors. "Poisson distribution." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 10 Mar. 2020. Web. 31 Mar. 2020.

Copyset Replication vs. Random Replication Replication Factor r = 3 : a node can store copies of the data in the other node 1 2 3 4 5 6 1 2 3 4 5 6 Relationships of Nodes with Random Replication Relationships of Nodes with Copyset Replication A node has replica set relationships with 5 nodes A node has replica set relationships with <=2 nodes With a sufficient number of data chunks Reducing the number of replica sets stored, data loss is nearly guaranteed can reduce the likelihood of data loss if any combination of r nodes fail under a correlated failure. simultaneously. 7

Our Analytical Model — Assumptions of Model Parameters and ◎ , where ◎ For hard drives, f could be greater than 2, while for SSDs, f could be less than 1. (We call the ratio of failure rates ) , where ◎ (We call the ratio of computing performance ) ◎ (We call the ratio of storage performance ) (3-way replication) ◎ 8

Our Analytical Model — Assumptions of Model Parameters and ◎ Failure Rate of Failure Rate of non-storage non-storage components components In In 9

Our Analytical Model — Assumptions of Model Parameters and ◎ Failure Rate of Failure Rate of a storage device the storage component In In 10

Our Analytical Model — Assumptions of Model Parameters , where ◎ For hard drives, f could be greater than 2, while for SSDs, f could be less than 1. (We call the ratio of failure rates ) Failure Rate of a storage device In Failure Rate of non-storage components In 11

Our Analytical Model — Assumptions of Model Parameters , where ◎ (We call the ratio of computing performance ) # of # of We need units of to get the same performance of a single 12

Our Analytical Model — Assumptions of Model Parameters ◎ (We call the ratio of storage performance ) is the number of storage devices ( 2) in a server. ... 13

Our Analytical Model — Assumptions of Model Parameters (3-way replication) ◎ ... need at least 3 servers for 3-way replication 14

Our Analytical Model — Assumptions of Model Parameters and ◎ , where ◎ For hard drives, f could be greater than 2, while for SSDs, f could be less than 1. (We call the ratio of failure rates ) How sensitive is the Relative , where ◎ Benefit to these parameters? (We call the ratio of computing performance ) ◎ (We call the ratio of storage performance ) (3-way replication) ◎ 15

Evaluation As an example, we evaluate the Relative Benefit of embedded storage regarding the data unavailability caused by failures of exactly three components. A component can be: P data-loss (server-based storage system) Relative Benefit = A server ● P data-loss (embedded storage system) An embedded storage device ● A storage component in a failure domain ● ✓ (the failure rate of the storage component over the failure rate of the non-storage components) ✓ (the number of nodes that have a replica set relationship with a node) ➔ (# of GP servers) ➔ (# of storage devices in a server) ➔ (# of embedded storage device / # of servers) and 16

Evaluation — Spinning Media as Storage The failure rate of a storage device is 2x of that of the non-storage components of a server ( f = 2 ) ◎ [Vishwanath, et al. "Characterizing cloud computing hardware reliability." 2010] The number of nodes that have a replica set relationship with a node is 4 ( w = 4 ) ◎ ฀ the server-based system has (m=) 10 servers ฀ the server-based system ฀ the embedded storage system has (m=) 10 servers has (17x10=) 170 devices n ฀ each server has (n=) 4 o ฀ relative benefit is 114.3 i t a storage devices g e r n g o i g ฀ relative benefit is 7.1 t a g A e r g g A e e t g a u r o S t p r e m h g i H o C r e h g i H The Impact of Storage Aggregation on the The Impact of Compute Aggregation on the Relative Benefit Relative Benefit ฀ c = n = 4 ➡ the embedded ฀ each server has storage system has (10x4=) 40 12 storage devices 17 devices

Evaluation — Solid-state Drives as Storage The failure rate of a storage device is 0.06x of that of the non-storage components of a server ( f = 0.06 ) ◎ [Xu, Erci, et al. "Lessons and actions: What we learned from 10k ssd-related storage system failures." 2019] The number of nodes that have a replica set relationship with a node is 4 ( w = 4 ) ◎ ฀ the server-based system has (m=) 10 servers ฀ each server has (n=) 4 storage devices ฀ relative benefit is 20.7 The Impact of Storage Aggregation on the The Impact of Compute Aggregation on the Relative Benefit Relative Benefit 18

Insights (part 1/5) 1. The higher the storage aggregation of a server, the higher the relative benefit of embedded storage. Server-based Storage System 10 servers with n storage devices each, resulting in 10 failure domains. Embedded Storage System 10 x n devices, resulting in 10 x n failure domains. 19

Insights (part 2/5) 2. Smaller storage systems are more sensitive to the benefit of embedded storage. Server-based Storage System m servers have 4 storage devices each, resulting in m failure domains. Embedded Storage System 4 x m devices, resulting in 4 x m failure domains. The total # of storage devices of the two systems are the same. 20

Insights (part 3/5) 3. The lower the failure rate of a storage device, the higher the relative benefit of embedded storage. Server-based Storage System 10 servers with n storage devices each, resulting in 10 failure domains. Embedded Storage System 10 x n devices, resulting in 10 x n failure domains. 21

Insights (part 4/5) 4. The higher the compute aggregation of a server, the higher the relative benefit of embedded storage. Server-based Storage System 10 servers with 12 storage devices each Embedded Storage System 10 x c devices units of can provide the same storage performance of a single 22

Insights (part 5/5) 5. The relationship between the resource aggregation and the relative benefit is nonlinear. 1) Doubling the storage aggregation of a server could triple the relative benefit. 2) Doubling the compute aggregation of a server could quadruple the relative benefit. 1) 2) 23

Conclusions Embedded storage devices are simpler, making it is possible to have more ◎ independent failure domains. Storage systems with more independent failure domains can improve data ◎ availability. A great design point, but many unsolved challenges! ◎ (e.g., explore the balance between availability and storage performance) 24

This work was supported in part by NSF grants OAC-1836650, CNS-1764102, and CNS-1705021, and by the Center for Research in Open Source Sofuware (cross.ucsc.edu). Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-NA0003525. Thank you! Questions? Jianshen Liu jliu120@ucsc.edu https://cross.ucsc.edu (Eusocial Storage Devices) 25

(aka Embedded Storage at the Edge Paper) Jianshen Liu*, Matthew Leon - PowerPoint PPT Presentation

Scale-out Edge Storage Systems with Embedded Storage Nodes to Get Better Availability and Cost-Efficiency At the Same Time (aka Embedded Storage at the Edge Paper) Jianshen Liu, Matthew Leon Curry , Carlos Maltzahn, Philip Kufeldt

Cloud Cloud Cloud Cloud network Edge Edge Edge Edge as a Edge Edge Edge Edge Edge

Get the edge Get the edge Get the edge Get the edge Get the edge Get the edge Get the edge

Edge-based Segmentation Transform Hough Edge Tracking Linking Edge Detection Canny Edge

The 5G-AKA Authentication Protocol Privacy Adrien Koutsos LVS, ENS Paris-Saclay January 18, 2019

Embedded PC The modular Industrial PC for mid-range control Embedded PC 1 Embedded OS

EMBEDDED EMBEDDED REAL TIME SYSTEMS REAL TIME SYSTEMS EMBEDDED EMBEDDED REAL TIME SYSTEMS

Platform Convergence Journey Windows Embedded Standard 7 Windows Embedded Standard 8 Converged

PAPER PROJECT 1 SOURCE: http://www.printhaus.es/diferencias-entre-papel/ PAPER PROJECT 1: TYPES

PAPER PROJECT 3 SOURCE: http://www.printhaus.es/diferencias-entre-papel/ PAPER PROJECT 3: TYPES

Effect of Edge Preparation Methods on Effect of Edge Preparation Methods on Edge Retention Rate

Next Edge Theta Yield Fund Next Edge Capital Corp., January 2016 IMPORTANT NOTES The Next Edge

Next Edge Private Debt Fund Next Edge Capital Corp., June 2018 IMPORTANT NOTES The Next Edge

Mobile Edge Cloud Services in 5G Yanyong Zhang WINLAB, Rutgers University

An Introduction to An Introduction to Geocaching Geocaching Chris Kracik Chris Kracik aka

General Physics I (aka PHYS 2013) P ROF . V ANCHURIN ( AKA V ITALY ) University of Minnesota,

Embedded PC The modular Industrial PC for mid-range control Stefan Hoppe 14.09.2007 1 Embedded

How do we think about GUIs? an array of buttons each button waits for a click each

Introduction to Higher Order Functions Dr. Mattox Beckman University of Illinois at

Predefined list algorithms Some classics: exists? (Example: Is there a number?) all?

Reflection Consider the member? function, which has the following algebraic laws: (member? m

More on the Curry Howard Isomorphism Curtis Millar CSE, UNSW (and Data61) 29 July 2020 1

Class 23: tons of fun! Currying Records Quiz on types Equality testing A little big-O Trees

Naive validity David Ripley University of Connecticut http://davewripley.rocks A base system A

Declarative Multi-Paradigm Programming in Michael Hanus Christian-Albrechts-Universit at Kiel

(aka Embedded Storage at the Edge Paper) Jianshen Liu*, Matthew Leon - PowerPoint PPT Presentation

Scale-out Edge Storage Systems with Embedded Storage Nodes to Get Better Availability and Cost-Efficiency At the Same Time (aka Embedded Storage at the Edge Paper) Jianshen Liu*, Matthew Leon Curry , Carlos Maltzahn*, Philip Kufeldt

Cloud Cloud Cloud Cloud network Edge Edge Edge Edge as a Edge Edge Edge Edge Edge

Get the edge Get the edge Get the edge Get the edge Get the edge Get the edge Get the edge

Edge-based Segmentation Transform Hough Edge Tracking Linking Edge Detection Canny Edge

The 5G-AKA Authentication Protocol Privacy Adrien Koutsos LVS, ENS Paris-Saclay January 18, 2019

Embedded PC The modular Industrial PC for mid-range control Embedded PC 1 Embedded OS

EMBEDDED EMBEDDED REAL TIME SYSTEMS REAL TIME SYSTEMS EMBEDDED EMBEDDED REAL TIME SYSTEMS

Platform Convergence Journey Windows Embedded Standard 7 Windows Embedded Standard 8 Converged

PAPER PROJECT 1 SOURCE: http://www.printhaus.es/diferencias-entre-papel/ PAPER PROJECT 1: TYPES

PAPER PROJECT 3 SOURCE: http://www.printhaus.es/diferencias-entre-papel/ PAPER PROJECT 3: TYPES

Effect of Edge Preparation Methods on Effect of Edge Preparation Methods on Edge Retention Rate

Next Edge Theta Yield Fund Next Edge Capital Corp., January 2016 IMPORTANT NOTES The Next Edge

Next Edge Private Debt Fund Next Edge Capital Corp., June 2018 IMPORTANT NOTES The Next Edge

Mobile Edge Cloud Services in 5G Yanyong Zhang WINLAB, Rutgers University

An Introduction to An Introduction to Geocaching Geocaching Chris Kracik Chris Kracik aka

General Physics I (aka PHYS 2013) P ROF . V ANCHURIN ( AKA V ITALY ) University of Minnesota,

Embedded PC The modular Industrial PC for mid-range control Stefan Hoppe 14.09.2007 1 Embedded

How do we think about GUIs? an array of buttons each button waits for a click each

Introduction to Higher Order Functions Dr. Mattox Beckman University of Illinois at

Predefined list algorithms Some classics: exists? (Example: Is there a number?) all?

Reflection Consider the member? function, which has the following algebraic laws: (member? m

More on the Curry Howard Isomorphism Curtis Millar CSE, UNSW (and Data61) 29 July 2020 1

Class 23: tons of fun! Currying Records Quiz on types Equality testing A little big-O Trees

Naive validity David Ripley University of Connecticut http://davewripley.rocks A base system A

Declarative Multi-Paradigm Programming in Michael Hanus Christian-Albrechts-Universit at Kiel

Scale-out Edge Storage Systems with Embedded Storage Nodes to Get Better Availability and Cost-Efficiency At the Same Time (aka Embedded Storage at the Edge Paper) Jianshen Liu, Matthew Leon Curry , Carlos Maltzahn, Philip Kufeldt