SelectiveEC: Selective Reconstruction in Erasure-coded Storage - PowerPoint PPT Presentation

SelectiveEC: Selective Reconstruction in Erasure-coded Storage Systems Liangliang Xu, Min Lyu, Qiliang Li, Lingjiang Xie, and Yinlong Xu University of Science and Technology of China HotStorage 2020

Distributed Storage Systems (DSSes)  Data is important • Large scale • Exponential growth  DSSes are the core infrastructures Disk • Thousands of nodes Cluster faults crushed • “Fat node” • Up to 72 TB of storage (about 1.5M chunks) per node in Pangu [1] • Frequent failures Network Artificial failures errors [1] ATC2019: Dayu: Fast and Low-interference Data Recovery in Very-large Storage Systems

Erasure Coding (EC)  EC popularly adopted in DSSes P 0 P 1 D 0 D 1 D 2 • Provide high reliability with low storage cost • (k, m)-Reed Solomon (RS) codes Client • k data chunks • m parity chunks • Tolerate any m nodes failures D 0 D 1 D 2 P 0 P 1 Node0 Node1 Node2 Node3 Node4 Writing a (3,2)-RS stripe

Reconstruction D 0 D 1 D 2 P 0 P 1 Node0 Node1 Node2 Node3 Node4 Reconstructing a chunk of (3,2)-RS stripe

Reconstruction D 0 Node5 D 0 D 1 D 2 P 0 P 1 Node0 Node1 Node2 Node3 Node4 Reconstructing a chunk of (3,2)-RS stripe

Reconstruction D 0 ① Reading chunks from source nodes Node5 D 0 D 1 D 2 P 0 P 1 1 1 1 Node0 Node1 Node2 Node3 Node4 Reconstructing a chunk of (3,2)-RS stripe

Reconstruction D 0 ① Reading chunks from source nodes ② Transferring data in network Node5 2 2 2 D 0 D 1 D 2 P 0 P 1 1 1 1 Node0 Node1 Node2 Node3 Node4 Reconstructing a chunk of (3,2)-RS stripe

Reconstruction D 0 3 ① Reading chunks from source nodes ② Transferring data in network ③ Decoding Node5 2 2 2 D 0 D 1 D 2 P 0 P 1 1 1 1 Node0 Node1 Node2 Node3 Node4 Reconstructing a chunk of (3,2)-RS stripe

Reconstruction D 0 3 4 ① Reading chunks from source nodes ② Transferring data in network ③ Decoding Node5 ④ Writing decoded data 2 2 2 D 0 D 1 D 2 P 0 P 1 1 1 1 Node0 Node1 Node2 Node3 Node4 Reconstructing a chunk of (3,2)-RS stripe

Breakdown of EC Reconstruction Time  Settings Reconstructing a (3,2)-RS chunk in 1Gbps network • 28 nodes: 1NN + 27DNs • quad-core 3.4 GHz Intel Core i5- Reading Transferring Writing 7500 CPU Stages chunks from data in Decoding decoded • 8GB RAM source nodes network data • 1T HDD Time • 1Gbps switch (30MB/s, 90MB/s 0.68% 85.23% 7.82% 6.27% Ratio or 150MB/s in Pangu [1] ) • 128MB chunk size  Network transferring contributes most to the reconstruction time [1] ATC2019: Dayu: Fast and Low-interference Data Recovery in Very-large Storage Systems

Random Data Layout  Random distribution • Load balance in a large amount of stripes  Reconstruction batch by batch • Limited network, disk I/O, CPU and memory resource • Optimal batch size • # of live nodes • Detailed analysis in the paper

Random Data Layout  Nonuniform data layout in a batch • Unbalanced upstream bandwidth occupation Node0 Node1 Node2 Node3 Node4 Node5 Node6 Node7 Random data layout of (3,2)-RS stripes

Random Data Layout  Nonuniform choices of replacement nodes • Unbalanced downstream bandwidth occupation Node0 Node1 Node2 Node3 Node4 Node5 Node6 Node7 Random data layout of (3,2)-RS stripes

Goals  Balanced distribution of source nodes Node0 Node1 Node2 Node3 Node4 Node5 Node6 Node7 Random data layout of (3,2)-RS stripes

Goals  Balanced distribution of source nodes  Balanced distribution of replacement nodes Node0 Node1 Node2 Node3 Node4 Node5 Node6 Node7 Random data layout of (3,2)-RS stripes

SelectiveEC Schedule reconstruction tasks out of order Select source nodes dynamically Select replacement nodes dynamically

Graph Model  Bipartite graph G s = (T ∪ N, E) for the selection of source nodes • T: tasks, i.e. each having k+m-1 source nodes • N: source nodes, i.e. all of live nodes • (T i , N j ) ∈ E iff there is a chunk of stripe T i in source node N j Tasks • Connections of tasks and live nodes • Nonuniform distribution of chunks 4 5 7 5 5 1 1 Source nodes G s = (T ∪ N, E) for (3, 2)-RS

Select k Source Nodes Dynamically  Goal: balance upstream bandwidth occupation  Using maximum flow to select k source nodes • Construct a flow graph FG s • Find a maximum flow • Maximum flow value = 17 • No conflict in the chosen source connections

Schedule Reconstruction Tasks Out of Order  Preparation work • Find the most unsaturated task • Compute an unsaturated list of source nodes • Task to be replaced: T 7 • Unsaturated list: N 5 , N 6 , N 7

Schedule Reconstruction Tasks Out of Order  Schedule reconstruction tasks Replace a new task: T 7 • Scan the reconstruction queue • Find a new task • More connections with unsaturated list • Update FG s • Find a maximum flow Maximum flow value = 19

Schedule Reconstruction Tasks Out of Order  Schedule reconstruction tasks • Scan the reconstruction queue • Find a new task • More connections with unsaturated list • Update FG s • Find a maximum flow  Achieve more balanced upstream bandwidth occupation

Select Replacement Nodes Dynamically  Construct bipartite graph G r for the selection of replacement nodes • Complement of G s • Find a perfect matching • Easy to find in large-scale DSSes  Achieve load balance of replacement nodes • Balanced downstream bandwidth occupation • Balanced disk I/O, CPU and memory usage

Evaluation  Implement simulative prototype of SeletiveEC  The simulations run in a server with • Two 12-core Intel Xeon E5-2650 processors • 64GB DDR4 memory • Linux 3.10.0  (3,2)-RS stripes  # of chunks in a “fat node” • 100 times of the number of live nodes  DRP: the degree of recovery parallelism

The First Batch Large scale Small scale  For small scale, DRP of SelectiveEC are all bigger than 0.975  For large scale, DRP of SelectiveEC improves the DRP up to 97.6%

Full Batches  Around 0.97 for SelectiveEC  Around 0.50 for random reconstruction

Summary  SelectiveEC, a balanced scheduling module • Schedule reconstruction tasks out of order • Select source nodes dynamically • Select replacement nodes dynamically • Improve the load balance for single failure recovery effectively  Simulation results • Improve the degree of recovery parallelism significantly  Future work • Deploy in practical systems • Optimize the algorithms to support multiple failures

Thanks for your attention! Q&A Liangliang Xu@USTC llxu@mail.ustc.edu.cn

SelectiveEC: Selective Reconstruction in Erasure-coded Storage - PowerPoint PPT Presentation

SelectiveEC: Selective Reconstruction in Erasure-coded Storage Systems Liangliang Xu, Min Lyu, Qiliang Li, Lingjiang Xie, and Yinlong Xu University of Science and Technology of China HotStorage 2020 Distributed Storage Systems (DSSes) Data

Mixed Oxides in Selective Mixed Oxides in Selective Mixed Oxides in Selective Mixed Oxides in

3D RECONSTRUCTION Reconstruction method Reconstruction from images Reconstruction from video

Delaunay Triangulation: Applications Reconstruction Meshing 1 Reconstruction From points 2 -

Texas Instruments & RFAB TI Information Selective Disclosure TI Information Selective

Cimzia Selective rebrand Concept A Cimzia Selective rebrand Logo Main / Colour Grayscale

Selective Prediction Binary classifications Rong Zhou November 8, 2017 Table of contents 1.

1. Reconstruction and the West 1.1 Reconstruction: Americas Unfinished Revolution, 1865-1877

Volumetric Scene Reconstruction Volumetric Scene Reconstruction Goal Goal from Multiple

Vertex reconstruction Vertex reconstruction in large liquid scintillator detectors in large

Selective Laser Trabeculoplasty Selective Laser Trabeculoplasty SLT SLT Jorge

Selective W eb Archiving at the Germ an National Library 1 | 8 | Selective Web Archiving

Selective Early Request Termination Selective Early Request Termination for Busy Internet

Design of Geofoam Embankment for the I-15 Reconstruction I 15 Reconstruction Steven F. Bartlett,

Curve and surface reconstruction Steve Oudot Reconstruction Paradigm Q What do you see? Why?

Type Reconstruction and Polymorphism 1 Type Checking and Type Reconstruction We now come to the

S Surface f Reconstruction Digitalisierung Surface Reconstruction: Dr. Peer Stelldinger WS

Characterizing Starch Starch Concepts in the Ruminant We can do a reasonably good job of

Nutrition in Heart Health July 16, 2020 at 3:00 PM ET Presenter: Penny M. Kris-Etherton, PhD, RD,

Chapter 11 Lipids Problems: 2-8,10-12,15-17. Lipids are essential components of all living

REMOVING BARRIERS TO ACCESS: ELIMINATING FINES AND FEES Dan Alcazar, High Plains Library District

Infrared micorscopy From macro to nano scale on the molecules of life Lisa Vaccari SISSI

SFS inference from NGS data to detect recent adaptive selection Anders Albrechtsen The

MOL2NET 2017, International Conference on Multidisciplinary Sciences, 3rd edition The high prices

toxicity using Caenorhabditis elegans and the RTgill cell line as model systems Erica K.

Sambuz

Useful Links

Newsletter

Mail Us

SelectiveEC: Selective Reconstruction in Erasure-coded Storage - PowerPoint PPT Presentation

SelectiveEC: Selective Reconstruction in Erasure-coded Storage Systems Liangliang Xu, Min Lyu, Qiliang Li, Lingjiang Xie, and Yinlong Xu University of Science and Technology of China HotStorage 2020 Distributed Storage Systems (DSSes) Data

Mixed Oxides in Selective Mixed Oxides in Selective Mixed Oxides in Selective Mixed Oxides in

3D RECONSTRUCTION Reconstruction method Reconstruction from images Reconstruction from video

Delaunay Triangulation: Applications Reconstruction Meshing 1 Reconstruction From points 2 -

Texas Instruments &amp; RFAB TI Information Selective Disclosure TI Information Selective

Cimzia Selective rebrand Concept A Cimzia Selective rebrand Logo Main / Colour Grayscale

Selective Prediction Binary classifications Rong Zhou November 8, 2017 Table of contents 1.

1. Reconstruction and the West 1.1 Reconstruction: Americas Unfinished Revolution, 1865-1877

Volumetric Scene Reconstruction Volumetric Scene Reconstruction Goal Goal from Multiple

Vertex reconstruction Vertex reconstruction in large liquid scintillator detectors in large

Selective Laser Trabeculoplasty Selective Laser Trabeculoplasty SLT SLT Jorge

Selective W eb Archiving at the Germ an National Library 1 | 8 | Selective Web Archiving

Selective Early Request Termination Selective Early Request Termination for Busy Internet

Design of Geofoam Embankment for the I-15 Reconstruction I 15 Reconstruction Steven F. Bartlett,

Curve and surface reconstruction Steve Oudot Reconstruction Paradigm Q What do you see? Why?

Type Reconstruction and Polymorphism 1 Type Checking and Type Reconstruction We now come to the

S Surface f Reconstruction Digitalisierung Surface Reconstruction: Dr. Peer Stelldinger WS

Characterizing Starch Starch Concepts in the Ruminant We can do a reasonably good job of

Nutrition in Heart Health July 16, 2020 at 3:00 PM ET Presenter: Penny M. Kris-Etherton, PhD, RD,

Chapter 11 Lipids Problems: 2-8,10-12,15-17. Lipids are essential components of all living

REMOVING BARRIERS TO ACCESS: ELIMINATING FINES AND FEES Dan Alcazar, High Plains Library District

Infrared micorscopy From macro to nano scale on the molecules of life Lisa Vaccari SISSI

SFS inference from NGS data to detect recent adaptive selection Anders Albrechtsen The

MOL2NET 2017, International Conference on Multidisciplinary Sciences, 3rd edition The high prices

toxicity using Caenorhabditis elegans and the RTgill cell line as model systems Erica K.

Sambuz

Useful Links

Newsletter

Mail Us

Texas Instruments & RFAB TI Information Selective Disclosure TI Information Selective