Storage Fabric CS6453 Summary Last week: NVRAM is going to change - PowerPoint PPT Presentation

Storage Fabric CS6453

Summary  Last week: NVRAM is going to change the way we thing about storage.  Today: Challenges of storage layers (SSDs, HDs) that are created from massive data.  Slowdowns in HDs and SSDs.  Enforcing policies for IO operations in Cloud architectures.

Background: Storage for Big Data  One disk is not enough to handle massive amounts of data.  Last time: Efficient datacenter networks using large number of cheap commodity switches.  Solution: Efficient IO performance using large number of commodity storage devices.

Background: RAIDS  Achieves Nx performance where N is the number of Disks.  Is this for free?  When N becomes large then the probability of Disk failures becomes large as well.  RAID 0 does not tolerate failures.

Background: RAIDS  Achieves (K-1)-fault tolerance with Kx Disks.  Is this for free?  There are Kx more disks (e.g. if you want to tolerate 1 failure you need 2x more Disks than RAID 0).  RAID 1 does not utilize resources in an efficient way.

Background: Erasure Code  Achieves K-fault tolerance with N+K Disks.  Efficient utilization of Disks (not as great as RAID 0).  Fault-Tolerance (not as great as RAID 1).  Is this for free?  Reconstruction Cost : # of Disks needed from a read in case of failure(s).  RAID 6 has a Reconstruction Cost of 3.

Modern Erasure Code Techniques  Erasure Coding in Windows Azure Storage [Huang, 2012]  Exploit Point: 𝑄𝑠𝑝𝑐 1 𝑔𝑏𝑗𝑚𝑣𝑠𝑓 ≫ 𝑄𝑠𝑝𝑐[2 𝑔𝑏𝑗𝑚𝑣𝑠𝑓𝑡 𝑝𝑠 𝑛𝑝𝑠𝑓]  Solution: Construct Erasure Code Technique that has low reconstruction cost for 1 failure.  1.33x more storage overhead (relatively low).  Tolerate up to 3 failures in 16 storage devices.  Reconstruction cost of 6 for 1 failure and 12 for 2+ failures.

The Tail at Store: Problem  We have seen how we treat failures with reconstruction. What about slowdowns in HDs (or SSDs)?  A slowdown of a disk (no failures) might have significant impact at overall performance.  Questions:  Do HDs or SSDs exhibit transient slowdowns?  Are slowdowns of disks frequent enough to affect the overall performance?  What causes slowdowns?  How do we deal with slowdowns?

The Tail at Store: Study RAID D D … D Q P Disk SSD #RAID groups 38,029 572 #Data drives per group 3-26 3-22 #Data drives 458,482 4,069 Total drive hours 857,183,442 7,481,055 Total RAID hours 72,046,373 1,072,690

The Tail at Store: Slowdowns? Hourly average I/O latency  CDF of Slowdown (Disk) per drive 𝑀 1 Slowdown:  𝑀 𝑇 = 0.98 𝑀 𝑛𝑓𝑒𝑗𝑏𝑜 Tail:  T = 𝑇 𝑛𝑏𝑦 0.96 Slow Disks: S ≥ 2  0.94 𝑇 ≥ 2 at 99.8 percentile  𝑇 ≥ 1.5 at 99.3 percentile  0.92 Si 𝑈 ≥ 2 at 97.8 percentile  T 𝑈 ≥ 1.5 at 95.2 percentile 0.9  1x 2x 4x 8x SSDs exhibit even more  slowdowns Slowdown

The Tail at Store: Duration? Slowdowns are transient  CDF of Slowdown Interval 40% of HD slowdowns ≥ 2  1 hours 12% of HD slowdowns ≥ 10  0.8 hours 0.6 Many slowdowns happen in  consecutive hours (last more) 0.4 0.2 Disk SSD 0 1 2 4 8 16 32 64 128 256 Slowdown Interval (Hours)

The Tail at Store: Correlation between slowdowns in the same storage? 90% of Disk slowdown are  CDF of Slowdown Inter-Arrival Period within 24 hours of another slowdown of the same Disk. 1 > 80% of SSDs slowdown  0.9 are within 24 hours of another slowdown of the same SSD. 0.8 Slowdowns happen in the  0.7 same Disks relatively close to each other. 0.6 Disk SSD 0.5 0 5 10 15 20 25 30 35 Inter-Arrival between Slowdowns (Hours)

The Tail at Store: Causes? 𝐽/𝑃𝑆𝑏𝑢𝑓 𝑆𝐽 =  CDF of RI within Si >= 2 𝐽/𝑃𝑆𝑏𝑢𝑓 𝑛𝑓𝑒𝑗𝑏𝑜 1 Rate imbalance does not  seem to be the main cause of slowdowns for slow 0.8 Disks. 0.6 0.4 0.2 Disk SSD 0 0.5x 1x 2x 4x Rate Imbalance

The Tail at Store: Causes? 𝐽/𝑃𝑇𝑗𝑨𝑓 𝑇𝐽 =  CDF of ZI within Si >= 2 𝐽/𝑃𝑇𝑗𝑨𝑓 𝑛𝑓𝑒𝑗𝑏𝑜 1 Size imbalance does not  seem to be the main cause of slowdowns for slow 0.8 Disks. 0.6 0.4 0.2 Disk SSD 0 0.5x 1x 2x 4x Size Imbalance

The Tail at Store: Causes? Disk age seems to have  CDF of Slowdown vs. Drive Age (Disk) some correlation but it is not strongly correlated. 1 0.99 9 1 0.98 2 3 4 0.97 5 7 0.96 6 10 8 0.95 1x 2x 3x 4x 5x Slowdown

The Tail at Store: Causes? No correlation of slowdowns to time of the day (0am – 24pm)  No explicit drive events around slow hours  Unplugging disks and plugging them back does not particularly help  SSD vendors have significant differences between them 

The Tail at Store: Solutions Create Tail-Tolerant RAIDS.  Treat slow disks as failed disks.  Reactive   Detect slow Disks: take a lot of time to answer (>2x from other Disks).  Reconstruct answer from other disks using RAID redundancy if Disk is slow.  Latency is going to optimally be around 3x compared to a read from an average Disk. Proactive   Always use RAID redundancy for additional read.  Take fastest answer.  Uses much more I/O bandwidth. Adaptive   Combination of both approaches taking into account the findings.  Use reactive approach until a slowdown is detected.  After this use proactive approach since slowdowns are repetitive and last many hours.

The Tail at Store: Conclusions  More research on possible causes for Disk and SSD slowdowns is required  Need Tail-Tolerant RAIDS to reduce the overhead from slowdowns  Since reconstruction of data is the way to deal with slowdowns and if 𝑄𝑠𝑝𝑐 1 𝑡𝑚𝑝𝑥𝑒𝑝𝑥𝑜 ≫ 𝑄𝑠𝑝𝑐[2 𝑡𝑚𝑝𝑥𝑒𝑝𝑥𝑜 𝑝𝑠 𝑛𝑝𝑠𝑓] the Azure paper [Huang, 2012] becomes more relevant.

Background: Cloud Storage  General Purpose Applications  Separate VM-VM connections from VM- Storage connections  Storage is virtualized  Many layers from application to actual storage  Resources are shared across multiple tenants

IOFlow: Problem  Cannot support end-to-end policies (e.g. minimum IO bandwidth from application to storage)  Applications do not have any way of expressing their storage policies  Sharing infrastructure where aggressive applications tend to get more IO bandwidth

IOFlow: Challenges  No existing enforcing mechanism for controlling IO rates  Aggregate performance policies  Non-performance policies  Admission control  Dynamic enforcement  Support for unmodified applications and VMs

IOFlow: Do it like SDNs

IOFlow: Supported policies  <VM, Destination> -> Bandwidth (static, compute side)  <VM, Destination> -> Min Bandwidth (dynamic, compute side)  <VM, Destination> -> Sanitize (static, compute or storage side)  <VM, Destination> -> Priority Level (static, compute and storage side)  <Set of VMs, Set of Destinations> -> Bandwidth (dynamic, compute side)

Example 1: Interface  Policies:  <VM1,Server X> -> B1  <VM2,Server X> -> B2  Controller to SMBc of physical server containing VM1 and VM2  createQueueRule(<VM1,Server X>,Q1)  createQueueRule(<VM2,Server X>,Q2)  createQueueRule(<*,*>,Q0)  configureQueueService(Q1, <B1, low, S>), where S is the size of the queue  configureQueueService(Q2, <B2, low, S>)  configureQueueService(Q0, <C-B1-B2, low, S>), where C is the Capacity of Server X.

Example 2: Max-Min Fairness  Policies:  <VM1-VM3,Server X> -> 900 Mbps  Demand:  VM1 -> 600 Mbps  VM2 -> 400 Mbps  VM3 -> 200 Mbps  Result:  VM1 -> 350 Mbps  VM2 -> 350 Mbps  VM3 -> 200 Mbps

IOFlow: Evaluation of Policy Enforcement  Windows-based IO stack  10 hypervisors with 12 VMs each (120 VMs total)  4 tenants using 30 VMs each (3 VMs per hypervisor for each tenant)  1 Storage Server  6.4 Gbps IO Bandwidth  1 Controller  1s interval between dynamic enforcements of policies

IOFlow: Evaluation of Policy Enforcement Tenant Policy Index {VM 1 -30, X} -> Min 800 Mbps Data {VM 31 - 60, X} -> Min 800 Mbps Message {VM 61 -90, X} -> Min 2500 Mbps Log {VM 91 -120, X} -> Min 1500 Mbps

IOFlow: Evaluation of Policy Enforcement

IOFlow: Evaluation of Overhead

IOFlow: Conclusions  Contributions  First Software Defined Storage approach  Fine-grain control over the IO operations in Cloud  Limitations  Network or other resources might be the bottleneck  Need to care about locating the VMs (spatial locality) close to data  Flat Datacenter Storage [Nightingale, 2012] provides solutions for this problem  Guaranteed latencies are not expressed by current policies  Best effort approach by setting priority

Specialized Storage Architectures  HDFS [Shvachko, 2009] and GFS [Ghemawat, 2003] work well for Hadoop MapReduce applications.  Facebook’s Photo Storage [Beaver, 2010] exploits workload characteristics to design and implement better storage system.

Storage Fabric CS6453 Summary Last week: NVRAM is going to change - PowerPoint PPT Presentation

Storage Fabric CS6453 Summary Last week: NVRAM is going to change the way we thing about storage. Today: Challenges of storage layers (SSDs, HDs) that are created from massive data. Slowdowns in HDs and SSDs. Enforcing policies

Optimising fabric quality, finishing processes and machinery through the use of fabric objective

Optimising fabric quality, finishing processes and machinery through the use of fabric objective

FPGA fabric is eating the world The rise of the custom computing machines From the eyes of Steve

Fibre to fabric Gary Robinson Wool industry consultant The transformation of raw wool to fabric

Paris COLLECTION 40mmx87mmx74mm Description fixed in base fixed in base covered fabric raised

Structure-aware Synthesis for Predictive Woven Fabric Appearance Shuang Zhao Wenzel Jakob Steve

THE EDGE AI FABRIC COMPANY J U L Y 1 6 , 2 0 1 8 Poi ntR Data I nc. 181 2 nd St. San Franci

PRESENTATION Style; DGL 078 Fabric; 98 % cotton 2 % spandex Size Range: regular Price; USD

Therm odynam ics Therm odynam ics and and Fabric of Spacetim e Fabric of Spacetim e Dm itri

THE EFFECT OF NUMBER OF FABRIC ON CFRP-FABRIC HYBRID COMPOSITE IMPACT SHIELD PERFORMANCE J. B.

Solamark retractable & fixed fabric awnings o Introduction o Fabrics o Fabric range o Solamark

Sustainability Category Supporting a Step Change in Construction Jackie Richards Whole House

Paris COLLECTION 40mmx87mmx74mm Description fixed in base fixed in base covered fabric raised

ARCS Data Fabric Pauline Mak pauline.mak@arcs.org.au ARCS Data Services Pauline Mak Outline

Regular Distributed Register Fabric Regular Distributed Register Fabric and Synthesis for Multi-

Data Storage and Interaction using Magnetized Fabric Justin Chan Shyam Gollakota 1 Existing

Eric Horvitz Microsoft Research CCC RISES Washington DC, Feb. 2011 Transportation Work

Promoting the UNCG Cello Music Collection through Digitization: The Bernard Greenhouse Project

Roy Osherove @RoyOsherove 5Whys.com blog Course: Essential Skills for team leads :

Is Mathematics Beautiful? If It Is, Is It Because God Is A Mathematician? 1 Tex When the

Causal Premise Semantics Stefan Kaufmann Northwestern / University of Connecticut Perspectives

Outline Indicative Conditionals, Strictly 1 Monotonic Patterns William Starr 2 New Data 3 A

Logic programming inference (interpretation) based on SLD-resolution declarativness: the

What is best for spoken langage understanding: small but task-dependent embeddings or huge but

Storage Fabric CS6453 Summary Last week: NVRAM is going to change - PowerPoint PPT Presentation

Storage Fabric CS6453 Summary Last week: NVRAM is going to change the way we thing about storage. Today: Challenges of storage layers (SSDs, HDs) that are created from massive data. Slowdowns in HDs and SSDs. Enforcing policies

Optimising fabric quality, finishing processes and machinery through the use of fabric objective

Optimising fabric quality, finishing processes and machinery through the use of fabric objective

FPGA fabric is eating the world The rise of the custom computing machines From the eyes of Steve

Fibre to fabric Gary Robinson Wool industry consultant The transformation of raw wool to fabric

Paris COLLECTION 40mmx87mmx74mm Description fixed in base fixed in base covered fabric raised

Structure-aware Synthesis for Predictive Woven Fabric Appearance Shuang Zhao Wenzel Jakob Steve

THE EDGE AI FABRIC COMPANY J U L Y 1 6 , 2 0 1 8 Poi ntR Data I nc. 181 2 nd St. San Franci

PRESENTATION Style; DGL 078 Fabric; 98 % cotton 2 % spandex Size Range: regular Price; USD

Therm odynam ics Therm odynam ics and and Fabric of Spacetim e Fabric of Spacetim e Dm itri

THE EFFECT OF NUMBER OF FABRIC ON CFRP-FABRIC HYBRID COMPOSITE IMPACT SHIELD PERFORMANCE J. B.

Solamark retractable &amp; fixed fabric awnings o Introduction o Fabrics o Fabric range o Solamark

Sustainability Category Supporting a Step Change in Construction Jackie Richards Whole House

Paris COLLECTION 40mmx87mmx74mm Description fixed in base fixed in base covered fabric raised

ARCS Data Fabric Pauline Mak pauline.mak@arcs.org.au ARCS Data Services Pauline Mak Outline

Regular Distributed Register Fabric Regular Distributed Register Fabric and Synthesis for Multi-

Data Storage and Interaction using Magnetized Fabric Justin Chan Shyam Gollakota 1 Existing

Eric Horvitz Microsoft Research CCC RISES Washington DC, Feb. 2011 Transportation Work

Promoting the UNCG Cello Music Collection through Digitization: The Bernard Greenhouse Project

Roy Osherove @RoyOsherove 5Whys.com blog Course: Essential Skills for team leads :

Is Mathematics Beautiful? If It Is, Is It Because God Is A Mathematician? 1 Tex When the

Causal Premise Semantics Stefan Kaufmann Northwestern / University of Connecticut Perspectives

Outline Indicative Conditionals, Strictly 1 Monotonic Patterns William Starr 2 New Data 3 A

Logic programming inference (interpretation) based on SLD-resolution declarativness: the

What is best for spoken langage understanding: small but task-dependent embeddings or huge but

Solamark retractable & fixed fabric awnings o Introduction o Fabrics o Fabric range o Solamark