Hurricane Master semester project IC School Operating Systems - PowerPoint PPT Presentation

Hurricane Master semester project – IC School Operating Systems Laboratory Author Diego Antognini Supervisors Prof. Willy Zwaenepoel Laurent Bindschaedler

Outline • Motivation • Hurricane • Experiments • Future work • Conclusion 2

Motivation

Original goal of the project • Implement Chaos on top of HDFS ! • How ? • Replace storage engine by HDFS • Why ? • Industry interested by systems running on Hadoop • Handling cluster easily • Distributed file systems • Fault-tolerance (but at what price ?) Introduction – Hurricane – Experiments – Future work - Conclusion 4

Chaos • Scale-out graph processing from secondary storage • Maximize sequential access • Stripes data across secondary devices in a cluster • Limited only by : • aggregate bandwidth • capacity of all storage devices in the entire cluster Introduction – Hurricane – Experiments – Future work - Conclusion 5

Hadoop Distributed File System Namenode Client Datanodes Datanodes Introduction – Hurricane – Experiments – Future work - Conclusion 6

Experiment : DFSIO • Measure aggregate bandwidth on a cluster when writing & reading 100 GB of data in X files : # Files Size 1 100 GB 2 50 GB … … 4096 25 MB • Use DFSIO benchmark • Each task operates on a distinct block • Measure disk I/O Introduction – Hurricane – Experiments – Future work - Conclusion 7

Clusters DCO OS Ubuntu 14.04.01 # Cores 16 Memory 128 GB HDD : 140 MB/s Storage SSD : 243 MB/s Network 10 Gbit/s Introduction – Hurricane – Experiments – Future work - Conclusion 8

Results DFSIO – DCO cluster I/O to disk writing 100GB of data 8 Nodes - No Replication DCO Cluster 2500 2250 Aggregate bandwidth [MB/s] 2000 1750 1500 1250 1000 750 500 250 0 1 2 4 8 16 32 64 128 256 512 1024 2048 4096 Number of Files Read Write Baseline (dd, hdparm) - Read/Write Introduction – Hurricane – Experiments – Future work - Conclusion 9

Observations: DFSIO • Somewhat lackluster performance • Hard to tune ! HDFS doesn’t fit the requirements Introduction – Hurricane – Experiments – Future work - Conclusion 10

Our solution • Create a standalone distributed storage system based on Chaos storage engine • Give it an HDFS-like RPC interface Actual project ! Introduction – Hurricane – Experiments – Future work - Conclusion 11

Hurricane

Hurricane • Scalable decentralized storage system based on Chaos • Balance I/O load randomly across available disks • Saturate available storage bandwidth • Target rack-scale deployment Introduction – Hurricane – Experiments – Future work - Conclusion 13

Real life scenario • Chaos using Hurricane Introduction – Hurricane – Experiments – Future work - Conclusion 14

Real life scenario • Measuring emotions of countries during Euro 2016 Emotions Switzerland Switzerland Emotions Belgium data Belgium Emotions Romania Romania • And much more ! Introduction – Hurricane – Experiments – Future work - Conclusion 15

Locality does not matter ! • Remote storage bandwidth = local storage bandwidth • Clients can read/write to any storage device • Storage is slower than network • Network not a bottleneck ! • Realistic for most clusters at rack scale or even more Introduction – Hurricane – Experiments – Future work - Conclusion 16

Maximizing I/O bandwidth • Clients pull data records from servers Server Client Server Client Server Client • Batches requests to prevent idle servers (prefetching) Introduction – Hurricane – Experiments – Future work - Conclusion 17

Features • Global file handling (global_*) • Create, exists, delete, fill, drain, rewind etc … • Local file handling (local_*) • Create, exists, delete, fill, drain, rewind etc ... • Add storage nodes dynamically Introduction – Hurricane – Experiments – Future work - Conclusion 18

How does it work ? – Writing files f f S1 S2 C1 C3 C2 f Introduction – Hurricane – Experiments – Future work - Conclusion 19

How does it work ? – Reading files f f S1 S2 C1 C3 C2 Introduction – Hurricane – Experiments – Future work - Conclusion 20

How does it work ? - Join g g f f S1 S2 S3 C1 C3 C2 g f Introduction – Hurricane – Experiments – Future work - Conclusion 21

Experiments

Clusters LABOS DCO TREX OS Ubuntu 14.04.1 Ubuntu 14.04.01 Ubuntu 14.04.01 # Cores 32 16 32 Memory 32 GB 128 GB 128 Gb HDD : 140 MB/s HDD : 414 MB/s Storage HDD : 474 MB/s SSD : 243 MB/s SSD : 464 MB/s Network 1 Gbit/s 10 Gbit/s 40 Gbit/s Introduction – Hurricane – Experiments – Future work - Conclusion 23

List of experiments • Weak scaling • Scalability 1 client • Strong scaling • Case studies • Unbounded buffer • Compression Introduction – Hurricane – Experiments – Future work - Conclusion 24

Weak scaling • Each node writes/reads 16 GB of data • Increasing number of nodes • N servers, N clients • Measure average bandwidth • Compare Chaos storage engine, Hurricane, DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 25

16 GB per node – 40 Gbit/s network TREX SSD Read TREX SSD Write Baseline (dd, hdparm) 500 500 Average bandwidth [MB/s] Average bandwidth [MB/s] 400 400 300 300 200 200 100 100 0 0 1 2 4 8 16 1 2 4 8 16 Machines Machines Chaos storage Hurricane DFSIO Chaos storage Hurricane DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 26

16 GB per node – 10 Gbit/s network DCO SSD Read DCO SSD Write Baseline (dd, hdparm) 250 250 Average bandwidth [MB/s] Average bandwidth [MB/s] 200 200 150 150 100 100 50 50 0 0 1 2 4 8 1 2 4 8 Machines Machines Chaos storage Hurricane DFSIO Chaos storage Hurricane DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 27

16 GB per node – 1 Gbit/s network LABOS Read LABOS Write Baseline (dd, hdparm) 500 500 Average bandwidth [MB/s] Average bandwidth [MB/s] 400 400 300 300 200 200 100 100 0 0 1 2 4 8 1 2 4 8 Machines Machines Chaos storage Hurricane DFSIO Chaos storage Hurricane DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 28

Weak scaling - Summary • Hurricane similar performance with Chaos storage • Scalable • Outperforms HDFS roughly 1.5x • Maximize I/O bandwidth Introduction – Hurricane – Experiments – Future work - Conclusion 29

16 GB per node - 64 nodes DCO SSD Read Baseline (dd, hdparm) 300 Average bandwidth [MB/s] 250 200 150 100 50 0 1 2 4 8 16 32 64 Machines STILL SCALABLE & GOOD I/O BANDWIDTH Chaos storage Hurricane DCO SSD Write Baseline (dd, hdparm) 300 Average bandwidth [MB/s] 250 200 150 100 50 0 1 2 4 8 16 32 64 Machines Chaos storage Hurricane Introduction – Hurricane – Experiments – Future work - Conclusion 30

Scalability with 1 Client • Client writes/reads 16 GB of data per server node • Increasing number of server nodes • N servers, 1 client • Measure aggregate bandwidth • Only Hurricane is used Introduction – Hurricane – Experiments – Future work - Conclusion 31

40 Gbit/s network Unknown network problem TREX SSD Read TREX SSD Write 5000 5000 4500 4500 4000 Aggregate bandwidth [MB/s] 4000 Aggegate bandwidth [MB/s] 3500 3500 3000 3000 2500 2500 2000 2000 1500 1500 1000 1000 500 500 0 0 1 2 4 8 16 1 2 4 8 16 Machines Machines Baseline Actual bandwidth of the network Introduction – Hurricane – Experiments – Future work - Conclusion 32

10 Gbit/s network DCO SSD Read DCO SSD Write 5000 5000 4500 4500 Aggregate bandwidth [MB/s] 4000 4000 Aggegate bandwidth [MB/s] 3500 3500 3000 3000 2500 2500 2000 2000 1500 1500 1000 1000 500 500 0 0 1 2 4 8 1 2 4 8 Machines Machines Baseline Also scale with only 1 client Use the I/O bandwidth of all the server nodes Introduction – Hurricane – Experiments – Future work - Conclusion 33

Strong scaling • Read/write 128 GB of data in total • Increasing number of nodes • N servers, N clients • Measure aggregate bandwidth • Compare Chaos storage engine, Hurricane, DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 34

40 Gbit/s network Baseline TREX SSD Read TREX SSD Write 8000 8000 7000 7000 Aggregate bandwidth [MB/s] Aggregate bandwidth [MB/s] 6000 6000 5000 5000 4000 4000 3000 3000 2000 2000 1000 1000 0 0 1 2 4 8 16 1 2 4 8 16 Machines Machines Chaos storage Hurricane DFSIO Chaos storage Hurricane DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 35

1 Gbit/s network Baseline LABOS Read LABOS Write 8000 8000 7000 7000 Aggregate bandwidth [MB/s] Aggregate bandwidth [MB/s] 6000 6000 5000 5000 4000 4000 3000 3000 2000 2000 1000 1000 0 0 1 2 4 8 1 2 4 8 Machines Machines Chaos storage Hurricane DFSIO Chaos storage Hurricane DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 36

Hurricane Master semester project IC School Operating Systems - PowerPoint PPT Presentation

Hurricane Master semester project IC School Operating Systems Laboratory Author Diego Antognini Supervisors Prof. Willy Zwaenepoel Laurent Bindschaedler Outline Motivation Hurricane Experiments Future work Conclusion

Medical Reserve Corps Liisa Jackson www.mrcvolunteer.org Hurricane Preparedness Hurricane

Hurricane Harvey Jeff Lindner Meteorologist Harris County Flood Control District Hurricane

C Comparison of Hurricane Loss Comparison of Hurricane Loss C i i f H f H i i L L

Preparedness & Awareness 2019 Hurricane Season Wrap-up 2020 Hurricane Forecast Severe

What causes a hurricane If you live Illinois and there were a hurricane it would destroy Missouri

DUKE ENERGY FLORIDA 2015 HURRICANE SEASON PREPARATION BRIEFING Florida PSC Hurricane Preparedness

Rebuilding after Hurricane Katrina: Rebuilding after Hurricane Katrina: Neighborhood Planning

Forecasting at the National Hurricane Center Tokyo, March 2014 Lix, report to WMO National

Hurricane Irma vs. Florida Keys Steven Hudson Deputy Chief, MCFR, EFO HURRICANE IRMA EOC

Scenario Facilitated Discussion One - Hurricane Scenario Part One At 9:00 AM on September 1 st

Puerto Rico Pole Performance Hurricane Maria Report By Kenneth Sharpless, P.E. VP Engineering

Hurricane Preparedness Community Information Meeting 2020 WELCOME General Overview Hurricane

Standardizing hurricane size descriptors for broadcast to the public Lori Drake, Hurricane

Hurricane Preparation and Recovery October 11, 2011 Jon Nance, Chief Engineer, NCDOT Hurricane

Transit Response and Recovery for Emergencies and Declared Disasters Hurricane Michael Pulse

Hurricane Michael Integrated Recovery Coordination & Collaboration Hurricane Michael

The Iwo Jima Flag Raisers Chaos, Controversy and World War II U.S. Marine Corps Personnel Records

CalACT Fall Conference September, 2010 Santa Barbara, California California Coordination At Its

Creating A Collaboration Strategy out of Chaos Jim Rinaldi Chief Information Officer Jet

Solucionando Problemas de Microsservios com Service Mesh: Istio e Envoy Edson Yanaga (@yanaga)

Who can you believe? How to avoid being deceived And why by fake news it matters A

Superintendents Proposed FY 2020 Budget Michael J. Martirano, Superintendent, Howard County

The Sumitomo The Sumitomo e Sumitomo Trust e Sumitomo Trust ust ust and B an and B an d B

Creating Uniformity Amidst Chaos Tennessee Housing Development Agency - Blake Worthington -

Sambuz

Useful Links

Newsletter

Mail Us

Hurricane Master semester project IC School Operating Systems - PowerPoint PPT Presentation

Hurricane Master semester project IC School Operating Systems Laboratory Author Diego Antognini Supervisors Prof. Willy Zwaenepoel Laurent Bindschaedler Outline Motivation Hurricane Experiments Future work Conclusion

Medical Reserve Corps Liisa Jackson www.mrcvolunteer.org Hurricane Preparedness Hurricane

Hurricane Harvey Jeff Lindner Meteorologist Harris County Flood Control District Hurricane

C Comparison of Hurricane Loss Comparison of Hurricane Loss C i i f H f H i i L L

Preparedness &amp; Awareness 2019 Hurricane Season Wrap-up 2020 Hurricane Forecast Severe

What causes a hurricane If you live Illinois and there were a hurricane it would destroy Missouri

DUKE ENERGY FLORIDA 2015 HURRICANE SEASON PREPARATION BRIEFING Florida PSC Hurricane Preparedness

Rebuilding after Hurricane Katrina: Rebuilding after Hurricane Katrina: Neighborhood Planning

Forecasting at the National Hurricane Center Tokyo, March 2014 Lix, report to WMO National

Hurricane Irma vs. Florida Keys Steven Hudson Deputy Chief, MCFR, EFO HURRICANE IRMA EOC

Scenario Facilitated Discussion One - Hurricane Scenario Part One At 9:00 AM on September 1 st

Puerto Rico Pole Performance Hurricane Maria Report By Kenneth Sharpless, P.E. VP Engineering

Hurricane Preparedness Community Information Meeting 2020 WELCOME General Overview Hurricane

Standardizing hurricane size descriptors for broadcast to the public Lori Drake, Hurricane

Hurricane Preparation and Recovery October 11, 2011 Jon Nance, Chief Engineer, NCDOT Hurricane

Transit Response and Recovery for Emergencies and Declared Disasters Hurricane Michael Pulse

Hurricane Michael Integrated Recovery Coordination &amp; Collaboration Hurricane Michael

The Iwo Jima Flag Raisers Chaos, Controversy and World War II U.S. Marine Corps Personnel Records

CalACT Fall Conference September, 2010 Santa Barbara, California California Coordination At Its

Creating A Collaboration Strategy out of Chaos Jim Rinaldi Chief Information Officer Jet

Solucionando Problemas de Microsservios com Service Mesh: Istio e Envoy Edson Yanaga (@yanaga)

Who can you believe? How to avoid being deceived And why by fake news it matters A

Superintendents Proposed FY 2020 Budget Michael J. Martirano, Superintendent, Howard County

The Sumitomo The Sumitomo e Sumitomo Trust e Sumitomo Trust ust ust and B an and B an d B

Creating Uniformity Amidst Chaos Tennessee Housing Development Agency - Blake Worthington -

Sambuz

Useful Links

Newsletter

Mail Us

Preparedness & Awareness 2019 Hurricane Season Wrap-up 2020 Hurricane Forecast Severe

Hurricane Michael Integrated Recovery Coordination & Collaboration Hurricane Michael