Hurricane
Master semester project – IC School Operating Systems Laboratory Author Diego Antognini Supervisors
- Prof. Willy Zwaenepoel
Laurent Bindschaedler
Hurricane Master semester project IC School Operating Systems - - PowerPoint PPT Presentation
Hurricane Master semester project IC School Operating Systems Laboratory Author Diego Antognini Supervisors Prof. Willy Zwaenepoel Laurent Bindschaedler Outline Motivation Hurricane Experiments Future work Conclusion
Master semester project – IC School Operating Systems Laboratory Author Diego Antognini Supervisors
Laurent Bindschaedler
2
4 Introduction – Hurricane – Experiments – Future work - Conclusion
5 Introduction – Hurricane – Experiments – Future work - Conclusion
6 Introduction – Hurricane – Experiments – Future work - Conclusion
Namenode Datanodes Datanodes Client
writing & reading 100 GB of data in X files :
# Files Size 1 100 GB 2 50 GB … … 4096 25 MB
7 Introduction – Hurricane – Experiments – Future work - Conclusion
DCO OS Ubuntu 14.04.01 # Cores 16 Memory 128 GB Storage HDD : 140 MB/s SSD : 243 MB/s Network 10 Gbit/s
8 Introduction – Hurricane – Experiments – Future work - Conclusion
9 Introduction – Hurricane – Experiments – Future work - Conclusion 250 500 750 1000 1250 1500 1750 2000 2250 2500 1 2 4 8 16 32 64 128 256 512 1024 2048 4096
Aggregate bandwidth [MB/s] Number of Files
I/O to disk writing 100GB of data 8 Nodes - No Replication DCO Cluster
Read Write Baseline (dd, hdparm) - Read/Write
10
HDFS doesn’t fit the requirements
Introduction – Hurricane – Experiments – Future work - Conclusion
based on Chaos storage engine
11 Introduction – Hurricane – Experiments – Future work - Conclusion
Actual project !
13 Introduction – Hurricane – Experiments – Future work - Conclusion
14 Introduction – Hurricane – Experiments – Future work - Conclusion
15 Introduction – Hurricane – Experiments – Future work - Conclusion
Switzerland Belgium Romania data
Emotions Switzerland Emotions Belgium Emotions Romania
16 Introduction – Hurricane – Experiments – Future work - Conclusion
17 Introduction – Hurricane – Experiments – Future work - Conclusion
Client Client Client Server Server Server
18 Introduction – Hurricane – Experiments – Future work - Conclusion
f
19
S1 S2 C1 C2 C3 f f
Introduction – Hurricane – Experiments – Future work - Conclusion
20
S1 S2 C1 C2 C3 f f
Introduction – Hurricane – Experiments – Future work - Conclusion
21
S1 S2 C1 C2 C3 S3 g g g
Introduction – Hurricane – Experiments – Future work - Conclusion
f f f
LABOS DCO TREX OS Ubuntu 14.04.1 Ubuntu 14.04.01 Ubuntu 14.04.01 # Cores 32 16 32 Memory 32 GB 128 GB 128 Gb Storage HDD : 474 MB/s HDD : 140 MB/s SSD : 243 MB/s HDD : 414 MB/s SSD : 464 MB/s Network 1 Gbit/s 10 Gbit/s 40 Gbit/s
23 Introduction – Hurricane – Experiments – Future work - Conclusion
24 Introduction – Hurricane – Experiments – Future work - Conclusion
25 Introduction – Hurricane – Experiments – Future work - Conclusion
100 200 300 400 500 1 2 4 8 16
Average bandwidth [MB/s] Machines TREX SSD Write Chaos storage Hurricane DFSIO
100 200 300 400 500 1 2 4 8 16
Average bandwidth [MB/s] Machines TREX SSD Read Chaos storage Hurricane DFSIO
26 Introduction – Hurricane – Experiments – Future work - Conclusion
Baseline (dd, hdparm)
27 Introduction – Hurricane – Experiments – Future work - Conclusion 50 100 150 200 250 1 2 4 8
Average bandwidth [MB/s] Machines DCO SSD Write Chaos storage Hurricane DFSIO
50 100 150 200 250 1 2 4 8
Average bandwidth [MB/s] Machines DCO SSD Read Chaos storage Hurricane DFSIO
Baseline (dd, hdparm)
28 Introduction – Hurricane – Experiments – Future work - Conclusion 100 200 300 400 500 1 2 4 8
Average bandwidth [MB/s] Machines LABOS Read Chaos storage Hurricane DFSIO
100 200 300 400 500 1 2 4 8
Average bandwidth [MB/s] Machines LABOS Write Chaos storage Hurricane DFSIO
Baseline (dd, hdparm)
29 Introduction – Hurricane – Experiments – Future work - Conclusion
30 Introduction – Hurricane – Experiments – Future work - Conclusion 50 100 150 200 250 300 1 2 4 8 16 32 64
Average bandwidth [MB/s] Machines DCO SSD Read Chaos storage Hurricane
50 100 150 200 250 300 1 2 4 8 16 32 64
Average bandwidth [MB/s] Machines DCO SSD Write Chaos storage Hurricane
STILL SCALABLE & GOOD I/O BANDWIDTH
Baseline (dd, hdparm) Baseline (dd, hdparm)
31 Introduction – Hurricane – Experiments – Future work - Conclusion
32 Introduction – Hurricane – Experiments – Future work - Conclusion 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 1 2 4 8 16
Aggegate bandwidth [MB/s] Machines TREX SSD Read
500 1000 1500 2000 2500 3000 3500 4000 4500 5000 1 2 4 8 16
Aggregate bandwidth [MB/s] Machines TREX SSD Write
Unknown network problem Baseline Actual bandwidth of the network
33 Introduction – Hurricane – Experiments – Future work - Conclusion 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 1 2 4 8
Aggegate bandwidth [MB/s] Machines DCO SSD Read
500 1000 1500 2000 2500 3000 3500 4000 4500 5000 1 2 4 8
Aggregate bandwidth [MB/s] Machines DCO SSD Write
Baseline Also scale with only 1 client
Use the I/O bandwidth of all the server nodes
34 Introduction – Hurricane – Experiments – Future work - Conclusion
1000 2000 3000 4000 5000 6000 7000 8000 1 2 4 8 16
Aggregate bandwidth [MB/s] Machines TREX SSD Read Chaos storage Hurricane DFSIO
35 Introduction – Hurricane – Experiments – Future work - Conclusion 1000 2000 3000 4000 5000 6000 7000 8000 1 2 4 8 16
Aggregate bandwidth [MB/s] Machines TREX SSD Write Chaos storage Hurricane DFSIO
Baseline
36 Introduction – Hurricane – Experiments – Future work - Conclusion 1000 2000 3000 4000 5000 6000 7000 8000 1 2 4 8
Aggregate bandwidth [MB/s] Machines LABOS Read Chaos storage Hurricane DFSIO
1000 2000 3000 4000 5000 6000 7000 8000 1 2 4 8
Aggregate bandwidth [MB/s] Machines LABOS Write
Chaos storage Hurricane DFSIO
Baseline
37 Introduction – Hurricane – Experiments – Future work - Conclusion
producers and consumers
seamlessly
38 Introduction – Hurricane – Experiments – Future work - Conclusion
39 Introduction – Hurricane – Experiments – Future work - Conclusion 50 100 150 200 250 300 350 400 450 500
Average file size [GB] Time TREX SSD Hurricane
50 100 150 200 250 300 350 400 450 500
Average file size [GB] Time TREX SSD Hurricane
40 Introduction – Hurricane – Experiments – Future work - Conclusion
41 Introduction – Hurricane – Experiments – Future work - Conclusion
42 Introduction – Hurricane – Experiments – Future work - Conclusion
43 Introduction – Hurricane – Experiments – Future work - Conclusion 250 500 750 1000 1250 1500 No compression Zeroed buffer Delta-encodable
Average bandwidth [MB/s] 16 Machines TREX SSD Read
250 500 750 1000 1250 1500 No compression Zeroed buffer Delta-encodable
16 Machines TREX SSD Write
If data amenable to compression, both speed and storage gains !
Baseline
Type of data Input Output Read speed Write speed No compression 16 GB 16 GB 443 MB/s 455 MB/s Zeroed buffer 16 GB 65 MB 1260 MB/s 565 MB/s Delta-encodable 16 GB 7.2GB 964 MB/s 455 MB/s
45 Introduction – Hurricane – Experiments – Future work - Conclusion
47 Introduction – Hurricane – Experiments – Future work - Conclusion
QUESTIONS ?
1. Amitabha Roy, Laurent Bindschaedler, Jasmina Malicevic, and Willy Zwaenepoel: Chaos: Scale-out Graph Processing from Secondary Storage. SOSP 2015. 2. Amitabha Roy, Ivo Mihailovic, and Willy Zwaenepoel: X-Stream: Edge-centric Graph Processing using Streaming Partitions. SOSP 2013. 3. Konstantin Shvachko, Hairong Kuang, Sanjay Radia, and Robert Chansler: The Hadoop Distributed File System. MSST 2010. 4. Mark Slee, Aditya Agarwal and Marc Kwiatkowski: Thrift: Scalable cross-language services implementation. Facebook white paper 2007. 5. Edmund B. Nightingale, Jeremy Elson, Jinliang Fan, Owen Hofmann, Jon Howell and Yutaka Suzue: Flat Datacenter
6. Michael Mitzenmacher : The Power of Two Choices in Randomized Load Balancing. IEE Transactions on Parallel and Distributed Systems 2001.
49