Pufferbench: Evaluating and Optimizing Malleability of Distributed Storage
Nathanaël Cheriere, Matthieu Dorier, Gabriel Antoniu PDSW-DISCS 2018, Dallas
Pufferbench: Evaluating and Optimizing Malleability of Distributed - - PowerPoint PPT Presentation
Pufferbench: Evaluating and Optimizing Malleability of Distributed Storage Nathanal Cheriere, Matthieu Dorier, Gabriel Antoniu PDSW-DISCS 2018, Dallas Data is everywhere High variety of applications High variety of needs Resource
Nathanaël Cheriere, Matthieu Dorier, Gabriel Antoniu PDSW-DISCS 2018, Dallas
How fast can one scale down a distributed file system?, N. Cheriere, G. Antoniu, Bigdata 2017 A Lower Bound for the Commission Times in Replication-Based Distributed Storage Systems. N. Cheriere,
MetadataGenerator: Generate information about files on the storage (number,size)
DataDistributionGenerator: Assign files to storage nodes
DataTransferScheduler: Compute data transfers needed for rescaling
IODispatcher: Assign transfer instructions to storage and network
Storage: Interface with the storage devices
Network: Exchange data between nodes
DataDistributionValidator: Compute statistics about data placement (load, replication)
Matching hypotheses:
Differences:
1 2 3 4 5 6 7 5 10 15 20 25 30 Number of decommissionned nodes (out of a cluster of 20) Time to decommission (s) 5 10 15 20 25 30
Decommission times Pufferbench Theoretical minimum
1 2 3 4 5 6 7 50 100 150 Number of decommissionned nodes (to a cluster of 20) Time to decommission (s) 50 100 150
Decommission times Pufferbench Theoretical minimum
5 10 15 20 25 30 15 20 25 30 35 40 Number of commissionned nodes (to a cluster of 10) Time to commission (s) 15 20 25 30 35 40
Commission times Pufferbench Theoretical minimum
5 10 15 20 25 30 100 150 200 250 Number of commissionned nodes (to a cluster of 10) Time to commission (s) 100 150 200 250
Commission times Pufferbench Theoretical minimum
Commission Decommission In memory storage On drive storage
Within 16% of lower bounds
Lower bounds are realistic
1 2 3 4 5 6 7 100 300 500 700 Number of decommissionned nodes (to a cluster of 20) Time to decommission (s) 100 300 500 700 100 300 500 700
Decommission times Measured on HDFS Pufferbench Theoretical minimum
1 2 3 4 5 6 7 5 10 15 20 25 30 35 Number of decommissionned nodes (out of a cluster of 20) Time to decommission (s) 5 10 15 20 25 30 35 5 10 15 20 25 30 35
Decommission times Measured on HDFS Pufferbench Theoretical minimum
In memory storage On drive storage
3 x
5 10 15 20 25 30 50 100 150 200 Number of commissionned nodes (to a cluster of 10) Time to commission (s) 50 100 150 200 50 100 150 200
Commission times Measured on HDFS Pufferbench Theoretical minimum
5 10 15 20 25 30 200 400 600 800 1000 Number of commissionned nodes (to a cluster of 10) Time to commission (s) 200 400 600 800 1000 200 400 600 800 1000
Commission times Measured on HDFS Pufferbench Theoretical minimum
In memory storage On drive storage
14 x
Setup overhead for the commission in memory:
Good for prototyping:
nathanael.cheriere@irisa.fr