pufferbench evaluating and optimizing malleability of
play

Pufferbench: Evaluating and Optimizing Malleability of Distributed - PowerPoint PPT Presentation

Pufferbench: Evaluating and Optimizing Malleability of Distributed Storage Nathanal Cheriere, Matthieu Dorier, Gabriel Antoniu PDSW-DISCS 2018, Dallas Data is everywhere High variety of applications High variety of needs Resource


  1. Pufferbench: Evaluating and Optimizing Malleability of Distributed Storage Nathanaël Cheriere, Matthieu Dorier, Gabriel Antoniu PDSW-DISCS 2018, Dallas

  2. Data is everywhere High variety of applications High variety of needs

  3. Resource requirements vary in time Day/night cycles Weekly cycles Workflows

  4. Dynamically adjust the amount of resources? Why? Problem: • Satisfy resource requirements What about task/data colocation? • Peaks • Low • Local data access • Avoid idle nodes • Easy scalability ü Save money ü Save energy ? Storage system malleability ü Computing resources malleability

  5. Two operations: Decommission Commission Constraints: Problems: No data losses Long data transfers • • Maintain fault tolerance • Balance data •

  6. What is the duration of storage rescaling on a given platform? • Previous works: lower bounds • Useful but unrealistic • Many simplifications • Need a tool to measure it on real hardware How fast can one scale down a distributed file system?, N. Cheriere, G. Antoniu, Bigdata 2017 A Lower Bound for the Commission Times in Replication-Based Distributed Storage Systems. N. Cheriere, M. Dorier, G. Antoniu. [Research Report – Submitted to JPDC] 2018

  7. A benchmark: Pufferbench Goals: • Measure the duration of rescaling on a platform • Serve as a quick prototyping testbed for rescaling mechanisms How: • Do all I/Os that are needed by a rescaling

  8. Main steps 1. Migration Planning 2. Data Generation 3. Execution 4. Statistics Aggregation

  9. Software Architecture

  10. Software Architecture MetadataGenerator: Generate information about files on the storage (number,size)

  11. Software Architecture DataDistributionGenerator: Assign files to storage nodes

  12. Software Architecture DataTransferScheduler: Compute data transfers needed for rescaling

  13. Software Architecture IODispatcher: Assign transfer instructions to storage and network

  14. Software Architecture Storage: Interface with the storage devices

  15. Software Architecture Network: Exchange data between nodes

  16. Software Architecture DataDistributionValidator: Compute statistics about data placement (load, replication)

  17. Validation Comparison to lower bounds Hardware Matching hypotheses: • Up to 40 nodes • Load balancing (50 GB per node) • 16 cores, 2.4 GHz • Uniform data distribution • 128 GB RAM • Data replication • 558 GB disk • 10 Gbps ethernet Differences: • Hardware is not identical • Storage has latency • Network has latency and interferences

  18. Pufferbench is close to lower bounds! 30 30 Decommission times Decommission times 25 25 Pufferbench Pufferbench Time to decommission (s) Time to decommission (s) 150 150 Theoretical minimum Theoretical minimum Decommission 20 20 100 100 15 15 10 10 50 50 5 5 0 0 0 0 1 2 3 4 5 6 7 1 2 3 4 5 6 7 Number of decommissionned nodes (out of a cluster of 20) Number of decommissionned nodes (to a cluster of 20) Within 16% of lower bounds 40 40 250 250 Commission times Commission times Lower bounds are realistic Pufferbench Pufferbench 35 35 Time to commission (s) Time to commission (s) Theoretical minimum Theoretical minimum 200 200 30 30 Commission 25 25 150 150 20 20 100 100 15 15 0 5 10 15 20 25 30 0 5 10 15 20 25 30 Number of commissionned nodes (to a cluster of 10) Number of commissionned nodes (to a cluster of 10) In memory storage On drive storage

  19. Use case: HDFS Question: How fast can the rescaling in HDFS be? No modifications of HDFS With Pufferbench: • Reproduce initial conditions • Aim for same final data placement

  20. Pufferbench matching HDFS’s rescaling Load balanced • Mostly random • Random placement • Replicated 3 times • Chunks of 128 MiB •

  21. HDFS needs better disk I/Os Commission 700 700 700 35 35 35 Decommission times Decommission times 30 30 30 Measured on HDFS Measured on HDFS Time to decommission (s) Time to decommission (s) Pufferbench Pufferbench 500 500 500 25 25 25 Theoretical minimum Theoretical minimum 3 x 20 20 20 300 300 300 15 15 15 10 10 10 100 100 100 5 5 5 0 0 0 0 0 0 1 2 3 4 5 6 7 1 2 3 4 5 6 7 Number of decommissionned nodes (out of a cluster of 20) Number of decommissionned nodes (to a cluster of 20) In memory storage On drive storage Improvement possible on disk access patterns

  22. HDFS is far from optimal performances! Commission 1000 1000 1000 Commission times Commission times 200 200 200 Measured on HDFS Measured on HDFS 800 800 800 Time to commission (s) Time to commission (s) Pufferbench Pufferbench Theoretical minimum Theoretical minimum 150 150 150 600 600 600 14 x 100 100 100 400 400 400 200 200 200 50 50 50 0 0 0 0 0 0 0 5 10 15 20 25 30 0 5 10 15 20 25 30 Number of commissionned nodes (to a cluster of 10) Number of commissionned nodes (to a cluster of 10) In memory storage On drive storage Improvement possible on algorithms, disk access patterns, pipelining

  23. Setup duration Setup overhead for the commission in memory: • HDFS: 26 h • Pufferbench: 53 min Good for prototyping: • Fast evaluation • Light setup

  24. To conclude Pufferbench: • Evaluate the viability of storage malleability on platforms • Quickly prototype and evaluate rescaling mechanisms Available at https://gitlab.inria.fr/Puffertools/Pufferbench Can be installed with Spack

  25. To conclude Pufferbench: • Evaluate the viability of storage malleability on platforms • Quickly prototype and evaluate rescaling mechanisms Available at https://gitlab.inria.fr/Puffertools/Pufferbench Can be installed with Spack Thank you! Questions? nathanael.cheriere@irisa.fr

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend