t 2 m
play

T 2 M Vasily Tarasov 1 , Santhosh Kumar 1 , Jack Ma 2 , Dean - PowerPoint PPT Presentation

Extracting Flexible, Replayable Models from Large Block Traces T 2 M Vasily Tarasov 1 , Santhosh Kumar 1 , Jack Ma 2 , Dean Hildebrand 3 , Anna Povzner 2 , Geoff Kuenning 2 , Erez Zadok 1 1 Stony Brook University 2 Harvey Mudd College 3 IBM


  1. Extracting Flexible, Replayable Models from Large Block Traces T 2 M Vasily Tarasov 1 , Santhosh Kumar 1 , Jack Ma 2 , Dean Hildebrand 3 , Anna Povzner 2 , Geoff Kuenning 2 , Erez Zadok 1 1 Stony Brook University 2 Harvey Mudd College 3 IBM Research – Almaden

  2. Outline 1. Traces and their problems 2. Workload models suitability 3. Design of the model extractor 4. Evaluation 5. Conclusions Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 2

  3. Traces Event Trace record Time- Opera- I/O size Offset stamp tion ● In general case, any event can be 0 read 4096 0 traced (process forking, file 0.5 read 4096 4096 0.7 read 4096 8192 accesses, user logins) 1.3 write 8192 28762 ● Timestamp is a common field 1.5 write 8192 32768 ● Other fields depend on the read 4096 12288 1.6 specific events traced read 4096 14384 2.0 ● We used block traces ● Our approach is valid for any trace Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 3

  4. Trace Use Cases  Workload analysis and characterization  Tune existing systems  Design new systems Highly valuable source  Trace replay  Evaluate, compare, and validate system behavior There are problems Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 4

  5. Problems with Trace Replay  Large in size  Disturb results  Replayer bottlenecks on I/O  Cache pollution  Hard to distribute  Static objects  Hard to intelligently and systematically modify the workload  Not easy to compare Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 5

  6. Outline 1. Traces and their problems 2. Workload models suitability 3. Design of the model extractor 4. Evaluation 5. Conclusions Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 6

  7. Statistics Matter ● Monday’s trace is not Monday Tuesday exactly the same as Trace Trace a Tuesday’s trace ● Responses are the same   ● Statistics of the workload in the traces impact the system: Same ♦ read/write ratio Same ♦ I/O size - Latency Observe - Throughput ● Set of statistics depends system’s - Power on specific system response: - Disk utilization Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 7

  8. Outline 1. Traces and their problems 2. Workload models suitability 3. Design of the model extractor 4. Evaluation 5. Conclusions Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 8

  9. Design Goals  Accuracy  System responses match  Conciseness  Small model size  Flexibility  Trade model size for accuracy  Existing benchmarks for workload generation  Extensibility  Statistics and benchmarks Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 9

  10. Trace Chunking  Workload changes in the trace over time 8KB 6KB I/O 2KB 2KB 2KB size 1KB 0.5KB 0.5KB Trace time  Chunk the trace:  Fixed chunking first  Then deduplicate chunks  This often results in variable chunking Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 10

  11. Within a Chunk  Assume stationary workload  Feature functions Trace p = (p 1 , p 2 , …, p n ) Trace field vector: p 1 p 2 p n Feature function: f 1 = f 1 (p, s 1 ) s 1 : state Put into a Feature function vector: multi-dimensional histogram f = (f 1 (p, s 1 ), f 2 (p, s 2 ), …, f n (p, s n )) Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 11

  12. Multi-Dimensional Histogram p:  p 1 – operation: read – 0, write – 1 f 1  p 2 – I/O size: in KB operation  p 3 – offset: in KB write – 1 read – 0 f: f 2 100 791 38 12  f 1 = p 1 (operation) 1 I/O size (KB) 60 95 412 32  f 2 = p 2 (I/O size) 2  f 3 = log(offset – s 3 .prev_offset) 99 27 10 198 4 (inter-arrival distance) 0 0 0 0 8 f 3 Inter-arrival distance, logarithmic (KB) Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 12

  13. Benchmark Plugins  Yet another workload generator?  Use existing benchmarks instead  Benchmark plugin: Workload Benchmark description plugins in Benchmark’s Chunk histograms language ♦ command line arguments for IOzone ♦ config files for Filebench or FIO Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 13

  14. Overall Design Fixed Histogram Benchmark Dedup- T Chunking Collection lication Plugin Workload R description in A Benchmark’s C language E Benchmark Initial time interval Similarity Features metrics and and histogram threshold granularity Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 14

  15. Outline 1. Traces and their problems 2. Workload models suitability 3. Design of the model extractor 4. Evaluation 5. Conclusions Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 15

  16. Evaluation 1. Replayed the trace 2. Emulated workload 3. Compared response (accuracy) parameters  CPU Utilization  Reads/sec  Memory  Writes/sec consumption  Latency  Interrupts  I/O Utilization  Context Switches  I/O Queue length  Wait Processes  Request size  Power Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 16

  17. Evaluation setup  Physical Setup  single node with physical disk drives  Virtual Setup  VM with disk image on remote GPFS server  Finance1  OLTP applications  MS-WBS  Microsoft build server Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 17

  18. Finance1 on Physical System 300 Average relative 250 error <10% across all parameters and systems 200 Reads/Sec - Replay Throughput Reads/Sec - Emulation (ops/second) 150 Writes/Sec - Replay Write/Sec - Emulation 17−25 × size 100 reduction 50 0 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 Time (seconds) Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 18

  19. Outline 1. Traces and their problems 2. Workload models suitability 3. Design of the model extractor 4. Evaluation 5. Conclusions Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 19

  20. Conclusions  Extractor of workload models from traces  Multi-dimensional histograms of feature functions  Trace chunking  Trade off accuracy for size reduction  Standard benchmarks Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 20

  21. Future work  More of everything  accuracy parameters, systems, traces  File system traces  Automatic selection of parameters  chunking interval, matrix granularity  Operations on models Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 21

  22. Extracting Flexible, Replayable Models from Large Block Traces http://goo.gl/yFdrG Q & A Vasily Tarasov 1 , Santhosh Kumar 1 , Jack Ma 2 , Dean Hildenbrand 3 , Anna Povzner 2 , Geoff Kuenning 2 , Erez Zadok 1 1 Stony Brook University 2 Harvey Mudd College 3 IBM Research – Almaden

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend