t 2 m

T 2 M Vasily Tarasov 1 , Santhosh Kumar 1 , Jack Ma 2 , Dean - PowerPoint PPT Presentation

Extracting Flexible, Replayable Models from Large Block Traces T 2 M Vasily Tarasov 1 , Santhosh Kumar 1 , Jack Ma 2 , Dean Hildebrand 3 , Anna Povzner 2 , Geoff Kuenning 2 , Erez Zadok 1 1 Stony Brook University 2 Harvey Mudd College 3 IBM


  1. Extracting Flexible, Replayable Models from Large Block Traces T 2 M Vasily Tarasov 1 , Santhosh Kumar 1 , Jack Ma 2 , Dean Hildebrand 3 , Anna Povzner 2 , Geoff Kuenning 2 , Erez Zadok 1 1 Stony Brook University 2 Harvey Mudd College 3 IBM Research – Almaden

  2. Outline 1. Traces and their problems 2. Workload models suitability 3. Design of the model extractor 4. Evaluation 5. Conclusions Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 2

  3. Traces Event Trace record Time- Opera- I/O size Offset stamp tion ● In general case, any event can be 0 read 4096 0 traced (process forking, file 0.5 read 4096 4096 0.7 read 4096 8192 accesses, user logins) 1.3 write 8192 28762 ● Timestamp is a common field 1.5 write 8192 32768 ● Other fields depend on the read 4096 12288 1.6 specific events traced read 4096 14384 2.0 ● We used block traces ● Our approach is valid for any trace Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 3

  4. Trace Use Cases  Workload analysis and characterization  Tune existing systems  Design new systems Highly valuable source  Trace replay  Evaluate, compare, and validate system behavior There are problems Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 4

  5. Problems with Trace Replay  Large in size  Disturb results  Replayer bottlenecks on I/O  Cache pollution  Hard to distribute  Static objects  Hard to intelligently and systematically modify the workload  Not easy to compare Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 5

  6. Outline 1. Traces and their problems 2. Workload models suitability 3. Design of the model extractor 4. Evaluation 5. Conclusions Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 6

  7. Statistics Matter ● Monday’s trace is not Monday Tuesday exactly the same as Trace Trace a Tuesday’s trace ● Responses are the same   ● Statistics of the workload in the traces impact the system: Same ♦ read/write ratio Same ♦ I/O size - Latency Observe - Throughput ● Set of statistics depends system’s - Power on specific system response: - Disk utilization Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 7

  8. Outline 1. Traces and their problems 2. Workload models suitability 3. Design of the model extractor 4. Evaluation 5. Conclusions Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 8

  9. Design Goals  Accuracy  System responses match  Conciseness  Small model size  Flexibility  Trade model size for accuracy  Existing benchmarks for workload generation  Extensibility  Statistics and benchmarks Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 9

  10. Trace Chunking  Workload changes in the trace over time 8KB 6KB I/O 2KB 2KB 2KB size 1KB 0.5KB 0.5KB Trace time  Chunk the trace:  Fixed chunking first  Then deduplicate chunks  This often results in variable chunking Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 10

  11. Within a Chunk  Assume stationary workload  Feature functions Trace p = (p 1 , p 2 , …, p n ) Trace field vector: p 1 p 2 p n Feature function: f 1 = f 1 (p, s 1 ) s 1 : state Put into a Feature function vector: multi-dimensional histogram f = (f 1 (p, s 1 ), f 2 (p, s 2 ), …, f n (p, s n )) Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 11

  12. Multi-Dimensional Histogram p:  p 1 – operation: read – 0, write – 1 f 1  p 2 – I/O size: in KB operation  p 3 – offset: in KB write – 1 read – 0 f: f 2 100 791 38 12  f 1 = p 1 (operation) 1 I/O size (KB) 60 95 412 32  f 2 = p 2 (I/O size) 2  f 3 = log(offset – s 3 .prev_offset) 99 27 10 198 4 (inter-arrival distance) 0 0 0 0 8 f 3 Inter-arrival distance, logarithmic (KB) Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 12

  13. Benchmark Plugins  Yet another workload generator?  Use existing benchmarks instead  Benchmark plugin: Workload Benchmark description plugins in Benchmark’s Chunk histograms language ♦ command line arguments for IOzone ♦ config files for Filebench or FIO Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 13

  14. Overall Design Fixed Histogram Benchmark Dedup- T Chunking Collection lication Plugin Workload R description in A Benchmark’s C language E Benchmark Initial time interval Similarity Features metrics and and histogram threshold granularity Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 14

  15. Outline 1. Traces and their problems 2. Workload models suitability 3. Design of the model extractor 4. Evaluation 5. Conclusions Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 15

  16. Evaluation 1. Replayed the trace 2. Emulated workload 3. Compared response (accuracy) parameters  CPU Utilization  Reads/sec  Memory  Writes/sec consumption  Latency  Interrupts  I/O Utilization  Context Switches  I/O Queue length  Wait Processes  Request size  Power Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 16

  17. Evaluation setup  Physical Setup  single node with physical disk drives  Virtual Setup  VM with disk image on remote GPFS server  Finance1  OLTP applications  MS-WBS  Microsoft build server Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 17

  18. Finance1 on Physical System 300 Average relative 250 error <10% across all parameters and systems 200 Reads/Sec - Replay Throughput Reads/Sec - Emulation (ops/second) 150 Writes/Sec - Replay Write/Sec - Emulation 17−25 × size 100 reduction 50 0 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 Time (seconds) Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 18

  19. Outline 1. Traces and their problems 2. Workload models suitability 3. Design of the model extractor 4. Evaluation 5. Conclusions Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 19

  20. Conclusions  Extractor of workload models from traces  Multi-dimensional histograms of feature functions  Trace chunking  Trade off accuracy for size reduction  Standard benchmarks Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 20

  21. Future work  More of everything  accuracy parameters, systems, traces  File system traces  Automatic selection of parameters  chunking interval, matrix granularity  Operations on models Extracting Workload Models from Block Traces – FAST 2012 2/11/2012 21

  22. Extracting Flexible, Replayable Models from Large Block Traces http://goo.gl/yFdrG Q & A Vasily Tarasov 1 , Santhosh Kumar 1 , Jack Ma 2 , Dean Hildenbrand 3 , Anna Povzner 2 , Geoff Kuenning 2 , Erez Zadok 1 1 Stony Brook University 2 Harvey Mudd College 3 IBM Research – Almaden

Recommend


More recommend