Warming up Storage-level Caches with Bonfire Yiying Zhang Gokul - PowerPoint PPT Presentation

Warming up Storage-level Caches with Bonfire Yiying Zhang Gokul Soundararajan Mark W. Storer Lakshmi N. Bairavasundaram Sethuraman Subbiah Andrea C. Arpaci-Dusseau Remzi H. Arpaci-Dusseau

2 Does on-demand cache warmup still work ?

3 10s of TBs 65536.0 4096.0 Memory (Cache) Size (GB) 256.0 1 TB Cache . . . 16.0 Sequential: Random: 1.0 2.5 hours 6 days + idle time 0.1 < 10 GB 0.0 1990 1995 2000 2005 2010 2015 Year

4 How Long Does On-demand Warmup Take? • Read hit rate difference between warm cache and on-demand 100 80 Read Hit Rate (%) 60 On-demand warmup takes hours to days 40 Always-warm 20 On-demand Cold 0 0 4 8 12 16 20 24 Time (hour) * Simulation results from a project server trace

5 To Make Things Worse • Caches are critical ▫ Key component to meet application SLAs ▫ Reduce storage server I/O load • Cache warmup happens often ▫ Storage server restart ▫ Storage server take-over ▫ Dynamic caching [ Narayanan’08, Bairavasundaram’12 ]

6 On-demand Warmup Doesn’t Work Anymore What Can We Do? • Bonfire I/O ▫ Monitors and logs I/Os Bonfire ▫ Load warmup data in bulk Monitor Warmup Information • Challenges Logging ▫ What to monitor & log? Effective Volume ▫ How to monitor & log? Efficient Storage System ▫ How to load warmup data? Fast ▫ General solution

7 Summary of Contributions • Trace analysis for storage-level cache warmup ▫ Temporal and spatial patterns of reaccesses • Cache warmup algorithm design and simulation • Implementation and evaluation of Bonfire ▫ Up to 100% warmup time improvement over on-demand ▫ Up to 200% more server I/O load reduction ▫ Up to 5 times lower read latency ▫ Low overhead

8 Outline • Introduction • Trace analysis for cache warmup • Cache warmup algorithm study with simulation • Bonfire architecture • Evaluation results • Conclusion

9 Workload Study – Trace Selection • MSR-Cambridge [Narayanan’08] ▫ 36 one-week block-level traces from MSR-Cambridge data center servers ▫ Filter out write-intensive, small working set, and low reaccess-rate Server Function #volumes mds Media server 1 prn Print server 1 proj Project directories 3 src1 Source control 1 usr User home directories 2 web Web/SQL server 1 Reaccesses: Read After Reads and Read After Writes

10 Questions for Trace Study Time Space Relative Q1: What’s the temporal distance? Q3: What’s the spatial distance? Any clustering of reaccesses? Absolute Q2: When do reaccesses happen Q4: Where do reaccesses (in terms of wall clock time)? happen (in terms of LBA)? 150 100 5 hours 1 hour LBA 50 0 12 AM 1 AM 2 AM 3 AM 4 AM 5 AM 6 AM

11 Q1: What is the Temporal Distance? Time b/w Reaccesses for All Traces 100 Amount of Reaccesses (%) 80 Within an hour: Hourly 60 40 23-25 hours: Daily 20 0 0 24 48 72 96 120 144 168 Temporal Distance (Hour)

12 Q1: What is the Temporal Distance? 100 Hourly Hourly Dominated Daily Amount of Reacceses (%) 80 Other Other 60 Daily Dominated 40 20 0 mds_1 proj_1 proj_4 usr_1 src1_1 web_2 prn_1 proj_2 usr_2 A1: Two main reaccess patterns: Hourly & Daily In an hour, recent blocks more likely reaccessed

13 Q2: When Do Reaccesses Happen (Wall Clock Time)? 6 src11 5 Daily Reaccesses (%) web2 4 3 2 1 0 0 1 2 3 4 5 6 7 Time (day) A2: Daily reaccesses at same time every day

14 Q3: What is the Spatial Distance? Hourly Dominated Daily Dominated Other 100% Amount of Reaccesses (%) 80% 60% 40% 20% 0% mds1 proj1 proj4 usr1 src11 web2 prn1 proj2 usr2 <10MB 10MB-1GB 1-10GB 10-100GB >100GB A3: Spatial distance usually small for hourly sometimes small for other reaccesses

15 Q3: Any spatial clustering among reaccesses? • Percentage of 1MB regions that have reaccesses 100 Pcentage of Reacceses in Hourly 80 1MB Region (%) Daily 60 Other 40 20 0 mds_1 proj_1 proj_4 usr_1 src1_1 web_2 prn_1 proj_2 usr_2 A3: Daily reaccesses more spatially clustered

16 Trace Analysis Summary and Implications Time Space A3: Hourly reaccesses are A1: Reaccesses have two main close in spatial distance. Relative temporal patterns: Daily reaccesses exhibit within 1 hour, around 1 day spatial clustering. A2: Daily reaccesses correlate A4: No hot spot of Absolute with wall clock time reaccesses in LBA space • A1 Hourly : Use recently accessed blocks • A1 and A2 Daily : Use same period from previous day • A3 Small spatial distance: Size of monitoring buffer is small

18 Metrics: Warmup Time • Warmup period: Hit-rate convergence time Converge time strict Converge time loose 100 80 Read Hit Rate (%) 60 Always-warm cache 40 New cache 20 0 0 20 40 60 80 100 120 Time

19 Metrics: Server I/O Reduction • Storage server I/O load reduction 𝐵𝑛𝑝𝑣𝑜𝑢 𝑝𝑔 𝐽/𝑃𝑡 𝑕𝑝𝑗𝑜𝑕 𝑢𝑝 𝑑𝑏𝑑ℎ𝑓 ▫ during convergence time 𝑈𝑝𝑢𝑏𝑚 𝐽/𝑃𝑡 • Improvement in server I/O load reduction 𝑇𝑓𝑠𝑤𝑓𝑠 𝐽/𝑃 𝑚𝑝𝑏𝑒 𝑠𝑓𝑒𝑣𝑑𝑢𝑗𝑝𝑜 𝑝𝑔 𝐶𝑝𝑜𝑔𝑗𝑠𝑓 ▫ 𝑇𝑓𝑠𝑤𝑓𝑠 𝐽 𝑃 𝑚𝑝𝑏𝑒 𝑠𝑓𝑒𝑣𝑑𝑢𝑗𝑝𝑜 𝑝𝑔 𝑃𝑜−𝑒𝑓𝑛𝑏𝑜𝑒

20 Cache Warmup Algorithms • Last-K : Last K regions accessed in the trace • First-K : First K regions in the past 24 hours Region: • Top-K : K most frequent regions granularity of monitoring • Random-K : Random K regions and logging e.g., 1MB T R I/Os F L Time 24 Hours cache starts

21 Simulation Results - Overall • LRU cache simulator with four warmup algorithms • Convergence time ▫ Improves 14% to 100% • Server I/O load reduction ▫ Improves 44% to 228% • In general, Last-K is the best • First-K works for special case (known patterns)

23 Bonfire Design • Design principles ▫ Low overhead monitoring and logging (efficient) ▫ Bulk loading useful warmup data (effective and fast) ▫ General design applicable to a range of scenarios • Techniques ▫ Last-K ▫ Monitors I/O below the server buffer cache ▫ Performance snapshot

24 Bonfire Architecture: Monitoring I/O Bonfire Monitor Buffer In-memory Cache Only store Staging Buffer warmup 1 n Warmup . metadata: 2 . Metadata metadata-only . 3 I/O I/O 5 Warmup 4 Data Performance Store warmup Snapshot … metadata and data: Logging metadata+data Data Volumes Volume Storage System

25 Bonfire Architecture: Bulk Cache Warmup I/O Warmup Data Bonfire Monitor Buffer New New In-mem n n-1 .. k Cache Cache Cache Sorted by LBA 1 n Warmup . 2 . Metadata . 3 I/O 5 4 Warmup Data Performance Snapshot … Logging Data Volumes Volume Storage System

27 Evaluation Set Up • Implemented Bonfire as a trace replayer ▫ Always-warm, on-demand, and Bonfire ▫ Metadata-only and metadata+data ▫ Replay traces using sync I/Os • Workloads 1 2 3 1 week 4 5 6 7 ▫ Synthetic workloads On-demand ▫ MSR-Cambridge traces Bonfire Always-warm • Metrics ▫ Benefits and overheads

28 Benefit Results - Read Hit Rate of MSR Trace • Higher read hit rate => less server I/O load On-demand converges 100 80 Read Hit Rate (%) 60 Cache Starts 40 Bonfire Always-warm 20 converges Cold On-demand Bonfire 0 Day 6 0 1 2 3 4 Day 5 Num of I/Os (x1000000) * Results of a project server trace from MSR-Cambridge trace set

29 Benefit Results - Read Latency of MSR Trace • Lower read latency => better application-perceived performance Cache Starts 2 Always-warm Cold On-demand 1.5 Read Latency (ms) Bonfire 1 0.5 0 Day 6 0 1 2 3 4 Day 5 Num of I/Os (x1000000) * Results of a project server trace from MSR-Cambridge trace set

30 Overhead Results I/O Bonfire Monitor Buffer 9KB/s to In-memory In-memory Cache 476KB/s Staging Buffer Staging Buffer 19MB to 71MB 256KB & 128MB 1 n Warmup . 2 . Metadata . 3 I/O 5 Warmup 4 Data 4.6MB/s Performance Performance to Snapshot Snapshot … 238MB/s 9.5GB to 36GB Logging Data Volumes Volume Storage System

31 Overhead Results Proper Bonfire scheme and configuration I/O 2 to 20 minutes Warmup Data Bonfire Monitor Buffer New 9KB/s to In-memory In-memory Cache Cache 476KB/s Staging Buffer Staging Buffer 19MB to 71MB 256KB & 128MB 1 n Warmup . 2 . Metadata . 3 I/O 5 Warmup 4 Data 4.6MB/s Performance Performance to Snapshot Snapshot … 238MB/s 9.5GB to 36GB Logging Data Volumes Volume Storage System

Warming up Storage-level Caches with Bonfire Yiying Zhang Gokul - PowerPoint PPT Presentation

Warming up Storage-level Caches with Bonfire Yiying Zhang Gokul Soundararajan Mark W. Storer Lakshmi N. Bairavasundaram Sethuraman Subbiah Andrea C. Arpaci-Dusseau Remzi H. Arpaci-Dusseau 2 Does on-demand cache warmup still work ? 3 10s

Possible Effects of Possible Effects of Global Warming on Global Warming on Global Warming on

Outline Simulated Impacts of Global Warming on Building Thermal Trends in global warming

Multicore Workshop Caches Mark Bull David Henty EPCC, University of Edinburgh Overview

Trace Caches and optimizations therein CSE 240C - Rushi Chakrabarti - Winter 2009 Trace Caches

Review: Why We Use Caches Caches Review Mechanism for transparent movement of Proc 1000

Global Warming: Global Warming: To summarize the physics of radiative transfer as it pertains

Say Goodbye to Off-heap Caches! On-heap Caches Using Memory-Mapped I/O Iacovos G. Kolokasis 1 ,

CSE 351: Week 7 Tom Bergan, TA 1 Today Cache geometries Lab 4 2 Caches they make

CS 136: Advanced Architecture Review of Caches 1 / 30 Introduction Why Caches? Basic goal:

CPUs Chapter 3.5 Caches. Memory management. Caches and CPUs address data cache

ECE232: Hardware Organization and Design Lecture 22: Introduction to Caches Adapted from Computer

What You Must Know about Memory, Caches, and Shared Memory Kenjiro Taura 1 / 67 Contents 1

Caches Electronic Computers M Caches 1 Cache LOCALITY PRINCIPLE (SPATIAL AND TEMPORAL)

Caches & Memcache Example Client N. America Client System Asia + Caches Client Africa

INF5470 Fall 2012 Lecture 10: Analog Storage Content Overview Volatile Short Term Storage

> SUN STORAGE 7000 UNIFIED STORAGE SYSTEMS ITS TIME TO CHANGE YOUR STORAGE

Inducing a Discriminative Parser to Optimize Machine Translation Reordering Graham Neubig 1,2,3 ,

We Crashed, Now What? Lorenzo Cavallaro Cristiano Giuffrida Andrew S. Tanenbaum Vrije

Clinical Trials in OSA Samuel T. Kuna, MD Department of Medicine Center for Sleep and Circadian

Git as a HIT Dan Licata Wesleyan University 1 1 Darcs Git as a HIT Dan Licata Wesleyan

A Low Power Asynchronous GPS Baseband Processor Benjamin Z. Tang, Stephen Longfield, Jr., Sunil

3. Data Structure and Algorithm 3.1 Proplets for Coding Propositional Content 3.1.1 C ONTEXT

ADVENTURES IN TIME & SPACE Jim Royer Syracuse University Joint work with Norman Danner

Hospice Item Set Presented By: CMS and RTI International February 4 5, 2014 Welcome Welcome

Warming up Storage-level Caches with Bonfire Yiying Zhang Gokul - PowerPoint PPT Presentation

Warming up Storage-level Caches with Bonfire Yiying Zhang Gokul Soundararajan Mark W. Storer Lakshmi N. Bairavasundaram Sethuraman Subbiah Andrea C. Arpaci-Dusseau Remzi H. Arpaci-Dusseau 2 Does on-demand cache warmup still work ? 3 10s

Possible Effects of Possible Effects of Global Warming on Global Warming on Global Warming on

Outline Simulated Impacts of Global Warming on Building Thermal Trends in global warming

Multicore Workshop Caches Mark Bull David Henty EPCC, University of Edinburgh Overview

Trace Caches and optimizations therein CSE 240C - Rushi Chakrabarti - Winter 2009 Trace Caches

Review: Why We Use Caches Caches Review Mechanism for transparent movement of Proc 1000

Global Warming: Global Warming: To summarize the physics of radiative transfer as it pertains

Say Goodbye to Off-heap Caches! On-heap Caches Using Memory-Mapped I/O Iacovos G. Kolokasis 1 ,

CSE 351: Week 7 Tom Bergan, TA 1 Today Cache geometries Lab 4 2 Caches they make

CS 136: Advanced Architecture Review of Caches 1 / 30 Introduction Why Caches? Basic goal:

CPUs Chapter 3.5 Caches. Memory management. Caches and CPUs address data cache

ECE232: Hardware Organization and Design Lecture 22: Introduction to Caches Adapted from Computer

What You Must Know about Memory, Caches, and Shared Memory Kenjiro Taura 1 / 67 Contents 1

Caches Electronic Computers M Caches 1 Cache LOCALITY PRINCIPLE (SPATIAL AND TEMPORAL)

Caches &amp; Memcache Example Client N. America Client System Asia + Caches Client Africa

INF5470 Fall 2012 Lecture 10: Analog Storage Content Overview Volatile Short Term Storage

&gt; SUN STORAGE 7000 UNIFIED STORAGE SYSTEMS ITS TIME TO CHANGE YOUR STORAGE

Inducing a Discriminative Parser to Optimize Machine Translation Reordering Graham Neubig 1,2,3 ,

We Crashed, Now What? Lorenzo Cavallaro Cristiano Giuffrida Andrew S. Tanenbaum Vrije

Clinical Trials in OSA Samuel T. Kuna, MD Department of Medicine Center for Sleep and Circadian

Git as a HIT Dan Licata Wesleyan University 1 1 Darcs Git as a HIT Dan Licata Wesleyan

A Low Power Asynchronous GPS Baseband Processor Benjamin Z. Tang, Stephen Longfield, Jr., Sunil

3. Data Structure and Algorithm 3.1 Proplets for Coding Propositional Content 3.1.1 C ONTEXT

ADVENTURES IN TIME &amp; SPACE Jim Royer Syracuse University Joint work with Norman Danner

Hospice Item Set Presented By: CMS and RTI International February 4 5, 2014 Welcome Welcome

Caches & Memcache Example Client N. America Client System Asia + Caches Client Africa

> SUN STORAGE 7000 UNIFIED STORAGE SYSTEMS ITS TIME TO CHANGE YOUR STORAGE

ADVENTURES IN TIME & SPACE Jim Royer Syracuse University Joint work with Norman Danner