Warming up Storage-level Caches with Bonfire Yiying Zhang Gokul - - PowerPoint PPT Presentation

warming up storage level caches
SMART_READER_LITE
LIVE PREVIEW

Warming up Storage-level Caches with Bonfire Yiying Zhang Gokul - - PowerPoint PPT Presentation

Warming up Storage-level Caches with Bonfire Yiying Zhang Gokul Soundararajan Mark W. Storer Lakshmi N. Bairavasundaram Sethuraman Subbiah Andrea C. Arpaci-Dusseau Remzi H. Arpaci-Dusseau 2 Does on-demand cache warmup still work ? 3 10s


slide-1
SLIDE 1

Yiying Zhang

Gokul Soundararajan Mark W. Storer Lakshmi N. Bairavasundaram Sethuraman Subbiah Andrea C. Arpaci-Dusseau Remzi H. Arpaci-Dusseau

Warming up Storage-level Caches with Bonfire

slide-2
SLIDE 2

Does on-demand cache warmup still work ?

2

slide-3
SLIDE 3

0.0 0.1 1.0 16.0 256.0 4096.0 65536.0 1990 1995 2000 2005 2010 2015 Memory (Cache) Size (GB) Year 3

< 10 GB 10s of TBs

. . . + idle time Random: 6 days Sequential: 2.5 hours

1 TB Cache

slide-4
SLIDE 4

20 40 60 80 100 4 8 12 16 20 24 Read Hit Rate (%) Time (hour) Always-warm Cold

How Long Does On-demand Warmup Take?

* Simulation results from a project server trace

4

  • Read hit rate difference between warm cache and on-demand

On-demand

On-demand warmup takes hours to days

slide-5
SLIDE 5
  • Caches are critical

▫ Key component to meet application SLAs ▫ Reduce storage server I/O load

  • Cache warmup happens often

▫ Storage server restart ▫ Storage server take-over ▫ Dynamic caching [Narayanan’08, Bairavasundaram’12]

To Make Things Worse

5

slide-6
SLIDE 6
  • Bonfire

▫ Monitors and logs I/Os ▫ Load warmup data in bulk

  • Challenges

▫ What to monitor & log? Effective ▫ How to monitor & log? Efficient ▫ How to load warmup data? Fast ▫ General solution

On-demand Warmup Doesn’t Work Anymore What Can We Do?

6 Bonfire Monitor

Storage System

Warmup Information

Logging Volume I/O

slide-7
SLIDE 7
  • Trace analysis for storage-level cache warmup

▫ Temporal and spatial patterns of reaccesses

  • Cache warmup algorithm design and simulation
  • Implementation and evaluation of Bonfire

▫ Up to 100% warmup time improvement over on-demand ▫ Up to 200% more server I/O load reduction ▫ Up to 5 times lower read latency ▫ Low overhead

Summary of Contributions

7

slide-8
SLIDE 8
  • Introduction
  • Trace analysis for cache warmup
  • Cache warmup algorithm study with simulation
  • Bonfire architecture
  • Evaluation results
  • Conclusion

Outline

8

slide-9
SLIDE 9
  • MSR-Cambridge [Narayanan’08]

▫ 36 one-week block-level traces from MSR-Cambridge data center servers ▫ Filter out write-intensive, small working set, and low reaccess-rate

Workload Study – Trace Selection

Server Function #volumes mds Media server 1 prn Print server 1 proj Project directories 3 src1 Source control 1 usr User home directories 2 web Web/SQL server 1

Reaccesses: Read After Reads and Read After Writes

9

slide-10
SLIDE 10

Questions for Trace Study

10

Time Space Relative Q1: What’s the temporal distance? Q3: What’s the spatial distance? Any clustering of reaccesses? Absolute Q2: When do reaccesses happen (in terms of wall clock time)? Q4: Where do reaccesses happen (in terms of LBA)?

50 100 150 12 AM 1 AM 2 AM 3 AM 4 AM 5 AM 6 AM LBA 5 hours 1 hour

slide-11
SLIDE 11

Q1: What is the Temporal Distance?

11 20 40 60 80 100 24 48 72 96 120 144 168 Amount of Reaccesses (%) Temporal Distance (Hour)

Time b/w Reaccesses for All Traces

Within an hour: Hourly 23-25 hours: Daily

slide-12
SLIDE 12

Q1: What is the Temporal Distance?

A1: Two main reaccess patterns: Hourly & Daily

12 20 40 60 80 100 mds_1 proj_1 proj_4 usr_1 src1_1 web_2 prn_1 proj_2 usr_2 Amount of Reacceses (%) Hourly Daily Other

Hourly Dominated Daily Dominated Other

In an hour, recent blocks more likely reaccessed

slide-13
SLIDE 13

Q2: When Do Reaccesses Happen (Wall Clock Time)?

13 1 2 3 4 5 6 1 2 3 4 5 6 7 Daily Reaccesses (%) Time (day) src11 web2

A2: Daily reaccesses at same time every day

slide-14
SLIDE 14

Q3: What is the Spatial Distance? A3: Spatial distance usually small for hourly sometimes small for other reaccesses

14

0% 20% 40% 60% 80% 100% mds1 proj1 proj4 usr1 src11 web2 prn1 proj2 usr2 Amount of Reaccesses (%) <10MB 10MB-1GB 1-10GB 10-100GB >100GB

Hourly Dominated Daily Dominated Other

slide-15
SLIDE 15
  • Percentage of 1MB regions that have reaccesses

Q3: Any spatial clustering among reaccesses?

A3: Daily reaccesses more spatially clustered

15 20 40 60 80 100 mds_1 proj_1 proj_4 usr_1 src1_1 web_2 prn_1 proj_2 usr_2 Pcentage of Reacceses in 1MB Region (%) Hourly Daily Other

slide-16
SLIDE 16
  • A1 Hourly: Use recently accessed blocks
  • A1 and A2 Daily: Use same period from previous day
  • A3 Small spatial distance: Size of monitoring buffer is small

Trace Analysis Summary and Implications

Time Space

Relative A1: Reaccesses have two main temporal patterns: within 1 hour, around 1 day A3: Hourly reaccesses are close in spatial distance. Daily reaccesses exhibit spatial clustering. Absolute A2: Daily reaccesses correlate with wall clock time A4: No hot spot of reaccesses in LBA space

16

slide-17
SLIDE 17
  • Introduction
  • Trace analysis for cache warmup
  • Cache warmup algorithm study with simulation
  • Bonfire architecture
  • Evaluation results
  • Conclusion

Outline

17

slide-18
SLIDE 18
  • Warmup period: Hit-rate convergence time

20 40 60 80 100 20 40 60 80 100 120 Read Hit Rate (%) Time Always-warm cache New cache

Metrics: Warmup Time

Converge time strict Converge time loose

18

slide-19
SLIDE 19
  • Storage server I/O load reduction

𝐵𝑛𝑝𝑣𝑜𝑢 𝑝𝑔 𝐽/𝑃𝑡 𝑕𝑝𝑗𝑜𝑕 𝑢𝑝 𝑑𝑏𝑑ℎ𝑓 𝑈𝑝𝑢𝑏𝑚 𝐽/𝑃𝑡

during convergence time

  • Improvement in server I/O load reduction

𝑇𝑓𝑠𝑤𝑓𝑠 𝐽/𝑃 𝑚𝑝𝑏𝑒 𝑠𝑓𝑒𝑣𝑑𝑢𝑗𝑝𝑜 𝑝𝑔 𝐶𝑝𝑜𝑔𝑗𝑠𝑓 𝑇𝑓𝑠𝑤𝑓𝑠 𝐽

𝑃𝑚𝑝𝑏𝑒 𝑠𝑓𝑒𝑣𝑑𝑢𝑗𝑝𝑜 𝑝𝑔 𝑃𝑜−𝑒𝑓𝑛𝑏𝑜𝑒

Metrics: Server I/O Reduction

19

slide-20
SLIDE 20
  • Last-K: Last K regions accessed in the trace
  • First-K: First K regions in the past 24 hours
  • Top-K: K most frequent regions
  • Random-K: Random K regions

Cache Warmup Algorithms

L F

24 Hours I/Os Time

R T cache starts

20 Region: granularity of monitoring and logging e.g., 1MB

slide-21
SLIDE 21
  • LRU cache simulator with four warmup algorithms
  • Convergence time

▫ Improves 14% to 100%

  • Server I/O load reduction

▫ Improves 44% to 228%

  • In general, Last-K is the best
  • First-K works for special case (known patterns)

Simulation Results - Overall

21

slide-22
SLIDE 22
  • Introduction
  • Trace analysis for cache warmup
  • Cache warmup algorithm study with simulation
  • Bonfire architecture
  • Evaluation results
  • Conclusion

Outline

22

slide-23
SLIDE 23
  • Design principles

▫ Low overhead monitoring and logging (efficient) ▫ Bulk loading useful warmup data (effective and fast) ▫ General design applicable to a range of scenarios

  • Techniques

▫ Last-K ▫ Monitors I/O below the server buffer cache ▫ Performance snapshot

Bonfire Design

23

slide-24
SLIDE 24

Bonfire Architecture: Monitoring

Performance Snapshot

Bonfire Monitor

Storage System

1 2 3 4 n 5 . . .

Warmup Metadata

Buffer Cache Logging Volume I/O I/O

Warmup Data In-memory Staging Buffer

Data Volumes I/O 24

Only store warmup metadata: metadata-only Store warmup metadata and data: metadata+data

slide-25
SLIDE 25

Bonfire Architecture: Bulk Cache Warmup

Performance Snapshot

Bonfire Monitor

Storage System

1 2 3 4 n 5 . . .

Warmup Metadata

Buffer Cache Logging Volume I/O I/O

Warmup Data

Data Volumes

In-mem n n-1 k .. Sorted by LBA

New Cache

Warmup Data

New Cache 25

slide-26
SLIDE 26
  • Introduction
  • Trace analysis for cache warmup
  • Cache warmup algorithm study with simulation
  • Bonfire architecture
  • Evaluation results
  • Conclusion

Outline

26

slide-27
SLIDE 27

1 week

  • Implemented Bonfire as a trace replayer

▫ Always-warm, on-demand, and Bonfire ▫ Metadata-only and metadata+data ▫ Replay traces using sync I/Os

  • Workloads

▫ Synthetic workloads ▫ MSR-Cambridge traces

  • Metrics

▫ Benefits and overheads

Evaluation Set Up

27 1 2 3 4 5 6 7 On-demand Bonfire Always-warm

slide-28
SLIDE 28

20 40 60 80 100 1 2 3 4 Read Hit Rate (%) Num of I/Os (x1000000) Always-warm Cold Bonfire

Benefit Results - Read Hit Rate of MSR Trace

On-demand converges Bonfire converges

28

* Results of a project server trace from MSR-Cambridge trace set

Cache Starts Day 5 Day 6

  • Higher read hit rate => less server I/O load

On-demand

slide-29
SLIDE 29

0.5 1 1.5 2 1 2 3 4 Read Latency (ms) Num of I/Os (x1000000) Always-warm Cold Bonfire

Benefit Results - Read Latency of MSR Trace

29

  • Lower read latency => better application-perceived performance

* Results of a project server trace from MSR-Cambridge trace set

Cache Starts Day 5 Day 6

On-demand

slide-30
SLIDE 30

Overhead Results

30

Performance Snapshot

Bonfire Monitor

Storage System

Warmup Metadata

Buffer Cache Logging Volume I/O I/O

Warmup Data In-memory Staging Buffer

Data Volumes 1 2 3 4 n 5 . . .

In-memory Staging Buffer Performance Snapshot

256KB & 128MB 19MB to 71MB 4.6MB/s to 238MB/s 9.5GB to 36GB 9KB/s to 476KB/s

slide-31
SLIDE 31

Overhead Results

31

Performance Snapshot

Bonfire Monitor

Storage System

Warmup Metadata

Buffer Cache Logging Volume I/O I/O

Warmup Data In-memory Staging Buffer

Data Volumes 1 2 3 4 n 5 . . .

Warmup Data

New Cache 256KB & 128MB 19MB to 71MB 2 to 20 minutes

Performance Snapshot

Proper Bonfire scheme and configuration

4.6MB/s to 238MB/s 9.5GB to 36GB 9KB/s to 476KB/s

In-memory Staging Buffer

slide-32
SLIDE 32
  • Faster cache warmup

▫ 59% to 100% improvement over on-demand

  • Less storage server I/O load

▫ 38% to 200% more reduction than on-demand

  • Better application-perceived latency

▫ Avg read latency 1/5 to 2/3 of on-demand

  • Small controllable overhead

Summary of Results

32

slide-33
SLIDE 33

On-demand warmup doesn’t work anymore

▫ Warm up terabytes of caches take days

Bonfire and beyond

▫ Client-side cache warmup ▫ Application-aware warmup ▫ …

In need for more long big public traces !

Conclusion

33

slide-34
SLIDE 34

Thank You Questions?

34

http://wisdom.cs.wisc.edu/home http://research.cs.wisc.edu/adsl