frd a filtering based buffer cache algorithm that
play

FRD: A Filtering based Buffer Cache Algorithm that Considers both - PowerPoint PPT Presentation

33rd International Conference on Massive Storage Systems and Technology (MSST 2017) Santa Clara, CA, May 15 19, 2017 FRD: A Filtering based Buffer Cache Algorithm that Considers both Frequency and Reuse Distance Sejin Park* and Chanik Park**


  1. 33rd International Conference on Massive Storage Systems and Technology (MSST 2017) Santa Clara, CA, May 15 — 19, 2017 FRD: A Filtering based Buffer Cache Algorithm that Considers both Frequency and Reuse Distance Sejin Park* and Chanik Park** 2017. 05. 18 *SK telecom Corporate R&D Center | Network IT Convergence R&D Center, New Computing Lab **POSTECH Department of Computer Science and Engineering, System Software Lab

  2. Table of Contents Motivation • Workload Analysis and Observations • Design • Evaluation • Summary • 2

  3. Motivation Buffer cache management algorithm is one of the oldest topic in computer science area • Existing buffer cache algorithm concentrates on how to maintain meaningful blocks? • – LRU, LFU, OPT, … – LIRS (ACM SIGMETRICS 2002, S. Jiang. et. al.) • Two LRU Stacks (LIRS, HIRS) – Reuse distance ordering – ARC (USENIX FAST 03, Megiddo. et. al.) • Two LRU Stacks (Recency-T1, Frequency-T2) – Adaptive resizing In this study, we concentrate on how to exclude the cache-unfriendly blocks • – We analyzed real-world workload and found characteristics of cache-unfriendly blocks 3

  4. Example: LRU Depending on their eviction policy, blocks that can make cache pollution could be • maintained in cache space LRU believes that recently used blocks will make more cache hit • – If the recently used blocks are infrequently accessed and rarely used, it causes cache pollution! New block Insertion I I I I I I I F F F F Eviction LRU MRU I Infrequently accessed block (cache-unfriendIy block) F Frequently accessed block 4

  5. Example: ARC Recency buffer T1 and Frequency buffer T2 in ARC works as LRU cache • If a block is reused, it moves into T2 even if it is infrequently accessed block • – This can cause cache pollution for T2 Reused block New block Insertion (moved from T1 or T2) Insertion Eviction I I I I I F Eviction LRU LRU MRU MRU History buffer T1 T2 History buffer I Infrequently accessed block (cache-unfriendIy block) F Frequently accessed block 5

  6. Workload Description Real-world workloads downloaded from SNIA. • Name Type Description OLTP Application Online transaction processing Web12 Web server A typical retail shop Web07 Web server A typical retail shop prxy_0 Data center Firewall/web proxy wdev_0 Data center Test web server hm_0 Data center Hardware monitoring proj_0 Data center Project directories proj_3 Data center Project directories src1_2 Data center Source control 6

  7. Workload Analysis Reuse Distance Distribution • - Reuse Distance: # of unique blocks between the same blocks request (a) OLTP (b) Web12 (c) Web07 (d) prxy_0 (e) wdev_0 CDF (Percentage) # of Accesses Reuse distance Reuse distance Reuse distance Reuse distance Reuse distance (f) hm_0 (g) proj_0 (h) proj_3 (i) src1_2 CDF (Percentage) # of Accesses Reuse distance Reuse distance Reuse distance Reuse distance 7

  8. Workload Analysis CDF of Number of accessed count for each block • (a) OLTP (b) Web12 (c) Web07 (d) prxy_0 (e) wdev_0 CDF (Percentage) X axis: Number of accessed count for each block (f) hm_0 (g) proj_0 (h) proj_3 (i) src1_2 CDF (Percentage) X axis: Number of accessed count for each block 8

  9. Workload Analysis Observation #1: Most blocks (about 50 – 90%) are infrequently accessed in the real-world • workload. (a) OLTP (b) Web12 (c) Web07 (d) prxy_0 (e) wdev_0 CDF (Percentage) 88% 80% 80% 74% 55% 3 3 3 3 3 X axis: Number of accessed count for each block (f) hm_0 (g) proj_0 (h) proj_3 (i) src1_2 CDF (Percentage) 74% 70% 54% 34% 3 3 3 3 X axis: Number of accessed count for each block 9

  10. Workload Analysis CDF of reuse distance distribution for the infrequently accessed blocks (represented by • percentage of cache size) (a) OLTP (b) Web12 (c) Web07 (d) prxy_0 (e) wdev_0 CDF (Percentage) X axis: Reuse distance (represented with percentage of given cache size) (f) hm_0 (g) proj_0 (h) proj_3 (i) src1_2 CDF (Percentage) X axis: Reuse distance (represented with percentage of given cache size) 10

  11. Workload Analysis Observation #2: Reuse distance for the infrequently accessed blocks is extremely long • or extremely short – In terms of cache size: under 10% and over 100% of cache size are dominant (a) OLTP (b) Web12 (c) Web07 (d) prxy_0 (e) wdev_0 CDF (Percentage) 88% 96% 90% 94% 90% X axis: Reuse distance (represented with percentage of given cache size) (f) hm_0 (g) proj_0 (h) proj_3 (i) src1_2 CDF (Percentage) 98% 98% 98% 98% X axis: Reuse distance (represented with percentage of given cache size) 11

  12. Observations Observation #1: Most blocks are infrequently accessed in the real-world workload • – These blocks are cache-unfriendly blocks that cause cache pollution Observation #2: Reuse distance for the infrequently accessed blocks is extremely • long or extremely short – The cache-unfriendly blocks have distinct characteristics Therefore, • – “Frequency” and “Reuse distance” are the key metrics to filter out the cache-unfriendly blocks 12

  13. Design Block Classification • Accessing Reuse Cache-Hit Cache Pollution Class Frequency Distance Target (Filtering target) Class 1 (FS) Frequent Short V - Class 2 (FL) Frequent Long V - Class 3 (IS) Infrequent Short V V Class 4 (IL) Infrequent Long - V Design Goal • – Maintains Class 1 and 2 blocks in cache – Maintains Class 3 blocks but preventing it from polluting cache – Filters out Class 4 blocks from cache 13

  14. FRD Algorithm - A Filtering based Buffer Cache Algorithm that Considers both Frequency and Reuse Distance Parameter = FilterStack (%) (Default = 10%) 4. Cache Hit 2. 1. New Resident Block Eviction Insertion Entry insertion MRU LRU * If RD stack is not full Filter Stack New entry is inserted to RD stack. Reuse distance Stack 3. MRU LRU History Block Insertion Eviction 5. Cache Miss 6. Cache Hit Resident Block History Block 14

  15. Analysis of FRD Algorithm Parameter = FilterStack (%) Class 1 Class 3 (Default = 10%) (FS) (IS) Class 2 Cache Hit (FL) Class 1 Class 2 (FS) (FL) Class 3 New Resident Block Eviction (IS) Class 3 Class 4 Insertion Entry (IS) (IL) MRU LRU Class 4 (IL) * If RD stack is not full Filter Stack New entry is inserted to RD stack. Reuse distance Stack Class Class 2 1,2,3,4 MRU LRU (FL) History Block Insertion Eviction Class 1,3,4 Cache Miss Class 2 (FL) Cache Hit Resident Block History Block 15

  16. Evaluation Environment • – Simulation based evaluation – Compared with OPT, LRU, ARC, LIRS 16

  17. <Legend> Hitratio Result FRD is highest LIRS is highest LIRS is unstable Case of LIRS’ unstable hitratio result • ARC is highest ARC is unstable Hitratio Cache size (MB) Cache size (MB) Cache size (MB) 17

  18. <Legend> Hitratio Result FRD is highest LIRS is highest LIRS is unstable ARC is highest ARC is unstable Hitratio Cache size (MB) Cache size (MB) Cache size (MB) 18

  19. <Legend> Hitratio Result FRD is highest LIRS is highest LIRS is unstable Case of ARC’s unstable hitratio result • ARC is highest ARC is unstable Hitratio Cache size (MB) Cache size (MB) Cache size (MB) 19

  20. Evaluation Overall Average Result (1.0 is OPT’s hitratio) • Workload LRU ARC LIRS FRD OLTP 0.674 0.746 0.691 0.753 Web12 0.829 0.852 0.827 0.857 Web07 0.800 0.839 0.812 0.847 prxy_0 0.844 0.870 0.870 0.898 wdev_0 0.647 0.723 0.728 0.745 hm_0 0.598 0.700 0.723 0.724 proj_0 0.612 0.722 0.740 0.780 proj_3 0.172 0.241 0.516 0.478 src1_2 0.620 0.697 0.799 0.813 20

  21. Parameter Sensitivity (Size of the Filter stack) Variation of filter stack size from 1% to 25% of cache size. • 10% shows the best performance on average but the difference is negligible. • 21

  22. Summary FRD: A Filtering based Buffer Cache Algorithm that Considers both Frequency and • Reuse Distance – A new buffer cache algorithm that filters out cache-unfriendly blocks – Careful analysis on real-world workload gives characteristics of cache-unfriendly blocks – The experimental result shows that it outperforms state-of-the-art cache algorithms like ARC or LIRS. 22

  23. Backup slides 23

  24. Hitratio Analysis Filter stack performance • 24

  25. Revisiting LIRS and ARC ARC (Initial: T1= T2= B1 = B2 = 0, p = 0) LIRS ( HIRstack + LIRstack = c, 1:99 ) T1+ T2+ B1+ B2 <= 2C p New Entry New Entry Replace(p) HIRstack Resident T1 B1 Resident History Metadata (Non-resident) Keep Non-Resident till RMAX p = min{c, p+ max{|B2|/|B1|,1} } Replace(p) LIRStack Non-Resident Resident T2 B2 Resident History p = max{0, p - max{|B1|/|B2|,1} } Replace(p) Subroutine Replace (p ) if (| T1 | ≥ 1) and (( x ∈ B2 and | T1 | = p ) or (| T1 | > p )) then move the LRU page of T1 to the top of B1 and remove it from the cache. else move the LRU page in T2 to the top of B2 and remove it from the cache. Eviction Flow HIT MISS NEW ENTRY Metadata(History) 25

  26. Design comparison with ARC and LIRS ARC LIRS FRD # LRU stack Two Two Two Adaptive Resizing O X X Two One Two (Two LRU stacks are (Two LRU stacks are (Two LRU stacks are Eviction Point isolated) not isolated) isolated) History size Cache size x 2 Max resident block Max resident block 26

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend