ordermergededup efficient failure consistent
play

OrderMergeDedup: Efficient, Failure-Consistent Deduplication on - PowerPoint PPT Presentation

OrderMergeDedup: Efficient, Failure-Consistent Deduplication on Flash Zhuan Chen and Kai Shen University of Rochester Background } I/O deduplication } Eliminate I/O writes with redundant content } Reduce the storage space usage } Write


  1. OrderMergeDedup: Efficient, Failure-Consistent Deduplication on Flash Zhuan Chen and Kai Shen University of Rochester

  2. Background } I/O deduplication } Eliminate I/O writes with redundant content } Reduce the storage space usage } Write reduction: reduce the Flash wear, improve performance } Broad usage in data centers, personal computers, data-driven sensing

  3. Mo,va,on } I/O deduplication is not free: metadata maintenance } (1) Logical-physical block mapping (2) Physical block fingerprints (3) Physical block reference counts } Need to maintain failure-consistency for data and metadata … … L2P mapping Write: ABC L 1 P Write: ABC (ref. ctr. = 2) Write: ABC … L 2 … … Logical Blocks Physical Blocks (on Flash) 3

  4. Challenge } Existing approaches for failure-consistency } Rely on non-volatile RAM or supercapacitors/batteries [Srinivasan etal. 2012; Chen et al. 2011; Gupta et al. 2011] } Checking/repair tools [Quinlan et al. 2002] } Redo logging [Meister et al. 2010] (additional I/O for logging writes) } Shadowing [Tarasov et al. 2014] (additional I/O for index block writes) } Challenge: metadata & failure-consistency-induced I/O cost shouldn’t significantly diminish the deduplication I/O saving } We look into soft updates-style I/O ordering 4

  5. I/O Ordering for Failure-Consistency } Define an order for data/metadata writes } Ordered writes are committed one by one } A failure still keeps a deduplication system consistent } A failure can only leave garbage (which can be reclaimed asynchronously) Example: new write (duplicated content) Incr. P 1 ref. L2P mapping ✗ (Higher-than-actual ref. – leave garbage) 5

  6. I/O Ordering for Failure-Consistency } I/O efficiency } No consistency-induced additional I/O } We can merge metadata writes residing on the same metadata block as long as they are not subject to any ordering constraint Example: new write (duplicated content) Incr. P 1 ref. L2P mapping Incr. P 2 ref. L2P mapping Metadata block 1 Metadata block 2 6

  7. I/O Ordering for Failure-Consistency } Cyclic dependencies } Prevent metadata I/O merging & complicate the implementation } Make soft updates costly for file systems [Seltzer et al. 2000] Example: (1) overwrite (duplicated content) (2) new write (duplicated content) (1) Incr. P 1 ref. L2P mapping Decr. P 2 ref. (2) Incr. P 3 ref. L2P mapping Metadata block 1 Metadata block 2 Metadata block 1 Metadata block 2 7

  8. I/O Ordering for Failure-Consistency } Resolve cyclic dependencies } We carefully design all deduplication I/O paths } Delay non-critical metadata I/O (the completion signal doesn’t depend on) Example: (1) overwrite (duplicated content) (2) new write (duplicated content) (1) Incr. P 1 ref. L2P mapping Completion to client Decr. P 2 ref. Delay (2) Incr. P 3 ref. L2P mapping Metadata block 1 Metadata block 2 8

  9. I/O Ordering for Failure-Consistency 1. Write new block L; duplicating existing physical block P inc. P’s ref.ctr. map L to P completion to client 2. Write new block L; no duplicate write to new physical block P map L to P completion to client set P’s ref.ctr. to 2 add P’s fingerprint 3. Overwrite block L mapped to physical block P old ; duplicating physical block P dup dec. P old ’s ref. ctr. inc. P dup ’s map L to P dup ref. ctr. completion to client 4. Overwrite block L mapped to physical block P old ; no duplicate dec. P old ’s ref. ctr. write to physical block P new map L to P new completion to client set P new ’s ref. ctr. to 2 add P new ’s fingerprint 9

  10. Metadata I/O Merging for Efficiency } Anticipatory I/O delay and merging } Delay a metadata write in anticipation for near-future merging opportunities } Limited delay duration (e.g., 1 millisecond), slight performance impact } We name our approach OrderMergeDedup 10

  11. Evalua,on Setup } Prototype of OrderMergeDedup } A custom device mapper target of Linux 3.14.29 } Mobile system workloads (Atom-based tablet) } Ubuntu package update & installation } BBench web browsing } Vehicle counting for intelligent traffic sensing } Server system workloads (Xeon-based server machine) } Hadoop } YCSB/MongoDB 11

  12. Evalua,on Deduplicated physical block writes Normalized I/O volume 1 Original Dmdedup 0.8 Failure − consistent 0.6 write ordering 0.4 + Anticipatory I/O 0.2 delay/merging 0 P P B V H Y a a B e a C c c h e d S k k i n o c B a a c o l g g e / h p M e e c w o i i o n n e n u d s g b n e t o a t b x i D n l l r u o g B p w d s a i n t e g } We save 18-63% I/O writes (on workloads with 23-73% write duplication) 12

  13. Evalua,on (Strong Persistence Model) Deduplicated physical block writes Normalized I/O volume 12 × 12 × 5 × 10 × 11 × 11 × Original 2 Dmdedup 1.5 Failure − consistent write ordering 1 + Anticipatory I/O 0.5 delay/merging 0 P P B V H Y a a B e a C c c h e d S k k i n o c B a a c o l g g e / h p M e e c w o i i o n n e n u d s g b n e t o a t b x i D n l l r u o g B p w d s a i n t e g } We save 15-51% I/O writes (on workloads with 23-73% write duplication) } Anticipatory I/O delay/merging is particularly effective 13

  14. Conclusion } OrderMergeDedup } Efficient, failure-consistent I/O deduplication on Flash } A soft updates-style data/metadata write ordering for failure-consistency (in particular, we resolve all possible cyclic dependencies with carefully designed I/O ordering and by delaying non-critical metadata writes) } Anticipatory I/O delay and merging to further reduce metadata I/O writes } We save 18-63% I/O writes (on workloads with 23-73% write duplication) } Anticipatory I/O delay/merging is particularly effective under the strong persistence model 14

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend