OrderMergeDedup: Efficient, Failure-Consistent Deduplication on - - PowerPoint PPT Presentation

ordermergededup efficient failure consistent
SMART_READER_LITE
LIVE PREVIEW

OrderMergeDedup: Efficient, Failure-Consistent Deduplication on - - PowerPoint PPT Presentation

OrderMergeDedup: Efficient, Failure-Consistent Deduplication on Flash Zhuan Chen and Kai Shen University of Rochester Background } I/O deduplication } Eliminate I/O writes with redundant content } Reduce the storage space usage } Write


slide-1
SLIDE 1

OrderMergeDedup: Efficient, Failure-Consistent Deduplication on Flash

Zhuan Chen and Kai Shen University of Rochester

slide-2
SLIDE 2

Background

} I/O deduplication

} Eliminate I/O writes with redundant content } Reduce the storage space usage } Write reduction: reduce the Flash wear, improve performance } Broad usage in data centers, personal computers, data-driven sensing

slide-3
SLIDE 3

Mo,va,on

3

} I/O deduplication is not free: metadata maintenance

} (1) Logical-physical block mapping

(2) Physical block fingerprints (3) Physical block reference counts

} Need to maintain failure-consistency for data and metadata

L1 L2 … P … … … Logical Blocks Physical Blocks (on Flash) … Write: ABC Write: ABC Write: ABC (ref. ctr. = 2) L2P mapping

slide-4
SLIDE 4

Challenge

4

} Existing approaches for failure-consistency

} Rely on non-volatile RAM or supercapacitors/batteries [Srinivasan etal. 2012;

Chen et al. 2011; Gupta et al. 2011]

} Checking/repair tools [Quinlan et al. 2002] } Redo logging [Meister et al. 2010] (additional I/O for logging writes) } Shadowing [Tarasov et al. 2014] (additional I/O for index block writes)

} Challenge: metadata & failure-consistency-induced I/O cost shouldn’t

significantly diminish the deduplication I/O saving

} We look into soft updates-style I/O ordering

slide-5
SLIDE 5

I/O Ordering for Failure-Consistency

5

} Define an order for data/metadata writes

} Ordered writes are committed one by one } A failure still keeps a deduplication system consistent } A failure can only leave garbage (which can be reclaimed asynchronously)

Example: new write (duplicated content)

  • Incr. P1 ref.

L2P mapping ✗ (Higher-than-actual ref. – leave garbage)

slide-6
SLIDE 6

I/O Ordering for Failure-Consistency

6

} I/O efficiency

} No consistency-induced additional I/O } We can merge metadata writes residing on the same metadata block

as long as they are not subject to any ordering constraint

Example: new write (duplicated content)

  • Incr. P1 ref.

L2P mapping

  • Incr. P2 ref.

L2P mapping Metadata block 1 Metadata block 2

slide-7
SLIDE 7

I/O Ordering for Failure-Consistency

7

} Cyclic dependencies

} Prevent metadata I/O merging & complicate the implementation } Make soft updates costly for file systems [Seltzer et al. 2000]

Example: (1) overwrite (duplicated content)

  • Incr. P1 ref.

L2P mapping Metadata block 1 Metadata block 2 (1)

  • Decr. P2 ref.
  • Incr. P3 ref.

L2P mapping (2) Metadata block 2 Metadata block 1 (2) new write (duplicated content)

slide-8
SLIDE 8

I/O Ordering for Failure-Consistency

8

} Resolve cyclic dependencies

} We carefully design all deduplication I/O paths } Delay non-critical metadata I/O (the completion signal doesn’t depend on)

Completion to client Example: (1) overwrite (duplicated content)

  • Incr. P1 ref.

L2P mapping Metadata block 1 Metadata block 2 (1)

  • Decr. P2 ref.
  • Incr. P3 ref.

L2P mapping (2) (2) new write (duplicated content) Delay

slide-9
SLIDE 9

I/O Ordering for Failure-Consistency

9

  • 1. Write new block L; duplicating existing physical block P
  • inc. P’s ref.ctr.

map L to P

  • 2. Write new block L; no duplicate
  • 3. Overwrite block L mapped to physical block Pold; duplicating physical block Pdup
  • 4. Overwrite block L mapped to physical block Pold; no duplicate

completion to client set P’s ref.ctr. to 2 map L to P completion to client write to new physical block P

  • inc. Pdup’s
  • ref. ctr.

map L to Pdup

  • dec. Pold’s ref. ctr.

completion to client write to physical block Pnew set Pnew’s ref. ctr. to 2 map L to Pnew

  • dec. Pold’s ref. ctr.

completion to client add P’s fingerprint add Pnew’s fingerprint

slide-10
SLIDE 10

Metadata I/O Merging for Efficiency

10

} Anticipatory I/O delay and merging

} Delay a metadata write in anticipation for near-future merging opportunities } Limited delay duration (e.g., 1 millisecond), slight performance impact

} We name our approach OrderMergeDedup

slide-11
SLIDE 11

Evalua,on Setup

11

} Prototype of OrderMergeDedup

} A custom device mapper target of Linux 3.14.29

} Mobile system workloads (Atom-based tablet)

} Ubuntu package update & installation } BBench web browsing } Vehicle counting for intelligent traffic sensing

} Server system workloads (Xeon-based server machine)

} Hadoop } YCSB/MongoDB

slide-12
SLIDE 12

Evalua,on

12 } We save 18-63% I/O writes (on workloads with 23-73% write duplication)

0.2 0.4 0.6 0.8 1 Normalized I/O volume P a c k a g e i n d e x u p d a t e P a c k a g e i n s t a l l B B e n c h w e b b r

  • w

s i n g V e h i c l e c

  • u

n t i n g H a d

  • p

Y C S B / M

  • n

g

  • D

B Deduplicated physical block writes Original Dmdedup Failure−consistent write ordering + Anticipatory I/O delay/merging

slide-13
SLIDE 13

Evalua,on (Strong Persistence Model)

13 } We save 15-51% I/O writes (on workloads with 23-73% write duplication) } Anticipatory I/O delay/merging is particularly effective

0.5 1 1.5 2 Normalized I/O volume P a c k a g e i n d e x u p d a t e P a c k a g e i n s t a l l B B e n c h w e b b r

  • w

s i n g V e h i c l e c

  • u

n t i n g H a d

  • p

Y C S B / M

  • n

g

  • D

B 12× 12× 5× 10× 11× 11× Deduplicated physical block writes Original Dmdedup Failure−consistent write ordering + Anticipatory I/O delay/merging

slide-14
SLIDE 14

Conclusion

14

} OrderMergeDedup

} Efficient, failure-consistent I/O deduplication on Flash } A soft updates-style data/metadata write ordering for failure-consistency

(in particular, we resolve all possible cyclic dependencies with carefully designed I/O ordering and by delaying non-critical metadata writes)

} Anticipatory I/O delay and merging to further reduce metadata I/O writes } We save 18-63% I/O writes (on workloads with 23-73% write duplication) } Anticipatory I/O delay/merging is particularly effective under the strong

persistence model