OrderMergeDedup: Efficient, Failure-Consistent Deduplication on - - PowerPoint PPT Presentation
OrderMergeDedup: Efficient, Failure-Consistent Deduplication on - - PowerPoint PPT Presentation
OrderMergeDedup: Efficient, Failure-Consistent Deduplication on Flash Zhuan Chen and Kai Shen University of Rochester Background } I/O deduplication } Eliminate I/O writes with redundant content } Reduce the storage space usage } Write
Background
} I/O deduplication
} Eliminate I/O writes with redundant content } Reduce the storage space usage } Write reduction: reduce the Flash wear, improve performance } Broad usage in data centers, personal computers, data-driven sensing
Mo,va,on
3
} I/O deduplication is not free: metadata maintenance
} (1) Logical-physical block mapping
(2) Physical block fingerprints (3) Physical block reference counts
} Need to maintain failure-consistency for data and metadata
L1 L2 … P … … … Logical Blocks Physical Blocks (on Flash) … Write: ABC Write: ABC Write: ABC (ref. ctr. = 2) L2P mapping
Challenge
4
} Existing approaches for failure-consistency
} Rely on non-volatile RAM or supercapacitors/batteries [Srinivasan etal. 2012;
Chen et al. 2011; Gupta et al. 2011]
} Checking/repair tools [Quinlan et al. 2002] } Redo logging [Meister et al. 2010] (additional I/O for logging writes) } Shadowing [Tarasov et al. 2014] (additional I/O for index block writes)
} Challenge: metadata & failure-consistency-induced I/O cost shouldn’t
significantly diminish the deduplication I/O saving
} We look into soft updates-style I/O ordering
I/O Ordering for Failure-Consistency
5
} Define an order for data/metadata writes
} Ordered writes are committed one by one } A failure still keeps a deduplication system consistent } A failure can only leave garbage (which can be reclaimed asynchronously)
Example: new write (duplicated content)
- Incr. P1 ref.
L2P mapping ✗ (Higher-than-actual ref. – leave garbage)
I/O Ordering for Failure-Consistency
6
} I/O efficiency
} No consistency-induced additional I/O } We can merge metadata writes residing on the same metadata block
as long as they are not subject to any ordering constraint
Example: new write (duplicated content)
- Incr. P1 ref.
L2P mapping
- Incr. P2 ref.
L2P mapping Metadata block 1 Metadata block 2
I/O Ordering for Failure-Consistency
7
} Cyclic dependencies
} Prevent metadata I/O merging & complicate the implementation } Make soft updates costly for file systems [Seltzer et al. 2000]
Example: (1) overwrite (duplicated content)
- Incr. P1 ref.
L2P mapping Metadata block 1 Metadata block 2 (1)
- Decr. P2 ref.
- Incr. P3 ref.
L2P mapping (2) Metadata block 2 Metadata block 1 (2) new write (duplicated content)
I/O Ordering for Failure-Consistency
8
} Resolve cyclic dependencies
} We carefully design all deduplication I/O paths } Delay non-critical metadata I/O (the completion signal doesn’t depend on)
Completion to client Example: (1) overwrite (duplicated content)
- Incr. P1 ref.
L2P mapping Metadata block 1 Metadata block 2 (1)
- Decr. P2 ref.
- Incr. P3 ref.
L2P mapping (2) (2) new write (duplicated content) Delay
I/O Ordering for Failure-Consistency
9
- 1. Write new block L; duplicating existing physical block P
- inc. P’s ref.ctr.
map L to P
- 2. Write new block L; no duplicate
- 3. Overwrite block L mapped to physical block Pold; duplicating physical block Pdup
- 4. Overwrite block L mapped to physical block Pold; no duplicate
completion to client set P’s ref.ctr. to 2 map L to P completion to client write to new physical block P
- inc. Pdup’s
- ref. ctr.
map L to Pdup
- dec. Pold’s ref. ctr.
completion to client write to physical block Pnew set Pnew’s ref. ctr. to 2 map L to Pnew
- dec. Pold’s ref. ctr.
completion to client add P’s fingerprint add Pnew’s fingerprint
Metadata I/O Merging for Efficiency
10
} Anticipatory I/O delay and merging
} Delay a metadata write in anticipation for near-future merging opportunities } Limited delay duration (e.g., 1 millisecond), slight performance impact
} We name our approach OrderMergeDedup
Evalua,on Setup
11
} Prototype of OrderMergeDedup
} A custom device mapper target of Linux 3.14.29
} Mobile system workloads (Atom-based tablet)
} Ubuntu package update & installation } BBench web browsing } Vehicle counting for intelligent traffic sensing
} Server system workloads (Xeon-based server machine)
} Hadoop } YCSB/MongoDB
Evalua,on
12 } We save 18-63% I/O writes (on workloads with 23-73% write duplication)
0.2 0.4 0.6 0.8 1 Normalized I/O volume P a c k a g e i n d e x u p d a t e P a c k a g e i n s t a l l B B e n c h w e b b r
- w
s i n g V e h i c l e c
- u
n t i n g H a d
- p
Y C S B / M
- n
g
- D
B Deduplicated physical block writes Original Dmdedup Failure−consistent write ordering + Anticipatory I/O delay/merging
Evalua,on (Strong Persistence Model)
13 } We save 15-51% I/O writes (on workloads with 23-73% write duplication) } Anticipatory I/O delay/merging is particularly effective
0.5 1 1.5 2 Normalized I/O volume P a c k a g e i n d e x u p d a t e P a c k a g e i n s t a l l B B e n c h w e b b r
- w
s i n g V e h i c l e c
- u
n t i n g H a d
- p
Y C S B / M
- n
g
- D
B 12× 12× 5× 10× 11× 11× Deduplicated physical block writes Original Dmdedup Failure−consistent write ordering + Anticipatory I/O delay/merging