Reducing Replication Bandwidth for Distributed Document Databases
Lianghong Xu1, Andy Pavlo1, Sudipta Sengupta2 Jin Li2, Greg Ganger1 Carnegie Mellon University1, Microsoft Research2
Reducing Replication Bandwidth for Distributed Document Databases - - PowerPoint PPT Presentation
Reducing Replication Bandwidth for Distributed Document Databases Lianghong Xu 1 , Andy Pavlo 1 , Sudipta Sengupta 2 Jin Li 2 , Greg Ganger 1 Carnegie Mellon University 1 , Microsoft Research 2 Document-oriented Databases { { " _id " :
Lianghong Xu1, Andy Pavlo1, Sudipta Sengupta2 Jin Li2, Greg Ganger1 Carnegie Mellon University1, Microsoft Research2
{ "_id" : "55ca4cf7bad4f75b8eb5c25c", "pageId" : "46780", "revId" : "41173", "timestamp" : "2002-03-30T20:06:22", "sha1" : "6i81h1zt22u1w4sfxoofyzmxd” "text" : “The Peer and the Peri is a comic [[Gilbert and Sullivan]] [[operetta ]] in two acts… just as predicting,…The fairy Queen, however, appears to … all live happily ever after. " } { "_id" : "55ca4cf7bad4f75b8eb5c25d”, "pageId" : "46780", "revId" : "128520", "timestamp" : "2002-03-30T20:11:12", "sha1" : "q08x58kbjmyljj4bow3e903uz” "text" : "The Peer and the Peri is a comic [[Gilbert and Sullivan]] [[operetta ]] in two acts… just as predicted, …The fairy Queen, on the other hand, is ''not'' happy, and appears to … all live happily ever after. " }
Update
2 ¡
{ "_id" : "55ca4cf7bad4f75b8eb5c25c", "pageId" : "46780", "revId" : "41173", "timestamp" : "2002-‑03-‑30T20:06:22Z", "sha1" : "6i81h1zt22u1w4sfxoofyzmxd” "text" : "The Peer and the Peri” is a comic [[Gilbert and Sullivan]] [[operetta ]] in two acts… just as predicting,…The fairy Queen, however, appears to … all live happily ever after. " } { "_id" : "55ca4cf7bad4f75b8eb5c25d”, "pageId" : "46780", "revId" : "128520", "timestamp" : "2002-03-30T20:11:12Z", "sha1" : "q08x58kbjmyljj4bow3e903uz” "text" : "The Peer and the Peri” is a comic [[Gilbert and Sullivan]] [[operetta ]] in two acts… just as predicted, …The fairy Queen, on the other hand, is ''not'' happy, and appears to … all live happily ever after. " }
Operation logs Operation logs
Secondary Secondary
Primary Database
3 ¡
4 ¡
Modified Region Duplicate Region Chunk Boundary Deduped Data Incoming Data
5 ¡
Modified Region Duplicate Region Chunk Boundary Incoming Data Deduped Data
6 ¡
Modified Region Duplicate Region Chunk Boundary Incoming Data Dedup’ed Data
Delta!
7 ¡
20GB sampled Wikipedia dataset MongoDB v2.7 // 4MB Oplog batches
8 ¡
Primary Node Client Secondary Node
Source documents Insertion & Updates Database Oplog sDedup Encoder Unsynchronized
Dedup’ed
Oplog Re-constructed
Replay sDedup Decoder Oplog syncer Database Source documents Source Document Cache
9 ¡
10 ¡
Target Document Consistent Sampling Similarity Sketch Rabin Chunking
32 17 25 41 12 41 32
Feature Index Table
Candidate Documents
41 32 32 25 38 41 12 32 17 38 41 12 39 32 22 15
Doc #1 Doc #2 Doc #3
32 25 38 41 12 32 17 38 41 12
Doc #2 Doc #3
Doc #1
Doc #2
Doc #3 Similarity Score
11 ¡
Source Document Cache Rank Candidates Score 1 2 1 2 2 1 Doc #1 Doc #2 Doc #3 Initial Ranking Final Ranking Rank Candidates Cached? Score 1 Yes 4 1 Yes 3 2 No 2 Doc #1 Doc #3 Doc #2
Is doc cached? If yes, reward +2
12 ¡
13 ¡
9.9 26.3 38.4 38.9 2.3 4.6 9.1 15.2 10 20 30 40 50
4KB 1KB 256B 64B Compression Ratio Chunk Size sDedup trad-dedup 20GB sampled Wikipedia dataset
14 ¡
34.1 47.9 57.3 61.0 80.2 133.0 272.5 780.5 200 400 600 800
4KB 1KB 256B 64B Memory (MB) Chunk Size sDedup trad-dedup 20GB sampled Wikipedia dataset
15 ¡
16 ¡
17 ¡
18 ¡
1.0 1.2 1.3 1.8 1.0 1.0 1.1 1.2 1 2 3 4 5
4KB 1KB 256B 64B Compression Ratio Chunk Size sDedup trad-dedup 10GB sampled StackExchange dataset
19 ¡
83.9 115.4 228.4 414.3 302.0 439.8 899.2 3,082.5 500 1000 1500 2000 2500 3000 3500
4KB 1KB 256B 64B Memory (MB) Chunk Size sDedup trad-dedup 10GB sampled StackExchange dataset
20 ¡
21 ¡
20GB sampled Wikipedia dataset.
22 ¡
38.4 38.2 38.1 37.9 10 20 30 40 50
1 3 5 9 Compression Ratio Number of Shards 20GB sampled Wikipedia dataset
23 ¡
– Based on the xDelta algorithm – Improved speed with minimal loss of compression
– Descriptors about duplicate/unique regions + unique bytes
– Use source doc + encoded output – Concatenate byte regions in order
24 ¡