 
              Reducing Replication Bandwidth for Distributed Document Databases Lianghong Xu 1 , Andy Pavlo 1 , Sudipta Sengupta 2 Jin Li 2 , Greg Ganger 1 Carnegie Mellon University 1 , Microsoft Research 2
#1 – You can sleep with grad students but not undergrads.
#2 – Keep a bottle of water in your office in case a student breaks down crying.
#3 – Kids love MongoDB, but they want to go work for Google.
Reducing Replication Bandwidth for Distributed Document Databases In ACM Symposium on Cloud Computing, pg. 1-12, August 2015. More Info: http:/ /cmudb.io/doc-dbs
Replication Bandwidth Primary Database Operation logs Operation logs WAN Secondary MMS
Replication Bandwidth Primary Database Goal: Reduce bandwidth Operation logs Operation logs for WAN geo-replication. WAN Secondary MMS
Why Deduplication? • Why not just compr pres ess? – Oplog batches are small and not enough overlap. • Why not just use diff iff? – Need application guidance to identify source. • Dedupl eduplicati tion finds and removes redundancies.
Traditional Dedup Chunk Boundary Modified Region Duplicate Region Incoming Data 1 2 3 4 5 Send deduped data to replicas. Deduped 1 2 4 5 Data
Traditional Dedup Chunk Boundary Modified Region Duplicate Region Incoming Data Must send the entire document. Deduped Data
Similarity Dedup Chunk Boundary Modified Region Duplicate Region Incoming Data Delta! Delta! Delta! Delta! Delta! Only send delta encoding. Deduped Data
Compress vs. Dedup 20GB sampled Wikipedia dataset. MongoDB v2.7 / / 4MB Oplog batches
sDedup: Similarity Dedup Client Insertion & Updates Oplog Oplog syncer Unsynchronized Source Source oplog entries documents documents Delta Delta Compressor Decompressor Database Database Re-constructed oplog entries Source Deduplicated Replay Document Oplog oplog entries Cache Primary Node Secondary Node
Encoding Steps • Identify Similar Documents • Select the Best Match • Delta Compression
Identify Similar Documents Target Document Simila imilarit ity Rabin Chunking Candi ndida date Docum ument ents Score re 32 17 25 41 12 1 Doc #1 39 32 22 15 Doc #1 Doc #2 32 25 38 41 12 Consistent Sampling Doc #3 32 17 38 41 12 2 Similarity Sketch 41 32 Doc #2 Doc #2 32 25 38 41 12 2 32 Feature Doc #3 32 17 38 41 12 Index Table Doc #3 41
Selecting the Best Match Init itial R ial Ran ankin ing Fin inal R al Ran ankin ing Rank Candidates Score Rank Candidates Cached? Score Doc #2 Doc #3 1 2 1 Yes Ye 6 1 Doc #3 2 1 Doc #1 Yes Ye 3 2 1 2 No 2 Doc #1 Doc #2 Is doc cached? If yes, reward 3x Source Document Cache
Delta Compression • Byte-level diff between source and target docs: – Based on the xDelta algorithm – Improved speed with minimal loss of compression • Enc ncoding ng : : – Descriptors about duplicate/unique regions + unique bytes • Dec Decoding ng : – Use source doc + encoded output – Concatenate byte regions in order
Evaluation • MongoDB setup (v2.7) – 1 primary, 1 secondary node, 1 client – Node Config: 4 cores, 8GB RAM, 100GB HDD storage • Datasets: – Wikipedia dump (20GB out of ~12TB) – Stack Exchange data dump (10GB out of ~100GB)
Compression: Wikipedia sDedup trad-dedup 50 Compression Ratio 38.9 38.4 40 30 26.3 20 15.2 9.9 9.1 10 4.6 2.3 0 4KB 1KB 256B 64B Chunk Size 20GB sampled Wikipedia dataset
Memory: Wikipedia sDedup trad-dedup 780.5 800 Memory (MB) 600 400 272.5 200 133.0 80.2 61.0 57.3 47.9 34.1 0 4KB 1KB 256B 64B Chunk Size 20GB sampled Wikipedia dataset
Compression: StackExchange sDedup trad-dedup 5 Compression Ratio 4 3 1.8 2 1.3 1.2 1.2 1.1 1.0 1.0 1.0 1 0 4KB 1KB 256B 64B Chunk Size 10GB sampled StackExchange dataset
Throughput Overhead
Dedup + Sharding 50 Compression Ratio 38.4 38.2 38.1 37.9 40 30 20 10 0 1 3 5 9 # of Shards 20GB sampled Wikipedia dataset
Failure Recovery Failure Point 20GB sampled Wikipedia dataset.
Conclusion • Similarity-based deduplication for replicated document databases. • sDedup for MongoDB (v2.7) – Much greater data reduction than traditional dedup – Up to 38x compression ratio for Wikipedia – Resource-efficient design for inline deduplication with negligible performance overhead
What’s Next? • Port code to MongoDB v3.1 • Integrating sDedup into WiredTiger storage manager. • Need to test with more workloads. • Try not to get anyone pregnant.
WiredTiger vs. sDedup Compression Ratio 1.6x Snappy 3.0x zLib 38.4x sDedup (no compress) 60.8x sDedup + Snappy 114.5x sDedup + zLib 20GB sampled Wikipedia dataset.
END @andy_pavlo
Recommend
More recommend