reducing replication bandwidth for distributed document
play

Reducing Replication Bandwidth for Distributed Document Databases - PowerPoint PPT Presentation

Reducing Replication Bandwidth for Distributed Document Databases Lianghong Xu 1 , Andy Pavlo 1 , Sudipta Sengupta 2 Jin Li 2 , Greg Ganger 1 Carnegie Mellon University 1 , Microsoft Research 2 Document-oriented Databases { { " _id " :


  1. Reducing Replication Bandwidth for Distributed Document Databases Lianghong Xu 1 , Andy Pavlo 1 , Sudipta Sengupta 2 Jin Li 2 , Greg Ganger 1 Carnegie Mellon University 1 , Microsoft Research 2

  2. Document-oriented Databases { { " _id " : "55ca4cf7bad4f75b8eb5c25c", " _id " : "55ca4cf7bad4f75b8eb5c25d”, " pageId " : "46780", " pageId " : "46780", " revId " : " 41173 ", " revId " : "128520", " timestamp " : "2002-03-30T20:06:22", " timestamp " : "2002-03-30T20:11:12", Update " sha1 " : "6i81h1zt22u1w4sfxoofyzmxd” " sha1 " : "q08x58kbjmyljj4bow3e903uz” " text " : “The Peer and the Peri is a " text " : "The Peer and the Peri is a comic [[Gilbert and Sullivan]] comic [[Gilbert and Sullivan]] [[operetta ]] in two acts… just as [[operetta ]] in two acts… just as predicting,…The fairy Queen, however, predicted, …The fairy Queen, on the other appears to … all live happily ever after. " hand, is ''not'' happy, and appears to … all } live happily ever after. " } Update: Reading a recent doc and writing back a similar one 2 ¡

  3. Replication Bandwidth { { Primary " _id " : "55ca4cf7bad4f75b8eb5c25c", " _id " : "55ca4cf7bad4f75b8eb5c25d”, " pageId " : "46780", " pageId " : "46780", Database " revId " : " 41173 ", " revId " : "128520", " timestamp " : " 2002-­‑03-­‑30T20:06:22Z ", " timestamp " : "2002-03-30T20:11:12Z", " sha1 " : "6i81h1zt22u1w4sfxoofyzmxd” Goal: Reduce bandwidth " sha1 " : "q08x58kbjmyljj4bow3e903uz” Operation Operation " text " : "The Peer and the Peri” is a " text " : "The Peer and the Peri” is a logs logs comic [[Gilbert and Sullivan]] comic [[Gilbert and Sullivan]] WAN for WAN geo-replication [[operetta ]] in two acts… just as [[operetta ]] in two acts… just as predicting,…The fairy Queen, however, predicted, …The fairy Queen, on the other appears to … all live happily ever after. " hand, is ''not'' happy, and appears to … all } live happily ever after. " } Secondary Secondary 3 ¡

  4. Why Deduplication? • Why not just compress ? – Oplog batches are small and not enough overlap • Why not just use diff ? – Need application guidance to identify source • Dedup finds and removes redundancies – In the entire data corpus 4 ¡

  5. Traditional Dedup: Ideal Chunk Boundary Modified Region Duplicate Region Incoming {BYTE STREAM } Data 1 2 3 4 5 Send dedup’ed Deduped data to replicas 1 2 4 5 Data 5 ¡

  6. Traditional Dedup: Reality Chunk Boundary Modified Region Duplicate Region Incoming Data 1 2 3 4 5 Send almost the Deduped 4 entire document. Data 6 ¡

  7. Similarity Dedup Chunk Boundary Modified Region Duplicate Region Incoming Data Delta! Only send delta Dedup’ed encoding. Data 7 ¡

  8. Compress vs. Dedup 20GB sampled Wikipedia dataset MongoDB v2.7 // 4MB Oplog batches 8 ¡

  9. sDedup: Similarity Dedup Client Insertion & Updates Oplog Oplog syncer Unsynchronized Source Source oplog entries documents documents sDedup sDedup Encoder Decoder Database Database Re-constructed oplog entries Source Dedup’ed Replay Document Oplog oplog entries Cache Primary Node Secondary Node 9 ¡

  10. sDedup Encoding Steps • Identify Similar Documents • Select the Best Match • Delta Compression 10 ¡

  11. Identify Similar Documents Target Document Similarity Candidate Documents Rabin Chunking Score 32 17 25 41 12 1 Doc #1 39 32 22 15 Doc #1 Doc #2 32 25 38 41 12 Consistent Sampling Doc #3 2 32 17 38 41 12 Similarity Sketch 41 32 Doc #2 Doc #2 32 25 38 41 12 2 Feature 32 Doc #3 32 17 38 41 12 Index Table Doc #3 41 11 ¡

  12. Select the Best Match Initial Ranking Final Ranking Rank Candidates Score Rank Candidates Cached? Score Doc #2 Doc #3 1 2 1 Yes 4 Doc #3 Doc #1 1 2 1 Yes 3 Doc #1 Doc #2 2 1 2 No 2 Is doc cached? If yes, reward +2 Source Document Cache 12 ¡

  13. Evaluation • MongoDB setup (v2.7) – 1 primary, 1 secondary node, 1 client – Node Config: 4 cores, 8GB RAM, 100GB HDD storage • Datasets: – Wikipedia dump (20GB out of ~12TB) – Additional datasets evaluated in the paper 13 ¡

  14. Compression sDedup trad-dedup 50 Compression Ratio 38.9 38.4 40 26.3 30 20 15.2 9.9 9.1 10 4.6 2.3 0 4KB 1KB 256B 64B Chunk Size 20GB sampled Wikipedia dataset 14 ¡

  15. Memory sDedup trad-dedup 780.5 800 Memory (MB) 600 400 272.5 200 133.0 80.2 57.3 61.0 47.9 34.1 0 4KB 1KB 256B 64B Chunk Size 20GB sampled Wikipedia dataset 15 ¡

  16. Other Results (See Paper) • Negligible client performance overhead • Failure recovery is quick and easy • Sharding does not hurt compression rate • More datasets – Microsoft Exchange, Stack Exchange 16 ¡

  17. Conclusion & Future Work • sDedup : Similarity-based deduplication for replicated document databases. – Much greater data reduction than traditional dedup – Up to 38x compression ratio for Wikipedia – Resource-efficient design with negligible overhead • Future work – More diverse datasets – Dedup for local database storage – Different similarity search schemes (e.g., super-fingerprints) 17 ¡

  18. Backup Slides 18 ¡

  19. Compression: StackExchange sDedup trad-dedup 5 Compression Ratio 4 3 1.8 2 1.3 1.2 1.2 1.1 1.0 1.0 1.0 1 0 4KB 1KB 256B 64B Chunk Size 10GB sampled StackExchange dataset 19 ¡

  20. Memory: StackExchange sDedup trad-dedup 3500 3,082.5 3000 Memory (MB) 2500 2000 1500 899.2 1000 439.8 414.3 302.0 500 228.4 115.4 83.9 0 4KB 1KB 256B 64B Chunk Size 10GB sampled StackExchange dataset 20 ¡

  21. Throughput Overhead 21 ¡

  22. Failure Recovery Failure Point 20GB sampled Wikipedia dataset. 22 ¡

  23. Dedup + Sharding 50 Compression Ratio 38.4 38.2 38.1 37.9 40 30 20 10 0 1 3 5 9 Number of Shards 20GB sampled Wikipedia dataset 23 ¡

  24. Delta Compression • Byte-level diff between source and target docs: – Based on the xDelta algorithm – Improved speed with minimal loss of compression • Encoding: – Descriptors about duplicate/unique regions + unique bytes • Decoding: – Use source doc + encoded output – Concatenate byte regions in order 24 ¡

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend