rollerchain a dht for efficient replication
play

Rollerchain: a DHT for Efficient Replication IEEE NCA13 Jo ao Paiva - PowerPoint PPT Presentation

Rollerchain: a DHT for Efficient Replication IEEE NCA13 Jo ao Paiva , Jo ao Leit ao, Lu s Rodrigues Instituto Superior T ecnico / Inesc-ID, Lisboa, Portugal August 22, 2013 Outline Introduction Our approach Evaluation


  1. Rollerchain: a DHT for Efficient Replication IEEE NCA’13 Jo˜ ao Paiva , Jo˜ ao Leit˜ ao, Lu´ ıs Rodrigues Instituto Superior T´ ecnico / Inesc-ID, Lisboa, Portugal August 22, 2013

  2. Outline Introduction Our approach Evaluation Conclusions

  3. Motivation ◮ D istributed H ash T ables are structured overlays where nodes organize into a predefined topology that supports routing. ◮ DHTs allow for scalable key-value storage.

  4. Motivation ◮ In dynamic environments, replication is paramount to maintaining data. ◮ However, predefined topologies are expensive to maintain in dynamic environments (churn). ◮ DHTs do not handle churn as well as unstructured networks.

  5. Motivation ◮ In dynamic environments, replication is paramount to maintaining data. ◮ However, predefined topologies are expensive to maintain in dynamic environments (churn). ◮ DHTs do not handle churn as well as unstructured networks.

  6. Main Approaches to DHT replication 1. Neighbour Replication 2. Multi-Publication

  7. Neighbour Replication Each node replicates its data on its R closest neighbours ◮ Good control on replication degree ◮ Simple to locate replicas ◮ Expensive replication: data is moved to respect topological constraints ◮ Not resilient under churn: each node acts on its own ◮ Poor load balancing: no active mechanisms to balance load

  8. Neighbour Replication Each node replicates its data on its R closest neighbours ◮ Good control on replication degree ◮ Simple to locate replicas ◮ Expensive replication: data is moved to respect topological constraints ◮ Not resilient under churn: each node acts on its own ◮ Poor load balancing: no active mechanisms to balance load

  9. Neighbour Replication: operation

  10. Neighbour Replication: operation

  11. Neighbour Replication: operation

  12. Neighbour Replication: operation

  13. Neighbour Replication: operation

  14. Multi-Publication Each object is attributed R different identifiers to be stored by R different nodes. ◮ Better load balancing ◮ Reduced correlated failures ◮ Expensive overlay maintenance: each object has a different set of replicas ◮ Expensive replication: data is moved to respect topological constraints ◮ Not resilient under churn: each node acts on its own

  15. Multi-Publication Each object is attributed R different identifiers to be stored by R different nodes. ◮ Better load balancing ◮ Reduced correlated failures ◮ Expensive overlay maintenance: each object has a different set of replicas ◮ Expensive replication: data is moved to respect topological constraints ◮ Not resilient under churn: each node acts on its own

  16. Current DHTs Based on structured networks Characterized by: ◮ Nodes with fixed positions in the overlay ◮ Static replication degree ◮ Poor performance under churn

  17. Main challenges Challenges: 1. Increase churn resilience 2. Minimize replication costs 3. Improve load balancing

  18. Outline Introduction Our approach Evaluation Conclusions

  19. Our approach: Architecture overview ◮ Ring-based overlay: Composed of virtual nodes

  20. Our approach: Architecture overview ◮ Ring-based overlay: Composed of virtual nodes

  21. Our approach: Dynamic topology overview

  22. Our approach: Dynamic topology overview

  23. Our approach: Dynamic topology overview

  24. Our approach: Dynamic topology overview

  25. Our approach: Dynamic topology overview

  26. Our approach: Dynamic topology overview

  27. Our approach: Dynamic topology overview

  28. Our approach: beating the challenges 1. Increase churn resilience: unstructured networks 2. Minimize replication costs: variable replication degree 3. Improve load balancing: dynamic key distribution

  29. Our approach: beating the challenges 1. Increase churn resilience: unstructured networks 2. Minimize replication costs: variable replication degree 3. Improve load balancing: dynamic key distribution

  30. Increasing churn resilience ◮ Ring maintained through gossip mechanisms

  31. Increasing churn resilience ◮ Gossip to keep virtual node membership up-to-date

  32. Increasing churn resilience ◮ Gossip to trade connections between virtual nodes

  33. Increasing churn resilience

  34. Increasing churn resilience

  35. Increasing churn resilience

  36. Increasing churn resilience

  37. Increasing churn resilience

  38. Our approach: beating the challenges 1. Increase churn resilience: unstructured networks 2. Minimize replication costs: variable replication degree 3. Improve load balancing: dynamic key distribution

  39. Minimizing replication costs: node failure ◮ Variable replication degree: No data movement on failure

  40. Minimizing replication costs: node failure ◮ Variable replication degree: No data movement on failure

  41. Minimizing replication costs: node failure ◮ Variable replication degree: No data movement on failure

  42. Minimizing replication costs: node join ◮ Nodes can select where to join: may join recently-failed virtual nodes

  43. Minimizing replication costs: node join ◮ Nodes can select where to join: may join recently-failed virtual nodes

  44. Minimizing replication costs: node join ◮ Nodes can select where to join: may join recently-failed virtual nodes

  45. Minimizing replication costs: node join ◮ New nodes can replace failed nodes: Blue’s data was moved only once and never discarded

  46. Minimizing replication costs: node join ◮ New nodes can replace failed nodes: Blue’s data was moved only once and never discarded

  47. Minimizing replication costs: node join ◮ New nodes can replace failed nodes: Blue’s data was moved only once and never discarded

  48. Our approach: beating the challenges 1. Increase churn resilience: unstructured networks 2. Minimize replication costs: variable replication degree 3. Improve load balancing: dynamic key distribution

  49. Improving replication costs: creating dynamic key distribution ◮ Virtual nodes store a number of keys proportional to their size: Blue’s data is split proportionally by its children

  50. Improving replication costs: creating dynamic key distribution ◮ Virtual nodes store a number of keys proportional to their size: Blue’s data is split proportionally by its children

  51. Improving replication costs: creating dynamic key distribution ◮ Virtual nodes store a number of keys proportional to their size: Blue’s data is split proportionally by its children

  52. Outline Introduction Our approach Evaluation Conclusions

  53. Experimental settings ◮ Overlay simulation in Peersim ◮ 100K Nodes ◮ 50K Keys ◮ Replication degree = 7 ◮ 5M queries

  54. Churn resilience 100 80 Objects reachable (%) 60 Rollerchain Neighbour Multi-Pub 40 20 0 churn=1 churn=10 churn=100 Churn rate

  55. Replication costs 100 80 Objects moved per node 60 Rollerchain Neighbour Multi-Pub 40 20 0 churn=1 churn=10 churn=100 Churn rate

  56. Load Balancing 250 STDEV of number of queries processed 200 150 Rollerchain Neighbour Multi-Pub 100 50 0

  57. Outline Introduction Our approach Evaluation Conclusions

  58. Conclusions ◮ DHT based on Virtual Nodes ◮ Designed with replication in mind ◮ Unstructured Networks: Increase churn resilience ◮ Variable replication degree: Minimize replication costs ◮ Dynamic key distribution: Improve load balancing

  59. Conclusions ◮ DHT based on Virtual Nodes ◮ Designed with replication in mind ◮ Unstructured Networks: Increase churn resilience ◮ Variable replication degree: Minimize replication costs ◮ Dynamic key distribution: Improve load balancing

  60. Thank you

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend