experiences in building and operating epost a reliable
play

Experiences in Building and Operating ePOST, a Reliable Peer-to-Peer - PowerPoint PPT Presentation

Experiences in Building and Operating ePOST, a Reliable Peer-to-Peer Application Alan Mislove Ansley Post Andreas Haeberlen Peter Druschel Max Planck Institute for Software Systems Rice University 1 Reliable P2P


  1. Experiences in Building and Operating ePOST, a Reliable Peer-to-Peer Application Alan Mislove †‡ Ansley Post †‡ Andreas Haeberlen †‡ Peter Druschel † † Max Planck Institute for Software Systems ‡ Rice University 1

  2. Reliable P2P Systems: Myth or Reality? • For the past few years, much research interest in p2p • Highly scalable in nodes and data • Utilization of underused resources • Robust to large range of workloads and failures • Most deployed systems are not reliable [Kazaa, Skype, etc] • None attempt to store data reliably, durably, or securely • Lead some to conclude p2p can’t support reliable applications • Question: Can peer-to-peer systems provide reliable service? 2 19.04.2006 EuroSys’06 Conference, Leuven, Belgium 2

  3. Demonstration Application: ePOST • ePOST is an email service built using decentralized components • Completely decentralized, no ‘email servers’ • Email one of the most important Internet applications • Privacy • Integrity • Durability • Availability • Wanted to develop system to a point where people rely on it 3 19.04.2006 EuroSys’06 Conference, Leuven, Belgium 3

  4. ePOST: Deployment • Built and deployed ePOST within our group • Running for over 2 years • Processed well over 500,000 email messages • Built ePOST to be more reliable than existing email systems • 16 users used ePOST as primary email • Even my advisor! • Many challenges found by building the system • After challenges solved, provides reliable service • Robust; numerous times ePOST was only mail service working 4 19.04.2006 EuroSys’06 Conference, Leuven, Belgium 4

  5. Rest of Talk • ePOST in detail • Challenges faced in building and deploying ePOST • Conclusion 5 19.04.2006 EuroSys’06 Conference, Leuven, Belgium 5

  6. ePOST: Architecture • Each participating node runs mail Node servers for the local user Email • Email service looks the same to users IMAP • Data stored cooperatively on POP3 SMTP SMTP participating machines • Machines form overlay • Replicated for redundancy • All data encrypted and signed IMAP SMTP • Prevents others from reading your IMAP email SMTP 6 19.04.2006 EuroSys’06 Conference, Leuven, Belgium 6

  7. ePOST: Architecture • Each participating node runs mail Node servers for the local user Email • Email service looks the same to users • Data stored cooperatively on participating machines • Machines form overlay • Replicated for redundancy • All data encrypted and signed • Prevents others from reading your email 6 19.04.2006 EuroSys’06 Conference, Leuven, Belgium 6

  8. ePOST: Metadata Storage • Folders represented using logs Log Head • Entries represent changes Log Entry • All entries self-authenticating • Log head points to most recent entry Add Email #3 • Signed by owner due to mutability • Only local node has key material Delete Email #2 • All writes performed by owner Mark #2 Read • Map multi-access problem to single- Add Email #2 writer 7 19.04.2006 EuroSys’06 Conference, Leuven, Belgium 7

  9. ePOST: Metadata Storage • Folders represented using logs Log Head • Entries represent changes Log Entry • All entries self-authenticating Add Email #4 • Log head points to most recent entry Add Email #3 • Signed by owner due to mutability • Only local node has key material Delete Email #2 • All writes performed by owner Mark #2 Read • Map multi-access problem to single- Add Email #2 writer 7 19.04.2006 EuroSys’06 Conference, Leuven, Belgium 7

  10. ePOST: Metadata Storage • Folders represented using logs Log Head • Entries represent changes Log Entry • All entries self-authenticating Add Email #4 • Log head points to most recent entry Add Email #3 • Signed by owner due to mutability • Only local node has key material Delete Email #2 • All writes performed by owner Mark #2 Read • Map multi-access problem to single- Add Email #2 writer 7 19.04.2006 EuroSys’06 Conference, Leuven, Belgium 7

  11. Challenges Faced 8 19.04.2006 EuroSys’06 Conference, Leuven, Belgium 8

  12. Challenges Faced • Network partitions • Complex failure modes • NATs and firewalls • Very unsynchronized clocks • Routing anomalies • Lost key material • Node churn • Disconnected nodes • Correlated failures • Power failures • Resource consumption • Resource exhaustion • Data storage • Spam attacks on relays • Slow nodes • Java eccentricities • Hidden single points of failure • Congested links • Data corruption • PlanetLab slice deletion • Comatose nodes • ... 8 19.04.2006 EuroSys’06 Conference, Leuven, Belgium 8

  13. Challenges Faced • Network partitions • Network partitions • Complex failure modes • NATs and firewalls • Very unsynchronized clocks • Very unsynchronized clocks • Routing anomalies • Routing anomalies • Lost key material • Node churn • Disconnected nodes • Correlated failures • Correlated failures • Power failures • Resource consumption • Resource consumption • Resource exhaustion • Data storage • Spam attacks on relays • Slow nodes • Java eccentricities • Hidden single points of failure • Congested links • Data corruption • PlanetLab slice deletion • Comatose nodes • ... 8 19.04.2006 EuroSys’06 Conference, Leuven, Belgium 8

  14. Challenge: Network Partitions • Overlay originally had no special Node provisions for network partitions • Did not envision partitions as a significant problem • When a network failure occurs, nodes detect others to be dead • Multiple overlays reform • Network usually fails at access links • Generally one large overlay and one small overlay 9 19.04.2006 EuroSys’06 Conference, Leuven, Belgium 9

  15. Challenge: Network Partitions • Overlay originally had no special Node provisions for network partitions • Did not envision partitions as a significant problem • When a network failure occurs, nodes detect others to be dead • Multiple overlays reform • Network usually fails at access links • Generally one large overlay and one small overlay 9 19.04.2006 EuroSys’06 Conference, Leuven, Belgium 9

  16. Challenge: Network Partitions • Overlay originally had no special Node provisions for network partitions • Did not envision partitions as a significant problem • When a network failure occurs, nodes detect others to be dead • Multiple overlays reform • Network usually fails at access links • Generally one large overlay and one small overlay 9 19.04.2006 EuroSys’06 Conference, Leuven, Belgium 9

  17. How frequent are partitions? 6 5 Number of Partitions 4 3 2 1 0 0 10 20 30 40 50 60 70 80 90 Time (days) • Partitions occur often in PlanetLab • Usually a single subnet (PlanetLab site) becomes partitioned 10 19.04.2006 EuroSys’06 Conference, Leuven, Belgium 10

  18. Impact of Network Partitions Node • Tradeoff between consistency and Email availability under partitions Log Head • Well-known tradeoff • ePOST resolves this in favor of availability • Partitions cause consistency problems • Small partitions have data inaccessibility • Mutable data can diverge • Partitions persist unless action is taken 11 19.04.2006 EuroSys’06 Conference, Leuven, Belgium 11

  19. Impact of Network Partitions Node • Tradeoff between consistency and Email availability under partitions Log Head • Well-known tradeoff • ePOST resolves this in favor of availability • Partitions cause consistency problems • Small partitions have data inaccessibility • Mutable data can diverge • Partitions persist unless action is taken ? 11 19.04.2006 EuroSys’06 Conference, Leuven, Belgium 11

  20. Partitions: Overlay Reintegration • To reintegrate overlay Node • Nodes remember recently deceased nodes • Periodically query these nodes, and integrate missing nodes into overlay • Protocol is periodic, and therefore stable • Tested on simulated failures as well as Planetlab • Overlay heals as expected 12 19.04.2006 EuroSys’06 Conference, Leuven, Belgium 12

  21. Partitions: Overlay Reintegration • To reintegrate overlay Node • Nodes remember recently deceased nodes • Periodically query these nodes, and integrate missing nodes into overlay • Protocol is periodic, and therefore stable • Tested on simulated failures as well as Planetlab • Overlay heals as expected 12 19.04.2006 EuroSys’06 Conference, Leuven, Belgium 12

  22. Partitions: Data Divergence • In ePOST, log-based data structure Log Entry • Forked logs must be merged • Data divergence unlikely due to single-writer behavior • To repair logs, merge entries, cancel destructive operations Add Email #3 • Ensures no data loss Delete Email #2 Mark #2 Read Add Email #2 13 19.04.2006 EuroSys’06 Conference, Leuven, Belgium 13

  23. Partitions: Data Divergence • In ePOST, log-based data structure Log Entry • Forked logs must be merged • Data divergence unlikely due to single-writer behavior Mark #4 Read Delete Folder Add Email #4 • To repair logs, merge entries, cancel destructive operations Add Email #3 • Ensures no data loss Delete Email #2 Mark #2 Read Add Email #2 13 19.04.2006 EuroSys’06 Conference, Leuven, Belgium 13

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend