tolerating latency in replicated state
play

Tolerating Latency in Replicated State Machines through Client - PowerPoint PPT Presentation

Tolerating Latency in Replicated State Machines through Client Speculation April 22, 2009 Benjamin Wester 1 , James Cowling 2 , Edmund B. Nightingale 3 , Peter M. Chen 1 , Jason Flinn 1 , Barbara Liskov 2 University of Michigan 1 , MIT CSAIL 2 ,


  1. Tolerating Latency in Replicated State Machines through Client Speculation April 22, 2009 Benjamin Wester 1 , James Cowling 2 , Edmund B. Nightingale 3 , Peter M. Chen 1 , Jason Flinn 1 , Barbara Liskov 2 University of Michigan 1 , MIT CSAIL 2 , Microsoft Research 3

  2. Simple Service Configuration 1 x=1 ++x NSDI'09 Benjamin Wester 2 University of Michigan CSE

  3. Replicated State Machines (RSM) 2 x=1 x=2 ++x 2 x=1 x=2 ++x x=1 x=2 2 ++x • Agree on request • x=2 x=1 All non-faulty replies are 2 ++x identical NSDI'09 Benjamin Wester 3 University of Michigan CSE

  4. RSMs have high latency 2 2 2 1. Need many replies 2. Agreement 3. Geographic Distribution NSDI'09 Benjamin Wester 4 University of Michigan CSE

  5. Hide the Latency • Use speculative execution inside RSM • Speculate before consensus is reached – Without faults, any reply predicts consensus value – Let client continue after receiving one reply NSDI'09 Benjamin Wester 5 University of Michigan CSE

  6. Overview • Introduction • Improving RSMs with speculation • Application to PBFT • Performance • Conclusion NSDI'09 Benjamin Wester 6 University of Michigan CSE

  7. Speculative Execution in RSM Take Checkpoint Predict: 1 Blocked Speculate! Commit x=1 x=1 • Continue processing while waiting NSDI'09 Benjamin Wester 7 University of Michigan CSE

  8. Critical path: first reply 1 1 • Completion latency less relevant • First reply latency sets critical path – Speed – Accuracy • Other desirable properties – Throughput – Stability under contention – Smaller number of replicas NSDI'09 Benjamin Wester 8 University of Michigan CSE

  9. Requests while speculative Predict win? = yes while !check_lottery(): submit_tps() buy_corvette() yes buy win? What do we do with this? 1. Hold request – Bad performance 2. Distributed commit/rollback – State tracking complex NSDI'09 Benjamin Wester 9 University of Michigan CSE

  10. Resolve speculations on the replicas Predict win? = yes while !check_lottery(): submit_tps() win? = yes buy_corvette() yes win? yes if win?=yes : buy • Explicitly encode dependencies as predicates • No special request handling needed • Replicas need to log past replies • Local decision at replicas matches client NSDI'09 Benjamin Wester 10 University of Michigan CSE

  11. Overview • Introduction • Improving RSMs with speculation • Application to PBFT • Performance • Conclusion NSDI'09 Benjamin Wester 11 University of Michigan CSE

  12. Practical BFT -CS [Castro and Liskov 1999] client primary f=1 NSDI'09 Benjamin Wester 12 University of Michigan CSE

  13. Additional Details • Tentative execution – PBFT/PBFT-CS complete in 4 phases • Read-only optimization – Accurate answer from backup replica • Failure threshold – Bound worst-case failure • Correctness NSDI'09 Benjamin Wester 13 University of Michigan CSE

  14. Overview • Introduction • Improving RSMs with speculation • Application to PBFT • Performance • Conclusion NSDI'09 Benjamin Wester 14 University of Michigan CSE

  15. Benchmarks • Shared counter – Simple checkpoint – No computation • NFS: Apache httpd build – Complex checkpoint – Significant computation NSDI'09 Benjamin Wester 15 University of Michigan CSE

  16. Topology 1. Primary-local 2. Primary-remote 3. Uniform 2.5 or 15 ms Primary NSDI'09 Benjamin Wester 16 University of Michigan CSE

  17. Base case: no replication 1. Primary-local 2. Primary-remote 3. Uniform 2.5 or 15 ms NSDI'09 Benjamin Wester 17 University of Michigan CSE

  18. Shared Counter Primary-local topology 120 100 Run Time (sec) 80 60 PBFT PBFT-CS 40 No replication 20 0 0 5 10 15 Network Delay (ms) NSDI'09 Benjamin Wester 18 University of Michigan CSE

  19. Shared Counter Primary-local topology 120 100 Run Time (sec) 80 PBFT 60 PBFT-CS No replication 40 Zyzzyva 20 [Kotla et al. 07] 0 0 5 10 15 Network Delay (ms) NSDI'09 Benjamin Wester 19 University of Michigan CSE

  20. Shared Counter Uniform & Primary-remote topology 120 100 Run Time (sec) 80 60 PBFT PBFT-CS 40 No replication 20 0 0 5 10 15 Network Delay (ms) NSDI'09 Benjamin Wester 20 University of Michigan CSE

  21. Shared Counter Uniform & Primary-remote topology 120 100 Run Time (sec) 80 PBFT 60 PBFT-CS No replication 40 Zyzzyva 20 0 0 5 10 15 Network Delay (ms) NSDI'09 Benjamin Wester 21 University of Michigan CSE

  22. NFS: Apache build Primary-local topology 35 30 Run Time (min) 25 20 PBFT 15 PBFT-CS 10 No replication 5 0 0 5 10 15 Network Delay (ms) NSDI'09 Benjamin Wester 22 University of Michigan CSE

  23. NFS: Apache build Uniform topology 35 30 Run Time (min) 25 20 PBFT 15 PBFT-CS 10 No replication 5 0 0 5 10 15 Network Delay (ms) NSDI'09 Benjamin Wester 23 University of Michigan CSE

  24. NFS: Apache build Primary-remote topology 35 30 Run Time (min) 25 20 PBFT 15 PBFT-CS 10 No replication 5 0 0 5 10 15 Network Delay (ms) NSDI'09 Benjamin Wester 24 University of Michigan CSE

  25. NFS: With Failure Primary-local topology 35 30 Run Time (min) 25 20 PBFT 15 PBFT-CS No replication 10 PBFT-CS (1% fail) 5 0 0 5 10 15 Network Delay (ms) NSDI'09 Benjamin Wester 25 University of Michigan CSE

  26. Throughput (Shared Counter) LAN topology 70 60 50 KOps/sec 40 PBFT 30 PBFT-CS 20 Zyzzyva 10 0 1 10 100 Number of Clients NSDI'09 Benjamin Wester 26 University of Michigan CSE

  27. Conclusion • Integrate client speculation within RSMs • Predicated requests: performance without complexity • Clients less sensitive to latency between replicas • 5x speedup over non-speculative protocol Makes WAN deployments more practical NSDI'09 Benjamin Wester 27 University of Michigan CSE

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend