Performance of UDP-based Byzantine Fault Tolerant Consensus Final - PowerPoint PPT Presentation

Chair of Network Architectures and Services Department of Informatics Technical University of Munich Performance of UDP-based Byzantine Fault Tolerant Consensus Final talk for the Bachelor’s Thesis by Lucas Mair advised by Richard von Seck, M. Sc. and Johannes Schleger, M. Sc. Wednesday 8 th July, 2020 Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Structure 1. Introduction 1.1 State Machine Replication 1.2 Byzantine Fault Tolerance 1.3 HotStuff 2. Research Questions 3. Related Work 4. Problem Analysis 5. Approach and Implementation 5.1 Library Structure 5.2 Implementation Approach 5.3 Conduction of Measurements 6. Experiment Evaluation 7. Conclusion Lucas Mair — BFT Consensus 2

Introduction State Machine Replication (SMR) State machine replication is used for providing a fault-tolerant service to process client requests. Figure 1: Example system using SMR Lucas Mair — BFT Consensus 3

Introduction State Machine Replication (SMR) • Input order of multiple client requests is important • Each replica generates separate output • System has to agree on single output • System will function correctly as long as number of faulty replicas is below a threshold: • Fault tolerance for crash failures: at least 2 f + 1 replicas for f faulty ones (majority decision) • Fault tolerance for byzantine failures: at least 3 f + 1 replicas for f faulty ones Lucas Mair — BFT Consensus 4

Introduction Byzantine Fault Tolerance • Arbitrary behaviour (e.g. sending wrong output) of faulty replicas instead of crashing • Decision has to be reached with N − f results (for N replicas) • Those results could include f faulty ones ⇒ at least f +1 correct results are needed as well: N − f ≥ f + f + 1 ⇒ N ≥ 3 f + 1 • Consensus protocols facilitate the agreement process using those results Lucas Mair — BFT Consensus 5

Introduction HotStuff The thesis focuses on the BFT consensus protocol "HotStuff" [8] ... which improves upon the complexity bounds of previous algorithms such as PBFT • Improves footprint of authentication messages per consensus round from O ( n 3 ) in PBFT to O ( n 2 ) in case of O ( n ) consecutive leader failures • Proposed feasibility for large-scale applications (100 and more replicas) • Example implementation in C++ freely accessible on GitHub [7] • Uses TCP and TLS for network communication • Communication model assumes reliable point-to-point connections Lucas Mair — BFT Consensus 6

Research Questions

Research Questions Optimizing HotStuff using UDP Central question of the Thesis: Is the usage of TCP necessary for the network communication of HotStuff? ⇒ Potential of optimizing it by using UDP instead. ⇒ The analysis, optimization and benchmarks are based on the example implementation. • Which TCP features are actually used or required by HotStuff? • Will a UDP-based implementation speed up the consensus process even if TCP features such as retransmissions are omitted? • Quantify the tradeoff between UDP-speedup and failed consensus rounds due to poten- tially higher transmission error rate. • Compare measurements to benchmarks of the HotStuff authors and analyze results. Lucas Mair — BFT Consensus 8

Related Work

Related Work Previous Analysis of UDP and Other Optimization Approaches Previous UDP-based protocol PBFT [3]: • Criticized by Chondros et al. [4] regarding packet loss • Problematic behaviour of UDP-based client ⇔ replica communication Comparison of a TCP and UDP implementation by Aublin et al. (RBFT [1]): • Similar performance, lower latency when using UDP • Identification of cryptographic operations as actual bottleneck Optimizations based on: • Reduction of cryptographic operations and speculative execution (Zyzzyva [5]) • Usability and versatility, including dynamic reconfiguration (BFT-SMaRt [2]) • Parallel consensus rounds with multiple leaders (Mir-BFT [6]) Lucas Mair — BFT Consensus 10

Problem Analysis

Problem Analysis Usage of TCP-Features in HotStuff Current state of HotStuff, using TCP-based message transmission: • Acknowledgements cause overhead. • Ordered data transfer is not necessary. • Flow control or congestion control are redundant in a way: ⇒ consensus rounds automatically slow down while waiting for messages. ⇒ message transmission frequency slows down (no bandwidth or payload changes). • TCP-handshake signals connection establishment, but causes overhead. • Client ⇔ replica communication must stay TCP-based (command order & reliability). • Retransmission after message loss is important. Lucas Mair — BFT Consensus 12

Problem Analysis Resulting Changes by using UDP Effect of using UDP-based message transmission in HotStuff: • Startup of sockets on receiving side has to be checked separately: ⇒ ICMP Destination Unreachable can serve as indicator • Message arrival is only inferred by successful progress of the consensus round. • Possible consensus failure due to lost messages without retransmission • DTLS requires differentiation of server and client. • Custom retransmission may be needed. Lucas Mair — BFT Consensus 13

Problem Analysis Discussion on Retransmission Possible model for custom retransmission: • Estimate optimal retransmission timer based on Round Trip Time (RTT). ⇒ send message again if consensus round made no progress after timer expires. Problems: • Only leader can properly estimate RTT with current message types. • Unnecessary retransmissions possible. Solutions: • Estimate and update RTT timer for replicas using ping mechanism. • Implement explicit acknowledgement messages. Lucas Mair — BFT Consensus 14

Approach and Implementation

Approach and Implementation The libhotstuff library structure Replica communicate Replica via PeerNetwork = Network Implementation = Replica Implementation hotstuff_app salticidae - PeerNetwork - ClientNetwork uses - HotStuff libhotstuff receive commands and send output via ClientNetwork = Signature Implementation secp256k1 hotstuff_client Figure 2: Diagram showing the library structure of libhotstuff and communication between client and replicas Lucas Mair — BFT Consensus 16

Approach and Implementation Implementation Approach • Create single UDP socket for sending and receiving messages. • Use UDP implementation by creating additional functions. (e.g. _recv_data → _recv_data_udp ). • Call new functions only for replica ⇔ replica communication, not for client messages. ⇒ modify PeerNetwork class and check for newly introduced enable_tls option • Implement DTLS using OpenSSL and assign deterministic server and client roles. • No implementation of Retransmission due to time constraints. Lucas Mair — BFT Consensus 17

Approach and Implementation Conduction of Measurements • Measurements on physical machines (testbed) ... with 4 and 7 replicas (tolerating 1 and 2 faults, respectively) • Benchmarks oriented on original paper for comparison purposes: measuring throughput(Kops/sec) vs. latency(ms) • Different batch sizes (100, 400, 800) with 0/0 payload (request/reply) • Different payload sizes (0/0, 128/128, 1024/1024) with batch size 400 • Robustness benchmark using avg. time difference of successful consensus round vs. consensus with single leader failure • Benchmark with 1% packet loss Lucas Mair — BFT Consensus 18

Experiment Evaluation

Experiment Evaluation Batchsize Experiments 25 30 20 latency (ms) • Throughput: +23% Kops/sec 20 15 10 • Latency: -17% 10 5 0 0 0 10 20 30 40 50 24 26 28 time Kops/sec (a) Throughput over time (b) Latency 40 20 30 latency (ms) 15 Kops/sec 20 10 10 5 0 0 0 10 20 30 40 50 28 30 32 34 36 time Kops/sec (c) Throughput over time (d) Latency Figure 3: TCP (a,b) vs. UDP (c,d) results for a batchsize of 100 Lucas Mair — BFT Consensus 20

Experiment Evaluation Batchsize Experiments 100 400 800 batchsize: throughput: 26 Kops/s 106 Kops/s 146 Kops/s TCP: 19.1 ms 18.8 ms 27.4 ms latency: throughput: 32 Kops/s 120 Kops/s 141 Kops/s UDP: latency: 15.7 ms 16.5 ms 28.4 ms throughput percentage differences +23% +13% − 3% for UDP: Table 1: Table displaying the average throughput and latencies for varying batchsizes. Lucas Mair — BFT Consensus 21

Experiment Evaluation Payload Experiments 25 100 20 latency (ms) • Throughput: +2% Kops/sec 75 15 50 • Latency: -3% 10 25 5 0 0 0 10 20 30 40 50 92 94 96 98 100 102 time Kops/sec (a) Throughput over time (b) Latency 25 100 20 latency (ms) 75 Kops/sec 15 50 10 25 5 0 0 0 10 20 30 40 50 98 99 100 101 time Kops/sec (c) Throughput over time (d) Latency Figure 4: TCP (a,b) vs. UDP (c,d) results for a payload size of 128 and batchsize of 400 Lucas Mair — BFT Consensus 22

Experiment Evaluation Payload Experiments 0/0 bytes 128/128 bytes 1024/1024 bytes payload size: 106 Kops/s 99 Kops/s 35 Kops/s throughput: TCP: 18.8 ms 20.4 ms 57.4 ms latency: throughput: 120 Kops/s 101 Kops/s 35 Kops/s UDP: latency: 16.5 ms 19.9 ms 57.3 ms throughput percentage +13% +2% 0% differences for UDP: Table 2: Table displaying the average throughput and latencies for varying client request/response payload sizes with a batchsize of 400. Lucas Mair — BFT Consensus 23

Performance of UDP-based Byzantine Fault Tolerant Consensus Final - PowerPoint PPT Presentation

Chair of Network Architectures and Services Department of Informatics Technical University of Munich Performance of UDP-based Byzantine Fault Tolerant Consensus Final talk for the Bachelors Thesis by Lucas Mair advised by Richard von Seck,

Making Byzantine Fault Tolerant Systems Tolerate Byzantine Faults Dian Yu 1/16 Comparison with

User Datagram Datagram Protocol (UDP) Protocol (UDP) User Srinidhi Varadarajan UDP: The User

Distributed Systems Making Byzantine Fault-Tolerant Systems Tolerate Byzantine Faults Hubert

Byzantine Techniques Michael George November 29, 2005 Michael George Byzantine Techniques

Lecture 10: Fault Tolerance Fault Tolerant Concurrent Computing The main principles of fault

BFTCBFTP: BYZANTINE-FAULT -TOLERANT CONSTRUCTION OF BFT PROTOCOLS EDWARD TREMEL SIGSEGV 2019

Making Byzantine Fault Tolerant Systems Tolerate Byzantine Failures Allen Clement, Mirco

Making Byzantine Fault Tolerant Systems Tolerate Byzantine Faults Xiaotian Zou, Lei Chu Outline

Adaptive Fault Tolerant Systems: Adaptive Fault Tolerant Systems: Reflective Design and

Idealised Fault Tolerant Idealised Fault Tolerant Architectural Element Architectural Element

Distributed Systems 5. Fault Tolerant Systems Fault-Tolerance - 1 Lszl Bszrmnyi

FAULT-TOLERANT CONTROL Is it possible? JAN MACIEJOWSKI Fault- tolerant control. DPS09,

Building a Fault- Building a Fault- Tolerant Distributed Tolerant Distributed System with

Fault-Tolerant Data Collection in Fault-Tolerant Data Collection in Heterogeneous Intelligent

Fault-tolerant techniques Fault-tolerant techniques What causes component faults? What are the

UDP Sockets Java UDP Sockets Internet Application Development Internet Application Development T

PADS: Policy Architecture for Distributed Storage Systems Nalini Belaramani, Jiandan Zheng, Amol

Outsourced Storage & Proofs of Retrievability Hovav Shacham, UC San Diego Brent Waters, SRI

CS 423 Operating System Design: This is the Syllabus Professor Adam Bates CS423:

STAR: Self-Tuning Aggregation for Scalable Monitoring [On job market next year] Navendu Jain,

ICDCS 2009 Motivation Motivation Media servers, scientific data applications M di i tifi d

Early Learning Inventories: Insights from State Colleagues Janice Keizer, Partnership Lead,

Content of this lecture Course information (personnel, policy, prerequisite, agenda, etc.)

Consensus Roger Wattenhofer wattenhofer@ethz.ch Summer School May-June 2016 ii Contents 1

Performance of UDP-based Byzantine Fault Tolerant Consensus Final - PowerPoint PPT Presentation

Chair of Network Architectures and Services Department of Informatics Technical University of Munich Performance of UDP-based Byzantine Fault Tolerant Consensus Final talk for the Bachelors Thesis by Lucas Mair advised by Richard von Seck,

Making Byzantine Fault Tolerant Systems Tolerate Byzantine Faults Dian Yu 1/16 Comparison with

User Datagram Datagram Protocol (UDP) Protocol (UDP) User Srinidhi Varadarajan UDP: The User

Distributed Systems Making Byzantine Fault-Tolerant Systems Tolerate Byzantine Faults Hubert

Byzantine Techniques Michael George November 29, 2005 Michael George Byzantine Techniques

Lecture 10: Fault Tolerance Fault Tolerant Concurrent Computing The main principles of fault

BFTCBFTP: BYZANTINE-FAULT -TOLERANT CONSTRUCTION OF BFT PROTOCOLS EDWARD TREMEL SIGSEGV 2019

Making Byzantine Fault Tolerant Systems Tolerate Byzantine Failures Allen Clement, Mirco

Making Byzantine Fault Tolerant Systems Tolerate Byzantine Faults Xiaotian Zou, Lei Chu Outline

Adaptive Fault Tolerant Systems: Adaptive Fault Tolerant Systems: Reflective Design and

Idealised Fault Tolerant Idealised Fault Tolerant Architectural Element Architectural Element

Distributed Systems 5. Fault Tolerant Systems Fault-Tolerance - 1 Lszl Bszrmnyi

FAULT-TOLERANT CONTROL Is it possible? JAN MACIEJOWSKI Fault- tolerant control. DPS09,

Building a Fault- Building a Fault- Tolerant Distributed Tolerant Distributed System with

Fault-Tolerant Data Collection in Fault-Tolerant Data Collection in Heterogeneous Intelligent

Fault-tolerant techniques Fault-tolerant techniques What causes component faults? What are the

UDP Sockets Java UDP Sockets Internet Application Development Internet Application Development T

PADS: Policy Architecture for Distributed Storage Systems Nalini Belaramani, Jiandan Zheng, Amol

Outsourced Storage &amp; Proofs of Retrievability Hovav Shacham, UC San Diego Brent Waters, SRI

CS 423 Operating System Design: This is the Syllabus Professor Adam Bates CS423:

STAR: Self-Tuning Aggregation for Scalable Monitoring [On job market next year] Navendu Jain,

ICDCS 2009 Motivation Motivation Media servers, scientific data applications M di i tifi d

Early Learning Inventories: Insights from State Colleagues Janice Keizer, Partnership Lead,

Content of this lecture Course information (personnel, policy, prerequisite, agenda, etc.)

Consensus Roger Wattenhofer wattenhofer@ethz.ch Summer School May-June 2016 ii Contents 1

Outsourced Storage & Proofs of Retrievability Hovav Shacham, UC San Diego Brent Waters, SRI