Evaluating BFT Protocols for Spire Henry Schuh & Sam Beckley - PowerPoint PPT Presentation

Evaluating BFT Protocols for Spire Henry Schuh & Sam Beckley 600.667 Advanced Distributed Systems & Networks

• SCADA & Spire Overview • High-Performance, Scalable Spire • Trusted Platform Module • Known Network Characteristics • Evaluating BFT-SMART • Benchmarking Results • Conclusions

Power Grid Overview

SCADA Overview

SCADA Requirements • Must have very low latencies (100-200ms) • Must have very high reliability • Must be able to run for decades

SCADA Adopting IP & Internet • In the past SCADA used proprietary protocols on air gapped systems • Now moving to both IP & the Internet to reduce costs

“These devices were not only internet facing, they did not have   security mechanisms to prevent unauthorized access” - Trend Micro Incorporated, Who’s Really Attacking Your ICS Systems

  Attacks on SCADA Systems 28 Days: 39 Attacks   All targeted specifically at SCADA systems   The first attack was within 18 hours of the honeypot going live Source: Trend Micro Incorporated, Who’s Really Attacking Your ICS Systems

  Distributed Replication • Several machines that coordinate their actions such that they appear to be a single unified machine to a client.   Pros: High Availability and Performance   Cons: Cost of Synchronization

  Intrusion Tolerant Replication Somewhat Formally: The ability to make progress in the presence of some number of malicious replicas with guaranteed correctness. Some protocols also guarantee a level of performance under attack. Informally: If some of the replicas get hacked the system still works.

Defense Across Space & Time Defense Across Time: Have to periodically regain control of a compromised machine to stop the attacker from eventually gaining control of the entire network. Defense Across Space: Every replica must present a unique attack surface so that one attack cannot be used to compromise every replica.

Spire Open Source SCADA system that provides both standard crypto defense mechanisms as well as an intrusion tolerant SCADA Master. Spire uses several different technologies • Prime • Spines • PVBrowser

Spire Internal Spines Network SCADA SCADA SCADA SCADA Master Master Master Master Prime Prime Prime Prime External Spines Network RTU / PLC RTU / PLC pvbrowser HMI Proxy Proxy RTU PLC

Scaling Spire In order to tolerate more intrusions we need more replicas The more replicas, the higher the latency becomes We rely on having very low latency

Our Mission Find a way to make Spire more scalable, to allow for more replicas, and thus more intrusions

3 Angles of Attack Trusted Hardware - using a TPM Taking Advantage of Known Network Characteristics Hierarchy of Protocols

Trusted Platform Module Specialized chip that holds a secret key and can perform cryptographic functions for the rest of the machine The key never leaves the TPM Too slow :’(

Leverage Network Characteristics SCADA deployments are static and predictable Most importantly, we know: • Geographically close - low latency communication • Consistent number of clients and messaging pattern

The Three BFT Protocol Families PBFT Spinning Prime

PBFT PBFT Spinning Prime

  PBFT When the leader fails we must perform a “view change” This is by far the most expensive operation in PBFT   “[The view change] is the Achilles Heel” -Yair Amir

Spinning Every ordering is done by a different leader A bad leader can delay exactly one ordering before it is evicted from the protocol

Prime Designed to remove load from the leader to allow for many clients without performance degradation Performs one ordering every X milliseconds

BFT-SMART Implements “ Yet Another Visit to Paxos” protocol   • (IBM Zurich) in Java Modular, multi-threaded server replicas • Standard BFT message pattern • Modern protocol with ongoing development •

Multithreaded Design Request Request Service Reply Client Request Server Reply Timer Thread 1 Replica Thread Thread Message Leader Processor Thread Thread … … Sender Sender Receiver Receiver Thread 1 Thread n-1 Thread 1 Thread n-1 Server Consensus Communication

BFT-SMART and Performance Attacks Consensus relies on leader to order messages • A malicious leader could delay progress • Timeouts limit the leader’s worst-cast performance • Propose (Pre-Prepare) Pre-Prepare Client Malicious Delay Malicious Delay 0 Leader (primary) Replica 1 1 Replica 2 2 Replica 3 3

Simulating a SCADA Network 3 replicas per site n = 12 NYC f = 3 4ms 4ms 3ms JHU SVG 3ms 2ms 2ms WAS

Normal-Case Latency Mean Latency vs. Number of Clients Me 45 40 35 30 Mean Latency (ms) 25 20 15 10 5 0 0 10 20 30 40 50 60 70 80 90 100 Number of Clients BFT-SMART Prime

Normal-Case Latency • Significantly lower with BFT-SMART, but increasing with number of clients • Matches expectations given fewer consensus rounds • Constant with Prime, due to batch ordering on a preset interval of 20ms

Performance Attack Latency Tested 4 timeouts, chosen based on normal performance • 1. 8ms (aggressive) 2. 10ms (conservative)

Performance Attack Latency Tested 4 timeouts, chosen based on normal performance • 1. 8ms (aggressive) 2. 10ms (conservative) 3. 16ms (aggressive, forwarding request at 8ms) 4. 20ms (conservative, forwarding request at 10ms)

Performance Attack Latency • Developed a malicious replica to delay sending pre-prepare messages as leader • Experimentally maximized delay up to each view change timeout • Measured worst-case latency seen by client under this condition

Performance Attack Latency Measured Latency vs. Timeout Me 35 30 Mean Worst-Case Latency (ms) 25 20 15 10 5 0 5 7 9 11 13 15 17 19 21 23 Pre-Prepare Timeout (ms) Worst-Case Latency Normal Latency

Performance Attack Latency • With a tight timeout, performance degradation is minimal • With a conservative timeout, performance degradation approaches 50% (26ms latency) • In either case, lower than normal-case Prime and exceeds the required performance • This performance attack would not pose a risk to the SCADA system

View Change 50-70ms depending on number of pending requests • Slow due to unoptimized serialization, data structures, taking up • to 40ms Sequential view changes are an issue with multiple faulty replicas • With f ≥ 3 , view change must be improved to meet the • 200ms requirement Prime view changes are on the order of 60-90ms •

Scalability Overhead LA LAN La Latency vs. Number of Replicas 600 500 400 Latency (µs) 300 La 200 100 0 0 5 10 15 20 25 Nu Number of replicas (n)

Scalability Overhead • Shows the computational overhead of increasing n • Latency appears linear with n , and grows at a reasonable rate • Actual latency determined by location of added replicas • Another geographic site vs. more replicas   per site

BFT-SMART: Pros & Cons PROS • Lightweight protocol & implementation • Possible to apply aggressive timeout • Low normal-case latency • Support for dynamic state transfer, reconfiguration/recovery CONS • Latency increases with number of clients, concurrent requests • High view change cost • Java implementation

Prime: Pros & Cons PROS • Leader is not burdened by client requests • Bounded performance guarantee under attack • Latency remains constant as number of clients increases • Measurements performed so replicas can adapt to network conditions CONS • 2 more consensus rounds per ordering • High view change cost • Significantly higher normal-case latency

Conclusions • Strict limit on performance attacks possible with a lightweight protocol and bounded network latencies • View change still a high cost, but could be optimized • A viable path to scaling Spire • However, BFT-SMART introduces some new issues

Evaluating BFT Protocols for Spire Henry Schuh & Sam Beckley - PowerPoint PPT Presentation

Evaluating BFT Protocols for Spire Henry Schuh & Sam Beckley 600.667 Advanced Distributed Systems & Networks SCADA & Spire Overview High-Performance, Scalable Spire Trusted Platform Module Known Network Characteristics

SCIMP 2014 Peter Martin/Eddie Adie SPIRE Project Team Workshop Format 1. Update on Progress

Among 27,438 High Risk Patients The SPIRE 1 and SPIRE 2 Cardiovascular Outcome Trials Paul M

BFT for the skeptics Yee Jiun Song, Flavio Junqueira, Benjamin Reed Cornell University, Yahoo!

The Honey Badger of BFT Protocols Authors: Andrew Miller, Yu Xia, Kyle Croman, Elaine Shi, Dawn

Towards an Ecosystem for Verifying Implementations of BFT protocols Ivana Vukotic, Vincent Rahli,

State Machine Replication for the Masses with BFT-SM A R T Hsin-Yang Huang Chih-shang Chen

BFTCBFTP: BYZANTINE-FAULT -TOLERANT CONSTRUCTION OF BFT PROTOCOLS EDWARD TREMEL SIGSEGV 2019

Robust BFT Protocols Sonia Ben Mokhtar , LIRIS, CNRS, Lyon Joint work with Pierre Louis Aublin ,

Sustainable Process Industries SPIRE Prparation de l a ppel 2015 Pierre Fiasse 4/06/2014

ATTACK-AWARENESS FOR SPIRE (INTRUSION-TOLERANT SCADA) Tiger Gao, Dan Qian, Elaine Wong, &

Energy in motion Investor presentation December 2018 2 2 Spire | Investor Presentation |

Moving forward confidently Investor Presentation September 2018 2 2 Spire | Investor

Moving forward confidently Investor Presentation June 2018 2 2 Spire | Investor Presentation |

DASHED LINE REPRESENTS AREA OF WORK F U L T O N S T R E E T BROADWAY TOP OF SPIRE

Energy in motion Investor presentation March 2019 2 2 Spire | Investor Presentation | December

Secure Multi-Party Computation Lecture 17 GMW & BGW Protocols MPC Protocols MPC Protocols

The Role of the Assistant Principal in Technology and Climate/Culture GASSP AP Conference 2013

BES Group Presentation BES Group Presentation A successful history of long term sustainable

TVA Coal Portfolio: Yesterday, Today and Tomorrow David Owens July 15, 2019 Tennessee Valley

Transport and Negative Feedback Dinesh Mohan INDIAN INSTITUTE OF TECHNOLOGY DELHI POLITICS OF

Karol Ruszczyk kr248234 What Byzantine failures are? World before UpRight UpRight

Rou outeing in n sou outh thwest of of the the Bal Baltic Sea Sea Upda pdate 2019 to to

Task Force Report Pres eserv rvin ing ou our Neig eighborhoods. Ba Bala lancin ing ou

Back to the Future: Overview of Governors Proposed FY 2021 State Budget 7 th Annual WVCBP

Evaluating BFT Protocols for Spire Henry Schuh & Sam Beckley - PowerPoint PPT Presentation

Evaluating BFT Protocols for Spire Henry Schuh & Sam Beckley 600.667 Advanced Distributed Systems & Networks SCADA & Spire Overview High-Performance, Scalable Spire Trusted Platform Module Known Network Characteristics

SCIMP 2014 Peter Martin/Eddie Adie SPIRE Project Team Workshop Format 1. Update on Progress

Among 27,438 High Risk Patients The SPIRE 1 and SPIRE 2 Cardiovascular Outcome Trials Paul M

BFT for the skeptics Yee Jiun Song, Flavio Junqueira, Benjamin Reed Cornell University, Yahoo!

The Honey Badger of BFT Protocols Authors: Andrew Miller, Yu Xia, Kyle Croman, Elaine Shi, Dawn

Towards an Ecosystem for Verifying Implementations of BFT protocols Ivana Vukotic, Vincent Rahli,

State Machine Replication for the Masses with BFT-SM A R T Hsin-Yang Huang Chih-shang Chen

BFTCBFTP: BYZANTINE-FAULT -TOLERANT CONSTRUCTION OF BFT PROTOCOLS EDWARD TREMEL SIGSEGV 2019

Robust BFT Protocols Sonia Ben Mokhtar , LIRIS, CNRS, Lyon Joint work with Pierre Louis Aublin ,

Sustainable Process Industries SPIRE Prparation de l a ppel 2015 Pierre Fiasse 4/06/2014

ATTACK-AWARENESS FOR SPIRE (INTRUSION-TOLERANT SCADA) Tiger Gao, Dan Qian, Elaine Wong, &amp;

Energy in motion Investor presentation December 2018 2 2 Spire | Investor Presentation |

Moving forward confidently Investor Presentation September 2018 2 2 Spire | Investor

Moving forward confidently Investor Presentation June 2018 2 2 Spire | Investor Presentation |

DASHED LINE REPRESENTS AREA OF WORK F U L T O N S T R E E T BROADWAY TOP OF SPIRE

Energy in motion Investor presentation March 2019 2 2 Spire | Investor Presentation | December

Secure Multi-Party Computation Lecture 17 GMW &amp; BGW Protocols MPC Protocols MPC Protocols

The Role of the Assistant Principal in Technology and Climate/Culture GASSP AP Conference 2013

BES Group Presentation BES Group Presentation A successful history of long term sustainable

TVA Coal Portfolio: Yesterday, Today and Tomorrow David Owens July 15, 2019 Tennessee Valley

Transport and Negative Feedback Dinesh Mohan INDIAN INSTITUTE OF TECHNOLOGY DELHI POLITICS OF

Karol Ruszczyk kr248234 What Byzantine failures are? World before UpRight UpRight

Rou outeing in n sou outh thwest of of the the Bal Baltic Sea Sea Upda pdate 2019 to to

Task Force Report Pres eserv rvin ing ou our Neig eighborhoods. Ba Bala lancin ing ou

Back to the Future: Overview of Governors Proposed FY 2021 State Budget 7 th Annual WVCBP

ATTACK-AWARENESS FOR SPIRE (INTRUSION-TOLERANT SCADA) Tiger Gao, Dan Qian, Elaine Wong, &

Secure Multi-Party Computation Lecture 17 GMW & BGW Protocols MPC Protocols MPC Protocols