Securing Passive Replication Through Verification Bruno Vavala 1,2 , - PowerPoint PPT Presentation

Securing Passive Replication Through Verification Bruno Vavala 1,2 , Nuno Neves 1 , Peter Steenkiste 2 1 University of Lisbon (Portugal) 2 Carnegie Mellon University (U.S.) IEEE Symposium on Reliable and Distributed Systems, 2015

Outline • Motivation and background • Goals • Architecture Design & System Operations • Evaluation • Takeaways

Fault-Tolerance • Service continuity has to be ensured in case of failure • Components have to be replicated replication • Replicas must be coordinated coordination ✗ 3

Fault-Tolerance • Service continuity has to be ensured in case of failure • Components have to be replicated replication • Replicas must be coordinated • Arbitrary failures require + replicas coordination + coordination 4

Replication 2 main design choices Active Passive vs. Replication Replication (State Machine Replication) 5

Active Replication (AR) State Machine approach: 3. Enough replicas execute them 1. System receives the requests 4. Each replica returns an 2. Requests are ordered answer (“many” messages) 5. Answers are voted 1 5 C 3 2 R 1 Request R 2 Ordering R 3 Protocol 4 R 4 6

Passive Replication (PR) 1. Primary receives the 4. Backups apply updates requests and return ACK 2. Requests are executed 5. Primary votes on ACKs 3. State updates are 6. Primary replies to client broadcast C 2 3 1 6 5 R 1 R 2 4 R 3 R 4 7

Current BFT Solutions AR PR PBFT (OSDI’99) • Seminal practical SMR work Correia et al.(SRDS’04) • …and Hybrid model with TTCB ∅ Zyzzyva (SOSP’07) • many Speculative executions Prime (DSN’08) . • Bounded Delay Guarantee . MinBFT (TC’11) • Less replicas in hybrid model many CheapBFT (Eurosys’12) • Hybrid model, activation of others! passive replicas upon failures BFT-SMaRt (DSN’14) • High performance 8

Why no PR solutions? 9

Why no PR solutions? system client R 1 correct R 2 answer ✔ ︎ Voter AR R 3 R 4 • Enough redundancy to extract correct answer 10

Why no PR solutions? system client R 1 correct R 2 answer ✔ ︎ Voter AR R 3 R 4 R 1 correct R 2 ? PR R 3 ? • Challenge: how to verify the result efficiently? • Trivial inefficient solution: re-execute the service 11

Pros & Cons AR PR ✔ ︎ ✗ Byzantine FT 2f+1 2f+1 Replicas O(n) O(1) Re-Computations |request| |reply| Message size +|input| +|update| ✗ ✔ Non-determinism “While some consensus algorithms, such as Paxos […] have started to find their way into those systems, their uses are limited mostly to the maintenance of the global configuration information in the system, not for the actual data replication. ” – L. Lamport et al. 12

Goals Fault-tolerant & resource-efficient & simple replicated architecture for unmodified services Challenges • Protect the service results from malicious failures • Efficient verification of the results • Ensure that state updates are correctly propagated • Ensure that client gets correct and consistent results 14

V-PR Verified Passive Replication 16

TCC Overview • Trusted Computing Component o It performs actual general-purpose computation No different assumptions with o It provides trusted services (TPM-like) respect to previous works, o It has internal registers that store the identity (i.e., hash) of running code just a more powerful TCC! • Primitives o put (data, ID)/ get (data, ID). TCC-backed and ID-based secure external storage. Only the same ID can store and retrieve data o execute (code, input). TCC-backed isolated execution of arbitrary code. Running code is identified for ID-based operations o attest (). TCC signature that could carry information on running code and results o create / get / incr_counter (ID, name). Access controlled Trusted counters. Only ID can read or modify them o verify (). Check validity of attestation, through manufacturer certificate 18

Model • TCC is crash-only Rest of the system can fail arbitrarily (Byzantine) • TCC only usable through primitives • Correct Majority of replicas • Asynchronous model for safety, partially synchronous oth. • Model does not consider: o Denial of Service attacks o Physical tampering (at least not to the TCC hardware) o Service vulnerabilities 19

V-PR Architecture primary client backup service client Service Update Svc Update Security MW Manager U-Manager Manager U-Manager OS TCC OS TCC OS network 20

V-PR Architecture primary client backup service client Service Update Svc Update Security MW Manager U-Manager Manager U-Manager OS TCC OS TCC OS network • Core components: SMW, Manager, U-Manager • Update service only applies state updates 21

V-PR Architecture primary client backup service client Service Update Svc Update Security MW Manager U-Manager Manager U-Manager OS TCC OS TCC OS network • Service Client and Service are not modified • Important effort to make V-PR service oblivious 22

V-PR Architecture primary client backup service client Service Update Svc Update Security MW Manager U-Manager Manager U-Manager OS TCC OS TCC OS trusted trusted untrusted untrusted network Dual failure model (crash+Byzantine) • Two execution environments with different Trust assumptions • Entry point: execute (Manager) to call TCC service • 23

Read Requests 2.execute primary client backup service client Service Update Svc Update Security MW Manager U-Manager Manager U-Manager OS TCC OS TCC OS network Client SMW can verify primary’s execution and • client 1.client establish a session key with the Manager request/reply request/reply No state updates => read request • 2 messages • 24

Write Requests 4.trusted primary updates client backup 6.check service client Service Update Svc Update ACKs Security MW Manager U-Manager Manager U-Manager OS TCC OS TCC OS network state 3.state updates/ACKs updates/ACKs Available state update => write request • 4 steps (of message passing) overall • 25

Evaluation 27

Implementation Message passing with ZeroMQ • trusted environment TCC with XMHF-TrustVisor • Service (S&P’10, S&P’13) Manager Full SQLite database engine • TrustVisor VPR-ed SQLite o XMHF OS-free implementation • Hardware very small TCB o TCC Against recent AR schemes: • BFT-SMaRt (IEEE DSN’14) o Prime (IEEE TDSC’11) 28 o

Performance • Overhead comparison among BFT-SMaRt, Prime and V-PR Read-latency (ms) Write-latency (ms) 4 25 BFT-SMaRt BFT-SMaRt 20 3 V-PR V-PR 15 2 Prime 10 1 5 0 0 1 5 10 20 1 5 10 20 Batch size Batch size 29

VPR-ed SQLite 35 30 Latency (ms) 25 20 Read 15 Write 10 5 0 1 2 5 7 Batch size • Realistic trusted executions are the bottleneck o 2 TCC execution at the primary (for write requests) o in pessimistic runs, 1 more TCC execution at backups 30

Takeaways Easy to design fault-tolerant protocols • using hardware-based security V-PR is the first fully-passive replication scheme that tolerates Byzantine failures o No additional assumptions (compared to previous literature) • Linear factor reduction in executing replicas • Non-determinism supported by design o Main limitation is the current technology • …but it’s making progress, check out Intel SGX o 32

Thanks. 33

System Initialization Need to form a secure group • If other replicas participate, they could be later shutdown (state loss) o Share a unique key K (use TCC secure storage for confidentiality) • Start from same initial state • check ACKs, install initial state check attestation M Primary attested initial state, ACK ACCEPT Admin attested TCC cert. +encr.{K} JOIN M Backup check attestation 36

Primary Change • Primary identified through local view counter o Each replica answer to only one specific primary • Detect primary’s failure through timeouts (partial synchrony) o Start primary change protocol, but always answer to primary’s updates o Exchange messages to increment view counter o Eventually, no progress => new primary • Extreme cases o Multiple primaries: safe, because only one can make progress o Only one view increment: • replica wait for others to change primary • replica can make progress through consecutive updates anyway 37

Securing Passive Replication Through Verification Bruno Vavala 1,2 , - PowerPoint PPT Presentation

Securing Passive Replication Through Verification Bruno Vavala 1,2 , Nuno Neves 1 , Peter Steenkiste 2 1 University of Lisbon (Portugal) 2 Carnegie Mellon University (U.S.) IEEE Symposium on Reliable and Distributed Systems, 2015 Outline

Passive Gas System Design PRESENTED BY BRYAN WELDON P.E. Passive System Overview 01 Passive

Asynchronous Replication and Bayou Asynchronous Replication and Bayou Asynchronous Replication

MySQL Replication Tutorial Mats Kindahl Senior Software Engineer Replication Technology Lars

August 23, 2012 Data Replication/ETL: Terms Data Replication : Data Replication is the process of

Asynchronous Replication and Bayou Asynchronous Replication and Bayou Jeff Chase CPS 212, Fall

Asynchronous Replication and Bayou Asynchronous Replication and Bayou Jeff Chase CPS 212, Fall

Building&Hacking modern iOS apps Wojciech Regua @_r3ggi wojciech.regula@securing.pl

Passive Fire Protection For the Oil & Gas Industry Passive Fire Protection What is purpose

New features in MySQL Replication Lars Thalmann, Development Manager, Replication & Backup

Todays Topics - Chapter 15 Slide 1 performance enhancement Replication Replication of

Passive DNS Replication Florian Weimer 17 th Annual FIRST Conference, Singapore, 2005 Florian

Passive Transport (no energy input required) Passive Transport Passive transport is the

Passive Intermodulation (PIM), an interference challenge for the radio Passive Intermodulation

What Are Active and Passive Voice? Can you write definitions for active and passive

Open Source Passive DNS Replication Robert Edmonds ( edmonds@isc.org ) October 14, 2012 ISC

DIVS DL/ID Verification Systems Verification of Legal Status DIVS Passport Verification

Hybrid Consensus: Efficient Consensus in the Permissionless Model Rafael Pass and Elaine Shi

MochiDB: A Byzantine Fault Tolerant Datastore Tigran Tsaturyan Saravanan Dhakshinamurthy 1. BFT

Byzantine Fault Tolerance and Partial Synchrony Stefan Stattelmann Seminar Advanced Topics in

BFT in Lens of Blockchain Ted Yin 1,2 , Dahlia Malkhi 2 , Michael K. Reiter 2,3 Guy Golan Gueta 2

Increasing Performance in Byzantine Fault-Tolerant Systems with On-Demand Replica Consistency

Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication 12th EuroSys Doctoral Workshop

Building Blocks for Blockchains and Distributed Systems Philipp Schindler

Proof-of-Stake Consensus Protocol for Cyber Supply Chain Data Provenance Xueping Liang, Deepak

Securing Passive Replication Through Verification Bruno Vavala 1,2 , - PowerPoint PPT Presentation

Securing Passive Replication Through Verification Bruno Vavala 1,2 , Nuno Neves 1 , Peter Steenkiste 2 1 University of Lisbon (Portugal) 2 Carnegie Mellon University (U.S.) IEEE Symposium on Reliable and Distributed Systems, 2015 Outline

Passive Gas System Design PRESENTED BY BRYAN WELDON P.E. Passive System Overview 01 Passive

Asynchronous Replication and Bayou Asynchronous Replication and Bayou Asynchronous Replication

MySQL Replication Tutorial Mats Kindahl Senior Software Engineer Replication Technology Lars

August 23, 2012 Data Replication/ETL: Terms Data Replication : Data Replication is the process of

Asynchronous Replication and Bayou Asynchronous Replication and Bayou Jeff Chase CPS 212, Fall

Asynchronous Replication and Bayou Asynchronous Replication and Bayou Jeff Chase CPS 212, Fall

Building&amp;Hacking modern iOS apps Wojciech Regua @_r3ggi wojciech.regula@securing.pl

Passive Fire Protection For the Oil &amp; Gas Industry Passive Fire Protection What is purpose

New features in MySQL Replication Lars Thalmann, Development Manager, Replication &amp; Backup

Todays Topics - Chapter 15 Slide 1 performance enhancement Replication Replication of

Passive DNS Replication Florian Weimer 17 th Annual FIRST Conference, Singapore, 2005 Florian

Passive Transport (no energy input required) Passive Transport Passive transport is the

Passive Intermodulation (PIM), an interference challenge for the radio Passive Intermodulation

What Are Active and Passive Voice? Can you write definitions for active and passive

Open Source Passive DNS Replication Robert Edmonds ( edmonds@isc.org ) October 14, 2012 ISC

DIVS DL/ID Verification Systems Verification of Legal Status DIVS Passport Verification

Hybrid Consensus: Efficient Consensus in the Permissionless Model Rafael Pass and Elaine Shi

MochiDB: A Byzantine Fault Tolerant Datastore Tigran Tsaturyan Saravanan Dhakshinamurthy 1. BFT

Byzantine Fault Tolerance and Partial Synchrony Stefan Stattelmann Seminar Advanced Topics in

BFT in Lens of Blockchain Ted Yin 1,2 , Dahlia Malkhi 2 , Michael K. Reiter 2,3 Guy Golan Gueta 2

Increasing Performance in Byzantine Fault-Tolerant Systems with On-Demand Replica Consistency

Low-Latency Network-Scalable Byzantine Fault-Tolerant Replication 12th EuroSys Doctoral Workshop

Building Blocks for Blockchains and Distributed Systems Philipp Schindler

Proof-of-Stake Consensus Protocol for Cyber Supply Chain Data Provenance Xueping Liang, Deepak

Building&Hacking modern iOS apps Wojciech Regua @_r3ggi wojciech.regula@securing.pl

Passive Fire Protection For the Oil & Gas Industry Passive Fire Protection What is purpose

New features in MySQL Replication Lars Thalmann, Development Manager, Replication & Backup