Securing Passive Replication Through Verification
Bruno Vavala1,2, Nuno Neves1, Peter Steenkiste2
1University of Lisbon (Portugal) 2Carnegie Mellon University (U.S.) IEEE Symposium on Reliable and Distributed Systems, 2015
Securing Passive Replication Through Verification Bruno Vavala 1,2 , - - PowerPoint PPT Presentation
Securing Passive Replication Through Verification Bruno Vavala 1,2 , Nuno Neves 1 , Peter Steenkiste 2 1 University of Lisbon (Portugal) 2 Carnegie Mellon University (U.S.) IEEE Symposium on Reliable and Distributed Systems, 2015 Outline
Securing Passive Replication Through Verification
Bruno Vavala1,2, Nuno Neves1, Peter Steenkiste2
1University of Lisbon (Portugal) 2Carnegie Mellon University (U.S.) IEEE Symposium on Reliable and Distributed Systems, 2015Outline
Fault-Tolerance
replication
coordination
3Fault-Tolerance
+replicas +coordination
replication coordination
4Replication
Active Replication
(State Machine Replication)
Passive Replication
2 main design choices vs.
5Active Replication (AR)
State Machine approach: 1. System receives the requests 2. Requests are ordered (“many” messages) 3. Enough replicas execute them 4. Each replica returns an answer 5. Answers are voted
R1 R2 R3 R4 C Request Ordering Protocol 1 2 3 4 5
6Passive Replication (PR)
R1 R2 R3 R4 C
requests
broadcast
and return ACK
1 2 3 4 5 6
7Current BFT Solutions
Seminal practical SMR work
Hybrid model with TTCB
Speculative executions
Bounded Delay Guarantee
Less replicas in hybrid model
Hybrid model, activation of passive replicas upon failures
High performance
AR PR …and many . . many
Why no PR solutions?
9Why no PR solutions?
AR
R1 R2 R3 R4 Voter system
correct answer
client ✔︎
Why no PR solutions?
AR PR
R1 R2 R3 R4 Voter system
correct answer
client R1 R2 R3 correct ?
?
✔︎
11Pros & Cons
AR PR Byzantine FT
✔︎ ✗
Replicas
2f+1 2f+1
Re-Computations
O(n) O(1)
Message size
|request| +|input| |reply| +|update|
Non-determinism
✗ ✔
“While some consensus algorithms, such as Paxos […] have started to find their way into those systems, their uses are limited mostly to the maintenance of the global configuration information in the system, not for the actual data replication.” – L. Lamport et al. 12Outline
Goals
Fault-tolerant & resource-efficient & simple replicated architecture for unmodified services Challenges
Outline
V-PR
Verified Passive Replication
16Best of Both Worlds
AR PR V-PR Byzantine FT
✔︎ ✗ ✔
Replicas
(w/ trust assumptions)2f+1 2f+1 2f+1
Executions
O(n) O(1) O(1)
Message size
|request| +|input| |reply| +|update| |reply| +|update|
Non-determinism
✗ ✔ ✔
17TCC Overview
Running code is identified for ID-based operations
can read or modify them
No different assumptions with respect to previous works, just a more powerful TCC!
18Model
Rest of the system can fail arbitrarily (Byzantine)
V-PR Architecture
service client Security MW OS OS TCC OS TCC Service Manager U-Manager primary backup client Manager Update Svc Update network U-Manager
20V-PR Architecture
service client Security MW OS OS TCC OS TCC Service Manager U-Manager primary backup client Manager Update Svc Update network U-Manager
21V-PR Architecture
service client Security MW OS OS TCC OS TCC Service Manager U-Manager primary backup client Manager Update Svc Update network U-Manager
22V-PR Architecture
service client Security MW OS OS TCC OS TCC Service Manager U-Manager primary backup client Manager Update Svc Update network U-Manager
trusted untrusted trusted untrusted
Read Requests
service client Security MW OS OS TCC OS TCC Service Manager U-Manager primary backup client Manager Update Svc Update network U-Manager client request/reply 2.execute 1.client request/reply
establish a session key with the Manager
Write Requests
service client Security MW OS OS TCC OS TCC Service Manager U-Manager primary backup client Manager Update Svc Update network U-Manager state updates/ACKs 3.state updates/ACKs 4.trusted updates
6.check ACKs
Outline
Evaluation
27Implementation
Hardware XMHF TrustVisor Manager Service trusted environment
TCC
(S&P’10, S&P’13)
Performance
BFT-SMaRt, Prime and V-PR
1 2 3 4 1 5 10 20 BFT-SMaRt V-PR 5 10 15 20 25 1 5 10 20 BFT-SMaRt V-PR Prime Batch size Batch size Read-latency (ms) Write-latency (ms)
29VPR-ed SQLite
5 10 15 20 25 30 35 1 2 5 7 Read Write
Latency (ms) Batch size
30Outline
Takeaways
using hardware-based security
Thanks.
33System Initialization
MPrimary MBackup Admin
attested JOIN check attestation attested ACCEPT +encr.{K} check ACKs, install initial state ACK initial state, TCC cert. check attestation
36Primary Change
(partial synchrony)
Implementation
performance library ZeroMQ
and TrustVisor(S&P’10)
primary backup client network client broker replica broker
38Implementation
Hardware HMHF TrustVisor Manager Service trusted environment
trusted counters
to devices, like disk): created custom APIs (memory allocation, debugging, etc.), custom filesystem (as a module, so no modification to SQLite)
TCC
performance library ZeroMQ
and TrustVisor(S&P’10)
Reducing TCC Demand
service client Security MW OS OS TCC OS TCC Service Manager U-Manager primary backup client Manager Update Svc Update network U-Manager 4.untrusted updates
(yes, all of them, so at least a correct one always available)
40Blinder
service client Security MW OS OS TCC OS TCC Service Manager U-Manager primary backup client Manager Update Svc Update network U-Manager 2.execute/ blind reply
consistency
5.unblind reply
41Code size
20 40 60 80 100 120 AR VPR Primary VPR Backup Update Network SQLite V-PR Average AR