Efficient Data Structures for Tamper-Evident Logging Scott A. - - PowerPoint PPT Presentation

efficient data structures for tamper evident logging
SMART_READER_LITE
LIVE PREVIEW

Efficient Data Structures for Tamper-Evident Logging Scott A. - - PowerPoint PPT Presentation

Efficient Data Structures for Tamper-Evident Logging Scott A. Crosby Dan S. Wallach Rice University Everyone has logs Tamper evident solutions Current commercial solutions Write only hardware appliances Security depends on


slide-1
SLIDE 1

Efficient Data Structures for Tamper-Evident Logging

Scott A. Crosby Dan S. Wallach

Rice University

slide-2
SLIDE 2

Everyone has logs

slide-3
SLIDE 3

Tamper evident solutions

  • Current commercial solutions

– ‘Write only’ hardware appliances – Security depends on correct operation

  • Would like cryptographic techniques

– Logger proves correct behavior – Existing approaches too slow

slide-4
SLIDE 4

Our solution

  • History tree

– Logarithmic for all operations – Benchmarks at >1,750 events/sec – Benchmarks at >8,000 audits/sec

  • In addition

– Propose new threat model – Demonstrate the importance of auditing

slide-5
SLIDE 5

Threat model

  • Forward integrity

– Events prior to Byzantine failure are tamper-evident

  • Don’t know when logger becomes evil

– Clients are trusted

  • Strong insider attacks

– Malicious administrator

  • Evil logger

– Clients may be mostly evil

  • Only trusted during insertion protocol
slide-6
SLIDE 6

Limitations and Assumptions

  • Limitations

– Detect misbehaviour, not prevent it – Cannot prevent ‘junk’ from being logged

  • Assumptions

– Privacy is outside our scope

  • Data may encrypted

– Crypto is secure

slide-7
SLIDE 7

System design

  • Logger

– Stores events – Never trusted

  • Clients

– Little storage – Create events to be logged – Trusted only at time of event creation – Sends commitments to auditors

  • Auditors

– Verify correct operation – Little storage – Trusted, at least one is honest

Client Client Client Auditor Auditor

Logger

slide-8
SLIDE 8

This talk

  • Discuss the necessity of auditing
  • Describe the history tree
  • Evaluation
  • Scaling the log
slide-9
SLIDE 9

Tamper evident log

  • Events come in
  • Commitments go out

– Each commits to the entire past

Logger

Cn-3 Cn-2 Cn-1

Xn-3 Xn-2 Xn-1

slide-10
SLIDE 10

Hash chain log

  • Existing approach [Kelsey,Schneier]

– Cn=H(Cn-1 || Xn) – Logger signs Cn

Xn-5 Xn-4 Xn-3

Cn-3

slide-11
SLIDE 11

Hash chain log

  • Existing approach [Kelsey,Schneier]

– Cn=H(Cn-1 || Xn) – Logger signs Cn

Xn-5 Xn-4 Xn-3 Xn-2

Cn-2

slide-12
SLIDE 12

Hash chain log

  • Existing approach [Kelsey,Schneier]

– Cn=H(Cn-1 || Xn) – Logger signs Cn

Xn-5 Xn-4 Xn-3 Xn-2 Xn-1

Cn-1

slide-13
SLIDE 13

Problem

  • We don’t trust the logger!

Cn Cn-2 Cn-1

Logger returns a stream of commitments Each corresponds to a log

Cn

Xn-4 Xn-3 Xn-2 Xn-1 Xn

slide-14
SLIDE 14

Problem

  • We don’t trust the logger!

Cn Cn-2 Cn-1

Xn Does really contain the just inserted Do and really commit the same historical events? ?

Cn

Xi Is the event at index i in log really ?

Cn

Xn-4 Xn-3 Xn-2 Xn-1 Xn

slide-15
SLIDE 15

Problem

  • We don’t trust the logger!

– Logger signs stream of log heads – Each corresponds to some log

Cn-3 Cn-2 Cn-1

Xn-3 Does really contain the just inserted Do and really commit the same historical events? ?

Cn

Xi Is the event at index i in log really ?

slide-16
SLIDE 16

Solution: Audit the logger

  • Only way to detect tampering

– Check the returned commitments

  • For consistency
  • For correct event lookup
  • Previously

– Auditing = looking historical events

  • Assumed to infrequent
  • Performance was ignored

Cn-2 Cn-1

Cn-3

Xn-3 

slide-17
SLIDE 17

Solution

  • Auditors check the returned commitments

– For consistency – For correct event lookup

  • Previously

– Auditing = looking historical events

  • Assumed to infrequent
  • Performance was ignored

Cn-2 Cn-1

Cn-3

Xn-3 

slide-18
SLIDE 18

Auditing is a frequent operation

  • If the logger knows this commitment will not be

audited for consistency with a later commitment.

X’n-3 Xn-2 Xn-1

C’n-1

Xn-6 Xn-4 Xn-3

Cn-3

Xn-5

slide-19
SLIDE 19

Auditing is a frequent operation

  • Successfully tampered with a ‘tamper evident’ log
  • Auditing required in forward integrity threat model

X’n-3 Xn-2 Xn-1

C’n-1

Xn-6 Xn-4 Xn-3

Cn-3

Xn-5

slide-20
SLIDE 20

Auditing is a frequent operation

  • Every commitment must have a non-zero

probability of being audited

X’n-3 Xn-2 Xn-1

C’n-1

Xn-4 Xn-3

Cn-3

Xn-5 Xn-6

slide-21
SLIDE 21

Forking the log

  • Rolls back the log and adds on different events

– Attack requires two commitments on different forks disagree on the contents of one event. – If system has historical integrity, audits must fail or be skipped

Xn-5 X’n-4 Xn-3 Xn-2 Xn-1

C’n-1

Xn-6 Xn-4 Xn-3

Cn-3

Xn-5

slide-22
SLIDE 22

New paradigm

  • Auditing cannot be avoided
  • Audits should occur

– On every event insertion – Between commitments returned by logger

  • How to make inserts and audits cheap

– CPU – Communications complexity – Storage

slide-23
SLIDE 23

Two kinds of audits

Ci Cn

  • Membership auditing

– Verify proper insertion – Lookup historical events

  • Incremental auditing

– Prove consistency between two commitments

Cn

Xi 

slide-24
SLIDE 24

Cn-3

Membership auditing a hash chain

  • Is

?

Cn-3

Xn-5 

slide-25
SLIDE 25

Membership auditing a hash chain

  • Is

Xn-5 Xn-4 Xn-3

Cn-3

X’n-4 X’n-3 X’n-5 X’n-6

?

Cn-3

Xn-5 

P

slide-26
SLIDE 26

Membership auditing a hash chain

  • Is

Xn-5 Xn-4 Xn-3

Cn-3

X’n-4 X’n-3 X’n-5 X’n-6

?

Cn-3

Xn-5 

slide-27
SLIDE 27

Incremental auditing a hash chain

  • Are

?

C’’n-5 C’n-1

slide-28
SLIDE 28

Incremental auditing a hash chain

X’n-5 X’n-4 X’n-3 X’n-2 X’n-1

C’n-1

X’’n-6 X’’n-5

C’’n-5

Xn-4 Xn-3 Xn-2 Xn-1 X’n-6 Xn-5 Xn-6

P

slide-29
SLIDE 29

Incremental auditing a hash chain

X’n-5 X’n-4 X’n-3 X’n-2 X’n-1

C’n-1

X’’n-6 X’’n-5

C’’n-5

Xn-4 Xn-3 Xn-2 Xn-1 X’n-6 Xn-5 Xn-6

Cn-5

P

slide-30
SLIDE 30

Incremental auditing a hash chain

X’n-5 X’n-4 X’n-3 X’n-2 X’n-1

C’n-1

X’’n-6 X’’n-5

C’’n-5

Xn-4 Xn-3 Xn-2 Xn-1

Cn-1

X’n-6 Xn-5 Xn-6

P

slide-31
SLIDE 31

Incremental auditing a hash chain

X’n-5 X’n-4 X’n-3 X’n-2 X’n-1

C’n-1

X’’n-6 X’’n-5

C’’n-5

Xn-4 Xn-3 Xn-2 Xn-1 X’n-6 Xn-5 Xn-6

Cn-5 Cn-1

P

slide-32
SLIDE 32

Existing tamper evident log designs

  • Hash chain

– Auditing is linear time – Historical lookups

  • Very inefficient
  • Skiplist history [Maniatis,Baker]

– Auditing is still linear time – O(log n) historical lookups

slide-33
SLIDE 33

Our solution

  • History tree

– O(log n) instead of O(n) for all operations – Variety of useful features

  • Write-once append-only storage format
  • Predicate queries + safe deletion
  • May probabilistically detect tampering

– Auditing random subset of events – Not beneficial for skip-lists or hash chains

slide-34
SLIDE 34

History Tree

  • Merkle binary tree

– Events stored on leaves – Logarithmic path length

  • Random access

– Permits reconstruction of past version and past commitments

slide-35
SLIDE 35

History Tree

X1

C2

X2

slide-36
SLIDE 36

History Tree

X1 X2 X3

C3

slide-37
SLIDE 37

History Tree

X1 X2 X3 X4

C4

slide-38
SLIDE 38

History Tree

X1 X2 X3 X4 X5

C5

slide-39
SLIDE 39

History Tree

X1 X2 X3 X4 X5 X6

C6

slide-40
SLIDE 40

History Tree

X1 X2 X3 X4 X5 X7 X6

C7

slide-41
SLIDE 41

History Tree

X1 X2 X3 X4 X5 X7 X6

slide-42
SLIDE 42

Incremental auditing

slide-43
SLIDE 43

Auditor

X1 X2 X3

C3 C3

slide-44
SLIDE 44

Auditor

X1 X2 X3 X4

C4 C3

slide-45
SLIDE 45

Auditor

X1 X2 X3 X4 X5

C5 C3

slide-46
SLIDE 46

Auditor

X1 X2 X3 X4 X5 X6

C6 C3

slide-47
SLIDE 47

Auditor

X1 X2 X3 X4 X5 X7 X6

C7 C3 C7

slide-48
SLIDE 48

Incremental proof 

X1 X2 X3 X4 X5 X7 X6

Auditor

C3 C7 C7 C3 C7 C3

slide-49
SLIDE 49

Incremental proof 

  • P is consistent with
  • P is consistent with
  • Therefore and are consistent.

X1 X2 X3 X4 X5 X7 X6

C7 C3 C7 C3

Auditor

C3 C7 C7 C3 C7 C3

slide-50
SLIDE 50

Auditor

Incremental proof 

  • P is consistent with
  • P is consistent with
  • Therefore and are consistent.

X1 X2 X3 X4 X5 X7 X6

C7 C3 C7 C3 C3 C7 C7 C3 C7 C3

slide-51
SLIDE 51

Auditor

Incremental proof 

  • P is consistent with
  • P is consistent with
  • Therefore and are consistent.

X1 X2 X3 X4 X5 X7 X6

C3 C7 C7 C3 C7 C3 C7 C3 C7 C3

slide-52
SLIDE 52

Incremental proof 

  • P is consistent with
  • P is consistent with
  • Therefore and are consistent.

X1 X2 X3 X4 X5 X7 X6

Auditor

C3 C7 C7 C3 C7 C3 C7 C3 C7 C3

slide-53
SLIDE 53

Pruned subtrees

X1 X2 X3 X4 X5 X7 X6

  • Although not sent to auditor

– Fixed by hashes above them – , fix the same (unknown) events

C7 C3

Auditor

C3 C7 C7 C3

slide-54
SLIDE 54

Membership proof that

X1 X2 X3 X4 X5 X7 X6

C’’7

  • Verify that has the same contents as P
  • Read out event

C’’7

X3

C’’7

X3

slide-55
SLIDE 55

Merkle aggregation

slide-56
SLIDE 56

Merkle aggregation

  • Annotate events with attributes

$1 $8 $3 $2 $5 $2 $2

slide-57
SLIDE 57

Aggregate them up the tree

  • Max()

$1 $8 $8

$8

$8 $3 $3 $2 $5 $5 $5 $4 $4 $2 Included in hashes and checked during audits

slide-58
SLIDE 58

Querying the tree

  • Max()

$1 $8 $8

$8

$8 $3 $3 $2 $5 $5 $5 $4 $4 $2 Find all transactions over $6

slide-59
SLIDE 59

Safe deletion

  • Max()

$1 $8 $8

$8

$8 $3 $3 $2 $5 $5 $5 $4 $4 $2 Authorized to delete all transactions under $4 $3 $2 $3 $2 $3

slide-60
SLIDE 60

Merkle aggregation is flexible

  • Many ways to map events to attributes

– Arbitrary computable function

  • Many attributes

– Timestamps, dollar values, flags, tags

  • Many aggregation strategies

+, *, min(), max(), ranges, and/or, Bloom filters

slide-61
SLIDE 61

Generic aggregation

  • (,,)

–  : Type of attributes on each node in history –  : Aggregation function –  : Maps an event to its attributes

  • For any predicate P, as long as:

– P(x) OR P(y) IMPLIES P(xy) – Then:

  • Can query for events matching P
  • Can safe-delete events not matching P
slide-62
SLIDE 62

Evaluating the history tree

  • Big-O performance
  • Syslog implementation
slide-63
SLIDE 63

Big-O performance

Cj Ci

Cj

Xi 

Insert History tree O(log n) O(log n) O(log n) Hash chain O(j-i) O(j-i) O(1) Skip-list history [Maniatis,Baker] O(j-i)

  • r O(n)

O(log n)

  • r O(n)

O(1)

slide-64
SLIDE 64

Skiplist history [Maniatis,Baker]

  • Hash chain with extra links

– Extra links cannot be trusted without auditing

  • Checking them

– Best case: only events since last audit – Worst case: examining the whole history

– If extra links are valid

  • Using them for historical lookups

– O(log n) time and space

slide-65
SLIDE 65

Syslog implementation

  • We ran 80-bit security level

– 1024 bit DSA signatures – 160 bit SHA-1 Hash

  • We recommend 112-bit security level

– 224 bit ECDSA signatures

  • 66% faster

– SHA-224 (Truncated SHA-256)

  • 33% slower
  • [NIST SP800-57 Part 1, Recommendations for Key Magament – Part 1: General

(Revised 2007)]

slide-66
SLIDE 66

Syslog implementation

  • Syslog

– Trace from Rice CS departmental servers – 4M events, 11 hosts over 4 days, 5 attributes per event

  • Repeated 20 times to create 80M event trace
slide-67
SLIDE 67

Syslog implementation

  • Implementation

– Hybrid C++ and Python – Single threaded – MMAP-based append-only write-once storage for log – 1024-bit DSA signatures and 160-bit SHA-1 hashes

  • Machine

– Dual-core 2007 desktop machine – 4gb RAM

slide-68
SLIDE 68

Performance

  • Insert performance: 1,750 events/sec

– 2.4% : Parse – 2.6% : Insert – 11.8% : Get commitment – 83.3% : Sign commitment

  • Auditing performance

– With locality (last 5M events)

  • 10,000-18,000 incremental proofs/sec
  • 8,600 membership proofs/sec

– Without locality

  • 30 membership proofs/sec

– < 4,000 byte self-contained proof size

  • Compression reduces performance and proof size by 50%
slide-69
SLIDE 69

Improving performance

  • Increasing audit throughput above

– 8,000 audits/sec

  • Increasing insert throughput above

– 1,750 inserts/sec

slide-70
SLIDE 70

Increasing audit throughput

  • Audits require read-only access to the log

– Trivially offloaded to additional cores

  • For infinite scalability

– May replicate the log server

  • Master assigns event indexes
  • Slaves build history tree locally
slide-71
SLIDE 71

Increasing insert throughput

  • Public key signatures are slow

– 83% of runtime

  • Three easy optimization

– Sign only some commitments – Use faster signatures – Offload to other hosts

  • Increase throughput to 10k events/sec
slide-72
SLIDE 72

More concurrency with replication

  • Processing pipeline:

– Inserting into history tree

  • O(1). Serialization point
  • Fundamental limit

– Must be done on each replica – 38,000 events/sec using only one core

– Commitment or proofs generation

  • O(log n).

– Signing commitments

  • O(1), but expensive. Concurrently on other hosts
slide-73
SLIDE 73

Storing on secondary storage

  • Nodes are frozen (no longer ever change)

– In post-order traversal

  • Static order

– Map into an array

1 3 7 2 4 6 5 8 10 11 9 X1 X2 X3 X4 X5 X7 X6

slide-74
SLIDE 74

Partial proofs

  • Can re-use node hashes from prior audits

– (eg, incremental proof from C3 to C4 )

X1 X2 X3 X4 X5 X7 X6

C’’7 C’4

slide-75
SLIDE 75

Conclusion

  • New paradigm

– Importance of frequent auditing

  • History tree

– Efficient auditing – Efficient predicate queries and safe deletion – Scalable

  • Proofs of tamper-evidence will be in my

PhD Thesis

slide-76
SLIDE 76

Questions

?

slide-77
SLIDE 77

Historical integrity

X’n-4 X’n-5

C’n-4

slide-78
SLIDE 78

Historical integrity

X’n-4 X’n-5

C’n-4

slide-79
SLIDE 79

X’n-5 X’n-4

C’n-4

Xn-5 Xn-4 Xn-3 Xn-2 Xn-1

Cn-1

Historical integrity

slide-80
SLIDE 80

Historical integrity

X’n-4 Xn-4 X’n-5

C’n-4

Xn-5 Xn-3 Xn-2 Xn-1

Cn-1

slide-81
SLIDE 81

Defining historically integrity

  • A logging system is tamper-evident when:

– If there is a verified incremental proof between commitments Cj and Ck (j<k), then for all i<j and all verifiable membership proofs that event i in log Cj is Xi and event i in log Ck is X’i, we must have Xi=X’i.

X’n-5 X’n-4 X’n-3

C’n-3

Xn-5 Xn-4 Xn-3 Xn-2 Xn-1

Cn-1

slide-82
SLIDE 82

Safe deletion

  • Unimportant events may be deleted

– When auditor requests deleted event

  • Logger supplies proof that ancestor was not important

X1 R X2 X3 X4 X5 X7 X6