Chair of Network Architectures and Services Department of Informatics Technical University of Munich
MA Final Talk: Tamper-Evident Publication of Internet Measurements - - PowerPoint PPT Presentation
MA Final Talk: Tamper-Evident Publication of Internet Measurements - - PowerPoint PPT Presentation
Chair of Network Architectures and Services Department of Informatics Technical University of Munich MA Final Talk: Tamper-Evident Publication of Internet Measurements Max Helm Advisors: Oliver Gasser, Benjamin Hof, Quirin Scheitle July 16,
SLIDE 1
SLIDE 2
Motivation
- What?
- Tamper-evident: Cryptographically provable append-only property
- Publish Internet measurements (e.g. certificates)
- Why?
- Documentation of measurement claim, timestamping service, attribution to cryptographic iden-
tity: I found and uploaded this specific certificate at this exact time
- Ease the joint use of different certificate sources: Certificate Transparency ↔ Active Scans [6]
- Bind measurement results to their meta data
- M. Helm — MA: Tamper-Evident Measurements
2
SLIDE 3
Motivation
Requirements:
- CT compatibility: Upload, download, proofs
- Extensibility: Add additional modules
- New spam protection mechanism
- M. Helm — MA: Tamper-Evident Measurements
3
SLIDE 4
Related Work
- Data archival and reproducibility guidelines: BSI[2], TUM UB[5], Censys[3]
- Tamper-evident data structures:
- Chain-based: Linear hash chains, ...
- Tree-based: Merkle Trees, Certificate Transparency, Persistent Authenticated Dictionaries, ...
- Merkle Tree extensions: Revocation Transparency[4], CT for DNSSEC[7], ...
- M. Helm — MA: Tamper-Evident Measurements
4
SLIDE 5
Background
- Merkle Tree:
- Binary hash tree (leaves are certificates, nodes are hashes of children)
- Root hash gets published
- Efficient way of proving inclusion of certificates and consistency between tree versions
- Certificate Transparency (CT):
- Public, tamper-evident logs of certificates
- Everyone can submit certificates with a valid chain
- Supply simple HTTP(S) GET/POST endpoints for up- and download
- Use Merkle tree to achieve tamper-evident property
- M. Helm — MA: Tamper-Evident Measurements
5
SLIDE 6
Approach
- Three leaf types:
- Certificates (comparable to CT)
- Scan Data (Meta data about whole scan as well as single scan items)
- Derived Data (Additional data: e.g. linter results for certificates)
- Use three Merkle trees in conjunction (additional data, keep CT compatibility)
- M. Helm — MA: Tamper-Evident Measurements
6
SLIDE 7
Approach
Workflow for researchers:
- Upload scan meta data → Returns: Inclusion promise for scan + leaf hash of scan
- Upload scan results → Returns: Inclusion promise for each result
- Researcher has to store or publish inclusion promises → Possible to prove misbehavior of
log
- Log triggers derived data collection
- M. Helm — MA: Tamper-Evident Measurements
7
SLIDE 8
Design
scan2 scan1 scan0
Figure 1: Scan series tree (scan meta data)
- M. Helm — MA: Tamper-Evident Measurements
8
SLIDE 9
Design
scan2 scan1 scan0
Figure 1: Scan series tree (scan meta data)
cert3 cert2 cert1 cert0
Figure 3: Certificate tree (certificate data)
- M. Helm — MA: Tamper-Evident Measurements
8
SLIDE 10
Design
scan2 scan1 scan0
Figure 1: Scan series tree (scan meta data)
derived3 derived2 derived1 derived0
Figure 2: Derived data tree (additional data derived from certs)
cert3 cert2 cert1 cert0
Figure 3: Certificate tree (certificate data)
- M. Helm — MA: Tamper-Evident Measurements
8
SLIDE 11
Design
Scan Data scan2 scan1 scan0 Certificates cert3 cert2 cert1 cert0 Derived Data derived3 derived2 derived1 derived0
Figure 4: The three tree types and their final interconnections. Solid lines represent tamper-evident links, dashed lines represent non-tamper-evident links.
- M. Helm — MA: Tamper-Evident Measurements
9
SLIDE 12
Implementation
- Based on github.com/google/trillian (log server)
and github.com/google/certificate-transparency-go (log client)
- Extended server to support active scans with meta data and derived data
- Extended client to upload scan data and added different upload modes
- M. Helm — MA: Tamper-Evident Measurements
10
SLIDE 13
Implementation
add-scan add-cert verifyPGP GPG Public Key Ring addToCache Memcached Instance checkCache Tree Mode startDerivation add-derived createTree queueLeaf DB gRPC Figure 5: Overview of the implementation of the trillian personality.
- M. Helm — MA: Tamper-Evident Measurements
11
SLIDE 14
Implementation
Big scans cause problems (protobuf restrictions) ⇒ Two modes for different requirements to the size of scans:
- Default (Scans < 5M entries):
- Three static trees
- One scan consists of one node of the scan tree
- Dynamic Tree Generation (Scans 5M entries):
- Three static trees + Dynamically generated trees
- Each scan triggers generation of new tree
- Those dynamic trees are sub trees of scan tree
- One scan consists of one node of scan tree + One dynamic tree
- M. Helm — MA: Tamper-Evident Measurements
12
SLIDE 15
Implementation
Scan Data scan2 scan1 scan0
datum3 datum2 datum1 datum0 datum3 datum2 datum1 datum0 datum3 datum2 datum1 datum0
Figure 6: The scan tree on top with three dynamically generated sub trees for the three scans in the top tree.
- M. Helm — MA: Tamper-Evident Measurements
13
SLIDE 16
Implementation
Different upload modes for certificates:
- Default (like in CT):
- One cert per request
- Batches:
- 1,000 certs per request
- Concurrent batches:
- Parallel upload of batches
- M. Helm — MA: Tamper-Evident Measurements
14
SLIDE 17
Evaluation
Setup:
- Single VM
- Debian Jessie, 64GB RAM, 8 × 2.67GHz cores
Scan upload (1M entries per scan):
- Default: Four minutes
- Dynamic Trees: 165 minutes
Certificate upload (1M certs):
- Default (like in CT): 660 minutes
- Batches: 161 minutes
- Concurrent batches: 70 minutes
- M. Helm — MA: Tamper-Evident Measurements
15
SLIDE 18
Evaluation
(a) Memory and CPU usage (b) Network usage and number of unsequenced rows
Figure 7: Measurement results for the scan upload in the default mode
- Short upload time
- Not resource bound
- M. Helm — MA: Tamper-Evident Measurements
16
SLIDE 19
Evaluation
(a) Memory and CPU usage (b) Network usage and number of unsequenced rows
Figure 8: Measurement results for the scan upload in the dynamic trees mode
- Upload takes more time
- Batch pattern in unsequenced rows
- M. Helm — MA: Tamper-Evident Measurements
17
SLIDE 20
Evaluation
(a) Rocketeer log size over time[1] (b) Argon 2017 log size over time[1]
Figure 9: CT logs over time
- Rocketeer: mean of 300,000 per day
- Argon 2017: maximum of 2,100,000 per day
⇒ Only estimation of lower bound
- M. Helm — MA: Tamper-Evident Measurements
18
SLIDE 21
Future Work
- Improve performance:
- Timeout issues of implementation
- Contexts get canceled
- Gossiping:
- Security goals like tamper-evidence, spam protection, accountability reached
- Other goals require lots of assumptions → e.g. no split view attacks
- M. Helm — MA: Tamper-Evident Measurements
19
SLIDE 22
Conclusion
- CT compatibility: Upload ✗, download ✓, proof ✓
- Extensibility: Add derived module via git link ✓
- New spam protection mechanism: GPG ✓
Questions?
- M. Helm — MA: Tamper-Evident Measurements
20
SLIDE 23
Backup
Implementation details
- trillian implements highly scalable Merkle-Tree in Go
- certificate-transparency-go implements CT personality
- Altered CT personality to work with arbitrary data (not only certs)
- Added second tree, communication between trees
- Removed: root store contains valid root certs
- Added: PGP signature verification on scan upload
- User store: implemented as Public key ring file
- Memcache for caching of cert hashes
- Only accept cert uploads where cert is part of existing scan
- M. Helm — MA: Tamper-Evident Measurements
21
SLIDE 24
Backup
Design details Roles:
- Log owner (created and delete log, add admins)
- Log admin (freeze log, add contributors)
- Log contributor (upload data)
- Log user (download data)
Attacker model:
- External attacker: User of or contributor to log → Alter (own) contributions
- Internal attacker: Owner or admin of the log → Alter contributions to fit long term measure-
ment results
- Internal data corruption: Bit flips, accidental overrides, ...
- M. Helm — MA: Tamper-Evident Measurements
22
SLIDE 25
Backup
Figure 10: Structure of an inclusion promise.
- M. Helm — MA: Tamper-Evident Measurements
23
SLIDE 26
Bibliography
[1] CT Monitor. ct.grahamedgecombe.com. Accessed: 2018-07-15. [2] BSI. M 4.170. https://www.bsi.bund.de/DE/Themen/ITGrundschutz/ITGrundschutzKataloge/Inhalt/_content/m/m04/m04170.html. [3]
- Z. Durumeric, D. Adrian, A. Mirian, M. Bailey, and J. A. Halderman.
A search engine backed by Internet-wide scanning. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pages 542–553. ACM, 2015. [4]
- B. Laurie and E. Kasper.
Revocation transparency. Google Research, September, 2012. [5]
- U. TUM.
Forschungsdatenmanagement. https://www.ub.tum.de/forschungsdaten-archivieren. [6]
- B. VanderSloot, J. Amann, M. Bernhard, Z. Durumeric, M. Bailey, and J. A. Halderman.
Towards a Complete View of the Certificate Ecosystem. In Proceedings of the 2016 ACM on Internet Measurement Conference, pages 543–549. ACM, 2016. [7]
- D. Zhang.
Certificate Transparency for Domain Name System Security Extensions. Work In Progress, 2016. https://tools.ietf.org/html/draft-zhang-trans-ct-dnssec-03.
- M. Helm — MA: Tamper-Evident Measurements
24