Censys Retrospective Zakir Durumeric Censys Timeline 2013 ZMap - PowerPoint PPT Presentation

Censys Retrospective Zakir Durumeric

Censys Timeline 2013 • ZMap Internet Scanner Release   We release ZMap, an open source network scanner capable of scanning IPv4 on one port in 45 minutes. Internet-Wide Scan Data Repository • 2014 We launch scans.io, a repository of active Internet   scan data. Initially Michigan and Rapid7 data. 2015 • Censys Public Launch We launch initial version of Censys query engine. Initially contains records for IPv4 hosts and Alexa. Censys, Inc. • 2018   We realize we built a   Censys spins out into standalone org. monster we can’t maintain

Censys Launch (2015) Observations Deployed Solution Painful to run ZMap scans in the real world Scan popular protocols weekly and annotate with device metadata We regularly answer questions for others Stitch scans into a single cohesive Researchers who cannot perform scans dataset and annotate with IP metadata also cannot download 1TB datasets Provide web search, BigQuery SQL Goals interface, and raw data downloads Primary: enable researchers to easily Initial Coverage answer their own questions about Internet and web composition HTTP , HTTPS, CWMP , POP3, IMAP , SMTP , Secondary: consistently collect and store FTP , Telnet, SSH, Modbus, DNP3 as well scan data to answer our own questions as TLS weaknesses like Heartbleed

Censys Architecture (2015) Celery Scheduler ZMap (IPs) → ZGrab (App)   3 Servers → ZTag (Anntotated) ZMap (IPs) → ZGrab (App)   → ZTag (Anntotated) … 12 GCP Instances RocksDB Based   Certificate   Storage Engine   Transparency (1 Server) Raw Storage (ZFS + NFS) scans.io

Where did our time go? Successes Challenges Scanning infrastructure Data pipeline maintenance. Di ffi cult to build/deploy pipeline for handling Easy to schedule scans and data with a changing schema capture raw data about hosts Stitching scans together from a one Hosting data in Google BigQuery week period. Far too much noise. Helping and researchers and non- Building APIs that meet everyone’s researchers understand hosts di ff erent needs. Merging datasets. Operator response Very di ffi cult to allow “fair” usage to large numbers of users

Reflection Was Censys Successful? Yes, but I don’t think we built the best tool for researchers What would I do differently? Be more opinionated. Focus solely on getting data into Google BigQuery Never store data in files, worry about web interface, or design APIs Move slowly transforming schema problems from collection to query time Pure Go-based solution that we could verify at compilation time Build fully streaming solution with sharded append-only BigQuery log

Some Thoughts on Technology Colaboratory Google BigQuery Hosted, easy to use notebook-based analysis Split storage from processing. Allows us to publish data and let researchers do their own querying, Elasticsearch merging with their datasets. $$ to scale. ~48 hosts for 20TB. Need to define Fast. We’ll upload and run SQL instead of write a your own DSL not use Lucene’s to be useful. local script. One headache: max 10K columns. Kafka Go Language Scales wonderfully, but library support isn’t None of this would have happened without Go. We necessarily stable. Di ffi cult to not drop data. will not use C/C++/Python for anything real today. Off the Shelf Databases Apache Beam Popular databases like Mongo, Cassandra, Merges idea from most other processing InfluxDB do not scale cheaply. BigTable works. frameworks. Combines both streaming + batch. Excited about FoundationDB, ClickHouse. Airflow JSON Best DAG-based scheduler. Still young. Many Nightmare streaming. Now use Protobuf and Avro. companies do this type of scheduling today.

Censys, Inc. Story Community Interaction We spun Censys out into an Ann Arbor Discontinued unrestricted public based company at the start of 2018 access to raw data and unlimited API access Provide raw data about IPs/certificates and building security services Provide full access to raw data and BigQuery tables for non-commercial Additional Coverage researchers. Generally short email. Open source application layer scanners Added RDBMS, NoSQL, printers, remote access, system protocols and light-weight scanning of top 1K ports

Research Requests 223 research requests (CY’18) Challenges 143 (64%) from academic groups Groups have varying definitions of research. What about research at for- Granted vast majority of requests profit companies? Denied Requests Significant language barriers for a non- negligible number of requests. Typically doing research on behalf of Groups are resistant to BigQuery and large company for Black Hat etc. bandwidth costs are non-negligible. Non-academic individual with no ~$70 to download 1TB from GCP . clear objective Di ffi cult to turn down support requests

Censys Retrospective Zakir Durumeric

Censys Retrospective Zakir Durumeric Censys Timeline 2013 ZMap - PowerPoint PPT Presentation

Censys Retrospective Zakir Durumeric Censys Timeline 2013 ZMap Internet Scanner Release We release ZMap, an open source network scanner capable of scanning IPv4 on one port in 45 minutes. Internet-Wide Scan Data Repository 2014 We

Mergers: a 20 year Retrospective Retrospective Competition Law Conference Sydney 4 May 2013

VSE: 5-Year Retrospective (March 2017) Disclaimer This VSE 5-Year Retrospective is neither an

Retrospective Price Indices and Substitution Bias Retrospective Price Indices and Substitution

Process Improvement In Retrospective ( Lessons Learned from Software Projects) SEPG Conference

FROM RETROSPECTIVE TO CONTINUOUS DEEP ANALYTICS Seif Haridi KTH SICS Why most Data Analysis

New Zealand Consumers Price Index: Retrospective superlative index and impact of alternative

Chapter 10 Retrospective on Unit Testing Software Testing: A Craftsman s Approach, 4 th

Retrospective Updates Issues Raised at RAASP Workshops - UPDATED 10 th November 2015 1

Retrospective Antipatterns Please move Aino Corry forward in the room @apaipi

Retrospective Antipatterns Aino Corry @apaipi Putting speakers on stage Messing with the heads of

ROTATIONAL FLAPS IN COMPLICATED PARTIAL FOOT AMPUTATION: A RETROSPECTIVE REVIEW TO ASSESS

A Retrospective Evaluation of Glycemic Effects in Veterans With Type 2 Diabetes After Addition of

Lameness in Standardbred racehorses: a three years retrospective study on causes of lameness

Retrospective analysis of risk factors for late presentation of chronic glaucoma Article in British

Local Market Power Mitigation Enhancements Draft Final Proposal and Retrospective Analysis May

Retrospective Dosimetry Based on Long Lived Free Radicals Harold Swartz, M.D., Ph.D. Geisel

Step Away From That Database Andrew Godwin DjangoCon 2010 O HAI ""Andrew speaks

Biodjango, an open framework for bioinformatics publishing Ennys Gheyouche and Stphane

Manageable data pipelines with Airflow (and Kubernetes) GDG DevFest Warsaw 2018 @higrys,

Dynamic provisioning and execution of HPC workflows using Python Chris Harris, Patrick OLeary,

Use Saltstack to deploy a full monitoring and supervision stack #cfgmgmtcamp18 Arthur Lutz

HTTP request proxying vulnerability andres@laptop:~/$ curl

Jean Tabaka, Rally Software So what? Victor Rodrigues Riaan Rottier Simon Sinek The Golden

DARE: A Standards-based Middleware for Science Gateways http://radical.rutgers.edu EGI

Censys Retrospective Zakir Durumeric Censys Timeline 2013 ZMap - PowerPoint PPT Presentation

Censys Retrospective Zakir Durumeric Censys Timeline 2013 ZMap Internet Scanner Release We release ZMap, an open source network scanner capable of scanning IPv4 on one port in 45 minutes. Internet-Wide Scan Data Repository 2014 We

Mergers: a 20 year Retrospective Retrospective Competition Law Conference Sydney 4 May 2013

VSE: 5-Year Retrospective (March 2017) Disclaimer This VSE 5-Year Retrospective is neither an

Retrospective Price Indices and Substitution Bias Retrospective Price Indices and Substitution

Process Improvement In Retrospective ( Lessons Learned from Software Projects) SEPG Conference

FROM RETROSPECTIVE TO CONTINUOUS DEEP ANALYTICS Seif Haridi KTH SICS Why most Data Analysis

New Zealand Consumers Price Index: Retrospective superlative index and impact of alternative

Chapter 10 Retrospective on Unit Testing Software Testing: A Craftsman s Approach, 4 th

Retrospective Updates Issues Raised at RAASP Workshops - UPDATED 10 th November 2015 1

Retrospective Antipatterns Please move Aino Corry forward in the room @apaipi

Retrospective Antipatterns Aino Corry @apaipi Putting speakers on stage Messing with the heads of

ROTATIONAL FLAPS IN COMPLICATED PARTIAL FOOT AMPUTATION: A RETROSPECTIVE REVIEW TO ASSESS

A Retrospective Evaluation of Glycemic Effects in Veterans With Type 2 Diabetes After Addition of

Lameness in Standardbred racehorses: a three years retrospective study on causes of lameness

Retrospective analysis of risk factors for late presentation of chronic glaucoma Article in British

Local Market Power Mitigation Enhancements Draft Final Proposal and Retrospective Analysis May

Retrospective Dosimetry Based on Long Lived Free Radicals Harold Swartz, M.D., Ph.D. Geisel

Step Away From That Database Andrew Godwin DjangoCon 2010 O HAI &quot;&quot;Andrew speaks

Biodjango, an open framework for bioinformatics publishing Ennys Gheyouche and Stphane

Manageable data pipelines with Airflow (and Kubernetes) GDG DevFest Warsaw 2018 @higrys,

Dynamic provisioning and execution of HPC workflows using Python Chris Harris, Patrick OLeary,

Use Saltstack to deploy a full monitoring and supervision stack #cfgmgmtcamp18 Arthur Lutz

HTTP request proxying vulnerability andres@laptop:~/$ curl

Jean Tabaka, Rally Software So what? Victor Rodrigues Riaan Rottier Simon Sinek The Golden

DARE: A Standards-based Middleware for Science Gateways http://radical.rutgers.edu EGI

Step Away From That Database Andrew Godwin DjangoCon 2010 O HAI ""Andrew speaks