vast interactive network forensics
play

VAST: Interactive Network Forensics Matthias Vallentin - PowerPoint PPT Presentation

VAST: Interactive Network Forensics Matthias Vallentin matthias@bro.org BroCon August 5, 2015 Demo I 2 / 26 Data Pyramid Low Filtered Fidelity Data Aggregated Data Data Volume Structured Data High Raw Data Fidelity 3 / 26 Data


  1. VAST: Interactive Network Forensics Matthias Vallentin matthias@bro.org BroCon August 5, 2015

  2. Demo I 2 / 26

  3. Data Pyramid Low Filtered Fidelity Data Aggregated Data Data Volume Structured Data High Raw Data Fidelity 3 / 26

  4. Data Pyramid Low Alarms Fidelity Bro Logs Data Volume Bro Events High Packets Fidelity 4 / 26

  5. Data Pyramid Low Exit Fidelity Status Process Events Data Volume System Calls High Instruction Stream Fidelity 5 / 26

  6. VAST: Visibility Across Space and Time Import Export Archive Index Key Features ◮ Interactive response times ◮ Type-rich data model ◮ Horizontal scaling over a cluster ◮ Strongly typed query language ◮ Iterative query refinement ◮ Historical & continuous queries 6 / 26

  7. High-Level Architecture of VAST Import 10.0.0.1 10.0.0.254 53/udp 10.0.0.2 10.0.0.254 80/tcp ◮ Sources produce events Import ◮ PCAP, Bro logs, BGPdump, . . . 7 / 26

  8. High-Level Architecture of VAST Import 10.0.0.1 10.0.0.254 53/udp 10.0.0.2 10.0.0.254 80/tcp ◮ Sources produce events Import ◮ PCAP, Bro logs, BGPdump, . . . Archive ◮ Key-value store (IDs → events) ◮ Stores raw data as events Archive 7 / 26

  9. High-Level Architecture of VAST Import 10.0.0.1 10.0.0.254 53/udp 10.0.0.2 10.0.0.254 80/tcp ◮ Sources produce events Import ◮ PCAP, Bro logs, BGPdump, . . . Archive ◮ Key-value store (IDs → events) ◮ Stores raw data as events Index Archive Index ◮ Bitmap indexes over event data ◮ Hits are event IDs in archive 7 / 26

  10. High-Level Architecture of VAST Import 10.0.0.1 10.0.0.254 53/udp 10.0.0.2 10.0.0.254 80/tcp ◮ Sources produce events Import ◮ PCAP, Bro logs, BGPdump, . . . Archive ◮ Key-value store (IDs → events) ◮ Stores raw data as events Index Archive Index ◮ Bitmap indexes over event data ◮ Hits are event IDs in archive Export ◮ Sinks consume events Export ◮ PCAP, Bro logs, ASCII, JSON 7 / 26

  11. VAST & Big Data MapReduce (Hadoop) Batch-oriented processing: full scan of data + Expressive: no restriction on algorithms - Speed & Interactivity: full scan for each query 8 / 26

  12. VAST & Big Data MapReduce (Hadoop) Batch-oriented processing: full scan of data + Expressive: no restriction on algorithms - Speed & Interactivity: full scan for each query In-memory Cluster Computing (Spark) Load full data set into memory and then run query + Speed & Interactivity: fast on arbitrary queries over working set - Thrashing when working set too large 8 / 26

  13. VAST & Big Data MapReduce (Hadoop) Batch-oriented processing: full scan of data + Expressive: no restriction on algorithms - Speed & Interactivity: full scan for each query In-memory Cluster Computing (Spark) Load full data set into memory and then run query + Speed & Interactivity: fast on arbitrary queries over working set - Thrashing when working set too large Distributed Indexing (VAST) Distributed building and querying of bitmap indexes + Fast: only access space-efficient indexes + Caching of index hits enables iterative analyses - Lookup only, not arbitrary computation 8 / 26

  14. VAST & SIEM Splunk Data Model Unstructured text Index B-tree Computation MapReduce Code Closed-source License Data-volume based 9 / 26

  15. VAST & SIEM Splunk ElasticSearch Data Model Rich (Lucene) Data Model Unstructured text Index B-tree Index Inverted (Lucene) Computation Index Lookup Computation MapReduce Code Closed-source Code Open-source License Apache 2.2 License Data-volume based 9 / 26

  16. VAST & SIEM Splunk ElasticSearch Data Model Rich (Lucene) Data Model Unstructured text Index B-tree Index Inverted (Lucene) Computation Index Lookup Computation MapReduce Code Closed-source Code Open-source License Apache 2.2 License Data-volume based VAST Data Model Rich (Bro) Index Bitmap Indexes Computation Index Lookup Code Open-source License BSD (3-clause) 9 / 26

  17. Types: Interpretation of Data TYPE bool string vector set int pattern record count address TYPE TYPE … real subnet field 1 field n table duration port TYPE TYPE KEY VALUE time none container types compound types basic types recursive types 10 / 26

  18. Query Language Boolean Expressions Examples ◮ Conjunctions && ◮ A && B || !(C && D) ◮ Disjunctions || ◮ orig_h == 10.0.0.1 && &time < now - 2h ◮ Negations ! ◮ &type == "conn" || "foo" in :string ◮ Predicates ◮ duration > 60s && service == "tcp" ◮ LHS op RHS ◮ (expr) Extractors Relational Operators Values ◮ &type ◮ < , <= , == , >= , > ◮ T , F ◮ &time ◮ +42 , 1337 , 3.14 ◮ in , ni , [+ , +] ◮ x.y.z.arg ◮ !in , !ni , [- , -] ◮ "foo" ◮ :type ◮ 10.0.0.0/8 ◮ ~ , !~ ◮ 80/tcp , 53/? ◮ {1, 2, 3} 11 / 26

  19. Index Hits: Sets of Event IDs 0 . 0 0 Bitvector : ordered set of IDs 0 ◮ Query result ≡ set of event IDs from [0 , 2 64 − 1) 0 1 = 1 → Model as bit vector : [4 , 7 , 8] = 0000100110 · · · 0 0 ◮ Run-length encoded 0 0 ◮ Append-only 1 . ◮ Bitwise operations do not require decoding 2 64 − 1 Data Bitmap B 0 B 1 B 2 B 3 Bitmap : maps values to bit vectors 2 0 0 1 0 ◮ push_back(T x ) : append value x of type T 1 0 1 0 0 2 0 0 1 0 ◮ lookup(T x , Op ◦ ) : get bit vector for x under ◦ 0 1 0 0 0 0 1 0 0 0 1 0 1 0 0 3 0 0 0 1 12 / 26

  20. Composing Results via Bitwise Operations Combining Predicates ◮ Query Q = X ∧ Y ∧ Z ◮ x = 1 . 2 . 3 . 4 ∧ y < 42 ∧ z ∈ ” foo ” ◮ Bitmap index lookup yields X → B 1 , Y → B 2 , and Z → B 3 ◮ Result R = B 1 & B 2 & B 3 B 1 B 2 B 3 R & & = 13 / 26

  21. What happened since BroCon’14? New Features ◮ Continuous queries ◮ Apply queries to arriving data 14 / 26

  22. What happened since BroCon’14? New Features ◮ Continuous queries ◮ Apply queries to arriving data ◮ Time Machine ◮ Full indexes on time stamp and connection tuple ◮ Bidirectional flow cut-off 14 / 26

  23. What happened since BroCon’14? New Features ◮ Continuous queries ◮ Apply queries to arriving data ◮ Time Machine ◮ Full indexes on time stamp and connection tuple ◮ Bidirectional flow cut-off ◮ New event sources ◮ BGPdump ◮ JSON/Kafka (not yet merged) 14 / 26

  24. What happened since BroCon’14? New Features ◮ Continuous queries ◮ Apply queries to arriving data ◮ Time Machine ◮ Full indexes on time stamp and connection tuple ◮ Bidirectional flow cut-off ◮ New event sources ◮ BGPdump ◮ JSON/Kafka (not yet merged) ◮ Distributed Architecture ◮ Commutativity: support message reordering ◮ Associativity: parallel query engine 14 / 26

  25. What happened since BroCon’14? New Features ◮ Continuous queries ◮ Apply queries to arriving data ◮ Time Machine ◮ Full indexes on time stamp and connection tuple ◮ Bidirectional flow cut-off ◮ New event sources ◮ BGPdump ◮ JSON/Kafka (not yet merged) ◮ Distributed Architecture ◮ Commutativity: support message reordering ◮ Associativity: parallel query engine 14 / 26

  26. Distributed VAST node importer I A X archive index E exporter node : the logical unit of deployment ◮ A container for actors/components ◮ Message serialization only at node boundaries → Maps to single OS process, typically one per machine 15 / 26

  27. Distributed VAST: Replicated Cores I I I A X A X A X E E E 16 / 26

  28. Distributed VAST: Replicated Cores source I I I A X A X A X E E E sink 17 / 26

  29. Distributed VAST: Custom Deployment source source HDD I I I A X A X X E E SSD SSD sink 18 / 26

  30. Demo II 19 / 26

  31. Demo Topology: Import foo bar I I ID A X A X 20 / 26

  32. Demo Topology: Import source foo bar I I ID A X A X 21 / 26

  33. Demo Topology: Export (naive) foo bar A X A X E sink 22 / 26

  34. Demo Topology: Export (better) foo bar A X A X E E 23 / 26

  35. Demo Topology: Export (good) foo bar A X A X E E sink 24 / 26

  36. Future Work: Moving Forward Next Milestone: Release ◮ Architecture converging: feature freeze for 0.1 soon ◮ Thorough testing of distributed architecture ◮ Improve index size of strings and containers 25 / 26

  37. Future Work: Moving Forward Next Milestone: Release ◮ Architecture converging: feature freeze for 0.1 soon ◮ Thorough testing of distributed architecture ◮ Improve index size of strings and containers Down The Line ◮ Improved Bro integration ◮ Unify data model with Broker ◮ VAST writer for Bro 25 / 26

  38. Future Work: Moving Forward Next Milestone: Release ◮ Architecture converging: feature freeze for 0.1 soon ◮ Thorough testing of distributed architecture ◮ Improve index size of strings and containers Down The Line ◮ Improved Bro integration ◮ Unify data model with Broker ◮ VAST writer for Bro ◮ Fault tolerance ◮ Data replication (replicate archive & index ) ◮ Query snapshotting (resume failed execution) ◮ Use Raft to manage global state (large-scale clusters) 25 / 26

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend