VAST A Unified Platform for Interactive Network Forensics Matthias - - PowerPoint PPT Presentation

vast
SMART_READER_LITE
LIVE PREVIEW

VAST A Unified Platform for Interactive Network Forensics Matthias - - PowerPoint PPT Presentation

VAST A Unified Platform for Interactive Network Forensics Matthias Vallentin 1 , 2 Vern Paxson 1 , 2 Robin Sommer 2 , 3 1 UC Berkeley 2 International Computer Science Institute (ICSI) 3 Lawrence Berkeley National Laboratory (LBNL) March 17, 2016


slide-1
SLIDE 1

VAST

A Unified Platform for Interactive Network Forensics Matthias Vallentin1,2 Vern Paxson1,2 Robin Sommer2,3

1UC Berkeley 2International Computer Science Institute (ICSI) 3Lawrence Berkeley National Laboratory (LBNL)

March 17, 2016 USENIX NSDI

1 / 28

slide-2
SLIDE 2

Omnipresent Data Breaches

2 / 28

slide-3
SLIDE 3

Breach Timeline

Compromise Forensics Detection Time

3 / 28

slide-4
SLIDE 4

Breach Timeline

Compromise Detection Time

3 / 28

slide-5
SLIDE 5

Breach Timeline

Compromise Detection Time

?

3 / 28

slide-6
SLIDE 6

Network Forensics — Characteristics

4 / 28

slide-7
SLIDE 7

Network Forensics — Characteristics

4 / 28

slide-8
SLIDE 8

Network Forensics — Characteristics

Organization

4 / 28

slide-9
SLIDE 9

Network Forensics — Characteristics

4 / 28

slide-10
SLIDE 10

Network Forensics — Characteristics

4 / 28

slide-11
SLIDE 11

Network Forensics — Characteristics

4 / 28

slide-12
SLIDE 12

Network Forensics — Characteristics

?

4 / 28

slide-13
SLIDE 13

Network Forensics — Characteristics

Interactive data exploration

◮ Iterative query refinement ◮ High-dimensional search

?

4 / 28

slide-14
SLIDE 14

Network Forensics — Characteristics

Interactive data exploration

◮ Iterative query refinement ◮ High-dimensional search

Disparate data access

◮ Temporal ◮ Spatial

?

4 / 28

slide-15
SLIDE 15

Network Forensics — Characteristics

Interactive data exploration

◮ Iterative query refinement ◮ High-dimensional search

Disparate data access

◮ Temporal ◮ Spatial

Massive data volumes

◮ 50–100K events/sec ◮ 10s TBs/day

?

4 / 28

slide-16
SLIDE 16

Log Example — Bro Connection Log

#separator \x09 #set_separator , #empty_field (empty) #unset_field - #path conn #open 2016-01-06-15-28-58 #fields ts uid id.orig_h id.orig_p id.resp_h id.resp_p proto service duration orig_bytes resp_bytes conn_.. #types time string addr port addr port enum string interval count count string bool bool count string 1258531.. Cz7SRx3.. 192.168.1.102 68 192.168.1.1 67 udp dhcp 0.163820 301 300 SF - - 0 Dd 1 329 1 328 (empty) 1258531.. CTeURV1.. 192.168.1.103 137 192.168.1.255 137 udp dns 3.780125 350 0 S0 - - 0 D 7 546 0 0 (empty) 1258531.. CUAVTq1.. 192.168.1.102 137 192.168.1.255 137 udp dns 3.748647 350 0 S0 - - 0 D 7 546 0 0 (empty) 1258531.. CYoxAZ2.. 192.168.1.103 138 192.168.1.255 138 udp - 46.725380 560 0 S0 - - 0 D 3 644 0 0 (empty) 1258531.. CvabDq2.. 192.168.1.102 138 192.168.1.255 138 udp - 2.248589 348 0 S0 - - 0 D 2 404 0 0 (empty) 1258531.. CViJEOm.. 192.168.1.104 137 192.168.1.255 137 udp dns 3.748893 350 0 S0 - - 0 D 7 546 0 0 (empty) 1258531.. CSC2Hd4.. 192.168.1.104 138 192.168.1.255 138 udp - 59.052898 549 0 S0 - - 0 D 3 633 0 0 (empty) 1258531.. Cd3RNm1.. 192.168.1.103 68 192.168.1.1 67 udp dhcp 0.044779 303 300 SF - - 0 Dd 1 331 1 328 (empty) 1258531.. CEwuIl2.. 192.168.1.102 138 192.168.1.255 138 udp - - - - S0 - - 0 D 1 229 0 0 (empty) 1258532.. CXxLc94.. 192.168.1.104 68 192.168.1.1 67 udp dhcp 0.002103 311 300 SF - - 0 Dd 1 339 1 328 (empty) 1258532.. CIFDQJV.. 192.168.1.102 1170 192.168.1.1 53 udp dns 0.068511 36 215 SF - - 0 Dd 1 64 1 243 (empty) 1258532.. CXFISh5.. 192.168.1.104 1174 192.168.1.1 53 udp dns 0.170962 36 215 SF - - 0 Dd 1 64 1 243 (empty) 1258532.. CQJw4C3.. 192.168.1.1 5353 224.0.0.251 5353 udp dns 0.100381 273 0 S0 - - 0 D 2 329 0 0 (empty) 1258532.. ClfEd43.. fe80::219:e3ff:fee7:5d23 5353 ff02::fb 5353 udp dns 0.100371 273 0 S0 - - 0 D 2 369 0 0 1258532.. C67zf02.. 192.168.1.103 137 192.168.1.255 137 udp dns 3.873818 350 0 S0 - - 0 D 7 546 0 0 (empty) 1258532.. CG1FKF1.. 192.168.1.102 137 192.168.1.255 137 udp dns 3.748891 350 0 S0 - - 0 D 7 546 0 0 (empty) 1258532.. CNFkeF2.. 192.168.1.103 138 192.168.1.255 138 udp - 2.257840 348 0 S0 - - 0 D 2 404 0 0 (empty) 1258532.. Cq4eis4.. 192.168.1.102 1173 192.168.1.1 53 udp dns 0.000267 33 497 SF - - 0 Dd 1 61 1 525 (empty) 1258532.. CHpqv31.. 192.168.1.102 138 192.168.1.255 138 udp - 2.248843 348 0 S0 - - 0 D 2 404 0 0 (empty) 1258532.. CFoJjT3.. 192.168.1.1 5353 224.0.0.251 5353 udp dns 0.099824 273 0 S0 - - 0 D 2 329 0 0 (empty) 1258532.. Cc3Ayyz.. fe80::219:e3ff:fee7:5d23 5353 ff02::fb 5353 udp dns 0.099813 273 0 S0 - - 0 D 2 369 0 0 5 / 28

slide-17
SLIDE 17

Existing Solutions

MapReduce (Hadoop)

✓ Scalability ✗ Batch-oriented: no iterative, exploratory analysis

6 / 28

slide-18
SLIDE 18

Existing Solutions

MapReduce (Hadoop)

✓ Scalability ✗ Batch-oriented: no iterative, exploratory analysis

In-Memory Cluster Computing (Spark)

✓ Efficient & complex analysis ✗ Thrashing when working set does not fit in aggregate memory

6 / 28

slide-19
SLIDE 19

Contribution

VAST

Visibility Across Space and Time

7 / 28

slide-20
SLIDE 20

Contribution

VAST

Visibility Across Space and Time

Architecture

◮ Performance: concurrent & modular design ◮ Scaling: intra-machine & inter-machine ◮ Typing: strong & rich

7 / 28

slide-21
SLIDE 21

Contribution

VAST

Visibility Across Space and Time

Architecture

◮ Performance: concurrent & modular design ◮ Scaling: intra-machine & inter-machine ◮ Typing: strong & rich

Implementation

◮ Composition: high-level bitmap indexing framework ◮ Adaptation: fine-grained component flow-control ◮ Asynchrony: finite state machines for query execution

7 / 28

slide-22
SLIDE 22

Outline

  • 1. Architecture
  • 2. Implementation
  • 3. Evaluation
slide-23
SLIDE 23

VAST Architecture — Single Machine

8 / 28

slide-24
SLIDE 24

VAST Architecture — Single Machine

importer archive index exporter node source sink

10.0.0.1 10.0.0.254 53/udp 10.0.0.2 10.0.0.254 80/tcp

8 / 28

slide-25
SLIDE 25

VAST Architecture — Ingestion

10.0.0.1 53/udp 10.0.0.2 80/tcp …

source

type 10.0.0.1 53/udp meta type 10.0.0.2 80/tcp meta generate event batch

9 / 28

slide-26
SLIDE 26

VAST Architecture — Ingestion

10.0.0.1 53/udp 10.0.0.2 80/tcp …

source

type 10.0.0.1 53/udp meta type 10.0.0.2 80/tcp meta generate event batch

importer

assign IDs

9 / 28

slide-27
SLIDE 27

VAST Architecture — Ingestion

10.0.0.1 53/udp 10.0.0.2 80/tcp …

source

type 10.0.0.1 53/udp meta type 10.0.0.2 80/tcp meta generate event batch

importer

assign IDs

archive

compress batch

9 / 28

slide-28
SLIDE 28

VAST Architecture — Ingestion

10.0.0.1 53/udp 10.0.0.2 80/tcp …

source

type 10.0.0.1 53/udp meta type 10.0.0.2 80/tcp meta generate event batch

importer

assign IDs

archive

compress batch

index

9 / 28

slide-29
SLIDE 29

VAST Architecture — Ingestion

10.0.0.1 53/udp 10.0.0.2 80/tcp …

source

type 10.0.0.1 53/udp meta type 10.0.0.2 80/tcp meta generate event batch

importer

assign IDs

archive

compress batch

index

10.0.0.2 80/tcp append data to bitmap index 10.0.0.1 53/udp type

9 / 28

slide-30
SLIDE 30

VAST Architecture — Index

partition index partition partition

meta index

10 / 28

slide-31
SLIDE 31

VAST Architecture — Index

partition index partition partition

meta index conn 10.0.0.2 53/udp 8.8.4.4 53/udp “dns”

indexer

10 / 28

slide-32
SLIDE 32

VAST Architecture — Querying

exporter

X in 10.0.0.0/8 || X == 80/tcp

11 / 28

slide-33
SLIDE 33

VAST Architecture — Querying

exporter

X in 10.0.0.0/8 || X == 80/tcp

index

11 / 28

slide-34
SLIDE 34

VAST Architecture — Querying

exporter

X in 10.0.0.0/8 || X == 80/tcp

index

lookup bit vectors from partitions 80/tcp == X 10.0.0.0/8 in X

_

11 / 28

slide-35
SLIDE 35

VAST Architecture — Querying

exporter

X in 10.0.0.0/8 || X == 80/tcp

index

lookup bit vectors from partitions 80/tcp == X 10.0.0.0/8 in X

_

11 / 28

slide-36
SLIDE 36

VAST Architecture — Querying

exporter

X in 10.0.0.0/8 || X == 80/tcp

index

lookup bit vectors from partitions 80/tcp == X 10.0.0.0/8 in X

_

archive

locate & ship event batch for ID

11 / 28

slide-37
SLIDE 37

VAST Architecture — Querying

exporter

X in 10.0.0.0/8 || X == 80/tcp

index

lookup bit vectors from partitions 80/tcp == X 10.0.0.0/8 in X

_

archive

locate & ship event batch for ID candidate check decompress batch

11 / 28

slide-38
SLIDE 38

VAST Architecture — Querying

exporter

X in 10.0.0.0/8 || X == 80/tcp

index

lookup bit vectors from partitions 80/tcp == X 10.0.0.0/8 in X

_

archive

locate & ship event batch for ID candidate check decompress batch

sink

11 / 28

slide-39
SLIDE 39

VAST Architecture — Querying

exporter

X in 10.0.0.0/8 || X == 80/tcp

index

lookup bit vectors from partitions 80/tcp == X 10.0.0.0/8 in X

_

archive

locate & ship event batch for ID candidate check decompress batch

sink

10.0.0.1 53/udp 10.0.0.2 80/tcp … type 10.0.0.1 53/udp meta type 10.0.0.2 80/tcp meta render results

11 / 28

slide-40
SLIDE 40

VAST Architecture — Distributed

12 / 28

slide-41
SLIDE 41

VAST Architecture — Distributed

12 / 28

slide-42
SLIDE 42

VAST Architecture — Distributed

12 / 28

slide-43
SLIDE 43

VAST Architecture — Distributed

12 / 28

slide-44
SLIDE 44

VAST Architecture — Distributed

12 / 28

slide-45
SLIDE 45

VAST Architecture — Distributed

12 / 28

slide-46
SLIDE 46

VAST Architecture — Distributed

12 / 28

slide-47
SLIDE 47

VAST Architecture — Distributed

12 / 28

slide-48
SLIDE 48

Outline

  • 1. Architecture
  • 2. Implementation
  • 3. Evaluation
slide-49
SLIDE 49

Indexing Basics — Tree Indexes

13 / 28

slide-50
SLIDE 50

Indexing Basics — Composition

( )

_ _

14 / 28

slide-51
SLIDE 51

Indexing Basics — Composition

( )

_ _

14 / 28

slide-52
SLIDE 52

Indexing Basics — Inverted Index

1 2 3 4 5 6 7 8 9 3 1 4 8 9 5 4 2 5 6 2 A B C D

15 / 28

slide-53
SLIDE 53

Indexing Basics — Bitmap Index

1 1 1 1 2 3 4 5 6 7 8 9 1 1 1 2 3 4 5 1 1 1 1 A B C D

16 / 28

slide-54
SLIDE 54

Indexing Basics — Bitmap Index

1 2 3 4 5 6 7 8 9 1 2 3 4 5 A B C D

16 / 28

slide-55
SLIDE 55

Indexing Basics — Bitmap Composition

17 / 28

slide-56
SLIDE 56

Indexing Basics — Bitmap Composition

X ∈ 192.168.0.0/24 Y ≥ 60s

17 / 28

slide-57
SLIDE 57

Indexing Basics — Bitmap Composition

X ∈ 192.168.0.0/24 Y ≥ 60s

_

17 / 28

slide-58
SLIDE 58

Indexing Challenges

High-cardinality values

◮ Represent millions of distinct values compactly ◮ Provide low-latency lookups

High-level operations

◮ Support type-specific operations ◮ Relational operators: {<, ≤, =, =, ≥, >, ∈, /

∈}

18 / 28

slide-59
SLIDE 59

Query Language

Boolean Expressions

◮ Conjunctions && ◮ Disjunctions || ◮ Negations ! ◮ Predicates

◮ LHS op RHS ◮ (expr)

Examples

◮ A && B || !(C && D) ◮ orig h in 10.0.0.1 && &time < now - 2h ◮ &type == "conn" || "foo" in :string ◮ duration > 60s && service == "tcp"

Extractors

◮ &tag ◮ x.y.z ◮ :type

Relational Operators

◮ <, <=, ==, >=, > ◮ in, ni, [+, +] ◮ !in, !ni, [-, -] ◮ ∼, !∼

Values

◮ T, F ◮ +42, 1337, 3.14 ◮ "foo" ◮ 10.0.0.0/8 ◮ 80/tcp, 53/? ◮ {1, 2, 3}

19 / 28

slide-60
SLIDE 60

Data Model

TYPE record vector set table KEY VALUE TYPE TYPE field 1 TYPE field n TYPE

container types basic types compound types recursive types bool int count real duration time string pattern address subnet port none

20 / 28

slide-61
SLIDE 61

Data Model

TYPE record vector set table KEY VALUE TYPE TYPE field 1 TYPE field n TYPE

container types basic types compound types recursive types bool int count real duration time string pattern address subnet port none

20 / 28

slide-62
SLIDE 62

Bitmap Index for IP Addresses

192.168.0.42

21 / 28

slide-63
SLIDE 63

Bitmap Index for IP Addresses

11000000.10101000.00000000.00101010

21 / 28

slide-64
SLIDE 64

Bitmap Index for IP Addresses

11000000.10101000.00000000.00101010

21 / 28

slide-65
SLIDE 65

Bitmap Index for IP Addresses

11000000.10101000.00000000.00101010

21 / 28

slide-66
SLIDE 66

Bitmap Index for IP Addresses

X ∈ 192.168.0.0/27

21 / 28

slide-67
SLIDE 67

Bitmap Index for IP Addresses

X ∈ 192.168.0.0/27

_

21 / 28

slide-68
SLIDE 68

Outline

  • 1. Architecture
  • 2. Implementation
  • 3. Evaluation
slide-69
SLIDE 69

Data Set

Single-Machine

Data:

◮ 10 M packets from a 24-hour

trace (5 fields/event)

◮ 3.4 M derived Bro

connection logs (20 fields/event) Machine:

◮ 2 × 8-core Intel Xeon CPUs ◮ 128 GB RAM ◮ 4 × 3 TB SAS 7.2 K disks ◮ 64-bit FreeBSD

22 / 28

slide-70
SLIDE 70

Data Set

Single-Machine

Data:

◮ 10 M packets from a 24-hour

trace (5 fields/event)

◮ 3.4 M derived Bro

connection logs (20 fields/event) Machine:

◮ 2 × 8-core Intel Xeon CPUs ◮ 128 GB RAM ◮ 4 × 3 TB SAS 7.2 K disks ◮ 64-bit FreeBSD

Cluster

Data:

◮ 1.24 B Bro connection logs

(152 GB)

◮ Split into N slices for N

nodes

◮ N ∈ [1, 24]

Nodes:

◮ 2 × 8-core Intel Xeon CPUs ◮ 12 GB of RAM ◮ 2 × 500 MB SATA disks ◮ 64-bit FreeBSD

22 / 28

slide-71
SLIDE 71

Queries

23 / 28

slide-72
SLIDE 72

Performance – Index Latency

  • 2

4 6 8 10 12 14 16 4 8 12 16

Cores Latency (seconds)

Query

  • A

B C D E F G H I

24 / 28

slide-73
SLIDE 73

Performance — Scaling

Import

  • 0.5

1.0 1.5 2.0 5 10 15 20 25

Nodes 1 / Utilization

Export

  • 0.5

1.0 1.5 2.0 2.5 5 10 15 20 25

Nodes Latency (seconds)

25 / 28

slide-74
SLIDE 74

Details in the paper

26 / 28

slide-75
SLIDE 75

Conclusion

Network Forensics Challenges

◮ Explorative high-dimensional search ◮ Disparate data access ◮ Massive data volumes

VAST: Visibility Across Space and Time

◮ Platform for network forensics ◮ Interactive & iterative search ◮ Inter-machine and intra-machine scaling ◮ Open-source, permissive license (BSD)

27 / 28

slide-76
SLIDE 76

Questions?

http://vast.io

28 / 28