Politecnico di Torino – Seminario su Traffic Classification - 10/2009 1
Capturing Traffic Traces with Ground- Capturing Traffic Traces with - - PowerPoint PPT Presentation
Capturing Traffic Traces with Ground- Capturing Traffic Traces with - - PowerPoint PPT Presentation
Politecnico di Torino Seminario su Traffic Classification - 10/2009 Capturing Traffic Traces with Ground- Capturing Traffic Traces with Ground- Truth Information Truth Information Niccolo' Cascarano, Politecnico di Torino Joint work with
Politecnico di Torino – Seminario su Traffic Classification - 10/2009 2
The problem The problem
- The ground truth in traffic traces
Politecnico di Torino – Seminario su Traffic Classification - 10/2009 3
Current solutions Current solutions
- Manual inspection
– Do you really believe this is possible?
- DPI
– Is that true?
- Ad-hoc created traffic
– Is it realistic? – Does it contain other kind of traffic?
Politecnico di Torino – Seminario su Traffic Classification - 10/2009 4
IPClass
The GT toolset The GT toolset
GT SQL Database Internet
Hosts running GT daemon Border router GT logs GT metadata Capturing host
Politecnico di Torino – Seminario su Traffic Classification - 10/2009 5
Characteristics (1) Characteristics (1)
- Polling based
– Simple
- Easy to develop
- Limited intrusiveness (no kernel hooks)
– Portable – Not 100% coverage in terms of bytes/flows
- Solutions exists
Politecnico di Torino – Seminario su Traffic Classification - 10/2009 6
Characteristics (2) Characteristics (2)
- Non intrusive with respect to send/received traffic
– E.g., preserves timestamp
- User-friendly
– Both command line and a GUI is available
Politecnico di Torino – Seminario su Traffic Classification - 10/2009 7
Characteristics (3) Characteristics (3)
- Open source and freely downloadable
– http://www.ing.unibs.it/ntw/tools/gt
- Finally, something that works
– FreeBSD, Linux, Windows, MacOS X
Politecnico di Torino – Seminario su Traffic Classification - 10/2009 8
Limitations (1) Limitations (1)
- Intrusive
– Need to be installed on each hosts – What about monitoring large crowds? – Monitored users are aware of it
- Do they modify their behavior?
Politecnico di Torino – Seminario su Traffic Classification - 10/2009 9
Limitations (2) Limitations (2)
- Not 100% coverage in terms of flows/bytes
– Polling mechanism – Cannot mark traffic that doesn’t create a socket (e.g. attacks to a closed port, ICMP traffic, …)
Politecnico di Torino – Seminario su Traffic Classification - 10/2009 10
Overhead on monitored hosts Overhead on monitored hosts
Politecnico di Torino – Seminario su Traffic Classification - 10/2009 11
Polling and coverage Polling and coverage TCP UDP
Politecnico di Torino – Seminario su Traffic Classification - 10/2009 12
Coverage with Service-Based Classification Coverage with Service-Based Classification
- 95% completeness in terms of flows
- 99% completeness in terms of bytes
IP= IP1, Port = P1 IP= IP1, Port = P2
Politecnico di Torino – Seminario su Traffic Classification - 10/2009 13
Applications and Protocols (1) Applications and Protocols (1)
HTTP HTTPS FTP HTTP HTTPS IMAP POP3 POP3S
Politecnico di Torino – Seminario su Traffic Classification - 10/2009 14
Applications and Protocols (2) Applications and Protocols (2)
GT metadata Captured traffic DPI Traffic Classifier GT protocol metadata
Regex proto1 Regex proto2 Regex proto3 Regex proto4 Regex proto5 Regex proto6 Regex proto7 Regex proto8
Politecnico di Torino – Seminario su Traffic Classification - 10/2009 15
Future work Future work
- GT is a good start for associating ground truth to the
traces
- But…
– Can we capture raw data? – How can we share files?
- Full payload is often needed
Politecnico di Torino – Seminario su Traffic Classification - 10/2009 16
References References
- Ground truth
–
- F. Gringoli, L. Salgarelli, M. Dusi, N. Cascarano, F. Risso, K.C. Claffy,
“GT: picking up the truth from the ground for Internet traffic,” ACM Computer Communication Review, October 2009.
- Service-based traffic classification
–
- M. Baldi, F. Risso, N. Cascarano, A. Baldini, “Service-Based Traffic