Capturing Traffic Traces with Ground- Capturing Traffic Traces with - - PowerPoint PPT Presentation

capturing traffic traces with ground capturing traffic
SMART_READER_LITE
LIVE PREVIEW

Capturing Traffic Traces with Ground- Capturing Traffic Traces with - - PowerPoint PPT Presentation

Politecnico di Torino Seminario su Traffic Classification - 10/2009 Capturing Traffic Traces with Ground- Capturing Traffic Traces with Ground- Truth Information Truth Information Niccolo' Cascarano, Politecnico di Torino Joint work with


slide-1
SLIDE 1

Politecnico di Torino – Seminario su Traffic Classification - 10/2009 1

Capturing Traffic Traces with Ground- Capturing Traffic Traces with Ground- Truth Information Truth Information

Niccolo' Cascarano, Politecnico di Torino Joint work with Telecommunication Networks Group at Università di Brescia (Italy)

slide-2
SLIDE 2

Politecnico di Torino – Seminario su Traffic Classification - 10/2009 2

The problem The problem

  • The ground truth in traffic traces
slide-3
SLIDE 3

Politecnico di Torino – Seminario su Traffic Classification - 10/2009 3

Current solutions Current solutions

  • Manual inspection

– Do you really believe this is possible?

  • DPI

– Is that true?

  • Ad-hoc created traffic

– Is it realistic? – Does it contain other kind of traffic?

slide-4
SLIDE 4

Politecnico di Torino – Seminario su Traffic Classification - 10/2009 4

IPClass

The GT toolset The GT toolset

GT SQL Database Internet

Hosts running GT daemon Border router GT logs GT metadata Capturing host

slide-5
SLIDE 5

Politecnico di Torino – Seminario su Traffic Classification - 10/2009 5

Characteristics (1) Characteristics (1)

  • Polling based

– Simple

  • Easy to develop
  • Limited intrusiveness (no kernel hooks)

– Portable – Not 100% coverage in terms of bytes/flows

  • Solutions exists
slide-6
SLIDE 6

Politecnico di Torino – Seminario su Traffic Classification - 10/2009 6

Characteristics (2) Characteristics (2)

  • Non intrusive with respect to send/received traffic

– E.g., preserves timestamp

  • User-friendly

– Both command line and a GUI is available

slide-7
SLIDE 7

Politecnico di Torino – Seminario su Traffic Classification - 10/2009 7

Characteristics (3) Characteristics (3)

  • Open source and freely downloadable

– http://www.ing.unibs.it/ntw/tools/gt

  • Finally, something that works

– FreeBSD, Linux, Windows, MacOS X

slide-8
SLIDE 8

Politecnico di Torino – Seminario su Traffic Classification - 10/2009 8

Limitations (1) Limitations (1)

  • Intrusive

– Need to be installed on each hosts – What about monitoring large crowds? – Monitored users are aware of it

  • Do they modify their behavior?
slide-9
SLIDE 9

Politecnico di Torino – Seminario su Traffic Classification - 10/2009 9

Limitations (2) Limitations (2)

  • Not 100% coverage in terms of flows/bytes

– Polling mechanism – Cannot mark traffic that doesn’t create a socket (e.g. attacks to a closed port, ICMP traffic, …)

slide-10
SLIDE 10

Politecnico di Torino – Seminario su Traffic Classification - 10/2009 10

Overhead on monitored hosts Overhead on monitored hosts

slide-11
SLIDE 11

Politecnico di Torino – Seminario su Traffic Classification - 10/2009 11

Polling and coverage Polling and coverage  TCP UDP 

slide-12
SLIDE 12

Politecnico di Torino – Seminario su Traffic Classification - 10/2009 12

Coverage with Service-Based Classification Coverage with Service-Based Classification

  • 95% completeness in terms of flows
  • 99% completeness in terms of bytes

IP= IP1, Port = P1 IP= IP1, Port = P2

slide-13
SLIDE 13

Politecnico di Torino – Seminario su Traffic Classification - 10/2009 13

Applications and Protocols (1) Applications and Protocols (1)

HTTP HTTPS FTP HTTP HTTPS IMAP POP3 POP3S

slide-14
SLIDE 14

Politecnico di Torino – Seminario su Traffic Classification - 10/2009 14

Applications and Protocols (2) Applications and Protocols (2)

GT metadata Captured traffic DPI Traffic Classifier GT protocol metadata

Regex proto1 Regex proto2 Regex proto3 Regex proto4 Regex proto5 Regex proto6 Regex proto7 Regex proto8

slide-15
SLIDE 15

Politecnico di Torino – Seminario su Traffic Classification - 10/2009 15

Future work Future work

  • GT is a good start for associating ground truth to the

traces

  • But…

– Can we capture raw data? – How can we share files?

  • Full payload is often needed
slide-16
SLIDE 16

Politecnico di Torino – Seminario su Traffic Classification - 10/2009 16

References References

  • Ground truth

  • F. Gringoli, L. Salgarelli, M. Dusi, N. Cascarano, F. Risso, K.C. Claffy,

“GT: picking up the truth from the ground for Internet traffic,” ACM Computer Communication Review, October 2009.

  • Service-based traffic classification

  • M. Baldi, F. Risso, N. Cascarano, A. Baldini, “Service-Based Traffic

Classification: Principles and Validation,” IEEE Sarnoff Symposium, Princeton, NJ (USA), pp. 1-6, March 2009.