How is SUNET really used? Results of traffic classification on - - PowerPoint PPT Presentation

how is sunet really used
SMART_READER_LITE
LIVE PREVIEW

How is SUNET really used? Results of traffic classification on - - PowerPoint PPT Presentation

MonNet a project for network and traffic monitoring How is SUNET really used? Results of traffic classification on backbone data Wolfgang John and Sven Tafvelin Dept. of Computer Science and Engineering Chalmers University of Technology


slide-1
SLIDE 1

MonNet – a project for network and traffic monitoring

How is SUNET really used?

Results of traffic classification on backbone data Wolfgang John and Sven Tafvelin

  • Dept. of Computer Science and Engineering

Chalmers University of Technology Göteborg, Sweden

slide-2
SLIDE 2

2007-11-15 SUNET TrefPunkt 17

Introduction: Measurement location

Internet Internet

R e g i

  • n

a l I S P s R e g i

  • n

a l I S P s

Göteborg Stockholm

Other smaller Univ. and Institutes GU GSIX

  • 2x 10 Gbit/s (OC-192)
  • capturing headers only
  • IP addresses anonymized
  • tightly synchronized
  • bidirectional per-flow analysis

Chalmers

slide-3
SLIDE 3

2007-11-15 SUNET TrefPunkt 17

Introduction: Motivation

  • Problem:

– Operators don’t know type of their traffic – How to:

  • Improve network design and provisioning?
  • Support QoS support or security monitoring?
  • Enhance accounting possibilities?
  • Reveal trends and changes in network applications?
slide-4
SLIDE 4

2007-11-15 SUNET TrefPunkt 17

Introduction: Motivation (2)

  • Solution: Network classification

– Four approaches in literature:

  • 1. Port numbers

+ easy to implement

  • unreliable (P2P, malicious traffic)
  • 2. Packet payloads

+ accurate

  • requires updated payload signatures
  • privacy and legal issues
  • high processing requirements
slide-5
SLIDE 5

2007-11-15 SUNET TrefPunkt 17

Introduction: Motivation (3)

  • Solution: Network classification (contd.)
  • 3. Statistical fingerprinting

+ no detailed packet information needed

  • depending on quality of training data
  • promising, but still immature
  • 4. Connection patterns

+ no payload required + no training data required

  • not perfect accuracy
slide-6
SLIDE 6

2007-11-15 SUNET TrefPunkt 17

Introduction: Overview

  • Connection classification
  • Overview of proposed heuristics
  • Verification of methodology
  • Results
  • Traffic volumes
  • Diurnal patterns
  • Signaling behavior
  • Summary of more results
slide-7
SLIDE 7

2007-11-15 SUNET TrefPunkt 17

Methodology: Traffic Classification

  • Two articles classify P2P flows according

to connection patterns:

– Karagiannis et al., 2004 – Perenyi et al., 2006

  • Updated classification heuristics:

– Refined the heuristics in prior articles – Added new, necessary heuristics

slide-8
SLIDE 8

2007-11-15 SUNET TrefPunkt 17

Methodology: Proposed Heuristics

  • Rules based on connection patterns

and port numbers

– 5 rules for P2P traffic – 10 rules to classify other types of traffic

  • remove ‘false positives’ from P2P

– Rules are applied:

  • On flows in 10 minute intervals
  • Independently on all flows and

Prioritized when fetched from the database

slide-9
SLIDE 9

2007-11-15 SUNET TrefPunkt 17

Methodology: Proposed Heuristics (2)

– Heuristics for potential P2P traffic (H1-H5)

  • All traffic to and from potential P2P hosts is marked

as P2P traffic

  • H1: TCP and UDP traffic between IP pair
  • H2: Well known P2P ports
  • H3: Re-usage of source port within short time
  • H4: Non-parallel connections to endpoint (IP/Port)
  • H5: unclassified, long flows

– unclassified by H1-H5 and F1-F10 – more than 1MB in one direction or – duration of more than 10 minutes

slide-10
SLIDE 10

2007-11-15 SUNET TrefPunkt 17

Methodology: Proposed Heuristics (3)

– Heuristics for other traffic (F1-F10)

  • F1 and F2: Web servers:

– parallel connections to Web ports – All traffic to and from Web server is Web-traffic

  • F3: common services (DNS, BGP)

– Equal source and destination port and port<501

  • F4: Mail servers:

– Hosts receiving traffic on mail ports (smtp, imap, pop) while sending traffic via smtp – All traffic to and from Mail servers is Mail-traffic

slide-11
SLIDE 11

2007-11-15 SUNET TrefPunkt 17

Methodology: Proposed Heuristics (3)

– Heuristics for other traffic (F1-F10)

  • F5 and F6: Messenger and Gaming

– Hosts, connected to by a number of different IPs on well- known messenger, chat or gaming ports within a period of 10 days – All traffic to and from these hosts is messenger or gaming

  • F7: FTP

– Active FTP with initiating port number of 20

  • F8: non P2P ports:

– Some well-known, privileged port numbers, typically not used by P2P like dns, telnet, ssh, ftp, mail, rtp, bgp …

slide-12
SLIDE 12

2007-11-15 SUNET TrefPunkt 17

Methodology: Proposed Heuristics (3)

– Heuristics for other traffic (F1-F10)

  • F9: malicious and attack traffic

– Scans through IP ranges – Scans through port ranges – DoS or “hammering attacks” to few hosts in high frequency

  • F10: unclassified, known non-P2P Port

– unclassified by H1-H4 and F1-F9 (no connection pattern) – Well known ports including Web, messenger and gaming

slide-13
SLIDE 13

2007-11-15 SUNET TrefPunkt 17

Verification of proposed heuristic

# connections in 106 Amount of data in TB

  • Comparison of classification for P2P traffic
slide-14
SLIDE 14

2007-11-15 SUNET TrefPunkt 17

Results: Traffic Volumes

  • Application breakdown April 2006
slide-15
SLIDE 15

2007-11-15 SUNET TrefPunkt 17

Results: Traffic Volumes (2)

  • Application breakdown April till Nov. 2006
slide-16
SLIDE 16

2007-11-15 SUNET TrefPunkt 17

Results: Diurnal Patterns

  • Fractions of P2P data, April till November

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1143000000 1148000000 1153000000 1158000000 1163000000 Linear (2AM P2P data) Linear (10AM P2P data) Linear (14PM P2P data) Linear (20PM P2P data )

slide-17
SLIDE 17

2007-11-15 SUNET TrefPunkt 17

Results: Signaling Behavior

  • Connection establishment for P2P, Web

and malicious traffic

slide-18
SLIDE 18

2007-11-15 SUNET TrefPunkt 17

Summary of Results

  • Traffic is increasing for TCP and UDP
  • Highest activity during evening hours
  • P2P dominating (~90 % of data volume)
  • P2P peak time at evening and night-time
  • Web peak time during office hours
  • Fractions of P2P and Web constant
  • Malicious traffic constant in absolute numbers
  • 'background noise'
slide-19
SLIDE 19

2007-11-15 SUNET TrefPunkt 17

Summary of Results (2)

  • Major differences in signaling behavior
  • 43% of TCP P2P connections 1-packet flows (attempts)
  • 80% of malicious TCP traffic 1-packet flows (scans)
  • Web traffic behaving ‘nicely‘
  • Different TCP options deployment
  • P2P behaves as expected
  • Web traffic shows artifacts of client-server patter

e.g. popular web-servers neglecting SACK option

slide-20
SLIDE 20

2007-11-15 SUNET TrefPunkt 17

References

  • W. John and S. Tafvelin, Analysis of Internet Backbone Traffic

and Anomalies observed, ACM IMC07, San Diego, USA, 2007.

  • W. John and S. Tafvelin, Differences between in- and outbound

Internet Backbone Traffic, TNC07, Copenhagen, DK, 2007. Available on: http://www.ce.chalmers.se/~johnwolf

  • W. John and S. Tafvelin, Heuristics to Classify Internet Backbone

Traffic based on Connection Patterns, accepted at IEEE ICOIN08

  • W. John and S. Tafvelin and Tomas Olovsson, Trends and

Differences in Connection Behavior within Classes of Internet Backbone Traffic, submitted for publication Available on request: johnwolf@ce.chalmers.se

  • r as Paper copy
slide-21
SLIDE 21

MonNet – a project for network and traffic monitoring

Thank you very much for you attention!

Questions?