Classifying Internet One-way Traffic Eduard Glatz, Xenofontas - - PowerPoint PPT Presentation

classifying internet one way traffic
SMART_READER_LITE
LIVE PREVIEW

Classifying Internet One-way Traffic Eduard Glatz, Xenofontas - - PowerPoint PPT Presentation

Classification Scheme One-way Traffic Composition Service Availability Monitoring Classifying Internet One-way Traffic Eduard Glatz, Xenofontas Dimitropoulos ETH Zurich May 15, 2012 Eduard Glatz, Xenofontas Dimitropoulos Classifying


slide-1
SLIDE 1

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Classifying Internet One-way Traffic

Eduard Glatz, Xenofontas Dimitropoulos

ETH Zurich

May 15, 2012

Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-2
SLIDE 2

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Overview

◮ Classification scheme for dissecting one-way traffic that relies

solely on flow-level data

◮ Observation on one-way traffic based on a massive dataset of

457 billion flows

◮ Show how one-way flows are useful for service availability

monitoring

Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-3
SLIDE 3

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Preliminaries

◮ Study incoming one-way traffic at the network level:

connections that do not receive a reply.

◮ Example causes of one-way traffic:

◮ Failures & Policies ◮ Attacks ◮ Special application behavior Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-4
SLIDE 4

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Preliminaries

◮ Study incoming one-way traffic at the network level:

connections that do not receive a reply.

◮ Example causes of one-way traffic:

◮ Failures & Policies ◮ Attacks ◮ Special application behavior

◮ Sampling and asymmetric routing can result in artificial

  • ne-way traffic

◮ One-way traffic can be measured in edge networks

Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-5
SLIDE 5

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Classification Scheme

◮ Associate each one-way flow with a number of signs ◮ Introduce 18 signs exploiting in 4 cases techniques from the

literature

◮ Classify flows based on their signs ◮ Classes:

◮ Unreachable services ◮ P2P applications ◮ Scanning ◮ Backscatter ◮ Suspected Benign ◮ Bogon Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-6
SLIDE 6

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Signs: Host pair behavior

a) b) c) d)

Figure: Mixture of incoming one- and two-way flows exchanged between a host pair. Hosts are represented by nodes and the presence of inflow/outflow/biflows by arrows.

Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-7
SLIDE 7

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Signs: Host pair behavior

a) b) c) d)

Figure: Mixture of incoming one- and two-way flows exchanged between a host pair. Hosts are represented by nodes and the presence of inflow/outflow/biflows by arrows.

◮ End-hosts-communicating: One-way flow between productive

host pair

◮ Limited dialog: One-way flows between unproductive host pair

Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-8
SLIDE 8

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Signs: Local host behavior

◮ Unused local address: Unpopulated local IP address ◮ Service unreachable: Unanswered request to local service ◮ Peer-to-peer1: Flow towards local P2P host

  • 1W. John and S. Tafvelin. Heuristics to classify internet backbone traffic based on connection patterns.

International Conference on Information Networking (ICOIN), 2008 Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-9
SLIDE 9

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Signs: Remote host behavior

◮ Service sole reply: no biflow on srcIP ∧ dstPort≥1024 ∧

srcPort < 1024

◮ Remote scanner 12: TRW algorithm (suspected scanner) ◮ Remote scanner 23: Host classification (suspected scanner) ◮ Remote non-scanner: TRW algorithm (suspected regular host)

  • 2J. Jung, V. Paxson, A. Berger, and H. Balakrishnan. Fast portscan detection using sequential hypothesis
  • testing. In Proceedings of the IEEE Symposium on Security and Privacy, 2004
  • 3M. Allman, V. Paxson, and J. Terrell. A brief history of scanning. In Proceedings of the 7th ACM

SIGCOMM IMC, 2007 Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-10
SLIDE 10

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Signs: Flow feature

◮ Artifact: UDP/TCP flow with both port numbers=0 ◮ Single packet: Flow contains one packet only ◮ Large flow: Flow carries ≥ 10 packets or ≥ 10240 bytes ◮ Bogon: Source IP belongs to bogon space ◮ Protocol: IP protocol type of flow

Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-11
SLIDE 11

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Classification Rules

Final classifier includes 17 classification rules

Class Name Rule # Flow Membership Rules Malicious 1 {TRWscan, HCscan, PotOk} ⇒ Scanner Scanning 2 {HCscan, TRWscan, TRWnom, PotOk} ⇒ Scanner 3 {TRWscan, HCscan, PotOk} ⇒ Scanner 4 {TRWnom, HCscan} ⇒ Scanner 5 {GreyIP, Onepkt, TRWscan, HCscan, Backsc, ICMP, UDP, bogon} ⇒ Scanner 6 {GreyIP, TRWscan, HCscan, Onepkt, ICMP, Backsc, bogon} ⇒ Scanner 7 {Onepkt, GreyIP, ICMP, TRWscan, HCscan, TRWnom, bogon, P2P, Unreach, PotOk, Backsc, Large} ⇒ Scanner 8 {GreyIP, Onepkt, TRWscan, HCscan, Backsc, ICMP, TCP, bogon} ⇒ Scanner 9 {ICMP, TRWscan, TRWnom, HCscan, InOut, bogon, PotOk} ⇒ Scanner Backscatter 10 {Backsc, TRWscan, HCscan, P2P, InOut, PotOk} ⇒ Backscatter Service 11 {Unreach, TRWscan, HCscan, bogon, P2P} ⇒ Unreachable Unreachable Benign P2P 12 {P2P, TRWscan, HCscan, bogon} ⇒ P2P Scanning Suspected 13 {PotOk, Unreach, P2P, TRWnom, bogon} ⇒ Benign Benign 14 {Large, GreyIP, TRWscan, HCscan, P2P, Unreach, PotOk, ICMP, Backsc, bogon, TRWnom} ⇒ Benign 15 {TRWnom, GreyIP, HCscan, P2P, Unreach, bogon, Backsc} ⇒ Benign 16 {ICMP, InOut, TRWscan, HCscan, TRWnom, bogon, PotOk} ⇒ Benign Bogon 17 {bogon, TRWscan, HCscan, Backsc} ⇒ Bogon

Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-12
SLIDE 12

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Data-Sets

◮ Use data from the Swiss academic backbone network

(SWITCH)

◮ Analyze the first 400 hours of each Feb and Aug between

2004 and 2011

◮ The studied traffic data correspond to:

◮ 457 billion flows ◮ 7.41 petabytes ◮ cover 9% of the total number of flows Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-13
SLIDE 13

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Data Sanitization

◮ Double-counting elimination reduces total traffic volume by

32.3%

◮ Defragmentation reduces the number of flows by a fraction

ranging between 20.6% and 39.6% for different years

◮ Bi-flow Pairing:

◮ For TCP and UDP based on standard 5-tuple ◮ For other protocols based on 3-tuple Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-14
SLIDE 14

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Evolution of One- and Two-way Traffic

◮ One-way flows are a large

fraction of all flows:

◮ In 2004, 2 out of every 3

flows were one-way

◮ From 2007 to 2010, 1 out of

every 3 flows were one-way

Period Mean Flows/24 h '4.2 '5.2 '6.2 '7.2 '8.2 '9.2 '0.2 '1.2 0e+00 4e+06 8e+06

Inbound One−Way Two−Way Inbound One−Way Two−Way

Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-15
SLIDE 15

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Evolution of One- and Two-way Traffic

◮ One-way flows are a large

fraction of all flows:

◮ In 2004, 2 out of every 3

flows were one-way

◮ From 2007 to 2010, 1 out of

every 3 flows were one-way

◮ The number of one-way flows

in 2011 is almost equal to 2004

◮ The fraction of one-way flows

has declined

Period Mean Flows/24 h '4.2 '5.2 '6.2 '7.2 '8.2 '9.2 '0.2 '1.2 0e+00 4e+06 8e+06

Inbound One−Way Two−Way Inbound One−Way Two−Way

Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-16
SLIDE 16

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Composition of One-way Traffic

Class % of flows % of pkts pkts/flow Scanning 83.5% 62.6% 1.6 P2P applications 6.7% 13.0% 6.8 Unreach services 4.8% 10.1% 4.1 Suspected Benign 2.6% 9.1% 12.1 Other 2.2% 4.7% 4.6 Backscatter 0.3% 0.5% 3.3

◮ The top sources of one-way

traffic are scanning, P2P protocols, and unreachable services

0e+00 1e+08 2e+08 3e+08 4e+08 Period One−Way Flows/24 h 2004.01 2004.07 2005.01 2005.07 2006.02 2006.07 2007.01 2007.07 2008.02 2008.07 2009.01 2009.07 2010.01 2010.07 2011.01 2011.08 SuspBenign SrvUnreach

  • ther

MalScan Bogon BenignP2P Backscat

Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-17
SLIDE 17

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Service Availability Monitoring

◮ One-way flows are very useful for service availability

monitoring

◮ Traditional service availability monitoring is based on active

probing

Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-18
SLIDE 18

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Service Availability Monitoring

◮ One-way flows are very useful for service availability

monitoring

◮ Traditional service availability monitoring is based on active

probing

◮ Advantages of flow-based approach:

◮ Provides a tangible assessment of the impact of disruptions ◮ Discovers running services without requiring manual

configuration

◮ Exploits passive measurements Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-19
SLIDE 19

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Outages and Misconfigurations in ETH Zurich

◮ Examine a week of NetFlow

data from the EE Department

  • f ETH Zurch

◮ Found 32 main services

(> 99% availability) and 11 transient services

Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-20
SLIDE 20

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Outages and Misconfigurations in ETH Zurich

◮ Examine a week of NetFlow

data from the EE Department

  • f ETH Zurch

◮ Found 32 main services

(> 99% availability) and 11 transient services

◮ Identified a coinciding global

  • utage

0.2 0.4 0.6 0.8 1 20 21 22 23 24 25 26 27 Fraction of Failed Clients Day of June 2011 993/tcp (tardis.ee) 25/tcp (tranquility.ee) 80/tcp (yosemite.ee) 25/tcp (smtp.ee)

Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-21
SLIDE 21

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Outages and Misconfigurations in ETH Zurich

◮ Examine a week of NetFlow

data from the EE Department

  • f ETH Zurch

◮ Found 32 main services

(> 99% availability) and 11 transient services

◮ Identified a coinciding global

  • utage

◮ During the identified interval

287,583 unique IP addresses failed to access target services!

0.2 0.4 0.6 0.8 1 20 21 22 23 24 25 26 27 Fraction of Failed Clients Day of June 2011 993/tcp (tardis.ee) 25/tcp (tranquility.ee) 80/tcp (yosemite.ee) 25/tcp (smtp.ee)

Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-22
SLIDE 22

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Conclusions

◮ Classification scheme for one-way traffic that relies on 18

signs derived from flow data

◮ Observations based on a very large data-set:

◮ One-way flows are a large fraction of all flows ◮ In terms of flows, the share of one-way traffic has declined

since 2004

◮ The top sources of one-way traffic are scanning, P2P

protocols, and unreachable services

◮ One-way traffic is very useful for assessing the impact of

failures

Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-23
SLIDE 23

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Questions?

Contact: fontas@gmail.com

  • E. Glatz and X. Dimitropoulos. Classifying Internet One-way
  • Traffic. TIK-Report 336, ETH Zurich, May 2012

Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-24
SLIDE 24

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Validation

◮ Collect packet traces from a small campus network ◮ Exploit additional information:

◮ Extended host profiles ◮ ICMP types and codes ◮ TCP flags (Check protocol state machine) ◮ DPI-based application identification4 ◮ Precise timestamps

  • 4H. Kim, K. Claffy, M. Fomenkov, D. Barman, M. Faloutsos, and K. Lee. Internet traffic classification

demystified: myths, caveats, and the best practices. ACM CoNEXT, 2008 Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-25
SLIDE 25

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Validation

◮ Collect packet traces from a small campus network ◮ Exploit additional information:

◮ Extended host profiles ◮ ICMP types and codes ◮ TCP flags (Check protocol state machine) ◮ DPI-based application identification4 ◮ Precise timestamps

Class Name Recall [%] Precision [%] Malicious Scanning 99.9 99.8 Service Unreachable 99.6 96.1 Benign P2P Scanning 95.3 95.5 Backscatter 62.4 88.4 Suspected Benign 85.1 75.0 Bogon 40.4 100.0

  • 4H. Kim, K. Claffy, M. Fomenkov, D. Barman, M. Faloutsos, and K. Lee. Internet traffic classification

demystified: myths, caveats, and the best practices. ACM CoNEXT, 2008 Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-26
SLIDE 26

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Outages and Misconfigurations in ETH Zurich

◮ Found server that was not reachable during the studied week

in total by 2.2 million unique clients!

◮ What was this server? Hint: Switzerland is famous for

chocolate, banking, swiss army knifes, and watches

Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-27
SLIDE 27

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Outages and Misconfigurations in ETH Zurich

◮ Found server that was not reachable during the studied week

in total by 2.2 million unique clients!

◮ What was this server? Hint: Switzerland is famous for

chocolate, banking, swiss army knifes, and watches

◮ Popular NTP server swisstime.ee.ethz.ch preconfigured in

NTP clients and used in NTP “hello world” examples

◮ It was not reachable to 12.9% of its clients cause by invalid

CRC checksums and a filtering policy

Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-28
SLIDE 28

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Impact of the Interval Size

Doubling the interval size:

◮ decreases absolute count

metrics by 3-5%.

◮ decreases relative volume

metrics by 1.2% and does not

◮ decrease further with an

increasing interval size.

0.95 1.00 1.05 Interval Size [s] Variation wrt. 600 s interval size 300 450 600 720 900 1200 Two−way flows One−way flows One−way/total flows Total flows

Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic

slide-29
SLIDE 29

Classification Scheme One-way Traffic Composition Service Availability Monitoring

Signs

Sign Type Sign Name Detection Criterion/Algorithm Host pair behavior End-hosts-communicating One-way flow between productive host pair Limited dialog One-way flows between unproductive host pair Remote host behavior Service sole reply no biflow on srcIP ∧ dstPort≥1024 ∧ srcPort < 1024 Remote scanner 1 TRW algorithm (suspected scanner) Remote scanner 2 Host classification (suspected scanner) Remote non-scanner TRW algorithm (suspected regular host) Local host behavior Unused local address Unpopulated local IP address Service unreachable Unanswered request to local service Peer-to-peer Flow towards local P2P host Flow feature Artifact UDP/TCP flow with both port numbers=0 Single packet Flow contains one packet only Large flow Flow carries ≥ 10 packets or ≥ 10240 bytes Bogon Source IP belongs to bogon space Protocol IP protocol type of flow Eduard Glatz, Xenofontas Dimitropoulos Classifying Internet One-way Traffic