Traffic Classification based on Visualization - - PowerPoint PPT Presentation

traffic classification based on visualization
SMART_READER_LITE
LIVE PREVIEW

Traffic Classification based on Visualization - - PowerPoint PPT Presentation

Traffic Classification based on Visualization


slide-1
SLIDE 1 0101001111110010011011010101010100101101110011111001010011101101011111010110111110011101111111100010001001110010000111100011100011101110111011110110100111010010101010011111010100010111001111001111100101001110110101111101011011111001110111

Traffic Classification based on Visualization

Zhibin Yu Realtime Image Processing & Telecommunication Lab. Kyungpook National University South Korea

December 8th, 2011

slide-2
SLIDE 2

010100111111001001101101010101010010110111001111100101001110110101111101011011111001110111111110001000100111001000011110001110001110111011101111011010011101001010101001111101010001011100111100111110010100111011010111110101101111100111011111111000

Overview

16/12/11

Realtime Image Pr Processing ng & Te Telecommuni unication n Lab.

2 2

HTTP FTP Flow A Figure A Is that possible to use face recognition technology to classify network traffic? P2P …… Person A Person B face recognition Network traffic face recognition Flow B Flow C Flow face A Flow face B Flow face C

slide-3
SLIDE 3

010100111111001001101101010101010010110111001111100101001110110101111101011011111001110111111110001000100111001000011110001110001110111011101111011010011101001010101001111101010001011100111100111110010100111011010111110101101111100111011111111000

Contents

  • Introduction
  • Related work
  • Proposed algorithm
  • Evaluation
  • Conclusion

16/12/11

Realtime Image Pr Processing ng & Te Telecommuni unication n Lab.

3 3

slide-4
SLIDE 4

010100111111001001101101010101010010110111001111100101001110110101111101011011111001110111111110001000100111001000011110001110001110111011101111011010011101001010101001111101010001011100111100111110010100111011010111110101101111100111011111111000

Introduction

16/12/11

Realtime Image Pr Processing ng & Te Telecommuni unication n Lab.

4 4

Figure B Flow chart

Input traffic flows Normalization

  • f packet size

and interval Results Pattern recognition (PCA) Display 2-D images Image enhancement

slide-5
SLIDE 5

010100111111001001101101010101010010110111001111100101001110110101111101011011111001110111111110001000100111001000011110001110001110111011101111011010011101001010101001111101010001011100111100111110010100111011010111110101101111100111011111111000

Related Work

  • Port-based Approaches

– Fast but unreliable

  • Signature-based approaches

– Accurate but inflexible

  • Statistical-based approaches

– weak in small flows

  • Machine Learning Approaches

– Accurate but time costly

  • Traffic Classification Metric

– where TP is true positive, FN is false negative and FP is false positive.

16/12/11

Realtime Image Pr Processing ng & Te Telecommuni unication n Lab.

5 5

FP FN TP TP F + + = 2 2

slide-6
SLIDE 6

010100111111001001101101010101010010110111001111100101001110110101111101011011111001110111111110001000100111001000011110001110001110111011101111011010011101001010101001111101010001011100111100111110010100111011010111110101101111100111011111111000

Proposed Method (1/5)

  • Feature selection

– Packet size – Packet inter-arrival time

16/12/11

Realtime Image Pr Processing ng & Te Telecommuni unication n Lab.

6 6

Figure C: Cumulative Distribution of Packet Size and Packet Inter-arrival time using our experiment dataset

slide-7
SLIDE 7

010100111111001001101101010101010010110111001111100101001110110101111101011011111001110111111110001000100111001000011110001110001110111011101111011010011101001010101001111101010001011100111100111110010100111011010111110101101111100111011111111000

Proposed Method (2/5)

  • Image Normalization

– Definition – MTU=1500 – Max_Interval=3600s – a=0.1

16/12/11

Realtime Image Pr Processing ng & Te Telecommuni unication n Lab.

7 7

MTU size Packet X _ * 512 =

a

Interval Max Interval Packet Y ) _ _ ( * 512 =

Y X Packet Inter- arrival time Packet size 512*512 image

slide-8
SLIDE 8

010100111111001001101101010101010010110111001111100101001110110101111101011011111001110111111110001000100111001000011110001110001110111011101111011010011101001010101001111101010001011100111100111110010100111011010111110101101111100111011111111000

Proposed Method (3/5)

  • Image Normaliztion

– Initialized Images

16/12/11

Realtime Image Pr Processing ng & Te Telecommuni unication n Lab.

8 8

Figure D: Four local images generated by FTP-data and OICQ flows. (a) FTP-data1. (b)FTP-data2. (c) OICQ1. (d)OICQ2. (a) (b) (c) (d)

slide-9
SLIDE 9

010100111111001001101101010101010010110111001111100101001110110101111101011011111001110111111110001000100111001000011110001110001110111011101111011010011101001010101001111101010001011100111100111110010100111011010111110101101111100111011111111000

Proposed Method (4/5)

  • Image enhancement

– Mountain clustering and visualization – N is the total number of packets in this flow – Ma is the mountain height of point a calculated by equation 1 – b is the parameter which controls the difference between peak and plain – Ba is the brightness value of point a

16/12/11

Realtime Image Pr Processing ng & Te Telecommuni unication n Lab.

9 9

) ( * 255

max

M M B

a a =

b N i i a

a a M ) ) 2 || || exp( (

1 2 2

=

− − = σ

slide-10
SLIDE 10

010100111111001001101101010101010010110111001111100101001110110101111101011011111001110111111110001000100111001000011110001110001110111011101111011010011101001010101001111101010001011100111100111110010100111011010111110101101111100111011111111000

Proposed Method (5/5)

  • Image enhancement

– Mountain clustering

16/12/11

Realtime Image Pr Processing ng & Te Telecommuni unication n Lab.

10 10 Figure E: Image enhancement after mountain clustering. (a) Original image. (b)Mountain clustering value. (c) Image after enhancement when b=0.3 σ=2

(b) (a) (c)

slide-11
SLIDE 11

010100111111001001101101010101010010110111001111100101001110110101111101011011111001110111111110001000100111001000011110001110001110111011101111011010011101001010101001111101010001011100111100111110010100111011010111110101101111100111011111111000

Evaluation (1/5)

  • Evaluation

– Data Description

16/12/11

Realtime Image Pr Processing ng & Te Telecommuni unication n Lab.

11 11

Table 1: Data description

Traffic name Average flow size (Kbytes) Average interval (Seconds) Average packet size (Bytes) HTTP 37.9 0.759 646 FTP 7.64 11.621 84 FTP-data 21386.7 0.0666 576 OICQ (Chatting) 18.9 3.531 200 POP3 35.3 0.246 317 SMTP 19.9 0.0835 464 Web-download 21734.4 0.0425 1142 PPStream (P2PTV) 1911.6 0.284 371

slide-12
SLIDE 12

010100111111001001101101010101010010110111001111100101001110110101111101011011111001110111111110001000100111001000011110001110001110111011101111011010011101001010101001111101010001011100111100111110010100111011010111110101101111100111011111111000

Evaluation (2/5)

  • Perfermance

– Comparison with different parameters

16/12/11

Realtime Image Pr Processing ng & Te Telecommuni unication n Lab.

12 12

Figure F: (a) Initialized image generated by a PPS flow (11.4MB). (b) b=0.1 σ=4. (c) b=0.2 σ=4. (d) b=0.5 σ=4. (e) b=0.2 σ=1 (f) b= 0.2 σ=2 (g) b=0.2 σ=4 (a) (b) (c) (d) (e) (f) (g)

slide-13
SLIDE 13

010100111111001001101101010101010010110111001111100101001110110101111101011011111001110111111110001000100111001000011110001110001110111011101111011010011101001010101001111101010001011100111100111110010100111011010111110101101111100111011111111000

Evaluation (3/5)

  • Performance

– Comparison with different flow size

16/12/11

Realtime Image Pr Processing ng & Te Telecommuni unication n Lab.

13 13 Figure G: Performance between different flow size (a) Initialized images generated by an elephent FTP-data flow(53,401KB). (b) Initialized images generated by a mice FTP-data flow (4.43KB). (c) Image (a) after enhancement with parameter b=0.3 σ=2. (d) Image (b) after enhancement with parameter b=0.3 σ=2

(a) (b) (c) (d)

slide-14
SLIDE 14

010100111111001001101101010101010010110111001111100101001110110101111101011011111001110111111110001000100111001000011110001110001110111011101111011010011101001010101001111101010001011100111100111110010100111011010111110101101111100111011111111000

Evaluation (4/5)

  • Perfermance

– Performance on encryption detection – 386 ftp-data flows based on SFTP protocol using SSH2 are tested

16/12/11

Realtime Image Pr Processing ng & Te Telecommuni unication n Lab.

14 14

Figure H: Performance of encryption traffic classification (a) An image generated from an SSH flow(13.4MB) with parameter b=0.3 σ=2. (b) An image generated from an FTP-data flow(13.0MB) with parameter b=0.3 σ=2. (c) Recognition result by PCA

50 100 150 200 250 300 350

(c) (a) (b)

slide-15
SLIDE 15

010100111111001001101101010101010010110111001111100101001110110101111101011011111001110111111110001000100111001000011110001110001110111011101111011010011101001010101001111101010001011100111100111110010100111011010111110101101111100111011111111000

Evaluation (5/5)

  • Evaluation

– Recognition result

16/12/11

Realtime Image Pr Processing ng & Te Telecommuni unication n Lab.

15 15

Figure 8: Anomalies identified both methods, sorted by traffic

Traffic type F-measure (%) Initialized image b=0.2 b=0.3 b=0.4 b=0.5 σ=1 σ=2 σ=4 σ=1 σ=2 σ=4 σ=1 σ=2 σ=4 σ=1 σ=2 σ=4 FTP 77.78 91.67 97.22 97.12 91.67 97.22 97.72 91.67 97.22 97.22 91.67 97.22 97.22 FTP- data 59.76 90.28 81.94 78.32 91.39 88.67 83.89 93.06 91.39 88.33 91.38 93.06 91.11 HTTP 86.39 73.06 75.86 60.56 74.72 80.55 56.94 76.94 86.11 72.22 78.89 87.22 80.56 OICQ (Chatting) 83.39 75.02 94.44 86.11 77.78 99.08 98.79 86.11 98.33 100 90.00 98.33 98.73 POP3 76.12 97.21 100 98.33 99.39 100 100 97.22 91.67 98.33 88.89 88.89 90.12 SMTP 94.44 100 99.27 100 100 100 100 98.33 95.00 100 92.22 93.33 95.69 PPStream (P2PTV) 80.56 91.67 97.22 96.72 94.44 97.22 97.92 94.44 97.22 100 86.11 97.22 98.77 Web- download 87.23 92.21 90.57 89.44 90.56 90.56 89.44 90.56 90.56 89.44 85.83 88.06 89.44 Average 80.72 88.76 92.07 87.95 90.04 94.16 90.59 91.04 93.43 93.19 88.13 92.91 92.71

slide-16
SLIDE 16

010100111111001001101101010101010010110111001111100101001110110101111101011011111001110111111110001000100111001000011110001110001110111011101111011010011101001010101001111101010001011100111100111110010100111011010111110101101111100111011111111000

Conclusion

  • We proposed a simple method to classify network traffic

based on visualization and pattern recognition.

  • We get more than 93% F-measure through classification.
  • Our proposed method is able to detect encrypted flows.
  • This algorithm can reduce the gap between elephant flow and

mice flow.

16/12/11

Realtime Image Pr Processing ng & Te Telecommuni unication n Lab.

16 16

slide-17
SLIDE 17

010100111111001001101101010101010010110111001111100101001110110101111101011011111001110111111110001000100111001000011110001110001110111011101111011010011101001010101001111101010001011100111100111110010100111011010111110101101111100111011111111000

Thank you!

16/12/11

Realtime Image Pr Processing ng & Te Telecommuni unication n Lab.

17 17