combining machine and automata learning
play

Combining Machine and Automata Learning for Network Traffic - PowerPoint PPT Presentation

Combining Machine and Automata Learning for Network Traffic Classification Zeynab Sabahi, Fatemeh Ghassemi, and Zahra Alimadadi TTCS 2020,Tehran, Iran 1 Network Traffic Classification, What & Why?


  1. ميحرلا نمحرلا للوا مسب Combining Machine and Automata Learning for Network Traffic Classification Zeynab Sabahi, Fatemeh Ghassemi, and Zahra Alimadadi TTCS 2020,Tehran, Iran 1

  2. Network Traffic Classification, What & Why? For a given interleaved packet trace, we want to detect which applications are running ? For the network management tasks: - Anomaly detection, - Balancing bandwidth usage, - Firewalling, gateway .. . 010011011001111 2

  3. Network Traffic Classification, How?  Port-based classification: Inefficient (random or non-standard ports usage)   Payload inspection: Useless in encrypted traffic   Statistical methods: Flow/packet statistical features Fast but less accurate  Ignore temporal relation among flows   Behavioral classification: Specific to the category of application  3

  4. Our solution  Intuition:  A network application is a program calling different well-known protocols such as HTTP, TCP, SSL, and TLS. TCP TLS HTTP User  Each application has its specific network communication language when calling different well-known protocols. 4

  5. Research Goals  Learning the network language for each application that we do not have its source code, in an automatic way  Classifying an interleaved packet traces of applications according to the learned languages 5

  6. Research Goals k-TSS language  Learning the network language for each application that we do not have its source code, in an automatic way  Classifying an interleaved packet traces of applications according to the learned languages 6

  7.  Introduction  Preliminary: k-TSS Language  NeTLang Framework  Evaluation  Conclusion 7

  8. Formal Foundation: k-TSS Language  k - T estable language in the S trict S ense is a regular k-size window language. Its learning is decidable.  Words are determined by three allowed sets prefixes, suffixes, and segments. 8

  9. Formal Foundation: k-TSS Language  k - T estable language in the S trict S ense is a regular k-size window language. Its learning is decidable.  Words are determined by three allowed sets prefixes, suffixes, and segments. aba aabababb  Window of size 3  Segments = {aba}  Prefixes = {}  Suffixes = {} 9

  10. Formal Foundation: k-TSS Language  k - T estable language in the S trict S ense is a regular k-size window language. Its learning is decidable.  Words are determined by three allowed sets prefixes, suffixes, and segments. a baa abababb  Window of size 3  Segments = {aba, baa}  Prefixes = {}  Suffixes = {} 10

  11. Formal Foundation: k-TSS Language  k - T estable language in the S trict S ense is a regular k-size window language. Its learning is decidable.  Words are determined by three allowed sets prefixes, suffixes, and segments. ab aaa bababb  Window of size 3  Segments = {aba, baa, aaa}  Prefixes = {}  Suffixes = {} 11

  12. Formal Foundation: k-TSS Language  k - T estable language in the S trict S ense is a regular k-size window language. Its learning is decidable.  Words are determined by three allowed sets prefixes, suffixes, and segments. ab aaabababb  Window of size 3  Segments = {aba, baa, aaa, aab, bab, abb}  Prefixes = {ab}  Suffixes = {} 12

  13. Formal Foundation: k-TSS Language  k - T estable language in the S trict S ense is a regular k-size window language. Its learning is decidable.  Words are determined by three allowed sets prefixes, suffixes, and segments. abaaababa bb  Window of size 3  Segments = {aba, baa, aaa, aab, bab, abb}  Prefixes = {ab}  Suffixes = {bb} 13

  14. Formal Definition of k-TSS Language  Definition 1 (k-test vector) Let k > 0. A k-test vector is a 5-tuple 𝑎 = < 𝛵, 𝐽, 𝐺, 𝑈, 𝐷 > where:  𝐽 ⊆ Σ 𝑙−1 is a set of allowed prefixes  𝐺 ⊆ Σ 𝑙−1 is a set of allowed suffixes  𝑈 ⊆ Σ 𝑙 is a set of allowed segments  𝐷 ⊆ Σ <𝑙 is a set of allowed short strings  Definition 2 (k-TSS Language) Let 𝑎 = < Σ, 𝐽, 𝐺, 𝑈, 𝐷 > be a k-test vector, for some k > 0. L(Z) = [(𝐽Σ ∗ ∩ Σ ∗ 𝐺) − Σ ∗ (Σ 𝑙 − 𝑈)Σ ∗ ] ∪ 𝐷  14

  15. Formal Definition of k-TSS Language What is it? How should it be defined for network domain?  Definition 1 (k-test vector) ? Let k > 0. A k-test vector is a 5-tuple 𝑎 = < 𝜯, 𝐽, 𝐺, 𝑈, 𝐷 > where:  𝐽 ⊆ Σ 𝑙−1 is a set of allowed prefixes  𝐺 ⊆ Σ 𝑙−1 is a set of allowed suffixes  𝑈 ⊆ Σ 𝑙 is a set of allowed segments  𝐷 ⊆ Σ <𝑙 is a set of allowed short strings  Definition 2 (k-TSS Language) Let 𝑎 = < Σ, 𝐽, 𝐺, 𝑈, 𝐷 > be a k-test vector, for some k > 0. L(Z) = [(𝐽Σ ∗ ∩ Σ ∗ 𝐺) − Σ ∗ (Σ 𝑙 − 𝑈)Σ ∗ ] ∪ 𝐷  15

  16.  Introduction  Preliminary: k-TSS Language  NeTLang Framework  Evaluation  Conclusion 16

  17. Translating Network Concepts to Automata Learning Intuition: some packets always appear together due to the control phase of protocols or the specific functionality of an application A sequence of related packets : A symbol of the alphabet A packet trace of an application : A word of the language For a set of all packet traces of an application its k-TSS language can be learned 17

  18. NeTLang Framework  Ne twork T raffic Lan guage Learner: NeTLang  Architectural View: 18

  19. NeTLang Framework  Ne twork T raffic Lan guage Learner: NeTLang  Architectural View: 1 2 3 19

  20. 1) Trace Generator  Different coloring is for their protocol.  Clustering algorithm is Kmeans++.  Stats is statistical features based on length, number, and IAT of packets. 20

  21. 2) Language Learner  By moving a k-window sliding parser the k-TSS vector is learned.  For the running example (k=3):  Σ = {H-2, SL-2, SL-3, SL-4, SL-5, T-1, T-10, TL-2, U- 0, U-1}  T = {SL-2 T-1 U-0, SL-4 SL-2 T-1, T-1 SL-5 H-2, T-1 U-0 U-1, T-10 TL-2 T-1, TL-2 T-1 SL-5, U-0 U-1 SL-3}  I = {SL-4 SL-2, T-10 TL-2}  F = {SL-5 H-2, U-1 SL-3} 21

  22. 3) Classifier The automata of applications The interleaved packet trace App1 App2 . . . 22

  23. 3) Classifier The automata of applications The interleaved packet trace App1 The trace generator module is used to divide the symbolic sub-traces by timing features. App2 . . . 23

  24. 3) Classifier The automata of applications Sub-trace s1 App1 Automata word inclusion is not a suitable approach due to App2 the incomplete sub-traces and network noises. . . . 24

  25. 3) Classifier The automata of applications Sub-trace s1 App1 Z(App1) = < 𝛵 1 , 𝐽 1 , 𝐺 1 , 𝑈 1 > Z(s1) = < 𝛵, 𝐽, 𝐺, 𝑈 > Window-based Similarity App2 . . . 25

  26. 3) Classifier The automata of applications Sub-trace s1 App1 Z(App1) = < 𝛵 1 , 𝐽 1 , 𝐺 1 , 𝑈 1 > Z(s1) = < 𝛵, 𝐽, 𝐺, 𝑈 > Window-based Similarity App2 Percentage 𝛦𝑈 = 𝑈 −𝑈1 1 = 𝑈1 −𝑈 , 𝛦𝑈 𝑈1 , 𝑈 Change 𝛦Ʃ = Ʃ −Ʃ1 , 𝛦𝐽 = 𝐽 −𝐽1 𝐽 , 𝛦𝐺 = 𝐺 −𝐺1 metric Ʃ 𝐺 . . . distance(s1, App1) = 𝛦𝑈 𝛦𝑈 1 𝛦Ʃ 𝛦𝐽 𝛦𝐺 26

  27. 3) Classifier The automata of applications Sub-trace s1 App1 Z(App1) = < 𝛵 1 , 𝐽 1 , 𝐺 1 , 𝑈 1 > Z(s1) = < 𝛵, 𝐽, 𝐺, 𝑈 > Window-based Similarity App2 𝛦𝑈 = 𝑈 −𝑈1 1 = 𝑈1 −𝑈 , 𝛦𝑈 𝑈1 , 𝑈 𝛦Ʃ = Ʃ −Ʃ1 , 𝛦𝐽 = 𝐽 −𝐽1 𝐽 , 𝛦𝐺 = 𝐺 −𝐺1 Ʃ 𝐺 In general: . . D(Z(w), Z( 𝐵𝑞𝑞 𝑗 )) = Δ𝑈 Δ𝑈 𝑗 ΔƩ . ΔI ΔF 27

  28. 3) Classifier The automata of applications Sub-trace s1 App1 distance(s1, App1) Min = distance(s1, Appj) App2 Class(s1) = 𝐵𝑞𝑞 𝑘 distance(s1, App2) . . . . . . 28

  29. 3) Classifier The automata of applications Sub-trace s1 App1 distance(s1, App1) Min = distance(s1, Appj) App2 Class(s1) = 𝐵𝑞𝑞 𝑘 distance(s1, App2) Class(w) = j if D(L(w), L( 𝐵𝑞𝑞 𝑘 )) = . . . . 𝑏𝑠𝑕𝑛𝑗𝑜 ∀ 𝐵𝑞𝑞 𝑗 ∈ |A| (D(L(w), L( 𝐵𝑞𝑞 𝑗 ))) . . 29

  30. Classifier Result for the Running Example  Z(w= SL-4 SL-2 T-10 TL-2 T-1 U-2 ):  Z(App ): Σ’ = {H-2, SL-2, SL-3, SL-4, SL-5, Σ = { SL-2, TL-2, T-1, U-2, SL-4, T-10}   T-1, T-10, TL-2, U-0, U-1} T = {SL-4 SL-2 T-10, SL-2 T-10 TL-2, T-10 TL-  T ’ = {SL-2 T-1 U-0, SL-4 SL-2 T-1,  2 T-1,TL-2 T-1 U-2} T-1 SL-5 H-2, T-1 U-0 U-1, T-10 I = {SL-4 SL-2,}  TL-2 T-1, TL-2 T-1 SL-5, U-0 U-1 F = {T-1 U-2}  SL-3} I ’ = {SL-4 SL-2, T-10 TL-2}  F ’ = {SL-5 H-2, U-1 SL-3}  𝑗 = 0.85, ΔƩ = 0 .16, Δ𝑈 = 0.75, Δ𝑈 ΔI = 0, ΔF = 1 D(L(w),L(app)) = 7484160099 31

  31.  Introduction  Preliminary: k-TSS Language  NeTLang Framework  Evaluation  Conclusion 32

  32. Dataset Description We divided pcaps to three sets: - Train: 65% - Validation: 15% - Test: 20% Metrics: - Precision (P), - Recall (R), - F1-Measure= 2∗𝑄∗𝑆 𝑄+𝑆 33

  33. Evaluation Results The Best Configurations of Validation Set Parameter Application Traffic Identification Characterization Session 15 15 Threshold Inactive 5 15 Timeout Flow 10 10 Duration k 3 3 34

  34. Compare with Statistical Classifiers: Application Identification Precision F1-Measure Recall 35

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend