Combining Machine and Automata Learning for Network Traffic Classification
Zeynab Sabahi, Fatemeh Ghassemi, and Zahra Alimadadi TTCS 2020,Tehran, Iran
ميحرلا نمحرلا للوا مسب
1
Combining Machine and Automata Learning for Network Traffic - - PowerPoint PPT Presentation
Combining Machine and Automata Learning for Network Traffic Classification Zeynab Sabahi, Fatemeh Ghassemi, and Zahra Alimadadi TTCS 2020,Tehran, Iran 1 Network Traffic Classification, What & Why?
Zeynab Sabahi, Fatemeh Ghassemi, and Zahra Alimadadi TTCS 2020,Tehran, Iran
ميحرلا نمحرلا للوا مسب
1
2
For the network management tasks:
...
010011011001111
gateway
3
Port-based classification:
Inefficient (random or non-standard ports usage)
Payload inspection:
Useless in encrypted traffic
Statistical methods: Flow/packet statistical features
Fast but less accurate
Ignore temporal relation among flows
Behavioral classification:
Specific to the category of application
Intuition:
4
User
HTTP TLS TCP
Learning the network language for each application that we
Classifying an interleaved packet traces of applications
5
Learning the network language for each application that we
Classifying an interleaved packet traces of applications
6
k-TSS language
7
k-Testable language in the Strict Sense is a regular k-size
Words are determined by three allowed sets prefixes, suffixes,
8
k-Testable language in the Strict Sense is a regular k-size
Words are determined by three allowed sets prefixes, suffixes,
Window of size 3
Segments = {aba} Prefixes = {} Suffixes = {}
9
k-Testable language in the Strict Sense is a regular k-size
Words are determined by three allowed sets prefixes, suffixes,
Window of size 3
Segments = {aba, baa} Prefixes = {} Suffixes = {}
10
k-Testable language in the Strict Sense is a regular k-size
Words are determined by three allowed sets prefixes, suffixes,
Window of size 3
Segments = {aba, baa, aaa} Prefixes = {} Suffixes = {}
11
k-Testable language in the Strict Sense is a regular k-size
Words are determined by three allowed sets prefixes, suffixes,
Window of size 3
Segments = {aba, baa, aaa, aab, bab, abb} Prefixes = {ab} Suffixes = {}
12
k-Testable language in the Strict Sense is a regular k-size
Words are determined by three allowed sets prefixes, suffixes,
Window of size 3
Segments = {aba, baa, aaa, aab, bab, abb} Prefixes = {ab} Suffixes = {bb}
13
Definition 1 (k-test vector)
Let k > 0. A k-test vector is a 5-tuple 𝑎 = < 𝛵, 𝐽, 𝐺, 𝑈, 𝐷 > where:
𝐽 ⊆ Σ𝑙−1 is a set of allowed prefixes 𝐺 ⊆ Σ𝑙−1 is a set of allowed suffixes 𝑈 ⊆ Σ𝑙 is a set of allowed segments 𝐷 ⊆ Σ<𝑙 is a set of allowed short strings
Definition 2 (k-TSS Language)
Let 𝑎 = < Σ, 𝐽, 𝐺, 𝑈, 𝐷 > be a k-test vector, for some k > 0.
L(Z) = [(𝐽Σ∗ ∩ Σ∗𝐺) − Σ∗(Σ𝑙 − 𝑈)Σ∗] ∪ 𝐷
14
Definition 1 (k-test vector)
Let k > 0. A k-test vector is a 5-tuple 𝑎 = < 𝜯, 𝐽, 𝐺, 𝑈, 𝐷 > where:
𝐽 ⊆ Σ𝑙−1 is a set of allowed prefixes 𝐺 ⊆ Σ𝑙−1 is a set of allowed suffixes 𝑈 ⊆ Σ𝑙 is a set of allowed segments 𝐷 ⊆ Σ<𝑙 is a set of allowed short strings
Definition 2 (k-TSS Language)
Let 𝑎 = < Σ, 𝐽, 𝐺, 𝑈, 𝐷 > be a k-test vector, for some k > 0.
L(Z) = [(𝐽Σ∗ ∩ Σ∗𝐺) − Σ∗(Σ𝑙 − 𝑈)Σ∗] ∪ 𝐷
15
What is it? How should it be defined for network domain?
16
Intuition: some packets always appear together due to the control phase of protocols or the specific functionality of an application
17
For a set of all packet traces of an application its k-TSS language can be learned
A packet trace of an application : A word of the language
A sequence of related packets : A symbol of the alphabet
18
Network Traffic Language Learner: NeTLang Architectural View:
19
Network Traffic Language Learner: NeTLang Architectural View:
1 2 3
20
By moving a k-window sliding parser the k-TSS vector is
For the running example (k=3):
21
Σ = {H-2, SL-2, SL-3, SL-4, SL-5, T-1, T-10, TL-2, U- 0, U-1} T = {SL-2 T-1 U-0, SL-4 SL-2 T-1, T-1 SL-5 H-2, T-1 U-0 U-1, T-10 TL-2 T-1, TL-2 T-1 SL-5, U-0 U-1 SL-3} I = {SL-4 SL-2, T-10 TL-2} F = {SL-5 H-2, U-1 SL-3}
22
App1 App2
The interleaved packet trace The automata of applications
23
The interleaved packet trace The trace generator module is used to divide the symbolic sub-traces by timing features. App1 App2
The automata of applications
24
Sub-trace s1
App1 App2
The automata of applications
25
Sub-trace s1 App1 App2
The automata of applications Z(App1) = <𝛵1, 𝐽1, 𝐺
1, 𝑈 1>
Z(s1) = <𝛵, 𝐽, 𝐺, 𝑈 > Window-based Similarity
26
Sub-trace s1 App1 App2
The automata of applications Z(App1) = <𝛵1, 𝐽1, 𝐺
1, 𝑈 1>
Z(s1) = <𝛵, 𝐽, 𝐺, 𝑈> Window-based Similarity
𝛦𝑈= 𝑈 −𝑈1
𝑈
, 𝛦𝑈
1 = 𝑈1 −𝑈
𝑈1 ,
𝛦Ʃ = Ʃ −Ʃ1
Ʃ
, 𝛦𝐽 = 𝐽 −𝐽1
𝐽 , 𝛦𝐺 = 𝐺 −𝐺1 𝐺
distance(s1, App1) = 𝛦𝑈 𝛦𝑈
1 𝛦Ʃ 𝛦𝐽 𝛦𝐺
Percentage Change metric
27
Sub-trace s1 App1 App2
The automata of applications Z(App1) = <𝛵1, 𝐽1, 𝐺
1, 𝑈 1>
Z(s1) = <𝛵, 𝐽, 𝐺, 𝑈> Window-based Similarity
𝛦𝑈= 𝑈 −𝑈1
𝑈
, 𝛦𝑈
1 = 𝑈1 −𝑈
𝑈1 ,
𝛦Ʃ = Ʃ −Ʃ1
Ʃ
, 𝛦𝐽 = 𝐽 −𝐽1
𝐽 , 𝛦𝐺 = 𝐺 −𝐺1 𝐺
28
Sub-trace s1 App1 App2
The automata of applications distance(s1, App1) distance(s1, App2)
Class(s1) = 𝐵𝑞𝑞𝑘
Min = distance(s1, Appj)
29
Sub-trace s1 App1 App2
The automata of applications distance(s1, App1) distance(s1, App2)
Class(s1) = 𝐵𝑞𝑞𝑘
Min = distance(s1, Appj)
Class(w) = j if D(L(w), L(𝐵𝑞𝑞𝑘)) = 𝑏𝑠𝑛𝑗𝑜∀ 𝐵𝑞𝑞𝑗∈|A|(D(L(w), L(𝐵𝑞𝑞𝑗)))
31
Δ𝑈= 0.75, Δ𝑈
𝑗 = 0.85, ΔƩ = 0.16,
ΔI = 0, ΔF = 1 D(L(w),L(app)) = 7484160099
2 T-1,TL-2 T-1 U-2}
T-1, T-10, TL-2, U-0, U-1}
T-1 SL-5 H-2, T-1 U-0 U-1, T-10 TL-2 T-1, TL-2 T-1 SL-5, U-0 U-1 SL-3}
32
33
We divided pcaps to three sets:
Metrics:
𝑄+𝑆
34
Parameter Application Identification Traffic Characterization
Session Threshold 15 15 Inactive Timeout 5 15 Flow Duration 10 10 k 3 3 The Best Configurations of Validation Set
35
Precision Recall
F1-Measure
36
Precision Recalls
F1-Measure
37
We have combined unsupervised machine learning and automata learning techniques
Utilizing window language to partially observing network traffic Taking into account the flows and packets temporal relation
Automatically generating the alphabet of automata [Kmeans++ and elbow]. Upgrading the word acceptance by a new proximity metric
NeTLang outperforms the state-of-the-art methods
More accurate, faster, more noise tolerable, better granularity & not application-specific
38
Evaluate NeTLang using a public dataset Taking the protocols phases into account in network unit
39
40
41
Stats # Features Group 1 TotalPktf/b, TotalLf/b, MinLf/b, MeanLf/b, MaxLf/b, StdLf/b Group 2 MinLf/b, MeanLf/b, MaxLf/b, StdLf/b, PktCntRf/b, DtSizeRf/b, AvgInvalf/b
Feature Description MinLf/b Minimum length of packet sent/received within a network unit MeanLf/b Average length of packet sent/received within a network unit. MaxLf/b Maximum length of packet sent/received within a network unit StdLf/b Standard deviation of packets length sent/received within a unit PktCntRf/b Rate of TotalPktf/b to total number of packets within a network unit DtSizeRf/b Rate of TotalLf/b to sum of all packets' length within a network unit AvgInvalf/b Average of sent/received packets time interval within a network unit
42
43
Param Application Identification
Traffic Characterization
Session Threshold 15 15 Inactive Timeout 5 15 Flow Duration 10 10 Feature Group 2 2 k 3 3
44
45
Application Identification Traffic Characterization