Detection of HTTP-GET Attack with Clustering and Information - - PowerPoint PPT Presentation

detection of http get attack with clustering and
SMART_READER_LITE
LIVE PREVIEW

Detection of HTTP-GET Attack with Clustering and Information - - PowerPoint PPT Presentation

Problem definition Clustering Detection Measurements Result analysis Future work Detection of HTTP-GET Attack with Clustering and Information Theoretic Measurements Pawel Chwalinski Roman Belavkin Xiaochun Cheng Middlesex University


slide-1
SLIDE 1

Problem definition Clustering Detection Measurements Result analysis Future work

Detection of HTTP-GET Attack with Clustering and Information Theoretic Measurements

Pawel Chwalinski Roman Belavkin Xiaochun Cheng

Middlesex University School of Science and Technology London, United Kingdom

FOUNDATIONS & PRACTICE OF SECURITY, 2012

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-2
SLIDE 2

Problem definition Clustering Detection Measurements Result analysis Future work

Outline

1

Problem definition

2

Clustering Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means

3

Detection Measurements Mutual Information Mahalanobis Distance Likelihood of the same request segments

4

Result analysis

5

Future work

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-3
SLIDE 3

Problem definition Clustering Detection Measurements Result analysis Future work

Outline

1

Problem definition

2

Clustering Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means

3

Detection Measurements Mutual Information Mahalanobis Distance Likelihood of the same request segments

4

Result analysis

5

Future work

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-4
SLIDE 4

Problem definition Clustering Detection Measurements Result analysis Future work

Flash Crowd Surge of connections Illegitimate

❲ ❲ ❲ ❲ ❲ ❲

❲ ❲ ❲ ❲ ❲ ❲ ❲ ❲

Legitimate

❣ ❣ ❣ ❣ ❣ ❣

❣ ❣ ❣ ❣ ❣

Resource exhaustion Increased number of attacking sessions = Increased arrival rate No bandwidth flooding The only difference is in intent, rather than in behaviour

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-5
SLIDE 5

Problem definition Clustering Detection Measurements Result analysis Future work Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means

Outline

1

Problem definition

2

Clustering Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means

3

Detection Measurements Mutual Information Mahalanobis Distance Likelihood of the same request segments

4

Result analysis

5

Future work

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-6
SLIDE 6

Problem definition Clustering Detection Measurements Result analysis Future work Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means

Dataset D divided into: DT for training and DV validating Example: si=(homepage, homepage, homepage, news, news, news, homepage, homepage, weather, tv) si=(1, 1, 1, 2, 2, 2, 1, 1, 5, 7) Ω={1, 2, . . . , nC}, where each ω represents a numerical label of an actual category, and nC = 17 Pairs of two consecutive requests are analysed, such that (x, y) ∈ Ω × Ω

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-7
SLIDE 7

Problem definition Clustering Detection Measurements Result analysis Future work Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 0.1 0.2 0.3 0.4

C a t e g

  • r

y L a b e l Category Label

P(x,y)

Figure: Joint distribution of the the pairs of requests observed in a cluster (i.e. one pattern of “behaviour”)

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-8
SLIDE 8

Problem definition Clustering Detection Measurements Result analysis Future work Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means

1 2 3 4 5 6 7 8 9 1011121314151617 1 2 3 4 5 6 7 8 91011121314151617 0.05 0.1 0.15 0.2 0.25

C a t e g

  • r

y L a b e l Category Label

P(x,y)

Figure: Joint distribution of the the pairs of requests observed in a cluster (i.e. one pattern of “behaviour”)

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-9
SLIDE 9

Problem definition Clustering Detection Measurements Result analysis Future work Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means

n random variables {X1, . . . , Xn} having discreet joint distribution defined as p(x1, . . . , xn) for which entropy is calculated as: h(x1, . . . , xn) = −

  • . . .
  • x1,...,xn

p(x1, . . . , xn) log p(x1, . . . , xn) (1) For a joint distribution p(x, y), entropy h(x, y) is calculated as: h(x, y)= −

  • x
  • y

p(x, y) log p(x, y) (2)

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-10
SLIDE 10

Problem definition Clustering Detection Measurements Result analysis Future work Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means

H T H 0.25 0.25 T 0.25 0.25

Table: Legitimate Coins

H T H 0.02 0.02 T 0.02 0.94

Table: Biased Coins

Entropy of legitimate distribution: h(x, y)

x,y∈{H,T}

= 2(bits) Entropy of biased distribution: h(x, y)

x,y∈{H,T}

= 0.3952(bits)

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-11
SLIDE 11

Problem definition Clustering Detection Measurements Result analysis Future work Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means

Clustering metric

A set C={C1, C2, . . . , Ck} of k clusters, the goal is to minimise the average entropy of the clusters, which is computed as follows: E{H}=

k

  • j=1

|Cj| |DT|{hj}

  • (3)

Having grouped b, 1≤b≤nT sequences from DT given (3), every time a sequence si, b<i≤nT is to be added to C, a cluster Cj, 1≤j≤k is picked where adding sequence si decreases (3) the most. Problems: Unwanted effect of sequence order (Re-clustering & Merging). Minimization of (3) is computationally expensive (Partitioning)

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-12
SLIDE 12

Problem definition Clustering Detection Measurements Result analysis Future work Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means

Re-clustering

Suppose that si has been added to Cj, 1≤j≤k. Having grouped other b sequences, there exists another cluster Cj′, 1≤j′≤k, j′ = j such that placing the previously added si inside Cj′, minimises (3) further. Therefore, after processing a batch of b sequences, the algorithm is stopped, and the whole set C is re-clustered.

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-13
SLIDE 13

Problem definition Clustering Detection Measurements Result analysis Future work Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means

Merging

Suppose there are two joint distributions pi(x, y), 1≤i≤k, and pj(x, y), 1≤j≤k is the application of Kullback-Leibler (KL) divergence formula: DKL(pi||pj)=

x,y∈Ω

pi(x, y) log pi(x, y) pj(x, y) (4) Two clusters Ci and Cj are similar, and are merged when DKL(pi||pj)≤ηKL, and DKL(pj||pi)≤ηKL.

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-14
SLIDE 14

Problem definition Clustering Detection Measurements Result analysis Future work Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means

Partitioning

DT divided into p same-length blocks {B1, B2, . . . , Bp}. For each block Bi, 1 ≤ i ≤ p

minimisation of (3) merging re-clustering

Having combined {B1, B2, . . . , Bp} into C:

merging re-clustering

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-15
SLIDE 15

Problem definition Clustering Detection Measurements Result analysis Future work Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means

Decision-making problem for attackers

Having requested a link from category c(ωt), ωt ∈ Ω, a programmed zombie decides whether c(ωt+1) = remain pR move pM=1 − pR Uniformly-changing zombies where pR= 1

|Ω| and pM=1 − pR= |Ω|−1 |Ω|

are too easy to detect. Frequently-changing hosts are the ones that tend to change categories less frequently comparing to the uniformly-changing zombies, such that pM=pR=0.5.

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-16
SLIDE 16

Problem definition Clustering Detection Measurements Result analysis Future work Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means

How to build attacking profile similar to legitimate behaviour?

Solution - create sequences visiting (on average) similar number of categories so that DA ∼ DT. A vector eT the expected number of various categories, given the number of requests, such that eT=[E{nc

r∧}, E{nc r∧+1}, . . . , E{nc r∧+t}, . . . , E{nc r∨}], where E{nc r∧+t}

denotes the expected number of visited categories after ∧ + t requests. pR=0.92, pM=0.08 so that eA ∼ eT.

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-17
SLIDE 17

Problem definition Clustering Detection Measurements Result analysis Future work Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means

Probability of observing sequence si, 1≤i≤nV A inside cluster Cj, 1≤j≤k with corresponding joint distribution pj(x, y) can be thought of in the following way (having assumed that two consecutive requests si,l, si,l+1, 1≤l<ni are independent): pj(si,1, si,2, . . . , si,ni)=pj(si,1)

ni

  • l=2

pj(si,l|si,l−1) (5) Subsequently, the probability of generating sequence si is calculated for each cluster, and then Cj is chosen for which (6) attains the highest value: Cj= arg max

C1,C2,...,Ck

log L(si; C)=P(si|C) (6)

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-18
SLIDE 18

Problem definition Clustering Detection Measurements Result analysis Future work Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means

Results of the algorithm

1 2 3 4 5 6 7 8 0.0% 0.2% 0.4% 0.6% 0.8% 1.0% 1.2% 1.4% 1.6% 1.8% 2.0%

Entropy Range H Percentage of Sequences in Sample Training Sequences DT Validating Sequences DV

(a) Percentage of sequences from datasets DT and DV

1 2 3 4 5 6 7 8 0.0% 0.2% 0.4% 0.6% 0.8% 1.0% 1.2% 1.4% 1.6% 1.8% 2.0%

Entropy Range H Percentage of Sequences in Sample Training Sequences DT Combined Sequences DV ∪A

(b) Percentage of sequences from datasets DT and DV A Figure: Comparison of sequence distribution with (3b), and without (3a) attacking sequences. Evidently, the distribution changes with the arrival

  • f the frequently-changing sequences.

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-19
SLIDE 19

Problem definition Clustering Detection Measurements Result analysis Future work Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means

Results of the algorithm

1 2 3 4 5 6 7 8 0.0% 0.2% 0.4% 0.6% 0.8% 1.0% 1.2% 1.4% 1.6% 1.8% 2.0%

Entropy Range H Percentage of Sequences in Sample Training Sequences DT Validating Sequences DV

(a) Percentage of sequences from sets DT and DV

1 2 3 4 5 6 7 8 0.0% 0.2% 0.4% 0.6% 0.8% 1.0% 1.2% 1.4% 1.6% 1.8% 2.0%

Entropy Range H Percentage of Sequences in Sample Training Sequences DT Combined Sequences DV ∪A

(b) Percentage of sequences from sets DT and DV A Figure: Comparison of sequence distribution with (4b), and without (4a) attacking sequences. The distribution of the set DV A with rarely-changing hosts shifts towards the expected distribution (Fig. 4a), but still differs.

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-20
SLIDE 20

Problem definition Clustering Detection Measurements Result analysis Future work Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means

Comparison against K-means algorithm

2 3 4 5 6 7 8 0% 1% 2% 3% 4% 5% 6% 7% 8%

Entropy Range H Percentage of Sequences in Sample Training Sequences DT Validating Sequences DV

(a) Training and Validating Sequences

2 3 4 5 6 7 8 0% 1% 2% 3% 4% 5% 6% 7% 8%

Entropy Range H Percentage of Sequences in Sample Training Sequences DT Combined Sequences DV ∪A

(b) Training,Validating and Frequently-Changing Sequences Figure: Distribution of sessions with K-means clustering. Despite using different sets (i.e. validating sequences (6a), validating and frequently-changing sequences (5b) or validating and rarely-changing sequences (6b)) the distributions are similar and do not give any insights into the nature of processing batch .

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-21
SLIDE 21

Problem definition Clustering Detection Measurements Result analysis Future work Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means

Comparison against K-means algorithm

2 3 4 5 6 7 8 0% 1% 2% 3% 4% 5% 6% 7% 8%

Entropy Range H Percentage of Sequences in Sample Training Sequences DT Validating Sequences DV

(a) Training and Validating Sequences

2 3 4 5 6 7 8 0% 1% 2% 3% 4% 5% 6% 7% 8%

Entropy Range H Percentage of Sequences in Sample Training Sequences DT Combined Sequences DV ∪A

(b) Training,Validating and Rarely-Changing Sequences Figure: Distribution of sessions with K-means clustering. Despite using different sets (i.e. validating sequences (6a), validating and frequently-changing sequences (5b) or validating and rarely-changing sequences (6b)) the distributions are similar and do not give any insights into the nature of processing batch .

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-22
SLIDE 22

Problem definition Clustering Detection Measurements Result analysis Future work Mutual Information Mahalanobis Distance Likelihood of the same request segments

Outline

1

Problem definition

2

Clustering Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means

3

Detection Measurements Mutual Information Mahalanobis Distance Likelihood of the same request segments

4

Result analysis

5

Future work

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-23
SLIDE 23

Problem definition Clustering Detection Measurements Result analysis Future work Mutual Information Mahalanobis Distance Likelihood of the same request segments

I(X; Y )=

y∈Y ,x∈X

p(x, y) log p(x, y) p(x)p(y)

  • Subsequently, for each cluster Ci a two-dimensional vector mi is

introduced, where each j-th column contains m∧

i,j minimum and

maximum m∨

i,j value of mutual information while requesting j-th

category

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-24
SLIDE 24

Problem definition Clustering Detection Measurements Result analysis Future work Mutual Information Mahalanobis Distance Likelihood of the same request segments

si=(1, 1, 1, 2, 2, 2, 1, 1, 5, 7) 1 1 1 2 2 2 1 1 5 7

m∧

1,2 ≤ I(1, 2) ≤ m∨ 1,2

m∧

2,1 ≤ I(2, 1) ≤ m∨ 2,1

m∧

1,5 ≤ I(1, 5) ≤ m∨ 1,5

m∧

5,7 ≤ I(5, 7) ≤ m∨ 5,7 Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-25
SLIDE 25

Problem definition Clustering Detection Measurements Result analysis Future work Mutual Information Mahalanobis Distance Likelihood of the same request segments

For each cluster Ci, 1≤i≤k a corresponding covariance matrix Σi is calculated, together with a vector of average categorical requests µi=[µi,1, µi,2, . . . , µi,nC ]. Subsequently, each sequence sj, 1≤j≤|Ci| from Ci, is transformed into a vector form vj=[vj,1, vj,2, . . . , vj,l, . . . , vj,nC ], where each vj,l denotes how many times l-th category has been requested during session sj. As a result, Mahalanobis distance can be calculated in the following way: dM(vj, Ci)=

  • (vj − µi)T Σ−1

i

(vj − µi) (7) Subsequently, vectors of Mahalanobis distances mM

i =[mM i,∧, mM i,∧+1, . . . , mM i,t, . . . , mM i,r∨] are obtained, where each

t-th component contains the maximum Mahalanobis value

  • bserved inside Ci for t-request-long sequences.

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-26
SLIDE 26

Problem definition Clustering Detection Measurements Result analysis Future work Mutual Information Mahalanobis Distance Likelihood of the same request segments

For si=(si,1, si,2, . . . , si,l, . . . , si,ni), 1≤i≤nT, a segment s∨

i,t = {si,l−t, si,l−t+1, . . . , si,l+t} denotes the longest substring

  • bserved inside si after t-th request, and composed of the same

elements si,l−t, such that si,l−t = si,l−t+1 = . . . = si,l+t. si=(1, 1, 1, 2, 2, 2, 1, 1, 5, 7) 1 1 1 2 2 2 1 1 5 7

1 2 3 3 3 3 3

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-27
SLIDE 27

Problem definition Clustering Detection Measurements Result analysis Future work Mutual Information Mahalanobis Distance Likelihood of the same request segments

For each sequence si belonging to Cj, conditional probability Pj(s∨

i |ni) of the longest segments of the same categorical requests,

given the number of requests ni is calculated in the following way: Pj(s∨

i,·|ni)= nj

  • l=r∧

Pj(s∨

i,l|l)

(8) Subsequently, for each cluster Cj, set Lj={ℓ1(s∨, n1), ℓ2(s∨, n2), . . . ℓt(s∨, nt), . . . , ℓ|Ci|(s∨, n|Ci|)} is introduced, where ℓt(s∨, nt) denotes loglikelihood of the longest same-element segments, observed in sequence st.

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-28
SLIDE 28

Problem definition Clustering Detection Measurements Result analysis Future work Mutual Information Mahalanobis Distance Likelihood of the same request segments

Classification of sequences from DV ∪A should vary for clusters with low and high entropy, and be divided into two ranges: “hard” HH={h1, . . . , ht}, 1≤t<k and “soft” HS={ht+1, . . . , hk}, and H=HH ∪ HS. In addition, percentages f T

i , f V A i

, 1≤i≤k of sequences from batches DT and DV ∪A assigned to cluster Ci are

  • calculated. If, for any cluster Cj, 1≤j<k the following holds

f T

i >f V A i

(i.e. there are more sequences inside cluster Cj during training than during validating stage), then the “hard” setting is alternated with “soft” detection scheme inside fewer populated clusters. Type Frequently-changing Rarely-changing “Soft”

  • Mut. Info.

min Lj

1≤j≤k

“Hard”

  • Mut. Info. & Mah. distance
  • Mut. Info. & 1

2 min

Lj

1≤j≤k

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-29
SLIDE 29

Problem definition Clustering Detection Measurements Result analysis Future work

Outline

1

Problem definition

2

Clustering Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means

3

Detection Measurements Mutual Information Mahalanobis Distance Likelihood of the same request segments

4

Result analysis

5

Future work

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-30
SLIDE 30

Problem definition Clustering Detection Measurements Result analysis Future work

0.12 0.125 0.13 0.135 0.14 0.145 0.15 0.155 0.16 0.165 0.75 0.76 0.77 0.78 0.79 0.8 0.81 0.82 0.83

False Positive True Positive Performance Curve FP: 0.1205, TP: 0.7531 FP: 0.1647, TP: 0.8122 FP: 0.1597, TP: 0.8093

(a) Performance curve for DV A composed of validating and frequently-changing hosts.

0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19 0.21 0.23 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

False Positive True Positive Performance Curve FP: 0.0271, TP: 0.2018 FP: 0.2228, TP: 0.8489 FP: 0.2029, TP: 0.8280

(b) Performance curve for DV A composed of validating and rarely-changing hosts. Figure: Performance of the detection measurements against two attacking strategies: frequently-changing (7a), and rarely-changing(7b).

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-31
SLIDE 31

Problem definition Clustering Detection Measurements Result analysis Future work

Outline

1

Problem definition

2

Clustering Web Dataset and Sessions Web clusters as joint distributions Entropy-based clustering Attacking Strategies Cluster assignment Results and comparison against K-means

3

Detection Measurements Mutual Information Mahalanobis Distance Likelihood of the same request segments

4

Result analysis

5

Future work

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-32
SLIDE 32

Problem definition Clustering Detection Measurements Result analysis Future work

Future work

Attack-independent measurement √ Improvement of detection performance √ Decrease of the algorithm execution time Real-time detection.

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-33
SLIDE 33

Problem definition Clustering Detection Measurements Result analysis Future work

Thank you for your attention!

Have you got any questions?

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-34
SLIDE 34

Problem definition Clustering Detection Measurements Result analysis Future work 100 150 200 250 300 50 100 150 200 250 300 350 400 450 500

Weight Height Tall Objects ∼ N(µT, σT) Short Objects ∼ N(µS, σS) µT=(120.50, 254.94) µS=(118.10, 122.84) First Object (FO) = (90.50, 32.84) Second Object (SO) =(120.50, 154.94) dM(FO, µT)= 3.89, dE(FO, µT)= 224.11 dM(FO, µS)= 6.75, dE(FO, µS)= 94.14 dM(SO, µT)= 1.24, dE(SO, µT)= 100.00 dM(SO, µS)= 2.32, dE(SO, µS)= 32.18

Figure: Mahalanobis distance vs. Euclidean distance

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack

slide-35
SLIDE 35

Problem definition Clustering Detection Measurements Result analysis Future work

Comparison against K-means algorithm

1 2 3 . . . nC 1 1,1 1,2 1,3 . . . 1,nC 2 2,1 2,2 2,3 . . . 2,nC 3 3,1 3,2 3,3 . . . 3,nC . . . . . . . . . . . . . . . . . . nC nC,1 nC,2 nC,3 . . . nC, nC Pair-wise representation [(1, 1), (1, 2), . . . , (1, nC), (2, 1), . . . , (2, nC), . . . , (nC, nC)] si=(1, 1, 1, 2, 2, 2, 1, 1, 5, 7) → si=[3, 1, 0, 0, 1, . . . , 0, 1, 2, . . . , 0, . . . , 0] # of iterations = 100, On-line updates (similar to re-clustering)

Chwalinski, Belavkin, Xiaochun Detection of HTTP-GET Attack