An SVM- -based Masquerade Detection based Masquerade Detection An - - PowerPoint PPT Presentation

an svm based masquerade detection based masquerade
SMART_READER_LITE
LIVE PREVIEW

An SVM- -based Masquerade Detection based Masquerade Detection An - - PowerPoint PPT Presentation

An SVM- -based Masquerade Detection based Masquerade Detection An SVM Method with Online Update Using Method with Online Update Using Co- -occurrence Matrix occurrence Matrix Co Liangwen Chen, Masayoshi Chen, Masayoshi Aritsugi Aritsugi


slide-1
SLIDE 1

An SVM An SVM-

  • based Masquerade Detection

based Masquerade Detection Method with Online Update Using Method with Online Update Using Co Co-

  • occurrence Matrix
  • ccurrence Matrix

Liangwen Liangwen Chen, Masayoshi Chen, Masayoshi Aritsugi Aritsugi Gunma University, Japan Gunma University, Japan

slide-2
SLIDE 2

Outline Outline

  • Background

Background

  • Conventional results

Conventional results

  • Our proposal

Our proposal

  • Experiments

Experiments

  • Conclusion

Conclusion

slide-3
SLIDE 3

Background Background

  • A computer can provide multiple services to multiple

A computer can provide multiple services to multiple users users

  • Users can login to a computer through network

Users can login to a computer through network Security mng. costs increase

  • Hard to protect computers from malicious access

Hard to protect computers from malicious access completely completely Masquerade detection

slide-4
SLIDE 4

Conventional results Conventional results

80.1% 80.1% 9.7% 9.7% SVM SVM-

  • based approach

based approach with voting with voting Kim and Cha Kim and Cha 72.3% 72.3% 2.5% 2.5% ECM ECM Oka et al. Oka et al. Hit Hit Rate Rate False Positive False Positive Rate Rate Approaches Approaches Researchers Researchers 66.2% 66.2% 4.6% 4.6% Na Naï ïve ve Bayes Bayes (no updating) (no updating) 61.5% 61.5% 1.3% 1.3% Na Naï ïve ve Bayes Bayes (updating) (updating) Maxion Maxion and and Townsend Townsend 41.1% 41.1% 2.7% 2.7% IPAM IPAM 36.8% 36.8% 3.7% 3.7% Sequence Matching Sequence Matching 34.2%. 34.2%. 5.0% 5.0% Compression Compression 49.3% 49.3% 3.2% 3.2% Hybrid Hybrid multistep multistep Markov Markov 69.3% 69.3% 6.7% 6.7% Bayes Bayes one

  • ne-
  • step Markov

step Markov 39.4% 39.4% 1.4% 1.4% Uniqueness Uniqueness Schonlau Schonlau et al. et al.

slide-5
SLIDE 5

Problems Problems

  • C
  • n

v e n t i

  • n

a l r e s e a r c h e s h a v e C

  • n

v e n t i

  • n

a l r e s e a r c h e s h a v e a t t e m p t e d t

  • i

m p r

  • v

e t h e a t t e m p t e d t

  • i

m p r

  • v

e t h e a c c u r a c y r a t e a c c u r a c y r a t e

  • U

s e r s U s e r s ’ ’ b e h a v i

  • r

s w

  • u

l d c h a n g e b e h a v i

  • r

s w

  • u

l d c h a n g e w i t h t i m e

0.918 ROC Score 4GB Memory Size Xeon 3.2GHz CPU 22.13 sec. Detection cost 1046.37 min. Training cost 72.3% Hit Rate 2.5% False Positive ECM

w i t h t i m e

Need to adapt to changes

slide-6
SLIDE 6

Our strategy Our strategy

  • T
  • b
  • r

r

  • w

t h e s a m e d a t a T

  • b
  • r

r

  • w

t h e s a m e d a t a

  • T
  • c
  • m

p a r e r e s u l t s w i t h c

  • n

v e n t i

  • n

a l w

  • r

k T

  • c
  • m

p a r e r e s u l t s w i t h c

  • n

v e n t i

  • n

a l w

  • r

k

  • T
  • b
  • r

r

  • w

E C M T

  • b
  • r

r

  • w

E C M

  • L
  • w

f a l s e p

  • s

i t i v e r a t e L

  • w

f a l s e p

  • s

i t i v e r a t e

  • H

i g h h i t r a t e H i g h h i t r a t e

  • H

i g h R O C s c

  • r

e H i g h R O C s c

  • r

e

  • T
  • e

x p l

  • i

t S V M T

  • e

x p l

  • i

t S V M

  • L
  • w

t r a i n i n g c

  • s

t L

  • w

t r a i n i n g c

  • s

t

  • A

d a p t t

  • c

h a n g e s

  • f

u s e r s A d a p t t

  • c

h a n g e s

  • f

u s e r s ’ ’ b e h a v i

  • r

s b e h a v i

  • r

s

slide-7
SLIDE 7

Correlation of commands Correlation of commands

time

User1 : User1 : cd cd ls ls less less ls ls less less cd cd ls ls cd cd cd cd ls ls User2 : User2 : emacs emacs gcc gcc gdb gdb emacs emacs ls ls gcc gcc gdb gdb ls ls ls ls emacs emacs User3 : User3 : mkdir mkdir cp cp cd cd ls ls cp cp ls ls cp cp cp cp cp cp cp cp cd cd ls ls less less ls ls less less cd cd ls ls cd cd cd cd ls ls

Strength of correlation of ls and less : 2+1=3

slide-8
SLIDE 8

Co Co-

  • occurrence matrix
  • ccurrence matrix

User1 : User1 : cd cd ls ls less less ls ls less less cd cd ls ls cd cd cd cd ls ls User2 : User2 : emacs emacs gcc gcc gdb gdb emacs emacs ls ls gcc gcc gdb gdb ls ls ls ls emacs emacs User3 : User3 : mkdir mkdir cp cp cd cd ls ls cp cp ls ls cp cp cp cp cp cp cp cp

slide-9
SLIDE 9

Our co Our co-

  • occurrence matrix
  • ccurrence matrix

cd ls less emacs gcc gdb mkdir cp 0 0 0 0 0 0 0 0 0 3 0 3 1 1 0 0 0 0 0 0 0 0 0 0 0 4 0 1 3 3 0 0 0 4 0 2 1 3 0 0 0 5 0 2 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 cd ls less emacs gcc gdb mkdir cp

High freq. Low freq. Commands in Legitimate training data All other commands Commands in Legitimate training data Low freq. High freq.

A B

emacs ls gcc gdb cd less mkdir cp 2 4 3 3 0 0 0 0 3 3 1 1 0 0 0 0 2 4 1 3 0 0 0 0 2 5 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 emacs ls gcc gdb cd less mkdir cp All other commands

C D

slide-10
SLIDE 10

System overview System overview

Training data Co-occ. Matrx. Gen. Co-occ. Matrx. Co-occ. Matrx. Feature vectr. Feature vectr. SVM training model results

  • Co

Co-

  • occ
  • cc.

. Matrx

  • Matrx. generation

. generation

  • SVM feature

SVM feature vectr

  • vectr. generation

. generation

  • SVM

SVM processing processing

  • Results

Results

  • Refinement

New sequence

Refinement

slide-11
SLIDE 11

Comparison with ECM Comparison with ECM

Pentium III 1.4GHz Pentium III 1.4GHz Xeon 3.2GHz Xeon 3.2GHz CPU CPU 512MB 512MB 4GB 4GB Memory Size Memory Size 0.926 0.926 0.918 0.918 ROC Score ROC Score 0.04 0.04 sec. sec. 22.13 22.13 sec. sec. Detection cost Detection cost 117.33 117.33 sec. sec. 1046.37 1046.37 min. min. Training cost Training cost 72.74% 72.74% 72.3% 72.3% Hit Rate Hit Rate 3.0% 3.0% 2.5% 2.5% False Positive False Positive Our method Our method (based on 2 (based on 2-

  • class SVM)

class SVM) ECM ECM

slide-12
SLIDE 12

Comparison with ECM Comparison with ECM

Pentium III 1.4GHz Pentium III 1.4GHz Xeon 3.2GHz Xeon 3.2GHz CPU CPU 512MB 512MB 4GB 4GB Memory Size Memory Size 0.926 0.926 0.918 0.918 ROC Score ROC Score 0.04 0.04 sec. sec. 22.13 22.13 sec. sec. Detection cost Detection cost 117.33 117.33 sec. sec. 1046.37 1046.37 min. min. Training cost Training cost 72.74% 72.74% 72.3% 72.3% Hit Rate Hit Rate 3.0% 3.0% 2.5% 2.5% False Positive False Positive Our method Our method (based on 2 (based on 2-

  • class SVM)

class SVM) ECM ECM

Almost the same

slide-13
SLIDE 13

Comparison with ECM Comparison with ECM

Pentium III 1.4GHz Pentium III 1.4GHz Xeon 3.2GHz Xeon 3.2GHz CPU CPU 512MB 512MB 4GB 4GB Memory Size Memory Size 0.926 0.926 0.918 0.918 ROC Score ROC Score 0.04 0.04 sec. sec. 22.13 22.13 sec. sec. Detection cost Detection cost 117.33 117.33 sec. sec. 1046.37 1046.37 min. min. Training cost Training cost 72.74% 72.74% 72.3% 72.3% Hit Rate Hit Rate 3.0% 3.0% 2.5% 2.5% False Positive False Positive Our method Our method (based on 2 (based on 2-

  • class SVM)

class SVM) ECM ECM

With lower power machine

slide-14
SLIDE 14

Comparison with ECM Comparison with ECM

Pentium III 1.4GHz Pentium III 1.4GHz Xeon 3.2GHz Xeon 3.2GHz CPU CPU 512MB 512MB 4GB 4GB Memory Size Memory Size 0.926 0.926 0.918 0.918 ROC Score ROC Score 0.04 0.04 sec. sec. 22.13 22.13 sec. sec. Detection cost Detection cost 117.33 117.33 sec. sec. 1046.37 1046.37 min. min. Training cost Training cost 72.74% 72.74% 72.3% 72.3% Hit Rate Hit Rate 3.0% 3.0% 2.5% 2.5% False Positive False Positive Our method Our method (based on 2 (based on 2-

  • class SVM)

class SVM) ECM ECM

Smaller

slide-15
SLIDE 15

Comparison with ECM Comparison with ECM

Pentium III 1.4GHz Pentium III 1.4GHz Xeon 3.2GHz Xeon 3.2GHz CPU CPU 512MB 512MB 4GB 4GB Memory Size Memory Size 0.926 0.926 0.918 0.918 ROC Score ROC Score 0.04 0.04 sec. sec. 22.13 22.13 sec. sec. Detection cost Detection cost 117.33 117.33 sec. sec. 1046.37 1046.37 min. min. Training cost Training cost 72.74% 72.74% 72.3% 72.3% Hit Rate Hit Rate 3.0% 3.0% 2.5% 2.5% False Positive False Positive Our method Our method (based on 2 (based on 2-

  • class SVM)

class SVM) ECM ECM

With lower power machine Training cost:535times smaller Detection cost:553times smaller Achieved almost the same good charac.

slide-16
SLIDE 16

Online update Online update

  • T
  • r

u n t h e s y s t e m a . s . a . p . e v e n i f w e d

  • n

T

  • r

u n t h e s y s t e m a . s . a . p . e v e n i f w e d

  • n

’ ’t h a v e t h a v e e n

  • u

g h a m

  • u

n t

  • f

d a t a f

  • r

t r a i n i n g e n

  • u

g h a m

  • u

n t

  • f

d a t a f

  • r

t r a i n i n g

  • T
  • a

d a p t c h a n g e s

  • f

u s e r s T

  • a

d a p t c h a n g e s

  • f

u s e r s ’ ’ b e h a v i

  • r

s b e h a v i

  • r

s

  • Our proposal is with low comput. cost
  • Online update of training model
  • By modifying application of the data
slide-17
SLIDE 17

2 2-

  • class and 1

class and 1-

  • class based methods

class based methods

  • 2

2-

  • class vs. 1

class vs. 1-

  • class

class

  • Data: 2

Data: 2-

  • class > 1

class > 1-

  • class

class

  • Cost: 2

Cost: 2-

  • class > 1

class > 1-

  • class

class

  • Accuracy: 2

Accuracy: 2-

  • class > 1

class > 1-

  • class

class

  • We look them concretely

We look them concretely by experiments by experiments

slide-18
SLIDE 18

Update Under 2 Update Under 2-

  • class SVM

class SVM

0.04 0.04 s s Detection cost Detection cost 10.03 10.03 s s 6.90 6.90 s s 7.04 7.04 s s 3.36 3.36 s s SVM SVM training training costs costs 107.30 107.30 s s 89.65 89.65 s s 59.53 59.53 s s 43.86 43.86 s s Update costs Update costs 0.93 0.93 0.91 0.91 0.90 0.90 0.89 0.89 ROC Score ROC Score 72.74% 72.74% 68% 68% 69% 69% 68% 68% Hit Rate Hit Rate 3% 3% 5% 5% 6% 6% 8% 8% False Positive False Positive 50 50 blks blks. . (25000) (25000) 40 40 blks blks. . (20000) (20000) 30 30 blks blks. . (15000) (15000) 20 20 blks blks. . (10000) (10000)

# trained # trained commands commands

slide-19
SLIDE 19

Update Under 2 Update Under 2-

  • class SVM

class SVM

0.04 0.04 s s Detection cost Detection cost 10.03 10.03 s s 6.90 6.90 s s 7.04 7.04 s s 3.36 3.36 s s SVM SVM training training costs costs 107.30 107.30 s s 89.65 89.65 s s 59.53 59.53 s s 43.86 43.86 s s Update costs Update costs 0.93 0.93 0.91 0.91 0.90 0.90 0.89 0.89 ROC Score ROC Score 72.74% 72.74% 68% 68% 69% 69% 68% 68% Hit Rate Hit Rate 3% 3% 5% 5% 6% 6% 8% 8% False Positive False Positive 50 50 blks blks. . (25000) (25000) 40 40 blks blks. . (20000) (20000) 30 30 blks blks. . (15000) (15000) 20 20 blks blks. . (10000) (10000)

# trained # trained commands commands improved

slide-20
SLIDE 20

Update Under 2 Update Under 2-

  • class SVM

class SVM

0.04 0.04 s s Detection cost Detection cost 10.03 10.03 s s 6.90 6.90 s s 7.04 7.04 s s 3.36 3.36 s s SVM SVM training training costs costs 107.30 107.30 s s 89.65 89.65 s s 59.53 59.53 s s 43.86 43.86 s s Update costs Update costs 0.93 0.93 0.91 0.91 0.90 0.90 0.89 0.89 ROC Score ROC Score 72.74% 72.74% 68% 68% 69% 69% 68% 68% Hit Rate Hit Rate 3% 3% 5% 5% 6% 6% 8% 8% False Positive False Positive 50 50 blks blks. . (25000) (25000) 40 40 blks blks. . (20000) (20000) 30 30 blks blks. . (15000) (15000) 20 20 blks blks. . (10000) (10000)

# trained # trained commands commands improved improved

slide-21
SLIDE 21

Update Under 2 Update Under 2-

  • class SVM

class SVM

0.04 0.04 s s Detection cost Detection cost 10.03 10.03 s s 6.90 6.90 s s 7.04 7.04 s s 3.36 3.36 s s SVM SVM training training costs costs 107.30 107.30 s s 89.65 89.65 s s 59.53 59.53 s s 43.86 43.86 s s Update costs Update costs 0.93 0.93 0.91 0.91 0.90 0.90 0.89 0.89 ROC Score ROC Score 72.74% 72.74% 68% 68% 69% 69% 68% 68% Hit Rate Hit Rate 3% 3% 5% 5% 6% 6% 8% 8% False Positive False Positive 50 50 blks blks. . (25000) (25000) 40 40 blks blks. . (20000) (20000) 30 30 blks blks. . (15000) (15000) 20 20 blks blks. . (10000) (10000)

# trained # trained commands commands improved improved improved

slide-22
SLIDE 22

Update Under 1 Update Under 1-

  • class SVM

class SVM

0.04 0.04 s s D e t e c t i

  • n

c

  • s

t D e t e c t i

  • n

c

  • s

t 0.27 0.27 s s 0.22 0.22 s s 0.18 0.18 s s 0.17 0.17 s s SVM SVM t r a i n i n g t r a i n i n g c

  • s

t s c

  • s

t s 2.15 2.15 s s 1.79 1.79 s s 1.53 1.53 s s 0.88 0.88 s s U p d a t e c

  • s

t s U p d a t e c

  • s

t s 0.88 0.88 0.87 0.87 0.86 0.86 0.85 0.85 ROC Score ROC Score 62.77% 62.77% 61% 61% 64% 64% 68% 68% Hit Rate Hit Rate 6% 6% 7% 7% 8% 8% 12% 12% False Positive False Positive 50 50 blks blks. . (5000) (5000) 40 40 blks blks. . (4000) (4000) 30 30 blks blks. . (3000) (3000) 20 20 blks blks. . (2000) (2000) # t r a i n e d # t r a i n e d c

  • m

m a n d s c

  • m

m a n d s

slide-23
SLIDE 23

Update Under 1 Update Under 1-

  • class SVM

class SVM

0.04 0.04 s s D e t e c t i

  • n

c

  • s

t D e t e c t i

  • n

c

  • s

t 0.27 0.27 s s 0.22 0.22 s s 0.18 0.18 s s 0.17 0.17 s s SVM SVM t r a i n i n g t r a i n i n g c

  • s

t s c

  • s

t s 2.15 2.15 s s 1.79 1.79 s s 1.53 1.53 s s 0.88 0.88 s s U p d a t e c

  • s

t s U p d a t e c

  • s

t s 0.88 0.88 0.87 0.87 0.86 0.86 0.85 0.85 ROC Score ROC Score 62.77% 62.77% 61% 61% 64% 64% 68% 68% Hit Rate Hit Rate 6% 6% 7% 7% 8% 8% 12% 12% False Positive False Positive 50 50 blks blks. . (5000) (5000) 40 40 blks blks. . (4000) (4000) 30 30 blks blks. . (3000) (3000) 20 20 blks blks. . (2000) (2000) # t r a i n e d # t r a i n e d c

  • m

m a n d s c

  • m

m a n d s

improved

slide-24
SLIDE 24

Update Under 1 Update Under 1-

  • class SVM

class SVM

0.04 0.04 s s D e t e c t i

  • n

c

  • s

t D e t e c t i

  • n

c

  • s

t 0.27 0.27 s s 0.22 0.22 s s 0.18 0.18 s s 0.17 0.17 s s SVM SVM t r a i n i n g t r a i n i n g c

  • s

t s c

  • s

t s 2.15 2.15 s s 1.79 1.79 s s 1.53 1.53 s s 0.88 0.88 s s U p d a t e c

  • s

t s U p d a t e c

  • s

t s 0.88 0.88 0.87 0.87 0.86 0.86 0.85 0.85 ROC Score ROC Score 62.77% 62.77% 61% 61% 64% 64% 68% 68% Hit Rate Hit Rate 6% 6% 7% 7% 8% 8% 12% 12% False Positive False Positive 50 50 blks blks. . (5000) (5000) 40 40 blks blks. . (4000) (4000) 30 30 blks blks. . (3000) (3000) 20 20 blks blks. . (2000) (2000) # t r a i n e d # t r a i n e d c

  • m

m a n d s c

  • m

m a n d s

improved improved

slide-25
SLIDE 25

Results: 2 Results: 2-

  • class vs. 1

class vs. 1-

  • class

class

2-class 1-class

0.04 0.04 s s Detection cost Detection cost 10.03 10.03 s s 6.90 6.90 s s 7.04 7.04 s s 3.36 3.36 s s SVM SVM training training costs costs 107.30 107.30 s s 89.65 89.65 s s 59.53 59.53 s s 43.86 43.86 s s Update costs Update costs 0.93 0.93 0.91 0.91 0.90 0.90 0.89 0.89 ROC Score ROC Score 72.74% 72.74% 68% 68% 69% 69% 68% 68% Hit Rate Hit Rate 3% 3% 5% 5% 6% 6% 8% 8% False Positive False Positive 50 50 blks blks. . (25000) (25000) 40 40 blks blks. . (20000) (20000) 30 30 blks blks. . (15000) (15000) 20 20 blks blks. . (10000) (10000)

# trained # trained commands commands

0.04 0.04 s s D e t e c t i

  • n

c

  • s

t D e t e c t i

  • n

c

  • s

t 0.27 0.27 s s 0.22 0.22 s s 0.18 0.18 s s 0.17 0.17 s s SVM SVM t r a i n i n g t r a i n i n g c

  • s

t s c

  • s

t s 2.15 2.15 s s 1.79 1.79 s s 1.53 1.53 s s 0.88 0.88 s s U p d a t e c

  • s

t s U p d a t e c

  • s

t s 0.88 0.88 0.87 0.87 0.86 0.86 0.85 0.85 ROC Score ROC Score 62.77 62.77 % % 61% 61% 64% 64% 68% 68% Hit Rate Hit Rate 6% 6% 7% 7% 8% 8% 12% 12% False Positive False Positive 50 50 blks blks. . (5000) (5000) 40 40 blks blks. . (4000) (4000) 30 30 blks blks. . (3000) (3000) 20 20 blks blks. . (2000) (2000) # t r a i n e d # t r a i n e d c

  • m

m a n d s c

  • m

m a n d s

slide-26
SLIDE 26

Conclusion Conclusion

  • R

e s u l t s R e s u l t s

  • E

x t e n s i

  • n
  • f

E C M w i t h l

  • w

c

  • m

p u t i n g c

  • s

t s E x t e n s i

  • n
  • f

E C M w i t h l

  • w

c

  • m

p u t i n g c

  • s

t s

  • A

v a i l a b i l i t y w i t h

  • n

l i n e u p d a t e A v a i l a b i l i t y w i t h

  • n

l i n e u p d a t e

  • F

u t u r e w

  • r

k F u t u r e w

  • r

k

  • T
  • d
  • m
  • r

e e x p e r i m e n t s w i t h

  • t

h e r d a t a T

  • d
  • m
  • r

e e x p e r i m e n t s w i t h

  • t

h e r d a t a

  • T
  • i

m p r

  • v

e a c c u r a c y b y i n t e g r a t i n g s e v e r a l m e t h

  • d

s T

  • i

m p r

  • v

e a c c u r a c y b y i n t e g r a t i n g s e v e r a l m e t h

  • d

s

  • T
  • t

e s t a n d e x t e n d

  • u

r p r

  • p
  • s

a l t

  • t

h e r a p p l i c a t i

  • n

s T

  • t

e s t a n d e x t e n d

  • u

r p r

  • p
  • s

a l t

  • t

h e r a p p l i c a t i

  • n

s l i k e d a t a b a s e s ( S Q L i n j e c t i

  • n

s ) l i k e d a t a b a s e s ( S Q L i n j e c t i

  • n

s )

slide-27
SLIDE 27

Thank you Thank you