 
              An SVM- -based Masquerade Detection based Masquerade Detection An SVM Method with Online Update Using Method with Online Update Using Co- -occurrence Matrix occurrence Matrix Co Liangwen Chen, Masayoshi Chen, Masayoshi Aritsugi Aritsugi Liangwen Gunma University, Japan Gunma University, Japan
Outline Outline � Background Background � � Conventional results Conventional results � � Our proposal Our proposal � � Experiments Experiments � � Conclusion Conclusion �
Background Background � A computer can provide multiple services to multiple A computer can provide multiple services to multiple � users users � Users can login to a computer through network Users can login to a computer through network � Security mng. costs increase � Hard to protect computers from malicious access Hard to protect computers from malicious access � completely completely Masquerade detection
Conventional results Conventional results False Positive Hit False Positive Hit Researchers Approaches Researchers Approaches Rate Rate Rate Rate Uniqueness Uniqueness 1.4% 1.4% 39.4% 39.4% Bayes one Bayes one- -step Markov step Markov 6.7% 6.7% 69.3% 69.3% Hybrid multistep Hybrid multistep Markov Markov 3.2% 3.2% 49.3% 49.3% Schonlau et al. et al. Schonlau Compression 5.0% 34.2%. Compression 5.0% 34.2%. Sequence Matching 3.7% 36.8% Sequence Matching 3.7% 36.8% IPAM 2.7% 41.1% IPAM 2.7% 41.1% Naï ïve ve Bayes Bayes (updating) (updating) 1.3% 61.5% Na 1.3% 61.5% Maxion and Maxion and Townsend Townsend Naï ïve ve Bayes Bayes (no updating) (no updating) 4.6% 66.2% Na 4.6% 66.2% SVM- -based approach based approach SVM Kim and Cha 9.7% 80.1% Kim and Cha 9.7% 80.1% with voting with voting Oka et al. ECM 2.5% 72.3% Oka et al. ECM 2.5% 72.3%
Problems Problems � C C o n v e n t i o n a l r e s e a r c h e s h a v e o n v e n t i o n a l r e s e a r c h e s h a v e � ECM a t t e m p t e d t o i m p r o v e t h e a t t e m p t e d t o i m p r o v e t h e a c c u r a c y r a t e a c c u r a c y r a t e False Positive 2.5% Hit Rate 72.3% � U U s e r s ’ b b e h a v i o r s w o u l d c h a n g e s e r s e h a v i o r s w o u l d c h a n g e ’ � ROC Score 0.918 w i t h t i m e w i t h t i m e Training cost 1046.37 min. Detection cost 22.13 sec. CPU Xeon 3.2GHz Need to adapt to changes Memory Size 4GB
Our strategy Our strategy � T T o b o r r o w t h e s a m e d a t a o b o r r o w t h e s a m e d a t a � � T T o c o m p a r e r e s u l t s w i t h c o n v e n t i o n a l w o r k o c o m p a r e r e s u l t s w i t h c o n v e n t i o n a l w o r k � � T T o b o r r o w E C M o b o r r o w E C M � � L L o o w w f f a a l l s s e e p p o o s s i i t t i i v v e e r r a a t t e e � � H H i g h h i t r a t e i g h h i t r a t e � � H H i g h R O C s c o r e i g h R O C s c o r e � � T T o e x p l o i t S V M o e x p l o i t S V M � � L L o w t r a i n i n g c o s t o w t r a i n i n g c o s t � � A A d a p t t o c h a n g e s o f u s e r s ’ b b e h a v i o r s d a p t t o c h a n g e s o f u s e r s e h a v i o r s ’ �
Correlation of commands Correlation of commands time User1 : cd ls less less ls ls less cd cd ls cd cd cd ls User1 : cd ls less ls cd ls User2 : User2 : emacs gcc emacs gcc gdb gdb emacs emacs ls ls gcc gcc gdb gdb ls ls ls emacs ls emacs User3 : mkdir cp cp cd cd ls cp ls ls cp cp cp cp cp cp cp User3 : mkdir ls cp cp cd ls less ls less cd ls cd cd cd cd ls cd ls less ls less cd ls ls Strength of correlation of ls and less : 2+1=3
Co- -occurrence matrix occurrence matrix Co User1 : cd ls less less ls ls less cd cd ls cd cd cd ls User1 : cd ls less ls cd ls User2 : emacs gcc gcc gdb gdb emacs emacs ls ls gcc gcc gdb gdb ls ls ls emacs emacs User2 : emacs ls User3 : mkdir cp cp cd cd ls cp ls ls cp cp cp cp cp cp cp User3 : mkdir ls cp cp
Our co- -occurrence matrix occurrence matrix Our co cd ls less emacs gcc gdb mkdir cp cd 0 0 0 0 0 0 0 0 ls 0 3 0 3 1 1 0 0 High freq. Low freq. less 0 0 0 0 0 0 0 0 Commands in emacs 0 4 0 1 3 3 0 0 All other commands Legitimate training data gcc 0 4 0 2 1 3 0 0 gdb High freq. 0 5 0 2 1 1 0 0 Legitimate training data mkdir 0 0 0 0 0 0 0 0 cp 0 0 0 0 0 0 0 0 Commands in A B Low freq. emacs ls gcc gdb cd less mkdir cp All other commands emacs 2 4 3 3 0 0 0 0 ls 3 3 1 1 0 0 0 0 gcc 2 4 1 3 0 0 0 0 gdb 2 5 1 1 0 0 0 0 C D cd 0 0 0 0 0 0 0 0 less 0 0 0 0 0 0 0 0 mkdir 0 0 0 0 0 0 0 0 cp 0 0 0 0 0 0 0 0
System overview System overview Training data New sequence Co- Co -occ occ. . Matrx Matrx. generation . generation � � SVM feature vectr vectr. generation . generation SVM feature � � Co-occ. Matrx. Gen. SVM processing processing SVM � � Results Results � � Co-occ. Refinement Refinement � � Co-occ. Matrx. Matrx. Feature vectr. Feature vectr. SVM training model results
Comparison with ECM Comparison with ECM Our method Our method ECM ECM (based on 2- (based on 2 -class SVM) class SVM) False Positive False Positive 2.5% 2.5% 3.0% 3.0% Hit Rate 72.3% 72.74% Hit Rate 72.3% 72.74% ROC Score 0.918 0.926 ROC Score 0.918 0.926 CPU Xeon 3.2GHz Pentium III 1.4GHz CPU Xeon 3.2GHz Pentium III 1.4GHz Memory Size 4GB 512MB Memory Size 4GB 512MB Training cost 1046.37 min. min. 117.33 sec. sec. Training cost 1046.37 117.33 Detection cost 22.13 sec. sec. 0.04 sec. sec. Detection cost 22.13 0.04
Comparison with ECM Comparison with ECM Our method Our method ECM ECM (based on 2- (based on 2 -class SVM) class SVM) False Positive False Positive 2.5% 2.5% 3.0% 3.0% Hit Rate 72.3% Almost the same 72.74% Hit Rate 72.3% 72.74% ROC Score 0.918 0.926 ROC Score 0.918 0.926 CPU Xeon 3.2GHz Pentium III 1.4GHz CPU Xeon 3.2GHz Pentium III 1.4GHz Memory Size 4GB 512MB Memory Size 4GB 512MB Training cost 1046.37 min. min. 117.33 sec. sec. Training cost 1046.37 117.33 Detection cost 22.13 sec. sec. 0.04 sec. sec. Detection cost 22.13 0.04
Comparison with ECM Comparison with ECM Our method Our method ECM ECM (based on 2- (based on 2 -class SVM) class SVM) False Positive False Positive 2.5% 2.5% 3.0% 3.0% Hit Rate 72.3% 72.74% Hit Rate 72.3% 72.74% ROC Score 0.918 0.926 ROC Score 0.918 0.926 CPU Xeon 3.2GHz Pentium III 1.4GHz CPU Xeon 3.2GHz Pentium III 1.4GHz With lower power machine Memory Size 4GB 512MB Memory Size 4GB 512MB Training cost 1046.37 min. min. 117.33 sec. sec. Training cost 1046.37 117.33 Detection cost 22.13 sec. sec. 0.04 sec. sec. Detection cost 22.13 0.04
Comparison with ECM Comparison with ECM Our method Our method ECM ECM (based on 2- (based on 2 -class SVM) class SVM) False Positive False Positive 2.5% 2.5% 3.0% 3.0% Hit Rate 72.3% 72.74% Hit Rate 72.3% 72.74% ROC Score 0.918 0.926 ROC Score 0.918 0.926 CPU Xeon 3.2GHz Pentium III 1.4GHz CPU Xeon 3.2GHz Pentium III 1.4GHz Memory Size 4GB 512MB Memory Size 4GB 512MB Training cost 1046.37 min. min. 117.33 sec. sec. Training cost 1046.37 117.33 Smaller Detection cost 22.13 sec. sec. 0.04 sec. sec. Detection cost 22.13 0.04
Comparison with ECM Comparison with ECM Our method Our method ECM ECM (based on 2- (based on 2 -class SVM) class SVM) With lower power machine False Positive False Positive 2.5% 2.5% 3.0% 3.0% Hit Rate 72.3% 72.74% Hit Rate 72.3% 72.74% Training cost:535times smaller ROC Score 0.918 0.926 ROC Score 0.918 0.926 Detection cost:553times smaller CPU Achieved almost the same good charac. Xeon 3.2GHz Pentium III 1.4GHz CPU Xeon 3.2GHz Pentium III 1.4GHz Memory Size 4GB 512MB Memory Size 4GB 512MB Training cost 1046.37 min. min. 117.33 sec. sec. Training cost 1046.37 117.33 Detection cost 22.13 sec. sec. 0.04 sec. sec. Detection cost 22.13 0.04
Recommend
More recommend