practical anomaly detection based on classifying frequent
play

Practical Anomaly Detection based on Classifying Frequent Traffic - PowerPoint PPT Presentation

Practical Anomaly Detection based on Classifying Frequent Traffic Patterns Ignasi Paredes-Oliva 1 Ismael Castell-Uroz 1 Pere Barlet-Ros 1 Xenofontas Dimitropoulos 2 Josep Sol-Pareta 1 1 UPC BarcelonaTech, Spain


  1. Practical Anomaly Detection based on Classifying Frequent Traffic Patterns Ignasi Paredes-Oliva 1 Ismael Castell-Uroz 1 Pere Barlet-Ros 1 Xenofontas Dimitropoulos 2 Josep Solé-Pareta 1 1 UPC BarcelonaTech, Spain {iparedes,icastell,pbarlet,pareta}@ac.upc.edu 2 ETH Zürich, Switzerland fontas@tik.ee.ethz.ch 15 th IEEE Global Internet Symposium (GI) Orlando, FL, United States March 30 th , 2012

  2. Introduction Related Work Our Proposal Performance Evaluation Conclusions Outline Introduction 1 Related Work 2 Our Proposal 3 Performance Evaluation 4 Conclusions 5 2 / 21

  3. Introduction Related Work Our Proposal Performance Evaluation Conclusions Outline Introduction 1 Related Work 2 Our Proposal 3 Performance Evaluation 4 Conclusions 5 3 / 21

  4. Introduction Related Work Our Proposal Performance Evaluation Conclusions The problem Growth of cyber-attacks 1 Anomaly detection systems not widely deployed e.g., too many false positives, complex black boxes Anomaly classification and root-cause analysis are still open issues e.g., manual analysis → error-prone, complex, slow and expensive 2 Our goal Simple system for automatic anomaly detection and classification High classification accuracy and low false positives Conceptually simple working scheme 1 Kim-Kwang Raymond Choo, The cyber threat landscape: Challenges and future research directions, Computers & Security, 2011. 2 M. Molina et al., Anomaly Detection in Backbone Networks: Building a Security Service Upon an Innovative Tool. TNC 2010. 4 / 21

  5. Introduction Related Work Our Proposal Performance Evaluation Conclusions The problem Growth of cyber-attacks 1 Anomaly detection systems not widely deployed e.g., too many false positives, complex black boxes Anomaly classification and root-cause analysis are still open issues e.g., manual analysis → error-prone, complex, slow and expensive 2 Our goal Simple system for automatic anomaly detection and classification High classification accuracy and low false positives Conceptually simple working scheme 1 Kim-Kwang Raymond Choo, The cyber threat landscape: Challenges and future research directions, Computers & Security, 2011. 2 M. Molina et al., Anomaly Detection in Backbone Networks: Building a Security Service Upon an Innovative Tool. TNC 2010. 4 / 21

  6. Introduction Related Work Our Proposal Performance Evaluation Conclusions Outline Introduction 1 Related Work 2 Our Proposal 3 Performance Evaluation 4 Conclusions 5 5 / 21

  7. Introduction Related Work Our Proposal Performance Evaluation Conclusions Related work and contributions Many proposals on anomaly detection Anomaly classification marginally studied Contributions of this paper Novel approach for automatic anomaly detection and classification based on classifying frequent traffic patterns Evaluated using data from two large networks High classification accuracy and low false positives ratio System deployed in the Catalan NREN 6 / 21

  8. Introduction Related Work Our Proposal Performance Evaluation Conclusions Related work and contributions Many proposals on anomaly detection Anomaly classification marginally studied Contributions of this paper Novel approach for automatic anomaly detection and classification based on classifying frequent traffic patterns Evaluated using data from two large networks High classification accuracy and low false positives ratio System deployed in the Catalan NREN 6 / 21

  9. Introduction Related Work Our Proposal Performance Evaluation Conclusions Outline Introduction 1 Related Work 2 Our Proposal 3 Performance Evaluation 4 Conclusions 5 7 / 21

  10. Introduction Related Work Our Proposal Performance Evaluation Conclusions System Overview Two phases: Offline: build model to classify anomalies Online: use model to classify incoming traffic Freq. Machine Feature Model Item-Set Learning Extraction Mining Freq. Feature Classification Item-Set Extraction Mining 8 / 21

  11. Introduction Related Work Our Proposal Performance Evaluation Conclusions Frequent Item-Set Mining Originally used in market basket analysis to find out products that were frequently bought together and make appealing offers ( e.g., beer and chips) What is an item-set? compact summarization of elements occurring together Why is it useful for anomaly detection? Many attacks involve high volume of flows with common features e.g., Port Scan: many flows with same sIP and dIP 9 / 21

  12. Introduction Related Work Our Proposal Performance Evaluation Conclusions Frequent Item-Set Mining Originally used in market basket analysis to find out products that were frequently bought together and make appealing offers ( e.g., beer and chips) What is an item-set? compact summarization of elements occurring together Why is it useful for anomaly detection? Many attacks involve high volume of flows with common features e.g., Port Scan: many flows with same sIP and dIP 9 / 21

  13. Introduction Related Work Our Proposal Performance Evaluation Conclusions Frequent Item-Set Mining Originally used in market basket analysis to find out products that were frequently bought together and make appealing offers ( e.g., beer and chips) What is an item-set? compact summarization of elements occurring together Why is it useful for anomaly detection? Many attacks involve high volume of flows with common features e.g., Port Scan: many flows with same sIP and dIP 9 / 21

  14. Introduction Related Work Our Proposal Performance Evaluation Conclusions Frequent Item-Set Mining Originally used in market basket analysis to find out products that were frequently bought together and make appealing offers ( e.g., beer and chips) What is an item-set? compact summarization of elements occurring together Why is it useful for anomaly detection? Many attacks involve high volume of flows with common features e.g., Port Scan: many flows with same sIP and dIP 9 / 21

  15. Introduction Related Work Our Proposal Performance Evaluation Conclusions Frequent Item-Set Mining Port Scan example sIP dIP sPort dPort 1 st flow X.77.17.59 Y.88.243.209 41393 21209 2 nd flow X.77.17.59 Y.88.243.209 41393 54766 3 rd flow X.77.17.59 Y.88.243.209 41393 31448 4 th flow X.77.17.59 Y.88.243.209 41393 58514 ... 2911 th flow X.77.17.59 Y.88.243.209 41393 48732 sIP dIP sPort dPort item-set X.77.17.59 Y.88.243.209 41393 * Need further information per item-set in order to classify it 10 / 21

  16. Introduction Related Work Our Proposal Performance Evaluation Conclusions Frequent Item-Set Mining Port Scan example sIP dIP sPort dPort 1 st flow X.77.17.59 Y.88.243.209 41393 21209 2 nd flow X.77.17.59 Y.88.243.209 41393 54766 3 rd flow X.77.17.59 Y.88.243.209 41393 31448 4 th flow X.77.17.59 Y.88.243.209 41393 58514 ... 2911 th flow X.77.17.59 Y.88.243.209 41393 48732 sIP dIP sPort dPort item-set X.77.17.59 Y.88.243.209 41393 * Need further information per item-set in order to classify it 10 / 21

  17. Introduction Related Work Our Proposal Performance Evaluation Conclusions Frequent Item-Set Mining Port Scan example sIP dIP sPort dPort 1 st flow X.77.17.59 Y.88.243.209 41393 21209 2 nd flow X.77.17.59 Y.88.243.209 41393 54766 3 rd flow X.77.17.59 Y.88.243.209 41393 31448 4 th flow X.77.17.59 Y.88.243.209 41393 58514 ... 2911 th flow X.77.17.59 Y.88.243.209 41393 48732 sIP dIP sPort dPort item-set X.77.17.59 Y.88.243.209 41393 * Need further information per item-set in order to classify it 10 / 21

  18. Introduction Related Work Our Proposal Performance Evaluation Conclusions Feature Extraction Computed features for each frequent item-set Value Defined Undefined Defined Src IP/Dst IP True False Src/Dst Port Port Number NaN Protocol Protocol Number NaN URG/ACK/PSH/RST/SYN/FIN True False # Bytes / # Packets Bytes per Packet (bpp) # Packets / # Flows Packet per Flow (ppf) 11 / 21

  19. Introduction Related Work Our Proposal Performance Evaluation Conclusions Building the classifier (offline) Goal: build model taking into account manually labeled frequent item-sets Output classes Anomalous: DoS (DDoS, SYN/ACK/UDP/ICMP floods), Network Scans (ICMP/Other Network Scans), Port Scans (SYN/ACK/UDP Port Scans) Normal (legitimate traffic) Unknown (not normal and did not fit in any anomalous class) Labeled item-sets + features + output classes are given to the C5.0 algorithm (machine learning) → output: classification model 12 / 21

  20. Introduction Related Work Our Proposal Performance Evaluation Conclusions Building the classifier (offline) Goal: build model taking into account manually labeled frequent item-sets Output classes Anomalous: DoS (DDoS, SYN/ACK/UDP/ICMP floods), Network Scans (ICMP/Other Network Scans), Port Scans (SYN/ACK/UDP Port Scans) Normal (legitimate traffic) Unknown (not normal and did not fit in any anomalous class) Labeled item-sets + features + output classes are given to the C5.0 algorithm (machine learning) → output: classification model 12 / 21

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend