an empirical evaluation of entropy based traffic anomaly
play

An Empirical Evaluation of Entropy- based Traffic Anomaly Detection - PowerPoint PPT Presentation

An Empirical Evaluation of Entropy- based Traffic Anomaly Detection George Nychis, Vyas Sekar, David Andersen, Hyong Kim, Hui Zhang Carnegie Mellon University Entropy-based Anomaly Detection Goal: detect abnormal behavior scan activity,


  1. An Empirical Evaluation of Entropy- based Traffic Anomaly Detection George Nychis, Vyas Sekar, David Andersen, Hyong Kim, Hui Zhang Carnegie Mellon University

  2. Entropy-based Anomaly Detection  Goal: detect abnormal behavior  scan activity, DDoS, bandwidth floods ...  Traditional: raw traffic volume ( insufficient)  e.g., total number of packets in an epoch  Modern : entropy-based traffic metrics  e.g., relative randomness in distribution of packets across ports Example Anomaly Entropy: Detectable Traffic Volume: Undetected 2

  3. Motivation Anomaly Detection Traffic Feature Timeseries NetFlow Alarm! Detection Data 3

  4. Motivation Anomaly Detection Traffic Feature Timeseries NetFlow sum(packets) A(pkts) Detection Data 3

  5. Motivation Anomaly Detection Traffic Feature Timeseries NetFlow H(addresses) A(addr) Detection Data Entropy-based Features: Dist. of packets across addresses 3

  6. Motivation Anomaly Detection Traffic Feature A(addr) Timeseries NetFlow A(port) H(ports) Detection Data Entropy-based Features: Distribution of packets across ports H(addresses) 3

  7. Motivation Anomaly Detection A(addr) Traffic Feature A(port) Timeseries NetFlow H(flow-size) A(FSD) Detection Data Entropy-based Features: Distribution of flow-sizes (in packets) H(addresses) H(ports) 3

  8. Motivation Anomaly Detection A(addr) A(port) Traffic Feature A(FSD) Timeseries NetFlow H(degree) A(deg) Detection Data Entropy-based Features: Distribution of host communication H(addresses) H(ports) H(flow-size) 3

  9. Motivation Anomaly Detection A(addr) A(port) Traffic Feature A(FSD) Timeseries NetFlow ???????? A(deg) Detection Data Entropy-based Features: H(addresses) H(ports) H(flow-size) H(degree) 3

  10. Motivation Anomaly Detection A(addr) A(port) Traffic Feature A(FSD) Timeseries NetFlow ???????? A(deg) Detection Data Entropy-based Features: H(addresses) H(ports) H(flow-size) H(degree)  Goal: understanding the features 3

  11. Motivation Anomaly Detection A(addr) A(port) Traffic Feature A(FSD) Timeseries NetFlow ???????? A(deg) Detection Data Entropy-based Features: H(addresses) H(ports) H(flow-size) H(degree)  Goal: understanding the features 1. How unique are their detection capabilities? 2. How effective are they? 3

  12. Analysis Method 5 one-month-long traces: NetFlow CMU-2005, CMU-2008, GATech-2008, Data GEANT-2005, Internet2-2006 4

  13. Analysis Method 5 one-month-long traces: NetFlow CMU-2005, CMU-2008, GATech-2008, Data GEANT-2005, Internet2-2006 H(addresses) H(ports) Entropy Timeseries H(flow-size) H(degree) 4

  14. Analysis Method 5 one-month-long traces: NetFlow CMU-2005, CMU-2008, GATech-2008, Data GEANT-2005, Internet2-2006 H(addresses) H(ports) Entropy Timeseries H(flow-size) H(degree) Are the distributions structurally similar? Timeseries Correlation 4

  15. Analysis Method 5 one-month-long traces: NetFlow CMU-2005, CMU-2008, GATech-2008, Data GEANT-2005, Internet2-2006 H(addresses) H(ports) Entropy Timeseries H(flow-size) H(degree) Are the A(addr) distributions A(port) structurally Anomaly Detection A(FSD) similar? A(deg) Timeseries Correlation 4

  16. Analysis Method 5 one-month-long traces: NetFlow CMU-2005, CMU-2008, GATech-2008, Data GEANT-2005, Internet2-2006 H(addresses) H(ports) Entropy Timeseries H(flow-size) H(degree) Are the A(addr) distributions A(port) structurally Anomaly Detection A(FSD) similar? A(deg) Anomaly Correlation Timeseries Correlation Goal(1): Uniqueness 4

  17. Analysis Method 5 one-month-long traces: NetFlow CMU-2005, CMU-2008, GATech-2008, Data GEANT-2005, Internet2-2006 H(addresses) H(ports) Entropy Timeseries H(flow-size) H(degree) Are the A(addr) distributions A(port) structurally Anomaly Detection A(FSD) similar? A(deg) Anomaly Correlation Timeseries Correlation Goal(1): Uniqueness 4

  18. Entropy Timeseries (February 2005) In-degree Out-degree Flow-size Src. Address Dst. Address Src. Port Dst. Port Raw traffic volume 5

  19. Entropy Timeseries (February 2005) In-degree Out-degree Flow-size Src. Address Dst. Address Src. Port Dst. Port Raw traffic volume 5

  20. Entropy Timeseries (February 2005) In-degree test  Out-degree Flow-size Src. Address Dst. Address Src. Port Dst. Port Raw traffic volume 5

  21. Entropy Timeseries (February 2005) In-degree test  Out-degree Flow-size Src. Address Dst. Address Src. Port Dst. Port Raw traffic volume 5

  22. Entropy Timeseries (February 2005) In-degree test  Out-degree Flow-size Src. Address Dst. Address Src. Port Dst. Port Raw traffic volume 5

  23. Analysis Method 5 one-month-long traces: NetFlow CMU-2005, CMU-2008, GATech-2008, Data GEANT-2005, Internet2-2006 H(addresses) H(ports) Entropy Timeseries H(flow-size) H(degree) Are the A(addr) distributions A(port) structurally Anomaly Detection A(FSD) similar? A(deg) Anomaly Correlation Timeseries Correlation Goal(1): Uniqueness 6

  24. Correlation in Entropy Timeseries  Pairwise correlation-scores for CMU-2005  All 4 other traces exhibit similar behavior! 7

  25. Why Entropy is Structurally Correlated 1. Port / Address Correlation  Properties of Network Traffic: - contribute X packets to address A - contribute X packets to port B … if hosts have few connections, and ports are uniformly random → similar distributions 8

  26. Why Entropy is Structurally Correlated 1. Port / Address Correlation  Properties of Network Traffic 2. Source / Destination Correlation  Flow accounting: - Bi-directional: Addr1(23) → Addr2(53) Bi-directional Saddr(23) Daddr(53) 8

  27. Why Entropy is Structurally Correlated 1. Port / Address Correlation  Properties of Network Traffic 2. Source / Destination Correlation  Flow accounting: - Uni-directional: Addr1 → Addr2 (23) Addr2 → Addr1 (53) Bi-directional Uni-directional Saddr(23) Saddr(23), Daddr(23) Daddr(53) Saddr(53), Daddr(53) Uni-directionality destroys 2 unique distributions 8

  28. Why Anomalies are Correlated  Root-cause analysis approach: no Remove Recompute Anomaly Analyze top-k flows entropy subsides? yes, cause!  Our results:  Ports & addresses: only detect alpha flows (correlation)  FSD: detects scans, Degree: SYN flood  FSD & Degree are unique ( no correlation ) 9

  29. Why Anomalies are Correlated  Root-cause analysis approach: no Remove Recompute Anomaly Analyze top-k flows entropy subsides? yes, cause! Traffic volume  Our results:  Ports & addresses: only detect alpha flows (correlation)  FSD: detects scans, Degree: SYN flood  FSD & Degree are unique ( no correlation ) 9

  30. Summary of Goal(1): Uniqueness  Strong correlation in ports and addresses  Flow-size and degree: unique  Structural correlation : properties of traffic  Anomaly correlation : types of anomalies seen 10

  31. Understanding Effectiveness Inject Synthetic Anomalies NetFlow Data Entropy Timeseries Anomaly Detection Anomaly Correlation Timeseries Correlation 11

  32. Best Distribution for an Anomaly?  Anomalies: BW Flood, Scanner, Multiple Scanners, Port Scan, and SYN Flood  Other Results:  BW Flood :  ports & addresses  already detectable FSD best by traffic volume detector  Scans:  difficult to detect  … FSD and degree 12

  33. Implications and Conclusions  Look beyond ports and addresses  Select complementary traffic distributions  Uni-directional accounting introduces biases in traffic distributions  Future Work: Can correlations be leveraged?  during anomalies found in flow-size & degree, correlation drops between ports & addresses 13

  34. Questions? 14

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend