The Subspace Method for Diagnosing Network-Wide Traffic Anomalies - PowerPoint PPT Presentation

The Subspace Method for Diagnosing Network-Wide Traffic Anomalies Anukool Lakhina, Mark Crovella, Christophe Diot

What’s happening in my network? • Is my customer being attacked? probed? infected? • Is there a sudden traffic shift? • An external route change? • A routing loop? • An equipment outage? Automated methods for reliably and generally answering such questions are lacking 2

A General Framework • We can treat all such problems as special cases of the general question: Is my network experiencing unusual conditions? • Then, adopt the following framework: – Detection Is there an unusual event? – Identification Which of the possible explanations fits best? – Quantification How serious is the problem? 3

Statistical Approach The advantage of such a framework is that it lends itself to a statistical approach: – Detection: Outlier detection Anomaly – Identification: Hypothesis testing Diagnosis – Quantification: Estimation 4

A Need for Whole-Network Diagnosis Our Thesis: Effective diagnosis of network anomalies requires a whole-network approach For example, diagnosing traffic anomalies requires analyzing traffic from all links 5

But, This Is Difficult! x 10 7 DNVR−SNVA x 10 7 HSTN−KSCY x 10 8 LOSA−SNVA x 10 8 CHIN−NYCM • Need to study traffic from all links in a 6 2 6 5 5 1.5 4 4 network simultaneously 5 1 3 3 4 2 2 0.5 1 3 – Large amount of data 1 x 10 7 ATLA−ATLA 3 x 10 7 DNVR−DNVR x 10 6 KSCY−KSCY x 10 7 SNVA−SNVA 4.5 5 – Traffic is nonstationary 15 2.5 4 4 2 3.5 10 3 1.5 3 – Varying link utilization levels 2 5 2.5 1 1 2 0.5 – 100s of links � High dimensionality x 10 6 LOSA−LOSA x 10 7 HSTN−HSTN x 10 6 STTL−STTL x 10 8 WASH−WASH 6 4 2 5 1.5 3 4 1.5 1 3 2 2 1 1 0.5 1 0 How do we extract meaning from such a high-dimensional data in a systematic manner? 6

Low Intrinsic Dimensionality of Link Traffic Studied via Principal Component Analysis Key result: Normal traffic is well approximated by a low dimensional space For example: Traffic on 40+ links is well approximated in space of only 4 dimensions 7

Reasons for Low Dimensionality of Traffic • Generally, traffic on different links is not independent • Link traffic is the superposition of origin- destination flows (OD flows) – The same OD flow passes over multiple links, inducing correlation among links – All OD flows tend to vary according to common daily and weekly cycles, and so are themselves correlated [See SIGMETRICS 2004 paper] 8

The Subspace Method • An approach to separate normal from anomalous traffic • Define as the space spanned by the first k principal components • Define as the space spanned by the remaining principal components • Then, decompose traffic on all links by projecting onto and to obtain: Residual traffic Traffic vector of all Normal traffic vector links at a particular vector point in time 9

The Subspace Method, Geometrically In general, anomalous traffic results in Traffic on Link 2 a large value of y Traffic on Link 1 10

Outline • Subspace Method applied to Link Traffic – Problem: Volume Anomaly Diagnosis – Detection, Identification, Quantification – Validation • Subspace Method applied to Flow Traffic – Problem: General Anomaly Detection – Sample Results • Conclusions 11

Diagnosing Volume Anomalies • A volume anomaly is a sudden change in an OD flow’s traffic ( i.e., point to point traffic) • Problem Statement: Given link traffic measurements, diagnose the volume anomalies • A first application of the subspace method 12

An Illustration 6 x 10 OD flow i−b 15 10 5 7 x 10 Link c−b 8 Sprint-Europe Backbone Network 6 4 7 x 10 Link d−c The Diagnosis Problem requires 6 4 analyzing traffic on all links to: 2 7 x 10 Link f−d 1) Detect the time of the anomaly 6 4 2) Identify the source & destination 2 7 x 10 Link i−f 6 3) Quantify the size of the anomaly 4 2 Fri Sat Sun 13

Subspace Method: Detection • Error Bounds on Squared Prediction Error: • Assuming multivariate Gaussian data, traffic is Traffic on Link 2 normal when, Result due to [Jackson and Mudholkar, 1979] Traffic on Link 1 14

SPE vs. All Traffic Value of over time Value of over time SPE ( ) at anomaly time points clearly stand out 15

Results on True Anomalies: Sprint-1 40 Largest deviations in OD flows via Fourier Detection Quantification Identification “Knee” in curve - natural cutoff for detection 16

Outline • Subspace Method applied to Link Traffic – Problem: Volume Anomaly Diagnosis – Detection, Identification, Quantification – Validation • Subspace Method applied to OD Flow Traffic – Problem: General Anomaly Detection – Sample Results • Conclusions 17

Beyond Volume Anomalies • Volume anomalies: important, but not the entire set of anomalies of interest to operators. • Operators are also interested in: – DOS attacks, flash crowds, port scans, worm propagation, network equipment outages, changes in ingress/egress traffic patterns, ... • Link data doesn't seem to hold enough information to accurately detect such a wide range of anomaly types. • Therefore, we turn to IP flow data 18

Characterization Methodology • Extend subspace method to diagnose anomalies directly in OD flow traffic timeseries – Detection in both and subspaces • Examine OD flow traffic as three separate views: # Bytes, # Packets, # IP-flows • Manually inspect each anomaly found over 4 week period in Abilene network – Using 5-tuple headers of sampled flow data 19

An example BP anomaly (heavy flow) Dominant Source IP: 192.88.112.0 which accounts for 32% of B, 20% of P and 0.15% of F . Dominant Dest. IP: 160.91.192.0 which accounts for 32% of B, 20% of P and 0.15% of F. Dominant Pair: 192.88.112.0-160.91.192.0 for 32% of B, 20% of P and 0.15% of F. Dominant Dest. Port: 5002 (iperf port, used by SLAC) 20

An example PF anomaly (DOS attack) Dominant Source IP: No dominant single source Dominant Dest. IP: 211.65.112.0 accounts for 80% of P traffic and 92% of F traffic. Dominant Pair : No single pair dominant Dominant Ports : No dominant source or destination port found 21

An example BPF Anomaly (ingress-shift) Multihomed customer CALREN reroutes around the LOSA-CHIN (scheduled) outage 22

Species of anomalies found Anomaly Definition ALPHA Unusually high rate point to point byte transfer DOS, DDOS (Distributed) Denial of service attack against a single victim FLASH CROWD Unusually large demand for a resource/service emerging from common set of sources SCAN Scanning a host for a vulnerable port (port scan) or scanning the network for a target port (network scan) WORM Self-propagating code that spreads across a network by exploiting security flaws POINT to Distribution of content from one server to many servers MULTIPOINT OUTAGE Equipment related events that decrease traffic exchanged by an OD pair INGRESS-SHIFT Customer shifts traffic from one ingress point to another 23

Summary of Anomalies Found 31 39 137 4 Alpha 3 DOS 2 Scan 3 Flash−Crowd Point−Multi Worm Outage Ingress−Shift Unknown FalseAlarm 64 24 44 56

Conclusions • Subspace method for anomaly diagnosis allows whole- network approach – Significant benefit accrues from whole-network analysis • Diagnosing Volume Anomalies from Link Traffic: – High detection rate, low false alarm rate – Hypothesis-based identification is easily formalized and extended • Detecting General Anomalies from Flow Traffic: – Anomalies detected span remarkable breadth – Almost all of the anomalies found are operationally relevant • Whole-Network Anomaly Diagnosis with the Subspace Method is promising – ... more to come! 25

Thanks! Help with Abilene Data • Rick Summerhill, Mark Fullmer (Internet2) • Matthew Davy (Indiana University) Help with Sprint-Europe Data • Bjorn Carlsson, Jeff Loughridge (SprintLink), • Supratik Bhattacharyya, Richard Gass (ATL) 26

The Subspace Method for Diagnosing Network-Wide Traffic Anomalies - PowerPoint PPT Presentation

The Subspace Method for Diagnosing Network-Wide Traffic Anomalies Anukool Lakhina, Mark Crovella, Christophe Diot Whats happening in my network? Is my customer being attacked? probed? infected? Is there a sudden traffic shift?

Diagnosing bacterial Diagnosing bacterial Diagnosing bacterial Diagnosing bacterial infections

Subspace Polynomials and Cyclic Subspace Codes Netanel Raviv Joint work with: Prof. Tuvi Etzion

Graph based Subspace Segmentation Canyi Lu National University of Singapore Nov. 21, 2013

Diagnosing the Location Diagnosing the Location of Bogon Bogon Filters Filters of Randy Bush

Baba Inusa Recommendation Lead Consultant, Paediatric Sickle cell and Thalassaemia , GSTT

The importance of meaning Diagnosing Diagnosing meaning errors meaning errors Detmar Meurers

Diagnosing: Home Wireless & Wide-area Networks Partha Kanuparthy, Constantine Dovrolis

Cyclic Subspace Codes Via Subspace Polynomials Kamil Otal and Ferruh zbudak Middle East

Inverse Free Preconditioned Krylov Subspace Method for Symmetric Generalized Eigenvalue Problems

Subspace Modeling and Selection Subspace Modeling and Selection for Noisy Speech Recognition for

Subspace Embeddings for Regression Lecture 12 October 1, 2020 Chandra (UIUC) CS498ABD 1 Fall

Subspace Embeddings and p -Regression Using Exponential Random Variables David P. Woodruff

Subspace Information Criterion Subspace Information Criterion for Image Restoration for Image

Subspace Clustering Ensemble Clustering Subspace Clustering, Ensemble Clustering, Alternative

Math 211 Math 211 Lecture #21 Determinants October 16, 2002 2 Basis of a Subspace Basis of a

Traffic Shaping, Traffic Policing Peter Puschner, Institut fr Technische Informatik Traffic

Avoiding Scheduler Subversion usin ing Scheduler-Cooperative Locks Yuvraj Patel, Leon Yang * ,

Multiagent Systems: Spring 2006 Ulle Endriss Institute for Logic, Language and Computation

CS 886: Game-theoretic methods for computer science Normal Form Games Kate Larson Computer

Preparing for the Worst but Hoping for the Best: Robust (Bayesian) Persuasion Piotr Dworczak

On Dominating Your Neighborhood Profitably

Centrality Measures on Big Graphs: Exact, Approximated, and Distributed Algorithms Francesco

Dominant Decay Channel of Higgs Particle Observed at ATLAS Zhijun Liang

Mathematical and Perceptual Models for Image Segmentation Thrasos Pappas Electrical &

Sambuz

Useful Links

Newsletter

Mail Us

The Subspace Method for Diagnosing Network-Wide Traffic Anomalies - PowerPoint PPT Presentation

The Subspace Method for Diagnosing Network-Wide Traffic Anomalies Anukool Lakhina, Mark Crovella, Christophe Diot Whats happening in my network? Is my customer being attacked? probed? infected? Is there a sudden traffic shift?

Diagnosing bacterial Diagnosing bacterial Diagnosing bacterial Diagnosing bacterial infections

Subspace Polynomials and Cyclic Subspace Codes Netanel Raviv Joint work with: Prof. Tuvi Etzion

Graph based Subspace Segmentation Canyi Lu National University of Singapore Nov. 21, 2013

Diagnosing the Location Diagnosing the Location of Bogon Bogon Filters Filters of Randy Bush

Baba Inusa Recommendation Lead Consultant, Paediatric Sickle cell and Thalassaemia , GSTT

The importance of meaning Diagnosing Diagnosing meaning errors meaning errors Detmar Meurers

Diagnosing: Home Wireless &amp; Wide-area Networks Partha Kanuparthy, Constantine Dovrolis

Cyclic Subspace Codes Via Subspace Polynomials Kamil Otal and Ferruh zbudak Middle East

Inverse Free Preconditioned Krylov Subspace Method for Symmetric Generalized Eigenvalue Problems

Subspace Modeling and Selection Subspace Modeling and Selection for Noisy Speech Recognition for

Subspace Embeddings for Regression Lecture 12 October 1, 2020 Chandra (UIUC) CS498ABD 1 Fall

Subspace Embeddings and p -Regression Using Exponential Random Variables David P. Woodruff

Subspace Information Criterion Subspace Information Criterion for Image Restoration for Image

Subspace Clustering Ensemble Clustering Subspace Clustering, Ensemble Clustering, Alternative

Math 211 Math 211 Lecture #21 Determinants October 16, 2002 2 Basis of a Subspace Basis of a

Traffic Shaping, Traffic Policing Peter Puschner, Institut fr Technische Informatik Traffic

Avoiding Scheduler Subversion usin ing Scheduler-Cooperative Locks Yuvraj Patel, Leon Yang * ,

Multiagent Systems: Spring 2006 Ulle Endriss Institute for Logic, Language and Computation

CS 886: Game-theoretic methods for computer science Normal Form Games Kate Larson Computer

Preparing for the Worst but Hoping for the Best: Robust (Bayesian) Persuasion Piotr Dworczak

On Dominating Your Neighborhood Profitably

Centrality Measures on Big Graphs: Exact, Approximated, and Distributed Algorithms Francesco

Dominant Decay Channel of Higgs Particle Observed at ATLAS Zhijun Liang

Mathematical and Perceptual Models for Image Segmentation Thrasos Pappas Electrical &amp;

Sambuz

Useful Links

Newsletter

Mail Us

Diagnosing: Home Wireless & Wide-area Networks Partha Kanuparthy, Constantine Dovrolis

Mathematical and Perceptual Models for Image Segmentation Thrasos Pappas Electrical &