A Signal Analysis of Network Traffic Anomalies Paul Barford with - - PowerPoint PPT Presentation
A Signal Analysis of Network Traffic Anomalies Paul Barford with - - PowerPoint PPT Presentation
A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin Madison Fall, 2002 Overview Motivation: Anomaly detection remains difficult Objective : Improve
pb@cs.wisc.ecdu 2
Overview
- Motivation: Anomaly detection remains difficult
- Objective: Improve understanding of traffic anomalies
- Approach: Multiresolution analysis of data set that
includes IP flow, SNMP and an anomaly catalog
- Method: Integrated Measurement Analysis Platform for
Internet Traffic (IMAPIT)
- Results: Identify anomaly characteristics using wavelets
and develop new method for exposing short-lived events
pb@cs.wisc.ecdu 3
Our Data Sets
- Consider anomalies in IP flow and SNMP data
– Collected at UW border router (Juniper M10) – Archive of ~6 months worth of data (packets, bytes, flows) – Includes catalog of anomalies (after-the-fact analysis)
- Group observed anomalies into four categories
– Network anomalies (41)
- Steep drop offs in service followed by quick return to normal behavior
– Flash crowd anomalies (4)
- Steep increase in service followed by slow return to normal behavior
– Attack anomalies (46)
- Steep increase in flows in one direction followed by quick return to normal
behavior – Measurement anomalies (18)
- Short-lived anomalies which are not network anomalies or attacks
pb@cs.wisc.ecdu 4
pb@cs.wisc.ecdu 5
Multiresolution Analysis
- Wavelets provide a means for describing time series
data that considers both frequency and time
– Powerful means for characterizing data with sharp spikes and discontinuities – Using wavelets can be quite tricky
- We use tools developed at UW which together make
up IMAPIT
– FlowScan software – The IDR Framenet software
pb@cs.wisc.ecdu 6
Our Wavelet System
- After evaluating different candidates we selected a
wavelet system called Pseudo Splines(4,1) Type 2.
– A framelet system developed by Daubechies et al. ‘00 – Very good frequency localization properties
- Three output signals are extracted
– Low Frequency (L): synthesis of all wavelet coefficients from level 9 and up – Mid Frequency (M): synthesis of wavelet coefficients 6, 7, 8 – High Frequency (H): synthesis of wavelet coefficients 1 to 5
pb@cs.wisc.ecdu 7 5 M 10 M 15 M 20 M Bytes/sec bytes, original signal One Autonomous System to Campus, Inbound, 2001-DEC-16 through 2001-DEC-23
- 6 M
- 4 M
- 2 M
2 M 4 M 6 M bytes, high-band
- 4 M
- 2 M
2 M 4 M bytes, mid-band Sat Sun Mon Tue Wed Thu Fri Sat Sun 5 M 10 M 15 M 20 M bytes, low-band
Ambient IP Flow Traffic
pb@cs.wisc.ecdu 8 5 M 10 M 15 M 20 M Bytes/sec bytes, original signal One Interface to Campus, Inbound, 2001-DEC-16 through 2001-DEC-23
- 6 M
- 4 M
- 2 M
2 M 4 M 6 M bytes, high-band
- 4 M
- 2 M
2 M 4 M bytes, mid-band Sat Sun Mon Tue Wed Thu Fri Sat Sun 5 M 10 M 15 M 20 M bytes, low-band
Ambient SNMP Traffic
pb@cs.wisc.ecdu 9
Oct-01 Oct-08 Oct-15 Oct-22 Oct-29 Nov-05 Nov-12 Nov-19 Nov-26 5 M 10 M 15 M 20 M Outbound Class-B Network Bytes, low-band
- 10 M
- 5 M
5 M Outbound Class-B Network Bytes, mid-band 5 M 10 M 15 M 20 M 25 M 30 M Bytes/sec Outbound Class-B Network Bytes, original signal Class-B Network, Outbound, 2001-SEP-30 through 2001-NOV-25
Byte Traffic for Flash Crowd
pb@cs.wisc.ecdu 10
Oct-01 Oct-08 Oct-15 Oct-22 Oct-29 Nov-05 Nov-12 Nov-19 Nov-26 500 1000 1500
- utbound HTTP average packet size, low-band
- 300
- 200
- 100
100 200 300
- utound HTTP average packet size, mid-band
500 1000 1500 Bytes
- utbound HTTP average packet size signal
Campus HTTP, Outbound, 2001-SEP-30 through 2001-NOV-25
Average Packet Size for Flash Crowd
pb@cs.wisc.ecdu 11 100 200 300 400 Flows/sec Inbound TCP Flows, original signal Campus TCP, Inbound, 2002-FEB-03 through 2002-FEB-10 50 100 150 200 Inbound TCP Flows, high-band
- 30
- 20
- 10
10 20 30 Inbound TCP Flows, mid-band Sun Mon Tue Wed Thu Fri Sat 100 200 300 400 Inbound TCP Flows, low-band
Flow Traffic During DoS Attacks
pb@cs.wisc.ecdu 12 10 M 20 M 30 M Bytes/sec Inbound TCP Bytes, original signal Campus TCP, Inbound, 2002-FEB-10 through 2002-FEB-17
- 4 M
- 2 M
2 M 4 M 6 M 8 M Inbound TCP Bytes, high-band
- 2 M
2 M 4 M 6 M Inbound TCP Bytes, mid-band Sun Mon Tue Wed Thu Fri Sat 10 M 20 M 30 M Inbound TCP Bytes, low-band
Byte Traffic During Measurement Anomalies
pb@cs.wisc.ecdu 13
Anomaly Detection via Deviation Score
- Short-lived anomalies can be identified automatically
based on variability in H and M signals
1. Compute local variability (using specified window) of H and M parts of signal 2. Combine local variability of H and M signals (using a weighted sum) and normalize by total variability to get deviation score V 3. Apply threshold to V then measure peaks
- Analysis shows that V peaks over 2.0 indicate short-
lived anomalies with high confidence
– We threshold at V = 1.25 and set window size to 3 hours
pb@cs.wisc.ecdu 14
Sun Mon Tue Wed Thu Fri Sat 1.5 2 Score Deviation Score Sun Mon Tue Wed Thu Fri Sat 10 k 20 k 30 k 40 k 50 k Packets/sec Inbound TCP Packets Campus TCP, Inbound, 2002-FEB-03 through 2002-FEB-10
Deviation Score for Three Anomalies
pb@cs.wisc.ecdu 15
Inbound 1.5 2 Score Outbound 1.5 2 Score Sun Mon Tue Wed Thu Fri Sat 1.5 2 Score 10 M 20 M 30 M Bytes/sec 10 k 20 k 30 k 40 k Pkts/sec Sun Mon Tue Wed Thu Fri Sat 50 100 150 200 Flows/sec
Deviation Score for Network Outage
pb@cs.wisc.ecdu 16
Inbound 1.5 2 Score Outbound 1.5 2 Score 1.5 2 Score 10 M 20 M 30 M Bytes/sec 10 k 20 k 30 k 40 k 50 k Pkts/sec Sun Mon Tue Wed Thu Fri Sat 50 100 150 200 250 300 Flows/sec Sun Mon Tue Wed Thu Fri Sat Inbound 1.5 2 Score Outbound 1.5 2 Score 1.5 2 Score 100 k 200 k 300 k 400 k 500 k 600 k Bytes/sec 5 k 10 k 15 k 20 k Pkts/sec Sun Mon Tue Wed Thu Fri Sat 50 100 150 200 Flows/sec Sun Mon Tue Wed Thu Fri Sat
Anomalies in Aggregate Signals
pb@cs.wisc.ecdu 17
2 M 4 M 6 M 8 M 10 M Bytes/sec Outbound Bytes, original signal Class-B Network, Outbound, 2001-NOV-25 through 2001-DEC-23
- 2 M
- 1 M
1 M 2 M 3 M Outbound Bytes, high-band
- 3 M
- 2 M
- 1 M
1 M 2 M 3 M Outbound Bytes, mid-band Nov-27 Dec-04 Dec-11 Dec-18 Dec-25 2 M 4 M 6 M 8 M 10 M Outbound Bytes, low-band
Hidden Anomalies in Low Frequency
pb@cs.wisc.ecdu 18
Deviation Score Evaluation
- How effective is deviation score at detecting anomalies?
– Compare versus set of 39 anomalies
- Set is unlikely to be complete so we don’t treat false-positives
– Compare versus Holt-Winters Forecasting
- Time series technique
- Requires some configuration
- Holt-Winters reported many more positives and sometimes
- scillated between values
37 38 39
Candidates detected by Holt-Winters Candidates detected by Deviation Score Total Candidate Anomalies
pb@cs.wisc.ecdu 19
Conclusion and Next Steps
- We present an evaluation of signal characteristics of network
traffic anomalies – Using IP flow and SNMP data collected at UW border router – IMAPIT developed to apply wavelet analysis to data – Deviation score developed to automate anomaly detection
- Results
– Characteristics of anomalies exposed using different filters and data – Deviation score appears promising as a detection method
- Future