A Signal Analysis of Network Traffic Anomalies Paul Barford with - - PowerPoint PPT Presentation

a signal analysis of network traffic anomalies
SMART_READER_LITE
LIVE PREVIEW

A Signal Analysis of Network Traffic Anomalies Paul Barford with - - PowerPoint PPT Presentation

A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin Madison Fall, 2002 Overview Motivation: Anomaly detection remains difficult Objective : Improve


slide-1
SLIDE 1

A Signal Analysis of Network Traffic Anomalies

Paul Barford

with Jeffery Kline, David Plonka, Amos Ron

University of Wisconsin – Madison Fall, 2002

slide-2
SLIDE 2

pb@cs.wisc.ecdu 2

Overview

  • Motivation: Anomaly detection remains difficult
  • Objective: Improve understanding of traffic anomalies
  • Approach: Multiresolution analysis of data set that

includes IP flow, SNMP and an anomaly catalog

  • Method: Integrated Measurement Analysis Platform for

Internet Traffic (IMAPIT)

  • Results: Identify anomaly characteristics using wavelets

and develop new method for exposing short-lived events

slide-3
SLIDE 3

pb@cs.wisc.ecdu 3

Our Data Sets

  • Consider anomalies in IP flow and SNMP data

– Collected at UW border router (Juniper M10) – Archive of ~6 months worth of data (packets, bytes, flows) – Includes catalog of anomalies (after-the-fact analysis)

  • Group observed anomalies into four categories

– Network anomalies (41)

  • Steep drop offs in service followed by quick return to normal behavior

– Flash crowd anomalies (4)

  • Steep increase in service followed by slow return to normal behavior

– Attack anomalies (46)

  • Steep increase in flows in one direction followed by quick return to normal

behavior – Measurement anomalies (18)

  • Short-lived anomalies which are not network anomalies or attacks
slide-4
SLIDE 4

pb@cs.wisc.ecdu 4

slide-5
SLIDE 5

pb@cs.wisc.ecdu 5

Multiresolution Analysis

  • Wavelets provide a means for describing time series

data that considers both frequency and time

– Powerful means for characterizing data with sharp spikes and discontinuities – Using wavelets can be quite tricky

  • We use tools developed at UW which together make

up IMAPIT

– FlowScan software – The IDR Framenet software

slide-6
SLIDE 6

pb@cs.wisc.ecdu 6

Our Wavelet System

  • After evaluating different candidates we selected a

wavelet system called Pseudo Splines(4,1) Type 2.

– A framelet system developed by Daubechies et al. ‘00 – Very good frequency localization properties

  • Three output signals are extracted

– Low Frequency (L): synthesis of all wavelet coefficients from level 9 and up – Mid Frequency (M): synthesis of wavelet coefficients 6, 7, 8 – High Frequency (H): synthesis of wavelet coefficients 1 to 5

slide-7
SLIDE 7

pb@cs.wisc.ecdu 7 5 M 10 M 15 M 20 M Bytes/sec bytes, original signal One Autonomous System to Campus, Inbound, 2001-DEC-16 through 2001-DEC-23

  • 6 M
  • 4 M
  • 2 M

2 M 4 M 6 M bytes, high-band

  • 4 M
  • 2 M

2 M 4 M bytes, mid-band Sat Sun Mon Tue Wed Thu Fri Sat Sun 5 M 10 M 15 M 20 M bytes, low-band

Ambient IP Flow Traffic

slide-8
SLIDE 8

pb@cs.wisc.ecdu 8 5 M 10 M 15 M 20 M Bytes/sec bytes, original signal One Interface to Campus, Inbound, 2001-DEC-16 through 2001-DEC-23

  • 6 M
  • 4 M
  • 2 M

2 M 4 M 6 M bytes, high-band

  • 4 M
  • 2 M

2 M 4 M bytes, mid-band Sat Sun Mon Tue Wed Thu Fri Sat Sun 5 M 10 M 15 M 20 M bytes, low-band

Ambient SNMP Traffic

slide-9
SLIDE 9

pb@cs.wisc.ecdu 9

Oct-01 Oct-08 Oct-15 Oct-22 Oct-29 Nov-05 Nov-12 Nov-19 Nov-26 5 M 10 M 15 M 20 M Outbound Class-B Network Bytes, low-band

  • 10 M
  • 5 M

5 M Outbound Class-B Network Bytes, mid-band 5 M 10 M 15 M 20 M 25 M 30 M Bytes/sec Outbound Class-B Network Bytes, original signal Class-B Network, Outbound, 2001-SEP-30 through 2001-NOV-25

Byte Traffic for Flash Crowd

slide-10
SLIDE 10

pb@cs.wisc.ecdu 10

Oct-01 Oct-08 Oct-15 Oct-22 Oct-29 Nov-05 Nov-12 Nov-19 Nov-26 500 1000 1500

  • utbound HTTP average packet size, low-band
  • 300
  • 200
  • 100

100 200 300

  • utound HTTP average packet size, mid-band

500 1000 1500 Bytes

  • utbound HTTP average packet size signal

Campus HTTP, Outbound, 2001-SEP-30 through 2001-NOV-25

Average Packet Size for Flash Crowd

slide-11
SLIDE 11

pb@cs.wisc.ecdu 11 100 200 300 400 Flows/sec Inbound TCP Flows, original signal Campus TCP, Inbound, 2002-FEB-03 through 2002-FEB-10 50 100 150 200 Inbound TCP Flows, high-band

  • 30
  • 20
  • 10

10 20 30 Inbound TCP Flows, mid-band Sun Mon Tue Wed Thu Fri Sat 100 200 300 400 Inbound TCP Flows, low-band

Flow Traffic During DoS Attacks

slide-12
SLIDE 12

pb@cs.wisc.ecdu 12 10 M 20 M 30 M Bytes/sec Inbound TCP Bytes, original signal Campus TCP, Inbound, 2002-FEB-10 through 2002-FEB-17

  • 4 M
  • 2 M

2 M 4 M 6 M 8 M Inbound TCP Bytes, high-band

  • 2 M

2 M 4 M 6 M Inbound TCP Bytes, mid-band Sun Mon Tue Wed Thu Fri Sat 10 M 20 M 30 M Inbound TCP Bytes, low-band

Byte Traffic During Measurement Anomalies

slide-13
SLIDE 13

pb@cs.wisc.ecdu 13

Anomaly Detection via Deviation Score

  • Short-lived anomalies can be identified automatically

based on variability in H and M signals

1. Compute local variability (using specified window) of H and M parts of signal 2. Combine local variability of H and M signals (using a weighted sum) and normalize by total variability to get deviation score V 3. Apply threshold to V then measure peaks

  • Analysis shows that V peaks over 2.0 indicate short-

lived anomalies with high confidence

– We threshold at V = 1.25 and set window size to 3 hours

slide-14
SLIDE 14

pb@cs.wisc.ecdu 14

Sun Mon Tue Wed Thu Fri Sat 1.5 2 Score Deviation Score Sun Mon Tue Wed Thu Fri Sat 10 k 20 k 30 k 40 k 50 k Packets/sec Inbound TCP Packets Campus TCP, Inbound, 2002-FEB-03 through 2002-FEB-10

Deviation Score for Three Anomalies

slide-15
SLIDE 15

pb@cs.wisc.ecdu 15

Inbound 1.5 2 Score Outbound 1.5 2 Score Sun Mon Tue Wed Thu Fri Sat 1.5 2 Score 10 M 20 M 30 M Bytes/sec 10 k 20 k 30 k 40 k Pkts/sec Sun Mon Tue Wed Thu Fri Sat 50 100 150 200 Flows/sec

Deviation Score for Network Outage

slide-16
SLIDE 16

pb@cs.wisc.ecdu 16

Inbound 1.5 2 Score Outbound 1.5 2 Score 1.5 2 Score 10 M 20 M 30 M Bytes/sec 10 k 20 k 30 k 40 k 50 k Pkts/sec Sun Mon Tue Wed Thu Fri Sat 50 100 150 200 250 300 Flows/sec Sun Mon Tue Wed Thu Fri Sat Inbound 1.5 2 Score Outbound 1.5 2 Score 1.5 2 Score 100 k 200 k 300 k 400 k 500 k 600 k Bytes/sec 5 k 10 k 15 k 20 k Pkts/sec Sun Mon Tue Wed Thu Fri Sat 50 100 150 200 Flows/sec Sun Mon Tue Wed Thu Fri Sat

Anomalies in Aggregate Signals

slide-17
SLIDE 17

pb@cs.wisc.ecdu 17

2 M 4 M 6 M 8 M 10 M Bytes/sec Outbound Bytes, original signal Class-B Network, Outbound, 2001-NOV-25 through 2001-DEC-23

  • 2 M
  • 1 M

1 M 2 M 3 M Outbound Bytes, high-band

  • 3 M
  • 2 M
  • 1 M

1 M 2 M 3 M Outbound Bytes, mid-band Nov-27 Dec-04 Dec-11 Dec-18 Dec-25 2 M 4 M 6 M 8 M 10 M Outbound Bytes, low-band

Hidden Anomalies in Low Frequency

slide-18
SLIDE 18

pb@cs.wisc.ecdu 18

Deviation Score Evaluation

  • How effective is deviation score at detecting anomalies?

– Compare versus set of 39 anomalies

  • Set is unlikely to be complete so we don’t treat false-positives

– Compare versus Holt-Winters Forecasting

  • Time series technique
  • Requires some configuration
  • Holt-Winters reported many more positives and sometimes
  • scillated between values

37 38 39

Candidates detected by Holt-Winters Candidates detected by Deviation Score Total Candidate Anomalies

slide-19
SLIDE 19

pb@cs.wisc.ecdu 19

Conclusion and Next Steps

  • We present an evaluation of signal characteristics of network

traffic anomalies – Using IP flow and SNMP data collected at UW border router – IMAPIT developed to apply wavelet analysis to data – Deviation score developed to automate anomaly detection

  • Results

– Characteristics of anomalies exposed using different filters and data – Deviation score appears promising as a detection method

  • Future

– Development of anomaly classification methods – Application of results in (distributed) detection systems