SLIDE 1 Collecting and Analysing Traffic Profiles within a Commercial Network
- Prof. D. J. Parish
- P. Sandford
High Speed Networks Group Loughborough University
SLIDE 2
Presentation Summary
Overview of the ‘Detecting and Preventing Criminal Activities on the Internet’ project Discussion of Architecture Example Network Patterns and Anomalies
SLIDE 3
Overview of the Network Abuse Detection Project EPSRC Funded Partnered by NTL, CESG and SPSS 3 Years, starting April 2004
SLIDE 4
Project Objectives
Identify illegal activity inside the network core
Sometimes Necessary Some prevention best done here
Use statistical traffic summaries of headers to identify anomalies
Processing Overhead High Throughput No user data Cheap Equipment
SLIDE 5
Approach Summary
Anomaly Based
Describing Normality Classifying Deviation
DataMining
Identify Relationships Update view of normality
SLIDE 6 Approach Summary (Cont.)
Broadband Network Broadband Network Broadband Network Broadband Network Broadband Network Broadband Network Central Processor Gatherer Gatherer Gatherer Gatherer Gatherer Gatherer Loughborough Controller
SLIDE 7
Current Status
Hardware/Basic software installed
6 Monitored PoPs 1 Data-Mining Engine Control from Loughborough/NTL
Modules In Place For
Summary Gathering Base lining data Simple Signature Detection Basic Alerting Outgoing spam email detection
SLIDE 8
PoP Software Summary
Kernel Space Capture Process & Statistic Gatherer Shared Memory Block 2 Shared Memory Block 1 Signature Detection & Summary Generator
SLIDE 9
PoP Software Summary
Fast Packet Processing
Purely Counter Increments
Pseudo Real-Time Statistics
Processing Time Constant Variable Statistic Window Size
SLIDE 10
Note on Capture Interfaces
Dealing With Interrupts
Poll Buffer
Context Switching
Memory Mapping
SLIDE 11
Signature Detection Without Data
Known Payload Original Payload
TCP Checksum TCP Header
TCP Checksum New Checksum
TCP Checksum TCP Header
SLIDE 12
Data Mining - Example
TTL Field
Initially Thought to be Limited Use Data Mining Highlighted TTL in Lab Tests
SLIDE 13
Data Mining - Example
TTL Field
Default Values (Depending on OS) Consistent by Default Can be used to Identify Spoofing Shows Daily Pattern (Windows / Linux, File Sharing / Web Browsing)
SLIDE 14
Patterns
Data Rates Port Numbers (Applications) An Anomaly Example
SLIDE 15
Data Rates
SLIDE 16
Data Rates (Cont.)
Time of Day Variation
Peak Times
Late evening (8pm)
Low Times
Early Morning (5am)
SLIDE 17 Model of Data Rate
600 MBit 200 MBit 400 MBit Day 0 Day 20
Data Rate Time
SLIDE 18 Port Numbers
0.01% 0.1% Gnutella (Secondary Port) 6348 7.178% 2.3% WinMX 6699 3.55% 4.1% BitTorrent 6881 5.3% 5.9% NNTP / UseNet 119 10.269% 6.4% eMule 4662 4.893% 7.9% Gnutella 6346 6.5% 8.8% HTTP 80 January 05 June 05 Percentage of All Packets Common Application Port
SLIDE 19
Port Numbers (Cont.)
SLIDE 20
Anomaly Example
SLIDE 21
Anomaly Example – Data Rate
SLIDE 22
Anomaly Example – Average Packet Size
SLIDE 23
Anomaly Example – TCP Packet Count
SLIDE 24
Anomaly Example – Destination Count
SLIDE 25
Anomaly Example – FIN Count
SLIDE 26
Anomaly Example – Ack Count
SLIDE 27
Anomaly - Conclusion
Large Traffic Amounts Synchronised TCP from many sources Small number of destinations FIN/ACK Packets
SLIDE 28
Anomaly TTL
SLIDE 29
Anomaly TTL - 63
SLIDE 30
Summary
System in Place
Monitoring High Data Rates Modelling Network Patterns Discovering Deviations from ‘Normal’
Architecture Design
Scalable Functional
SLIDE 31
Future Work
Cross-site Correlation Automated Alerting Investigation of Mitigation Techniques
SLIDE 32
Questions