Building a real-time Grid protocol analyser Jonathan Paisley - - PowerPoint PPT Presentation

building a real time grid protocol analyser
SMART_READER_LITE
LIVE PREVIEW

Building a real-time Grid protocol analyser Jonathan Paisley - - PowerPoint PPT Presentation

Building a real-time Grid protocol analyser Jonathan Paisley Department of Computing Science The Grid Wide-area distributed computing Lots of funding Network operators need to support it Traffic dominated by bulk data transfer Elephants and


slide-1
SLIDE 1

Building a real-time Grid protocol analyser

Jonathan Paisley Department of Computing Science

slide-2
SLIDE 2

The Grid

Wide-area distributed computing Lots of funding Network operators need to support it Traffic dominated by bulk data transfer

slide-3
SLIDE 3

Elephants and Mice

slide-4
SLIDE 4

Multiple Elephants and Mice

slide-5
SLIDE 5

Elephants and Mixed Mice

slide-6
SLIDE 6

Elephants and Cipher Mice

?

slide-7
SLIDE 7

Grid Monitoring

Other ISPs and the Internet Monitoring point Signal Monitoring System ISP Operator Servers with lots of storage Grid Application Users ISP's Network Institution Network

slide-8
SLIDE 8

Approach

Interpret protocol to learn about associated bulk connections Report on transfer sizes Be able to deal with mixed control-data flows

slide-9
SLIDE 9

DAG-based Network Monitor

Just a PC with special network monitoring card. Example: 2.8 GHz dual Xeon, 2+ GB memory

Image Source: Endace Measurement Systems

slide-10
SLIDE 10

Similarity to NIDS

NIDS = Network Intrusion Detection System

For example: Bro Does protocol analysis (FTP, SMTP, etc) Needs port-based filter Full reassembly of every monitored flow Too slow

slide-11
SLIDE 11

Design Goals

Leverage DAG ring buffer architecture Capable of processing at GigE line rate Support full cleartext protocol analysis Efficiently handle mixed control/data

slide-12
SLIDE 12

Assumptions and Principles

TCP only Applications under study not used maliciously Minimise memory copies Minimise heap allocation Process packets as soon as possible Single-threaded, data-driven

slide-13
SLIDE 13

DAG Ring Buffer

Application Acknowledged Pointer DAG Write Pointer

Processed Unprocessed Not-yet- written

slide-14
SLIDE 14

DAG Ring Buffer

Application Processed Pointer Application Acknowledged Pointer DAG Write Pointer

Processed Unprocessed Retained Not-yet- written

slide-15
SLIDE 15

DAG Ring Buffer

Application Processed Pointer Application Acknowledged Pointer DAG Write Pointer

Processed Unprocessed Retained Not-yet- written

slide-16
SLIDE 16

Writing Protocol Analysers

Passive monitor sees both flow directions Code to track state -> generate events State machines can be complex Threaded programming style is easier ... but runtime cost normally higher

slide-17
SLIDE 17

ProtoThreads

Similar to co-routines/continuations Implemented using a C switch statement State maintained in a structure Context switching by stack unwinding

slide-18
SLIDE 18

Analyser Example

void AnalyserClass::AnalyserMain() { // Read function id and num args READ(OrigFlow, 2); func_id = *(uint16_t*)data; READ(OrigFlow, 2); num_args = *(uint16_t*)data; for (i=0;i<num_args;i++) { READ(OrigFlow, 4); len = *(uint32_t*)data; // Read the argument, but we // only need the first 200 bytes READ_AND_SKIP(OrigFlow, len, 200); // ... process the argument } READ(RespFlow, 4); result_value = *(uint32_t*)data; }

slide-19
SLIDE 19

Analyser Example

void AnalyserClass::AnalyserMain() { // Read function id and num args READ(OrigFlow, 2); func_id = *(uint16_t*)data; READ(OrigFlow, 2); num_args = *(uint16_t*)data; for (i=0;i<num_args;i++) { READ(OrigFlow, 4); len = *(uint32_t*)data; // Read the argument, but we // only need the first 200 bytes READ_AND_SKIP(OrigFlow, len, 200); // ... process the argument } READ(RespFlow, 4); result_value = *(uint32_t*)data; }

May block here!

slide-20
SLIDE 20

Scalability

Presently limited to single processor Auxiliary flow tracing complicated by concurrent processing Could use retained packet scheme for all flows: gives extra 1-2 seconds buffering

slide-21
SLIDE 21

Evaluation

Informal testing carried out during development Negligible load for ~900Mbps mixed control/ data SRB flow Initial testing with larger number of connections with flat-out* replay of IP header traces ~10-15% load

* 250Mbps, 5000 new connections per second.

slide-22
SLIDE 22

Encrypted Analysis Ideas

What can we know about encrypted traffic? Messages: (direction, size*, timing) Lack of messages (timeouts) If we understand framing protocol: can get application-level messages

* with some bounded error

slide-23
SLIDE 23

Requests and Responses

Client Server Monitoring Point Look out for associated bulk data flow!

slide-24
SLIDE 24

Approaches

Hidden Markov Models? Naïve Bayesian Classifier? Other work: SSH password typing analysis HTTPS request analysis by URL lengths Sideband attacks on encryption algorithms

slide-25
SLIDE 25

Summary

Built (hopefully) fast system for real-time protocol analysis work. Evaluation pending. Support for efficient handling of mixed control/data protocols. Coding of protocol analysers simplified by rich lightweight threaded interface. Starting work on classifying and event reporting of encrypted traffic.

slide-26
SLIDE 26