A Case for Packet Sampling A Case for Packet Sampling Tanja Zseby, - - PowerPoint PPT Presentation

a case for packet sampling a case for packet sampling
SMART_READER_LITE
LIVE PREVIEW

A Case for Packet Sampling A Case for Packet Sampling Tanja Zseby, - - PowerPoint PPT Presentation

A Case for Packet Sampling A Case for Packet Sampling Tanja Zseby, zseby@fokus.fhg.de Competence Center for Autonomic Networking Technologies Motivation: FloCon FloCon 2005 2005 Motivation: FloCon05 participants: We dont believe in


slide-1
SLIDE 1

A Case for Packet Sampling A Case for Packet Sampling

Tanja Zseby, zseby@fokus.fhg.de Competence Center for Autonomic Networking Technologies

slide-2
SLIDE 2

FloCon 2006 Panel 2

Motivation: Motivation: FloCon FloCon 2005 2005

FloCon05 participants: “We don’t believe in Sampling” Happy to use flow data Very skeptical to packet sampling

slide-3
SLIDE 3

FloCon 2006 Panel 3

The Problem: Limited Resources The Problem: Limited Resources

Full packet capture at each node not feasible

– Increasing data rates – Hardware costs – Privacy concerns

Resources are limited

– Storage – Processing – Transmission

We cannot measure everything

Additional CPU load for running NetFlow on different routers* *source: NetFlow Performance Analysis, Cisco white paper http://www.cisco.com/warp/public/cc/pd/iosw/prodlit/ntfo_wpa.jpg

slide-4
SLIDE 4

FloCon 2006 Panel 4

Solution1: Flow Data Solution1: Flow Data

Grouping of packets into flows (classification) Reporting of flow information only Disadvantages:

– Per-packet information is lost – Information and effort depends on flow definition

Classification Record Generation 2x 1x 5x Flow Info:

slide-5
SLIDE 5

FloCon 2006 Panel 5

Flow Data Generation Flow Data Generation

  • Information about packets is discarded
  • Available information depends on

– Flow definition – Flow characteristics that are reported

Aggregation Classification Aggregation Aggregation

FlowID 1: <s1, t1, c1> <s4, t4, c4> <s8, t8, c8> FlowID 2: <s2, t2, c2> <s3, t3, c3> <s6, t6, c6> FlowID 3: <s5, t5, c5> <s7, t7, c7> <s9, t9, c9> Flow Characteristics: <s1, t1, c1>, <s2, t2, c2>, ... <sN, tN, cN> Traffic Mix: Flows: <Nf, µf, f, …> <Nf, µf, f, …> <Nf, µf, f, …>

Aggregation Classification Aggregation Aggregation

FlowID 1: <s1, t1, c1> <s4, t4, c4> <s8, t8, c8> FlowID 2: <s2, t2, c2> <s3, t3, c3> <s6, t6, c6> FlowID 3: <s5, t5, c5> <s7, t7, c7> <s9, t9, c9> Flow Characteristics: <s1, t1, c1>, <s2, t2, c2>, ... <sN, tN, cN> Traffic Mix: Flows: <Nf, µf, f, …> <Nf, µf, f, …> <Nf, µf, f, …> Record Generation Record Generation Record Generation

slide-6
SLIDE 6

FloCon 2006 Panel 6

Solution2: Packet Sampling Solution2: Packet Sampling

Random Selection of some packets

– Report parts or full packet information – Estimation of metrics based on sample

Provides different viewpoint

– Packet data can reveal further information – Sampled data sufficient for some metrics

Helps to protect measurement infrastructure during attack

Sampling Packet Inspection

slide-7
SLIDE 7

FloCon 2006 Panel 7

Sampling: State of Art Sampling: State of Art

1990 2005 2000 1995

adaptive [EsKM04] sample+hold [EsVa01] flow volume

2001 2003 2004 2002

adaptive [DrCh98] packet-count total volume adaptive [ChPZ02] [AmCa89] [JePP92] packet-count per flow 2-run [KoLM04] flow sampling [DuLT01] time vs. count [ClPB93] First Sampling Workshop 2005 stratified [Zseb05] SLA/QoS ATM [CoGi98] proportion [Zseb02] stratified [Zseb03] (trajectory) [DuGr00] hash emulation [NiMD04], [MoND05] IPFIX anomaly detection with hypothesis testing load change detection sFlow [RFC3176] PSAMP attack detection as target application DDos detection protect infrastructure

slide-8
SLIDE 8

FloCon 2006 Panel 8

Packet Sampling Packet Sampling

Real metric substituted by estimate Accuracy statement is essential Accuracy depends on

– Sampling scheme – Estimation method – Position of sampling process in measurement sequence – Population characteristics (e.g. variance of metric of interest)

slide-9
SLIDE 9

FloCon 2006 Panel 9

A Simple Example A Simple Example

( )

ˆ ˆ

ˆ ˆ 1

c c P P

Prob P z P P z σ σ α − ⋅ ≤ ≤ + ⋅ = −

M P N =

ˆ m P n =

Real proportion: Estimate: Estimation Accuracy (random n-of-N):

( )

ˆ

1 1

P

P P N n n N σ ⋅ − − = ⋅ −

Confidence Limits:

Works with other packet properties, too!

ˆ 0.9 P =

ˆ

0.03

P

σ =

0.8226 P 0.977, with 99% confidence

  • ˆ

0.5 P =

ˆ

0.05

P

σ =

(worst case) 0.371 P 0.629, with 99% confidence

Goal: Estimation of packet proportions (e.g. TCP-SYN packets in a flow)

ˆ 0.1 P =

same accuracy Example: - Measurement interval with N=10,000 packets

  • Random packet selection 1% (n=100)
slide-10
SLIDE 10

FloCon 2006 Panel 10

Advise Advise

  • Don’t restrict your analysis to flow data

– Include further viewpoints – Use sampling in addition or as alternative to flow data

  • Trust the power of statistics

– It’s a mature and well established field full range of proven techniques

  • Use sampling where applicable

– Applicability depends on traffic profile, metric of interest, accuracy demand Sampled data sufficient to detect large events (high volumes, high packet counts) May be sufficient to estimate #pkts with specific properties (e.g. SYN, VoIP packets, small packets, packets with same content, etc.) Others depends on scenario – Difficulties with rare events (stealth attacks, slow port scans) – Not suitable to re-assemble connections (but filtering may be)

slide-11
SLIDE 11

Thank you for your Thank you for your attention! attention!