Inverting Sampled Traffic Nicolas Hohn, Darryl Veitch Australian - - PowerPoint PPT Presentation
Inverting Sampled Traffic Nicolas Hohn, Darryl Veitch Australian - - PowerPoint PPT Presentation
Inverting Sampled Traffic Nicolas Hohn, Darryl Veitch Australian Research Council Special Research Center for Ultra-Broadband Information Networks T HE U NIVERSITY OF M ELBOURNE Inverting Sampled Traffic Motivation Sampling Techniques
Inverting Sampled Traffic
Motivation Sampling Techniques – Packet Sampling – Flow Sampling Comparison of sampling techniques – Distribution of the number of packets per flows – Spectral density of packet arrival process Application to traffic modelling
Introduction
Motivation
Traffic statistics collected by routers don’t scale well with link speed: exact traffic logging is impossible for backbone links Need to sample the traffic, export partial statistics Aim: infer statistics of original traffic from partial measurements
Introduction
Motivation
Traffic statistics collected by routers don’t scale well with link speed: exact traffic logging is impossible for backbone links Need to sample the traffic, export partial statistics Aim: infer statistics of original traffic from partial measurements
Short history
1993: Claffy et al. advocate sampling techniques at the packet level to reduce the load on measuring infrastructure. 2002-2003: Duffield et al. give estimates of first order quantities from packet level sampled traffic: average rate, mean number of packets per flows.
Inverting Sampled Traffic
Motivation Sampling Techniques – Packet Sampling – Flow Sampling Comparison of sampling techniques – Distribution of the number of packets per flows – Spectral density of packet arrival process Application to traffic modelling
Packet Sampling
Original traffic i.i.d. sampling with probability q Sampled traffic
Time
Simple example: recover original packet rate
- Sample packets with probability q.
- Measure rate of sampled traffic: λ(q).
- Infer rate of original traffic: λ(q)/q
Packet Sampling
Original traffic i.i.d. sampling with probability q Sampled traffic
Time Time
Simple example: recover original packet rate
- Sample packets with probability q.
- Measure rate of sampled traffic: λ(q).
- Infer rate of original traffic: λ(q)/q
Packet Sampling
Original traffic i.i.d. sampling with probability q Sampled traffic
Time Time Time
Simple example: recover original packet rate
- Sample packets with probability q.
- Measure rate of sampled traffic: λ(q).
- Infer rate of original traffic: λ(q)/q
Packet Sampling
Original traffic i.i.d. sampling with probability q Sampled traffic
Time Time Time
Simple example: recover original packet rate
- Sample packets with probability q,
- Measure rate of sampled traffic λ(q),
- Infer rate of original traffic λ(q)/q.
Terminology
IP flow: set of packets with same 5-tuple IP Source Destination Source Destination protocol Address Address Port Port
Time Time
Flow Level Packet Level
Terminology
IP flow: set of packets with same 5-tuple IP Source Destination Source Destination protocol Address Address Port Port
Time
Flow Level Packet Level
Original Traffic
Time Recovering original flow sizes not straightforward
Flow Sampling
Time No ‘inversion’ problems
Original Traffic
Time Recovering original flow sizes not straightforward
Packet Sampling
Time Recovering original flow sizes not straightforward
Inverting Sampled Traffic
Motivation Sampling Techniques – Packet Sampling – Flow Sampling Comparison of sampling techniques – Distribution of the number of packets per flows – Spectral density of packet arrival process Application to traffic modelling
Distribution of number of packets per flow
Original traffic
Time
Packet sampling Flow Sampling
Time Time
Potential inversion problems No ‘inversion’ problems
Distribution of number of packets per flow
Packet sampling
pj: Probability that a flow had j packets before sampling. p(q)
k : Probability that a flow has k packets after sampling,
p(q)
k
=
∞
- j=k
Pr{k packets after thinning| j packets before thinning}pj p(q)
k
=
∞
- j=k
j k
- qk(1 − q)j−kpj
(1)
Aim: express pj as a function of p(q)
k
by inverting (1)
Inverting (1) with generating functions
Definition:
GP(z) =
∞
- j=0
pjzj, z ∈ D(0, 1). D(z, r): open disc centered at z with radius r
Singularity at z = 1 if heavy tailed distribution. From (1):
G(q)
P (z)
=
- k
p(q)
k zk = GP(1 − q + qz), z ∈ D(0, 1)
GP(z) = G(q)
P
z − (1 − q) q
- , z ∈ D(1 − q, q)
Aim: Find power series expansion of GP at z = 0 Methods: – Analytic Continuation – Cauchy Integral
Scheme 1: Analytic Continuation
q = 0.6
−1 −0.5 0.5 1 −1 −0.5 0.5 1
z0 z1
pj =
∞
- n=j
n j (−1)n−j qn (1 − q)n−jp(q)
n
(2)
Scheme 1: Analytic Continuation
q = 0.1
−1 −0.5 0.5 1 −1 −0.5 0.5 1
z0 z1 z2 z3 z4 z5
pj = ...
Scheme 2: Cauchy Integral
pj =
- S
GP(z) zj+1 dz, (3) S: any closed contour containing the origin, for instance D(0, 1).
Inversion methods work well when GP can be directly evaluated on S Values of GP on D(0, 1) are unknown : obtained with Pad´ e Approximants
Distribution of number of packets per flow
q = 0.6
10 10
1
10
2
10
3
10
−6
10
−4
10
−2
10
j (number of packets per flow) Pr(P=j)
Theoretical original density Flow thinning Packet thinning: scheme 1 Packet thinning: scheme 2
Distribution of number of packets per flow
Packet sampling Flow Sampling
Time Time
Easy to implement, Need for consistent flow definition for sampled traffic (new timeout
T0),
Problems to estimate p(q) from sampled data, Severe numerical issues to recover the packet distribution (“impossible” for q < 0.5 !), Need on-line processing to create flows. No need to change flow definition, No inversion to recover packet distribution,
q plays no theoretical role.
Only the remaining number of flows matters for the estimation,
Spectral density of packet arrival process
Original traffic
Time
Packet sampling Flow Sampling
Time Time
Potential inversion problems Potential inversion problems
Spectral density of packet arrival process
ΓX(ω): spectral density of original traffic Γ(q)
X (ω): spectral density of sampled traffic
Packet sampling
Results from theory of thinned point processes give direct inversion
ΓX(ω) = 1 q2
- Γ(q)
X (ω) − (1 − q)λ(q)
Flow sampling
Assumptions needed: Flow arrivals follow a Poisson process, Flows are uncorrelated.
ΓX(ω) = 1 qΓ(q)
X (ω)
Study Second Order Structure
Analysis tools: Discrete Wavelet Transform
Definition:
Comparison of a signal X(t) with a family of functions ψj,k by means of inner products dX(j, k) =< X, ψj,k >, where ψj,k = 2−j/2ψ(2−jt − k), and ψ is the mother wavelet, localised both in time and frequency.
Properties:
{dX(j, k), k ∈ Z} is stationary and short range dependent for j fixed, variance(j) =E|dX(j, k)|2 For scaling processes: E|dX(j, k)|2 = 2jαE|dX(0, k)|2, For LRD processes: E|dX(j, k)|2 ∼ 2jαE|dX(0, k)|2 for large j. Wavelet Spectrum Estimate: log2
- 1
nj
- k |dX(j, k)|2
vs j Link with power spectral density: E|dX(j, k)|2 =
- Γ
X(ν)2j|Ψ(2jν)|2dν
Spectral density: q = 0.1
0.004 0.016 0.062 0.25 1 4 16 64 256 1024 −8 −6 −4 −2 2 4 6 8 10 12 6 8 10 12 14 16 18
j = log2 ( a ) log2 Var( d
j )
Original Packet Thinned Inferred from Packet Thinned Flow Thinned Inferred from Flow Thinned
Spectral density: q = 0.001
30.5mus 977mus 0.031 1 32 −14 −12 −10 −8 −6 −4 −2 2 4 6 5 10 15 20 25 30 35
j = log2 ( a ) log2 Var( d
j )
Original Packet Thinned Inferred from Packet Thinned Flow Thinned Inferred from Flow Thinned
Conclusions
Packet Sampling Flow Sampling Easy to implement, Need for consistent flow definition for sampled traffic (new timeout
T0),
Problems to estimate p(q) from sampled data, Severe numerical issues to recover the packet distribution (“impossible” for q < 0.5 !), Inaccurate estimation
- f
the spectrum from sampled traffic for small q. Need on-line processing to create flows. No need to change flow definition, No inversion to recover packet distribution,
q plays no theoretical role.
Only the remaining number of flows matters for the estimation, Accurate spectrum estimation,
Inverting Sampled Traffic
Motivation Sampling Techniques – Packet Sampling – Flow Sampling Comparison of sampling techniques – Distribution of the number of packets per flows – Spectral density of packet arrival process Application to traffic modelling
Application to traffic modelling
Aim
Fit model to sampled traffic, Infer model parameters for unsampled traffic.
Theory
Closure properties of the Bartlett-Lewis Point Process under both packet and flow sampling.
Practice
Only flow thinning is applicable.
Sampling the Bartlett-Lewis Point Process
0.004 0.016 0.062 0.25 1 4 16 64 256 1024 −8 −6 −4 −2 2 4 6 8 10 12 5 10 15 20
j = log2 ( a ) log2 Var( d
j )
Original BLPP matched to Original Flow Thinned BLPP matched to Flow Thinned BLPP reconstructed from Thinned