Observed structure of addresses in IP traffic Eddie Kohler, - - PowerPoint PPT Presentation

observed structure of addresses in ip traffic
SMART_READER_LITE
LIVE PREVIEW

Observed structure of addresses in IP traffic Eddie Kohler, - - PowerPoint PPT Presentation

Observed structure of addresses in IP traffic Eddie Kohler, Jinyang Li, Vern Paxson, Scott Shenker ICSI Center for Internet Research Thanks to David Donoho and Dick Karp 1 Problem How can we model the set of destination IP


slide-1
SLIDE 1

Observed structure of addresses in IP traffic

  • Eddie Kohler, Jinyang Li, Vern Paxson, Scott Shenker

ICSI Center for Internet Research Thanks to David Donoho and Dick Karp

1

slide-2
SLIDE 2

Problem

  • How can we model the set of destination IP addresses visible on

some link? (And does it matter?) Example from a 4-hour trace at a university access link:

255.255.255.255 192.0.0.0 128.0.0.0 64.0.0.0 0.0.0.0

In particular, can we model how the addresses aggregate? We call this address structure.

  • Applications might include average-case route lookup, analysis of

aggregate-based congestion control, realistic sets of addresses for simulations, . . .

2

slide-3
SLIDE 3

Results

  • Address structure dominates the characteristics of medium-scale

prefix aggregates, such as /16s.

  • The medium-scale aggregation behavior of real addresses is well

modeled by a multifractal Cantor set construction with two parameters. The model captures both fractal metrics and metrics we developed for address structures.

  • Address structure can serve as a site “fingerprint”.

Structural metrics differ between sites. At a given site, these metrics are stable over short time scales. New communication dynamics, such as worm propagation, show up in the metrics.

3

slide-4
SLIDE 4

Outline

  • Terminology
  • Address structure and aggregate packet counts
  • Model
  • Metrics
  • Fingerprints

4

slide-5
SLIDE 5

Terminology

  • Active address: an IP address visible in the trace as a destination
  • N: the number of active addresses in a trace

N ≤ 232 by definition; N ≪ 232 for all our traces

  • p-aggregate: a set of addresses that share the same p-bit address

prefix (0 ≤ p ≤ 32) Also called a /p 1.0.0.0 and 1.99.130.14 are in the same /8, but different /10s

  • Active p-aggregate: a /p containing at least one active address

5

slide-6
SLIDE 6

Traces

  • Name

Description

∆T

# pkts N U1 large university access link ∼ 4 h 62M 69,196 U2 large university access link ∼ 1 h 101M 144,244 A1 ISP ∼ 0.6 h 34M 82,678 A2 ISP 1 h 29M 154,921 R1 link from regional ISP 1 h 1.5M 168,318 § R2 link from regional ISP 2 h 1M 110,783 § W1 large Web site access link ∼ 2 h 5M 124,454

  • Collected between 1998 and 2001

Most anonymized while preserving prefix and class relationships § means sampled (1 in 256)

6

slide-7
SLIDE 7

Does address structure matter?

  • Assume that aggregate packet counts matter.

Accounting, fairness, congestion control . . .

  • What factors affect aggregate packet counts?

Packet counts per address: probably a heavy-tailed distribution Addresses per aggregate = address structure Correlation

  • Analyze the contributions of these factors to an observed packet

count distribution Medium scales are most interesting (/16s and thereabouts)

7

slide-8
SLIDE 8

R1 packet count distributions

  • 10-6

10-5 10-4 10-3 10-2 0.1 1 106 105 104 1000 100 10 1 Complementary CDF Packet count slope -1.46 slope -1.16 slope -1.13 R1 16-aggregates R1 addresses R1 flows

8

slide-9
SLIDE 9

Semi-experiments

  • Manipulate the data, destroying one factor at a time; see which

factors impact aggregate packet counts

  • “Random counts”: destroy per-address packet counts

Replace the (heavy-tailed) per-address packet count distribution with a uniform distribution over [0, 17.54]

  • “Random addresses”: destroy address structure

Replace address structure with a uniform random distribution over the entire IP address space

  • “Permuted counts”: destroy correlation

Permute per-address packet counts among the active addresses

9

slide-10
SLIDE 10

Address structure matters most

  • 10-4

10-3 10-2 0.1 1 106 105 104 1000 100 10 1 Complementary CDF 16-aggregate packet count R1 Permuted counts Random counts Random addresses

10

slide-11
SLIDE 11

Tour of U1’s address structure

  • 255.255.255.255

192.0.0.0 128.0.0.0 64.0.0.0 0.0.0.0 200.0.0.0 198.0.0.0 196.0.0.0 194.0.0.0 192.0.0.0 195.192.0.0 195.176.0.0 195.160.0.0 195.144.0.0 195.128.0.0 195.190.0.0 195.189.128.0 195.189.0.0 195.188.128.0 195.188.0.0

11

slide-12
SLIDE 12

Self-similarity?

  • Interesting structure all the way down

Visually “self-similar” characteristics

  • Might address structure be usefully modeled by a fractal?

Treat an address structure as a subset of the unit interval Fractal dimension D ∈ [0, 1]?

12

slide-13
SLIDE 13

Fractal dimension for address structure

  • Use lattice box-counting dimension

Corresponds nicely to prefix aggregation

  • Let np equal the number of active /ps in a trace

n32 = N np ≤ np+1 ≤ 2np each /p contains and is covered by 2 disjoint /(p + 1)s

  • Then D = lim

p→∞

log np p log 2 But p ≤ 32 here, and expect sampling effects for high p Examine medium p to see if the limit exists

13

slide-14
SLIDE 14

log np is linearly related to p at medium scales

  • 106

105 104 103 100 10 1 4 8 12 16 20 24 28 32 np Prefix length p D = 0.79 R1 A2 U1

14

slide-15
SLIDE 15

Multifractality

  • Monofractal may not be sufficient

Same scaling behavior everywhere Not what we saw in the tour

  • Examine the multifractal spectrum to test for multifractality

(different local scaling behavior) Binned approximation (Histogram Method) If multifractal, spectrum will cover a wide range of scaling exponents

15

slide-16
SLIDE 16

Address structure is multifractal at /16

  • 0.2

0.4 0.6 0.8 1 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 f16(x) Scaling exponent R1 Cantor dust with D = 0.79

16

slide-17
SLIDE 17

Multifractal model

  • Make a multifractal Cantor measure matching this spectrum
  • Start with a Cantor dust with dimension D

Repeatedly remove middle subinterval with proportion h = 1 − 21−1/D

  • Sample unequally from left and right subintervals

Distribute a unit of “mass” between subintervals; left gets m0, middle gets 0 (removed), right gets m2 = 1 − m0 Produces a sequence of measures µk that weakly converge to µ Sample an address with probability equal to its measure Result: different local scaling behavior

17

slide-18
SLIDE 18

The model fits well

  • 0.2

0.4 0.6 0.8 1 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 f16(x) Scaling exponent R1 Cantor dust (D = 0.79, m0 = 0.5) R1 Model (D = 0.79, m0 = 0.80)

18

slide-19
SLIDE 19

The model fits well

  • 0.2

0.4 0.6 0.8 1 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 f16(x) Scaling exponent A2 A2 Model (D = 0.80, m0 = 0.70)

19

slide-20
SLIDE 20

Why multifractal?

  • Perhaps it’s due to a cascade

Recursive subdivision plus a rule for distributing mass

  • For example, address allocation

Pure speculation! ICANN allocates short prefixes to providers Providers allocate longer prefixes to their customers All parties might allocate basically from left to right

20

slide-21
SLIDE 21

Does the multifractal spectrum matter?

  • Certainly the model doesn’t look like real data:

U1

255.255.255.255 192.0.0.0 128.0.0.0 64.0.0.0 0.0.0.0

U1 Model

255.255.255.255 192.0.0.0 128.0.0.0 64.0.0.0 0.0.0.0

How do we know whether we’ve captured relevant properties?

  • Develop application metrics for address structures

Contrast metrics among traces Compare with model

21

slide-22
SLIDE 22

Active aggregate counts: np and γp

  • np again equals the number of active /ps in a trace
  • np measures how densely addresses are packed

If N = 216 and n16 = 1, addresses are closely packed If N = 216 and n16 = 216, addresses are well spread out Useful for algorithms keeping track of aggregates—shows how many aggregates there tend to be

  • γp = np+1/np more convenient for graphs

N =

1≤p<32 γp

22

slide-23
SLIDE 23

γp

  • 1

1.2 1.4 1.6 1.8 2 4 8 12 16 20 24 28 32 γp Prefix length p R1 A2 U1 W1

23

slide-24
SLIDE 24

Models’ γp

  • 1

1.2 1.4 1.6 1.8 2 4 8 12 16 20 24 28 32 γp Prefix length p R1 R1 model A2 A2 model

24

slide-25
SLIDE 25

Discriminating prefixes

  • The discriminating prefix of an active address, a, is the prefix length
  • f the largest aggregate that contains only one active address,

namely a. Example with 4-bit addresses:

Prefix length 4 4 3 2 4 4 1 2 3 4

  • Measures address separation

If many addresses have d.p. < 20, say, then addresses are well separated How depopulated do aggregates become?

25

slide-26
SLIDE 26

Discriminating prefixes: πp

  • Let πp equal the number of addresses with d.p. p

πp = N

Turns discriminating prefixes into a metric

26

slide-27
SLIDE 27

πp

  • 10-6

10-5 10-4 10-3 10-2 0.1 1 4 8 12 16 20 24 28 32 CDF of πp Prefix length p R1 A2 U1 W1

27

slide-28
SLIDE 28

Models’ πp

  • 10-6

10-5 10-4 10-3 10-2 0.1 1 4 8 12 16 20 24 28 32 CDF of πp Prefix length p R1 R1 Model A2 A2 Model

28

slide-29
SLIDE 29

Aggregate population distribution

  • Like aggregate packet count distribution, but count the number of

active addresses per aggregate Expect a wide range of variation, just as with the other metrics

29

slide-30
SLIDE 30

Aggregate population distribution

  • 10-4

10-3 10-2 0.1 1 105 104 1000 100 10 1 Complementary CDF Aggregate population /16s /8s R1 A2 U1 W1

30

slide-31
SLIDE 31

Models’ aggregate population distribution

  • 10-4

10-3 10-2 0.1 1 105 104 1000 100 10 1 Complementary CDF Aggregate population /16s /8s R1 R1 Model A2 A2 Model

31

slide-32
SLIDE 32

A tough metric

  • The model for A2 doesn’t match A2’s aggregate populations

R1, W1 match well, A2, U1 do not Significant aggregation in A2, U1 at long prefixes . . . ?

  • Aggregate population distribution is difficult to match
  • Consider random allocation constrained to match γp and πp exactly

Heck, match “generalized discriminating prefixes”—d.p.s for aggregates—as well Call this the “Match-DP” model How well does this do?

32

slide-33
SLIDE 33

Match-DP fails aggregate population distribution

  • 10-4

10-3 10-2 0.1 1 105 104 1000 100 10 1 Complementary CDF Aggregate population /16s /8s R1 R1 Model R1 DP-Model A2 A2 Model A2 DP-Model

33

slide-34
SLIDE 34

Another tough metric: The multifractal spectrum

  • 0.2

0.4 0.6 0.8 1 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 f16(x) Scaling exponent R1 R1 Model R1 DP-Model

34

slide-35
SLIDE 35

Properties of γp: Sampling effects?

  • Turn from the multifractal model to properties of our γp metric
  • First: Is γp dominated by sampling effects?

N is effectively a sample size How does the shape of the γp curve depend on N?

  • Plot γp for longer and shorter sections of trace U1

24 hours → 6 minutes; N = 161,560 → 11,838

35

slide-36
SLIDE 36

Shape of γp similar for wide range of sample sizes

  • 1

1.2 1.4 1.6 1.8 2 4 8 12 16 20 24 28 32 γp Prefix length p 24 hours, N = 161,560 U1: 4 hours, N = 69,196 40 minutes, N = 26,108 6 minutes, N = 11,838

36

slide-37
SLIDE 37

Shape of γp similar for wide range of sample sizes

  • 1

1.2 1.4 1.6 1.8 2 4 8 12 16 20 24 28 32 γp Prefix length p 24 hours, N = 161,560 U1: 4 hours, N = 69,196 40 minutes, N = 26,108 6 minutes, N = 11,838 R1 A2

37

slide-38
SLIDE 38

Short-term stability?

  • Is γp stable over short time scales?
  • Divide traces into short sections, each with N = 32,768

Plot maximum, minimum, and mean γp over all sections R1, A2, and U2; sections last about 6–7 minutes each

38

slide-39
SLIDE 39

Shape of γp relatively stable over short time scales

  • 1

1.2 1.4 1.6 1.8 2 4 8 12 16 20 24 28 32 γp Prefix length p R1 sections A2 sections U2 sections

39

slide-40
SLIDE 40

New communication dynamics?

  • How does γp change given a different communication pattern, such

as worm propagation? Expect worm propagation to significantly change the destination addresses visible at an access link, since every possible internal address will be contacted. Not the best detection metric . . .

  • Take a new data set, collected at a national laboratory, before and

after Code Reds 1 and 2 Consider γp and aggregate population distribution

40

slide-41
SLIDE 41

Shape of γp changes during worm propagation

  • 1

1.2 1.4 1.6 1.8 2 4 8 12 16 20 24 28 32 γp Prefix length p 18 Jul, pre-Code Red 19 Jul, Code Red 1 3 Aug, pre-Code Red 2 4 Aug, Code Red 2

41

slide-42
SLIDE 42
  • Agg. packet counts change during worm propagation
  • 10-5

10-4 10-3 10-2 0.1 1 104 1000 100 10 1 Complementary CDF Packet count of 24-aggregates 18 Jul, pre-Code Red 19 Jul, Code Red 1 3 Aug, pre-Code Red 2 4 Aug, Code Red 2

42

slide-43
SLIDE 43

Address stability

  • Divide a trace into sections, each lasting t seconds.
  • How many addresses in section 1 recur in section 2?

. . . in sections 1, 2, and 3? and so forth Indicates how quickly address sets change

  • Model: there are long-lived addresses and short-lived addresses

Every section contains nS short-lived and nL long-lived Addresses survive into the next section with probabilities pS and pL (where pL > pS) How well does this model match?

43

slide-44
SLIDE 44

U2, 6-minute sections

  • 5000

10000 15000 20000 25000 30000 35000 40000 1 2 3 4 5 6 7 8 9 Surviving addresses Number of sections real data 9779.9*0.937x-1 + 22260.1*0.255x-1

44

slide-45
SLIDE 45

Other time scales

  • 5000
10000 15000 20000 5 10 15 20 25 30 35 40 Surviving addresses Number of sections 1.5 minute sections real data 2895.0*0.988x-1 + 11756.6*0.418x-1 5000 10000 15000 20000 25000 30000 35000 40000 2 4 6 8 10 12 14 Surviving addresses Number of sections 4.5 minute sections real data 6581.5*0.964x-1 + 20755.9*0.314x-1 5000 10000 15000 20000 25000 30000 2 4 6 8 10 12 14 16 18 20 Surviving addresses Number of sections 3 minute sections real data 4310.0*0.982x-1 + 17208.4*0.367x-1 5000 10000 15000 20000 25000 30000 35000 40000 1 2 3 4 5 6 7 Surviving addresses Number of sections 7.5 minute sections real data 12718.1*0.907x-1 + 23998.4*0.214x-1

45

slide-46
SLIDE 46

Conclusions

  • Demonstrated importance of address structure
  • Real address structure well modeled by a two-parameter

multifractal Captures some aggregation behavior better than models built using metrics from real data

  • Use of structural metrics as site fingerprints

Metrics differ between sites, are stable over short time scales

46

slide-47
SLIDE 47

Future work

  • ???

47

slide-48
SLIDE 48

Analysis details

  • Sections are numbered 1 . . . k.

n[A] is number of active addresses in intersection of sections A.

  • nL long-lived addresses per section, nS short-lived addresses.
  • pL long-lived survival probability, pS short-lived.
  • pL ∼ n[1 . . . k]/n[1 . . . k − 1].
  • nL = n[1 . . . k]/pLk.
  • nS = n[1] − nL.
  • pS = (n[1, 2] − nLpL)/nS.

48