Confidence Weighted Marginal Utility Analyses of Internet Mapping - - PowerPoint PPT Presentation

confidence weighted marginal utility analyses of internet
SMART_READER_LITE
LIVE PREVIEW

Confidence Weighted Marginal Utility Analyses of Internet Mapping - - PowerPoint PPT Presentation

Confidence Weighted Marginal Utility Analyses of Internet Mapping Techniques Confidence Weighted Marginal Utility Analyses of Internet Mapping Techniques Craig Prince and Danny Wyatt December 6, 2004, CSE561 Networks Confidence Weighted


slide-1
SLIDE 1

Confidence Weighted Marginal Utility Analyses of Internet Mapping Techniques

Confidence Weighted Marginal Utility Analyses of Internet Mapping Techniques

Craig Prince and Danny Wyatt December 6, 2004, CSE561 Networks

slide-2
SLIDE 2

Confidence Weighted Marginal Utility Analyses of Internet Mapping Techniques Internet Mapping What and Why

Internet Mapping

What is it?

Figure out what the internet looks like Find routers and their interconnections Discern a topology

What is it good for?

Research Simulations Problem diagnosis Routing in overlay networks Spying on competing ISPs

slide-3
SLIDE 3

Confidence Weighted Marginal Utility Analyses of Internet Mapping Techniques Internet Mapping How

How do you map the internet?

Cannot directly observe it Have to send traceroutes through it What you see depends on

Source Target Routing policies ... and the topology itself!

Can only control source and target... Errors can occur that do not reflect true topology... Things change over time...

slide-4
SLIDE 4

Confidence Weighted Marginal Utility Analyses of Internet Mapping Techniques Internet Mapping How

How do you map the internet?

Cannot directly observe it Have to send traceroutes through it What you see depends on

Source Target Routing policies ... and the topology itself!

Can only control source and target... Errors can occur that do not reflect true topology... Things change over time... ...so just add as many as you can

slide-5
SLIDE 5

Confidence Weighted Marginal Utility Analyses of Internet Mapping Techniques Internet Mapping How

How do you map the internet?

Cannot directly observe it Have to send traceroutes through it What you see depends on

Source Target Routing policies ... and the topology itself!

Can only control source and target... Errors can occur that do not reflect true topology... Things change over time... ...so just add as many as you can Is more really better?

slide-6
SLIDE 6

Confidence Weighted Marginal Utility Analyses of Internet Mapping Techniques Internet Mapping Aims

Aims

How do different mapping tools compare in their efficient use of data? Are some kinds of measurements more valuable than others? If we are uncertain of our observations, how would different methods address that uncertainty?

slide-7
SLIDE 7

Confidence Weighted Marginal Utility Analyses of Internet Mapping Techniques Mapping Tools

The Data: 3 Mapping Tools

Skitter

24 distributed sources Each uses 1 or more of 4 lists of preselected target Continually loop through lists We use 3 days: 12/18-20, 2002

slide-8
SLIDE 8

Confidence Weighted Marginal Utility Analyses of Internet Mapping Techniques Mapping Tools

The Data: 3 Mapping Tools

Skitter

24 distributed sources Each uses 1 or more of 4 lists of preselected target Continually loop through lists We use 3 days: 12/18-20, 2002

Scriptroute

70 distributed PlanetLab nodes Each used same list of 125,000 address prefixes Attempted all traces once a day for three days (same as above)

slide-9
SLIDE 9

Confidence Weighted Marginal Utility Analyses of Internet Mapping Techniques Mapping Tools

The Data: 3 Mapping Tools

Skitter

24 distributed sources Each uses 1 or more of 4 lists of preselected target Continually loop through lists We use 3 days: 12/18-20, 2002

Scriptroute

70 distributed PlanetLab nodes Each used same list of 125,000 address prefixes Attempted all traces once a day for three days (same as above)

Rocketfuel

837 distributed public traceroute servers ≈ 60, 000 targets Heuristic pruning of source-target pairs to maximize coverage Data collected over January, 2002

slide-10
SLIDE 10

Confidence Weighted Marginal Utility Analyses of Internet Mapping Techniques Methodology

Some Definitions

A map is a directed graph G = (V , E) There is some impossible, true map ˆ G = ( ˆ V , ˆ E) with 100% perfect coverage A map is made by aggregating many measurements

Sources Targets

Coverage is how well one map approximates another Marginal coverage is how much each measurement contributes to its map We evaluate the marginal coverage of each of the three tools

slide-11
SLIDE 11

Confidence Weighted Marginal Utility Analyses of Internet Mapping Techniques Methodology

More Definitions: Confidence Weighting

Traceroutes are noisy sensors with probability of error d n(e) is number of observations of edge e ∈ E Probability that e exists is P(e) = 1 − dn(e) Edge coverage of G is mean probability of all edges:

P

e∈E P(e)

|E|

Node coverage is defined similarly For each analysis, also consider how it compares according to different values of d

slide-12
SLIDE 12

Node Coverage per Source

200000 400000 600000 800000 1000000 1200000 1400000 5 10 15 20 25 Node Coverage Source Error Probability 0.0 0.3 0.5 0.9 Total Probed

Skitter

20000 40000 60000 80000 100000 120000 140000 10 20 30 40 50 60 70 Node Coverage Source Error Probability 0.0 0.3 0.5 0.9 Total Probed

Scriptroute

10000 20000 30000 40000 50000 60000 70000 80000 100 200 300 400 500 600 700 800 900 Node Coverage Source Error Probability 0.0 0.3 0.5 0.9 Total Probed

Rocketfuel

slide-13
SLIDE 13

Confidence Weighted Marginal Utility Analyses of Internet Mapping Techniques Analyses Entropy

Entropy

H(A) =

  • a∈A

−P(a) log(P(a)) Average number of bits needed to encode each event a We take the entropy of the mean node and edge distributions Should always be changing

slide-14
SLIDE 14

Edge Entropy per Source

0.2 0.4 0.6 0.8 1 5 10 15 20 25 Edge Entropy Source Error Probability 0.0 0.3 0.5 0.9 Opt

Skitter

0.2 0.4 0.6 0.8 1 10 20 30 40 50 60 70 Edge Entropy Source Error Probability 0.0 0.3 0.5 0.9 Opt

Scriptroute

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 100 200 300 400 500 600 700 800 900 Edge Entropy Source Error Probability 0.0 0.3 0.5 0.9 Opt

Rocketfuel

slide-15
SLIDE 15

Confidence Weighted Marginal Utility Analyses of Internet Mapping Techniques Analyses K-L Divergence

Kullback-Leibler Divergence

KL(A||B) =

  • a

pA(a) log pA(a) pB(a)

  • Also known as relative entropy

Average extra bits per event for encoding according to the wrong distribution We measure divergence between coverage up to a measurement and final coverage Marginal utility is the decrease in K-L divergence between measurements

slide-16
SLIDE 16

K-L Divergence per Source

0.5 1 1.5 2 2.5 3 3.5 4 5 10 15 20 25 Target KL Divergence Source Error Probability 0.0 0.3 0.5 0.9 Opt

Skitter

0.5 1 1.5 2 2.5 3 10 20 30 40 50 60 70 Edge KL Divergence Source Error Probability 0.0 0.3 0.5 0.9 Opt

Scriptroute

1 2 3 4 5 6 100 200 300 400 500 600 700 800 900 Edge KL Divergence Source Error Probability 0.0 0.3 0.5 0.9 Opt

Rocketfuel

slide-17
SLIDE 17

Confidence Weighted Marginal Utility Analyses of Internet Mapping Techniques Analyses K-L Divergence

K-L Divergence per Target

1 2 3 4 5 6 7 8 5000 10000 15000 20000 25000 30000 35000 Edge KL Divergence Target Error Probability 0.0 0.3 0.5 0.9 Opt

Scriptroute

1 2 3 4 5 6 7 10000 20000 30000 40000 50000 60000 Edge KL Divergence Target Error Probability 0.0 0.3 0.5 0.9 Opt

Rocketfuel

slide-18
SLIDE 18

Confidence Weighted Marginal Utility Analyses of Internet Mapping Techniques Conclusions

Conclusions

Adding targets is more useful than adding sources

Half of all coverage comes from the first few sources

Rocketfuel does increase its per measurement return

More targets always yield more information More sources have diminished returns, but higher than other tools

There is a pronounced trade off in confidence

Rocketfuel has more divergence between different error probabilities More redundant tools are less effected

slide-19
SLIDE 19

Confidence Weighted Marginal Utility Analyses of Internet Mapping Techniques Conclusions

Conclusions

These metrics can be used as heuristics for quicker mapping Reordering the second two days of Skitter data according to the first day:

200000 400000 600000 800000 1000000 1200000 5 10 15 20 25 Node Coverage Source Error Probability 0.0 0.3 0.5 0.9 Total Probed

Day 1

200000 400000 600000 800000 1000000 1200000 5 10 15 20 25 Node Coverage Source Error Probability 0.0 0.3 0.5 0.9 Total Probed

Day 2 and 3, reordered

slide-20
SLIDE 20

Confidence Weighted Marginal Utility Analyses of Internet Mapping Techniques Conclusions

Conclusions

Questions?