The Shape of the Internet Slides assembled by Jeff Chase Duke - - PowerPoint PPT Presentation

the shape of the internet
SMART_READER_LITE
LIVE PREVIEW

The Shape of the Internet Slides assembled by Jeff Chase Duke - - PowerPoint PPT Presentation

The Shape of the Internet Slides assembled by Jeff Chase Duke University (thanks to Vishal Misra and C. Faloutsos) The Shape of the Network Characterizing shape: AS-level topology: who connects to whom Router-level topology:


slide-1
SLIDE 1

The Shape of the Internet

Slides assembled by Jeff Chase Duke University (thanks to Vishal Misra and C. Faloutsos)

slide-2
SLIDE 2

The Shape of the Network

Characterizing “shape”:

  • AS-level topology: who connects to whom
  • Router-level topology: what connects with what
  • POP-level topology: where connects with where

Why does it matter?

  • Survivability/robustness to node/POP/AS failure
  • Path lengths / diameter
  • Congestion / hot spots / bottlenecks
  • Redundancy

Star? Tree? Mesh? Random?

slide-3
SLIDE 3

Why study topology?

  • Correctness of network protocols typically

independent of topology

  • Performance of networks critically dependent on

topology – e.g., convergence of route information

  • Internet impossible to replicate
  • Modeling of topology needed to generate test

topologies

Vishal Misra

slide-4
SLIDE 4

Internet topologies

AT&T SPRINT MCI AT&T MCI SPRINT

Router level Autonomous System (AS) level

Vishal Misra

slide-5
SLIDE 5

More on topologies..

  • Router level topologies reflect physical connectivity between

nodes – Inferred from tools like traceroute or well known public measurement projects like Mercator and Skitter

  • AS graph reflects a peering relationship between two

providers/clients – Inferred from inter-domain routers that run BGP and public projects like Oregon Route Views

  • Inferring both is difficult, and often inaccurate

Vishal Misra

slide-6
SLIDE 6

Early work

  • Early models of topology used variants of Erdos-Renyi

random graphs – Nodes randomly distributed on 2-dimensional plane – Nodes connected to each other w/ probability inversely proportional to distance

  • Soon researchers observed that random graphs did

not represent real world networks

Vishal Misra

slide-7
SLIDE 7

Real world topologies

  • Real networks exhibit

– Hierarchical structure – Specialized nodes (transit, stub..) – Connectivity requirements – Redundancy

  • Characteristics incorporated into the Georgia Tech

Internetwork Topology Models (GT-ITM) simulator (E. Zegura, K.Calvert and M.J. Donahoo, 1995)

Vishal Misra

slide-8
SLIDE 8

So…are we done?

  • No!
  • In 1999, Faloutsos, Faloutsos and Faloutsos published

a paper, demonstrating power law relationships in Internet graphs

  • Specifically, the node degree distribution exhibited

power laws That Changed Everything…..

Vishal Misra

slide-9
SLIDE 9

Power laws in AS level topology

Vishal Misra

slide-10
SLIDE 10

AS graph is “scale-free”

  • Power law in the AS degree distribution [SIGCOMM99]

internet domains log(rank) log(degree)

  • 0.82

att.com ibm.com

  • C. Faloutsos
slide-11
SLIDE 11

Power Laws

  • 10
  • 9
  • 8
  • 7
  • 6
  • 5
  • 4
  • 3
  • 2
  • 1

2 4 6 8 degree (d) P(k > d)

  • Faloutsos3 (Sigcomm’99)

– frequency vs. degree – empirical ccdf P(d>x) ~ x-α α ≈1.15

Vishal Misra

topology from BGP tables

slide-12
SLIDE 12

GT-ITM abandoned..

  • GT-ITM did not give power law degree graphs
  • New topology generators and explanation for power

law degrees were sought

  • Focus of generators to match degree distribution of
  • bserved graph

Vishal Misra

slide-13
SLIDE 13

Generating power law graphs

Goal: construct network of size N with degree power law, P(d>x) ~ x-α

  • power law random graph (PLRG)(Aiello et al)
  • Inet (Chen et al)
  • incremental growth (BA) (Barabasi et al)
  • general linear preference (GLP) (Bu et al)

Vishal Misra

slide-14
SLIDE 14

Barabasi model: fixed exponent

  • incremental growth

– initially, m0 nodes – step: add new node i with m edges

  • linear preferential attachment

– connect to node i with probability

∏(ki) = ki / ∑ kj

0.5 0.5 0.25 0.5 0.25 new node existing node

may contain multi-edges, self-loops

Vishal Misra

slide-15
SLIDE 15

“Scale-free” graphs

  • Preferential attachment leads to “scale free” structure in

connectivity

  • Implications of “scale free” structure

– Few centrally located and highly connected hubs – Network robust to random attack/node removal (probability

  • f targeting hub very low)

– Network susceptible to catastrophic failure by targeted attacks (“Achilles heel of the Internet” Albert, Jeong, Barabasi, Nature 2000)

Vishal Misra

slide-16
SLIDE 16

Is the router-level Internet graph scale-free?

  • No…(There is no Memphis!)
  • Emphasis on degree distribution - structure ignored
  • Real Internet very structured
  • Evolution of graph is highly constrained

Vishal Misra

slide-17
SLIDE 17

Topology constraints

  • Technology

– Router out degree is constrained by processing speed – Routers can either have a large number of low bandwidth connections, or.. – A small number of high bandwidth connections

  • Geography

– Router connectivity highly driven by geographical proximity

  • Economy

– Capacity of links constrained by the technology that nodes can afford, redundancy/performance they desire etc.

Vishal Misra

slide-18
SLIDE 18

Network and graph mining

Food Web [Martinez ’91] Protein Interactions [genomebiology.com] Friendship Network [Moody ’01]

Graphs are everywhere!

  • C. Faloutsos
slide-19
SLIDE 19

Network and graph mining

  • How does the Internet look like?
  • How does the web look like?
  • What constitutes a ‘normal’ social network?
  • What is the ‘network value’ of a customer?
  • which gene/species affects the others the

most?

  • C. Faloutsos
slide-20
SLIDE 20

Why

Given a graph:

  • which node to market-to /

defend / immunize first?

  • Are there un-natural sub-

graphs? (eg., criminals’ rings)?

[from Lumeta: ISPs 6/1999]

  • C. Faloutsos
slide-21
SLIDE 21

Patterns?

  • avg degree is, say 3.3
  • pick a node at random – guess

its degree, exactly (-> “mode”)

degree count avg: 3.3

  • C. Faloutsos
slide-22
SLIDE 22

Patterns?

  • avg degree is, say 3.3
  • pick a node at random – guess

its degree, exactly (-> “mode”)

  • A: 1!!

degree count avg: 3.3

  • C. Faloutsos
slide-23
SLIDE 23

Patterns?

  • avg degree is, say 3.3
  • pick a node at random - what

is the degree you expect it to have?

  • A: 1!!
  • A’: very skewed distr.
  • Corollary: the mean is

meaningless!

  • (and std -> infinity (!))

degree count avg: 3.3

  • C. Faloutsos
slide-24
SLIDE 24

Power laws - discussion

  • do they hold, over time?
  • Yes! for multiple years [Siganos+]
  • do they hold on other graphs/domains?
  • Yes!

– web sites and links [Tomkins+], [Barabasi+] – peer-to-peer graphs (gnutella-style) – who-trusts-whom (epinions.com)

  • C. Faloutsos
slide-25
SLIDE 25

Time Evolution: rank R

  • 1
  • 0.9
  • 0.8
  • 0.7
  • 0.6
  • 0.5

200 400 600 800

Instances in time: Nov'97 and on Rank exponent

  • The rank exponent has not changed!

[Siganos+]

Domain level

log(rank) log(degree)

  • 0.82

att.com ibm.com

  • C. Faloutsos
slide-26
SLIDE 26

The Peer-to-Peer Topology

  • Number of immediate peers (= degree), follows a power-law

[Jovanovic+]

degree count

  • C. Faloutsos
slide-27
SLIDE 27

epinions.com

  • who-trusts-whom

[Richardson + Domingos, KDD 2001]

(out) degree count

  • C. Faloutsos
slide-28
SLIDE 28

Why care about these patterns?

  • better graph generators [BRITE, INET]

– for simulations – extrapolations

  • ‘abnormal’ graph and subgraph detection
  • C. Faloutsos
slide-29
SLIDE 29

Even more power laws:

library science (Lotka’s law of publication count); and citation counts: (citeseer.nj.nec.com 6/2001)

1 10 100 100 1000 10000 log count log # citations ’cited.pdf’

log(#citations) log(count)

Ullman

  • C. Faloutsos
slide-30
SLIDE 30

Even more power laws:

  • web hit counts [w/ A. Montgomery]

Web Site Traffic log(freq) log(count) Zipf “yahoo.com”

  • C. Faloutsos
slide-31
SLIDE 31

Power laws, cont’d

  • In- and out-degree distribution of web sites

[Barabasi], [IBM-CLEVER] log indegree

  • log(freq)

from [Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins ]

  • C. Faloutsos
slide-32
SLIDE 32

Power laws, cont’d

  • In- and out-degree distribution of web sites

[Barabasi], [IBM-CLEVER] log indegree log(freq)

from [Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins ]

  • C. Faloutsos
slide-33
SLIDE 33

Mapping the Internet

  • At this point in the session, we discussed the

SIGCOMM 2002 RocketFuel paper, based on slides in pdf form from Neil Spring.

www.cs.umd.edu/~nspring/talks/sigcomm-rocketfuel.pdf