Archipelago Measurement Infrastructure Updates and Analyses Young - - PowerPoint PPT Presentation

archipelago
SMART_READER_LITE
LIVE PREVIEW

Archipelago Measurement Infrastructure Updates and Analyses Young - - PowerPoint PPT Presentation

Archipelago Measurement Infrastructure Updates and Analyses Young Hyun CAIDA ISMA 2009 AIMS Workshop Feb 12, 2009 Outline Focus and Architecture Monitor Deployment Measurements Future Work 2 Introduction Archipelago (Ark) is CAIDAs


slide-1
SLIDE 1

Updates and Analyses

Young Hyun

CAIDA ISMA 2009 AIMS Workshop Feb 12, 2009

Archipelago

Measurement Infrastructure

slide-2
SLIDE 2

Outline

Focus and Architecture Monitor Deployment Measurements Future Work

2

slide-3
SLIDE 3

Introduction

Archipelago (Ark) is CAIDA’s next-generation active measurement infrastructure

evolution of the skitter infrastructure

in production since Sep 12, 2007

3

slide-4
SLIDE 4

Focus

easy development and rapid prototyping

lower barriers => implement better measurements faster with lower cost

  • measurement infrastructures notoriously lack funding

raise level of abstraction with high-level API and scripting language

  • inspiration from Scriptroute, Metasploit, Scapy, Racket

4

slide-5
SLIDE 5

Focus

dynamic and coordinated measurements

take advantage of multiple distributed measurement nodes in sophisticated ways

  • one measurement triggers another measurement
  • use multiple nodes to divide and conquer
  • synchronize measurements

for example: Doubletree; tomography; Rocketfuel-like targeted discovery of a single network’s topology

5

slide-6
SLIDE 6

Focus

measurement services

build upon the work of others; share services between measurement activities

  • for example, on-demand traceroute/ping service; IP-to-AS

mapping service

similiar in goal to service-oriented architecture (SOA) but at finer granularity and without the complexity

6

slide-7
SLIDE 7

Architecture

Ark is composed of measurement nodes (machines) located in various networks worldwide

many thanks to the organizations hosting Ark boxes please contact us if you want to host an Ark box

Ark employs a tuple space to enable communication and coordination

a tuple space is a distributed shared memory combined with a small number of easy-to-use

  • perations

a tuple space stores tuples, which are arrays of simple values (strings and numbers), and clients retrieve tuples by pattern matching

7

slide-8
SLIDE 8

Architecture

use tuple space for decentralized (that is, peer-to- peer) communication, interaction, and coordination

8

m

  • n

i t

  • r1

central server monitor2 m

  • n

i t

  • r3

monitor4 monitor5

slide-9
SLIDE 9

Monitor Deployment

33 monitors in 22 countries

9

12 North America 2 South America 11 Europe 1 Africa 5 Asia 2 Oceania

Continent

19 academic 9 research network 2 network infrastructure 1 commercial network 1 community network 1 military research

Organization

slide-10
SLIDE 10

Measurements

IPv4 Routed /24 Topology IPv4 Routed /24 AS Links IPv6 Topology DNS Names DNS Query/Response Traffic Spoofer Project Collaboration

10

slide-11
SLIDE 11

IPv4 Routed /24 Topology

  • ngoing large-scale topology measurements

ICMP Paris traceroute to every routed /24 (7.4 million) running scamper

  • written by Matthew Luckie of WAND, University of Waikato

group monitors into teams and dynamically divide up the measurement work among team members

13-member team probes every /24 in 48 hrs at 100pps

  • nly one monitor probes each /24 per cycle

3 teams active

11

slide-12
SLIDE 12

IPv4 Routed /24 Topology

12

5 10 15 20 25 30 Sep 07 Nov 07 Jan 08 Mar 08 May 08 Jul 08 Sep 08 Nov 08 Jan 09 Mar 09 amw-us (1) cjj-kr (1*) dub-ie (1) hel-fi (1*) mnl-ph (1*) nrt-jp (1*) san-us (1*) syd-au (1*) laf-us (1*) lej-de (1) hlz-nz (1) bcn-es (1*) yto-ca (1*) iad-us (2*) vie-at (2) cbg-uk (2*) lax-us (2) hnl-us (2) cmn-ma (2) gig-br (2) sjc-us (2) zrh-ch (2) bwi-us (2) ams-nl (2) tpe-tw (2) yow-ca (2) she-cn (3) scl-cl (3) her-gr (3) pna-es (3) dfw-us (3) eug-us (3) nap-it (3)

Sep 2007 to Jan 2009 (17 months): 2.5 billion traceroutes; 1.0TB data

slide-13
SLIDE 13

IPv4 Routed /24 Topology

13

5 10 15 20 25 30 Sep 07 Nov 07 Jan 08 Mar 08 May 08 Jul 08 Sep 08 Nov 08 Jan 09 Mar 09 amw-us (1) cjj-kr (1*) dub-ie (1) hel-fi (1*) mnl-ph (1*) nrt-jp (1*) san-us (1*) syd-au (1*) laf-us (1*) lej-de (1) hlz-nz (1) bcn-es (1*) yto-ca (1*) iad-us (2*) vie-at (2) cbg-uk (2*) lax-us (2) hnl-us (2) cmn-ma (2) gig-br (2) sjc-us (2) zrh-ch (2) bwi-us (2) ams-nl (2) tpe-tw (2) yow-ca (2) she-cn (3) scl-cl (3) her-gr (3) pna-es (3) dfw-us (3) eug-us (3) nap-it (3)

Sep 2007 to Jan 2009 (17 months): 2.5 billion traceroutes; 1.0TB data

software failure

slide-14
SLIDE 14

IPv4 Routed /24 Topology

14

5 10 15 20 25 30 Sep 07 Nov 07 Jan 08 Mar 08 May 08 Jul 08 Sep 08 Nov 08 Jan 09 Mar 09 amw-us (1) cjj-kr (1*) dub-ie (1) hel-fi (1*) mnl-ph (1*) nrt-jp (1*) san-us (1*) syd-au (1*) laf-us (1*) lej-de (1) hlz-nz (1) bcn-es (1*) yto-ca (1*) iad-us (2*) vie-at (2) cbg-uk (2*) lax-us (2) hnl-us (2) cmn-ma (2) gig-br (2) sjc-us (2) zrh-ch (2) bwi-us (2) ams-nl (2) tpe-tw (2) yow-ca (2) she-cn (3) scl-cl (3) her-gr (3) pna-es (3) dfw-us (3) eug-us (3) nap-it (3)

Sep 2007 to Jan 2009 (17 months): 2.5 billion traceroutes; 1.0TB data

hardware failure

slide-15
SLIDE 15

IPv4 Routed /24 Topology

15

5 10 15 20 25 30 Sep 07 Nov 07 Jan 08 Mar 08 May 08 Jul 08 Sep 08 Nov 08 Jan 09 Mar 09 amw-us (1) cjj-kr (1*) dub-ie (1) hel-fi (1*) mnl-ph (1*) nrt-jp (1*) san-us (1*) syd-au (1*) laf-us (1*) lej-de (1) hlz-nz (1) bcn-es (1*) yto-ca (1*) iad-us (2*) vie-at (2) cbg-uk (2*) lax-us (2) hnl-us (2) cmn-ma (2) gig-br (2) sjc-us (2) zrh-ch (2) bwi-us (2) ams-nl (2) tpe-tw (2) yow-ca (2) she-cn (3) scl-cl (3) her-gr (3) pna-es (3) dfw-us (3) eug-us (3) nap-it (3)

Sep 2007 to Jan 2009 (17 months): 2.5 billion traceroutes; 1.0TB data

power supply died replacement power supply died

slide-16
SLIDE 16

IPv4 Routed /24 AS Links

AS links from Routed /24 Topology traces

map IP addresses to ASes with RouteViews BGP table

16

slide-17
SLIDE 17

IPv4 Routed /24 AS Links

statistics for 1 month of AS links from three sources (Dec 2008):

17

“avg neighbor deg” = avg neighbor degree of the avg k- degree node averaged over all k “mean clustering” = (avg number of links between neighbors of k-deg nodes) / (max possible such links for k) averaged over all k

nodes links max degree average degree average neighbor degree mean clustering Ark DIMES RouteViews (rv2)

23,425 56,760 2,509 4.85 467.3 0.354 22,995 74,140 3,590 6.45 705.4 0.446 30,760 65,775 2,328 4.28 487.2 0.241

slide-18
SLIDE 18

3 AS Links Sources: 1 Month

18

10-5 10-4 10-3 10-2 10-1 100 100 101 102 103 104 CCDF Node degree DIMES AS links (2008-12) Ark AS links (2008-12) RouteViews (rv2) AS links (2008-12)

slide-19
SLIDE 19

3 AS Links Sources: 1 Month

19

100 101 102 103 100 101 102 103 104 average neighbor degree Node degree DIMES AS links (2008-12) Ark AS links (2008-12) RouteViews (rv2) AS links (2008-12)

slide-20
SLIDE 20

3 AS Links Sources: 1 Month

20

10-4 10-3 10-2 10-1 100 100 101 102 103 104 clustering Node degree DIMES AS links (2008-12) Ark AS links (2008-12) RouteViews (rv2) AS links (2008-12)

slide-21
SLIDE 21

AS Links Growth

AS links seem to accumulate linearly without bound

in skitter, Ark, DIMES; possibly in BGP even with fixed traceroute sources and destination list (which happened with skitter for 4 years)

AS graph densification: average degree increases for example:

1 year of Ark (2008): 104k AS links, 28k ASes 2 years of DIMES: 356k AS links, 29k ASes 7.5 years of skitter: 209k AS links, 27k ASes

21

slide-22
SLIDE 22

AS Links Growth

hard to determine the “natural” time period to aggregate AS links

1 month? 6 months? years? when do we get a representative AS graph?

22

slide-23
SLIDE 23

AS Links Growth

hard to compare different infrastructures

you can always make AS graph bigger by aggregating

23

slide-24
SLIDE 24

AS Links Growth

hard to compare different infrastructures

you can always make AS graph bigger by aggregating

23

in fact, got spam on this ...

slide-25
SLIDE 25

AS Links Growth

hard to compare different infrastructures

you can always make AS graph bigger by aggregating

23

in fact, got spam on this ...

slide-26
SLIDE 26

Ark AS Links Growth

24

23000 24000 25000 26000 27000 28000 29000 1 2 3 4 5 6 7 8 9 10 11 12 55000 60000 65000 70000 75000 80000 85000 90000 95000 100000 105000 # nodes # links Months of accumulation # nodes # links

slide-27
SLIDE 27

Ark AS Links Growth

25

2250 2500 2750 3000 3250 3500 3750 4000 4250 4500 1 2 3 4 5 6 7 8 9 10 11 12 4.5 5 5.5 6 6.5 7 7.5 max degree average degree Months of accumulation max degree average degree

slide-28
SLIDE 28

Ark AS Links Growth

26

450 500 550 600 650 700 750 800 850 1 2 3 4 5 6 7 8 9 10 11 12 0.34 0.36 0.38 0.4 0.42 0.44 0.46 0.48 0.5 0.52 0.54 average neighbor degree clustering Months of accumulation average neighbor degree clustering

slide-29
SLIDE 29

Ark AS Links: 1, 6, 12 Months

27

10-5 10-4 10-3 10-2 10-1 100 100 101 102 103 104 CCDF Node degree Ark AS links (2008, 1 to 12) Ark AS links (2008, 7 to 12) Ark AS links (2008, 12 to 12)

slide-30
SLIDE 30

Ark AS Links: 1, 6, 12 Months

28

101 102 103 100 101 102 103 104 average neighbor degree Node degree Ark AS links (2008, 1 to 12) Ark AS links (2008, 7 to 12) Ark AS links (2008, 12 to 12)

slide-31
SLIDE 31

Ark AS Links: 1, 6, 12 Months

29

10-3 10-2 10-1 100 100 101 102 103 104 clustering Node degree Ark AS links (2008, 1 to 12) Ark AS links (2008, 7 to 12) Ark AS links (2008, 12 to 12)

slide-32
SLIDE 32

Ark IPv6 Topology

  • ngoing “large-scale” IPv6 measurements since

Dec 12, 2008 6 monitors: 3 in US, 3 in Europe

2 IPv6 boxes down 3 more IPv6 boxes coming Real Soon Now

ICMP Paris traceroute to every routed prefix

each monitor probes a random destination in every routed prefix in every cycle; 1,553 prefixes <= /48 reduced probing rate to take 2 days per cycle running scamper

30

slide-33
SLIDE 33

Ark IPv6 Topology

statistics for 8 weeks of AS links from six sources:

Dec 12, 2008 to Feb 7, 2009

31

nodes links max degree average degree average neighbor degree mean clustering IPv6 8 weeks IPv4 4 weeks

520 1,181 94 4.54 36.3 0.265 23,425 56,760 2,509 4.85 467.3 0.354

slide-34
SLIDE 34

Ark IPv6 AS Links

32

10-5 10-4 10-3 10-2 10-1 100 100 101 102 103 104 CCDF Node degree Ark IPv4 AS links (2008-12, 4 weeks) Ark IPv6 AS links (2008-12, 8 weeks)

slide-35
SLIDE 35

Ark IPv6 AS Links

33

101 102 103 100 101 102 103 104 average neighbor degree Node degree Ark IPv4 AS links (2008-12, 4 weeks) Ark IPv6 AS links (2008-12, 8 weeks)

slide-36
SLIDE 36

Ark IPv6 AS Links

34

10-3 10-2 10-1 100 100 101 102 103 104 clustering Node degree Ark IPv4 AS links (2008-12, 4 weeks) Ark IPv6 AS links (2008-12, 8 weeks)

slide-37
SLIDE 37

DNS Names

automated ongoing DNS lookup of IP addresses seen in the Routed /24 Topology traces

all intermediate addresses and responding destinations using our in-house bulk DNS lookup service (HostDB)

  • can look up millions of addresses per day

213M lookups since March 2008

35

slide-38
SLIDE 38

DNS Traffic

tcpdump capture of DNS query/response traffic

  • nly for lookups of Routed /24 Topology addresses

continuous collection of 3-5M packets per day can download most recent 30 days of pcap files

a broad sampling of the nameservers on the Internet due to the broad coverage of the routed space in traces how many nameservers have IPv6 glue records? DNSSEC records? support EDNS? typical TTLs?

36

slide-39
SLIDE 39

Alias Resolution

Goal: collapse interfaces observed in traceroute paths into routers

toward a router-level map of the Internet

alias resolution work led by Ken Keys

37

slide-40
SLIDE 40

Spoofer Project

collaboration with Rob Beverly on MIT Spoofer Project

how many networks allow packets with spoofed IP addresses to leave their network?

Ark monitors act as targets for spoofed probes sent by willing participants

forwards received probe data to MIT server

38

slide-41
SLIDE 41

Spoofer Project

39

monitor monitor monitor MIT CAIDA

UDP port 53 tuple space

slide-42
SLIDE 42

Ark Statistics Pages

per-monitor analysis of IPv4 topology data

RTT, path length, RTT vs. distance

40

www.caida.org/projects/ark/statistics

slide-43
SLIDE 43

Future Work

release Marinda tuple space under GPL implement large-scale RadarGun measurements more in-depth analysis of data for stats pages investigate AS link densification DNS open resolver surveys? high-level packet generation, capture, and analysis API allow semi-trusted 3rd parties to conduct measurements

41

slide-44
SLIDE 44

Thanks!

www.caida.org/projects/ark

42

For more information and to request data: