Internet Atlas: A Geographical Database of the Physical Internet - - PowerPoint PPT Presentation

internet atlas a geographical database of the physical
SMART_READER_LITE
LIVE PREVIEW

Internet Atlas: A Geographical Database of the Physical Internet - - PowerPoint PPT Presentation

Internet Atlas: A Geographical Database of the Physical Internet Active Internet Measurement Systems Workshop (AIMS) February 6-8, 2013 Ram Durairajan Computer Sciences University of Wisconsin Motivation rkrish@cs.wisc.edu 2


slide-1
SLIDE 1

Internet Atlas: A Geographical Database of the Physical Internet

Ram Durairajan Computer Sciences University of Wisconsin

  • Active Internet Measurement Systems Workshop (AIMS)

February 6-8, 2013

slide-2
SLIDE 2

Motivation

2 rkrish@cs.wisc.edu

slide-3
SLIDE 3

Objectives of our work

  • Create and maintain a comprehensive

catalog of the physical Internet

– Geographic locations of nodes (buildings that house PoPs, IXPs etc.) and links (fiber conduits)

  • Deploy portal for visualization and analysis
  • Extend with relevant related data

– Active probes, BGP updates, Twitter, weather, etc.

  • Apply maps to problems of interest

– Robustness, performance, security

rkrish@cs.wisc.edu 3

slide-4
SLIDE 4

Related work

  • Many prior Internet mapping efforts

– S. Gorman studies from early 2000’s – CAIDA – DIMES

  • Commercial activities

– TeleGeography – Renesys – Lumeta

  • Internet Topology Zoo

rkrish@cs.wisc.edu 4

slide-5
SLIDE 5

Compiling a physical repository

  • Step #1: Identification

– Utilize search to find maps of physical locations

  • Step #2: Transcription

– Multiple methods to automate data entry

  • Step #3: Verification

– Ensure that data reflects latest network maps

  • Our hypothesis is that physical sites are limited

in number and fixed in location

– But the raw number is still large!

rkrish@cs.wisc.edu 5

slide-6
SLIDE 6

Challenges

  • Accuracy

– How accurate are the node locations? – How accurate are the link paths and connections?

  • Completeness

– How much of the physical Internet is in the catalog?

  • Varying data formats

– requires varying approaches for processing

  • Verification problems

– networks change, data entry errors due to manual annotations

rkrish@cs.wisc.edu 6

slide-7
SLIDE 7

Internet Atlas @ UW

  • Effort began in September ’11

– Capture everything from maps discovered by search – Use all relevant data sources (ISP maps, colocation, data centers, NTP, traceroute, etc.)

  • Data extraction tools
  • Comprehensive database

– Developed using MySQL

  • Alpha web portal – http://atlas.wail.wisc.edu

– Includes ArcGIS for visualization and analysis

rkrish@cs.wisc.edu 7

slide-8
SLIDE 8

Current DB

  • Number of networks: 372
  • Number of tier 1 networks: 10 (all)
  • Number of data centers: 2,179
  • Number of NTP servers: 744
  • Number of traceroute servers: 221
  • Number and type of other nodes: IXP (358), DNS root (282)
  • Total number of nodes: 13,734
  • Number of unique locations of nodes: 7,932
  • Maximum overlap at any one node: 90
  • Total number of links: 13,228

rkrish@cs.wisc.edu 8

slide-9
SLIDE 9

Identifying relevant data

  • Internet search reveals significant

information

– ISP’s and data center hosts routinely publish maps and locations of their infrastructure – Other elements such as NTP list precise locations

  • Creating a corpus of search terms

– Geography is important

  • Timely representations require repetition

rkrish@cs.wisc.edu 9

slide-10
SLIDE 10

Example: Telstra world wide

rkrish@cs.wisc.edu 10

slide-11
SLIDE 11

Example: Sprint IP network (US)

rkrish@cs.wisc.edu 11

slide-12
SLIDE 12

Example: Regional fiber

rkrish@cs.wisc.edu 12

slide-13
SLIDE 13

Example: Metro fiber maps

rkrish@cs.wisc.edu 13

slide-14
SLIDE 14

Automating transcription

  • Web pages contain Internet resource

information in a variety of formats

– Text, flash, images, Google maps-based, etc.

  • Our goal is to extract information and enter it

into our DB automatically

– Requires identification of relevant page

  • Library of parsing scripts for various formats
  • Sometimes manual entry and annotation is

necessary

rkrish@cs.wisc.edu 14

slide-15
SLIDE 15

Geo-coding node locations

  • Physical locations of nodes from search

– Lat/Lon – Street address – City

  • All locations decomposed in DB to Lat/Lon

– Google geocoder – http://maps.googleapis.com/maps/api/geocode/ xml?address="+address+"&sensor=false

rkrish@cs.wisc.edu 15

slide-16
SLIDE 16

Geo-accurate link transcription

  • Transcribing geographic information for links is

much more challenging than for nodes

  • Step #1: Copy images

– Max zoom required for max accuracy

  • Step #2: Image patching via feature matching
  • Step #3: Link image extraction from base map
  • Step #4: Geographic projection

– Key step uses ArcGIS registration functionality

  • Step #5: Link vectorization

rkrish@cs.wisc.edu 16

slide-17
SLIDE 17

Structure in link maps

rkrish@cs.wisc.edu 17

slide-18
SLIDE 18

Image extraction

rkrish@cs.wisc.edu 18

slide-19
SLIDE 19

Geo-specific link encoding

rkrish@cs.wisc.edu 19

slide-20
SLIDE 20

rkrish@cs.wisc.edu 20

Internet Atlas – Full View

slide-21
SLIDE 21

rkrish@cs.wisc.edu 21

Internet Atlas – Layers

slide-22
SLIDE 22

rkrish@cs.wisc.edu 22

Internet Atlas – Identify

slide-23
SLIDE 23

rkrish@cs.wisc.edu 23

Internet Atlas – Identify

slide-24
SLIDE 24

rkrish@cs.wisc.edu 24

Internet Atlas – Zoom

slide-25
SLIDE 25

rkrish@cs.wisc.edu 25

Internet Atlas – Search

slide-26
SLIDE 26

rkrish@cs.wisc.edu 26

Internet Atlas – Search

slide-27
SLIDE 27

Target applications

  • Many potential applications for an accurate,

but incomplete graph of the physical Internet

  • Application 1: link characterization

– What are the physical distances of links?

  • Application 2: robustness

– Are there vulnerabilities in the current infrastructure?

  • Application 3: intra-domain routing

– Given peering relationships, can we identify inefficiencies?

rkrish@cs.wisc.edu 27

slide-28
SLIDE 28

Improving network availability

  • Given outage event risk profile, how can

network availability be improved?

– Backup routes within an infrastructure – Additional provisioning to extend infrastructure

  • RiskRoute optimization framework

– Identifies backup routes and provisioning options – Considers historical and/or real time outage events

  • Case study using networks and disaster event

data from US

– Many opportunities to reduce risk!

pb@cs.wisc.edu 28

slide-29
SLIDE 29

Level3 and Hurricane Irene

pb@cs.wisc.edu 29

slide-30
SLIDE 30

rkrish@cs.wisc.edu 30

Internet Atlas – Risk Analysis

slide-31
SLIDE 31

Data Sharing

  • NO!
  • Questions? Enquiries?

– Prof. Barford (pb@cs.wisc.edu)

  • Accounts?

– Prof. Barford (pb@cs.wisc.edu) – Ram Durairajan (rkrish@cs.wisc.edu)

rkrish@cs.wisc.edu 31

slide-32
SLIDE 32

Thank you!

  • Paul Barford
  • Brian Eriksson
  • Xin Tang
  • Subhadip Ghosh

rkrish@cs.wisc.edu 32

slide-33
SLIDE 33

rkrish@cs.wisc.edu 33