On the Analysis of the Internet from a Geographic and Economic - - PowerPoint PPT Presentation

on the analysis of the internet from a geographic and
SMART_READER_LITE
LIVE PREVIEW

On the Analysis of the Internet from a Geographic and Economic - - PowerPoint PPT Presentation

On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data Alessandro Improta Advisors: Prof. Luciano Lenzini Prof. Gigliola Vaglini Ing. Enrico Gregori Pisa - May 24th, 2013 Alessandro Improta On the


slide-1
SLIDE 1

On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

Alessandro Improta

Advisors:

  • Prof. Luciano Lenzini
  • Prof. Gigliola Vaglini
  • Ing. Enrico Gregori

Pisa - May 24th, 2013

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-2
SLIDE 2

The Internet role in society

The Internet is becoming day by day more important for everyone

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-3
SLIDE 3

... but what truly is the Internet?

Nevertheless, the Internet is perceived by every user as a magic box What happens to my data packets once they leave my home router?

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-4
SLIDE 4

First... a little bit of history

1969 - ARPANET 1985 - NSFNET 1995 - Commercial Internet ARPANET was one of the first network to implement TCP/IP (1983)

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-5
SLIDE 5

First... a little bit of history

1969 - ARPANET 1985 - NSFNET 1995 - Commercial Internet NSFNET use from for-profit organizations was acceptable when it was in support of open research and education

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-6
SLIDE 6

What about today?

1969 - ARPANET 1985 - NSFNET 1995 - Commercial Internet From then, its real structure became hidden, as well as its potential structural weaknesses

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-7
SLIDE 7

Why is important to discover the Internet structure?

To understand how packets are routed in the Internet To understand how to optimize Internet paths by analyzing existing deficiencies To develop more scalable interdomain routing protocols and architectures To construct economy-based models of the global Internet growth To develop better topology generators to simulate the Internet To select data centers for server replicas by taking into account the Internet paths To properly select peers or upstream providers based on their connectivity

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-8
SLIDE 8

What researchers can do?

There not exist any dedicated tool to discover the Internet topology at any level... ... but there exist two main workarounds: traceroute (IP-level) BGP route collectors (AS-level)

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-9
SLIDE 9

BGP Route Collectors

A Route Collector (RC) is a device which collects BGP routing data from co-operating ASes.

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-10
SLIDE 10

The Internet AS-level topology

“An AS is a connected group of one or more IP prefixes run by one

  • r more network operators which has a single and clearly defined

routing policy”. [RFC 1930] 44,389 AS numbers and 170,204 inter-AS connections were found in January 2013 topology

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-11
SLIDE 11

PhD Contribution

1 Inter-AS economic relationship inference from BGP data 2 Geographic AS path inference from BGP data 3 Quantification of BGP data completeness 4 Identification of ideal BGP feeders to improve current situation Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-12
SLIDE 12

PhD Contribution

1 Inter-AS economic relationship inference from BGP data 2 Geographic AS path inference from BGP data 3 Quantification of BGP data completeness 4 Identification of ideal BGP feeders to improve current situation Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-13
SLIDE 13

BGP export policies

Internet AS-level topology is not an undirected graph provider-customer: the customer pays the provider to reach every AS peer-to-peer: the two ASes exploits each other to reach their customer-cones (typically free-of-charge) sibling-to-sibling: each AS acts as a provider for the other

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-14
SLIDE 14

AS path valley-free property

In other words, ASes should not transit traffic between ✗ two of its peers ✗ two of its providers ✗ a peer and a provider A p2c edge can be followed by only p2c or s2s edges A p2p edge can be followed by only p2c or s2s edges Knowing a set of provider-free ASes, it is possible to draw economic inferences on AS paths

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-15
SLIDE 15

Transient AS paths

0.001 0.01 0.1 1 1 10 100 1000 10000 100000 1e+06 1e+07

CCDF Lifespan [s]

Inter-T1 routes All routes

Transient AS paths may be non compliant with the valley-free rule due to BGP misconfigurations that appear only during the BGP convergence process after a network failure

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-16
SLIDE 16

A time-aware tagging algorithm

Inferences are merged together only if their lifespan ratio is larger than NMAG

[1] E.Gregori, A.Improta, L.Lenzini, L.Sani, L.Rossi, “BGP and Inter-AS Economic Relationships”, in Proceedings

  • f the 11th International IFIP TC-6 Conference on Networking (NETWORKING ’11), vol.2, pp. 54-67, 2011

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-17
SLIDE 17

A filtering pre-phase to improve the algorithm

Not every transient AS path is affected by BGP misconfigurations though...

0.05 0.1 0.15 0.2 0.25 0.3 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

p(x) AS path length (x)

All routes Routes lasted less than 60s Routes lasted less than 1h Routes with inter-PF valleys

[2] E.Gregori, A.Improta, L.Lenzini, L.Sani, L.Rossi, “On Improving the Reliability of Inter-AS Economic Inferences Through an Hygiene Phase on BGP Data”, submitted to Computer Networks, 2013 Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-18
SLIDE 18

PhD Contribution

1 Inter-AS economic relationship inference from BGP data 2 Geographic AS path inference from BGP data 3 Quantification of BGP data completeness 4 Identification of ideal BGP feeders to improve current situation Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-19
SLIDE 19

BGP data and geography

The global Internet is the result of the interconnection of multiple geographic networks ... ... but the analysis of the Internet from a global perspective may hide several regional characteristics e.g. Telstra (AS4637) is fundamental for the Australian connectivity, but represent just a non-stub in the global view

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-20
SLIDE 20

How to infer geographic information from BGP data

“An AS is a connected group of one or more IP prefixes run by one or more IP network operators which has a single and clearly defined routing policy” (RFC 1930) For each AS:

1 We collect its IP prefixes from BGP data 2 We geolocate the AS by geolocating its prefixes (via Maxmind

GeoLite Database)

3 We apply an heuristic to infer geographic AS paths and

geographic topologies 41,796 out of 44,389 ASes result located only in one continent 40,347 out of 44,389 ASes result located only in one country

[3] E.Gregori, A.Improta, L.Lenzini, L.Sani, L.Rossi, “Inferring Geography from BGP Raw Data”, in Proceedings of the 4th IEEE International Workshop on Network Science for Communication Networks (NETSCICOM ’12), pp. 208-213, 2012 [4] E.Gregori, A.Improta, L.Lenzini, L.Sani, L.Rossi, “Discovering the geographic properties of the Internet AS-level topology”, submitted to Networking Science, 2013 Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-21
SLIDE 21

PhD Contribution

1 Inter-AS economic relationship inference from BGP data 2 Geographic AS path inference from BGP data 3 Quantification of BGP data completeness 4 Identification of ideal BGP feeders to improve current situation Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-22
SLIDE 22

BGP Route Collector Status (Jan 2013)

RouteViews RIS PCH BGPmon

  • N. of RC

12 13 49 1

  • N. of feeders

116 309 936 33

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-23
SLIDE 23

Feeder Contribution

Only 152 feeders announce to the RCs their IPv4 full routing table Only 76 feeders announce to the RCs their IPv6 full routing table

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-24
SLIDE 24

Feeder characterization

RouteViews RIS PCH BGPmon

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 100 101 102 103 104 P(X>x) x = Degree Minor feeders Partial feeders Full feeders 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 100 101 102 103 104 P(X>x) x = Degree Minor feeders Partial feeders Full feeders 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 100 101 102 103 104 P(X>x) x = Degree Minor feeders Partial feeders Full feeders 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 100 101 102 103 104 P(X>x) x = Degree Minor feeders Partial feeders Full feeders

About 80% of full feeders have a degree higher than 100

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-25
SLIDE 25

A view from the top

Connections that can be discovered (A, C) (A, D) (A, E) (A, F) (B, E) RCs connected to large ISPs will fail to retrieve a large amount of p2p-connectivity

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-26
SLIDE 26

A view from the bottom

Connections that can be discovered (A, B) (A, C) (A, D) (A, E) (A, F) (B, E) (C, D) RCs need to be connected to ASes part of the lowest part of the Internet hierarchy to discover the missing p2p connectivity

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-27
SLIDE 27

A new metric: p2c distance

p2c distance of AS X from AS Y: Minimum number of consecutive p2c links that connect X to Y

AS p2c-distance from R A 1 B 1 C

  • D
  • E

2 F

  • Farther an AS is from a RC, the greater are the chances

to lose AS-level connectivity due to BGP decision processes

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-28
SLIDE 28

Focusing the target

Thoughts Every AS becomes feeder: unfeasible and unuseful The vast majority of missing links are p2p Stub ASes are not likely to establish many p2p connections (only 7% are members of at least one IXP) Goals Discover the connectivity of non-stub ASes ... ... without connecting to all of them ... ... and identify a list of the most important ASes that has to be connected Note: Stub ASes may be still exploited as feeders to achieve this objective

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-29
SLIDE 29

Tailored set covering problem

Goal rephrased Select new BGP feeders such that each non-stub AS has a finite and bounded p2c distance from the route collector infrastructure Set Covering

Minimize  

ASi ∈U

xASi   (1) subject to

  • ASi :n∈S(d)

ASi

xASi ≥ 1 ∀n ∈ N (2) xASi ∈ {0, 1}, ∀ASi ∈ U (3)

[5] E.Gregori, A.Improta, L.Lenzini, L.Sani, L.Rossi, “On the Incompleteness of the AS-level Graph: a Novel Methodology for BGP Route Collector Placement”, in Proceedings of the 12th ACM SIGCOMM Conference on Internet Measurement (IMC ’12), pp. 253-264, 2012 Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-30
SLIDE 30

MSC results

The number of feeders required heavily outnumbers the current number of (full) feeders The largest number of candidates are multihomed stub ASes which rarely feed the RCs nowadays

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-31
SLIDE 31

Tailored max coverage problem

Goal rephrased Identify a priority index for each candidate BGP feeders Max Coverage

Maximize  

ASj ∈N

yASj   (4) subject to

  • ASi ∈I

xASi ≤ k (5)

  • ASi ∈I∧ASj ∈SASi

xASi ≥ yASj , ∀ASj ∈ N (6) yASj ∈ {0, 1}, ∀ASj ∈ N (7) xASi ∈ {0, 1}, ∀ASi ∈ U (8)

[6] E.Gregori, A.Improta, L.Lenzini, L.Sani, L.Rossi, “A Novel Methodology to Address the Internet AS-level Data Incompleteness”, submitted to IEEE/ACM Transactions on Networking, 2013 Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data

slide-32
SLIDE 32

Thank you for your attention

Data presented in this presentation and many others can be found at www.isolario.it Any question? alessandro.improta@iet.unipi.it

Alessandro Improta On the Analysis of the Internet from a Geographic and Economic Perspective via BGP Raw Data