Sibyl: A Practical Internet Route Oracle Ethan Katz-Bassett - - PowerPoint PPT Presentation

sibyl a practical internet route oracle
SMART_READER_LITE
LIVE PREVIEW

Sibyl: A Practical Internet Route Oracle Ethan Katz-Bassett - - PowerPoint PPT Presentation

1 Sibyl: A Practical Internet Route Oracle Ethan Katz-Bassett (University of Southern California) with: Pietro Marchetta (University of Napoli Federico II), Matt Calder, Yi-Ching Chiu (USC), Italo Cunha (UFMG), Harsha Madhyastha


slide-1
SLIDE 1

Sibyl: A Practical Internet Route Oracle

1

Ethan Katz-Bassett (University of Southern California) with: Pietro Marchetta (University of Napoli Federico II),
 Matt Calder, Yi-Ching Chiu (USC), Italo Cunha (UFMG),
 Harsha Madhyastha (Michigan), Vasileios Giotsas (CAIDA)

Supported By:

slide-2
SLIDE 2

Sibyl: A Practical Internet Route Oracle

2

Supported By:

Ethan Katz-Bassett (University of Southern California) with: Pietro Marchetta (University of Napoli Federico II),
 Matt Calder, Yi-Ching Chiu (USC), Italo Cunha (UFMG),
 Harsha Madhyastha (Michigan), Vasileios Giotsas (CAIDA)

slide-3
SLIDE 3

Traceroute Widely Used 
 by Operators and Researchers

“The number one go-to tool is traceroute.

NANOG Network operators troubleshooting tutorial, 2009.

! Lots of use cases:

! topology and AS relationships ! route performance and inflation ! location of congestion ! outages ! prefix hijacks ! etc

3

slide-4
SLIDE 4

Traceroute Widely Used 
 by Operators and Researchers

“The number one go-to tool is traceroute.

NANOG Network operators troubleshooting tutorial, 2009.

! Lots of use cases:

! topology and AS relationships ! route performance and inflation ! location of congestion ! outages ! prefix hijacks ! etc

4

! Lots of vantage points:

! PlanetLab, Ark ! RIPE Atlas, BISmark ! traceroute servers ! MobiPerf, Dasu ! RIPE RIS, RouteViews ! etc

slide-5
SLIDE 5

Traceroute Is Extremely Limited

“The number one go-to tool is traceroute.

NANOG Network operators troubleshooting tutorial, 2009.

! Lots of use cases:

! topology and AS relationships ! route performance and inflation ! location of congestion ! outages ! prefix hijacks ! etc

! But traceroute only supports one query:


“What is the path from vantage point s to destination d?”

5

! Lots of vantage points:

! PlanetLab, Ark ! RIPE Atlas, BISmark ! traceroute servers ! MobiPerf, Dasu ! RIPE RIS, RouteViews ! etc

slide-6
SLIDE 6

6

slide-7
SLIDE 7

Current path Historical path

How I’d Like to Use Vantage Points

7

What’s the path from
 AT&T mobile in LA 
 to YouTube?

Query for:

! Path from a certain network

slide-8
SLIDE 8

Current path Historical path MobiPerf

How I’d Like to Use Vantage Points

8

Did paths from
 AT&T mobile in LA 
 always go to Seattle?

Query for:

! Path from a certain network ! Historical path

slide-9
SLIDE 9

Current path Historical path MobiPerf

How I’d Like to Use Vantage Points

9

Do any paths through
 AT&T in LA to YouTube
 still go to LA server?

Query for:

! Path from a certain network ! Historical path ! Paths through series of hops

slide-10
SLIDE 10

Current path Historical path

How I’d Like to Use Vantage Points

10

Other paths that went
 to YouTube LA and now
 go to YouTube Seattle?

Query for:

! Path from a certain network ! Historical path ! Paths through series of hops ! Relationship between

historical path and current path

slide-11
SLIDE 11

Current path Historical path MeasrDroid

How I’d Like to Use Vantage Points

11

Query for:

! Path from a certain network ! Historical path ! Paths through series of hops ! Relationship between

historical path and current path

slide-12
SLIDE 12

Current path Historical path MeasrDroid MobiPerf

How I’d Like to Use Vantage Points

12

YouTube appears to
 map these clients
 together.


Query for:

! Path from a certain network ! Historical path ! Paths through series of hops ! Relationship between

historical path and current path

slide-13
SLIDE 13

Current path Historical path MeasrDroid MobiPerf

How I’d Like to Use Vantage Points

13

Or…other paths
 traversing AT&T

  • NTT?


Query for:

! Path from a certain network ! Historical path ! Paths through series of hops ! Relationship between

historical path and current path

slide-14
SLIDE 14

Current path Historical path MeasrDroid MobiPerf

How I’d Like to Use Vantage Points

14

Or…other paths
 traversing GTT

  • NTT?


Query for:

! Path from a certain network ! Historical path ! Paths through series of hops ! Relationship between

historical path and current path

slide-15
SLIDE 15

Current path Historical path MeasrDroid MobiPerf

How I’d Like to Use Vantage Points

15

Or…other paths
 traversing NTT
 but not AT&T or GTT?


Query for:

! Path from a certain network ! Historical path ! Paths through series of hops ! Relationship between

historical path and current path

slide-16
SLIDE 16

Current path Historical path MeasrDroid MobiPerf

How I’d Like to Use Vantage Points

16

Or…other paths
 traversing NTT
 but not LA or SEA?


Query for:

! Path from a certain network ! Historical path ! Paths through series of hops ! Relationship between

historical path and current path

slide-17
SLIDE 17

2014 Measurement vs 2016 Measurement

! What I do

17

MobiPerf 1 MobiPerf 2 RIPE Atlas 1 RIPE Atlas 2 Experiment Dest Path MobiPerf 1 MobiPerf 2 MobiPerf 2 RIPE Atlas 1 RIPE Atlas 2 Experiment Unified Probing Platform Dest1 Path Dest2 MobiPerf 2 MeasrDroid Give me paths like X. Here are some paths.

! What I want to do

slide-18
SLIDE 18

Benefit of Combining Platforms

! Combining platforms improves coverage

18

20 40 60 80 100 10-1 100 101 102 103 104 % of ASes Hosting Vantage Points Minimum Customer Cone Size

RIPE+PL +TS+Dasu RIPE TS Dasu PL

slide-19
SLIDE 19

Challenge of Combining Platforms

! Combining platforms improves coverage

19

0.2 0.4 0.6 0.8 1 100 200 300 400 500 600 700 800 CDF of Destinations Number of ASes seen on Path

PL

slide-20
SLIDE 20

Challenge of Combining Platforms

20

0.2 0.4 0.6 0.8 1 100 200 300 400 500 600 700 800 CDF of Destinations Number of ASes seen on Path

RIPE PL

! Combining platforms improves coverage

slide-21
SLIDE 21

Challenge of Combining Platforms

21

0.2 0.4 0.6 0.8 1 100 200 300 400 500 600 700 800 CDF of Destinations Number of ASes seen on Path

RIPE PL Rate Limited - RIPE

! Combining platforms improves coverage ! … but exhaustive probing is infeasible

slide-22
SLIDE 22

Challenge of Combining Platforms

22

0.2 0.4 0.6 0.8 1 100 200 300 400 500 600 700 800 CDF of Destinations Number of ASes seen on Path

RIPE Rate Limited

  • TS+RIPE+PL

PL Rate Limited - RIPE

! Combining platforms improves coverage ! … but exhaustive probing is infeasible

slide-23
SLIDE 23

Challenge of Combining Platforms

23

0.2 0.4 0.6 0.8 1 100 200 300 400 500 600 700 800 CDF of Destinations Number of ASes seen on Path

RIPE Rate Limited(G)

  • TS+RIPE+PL

Rate Limited

  • TS+RIPE+PL

PL Rate Limited(G)

  • RIPE

Rate Limited - RIPE

! Combining platforms improves coverage ! … but exhaustive probing is infeasible ! Rate limits mean you have to be smart about what to issue

slide-24
SLIDE 24

Goals

! Take advantage of diverse vantage points ! Efficient use of probing budgets ! High rate, low diversity vantage points like Ark and PL ! Low rate, high diversity vantage points like Atlas and LG ! Support rich queries

Service queries:

! as if we had traceroutes from all vantage points to all

Internet destinations

! even though probing budgets are very restricted

24

slide-25
SLIDE 25

Related Work

! IXPs: Mapped?, RocketFuel, Lord of the Links, Reverse

Traceroute, etc

! Issued measurements likely to traverse particular links

! iPlane

! Predicted source/destination paths

! TopHat

! Unified historical and current ! Multiple testbeds? ! Does it exist anymore?

! MPlane ! Srikanth’s talk ! Others?

25

slide-26
SLIDE 26

Goals

26

! Take advantage of diverse vantage points ! Efficient use of probing budgets ! High rate, low diversity vantage points like Ark and PL ! Low rate, high diversity vantage points like Atlas and LG ! Support rich queries

Service queries:

! as if we had traceroutes from all vantage points to all

Internet destinations

! even though probing budgets are very restricted

slide-27
SLIDE 27

Goal:

! In each round, allocate probing budget to best serve queries

Optimize Use of Probing Budget

27

slide-28
SLIDE 28

Goal:

! In each round, allocate probing budget to best serve queries

Optimize Use of Probing Budget

28

slide-29
SLIDE 29

Goal:

! In each round, allocate probing budget to best serve queries

Optimize Use of Probing Budget

29

! Max utility of traceroutes

slide-30
SLIDE 30

Goal:

! In each round, allocate probing budget to best serve queries

Optimize Use of Probing Budget

30

! Max utility of traceroutes ! Tr: set of traceroutes in the

round is union of those from each platform V

slide-31
SLIDE 31

Goal:

! In each round, allocate probing budget to best serve queries

Optimize Use of Probing Budget

31

! Max utility of traceroutes ! Tr: set of traceroutes in the

round is union of those from each platform V

! Utility of set of traceroutes

Tr is sum of utility in matching each query q

slide-32
SLIDE 32

Goal:

! In each round, allocate probing budget to best serve queries

Optimize Use of Probing Budget

32

! Max utility of traceroutes ! Tr: set of traceroutes in the

round is union of those from each platform V

! Utility of set of traceroutes

Tr is sum of utility in matching each query q

! Each platform V has a rate

limit budget CV

slide-33
SLIDE 33

Goal:

! In each round, allocate probing budget to best serve queries

Need to solve (rest of talk):

  • What is query language? 

  • How to find paths to match queries, given we can’t issue every

traceroute (or know the results before issuing)?


  • How to solve the optimization efficiently?

Challenges in Optimization

33

slide-34
SLIDE 34

Query language

! How to express what paths you care about? ! Sibyl in its current state:

! Regular expressions over hops

! All evaluation with AS-level hops, but internal representation is PoPs

! Supports two types of queries:

! Existence query: Give me (at least) one path that matches

! A path that goes through Sprint on the way to USC?


Sprint-.*-USC$

! Diversity query: Give me as diverse a set of matching paths as possible

! Suspected problem on peering between GTT and Level3?


\(^.*-(GTT

  • Level3|Level3-GTT)-.*$\)

! Actually, NTT

  • GTT too! Other GTT peers?


\(^.*-[^NTT

  • Level3]-GTT
  • [^NTT
  • Level3]-.*$\)

! Thoughts?

34

slide-35
SLIDE 35

Goal:

! In each round, allocate probing budget to best serve queries

Need to solve (rest of talk):

  • What is query language? Regular expressions over hops.

  • How to find paths to match queries, given we can’t issue every

traceroute (or know the results before issuing)?


  • How to solve the optimization efficiently?

Challenges in Optimization

35

slide-36
SLIDE 36

Finding Paths to Match Queries

  • 1. Find candidate (v1 to d2) by splicing existing traceroutes
  • a. Find historical traceroute t1 from a vantage point v1 to

destination d1 that matches a prefix qp of q

  • b. Find historical traceroute t2 from a vantage point v2 to

destination d2 that matches a prefix qp of q s.t.:

i. t1 and t2 intersect at a common PoP p

  • ii. v1…p…d2=qpqs=q
  • c. Nominate v1 to d2 as a candidate to match q
  • 2. Predict probability that traceroute (v1 to d2) matches q
  • a. Generate all possible spliced paths from v1 to d2
  • b. Rate how likely each spliced path is to be correct
  • c. Probability of matching q is sum of likelihood of the spliced

paths that match

36

slide-37
SLIDE 37

Finding Paths to Match Queries

  • 1. Find candidate (v1 to d2) by splicing existing traceroutes
  • a. Find historical traceroute t1 from a vantage point v1 to

destination d1 that matches a prefix qp of q

  • b. Find historical traceroute t2 from a vantage point v2 to

destination d2 that matches a prefix qp of q s.t.:

i. t1 and t2 intersect at a common PoP p

  • ii. v1…p…d2=qpqs=q
  • c. Nominate traceroute (v1 to d2) as a candidate to match q
  • 2. Predict probability that traceroute (v1 to d2) matches q
  • a. Generate all possible spliced paths from v1 to d2
  • b. Rate how likely each spliced path is to be correct
  • c. Probability of matching q is sum of likelihood of the spliced

paths that match

37

slide-38
SLIDE 38

Finding Paths to Match Queries

  • 1. Find candidate (v1 to d2) by splicing existing traceroutes
  • a. Find historical traceroute t1 from a vantage point v1 to

destination d1 that matches a prefix qp of q

  • b. Find historical traceroute t2 from a vantage point v2 to

destination d2 that matches a prefix qp of q s.t.:

i. t1 and t2 intersect at a common PoP p

  • ii. v1…p…d2=qpqs=q
  • c. Nominate traceroute (v1 to d2) as a candidate to match q
  • 2. Predict probability that traceroute (v1 to d2) matches q
  • a. Generate all possible spliced paths from v1 to d2
  • b. Rate how likely each spliced path is to be correct
  • c. Probability of matching q is sum of likelihood of the spliced

paths that match

38

slide-39
SLIDE 39

How Likely is a Spliced Path Correct?

! Train system to recognize


likely vs unlikely splices

! Features include:

! AS relationships at


splice point

! AS path similarity with


  • ther possible splices

! AS path inflation vs


shortest prediction ! System scores spliced


path based on features

! Evaluation shows a positive score from system usually means

the prediction is correct

39

0.05 0.1 0.15 0.2 0.25 0.3

  • 10
  • 8
  • 6
  • 4
  • 2

2 4 6 Probability Density RuleFit Score

Correct Predictions Incorrect Predictions

slide-40
SLIDE 40

Goal:

! In each round, allocate probing budget to best serve queries

Need to solve (rest of talk):

  • What is query language? Regular expressions over hops.

  • How to find paths to match queries, given we can’t issue every

traceroute (or know the results before issuing)? Splice existing traceroutes to predict new ones, learn which predictions are good.

  • How to solve the optimization efficiently?

Challenges in Optimization

40

slide-41
SLIDE 41

! In each round, allocate probing budget to best serve queries ! Solve greedily

! Greedy heuristic known to perform well


for submodular function subject to partition constraints

Optimize Use of Probing Budget

41

! Submodular function:

function on sets that exhibits diminishing returns

! Partition constraints:


each traceroute subject to at most one constraint

slide-42
SLIDE 42

Goal:

! In each round, allocate probing budget to best serve queries

Need to solve (rest of talk):

  • What is query language? Regular expressions over hops.

  • How to find paths to match queries, given we can’t issue every

traceroute (or know the results before issuing)? Splice existing traceroutes to predict new ones, learn which predictions are good.

  • How to solve the optimization efficiently? Greedily.

Challenges in Optimization

42

slide-43
SLIDE 43

Evaluation: Does Sibyl efficiently allocate probing budget? (I)

! Prediction is effective: Sibyl satisfies 80% as many queries

as an Oracle that knows all paths but is subject to rate limits

! Important to assess likelihood: Sibyl satisfies 107% more

than Randomly selecting among spliced candidates

43

0.2 0.4 0.6 0.8 1 5 10 15 20 25 30 35 40 45 50 Fraction of Queries Satisfied Round

Oracle Limited Sibyl Random Candidates

slide-44
SLIDE 44

Evaluation: Does Sibyl efficiently allocate probing budget? (II)

! By combining diverse but rate limited RIPE Atlas vantage

points with the smaller number of PlanetLab vantage points, Sibyl matches nearly as many queries as 
 RIPE Atlas without rate limits

44

0.2 0.4 0.6 0.8 1 5 10 15 20 25 30 35 40 45 50 Fraction of Queries Satisfied Round

RIPE unlimited Sibyl - PL unlimited and RIPE limited PL unlimited Sibyl - RIPE limited

slide-45
SLIDE 45

Conclusion

¬ Lots of route vantage points exist,


but our interface to them is extremely limited

"Little unification of historical and live measurements "Each platform is independent "Can only ask one question: 


“What is the path from here to there?”

¬ Sibyl: A unified platform for Internet path queries

"Goal: Get relevant paths with properties of interest, 


without limiting queries by testbed, source, or destination

"Approach: Predict paths based on previously measured

  • nes, and optimize budget use based on predictions

45