Active BGP Measurement with BGP-Mux Ethan Katz-Bassett (USC) with - - PowerPoint PPT Presentation

active bgp measurement with bgp mux
SMART_READER_LITE
LIVE PREVIEW

Active BGP Measurement with BGP-Mux Ethan Katz-Bassett (USC) with - - PowerPoint PPT Presentation

Active BGP Measurement with BGP-Mux Ethan Katz-Bassett (USC) with testbed and some slides hijacked from Nick Feamster and Valas Valancius 2 Before I Start Georgia Tech system, I am just an enthusiastic user Nick Feamster and his


slide-1
SLIDE 1

Active BGP Measurement with BGP-Mux

Ethan Katz-Bassett (USC) with testbed and some slides hijacked from Nick Feamster and Valas Valancius

2

slide-2
SLIDE 2

Active BGP Measurement with BGP-Mux

Before I Start

 Georgia Tech system, I am just an enthusiastic user

 Nick Feamster and his students:  Valas

Valancius

 Bharath Ravi

 Questions for the audience:

 What would you use this system for? What should we use it for?  How do we get more ASes to connect to us?

 Getting them to agree to peer  Then, getting the connection to work

3

3

slide-3
SLIDE 3

Active BGP Measurement with BGP-Mux

Networks Use BGP to Interconnect

4

 BGP sessions  Route advertisements  Traffic over those routes  BGP controls both inbound and outbound traffic

4

slide-4
SLIDE 4

Active BGP Measurement with BGP-Mux

Networks Use BGP to Interconnect

4

WS

 BGP sessions  Route advertisements  Traffic over those routes  BGP controls both inbound and outbound traffic

4

slide-5
SLIDE 5

Active BGP Measurement with BGP-Mux

Networks Use BGP to Interconnect

4

WS

 BGP sessions  Route advertisements  Traffic over those routes  BGP controls both inbound and outbound traffic

4

slide-6
SLIDE 6

Active BGP Measurement with BGP-Mux

Networks Use BGP to Interconnect

4

WS ATTWS

 BGP sessions  Route advertisements  Traffic over those routes  BGP controls both inbound and outbound traffic

4

slide-7
SLIDE 7

Active BGP Measurement with BGP-Mux

Networks Use BGP to Interconnect

4

WS ATTWS

 BGP sessions  Route advertisements  Traffic over those routes  BGP controls both inbound and outbound traffic

4

slide-8
SLIDE 8

Active BGP Measurement with BGP-Mux

Networks Use BGP to Interconnect

4

WS ATTWS SprintATTWS L3ATTWS

 BGP sessions  Route advertisements  Traffic over those routes  BGP controls both inbound and outbound traffic

4

slide-9
SLIDE 9

Active BGP Measurement with BGP-Mux

Networks Use BGP to Interconnect

4

WS ATTWS SprintATTWS L3ATTWS

 BGP sessions  Route advertisements  Traffic over those routes  BGP controls both inbound and outbound traffic

4

slide-10
SLIDE 10

Active BGP Measurement with BGP-Mux

Networks Use BGP to Interconnect

4

WS ATTWS SprintATTWS L3ATTWS UWL3ATTWS

 BGP sessions  Route advertisements  Traffic over those routes  BGP controls both inbound and outbound traffic

4

slide-11
SLIDE 11

Active BGP Measurement with BGP-Mux

Networks Use BGP to Interconnect

4

WS ATTWS SprintATTWS L3ATTWS UWL3ATTWS

 BGP sessions  Route advertisements  Traffic over those routes  BGP controls both inbound and outbound traffic

4

slide-12
SLIDE 12

Active BGP Measurement with BGP-Mux

Networks Use BGP to Interconnect

4

WS ATTWS SprintATTWS L3ATTWS UWL3ATTWS

 BGP sessions  Route advertisements  Traffic over those routes  BGP controls both inbound and outbound traffic

4

slide-13
SLIDE 13

Active BGP Measurement with BGP-Mux

Virtual Networks Need BGP, too

Say I have some neat new routing ideas. I want to test them:

 Emulate the type of AS (CDN, stub, etc) of my choice

 Choose a set of providers, peers, and customers

 Inbound:

 Choose routes from those providers  Send traffic along those routes

 Outbound:

 Announce my prefix(es) to neighbors of choice, with

communities, etc

 Receive traffic to prefix(es)

 And everyone else should be able to do this, also

5

5

slide-14
SLIDE 14

Active BGP Measurement with BGP-Mux

Traditionally, BGP Experiments are Hard

I have some neat new routing ideas. How do I test them?

 Passive observation

 E.g., RouteViews, RIPE  Receive feeds only

 Limited “active” measurements

 E.g., Beacons  Generally, regular announcements and withdrawals

 Know the right people

 Negotiate the ability to make announcements  High overhead, limited deployment

All limit what you can do

6

6

slide-15
SLIDE 15

Active BGP Measurement with BGP-Mux

What I Need to Get What I Want

 Resources

 IP address space  AS number

 Connectivity & contracts

 BGP peering with real ASes  Data plane forwarding

 Time and money

7

7

slide-16
SLIDE 16

 Resources

 IP address space  AS number

 Connectivity & contracts

 BGP peering with real ASes  Data plane forwarding

 Time and money

184.164.224.0/19

AS47065

5 Universities as providers

Send & receive traffic

One-time cost

Active BGP Measurement with BGP-Mux 9

Internet UW GT BGP-Mux Virtual Network Virtual Network

BGP-Mux Provides All This For You

9

slide-17
SLIDE 17

Active BGP Measurement with BGP-Mux

Design Requirements

 Session transparency: BGP updates should appear as

they would with direction connection

 Session stability: Upstreams should not see transient

behavior

 Isolation: Individual networks should be able to set their

  • wn policies, forward independently, etc

 Scalability: BGP-Mux should support many networks

10

10

slide-18
SLIDE 18

 What would we like to add to BGP to enable this?  What can we deploy today, using only available protocols

and router support?

Active BGP Measurement with BGP-Mux

A Project Using BGP-Mux

11

LIFEGUARD: Locating Internet Failures Effectively and Generating Usable Alternate Routes Dynamically

 Locate the ISP / link causing the problem  Suggest that other ISPs reroute around the problem

11

slide-19
SLIDE 19

Active BGP Measurement with BGP-Mux

Our Goal for Failure Avoidance

 Enable content / service providers to repair

persistent routing problems affecting them, regardless of which ISP is causing them Setting

 Assume we can locate problem  Assume we are multi-homed / have multiple data centers  Assume we speak BGP  We use BGP-Mux to speak BGP to the real Internet:

5 US universities as providers

12

12

slide-20
SLIDE 20

LIFEGUARD: Practical Repair of Persistent Route Failures

Straightforward: Choose a path that avoids the problem.

13

Self-Repair of Forward Paths

13

slide-21
SLIDE 21

LIFEGUARD: Practical Repair of Persistent Route Failures

Straightforward: Choose a path that avoids the problem.

13

Self-Repair of Forward Paths

13

slide-22
SLIDE 22

Active BGP Measurement with BGP-Mux

A Mechanism for Failure Avoidance

Forward path: Choose route that avoids ISP or ISP-ISP link Reverse path: Want others to choose paths to my prefix P that avoid ISP or ISP-ISP link X

 Want a BGP announcement AVOID(X,P):

 Any ISP with a route to P that avoids X uses such a route  Any ISP not using X need only pass on the announcement

14

14

slide-23
SLIDE 23

LIFEGUARD: Practical Repair of Persistent Route Failures 15

Ideal Self-Repair of Reverse Paths

15

slide-24
SLIDE 24

LIFEGUARD: Practical Repair of Persistent Route Failures

AVOID(L3,WS)

15

Ideal Self-Repair of Reverse Paths

15

slide-25
SLIDE 25

LIFEGUARD: Practical Repair of Persistent Route Failures

AVOID(L3,WS) AVOID(L3,WS)

15

Ideal Self-Repair of Reverse Paths

15

slide-26
SLIDE 26

LIFEGUARD: Practical Repair of Persistent Route Failures

AVOID(L3,WS) AVOID(L3,WS) AVOID(L3,WS)

15

Ideal Self-Repair of Reverse Paths

15

slide-27
SLIDE 27

LIFEGUARD: Practical Repair of Persistent Route Failures

AVOID(L3,WS) AVOID(L3,WS) AVOID(L3,WS)

15

Ideal Self-Repair of Reverse Paths

15

slide-28
SLIDE 28

LIFEGUARD: Practical Repair of Persistent Route Failures 16

Practical Self-Repair of Reverse Paths

16

slide-29
SLIDE 29

LIFEGUARD: Practical Repair of Persistent Route Failures

WS

16

Practical Self-Repair of Reverse Paths

16

slide-30
SLIDE 30

LIFEGUARD: Practical Repair of Persistent Route Failures

WS ATT → WS Qwest → WS

16

Practical Self-Repair of Reverse Paths

16

slide-31
SLIDE 31

LIFEGUARD: Practical Repair of Persistent Route Failures

WS ATT → WS Sprint → Qwest → WS AISP → Qwest → WS L3 → ATT → WS Qwest → WS

16

Practical Self-Repair of Reverse Paths

16

slide-32
SLIDE 32

LIFEGUARD: Practical Repair of Persistent Route Failures

WS ATT → WS UW → L3 → ATT → WS Sprint → Qwest → WS AISP → Qwest → WS L3 → ATT → WS Qwest → WS

16

Practical Self-Repair of Reverse Paths

16

slide-33
SLIDE 33

LIFEGUARD: Practical Repair of Persistent Route Failures

WS ATT → WS UW → L3 → ATT → WS Sprint → Qwest → WS AISP → Qwest → WS L3 → ATT → WS Qwest → WS

16

Practical Self-Repair of Reverse Paths

16

slide-34
SLIDE 34

LIFEGUARD: Practical Repair of Persistent Route Failures

WS ATT → WS UW → L3 → ATT → WS Sprint → Qwest → WS AISP → Qwest → WS L3 → ATT → WS Qwest → WS

16

Practical Self-Repair of Reverse Paths

16

slide-35
SLIDE 35

LIFEGUARD: Practical Repair of Persistent Route Failures

WS ATT → WS UW → L3 → ATT → WS Sprint → Qwest → WS AISP → Qwest → WS Qwest → WS AVOID(L3,WS)

17

Practical Self-Repair of Reverse Paths

L3 → ATT → WS

17

slide-36
SLIDE 36

LIFEGUARD: Practical Repair of Persistent Route Failures

WS ATT → WS UW → L3 → ATT → WS Sprint → Qwest → WS AISP → Qwest → WS Qwest → WS WS → L3→ WS

17

Practical Self-Repair of Reverse Paths

L3 → ATT → WS

BGP loop prevention encourages switch to working path.

17

slide-37
SLIDE 37

LIFEGUARD: Practical Repair of Persistent Route Failures

WS ATT → WS UW → L3 → ATT → WS Sprint → Qwest → WS AISP → Qwest → WS WS → L3→ WS Qwest → WS → L3→ WS

17

Practical Self-Repair of Reverse Paths

L3 → ATT → WS

BGP loop prevention encourages switch to working path.

17

slide-38
SLIDE 38

LIFEGUARD: Practical Repair of Persistent Route Failures

WS ATT → WS UW → L3 → ATT → WS Sprint → Qwest → WS AISP → Qwest → WS → L3→ WS WS → L3→ WS Qwest → WS → L3→ WS

17

Practical Self-Repair of Reverse Paths

L3 → ATT → WS

BGP loop prevention encourages switch to working path.

17

slide-39
SLIDE 39

LIFEGUARD: Practical Repair of Persistent Route Failures

WS ATT → WS UW → L3 → ATT → WS Sprint → Qwest → WS Sprint → Qwest → WS → L3→ WS WS → L3→ WS Qwest → WS → L3→ WS

17

Practical Self-Repair of Reverse Paths

L3 → ATT → WS

BGP loop prevention encourages switch to working path.

17

slide-40
SLIDE 40

LIFEGUARD: Practical Repair of Persistent Route Failures

WS ATT → WS UW → L3 → ATT → WS Sprint → Qwest → WS Sprint → Qwest → WS → L3→ WS ATT → WS → L3→ WS WS → L3→ WS

17

Practical Self-Repair of Reverse Paths

L3 → ATT → WS

BGP loop prevention encourages switch to working path.

17

slide-41
SLIDE 41

LIFEGUARD: Practical Repair of Persistent Route Failures

WS ATT → WS UW → L3 → ATT → WS Sprint → Qwest → WS ? Sprint → Qwest → WS → L3→ WS ATT → WS → L3→ WS WS → L3→ WS

17

Practical Self-Repair of Reverse Paths

BGP loop prevention encourages switch to working path.

17

slide-42
SLIDE 42

LIFEGUARD: Practical Repair of Persistent Route Failures

WS ATT → WS UW → L3 → ATT → WS Sprint → Qwest → WS ? UW → Sprint → Qwest → WS → L3→ WS Sprint → Qwest → WS → L3→ WS ATT → WS → L3→ WS WS → L3→ WS

17

Practical Self-Repair of Reverse Paths

BGP loop prevention encourages switch to working path.

17

slide-43
SLIDE 43

LIFEGUARD: Practical Repair of Persistent Route Failures

WS ATT → WS UW → L3 → ATT → WS Sprint → Qwest → WS ? UW → Sprint → Qwest → WS → L3→ WS Sprint → Qwest → WS → L3→ WS ATT → WS → L3→ WS WS → L3→ WS

17

Practical Self-Repair of Reverse Paths

BGP loop prevention encourages switch to working path.

17

slide-44
SLIDE 44

Active BGP Measurement with BGP-Mux

Naive Poisoning Causes Transient Loss

O A B C F D E O A-O D-A-O F-B-A-O B-A-O E-D-A-O A-O B-A-O

 Some ISPs may have

working paths that avoid problem ISP X

 Naively, poisoning

causes path exploration even for these ISPs

 Path exploration causes

transient loss

18

AVOID(X,P)

18

slide-45
SLIDE 45

O A B C F D E O-X-O A-O D-A-O F-B-A-O B-A-O E-D-A-O A-O B-A-O

Active BGP Measurement with BGP-Mux

Naive Poisoning Causes Transient Loss

 Some ISPs may have

working paths that avoid problem ISP X

 Naively, poisoning

causes path exploration even for these ISPs

 Path exploration causes

transient loss

19

AVOID(X,P)

19

slide-46
SLIDE 46

O A B C F D E O-X-O A-O-X-O D-A-O F-B-A-O B-A-O E-D-A-O A-O-X-O B-A-O

Active BGP Measurement with BGP-Mux

Naive Poisoning Causes Transient Loss

 Some ISPs may have

working paths that avoid problem ISP X

 Naively, poisoning

causes path exploration even for these ISPs

 Path exploration causes

transient loss

20

AVOID(X,P)

20

slide-47
SLIDE 47

O A B C F D E O-X-O A-O-X-O A-O-X-O D-A-O-X-O F-B-A-O B-A-O-X-O E-D-A-O B-A-O-X-O F-B-A-O E-D-A-O

Active BGP Measurement with BGP-Mux

Naive Poisoning Causes Transient Loss

 Some ISPs may have

working paths that avoid problem ISP X

 Naively, poisoning

causes path exploration even for these ISPs

 Path exploration causes

transient loss

21

AVOID(X,P)

21

slide-48
SLIDE 48

O A B C F D E O-X-O A-O-X-O A-O-X-O D-A-O-X-O F-B-A-O B-A-O-X-O E-D-A-O B-A-O-X-O F-B-A-O E-D-A-O F-B-A-O D-A-O-X-O E-D-A-O B-A-O-X-O E-D-A-O F-B-A-O

Active BGP Measurement with BGP-Mux

Naive Poisoning Causes Transient Loss

 Some ISPs may have

working paths that avoid problem ISP X

 Naively, poisoning

causes path exploration even for these ISPs

 Path exploration causes

transient loss

22

AVOID(X,P)

22

slide-49
SLIDE 49

O A B C F D E O-X-O A-O-X-O A-O-X-O D-A-O-X-O F-B-A-O B-A-O-X-O E-D-A-O B-A-O-X-O F-B-A-O E-D-A-O F-B-A-O D-A-O-X-O E-D-A-O B-A-O-X-O E-D-A-O F-B-A-O E-D-A-O F-B-A-O

Active BGP Measurement with BGP-Mux

Naive Poisoning Causes Transient Loss

 Some ISPs may have

working paths that avoid problem ISP X

 Naively, poisoning

causes path exploration even for these ISPs

 Path exploration causes

transient loss

23

AVOID(X,P)

23

slide-50
SLIDE 50

O A B C F D E O-X-O A-O-X-O A-O-X-O D-A-O-X-O F-B-A-O B-A-O-X-O E-D-A-O B-A-O-X-O F-B-A-O E-D-A-O F-B-A-O D-A-O-X-O E-D-A-O B-A-O-X-O E-D-A-O F-B-A-O E-D-A-O F-B-A-O B-A-O-X-O E-D-A-O D-A-O-X-O F-B-A-O

Active BGP Measurement with BGP-Mux

Naive Poisoning Causes Transient Loss

 Some ISPs may have

working paths that avoid problem ISP X

 Naively, poisoning

causes path exploration even for these ISPs

 Path exploration causes

transient loss

24

AVOID(X,P)

24

slide-51
SLIDE 51

O A B C F D E O-X-O A-O-X-O D-A-O-X-O F-B-A-O-X-O B-A-O-X-O E-D-A-O-X-O A-O-X-O B-A-O-X-O

Active BGP Measurement with BGP-Mux

Naive Poisoning Causes Transient Loss

 Some ISPs may have

working paths that avoid problem ISP X

 Naively, poisoning

causes path exploration even for these ISPs

 Path exploration causes

transient loss

25

AVOID(X,P)

25

slide-52
SLIDE 52

O A B C F D E O-O-O A-O-O-O D-A-O-O-O F-B-A-O-O-O B-A-O-O-O E-D-A-O-O-O A-O-O-O B-A-O-O-O

Active BGP Measurement with BGP-Mux

Prepend to Reduce Path Exploration

 Most routing decisions

based on: (1) next hop ISP (2) path length

 Keep these fixed to

speed convergence

 Prepending prepares

ISPs for later poison

26

AVOID(X,P)

26

slide-53
SLIDE 53

O A B C F D E O-O-O A-O-O-O D-A-O-O-O F-B-A-O-O-O B-A-O-O-O E-D-A-O-O-O A-O-O-O B-A-O-O-O O-X-O

Active BGP Measurement with BGP-Mux

Prepend to Reduce Path Exploration

 Most routing decisions

based on: (1) next hop ISP (2) path length

 Keep these fixed to

speed convergence

 Prepending prepares

ISPs for later poison

27

AVOID(X,P)

27

slide-54
SLIDE 54

O A B C F D E O-O-O A-O-O-O D-A-O-O-O F-B-A-O-O-O B-A-O-O-O E-D-A-O-O-O A-O-O-O B-A-O-O-O O-X-O A-O-X-O A-O-X-O

Active BGP Measurement with BGP-Mux

Prepend to Reduce Path Exploration

 Most routing decisions

based on: (1) next hop ISP (2) path length

 Keep these fixed to

speed convergence

 Prepending prepares

ISPs for later poison

28

AVOID(X,P)

28

slide-55
SLIDE 55

O A B C F D E O-X-O A-O-X-O A-O-X-O D-A-O-X-O F-B-A-O-O-O B-A-O-X-O E-D-A-O-O-O B-A-O-X-O E-D-A-O-O-O F-B-A-O-O-O

Active BGP Measurement with BGP-Mux

Prepend to Reduce Path Exploration

 Most routing decisions

based on: (1) next hop ISP (2) path length

 Keep these fixed to

speed convergence

 Prepending prepares

ISPs for later poison

29

AVOID(X,P)

29

slide-56
SLIDE 56

O A B C F D E O-X-O A-O-X-O D-A-O-X-O F-B-A-O-X-O B-A-O-X-O E-D-A-O-X-O A-O-X-O B-A-O-X-O

Active BGP Measurement with BGP-Mux

Prepend to Reduce Path Exploration

 Most routing decisions

based on: (1) next hop ISP (2) path length

 Keep these fixed to

speed convergence

 Prepending prepares

ISPs for later poison

30

AVOID(X,P)

30

slide-57
SLIDE 57

O A B C F D E O-X-O A-O-X-O D-A-O-X-O F-B-A-O-X-O B-A-O-X-O E-D-A-O-X-O A-O-X-O B-A-O-X-O

Active BGP Measurement with BGP-Mux

Prepend to Reduce Path Exploration

 Most routing decisions

based on: (1) next hop ISP (2) path length

 Keep these fixed to

speed convergence

 Prepending prepares

ISPs for later poison

30

UW GT BGP-Mux LIFEGUARD O-X-O

AVOID(X,P)

30

slide-58
SLIDE 58

0.9999 0.999 0.99 0.95 0.65 1 2 3 4 5 6 7 8 Cumulative Fraction of Convergences (CDF) Peer Convergence Time (minutes) Prepend, no change No prepend, no change

Active BGP Measurement with BGP-Mux

Tested Idea Using BGP-Mux

 With no prepend, only 65% of unaffected ISPs converge instantly  With prepending, 95% of unaffected ISPs re-converge instantly, 98%<1/2 min.  Also speeds convergence to new paths for affected peers

31

31

slide-59
SLIDE 59

Active BGP Measurement with BGP-Mux

Summary

 BGP-Mux lets researchers experiment with BGP in the wild

 Transparent to experiments and stable to upstream

 Initial experiments using it:

 LIFEGUARD: reroute around ASes or links  PoiRoot: root cause analysis of BGP path changes

 Expose routing preferences  Induce changes to use as ground truth

 PECAN: joint content and network routing

 Measure performance of alternate paths

32

32

slide-60
SLIDE 60

Active BGP Measurement with BGP-Mux

Those Three Questions

 Data sharing

 Reverse traceroute data now online  Other researchers passively observed our active BGP updates  Use the testbed yourself

 Visualization: http://tp.gtnoise.net/

38

38

slide-61
SLIDE 61

Active BGP Measurement with BGP-Mux 39

39

slide-62
SLIDE 62

Active BGP Measurement with BGP-Mux

Conclusion

 BGP-Mux lets researchers experiment with BGP in the

wild

 Transparent to experiments and stable to upstream  Georgia Tech system, I am just an enthusiastic user

 LIFEGUARD: Let edge networks reroute around failures  Questions for the audience:

 What would you use this system for? What should we use it for?  How do we get more ASes to connect to us?

 Getting them to agree to  Then, getting the connection to work

 VLAN between BGP-Mux and border router  Ability to advertise BGP routes

40

40