Improving network agility with seamless BGP reconfigurations - - PowerPoint PPT Presentation

improving network agility with seamless bgp
SMART_READER_LITE
LIVE PREVIEW

Improving network agility with seamless BGP reconfigurations - - PowerPoint PPT Presentation

Improving network agility with seamless BGP reconfigurations Laurent Vanbever vanbever@cs.princeton.edu IRTF Open Meeting, IETF87 July, 30 2013 Based on joint work with Stefano Vissicchio, Luca Cittadini, Cristel Pelsser, Pierre Franois and


slide-1
SLIDE 1

Improving network agility with seamless BGP reconfigurations

IRTF Open Meeting, IETF87 Laurent Vanbever vanbever@cs.princeton.edu July, 30 2013

Based on joint work with

Stefano Vissicchio, Luca Cittadini, Cristel Pelsser, Pierre François and Olivier Bonaventure

slide-2
SLIDE 2
  • -Vijay Gill

When you are changing the tires of a moving car “

slide-3
SLIDE 3

make sure one wheel is

  • n the ground at all time

  • -Vijay Gill

When you are changing the tires of a moving car ”

slide-4
SLIDE 4

Why does seamless BGP reconfigurations matter?

BGP configuration is often changed On average, 400+ changes accounted per month in a Tier1

Changing a BGP configuration can impact availability

even if the initial and final configurations are safe BGP is critical for ISPs enforce business relationship, responsible for most of traffic

slide-5
SLIDE 5

A crash course

BGP reconfiguration

1

Finding an ordering

Is it easy? Does it exist? 2

Reconfiguration framework

Overcome complexity 3

Improving network agility with seamless BGP reconfigurations

slide-6
SLIDE 6

BGP reconfiguration Finding an ordering

Is it easy? Does it exist?

Reconfiguration framework

Overcome complexity A crash course 1

Improving network agility with seamless BGP reconfigurations

slide-7
SLIDE 7

AS10 AS20 AS30 AS40 AS50 Border Gateway Protocol Autonomous System

AS1

BGP is the only inter-domain routing protocol used today

slide-8
SLIDE 8

BGP comes in two flavors

AS10 AS20 AS30 AS40 AS50

AS1

slide-9
SLIDE 9

AS10 AS20 AS30 AS40 AS50 eBGP sessions

AS1

external BGP (eBGP) exchanges reachability information between ASes

slide-10
SLIDE 10

internal BGP (iBGP) distributes externally learned routes within the AS

AS10 AS20 AS30 AS40 AS50

AS1

iBGP sessions

slide-11
SLIDE 11

Plain iBGP mandates a full-mesh of iBGP sessions

Fair warning: some sessions are missing

O(n2) iBGP sessions where n is the number of routers ... quickly becomes totally unmanageable

slide-12
SLIDE 12

With Route Reflection, iBGP routers are hierarchically organized

slide-13
SLIDE 13

Route Reflectors Clients

Route Reflectors relay route updates between iBGP neighbors

slide-14
SLIDE 14

Route Reflectors Clients

Lower layers rely on upper layers to learn and propagate routing informations

Route Reflectors relay route updates between iBGP neighbors

slide-15
SLIDE 15

iBGP Clients sessions eBGP Routing policies Route-reflector sessions Peer sessions External sessions

iBGP and eBGP need to be carefully configured

A BGP configuration is composed of

slide-16
SLIDE 16

iBGP Clients sessions eBGP Routing policies Route-reflector sessions Peer sessions External sessions

Each part of a BGP configuration can be changed

Add sessions Remove sessions Change type Typical reconfiguration scenarios consist in

slide-17
SLIDE 17

iBGP Clients sessions eBGP Routing policies Route-reflector sessions Peer sessions External sessions

Each part of a BGP configuration can be changed

Add sessions Remove sessions Modify policies Add sessions Remove sessions Change type Typical reconfiguration scenarios consist in

slide-18
SLIDE 18

signaling anomalies dissemination anomalies forwarding anomalies BGP reconfigurations can create

Reconfiguring BGP can be disruptive

  • r any combination of those

[Griffin, SIGCOMM02] [Vissicchio, INFOCOM12] [Griffin, SIGCOMM02]

slide-19
SLIDE 19

signaling anomalies dissemination anomalies forwarding anomalies BGP reconfigurations can create

Reconfiguring BGP can be disruptive

  • r any combination of those

routing oscillations black holes forwarding loops traffic shifts

slide-20
SLIDE 20

signaling anomalies dissemination anomalies forwarding anomalies BGP reconfigurations can create

Reconfiguring BGP can be disruptive

  • r any combination of those

How much ?

slide-21
SLIDE 21

Let’s migrate from a full-mesh to a RR topology

slide-22
SLIDE 22

Establish the RR sessions in a bottom-up manner, then remove the full-mesh sessions

[Herrero10]

Let’s migrate from a full-mesh to a RR topology, following best practices

slide-23
SLIDE 23

20 40 60 80 100 0.0 0.2 0.4 0.6 0.8 1.0

35 100 60

Best practices do not work

Tier1 (50) experiments (cumul. frequency) % of migration steps with anomalies

Loops

60% of the experiments were subject to loops for > 35% of the steps

100

slide-24
SLIDE 24

20 40 60 80 100 0.0 0.2 0.4 0.6 0.8 1.0

45 100 100

Best practices do not work

Tier1 (50) experiments (cumul. frequency) % of migration steps with anomalies

Traffic shifts Loops

100% of the experiments were subject to traffic shifts for > 40% of the steps

slide-25
SLIDE 25

AS3 AS4 AS2

E4 E1 E2 E3 E5

P P P P P

AS1

Let’s tune BGP policies

slide-26
SLIDE 26

AS3 AS4 AS2

E4 E1 E2 E3 E5

AS1 learns a destination P via 5 egress points

P P P P P

AS1

slide-27
SLIDE 27

AS3 AS4 AS2

60 60 60 60 60

E4 E1 E2 E3 E5

preference

Initially, each egress point is equally preferred

AS1

slide-28
SLIDE 28

AS3 AS4 AS2

60 60 60 60 60

E4 E1 E2 E3 E5

preference

Depending on its position, each egress receives a percentage of the traffic

40% 20% 10% 10% 10% usage

AS1

slide-29
SLIDE 29

AS3 AS4 AS2

60 60

E4 E1 E2 E3 E5

preference

Let’s say that AS2 becomes more preferred

60 60 60 40% 20% 10% 10% 10% usage

AS1

slide-30
SLIDE 30

AS3 AS4 AS2

60 60

E4 E1 E2 E3 E5

preference

Let’s say that AS2 becomes more preferred

120 60 60 40% 20% 10% 10% 10% usage

AS1

slide-31
SLIDE 31

AS1 AS3 AS4 AS2

60 60

E4 E1 E2 E3 E5

preference

Let’s say that AS2 becomes more preferred

120 60 60 100% 0% 0% 0% 0% usage

  • 10%
  • 10%
  • 10%
  • 20%

60% of the traffic experience a traffic shift

slide-32
SLIDE 32

AS3 AS4 AS2

60 60

E4 E1 E2 E3 E5

preference

Let’s say that AS2 becomes more preferred

120 60 60 100% 0% 0% 0% 0% usage

AS1

slide-33
SLIDE 33

AS3 AS4 AS2

60 60

E4 E1 E2 E3 E5

preference

Let’s say that AS2 becomes more preferred

120 120 60 100% 0% 0% 0% 0% usage

AS1

slide-34
SLIDE 34

AS3 AS4 AS2

60 60

E4 E1 E2 E3 E5

preference

Let’s say that AS2 becomes more preferred

120 120 60 67% 33% 0% 0% 0% usage

AS1

33% of the traffic experience a traffic shift

  • 33%

60% of the traffic experience a traffic shift

slide-35
SLIDE 35

AS3 AS4 AS2

60 60

E4 E1 E2 E3 E5

preference

Let’s say that AS2 becomes more preferred

120 120 120 67% 33% 0% 0% 0% usage

AS1

slide-36
SLIDE 36

AS3 AS4 AS2

60 60

E4 E1 E2 E3 E5

preference

Let’s say that AS2 becomes more preferred

120 120 120 56% 28% 16% 0% 0% usage

AS1

  • 5%
  • 11%

33% of the traffic experience a traffic shift 60% of the traffic experience a traffic shift 16% of the traffic experience a traffic shift

slide-37
SLIDE 37

AS3 AS4 AS2

60 60

E4 E1 E2 E3 E5

preference

During the migration, 109% of the traffic has been shifted

120 120 120 56% 28% 16% 0% 0% usage

AS1

33% of the traffic experience a traffic shift 60% of the traffic experience a traffic shift 16% of the traffic experience a traffic shift

slide-38
SLIDE 38

0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.2 0.4 0.6 0.8 1.0

Tuning eBGP policies can create huge traffic shifts

100 1 3.0

max LP

Tier1 experiments (cumul. frequency) avg # traffic shifts per router per prefix 50% of the routers experience > 1 TS for each prefix

50

slide-39
SLIDE 39

A crash course

BGP reconfiguration Finding an ordering Reconfiguration framework

Overcome complexity Is it easy? Does it exist? 2

Improving network agility with seamless BGP reconfigurations

slide-40
SLIDE 40

Given an initial & final, anomaly-free, BGP configuration.

To avoid reconfiguration problems, a proper

  • perational ordering must be enforced

signaling anomalies dissemination anomalies forwarding anomalies Find a sequence of configuration changes such that never occur, during any migration step

slide-41
SLIDE 41

Find a sequence of configuration changes

slide-42
SLIDE 42

Does it always exist ? Find a sequence of configuration changes

slide-43
SLIDE 43

Does it always exist ? Is it easy to compute ? Find a sequence of configuration changes

slide-44
SLIDE 44

E1 E2 R1 R2

E2 E1 E2 E1

P P

We model iBGP configurations by using extended Stable Path Problem instances

slide-45
SLIDE 45

E1 E2 R1 R2

E2 E1 E2 E1

P P Egress-point to prefix P

We model iBGP configurations by using extended Stable Path Problem instances

slide-46
SLIDE 46

E1 E2 R1 R2

E2 E1 E2 E1

P P Egress-point to prefix P Egress-points in decreasing preference order

We model iBGP configurations by using extended Stable Path Problem instances

slide-47
SLIDE 47

E1 E2 R1 R2

E2 E1 E2 E1

P P Egress-point to prefix P Egress-points in decreasing preference order Best-learned egress point

We model iBGP configurations by using extended Stable Path Problem instances

slide-48
SLIDE 48

E1 E2 R1 R2

1 2 1 1 1

E1 E2 R1 R2

E2 E1 E2 E1

P P

A stable BGP configuration determines the forwarding paths being used

BGP configuration IGP configuration

resulting forwarding paths

slide-49
SLIDE 49

A seamless migration ordering might not always exist

E1 E2 R1 R2 RR1 RR2 S E1 E2 R1 R2 RR1 RR2 S

Initial BGP configuration Final BGP configuration

P P P P P P

slide-50
SLIDE 50

A seamless migration ordering might not always exist

E1 E2 R1 R2 RR1 RR2 S E1 E2 R1 R2 RR1 RR2 S

Initial BGP configuration Final BGP configuration

removed session

P P P P P P

slide-51
SLIDE 51

A seamless migration ordering might not always exist

E1 E2 R1 R2 RR1 RR2 S E1 E2 R1 R2 RR1 RR2 S

Initial BGP configuration Final BGP configuration

added session

P P P P P P

slide-52
SLIDE 52

E1 E2 R1 R2 RR1 RR2 S 1 1 E1 E2 R1 R2 RR1 RR2 S 1 1 1 100 100 1

E1 E2 S E2 E1 S E2 E1 S E1 E2 S

Path preferences IGP configuration

P P P

slide-53
SLIDE 53

E1 E2 R1 R2 RR1 RR2 S E1 E2 R1 R2 RR1 RR2 S 1 1 1 1 1 1

E1 E2 S E2 E1 S E2 E1 S E1 E2 S

The initial configuration is anomaly-free

P P P

slide-54
SLIDE 54

E1 E2 R1 R2 RR1 RR2 S E1 E2 R1 R2 RR1 RR2 S 1 1 1 1 1 1

E1 E2 S

E2 E1 S

E2 E1 S E1 E2 S

The final configuration is anomaly-free

P P P

slide-55
SLIDE 55

E1 E2 R1 R2 RR1 RR2 S E1 E2 R1 R2 RR1 RR2 S 1 1 1 1 1 1

E1 E2 S E2 E1 S E2 E1 S E1 E2 S

Let’s add the final session before removing the initial one

P P P

slide-56
SLIDE 56

E1 E2 R1 R2 RR1 RR2 S E1 E2 R1 R2 RR1 RR2 S 1 1 1 1 1 1

E1 E2 S E2 E1 S E2 E1 S E1 E2 S

Let’s add the final session before removing the initial one

P P P

slide-57
SLIDE 57

E1 E2 R1 R2 RR1 RR2 S E1 E2 R1 R2 RR1 RR2 S 1 1 1 1 1 1

E1 E2 S E2 E1 S E2 E1 S E1 E2 S

R1 now learns and selects E2, forcing RR1 to use E2 as well

P P P

slide-58
SLIDE 58

E1 E2 R1 R2 RR1 RR2 S E1 E2 R1 R2 RR1 RR2 S 1 1 1 1 1 1

E1 E2 S E2 E1 S E2 E1 S E1 E2 S

RR1 uses RR2 to reach E2, and RR2 uses RR1 to reach E1 ...

P P P

slide-59
SLIDE 59

E1 E2 R1 R2 RR1 RR2 S E1 E2 R1 R2 RR1 RR2 S

E1 E2 S E2 E1 S E2 E1 S E1 E2 S

Forwarding Loop

which creates a forwarding loops

P P P

slide-60
SLIDE 60

E1 E2 R1 R2 RR1 RR2 S E1 E2 R1 R2 RR1 RR2 S 1 1 1 1 1 1

E1 E2 S E2 E1 S E2 E1 S E1 E2 S

Let’s remove the initial session before adding the final one

P P P

slide-61
SLIDE 61

E1 E2 R1 R2 RR1 RR2 S E1 E2 R1 R2 RR1 RR2 S 1 1 1 1 1 1

E1 E2 S E2 E1 S E2 E1 S E1 E2 S

Let’s remove the initial session before adding the final one

P P P

slide-62
SLIDE 62

E1 E2 R1 R2 RR1 RR2 S E1 E2 R1 R2 RR1 RR2 S 1 1 1 1 1 1

E1 E2 S E2 E1 S E2 E1 S E1 E2 S

When we remove the session, R2 and RR2 stop learning E1 and switch to E2

P P P

slide-63
SLIDE 63

E1 E2 R1 R2 RR1 RR2 S E1 E2 R1 R2 RR1 RR2 S 1 1 1 1 1 1

E1 E2 S E2 E1 S E2 E1 S E1 E2 S

R1 uses R2 to reach E1, and R2 uses R1 to reach E2

P P P

slide-64
SLIDE 64

E1 E2 R1 R2 RR1 RR2 S E1 E2 R1 R2 RR1 RR2 S

E1 E2 S E2 E1 S E2 E1 S E1 E2 S

Forwarding Loop

which creates a forwarding loop as well...

P P P

slide-65
SLIDE 65

Does it always exist ? No. Find a sequence of configuration changes

slide-66
SLIDE 66

Does it always exist ? No. Find a sequence of configuration changes Is it easy to compute ?

slide-67
SLIDE 67

Finding a seamless migration ordering is computationally hard

reduction in polynomial time from 3-SAT Deciding if an ordering free from signaling anomalies exists is NP-hard

slide-68
SLIDE 68

The same reduction applies for dissemination anomalies forwarding anomalies iBGP or eBGP reconfigurations reduction in polynomial time from 3-SAT

Finding a seamless migration ordering is computationally hard

Deciding if an ordering free from signaling anomalies exists is NP-hard

slide-69
SLIDE 69

Does it always exist ? No. Find a sequence of configuration changes Is it easy to compute ? No.

slide-70
SLIDE 70

Does it always exist ? No. Find a sequence of configuration changes Is it easy to compute ? No.

An algorithmic approach is not viable

slide-71
SLIDE 71

A crash course

BGP reconfiguration Finding an ordering

Is it easy? Does it exist?

Reconfiguration framework

Overcome complexity 3

Improving network agility with seamless BGP reconfigurations

slide-72
SLIDE 72

Why is BGP reconfiguration so complex ?

Local reconfiguration can have global impact in an unpredictable manner

slide-73
SLIDE 73

Why is BGP reconfiguration so complex ?

To avoid that, we could run each configuration in an independent routing plane Local reconfiguration can have global impact in an unpredictable manner Similar to IGP reconfiguration Shadow configuration

[Vanbever, SIGCOMM11] [Alimi, SIGCOMM08]

slide-74
SLIDE 74

The reconfiguration framework leverages Ships-In-The-Night (SITN) migration for BGP

SITNs migrations consists in

1 2 3

running multiple BGP routing planes waiting for each plane to converge modifying the plane responsible for forwarding

Data-plane

init forwarding paths init BGP

Control-plane

Abstract model of a router

slide-75
SLIDE 75

The reconfiguration framework leverages Ships-In-The-Night (SITN) migration for BGP

Data-plane

final BGP init forwarding paths init BGP

Control-plane

Abstract model of a router

SITNs migrations consists in

1 2 3

running multiple BGP routing planes waiting for each plane to converge modifying the plane responsible for forwarding

slide-76
SLIDE 76

The reconfiguration framework leverages Ships-In-The-Night (SITN) migration for BGP

Data-plane

final BGP init forwarding paths init BGP

Control-plane

Abstract model of a router

SITNs migrations consists in

1 2 3

running multiple BGP routing planes waiting for each plane to converge modifying the plane responsible for forwarding

slide-77
SLIDE 77

The reconfiguration framework leverages Ships-In-The-Night (SITN) migration for BGP

Data-plane

final BGP final forwarding paths init BGP

Control-plane

Abstract model of a router

SITNs migrations consists in

1 2 3

running multiple BGP routing planes waiting for each plane to converge modifying the plane responsible for forwarding

slide-78
SLIDE 78

The reconfiguration framework leverages Ships-In-The-Night (SITN) migration for BGP

Data-plane

final BGP final forwarding paths init BGP

Control-plane

Abstract model of a router

BGP SITN can be deployed on today’s routers using BGP/MPLS VPNs technology

SITNs migrations consists in

1 2 3

running multiple BGP routing planes waiting for each plane to converge modifying the plane responsible for forwarding

slide-79
SLIDE 79

GEANT

European research network 53 links

Let’s reconfigure a network from an iBGP full-mesh ...

36 routers (virtualized)

slide-80
SLIDE 80

GEANT

European research network 36 routers (virtualized) 53 links Top Middle Bottom iBGP hierarchy

Let’s reconfigure a network from an iBGP full-mesh to an iBGP hierarchy

slide-81
SLIDE 81

5 10 15 20 25 200 400 600 800 1000 migration steps # of failed ping median 95% 5% current best practices

  • ur approach

Following best practices, traffic was lost for 30% of the process

losses

Average results (30 repetitions) computed on 120+ pings per step from every router to 16 summary prefixes

losses from 7 routers 60% of GEANT routing table is impacted !

slide-82
SLIDE 82

5 10 15 20 25 200 400 600 800 1000 migration steps # of failed ping median 95% 5% current best practices

  • ur approach

Following our approach, lossless reconfiguration was achieved

losses

No loss occurred with our approach losses from 7 routers 60% of GEANT routing table is impacted !

Average results (30 repetitions) computed on 120+ pings per step from every router to 16 summary prefixes

slide-83
SLIDE 83

A crash course

BGP reconfiguration Finding an ordering

Is it easy? Does it exist?

Reconfiguration framework

Overcome complexity

Improving network agility with seamless BGP reconfigurations

slide-84
SLIDE 84

Contributions

Implement and validate a BGP reconfiguration framework Study BGP reconfiguration, both practically and theoretically Show that a (seamless) operational ordering might be needed might not exist is computationally hard to find

1 2 3

slide-85
SLIDE 85

IRTF Open Meeting, IETF87 Laurent Vanbever July, 30 2013 http://vanbever.eu

Improving network agility with seamless BGP reconfigurations