External Routing External Routing BGP JeanYves Le Boudec Fall 2009 - - PDF document

external routing external routing bgp
SMART_READER_LITE
LIVE PREVIEW

External Routing External Routing BGP JeanYves Le Boudec Fall 2009 - - PDF document

COLE POLYTECHNIQUE FDRALE DE LAUSANNE External Routing External Routing BGP JeanYves Le Boudec Fall 2009 Self Organization 1 1 Contents A. What InterDomain Routing does 1 1. Inter Domain Routing Inter Domain Routing 2.


slide-1
SLIDE 1

1

External Routing

ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE

External Routing BGP

Jean‐Yves Le Boudec Fall 2009 Self­Organization

1

slide-2
SLIDE 2

2

Contents

  • A. What Inter‐Domain Routing does

1 Inter‐Domain Routing

  • 1. Inter Domain Routing
  • 2. Policy Routing
  • 3. Route Aggregation
  • B. How BGP works
  • 1. How it works
  • 2. Interaction BGP—IGP—Packet Forwarding
  • 3. Other Bells and Whistles
  • C. Examples
  • D. Illustrations and Statistics

2

slide-3
SLIDE 3

3

  • A. What BGP Does
  • 1. Inter‐Domain Routing

Why invented ? The Internet is too large to be run by one routing protocol The Internet is too large to be run by one routing protocol Hierarchical routing is used

the Internet is split into Domains, or Autonomous Systems with OSPF: large domains are split into Areas

Routing protocols are said

interior: (Internal Gateway Protocols, IGPs): inside ASs: RIP, OSPF (standard), IGRP (Cisco) exterior: between ASs:

3

e te o EGP (old) and BGP‐1 to BGP‐4 (today), IDRP (tomorrow, maybe)

slide-4
SLIDE 4

4

What is an ARD ? An AS ?

ARD = Autonomous Routing Domain = routing domain under one single administration

  • ne or more border routers

all subnetworks inside an ARD should be connected should learn about other subnetwork prefixes ‐ the routing tables of internal routers should contain entries of all destination of the Internet

AS = Autonomous System = ARD with a number (“AS number”)

AS number is 16 bits public: 1 ‐ 64511 private: 64512 – 65535

ARDs that are do not need a number are typically those with default route to the rest of the world Examples

4

Examples

AS1942 ‐ CICG‐GRENOBLE, AS2200 ‐ Renater AS559 ‐ SWITCH Teleinformatics Services AS5511 – OPENTRANSIT EPFL: one ARD, no number

slide-5
SLIDE 5

5

the figure shows three domains, or ARDs. ARDs can be transit (B and D), stub (A) or multihomed (C). Only non stub ( ), ( ) ( ) y domains need an AS number.

ARD D 3 D1 4 area 0 C2 C1 C4 IGRP B1 A2 A1 A4 BGP-4 BGP-4 OSPF BGP-4 BGP-4 D2 D3 D4 D6 OSPF area 2 area 1 D5

5

ARD C C4 C3 B2 B1 B4 B3 A3 Autonomous Routing Domain A ARD B

slide-6
SLIDE 6

6

What does BGP do ?

What does BGP do ? BGP is a routing protocol between ARDs It is used to compute BGP is a routing protocol between ARDs. It is used to compute paths from one router in one ARD to any network prefix anywhere in the world There are only two levels for BGP :

Inter‐domain within domain

6

The method of routing is

Path vector With policy

slide-7
SLIDE 7

7

Path Vector Routing

What is the requirement ?

Find best routes, in a sense that can be decided by every ARD using their own criteria

How does it work ?

a route between neighbours is (path: dest) where path is a sequence of AS numbers and dest is a route between neighbours is (path: dest) where path is a sequence of AS numbers and dest is an IP prefix example: B A:n1 every AS uses its own rules for deciding which path is better BGP table keeps a record of best paths to all destinations AS announces only the best paths it knows

  • Q. Explain how E can choose the best paths to n1 and n2 ?
  • Q. How can loops be avoided ? solution

BGP table in E

Q p

B E n1, n2 A:n1,n2 B A:n1,n2 B:n5 dest AS path n1 B A n2 B A n3 D C n4 D BGP table in E n5 n4

7

A C A:n1,n2 C A:n1,n2 C:n3 D D C A:n1,n2 D C: n3 D: n4 n4 D n5 B n3

slide-8
SLIDE 8

8

Border Gateways, E‐ and I‐BGP

A router that runs BGP is called a BGP speaker At the boundary between 2 ARDs there are 2 BGP speakers, one in each domain

Q: compare to OSPF

Inside one ARD there are usually several BGP speakers

They all talk to each other, to exchange what they have learnt Using “Internal BGP” (I‐BGP) Over TCP connections, full mesh called the “BGP mesh” I‐BGP is the same as E‐BGP except for one rule: routes learned from a node in the I BGP is the same as E BGP except for one rule: routes learned from a node in the mesh are not repeated inside the mesh

D1 D2 A B C D X:n1 X:n1 A->C: D1 X:n1

E-BGP

8

D4 D5 D3 G H E F C->E: D1 X:n1 E->G: D3 D1,X:n1

E-BGP I-BGP

slide-9
SLIDE 9

9

Q1: Which of the following announcements do you think router D will send to router E ?

  • 1. D‐>E: D2 X:n1
  • 2. D‐>E: D1 X:n1

A: 1 only; 2 cannot be sent because over I‐BGP we do not repeat a route learnt f th BGP k i th ARD from another BGP speaker in the same ARD

Q2: Assume that in the figure below the router R does not run BGP. Is this possible ? A: yes. Q3: Do BGP speakers in the same ARD need to be on‐link ? Q3: Do BGP speakers in the same ARD need to be on link ? A: no; C and F are not directly connected (are not on‐link)

D1 D2 A B C D X:n1 X:n1

E-BGP solution

9

D4 D5 D3 G H C D E F

E-BGP I-BGP

R

slide-10
SLIDE 10

10

  • 2. Policy Routing

Why invented ? Interconnection of ASs is self‐organized

point to point links between networks: ex: EPFL to Switch, Switch to Telianet interconnection points: NAP (Network Access Point), MAE (Metropolitan Area Ethernet), CIX (Commercial Internet eXchange), GIX (Global Internet eXchange), IXP, SFINX, LINX

Mainly 3 types of relations, depending on money flows

customer: EPFL is customer of Switch. EPFL pays Switch provider: Switch is provider for EPFL; Switch is paid by EPFL

10

peer: EPFL and CERN are peers: costs of interconnection is shared the type of relation is negotiated in bilateral agreements there is no architecture rule, just business

slide-11
SLIDE 11

11

What is the Goal of Policy Routing ?

Example:

ISP3‐ISP2 is transatlantic link cost shared between ISP3 ISP2 is transatlantic link, cost shared between ISP2 and ISP 3 ISP 3‐ ISP 1 is a local, inexpensive link Ci is customer of ISPi, ISPs are peers

It is advantageous for ISP3 to send traffic to 2 i ISP1

ISP 1 ISP 3 ISP 2

n2 via ISP1 ISP1 does not agree to carry traffic from C3 to C2

ISP1 offers a “transit service” to C1 and a “non‐ transit” service to ISP 2 and ISP3 C1 C2 C3 n2

11

The goal of “policy routing” is to support this and other similar requirements

provider customer peers

slide-12
SLIDE 12

12

How does Policy Routing Work ?

Policy routing is implemented by rules imposed to BGP routers inside an AS, who may

refuse to import or announce some paths refuse to import or announce some paths modify the attributes that control which path is preferred (see later)

Example

ISP 1 announces to ISP 3 all networks of C1 – so that C1 can

ISP 1 ISP 3

ISP 2 be reached by all sources in the world ISP 1 announces to C1 all routes it has learnt from ISP3 and ISP2 – so that C1 can send traffic to all destinations in the world ISP2 announces “ISP2 n2” to ISP3 and ISP1 ; assume that ISP1 annouces “ISP1 ISP2 n2” to ISP3.

C1 C2 C3 n2 n3

12

ISP1 annouces ISP1 ISP2 n2 to ISP3. ISP 3 has two routes to n2: “ISP2 n2” and “ISP1 ISP2 n2”; assume that ISP3 gives preference to the latter packets from n3 to n2 are routed via ISP1 – undesired solution: ISP 1 announces to ISP3 only routes to ISP3’s customers

slide-13
SLIDE 13

13

Typical Policy Routing Rules

Provider (ISP1) to customer (C1)

announce all routes learnt from other ISs import only routes that belong to domain C1 example: import from EPFL only one route 128.178/15

Customer (C1) to Provider (ISP1)

announce all routes that belong to domain C1 ISP 1 ISP 3 ISP 2 import all routes

Peers (ISP1 to ISP3)

announce only routes to all customers of ISP1 import only routes to ISP3’s customer these routes are defined as part of peering C1 C2 C3

13

these routes are defined as part of peering agreement

The rules are defined by every AS (self‐

  • rganization) and implemented in all BGP

speakers in one AS

slide-14
SLIDE 14

14

  • 3. Aggregation

Domains that do not have a default route (i.e. all transit ISPs) must know all routes in the world (> 170 000 prefixes) must know all routes in the world (> 170 000 prefixes)

in IP routing tables unless default routes are used in BGP announcements

Aggregation is a way to reduce the number of routes

14

slide-15
SLIDE 15

15

Aggregation Example 1

AS1

193.212.0.0/24

AS1 AS2 AS3 AS4

193.212.1.0/24 193.212.0.0/23 193.212.0.0/23

Assume AS3 aggregates the routes received from AS1 and AS2

AS1: 193.212.0.0/24 AS_PATH: 1 AS2: 193.212.1.0/24 AS_PATH: 2 AS3: 193.212.0.0/23 AS_PATH: 3 {1 2} AS4: 193 212 0 0/23 AS PATH: 4 3 {1 2}

15

AS4: 193.212.0.0/23 AS_PATH: 4 3 {1 2}

slide-16
SLIDE 16

16

Aggregation Example 2

AS1

197.8.2.0/24

AS1 AS2 AS3 AS4

197.8.3.0/24 197.8.2.0/23 197.8.3.0/24 197.8.2.0/23

n2= 197.8.3.0/24 n4 AS4 receives

197.8.2.0/23 AS_PATH: 3 {1 2} 197.8.3.0/24 AS_PATH: 2

and assume that AS4 exports only the first route (aggregation is done) But assume both routes are injected into AS4’s local routing tables; Q:

16

j g what happens to packets from n4 to n2 ? If AS4 injects only the aggregate route into its local routing tables, what is the answer now ? solution

slide-17
SLIDE 17

17

Example Without Aggregation

AS1

197.8.2.0/24 197.8.2.0/23 197.8.2.0/23

AS1 AS2 AS3 AS4

197.8.3.0/24 197.8.2.0/23 197.8.3.0/24 197.8.2.0/24 197.8.3.0/24 197.8.2.0/23 197.8.2.0/24 197.8.3.0/24

Q: If AS3 does not aggregate, what are the routes announced by AS 4 ? Is there any benefit ? solution

17

slide-18
SLIDE 18

18

Morale

Aggregation should be performed whenever possible

when all aggregated prefixes have the same path (example 1) when all aggregated prefixes have the same path (example 1) when all aggregated prefixes have the same path before the aggregation point (example 2 )

An AS can decide to

Aggregate several routes when exporting them But still maintain different routing entries inside its domain (example 2)

18

slide-19
SLIDE 19

19

  • B. BGP (Border Gateway Protocol)
  • 1. How it works, Fundamental Examples

BGP‐4, RFC 1771

BGP routers talk to each other over TCP connections BGP routers talk to each other over TCP connections

Maintain adjacency to peers (over TCP connection): open, keep‐alive

Transmit only modifications

How a BGP router works (see also next slide)

Receives routes from neighbours

Accepts / rejects them (import policy) Processes attributes Stores in Adj‐RIB‐in (one per BGP peer, internal or external)

Applies decision process and stores results in Loc‐RIB (global to BGP speaker) (= decides which routes to accept among all present in the different Adj‐RIB‐ins)

For every subnetwork prefix, at most 1 route is selected

Sends to neighbours

Decides whether to send or not (export policy) A l i l i if li bl 19 Aggregate multiple routes into one, if applicable Store result in Adj‐RIB‐out (one per BGP peer) and send to neighbour Only routes learnt from E‐BGP are sent to an I‐BGP neighbour Sends updates when Adj‐RIB‐out changes

Write forwarding entries in its routing table, either directly or by re‐distribution into IGP ; this is explained in detail in Section 2.

slide-20
SLIDE 20

20

Inside BGP

updates in d t t

One BGP Router

Adj RIB out

updates in updates out

Adj-RIB-in Adj RIB in

Import Policy

Decision Process:

b t

Export

Routing Adj-RIB-out Adj-RIB-in Adj-RIB-out Loc-RIB

IGP Write forwarding entries

  • ne best

route / destination

p Policy

Internal routes

20

Routing Table

Static Configuration

slide-21
SLIDE 21

21

Routes, RIBs, Routing Table

The records sent in BGP messages are called “Routes”. Routes + their attributes are stored in the Adj‐RIB‐in, Loc‐RIB, Adj‐RIB‐out. A route is made of: A route is made of:

destination (subnetwork prefix) path to the destination (AS‐PATH) Attributes

Well‐known Mandatory

ORIGIN (route learnt from IGP, BGP or static) AS‐PATH AS PATH NEXT‐HOP

Well‐known Discretionary

LOCAL‐PREF (see later) ATOMIC‐AGGREGATE (= route cannot be dis‐aggregated)

Optional Transitive

MULTI‐EXIT‐DISC (MED)(see later) AGGREGATOR (who aggregated this route) 21 AGGREGATOR (who aggregated this route)

Optional Nontransitive

WEIGHT (see later)

In addition, like any IP host or router, a BGP router also has a Routing Table = IP forwarding table

Used for packet forwarding, in real time This is not the same as a RIB, we will see the interaction later

slide-22
SLIDE 22

22

The Decision Process

The decision process decides which route is selected; its output is written into Loc‐RIB At most one best route to exactly the same prefix is chosen

Only one route to 2.2/16 can be chosen But there can be different routes to 2.2.2/24 and 2.2/16

Routes are compared against each other using the p g g following sequence of criteria, until only one route remains:

  • 0. Highest weight (Cisco proprietary)
  • 1. Highest LOCAL‐PREF
  • 2. Shortest AS‐PATH

22

  • 3. Lowest MED, if taken seriously by this network
  • 4. E‐BGP > I‐BGP
  • 5. Shortest path to NEXT‐HOP, according to IGP
  • 6. Lowest BGP identifier
slide-23
SLIDE 23

23

Fundamental Examples

AS x R2 R1

I-BGP

R4 R2 R1 10.1/16 AS=(y) 10.2/16 AS=(y) 10.1/16 AS=(y) 10.2/16 AS=(y) 10 2/16

E-BGP E-BGP

R3

1.1.1.1 2.2.2.2 1.1.1.2 2.2.2.1

AS y 10.1/16 10.2/16

Skrivañ war an daol

23

In this simple example there are 4 BGP routers. They communicate directly or indirectly via E‐BGP or I‐BGP, as shown on the figure. There are 2 ASs, x and y. We do not show the details of the internals of y. R3 and R4 send the BGP messages shown. In the next slide, we show only the following attributes in the RIBs: destination, path, NEXT‐HOP

slide-24
SLIDE 24

24

R1 has the following RIBs, initially empty:

Adj - RI B- I n f r om R3 Adj - RI B- I n f r om R2 Loc- RI B Adj - RI B- O ut t o R3 Adj - RI B- O ut t o R2

R1 receives from R3 two announcements, made of the routes 10. 1/ 16 AS=( y) and 10. 2/ 16

AS=( y) . R1 associates the NEXT‐HOP attribute

1 1 1 2 h d h Ad R1 prepares to send these new routes to all its I‐ BGP peers by writing them into Adj‐RIB‐Out to R2:

Adj - RI B- I n f r om R3

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2

  • 10. 2/ 16 AS=y NEXT- HO

P=1. 1. 1. 2 Adj - RI B- I n f r om R2 Loc- RI B 10 1/ 16 AS=y NEXT HO P=1 1 1 2

1.1.1.2 to these routes and writes them in its Adj‐ RIB‐In:

Adj - RI B- I n f r om R3

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2

  • 10. 2/ 16 AS=y NEXT- HO

P=1. 1. 1. 2 Adj - RI B- I n f r om R2 Loc- RI B Adj - RI B- O ut t o R3 Adj - RI B- O ut t o R2

R1 applies its decision process, and selects both

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2

  • 10. 2/ 16 AS=y NEXT- HO

P=1. 1. 1. 2 Adj - RI B- O ut t o R3 Adj - RI B- O ut t o R2

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2

  • 10. 2/ 16 AS=y NEXT- HO

P=1. 1. 1. 2

Since Adj‐RIB‐Out to R2 changed, a BGP message is sent to R2 Similarly, at some time R1 will receive from R2 a BGP message with the routes 10. 1/ 16 AS=y pp p , routes:

Adj - RI B- I n f r om R3

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2

  • 10. 2/ 16 AS=y NEXT- HO

P=1. 1. 1. 2 Adj - RI B- I n f r om R2 Loc- RI B

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2

  • 10. 2/ 16 AS=y NEXT- HO

P=1. 1. 1. 2 Adj - RI B- O ut t o R3 Adj - RI B- O ut t o R2

BGP message with the routes 10. 1/ 16 AS y

NEXT- HO P=2. 2. 2. 1 and 10. 2/ 16 AS=y NEXT- HO P=2. 2. 2. 1: Adj - RI B- I n f r om R3

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2

  • 10. 2/ 16 AS=y NEXT- HO

P=1. 1. 1. 2 Adj - RI B- I n f r om R2

  • 10. 1/ 16 AS=y NEXT- HO

P=2. 2. 2. 1

  • 10. 2/ 16 AS=y NEXT- HO

P=2. 2. 2. 1 Loc- RI B

24 R1 prepares to send these new routes to all its E‐ BGP peers by writing them into Adj‐RIB‐Out. However, the message is not sent to R3 as it would create a loop in the AS‐path.

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2

  • 10. 2/ 16 AS=y NEXT- HO

P=1. 1. 1. 2 Adj - RI B- O ut t o R3 Adj - RI B- O ut t o R2

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2

  • 10. 2/ 16 AS=y NEXT- HO

P=1. 1. 1. 2

slide-25
SLIDE 25

25

R1 applies again its decision process. Now it has several possible routes to each prefix. The first applicable rule in the decision process (slide “The Decision Process”) says that if a route is learnt from E‐BGP it has precedence over a route learnt from I‐BGP. Since all routes in Adj - RI B- I n f r om

R2 are

learnt from I‐BGP, and all routes in Adj - learnt from I BGP, and all routes in Adj

RI B- I n f r om R3 are learnt from E‐BGP, the

winners are the latter, so there is no change in Loc- RI B:

Adj - RI B- I n f r om R3

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2

  • 10. 2/ 16 AS=y NEXT- HO

P=1. 1. 1. 2 Adj - RI B- I n f r om R2

  • 10. 1/ 16 AS=y NEXT- HO

P=2. 2. 2. 1 10 2/ 16 AS y NEXT HO P 2 2 2 1

  • 10. 2/ 16 AS=y NEXT- HO

P=2. 2. 2. 1 Loc- RI B

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2

  • 10. 2/ 16 AS=y NEXT- HO

P=1. 1. 1. 2 Adj - RI B- O ut t o R3 Adj - RI B- O ut t o R2

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2

  • 10. 2/ 16 AS=y NEXT- HO

P=1. 1. 1. 2

Since there is no change in Loc- RI B there is h i

dj

d h f

25

no change in Adj - RI B- O

ut and therefore no

message is sent by R1.

slide-26
SLIDE 26

26

AS x R22 R1

I-BGP

R21 I-BGP

I-BGP

Fundamental Examples, continued

R4 R22 R1 10.1/16 AS=(y) 10.2/16 AS=(y) 10 2/16

E-BGP E-BGP

R3 1.1.1.1 2.2.2.2 1.1.1.2 2.2.2.1 3.3.3.1 3.3.3.2 10.2/16 AS=(y) AS y 10.1/16 10.2/16

There are now 3 BGP routers in AS x Note that the 3 BGP in AS x routers must have

26

There are now 3 BGP routers in AS x. Note that the 3 BGP in AS x routers must have TCP connections with each other (same in AS y, but not shown on figure). An IGP (for example OSPF) also runs on R1, R21 and R22. All link costs are equal to 1. The announcements made by R3 and R4 are different, as shown on the figure.

slide-27
SLIDE 27

27

Here is a possible sequence of messages and the resulting states of RIBs, at R1, after message is processed: R1 receives message 10. 1/ 16 AS=( y) from R3 over E_BGP:

Adj - RI B- I n f r om R3

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2 Adj - RI B- I n f r om R22 Adj - RI B- I n f r om R21 Loc- RI B

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2 Adj - RI B- O ut t o R3 Adj - RI B- O ut t o R22 Adj - RI B- O ut t o R21

Assume R21 also received the E‐BGP messages shown on the figure and did its job. R1 receives message 10. 2/ 16

AS=y, NEXT- HO P=3. 3. 3. 1 from R21 over I‐BGP: Adj - RI B- I n f r om R3

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2 Adj - RI B- I n f r om R22

  • 10. 2/ 16 AS=y, NEXT- HO

P=2. 2. 2. 1 Adj - RI B- I n f r om R21

  • 10. 2/ 16 AS=y, NEXT- HO

P=3. 3. 3. 1 Loc- RI B

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2 10 2/ 16 AS y NEXT HO P 2 2 2 1 Adj - RI B- O ut t o R21

R1 applies its decision process, accepts the route and sends it to R22 and R21

Adj - RI B- I n f r om R3

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2 Adj - RI B- I n f r om R22 Adj - RI B- I n f r om R21 Loc- RI B

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2 Adj - RI B- O ut t o R3 Adj - RI B- O ut t o R22

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2 Adj - RI B- O ut t o R21

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2

  • 10. 2/ 16 AS=y NEXT- HO

P=2. 2. 2. 1 Adj - RI B- O ut t o R3 Adj - RI B- O ut t o R22

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2 Adj - RI B- O ut t o R21

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2

The decision process now has to choose between two routes with same destination prefix 10.2/16. Both were learnt from I‐BGP, so we apply criterion 5 in slide “The Decision Process”. The distance, computed by the IGP, to 2.2.2.1 is 3 and the distance to 3.3.3.1 is 2. Thus the route that has NEXT-

HO P=3 3 3 1 is preferred by the decision process: / 6 S y O

Assume R22 also received the E‐BGP messages shown on the figure and did its job. R1 receives message 10. 2/ 16

AS=y, NEXT- HO P=2. 2. 2. 1 from R22 over I‐BGP, and the

decision process accepts it:

Adj - RI B- I n f r om R3

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2 Adj - RI B- I n f r om R22

  • 10. 2/ 16 AS=y, NEXT- HO

P=2. 2. 2. 1 Adj - RI B- I n f r om R21 Loc- RI B

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2 HO P=3. 3. 3. 1 is preferred by the decision process: Adj - RI B- I n f r om R3

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2 Adj - RI B- I n f r om R22

  • 10. 2/ 16 AS=y, NEXT- HO

P=2. 2. 2. 1 Adj - RI B- I n f r om R21

  • 10. 2/ 16 AS=y, NEXT- HO

P=3. 3. 3. 1 Loc- RI B

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2

  • 10. 2/ 16 AS=y NEXT- HO

P=3. 3. 3. 1 Adj - RI B- O ut t o R3 Adj - RI B- O ut t o R22

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2 Adj - RI B- O ut t o R21

27

/ y

  • 10. 2/ 16 AS=y NEXT- HO

P=2. 2. 2. 1 Adj - RI B- O ut t o R3 Adj - RI B- O ut t o R22

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2 Adj - RI B- O ut t o R21

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2

Since the new message is learnt from I‐BGP it is not put into any Adj‐RIB‐Out

j

  • 10. 1/ 16 AS=y NEXT- HO

P=1. 1. 1. 2

Since the new message is learnt from I‐BGP it is not put into any Adj‐RIB‐Out.

slide-28
SLIDE 28

28

Questions

What is the difference between I‐BGP and IGP ? What is the difference between I‐BGP and E‐BGP ? What is the difference between a RIB and a Routing Table ?

28

slide-29
SLIDE 29

29

How are routes injected into BGP

BGP propagates route information, but how is this bootstrapped ? bootstrapped ? Two methods

Static configuration: tell this BGP router which are the prefixes owned by this domain Import from IGP: tell this router to copy all the prefixes that the IGP has learnt learnt

Assumes the IGP either does not propagate external prefixes, or has a way to differentiate them Such routes are sent to E‐BGP neighbours only, with ORIGIN=IGP

29

slide-30
SLIDE 30

30

  • 2. How Forwarding Entries learnt by BGP are written

into Routing Tables

So far, we have seen how BGP routers learn about all the prefixes in the world It remains to see how they inform all the prefixes in the world. It remains to see how they inform all the routers in their AS about these prefixes (« write forwarding entries » in slide « Inside BGP »). This is the topic of this section. There are two possible ways for this, usually mutually l i exclusive: Redistribution: routes learnt by BGP are passed to IGP (ex: OSPF)

Called “redistribution of BGP into OSPF” OSPF propagates the routes using type 4 LSAs to all routers in OSPF cloud

30

p p g g yp Not found much in practice

Injection: routes learnt by BGP are written into the forwarding table of this router

Routes do not propagate; this helps only this router

slide-31
SLIDE 31

31

Redistribution Example

AS x

E-BGP

R5 R6 AS z

R5 advertises 18.1/16 to R6 via E‐BGP R6 i i R2 i I BGP

E-BGP

R1 R2 18.1/16

I-BGP IGP (OSPF) IGP (OSPF)

2.2.2.2

AS y

R6 transmits it to R2 via I‐BGP

TCP connection between R6 and R2

(redistribute BGP into IGP) R6 injects 18.1/16 into IGP (OSPF)

OSPF propagates 18.1/16 (type 4 LSA) and updates forwarding tables After OSPF converges, R1, R2 now have a route to 18.1/6R2 R4

31

g , , / advertises route to R4 via E‐BGP

(synchronize with IGP) R2 must wait for the OSPF entry to 18.1/6 before advertising via E‐BGP Packet to 18.1/16 from AS y finds forwarding table entries in R2, R1 and R6

slide-32
SLIDE 32

32

Example with Re‐Distribution

by ___, F learns from G the route D2‐D1‐n1 C redistributes the external route D2:n2 into OSPF b D l th t D2 2 b D l th t DYD2 2 D by ____, D learns the route D2:n2; by ____ D learns the route DYD2:n2; D selects D2:n2 and does not redistribute it to OSPF by ___, B learns the route D2:n2 by ___, A learns the route DX:D2:n2 by ___, L learns the route to n2 via C

D1 = EPFL D2 = UCB n1 D1: n1 n2 A C F D2 : n2 G

32

DX DY D2 : n2 DY D2: n2 D1: n1 B D E L physical link TCP connection

solution

slide-33
SLIDE 33

33

Re‐Distribution is Considered Harmful

In practice, operators avoid re‐distribution of BGP into IGP

Large number of routing entries in IGP Large number of routing entries in IGP Convergence time after failures is large if IGP has many routing table entries

Therefore, injection is used instead Injection assumes that routers in the AS employ recursive table lookup

When IP packet is submitted to router, the forwarding table may indicate a “NEXT‐HOP” which is not on‐link with router A second table lookup needs to be done to resolve the next‐hop into an on‐ link neighbour

in practice, second lookup may be done in advance – not in real time– by pre‐

33

processing the routing table

slide-34
SLIDE 34

34

Example: Recursive Table Lookup

At R1, data packet to 10.1.x.y is received The forwarding table at R1 is looked up The forwarding table at R1 is looked up

Q: what are the next events ?

T NEXT HO P l 2 dd

Forwarding Table at R1 R1

To NEXT- HO P l ayer - 2 addr

  • 10. 1/ 16 2. 2. 2. 63 N/ A
  • 2. 2. 2. 63 2. 2. 2. 33 x09: F1: 6A: 33: 76: 21

2.2.2.33 2.2.2.63 10.1/16 R2 R3

34

R 2.2.2.93 R3 solution

slide-35
SLIDE 35

35

Injection Example

AS x

E-BGP

R5 R6 AS z

R5 advertises 18.1/16, NEXT-HOP = 2.2.2.2

E-BGP

R1 R2 18.1/16

I-BGP IGP IGP (OSPF)

2.2.2.2

AS y

to R6 via E‐BGP

R6 injects 18.1/16, NEXT-HOP = 2.2.2.2 into its local forwarding table (does not re‐distribute into OSPF)

R2 learns route from R6 via I‐BGP

R2 injects 18 1/16 NEXT-HOP = 2 2 2 2 into

R4

35

R2 injects 18.1/16, NEXT HOP = 2.2.2.2 into its local forwarding table

Data packet to 18.1.2.3 is received by R2

Recursive table lookup at R2 can be used Q: there is a problem at R1: how can we solve it ?

solution

slide-36
SLIDE 36

36

Injection in Practice Requires all Routers to Run BGP

AS x

E-BGP

R2 R5 R6 18.1/16

I-BGP IGP (OSPF)

2.2.20.1

2.2.2.2

AS z

Practical solution often deployed

All core routers also run I‐BGP (in addition to IGP)

E-BGP

R4 R1 R2 /

IGP (OSPF)

AS y

Routes learnt from BGP are injected into local forwarding table Recursive table lookup is done at all routers Q: repeat the sequence of previous slide with this new assumption

Potential problem: I‐BGP mesh ‐> use reflectors

36

Potential problem: I BGP mesh > use reflectors IGP handles only internal networks – very few solution

slide-37
SLIDE 37

37

Questions

What is the difference between injection and re‐distribution ? Can BGP routes be learnt by a non BGP router ? What is recursive table lookup ? Where is it needed ?

37

slide-38
SLIDE 38

38

  • 3. Other Bells and Whistles

MULTI‐EXIT‐DISC (MED)

packet to 10.1.2.3 packet to 10 2 3 4

AS x R2 R1 10.1/16 MED=10 10 2/16 MED=50 10.1/16 MED=50 10.2/16 MED=10

E-BGP E-BGP

packet to 10.2.3.4

1.1.1.1 2.2.2.2

AS y R4 10.2/16 MED=50 10.1/16 10.2/16 R3

38

One AS connected to another over several links

ex: multinational company connected to worldwide ISP AS y advertises its prefixes with different MEDs (low = preferred) If AS x accepts to use MEDs put by ASy: traffic goes on preferred link

slide-39
SLIDE 39

39

MED Example

Q1: by which mechanisms will R1 and R2 make sure that packets to ASy use the preferred links ? packets to ASy use the preferred links ? Q2: router R3 crashes; can 10.1/16 still be reached ? explain the sequence of actions. solution

39

slide-40
SLIDE 40

40

MED Question

Q1: Assume now ASx and ASy are peers (ex: both are ISPs). Explain why ASx is not interested in taking MED into account Explain why ASx is not interested in taking MED into account. Q2: By which mechanisms can ASx pick the nearest route to ASy ? solution

40

slide-41
SLIDE 41

41

LOCAL‐PREF

f 100 AS x

E-BGP

R1 R2 R6

I-BGP I-BGP E-BGP E-BGP

pref=10 pref=100

Used inside an AS to select a best AS path Assigned by border router when receiving route over E‐ BGP

Propagated without change over I‐BGP

41

Example

R6 associates pref=100, R2 pref=10 R1 chooses the largest preference bgp default local-preference pref-value

slide-42
SLIDE 42

42

LOCAL‐PREF Example

Q1: The link AS2‐AS4 is expensive. How should AS 4 set local‐prefs on routes received from AS 3 and AS 2 in order to route traffic preferably through AS 3 ? Q2 E l i th f t f R1 R2 d R3 Q2:Explain the sequence of events for R1, R2 and R3.

AS 1 3 AS 2

AS1: 10.1/16 10.1/16 AS1: 10.1/16

AS 3 AS 2 AS 4 R1 R2 R3

42

AS 5 solution

slide-43
SLIDE 43

43

LOCAL‐PREF Question

Q: Compare MED to LOCAL‐PREF solution solution

43

slide-44
SLIDE 44

44

Weight

This is a route attribute given by Cisco or Zebra router

It remains local to this router It remains local to this router Never propagated to other routers, even in the same cloud Therefore there is no weight attribute in route announcements

44

slide-45
SLIDE 45

45

Route Flap Dampening

Route modification propagates everywhere Sometimes routes are flapping Sometimes routes are flapping

successive UPDATE and WITHDRAW caused for example by BGP speaker that often crashes and reboots

Solution:

decision process eliminates flapping routes

How

withdrawn routes are kept in Adj‐RIN‐in if comes up again soon (ie : flap), route receives a penalty if penalty ≥ suppress‐limit route is not selected

45

penalty fades out exponentially, see next slide

slide-46
SLIDE 46

46

Route Flap Dampening

penalty suppress-limit penalty reuse-limit time t1 t2

46

Route suppressed at t1, restored att2

slide-47
SLIDE 47

47

Avoid I‐BGP Mesh: Confederations

AS P1 AS P2 AS P3 AS z

E-BGP

AS P1 AS P2 AS P3

I-BGP I-BGP I-BGP E-BGP

AS decomposed into sub‐AS

private AS number

AS z

E-BGP E-BGP E-BGP

47

similar to OSPF areas

I‐BGP inside sub‐AS (full interconnection) E‐BGP between sub‐AS

slide-48
SLIDE 48

48

Avoid I‐BGP Mesh : Route reflectors

I-BGP

AS z

I-BGP I-BGP I-BGP I-BGP I-BGP RR RR RR

cluster 1 cluster 2 cluster 3

Cluster of routers

  • ne I‐BGP session between one client and RR

CLUSTER_ID

R fl

AS z

E-BGP E-BGP E-BGP

48

Route reflector

re‐advertises a route learnt via I‐BGP to avoid loops

ORIGINATOR_ID attribute associated with the advertisement

slide-49
SLIDE 49

49

I‐BGP configuration

lo0 lo0

I-BGP I-BGP

I‐BGP configured on loopback interface (lo0)

i f l

AS z

49

interface always up IP address associated with the interface IGP routing guarantees packet forwarding to the interface

slide-50
SLIDE 50

50

Avoid E‐BGP mesh: Route server

At i t ti i t

E-BGP

50

At interconnection point Instead of n(n‐1)/2 peer to peer E‐BGP connections n connections to Route Server To avoid loops ADVERTISER attribute indicates which router in the AS generated the route

slide-51
SLIDE 51

51

Communities

Other attributes can be associated with routes in order to simplify rules They are called « communities » simplify rules. They are called « communities »

Pre‐defined: Example: NO‐EXPORT ( a well known, pre‐defined attribute) – see later for an example Defined by one AS (a label of the form ASN:x where AS= AS number, x = a 2 byte—number)

51

slide-52
SLIDE 52

52

NO—EXPORT

Written on E‐BGP by one AS, transmitted on I‐BGP by accepting AS, not forwarded Example: AS2 has different routes to AS1 but AS2 sends only one aggregate route to AS3

simplifies the aggregation rules at AS2 What is the route followed by a packet sent to 2.2.48 received by R4 ?

2.2.0/17 NO-EXPORT 2 2/16

R1

2.2.0/17 / 2.2/16 2.2/16

R1 R5 R3 R4

52

2.2.128/17 2.2.128/17 NO-EXPORT 2.2/16

R2 R5 solution

slide-53
SLIDE 53

53

  • C. Examples

A network operator implements their policy by selecting the A network operator implements their policy by selecting the rules for

Import policy (accept route or not, if required modify/add/delete attributes) Export policy (export or not a route, decide which attribute value)

We illustrate two scenarios: Dual Homing Hot potato routing

53

Hot potato routing

slide-54
SLIDE 54

54

Ex1: Stub Area

Operator AS p

R1 R2 BGP not needed between Client and Operator No AS number for client R2 learns all prefixes in Client by static configuration or RIP on link R1—R2

Client AS

54

Example: EPFL and Switch what if R1 fails ?

slide-55
SLIDE 55

55

Ex2: Stub Area, Dual Homing to Single Provider

AS x R2 R1 10 1/16 MED=10 10.1/16 MED=50

E BGP E BGP Provider

AS R4 10.1/16 MED=10 10.2/16 MED=50 10.2/16 MED=10 10.1/16 10.2/16

E-BGP E-BGP

R3

Client

With numbered Client AS

Use MED to share traffic from ISP to Client on two links Use Client IGP configuration to share traffic from Client to two links

AS y

55

Q1: is it possible to avoid distributing BGP routes into Client IGP ? Q2: is it possible to avoid assigning an AS number to Client ? Q3: is it possible to avoid BGP between Client and Provider ? solution

slide-56
SLIDE 56

56

Ex3: Stub Area, Dual Homing to Several Providers

AS x R2 R1

P d

AS y AS z R4 R2 R1 / 2 1/17

E-BGP E-BGP

R3

Cli t Providers Client has own address space and AS number Q: how can routes be announced between AS 100 and AS x ? AS x and AS z ?

AS 100 2.0/17 2.1/17

Client

56

Q: assume Client wants most traffic to favour AS x. How can that be done ? solution

slide-57
SLIDE 57

57

Ex4: Hot Potato Routing

Customer 1 ISP1

R11 R12

ISP2

R21 R22

Customer 2

Packets from Customer 2 to Customer 1

Both R21 and R22 have a route to Customer 1 Shortest path routing favours R21 Q1: by which mechanism is that done ?

Q2: what is the path followed in the reverse direction ?

57

solution

slide-58
SLIDE 58

58

  • D. Illustrations: The Switch Network

www.switch.ch

58

slide-59
SLIDE 59

59

An Interconnection Point

59

slide-60
SLIDE 60

60

60

slide-61
SLIDE 61

61

Type Next HOP MED Origin Community RRC ID A 128.178.0.0/ 15 2003-10-02 05:05:49Z 129.250.0.2 32 129.250.0.2 32 9 Not defined 2914 1299 559 2914:420 2914:2000 2914:3000 RIPE NCC

from www.ris.ripe.net: all routes to 128.178.0.0/15 on RIPE Route Collectors

A 128.178.0.0/ 15 2003-10-02 06:16:00Z 193.10.252. 5 193.10.252. 5 IGP 2603 3356 1299 559 2603:666 3356:2 3356:86 3356:507 3356:666 3356:2076 Netnod A 128.178.0.0/ 15 2003-10-02 06:16:17Z 194.68.48.1 194.68.48.1 IGP 12381 1653 2603 20965 559 12381:1653 Netnod A 128.178.0.0/ 15 2003-10-02 06:16:37Z 194.68.48.1 194.68.48.1 IGP 12381 1653 2603 3356 1299 559 12381:1653 Netnod A 128.178.0.0/ 15 2003-10-02 06:21:08Z 193.10.252. 5 193.10.252. 5 IGP 2603 20965 559 2603:222 2603:666 20965:155 Netnod A 128.178.0.0/ 15 2003-10-02 06:21:17Z 194.68.48.1 194.68.48.1 IGP 12381 1653 2603 20965 559 12381:1653 Netnod

61

559 A 128.178.0.0/ 15 2003-10-02 07:24:06Z 129.250.0.2 32 129.250.0.2 32 9 Not defined 2914 3549 559 2914:420

P r e f ix T i m e P e e r A S p a th

slide-62
SLIDE 62

62

The World seen from EPFL

62

http://www.ris.ripe.net/bgpviz/

slide-63
SLIDE 63

63

Some statistics

Source: http://www.cidr‐report.org

Address prefixes ASs

63

slide-64
SLIDE 64

64

Number of hosts

64

slide-65
SLIDE 65

65

65

slide-66
SLIDE 66

66

Exercise

What ASs does EPFL receive service from ? What ASs does Switch receive service from ? Find the names of the networks that have these AS numbers

66

slide-67
SLIDE 67

67

Exercise

Lookup http://rpsl.info.ucl.ac.be. to find out the relationships between Switch and other providers between Switch and other providers How does the software on this site decide whether a relationship is client, provider or peer ?

67

slide-68
SLIDE 68

68

Further Reading

Slow convergence after route suppression – “BGP path exploration” similar to (but worse than) distance vector slow exploration , similar to (but worse than) distance vector slow

  • convergence. Is in the nature of path vector routing with

explicit suppression.

Craig Labovitz, Abha Ahuja, Abhijit Bose, Farnam Jahanian: Delayed Internet routing convergence. IEEE/ACM Trans. Netw. 9(3): 293‐306 (2001)

R t fl d i l d Route flap dampening slows down convergence

Zhuoqing Morley Mao, Ramesh Govindan, George Varghese, Randy H. Katz: Route flap damping exacerbates internet routing convergence. SIGCOMM 2002: 221‐233

Path vector + policy may suffer from incompatibilities (loops)

68

The stable paths problem and interdomain routing Griffin, T.G.; Shepherd, F.B.; Wilfong, G. ACM/IEEE ToN April 2002, Page(s): 232‐243

slide-69
SLIDE 69

69

References

Timothy Griffin’s home page at Intel RFC 1771 (BGP 4) RFC 1771 (BGP‐4)

  • C. Huitema, “Le Routage dans l’Internet”

John W. Stewart III “ BGP 4” www.ris.ripe.net : AS paths id t ti t ti ti www.cidr‐report.org aggregation statistics www.caida.org map of Internet rpsl.info.ucl.ac.be relations between ASs

69

slide-70
SLIDE 70

70

Solutions

70

slide-71
SLIDE 71

71

Path Vector Routing

  • Q. Explain how E can choose the best paths to n1 and n2
  • A. E receives the routes “B A n1” and “D C A n1”. E selects as best routes the
  • nes with shorter AS path.
  • nes with shorter AS path.
  • Q. How can loops be avoided ?

A: BGP routers recognize looping announcements by the repetition of the same AS in the path. Such announcements are discarded back

BGP table in E B E n1, n2 A:n1,n2 B A:n1,n2 B:n5 dest AS path n1 B A n2 B A n3 D C n4 D BGP table in E n5 n4

71

A C A:n1,n2 C A:n1,n2 C:n3 D D C A:n1,n2 D C: n3 D: n4 n4 D n5 B n3

slide-72
SLIDE 72

72

Q1: Which of the following announcements do you think router D will send to router E ?

  • 1. D‐>E: D2 X:n1
  • 2. D‐>E: D1 X:n1

A: 1 only; 2 cannot be sent because over I‐BGP we do not repeat a route learnt f th BGP k i th ARD from another BGP speaker in the same ARD

Q2: Assume that in the figure below the router R does not run BGP. Is this possible ? A: yes. Q3: Do BGP speakers in the same ARD need to be on‐link ? Q3: Do BGP speakers in the same ARD need to be on link ? A: no; C and F are not directly connected (are not on‐link)

D1 D2 A B C D X:n1 X:n1

E-BGP back

72

D4 D5 D3 G H C D E F

E-BGP I-BGP

R

slide-73
SLIDE 73

73

Aggregation Example 2

AS1

197.8.2.0/24

AS1 AS2 AS3 AS4

197.8.3.0/24 197.8.2.0/23 197.8.3.0/24 197.8.2.0/23

n2 n4 Q: what happens to packets from n4 to n2 ? A: There are two routing entries in AS4 routers: one for 197.8.2.0/23 and one for 197.8.3.0/24. Longest prefix match in the packet forwarding algorithm ensures that packets to n2 go

  • n the direct route (sent to AS2).

73

( ) If AS4 injects only the aggregate route into its local routing tables, what is the answer now ? There is now only one entry (197.8.2.0/23 ) and all packets to n2 are sent to AS3 back

slide-74
SLIDE 74

74

Aggregation Example 3

AS1

197.8.2.0/24

AS1 AS2 AS3 AS4

197.8.3.0/24 197.8.2.0/23 197.8.2.0/23 197.8.3.0/24

AS5 AS6

197.8.3.0/24

n2 n4

AS4 receives

197.8.2.0/23 AS_PATH: 3 {1 2} 197.8.3.0/24 AS_PATH: 6 5 2

Both routes are received by AS4; Q.what happens for packets from n4 to n2?

197.8.3.0/24 197.8.3.0/24

74

Both routes are received by AS4; Q.what happens for packets from n4 to n2?

  • A. they go the long way

Assume now that only shortest AS paths routes are injected into routing tables Q: what happens to packets from n4 to n2? A: they go via AS3 back

slide-75
SLIDE 75

75

Example Without Aggregation

AS1

197.8.2.0/24 197.8.0.0/23 197.8.0.0/23

AS1 AS2 AS3 AS4

197.8.3.0/24 197.8.0.0/23 197.8.3.0/24 197.8.2.0/24 197.8.3.0/24 197.8.0.0/23 197.8.2.0/24 197.8.3.0/24

Q: If AS3 does not aggregate, what are the routes announced by AS 4 ? Is there any benefit ? A:

197.8.0.0/23 AS_PATH: 4 3 197 8 2 0/24 AS PATH: 4 3 1

75

197.8.2.0/24 AS_PATH: 4 3 1 197.8.3.0/24 AS_PATH: 4 2

A: there is no benefit since all routes go via AS 4 anyhow. AS4 should aggregate. back

slide-76
SLIDE 76

76

NEXT‐HOP

AS x

E-BGP

R6 AS z

E-BGP

R1 R2 R5

I-BGP I-BGP

AS y 10.1/16 10.2/16

E BGP

R3 R4

76

R3 advertises 10.2/16 to R1, NEXT‐HOP = R4 IP address R6 advertises 10.2/16 to R5, NEXT‐HOP = R6 IP address

  • Q. where is such a scenario likely to happen ?

A: in interconnection points with many providers interconnected on one LAN back

slide-77
SLIDE 77

77

MED Example

Q1: by which mechanisms will R1 and R2 make sure that packets to ASy use the preferred links ? packets to ASy use the preferred links ? A:

R1 and R2 exchange their routes to AX y via I‐BGP R1 has 2 routes to 10.1/16, one learnt by E‐BGP (MED=10), one learnt by I‐ BGP (MED=5=); prefers the former R1 has 2 routes to 10 2/16 prefers the latter (route via R2) R1 has 2 routes to 10.2/16, prefers the latter (route via R2)

Q2: router R3 crashes; can 10.1/16 still be reached ? explain the sequence of actions. A:

R1 clears routes to ASy learnt from R3 (keep‐alive mechanism)

77

R2 is informed of the route suppression by I‐BGP R2 has now only 1 route to 10.1/16 and 1 route to 10.2/16; traffic to 10.1/16 now goes to R2 back

slide-78
SLIDE 78

78

MED Question

Q1: Assume now ASx and ASy are peers (ex: both are ISPs). Explain why ASx is not interested in taking MED into account. A: ASx is interested in sending traffic to ASy to the nearest exit, avoiding transit inside ASx as much as possible. Thus ASx will choose the nearest route to ASy, and will ignore MEDs Q2: By which mechanisms can ASx pick the nearest route to ASy ? A: it depends on the IGP. With OSPF: all routes to ASy are injected into OSPF by means type 5 LSAs. These LSAs say: send to router R3 or R4. Every OSPF router inside ASx knows the cost (determined by OSPF weights) of the path from self to R3 and R4. Packets to 10.1/16 and 10.2/16 are routed to the nearest among R3 and R4 (nearest = lowest OSPF cost). back

78

back

slide-79
SLIDE 79

79

LOCAL‐PREF Example

Q1: The link AS2‐AS4 is expensive. How should AS 4 set local‐prefs on routes received from AS 3 and AS 2 in order to route traffic preferably through AS 3 ? A: for example: set LOCAL‐PREF to 100 to all routes received from AS 3 and to 50 to A: for example: set LOCAL PREF to 100 to all routes received from AS 3 and to 50 to all routes received from AS 2

AS 1 3 AS 2 AS1: 10.1/16 10.1/16 AS1: 10.1/16

  • Q2: Explain the sequence of events

for R1, R2 and R3 A:

AS 3 AS 2 AS 4 R1 R2 R3

A:

R1 receives the route AS2 AS1 10.1/16

  • ver E-BGP; sets LOCAL-PREF to 50

R2 receives the route AS3 AS1 10.1/16

  • ver E-BGP; sets LOCAL-PREF to 100

R3 receives AS2 AS1 10.1/16, LOCAL- PREF=50 from R1 over I-BGP and AS3 AS1 10.1/16, LOCAL-PREF=100 from R1 over I-BGP R3 l t AS3 AS1 10 1/16 LOCAL

AS3 AS1: 10.1/16

79

AS 5

R3 selects AS3 AS1 10.1/16, LOCAL- PREF=100 and installs it into local-RIB R3 announces only AS3 AS1 10.1/16 to AS 5

back

slide-80
SLIDE 80

80

LOCAL‐PREF Question

Q: Compare MED to LOCAL‐PREF A: A:

MED is used between ASs (i.e. over E‐BGP); LOCAL‐PREF is used inside one AS (over I‐BGP) MED is used to tell one provider AS which entry link to prefer; LOCAL‐PREF is used to tell the rest of the world which AS path we want to use, by not announcing the other ones. g back

80

slide-81
SLIDE 81

81

Example with Re‐Distribution

by I‐BGP, F learns from G the route to D2‐D1‐n1 C redistributes the external route D2:n2 into OSPF; b I BGP D l th t D2 2 b E BGP D l th t DYD2 2 D by I‐BGP, D learns the route D2:n2; by E‐BGP D learns the route DYD2:n2; D selects D2:n2 and does not redistribute it to OSPF by I‐BGP, B learns the route D2:n2 from C by E‐BGP, A learns the route DX:D2:n2 by OSPF, L learns the route to n2 via C I-BGP - internal BGP E-BGP - external BGP

D1 = EPFL D2 = UCB n1 D1: n1 n2 A C F D2 : n2 G

81

DX DY D2 : n2 DY D2: n2 D1: n1 B D E L physical link TCP connection

back

slide-82
SLIDE 82

82

Example: Recursive Table Lookup

At R1, data packet to 10.1.x.y is received The forwarding table at R1 is looked up The forwarding table at R1 is looked up

Q: what are the next events ? A: first, the nex‐hop 2.2.2.63 is found; a second lookup for 2.2.2.63 is done; the packet is sent to MAC address x09: F1: 6A: 33: 76: 21

T NEXT HO P l 2 dd

R1

To NEXT- HO P l ayer - 2 addr

  • 10. 1/ 16 2. 2. 2. 63 N/ A
  • 2. 2. 2. 63 2. 2. 2. 33 x09: F1: 6A: 33: 76: 21

2.2.2.33 2.2.2.63 10.1/16 R2 R3

82

R 2.2.2.93 R3 back

slide-83
SLIDE 83

83

Avoid Redistribution: Combine Recursive Lookup and NEXT‐HOP

AS x

E-BGP

R5 R6 AS z

Data packet to 18.1.2.3 is received by R2

Q: there is a problem at R1: how can we solve it ?

E-BGP

R1 R2 18.1/16

I-BGP IGP IGP (OSPF)

2.2.2.2

AS y

Q: there is a problem at R1: how can we solve it ? A: the problem is: Packet is sent by R2 towards R1, which is the neighbour towards 2.2.2.2. At R1, there is no forwarding entry to 18.1/16 – blackhole ! A solution would be to use loose source routing: R2 adds 2.2.2.2 as loose source routing info into packet. Another solution is label switching (see MPLS module) also called tag switching. R2 establishes a label switched

R4

83

g g path to 2.2.2.2 and forwards the packet on this path. In practice however, another solution, as shown next. back

slide-84
SLIDE 84

84

Avoid Redistribution: Practical Solution

AS x

E-BGP

R2 R5 R6 18.1/16

I-BGP IGP (OSPF)

2.2.20.1

2.2.2.2

AS z

Q: repeat the sequence of previous slide with this new assumption R5 advertises 18.1/16, NEXT-HOP = 2.2.2.2 to R6 via E‐BGP R6 transmits 18.1/16, NEXT-HOP = 2.2.2.2 to R1 and R2 via I BGP

E-BGP

R4 R1 R2 /

IGP (OSPF)

AS y

I‐BGP

R6 injects 18.1/16, NEXT-HOP = 2.2.2.2 into its local forwarding table R2 injects 18.1/16, NEXT-HOP = 2.2.2.2 into its local forwarding table

Independently, IGP finds that, at R2, packets to 2.2.10.1 should be sent to R1 Data packet to 18.1.2.3 is received by R2

At R2, recursive table lookup determines that packet should be forwarded to

84

, p p R1 At R1, recursive table lookup determines that packet should be forwarded to R6 At R6, recursive table lookup determines that packet should be forwarded to 2.2.2.2

back

slide-85
SLIDE 85

85

NO—EXPORT

Q: What is the route followed by a packet sent to 2.2.48 received by R4 ? A: the packet is sent via R3 and R1 p

2.2.0/17 NO-EXPORT 2 2/16

R1

2.2.0/17 / 2.2/16 2.2/16

R1 R5 R3 R4

85

2.2.128/17 2.2.128/17 NO-EXPORT 2.2/16

R2 R5 back

slide-86
SLIDE 86

86

  • Sol. Ex2: Stub Area, Dual Homing to Single Provider

AS x R2 R1 10 1/16 MED=10 10.1/16 MED=50

E BGP E BGP Provider

AS R4 10.1/16 MED=10 10.2/16 MED=50 10.2/16 MED=10 10.1/16 10.2/16

E-BGP E-BGP

R3

Client

Q1: is it possible to avoid distributing BGP routes into Client IGP ? A: yes, for example: configure R3 and R4 as default routers in Client AS; traffic from Client AS is forwarded to nearest of R3 and R4. If R3 or R4 fails, to the remaining one Q2: is it possible to avoid assigning an AS number to Client ? A: Yes, it is sufficient to assign to Client a private AS number: Provider translates this

AS y

86

, g p number to its own. Q3: is it possible to avoid BGP between Client and Provider ? A: Yes, by running a protocol like RIP between Client and Provider and redistributing Client routes into Provider IGP. Thus Provider pretends to the rest of the world that the prefixes of Client are its own. back

slide-87
SLIDE 87

87

  • Sol. Ex3: Stub Area, Dual Homing to Several Providers

AS x R2 R1

P d

AS y AS z R4 R2 R1 / 2 1/17

E-BGP E-BGP

R3

Cli t Providers

Client has own address space and AS number Q: how can routes be announced between AS 100 and AS x ? AS x and AS z ? A: R3 announces 2.0/17 and 2.0/16; traffic from AS x to 2.0/17 will flow via AS x; if

AS 100 2.0/17 2.1/17

Client

87

A: R3 announces 2.0/17 and 2.0/16; traffic from AS x to 2.0/17 will flow via AS x; if R3 fails, it will use the longer prefix and flow via Asy. ASx announces 2.0/17 and 2.0/16 to AS z Q: assume Client wants most traffic to prefer AS y. How can that be done ? A: R3 announces an artificially inflated path: 100 100 100 100 : 2.0/17. AS z will favour the path via AS y which has a shorter AS path length back

slide-88
SLIDE 88

88

  • Sol. Ex4: Hot Potato Routing

Customer 1 ISP1

R11 R12

ISP2

R21 R22

Customer 2 Packets from Customer 2 to Customer 1

Both R21 and R22 have a route to Customer 1 Shortest path routing favours R21 Q1: by which mechanism is that done ? A Ch i f th b t t ( it i 5) i lll t i ISP2 BGP

88

A: « Choice of the best route » (criteria 5), assuming alll routers in ISP2 run BGP

Q2: what is the path followed in the reverse direction ?

A: see picture. Note the asymmetric routing

back

slide-89
SLIDE 89

89

Exercise

What ASs does EPFL receive service from ?

from the previous routes we find AS 559 (Switch) from the previous routes, we find AS 559 (Switch)

What ASs does Switch receive service from ?

from the previous routes we see that there are at least:

AS 1299 AS 20965 AS 3549 AS 3549

Find the names of the networks that have these AS numbers

from whois on www.ripe.net:

AS 1299: Telianet AS 20965: Geant

89

AS 3549: Global Crossing

slide-90
SLIDE 90

90

Exercise

Lookup http://rpsl.info.ucl.ac.be. to find out the relationships between Switch and other providers between Switch and other providers How does the software on this site decide whether a relationship is client, provider or peer ?

AS X is client of Switch if AS X accepts ANY path and announces only self (AS X) AS X is provider of Switch if AS X announces ANY path and accepts only AS Switch AS X is a peer if AS X accepts and announces only a small set of routes

90

slide-91
SLIDE 91

91

91