1
External Routing
ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE
External Routing BGP
Jean‐Yves Le Boudec Fall 2009 SelfOrganization
1
External Routing External Routing BGP JeanYves Le Boudec Fall 2009 - - PDF document
COLE POLYTECHNIQUE FDRALE DE LAUSANNE External Routing External Routing BGP JeanYves Le Boudec Fall 2009 Self Organization 1 1 Contents A. What InterDomain Routing does 1 1. Inter Domain Routing Inter Domain Routing 2.
ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE
1
2
3
4
ARD D 3 D1 4 area 0 C2 C1 C4 IGRP B1 A2 A1 A4 BGP-4 BGP-4 OSPF BGP-4 BGP-4 D2 D3 D4 D6 OSPF area 2 area 1 D5
5
ARD C C4 C3 B2 B1 B4 B3 A3 Autonomous Routing Domain A ARD B
6
Find best routes, in a sense that can be decided by every ARD using their own criteria
a route between neighbours is (path: dest) where path is a sequence of AS numbers and dest is a route between neighbours is (path: dest) where path is a sequence of AS numbers and dest is an IP prefix example: B A:n1 every AS uses its own rules for deciding which path is better BGP table keeps a record of best paths to all destinations AS announces only the best paths it knows
BGP table in E
B E n1, n2 A:n1,n2 B A:n1,n2 B:n5 dest AS path n1 B A n2 B A n3 D C n4 D BGP table in E n5 n4
7
A C A:n1,n2 C A:n1,n2 C:n3 D D C A:n1,n2 D C: n3 D: n4 n4 D n5 B n3
D1 D2 A B C D X:n1 X:n1 A->C: D1 X:n1
8
D4 D5 D3 G H E F C->E: D1 X:n1 E->G: D3 D1,X:n1
D1 D2 A B C D X:n1 X:n1
9
D4 D5 D3 G H C D E F
R
10
11
12
13
14
193.212.0.0/24
193.212.1.0/24 193.212.0.0/23 193.212.0.0/23
15
197.8.2.0/24
197.8.3.0/24 197.8.2.0/23 197.8.3.0/24 197.8.2.0/23
16
197.8.2.0/24 197.8.2.0/23 197.8.2.0/23
197.8.3.0/24 197.8.2.0/23 197.8.3.0/24 197.8.2.0/24 197.8.3.0/24 197.8.2.0/23 197.8.2.0/24 197.8.3.0/24
17
18
BGP routers talk to each other over TCP connections BGP routers talk to each other over TCP connections
Maintain adjacency to peers (over TCP connection): open, keep‐alive
Transmit only modifications
Receives routes from neighbours
Accepts / rejects them (import policy) Processes attributes Stores in Adj‐RIB‐in (one per BGP peer, internal or external)
Applies decision process and stores results in Loc‐RIB (global to BGP speaker) (= decides which routes to accept among all present in the different Adj‐RIB‐ins)
For every subnetwork prefix, at most 1 route is selected
Sends to neighbours
Decides whether to send or not (export policy) A l i l i if li bl 19 Aggregate multiple routes into one, if applicable Store result in Adj‐RIB‐out (one per BGP peer) and send to neighbour Only routes learnt from E‐BGP are sent to an I‐BGP neighbour Sends updates when Adj‐RIB‐out changes
Write forwarding entries in its routing table, either directly or by re‐distribution into IGP ; this is explained in detail in Section 2.
Decision Process:
b t
Export
route / destination
p Policy
20
Well‐known Mandatory
ORIGIN (route learnt from IGP, BGP or static) AS‐PATH AS PATH NEXT‐HOP
Well‐known Discretionary
LOCAL‐PREF (see later) ATOMIC‐AGGREGATE (= route cannot be dis‐aggregated)
Optional Transitive
MULTI‐EXIT‐DISC (MED)(see later) AGGREGATOR (who aggregated this route) 21 AGGREGATOR (who aggregated this route)
Optional Nontransitive
WEIGHT (see later)
Used for packet forwarding, in real time This is not the same as a RIB, we will see the interaction later
22
AS x R2 R1
R4 R2 R1 10.1/16 AS=(y) 10.2/16 AS=(y) 10.1/16 AS=(y) 10.2/16 AS=(y) 10 2/16
R3
AS y 10.1/16 10.2/16
Skrivañ war an daol
23
R1 has the following RIBs, initially empty:
Adj - RI B- I n f r om R3 Adj - RI B- I n f r om R2 Loc- RI B Adj - RI B- O ut t o R3 Adj - RI B- O ut t o R2
R1 receives from R3 two announcements, made of the routes 10. 1/ 16 AS=( y) and 10. 2/ 16
AS=( y) . R1 associates the NEXT‐HOP attribute
1 1 1 2 h d h Ad R1 prepares to send these new routes to all its I‐ BGP peers by writing them into Adj‐RIB‐Out to R2:
Adj - RI B- I n f r om R3
P=1. 1. 1. 2
P=1. 1. 1. 2 Adj - RI B- I n f r om R2 Loc- RI B 10 1/ 16 AS=y NEXT HO P=1 1 1 2
1.1.1.2 to these routes and writes them in its Adj‐ RIB‐In:
Adj - RI B- I n f r om R3
P=1. 1. 1. 2
P=1. 1. 1. 2 Adj - RI B- I n f r om R2 Loc- RI B Adj - RI B- O ut t o R3 Adj - RI B- O ut t o R2
R1 applies its decision process, and selects both
P=1. 1. 1. 2
P=1. 1. 1. 2 Adj - RI B- O ut t o R3 Adj - RI B- O ut t o R2
P=1. 1. 1. 2
P=1. 1. 1. 2
Since Adj‐RIB‐Out to R2 changed, a BGP message is sent to R2 Similarly, at some time R1 will receive from R2 a BGP message with the routes 10. 1/ 16 AS=y pp p , routes:
Adj - RI B- I n f r om R3
P=1. 1. 1. 2
P=1. 1. 1. 2 Adj - RI B- I n f r om R2 Loc- RI B
P=1. 1. 1. 2
P=1. 1. 1. 2 Adj - RI B- O ut t o R3 Adj - RI B- O ut t o R2
BGP message with the routes 10. 1/ 16 AS y
NEXT- HO P=2. 2. 2. 1 and 10. 2/ 16 AS=y NEXT- HO P=2. 2. 2. 1: Adj - RI B- I n f r om R3
P=1. 1. 1. 2
P=1. 1. 1. 2 Adj - RI B- I n f r om R2
P=2. 2. 2. 1
P=2. 2. 2. 1 Loc- RI B
24 R1 prepares to send these new routes to all its E‐ BGP peers by writing them into Adj‐RIB‐Out. However, the message is not sent to R3 as it would create a loop in the AS‐path.
P=1. 1. 1. 2
P=1. 1. 1. 2 Adj - RI B- O ut t o R3 Adj - RI B- O ut t o R2
P=1. 1. 1. 2
P=1. 1. 1. 2
R1 applies again its decision process. Now it has several possible routes to each prefix. The first applicable rule in the decision process (slide “The Decision Process”) says that if a route is learnt from E‐BGP it has precedence over a route learnt from I‐BGP. Since all routes in Adj - RI B- I n f r om
R2 are
learnt from I‐BGP, and all routes in Adj - learnt from I BGP, and all routes in Adj
RI B- I n f r om R3 are learnt from E‐BGP, the
winners are the latter, so there is no change in Loc- RI B:
Adj - RI B- I n f r om R3
P=1. 1. 1. 2
P=1. 1. 1. 2 Adj - RI B- I n f r om R2
P=2. 2. 2. 1 10 2/ 16 AS y NEXT HO P 2 2 2 1
P=2. 2. 2. 1 Loc- RI B
P=1. 1. 1. 2
P=1. 1. 1. 2 Adj - RI B- O ut t o R3 Adj - RI B- O ut t o R2
P=1. 1. 1. 2
P=1. 1. 1. 2
Since there is no change in Loc- RI B there is h i
dj
d h f
25
no change in Adj - RI B- O
ut and therefore no
message is sent by R1.
AS x R22 R1
R21 I-BGP
R4 R22 R1 10.1/16 AS=(y) 10.2/16 AS=(y) 10 2/16
R3 1.1.1.1 2.2.2.2 1.1.1.2 2.2.2.1 3.3.3.1 3.3.3.2 10.2/16 AS=(y) AS y 10.1/16 10.2/16
26
Here is a possible sequence of messages and the resulting states of RIBs, at R1, after message is processed: R1 receives message 10. 1/ 16 AS=( y) from R3 over E_BGP:
Adj - RI B- I n f r om R3
P=1. 1. 1. 2 Adj - RI B- I n f r om R22 Adj - RI B- I n f r om R21 Loc- RI B
P=1. 1. 1. 2 Adj - RI B- O ut t o R3 Adj - RI B- O ut t o R22 Adj - RI B- O ut t o R21
Assume R21 also received the E‐BGP messages shown on the figure and did its job. R1 receives message 10. 2/ 16
AS=y, NEXT- HO P=3. 3. 3. 1 from R21 over I‐BGP: Adj - RI B- I n f r om R3
P=1. 1. 1. 2 Adj - RI B- I n f r om R22
P=2. 2. 2. 1 Adj - RI B- I n f r om R21
P=3. 3. 3. 1 Loc- RI B
P=1. 1. 1. 2 10 2/ 16 AS y NEXT HO P 2 2 2 1 Adj - RI B- O ut t o R21
R1 applies its decision process, accepts the route and sends it to R22 and R21
Adj - RI B- I n f r om R3
P=1. 1. 1. 2 Adj - RI B- I n f r om R22 Adj - RI B- I n f r om R21 Loc- RI B
P=1. 1. 1. 2 Adj - RI B- O ut t o R3 Adj - RI B- O ut t o R22
P=1. 1. 1. 2 Adj - RI B- O ut t o R21
P=1. 1. 1. 2
P=2. 2. 2. 1 Adj - RI B- O ut t o R3 Adj - RI B- O ut t o R22
P=1. 1. 1. 2 Adj - RI B- O ut t o R21
P=1. 1. 1. 2
The decision process now has to choose between two routes with same destination prefix 10.2/16. Both were learnt from I‐BGP, so we apply criterion 5 in slide “The Decision Process”. The distance, computed by the IGP, to 2.2.2.1 is 3 and the distance to 3.3.3.1 is 2. Thus the route that has NEXT-
HO P=3 3 3 1 is preferred by the decision process: / 6 S y O
Assume R22 also received the E‐BGP messages shown on the figure and did its job. R1 receives message 10. 2/ 16
AS=y, NEXT- HO P=2. 2. 2. 1 from R22 over I‐BGP, and the
decision process accepts it:
Adj - RI B- I n f r om R3
P=1. 1. 1. 2 Adj - RI B- I n f r om R22
P=2. 2. 2. 1 Adj - RI B- I n f r om R21 Loc- RI B
P=1. 1. 1. 2 HO P=3. 3. 3. 1 is preferred by the decision process: Adj - RI B- I n f r om R3
P=1. 1. 1. 2 Adj - RI B- I n f r om R22
P=2. 2. 2. 1 Adj - RI B- I n f r om R21
P=3. 3. 3. 1 Loc- RI B
P=1. 1. 1. 2
P=3. 3. 3. 1 Adj - RI B- O ut t o R3 Adj - RI B- O ut t o R22
P=1. 1. 1. 2 Adj - RI B- O ut t o R21
27
/ y
P=2. 2. 2. 1 Adj - RI B- O ut t o R3 Adj - RI B- O ut t o R22
P=1. 1. 1. 2 Adj - RI B- O ut t o R21
P=1. 1. 1. 2
Since the new message is learnt from I‐BGP it is not put into any Adj‐RIB‐Out
j
P=1. 1. 1. 2
Since the new message is learnt from I‐BGP it is not put into any Adj‐RIB‐Out.
28
29
30
AS x
E-BGP
R5 R6 AS z
E-BGP
R1 R2 18.1/16
I-BGP IGP (OSPF) IGP (OSPF)
2.2.2.2
AS y
TCP connection between R6 and R2
OSPF propagates 18.1/16 (type 4 LSA) and updates forwarding tables After OSPF converges, R1, R2 now have a route to 18.1/6R2 R4
31
g , , / advertises route to R4 via E‐BGP
D1 = EPFL D2 = UCB n1 D1: n1 n2 A C F D2 : n2 G
32
DX DY D2 : n2 DY D2: n2 D1: n1 B D E L physical link TCP connection
33
T NEXT HO P l 2 dd
To NEXT- HO P l ayer - 2 addr
34
AS x
E-BGP
R5 R6 AS z
E-BGP
R1 R2 18.1/16
I-BGP IGP IGP (OSPF)
2.2.2.2
AS y
R4
35
AS x
E-BGP
R2 R5 R6 18.1/16
I-BGP IGP (OSPF)
2.2.20.1
2.2.2.2
AS z
E-BGP
R4 R1 R2 /
IGP (OSPF)
AS y
36
37
AS x R2 R1 10.1/16 MED=10 10 2/16 MED=50 10.1/16 MED=50 10.2/16 MED=10
AS y R4 10.2/16 MED=50 10.1/16 10.2/16 R3
38
39
40
f 100 AS x
E-BGP
R1 R2 R6
I-BGP I-BGP E-BGP E-BGP
pref=10 pref=100
41
AS1: 10.1/16 10.1/16 AS1: 10.1/16
42
43
44
45
penalty suppress-limit penalty reuse-limit time t1 t2
46
AS P1 AS P2 AS P3 AS z
E-BGP
AS P1 AS P2 AS P3
I-BGP I-BGP I-BGP E-BGP
AS z
E-BGP E-BGP E-BGP
47
I‐BGP inside sub‐AS (full interconnection) E‐BGP between sub‐AS
I-BGP
AS z
I-BGP I-BGP I-BGP I-BGP I-BGP RR RR RR
cluster 1 cluster 2 cluster 3
AS z
E-BGP E-BGP E-BGP
48
ORIGINATOR_ID attribute associated with the advertisement
lo0 lo0
I-BGP I-BGP
AS z
49
E-BGP
50
51
2.2.0/17 NO-EXPORT 2 2/16
2.2.0/17 / 2.2/16 2.2/16
52
2.2.128/17 2.2.128/17 NO-EXPORT 2.2/16
53
Operator AS p
Client AS
54
AS x R2 R1 10 1/16 MED=10 10.1/16 MED=50
AS R4 10.1/16 MED=10 10.2/16 MED=50 10.2/16 MED=10 10.1/16 10.2/16
R3
AS y
55
AS x R2 R1
AS y AS z R4 R2 R1 / 2 1/17
R3
AS 100 2.0/17 2.1/17
56
R11 R12
R21 R22
Both R21 and R22 have a route to Customer 1 Shortest path routing favours R21 Q1: by which mechanism is that done ?
57
58
59
60
Type Next HOP MED Origin Community RRC ID A 128.178.0.0/ 15 2003-10-02 05:05:49Z 129.250.0.2 32 129.250.0.2 32 9 Not defined 2914 1299 559 2914:420 2914:2000 2914:3000 RIPE NCC
A 128.178.0.0/ 15 2003-10-02 06:16:00Z 193.10.252. 5 193.10.252. 5 IGP 2603 3356 1299 559 2603:666 3356:2 3356:86 3356:507 3356:666 3356:2076 Netnod A 128.178.0.0/ 15 2003-10-02 06:16:17Z 194.68.48.1 194.68.48.1 IGP 12381 1653 2603 20965 559 12381:1653 Netnod A 128.178.0.0/ 15 2003-10-02 06:16:37Z 194.68.48.1 194.68.48.1 IGP 12381 1653 2603 3356 1299 559 12381:1653 Netnod A 128.178.0.0/ 15 2003-10-02 06:21:08Z 193.10.252. 5 193.10.252. 5 IGP 2603 20965 559 2603:222 2603:666 20965:155 Netnod A 128.178.0.0/ 15 2003-10-02 06:21:17Z 194.68.48.1 194.68.48.1 IGP 12381 1653 2603 20965 559 12381:1653 Netnod
61
559 A 128.178.0.0/ 15 2003-10-02 07:24:06Z 129.250.0.2 32 129.250.0.2 32 9 Not defined 2914 3549 559 2914:420
P r e f ix T i m e P e e r A S p a th
62
63
64
65
66
67
68
69
70
BGP table in E B E n1, n2 A:n1,n2 B A:n1,n2 B:n5 dest AS path n1 B A n2 B A n3 D C n4 D BGP table in E n5 n4
71
A C A:n1,n2 C A:n1,n2 C:n3 D D C A:n1,n2 D C: n3 D: n4 n4 D n5 B n3
D1 D2 A B C D X:n1 X:n1
72
D4 D5 D3 G H C D E F
R
197.8.2.0/24
197.8.3.0/24 197.8.2.0/23 197.8.3.0/24 197.8.2.0/23
73
197.8.2.0/24
197.8.3.0/24 197.8.2.0/23 197.8.2.0/23 197.8.3.0/24
197.8.3.0/24
197.8.2.0/23 AS_PATH: 3 {1 2} 197.8.3.0/24 AS_PATH: 6 5 2
197.8.3.0/24 197.8.3.0/24
74
197.8.2.0/24 197.8.0.0/23 197.8.0.0/23
197.8.3.0/24 197.8.0.0/23 197.8.3.0/24 197.8.2.0/24 197.8.3.0/24 197.8.0.0/23 197.8.2.0/24 197.8.3.0/24
75
AS x
E-BGP
R6 AS z
E-BGP
R1 R2 R5
I-BGP I-BGP
AS y 10.1/16 10.2/16
E BGP
R3 R4
76
77
78
for R1, R2 and R3 A:
A:
R1 receives the route AS2 AS1 10.1/16
R2 receives the route AS3 AS1 10.1/16
R3 receives AS2 AS1 10.1/16, LOCAL- PREF=50 from R1 over I-BGP and AS3 AS1 10.1/16, LOCAL-PREF=100 from R1 over I-BGP R3 l t AS3 AS1 10 1/16 LOCAL
79
R3 selects AS3 AS1 10.1/16, LOCAL- PREF=100 and installs it into local-RIB R3 announces only AS3 AS1 10.1/16 to AS 5
80
D1 = EPFL D2 = UCB n1 D1: n1 n2 A C F D2 : n2 G
81
DX DY D2 : n2 DY D2: n2 D1: n1 B D E L physical link TCP connection
T NEXT HO P l 2 dd
To NEXT- HO P l ayer - 2 addr
82
AS x
E-BGP
R5 R6 AS z
E-BGP
R1 R2 18.1/16
I-BGP IGP IGP (OSPF)
2.2.2.2
AS y
R4
83
AS x
E-BGP
R2 R5 R6 18.1/16
I-BGP IGP (OSPF)
2.2.2.2
AS z
Q: repeat the sequence of previous slide with this new assumption R5 advertises 18.1/16, NEXT-HOP = 2.2.2.2 to R6 via E‐BGP R6 transmits 18.1/16, NEXT-HOP = 2.2.2.2 to R1 and R2 via I BGP
E-BGP
R4 R1 R2 /
IGP (OSPF)
AS y
I‐BGP
R6 injects 18.1/16, NEXT-HOP = 2.2.2.2 into its local forwarding table R2 injects 18.1/16, NEXT-HOP = 2.2.2.2 into its local forwarding table
Independently, IGP finds that, at R2, packets to 2.2.10.1 should be sent to R1 Data packet to 18.1.2.3 is received by R2
At R2, recursive table lookup determines that packet should be forwarded to
84
, p p R1 At R1, recursive table lookup determines that packet should be forwarded to R6 At R6, recursive table lookup determines that packet should be forwarded to 2.2.2.2
2.2.0/17 NO-EXPORT 2 2/16
2.2.0/17 / 2.2/16 2.2/16
85
2.2.128/17 2.2.128/17 NO-EXPORT 2.2/16
AS x R2 R1 10 1/16 MED=10 10.1/16 MED=50
AS R4 10.1/16 MED=10 10.2/16 MED=50 10.2/16 MED=10 10.1/16 10.2/16
R3
AS y
86
AS x R2 R1
AS y AS z R4 R2 R1 / 2 1/17
R3
AS 100 2.0/17 2.1/17
87
R11 R12
R21 R22
88
89
90
91