Border Gateway Protocol:
The Good, Bad, and Ugly of Internet Routing
Jim Cowie, Chief Scientist
@jimcowie / @DynResearch Stanford EE Computer Systems Colloquium 11 February 2015
Border Gateway Protocol: The Good, Bad, and Ugly of Internet Routing - - PowerPoint PPT Presentation
Border Gateway Protocol: The Good, Bad, and Ugly of Internet Routing Jim Cowie, Chief Scientist @jimcowie / @DynResearch Stanford EE Computer Systems Colloquium 11 February 2015 On the menu today The Core Problem: Attribution and Belief on
Border Gateway Protocol:
The Good, Bad, and Ugly of Internet Routing
Jim Cowie, Chief Scientist
@jimcowie / @DynResearch Stanford EE Computer Systems Colloquium 11 February 2015
/ 2
@jimcowie / @DynResearch
/ 3
@jimcowie / @DynResearch
NOTE: Some cities host multiple collectors. Cable Map credit: Telegeography
/ 4
@jimcowie / @DynResearch
/ 5
@jimcowie / @DynResearch
/ 6
@jimcowie / @DynResearch
/ 7
@jimcowie / @DynResearch
/ 8
@jimcowie / @DynResearch
§ This is good: Great flexibility to support business objectives § This is bad: Vulnerability to bogus route propagation.
/ 9
@jimcowie / @DynResearch
/ 10
@jimcowie / @DynResearch
/ 11
@jimcowie / @DynResearch
/ 12
@jimcowie / @DynResearch
197039
/ 13
@jimcowie / @DynResearch
197039
Nomura 194.36.241.0/24
London, UK
/ 14
@jimcowie / @DynResearch
197039
Nomura 194.36.241.0/24
London, UK This one has 256 IPv4 addresses (32-24=8 bits)
/ 15
@jimcowie / @DynResearch
197039
Nomura 194.36.241.0/24
London, UK
/ 16
@jimcowie / @DynResearch
8220
702
$$ $$ Transit: I guarantee delivery to the entire
Peering: I only guarantee delivery to my customers
197039
Nomura 194.36.241.0/24
London, UK
/ 17
@jimcowie / @DynResearch
8220
702
$$ $$
197039
Nomura 194.36.241.0/24
London, UK
Deutsche Telekom
3320 Level3
3356 $ $
Wholesale Transit: prices per megabit tend to drop as the volumes exchanged increase (aggregation)
/ 18
@jimcowie / @DynResearch
8220
702
$$ $$
197039
Nomura 194.36.241.0/24
London, UK
Deutsche Telekom
3320 Level3
3356 $ $
Comcast 7922 Rostelecom 12389 Verizon Wireless 6167
$ $
Siemens AG 29308
$$
/ 19
@jimcowie / @DynResearch
/ 20
@jimcowie / @DynResearch
8220
702
197039
Nomura 194.36.241.0/24
London, UK
Deutsche Telekom
3320 Level3
3356
Comcast 7922 Rostelecom 12389 Verizon Wireless 6167
Siemens AG 29308
Money, Route Announcements Go Out Traffic comes back
/ 21
@jimcowie / @DynResearch
8220
702
197039
??? 194.36.252.0/24
London, UK
Deutsche Telekom
3320 Level3
3356
Comcast 7922 Rostelecom 12389 Verizon Wireless 6167
Siemens AG 29308
Incorrect Route Announcements Go Out Does traffic still come back?
/ 22
@jimcowie / @DynResearch
8220
702
197039
Wedgewood UK 194.36.252.0/24
London, UK
Their customer, Nomura, has no business advertising the (unused, unrouted) address space of Wedgwood China!
/ 23
@jimcowie / @DynResearch
8220
702
197039
Wedgewood UK 194.36.252.0/24
London, UK
If they fail to filter this mistake, and propagate the route to their providers and peers, it will probably be accepted everywhere on Earth within a few seconds.
No customer filtering = global propagation
/ 24
@jimcowie / @DynResearch
8220
702
197039
Wedgewood UK 194.36.252.0/24
London, UK
Customer filtering is somewhat laborious and error-prone. Hacks include:
prefix originations
entries in various routing registries
No customer filtering = global propagation
/ 25
@jimcowie / @DynResearch
8220
702
197039
CANTV 190.36.241.0/24
Venezuela
What if the space is already routed and in active use?
/ 26
@jimcowie / @DynResearch
197039
CANTV 190.36.241.0/24
Venezuela
CANTV
8048
Globenet
52320
CANTV 190.36.0.0/16
Venezuela
190.36.0.0/16 versus 190.36.241.0/24
/ 27
@jimcowie / @DynResearch
197039
CANTV 190.36.241.0/24
Venezuela
CANTV
8048
Globenet
52320
CANTV 190.36.0.0/16
Venezuela “Hole” punched in CANTV’s /16 BGP tells everyone: send traffic towards the ASN who made the most specific announcement
/ 28
@jimcowie / @DynResearch
197039
CANTV 190.36.241.0/24
Venezuela
CANTV
8048
Globenet
52320
CANTV 190.36.0.0/16
Venezuela Traffic for these 256 addresses is silently diverted to London.
The Venezuelans would need to be monitoring the global BGP table to detect this as anything other than a mysterious drop in traffic.
/ 29
@jimcowie / @DynResearch
can consult to say whether or not provider X is entitled to originate or transit address space Y
/ 30
@jimcowie / @DynResearch
can consult to say whether or not provider X is entitled to originate or transit address space Y
control, it would be a massive SPOF , subject to inappropriate influence
/ 31
@jimcowie / @DynResearch
/ 32
@jimcowie / @DynResearch
classification, don’t feel bad if you miss it
/ 33
@jimcowie / @DynResearch
Traffic between two floors of the same office building in Singapore takes over 350ms round trip, taking a long detour via a mysterious datacenter in San Jose, California
/ 34
@jimcowie / @DynResearch
Provider1 won’t peer with Provider2 in Singapore; Provider2 must drag traffic to San Jose to hand it off to Provider1, who drags it home again to Singapore.
/ 35
@jimcowie / @DynResearch
Traffic from Western Europe to the US takes around 70ms round trip, traveling via Iceland’s incumbent provider
/ 36
@jimcowie / @DynResearch
Iceland’s Siminn advertised other people’s routes in London, attracted the traffic, and reinjected it in Canada. Transatlantic Internet traffic should probably never come ashore in Iceland (cost, latency, geography).
/ 37
@jimcowie / @DynResearch
0-50ms RTTs to the anycasted 8.8.8.8 recursive resolver ICMP traceroutes from Brazil, Chile, Argentina suddenly take 120-220ms
/ 38
@jimcowie / @DynResearch
Google anycast instances in Brazil stopped responding to South American consumer queries. DNS queries were being answered from California instead.
Google anycast instances returned to Brazil in February 2014, and it’s faster than ever. “Anycast” injects the same route in many places, and the BGP-closest instance serves.
/ 39
@jimcowie / @DynResearch
One day, latencies to a particular content provider network (hosting important domains for software download) decrease by 90% from Eastern Europe
ICMP round-trip measurements improve transiently from 25ms to less than 5ms
/ 40
@jimcowie / @DynResearch
Speed of light violation sends clear signal that the site is now “somewhere else” .. Traffic is landing on an alternative responder. Need to assess:
/ 41
@jimcowie / @DynResearch
Russian provider Vimpelcom advertises 7,000 customer prefixes to its peer, China Telecom, who in turn announces them to its peers. China Telecom then announces more than 300,000 global routes to Vimpelcom in return.
Traffic increases significantly between the two peers.
/ 42
@jimcowie / @DynResearch
Remember, peers are only supposed to provide mutual visibility into each others’ customers. When peers announce peer routes to other peers, it quickly turns into traffic misdirection .. Not a hijacking, but a policy breakdown.
trace from Moscow to Manchester, NH at 12:09 Aug 05, 2014 1 * 2 194.154.89.125 (Vimpelcom, Moscow, RU) 0.743ms 3 79.104.235.66 mx01.Frankfurt.gldn.net 40.574ms 4 118.85.204.53 beeline-gw3.china-telecom.net 43.198ms 5 202.97.58.57 (China Telecom, Shanghai, CN) 302.433ms 6 202.97.58.238 (China Telecom, Los Angeles, US) 479.642ms 7 202.97.49.14 (China Telecom, Los Angeles, US) 487.225ms 8 38.104.139.77 te0-7-0-24.ccr21.sjc03.atlas.cogentco.com 380.087ms 9 154.54.6.105 be2000.ccr21.sjc01.atlas.cogentco.com 375.079ms 10 154.54.28.33 be2164.ccr21.sfo01.atlas.cogentco.com 371.727ms 11 154.54.30.54 be2132.ccr21.mci01.atlas.cogentco.com 372.585ms 12 154.54.6.86 be2156.ccr41.ord01.atlas.cogentco.com 370.596ms 13 154.54.44.86 be2351.ccr21.cle04.atlas.cogentco.com 367.498ms 14 154.54.25.89 be2009.ccr21.alb02.atlas.cogentco.com 371.972ms 15 38.104.52.78 (Cogent, Albany, US) 367.334ms 16 70.109.168.139 burl-lnk.ngn.east.myfairpoint.net 321.980ms 17 64.222.166.166 (Fairpoint Communications, Concord, US) 315.036ms 18 64.223.189.66 static.man.east.myfairpoint.net 321.682ms
Trace from Moscow to the US East Coast …through Shanghai and California
/ 43
@jimcowie / @DynResearch
/ 44
@jimcowie / @DynResearch
/ 45
@jimcowie / @DynResearch
This will become more common as IPv4 address space nears exhaustion!
/ 46
@jimcowie / @DynResearch
and deny it to them
Basic denial of service. You only get the portion
(shorter ASPath). Your victim may not even notice!
/ 47
@jimcowie / @DynResearch
and deny it to them
victim’s address space Victim sees 100% denial of service on the specific address range you’re hijacking --- even people very close to the victim may be fooled. (Youtube/Pakistan)
/ 48
@jimcowie / @DynResearch
and deny it to them
victim’s address space
take all the traffic to all of the victim’s address space Example: split the victim’s /22 into two /23s or four /24s. Very evil!
/ 49
@jimcowie / @DynResearch
and deny it to them
victim’s address space
take all the traffic to all of the victim’s address space
modify it, and then quietly deliver it to the victim ✓✓✓
/ 50
@jimcowie / @DynResearch
Man in the Middle Hijacking requires two paths:
Traffic comes in the dirty path, leaves by the clean path What you do with it in the meantime is up to your imagination.
Hijacker
dirty clean
Victim
“This way to the victim, folks” “Normal routing!”
/ 51
@jimcowie / @DynResearch
Classic MITM was demonstrated live by Pilosov & Kapela on the show network at Defcon 16 (2008):
path from attacker to victim
/ 52
@jimcowie / @DynResearch
ASPath poisoning cleverly injects the victim’s own ASN into the path Loop detection algorithm blinds the victim to the manipulation Everyone else, of course, can see what’s going on. Subsequent refinements will all focus on careful control of visibility of path manipulation.
/ 53
@jimcowie / @DynResearch
At least two possible approaches beyond Pilosov/Kapela simple path poisoning, both of which we first saw in 2013:
Requires provider with many peers in order to get appreciable amounts of traffic
/ 54
@jimcowie / @DynResearch
54
AS “origins” for hijacked prefixes
through Beltelecom, not the claimed origin
/ 55
@jimcowie / @DynResearch
55
“origins”
Siminn, not the claimed
redirection of traffic, but claimed router bug
/ 56
@jimcowie / @DynResearch
56
/ 57
@jimcowie / @DynResearch
57
/ 58
@jimcowie / @DynResearch
58
/ 59
@jimcowie / @DynResearch
Hijacker announces prefix p to regional provider P with BGP community P:71990
Whois tells us: remarks
To deny prefix propagation, remarks use P:7DNNA, where remarks D – Deny announcements: remarks 1 – International peers remarks 2 – In-country peers … remarks NN – Upstream: remarks 01 – Provider 1 remarks 02 – Provider 2 … remarks 99 – All Upstreams … remarks A – Action: remarks 0 – Do not announce prefix
announce p to any of its international peers, implying that it should announce p only to its domestic peers and customers.
constrained to a geographic region.
misdirected to the attacker, who can then forward it onward to its rightful owner via a clean path through another provider.
/ 60
@jimcowie / @DynResearch
MITM
providers.
techniques employed
/ 61
@jimcowie / @DynResearch
/ 62
@jimcowie / @DynResearch
Geolocation of network subject to traffic detouring
/ 63
@jimcowie / @DynResearch
/ 64
@jimcowie / @DynResearch
/ 65
@jimcowie / @DynResearch
We continued to see ‘traditional’ hijackings:
Impacts were significant for those whose networks are involved, but this seemed like “benign” hijacking compared to 2013’s MITM series.
/ 66
@jimcowie / @DynResearch
All that changed in the second half of 2014, and in 2015, hijackers are back in force. We think it’s being driven by
/ 67
@jimcowie / @DynResearch
/ 68
@jimcowie / @DynResearch
/ 69
@jimcowie / @DynResearch
Each of the 49,000+ ASes on the Internet could …
allowing others (who?) to do the monitoring.
authority for every change
that validate, and a lot of compute cycles per BGP update
/ 70
@jimcowie / @DynResearch
“When a BGPsec update message is received by a BGP speaker, the BGP speaker can validate the message as follows. For each signature, the BGP speaker first needs to determine if there is a valid RPKI Router certificate matching the SKI and containing the appropriate AS
from this BGPsec router certificate. If all the signatures can be verified in this fashion, the BGP speaker is assured that the update message it received actually came via the AS path specified in the update message.
In the above example, upon receiving the BGPsec update message, a BGP speaker for AS 4 would do the following. First, it would look at the SKI for the first signature and see if this corresponds to a valid BGPsec Router certificate for AS 1. Next, it would verify the first signature using the key found in this valid certificate. Finally, it
would repeat this process for the second and third signatures, checking to see that there are valid BGPsec router certificates for AS 2 and AS 3 (respectively) and that the signatures can be verified with the keys found in these certificates. Note that the BGPsec speaker for AS 4 should additionally perform origin validation as per RFC 6483 [RFC6483]. However, such origin validation is independent
/ 71
@jimcowie / @DynResearch
that answer (e.g., Turkey answering for Google’s DNS servers).
/ 72
@jimcowie / @DynResearch
AS393327 (THE GEORGE W. BUSH FOUNDATION, US) announced 1 pfx(s) that are potentially hijacks of currently routed address space: 12.203.53.0/24 (GEORGE W. BUSH PRESIDENTIAL CENT, US), seen by 394 peers for an average
12.128.0.0/9 (AT&T Bell Laboratories, US) Origin=7018 (AT&T, US) *** 8 MAY 2014
/ 73
@jimcowie / @DynResearch
AS393327 (THE GEORGE W. BUSH FOUNDATION, US) announced 1 pfx(s) that are potentially hijacks of currently routed address space: 12.203.53.0/24 (GEORGE W. BUSH PRESIDENTIAL CENT, US), seen by 394 peers for an average
12.128.0.0/9 (AT&T Bell Laboratories, US) Origin=7018 (AT&T, US) *** 8 MAY 2014 – ROUTING ACCOMPLISHED
/ 74
@jimcowie / @DynResearch
Suppress alerts for new originations that meet certain conditions:
infrastructure)
/ 75
@jimcowie / @DynResearch
suspicious originations for review
/ 76
@jimcowie / @DynResearch
Dyn Research generates a handful of internal reports per day. Many are false positives and easily dismissed. Occasionally, we get a “live one”. Here is one automated report entry from April 2014:
AS6697 (Beltelecom, BY) announced 1 pfx(s) that are potentially hijacks of currently routed address space: XXX.YYY.151.0/24 (ZZZZ, GB), seen by 310 peers for an average of 1.16
Origin=NNNN (ZZZZZ, GB)
/ 77
@jimcowie / @DynResearch
/ 78
@jimcowie / @DynResearch
specific enterprise typically occur beyond the enterprise’s perimeter and may be invisible from the enterprise’s vantage points
necessary to minimize false negatives (otherwise, regional or provider specific events can be missed)
analytics are necessary to detect deviations from “normal” and minimize false positives
/ 79
@jimcowie / @DynResearch
How do we distinguish malicious/suboptimal from legitimate? Best case: An enterprise knows “Ground Truth” for what it cares about. Monitoring system then alerts on any deviations. Actual case: “Ground Truth” is rarely fully and correctly specified, and never will be for most organizations.
Monitor long enough to determine the “approximately correct”★ state, alert on deviations, automate response processing
★ with thanks to Leslie Valiant, author of Probably Approximately Correct.
/ 80
@jimcowie / @DynResearch
Internet, and there are no simple solutions within reach of global deployment
if everyone commits to watching carefully.
/ 81
@jimcowie / @DynResearch
Each of these repositories of datasets has associated caveats, but finding them is part of the challenge, and you can’t beat the price.
/ 82
@jimcowie / @DynResearch
– Seek fellowship with people who make their living pushing packets – Attend conferences where network
– Buy them refreshments – Sanity-check your great ideas against their experience
/ 83
@jimcowie / @DynResearch
Border Gateway Protocol:
The Good, Bad, and Ugly of Internet Routing
Jim Cowie, Chief Scientist
@jimcowie / @DynResearch Stanford EE Computer Systems Colloquium 11 February 2015