The Impact of Router Outages on the AS-Level Internet Matthew - PowerPoint PPT Presentation

The Impact of Router Outages on the AS-Level Internet Matthew Luckie* - University of Waikato   Robert Beverly - Naval Postgraduate School *work started while at CAIDA, UC San Diego SIGCOMM 2017, August 24th 2017 1 w w w . cai da. or

Internet Resilience Where are the Single Points of Failure? CE CE CE PE PE PE Example #A Example #B CE: Customer Edge   PE: Provider Edge 2

Internet Resilience Where are the Single Points of Failure? CE If the CE router fails,   the network is disconnected,   so the CE router is a   PE Single Point of Failure (SPoF) PE Example #A CE: Customer Edge   PE: Provider Edge 3

Internet Resilience Where are the Single Points of Failure? CE CE If the CE router fails,   PE the network has an   alternate path available,   so the CE router is NOT a   Single Point of Failure (SPoF) Example #B CE: Customer Edge   PE: Provider Edge 4

Internet Resilience Where are the Single Points of Failure? CE CE If the PE router fails,   PE the customer network is   disconnected, so the PE router is   a Single Point of Failure (SPoF) Example #B CE: Customer Edge   PE: Provider Edge 5

Challenges in topology analysis • Prior approaches analyzed static AS-level and router-level topology graphs, - e.g.: Nature 2000 • Important AS-level and router-level topology might be invisible to measurement , such as backup paths, - e.g: INFOCOM 2002 • A router that appears to be central to a network’s connectivity might not be - e.g.: AMS 2009 6

What we did Large-scale ( Internet-wide ) longitudinal ( 2.5 years ) measurement study to characterize prevalence of Single Points of Failure ( SPoF ): 1.Efficiently inferred IPv6 router outage time windows 2. Associated routers with IPv6 BGP prefixes 3. Correlated router outages with BGP control plane 4. Correlated router outages with data plane 5. Validated inferences of SPoF with network operators 7

What we did Identified IPv6 router interfaces from traceroute 83K to 2.4M interfaces from CAIDA’s   Archipelago traceroute measurements 8

What we did probed router interfaces to infer outage windows We used a single vantage point located at CAIDA,   UC San Diego for the duration of this study 9

What we did Central counter: 9290 10

What we did Central counter: 9290 Central counter: 9291 T 1 : 9290 9290 10

What we did Central counter: 9292 Central counter: 9290 Central counter: 9291 T 1 : 9290 T 1 : 9290 T 2 : 9291 9291 10

What we did Central counter: 9292 Central counter: 9290 Central counter: 9291 Central counter: 9293 T 1 : 9290 T 1 : 9290 T 1 : 9290 T 2 : 9291 T 2 : 9291 T 3 : 9292 9292 10

What we did Central counter: 9292 Central counter: 9294 Central counter: 9293 Central counter: 9290 Central counter: 9291 T 1 : 9290 T 1 : 9290 T 1 : 9290 T 1 : 9290 T 2 : 9291 T 2 : 9291 T 2 : 9291 T 3 : 9292 T 3 : 9292 9293 T 4 : 9293 10

What we did Central counter: 9292 Central counter: 9294 Central counter: 9295 Central counter: 9293 Central counter: 9291 Central counter: 9290 T 1 : 9290 T 1 : 9290 T 1 : 9290 T 1 : 9290 T 1 : 9290 T 2 : 9291 T 2 : 9291 T 2 : 9291 T 2 : 9291 T 3 : 9292 T 3 : 9292 T 3 : 9292 9294 T 4 : 9293 T 4 : 9293 T 5 : 9294 10

What we did Central counter: 9292 Central counter: 9295 Central counter: 9294 Central counter: 1 Central counter: 9293 Central counter: 9290 Central counter: 9291 T 1 : 9290 T 1 : 9290 T 1 : 9290 T 1 : 9290 T 1 : 9290 Reboot! T 2 : 9291 T 2 : 9291 T 2 : 9291 T 2 : 9291 T 3 : 9292 T 3 : 9292 T 3 : 9292 T 4 : 9293 T 4 : 9293 T 5 : 9294 10

What we did Central counter: 9294 Central counter: 1 Central counter: 9295 Central counter: 2 Central counter: 9292 Central counter: 9290 Central counter: 9293 Central counter: 9291 T 1 : 9290 T 1 : 9290 T 1 : 9290 T 1 : 9290 T 1 : 9290 T 1 : 9290 T 2 : 9291 T 2 : 9291 T 2 : 9291 T 2 : 9291 T 2 : 9291 T 3 : 9292 T 3 : 9292 T 3 : 9292 T 3 : 9292 1 T 4 : 9293 T 4 : 9293 T 4 : 9293 T 5 : 9294 T 5 : 9294 T 6 : 1 10

What we did Central counter: 9294 Central counter: 1 Central counter: 9295 Central counter: 2 Central counter: 3 Central counter: 9292 Central counter: 9293 Central counter: 9290 Central counter: 9291 T 1 : 9290 T 1 : 9290 T 1 : 9290 T 1 : 9290 T 1 : 9290 T 1 : 9290 T 1 : 9290 T 2 : 9291 T 2 : 9291 T 2 : 9291 T 2 : 9291 T 2 : 9291 T 2 : 9291 T 3 : 9292 T 3 : 9292 T 3 : 9292 T 3 : 9292 T 3 : 9292 2 T 4 : 9293 T 4 : 9293 T 4 : 9293 T 4 : 9293 T 5 : 9294 T 5 : 9294 T 5 : 9294 T 6 : 1 T 6 : 1 T 7 : 2 10

What we did Central counter: 9292 Central counter: 3 Central counter: 2 Central counter: 9294 Central counter: 1 Central counter: 9295 Central counter: 9290 Central counter: 9291 Central counter: 4 Central counter: 9293 T 1 : 9290 T 1 : 9290 T 1 : 9290 T 1 : 9290 T 1 : 9290 T 1 : 9290 T 1 : 9290 T 1 : 9290 T 2 : 9291 T 2 : 9291 T 2 : 9291 T 2 : 9291 T 2 : 9291 T 2 : 9291 T 2 : 9291 T 3 : 9292 T 3 : 9292 T 3 : 9292 T 3 : 9292 T 3 : 9292 T 3 : 9292 3 T 4 : 9293 T 4 : 9293 T 4 : 9293 T 4 : 9293 T 4 : 9293 T 5 : 9294 T 5 : 9294 T 5 : 9294 T 5 : 9294 T 6 : 1 T 6 : 1 T 6 : 1 T 7 : 2 T 7 : 2 T 8 : 3 10

What we did probed router interfaces to infer outage windows using IPID T 1 : 9290 T 2 : 9291 T 3 : 9292 T 4 : 9293 Outage   T 5 : 9294 Window T 6 : 1 T 7 : 2 T 8 : 3 Infer a reboot when time series of values returned from a router is discontinuous, indicating router was restarted 11

Why IPv6 fragment IDs? • IPv4 Fragment IDs: - 16 bits, bursty velocity : every packet requires unique ID - At 100Mbps and 1500 byte packets, Nyquist rate dictates   4 second probing interval • IPv6 Fragment IDs: - 32 bits, low velocity : IPv6 routers rarely send fragments - We average 15 minute probing interval 12

What we did correlated routers with prefixes   using traceroute paths 13

What we did 2001:db8:2::/48 Ark VP correlated routers with prefixes   using traceroute paths 50-60 Ark VPs traceroute every   routed IPv6   2001:db8:1::/48 prefix every day Ark VP 14

What we did 2001:db8:2::/48 Ark VP computed distance of   router from AS announcing   network 0 (CE) 2 1   (PE) CE: Customer Edge   2001:db8:1::/48 PE: Provider Edge 15

What we did 2001:db8:2::/48 correlated router outage windows   with BGP control plane 0 (CE) 2001:db8:1::/48 16

What we did 2001:db8:2::/48 correlated router outage windows   with BGP control plane T 1 : 9290 T 2 : 9291 T 3 : 9292 T 4 : 9293 Outage   T 5 : 9294 Window T 6 : 1 T 7 : 2 T 8 : 3 2001:db8:1::/48 17

What we did 2001:db8:2::/48 correlated router outage windows   with BGP control plane RouteViews T 1 : 9290 2001:db8:2::/48 T 2 : 9291 T 5.2 : Peer-1 W T 3 : 9292 T 5.2 : Peer-2 W T 4 : 9293 T 5.3 : Peer-3 W Outage   T 5 : 9294 T 5.3 : Peer-4 W Window T 6 : 1 T 5.8 : Peer-3 A T 7 : 2 T 5.8 : Peer-2 A T 8 : 3 T 5.8 : Peer-1 A 2001:db8:1::/48 T 5.8 : Peer-4 A 18

What we did classified impact on BGP according to observed activity   overlapping with inferred outage • Complete Withdrawal : all peers simultaneously withdrew route for at least 70 seconds - Single Point of Failure ( SPoF ) • Partial Withdrawal : at least one peer withdrew route for at least 70 seconds, but not all did • Churn : BGP activity for the prefix • No Impact : No observed BGP activity for the prefix 19

What we did Data Collection Summary • Probed IPv6 routers at ~15 minute intervals from   18 Jan 2015 to 30 May 2017 (approx. 2.5 years) • 149,560 routers allowed reboots to be detected • We inferred 59,175 (40%) rebooted at least once,750K reboots in total 1 0.8 0.6 CDF 0.4 0.2 0 1 10 100 Number of Outages 20

What we found • 2,385 (4%) of routers that rebooted (59K) we inferred to be SPoF for at least one IPv6 prefix in BGP • Of SPoF routers, we inferred 59% to be customer edge router; 8% provider edge; 29% within destination AS • No covering prefix for 70% of withdrawn prefixes - During one-week sample, covering prefix presence during withdrawal did not imply data plane reachability • IPv6 Router reboots correlated with IPv4 BGP control plane activity 21

Limitations • Applicability to IPv4 depends on router being dual-stack • Requires IPID assigned from a counter - Cisco, Huawei, Vyatta, Microtik, HP assign from counter - 27.1% responsive for 14 days assigned from counter • Router outage might end before all peers withdraw route - Path exploration + Minimum Route Advertisement Interval (MRAI) + Route Flap Dampening (RFD) • Complex events: multiple router outages but one detected - We observed some complex events and filtered them out 22

Validation Reboots SPoF ✔ ✔ Network ? ? ✘ ✘ US University 7 0 8 7 0 8 US R&E backbone #1 2 0 3 3 2 0 US R&E backbone #2 3 0 1 0 0 4 NZ R&E backbone 11 0 22 4 2 27 Total: 23 0 34 14 4 39 ✔ = Validated Inference   ✘ = Incorrect Inference ? = Not Validated 23

The Impact of Router Outages on the AS-Level Internet Matthew - PowerPoint PPT Presentation

The Impact of Router Outages on the AS-Level Internet Matthew Luckie* - University of Waikato Robert Beverly - Naval Postgraduate School *work started while at CAIDA, UC San Diego SIGCOMM 2017, August 24th 2017 1 w w w . cai da. or

Tor: The Onion Router 2 / 13 Tor: The Onion Router www.cbc.ca 2 / 13 Tor: The Onion Router

An Enhanced Global Router An Enhanced Global Router An Enhanced Global Router An Enhanced Global

Worksheet 9 Worksheet 9 Linux as a router, packet filtering, traffic Linux as a router, packet

OSPF Router Types OSPF Router Types There are four types of OSPF routers. Router types are

Outages and Curtailments Outages and Curtailments Southwest Cold Weather Event Southwest Cold

Online Security Michael Hutchinson My First line of Defense Making Things Secure Router

CNC Router What is it good for? About the CNC Full name is CNC Router but is shortened to CNC

Off- -Line Router ETR Line Router ETR- -400 400 Off 2007. 05. 15 Rev. 1 2007. 05. 15

Current W ireless Router C Current W ireless Router C Traditional Routing requires 4 time

Constraining Queuing Delay in a Constraining Queuing Delay in a Router based on Superposition of

BSD Router Project Don't buy a router: download it ! FOSDEM 15 Olivier Cochard-Labb

Source 1 10 Mbps Ethernet Router Dest 1.5 Mbps T1 Link 100 Mbps FDDI Source 2 Source 1

Regaining sovereignty over your router Lucas Lasota | Legal Team lucas.lasota@fsfe.org What is

Using Social Sensors for Detecting Power Outages in the Electrical Utility Industry Konstantin

Detecting network outages using different sources of data TMA Experts Summit, Paris, France

Open Multi-Core Router H3C SR66 Development Trends of High-end Routers H3C SR66 Open

WINDOW TWO EFInA Innovation Grant Round 6: Digital Financial Services for Women Landscape of

Cannabis Regulation Commission Thursday, October 24, 2019 OVERVIEW Citys Commercial

Telson Mining Corporation A new Multi- Mine Producer in Mexico TSX.V: TSN OTCBB: SOHFF

Decision on Authorization for Filing FERC Order No. 890 Transmission Planning Process Gary

Division of School Finance and Operations Panel Discussion CAS BO April 5, 2018 Agenda

MMLGs Phase 3 Cannabis Workshop Disclaimer The materials available at this presentation are

November 2, 2017 I. About Bonotel Exclusive Travel & Receptive Operators Our Unmatched

HMS Next Steps Post-Referendum Community Data May 23, 2016 Survey Notes Phone survey

The Impact of Router Outages on the AS-Level Internet Matthew - PowerPoint PPT Presentation

The Impact of Router Outages on the AS-Level Internet Matthew Luckie* - University of Waikato Robert Beverly - Naval Postgraduate School *work started while at CAIDA, UC San Diego SIGCOMM 2017, August 24th 2017 1 w w w . cai da. or

Tor: The Onion Router 2 / 13 Tor: The Onion Router www.cbc.ca 2 / 13 Tor: The Onion Router

An Enhanced Global Router An Enhanced Global Router An Enhanced Global Router An Enhanced Global

Worksheet 9 Worksheet 9 Linux as a router, packet filtering, traffic Linux as a router, packet

OSPF Router Types OSPF Router Types There are four types of OSPF routers. Router types are

Outages and Curtailments Outages and Curtailments Southwest Cold Weather Event Southwest Cold

Online Security Michael Hutchinson My First line of Defense Making Things Secure Router

CNC Router What is it good for? About the CNC Full name is CNC Router but is shortened to CNC

Off- -Line Router ETR Line Router ETR- -400 400 Off 2007. 05. 15 Rev. 1 2007. 05. 15

Current W ireless Router C Current W ireless Router C Traditional Routing requires 4 time

Constraining Queuing Delay in a Constraining Queuing Delay in a Router based on Superposition of

BSD Router Project Don't buy a router: download it ! FOSDEM 15 Olivier Cochard-Labb

Source 1 10 Mbps Ethernet Router Dest 1.5 Mbps T1 Link 100 Mbps FDDI Source 2 Source 1

Regaining sovereignty over your router Lucas Lasota | Legal Team lucas.lasota@fsfe.org What is

Using Social Sensors for Detecting Power Outages in the Electrical Utility Industry Konstantin

Detecting network outages using different sources of data TMA Experts Summit, Paris, France

Open Multi-Core Router H3C SR66 Development Trends of High-end Routers H3C SR66 Open

WINDOW TWO EFInA Innovation Grant Round 6: Digital Financial Services for Women Landscape of

Cannabis Regulation Commission Thursday, October 24, 2019 OVERVIEW Citys Commercial

Telson Mining Corporation A new Multi- Mine Producer in Mexico TSX.V: TSN OTCBB: SOHFF

Decision on Authorization for Filing FERC Order No. 890 Transmission Planning Process Gary

Division of School Finance and Operations Panel Discussion CAS BO April 5, 2018 Agenda

MMLGs Phase 3 Cannabis Workshop Disclaimer The materials available at this presentation are

November 2, 2017 I. About Bonotel Exclusive Travel &amp; Receptive Operators Our Unmatched

HMS Next Steps Post-Referendum Community Data May 23, 2016 Survey Notes Phone survey

November 2, 2017 I. About Bonotel Exclusive Travel & Receptive Operators Our Unmatched