Measuring and Characterizing IPv6 Router Availability Robert Beverly - - PowerPoint PPT Presentation

measuring and characterizing ipv6 router availability
SMART_READER_LITE
LIVE PREVIEW

Measuring and Characterizing IPv6 Router Availability Robert Beverly - - PowerPoint PPT Presentation

Measuring and Characterizing IPv6 Router Availability Robert Beverly , Matthew Luckie , Lorenza Mosley , kc claffy Naval Postgraduate School UCSD/CAIDA March 20, 2015 PAM 2015 - 16th Passive and Active Measurement


slide-1
SLIDE 1

Measuring and Characterizing IPv6 Router Availability

Robert Beverly∗, Matthew Luckie†, Lorenza Mosley∗, kc claffy†

∗Naval Postgraduate School †UCSD/CAIDA

March 20, 2015

PAM 2015 - 16th Passive and Active Measurement Conference

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 1 / 28

slide-2
SLIDE 2

Infrastructure Uptime

Outline

1

Infrastructure Uptime

2

Methodology

3

Experiments

4

Conclusion

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 2 / 28

slide-3
SLIDE 3

Infrastructure Uptime Motivation

Infrastructure “Uptime:” More formally: uninterrupted system availability Duration between device restarts Restarts due e.g. to planned device reboots, crashes, power failures Our Work:

1

Development of an active network measurement technique to infer infrastructure uptime

2

Uptime measurement survey of ∼ 21, 000 IPv6 router interfaces

  • ver 5-month period

3

Validation of our uptime inferences by five autonomous systems

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 3 / 28

slide-4
SLIDE 4

Infrastructure Uptime Motivation

Infrastructure “Uptime:” More formally: uninterrupted system availability Duration between device restarts Restarts due e.g. to planned device reboots, crashes, power failures Our Work:

1

Development of an active network measurement technique to infer infrastructure uptime

2

Uptime measurement survey of ∼ 21, 000 IPv6 router interfaces

  • ver 5-month period

3

Validation of our uptime inferences by five autonomous systems

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 3 / 28

slide-5
SLIDE 5

Infrastructure Uptime Motivation

Why

Who wants uptime data? Researchers Operators Policy makers Regulators:

For instance, FCC mandates reporting voice network outages (but not broadband network services)

Despite importance of Internet as critical infrastructure, little quantitative data on Internet device availability exists!

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 4 / 28

slide-6
SLIDE 6

Infrastructure Uptime Motivation

Why

Who wants uptime data? Researchers Operators Policy makers Regulators:

For instance, FCC mandates reporting voice network outages (but not broadband network services)

Despite importance of Internet as critical infrastructure, little quantitative data on Internet device availability exists!

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 4 / 28

slide-7
SLIDE 7

Infrastructure Uptime Motivation

Uptime and Security

Security Implications Understand whether a reboot-based security update/patch could possibly have been applied to a device (or whether device likely still vulnerable) Determine if an attack designed to reboot a device is successful Gain knowledge of a network’s operational practices and maintenance windows

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 5 / 28

slide-8
SLIDE 8

Infrastructure Uptime Motivation

Obtaining Remote Uptime

How to remotely obtain uptime? Just login? Management protocols (e.g. SNMP)?

...requires access privilege

Prior Network Availability Work: nmap, netcraft: use TCP timestamp rate to estimate uptime

...only for old operating systems w/ low-frequency clocks ...restricted to infrastructure w/ listening TCP

Prevalence and persistence of BGP routes [P97, RWXZ02] Operational mailing lists [FB05]

...indirect measures unreliable, miss events

Edge probing [QHP13]

...not infrastructure, not uptime

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 6 / 28

slide-9
SLIDE 9

Infrastructure Uptime Motivation

Obtaining Remote Uptime

How to remotely obtain uptime? Just login? Management protocols (e.g. SNMP)?

...requires access privilege

Prior Network Availability Work: nmap, netcraft: use TCP timestamp rate to estimate uptime

...only for old operating systems w/ low-frequency clocks ...restricted to infrastructure w/ listening TCP

Prevalence and persistence of BGP routes [P97, RWXZ02] Operational mailing lists [FB05]

...indirect measures unreliable, miss events

Edge probing [QHP13]

...not infrastructure, not uptime

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 6 / 28

slide-10
SLIDE 10

Infrastructure Uptime Motivation

Objective

Instead, our objective: Find uptime of remote routers... which don’t accept TCP connections from untrusted sources... without privileged access... using active measurement

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 7 / 28

slide-11
SLIDE 11

Methodology

Outline

1

Infrastructure Uptime

2

Methodology

3

Experiments

4

Conclusion

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 8 / 28

slide-12
SLIDE 12

Methodology

Obtaining an Identifier

Fundamentally, our work is active fingerprinting Uses an identifier from the router’s IPv6 control plane stack Obtaining an Identifier for IPv6 Routers We leverage our prior work on IPv6 alias resolution: too-big-trick (PAM 2013), speedtrap (IMC 2013) To remotely obtain an identifier without privileged access

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 9 / 28

slide-13
SLIDE 13

Methodology

Obtaining an Identifier

Fundamentally, our work is active fingerprinting Uses an identifier from the router’s IPv6 control plane stack Obtaining an Identifier for IPv6 Routers We leverage our prior work on IPv6 alias resolution: too-big-trick (PAM 2013), speedtrap (IMC 2013) To remotely obtain an identifier without privileged access

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 9 / 28

slide-14
SLIDE 14

Methodology

Obtaining an Identifier

IPv6 Fragmentation Background No in-network fragmentation in IPv6 If next hop interface MTU is smaller than packet, routers:

drop packet send ICMP6 “packet too big” (PTB) to source

IPv6 stack receiving PTB:

Caches per-destination maximum MTU Sends packets with length > PMTU using IPv6 fragment header extension

IPv6 fragment header contains ID Prior Insight: Router’s control plane also implements PTB cache and sends fragments if necessary – providing an ID

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 10 / 28

slide-15
SLIDE 15

Methodology

Obtaining an Identifier

IPv6 Fragmentation Background No in-network fragmentation in IPv6 If next hop interface MTU is smaller than packet, routers:

drop packet send ICMP6 “packet too big” (PTB) to source

IPv6 stack receiving PTB:

Caches per-destination maximum MTU Sends packets with length > PMTU using IPv6 fragment header extension

IPv6 fragment header contains ID Prior Insight: Router’s control plane also implements PTB cache and sends fragments if necessary – providing an ID

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 10 / 28

slide-16
SLIDE 16

Methodology Too-Big Trick

Too-Big Trick

Too-Big Trick Our prober sends ICMP6 echos and fake PTBs Inducing remote IPv6 router to originate fragmented packets

Prober

I C M P 6 E c h

  • R

e q 1 3 B , S e q = 1 I C M P 6 E c h

  • R

e q 1 3 B , S e q = I C M P E c h

  • R

e s p 1 3 B I C M P 6 T

  • B

i g F r a g I D = x , O f f s e t = F r a g I D = x , O f f s e t = 1 2 3 2 I C M P 6 E c h

  • R

e q 1 3 B , S e q = 2 F r a g I D = x + 1 , O f f s e t = F r a g I D = x + 1 , O f f s e t = 1 2 3 2

IPv6 Interface

Fragment identifier is (frequently) monotonically increasing and resets to 0 on (most) IPv6 stacks, including routers

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 11 / 28

slide-17
SLIDE 17

Methodology Too-Big Trick

Too-Big Trick

Too-Big Trick Our prober sends ICMP6 echos and fake PTBs Inducing remote IPv6 router to originate fragmented packets

Prober

I C M P 6 E c h

  • R

e q 1 3 B , S e q = 1 I C M P 6 E c h

  • R

e q 1 3 B , S e q = I C M P E c h

  • R

e s p 1 3 B I C M P 6 T

  • B

i g F r a g I D = x , O f f s e t = F r a g I D = x , O f f s e t = 1 2 3 2 I C M P 6 E c h

  • R

e q 1 3 B , S e q = 2 F r a g I D = x + 1 , O f f s e t = F r a g I D = x + 1 , O f f s e t = 1 2 3 2

IPv6 Interface

Fragment identifier is (frequently) monotonically increasing and resets to 0 on (most) IPv6 stacks, including routers

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 11 / 28

slide-18
SLIDE 18

Methodology Too-Big Trick

Methodology

High-Level: Periodically probe IPv6 routers with PTB and ICMP6 echo request (using scamper packet prober) For interface k, obtain a time series of fragment IDs and timestamps: Fk = (f1, t1), (f2, t2), . . . , (fn, tn) where ti < ti+1 If fi+1 < fi, then k rebooted between ti+1 and ti Real example, 3 probes per cycle: Mar 4 21:30:01: 0x00000001, 0x00000002, 0x00000003 Mar 5 04:25:05: 0x00000004, 0x00000005, 0x00000006 . . . Apr 21 09:39:12: 0x000001b0, 0x000001b1, 0x000001b2 Apr 21 16:42:54: 0x00000001, 0x00000002, 0x00000003

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 12 / 28

slide-19
SLIDE 19

Methodology Too-Big Trick

Real-world heterogeneity

Not as easy in practice: Odd behaviors, corner cases require de-noising, e.g.,:

.., 405, 406, 407, 850815256, 408, 409, ...

Different router vendors == Different IPv6 stacks BSD-based devices (notably Juniper) return random fragment IDs Linux-based devices return cyclic fragment IDs

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 13 / 28

slide-20
SLIDE 20

Methodology Too-Big Trick

Cyclic Fragment IDs

Linux Kernel 3.1-3.9: Sets the fragment counter per-inet peer using keyed hash of destination IP The per-inet peer data structure times out or is garbage collected Hence, we get the same repeating sequence every probe cycle Can still detect reboots, because the random secret for the hash is recomputed at system start! Real example, 3 probes per cycle: Mar 27 16:42:31: 0x7943f889, 0x7943f890, 0x7943f891 Mar 27 22:01:41: 0x7943f889, 0x7943f890, 0x7943f891 . . . Apr 26 17:45:02: 0x7943f889, 0x7943f890, 0x7943f891 Apr 26 22:52:12: 0xc2f9dcd7, 0xc2f9dcd8, 0xc2f9dcd9

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 14 / 28

slide-21
SLIDE 21

Methodology Too-Big Trick

Cyclic Fragment IDs

Linux Kernel 3.1-3.9: Sets the fragment counter per-inet peer using keyed hash of destination IP The per-inet peer data structure times out or is garbage collected Hence, we get the same repeating sequence every probe cycle Can still detect reboots, because the random secret for the hash is recomputed at system start! Real example, 3 probes per cycle: Mar 27 16:42:31: 0x7943f889, 0x7943f890, 0x7943f891 Mar 27 22:01:41: 0x7943f889, 0x7943f890, 0x7943f891 . . . Apr 26 17:45:02: 0x7943f889, 0x7943f890, 0x7943f891 Apr 26 22:52:12: 0xc2f9dcd7, 0xc2f9dcd8, 0xc2f9dcd9

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 14 / 28

slide-22
SLIDE 22

Experiments

Outline

1

Infrastructure Uptime

2

Methodology

3

Experiments

4

Conclusion

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 15 / 28

slide-23
SLIDE 23

Experiments

Data Collection

Data Gathered 66,471 IPv6 interfaces from CAIDA’s Ark traceroutes (31,170 unresponsive, 13,330 random) We probed 21,539 distinct IPv6 router interfaces that return monotonic or cyclic fragment IDs Probed each on average every 6 hours from March 5 - July 31, 2014 from single native IPv6 vantage point Interface Reboots → Router Reboots (see paper for details) Use Speedtrap to resolve aliases Separate into “core” routers (intra-AS) versus border routers (inter-AS)

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 16 / 28

slide-24
SLIDE 24

Experiments

Results

04-2014 05-2014 06-2014 07-2014 0.0 0.2 0.4 0.6 0.8 1.0 Cumulative Fraction of Reboots

Interfaces Interfaces (core) Routers Routers (core)

Reboots throughout duration of experiment Core routers exhibit more variation, suggesting correlated events

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 17 / 28

slide-25
SLIDE 25

Experiments

Results

10 −5 10 10−4 −3 CCDF of ifaces / routers 0.1 1 1 10 100 1K Observed Reboots Interfaces Interfaces (core) Routers Routers (core) 0.01

Overall, 68% of interfaces had no reboots, while 22% had

  • ne

Core routers and interfaces relatively more stable 78% of core routers had no reboots, 98% rebooted ≤ 2 times

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 18 / 28

slide-26
SLIDE 26

Experiments

Results

1

  • 4

1

  • 3

1

  • 2

1

  • 1

1 1

1

1

2

1

3

1

4

Observed Uptimes (hours) 0.0 0.2 0.4 0.6 0.8 1.0 Cumulative Fraction of Reboots

Interfaces Interfaces (core) Routers Routers (core)

Experiment duration: about 150 days 15% of uptimes were less than 1 day Median uptime of 23 days 10% had uptime ≥ 125 days

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 19 / 28

slide-27
SLIDE 27

Experiments

Validation

Solicited Validation from Operators of 12 ASes: 5 operators confirmed our inferences Total of 15 router restarts validated No false positives Reboots on May 18 and June 1, 2014:

Operators confirmed; due to TCAM exhaustion Predates 512K FIB bug discussion in August, 2014!

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 20 / 28

slide-28
SLIDE 28

Experiments

When do Routers Reboot

Geolocate routers to infer timezone using NetAcuity Weekend reboots much less likely (maintenance windows during week) Reboots by day-of-week Core All Monday 110 9.7% 925 11.2% Tuesday 226 20.0% 1684 20.4% Wednesday 227 20.0% 1553 18.8% Thursday 197 17.4% 1313 15.9% Friday 157 13.9% 1120 13.5% Saturday 115 10.2% 864 10.4% Sunday 101 8.9% 813 9.8% 1133 8272

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 21 / 28

slide-29
SLIDE 29

Experiments

Control Plane Correlation

Correlation Finally, we sought to determine if the reboot events we infer are also observed in the control plane Manually searched routeviews BGP data for a prefix withdrawal corresponding to a reboot Focused on customer routers single-homed to provider (where a globally visible withdrawal is likely)

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 22 / 28

slide-30
SLIDE 30

Experiments

Example Reboot Correlation w/ BGP

CPE router at AAD, customer of AARNet Upper dots represent our inferred reboot events for router with interface 2001:388:1:700d::2 Lower dots represent global BGP events for the prefix (2405:7100::/33) announced by the router

Time (UTC) 544, 545, 546 22:32 4:46 W 4:49 A 5:36 1, 2, 3 22:35 10, 11, 12 1:57 W 2:01 A 2:12 W 2:13 A 5:05 1, 2, 3 Apr 29th Apr 30th May 1st (a) (b) 2405:7100::/33 2001:388:1:700d::2 IPID BGP

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 23 / 28

slide-31
SLIDE 31

Conclusion

Outline

1

Infrastructure Uptime

2

Methodology

3

Experiments

4

Conclusion

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 24 / 28

slide-32
SLIDE 32

Summary

Summary

Developed technique to infer the uptime of remote IPv6 devices without privileged access First quantitative wide-scale study of IPv6 router availability and reboot behavior Thanks! Questions? http://www.cmand.org/ipv6/

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 25 / 28

slide-33
SLIDE 33

Summary

Backup Slides

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 26 / 28

slide-34
SLIDE 34

Summary

Limitations

Limitations of methodology: Only applicable to IPv6; IPv4 is subject of current research Does not work for random fragment IDs (Juniper) Inferred reboot granularity limited to polling rate Can’t detect multiple reboots that occur between polls Can’t attribute reboot to root cause (power failure, software fault, upgrade)

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 27 / 28

slide-35
SLIDE 35

Summary

Future Directions

Future Directions Probe and characterize other IPv6 critical infrastructure, e.g. web and DNS servers Smarter/faster probing techniques to increase granularity of reboot time inferences Broader correlation with IPv4 and IPv6 BGP events Develop uptime inferences for IPv4

  • R. Beverly et al.

(NPS/CAIDA) IPv6 Router Uptime PAM 2015 28 / 28