fault tolerance and security
play

Fault Tolerance and Security Heechul Yun 1 Safety Failures in CPS - PowerPoint PPT Presentation

Fault Tolerance and Security Heechul Yun 1 Safety Failures in CPS Therac 25 Arian 5 Computer controlled medical X-ray 7 billion dollar rocket was destroyed after 40 treatments secs (6/4/1996) Six people died/injured due to


  1. Fault Tolerance and Security Heechul Yun 1

  2. Safety Failures in CPS Therac 25 Arian 5 • Computer controlled medical X-ray • 7 billion dollar rocket was destroyed after 40 treatments secs (6/4/1996) • Six people died/injured due to massive • “caused by the complete loss of guidance and overdoses (1985-1987) altitude information ”  Caused by 64bit • Caused by synchronization mistakes floating to 16bit integer conversion 2

  3. Safety Failures in CPS http://www.nytimes.com/2015/01/28/us/white-house-drone.html http://petapixel.com/2015/12/23/crashing-camera-drone-narrowly-misses-top-skiier/ http://rochester.nydatabases.com/map/domestic-drone-accidents http://www.nytimes.com/interactive/2016/07/01/business/inside-tesla-accident.html Failures in CPS have consequences 3

  4. Air France 447 (2009) • Airbus A330 crashed into the Atlantic Ocean in 2009 • Caused in part by computer’s misguidance – Pitot tube (speed sensor) failure  Flight Director (FD) malfunction (shows “head up”)  pilots follow the faulty FD  enter stall Normal Stall http://www.slate.com/blogs/the_eye/2015/06/25/air_france_flight_447_and_the_safety_paradox_of_airline_automation_on_99.html 4 http://www.spiegel.de/international/world/experts-say-focus-on-manual-flying-skills-needed-after-air-france-crash-a-843421.html

  5. Lion Air Flight 610 (2018) • Boeing 737 Max crashed into the Java See in 2018 • Caused by stall prevention system (MCAS) – sensor error (plane is “stall”)  nose down (to the ocean) 5

  6. Ethiopian Air 302 (2019) https://www.seattletimes.com/business/boeing-aerospace/failed-certification-faa-missed-safety-issues-in-the-737-max-system-implicated-in -the-lion-air-crash 6

  7. Boeing 737 MAX • Controversial designs of the MCAS – Designed to use a single AoA sensor • Even though there are two AoA sensors • Single-point-of-failure. – More powerful than the pilots • Overrode the pilots’ pitch -up commands • Yet, classified as “hazardous” ( Lvl B), not critical (Lvl A) • Planned software update – Use both sensors, limit the power https://www.seattletimes.com/business/boeing-aerospace/failed-certification-faa-missed-safety-issues-in-the-737-max-system-implicated 7 -in-the-lion-air-crash/

  8. Lufthansa A321 (2014) • Similar prior incidents that didn’t kill people. • Faulty AoA sensor readings (ice) trigger an automated stall prevention system called ‘Alpha Prot ’, resulting in 4,000 ft loss of altitude • Triple redundant sensors with a voting mechanism. But two sensors were iced up simultaneously. The only working sensor’s value was discarded. • “When Alpha Prot is activated due to blocked AOA probes, the flight control laws order a continuous nose down pitch rate that, in a worst case scenario, cannot be stopped with backward sidestick inputs, even in the full backward position .” https://avherald.com/h?article=47d74074 8

  9. Tesla Autopilot (2016) • Tesla autopilot failed to recognize a trailer resulting in a death of the driver http://www.nytimes.com/interactive/2016/07/01/business/inside-tesla-accident.html 9

  10. NHTSA Report • Both the radar and camera sub-systems are designed for front-to-rear collision prediction mitigation or avoidance. • The system requires agreement from both sensor systems to initiate automatic braking. • The camera system uses Mobileye’s EyeQ3 processing chip which uses a large dataset of the rear images of vehicles to make its target classification decisions. • Complex or unusual vehicle shapes may delay or prevent the system from classifying certain vehicles as targets/threats https://static.nhtsa.gov/odi/inv/2016/INCLA-PE16007-7876.PDF 10

  11. NHTSA Report • Object classification algorithms in the Tesla and peer vehicles with AEB technologies are designed to avoid false positive brake activations . • The Florida crash involved a target image (side of a tractor trailer) that would not be a “true” target in the EyeQ3 vision system dataset and • The tractor trailer was not moving in the same longitudinal direction as the Tesla, which is the vehicle kinematic scenario the radar system is designed to detect https://static.nhtsa.gov/odi/inv/2016/INCLA-PE16007-7876.PDF 11

  12. Uber Self-Driving Car (2018) • Kill a pedestrian crossing a road in Arizona https://www.nytimes.com/2018/03/19/technology/uber-driverless-fatality.html 12

  13. NTSB Report • The system first registered radar and LIDAR observations of the pedestrian about 6 seconds before impact • Software classified the pedestrian as an unknown object, as a vehicle, and then as a bicycle with varying expectations of future travel path. • At 1.3 seconds before impact, the system determined that Failures in CPS have consequences an emergency braking maneuver was needed • Emergency braking maneuvers are not enabled while the vehicle is under computer control, to reduce the potential for erratic vehicle behavior https://www.ntsb.gov/investigations/AccidentReports/Reports/HWY18MH010-prelim.pdf 13

  14. Challenges for Safe CPS • Time Predictability • Complexity • Reliability • Security 14

  15. Real-Time Predictability victim attackers Core1 Core2 Core3 Core4 LLC • Observed worst-case: >300X (times) slowdown – On simple in-order multicores (Raspberry Pi3, Odroid C2) Difficult to guarantee predictable timing Michael G. Bechtel and Heechul Yun. “Denial -of-Service Attacks on Shared Cache in Multicore: Analysis and Prevention.” In RTAS , 2019 (to appear, Outstanding Paper Award )

  16. Complexity • Software complexity increases Growth in Software Size Lines of Code in Typical GM Car 1400 100000 1200 10000 1000 K SLOC 1000 KLOC 800 600 100 400 10 200 1 0 1970 1990 2010 Apollo 1968 Space Shuttle Orion (est.) Model Year Flight Vehicle More bugs, unintended side-effects Figures are from NASA JPL. “Flight Software Complexity,” 2008 16

  17. Hardware Reliability • Transient hardware faults (soft errors) – Due to environment factors (ex: alpha particle, cosmic radiation) – Manifested as software failures – Bigger problem in advanced CPU • Increased density  higher soft error rate (SER) per chip Ibe et al., “Scaling Effects on Neutron -Induced Soft Error in SRAMs http://www.cotsjournalonline.com/articles/view/102279 Down to 22nm Process” (Hitachi) More susceptible to environmental factors 17

  18. Hardware Reliability Wordline Row of Cells Row Victim Row V LOW V HIGH Aggressor Row Row Opened Closed Victim Row Row Row Repeatedly opening and closing a row induces disturb ance errors in adjacent rows Hardware can be exploited by attackers 18 This slide is from the Dr. Yoongu Kim’s ISCA 2014 presentation

  19. Software Security • Insecure software in CPS  safety hazards • Stuxnet: first reported cyber warfare, targeted for Iranian nuclear plants (destroying centrifuges) • Vermont power grid hack by Russia • Remote hack into cars (Zeep) • Police drone hacking CPS software can be attacked by hackers 19

  20. Hardware Security https://meltdownattack.com/ Hardware can leak secrets to attackers 20

  21. How to Improve Safety of CPS? • Correct by design – Formal method based software development • Difficult for a complex system – Use reliable hardware • e.g., radiation hardened processors • Expensive and low performance • Deal with failures – Run-time monitoring and redundancy 21

  22. This Week: Fault Tolerance/Security • A Simplex Architecture for Intelligent and Safe Unmanned Aerial Vehicles, RTCSA16 • Comprehensive Experimental Analyses of Automotive Attack Surfaces, USENIX Security, 2011 (Dalton) 22

  23. arXiv: https://arxiv.org/abs/1811.12555 Video: https://www.youtube.com/watch?v=poRbH__kB2M 23

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend