what can nuclear engineering teach us about software
play

What can nuclear engineering teach us about software? Todd Lewis - PowerPoint PPT Presentation

What can nuclear engineering teach us about software? Todd Lewis & Eduardo Bellani tlewis@brickabode.com emb@brickabode.com 24 April 2017 Read every word of Lamport Leslie Lamport (1977): "Proving the Correctness of Multiprocess


  1. What can nuclear engineering teach us about software? Todd Lewis & Eduardo Bellani tlewis@brickabode.com emb@brickabode.com 24 April 2017

  2. Read every word of Lamport ● Leslie Lamport (1977): "Proving the Correctness of Multiprocess Programs" ● This paper is amazing ● Leslie Lamport is amazing ● He did "Time, clocks, and the ordering of events in a distributed system" only a year later ● (Has there ever been a computer science decade as great as the 1970s?)

  3. System properties come in two kinds! ● Computing is great at Liveness Safety liveness: lots of features! ● Benefit of features often When Sometimes Always outweighs cost of failure, so “Move Fast & Break Where Somewhere Everywhere Things” ● However, we often do Nature Good thing Bad thing safety so badly that there is opportunity there; lots Does not Action Happens happen of low-hanging fruit Means Feature Control

  4. Let’s talk about saving lives ● Starting in the 1970s, human factors analysis started happening in aviation ● What used to be called “pilot error” is now recognized as “bad interface design” ● Hundreds of thousands of people are alive today who would otherwise be dead because of this advance

  5. Compare and contrast

  6. Let’s design a nuclear plant! ● We are putting a nuclear plant next to the ocean ● Your mother lives next door ● What failures would you want the designers to care about?

  7. Multi-system failures (Oceanic edition) Bad outcome Cause Control Put critical infrastructure up Multi-system failure Tsunami high Multi-system failure Corrosion Annual inspections Multi-system failure Flooding Sea wall and drainage Loss of coolant Multi-system failure Inspect & clean pipes (biomass clogs pipes) Multi-system failure Sea-borne attack Sea walls Multi-system failure Erosion kills plant Sea walls Sedimentation blocks Multi-system failure Inspect & dredge coolant

  8. We can do this systematically 1) What failures matter? (“Bad business outcome” is a useful criterion) 2) For each failure, what can cause it? 3) How do you address each cause? ● Gives you a finite list of hazards handled ● Gives you a clear model to give to your operators: here are the risks we manage, and how

  9. Pro tip: Create a Red Team ● It is psychologically difficult to look at your own designs critically ● You need distance in order to tease out assumptions and blindspots ● Bring an outsider into your analysis, and encourage them to ask “dumb questions”

  10. How to do this 1) Get a few hours of whiteboard time: your team, plus a smart outsider 2) Failures → Causes → Controls 3) Write it up 4) Start sharing it with others: here are some new options to improve our system

  11. Where to find more ● Engineering a Better World, by Nancy Leveson ● Resilience Engineering, by Hollnagel, Woods and Leveson ● Drift into Failure, by Sidney Dekker

  12. “Correct, On-Time, On-Budget” ● Do you like building systems that work? ● Are you a Haskell, ML, or Lisp programmer? ● Meet us after the talk! ● jobs@brickabode.com

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend