Systems Failure Case Studies Michael Collins. 18-849, Section B - - PowerPoint PPT Presentation

systems failure case studies
SMART_READER_LITE
LIVE PREVIEW

Systems Failure Case Studies Michael Collins. 18-849, Section B - - PowerPoint PPT Presentation

Systems Failure Case Studies Michael Collins. 18-849, Section B Spring 1999 Systems Failure Case Studies What exists A simple taxonomy Common horrors Conclusions Hunting Down Resources Lots of apocrypha comp.risks,


slide-1
SLIDE 1

Systems Failure Case Studies

Michael Collins. 18-849, Section B Spring 1999

slide-2
SLIDE 2

Systems Failure Case Studies

■ What exists ■ A simple taxonomy ■ Common horrors ■ Conclusions

slide-3
SLIDE 3

Hunting Down Resources

■ Lots of apocrypha

◆ comp.risks, Software Engineering News ◆ political issues and spin

■ Some good sources

◆ Have to be studied ◆ Air Failures Page

slide-4
SLIDE 4

Taxonomy

■ Task isn’t finding data, it’s sorting data. ■ Taxonomy needed for sorting purposes ■ Primitive taxonomies.

◆ PGN, lists failures ◆ ODC ◆ Yours truly

slide-5
SLIDE 5

Taxonomy Issues

■ It’s never one thing

◆ Shrug-off answers : bad design, human error ◆ Sorting by keyword

■ Validity

◆ Urban legends

■ Ceiling on consequences

◆ Deaths/Cost meaningless past a certain point

slide-6
SLIDE 6

Operator Complacency

■ Therac-25

◆ Cry Wolf

■ Chernobyl

◆ Experimental Situation

■ STS-1 (Counterexample)

◆ Alert operators

slide-7
SLIDE 7

Lack/Erosion Of Safety Culture

■ Ottawa Elevator Death

◆ Corporate musical chairs

■ Challenger Disaster

◆ Operator arrogance

■ Bhopal

◆ Corporate downsizing

slide-8
SLIDE 8

Human Interfaces

■ Vincennes Airbus

◆ Faulty interpretation of IFF data

■ Therac-25

◆ Faulty error reporting

■ Cali/Puerto Plata

◆ Complacency/Complexity

slide-9
SLIDE 9

Poor Design

■ AP Outage

◆ Dual cables

■ 1986 Arpanet Outage

◆ Single linking cable

■ Chernobyl

◆ RBMK design

slide-10
SLIDE 10

Hysteria

■ Sudden Acceleration Syndrome

◆ Media exposure

■ Year 2000 IV pump

◆ Endorsed by speakers

■ Year 2000 Fire Truck ladders

◆ From the Senate

slide-11
SLIDE 11

Subsidiary Issues

■ Repair Failures

◆ SS-7

■ Sabotage

◆ Lithuanian Plant

■ Language Debate

◆ Ariane V ◆ AT & T System Runaway, 1990

slide-12
SLIDE 12

Conclusions

Very rarely any one thing Very rarely any one thing

■ S y st e m failure

◆ Very rarely any one thing ◆ Failures pile up into catastrophes

■ Human factors are critical

◆ Morale, security, inspections ◆ ICHH Syndrome