Normal Accidents: Normal Accidents: A Book Report A Book Report - - PowerPoint PPT Presentation

normal accidents normal accidents a book report a book
SMART_READER_LITE
LIVE PREVIEW

Normal Accidents: Normal Accidents: A Book Report A Book Report - - PowerPoint PPT Presentation

Normal Accidents: Normal Accidents: A Book Report A Book Report Bill Tet zlaf f Bill Tet zlaf f Sept ember 6, 2001 Sept ember 6, 2001 Normal Accidents Normal Accidents Charles P errow P rincet on Universit y P ress, 1999 I SBM


slide-1
SLIDE 1

Normal Accidents: Normal Accidents: A Book Report A Book Report

Bill Tet zlaf f Bill Tet zlaf f Sept ember 6, 2001 Sept ember 6, 2001

slide-2
SLIDE 2

Normal Accidents Normal Accidents

Charles P errow P rincet on Universit y P ress, 1999 I SBM 0-691-00412-9 First published by Basic Books, 1984 Discipline: Sociology of Organizat ions

slide-3
SLIDE 3

What are Normal Accidents? What are Normal Accidents?

Accident s t hat are seemingly ext reemly rare, t hat are in f act "normal" Also called "syst em accident s" They are mult iple f ailure accident s in which t here are unf orseen int eract ions t hat make t hem eit her worse or harder t o diagnose

slide-4
SLIDE 4

Some terms Some terms

I nt erract ive Complexit y

Failures of t wo component s int eract in an unexpect ed way

Tight ly Coupled

Processes t hat are part s of a syst em t hat happen quickly and cannot be t urned of f or isolat ed

P errow Thesis: Tight ly coupled syst ems wit h high int eract ive complexit y will have Normal Accident s

slide-5
SLIDE 5

Operator Error Operator Error

I n his experience post mort ums blame "operat or error" 60 t o 80 percent of t he t ime He f eels t hat t hey are scapegoat ed by people wit h 20 20 hindsight Most ly t hey are errors t hat are designed in t o t he syst em

slide-6
SLIDE 6

Three Mile I sland Three Mile I sland

Unit Number 2 in a Nuclear P lant near Harrisburg, P ennsylvania March 28, 1979 Many of us wat ched t his unf old on t he evening news f or days - pret t y scarey

slide-7
SLIDE 7

TMI System TMI System

PORV Feedwater HPI

slide-8
SLIDE 8

Cooling System Cooling System

P rimary Cooling Syst em

High pressure, radioact ive, wat er circulat ing t hr ough t he r eact or . Heat Exchanger t ransf ers heat t o t he secondary syst em

Secondary Cooling Syst em

Cools t he primary cooling syst em Creat es st eam t o run t he t urbines t o generat e elect r icit y Due t o t hin t ubes in t he t urbine it must be very pur e Cont inuously cleaned by a "polisher syst em"

slide-9
SLIDE 9

How it started How it started

The polisher leaked about a cup a day of wat er t hrough a seal Wat er vapor got int o a pneumat ic syst em t hat drives some inst rument s This wat er vapor int errupt ed pressure t o t wo valves in t he f eedwat er syst em, which caused t wo f eedwat er pumps t o shut down Lack of f low in t he secondary syst em t riggered a saf et y syst em t hat shut down t he t urbines This was t he f irst indicat ion of t rouble t o t he

  • perat ors

At t his point t he react or st ill needs t o be cooled - or else

slide-10
SLIDE 10

Emergency f eedwater takes over Emergency f eedwater takes over

An emergency f eedwat er syst em st art s up t o pump st ored cold wat er t hrough t he secondary syst em t o remove t he accumulat ing heat The pumps were running, but valves on t he pipes were incorrect ly lef t closed f rom prior maint enance The operat ors insist t hey were lef t open The check list s say t hey were opened A Repair Tag on a broken indicat or hung over t he indicat or on t he cont rol panel t hat indicat ed t hat t he valves were closed Redundant pipes, redundant pumps, and redundant valves, all t hwart ed by having t he t wo valves physically at t he same place and miss set Eight minut es lat er t hey not iced t hey were shut by t hen t he damage was done

slide-11
SLIDE 11

With no cooling the reactor got hot With no cooling the reactor got hot

Due t o overheat ing t he react or "scrammed" aut omat ically

This shut s down t he react ion

Enough heat remains in t he react or t o require a normal working cooling several days t o cool of f Wit hout cooling t he pressure goes up An ASU Aut omat ic Saf et y Device t akes over t o t emporarily relieve t he pressure: t he P ilot Operat ed Relief Valve (P ORV)

slide-12
SLIDE 12

PORV PORV

The P ORV is supposed t o vent pressure brief ly, and t hen reclose

I f it st ays open t oo long liquid escapes, pressure in t he react or drops, st eam f orms causing voids in t he wat er, cooling is impaired and some places get yet hot t er

Thirt y-t wo t housand gallons of wat er event ually went out t his unclosed valve There was an indicat ion on t he cont rol panel t hat t he message t o reseat had been sent t o t he valve

However, no indicat ion was available t hat it had reseat ed

We are now t hirt een seconds int o t he "t ransient "

An indicat or shows t hat t her e is ext r a wat er f r om an unknown source

slide-13
SLIDE 13

Automatic Coolant Pump Starts Automatic Coolant Pump Starts

This is anot her aut omat ic saf et y syst em t hat pumps wat er t o cool t he react or aut omat ically st art s at 13

  • seconds. The second was manually st art ed by t he
  • perat or

For t hree minut es it looked like t he core was being cooled successf ully

However, apparent ly due t o t he st eam voids, t he cooling was not happening

The secondary st eam generat ors were not get t ing wat er and boiled dry - at t he same t ime wat er was f lowing out of t he primary cooling syst em t hrough t he st uck pressure relief valve

slide-14
SLIDE 14

High Pressure I njection (HPI ) Starts High Pressure I njection (HPI ) Starts

This is an aut omat ic emergency device t hat f orces cold wat er int o t he react or t o cool it down. The react or was f looded f or t wo minut es, and t hen t he

  • perat ors drast ically cut back t he f low

t his was r eguar ded as t he key oper at or er r or what t hey did not realize was t hat t he wat er was f lowing out t he PORV and t he core would become uncovered

Two dials conf used t he operat ors:

  • ne said t he pr essur e in t he r eact or was r ising

t he ot her said it was f alling

The Kemeny commission t hought t he operat ors should have realized t his meant LOCA (Loss of Coolant Accident )

slide-15
SLIDE 15

Conditions in the control room Conditions in the control room

Three audible alarms are making a din Many of t he 1,600 indicat or light s are blinking The comput er is way behind in print ing out error messages

I t t urns out t hey can only be print ed, not spooled t o disk, t o see t he current condit ion t hey would have t o purge t he print er and loose pot ent ially valuable inf ormat ion

The react or coolant pumps begin t he bang and shake, due t o cavit at ion f rom lack of wat er t o pump-t hey are shut of f

slide-16
SLIDE 16

Stuck open PORV valve Stuck open PORV valve discovered! discovered!

The operat ors checked t he valve and f ound it open They closed it

Wit h some t repidat ion since t hey were messing wit h a saf et y syst em

The react or core had been uncovered at t his point and had part ially melt ed Anot her 30 minut es wit hout coolant and it would probably have been a t ot al melt down

slide-17
SLIDE 17

The Hydrogen Bubble The Hydrogen Bubble

I f t he cladding on t he uranium pills get s t oo hot in t he presence of wat er Hydrogen gas is given of f At one point , 33 hours int o t he incident , t here was an explosion and spiking of t he inst rument s P ressure reached half t he rat ed pressure of t he cont ainment building

The cont ainment building had been signif icant ly over engineered out of concern of being hit by an airplane f rom a near by air por t Three years lat er t hey f ound t he damage done in t he cont ainment building by t he missiles t hrown by t he explosion

The working syst ems cooling and cont roling t he react or might have been damaged, but were not

slide-18
SLIDE 18

Finally under control Finally under control

At t his point t he react or event ually was cooled down and t he invest igat ion heat ed up I n t he end t he operat ors were blamed

t hough t he commission members could not agree on what t he er r or s wer e

slide-19
SLIDE 19

I s this typical? I s this typical?

P errow chronicles a number of ot her nuclear incident s, wit hout t he magnit ude, but wit h t he charact erist ic errors I ndian P

  • int Number 2

An indicat or light is viewed as f ault y, while 100,000 gallons of cold Hudson riverwat er accumulat e around t he react or f rom a br oken pipe Anot her indicat or, t o measure wat er, does not det ect it because it is designed t o det ect hot wat er An unr elat ed oper at or er r or caused t he r eact or t o shut down. When t hey went int o t he cont ainment building t hey f ound t he 9 f eet of wat er ar ound t he r eact or

Dresden number 2 in Chicago, Fermi in Det roit , et c.

slide-20
SLIDE 20

Common characteristics Common characteristics

The whole syst em is never all up and working as designed

t hus it is hard t o underst and

When t hings st art t o f ail t he syst em is even harder t o underst and Saf et y syst ems are not always working

some are down, and known t o be some are accident ally t urned of f some ar e not set pr oper ly

  • t hers f ail t o work when needed

There are of t en not direct indicat ors of what is happening

  • perat ors f igure it out indirect ly
slide-21
SLIDE 21

Def ense in Depth Def ense in Depth

Nuclear power syst ems are as saf e as t hey are because of def ense in dept h

Many levels of syst ems and cont ainment Ult imat ely t he cont ainment building is supposed t o cont ain a melt down

(Early Russian react ors did not have t hem)

The cont ainment building has negat ive pressure, so even if cracked, air will not escape

slide-22
SLIDE 22

Some Def initions: Some Def initions:

Coupling

Tight

direct and immediat e connect ion and int eract ion bet ween component s

Loose

slack or buf f ering bet ween component s

I nt eract ions, and t ransf ormat ion processes

Linear

  • rderly st ep by st ep wit h only int eract ions wit h

adj acent st eps, easy isolat ion of component s

Complex

many connect ions and int errelat ionships

slide-23
SLIDE 23

Assembly Line of workstations Assembly Line of workstations

Loose and linear

assuming t hat t here is space t o st ore work in progress bet ween workst at ions

Tight and linear

assuming somet hing like aut omobile assembly where t he f rame moves by and t here is only so much t ime t o add in each it em Bread baking

Queue Process Queue Queue Process Process Process Process Process

slide-24
SLIDE 24

Chemical Plant Chemical Plant

Tight and complex Heat given of f by one process is used t o heat a st ep

  • f anot her by t ransf ering t he heat

One st ep is cooled and anot her is heat ed by a heat pump

Process Process Process Process Process Process

Gives off Heat Needs Heat

Heat Pump

Needs Heat Needs Cooling

slide-25
SLIDE 25

I nteractions and coupling I nteractions and coupling

Coupling Interaction Linear Complex Tight Loose

University Trade School Bread Baking Nuclear Plant Chemical Plant Aircraft Space Mission R&D Assembly Line Most Manufacturing Marine Transport

slide-26
SLIDE 26

Air Transportation Air Transportation

St ruct uraly f avors saf et y

I ndust ry elit es, regulat ory elit es, polit icians f ly Lot s of independendent redundant equipment Pilot Co-pilot r elat ionship

They t alk a lot and agree I f t hey are wrong, t hey are bot h wrong

Cockpit aut omat ion

"t he burning quest ion... is not how much a man can work but how lit t le"

slide-27
SLIDE 27

Air Transportation becoming less Air Transportation becoming less coupled coupled

Built in buf f ers

spacing abilit y t o be lat e abort t akeof f or landing planes can move in t hree dimensions

Rest rict ed air space f light lanes by t ype of aircraf t Less use of voice communicat ion

aircraf t report s alt it ude all t he t ime

Technology is primarily improving t hroughput

slide-28
SLIDE 28

Chemical Plants Chemical Plants

Bhopal: December 1984

Not hing complex, j ust component f ailure

Declining prof it s

  • perat ions crew cut in half

maint enance crew cut in half

"it t akes t he right combinat ion of circumst ances t o produce a cat ast rophe"

No warning, no evacuat ion plan, no alarms, people asleep nearby, light wind in t he r ight dir ect ion

Similar plant at I nst it ut e West Virginia-OSHA exemplary August 11, 1985 - West Virginia-weat her right t o disperse gas

OSHA "accident wait ing t o happen"

slide-29
SLIDE 29

Marine Accidents Marine Accidents

St ruct urally encourages accident s

no high prof ile riders, Senat ors don' t t ravel on f r eight er s risk is easily accept ed as part of t radit ion of t he sea emphasis on t hroughput poor maint enance saf et y syst ems not t urned on or not working The capt ain is god The mat e st ands by silent ly while t hey ground

Technology improvement s all go t o t hroughput

no reduct ion in accident s

Lot s of int erest ing sea st ories, but not t erribly relevant t o us

slide-30
SLIDE 30

Marine Accidents- Marine Accidents- Passing in rivers Passing in rivers

P assing on rivers is a big problem

Pilot s agree by radio about port t o port or st ar boar d t o st ar boar d Classical one bit agreement problem Get s more complex because convoys f orm Poor radio discipline A missed message and t hey hit each ot her

slide-31
SLIDE 31

Pisces and Trade Master Pisces and Trade Master

slide-32
SLIDE 32

NORAD NORAD

Cheyenne Mount ain Colorado Early warning command cent er When somet hing goes "of f " a "missile display conf erence" is called

I n 1979 t her e wer e 1544 I n f ir st half 1980, t her e wer e 2159

They t ollerat e lot s of f alse posit ives t o eliminat e f alse negat ives Two maj or unconnect ed syst ems

Sat ellit es pick up launch Radar picks up incoming f light

First Alaska, second Nort h Dakot a

slide-33
SLIDE 33

Common Threads I f ound Common Threads I f ound

Mult iple errors are commonplace

birt hday paradox

Complex syst ems are never act ually f ulling working or working properly

some t hings are known and some not

light s and sensors not working

t hey become hard t o underst and

Backup and aut omat ic saf et y syst ems

not all working properly, only some known not f ully or regularly t est ed

I ndirect measurement s or no measurement s

hard t o f igure out t he st at e in an emergency

P OV open, but t old t o close, is t he core uncovered?

Tendancy t o blame t he Operat or

slide-34
SLIDE 34

Place these systems: Place these systems:

micro processor t hermost at mult iprogramming and mult iprocessing many applicat ions on one OS Single server groved f or one applicat ion Operat ing syst em The int ernet

slide-35
SLIDE 35

Coupling Interaction Linear Complex Tight Loose

slide-36
SLIDE 36

High Conf idence Computing High Conf idence Computing Questions? Questions?

Are t here "syst em accident s" in comput ing? How does def ense in dept h relat e t o comput ing syst ems? What comput ing syst ems are "complex?" and what are "linear?" Can comput ing syst ems be made more "linear?" What comput ing syst ems have "t ight " coupling and "loose" coupling How can comput ing syst ems be designed t o have more "loose" coupling? St orage overlay Tendancy t o use single purpose servers

slide-37
SLIDE 37

Promising Ref erences Promising Ref erences

Levenson, Nancy, 1995, Saf eware: Syst em Saf et y and Comput ers, New York, Addison Wesley Neumann, P et er G, 1999, "Risk Digest ," www.comp.risks Rushby, J ohn, 1994, "Crit ical Syst em P ropert ies: Survey and Taxonomy," Reliabilit y Engineering and Syst em Saf et y, 43:189-210 Reliable Sof t ware t hrough Composit e Design, Glenf ord Myers, 1975

slide-38
SLIDE 38

Finish Finish