CPSC 875 CPSC 875 John D McGregor John D. McGregor C10 Error Design - - PowerPoint PPT Presentation

cpsc 875 cpsc 875
SMART_READER_LITE
LIVE PREVIEW

CPSC 875 CPSC 875 John D McGregor John D. McGregor C10 Error Design - - PowerPoint PPT Presentation

CPSC 875 CPSC 875 John D McGregor John D. McGregor C10 Error Design Uncertainty Uncertainty Make uncertainty a first class entity in design Make uncertainty a first class entity in design Assume things fail Watchdog timers check


slide-1
SLIDE 1

CPSC 875 CPSC 875

John D McGregor John D. McGregor C10 – Error Design

slide-2
SLIDE 2

Uncertainty Uncertainty

  • Make uncertainty a first class entity in design

Make uncertainty a first class entity in design

  • Assume things fail

hd i h k h i h

  • Watchdog timers check that an operation has

not frozen (as opposed to the modal dialog i h l b ) with a cancel button)

  • Google File System is designed to recognize a

failed disk drive and to work around it

slide-3
SLIDE 3

Autonomous Robot Autonomous Robot

http://www.cs.ait.ac.th/~mdailey/papers/Limsoonthrakul-Arch.pdf

slide-4
SLIDE 4

Module structure Module structure

slide-5
SLIDE 5

Publish/Subscribe Style Publish/Subscribe Style

slide-6
SLIDE 6
slide-7
SLIDE 7

Utility Utility

  • Usefulness

Usefulness

  • We assume that as a design satisfies more and

more of the desired qualities its usefulness is more of the desired qualities its usefulness is increased B i d i

  • But it costs more and more so at some point

the increase in utility is not worth the increase in cost

utility Total cost

slide-8
SLIDE 8

Design for Errors Design for Errors

slide-9
SLIDE 9

Nothing can go wrong Nothing can go wrong

From: http://academic.csuohio.edu/duffy_s/Section_03.pdf

slide-10
SLIDE 10

www.artemis-ia.eu/publication/download/?publication=98

slide-11
SLIDE 11

AADL Error Annex AADL Error Annex

  • https://wiki sei cmu edu/aadl/images/4/42/Er

https://wiki.sei.cmu.edu/aadl/images/4/42/Er rorModelOverview‐04182012.pdf

slide-12
SLIDE 12

Error design Error design

slide-13
SLIDE 13

Exception handling Exception handling

  • Always clean up after yourself

Always clean up after yourself

  • Never use exceptions for flow control

i i

  • Do not suppress or ignore exceptions
  • Do not catch top‐level exceptions
  • Log exceptions just once
slide-14
SLIDE 14

PrimaryBackupPattern PrimaryBackupPattern

system implementation PrimaryBackupPattern.impl subcomponents

primary: system sys in modes (Primarymode); backup: system sys in modes (Backupmode);

connections

inprimary: data port insignal ‐> primary.insignal in modes (Primarymode); inbackup: data port insignal ‐> backup.insignal

in modes (Backupmode);

  • utprimary: data port primary.outsignal ‐>
  • utsignal in modes (Primarymode);
  • utbackup: data port backup.outsignal ‐>
  • utsignal in modes (Backupmode);

modes modes

Primarymode: initial mode; Backupmode: mode;

end PrimaryBackupPattern.impl;

slide-15
SLIDE 15

Error design Error design

error model Example1 features

ErrorFree: initial error state; Failed: error state; Fail, Repair: error event; C t dD t t ti CorruptedData: out error propagation {Occurrence => fixed 0.8};

end Example1; error model implementation Example1.basic transitions

ErrorFree‐[Fail]‐>Failed; Failed‐[out CorruptedData]‐>Failed; Failed‐[Repair]‐>ErrorFree;

properties

Occurrence => poisson 1.0e‐3 applies to Fault; Occurrence => poisson 1.0e‐4 applies to Repair;

end Example1.basic;

slide-16
SLIDE 16

Using error model Using error model

system computer end computer; system implementation computer.personal subcomponents subcomponents CPU: processor Intel.DualCore; RAM: memory SDRAM; S b Sid FSB: bus FrontSideBus; annex Error_Model {** Model => My_ErrorModels::Example1.basic applies to CPU; Occurrence => fixed 0.9 applies to error CPU.CorruptedData; **}; end computer.personal; end computer.personal;

slide-17
SLIDE 17

Propagation Propagation

slide-18
SLIDE 18

Error Propagation Error Propagation

slide-19
SLIDE 19

Propagations Propagations

slide-20
SLIDE 20

Full spec Full spec

slide-21
SLIDE 21
slide-22
SLIDE 22
slide-23
SLIDE 23
slide-24
SLIDE 24

Next steps Next steps

  • Read:

– http://hbswk.hbs.edu/item/5699.html At the bottom of the page there is a place to download “Full Working Paper T t” Text” – http://www.sei.cmu.edu/reports/07tn043.pdf

  • Continue to expand your AADL model

Continue to expand your AADL model – Add at least one state machine – Define and bind to a platform p – Identify at least one type of error and add to your model

  • Create the DSMs for your architecture so far