Dependability Engineering of Complex Computing Systems M. Kaniche - - PowerPoint PPT Presentation

dependability engineering of complex computing systems
SMART_READER_LITE
LIVE PREVIEW

Dependability Engineering of Complex Computing Systems M. Kaniche - - PowerPoint PPT Presentation

Dependability Engineering of Complex Computing Systems M. Kaniche J.-C. Laprie J.-P. Blanquart LAAS / LIS LAAS / LIS Astrium / LIS kaaniche@laas.fr laprie@laas.fr blanquar@laas.fr 6th International Conference on Engineering


slide-1
SLIDE 1

Dependability Engineering

  • f Complex Computing Systems
  • M. Kaâniche J.-C. Laprie J.-P. Blanquart

6th International Conference on Engineering of Complex Computer Systems (ICECCS 2000) September 11-14 2000, Tokyo, Japan LAAS / LIS LAAS / LIS Astrium / LIS kaaniche@laas.fr laprie@laas.fr blanquar@laas.fr

slide-2
SLIDE 2

Dependability

Property of a system such that reliance can justifiably be placed on the service it delivers Impairments Means Attributes Fault Prevention Fault Tolerance Fault Removal Fault Forecasting Availability Reliability Confidentiality Integrity Maintainability Fault Error Failure

(IFIP WG 10.4- Dependable Computing and Fault Tolerance)

Safety

slide-3
SLIDE 3

Motivation

 Developing dependable systems able to deliver critical services with a

justified level of confidence is not easy

 increasing complexity, fault diversity, conflicting objectives, …

 Traditional development models do not explicitly incorporate all

activities needed for the production of dependable systems

 Hardware (BSI 5760 Standard)

 incorporation of assessments  fault tolerance activities focussed on physical faults only

 Software (Waterfall, V model, spiral, incremental, process oriented,…)

 structuring of activities  focus on verification

 System engineering (EIA 632, IEEE 1220, …)

 generic pluridisciplinary framework integrating products, processes and people  dependability related issues are not detailed

 Need for a dependability-explicit development model

slide-4
SLIDE 4

Basic Model

Dependability processes

slide-5
SLIDE 5

Basic activities

System Creation Process System Creation Process

  • Requirements
  • Design
  • Realization
  • Integration

Fault Prevention Process Fault Prevention Process

  • Formalisms & Languages
  • Project organization
  • Project planning &

risk assessment

Fault Tolerance Process Fault Tolerance Process

  • System behavior in

presence of faults

  • System partitioning
  • Error & fault handling

mechanisms

Fault Removal Process Fault Removal Process

  • Verification
  • Diagnosis
  • Modification

Fault Forecasting Process Fault Forecasting Process

  • Dependability objectives
  • Allocation
  • Evaluation
slide-6
SLIDE 6

Interactions Interactions

Design Realization Integration Requirement

Beh Behavior in the presence of faults Par System Partitioning Han Error & Fault Handling Obj Objectives All Allocation Eva Evaluation Ver Verification Dia Diagnosis Mod Modification For Formalisms & languages Org Project Organization Pla Project Planning & Risk assessment

Ver Dia Mod For Org Pla Han Par Beh Obj All Eva

System Creation System Creation Fault Prevention Fault Prevention Fault Tolerance Fault Tolerance Fault Removal Fault Removal Fault Forecasting Fault Forecasting

slide-7
SLIDE 7

Interactions: examples

 Fault prevention process activities should be tightly coupled with

system creation and dependability processes activities

 Fault tolerance and fault forecasting

 Definition of dependability related requirements and functions  Allocation of dependability requirements  Assessment of the efficiency of fault tolerance mechanisms (coverage)

 Fault removal and fault tolerance

 Verification of fault assumptions for traceability, consistency,

completeness and verifiability

 Verification of fault tolerance mechanisms by means of fault injection,

formal verification or static analyzes

 Fault removal and fault forecasting

 Validation of fault forecasting assumptions and results  Definition of test stopping criteria based on dependability level achieved  Evaluation of dependability based on test results

slide-8
SLIDE 8

Fault Assumptions

 Fault assumptions should be defined at each system refinement step

 Support for the definition of fault tolerance strategies and mechanisms  Check for traceability, consistency, completeness and verifiability

Fault Tolerance Coverage Error and Fault Handling Coverage Fault Assumption Coverage Failure Mode Coverage Failure Independence Coverage

slide-9
SLIDE 9

FP Rq De Re In FR FT FF Rq Requirements De Design Re Realization In Integration FP Fault Prevention FT Fault Tolerance FR Fault Removal FF Fault Forecasting

System requirements allocated to software Software Product

reuse with adjustments

Rq De Re In Rq De Rq Re Rq Rq De In De In In In In In Rq De Re

traditional Waterfall reuse without changes Prototyping

System development process Software development process

A meta-model not a life-cycle model

slide-10
SLIDE 10

Checklist

Requirements

❍ System behavior / failures

  • dependability properties
  • criticality / mission phase
  • acceptable degraded modes
  • maximum tolerable duration
  • f service interruption
  • number of simultaneous/

consecutive failures to be tolerated for each mode

  • fault tolerance means provided

by the environment

❍ Verification planning

  • static analyzes and testing

strategies (criteria, input generation)

  • test-beds, environment simulators

❍ Verification assumptions

  • classes of functions/ behavior
  • predicates

❍ Requirements verification

  • traceability analysis
  • functional / behavioral analyses
  • reviews & inspections

❍ Functional/ behavioral verification scenarios

❍ Formalisms & languages

  • standards, rules, tools, formalisms

❍ Project organization

  • life cycle model
  • resource management

❍ Project planning & risk assess.

  • risks identification & mitigation
  • dev. stages, transition criteria
  • planning of project reviews,

certification, config. management

Fault Prevention Fault Tolerance Fault Removal Fault Forecasting

❍ Dependability objectives ❍ Failure modes analysis

  • classification by severity

❍ FF assumptions ❍ Function-by-function

dependability allocation

  • classification of functions

by criticality levels

❍ Fault forecasting planning ❍ Data collection and analysis ❍ Functional specification

  • functions (value, time)
  • mission phases & sequencing
  • operation/ maintenance

modes

❍ Environment description

  • boundaries and interactions

❍ Development and validation,

constraints

  • foreseeable evolutions
  • interoperability, portability
  • reusablity, testability, …
slide-11
SLIDE 11

Checklist

Design

❍ System behavior / faults

  • fault assumptions

❍ System partitioning

  • fault/error containment

regions

  • FT application layers

❍ Fault tolerance strategies

  • redundancy, design

diversity, exception handling

❍ Error & Fault handling

mechanisms

  • error detection, diagnosis,

recovery

  • fault diagnosis, passivation,

reconfiguration

❍ Single points of failure? ❍ Verification assumptions ❍ Design verification

  • behavioral analysis, reviews,

inspections, prototyping

❍ Fault tolerance verification

  • (Formal) Verification
  • Simulation- based fault

injection

❍ Unit / Integration testing

planning

❍ Functional/structural

verification scenarios

❍ Verification of FF results ❍ Formalisms & languages ❍ Project organization ❍ Project planning & riskassess.

Fault Prevention Fault Tolerance Fault Removal Fault Forecasting

❍ FF assumptions ❍ Failure Mode Analysis

❍Allocation / component

❍ Preliminary dependability assessment ❍ Data Collection & Analysis ❍ Architecture

  • structure
  • behavior
  • data

❍ Low level requirements ❍ Reusable components? ❍ Operation and maintenance

procedures definition ❍ System integration strategy

slide-12
SLIDE 12

Conclusion

 Structuring and controlling the development process is a prerequisite

for the successful integration of fault tolerance and dependability- related mechanisms in complex systems

 The proposed model provides a generic framework for structuring

fault prevention, fault tolerance, fault removal and fault forecasting activities

 iterative process  tradeoffs

 The guidelines aim to ensure that dependability related issues are not

  • verlooked, but rather considered at each stage of the development

 The proposed framework can be used to define and structure the

evidence needed to support certification