Layered Dependability Modeling of an Air Traffic Control System - - PowerPoint PPT Presentation

layered dependability modeling of an air traffic control
SMART_READER_LITE
LIVE PREVIEW

Layered Dependability Modeling of an Air Traffic Control System - - PowerPoint PPT Presentation

WADS Workshop at ICSE 2003 Olivia Das, Murray Woodside, May 3, 2003 1 C Layered Dependability Modeling of an Air Traffic Control System Olivia Das and C. Murray


slide-1
SLIDE 1

WADS Workshop at ICSE 2003 Olivia Das, Murray Woodside, May 3, 2003 1

C

Layered Dependability Modeling of an Air Traffic Control System

Olivia Das and C. Murray Woodside Department of Systems and Computer Engineering Carleton University, Ottawa, Canada

  • das@sce.carleton.ca, cmw@sce.carleton.ca
slide-2
SLIDE 2

WADS Workshop at ICSE 2003 Olivia Das, Murray Woodside, May 3, 2003 2

C

Overview

  • dependability of complex systems
  • dependability for systems with layered software

architecture

  • effect on coverage due to management subsystem failures
  • performability measures
slide-3
SLIDE 3

WADS Workshop at ICSE 2003 Olivia Das, Murray Woodside, May 3, 2003 3

C

Layered Application Model

Tasks, Interactions and Dependencies, and Processors

Controller

UI

Display Management

modify Display modify FlightPlan display Flight Plan user Interface display Radar data conflict Alert modify flightPlan get flight Plan detect and resolve conflicts

Conflict Resolution Flight Plan Management

modify trajectory get traje- ctory

Trajectory Management

read Flight Plan update Flight Plan

Flight Plan Database

process radar data

Surveillance Processing

(up to four controllers) Radars (two radars)

Synchronous service request Asynchronous service request

slide-4
SLIDE 4

WADS Workshop at ICSE 2003 Olivia Das, Murray Woodside, May 3, 2003 4

C

Replication Mechanisms

Primary-standby, load-balancing, active, primary-standby-active

N UserA = 50 N UserB = 100 UserA AppA

#1 #1 #2 #2

userA UserB userB eA serviceA AppB eB serviceB Server1 eA1 eB1 Server2 eA2 eB2 procA procB proc1 proc3 proc2 proc4 log Log Server

slide-5
SLIDE 5

WADS Workshop at ICSE 2003 Olivia Das, Murray Woodside, May 3, 2003 5

C

Example Configuration (1)

proc3 fails and causes Server1 failure...Server2 used instead

N UserA = 50 N UserB = 100 UserA AppA

#1 #1 #2 #2

userA UserB userB eA serviceA AppB eB serviceB Server1 eA1 eB1 Server2 eA2 eB2 procA procB proc1 proc3 proc2 proc4 log Log Server

slide-6
SLIDE 6

WADS Workshop at ICSE 2003 Olivia Das, Murray Woodside, May 3, 2003 6

C

Example Configuration (2)

proc1 fails and puts AppA out.. Group UserA fails.. Here, failure cannot be compensated by standby servers

N UserA = 50 N UserB = 100 UserA AppA

#1 #1 #2 #2

userA UserB userB eA serviceA AppB eB serviceB Server1 eA1 eB1 Server2 eA2 eB2 procA procB proc1 proc3 proc2 proc4 log Log Server

slide-7
SLIDE 7

WADS Workshop at ICSE 2003 Olivia Das, Murray Woodside, May 3, 2003 7

C

Centralized Fault Management Model

MT Manager AT UserA AT AppA AT Server1 AT UserB AT AppB AT Server2

Components

  • Application Tasks
  • Mgmt. cmpts.

Connectors

  • Alive-Watch
  • Notify
slide-8
SLIDE 8

WADS Workshop at ICSE 2003 Olivia Das, Murray Woodside, May 3, 2003 8

C

Perfect detection and reconfiguration

proc3 fails and causes Server1 failure... Full coverage: Server2 used instead

N UserA = 50 N UserB = 100 UserA AppA

#1 #1 #2 #2

userA UserB userB eA serviceA AppB eB serviceB Server1 eA1 eB1 Server2 eA2 eB2 procA procB proc1 proc3 proc2 proc4 log Log Server

slide-9
SLIDE 9

WADS Workshop at ICSE 2003 Olivia Das, Murray Woodside, May 3, 2003 9

C

Partial coverage for centralized mgmt.

proc3 fails and causes Server1 failure... Partial coverage: Manager failed, so system failed

N UserA = 50 N UserB = 100 UserA AppA

#1 #1 #2 #2

userA UserB userB eA serviceA AppB eB serviceB Server1 eA1 eB1 Server2 eA2 eB2 procA procB proc1 proc3 proc2 proc4 log Log Server

slide-10
SLIDE 10

WADS Workshop at ICSE 2003 Olivia Das, Murray Woodside, May 3, 2003 10

C

Analysis - currently

Determine Distinct Operational Configurations Ci Compute Probability, Prob(Ci), of each Operational Configuration Compute Reward, R(Ci) of each Operational Configuration using Layered Queueing

Models Compute Mean Reward=

R Ci ( ) Prob Ci ( ) ⋅

i

Level 1 Level 2

slide-11
SLIDE 11

WADS Workshop at ICSE 2003 Olivia Das, Murray Woodside, May 3, 2003 11

C

Probabilities of Operational Configurations

Non-coherent fault tree

✂✁ ✄ ☎ ✆ ✝✂✞ ✟ ✠ ✁ ✡✁ ✆ ☛ ✟ ☞ ✌ ✞ ✁ ✠ ✍ ✎ ✏ ✑✒ ✓ ✔✕ ✖ ✗ ☞ ✟✘ ☎✚✙ ✛ ✡ ✙ ✝ ✞ ✟ ✠ ✁ ✡✁ ✆ ☛ ✟ ☞ ✌✂✞ ✁ ✠ ✍

#1 #2 #1 #2

✎ ✛ ☛ ✞ ✙ ✜ ✆ ✙ ✛ ✖ ✢ ✁ ✣ ✙ ✞ ✙ ✛ ✏ ✠ ✠ ☎ ☛ ✜ ✁ ✆ ☛ ✟ ☞ ✤ ✟ ✛ ✙ ☎ ✂✁ ✄ ☎ ✆ ✤ ✡ ✥ ✆✧✦ ✤ ✟ ✛ ✙ ☎
slide-12
SLIDE 12

WADS Workshop at ICSE 2003 Olivia Das, Murray Woodside, May 3, 2003 12

C

Layered Model of ATC En Route System

Controller UI Display Management modify

Display modify FlightPlan display Flight Plan user Interface display Radar data conflict Alert

Console subsystem

modify flightPlan get flight Plan detect and resolve conflicts

Conflict Resolution Flight Plan Management

modify trajectory get traje- ctory

Trajectory Management

read Flight Plan update Flight Plan

Flight Plan Database Central subsystem

process radar data

Surveillance Processing Radar subsystem (three load-balanced (two primary-standby replicas) (three primary- standby-active (three controllers) Radars (two radars) replicas) replicas)

OR OR OR AND

slide-13
SLIDE 13

WADS Workshop at ICSE 2003 Olivia Das, Murray Woodside, May 3, 2003 13

C

Fault Mgmt. Model of ATC En Route System

gSAM UI

Console Processor

AT AT MT Name Server MT Display Mgmt AT P2PSM gSAM

Central Processor

AT

AT

MT AT P2PSM gSAM

Radar Processor

AT MT AT P2PSM Surveillance Processing

Monitor and Control subsystem Name Server Processor (Three active replicas)

FlightPlan

AT

AT

Trajectory Mgmt Conflict Resolution Database FlightPlan Mgmt

slide-14
SLIDE 14

WADS Workshop at ICSE 2003 Olivia Das, Murray Woodside, May 3, 2003 14

C

Results

Number of components (tasks and processors): 51 Number of connectors in fault management model: 118 Failure probability of all processors: 0.05 Failure probability of all tasks (including management tasks): 0.1 Total number of nodes in the graph that combines information from both the fault propagation graph and the Knowledge Propagation graph: 715 Number of operational configurations: 14 Time to generate and compute probabilities of configurations: 277 secs Probability of system being in working state: 0.33 Average throughput for Controller task: 0.067 requests/sec If failure probability of management tasks decreased to 0.05, then Probability of system being in working state: 0.45 and average throughput for Controller task increases to 0.093 requests/sec.

slide-15
SLIDE 15

WADS Workshop at ICSE 2003 Olivia Das, Murray Woodside, May 3, 2003 15

C

Conclusions

  • Dependability evaluation for layered software architectures
  • Scalable technique
  • separation of performance analysis from failure-repair
  • much smaller set of configurations because of layered

architecture than of failure states

  • Operational configurations takes into account:
  • layered dependencies
  • "Knowledge failure" effects that depends on the status
  • f the Management system which limits the

reconfiguration capability

  • Explosion of configuration is a limitation