Mixed-Criticality Systems with Permitted Failure Probability Zhishan - - PowerPoint PPT Presentation

mixed criticality systems with
SMART_READER_LITE
LIVE PREVIEW

Mixed-Criticality Systems with Permitted Failure Probability Zhishan - - PowerPoint PPT Presentation

EDF Schedulability Analysis on Mixed-Criticality Systems with Permitted Failure Probability Zhishan Guo , Luca Santinelli * , and Kecheng Yang Department of Computer Science, UNC Chapel Hill *ONERA The French Aerospace Lab at Toulouse The


slide-1
SLIDE 1

EDF Schedulability Analysis on Mixed-Criticality Systems with Permitted Failure Probability

Zhishan Guo, Luca Santinelli*, and Kecheng Yang

Department of Computer Science, UNC Chapel Hill *ONERA The French Aerospace Lab at Toulouse

slide-2
SLIDE 2

The Multi-WCET MC Task Model

  • The Liu & Layland (LL) sporadic task model:

Task τi = (ci, Ti)

– Worst-case execution requirement – Minimum inter-arrival separation (period)

Implicit Deadline

slide-3
SLIDE 3

The Multi-WCET MC Task Model

  • The Liu & Layland (LL) sporadic task model:

Task τi = (ci, Ti)

– Worst-case execution requirement – Minimum inter-arrival separation (period)

  • Provisioning assumptions (e.g., WCET-analysis tools) may be

more or less conservative

  • Example: x := a + b

3~321 cycles

Implicit Deadline

slide-4
SLIDE 4

The Multi-WCET MC Task Model

  • The Liu & Layland (LL) sporadic task model:

Task τi = (ci, Ti)

– Worst-case execution requirement – Minimum inter-arrival separation (period)

  • Provisioning assumptions (e.g., WCET-analysis tools) may be

more or less conservative

  • Example: x := a + b

3~321 cycles

ciLO ciHI t Static Analysis; Pessimistic Measurement Based; Optimistic Implicit Deadline

slide-5
SLIDE 5

Our MC Task Model

  • Liu & Layland (LL) sporadic task: τi = (ci, Ti)
  • MC sporadic task: τi = (ciLO, ciHI, Ti, HI)

– Worst-case execution estimates – Minimum inter-arrival separation (period) – Criticality level

ciLO ciHI t Static Analysis; Pessimistic Measurement Based; Optimistic Implicit Deadline

slide-6
SLIDE 6

Our MC Task Model

  • Liu & Layland (LL) sporadic task: τi = (ci, Ti)
  • MC sporadic task: τi = (ciLO, ciHI, Ti, HI)
  • HI task τi = (ciLO, ciHI, fi, Ti, HI)

– Worst-case execution estimates along with failure prob. – Minimum inter-arrival separation (period) – Criticality level

ciLO ciHI t Failure Probability Implicit Deadline

slide-7
SLIDE 7

Our MC Task Model

  • Liu & Layland (LL) sporadic task: τi = (ci, Ti)
  • MC sporadic task: τi = (ciLO, ciHI, Ti, HI)
  • HI task τi = (ciLO, ciHI, fi, Ti, HI)

ciLO ciHI t Failure Probability

For each HI-criticality task τi, within a time interval of one hour, no job of τi has an execution greater than ciHI and the probability of any job of τi has an execution greater than ciLO is fi — we would expect fi to be a very small positive value.

Implicit Deadline

slide-8
SLIDE 8

Why Failure Probability?

  • Steve Vestal. Preemptive scheduling of multi-criticality systems

with varying degrees of execution time assurance. RTSS 2007.

ciLO ciHI t

slide-9
SLIDE 9

Why Failure Probability?

  • Steve Vestal. Preemptive scheduling of multi-criticality systems

with varying degrees of execution time assurance. RTSS 2007.

Abstract tract

…the more more con confi fiden dence ce one ne needs in a task exe xecut cutio ion ti time me bo bound, the larger and more conservative that bound tends to become in

  • practice. … We assume a task may have a set of alt

lter erna nati tive ve wor worst st-ca case se execu cuti tion

  • n times, each

ch assured to to a differ feren ent level of

  • f CONFIDENCE

ENCE.

ciLO ciHI t

slide-10
SLIDE 10

Why Failure Probability?

  • Steve Vestal. Preemptive scheduling of multi-criticality systems

with varying degrees of execution time assurance. RTSS 2007.

Abstract tract

…the more more con confi fiden dence ce one ne needs in a task exe xecut cutio ion ti time me bo bound, the larger and more conservative that bound tends to become in

  • practice. … We assume a task may have a set of alt

lter erna nati tive ve wor worst st-ca case se execu cuti tion

  • n times, each

ch assured to to a differ feren ent level of

  • f CONFIDENCE

ENCE.

ciLO ciHI t

fi is a quantized form of confidence

slide-11
SLIDE 11

Abstract tract

This paper is based on a conjecture that the more more conf confide idence ce one ne needs in a task exe execut cution ion ti time me bo bound nd (the less tolerant one is of missed deadlines), the larger and more conservative that bound tends to become in practice. … We assume a task may have a set of alt lter ernati native ve wor worst st-ca case se exec executi tion ti time mes, each ch as assured sured to to a di diff ffer erent nt level el of

  • f CONFIDENCE

ENCE.

Why Failure Probability?

  • Steve Vestal. Preemptive scheduling of multi-criticality systems

with varying degrees of execution time assurance. RTSS 2007.

Eval aluat atio ion n

In some cases, experience or special fa fact ctors

  • rs associated with a

particular application domain were taken into account, sometimes including added safety margins.

ciLO ciHI t * Fudge factor Static Analysis

slide-12
SLIDE 12

Abstract tract

This paper is based on a conjecture that the more more conf confide idence ce one ne needs in a task exe execut cution ion ti time me bo bound nd (the less tolerant one is of missed deadlines), the larger and more conservative that bound tends to become in practice. … We assume a task may have a set of alt lter ernati native ve wor worst st-ca case se exec executi tion ti time mes, each ch as assured sured to to a di diff ffer erent nt level el of

  • f CONFIDENCE

ENCE.

Eval aluat atio ion n

Discussions with individuals from various Honeywell sites indicated that execution time measurements obtained from instrumented platforms were the primary but not not on

  • nly dat

data used to determine worst-case execution time parameters. Testing was influenced by the desi esign as assuran urance ce lev level el of a task…. In some cases experience or special fa factor ctors associated with a particular application domain were taken into account, sometimes including added safety margins.

Why Failure Probability?

  • Steve Vestal. Preemptive scheduling of multi-criticality systems

with varying degrees of execution time assurance. RTSS 2007.

Remark mark

One would obtain WCET bounds onl nly at the precision and lev level el of

  • f

assurance ce required.

ciLO ciHI t

slide-13
SLIDE 13

Abstract tract

This paper is based on a conjecture that the more more conf confide idence ce one ne needs in a task exe execut cution ion ti time me bo bound nd (the less tolerant one is of missed deadlines), the larger and more conservative that bound tends to become in practice. … We assume a task may have a set of alt lter ernati native ve wor worst st-ca case se exec executi tion ti time mes, each ch as assured sured to to a di diff ffer erent nt level el of

  • f CONFIDENCE

ENCE.

Eval aluat atio ion n

Discussions with individuals from various Honeywell sites indicated that execution time measurements obtained from instrumented platforms were the primary but not not on

  • nly dat

data used to determine worst-case execution time parameters. Testing was influenced by the desi esign as assuran urance ce lev level el of a task…. In some cases experience or special fa factor ctors associated with a particular application domain were taken into account, sometimes including added safety margins.

Why Failure Probability?

  • Steve Vestal. Preemptive scheduling of multi-criticality systems

with varying degrees of execution time assurance. RTSS 2007.

Remark mark

One would obtain WCET bounds onl nly at the precision and lev level el of

  • f

assurance ce required.

ciLO ciHI t

An allowed system failure probability FS is specified. It describes the permitted probability of the system failing to meet timing constraints during one hour of execution. FS may be very close to zero (e.g., 10−12 for some safety- critical avionics systems).

slide-14
SLIDE 14

Abstract tract

This paper is based on a conjecture that the more more conf confide idence ce one ne needs in a task exe execut cution ion ti time me bo bound nd (the less tolerant one is of missed deadlines), the larger and more conservative that bound tends to become in practice. … We assume a task may have a set of alt lter ernati native ve wor worst st-ca case se exec executi tion ti time mes, each ch as assured sured to to a di diff ffer erent nt level el of

  • f CONFIDENCE

ENCE.

Eval aluat atio ion n

Discussions with individuals from various Honeywell sites indicated that execution time measurements obtained from instrumented platforms were the primary but not not on

  • nly dat

data used to determine worst-case execution time parameters. Testing was influenced by the desi esign as assuran urance ce lev level el of a task…. In some cases experience or special fa factor ctors associated with a particular application domain were taken into account, sometimes including added safety margins.

Why Failure Probability?

  • Steve Vestal. Preemptive scheduling of multi-criticality systems

with varying degrees of execution time assurance. RTSS 2007.

Remark mark

One would obtain WCET bounds onl nly at the precision and lev level el of

  • f

assurance ce required. Tractable exact WCET analysis would reduce but not entirely eliminate the utility of these methods. For example, the longest execution paths might be sufficiently infrequent in practice (e.g. error handling paths) that they sho shoul uld be be ig ignore nored for low-to- moderate criticality tasks. Occ ccasi asional nal deadline misses may be tolerable, especially by tasks at lower criticality levels.

ciLO ciHI t

slide-15
SLIDE 15

Why Failure Probability?

  • Alan Burns and Robert I. Davis. Mixed Criticality Systems - A Review.

6th Ed., Aug 2015. http://www-users.cs.york.ac.uk/burns/review.pdf – 272 papers cited – The multi-WCET task model has been thoroughly studied.

ciLO ciHI t

slide-16
SLIDE 16

Why Failure Probability?

  • Alan Burns and Robert I. Davis. Mixed Criticality Systems - A Review.

6th Ed., Aug 2015. http://www-users.cs.york.ac.uk/burns/review.pdf – 272 papers cited – The multi-WCET task model has been thoroughly studied. – Without fi and Fs, most existing work must consider the case that ALL HI-criticality tasks SIMULTANEOUSLY exceed their CLO’s, which, in some cases, might be too pessimistic.

ciLO ciHI t

slide-17
SLIDE 17

Why Failure Probability?

  • Alan Burns and Robert I. Davis. Mixed Criticality Systems - A Review.

6th Ed., Aug 2015. http://www-users.cs.york.ac.uk/burns/review.pdf – 272 papers cited – The multi-WCET task model has been thoroughly studied. – Without fi and Fs, most existing work must consider the case that ALL HI-criticality tasks SIMULTANEOUSLY exceed their CLO’s, which, in some cases, might be too pessimistic.

  • Schedulability analysis based on pWCET

– Dedicated to probabilistic analysis (e.g. EVT) – Convolution based analysis is slow (exponential) – MC is not yet involved

ciLO ciHI t

slide-18
SLIDE 18

Our Problem

  • Given

– A MC sporadic task set (HI tasks with failure probability fi), and – A uni-processor platform (with system failure probability FS).

ciLO ciHI t

slide-19
SLIDE 19

Our Problem

  • Given

– A MC sporadic task set (HI tasks with failure probability fi); – A uni-processor platform (with system failure probability FS).

  • We seek to determine the probabilistic schedulability

– Strongly probabilistic schedulable: upon any hour of execution, the probability of missing any deadline is less than FS. – Weakly probabilistic schedulable: upon any hour of execution, the probability of missing any HI-criticality deadline is less than FS. – In either case, all deadlines are met during system runs where no job exceeds its LO-WCET.

ciLO ciHI t

slide-20
SLIDE 20

About Independence

  • Between jobs of the same task, no independence is required

– Jobs generated from the same task typically represent execution

  • f the same piece of code

– WCETs of consecutive such jobs are often correlated

– Recall: Def of fi -- prob. of any failure in an hour

slide-21
SLIDE 21

About Independence

  • Between jobs of the same task, no independence is required

– Jobs generated from the same task typically represent execution

  • f the same piece of code

– WCETs of consecutive such jobs are often correlated

– Recall: Def of fi -- prob. of any failure in an hour

  • Between tasks

– To simplify, and to start with, independence is assumed – pET vs pWCET: Execution times are observed with other tasks executing in parallel, thus the execution time measuring embeds dependency [on-going discussions…] – Independence (i.i.d.) may be achieved with proper CiLO (setting threshold) [Liu, Mills, and Anderson, RTSS’14]

slide-22
SLIDE 22

Our Strategy - A Quick View

  • HI-criticality tasks have slight chances to exceed their CLO’s.
  • LFF-Clustering -- HI-criticality tasks are put into groups, s.t.:

Task: Heptagon

slide-23
SLIDE 23

Our Strategy - A Quick View

  • HI-criticality tasks have slight chances to exceed their CLO’s.
  • LFF-Clustering -- HI-criticality tasks are put into groups, s.t.:

– The likelihood of more than one tasks per group exceeding their CLO’s is much smaller than FS (safe to ignore). – Each group is assigned a HI criticality server that “takes care of” any reasonable potential failure behavior of HI tasks.

Task: Heptagon

slide-24
SLIDE 24

Our Strategy - A Quick View

  • HI-criticality tasks have slight chances to exceed their CLO’s.
  • LFF-Clustering -- HI-criticality tasks are put into groups, s.t.:

– The likelihood of more than one tasks per group exceeding their CLO’s is much smaller than FS (safe to ignore). – Each group is assigned a HI criticality server that “takes care of” any reasonable potential failure behavior of HI tasks.

  • Why do we need groups?

Task: Heptagon

  • HI-criticality tasks have slight chances to exceed their CLO’s.
  • LFF-Clustering -- HI-criticality tasks are put into groups, s.t.:

– The likelihood of more than one tasks per group exceeding their CLO’s is much smaller than FS (safe to ignore). – Each group is assigned a HI criticality server that “takes care of” any reasonable potential failure behavior of HI tasks.

slide-25
SLIDE 25

Our Strategy - A Quick View

  • HI-criticality tasks have slight chances to exceed their CLO’s.
  • LFF-Clustering -- HI-criticality tasks are put into groups, s.t.:

– The likelihood of more than one tasks per group exceeding their CLO’s is much smaller than FS (safe to ignore). – Each group is assigned a HI criticality server that “takes care of” any reasonable potential failure behavior of HI tasks.

  • Why do we need groups?

Task: Heptagon

  • HI-criticality tasks have slight chances to exceed their CLO’s.
  • LFF-Clustering -- HI-criticality tasks are put into groups, s.t.:

– The likelihood of more than one tasks per group exceeding their CLO’s is much smaller than FS (safe to ignore). – Each group is assigned a HI criticality server that “takes care of” any reasonable potential failure behavior of HI tasks.

We need to consider the probability of more than one task per group exceeding CLO’s -- Exponential Time!

slide-26
SLIDE 26

Our Strategy - A Quick View

  • HI-criticality tasks have slight chances to exceed their CLO’s.
  • LFF-Clustering -- HI-criticality tasks are put into groups, s.t.:

– The likelihood of more than one tasks per group exceeding their CLO’s is much smaller than FS (safe to ignore). – Each group is assigned a HI criticality server that “takes care of” any reasonable potential failure behavior of HI tasks.

  • Why do we need groups?

Task: Heptagon

  • HI-criticality tasks have slight chances to exceed their CLO’s.
  • LFF-Clustering -- HI-criticality tasks are put into groups, s.t.:

– The likelihood of more than one tasks per group exceeding their CLO’s is much smaller than FS (safe to ignore). – Each group is assigned a HI criticality server that “takes care of” any reasonable potential failure behavior of HI tasks.

slide-27
SLIDE 27

Our Strategy - A Quick View

  • HI-criticality tasks have slight chances to exceed their CLO’s.
  • LFF-Clustering -- HI-criticality tasks are put into groups, s.t.:

– The likelihood of more than one tasks per group exceeding their CLO’s is much smaller than FS (safe to ignore). – Each group is assigned a HI criticality server that “takes care of” any reasonable potential failure behavior of HI tasks.

  • Why do we need groups?

Task: Heptagon

  • HI-criticality tasks have slight chances to exceed their CLO’s.
  • LFF-Clustering -- HI-criticality tasks are put into groups, s.t.:

– The likelihood of more than one tasks per group exceeding their CLO’s is much smaller than FS (safe to ignore). – Each group is assigned a HI criticality server that “takes care of” any reasonable potential failure behavior of HI tasks.

We are covering the case that all tasks SEMUTANIOUSLY exceeds their CLO’s -- No Improvement!

slide-28
SLIDE 28

Our Strategy - A Quick View

  • HI-criticality tasks have slight chances to exceed their CLO’s.
  • LFF-Clustering -- HI-criticality tasks are put into groups, s.t.:

– The likelihood of more than one tasks per group exceeding their CLO’s is much smaller than FS (safe to ignore). – Each group is assigned a HI criticality server that “takes care of” any reasonable potential failure behavior of HI tasks.

  • Why do we need groups?

Task: Heptagon

  • HI-criticality tasks have slight chances to exceed their CLO’s.
  • LFF-Clustering -- HI-criticality tasks are put into groups, s.t.:

– The likelihood of more than one tasks per group exceeding their CLO’s is much smaller than FS (safe to ignore). – Each group is assigned a HI criticality server that “takes care of” any reasonable potential failure behavior of HI tasks.

slide-29
SLIDE 29

Our Strategy - A Quick View

  • HI-criticality tasks have slight chances to exceed their CLO’s.
  • LFF-Clustering -- HI-criticality tasks are put into groups, s.t.:

– The likelihood of more than one tasks per group exceeding their CLO’s is much smaller than FS (safe to ignore). – Each group is assigned a HI criticality server that “takes care of” any reasonable potential failure behavior of HI tasks.

  • Schedulability test pMC

– May return strongly, weakly or not schedulable – Sufficient conditions provided (utilization based) – Theoretically proved – Experiment comparison over EDF-VD Please see the paper for details.

  • HI-criticality tasks have slight chances to exceed their CLO’s.
  • LFF-Clustering -- HI-criticality tasks are put into groups, s.t.:

– The likelihood of more than one tasks per group exceeding their CLO’s is much smaller than FS (safe to ignore). – Each group is assigned a HI criticality server that “takes care of” any reasonable potential failure behavior of HI tasks.

slide-30
SLIDE 30

Experimental Comparison

UUniFast; FS = 10-6 UHI~U[0.9,1]

slide-31
SLIDE 31

Experimental Comparison

UUniFast; FS = 10-6 UHI~U[0.9,1]

slide-32
SLIDE 32

Experimental Comparison

UUniFast; FS = 10-6 UHI~U[0.9,1]

slide-33
SLIDE 33

Experimental Comparison

UUniFast; FS = 10-6 UHI~U[0.9,1]

About 83% of the randomly generated sets with LO utilization in the range of [0.75,0.8) are strongly schedulable by

  • ur method.
slide-34
SLIDE 34

Experimental Comparison

UUniFast; FS = 10-6 UHI~U[0.9,1]

slide-35
SLIDE 35

Experimental Comparison

UUniFast; FS = 10-6 UHI~U[0.9,1]

slide-36
SLIDE 36

Experimental Comparison

UUniFast; FS = 10-6 UHI~U[0.9,1]

slide-37
SLIDE 37

Contribution

  • According to [Vestal, 2007], we extend the multi-WCET MC task

model, by introducing a probabilistic based parameter

  • Defined (strong and weak probabilistic) schedulability

accordingly

  • Proposed a server based grouping mechanism to efficiently

solve the problem

  • Outperform EDF-VD under experimental study on randomly

generated task sets

slide-38
SLIDE 38

Restrictions of this work

  • Uni-processor
  • Implicit deadline
  • Two criticality levels
  • Inter-task WCET independence assumption
  • The server introduces too many preemptions
  • More comparison study…
slide-39
SLIDE 39

Some Related Work

  • J. Ren and L.T.X. Phan. Mixed-criticality scheduling on

multiprocessors using task grouping. In Proc. 27th ECRTS, pages 25–36. IEEE, July 2015.

  • X. Gu, K.-M. Phan, A. Easwaran, and I. Shin. Resource efficient

isolation mechanisms in mixed-criticality scheduling. In Proc. 27th ECRTS, pages 13–24. IEEE, July 2015.

  • Done in parallel to our work, with similar ideas
  • On grouping tasks together, and
  • Assuming that only a few tasks in a group will exceed their CLO.
slide-40
SLIDE 40

Acknowledgement

  • We would like to thank:

– Prof. Sanjoy Baruah for his kind help and support, and – Dr. Dorin Maxim for his suggestive comments.

slide-41
SLIDE 41

Thank you!

Zhishan Guo zsguo@cs.unc.edu

slide-42
SLIDE 42

Abstract tract

This paper is based on a conjecture that the more more conf confide idence ce one ne needs in a task exe execut cution ion ti time me bo bound nd (the less tolerant one is of missed deadlines), the larger and more conservative that bound tends to become in practice. … We assume a task may have a set of alt lter ernati native ve wor worst st-ca case se exec executi tion ti time mes, each ch as assured sured to to a di diff ffer erent nt level el of

  • f CONFIDENCE

ENCE.

Why Failure Probability?

  • Steve Vestal. Preemptive scheduling of multi-criticality systems

with varying degrees of execution time assurance. RTSS 2007.

Eval aluat atio ion n

Execution time measurements obtained from instrumented platforms were the primary but not not onl nly da data ta used to determine worst-case execution time parameters. Testing was influenced by the desig design assurance ce level of a task….

ciLO ciHI t Static Analysis Measurement Based