Statistical Model Checking and Rare Events Paolo Zuliani Joint - - PowerPoint PPT Presentation

statistical model checking and rare events
SMART_READER_LITE
LIVE PREVIEW

Statistical Model Checking and Rare Events Paolo Zuliani Joint - - PowerPoint PPT Presentation

Statistical Model Checking and Rare Events Paolo Zuliani Joint work with Edmund M. Clarke Computer Science Department, CMU Probabilistic Verification Verification of stochastic system models via statistical model checking Temporal


slide-1
SLIDE 1

Statistical Model Checking and Rare Events

Paolo Zuliani

Joint work with Edmund M. Clarke Computer Science Department, CMU

slide-2
SLIDE 2
  • Verification of stochastic system models via

statistical model checking

  • Temporal logic specification:
  • “the amount of p53 exceeds 105 within 20 minutes”
  • If Ф = “p53 exceeds 105 within 20 minutes”

Probability (Ф) = ?

Probabilistic Verification

slide-3
SLIDE 3

Equivalently

  • A biased coin (Bernoulli random variable):
  • Prob (Heads) = p

Prob (Tails) = 1-p

  • p is unknown
  • Question: What is p?
  • A solution: flip the coin a number of times, collect

the outcomes, and use statistical estimation

slide-4
SLIDE 4

Key idea

(Haakan Younes, 2001)

  • System behavior w.r.t. property Ф can be modeled

by a Bernoulli random variable of parameter p:

  • System satisfies Ф with (unknown) probability p
  • Question: What is p?
  • Draw a sample of system simulations and use:
  • Statistical estimation: returns “p in interval (a,b)” with high

probability

Statistical Model Checking

slide-5
SLIDE 5
  • Statistical Model Checking is a Monte Carlo method
  • Problems arise when p is very small (rare event)
  • The number of simulations (coin flips) needed to

estimate p accurately grows too large

  • Need to deal with this …

Statistical Model Checking

slide-6
SLIDE 6
  • Estimate Prob(Xt) = pt, when pt is small (say 10-9)

 

Rare events

slide-7
SLIDE 7
  • Estimate Prob(Xt) = pt, when pt is small (say 10-9)
  • Standard (Crude) Monte Carlo: generate K i.i.d.

samples of X; return the estimator eK eK =

  • Prob (eK  pt) = 1 for K   (strong law LN)

Rare events

slide-8
SLIDE 8
  • E[eK] = pt
  • Var[eK] =

   

Rare events

K p p

t t

) 1 ( 

slide-9
SLIDE 9
  • E[eK] = pt
  • Var[eK] =
  • By the Central Limit Theorem (CLT), the distribution of eK

converges to a normal distribution with:

  • mean pt
  • variance
  • Relative Error (RE) =

Rare events

K p p

t t

) 1 (  K p p

t t

) 1 ( 

K p p p e e

t t t K K

) 1 ( ] [ E ] var[  

slide-10
SLIDE 10
  • RE =
  • Fix K, then RE is unbounded as pt  0
  • More accuracy  more samples
  • Want confidence interval of relative accuracy δ and

coverage probability c, i.e., estimate eK must satisfy:

Prob(| eK – pt | < δ·pt) ≥ c

  • How many samples do we need?

Rare events

K p p p

t t t

) 1 ( 

slide-11
SLIDE 11
  • From the CLT, a 99% (approximate) confidence interval
  • f relative accuracy δ needs about

K ≈ samples

Thus, Prob(| eK – pt | < δpt) ≈ 0.99 

 

Rare events

2

1 

t t

p p 

slide-12
SLIDE 12
  • From the CLT, a 99% (approximate) confidence interval
  • f relative accuracy δ needs about

K ≈ samples

Thus, Prob(| eK – pt | < δpt) ≈ 0.99

  • Examples:
  • pt = 10-9 and δ = 10-2 (ie, 1% relative accuracy) we need

about 1013 samples!!

  • Bayesian estimation requires about 6x106 samples with

pt=10-4 and δ = 10-1

Rare events

2

1 

t t

p p 

slide-13
SLIDE 13

A solution

  • Importance Sampling (1940s)
  • A variance-reduction technique
  • Can result in dramatic reduction in sample size
slide-14
SLIDE 14
  • The fundamental Importance Sampling identity

f is the density of X

Importance Sampling

slide-15
SLIDE 15
  • The fundamental Importance Sampling identity

f is the density of X

Importance Sampling

likelihood ratio

slide-16
SLIDE 16
  • Estimate pt= E[Xt] = Prob(Xt)
  • A sample X1,… XK iid as f
  • The crude Monte Carlo estimator is

Importance Sampling

slide-17
SLIDE 17
  • Estimate pt= E[Xt] = Prob(Xt)
  • A sample X1,… XK iid as f
  • The crude Monte Carlo estimator is

sampling from f

Importance Sampling

slide-18
SLIDE 18
  • Define a biasing density f*
  • Compute the IS estimator

where is the likelihood ratio

Importance Sampling

) ( ) ( ) (

* x

f x f x W 

slide-19
SLIDE 19
  • Define a biasing density f*
  • Compute the IS estimator

where is the likelihood ratio

sampling from f* !

Importance Sampling

) ( ) ( ) (

* x

f x f x W 

slide-20
SLIDE 20
  • Need to choose a “good” biasing density (low variance)
  • Optimal density:
  • Zero variance! (But …)

Importance Sampling

t

p x f t x I x f ) ( ) ( ) (

*

 

 

 

   

K i K i K

X f X f t X I K X W t X I K e

1 * 1

) ( ) ( ) ( 1 ) ( ) ( 1

   

K i t t

p X f t X I X f t X I K p

1

) ( ) ( ) ( ) ( 1

slide-21
SLIDE 21
  • Need to choose a “good” biasing density (low variance)
  • Optimal density:
  • Zero variance! (But …)

Importance Sampling

t

p x f t x I x f ) ( ) ( ) (

*

 

 

 

   

K i K i K

X f X f t X I K X W t X I K e

1 * 1

) ( ) ( ) ( 1 ) ( ) ( 1

   

K i t t

p X f t X I X f t X I K p

1

) ( ) ( ) ( ) ( 1

unknown

slide-22
SLIDE 22

Cross-Entropy Method

(R. Rubinstein)

  • Suppose the density of X in a family of densities {f( · ;v)}
  • the “nominal” f is f(x;u)
  • Key idea: choose a parameter v such that the distance

between f* and f( · ;v) is minimal

  • The Kullback-Leibler divergence (cross-entropy) is a

measure of “distance” between two densities

  • First used for rare event simulation by Rubinstein (1997)
slide-23
SLIDE 23

Cross-Entropy Method

  • The KL divergence (cross-entropy) of densities g, h is
  • D(g,h)  0

(= 0 IFF g = h)

  • D(g,h) ≠ D(h,g)

 

         dx x h x g dx x g x g X h X g E h g D

g

) ( ln ) ( ) ( ln ) ( ) ( ) ( ln ) , (

slide-24
SLIDE 24

Cross-Entropy Method

  • The KL divergence (cross-entropy) of densities g, h is
  • D(g,h)  0

(= 0 IFF g = h)

  • D(g,h) ≠ D(h,g)

 

         dx x h x g dx x g x g X h X g E h g D

g

) ( ln ) ( ) ( ln ) ( ) ( ) ( ln ) , (

family {f( · ;v)}

slide-25
SLIDE 25

Cross-Entropy Method

  • The KL divergence (cross-entropy) of densities g, h is
  • D(g,h)  0

(= 0 IFF g = h)

  • D(g,h) ≠ D(h,g)

 

         dx x h x g dx x g x g X h X g E h g D

g

) ( ln ) ( ) ( ln ) ( ) ( ) ( ln ) , (

family {f( · ;v)}

  • ptimal density f*
slide-26
SLIDE 26

Cross-Entropy Method

  • The KL divergence (cross-entropy) of densities g, h is
  • D(g,h)  0

(= 0 IFF g = h)

  • D(g,h) ≠ D(h,g)

 

         dx x h x g dx x g x g X h X g E h g D

g

) ( ln ) ( ) ( ln ) ( ) ( ) ( ln ) , (

family {f( · ;v)}

  • ptimal density f*

min D(f*, f( · ;v))

slide-27
SLIDE 27

Cross-Entropy Method

  • The Cross-Entropy Method has two basic steps

 

slide-28
SLIDE 28

Cross-Entropy Method

  • The Cross-Entropy Method has two basic steps
  • 1. find v* =

 

)) ; · ( ), · ( ( min arg v f f D v

*

slide-29
SLIDE 29

Cross-Entropy Method

  • The Cross-Entropy Method has two basic steps
  • 1. find v* =
  • 2. run importance sampling with biasing density f( · ; v*)

 

)) ; · ( ), · ( ( min arg v f f D v

*

slide-30
SLIDE 30

Cross-Entropy Method

  • The Cross-Entropy Method has two basic steps
  • 1. find v* =
  • 2. run importance sampling with biasing density f( · ; v*)

 

)) ; · ( ), · ( ( min arg v f f D v

*

slide-31
SLIDE 31

Cross-Entropy Method

  • The Cross-Entropy Method has two basic steps
  • 1. find v* =
  • 2. run importance sampling with biasing density f( · ; v*)
  • Step 2 is “easy”
  • Step 1 is not so easy

)) ; · ( ), · ( ( min arg v f f D v

*

slide-32
SLIDE 32

Cross-Entropy Method

  • Step 1:

v* =

slide-33
SLIDE 33

Cross-Entropy Method

  • Step 1:

v* =

 

        dx v x f x f dx x f x f v v X f X f E v

f

) ; ( ln ) ( ) ( ln ) ( min arg ) ; ( ) ( ln min arg

* * * *

*

slide-34
SLIDE 34

Cross-Entropy Method

  • Step 1:

v* =

 

        dx v x f x f dx x f x f v v X f X f E v

f

) ; ( ln ) ( ) ( ln ) ( min arg ) ; ( ) ( ln min arg

* * * *

*

always  0

slide-35
SLIDE 35

Cross-Entropy Method

  • Step 1:

v* =

 

        dx v x f x f dx x f x f v v X f X f E v

f

) ; ( ln ) ( ) ( ln ) ( min arg ) ; ( ) ( ln min arg

* * * *

*

always  0

 dx v x f x f v ) ; ( ln ) ( max arg

*

slide-36
SLIDE 36

Cross-Entropy Method

  • Step 1:

v* =

 

        dx v x f x f dx x f x f v v X f X f E v

f

) ; ( ln ) ( ) ( ln ) ( min arg ) ; ( ) ( ln min arg

* * * *

*

always  0

 dx v x f x f v ) ; ( ln ) ( max arg

*

  dx v x f p u x f t x I v

t

) ; ( ln ) ; ( ) ( max arg

slide-37
SLIDE 37

Cross-Entropy Method

  • Step 1:

v* =

 

        dx v x f x f dx x f x f v v X f X f E v

f

) ; ( ln ) ( ) ( ln ) ( min arg ) ; ( ) ( ln min arg

* * * *

*

always  0

 dx v x f x f v ) ; ( ln ) ( max arg

*

  dx v x f p u x f t x I v

t

) ; ( ln ) ; ( ) ( max arg

  dx v x f u x f t x I v ) ; ( ln ) ; ( ) ( max arg

slide-38
SLIDE 38

Cross-Entropy Method

  • Step 1:

v* =

 

        dx v x f x f dx x f x f v v X f X f E v

f

) ; ( ln ) ( ) ( ln ) ( min arg ) ; ( ) ( ln min arg

* * * *

*

always  0

 dx v x f x f v ) ; ( ln ) ( max arg

*

  dx v x f p u x f t x I v

t

) ; ( ln ) ; ( ) ( max arg

  dx v x f u x f t x I v ) ; ( ln ) ; ( ) ( max arg )] ; ( ln ) ( [ max arg v X f t X I E v

u

 

slide-39
SLIDE 39

Cross-Entropy Method

  • For certain families {f( · ;v)} (eg, one-dim exponential)

the problem can be solved analytically:

)] ; ( ln ) ( [ max arg

*

v X f t X I E v v

u

 

)] ( [ ] ) ( [

*

t X I E X t X I E v

u u

  

slide-40
SLIDE 40

Cross-Entropy Method

  • For certain families {f( · ;v)} (eg, one-dim exponential)

the problem can be solved analytically:

)] ; ( ln ) ( [ max arg

*

v X f t X I E v v

u

 

)] ( [ ] ) ( [

*

t X I E X t X I E v

u u

  

slide-41
SLIDE 41

Cross-Entropy Method

  • In practice: get X1, …, XK samples iid as f( · ;u) and

compute the approximation

 

 

  

K i i K i i i

t X I X t X I v

1 1 *

)] ( [ ] ) ( [

slide-42
SLIDE 42

Cross-Entropy Method

  • In practice: get X1, …, XK samples iid as f( · ;u) and

compute the approximation

 

 

  

K i i K i i i

t X I X t X I v

1 1 *

)] ( [ ] ) ( [

In general, one would have to (numerically) solve the problem

  

K i i i

v X f t X I K

1

) ; ( ln ) ( 1

slide-43
SLIDE 43

Cross-Entropy Method

  • Problem: If {X  t} is a rare event, then this fails

 

 

  

K i i K i i i

t X I X t X I v

1 1

)] ( [ ] ) ( [

*

slide-44
SLIDE 44

Cross-Entropy Method

  • Problem: If {X  t} is a rare event, then this fails
  • Most terms in both sums will be zero!

 

 

  

K i i K i i i

t X I X t X I v

1 1

)] ( [ ] ) ( [

*

slide-45
SLIDE 45

Cross-Entropy with Rare Events

  • Rubinstein gave an algorithm that computes v* adaptively
  • Estimate Prob {S(X)  t} [S(X) = sample performance]

   

slide-46
SLIDE 46

Cross-Entropy with Rare Events

  • Rubinstein gave an algorithm that computes v* adaptively
  • Estimate Prob {S(X)  t} [S(X) = sample performance]
  • Idea: compute v* for a non-rare event {S(X)  t’}, and

iterate until t’ converges to t   

slide-47
SLIDE 47

Cross-Entropy with Rare Events

  • Rubinstein gave an algorithm that computes v* adaptively
  • Estimate Prob {S(X)  t} [S(X) = sample performance]
  • Idea: compute v* for a non-rare event {S(X)  t’}, and

iterate until t’ converges to t

  • Fix, say 0.01 < ρ < 0.1; get X1, … XK samples iid as f( · ;u)

 

slide-48
SLIDE 48

Cross-Entropy with Rare Events

  • Rubinstein gave an algorithm that computes v* adaptively
  • Estimate Prob {S(X)  t} [S(X) = sample performance]
  • Idea: compute v* for a non-rare event {S(X)  t’}, and

iterate until t’ converges to t

  • Fix, say 0.01 < ρ < 0.1; get X1, … XK samples iid as f( · ;u)
  • Compute t’ = (1-ρ) sample quantile of the S(Xj)

then Prob {S(X)  t’}  ρ (approx.) 

slide-49
SLIDE 49

Cross-Entropy with Rare Events

  • Rubinstein gave an algorithm that computes v* adaptively
  • Estimate Prob {S(X)  t} [S(X) = sample performance]
  • Idea: compute v* for a non-rare event {S(X)  t’}, and

iterate until t’ converges to t

  • Fix, say 0.01 < ρ < 0.1; get X1, … XK samples iid as f( · ;u)
  • Compute t’ = (1-ρ) sample quantile of the S(Xj)

then Prob {S(X)  t’}  ρ (approx.)

  • Now compute v* “as usual”. Iterate until t’=t
slide-50
SLIDE 50

Cross-Entropy with Rare Events

  • Does NOT work with statistical model checking
  • Problem: sample quantile computation
  • Order the sample performances

S(1)  …  S(i)  …  S(K)

  • In statistical model checking, sample performances are

either 0 (property false) or 1 (property true)

(1 - ρ)K

slide-51
SLIDE 51

)] , ; ( ) ( [ ] ) , ; ( ) ( [ )] ( [ ] ) ( [

*

w u X W t X I E X w u X W t X I E t X I E X t X I E v

w w u u

     

Cross-Entropy with Rare Events

  • However …

where for an arbitrary parameter w Work in progress

) ; ( ) ; ( ) , ; ( w x f u x f w u x W 

slide-52
SLIDE 52

Example: Fuel Control System

The Stateflow/Simulink model

slide-53
SLIDE 53

Verification

  • We want to estimate the probability that

M, FaultRate ╞═ F100 G1(FuelFlowRate = 0)

  • “It is the case that within 100 seconds, FuelFlowRate is

zero for 1 second”

  • FaultRate = 1/3600s (same value for the three sensors)
slide-54
SLIDE 54

Importance Sampling

  • Ran cross-entropy method to estimate optimal

biasing density with FaultRates = {1/7, 1/8, 1/9}

  • Used 100 samples for this, and obtained
  • NewRates* = {1/2.007, 1/1.0113, 1/1.7277}
  • Run importance sampling with 1,000 samples

and NewRates*

  • Probability estimate 9.1855x10-15
slide-55
SLIDE 55

Conclusions

  • Need to be able to deal with rare events in statistical

model checking

  • The Cross-Entropy method is an interesting, semi-

automatic technique

  • Research: adaptive technique for stat. model checking
  • [Further benefit: cross-entropy method also applies to
  • ptimization, eg, finding policies for MDPs]
slide-56
SLIDE 56

The End

Questions?