lecture 13 reachability in mdps
play

Lecture 13 Reachability in MDPs Dr. Dave Parker Department of - PowerPoint PPT Presentation

Probabilistic Model Checking Michaelmas Term 2011 Lecture 13 Reachability in MDPs Dr. Dave Parker Department of Computer Science University of Oxford Recall - MDPs Markov decision process: M = (S,s init ,Ste teps,L)


  1. Probabilistic Model Checking Michaelmas Term 2011 Lecture 13 
 Reachability in MDPs Dr. Dave Parker Department of Computer Science University of Oxford

  2. Recall - MDPs • Markov decision process: M = (S,s init ,Ste teps,L) • Adversary σ ∈ Adv resolves nondeterminism • σ induces set of paths Path σ (s) and DTMC D σ • D σ yields probability space Pr σ s over Path σ (s) • Prob σ (s, ψ ) = Pr σ s { ω ∈ Path σ (s) | ω ⊨ ψ } • MDP yields minimum/maximum probabilities: p min (s, ψ ) = inf σ∈ Adv Prob σ (s, ψ ) p max (s, ψ ) = sup σ∈ Adv Prob σ (s, ψ ) DP/Probabilistic Model Checking, Michaelmas 2011 2

  3. Probabilistic reachability • Minimum and maximum probability of reaching target set − target set = all states labelled with atomic proposition a p min (s,F a) = inf σ∈ Adv Prob σ (s,F a) p max (s,F a) = sup σ∈ Adv Prob σ (s,F a) • Vectors: p min (F a) and p max (F a) − minimum/maximum probabilities for all states of MDP DP/Probabilistic Model Checking, Michaelmas 2011 3

  4. Overview • Qualitative probabilistic reachability − case where p min >0 or p max >0 • Optimality equation • Memoryless adversaries suffice − finitely many adversaries to consider • Computing reachability probabilities − value iteration (fixed point computation) − linear programming problem − policy iteration DP/Probabilistic Model Checking, Michaelmas 2011 4

  5. Qualitative probabilistic reachability • Consider the problem of determining states for which 
 p min (s, F a) or p max (s, F a) is zero (or non-zero) − max case: S max=0 = { s ∈ S | p max (s, F a) = 0 } − this is just (non-probabilistic) reachability R := Sat(a) done := false while (done = false) R � = R ∪ { s ∈ S | ∃ (a,µ) ∈ Steps(s) . ∃ s � ∈ R . µ(s � )>0} if (R � =R) then done := true R := R � endwhile return S\R DP/Probabilistic Model Checking, Michaelmas 2011 5

  6. Qualitative probabilistic reachability • Min case: S min=0 = { s ∈ S | p min (s, F a) = 0 } note: quantification R := Sat(a) over all choices done := false while (done = false) R � = R ∪ { s ∈ S | ∀ (a,µ) ∈ Steps(s) . ∃ s � ∈ R . µ(s � )>0} if (R � =R) then done := true R := R � endwhile return S\R DP/Probabilistic Model Checking, Michaelmas 2011 6

  7. Optimality (min) • The values p min (s, F a) are the unique solution of the following equations: % ' 1 if s ∈ Sat(a) ' ' x s = 0 if s ∈ S min = 0 & ' % ) ' ' ' min µ (s') ⋅ x s' | (a, µ ) ∈ Steps (s) otherwise ∑ & * ' ' ' ( + ( s' ∈ S S min=0 optimal solution for state s uses = optimal solution for successors s � { s | p min (s, F a)=0 } • This is an instance of the Bellman equation − (basis of dynamic programming techniques) DP/Probabilistic Model Checking, Michaelmas 2011 7

  8. Optimality (max) • Likewise, the values p max (s, F a) are the unique solution of the following equations: % ' 1 if s ∈ Sat(a) ' ' x s = 0 if s ∈ S max = 0 & ' % ) ' ' ' max µ (s') ⋅ x s' | (a, µ ) ∈ Steps ps(s) otherwise ∑ & * ' ' ' ( + ( s' ∈ S S max=0 = { s | p max (s, F a)=0 } DP/Probabilistic Model Checking, Michaelmas 2011 8

  9. Memoryless adversaries • Memoryless adversaries suffice for probabilistic reachability − i.e. there exist memoryless adversaries σ min & σ max such that: − Prob σ min (s, F a) = p min (s, F a) for all states s ∈ S − Prob σ max (s, F a) = p max (s, F a) for all states s ∈ S • Construct adversaries from optimal solution: & * ( ( σ min (s) = argmin µ (s') ⋅ p min (s',Fa) | (a, µ ) ∈ Steps ps(s) ∑ ' + ( ( ) , s' ∈ S & * ( ( σ max (s) = argmax µ (s') ⋅ p max (s',Fa) | (a, µ ) ∈ Steps (s) ∑ ' + ( ( ) , s' ∈ S DP/Probabilistic Model Checking, Michaelmas 2011 9

  10. Computing reachability probabilities • Several approaches… Preferable 
 in practice, • 1. Value iteration e.g. in PRISM − approximate with iterative solution method − corresponds to fixed point computation • 2. Reduction to a linear programming (LP) problem − solve with linear optimisation techniques − exact solution using well-known methods better • 3. Policy iteration complexity; good for small − iteration over adversaries examples DP/Probabilistic Model Checking, Michaelmas 2011 10

  11. Method 1 - Value iteration (min) • For minimum probabilities p min (s, F a) it can be shown that: − p min (s, F a) = lim n →∞ x s (n) where: & 1 if s ∈ Sat(a) ( 0 if s ∈ S min = 0 ( ( (n) if s ∈ S ? and n = 0 x s 0 = ' ( & * ( if s ∈ S ? and n > 0 ( (n − 1) ( ( min µ (s') ⋅ x s' | (a, µ ) ∈ Steps ps(s) ∑ ' + ( ( ) , ) s' ∈ S − where: S ? = S \ ( Sat(a) ∪ S min=0 ) • Approximate iterative solution technique − iterations terminated when solution converges sufficiently DP/Probabilistic Model Checking, Michaelmas 2011 11

  12. Method 1 - Value iteration (max) • Value iteration applies to maximum probabilities in the same way… − p max (s, F a) = lim n →∞ x s (n) where: & 1 if s ∈ Sat(a) ( 0 if s ∈ S max = 0 ( ( (n) if s ∈ S ? and n = 0 x s 0 = ' ( & * ( if s ∈ S ? and n > 0 ( (n − 1) ( ( max µ (s') ⋅ x s' | (a, µ ) ∈ Step eps (s) ∑ ' + ( ( ) , ) s' ∈ S − where: S ? = S \ ( Sat(a) ∪ S max=0 ) DP/Probabilistic Model Checking, Michaelmas 2011 12

  13. Example • Minimum/maximum probability of reaching an a-state 0.5 {a} 0.4 s 2 s 1 1 0.1 1 1 1 0.5 s 0 s 3 0.25 0.25 DP/Probabilistic Model Checking, Michaelmas 2011 13

  14. Example - Value iteration (min) Compute: p min (s i , F a) Sat(a) = {s 2 }, S min=0 ={s 3 }, S ? = {s 0 , s 1 } 0.5 Sat(a) {a} [ x 0 (n) ,x 1 (n) ,x 2 (n) ,x 3 (n) ] 0.4 s 1 s 2 n=0: [ 0, 0, 1, 0 ] 1 0.1 n=1: [ min(1·0, 0.25·0+0.25·0+0.5·1), 1 1 1 0.5 0.1·0+0.5·0+0.4·1, 1, 0 ] s 3 s 0 = [ 0, 0.4, 1, 0 ] 0.25 S min=0 n=2: [ min(1·0.4,0.25·0+0.25·0+0.5·1), 0.25 0.1·0+0.5·0.4+0.4·1, 1, 0 ] =[ 0.4, 0.6, 1, 0 ] n=3: … DP/Probabilistic Model Checking, Michaelmas 2011 14

  15. Example - Value iteration (min) [ x 0 (n) ,x 1 (n) ,x 2 (n) ,x 3 (n) ] n=0: [ 0.000000, 0.000000, 1, 0 ] 0.5 n=1: [ 0.000000, 0.400000, 1, 0 ] Sat(a) {a} n=2: [ 0.400000, 0.600000, 1, 0 ] 0.4 s 1 s 2 n=3: [ 0.600000, 0.740000, 1, 0 ] 1 n=4: [ 0.650000, 0.830000, 1, 0 ] 0.1 1 1 n=5: [ 0.662500, 0.880000, 1, 0 ] 1 0.5 n=6: [ 0.665625, 0.906250, 1, 0 ] s 3 s 0 n=7: [ 0.666406, 0.919688, 1, 0 ] 0.25 S min=0 n=8: [ 0.666602, 0.926484, 1, 0 ] 0.25 … p min (F a) n=20: [ 0.666667, 0.933332, 1, 0 ] = n=21: [ 0.666667, 0.933332, 1, 0 ] [ 2/3, 14/15, 1, 0 ] ≈ [ 2/3, 14/15, 1, 0 ] DP/Probabilistic Model Checking, Michaelmas 2011 15

  16. Generating an optimal adversary • Min adversary σ min [ x 0 (n) ,x 1 (n) ,x 2 (n) ,x 3 (n) ] … 0.5 Sat(a) {a} n=20: [ 0.666667, 0.933332, 1, 0 ] 0.4 n=21: [ 0.666667, 0.933332, 1, 0 ] s 1 s 2 ≈ [ 2/3, 14/15, 1, 0 ] 1 0.1 1 1 s 0 : min(1·14/15, 0.5 · 1+0.25 · 0+0.25 · 2/3) 1 0.5 s 3 s 0 =min(14/15, 2/3) 0.25 S min=0 0.25 DP/Probabilistic Model Checking, Michaelmas 2011 16

  17. Generating an optimal adversary • DTMC D σ min [ x 0 (n) ,x 1 (n) ,x 2 (n) ,x 3 (n) ] … 0.5 {a} n=20: [ 0.666667, 0.933332, 1, 0 ] 0.4 n=21: [ 0.666667, 0.933332, 1, 0 ] s 1 s 2 ≈ [ 2/3, 14/15, 1, 0 ] 1 0.1 s 0 : min(1·14/15, 0.5 · 1+0.25 · 0+0.25 · 2/3) 1 0.5 s 3 s 0 =min(14/15, 2/3) 0.25 0.25 DP/Probabilistic Model Checking, Michaelmas 2011 17

  18. Value iteration as a fixed point • Can view value iteration as a fixed point computation over vectors of probabilities y ∈ [0,1] S , e.g. for minimum: $ 1 if s Sat ( a ) ! ∈ ! F( y )(s) 0 if s S min 0 = = ∈ # ! $ ' min µ ( s ' ) y ( s ' ) | (a, µ ) St Steps ( s ) otherwise ∑ ⋅ ∈ ! # & " % s' S " ∈ • Let: − x (0) = 0 (i.e. x (0) (s) = 0 for all s) − x (n+1) = F(x (n) ) • Then: − x (0) ≤ x (1) ≤ x (2) ≤ x (3) ≤ … − p min (F a) = lim n →∞ x (n) DP/Probabilistic Model Checking, Michaelmas 2011 18

  19. Linear programming • Linear programming − optimisation of a linear objective function − subject to linear (in)equality constraints • General form: Many standard solution − n variables: x 1 , x 2 , … ,x n techniques exist, e.g. Simplex, ellipsoid method, 
 − maximise (or minimise): interior point method • c 1 x 1 +c 2 x 2 +…+c n x n − subject to constraints In matrix/vector form: • a 11 x 1 +a 12 x 2 +…a 1n x n ≤ b 1 Maximise (or minimise) • a 21 x 1 +a 22 x 2 +…a 2n x n ≤ b 2 c·x subject to A·x ≤ b • … • a m1 x 1 +a m2 x 2 +…a mn x n ≤ b m DP/Probabilistic Model Checking, Michaelmas 2011 19

  20. Method 2 - Linear programming problem • Min probabilities p min (s, F a) can be computed as follows: − p min (s, F a) = 1 if s ∈ Sat(a) − p min (s, F a) = 0 if s ∈ S min=0 − values for remaining states in the set S ? = S \ (Sat(a) ∪ S no ) can 
 be obtained as the unique solution of the following 
 linear programming problem: maximize x s subject to the constraints : ∑ s ∈ S ? x s ≤ µ (s') ⋅ x s' + µ (s') ∑ ∑ s' ∈ S ? s' ∈ Sat(a) for all s ∈ S ? and for all (a, µ ) ∈ Steps (s) DP/Probabilistic Model Checking, Michaelmas 2011 20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend