SLIDE 1
time Markov Models Mohmmadsadegh Mohagheghi And Behrang Chaboki - - PowerPoint PPT Presentation
time Markov Models Mohmmadsadegh Mohagheghi And Behrang Chaboki - - PowerPoint PPT Presentation
Dirac-based Reduction Techniques for Quantitative Analysis of Discrete- time Markov Models Mohmmadsadegh Mohagheghi And Behrang Chaboki Vali-e-Asr University of Rafsanjan Probabilistic Model Checking Probabilistic Model Checking A Markov
SLIDE 2
SLIDE 3
Probabilistic Model Checking
- A Markov Decision Process (MDP) is a tuple
M = (S, s0, Act, P, R) where:
- S is a set of states,
- s0 is the initial state,
- Act is a finite set of actions,
- P is a probabilistic transition function,
- R is a reward function.
SLIDE 4
Probabilistic Model Checking
- A policy is used to resolve non-deterministic
choices of an MDP.
- A policy ๐: ๐ โ ๐ต๐๐ข selects one action for
each state s.
- For an MDP M, every possible policy ๐ induces
a quotient DTMC.
SLIDE 5
Numeric Computations
- Reachability Probabilities: The (maximal or
minimal) probability of finally reaching one of the goal states [For MDPs]
- Expected Rewards: The (maximal or minimal)
expectation of accumulated rewards until reaching a goal state
SLIDE 6
Numeric Computations
- Extremal Reachablity Probabilities
& Expected Rewards
Solving a Linear Program (Exact Solutions) Using Iterative Methods (In practice)
SLIDE 7
Jacobi Iterative Method (DTMCs)
- Starting from an initial vector
๐ of value-states, in each iteration and every state ๐ก โ ๐ update ๐๐ก as:
๏ฅ ๏
๏ฝ
ฮฑ) Post(s, s' s' s
).V s' ฮฑ, P(s, V
๏ฅ ๏
๏ซ ๏ฝ
ฮฑ) Post(s, s' s' s
).V s' ฮฑ, P(s, R(s) V
SLIDE 8
Value Iteration (MDPs)
- Starting from an initial vector
๐of value-states, in each iteration and every state ๐ก โ ๐ update ๐๐ก as:
) ).V s' ฮฑ, P(s, ( Max V
ฮฑ) Post(s, s' s' Act(s) ฮฑ s
๏ฅ ๏
๏
๏ฝ ) ).V s' ฮฑ, P(s, (R(s) Max V
ฮฑ) Post(s, s' s' Act(s) ฮฑ s
๏ฅ ๏
๏
๏ซ ๏ฝ
SLIDE 9
Value Iteration (MDPs)
- Starting from an initial vector
๐ of value-states, in each iteration and every state s of S update ๐๐ก as:
- Terminate criterion:
for some tiny ฮต (10^-6)
) ).V s' ฮฑ, P(s, (R(s) Max V
ฮฑ) Post(s, s' s' Act(s) ฮฑ s
๏ฅ ๏
๏
๏ซ ๏ฝ
๏ฅ ๏ฃ
๏
) V
- (V
Max
- ld
s s S s
SLIDE 10
Policy Iteration
Select a policy ๐ repeat Compute the values of the induced DTMC Update ๐ Until no change in policies
SLIDE 11
Dirac-based Reduction Technique
- Idea: If ๐(๐ก, ๐กโฒ) = 1 in a DTMC, reachability
probability of s and ๐กโฒ are equal.
SLIDE 12
Dirac-based Reduction Technique
- Dirac transitions are used to classify ๐.
- The states of each class are connected with
Dirac transitions and have the same reachability probabilities.
- Apply iterative commutations on the reduced
DTMC
SLIDE 13
Dirac-based Reduction Technique
- DTMC reduction can be used for policy
iteration
- Time complexity: Linear in the size of DTMCs
SLIDE 14
Dirac-based Reduction Technique
- Expected rewards: If ๐(๐ก, ๐กโฒ) = 1 then ๐
๐กโฒ =
๐
๐ก + ๐(๐ก)
- State-rewards should be modified for reduced
DTMCs
SLIDE 15
Experimental Results
- We implemented Dirac-based methods in
PRISM.
- Available in:
https://github.com/sadeghrk/prism/tree/Dira cBased-Improving
SLIDE 16
Experimental Results
SLIDE 17