time markov models
play

time Markov Models Mohmmadsadegh Mohagheghi And Behrang Chaboki - PowerPoint PPT Presentation

Dirac-based Reduction Techniques for Quantitative Analysis of Discrete- time Markov Models Mohmmadsadegh Mohagheghi And Behrang Chaboki Vali-e-Asr University of Rafsanjan Probabilistic Model Checking Probabilistic Model Checking A Markov


  1. Dirac-based Reduction Techniques for Quantitative Analysis of Discrete- time Markov Models Mohmmadsadegh Mohagheghi And Behrang Chaboki Vali-e-Asr University of Rafsanjan

  2. Probabilistic Model Checking

  3. Probabilistic Model Checking • A Markov Decision Process (MDP) is a tuple M = (S, s 0 , Act, P, R) where: • S is a set of states, • s 0 is the initial state, • Act is a finite set of actions, • P is a probabilistic transition function, • R is a reward function.

  4. Probabilistic Model Checking • A policy is used to resolve non-deterministic choices of an MDP. • A policy 𝜌: 𝑇 → 𝐵𝑑𝑢 selects one action for each state s. • For an MDP M, every possible policy 𝜌 induces a quotient DTMC.

  5. Numeric Computations • Reachability Probabilities: The (maximal or minimal) probability of finally reaching one of the goal states [For MDPs] • Expected Rewards: The (maximal or minimal) expectation of accumulated rewards until reaching a goal state

  6. Numeric Computations • Extremal Reachablity Probabilities & Expected Rewards Solving a Linear Program (Exact Solutions) Using Iterative Methods (In practice)

  7. Jacobi Iterative Method (DTMCs) • Starting from an initial vector 𝑊 of value-states, in each iteration and every state 𝑡 ∈ 𝑇 update 𝑊 𝑡 as:    α, V P(s, s' ).V s s' α) s' Post(s,     α, V R(s) P(s, s' ).V s s' α) s' Post(s,

  8. Value Iteration (MDPs) • Starting from an initial vector 𝑊 of value-states, in each iteration and every state 𝑡 ∈ 𝑇 update 𝑊 𝑡 as:    α, V Max ( P(s, s' ).V )  α s Act(s) s' α) s' Post(s,     α, V Max (R(s) P(s, s' ).V )  α s Act(s) s' α) s' Post(s,

  9. Value Iteration (MDPs) • Starting from an initial vector 𝑊 of value-states, in each iteration and every state s of S update 𝑊 𝑡 as:     α, V Max (R(s) P(s, s' ).V )  α s Act(s) s' α) s' Post(s, • Terminate criterion:   old Max (V - V )  s S s s for some tiny ε (10^-6)

  10. Policy Iteration Select a policy 𝜌 repeat Compute the values of the induced DTMC Update 𝜌 Until no change in policies

  11. Dirac-based Reduction Technique • Idea: If 𝑄(𝑡, 𝑡′) = 1 in a DTMC, reachability probability of s and 𝑡′ are equal.

  12. Dirac-based Reduction Technique • Dirac transitions are used to classify 𝑇 . • The states of each class are connected with Dirac transitions and have the same reachability probabilities. • Apply iterative commutations on the reduced DTMC

  13. Dirac-based Reduction Technique • DTMC reduction can be used for policy iteration • Time complexity: Linear in the size of DTMCs

  14. Dirac-based Reduction Technique • Expected rewards: If 𝑄(𝑡, 𝑡′) = 1 then 𝑊 𝑡′ = 𝑊 𝑡 + 𝑆(𝑡) • State-rewards should be modified for reduced DTMCs

  15. Experimental Results • We implemented Dirac-based methods in PRISM. • Available in: https://github.com/sadeghrk/prism/tree/Dira cBased-Improving

  16. Experimental Results

  17. Questions?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend