pac statistical model checking for markov
play

PAC Statistical Model Checking for Markov Decision Processes and - PowerPoint PPT Presentation

PAC Statistical Model Checking for Markov Decision Processes and Stochastic Games 1 Pranav Ashok, Jan K ret nsk y, Maximilian Weininger Technical University of Munich Highlights of Logic, Automata and Games Warsaw, Poland September


  1. PAC Statistical Model Checking for Markov Decision Processes and Stochastic Games 1 Pranav Ashok, Jan Kˇ ret´ ınsk´ y, Maximilian Weininger Technical University of Munich Highlights of Logic, Automata and Games Warsaw, Poland September 19, 2019 1 based on paper presented at CAV 2019

  2. Stochastic Game Reachability 0 . 2 a b c 0 . 8 Objective player: maximize P(F ) player: minimize P(F ) Reachability in limited information stochastic games 2/6

  3. Stochastic Game Reachability 0 . 2 a b c 0 . 8 Objective player: maximize P(F ) player: minimize P(F ) Reachability in limited information stochastic games 2/6

  4. Stochastic Game Reachability 0 . 2 a b c 0 . 8 Objective player: maximize P(F ) player: minimize P(F ) Reachability in limited information stochastic games 2/6

  5. This work: Black-box (limited information setting) Unknown successor distribution Problem statement Compute V ( s ) = max σ min τ P σ,τ ( F ) = min τ max σ P σ,τ ( F ) s s with guarantees Reachability in limited information stochastic games 3/6

  6. Background ◮ Seminal paper on Stochastic Games [ Condon 90 ] quadratic programming, strategy iteration, value iteration Reachability in limited information stochastic games 4/6

  7. Background ◮ Seminal paper on Stochastic Games [ Condon 90 ] quadratic programming, strategy iteration, value iteration ◮ Algos not directly applicable on general SG ◮ First practical algorithm for general SG giving guarantees [ Kelmendi et. al. 2018 ] Reachability in limited information stochastic games 4/6

  8. Background ◮ Seminal paper on Stochastic Games [ Condon 90 ] quadratic programming, strategy iteration, value iteration ◮ Algos not directly applicable on general SG ◮ First practical algorithm for general SG giving guarantees [ Kelmendi et. al. 2018 ] ◮ This work: first algorithm for limited information SG Reachability in limited information stochastic games 4/6

  9. The Algorithm Similar to Kelmendi et. al. 2018 while U − L is large 1. Simulate and estimate 2. Back-propagate Reachability in limited information stochastic games 5/6

  10. The Algorithm Similar to Kelmendi et. al. 2018 while U − L is large 1. Simulate and estimate 2. Back-propagate The how ◮ Simulation finds important parts of state space Reachability in limited information stochastic games 5/6

  11. The Algorithm Similar to Kelmendi et. al. 2018 while U − L is large 1. Simulate and estimate 2. Back-propagate The how ◮ Simulation finds important parts of state space ◮ Simulation computes Hoeffding confidence intervals ball around estimate such that real prob. falls in the ball with high confidence Reachability in limited information stochastic games 5/6

  12. The Algorithm Similar to Kelmendi et. al. 2018 while U − L is large 1. Simulate and estimate 2. Back-propagate The how ◮ Simulation finds important parts of state space ◮ Simulation computes Hoeffding confidence intervals ball around estimate such that real prob. falls in the ball with high confidence ◮ Information conservatively back-propagated Reachability in limited information stochastic games 5/6

  13. The Algorithm Similar to Kelmendi et. al. 2018 while U − L is large 1. Simulate and estimate 2. Back-propagate The how ◮ Simulation finds important parts of state space ◮ Simulation computes Hoeffding confidence intervals ball around estimate such that real prob. falls in the ball with high confidence ◮ Information conservatively back-propagated ◮ Other tricks to ensure fixpoint convergence Reachability in limited information stochastic games 5/6

  14. Conclusion ◮ Algorithm for reachability in limited information MDP/SG result ∈ [0 . 6 − ǫ, 0 . 6 + ǫ ] with prob of going wrong 10 − 8 ◮ Implemented and benchmarked in PRISM Model Checker ◮ First algorithm to do so for SG ◮ First practical algorithm for MDPs Reachability in limited information stochastic games 6/6

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend