expectations or guarantees i want it all a crossroad
play

Expectations or Guarantees? I Want It All! A Crossroad between Games - PowerPoint PPT Presentation

Expectations or Guarantees? I Want It All! A Crossroad between Games and MDPs V. Bruy` ere (UMONS) E. Filiot (ULB) M. Randour (UMONS-ULB) J.-F. Raskin (ULB) Grenoble - 05.04.2014 SR 2014 - 2nd International Workshop on Strategic Reasoning


  1. Expectations or Guarantees? I Want It All! A Crossroad between Games and MDPs V. Bruy` ere (UMONS) E. Filiot (ULB) M. Randour (UMONS-ULB) J.-F. Raskin (ULB) Grenoble - 05.04.2014 SR 2014 - 2nd International Workshop on Strategic Reasoning

  2. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion The talk in two slides (1/2) Verification and synthesis: � a reactive system to control , � an interacting environment , � a specification to enforce . Focus on quantitative properties . Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 1 / 26

  3. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion The talk in two slides (1/2) Verification and synthesis: � a reactive system to control , � an interacting environment , � a specification to enforce . Focus on quantitative properties . Several ways to look at the interactions, and in particular, the nature of the environment . Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 1 / 26

  4. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion The talk in two slides (2/2) Games MDPs → antagonistic adversary → stochastic adversary → guarantees on worst-case → optimize expected value Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 2 / 26

  5. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion The talk in two slides (2/2) Games MDPs → antagonistic adversary → stochastic adversary → guarantees on worst-case → optimize expected value ∧ BWC synthesis → ensure both Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 2 / 26

  6. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion The talk in two slides (2/2) Games MDPs → antagonistic adversary → stochastic adversary → guarantees on worst-case → optimize expected value ∧ BWC synthesis → ensure both Studied Mean-Payoff Shortest Path value functions Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 2 / 26

  7. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Advertisement Featured in STACS’14 [BFRR14] Full paper available on arXiv: abs/1309.5439 Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 3 / 26

  8. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion 1 Context 2 BWC Synthesis 3 Mean-Payoff 4 Shortest Path 5 Conclusion Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 4 / 26

  9. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion 1 Context 2 BWC Synthesis 3 Mean-Payoff 4 Shortest Path 5 Conclusion Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 5 / 26

  10. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Quantitative games on graphs Graph G = ( S , E , w ) with w : E → Z Two-player game G = ( G , S 1 , S 2 ) � P 1 states = 2 2 � P 2 states = 5 Plays have values � f : Plays( G ) → R ∪ {−∞ , ∞} − 1 7 Players follow strategies − 4 � λ i : Prefs i ( G ) → D ( S ) � Finite memory ⇒ stochastic output Moore machine M ( λ i ) = (Mem , m 0 , α u , α n ) Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 6 / 26

  11. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Quantitative games on graphs Graph G = ( S , E , w ) with w : E → Z Two-player game G = ( G , S 1 , S 2 ) � P 1 states = 2 2 � P 2 states = 5 Plays have values � f : Plays( G ) → R ∪ {−∞ , ∞} − 1 7 Players follow strategies − 4 � λ i : Prefs i ( G ) → D ( S ) � Finite memory ⇒ stochastic output Moore machine M ( λ i ) = (Mem , m 0 , α u , α n ) Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 6 / 26

  12. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Quantitative games on graphs Graph G = ( S , E , w ) with w : E → Z Two-player game G = ( G , S 1 , S 2 ) � P 1 states = 2 2 � P 2 states = 5 Plays have values � f : Plays( G ) → R ∪ {−∞ , ∞} − 1 7 Players follow strategies − 4 � λ i : Prefs i ( G ) → D ( S ) � Finite memory ⇒ stochastic output Moore machine M ( λ i ) = (Mem , m 0 , α u , α n ) Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 6 / 26

  13. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Quantitative games on graphs Graph G = ( S , E , w ) with w : E → Z Two-player game G = ( G , S 1 , S 2 ) � P 1 states = 2 2 � P 2 states = 5 Plays have values � f : Plays( G ) → R ∪ {−∞ , ∞} − 1 7 Players follow strategies − 4 � λ i : Prefs i ( G ) → D ( S ) � Finite memory ⇒ stochastic output Moore machine M ( λ i ) = (Mem , m 0 , α u , α n ) Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 6 / 26

  14. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Quantitative games on graphs Graph G = ( S , E , w ) with w : E → Z Two-player game G = ( G , S 1 , S 2 ) � P 1 states = 2 2 � P 2 states = 5 Plays have values � f : Plays( G ) → R ∪ {−∞ , ∞} − 1 7 Players follow strategies − 4 � λ i : Prefs i ( G ) → D ( S ) � Finite memory ⇒ stochastic output Moore machine M ( λ i ) = (Mem , m 0 , α u , α n ) Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 6 / 26

  15. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Quantitative games on graphs Graph G = ( S , E , w ) with w : E → Z Two-player game G = ( G , S 1 , S 2 ) � P 1 states = 2 2 � P 2 states = 5 Plays have values � f : Plays( G ) → R ∪ {−∞ , ∞} − 1 7 Players follow strategies − 4 � λ i : Prefs i ( G ) → D ( S ) � Finite memory ⇒ stochastic output Moore machine M ( λ i ) = (Mem , m 0 , α u , α n ) Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 6 / 26

  16. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Quantitative games on graphs Graph G = ( S , E , w ) with w : E → Z Two-player game G = ( G , S 1 , S 2 ) � P 1 states = 2 2 � P 2 states = 5 Plays have values � f : Plays( G ) → R ∪ {−∞ , ∞} − 1 7 Players follow strategies − 4 � λ i : Prefs i ( G ) → D ( S ) � Finite memory ⇒ stochastic output Moore Then, (2 , 5 , 2) ω machine M ( λ i ) = (Mem , m 0 , α u , α n ) Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 6 / 26

  17. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Markov decision processes MDP P = ( G , S 1 , S ∆ , ∆) with ∆: S ∆ → D ( S ) 2 2 � P 1 states = 5 � stochastic states = MDP = game + strategy of P 2 − 1 7 � P = G [ λ 2 ] − 4 1 2 1 2 Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 7 / 26

  18. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Markov chains MC M = ( G , δ ) with δ : S → D ( S ) MC = MDP + strategy of P 1 = game + both strategies 2 2 � M = P [ λ 1 ] = G [ λ 1 , λ 2 ] 1 5 4 3 4 − 1 7 − 4 1 2 1 2 Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 8 / 26

  19. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Markov chains MC M = ( G , δ ) with δ : S → D ( S ) MC = MDP + strategy of P 1 = game + both strategies 2 2 � M = P [ λ 1 ] = G [ λ 1 , λ 2 ] 1 5 4 Event A ⊆ Plays( G ) 3 4 � probability P M − 1 s init ( A ) 7 − 4 1 Measurable f : Plays( G ) → R ∪ {−∞ , ∞} 2 1 � expected value E M s init ( f ) 2 Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 8 / 26

  20. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Classical interpretations System trying to ensure a specification = P 1 � whatever the actions of its environment Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 9 / 26

  21. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Classical interpretations System trying to ensure a specification = P 1 � whatever the actions of its environment The environment can be seen as � antagonistic two-player game, worst-case threshold problem for µ ∈ Q ∃ ? λ 1 ∈ Λ 1 , ∀ λ 2 ∈ Λ 2 , ∀ π ∈ Outs G ( s init , λ 1 , λ 2 ) , f ( π ) ≥ µ Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 9 / 26

  22. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion Classical interpretations System trying to ensure a specification = P 1 � whatever the actions of its environment The environment can be seen as � antagonistic two-player game, worst-case threshold problem for µ ∈ Q ∃ ? λ 1 ∈ Λ 1 , ∀ λ 2 ∈ Λ 2 , ∀ π ∈ Outs G ( s init , λ 1 , λ 2 ) , f ( π ) ≥ µ � fully stochastic MDP, expected value threshold problem for ν ∈ Q ∃ ? λ 1 ∈ Λ 1 , E P [ λ 1 ] s init ( f ) ≥ ν Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 9 / 26

  23. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion 1 Context 2 BWC Synthesis 3 Mean-Payoff 4 Shortest Path 5 Conclusion Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 10 / 26

  24. Context BWC Synthesis Mean-Payoff Shortest Path Conclusion What if you want both? In practice, we want both 1 nice expected performance in the everyday situation, 2 strict (but relaxed) performance guarantees even in the event of very bad circumstances. Beyond Worst-Case Synthesis Bruy` ere, Filiot, Randour, Raskin 11 / 26

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend