reconciling rationality and stochasticity rich behavioral
play

Reconciling Rationality and Stochasticity: Rich Behavioral Models in - PowerPoint PPT Presentation

Reconciling Rationality and Stochasticity: Rich Behavioral Models in Two-Player Games Mickael Randour Computer Science Department, ULB - Universit e libre de Bruxelles, Belgium July 24, 2016 GAMES 2016 - 5th World Congress of the Game Theory


  1. Reconciling Rationality and Stochasticity: Rich Behavioral Models in Two-Player Games Mickael Randour Computer Science Department, ULB - Universit´ e libre de Bruxelles, Belgium July 24, 2016 GAMES 2016 - 5th World Congress of the Game Theory Society

  2. Rationality & stochasticity Planning a journey Synthesis Conclusion The talk in one slide Two traditional paradigms for agents in complex systems Fully rational Fully stochastic System System = = (multi-player) game large stochastic process In some fields (e.g., computer science), need to go beyond: rich behavioral models Illustration: planning a journey in an uncertain environment Reconciling Rationality and Stochasticity Mickael Randour 1 / 21

  3. Rationality & stochasticity Planning a journey Synthesis Conclusion Advertisement Full paper available on arXiv [Ran16a]: abs/1603.05072 Reconciling Rationality and Stochasticity Mickael Randour 2 / 21

  4. Rationality & stochasticity Planning a journey Synthesis Conclusion 1 Rationality & stochasticity 2 Planning a journey in an uncertain environment 3 Synthesis of reliable reactive systems 4 Conclusion Reconciling Rationality and Stochasticity Mickael Randour 3 / 21

  5. Rationality & stochasticity Planning a journey Synthesis Conclusion 1 Rationality & stochasticity 2 Planning a journey in an uncertain environment 3 Synthesis of reliable reactive systems 4 Conclusion Reconciling Rationality and Stochasticity Mickael Randour 4 / 21

  6. Rationality & stochasticity Planning a journey Synthesis Conclusion Rationality hypothesis Rational agents [OR94]: clear personal objectives, aware of their alternatives, form sound expectations about any unknowns, choose their actions coherently (i.e., regarding some notion of optimality). = ⇒ In the particular setting of zero-sum games: antagonistic interactions between the players. ֒ → Well-founded abstraction in computer science. E.g., processes competing for access to a shared resource. Reconciling Rationality and Stochasticity Mickael Randour 5 / 21

  7. Rationality & stochasticity Planning a journey Synthesis Conclusion Stochasticity Stochastic agents : often a sufficient abstraction to reason about macroscopic properties of a complex system, agents follow stochastic models that can be based on experimental data (e.g., traffic in a town). Several models of interest : fully stochastic agents = ⇒ Markov chain [Put94], rational agent against stochastic agent = ⇒ Markov decision process [Put94], two rational agents + one stochastic agent = ⇒ stochastic game or competitive MDP [FV97]. Reconciling Rationality and Stochasticity Mickael Randour 6 / 21

  8. Rationality & stochasticity Planning a journey Synthesis Conclusion Choosing the appropriate paradigm matters! As an agent having to choose a strategy, the assumptions made on the other agents are crucial . = ⇒ They define our objective hence the adequate strategy. = ⇒ Illustration: planning a journey. Reconciling Rationality and Stochasticity Mickael Randour 7 / 21

  9. Rationality & stochasticity Planning a journey Synthesis Conclusion 1 Rationality & stochasticity 2 Planning a journey in an uncertain environment 3 Synthesis of reliable reactive systems 4 Conclusion Reconciling Rationality and Stochasticity Mickael Randour 8 / 21

  10. Rationality & stochasticity Planning a journey Synthesis Conclusion Aim of this illustration Flavor of � = types of useful strategies in stochastic environments. � Based on a series of papers, most in a computer science setting (more on that later) [Ran13, BFRR14b, BFRR14a, RRS15a, RRS15b, BCH + 16]. Applications to the shortest path problem . B 5 30 D 10 A 20 20 E 10 5 C ֒ → Find a path of minimal length in a weighted graph (Dijkstra, Bellman-Ford, etc) [CGR96]. Reconciling Rationality and Stochasticity Mickael Randour 9 / 21

  11. Rationality & stochasticity Planning a journey Synthesis Conclusion Aim of this illustration Flavor of � = types of useful strategies in stochastic environments. � Based on a series of papers, most in a computer science setting (more on that later) [Ran13, BFRR14b, BFRR14a, RRS15a, RRS15b, BCH + 16]. Applications to the shortest path problem . B 5 30 D 10 A 20 20 E 10 5 C What if the environment is uncertain ? E.g., in case of heavy traffic, some roads may be crowded. Reconciling Rationality and Stochasticity Mickael Randour 9 / 21

  12. Rationality & stochasticity Planning a journey Synthesis Conclusion Planning a journey in an uncertain environment home go back, 2 railway, 2 car, 1 0 . 1 0 . 9 0 . 2 0 . 1 0 . 7 waiting light medium heavy train bike, 45 room traffic traffic traffic 0 . 9 0 . 1 relax, 35 drive, 20 drive, 30 drive, 70 wait, 3 work Each action takes time, target = work. � What kind of strategies are we looking for when the environment is stochastic (MDP)? Reconciling Rationality and Stochasticity Mickael Randour 10 / 21

  13. Rationality & stochasticity Planning a journey Synthesis Conclusion Solution 1: minimize the expected time to work home go back, 2 railway, 2 car, 1 0 . 1 0 . 9 0 . 2 0 . 1 0 . 7 waiting light medium heavy train bike, 45 room traffic traffic traffic 0 . 9 0 . 1 relax, 35 drive, 20 drive, 30 drive, 70 wait, 3 work � “Average” performance: meaningful when you journey often. � Simple strategies suffice: no memory, no randomness. D (TS work ) = 33. � Taking the car is optimal: E σ Reconciling Rationality and Stochasticity Mickael Randour 11 / 21

  14. Rationality & stochasticity Planning a journey Synthesis Conclusion Solution 2: traveling without taking too many risks home go back, 2 railway, 2 car, 1 0 . 1 0 . 9 0 . 2 0 . 1 0 . 7 waiting light medium heavy train bike, 45 room traffic traffic traffic 0 . 9 0 . 1 relax, 35 drive, 20 drive, 30 drive, 70 wait, 3 work Minimizing the expected time to destination makes sense if we travel often and it is not a problem to be late . With car, in 10% of the cases, the journey takes 71 minutes. Reconciling Rationality and Stochasticity Mickael Randour 12 / 21

  15. Rationality & stochasticity Planning a journey Synthesis Conclusion Solution 2: traveling without taking too many risks home go back, 2 railway, 2 car, 1 0 . 1 0 . 9 0 . 2 0 . 1 0 . 7 waiting light medium heavy train bike, 45 room traffic traffic traffic 0 . 9 0 . 1 relax, 35 drive, 20 drive, 30 drive, 70 wait, 3 work Most bosses will not be happy if we are late too often. . . � what if we are risk-averse and want to avoid that? Reconciling Rationality and Stochasticity Mickael Randour 12 / 21

  16. Rationality & stochasticity Planning a journey Synthesis Conclusion Solution 2: maximize the probability to be on time home go back, 2 railway, 2 car, 1 0 . 1 0 . 9 0 . 2 0 . 1 0 . 7 waiting light medium heavy train bike, 45 room traffic traffic traffic 0 . 9 0 . 1 relax, 35 drive, 20 drive, 30 drive, 70 wait, 3 work Specification: reach work within 40 minutes with 0 . 95 probability TS work ≤ 40 � � Sample strategy : take the train � P σ = 0 . 99 D Bad choices : car (0 . 9) and bike (0 . 0) Reconciling Rationality and Stochasticity Mickael Randour 13 / 21

  17. Rationality & stochasticity Planning a journey Synthesis Conclusion Solution 3: strict worst-case guarantees home go back, 2 railway, 2 car, 1 0 . 1 0 . 9 0 . 2 0 . 1 0 . 7 waiting light medium heavy train bike, 45 room traffic traffic traffic 0 . 9 0 . 1 relax, 35 drive, 20 drive, 30 drive, 70 wait, 3 work Specification: guarantee that work is reached within 60 minutes (to avoid missing an important meeting) Sample strategy : bike � worst-case reaching time = 45 minutes. Bad choices : train ( wc = ∞ ) and car ( wc = 71) Reconciling Rationality and Stochasticity Mickael Randour 14 / 21

  18. Rationality & stochasticity Planning a journey Synthesis Conclusion Solution 3: strict worst-case guarantees home go back, 2 railway, 2 car, 1 0 . 1 0 . 9 0 . 2 0 . 1 0 . 7 waiting light medium heavy train bike, 45 room traffic traffic traffic 0 . 9 0 . 1 relax, 35 drive, 20 drive, 30 drive, 70 wait, 3 work Worst-case analysis � two-player zero-sum game against a ratio- nal antagonistic adversary ( bad guy ) � forget about probabilities and give the choice of transitions to the adversary Reconciling Rationality and Stochasticity Mickael Randour 14 / 21

  19. Rationality & stochasticity Planning a journey Synthesis Conclusion Solution 4: minimize the expected time under strict worst-case guarantees home go back, 2 railway, 2 car, 1 0 . 1 0 . 9 0 . 2 0 . 1 0 . 7 waiting light medium heavy train bike, 45 room traffic traffic traffic 0 . 9 0 . 1 relax, 35 drive, 20 drive, 30 drive, 70 wait, 3 work Expected time: car � E = 33 but wc = 71 > 60 Worst-case: bike � wc = 45 < 60 but E = 45 >>> 33 Reconciling Rationality and Stochasticity Mickael Randour 15 / 21

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend