Strategy recovery for stochastic mean payoff games Marcello Mamino - PowerPoint PPT Presentation

Strategy recovery for stochastic mean payoff games Marcello Mamino TU Dresden GRASTA ’15, October 19–23, 2015, Montreal

Outline • Stochastic games • What is the solution of a game? • Complexity of stochastic games • Strategy recovery • Proof

Stochastic games Definition (stochastic game) • Two player 0-sum complete information game. • Finite directed graph G , a token rests on one of the vertices. • Each vertex v has an owner o ( v ) which is a player. • Each directed edge x A , p → y has an action A ∈ { a , b , c . . . } − − and a probability p ∈ Q ∩ [0 , 1]. • Each action A has a reward r ( A ) ∈ Q . • Play starts at some vertex v 0 . • Play never ends.

Stochastic games A play of a stochastic game G produces an infinite squence of vertices and actions A 0 A 1 A 2 v 0 → v 1 → v 2 . . . − − − − − − − − − − − − → Definition For 0 < β < 1, the β -discounted payoff is ∞ � r ( A i ) β i v β ( A 0 , A 1 . . . ) = (1 − β ) i =0 The mean payoff is n 1 � v 1 ( A 0 , A 1 . . . ) = lim inf r ( A i ) n + 1 n →∞ i =0

Stochastic games • Introduced by Gillette in 1957 generalizing Shapley. • Used to model reactive systems with randomized and adversarial behaviour ( competitive Markov decision processes ). • Pseudo-polynomial time algorithms in some cases (discounted payoff, ergodic mean payoff if most states are deterministic). • No polynomial time algorithm known. Theorem (Gillette ’57, Liggett–Lippman ’69) Stochastic discounted payoff and mean payoff games are determined. Moreover, the optimal strategies are positional . Corollary Stochastic discounted payoff and mean payoff games are in NP ∩ co-NP

What is the solution of a game? Definition We call strategic solution a pair of optimal strategies. Definition We call quantitative solution a method to evaluate all possible positions in a game. Observation If the plays of a class of games have finite length , then – under reasonable hypotheses – the problems of finding a strategic solution and a quantitative solution are equivalent .

What is the solution of a game?

What is the solution of a game? Definition We call strategic solution a pair of optimal strategies. Definition We call quantitative solution a method to evaluate all possible positions in a game. Observation If the plays of a class of games have finite length , then – under reasonable hypotheses – the problems of finding a strategic solution and a quantitative solution are equivalent .

What is the solution of a game? Observation In general , to find a quantitative solution, given a strategic solution, is not harder than playing two strategies against each other ( quantitative ≺ strategic ). Fact There are inperfect information stochastic games whose ǫ -optimal strategies require exponential space to be represented in binary. Question (strategy recovery) Given the quantitative solution of a specific game, how hard is it to derive a strategic solution?

What is the solution of a game?

What is the solution of a game? Observation In general , to find a quantitative solution, given a strategic solution, is not harder than playing two strategies against each other ( quantitative ≺ strategic ). Fact There are inperfect information stochastic games whose ǫ -optimal strategies require exponential space to be represented in binary. Question (strategy recovery) Given the quantitative solution of a specific game, how hard is it to derive a strategic solution?

Complexity of stochastic games Theorem (Andersson–Miltersen ’09) The following are polynomial time Turing equivalent.

Strategy recovery Observation For discounted payoff stochastic games strategy recovery can be performed in linear time. Theorem (Andersson–Miltersen ’09) Strategy recovery for terminal and simple stochastic games can be done in linear time. Theorem For mean payoff stochastic games, strategy recovery is as hard as it possibly can, namely polynomial time Turing equivalent to strategic solution. Idea of the proof: reduce all stochastic mean payoff games to a subclass of games with the property that, by a reason of symmetry, all positions have expected value zero.

Steps of the proof 1 The mean payoff game on G is strategically equivalent to the β -discounted game on G for β close enough to 1. 2 Fix a vertex v of G and replace all edges x A , p → y with x A ,β p → y − − − − − and x A , (1 − β ) p → v , yielding a new game G v . − − − − − − 3 This immediately forces the expected mean payoff of all initial positions of G v to be the same. 4 Moreover the expected mean payoff of G v coincides with the expected β -discounted value of G starting at v . 5 Summarizing, if we can find optimal strategies for all G v , then we can evaluate all G v , hence we can compute the β -discounted value of all positions in G , and by a previous observation we can compute optimal β -discounted strategies, which coincide with optimal mean payoff strategies.

Steps of the proof G v v

Steps of the proof G v v Flip the signs of the rewards in this component

Thank you!

Strategy recovery for stochastic mean payoff games Marcello Mamino - PowerPoint PPT Presentation

Strategy recovery for stochastic mean payoff games Marcello Mamino TU Dresden GRASTA 15, October 1923, 2015, Montreal Outline Stochastic games What is the solution of a game? Complexity of stochastic games Strategy

On the Approximation of Mean-Payoff Games Raffaella Gentilini University of Perugia Convegno

Games Miheer Dewaskar Chennai Mathematical Institute April 27, 2016 1 / 19 Outline Finite

The Multiple Dimensions of Mean-Payoff Games Laurent Doyen CNRS & LSV, ENS Paris-Saclay RP

Mean-payoff games with incomplete information Paul Hunter, Guillermo P erez, Jean-Franc ois

Multigrid methods for two player zero-sum stochastic games Sylvie Detournay INRIA Saclay and

Empirical-evidence Equilibria in Stochastic Games Nicolas Dudebout Outline 2 Stochastic

Today Experts/Zero-Sum Games Equilibrium. Boosting and Experts. Routing and Experts. Two person

Stochastic Games Reachability objectives The value (in Formal Verification) Min strategies

Ergodic Mean-Payoff Games for the Analysis of Attacks in Crypto-Currencies Krishnendu Chatterjee 1

S S S S erious Games erious Games erious Games erious Games + Computer S + Computer S +

Potential Games Matoula Petrolia April 14, 2011 Examples Potential Games Potential vs

Pre-Grundy Games Games And Graphs Workshop 2017 In collaboration with : Eric Duch ene,

Lecture 2: Stochastic Discount Factor Simon Gilchrist Boston Univerity and NBER EC 745 Fall,

Robust Predictions in Games with Incomplete Information joint with Stephen Morris (Princeton

LOGIC OF GAMES Andreas Blass University of Michigan Ann Arbor, MI 48109 ablass@umich.edu Games

Nash Dynamics and Potential Games Maria Serna Fall 2016 AGT-MIRI, FIB Potential Games Contents

REPEATED GAMES Overview Context: players (e.g., firms) interact with each other on an ongoing

http://cs224w.stanford.edu Spreading through Examples: networks: Biological: Cascading

MA111: Contemporary mathematics . Jack Schmidt University of Kentucky September 26, 2011

Competition and Incentives with Motivated Agents Timothy Besley and Maitreesh Ghatak

CS 598 RM : Algorithmic game theory Lecture 1 Two-player games For any two-player game, we have

Mathematical Foundations for Finance Exercise 5 Martin Stefanik ETH Zurich Some Concepts from

Positionality and strategy improvement for continuous payoffs A. Kozachinskiy University of

Agenda Contractor provisions Contractor provisions Employee or contractor? Employee or

Sambuz

Useful Links

Newsletter

Mail Us

Strategy recovery for stochastic mean payoff games Marcello Mamino - PowerPoint PPT Presentation

Strategy recovery for stochastic mean payoff games Marcello Mamino TU Dresden GRASTA 15, October 1923, 2015, Montreal Outline Stochastic games What is the solution of a game? Complexity of stochastic games Strategy

On the Approximation of Mean-Payoff Games Raffaella Gentilini University of Perugia Convegno

Games Miheer Dewaskar Chennai Mathematical Institute April 27, 2016 1 / 19 Outline Finite

The Multiple Dimensions of Mean-Payoff Games Laurent Doyen CNRS &amp; LSV, ENS Paris-Saclay RP

Mean-payoff games with incomplete information Paul Hunter, Guillermo P erez, Jean-Franc ois

Multigrid methods for two player zero-sum stochastic games Sylvie Detournay INRIA Saclay and

Empirical-evidence Equilibria in Stochastic Games Nicolas Dudebout Outline 2 Stochastic

Today Experts/Zero-Sum Games Equilibrium. Boosting and Experts. Routing and Experts. Two person

Stochastic Games Reachability objectives The value (in Formal Verification) Min strategies

Ergodic Mean-Payoff Games for the Analysis of Attacks in Crypto-Currencies Krishnendu Chatterjee 1

S S S S erious Games erious Games erious Games erious Games + Computer S + Computer S +

Potential Games Matoula Petrolia April 14, 2011 Examples Potential Games Potential vs

Pre-Grundy Games Games And Graphs Workshop 2017 In collaboration with : Eric Duch ene,

Lecture 2: Stochastic Discount Factor Simon Gilchrist Boston Univerity and NBER EC 745 Fall,

Robust Predictions in Games with Incomplete Information joint with Stephen Morris (Princeton

LOGIC OF GAMES Andreas Blass University of Michigan Ann Arbor, MI 48109 ablass@umich.edu Games

Nash Dynamics and Potential Games Maria Serna Fall 2016 AGT-MIRI, FIB Potential Games Contents

REPEATED GAMES Overview Context: players (e.g., firms) interact with each other on an ongoing

http://cs224w.stanford.edu Spreading through Examples: networks: Biological: Cascading

MA111: Contemporary mathematics . Jack Schmidt University of Kentucky September 26, 2011

Competition and Incentives with Motivated Agents Timothy Besley and Maitreesh Ghatak

CS 598 RM : Algorithmic game theory Lecture 1 Two-player games For any two-player game, we have

Mathematical Foundations for Finance Exercise 5 Martin Stefanik ETH Zurich Some Concepts from

Positionality and strategy improvement for continuous payoffs A. Kozachinskiy University of

Agenda Contractor provisions Contractor provisions Employee or contractor? Employee or

Sambuz

Useful Links

Newsletter

Mail Us

The Multiple Dimensions of Mean-Payoff Games Laurent Doyen CNRS & LSV, ENS Paris-Saclay RP