finding optimal mixed finding optimal mixed strategies to
play

Finding Optimal Mixed Finding Optimal Mixed Strategies to Commit to - PowerPoint PPT Presentation

Finding Optimal Mixed Finding Optimal Mixed Strategies to Commit to in g Security Games Vincent Conitzer Departments of Computer Science and Economics Departments of Computer Science and Economics Duke University Co-authors on various


  1. Finding Optimal Mixed Finding Optimal Mixed Strategies to Commit to in g Security Games Vincent Conitzer Departments of Computer Science and Economics Departments of Computer Science and Economics Duke University Co-authors on various parts: @Duke: Dmytro (Dima) Korzhyk, Josh Letchford, Kamesh Munagala, Ron Parr @USC: Zhengyu Yin, Chris Kiekintveld, Milind Tambe @CMU: Tuomas Sandholm

  2. What is game theory? • Game theory studies settings where multiple parties (agents) each have ( g ) – different preferences (utility functions), – different actions that they can take – different actions that they can take • Each agent’s utility (potentially) depends on all agents’ actions i – What is optimal for one agent depends on what other agents do • Very circular! • Game theory studies how agents can rationally form y g y beliefs over what other agents will do, and (hence) how agents should act agents should act – Useful for acting as well as predicting behavior of others

  3. Penalty kick example probability .7 probability .3 action probability 1 Is this a action action “rational” probability .6 outcome? If not, what probability .4 is?

  4. Rock-paper-scissors Column player aka. Column player aka player 2 chooses a column 0, 0 -1, 1 1, -1 Row player 1, -1 0, 0 , , -1, 1 , aka. player 1 chooses a row c ooses a o -1, 1 1, -1 , , 0, 0 , A row or column is called an action or (pure) strategy Row player’s utility is always listed first, column player’s second p y y y , p y Zero-sum game: the utilities in each entry sum to 0 (or a constant) Three-player game would be a 3D table with 3 utilities per entry, etc.

  5. Matching pennies (~penalty kick) L R 1, -1 -1, 1 L -1, 1 1, -1 R

  6. “Chicken” • Two players drive cars towards each other • If one player goes straight that player wins • If one player goes straight, that player wins • If both go straight, they both die D S S D D S 0, 0 0 0 -1, 1 1 1 D not zero-sum 1, -1 -5, -5 S

  7. How to play matching pennies Them L R 1, -1 -1, 1 L Us Us -1, 1 1, -1 R • Assume opponent knows our strategy… – hopeless? • … but we can use randomization • If we play L 60% R 40% If we play L 60%, R 40%... • … opponent will play R… • … we get .6*(-1) + .4*(1) = -.2 t 6*( 1) 4*(1) 2 • What’s optimal for us? What about rock-paper-scissors?

  8. Matching pennies with a sensitive target Them L R 1, -1 -1, 1 L Us Us -2, 2 1, -1 R • If we play 50% L, 50% R, opponent will attack L – We get .5*(1) + .5*(-2) = -.5 g ( ) ( ) • What if we play 55% L, 45% R? • • Opponent has choice between Opponent has choice between – L: gives them .55*(-1) + .45*(2) = .35 – R: gives them .55*(1) + .45*(-1) = .1 R i th 55*(1) 45*( 1) 1 • We get -.35 > -.5

  9. Matching pennies with a sensitive target Them L R 1, -1 -1, 1 L Us Us -2, 2 1, -1 R • What if we play 60% L, 40% R? • Opponent has choice between Opponent has choice between – L: gives them .6*(-1) + .4*(2) = .2 – R: gives them .6 (1) + .4 (-1) = .2 R: gives them 6*(1) + 4*( 1) = 2 • We get -.2 either way • This is the maximin strategy – Maximizes our minimum utility

  10. Let’s change roles Them L R 1, -1 -1, 1 L Us Us -2, 2 1, -1 R • Suppose we know their strategy • If they play 50% L, 50% R, y p y , , von Neumann’s minimax theorem [1927]: maximin – We play L, we get .5*(1)+.5*(-1) = 0 value = minimax value ( (~LP duality) y) • If they play 40% L, 60% R, If they play 40% L 60% R – If we play L, we get .4*(1)+.6*(-1) = -.2 – If we play R, we get .4 (-2)+.6 (1) = -.2 If we play R we get 4*( 2)+ 6*(1) = 2 • This is the minimax strategy

  11. Minimax theorem falls apart in nonzero-sum games D D S S 0 0 0, 0 -1 1 1, 1 D D S 1, 1 1 -1 -5 -5 5, 5 S • Let’s say we play S Let s say we play S • Most they could hurt us is by playing S as well • But that is not rational for them • If we can commit to S they will play D If we can commit to S, they will play D – Commitment advantage

  12. Nash equilibrium [Nash 1950] q [ ] • A profile (= strategy for each player) so that no player wants to deviate player wants to deviate D S 0, 0 -1, 1 D 1, -1 -5, -5 S • This game has another Nash equilibrium in g q mixed strategies – both play D with 80%

  13. The presentation game Presenter Put effort into Put effort into Do not put effort into Do not put effort into presentation (E) presentation (NE) Pay attention Pay attention 2, 2 -8, -7 (A) Audience Do not pay 0, -1 0, 0 attention (NA) • Pure-strategy Nash equilibria: (A, E), (NA, NE) • Mixed-strategy Nash equilibrium: Mixed strategy Nash equilibrium: ((1/10 A, 9/10 NA), (4/5 E, 1/5 NE)) – Utility 0 for audience, -7/10 for presenter y , p – Can see that some equilibria are strictly better for both players than other equilibria, i.e. some equilibria Pareto-dominate other equilibria

  14. Properties of Nash equilibrium in two-player games • In zero-sum games, same thing as maximin/minimax strategies maximin/minimax strategies • Any (finite) game has at least one Nash equilibrium [Nash 1950] • PPAD complete to compute one Nash equilibrium • PPAD-complete to compute one Nash equilibrium [Daskalakis, Goldberg, Papadimitriou 2006; Chen & Deng, 2006] • NP-hard & inapproximable to compute the “best” Nash equilibrium [Gilboa & Zemel 1989; Conitzer & Sandholm 2008] q

  15. Nash isn’t optimal if one player can commit 2, 1 4, 0 U i Unique Nash N h equilibrium 1, 0 3, 1 • Suppose the game is played as follows: – Player 1 commits to playing one of the rows, – Player 2 observes the commitment and then chooses a column Player 2 observes the commitment and then chooses a column • Optimal strategy for player 1: commit to Down

  16. Commitment as an extensive-form game i f • For the case of committing to a pure strategy: Player 1 Player 1 Up Down Player 2 Player 2 Left Right Left Right 2, 1 4, 0 1, 0 3, 1

  17. Commitment to mixed strategies g 2, 1 , 4, 0 , .49 .5 1, 0 3, 1 .51 .5 • Assume follower breaks ties in leader’s favor – In generic games this is the unique SPNE outcome of the extensive- form game [von Stengel & Zamir 2010] – We will also refer to this as a Stackelberg strategy

  18. Commitment as an extensive-form game… i f • … for the case of committing to a mixed strategy: for the case of committing to a mixed strategy: Player 1 (1,0) (0,1) (.5,.5) (=Up) (=Down) … … Player 2 Left Right Left Right Left Right 3, 1 2, 1 4, 0 1.5, .5 3.5, .5 1, 0 • • Economist: Just an extensive form game nothing new here Economist: Just an extensive-form game, nothing new here • Computer scientist: Infinite-size game! Representation matters

  19. Computing the optimal mixed strategy to commit to [Conitzer & Sandholm 2006, von Stengel & Zamir 2010] [C it & S dh l 2006 St l & Z i 2010] • Separate LP for every possible follower’s action t* Leader utility Distributional constraint Follower optimality • Choose t* for which the LP is feasible and has the highest objective The leader plays the highest objective. The leader plays the corresponding strategy <p s >. Slide 7

  20. Easy polynomial-time algorithm for two players for two players [Conitzer & Sandholm 2006; von Stengel & Zamir 2010] • For every column t separately, we solve separately for the best mixed row strategy (defined by p s ) that induces player 2 to play t • maximize Σ p u (s t) • maximize Σ s p s u 1 (s, t) • subject to for any t’, Σ s p s u 2 (s, t) ≥ Σ s p s u 2 (s, t’) Σ p = 1 Σ s p s 1 • (May be infeasible) • Pick the t that is best for player 1

  21. Visualization Visualization L L C C R R U 0,1 1,0 0,0 ( , , ) (0,1,0) = M M 4,0 0,1 0,0 D 0,0 1,0 1,1 C R R L (1,0,0) = U (0,0,1) = D

  22. Observations about commitment to a mixed strategy in a two-player game • Coincides with minimax strategies in zero-sum Coincides with minimax strategies in zero sum games • Leader’s payoff always at least as good as in any Nash equilibrium (see [von Stengel & Zamir 2010] ) q ( ] ) [ g – Can simply commit to the Nash equilibrium strategy – Follower breaks ties in your favor – Actually at least as good as any correlated equilibrium – Close relationship to LP for correlated equilibrium [Conitzer 2010 draft] • No equilibrium selection problem • No equilibrium selection problem • Natural notion of approximation

  23. (a particular kind of) Bayesian games (a particular kind of) Bayesian games follower utilities f follower utilities f leader utilities l d tiliti (type 2) (type 1) 2 2 4 4 1 1 0 0 1 1 0 0 1 3 0 1 1 3 probability .6 probability .4

  24. Multiple types Multiple types - visualization visualization (0 1 0) (0,1,0) Combined C C (0,1,0) R L (0,0,1) (1,0,0) (0,1,0) (1,0,0) (0,0,1) R (R,C) L (1,0,0) C (0,0,1)

  25. LAX techniques [Paruchuri et al. 2008, Pita et al. 2009] • Uses Bayesian games framework • Mixed integer programming formulation for solving Bayesian games optimally solving Bayesian games optimally – Much faster than converting game to normal form, solving that

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend