game theory
play

Game Theory : Zero-Sum Games, The Minimax Theorem CSC304 - Nisarg - PowerPoint PPT Presentation

CSC304 Lecture 6 Game Theory : Zero-Sum Games, The Minimax Theorem CSC304 - Nisarg Shah 1 Zero-Sum Games Special case of games Total reward to all players is constant in every outcome Without loss of generality, sum of rewards = 0


  1. CSC304 Lecture 6 Game Theory : Zero-Sum Games, The Minimax Theorem CSC304 - Nisarg Shah 1

  2. Zero-Sum Games • Special case of games ➢ Total reward to all players is constant in every outcome ➢ Without loss of generality, sum of rewards = 0 ➢ Inspired terms like “zero - sum thinking” and “zero -sum situation” • Focus on two-player zero-sum games (2p-zs) ➢ “The more I win, the more you lose” CSC304 - Nisarg Shah 2

  3. Zero-Sum Games Zero-sum game: Rock-Paper-Scissor P2 Rock Paper Scissor P1 Rock (0 , 0) (-1 , 1) (1 , -1) Paper (1 , -1) (0 , 0) (-1 , 1) Scissor (-1 , 1) (1 , -1) (0 , 0) Non-zero- sum game: Prisoner’s dilemma John Stay Silent Betray Sam Stay Silent (-1 , -1) (-3 , 0) Betray (0 , -3) (-2 , -2) CSC304 - Nisarg Shah 3

  4. Zero-Sum Games • Why are they interesting? ➢ Many physical games we play are zero-sum: chess, tic-tac- toe, rock-paper- scissor, … ➢ (win, lose), (lose, win), (draw, draw) ➢ (1, -1), (-1, 1), (0, 0) • Why are they technically interesting? ➢ We’ll see. CSC304 - Nisarg Shah 4

  5. Zero-Sum Games • Reward for P2 = - Reward for P1 ➢ Only need to write a single entry in each cell (say reward of P1) ➢ Hence, we get a matrix 𝐵 ➢ P1 wants to maximize the value, P2 wants to minimize it P2 Rock Paper Scissor P1 Rock 0 -1 1 Paper 1 0 -1 Scissor -1 1 0 CSC304 - Nisarg Shah 5

  6. Rewards in Matrix Form • Say P1 uses mixed strategy 𝑦 1 = (𝑦 1,1 , 𝑦 1,2 , … ) ➢ What are the rewards of P1 for different actions chosen by P2? 𝑡 𝑘 𝑦 1,1 𝑦 1,2 𝑦 1,3 . . . CSC304 - Nisarg Shah 6

  7. Rewards in Matrix Form • Say P1 uses mixed strategy 𝑦 1 = (𝑦 1,1 , 𝑦 1,2 , … ) ➢ What are the rewards for P1 corresponding to different possible actions of P2? 𝑡 𝑘 𝑦 1,1 , 𝑦 1,2 , 𝑦 1,3 , … ∗ ❖ Reward of P1 when P2 𝑈 ∗ 𝐵 𝑘 chooses s j = 𝑦 1 CSC304 - Nisarg Shah 7

  8. Rewards in Matrix Form • Reward for P1 when… ➢ P1 uses a mixed strategy 𝑦 1 ➢ P2 uses a mixed strategy 𝑦 2 𝑦 2,1 𝑈 ∗ 𝐵 1 , 𝑦 1 𝑈 ∗ 𝐵 2 , 𝑦 1 𝑈 ∗ 𝐵 3 … 𝑦 1 ∗ 𝑦 2,2 𝑦 2,3 ⋮ 𝑈 ∗ 𝐵 ∗ 𝑦 2 = 𝑦 1 CSC304 - Nisarg Shah 8

  9. How would the two players act in this zero-sum game? John von Neumann, 1928 CSC304 - Nisarg Shah 9

  10. Maximin Strategy • Worst- case thinking by P1… ➢ Suppose I don’t know anything about what P2 would do. ➢ If I choose a mixed strategy 𝑦 1 , in the worst case, P2 chooses an 𝑦 2 that minimizes my reward (i.e., maximizes his reward) ➢ Let me choose 𝑦 1 to maximize this “worst - case reward” ∗ = max 𝑈 ∗ 𝐵 ∗ 𝑦 2 𝑊 min 𝑦 1 1 𝑦 1 𝑦 2 CSC304 - Nisarg Shah 10

  11. Maximin Strategy ∗ = max 𝑈 ∗ 𝐵 ∗ 𝑦 2 𝑊 min 𝑦 1 1 𝑦 1 𝑦 2 ∗ : maximin value of P1 • 𝑊 1 ∗ (maximizer) : maximin strategy of P1 • 𝑦 1 ∗ , I guarantee myself at least 𝑊 ∗ ” • “By playing 𝑦 1 1 • P2 can similarly think of her worst case. CSC304 - Nisarg Shah 11

  12. Maximin vs Minimax Player 1 Player 2 Choose 𝑦 1 to maximize my Choose 𝑦 2 to minimize P1’s reward in the worst case reward in the worst case over P2’s strategy over P1’s strategy ∗ = max 𝑈 ∗ 𝐵 ∗ 𝑦 2 ∗ = min 𝑈 ∗ 𝐵 ∗ 𝑦 2 𝑊 min 𝑦 1 𝑊 max 𝑦 1 1 2 𝑦 1 𝑦 2 𝑦 2 𝑦 1 ∗ ∗ 𝑦 1 𝑦 2 ∗ and 𝑊 ∗ ? Question: Relation between 𝑊 1 2 CSC304 - Nisarg Shah 12

  13. Maximin vs Minimax ∗ = max 𝑈 ∗ 𝐵 ∗ 𝑦 2 ∗ = min 𝑈 ∗ 𝐵 ∗ 𝑦 2 𝑊 min 𝑦 1 𝑊 max 𝑦 1 1 2 𝑦 1 𝑦 2 𝑦 2 𝑦 1 ∗ ∗ 𝑦 1 𝑦 2 ∗ , x 2 ∗ ) simultaneously? • What if (P1,P2) play (x 1 ∗ ➢ P1’s guarantee: P1 must get reward at least 𝑊 1 ∗ ➢ P2’s guarantee: P1 must get reward at most 𝑊 2 ∗ ≤ 𝑊 ∗ ➢ 𝑊 1 2 CSC304 - Nisarg Shah 13

  14. Maximin vs Minimax ∗ = max 𝑈 ∗ 𝐵 ∗ 𝑦 2 ∗ = min 𝑈 ∗ 𝐵 ∗ 𝑦 2 𝑊 min 𝑦 1 𝑊 max 𝑦 1 1 2 𝑦 1 𝑦 2 𝑦 2 𝑦 1 ∗ ∗ 𝑦 1 𝑦 2 • Another way to see this: ∗ = min ∗ 𝑈 ∗ 𝐵 ∗ 𝑦 2 ≤ ∗ 𝑈 ∗ 𝐵 ∗ 𝑦 2 ∗ 𝑊 𝑦 1 𝑦 1 1 𝑦 2 𝑈 ∗ 𝐵 ∗ 𝑦 2 ∗ = 𝑊 ∗ ≤ max 𝑦 1 2 𝑦 1 CSC304 - Nisarg Shah 14

  15. The Minimax Theorem • Jon von Neumann [1928] • Theorem: For any 2p-zs game, ∗ = 𝑊 ∗ = 𝑊 ∗ (called the minimax value of the game) ➢ 𝑊 1 2 ➢ Set of Nash equilibria = ∗ ∶ x 1 ∗ = maximin for P1, x 2 ∗ = minimax for P2 } ∗ , x 2 { x 1 ∗ is best response to 𝑦 2 ∗ and vice-versa. • Corollary: 𝑦 1 CSC304 - Nisarg Shah 15

  16. The Minimax Theorem • An alternative interpretation of maximin strategies ∗ is the strategy P1 would choose if she were to commit ➢ 𝑦 1 to her strategy first, and P2 were to choose her strategy after observing P1’s strategy ∗ is the strategy P2 would choose if P2 were to ➢ Similarly, 𝑦 2 commit first ∗ and 𝑦 2 ∗ are best responses to each other. ➢ However, 𝑦 1 ➢ Hence, in zero- sum games, it doesn’t matter which player commits first (or if both players commit together). CSC304 - Nisarg Shah 16

  17. The Minimax Theorem • Jon von Neumann [1928] “ As far as I can see, there could be no theory of games … without that theorem … I thought there was nothing worth publishing until the Minimax Theorem was proved” CSC304 - Nisarg Shah 17

  18. Proof of the Minimax Theorem • Simpler proof using Nash’s theorem ➢ But predates Nash’s theorem • Suppose ෤ 𝑦 1 , ෤ 𝑦 2 is a NE ➢ Note: A Nash equilibrium exists due to Nash’s theorem 𝑦 1 𝑈 𝐵 ෤ • P1 gets value ෤ 𝑤 = ෤ 𝑦 2 𝑤 = max 𝑦 1 𝑦 1 𝑈 𝐵 ෤ • ෤ 𝑦 1 is best response for P1 : ෤ 𝑦 2 𝑦 1 𝑈 𝐵 𝑦 2 • ෤ 𝑦 2 is best response for P2 : ෤ 𝑤 = min 𝑦 2 ෤ CSC304 - Nisarg Shah 18

  19. Proof of the Minimax Theorem ∗ = min 𝑈 ∗ 𝐵 ∗ 𝑦 2 ≤ 𝑊 max 𝑦 1 2 𝑦 2 𝑦 1 𝑦 1 𝑈 𝐵 ෤ 𝑦 1 𝑈 𝐵 𝑦 2 max 𝑦 2 = ෤ 𝑤 = min ෤ 𝑦 1 𝑦 2 𝑈 ∗ 𝐵 ∗ 𝑦 2 = 𝑊 ∗ ≤ max min 𝑦 1 1 𝑦 1 𝑦 2 ∗ ≤ 𝑊 ∗ • But we already saw 𝑊 1 2 ∗ = 𝑊 ∗ ➢ 𝑊 1 2 CSC304 - Nisarg Shah 19

  20. Proof of the Minimax Theorem ∗ = min 𝑈 ∗ 𝐵 ∗ 𝑦 2 = 𝑊 max 𝑦 1 2 𝑦 2 𝑦 1 𝑦 1 𝑈 𝐵 ෤ 𝑦 1 𝑈 𝐵 𝑦 2 max 𝑦 2 = ෤ 𝑤 = min ෤ 𝑦 1 𝑦 2 𝑈 ∗ 𝐵 ∗ 𝑦 2 = 𝑊 ∗ = max min 𝑦 1 1 𝑦 1 𝑦 2 • When (෤ 𝑦 1 , ෤ 𝑦 2 ) is a NE, ෤ 𝑦 1 and ෤ 𝑦 2 must be maximin and minimax strategies for P1 and P2, respectively. • The reverse direction is also easy to prove. CSC304 - Nisarg Shah 20

  21. Computing Nash Equilibria • Recall that in general games, computing a Nash equilibrium is hard even with two players. • For 2p-zs games, a Nash equilibrium can be computed in polynomial time. ➢ Polynomial in #actions of the two players: 𝑛 1 and 𝑛 2 ➢ Exploits the fact that Nash equilibrium is simply composed of maximin strategies, which can be computed using linear programming CSC304 - Nisarg Shah 21

  22. Computing Nash Equilibria Maximize 𝑤 Subject to 𝑈 𝐵 𝑘 ≥ 𝑤 , 𝑘 ∈ 1, … , 𝑛 2 𝑦 1 𝑦 1 1 + ⋯ + 𝑦 1 𝑛 1 = 1 𝑦 1 𝑗 ≥ 0, 𝑗 ∈ {1, … , 𝑛 1 } CSC304 - Nisarg Shah 22

  23. Limitation of Minimax Theorem • It only makes sense to play your maximin strategy ∗ if you know the other player is rational enough 𝑦 1 ∗ to choose the best response 𝑦 2 • If the other player is choosing a suboptimal strategy 𝑦 2 , the best response to 𝑦 2 might be different • This is what computer programs playing Chess exploit when they play against human players CSC304 - Nisarg Shah 23

  24. Minimax Theorem in Real Life? Goalie L R Kicker L 0.58 0.95 R 0.93 0.70 Kicker Goalie Maximize 𝑤 Minimize 𝑤 Subject to Subject to 0.58𝑞 𝑀 + 0.93𝑞 𝑆 ≥ 𝑤 0.58𝑟 𝑀 + 0.95𝑟 𝑆 ≤ 𝑤 0.95𝑞 𝑀 + 0.70𝑞 𝑆 ≥ 𝑤 0.93𝑟 𝑀 + 0.70𝑟 𝑆 ≤ 𝑤 𝑞 𝑀 + 𝑞 𝑆 = 1 𝑟 𝑀 + 𝑟 𝑆 = 1 𝑞 𝑀 ≥ 0, 𝑞 𝑆 ≥ 0 𝑟 𝑀 ≥ 0, 𝑟 𝑆 ≥ 0 CSC304 - Nisarg Shah 24

  25. Minimax Theorem in Real Life? Goalie L R Kicker L 0.58 0.95 R 0.93 0.70 Kicker Goalie Maximin: Maximin: 𝑞 𝑀 = 0.38 , 𝑞 𝑆 = 0.62 𝑟 𝑀 = 0.42 , 𝑟 𝑆 = 0.58 Reality: Reality: 𝑞 𝑀 = 0.40 , 𝑞 𝑆 = 0.60 𝑞 𝑀 = 0.423 , 𝑟 𝑆 = 0.577 Some evidence that people may play minimax strategies. CSC304 - Nisarg Shah 25

  26. Minimax Theorem • We proved it using Nash’s theorem ➢ Cheating. Typically, Nash’s theorem (for the special case of 2p-zs games) is proved using the minimax theorem. John von Neumann • Useful for proving Yao’s principle, which provides lower bound for randomized algorithms • Equivalent to linear programming duality George Dantzig CSC304 - Nisarg Shah 26

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend