CSC304 Lecture 6 Game Theory : Zero-Sum Games, The Minimax Theorem
CSC304 - Nisarg Shah 1
Game Theory : Zero-Sum Games, The Minimax Theorem CSC304 - Nisarg - - PowerPoint PPT Presentation
CSC304 Lecture 6 Game Theory : Zero-Sum Games, The Minimax Theorem CSC304 - Nisarg Shah 1 Zero-Sum Games Special case of games Total reward to all players is constant in every outcome Without loss of generality, sum of rewards = 0
CSC304 - Nisarg Shah 1
CSC304 - Nisarg Shah 2
➢ Total reward to all players is constant in every outcome ➢ Without loss of generality, sum of rewards = 0 ➢ Inspired terms like “zero-sum thinking” and “zero-sum
situation”
➢ “The more I win, the more you lose”
CSC304 - Nisarg Shah 3
Sam John Stay Silent Betray Stay Silent (-1 , -1) (-3 , 0) Betray (0 , -3) (-2 , -2)
P1 P2 Rock Paper Scissor Rock (0 , 0) (-1 , 1) (1 , -1) Paper (1 , -1) (0 , 0) (-1 , 1) Scissor (-1 , 1) (1 , -1) (0 , 0)
CSC304 - Nisarg Shah 4
➢ Many physical games we play are zero-sum: chess, tic-tac-
toe, rock-paper-scissor, …
➢ (win, lose), (lose, win), (draw, draw) ➢ (1, -1), (-1, 1), (0, 0)
➢ We’ll see.
CSC304 - Nisarg Shah 5
➢ Only need to write a single entry in each cell (say reward
➢ Hence, we get a matrix 𝐵 ➢ P1 wants to maximize the value, P2 wants to minimize it
P1 P2 Rock Paper Scissor Rock
1 Paper 1
Scissor
1
CSC304 - Nisarg Shah 6
➢ What are the rewards of P1 for different actions chosen
by P2? 𝑡
𝑘
𝑦1,1 𝑦1,2 𝑦1,3 . . .
CSC304 - Nisarg Shah 7
➢ What are the rewards for P1 corresponding to different
possible actions of P2? 𝑡
𝑘
𝑦1,1, 𝑦1,2, 𝑦1,3, … ∗ ❖ Reward of P1 when P2 chooses sj = 𝑦1
𝑈 ∗ 𝐵 𝑘
CSC304 - Nisarg Shah 8
➢ P1 uses a mixed strategy 𝑦1 ➢ P2 uses a mixed strategy 𝑦2
𝑦1
𝑈 ∗ 𝐵 1, 𝑦1 𝑈 ∗ 𝐵 2, 𝑦1 𝑈 ∗ 𝐵 3 …
∗ 𝑦2,1 𝑦2,2 𝑦2,3 ⋮ = 𝑦1
𝑈 ∗ 𝐵 ∗ 𝑦2
CSC304 - Nisarg Shah 9
CSC304 - Nisarg Shah 10
➢ Suppose I don’t know anything about what P2 would do. ➢ If I choose a mixed strategy 𝑦1, in the worst case, P2
chooses an 𝑦2 that minimizes my reward (i.e., maximizes his reward)
➢ Let me choose 𝑦1 to maximize this “worst-case reward”
𝑊
1 ∗ = max 𝑦1
min
𝑦2
𝑦1
𝑈 ∗ 𝐵 ∗ 𝑦2
CSC304 - Nisarg Shah 11
1 ∗ = max 𝑦1
𝑦2
𝑈 ∗ 𝐵 ∗ 𝑦2
1 ∗ : maximin value of P1
∗ (maximizer) : maximin strategy of P1
∗, I guarantee myself at least 𝑊 1 ∗”
CSC304 - Nisarg Shah 12
1 ∗ = max 𝑦1
𝑦2
𝑈 ∗ 𝐵 ∗ 𝑦2
2 ∗ = min 𝑦2
𝑦1
𝑈 ∗ 𝐵 ∗ 𝑦2
1 ∗ and 𝑊 2 ∗?
𝑦1
∗
𝑦2
∗
CSC304 - Nisarg Shah 13
𝑊
1 ∗ = max 𝑦1
min
𝑦2
𝑦1
𝑈 ∗ 𝐵 ∗ 𝑦2
𝑊
2 ∗ = min 𝑦2
max
𝑦1
𝑦1
𝑈 ∗ 𝐵 ∗ 𝑦2
∗, x2 ∗) simultaneously?
➢ P1’s guarantee: P1 must get reward at least 𝑊
1 ∗
➢ P2’s guarantee: P1 must get reward at most 𝑊
2 ∗
➢ 𝑊
1 ∗ ≤ 𝑊 2 ∗ 𝑦1
∗
𝑦2
∗
CSC304 - Nisarg Shah 14
𝑊
1 ∗ = max 𝑦1
min
𝑦2
𝑦1
𝑈 ∗ 𝐵 ∗ 𝑦2
𝑊
2 ∗ = min 𝑦2
max
𝑦1
𝑦1
𝑈 ∗ 𝐵 ∗ 𝑦2
𝑦1
∗
𝑦2
∗
𝑊
1 ∗ = min 𝑦2
𝑦1
∗ 𝑈 ∗ 𝐵 ∗ 𝑦2 ≤
𝑦1
∗ 𝑈 ∗ 𝐵 ∗ 𝑦2 ∗
≤ max
𝑦1
𝑦1
𝑈 ∗ 𝐵 ∗ 𝑦2 ∗ = 𝑊 2 ∗
CSC304 - Nisarg Shah 15
➢ 𝑊
1 ∗ = 𝑊 2 ∗ = 𝑊∗ (called the minimax value of the game)
➢ Set of Nash equilibria =
{ x1
∗, x2 ∗ ∶ x1 ∗ = maximin for P1, x2 ∗ = minimax for P2}
∗ is best response to 𝑦2 ∗ and vice-versa.
CSC304 - Nisarg Shah 16
➢ 𝑦1
∗ is the strategy P1 would choose if she were to commit
to her strategy first, and P2 were to choose her strategy after observing P1’s strategy
➢ Similarly, 𝑦2
∗ is the strategy P2 would choose if P2 were to
commit first
➢ However, 𝑦1
∗ and 𝑦2 ∗ are best responses to each other.
➢ Hence, in zero-sum games, it doesn’t matter which player
commits first (or if both players commit together).
CSC304 - Nisarg Shah 17
CSC304 - Nisarg Shah 18
➢ But predates Nash’s theorem
➢ Note: A Nash equilibrium exists due to Nash’s theorem
CSC304 - Nisarg Shah 19
1 ∗ ≤ 𝑊 2 ∗
➢ 𝑊
1 ∗ = 𝑊 2 ∗
max
𝑦1
𝑦1 𝑈𝐵 𝑦2 = 𝑤 = min
𝑦2
𝑦1 𝑈𝐵 𝑦2 ≤ max
𝑦1
min
𝑦2
𝑦1
𝑈 ∗ 𝐵 ∗ 𝑦2 = 𝑊 1 ∗
𝑊
2 ∗ = min 𝑦2
max
𝑦1
𝑦1
𝑈 ∗ 𝐵 ∗ 𝑦2 ≤
CSC304 - Nisarg Shah 20
max
𝑦1
𝑦1 𝑈𝐵 𝑦2 = 𝑤 = min
𝑦2
𝑦1 𝑈𝐵 𝑦2 = max
𝑦1
min
𝑦2
𝑦1
𝑈 ∗ 𝐵 ∗ 𝑦2 = 𝑊 1 ∗
𝑊
2 ∗ = min 𝑦2
max
𝑦1
𝑦1
𝑈 ∗ 𝐵 ∗ 𝑦2 =
CSC304 - Nisarg Shah 21
➢ Polynomial in #actions of the two players: 𝑛1 and 𝑛2 ➢ Exploits the fact that Nash equilibrium is simply
composed of maximin strategies, which can be computed using linear programming
CSC304 - Nisarg Shah 22
Maximize 𝑤 Subject to 𝑦1
𝑈 𝐵 𝑘 ≥ 𝑤, 𝑘 ∈ 1, … , 𝑛2
𝑦1 1 + ⋯ + 𝑦1 𝑛1 = 1 𝑦1 𝑗 ≥ 0, 𝑗 ∈ {1, … , 𝑛1}
CSC304 - Nisarg Shah 23
∗ if you know the other player is rational enough
∗
CSC304 - Nisarg Shah 24
Kicker Goalie L R L 0.58 0.95 R 0.93 0.70
Kicker Maximize 𝑤 Subject to 0.58𝑞𝑀 + 0.93𝑞𝑆 ≥ 𝑤 0.95𝑞𝑀 + 0.70𝑞𝑆 ≥ 𝑤 𝑞𝑀 + 𝑞𝑆 = 1 𝑞𝑀 ≥ 0, 𝑞𝑆 ≥ 0 Goalie Minimize 𝑤 Subject to 0.58𝑟𝑀 + 0.95𝑟𝑆 ≤ 𝑤 0.93𝑟𝑀 + 0.70𝑟𝑆 ≤ 𝑤 𝑟𝑀 + 𝑟𝑆 = 1 𝑟𝑀 ≥ 0, 𝑟𝑆 ≥ 0
CSC304 - Nisarg Shah 25
Kicker Goalie L R L 0.58 0.95 R 0.93 0.70
Kicker Maximin: 𝑞𝑀 = 0.38, 𝑞𝑆 = 0.62 Reality: 𝑞𝑀 = 0.40, 𝑞𝑆 = 0.60 Goalie Maximin: 𝑟𝑀 = 0.42, 𝑟𝑆 = 0.58 Reality: 𝑞𝑀 = 0.423, 𝑟𝑆 = 0.577 Some evidence that people may play minimax strategies.
CSC304 - Nisarg Shah 26
➢ Cheating. Typically, Nash’s theorem (for
the special case of 2p-zs games) is proved using the minimax theorem.
John von Neumann George Dantzig
CSC304 - Nisarg Shah 27
George Dantzig loves to tell the story of his meeting with John von Neumann on October 3, 1947 at the Institute for Advanced Study at Princeton. Dantzig went to that meeting with the express purpose of describing the linear programming problem to von Neumann and asking him to suggest a computational procedure. He was actually looking for methods to benchmark the simplex method. Instead, he got a 90-minute lecture on Farkas Lemma and Duality (Dantzig's notes of this session formed the source of the modern perspective on linear programming duality). Not wanting Dantzig to be completely amazed, von Neumann admitted: "I don't want you to think that I am pulling all this out of my sleeve like a magician. I have recently completed a book with Morgenstern on the theory of games. What I am doing is conjecturing that the two problems are equivalent. The theory that I am outlining is an analogue to the one we have developed for games.“