CSC304 Lecture 5 Game Theory : Zero-Sum Games, The Minimax Theorem
CSC304 - Nisarg Shah 1
CSC304 Lecture 5 Game Theory : Zero-Sum Games, The Minimax Theorem - - PowerPoint PPT Presentation
CSC304 Lecture 5 Game Theory : Zero-Sum Games, The Minimax Theorem CSC304 - Nisarg Shah 1 Recap Last lecture Cost-sharing games o Price of anarchy (PoA) can be o Price of stability (PoS) is (log ) Potential functions
CSC304 - Nisarg Shah 1
CSC304 - Nisarg Shah 2
โข Cost-sharing games
โข Potential functions and pure Nash equilibria โข Congestion games โข Braessโ paradox โข Updated (slightly more detailed) slides
CSC304 - Nisarg Shah 3
โข Common term: โzero-sum situationโ โข Psychology literature: โzero-sum thinkingโ โข โStrictly competitive gamesโ
โข โThe more I win, the more you loseโ
CSC304 - Nisarg Shah 4
Sam John Stay Silent Betray Stay Silent (-1 , -1) (-3 , 0) Betray (0 , -3) (-2 , -2)
P1 P2 Rock Paper Scissor Rock (0 , 0) (-1 , 1) (1 , -1) Paper (1 , -1) (0 , 0) (-1 , 1) Scissor (-1 , 1) (1 , -1) (0 , 0)
CSC304 - Nisarg Shah 5
โข Most games we play are zero-sum: chess, tic-tac-toe,
โข (win, lose), (lose, win), (draw, draw) โข (1, -1), (-1, 1), (0, 0)
โข Relation between the rewards of P1 and P2 โข P1 maximizes his reward โข P2 maximizes his reward = minimizes reward of P1
CSC304 - Nisarg Shah 6
โข Only need a single matrix ๐ต : reward for P1 โข P1 wants to maximize, P2 wants to minimize
P1 P2 Rock Paper Scissor Rock
1 Paper 1
Scissor
1
CSC304 - Nisarg Shah 7
โข What are the rewards for P1 corresponding to different
possible actions of P2? ๐ก
๐
๐ฆ1,1 ๐ฆ1,2 ๐ฆ1,3 . . .
CSC304 - Nisarg Shah 8
โข What are the rewards for P1 corresponding to different
possible actions of P2? ๐ก
๐
๐ฆ1,1, ๐ฆ1,2, ๐ฆ1,3, โฆ โ โ Reward for P1 when P2 chooses sj = ๐ฆ1
๐ โ ๐ต ๐
CSC304 - Nisarg Shah 9
โข P1 uses mixed strategy ๐ฆ1 โข P2 uses mixed strategy ๐ฆ2
๐ฆ1
๐ โ ๐ต 1, ๐ฆ1 ๐ โ ๐ต 2, ๐ฆ1 ๐ โ ๐ต 3 โฆ
โ ๐ฆ2,1 ๐ฆ2,2 ๐ฆ2,3 โฎ = ๐ฆ1
๐ โ ๐ต โ ๐ฆ2
CSC304 - Nisarg Shah 10
CSC304 - Nisarg Shah 11
โข If I choose mixed strategy ๐ฆ1โฆ โข P2 would choose ๐ฆ2 to minimize my reward (i.e.,
maximize his reward)
โข Let me choose ๐ฆ1 to maximize this โworst-case rewardโ
1 โ = max ๐ฆ1
๐ฆ2
๐ โ ๐ต โ ๐ฆ2
CSC304 - Nisarg Shah 12
1 โ = max ๐ฆ1
๐ฆ2
๐ โ ๐ต โ ๐ฆ2
1 โ : maximin value of P1
โ (maximizer) : maximin strategy of P1
โ, I guarantee myself at least ๐ 1 โโ
โ, P2โs best response โ เท
โข Will ๐ฆ1
โ be the best response to เท
CSC304 - Nisarg Shah 13
Choose my strategy to maximize my reward, worst- case over P2โs response ๐
1 โ = max ๐ฆ1
min
๐ฆ2
๐ฆ1
๐ โ ๐ต โ ๐ฆ2
Choose my strategy to minimize P1โs reward, worst- case over P1โs response ๐
2 โ = min ๐ฆ2
max
๐ฆ1
๐ฆ1
๐ โ ๐ต โ ๐ฆ2
1 โ and ๐ 2 โ?
๐ฆ1
โ
๐ฆ2
โ
CSC304 - Nisarg Shah 14
๐
1 โ = max ๐ฆ1
min
๐ฆ2
๐ฆ1
๐ โ ๐ต โ ๐ฆ2
๐
2 โ = min ๐ฆ2
max
๐ฆ1
๐ฆ1
๐ โ ๐ต โ ๐ฆ2
โ, x2 โ)?
โข P1 must get at least ๐
1 โ (ensured by P1)
โข P1 must get at most ๐
2 โ (ensured by P2)
โข ๐
1 โ โค ๐ 2 โ
๐ฆ1
โ
๐ฆ2
โ
CSC304 - Nisarg Shah 16
โข ๐
1 โ = ๐ 2 โ = ๐โ (called the minimax value of the game)
โข Set of Nash equilibria =
{ x1
โ, x2 โ โถ x1 โ = maximin for P1, x2 โ = minimax for P2}
โ is best response to ๐ฆ2 โ and vice-versa.
CSC304 - Nisarg Shah 17
โAs far as I can see, there could be no theory of games โฆ without that theorem โฆ I thought there was nothing worth publishing until the Minimax Theorem was provedโ
โข Optimal strategies for P1 and P2 (up to ties) โข Optimal rewards for P1 and P2 under a rational play
CSC304 - Nisarg Shah 18
โข But predates Nashโs theorem
CSC304 - Nisarg Shah 19
1 โ โค ๐ 2 โ
โข ๐
1 โ = ๐ 2 โ
๐ฆ1
๐ฆ2
๐ฆ1
๐ฆ2
๐ โ ๐ต โ ๐ฆ2 = ๐ 1 โ
2 โ = min ๐ฆ2
๐ฆ1
๐ โ ๐ต โ ๐ฆ2 โค
CSC304 - Nisarg Shah 20
๐ฆ1
๐ฆ2
๐ฆ1
๐ฆ2
๐ โ ๐ต โ ๐ฆ2 = ๐ 1 โ
2 โ = min ๐ฆ2
๐ฆ1
๐ โ ๐ต โ ๐ฆ2 =
CSC304 - Nisarg Shah 21
โข Polynomial in #actions of the two players: ๐1 and ๐2
CSC304 - Nisarg Shah 22
๐ ๐ต ๐ โฅ ๐ค, ๐ โ 1, โฆ , ๐2
CSC304 - Nisarg Shah 23
CSC304 - Nisarg Shah 24
Kicker Goalie L R L 0.58 0.95 R 0.93 0.70
Kicker Maximize ๐ค Subject to 0.58๐๐ + 0.93๐๐ โฅ ๐ค 0.95๐๐ + 0.70๐๐ โฅ ๐ค ๐๐ + ๐๐ = 1 ๐๐ โฅ 0, ๐๐ โฅ 0 Goalie Minimize ๐ค Subject to 0.58๐๐ + 0.95๐๐ โค ๐ค 0.93๐๐ + 0.70๐๐ โค ๐ค ๐๐ + ๐๐ = 1 ๐๐ โฅ 0, ๐๐ โฅ 0
CSC304 - Nisarg Shah 25
Kicker Goalie L R L 0.58 0.95 R 0.93 0.70
Kicker Maximin: ๐๐ = 0.38, ๐๐ = 0.62 Reality: ๐๐ = 0.40, ๐๐ = 0.60 Goalie Maximin: ๐๐ = 0.42, ๐๐ = 0.58 Reality: ๐๐ = 0.423, ๐๐ = 0.577 Some evidence that people may play minimax strategies.
CSC304 - Nisarg Shah 26
โข Cheating. Typically, Nashโs theorem (for
the special case of 2p-zs games) is proved using the minimax theorem.
John von Neumann George Dantzig
CSC304 - Nisarg Shah 27
George Dantzig loves to tell the story of his meeting with John von Neumann on October 3, 1947 at the Institute for Advanced Study at Princeton. Dantzig went to that meeting with the express purpose of describing the linear programming problem to von Neumann and asking him to suggest a computational procedure. He was actually looking for methods to benchmark the simplex method. Instead, he got a 90-minute lecture on Farkas Lemma and Duality (Dantzig's notes of this session formed the source of the modern perspective on linear programming duality). Not wanting Dantzig to be completely amazed, von Neumann admitted: "I don't want you to think that I am pulling all this out of my sleeve like a magician. I have recently completed a book with Morgenstern on the theory of games. What I am doing is conjecturing that the two problems are equivalent. The theory that I am outlining is an analogue to the one we have developed for games.โ