CSC304 Lecture 5 Game Theory : Zero-Sum Games, The Minimax Theorem - - PowerPoint PPT Presentation

โ–ถ
csc304 lecture 5
SMART_READER_LITE
LIVE PREVIEW

CSC304 Lecture 5 Game Theory : Zero-Sum Games, The Minimax Theorem - - PowerPoint PPT Presentation

CSC304 Lecture 5 Game Theory : Zero-Sum Games, The Minimax Theorem CSC304 - Nisarg Shah 1 Recap Last lecture Cost-sharing games o Price of anarchy (PoA) can be o Price of stability (PoS) is (log ) Potential functions


slide-1
SLIDE 1

CSC304 Lecture 5 Game Theory : Zero-Sum Games, The Minimax Theorem

CSC304 - Nisarg Shah 1

slide-2
SLIDE 2

Recap

CSC304 - Nisarg Shah 2

  • Last lecture

โžข Cost-sharing games

  • Price of anarchy (PoA) can be ๐‘œ
  • Price of stability (PoS) is ๐‘ƒ(log ๐‘œ)

โžข Potential functions and pure Nash equilibria โžข Congestion games โžข Braessโ€™ paradox โžข Updated (slightly more detailed) slides

  • Assignment 1 to be posted
  • Volunteer note-taker
slide-3
SLIDE 3

Zero-Sum Games

CSC304 - Nisarg Shah 3

  • Total reward constant in all outcomes (w.l.o.g. 0)

โžข Common term: โ€œzero-sum situationโ€ โžข Psychology literature: โ€œzero-sum thinkingโ€ โžข โ€œStrictly competitive gamesโ€

  • Focus on two-player zero-sum games (2p-zs)

โžข โ€œThe more I win, the more you loseโ€

slide-4
SLIDE 4

Zero-Sum Games

CSC304 - Nisarg Shah 4

Sam John Stay Silent Betray Stay Silent (-1 , -1) (-3 , 0) Betray (0 , -3) (-2 , -2)

Non-zero-sum game: Prisonerโ€™s dilemma Zero-sum game: Rock-Paper-Scissor

P1 P2 Rock Paper Scissor Rock (0 , 0) (-1 , 1) (1 , -1) Paper (1 , -1) (0 , 0) (-1 , 1) Scissor (-1 , 1) (1 , -1) (0 , 0)

slide-5
SLIDE 5

Zero-Sum Games

CSC304 - Nisarg Shah 5

  • Why are they interesting?

โžข Most games we play are zero-sum: chess, tic-tac-toe,

rock-paper-scissor, โ€ฆ

โžข (win, lose), (lose, win), (draw, draw) โžข (1, -1), (-1, 1), (0, 0)

  • Why are they technically interesting?

โžข Relation between the rewards of P1 and P2 โžข P1 maximizes his reward โžข P2 maximizes his reward = minimizes reward of P1

slide-6
SLIDE 6

Zero-Sum Games

CSC304 - Nisarg Shah 6

  • Reward for P2 = - Reward for P1

โžข Only need a single matrix ๐ต : reward for P1 โžข P1 wants to maximize, P2 wants to minimize

P1 P2 Rock Paper Scissor Rock

  • 1

1 Paper 1

  • 1

Scissor

  • 1

1

slide-7
SLIDE 7

Rewards in Matrix Form

CSC304 - Nisarg Shah 7

  • Say P1 uses mixed strategy ๐‘ฆ1 = (๐‘ฆ1,1, ๐‘ฆ1,2, โ€ฆ )

โžข What are the rewards for P1 corresponding to different

possible actions of P2? ๐‘ก

๐‘˜

๐‘ฆ1,1 ๐‘ฆ1,2 ๐‘ฆ1,3 . . .

slide-8
SLIDE 8

Rewards in Matrix Form

CSC304 - Nisarg Shah 8

  • Say P1 uses mixed strategy ๐‘ฆ1 = (๐‘ฆ1,1, ๐‘ฆ1,2, โ€ฆ )

โžข What are the rewards for P1 corresponding to different

possible actions of P2? ๐‘ก

๐‘˜

๐‘ฆ1,1, ๐‘ฆ1,2, ๐‘ฆ1,3, โ€ฆ โˆ— โ– Reward for P1 when P2 chooses sj = ๐‘ฆ1

๐‘ˆ โˆ— ๐ต ๐‘˜

slide-9
SLIDE 9

Rewards in Matrix Form

CSC304 - Nisarg Shah 9

  • Reward for P1 whenโ€ฆ

โžข P1 uses mixed strategy ๐‘ฆ1 โžข P2 uses mixed strategy ๐‘ฆ2

๐‘ฆ1

๐‘ˆ โˆ— ๐ต 1, ๐‘ฆ1 ๐‘ˆ โˆ— ๐ต 2, ๐‘ฆ1 ๐‘ˆ โˆ— ๐ต 3 โ€ฆ

โˆ— ๐‘ฆ2,1 ๐‘ฆ2,2 ๐‘ฆ2,3 โ‹ฎ = ๐‘ฆ1

๐‘ˆ โˆ— ๐ต โˆ— ๐‘ฆ2

slide-10
SLIDE 10

CSC304 - Nisarg Shah 10

How would the two players act do in this zero-sum game?

John von Neumann, 1928

slide-11
SLIDE 11

Maximin Strategy

CSC304 - Nisarg Shah 11

  • Worst-case thinking by P1โ€ฆ

โžข If I choose mixed strategy ๐‘ฆ1โ€ฆ โžข P2 would choose ๐‘ฆ2 to minimize my reward (i.e.,

maximize his reward)

โžข Let me choose ๐‘ฆ1 to maximize this โ€œworst-case rewardโ€

๐‘Š

1 โˆ— = max ๐‘ฆ1

min

๐‘ฆ2

๐‘ฆ1

๐‘ˆ โˆ— ๐ต โˆ— ๐‘ฆ2

slide-12
SLIDE 12

Maximin Strategy

CSC304 - Nisarg Shah 12

๐‘Š

1 โˆ— = max ๐‘ฆ1

min

๐‘ฆ2

๐‘ฆ1

๐‘ˆ โˆ— ๐ต โˆ— ๐‘ฆ2

  • ๐‘Š

1 โˆ— : maximin value of P1

  • ๐‘ฆ1

โˆ— (maximizer) : maximin strategy of P1

  • โ€œBy playing ๐‘ฆ1

โˆ—, I guarantee myself at least ๐‘Š 1 โˆ—โ€

  • But if P1 โ†’ ๐‘ฆ1

โˆ—, P2โ€™s best response โ†’ เทœ

๐‘ฆ2

โžข Will ๐‘ฆ1

โˆ— be the best response to เทœ

๐‘ฆ2?

slide-13
SLIDE 13

Maximin vs Minimax

CSC304 - Nisarg Shah 13

Player 1

Choose my strategy to maximize my reward, worst- case over P2โ€™s response ๐‘Š

1 โˆ— = max ๐‘ฆ1

min

๐‘ฆ2

๐‘ฆ1

๐‘ˆ โˆ— ๐ต โˆ— ๐‘ฆ2

Player 2

Choose my strategy to minimize P1โ€™s reward, worst- case over P1โ€™s response ๐‘Š

2 โˆ— = min ๐‘ฆ2

max

๐‘ฆ1

๐‘ฆ1

๐‘ˆ โˆ— ๐ต โˆ— ๐‘ฆ2

Question: Relation between ๐‘Š

1 โˆ— and ๐‘Š 2 โˆ—?

๐‘ฆ1

โˆ—

๐‘ฆ2

โˆ—

slide-14
SLIDE 14

Maximin vs Minimax

CSC304 - Nisarg Shah 14

๐‘Š

1 โˆ— = max ๐‘ฆ1

min

๐‘ฆ2

๐‘ฆ1

๐‘ˆ โˆ— ๐ต โˆ— ๐‘ฆ2

๐‘Š

2 โˆ— = min ๐‘ฆ2

max

๐‘ฆ1

๐‘ฆ1

๐‘ˆ โˆ— ๐ต โˆ— ๐‘ฆ2

  • What if (P1,P2) play (x1

โˆ—, x2 โˆ—)?

โžข P1 must get at least ๐‘Š

1 โˆ— (ensured by P1)

โžข P1 must get at most ๐‘Š

2 โˆ— (ensured by P2)

โžข ๐‘Š

1 โˆ— โ‰ค ๐‘Š 2 โˆ—

๐‘ฆ1

โˆ—

๐‘ฆ2

โˆ—

slide-15
SLIDE 15

The Minimax Theorem

CSC304 - Nisarg Shah 16

  • Jon von Neumann [1928]
  • Theorem: For any 2p-zs game,

โžข ๐‘Š

1 โˆ— = ๐‘Š 2 โˆ— = ๐‘Šโˆ— (called the minimax value of the game)

โžข Set of Nash equilibria =

{ x1

โˆ—, x2 โˆ— โˆถ x1 โˆ— = maximin for P1, x2 โˆ— = minimax for P2}

  • Corollary: ๐‘ฆ1

โˆ— is best response to ๐‘ฆ2 โˆ— and vice-versa.

slide-16
SLIDE 16

The Minimax Theorem

CSC304 - Nisarg Shah 17

  • Jon von Neumann [1928]

โ€œAs far as I can see, there could be no theory of games โ€ฆ without that theorem โ€ฆ I thought there was nothing worth publishing until the Minimax Theorem was provedโ€

  • An unequivocal way to โ€œsolveโ€ zero-sum games

โžข Optimal strategies for P1 and P2 (up to ties) โžข Optimal rewards for P1 and P2 under a rational play

slide-17
SLIDE 17

Proof of the Minimax Theorem

CSC304 - Nisarg Shah 18

  • Simpler proof using Nashโ€™s theorem

โžข But predates Nashโ€™s theorem

  • Suppose เทค

๐‘ฆ1, เทค ๐‘ฆ2 is a NE

  • P1 gets value เทค

๐‘ค = เทค ๐‘ฆ1 ๐‘ˆ๐ต เทค ๐‘ฆ2

  • เทค

๐‘ฆ1 is best response for P1 : เทค ๐‘ค = max๐‘ฆ1 ๐‘ฆ1 ๐‘ˆ๐ต เทค ๐‘ฆ2

  • เทค

๐‘ฆ2 is best response for P2 : เทค ๐‘ค = min๐‘ฆ2 เทค ๐‘ฆ1 ๐‘ˆ๐ต ๐‘ฆ2

slide-18
SLIDE 18

Proof of the Minimax Theorem

CSC304 - Nisarg Shah 19

  • But we already saw ๐‘Š

1 โˆ— โ‰ค ๐‘Š 2 โˆ—

โžข ๐‘Š

1 โˆ— = ๐‘Š 2 โˆ—

max

๐‘ฆ1

๐‘ฆ1 ๐‘ˆ๐ต เทค ๐‘ฆ2 = เทค ๐‘ค = min

๐‘ฆ2

เทค ๐‘ฆ1 ๐‘ˆ๐ต ๐‘ฆ2 โ‰ค max

๐‘ฆ1

min

๐‘ฆ2

๐‘ฆ1

๐‘ˆ โˆ— ๐ต โˆ— ๐‘ฆ2 = ๐‘Š 1 โˆ—

๐‘Š

2 โˆ— = min ๐‘ฆ2

max

๐‘ฆ1

๐‘ฆ1

๐‘ˆ โˆ— ๐ต โˆ— ๐‘ฆ2 โ‰ค

slide-19
SLIDE 19

Proof of the Minimax Theorem

CSC304 - Nisarg Shah 20

  • When (เทค

๐‘ฆ1, เทค ๐‘ฆ2) is a NE, เทค ๐‘ฆ1 and เทค ๐‘ฆ2 must be maximin and minimax strategies for P1 and P2, respectively.

  • The reverse direction is also easy to prove.

max

๐‘ฆ1

๐‘ฆ1 ๐‘ˆ๐ต เทค ๐‘ฆ2 = เทค ๐‘ค = max

๐‘ฆ2

เทค ๐‘ฆ1 ๐‘ˆ๐ต ๐‘ฆ2 = max

๐‘ฆ1

min

๐‘ฆ2

๐‘ฆ1

๐‘ˆ โˆ— ๐ต โˆ— ๐‘ฆ2 = ๐‘Š 1 โˆ—

๐‘Š

2 โˆ— = min ๐‘ฆ2

max

๐‘ฆ1

๐‘ฆ1

๐‘ˆ โˆ— ๐ต โˆ— ๐‘ฆ2 =

slide-20
SLIDE 20

Computing Nash Equilibria

CSC304 - Nisarg Shah 21

  • Can I practically compute a maximin strategy (and

thus a Nash equilibrium of the game)?

  • Wasnโ€™t it computationally hard even for 2-player

games?

  • For 2p-zs games, a Nash equilibrium can be

computed in polynomial time using linear programming.

โžข Polynomial in #actions of the two players: ๐‘›1 and ๐‘›2

slide-21
SLIDE 21

Computing Nash Equilibria

CSC304 - Nisarg Shah 22

Maximize ๐‘ค Subject to ๐‘ฆ1

๐‘ˆ ๐ต ๐‘˜ โ‰ฅ ๐‘ค, ๐‘˜ โˆˆ 1, โ€ฆ , ๐‘›2

๐‘ฆ1 1 + โ‹ฏ + ๐‘ฆ1 ๐‘›1 = 1 ๐‘ฆ1 ๐‘— โ‰ฅ 0, ๐‘— โˆˆ {1, โ€ฆ , ๐‘›1}

slide-22
SLIDE 22

Minimax Theorem in Real Life?

CSC304 - Nisarg Shah 23

  • If you were to play a 2-player zero-sum game (say,

as player 1), would you always play a maximin strategy?

  • What if you were convinced your opponent is an

idiot?

  • What if you start playing the maximin strategy, but
  • bserve that your opponent is not best

responding?

slide-23
SLIDE 23

Minimax Theorem in Real Life?

CSC304 - Nisarg Shah 24

Kicker Goalie L R L 0.58 0.95 R 0.93 0.70

Kicker Maximize ๐‘ค Subject to 0.58๐‘ž๐‘€ + 0.93๐‘ž๐‘† โ‰ฅ ๐‘ค 0.95๐‘ž๐‘€ + 0.70๐‘ž๐‘† โ‰ฅ ๐‘ค ๐‘ž๐‘€ + ๐‘ž๐‘† = 1 ๐‘ž๐‘€ โ‰ฅ 0, ๐‘ž๐‘† โ‰ฅ 0 Goalie Minimize ๐‘ค Subject to 0.58๐‘Ÿ๐‘€ + 0.95๐‘Ÿ๐‘† โ‰ค ๐‘ค 0.93๐‘Ÿ๐‘€ + 0.70๐‘Ÿ๐‘† โ‰ค ๐‘ค ๐‘Ÿ๐‘€ + ๐‘Ÿ๐‘† = 1 ๐‘Ÿ๐‘€ โ‰ฅ 0, ๐‘Ÿ๐‘† โ‰ฅ 0

slide-24
SLIDE 24

Minimax Theorem in Real Life?

CSC304 - Nisarg Shah 25

Kicker Goalie L R L 0.58 0.95 R 0.93 0.70

Kicker Maximin: ๐‘ž๐‘€ = 0.38, ๐‘ž๐‘† = 0.62 Reality: ๐‘ž๐‘€ = 0.40, ๐‘ž๐‘† = 0.60 Goalie Maximin: ๐‘Ÿ๐‘€ = 0.42, ๐‘Ÿ๐‘† = 0.58 Reality: ๐‘ž๐‘€ = 0.423, ๐‘Ÿ๐‘† = 0.577 Some evidence that people may play minimax strategies.

slide-25
SLIDE 25

Minimax Theorem

CSC304 - Nisarg Shah 26

  • We proved it using Nashโ€™s theorem

โžข Cheating. Typically, Nashโ€™s theorem (for

the special case of 2p-zs games) is proved using the minimax theorem.

  • Useful for proving Yaoโ€™s principle,

which provides lower bound for randomized algorithms

  • Equivalent to linear programming

duality

John von Neumann George Dantzig

slide-26
SLIDE 26

von Neumann and Dantzig

CSC304 - Nisarg Shah 27

George Dantzig loves to tell the story of his meeting with John von Neumann on October 3, 1947 at the Institute for Advanced Study at Princeton. Dantzig went to that meeting with the express purpose of describing the linear programming problem to von Neumann and asking him to suggest a computational procedure. He was actually looking for methods to benchmark the simplex method. Instead, he got a 90-minute lecture on Farkas Lemma and Duality (Dantzig's notes of this session formed the source of the modern perspective on linear programming duality). Not wanting Dantzig to be completely amazed, von Neumann admitted: "I don't want you to think that I am pulling all this out of my sleeve like a magician. I have recently completed a book with Morgenstern on the theory of games. What I am doing is conjecturing that the two problems are equivalent. The theory that I am outlining is an analogue to the one we have developed for games.โ€œ

  • (Chandru & Rao, 1999)