g ame t heory 2
play

G AME T HEORY 2 I NSTRUCTOR : G IANNI A. D I C ARO T HE PROFESSOR S - PowerPoint PPT Presentation

15-382 C OLLECTIVE I NTELLIGENCE S18 L ECTURE 27: G AME T HEORY 2 I NSTRUCTOR : G IANNI A. D I C ARO T HE PROFESSOR S DILEMMA Class Simultaneous move Listen Sleep Non-cooperative game Complete information Make 10 6 ,10 6


  1. 15-382 C OLLECTIVE I NTELLIGENCE – S18 L ECTURE 27: G AME T HEORY 2 I NSTRUCTOR : G IANNI A. D I C ARO

  2. T HE PROFESSOR ’ S DILEMMA Class  Simultaneous move Listen Sleep  Non-cooperative game  Complete information Make 10 6 ,10 6 -10,0 Professor  Imperfect information effort  Solution concept: predict how the game will be played Slack with rational agents 0,-10 0,0 off  Prediction ≡ Solution  Nash: Equilibrium concept Dominant strategies? Nope, if Class listen, and Professor slacks off, Sleep provides a higher payoff! No dominant strategy: best strategy it doesn’t matter what other player’s strategy 15781 Fall 2016: Lecture 22 2

  3. N ASH EQUILIBRIUM (1951)  Can we find an equilibrium also in absence of a dominant strategy?  At equilibrium, each player’s strategy is a best response to strategies of others  Formally, a Nash equilibrium is strategy profile 𝑡 = 𝑡 1 … , 𝑡 𝑜 ∈ 𝑇 𝑜 such that: ′ ∈ 𝑇, 𝑣 𝑗 𝑡 ≥ 𝑣 𝑗 (𝑡 𝑗 ′ , 𝑡 −𝑗 ) ∀𝑗 ∈ 𝑂, ∀𝑡 𝑗 John F. Nash, Nobel Prize in Economics, 1994 15781 Fall 2016: Lecture 22 3

  4. N ASH EQUILIBRIUM  In equilibrium, each player is playing the strategy that is a “ best response ” to the strategies of the other players. No one has an incentive to change strategy given the strategy choices of the others  A NE is an equilibrium where each player’s strategy is optimal given the strategies of all other players .  A Nash Equilibrium exists when there is no unilateral profitable deviation from any of the players involved  Nash Equilibria are self-enforcing strategies: when players are at a Nash Equilibrium they have no desire to move because they will be worse off → Equilibrium in the policy space  Dominant strategy ⟹ Nash equilibrium : All solutions in dominant strategies are also Nash equilibria, but the vice versa is not 15781 Fall 2016: Lecture 22 necessarily (and not usually) true 4

  5. N ASH EQUILIBRIUM Equilibrium is not :  The best possible outcome of the game. Equilibrium in the one-shot prisoners’ dilemma is for both players to confess, which is not the best possible outcome (not Pareto optimal)  A situation where players always choose the same action. Sometimes equilibrium will involve changing action choices ( mixed strategy equilibrium). 15781 Fall 2016: Lecture 22 5

  6. N ASH EQUILIBRIUM  How many Nash equilibria does the Professor’s Dilemma have ? Listen Sleep 10 6 ,10 6 -10,0 Make effort Slack off 0,-10 0,0 ML - SS 15781 Fall 2016: Lecture 22 6

  7. N ASH EQUILIBRIA : H OW DO WE FIND THEM ?  Nash equilibrium: A play of the game where each strategy is a best reply to the given strategy of the other.  Let’s examine all the possible pure strategy profiles and check if for a profile (X,Y) one player could improve its payoff, given the strategy of the other  (M, L)? If Prof plays M, then L is the best reply given M. Neither player can increase its the payoff by choosing a different action o (S,L)? If Prof plays S, S is the best reply given S, not L o (M, S)? If Prof plays M, then L is the best reply given M, not S  (S,S)? If Prof plays S, then S is the best reply given S. Neither player can increase its the payoff by choosing a different action 15781 Fall 2016: Lecture 22 7

  8. N ASH EQUILIBRIUM FOR P RISONER ’ S D ILEMMA Prisoner B Don’t Confess confess Confess Don’t Prisoner A -1,-1 -9,0 Confess 0,-9 -6,-6 15781 Fall 2016: Lecture 22 8

  9. C OORDINATION G AME : S TAG HUNT (Originally from J.J. Rousseau)  Two equilibria at ( stag, stag ) and ( rabbit, rabbit ) → Players' optimal strategy depend on their expectation on what the other player may do.  This game has been used as an analogy for social cooperation, and mutual trust  In Prisoner’s dilemma, the Nash equilibrium corresponds to defect, no cooperate! 9

  10. C OMPETITION G AME  Both players simultaneously choose an integer from 0 to 3  They both win the smaller of the two numbers in points.  In addition, if one player chooses a larger number than the other, then it has to give up two points to the other. Does the (unique) NE at (0,0) make sense? 15781 Fall 2016: Lecture 22 10

  11. R OCK - PAPER - SCISSORS R P S R 0,0 -1,1 1,-1 P 1,-1 0,0 -1,1 S -1,1 1,-1 0,0 Nash equilibrium? Is there a pure strategy as best response? 15781 Fall 2016: Lecture 22 11

  12. R OCK -P APER -S CISSORS  For every pure strategy (X,Y), there R P S is a different strategy choice that increases the payoff of a player R 0,0 -1,1 1,-1  E.g., for strategy (P,R), player B can get a higher payoff playing strategy S instead R P 1,-1 0,0 -1,1  E.g., for strategy (S,R), player A can get a higher payoff playing strategy P instead S S -1,1 1,-1 0,0  No strategy equilibrium can be settled, players have the incentive to No (pure) Nash equilibria: keep switching their strategy Best response: randomize! 15781 Fall 2016: Lecture 22 12

  13. M IXED STRATEGIES  Mixed strategy: a probability distribution over ( pure ) strategies  The mixed strategy of player 𝑗 ∈ 𝑂 is 𝑦 𝑗 , where 𝑦 𝑗 (𝑡 𝑗 ) = Pr[𝑗 plays 𝑡 𝑗 ] (e.g., 𝑦 𝑗 𝑆 = 0.3, 𝑦 𝑗 𝑄 = 0.5, 𝑦 𝑗 𝑇 = 0.2)  The (expected) utility of player 𝑗 ∈ 𝑂 is 𝑜 𝑣 𝑗 𝑦 1 , … , 𝑦 𝑜 = ෍ 𝑣 𝑗 𝑡 1 , … , 𝑡 𝑜 ⋅ ෑ 𝑦 𝑘 (𝑡 𝑘 ) (𝑡 1 ,…,𝑡 𝑜 )∈𝑇 𝑜 𝑘=1 Mixed strategy Pure strategy Utility of pure Joint probability of profile profile strategy the pure strategy profile profile given the mixed profile 15781 Fall 2016: Lecture 22 13

  14. E XERCISE : M IXED NE 1 1 R P S  Player 1 plays 2 , 2 , 0 , player 2 1 1 plays 0, 2 , 2 . What is 𝑣 1 ? R 0,0 -1,1 1,-1 1 1 1  Both players play P 1,-1 0,0 -1,1 3 , 3 , 3 . What is 𝑣 1 ? S -1,1 1,-1 0,0 15781 Fall 2016: Lecture 22 14

  15. E XERCISE : M IXED NE 1 1 1 1 2 , 2 , 0 , player 2 plays 0, 2 , 2 . What is 𝑣 1 ? Player 1 plays + + R P S R 0,0 -1,1 1,-1 P 1,-1 0,0 -1,1 S -1,1 1,-1 0,0 In the second case, because of symmetry, the utility is 15781 Fall 2016: Lecture 22 zero: It’s a zero-sum game 15

  16. M IXED S TRATEGIES E QUILIBRIUM IS N ASH  The mixed strategy profile 𝑦 ∗ in a strategic game is a mixed strategy Nash equilibrium if ∗ , 𝑦 −𝑗 ∗ ∗ ∀ 𝑦 𝑗 and 𝑗 𝑣 𝑗 𝑦 𝑗 ≥ 𝑣 𝑗 𝑦 𝑗 , 𝑦 −𝑗  𝑣 𝑗 𝑦 is player 𝑗 ’s expected utility with mixed strategy profile 𝑦  → Same definition as in the case f pure strategies, where 𝑣 𝑗 was the utility of a pure strategy instead of a mixed strategy 15781 Fall 2016: Lecture 22 16

  17. M IXED S TRATEGIES N ASH E QUILIBRIUM ∗ is  Using best response functions, 𝑦 ∗ is a mixed strategy NE iff 𝑦 𝑗 the best response for every player 𝑗 .  If a mixed strategy 𝑦 ∗ is a best response, then each of the pure strategies in the mix must be best response : they must yield the same expected payoff (otherwise it would just make sense to choose the one with the better payoff)  → If a mixed strategy is a best response for player 𝑗 , then the player must be indifferent among the pure strategies in the mix  E.g., in the RPS game, if the mixed strategy of player 𝑗 assigns non-zero probabilities p R for playing R and p P for playing P, then 𝑗 ’s expected utility for playing R or P has to be the same 15781 Fall 2016: Lecture 22 17

  18. E XERCISE : M IXED NE  Which is a NE? R P S 1 1 1 1 1. 2 , 2 , 0 , 2 , 2 , 0 R 0,0 -1,1 1,-1 1 1 1 1 2. 2 , 2 , 0 , 2 , 0, 2 P 1,-1 0,0 -1,1 1 1 1 1 1 1 3. 3 , 3 , 3 , 3 , 3 , 3 S -1,1 1,-1 0,0 1 2 2 1 4. 3 , 3 , 0 , 3 , 0, 3 Any other NE? 15781 Fall 2016: Lecture 22 18

  19. N ASH ’ S T HEOREM  Theorem [Nash, 1950]: In any game with finite number of strategies there exists at least one (possibly mixed) Nash equilibrium Player B Left Right 1,2 0,4 Up Player A 0,5 3,2 Down This game has no pure strategy Nash equilibria but it does have a Nash 15781 Fall 2016: Lecture 22 equilibrium in mixed strategies. How is it computed? 19

  20. C OMPUTATION OF MS NE Player B Left Right 1,2 0,4 Up Player A 0,5 3,2 Down Player A plays Up with probability p U and plays Down with probability 1-p U Player B plays Left with probability p L and plays Right with probability 1-p L. 15781 Fall 2016: Lecture 22 20

  21. C OMPUTATION OF MS NE Player B L, p L R,1- p L U, p U 1,2 0,4 Player A D,1- p U 0,5 3,2 15781 Fall 2016: Lecture 22 21

  22. C OMPUTATION OF MS NE Player B L, p L R,1- p L U, p U 1,2 0,4 Player A D,1- p U 0,5 3,2 If B plays Left, its expected utility is p   p 2 5 1 ( ) U U 15781 Fall 2016: Lecture 22 22

  23. C OMPUTATION OF MS NE Player B L, p L R,1- p L U, p U 1,2 0,4 Player A D,1- p U 0,5 3,2 If B plays Right, its expected utility is p   p 4 2 1 ( ). U U 15781 Fall 2016: Lecture 22 23

  24. C OMPUTATION OF MS NE Player B L, p L R,1- p L U, p U 1,2 0,4 Player A D,1- p U 0,5 3,2 p   p  p   p 2 5 1 ( ) 4 2 1 ( ) If then U U U U B would play only Left, which would be a pure strategy. But there are no (pure) Nash equilibria in which B plays only Left. 15781 Fall 2016: Lecture 22 24

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend