game theory lecture 12
play

Game Theory - Lecture #12 Outline: Randomized actions vNM & - PDF document

Game Theory - Lecture #12 Outline: Randomized actions vNM & Bernoulli payoff functions Mixed strategies & Nash equilibrium Hawk/Dove & Mixed strategies Randomized action profiles Original strategic setup: Set of


  1. Game Theory - Lecture #12 Outline: • Randomized actions • vNM & Bernoulli payoff functions • Mixed strategies & Nash equilibrium • Hawk/Dove & Mixed strategies

  2. Randomized action profiles • Original strategic setup: – Set of players, { 1 , 2 , ..., n } – For each player, a set of actions A i – For each player, preferences on action profiles characterized by a payoff function: U i : A → R • Question: How do we extend preferences to lotteries over action profiles? • Extension: Strategic game with vNM (Von Neumann and Morgenstern) preferences – Set of players – For each player, a set of actions A i – For each player, preferences on lotteries on action profiles characterized by a (vNM) payoff function: U i (∆( A )) → R Notation: ∆( Set ) denotes probability distributions over a Set of outcomes • Important special case: vNM preferences given by expected utility over action profiles (Bernoulli payoff) • Key observation: Payoff values define preferences over distributions – Original setting: preferences ⇔ payoffs over profiles – Extension: preferences ⇔ payoffs over profiles ⇒ preferences over distributions • Concern: Moving further away from true preferences 1

  3. Example • Original setting: preferences ⇔ payoffs over profiles – Fact: Several payoff functions reflect preferences C D C D C 2 , 2 0 , 3 C 3 , 3 0 , 4 D 3 , 0 1 , 1 D 4 , 0 1 , 1 – These are the same game (Prisoner’s dilemma) in terms of original ordinal preference • Extension: preferences ⇔ payoffs over profiles ⇒ preferences over distributions – These are different games in terms of probability preferences – Player 1 vNM utility depends on probabilities of { CC, CD, DC, DD } : ∗ Left game: U 1 ( p ) = p CC · 2 + p CD · 0 + p DC · 3 + p DD · 1 ∗ Right game: U 1 ( p ) = p CC · 3 + p CD · 0 + p DC · 4 + p DD · 1 Similar for Player 2 – Compare following probability distributions: (2 / 5 , 3 / 5 , 0 , 0) vs (0 , 0 , 0 , 1) • Payoff values take on heightened importance in extended setting. • Dependence on payoff values can result in peculiar outcomes. 2

  4. Expected payoff peculiarities • In the new framework, the preferences are over probability distributions • Issue: Are expected payoffs “reasonable”? • Example: Allais paradox – Consider the following two lotteries (in millions): $10 $2 $0 $10 $2 $0 0 1 0 0 . 1 0 . 89 0 . 01 vs A a Most prefer A to a ... – Consider another two lotteries: $10 $2 $0 $10 $2 $0 0 . 1 0 0 . 9 0 0 . 11 0 . 89 vs B b Most prefer B to b ... – Q: Are there choices of u (10) , u (2) , u (0) such that expected utilities result in pref- erences ( A > a ) and ( B > b ) – Preference evaluation for ( A > a ) u (2) > 0 . 1 u (10) + 0 . 89 u (2) + 0 . 01 u (0) . – Subtract 0 . 89 u (2) and add 0 . 89 u (0) to each side 0 . 11 u (2) + 0 . 89 u (0) > 0 . 1 u (10) + 0 . 9 u (0) – This implies that the expected payoff of lottery b exceeds that of lottery B ! • Conclusion: Decision maker’s preferences cannot always be represented by an expected payoff function. Nonetheless, we will make use of expected payoffs. 3

  5. Mixed strategies • A mixed strategy is a probability distribution over a player’s actions. Specifically, a player selects α i ∈ ∆( A i ) • Consequences: – Joint action probabilities are products of player probabilities – Bernoulli payoff becomes expected utility with independent players – New notation: U i ( α i , α − i ) • Continuing previous example: – Player 1 chooses α 1 = ( α 1 C , α 1 D ) – Player 2 chooses α 2 = ( α 2 C , α 2 D ) – Resulting probability distribution over joint actions is ( p CC , p CD , p DC , p DD ) = ( α 1 C α 2 C , α 1 C α 2 D , α 1 D α 2 C , α 1 D α 2 D ) – Inherited expected utilities: ∗ Left game: U 1 ( α 1 , α 2 ) = 2 · α 1 C α 2 C + 0 · α 1 C α 2 D + 3 · α 1 D α 2 C + 1 · α 1 D α 2 D ∗ Right game: U 1 ( α 1 , α 2 ) = 3 · α 1 C α 2 C + 0 · α 1 C α 2 D + 4 · α 1 D α 2 C + 1 · α 1 D α 2 D (Likewise for U 2 ( · ) ) • Reconciled viewpoint: New setup is same as old setup with – Set of players – “New” set of actions α i ∈ ∆( A i ) – “New” payoff functions U i ( α i , α − i ) which is expected value of original payoff func- tions assuming independent players 4

  6. Mixed strategy best response • Define the best response function, B i ( · ) , as B i ( α − i ) = { α i : U i ( α i , α − i ) ≥ U i ( α ′ i , α − i ) for all α ′ i ∈ ∆( A i ) } Note that the best response “function” is actually a “set” • This definition is exactly as before except: – Player actions are replaced with mixed strategies – Player utilities are replaced with expected utilities assuming independent players • Example: Generic two player/two action game L R T a, A b, B B c, C d, D – Assume mixed strategies are α 1 = ( p, 1 − p ) for row player and α 2 = ( q, 1 − q ) for column player – Player 1 must maximize over p ∈ [0 , 1] � � � � p q · a + (1 − q ) · b + (1 − p ) q · c + (1 − q ) · d – Fact:  1 ( q · a + (1 − q ) · b ) > ( q · c + (1 − q ) · d )   B row ( q ) = 0 ( q · a + (1 − q ) · b ) < ( q · c + (1 − q ) · d )  [0 , 1] ( q · a + (1 − q ) · b ) = ( q · c + (1 − q ) · d )  – Similar analysis to derive B col ( p ) 5

  7. Mixed strategy Nash equilibrium • The mixed strategy profile α ∗ = ( α ∗ 1 , ..., α ∗ n ) is a mixed strategy Nash equilibrium if for every player i , α ∗ i ∈ B i ( α ∗ − i ) • Celebrated Nash theorem: Every strategic game with vNM preferences in which each player has finitely many actions has a mixed strategy Nash equilibrium. • Nash result due to (advanced) fixed point theory – Want to find ( α ∗ 1 , ..., α ∗ n ) such that α ∗ → ( B 1 ( · ) , ..., B n ( · )) → α ∗ – Illustration: A continuous function on the closed interval [0,1] must have a “fixed point”, i.e., an x ∈ [0 , 1] such that x = f ( x ) 6

  8. Hawk/Dove H D H 0 , 0 6 , 1 D 1 , 6 3 , 3 • Setup: – H : hawk = aggressive – D : dove = passive – Model of game of “chicken” or traffic intersection • First look: What are the pure (i.e., non-randomized) action NE? – Best response function for row player: B row ( H ) = D & B row ( D ) = H – Symmetric for column player – NE: ( H, D ) and ( D, H ) 7

  9. Hawk/Dove: Mixed strategies H D H 0 , 0 6 , 1 D 1 , 6 3 , 3 • Second look: What are the mixed strategy NE? • As before, we construct best response function, but for mixed strategies – Row: Pr ( H ) = p and Pr ( D ) = 1 − p – Column: Pr ( H ) = q and Pr ( D ) = 1 − q – Players select { H, D } independently • Best response for row player: Need to maximize expected payoff, i.e., � � � � 0 ≤ p ≤ 1 p max 0 · q + 6 · (1 − q ) + (1 − p ) 1 · q + 3 · (1 − q ) ⇓  � � � � 1 0 · q + 6 · (1 − q ) > 1 · q + 3 · (1 − q )     � � � � B row ( q ) = [0 , 1] 0 · q + 6 · (1 − q ) = 1 · q + 3 · (1 − q )  � � � �  0 0 · q + 6 · (1 − q ) < 1 · q + 3 · (1 − q )   • Conclusion:   1 q < 3 / 4 1 p < 3 / 4     B row ( q ) = & B col ( p ) = [0 , 1] q = 3 / 4 [0 , 1] p = 3 / 4   0 q > 3 / 4 0 p > 3 / 4   8

  10. H/D: Best response plots 1 0.8 0.6 p 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 q • NE occur at intersection of best response plots – NE of original pure strategy game are still present – New “mixed strategy” NE: ( p ∗ , q ∗ ) = (3 / 4 , 3 / 4) • Peculiarity: At mixed strategy NE, players are indifferent , i.e., B row (3 / 4) = [0 , 1] & B col (3 / 4) = [0 , 1] i.e., at NE, best response is to play ( H, D ) with any probability combination. • The mixed strategy NE makes both players indifferent • Question: Are there other outcome that could lead to more desirable behavior? 9

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend