L ECTURES 29-31: G AME T HEORY 4-6 / E VOLUTIONARY G AME T HEORY 1-3 - - PowerPoint PPT Presentation
L ECTURES 29-31: G AME T HEORY 4-6 / E VOLUTIONARY G AME T HEORY 1-3 - - PowerPoint PPT Presentation
15-382 C OLLECTIVE I NTELLIGENCE S18 L ECTURES 29-31: G AME T HEORY 4-6 / E VOLUTIONARY G AME T HEORY 1-3 I NSTRUCTOR : G IANNI A. D I C ARO T HE Q UEST FOR A LTRUISM AND C OOPERATION Cooperation seems to be not the strategy to follow in
15781 Fall 2016: Lecture 22
THE QUEST FOR ALTRUISM AND COOPERATION
§ Cooperation seems to be not the strategy to follow in rational, self-interested agents
2
§ Ritualized animal behavior in a conflict situation: “why are animals so gentlemanly
- r ladylike in contests for
resources?”
15781 Fall 2016: Lecture 22
CAN WE DO ANY BETTER?
§ Prisoner’s dilemma
3
R is REWARD for mutual cooperation = 3 S SUCKER’s payoff = 0 T TEMPTATION to defect = 5 P PUNISHMENT for mutual defection = 1 with T>R>P>S
§ Classical game theory à both players D § Shame because they’d do better by both cooperating § Cooperation is a very general problem in biology and not only!
For player row (symmetric for player col)
𝑆 𝑇 𝑈 𝑄 C D C D
§ Import tariffs - Should countries remove them? § Price fixing- why not cheat?
15781 Fall 2016: Lecture 22
CAN WE LEARN TO BE ALTRUISTIC?
§ Iterated prisoner’s dilemma
4
§ Let’s repeat the game with the same opponent over and over, in this case new strategies are possible because of the iterated nature § What if we observe the outcomes and the strategy can be adapted? § Axelrod’s tournament (1984), let’s play with multiple opponents § Each strategy was paired with each other strategy for 200 iterations of a game, and scored on the total points accumulated through the
- tournament. A strategy could adapt based on the observed outcomes
§ The winner was … Tit-For-Tat! § Cooperates on the first move, and subsequently echoes (reciprocates) what the other player did on the previous move: Retaliation with forgivness § Cheap to implement (plausible model also for animals) § Alternatives to one-shot normal-form games? § How can players update / improve their policies?
15781 Fall 2016: Lecture 22
CAN WE LEARN TO BE ALTRUISTIC?
5
6
EVOLUTIONARY GAME THEORY
§ Drop the assumption of rational players, able to perfectly predict each other's thought process and make rational choices (classical game theory) § Let’s consider a (large) population of decision makers (players) § A player is not faced always with the same opponents § There are many local interaction at the same § Individuals / Players do not make choices, but implement given strategies § The frequency with which a particular decision is made depends on the fraction of individuals in the population that have it in their strategy § The frequency is time-varying: the distribution of players' actions changes towards those that are (currently) better. § We could think of this as an evolutionary dynamics
7
EVOLUTIONARY GAME THEORY
§ Biological interpretation: payoff == reproductive fitness. So agents following better strategies have more children, and so proportion playing such strategy grows over time. § Economics: unsuccessful firms are driven out of business, while successful ones expand. § Imitation: Players can look around, see whether a rival player is doing better than themselves and if so copy their strategy § Best Respond: Observe how the game is currently being played, and play the current best response to it. Three justifications of evolutionary dynamics:
8
EVOLUTIONARY GAME THEORY
§ Agents do not make choices, as in classical game theory, but implement their currently adopted / given strategy § An Agent with a good strategy 𝜏 (i.e., higher fitness), will reproduce more à in the next time step the population will contain proportionally more individuals with the strategy 𝜏
9
EVOLUTIONARY GAME THEORY
§ Natural selection processes replaces rational behavior
10
BASIC NOTIONS
§ Population of individuals that can use some set 𝑇 of pure strategies § Population profile: vector 𝒚 that gives a probability 𝑦 𝑡 with which each strategy 𝑡 ∈ 𝑇 is played in the population § A population profile needs not correspond to a strategy adopted by any member of the population § E.g., a population that can use two strategies, 𝑡-, 𝑡. § If every member of the population randomizes by playing each of the two pure strategies with probability ½ à 𝒚 = (½, ½), and the population profile is the same as the mixed strategy adopted by all members § If half of the population adopts strategy 𝑡- and the other half 𝑡. à Populations’ profile is still 𝒚 = (½, ½), but no member adopts it § Individual payoff: if an individual uses a mixed strategy 𝜏 in a population with profile 𝒚 , its payoff is 𝜌(𝜏, 𝒚 ) = ∑ 𝑞(𝑡)𝜌(𝑡, 𝒚 )
4∈5
= #descendants
11
DESCENDANTS
§ 𝑂 agents / animals, programmed to use one of the pure strategies 𝑡- or 𝑡. § Let’s assume that 50% of the agents use each of the strategies: 𝒚 = (½, ½) § Given payoffs: 𝜌(𝑡-, 𝒚 ) = 6, 𝜌(𝑡., 𝒚 ) = 4 § Next generation: 6𝑂/2 individuals using 𝑡- and 4𝑂/2 individuals using 𝑡. § à New population profile: 𝒚 = (0.6, 0.4) Core Question: In order to determine the next population, how do 𝜌(s, 𝒚 ) behave as a function of 𝒚 ?
12
TYPES OF GAMES
§ Game against the field: there’s no specific opponent for a given individual, their payoff depends on what everyone in the population is doing § Population-wide interactions § Mean-field game theory ≠ Classical game theory § Payoff might be not linear in the probabilities 𝑦(𝑡) with which each pure strategy is played by population member § Frequency-dependent selection § Analogous to dynamical systems mean-field assumption for population studies
13
TYPES OF GAMES
§ Pairwise contest: a given individual plays against an opponent that has been randomly selected (by Nature) from the population, and the payoffs depends on what the pair of individuals do § Payoffs are linear in the probabilities 𝑦(𝑡) with which each pure strategy is played by population member:
𝜌(𝜏, 𝒚 ) = ∑ ∑ 𝑞 𝑡 𝑦(𝑡;)𝜌(s,s’)
4;∈5 4∈5
14
IMPORTANT QUESTIONS
§ Does it exist / What is the stable population (equilibrium)? § How do we define an equilibrium? § Will the population evolve toward an equilibrium? § Will the population move away from the equilibrium (stability of equilibrium)? § What are the dynamics of population change (towards equilibria)? John Maynard Smith George Price ~1973
15
EQUILIBRIA
§ Stability: under which strategy profiles is the population stable? § Let 𝒚∗ be the profile generated by a population where all individuals adopt strategy 𝜏∗ (i.e., 𝒚∗=𝜏∗) § Necessary condition for evolutionary stability: 𝜏∗ ∈ arg max
I∈J 𝜌(𝜏, 𝒚∗)
At equilibrium the strategy adopted by individuals must be a best response to the population profile that generates it § If 𝜏∗ is a unique best response to 𝒚∗, then the evolution of the population stops § If there are multiple stable strategies, the population could drift to any other
- f these strategies
16
MUTANTS / IMMIGRANTS
§ A population where initially all individuals adopt some strategy 𝜏∗ § A genetic mutation (or immigration) occurs and a small proportion 𝜁 of individuals use some other strategy 𝜏 § The new population is the post-entry population, with profile 𝒚L § E.g., 𝑇 = 𝑡-,𝑡. , 𝜏∗ = (½,½), and the mutant strategy is 𝜏 =
O P ,
- P
𝒚L = 1 − 𝜁 𝜏∗ + 𝜁𝜏 = 1 − 𝜁 1 2 ,½ + 𝜁 3 4, 1 4 = 1 2 + 𝜁 4, 1 4 − 𝜁 4
17
EVOLUTIONARY STABLE STRATEGY (ESS)
§ ESS: A mixed strategy 𝜏∗ is an ESS if mutants that adopt any other strategy 𝜏 leave fewer offspring in the post-entry population, provided that the population of the mutants is small. That is, if there exists an 𝜁̅ such that for every 0 < 𝜁 < 𝜁̅ and for every 𝜏 ≠ 𝜏∗: 𝜌 𝜏∗,𝒚L > 𝜌 𝜏,𝒚L Let’s see pairwise contest examples… § ESS is a notion of equilibrium which is resistant to invasion of mutants: it’s an end-point of evolution § Not all Nash equilibria can guarantee resistance to mutant invasion
15781 Fall 2016: Lecture 22
HAWK-DOVE PAIRWISE CONTEST GAME
18
C C
15781 Fall 2016: Lecture 22
HAWK-DOVE PAIRWISE CONTEST GAME
19
C C
15781 Fall 2016: Lecture 22
HAWK-DOVE PAIRWISE CONTEST GAME
20
15781 Fall 2016: Lecture 22
HAWK-DOVE PAIRWISE CONTEST GAME
21
15781 Fall 2016: Lecture 22
HAWK-DOVE PAIRWISE CONTEST GAME
22
15781 Fall 2016: Lecture 22
GROUP-SELECTION VS. DARWINIAN SELECTION
23
§ Darwinism assumes that natural selection operates at the level of an individual, aiming to maximize individual fitness. In our modeling, this would equal to maximize the number of descendants adopting the same strategy. § A different view, postulates that selection operates on a larger unit, a group, maximizing the benefit of this unit (e. g., a population, a species etc.) à Group selection § In a monomorphic population (all players play the same randomized profile), individual and population profiles are the same, that means that individual and average expected fitness are the same. For the Hawk-Dove game, individual fitness is: 𝜌(𝜏, 𝒚 ) = 𝑞𝑦
XYZ . + 𝑞 1 − 𝑦 𝑤 + 1 − 𝑞
1 − 𝑦
X . = X . − Z . 𝑞.
§ Group selection should maximize average (group) fitness à 𝑞 = 0, that means nobody plays aggressive and individual fitness is
X .
§ Instead, for the ESS equilibrium, 𝑞 =
X Z , and individual fitness is X . 1 − X Z
<
X .
§ à When selection operates on individual level, fitness is lower than in the case of group selection. However, in the case of group selection the population is not resistant to mutants (is not an ESS), such as the strategy cannot be considered an end point in evolution
15781 Fall 2016: Lecture 22
GEOMETRICAL INTUITIONS ON EQUILIBRIA
24
§ How ESS relates to Nash equilibrium? Let’s consider pairwise contest games, since Nash has no direct meaning for a game against the field, and let’s first revise notations and introduce a number of new concepts related to equilibria (next 13 slides) § Payoff to a focal individual using mixed strategy 𝜏 in a population with profile 𝒚 is: 𝜌(𝜏, 𝒚 ) = ∑ ∑ 𝑞 𝑡 𝑦(𝑡;)𝜌(s, s′)
4;∈5 4∈5
§ Let’s consider the set of pure strategies 𝑇 = 𝑡-,𝑡.,…, 𝑡^ (note: we said that in one-shot game, actions and pure strategies have a one to one correspondence) § Each pure strategy 𝑡_ can be cast as a mixed strategy vector, e.g., (0, … ,1, … 0) § Focal’s player mixed strategy 𝜏 = (𝑞(𝑡-),𝑞(𝑡.),… , 𝑞(𝑡^)) and population’s strategy profile 𝒚 are normalized vectors in ℝ^ § 𝜌(𝑡_,𝑡
a) is the payoff (utility) for playing in a contest game pure strategy 𝑗 against
pure (population) strategy 𝑘 à Payoff matrix: Π = 𝜌(𝑡_, 𝑡
a) _,ae- ^
§ In the cases considered so far the matrix is a squared one, but if individuals with different roles / actions are present in the population, it can be a rectangular one
15781 Fall 2016: Lecture 22
ALGEBRAIC AND GEOMETRICAL ASPECTS OF EQUILIBRIA
25
§ In a generic two-player game, each player has different payoffs for each pair of pure strategies they can play, such that a distinction between the players is needed when defining the expected utility function for a pair of mixed strategy vectors 𝜏 f, 𝜏g: 𝜌 f(𝜏f, 𝜏g) = ∑ ∑ 𝑞f 𝑡_ 𝑞g(𝑡
a) ^ ae- ^ _e-
𝜌 f(𝑡_,𝑡
a)
𝜌g(𝜏 f, 𝜏g) = ∑ ∑ 𝑞f 𝑡_ 𝑞g(𝑡
a) ^ ae- ^ _e-
𝜌g(𝑡_, 𝑡
a)
§ We can rewrite these expected utility functions as scalar products in vector notation: 𝜌 f(𝜏 f, 𝜏g) = 𝜏 f h Πf𝜏g § In a game with 𝑜 pure strategies, the utility function of a player 𝑙 is a function: 𝜌k ∶ 0,1 × 0,1 × … 0,1 → ℝ § Individual’s expected utility 𝜌k has a linear relation with pure strategies’ payoffs 𝜌k(𝑡_,𝑡
a)
§ For a pairwise contest game in an evolutionary population game, the focal player utility function is expressed as: 𝜌(𝜏, 𝒚 ) = 𝜏 ∙ Π𝒚 which is a convex polytope in ℝp
^ (as a function of the mixed strategy probabilities)
15781 Fall 2016: Lecture 22
STATE SPACE OF THE GAME
26
§ Given a mixed strategy vector 𝜏, the support set is defined as the subset of pure strategies that are used in the mixed strategy 𝜏: 𝑡𝑣𝑞𝑞 𝜏 = 𝑡_ 0 ≤ 𝑗 ≤ 𝑜, 𝑞 𝑡_ > 0} § If all individuals are equivalent, pairwise contest games will be symmetric games: the payoff matrix (for the pure strategies) is squared and symmetric § In a monomorphic population all players play the same randomized profile, that corresponds to the population’s one: 𝜏 = 𝒚 § The set Σ of all possible population’s mixed strategy vectors: Σ = 𝒚 = 𝑦-,𝑦.,…, 𝑦^ ∈ ℝp
^ ∶ u𝑦_ = 1 ^ _e-
forms an 𝑜 − 1 dimensional subspace: every possible mixed strategy profiles of the population must be a vector in this subspace § The mixed strategy (currently) adopted by the population is the state of the population à Σ represents the population state space
15781 Fall 2016: Lecture 22
GAME SIMPLEX
27
§ Σ = 𝒚 = 𝑦-,𝑦.,…, 𝑦^ ∈ ℝp
^ ∶ ∑
𝑦_ = 1
^ _e-
is also equivalent to the probability simplex (𝒐 − 𝟐 dimensional unit simplex) § Σ is a convex set: is the set of all convex combinations of the pure strategies § More in general, a 𝑙-simplex is a 𝑙-dimensional polytope: the convex hull set of its 𝑙 + 1 vertices 𝑤-,𝑤.,… 𝑤^ 𝑦-𝑤- + 𝑦.𝑤. + ⋯ 𝑦^𝑤^ ∈ ℝp
^ ∶ u 𝑦_ = 1 ^ _e-
15781 Fall 2016: Lecture 13
CONVEX SETS
§ A set ℱ ⊆ ℝ𝑜 is convex if for all 𝒚, 𝒛 ∈ ℱ and 𝜄 ∈ 0,1 , 𝜄𝒚 + 1 − 𝜄 𝒛 ∈ ℱ § A set is convex if, given two points in it, it contains all their possible linear (convex) combinations
28
Convex set Nonconvex set
15781 Fall 2016: Lecture 13
CONVEX COMBINATION
29
§ Given 𝑙 points 𝑸_ ∈ ℝ^, 𝑗 = 1, …,𝑙, a point 𝒜 ∈ ℝ^ is a convex combination
- f the points 𝑸_ if:
𝒜 = u 𝜇_
k _e-
𝑸_, 𝜇_≥ 0 ∀𝑗, u 𝜇_
k _e-
= 1 § If 𝑙 = 2, 𝒜 = 𝜇𝑸- + 1 − 𝜇 𝑸., 𝜇- = 𝜇, 𝜇. = 1 − 𝜇 § 𝑸- = 2,1 , 𝑸. = 6,3 , 𝜇 = 0.75 à 𝒜 = (3,1.5) § 𝑸- = 0,0 , 𝑸. = 1,0 , 𝑸O = 0,1 , 𝜇_= {0.5,0.2,0.3} à 𝒜 = (0.2, 0.3)
15781 Fall 2016: Lecture 13
CONVEX HULL
30
§ Given a set 𝑄 of 𝑙 points of ℝ^, 𝑄 = {𝑸-,𝑸.,…. ,𝑸k}, the smallest convex set, 𝑑𝑝𝑜𝑤 𝑄 , that includes 𝑄 is 𝑄’s convex hull, 𝑄 ⊆ 𝑑𝑝𝑜𝑤(𝑄) § 𝑑𝑝𝑜𝑤(𝑄) is the set of all convex combinations of the points in 𝑄: 𝑑𝑝𝑜𝑤 𝑄 = 𝒜 ∈ ℝ^ ∶ 𝒜 = u 𝜇_
k _e-
𝑸_, ∀ 𝜇_ ,𝑗 = 1, …, 𝑙 | 𝜇_ ≥ 0, u𝜇_
k _e-
= 1
15781 Fall 2016: Lecture 22
UTILITY FUNCTIONS AND DOT PRODUCTS
31
A two-player symmetric game
𝜌 f(𝜏f, 𝜏g) = ∑ ∑ 𝑞f 𝑡_ 𝑞g(𝑡
a) ^ ae- ^ _e-
𝜌 f(𝑡_,𝑡
a)
𝜌g(𝜏 f, 𝜏g) = ∑ ∑ 𝑞f 𝑡_ 𝑞g(𝑡
a) ^ ae- ^ _e-
𝜌g(𝑡_, 𝑡
a)
§ As dot product: 𝜌f(𝜏f, 𝜏g) = 𝜏 f h Πf𝜏g § 𝜏‰ = 𝑞 𝑡-
‰ , 𝑞 𝑡. ‰
= 𝑞 𝑡-
‰ , 1 − 𝑞 𝑡- ‰
, 𝜈 = 𝐵, 𝐶 § Πf = 𝜌f(𝑡-
f,𝑡- g)
𝜌f(𝑡-
f,𝑡. g)
𝜌f(𝑡.
f,𝑡- g)
𝜌f(𝑡.
f,𝑡. g)
= 8 3 7 5 § Πg = 𝜌g(𝑡-
f,𝑡- g)
𝜌g(𝑡-
f,𝑡. g)
𝜌g(𝑡.
f,𝑡- g)
𝜌g(𝑡.
f,𝑡. g)
= 8 7 3 5 § à 𝜌f 𝜏 f, 𝜏g = 𝑞 𝑡-
f 𝑞 𝑡. f
h 8 3 7 5
Ž 4•
- Ž 4‘
- =
𝑞 𝑡-
f 𝑞 𝑡. f
h 8𝑞 𝑡-
g + 3𝑞 𝑡. g
7𝑞 𝑡-
g + 5𝑞 𝑡. g
15781 Fall 2016: Lecture 22
UTILITY FUNCTIONS AND DOT PRODUCTS
32
§ Let’s consider a game with uniform payoffs: § Πf = Πg = 1 1 1 1 § 𝜌f 𝜏 f, 𝜏g = 𝑞 𝑡-
f 𝑞 𝑡. f
h 1 1 1 1
Ž 4•
- Ž 4‘
- = 𝑞 𝑡-
f 𝑞 𝑡. f
h 𝑞 𝑡-
g + 𝑞 𝑡. g
𝑞 𝑡-
g + 𝑞 𝑡. g
= 𝑞 𝑡-
f 𝑞 𝑡. f
h 1 1 = 1 (1,0) (0,1)
𝑞 𝑡-
f
𝑞 𝑡.
f
(1,1) Mixed strategy vectors § Dot product: 𝑤 ⃗ h 𝑥 = 𝑤 𝑥 cos 𝑤𝑥 – § à Scalar projection of 𝑤 on 𝑥, scaled by 𝑥’s length If we take previous matrix game, the projection direction and scaling factor (the vector 𝑥) totally change and depend on 𝐶’s mixed strategy: 8𝑞 𝑡-
g + 3𝑞 𝑡. g
7𝑞 𝑡-
g + 5𝑞 𝑡. g
In the figure is shown for 𝑞 𝑡-
g =0
(dashed vector, partial)
15781 Fall 2016: Lecture 22
PAYOFF FUNCTION AND NASH EQUILIBRIUM
33
𝜌 f(𝜏f, 𝜏g) = 8𝑞 𝑡-
f 𝑞 𝑡- g + 3𝑞 𝑡- f 𝑞 𝑡. g + 7𝑞 𝑡. f 𝑞 𝑡- g + 5𝑞(𝑡. f)𝑞(𝑡. g)
𝜌 f(𝜏f, 𝜏g) = 8𝑦-𝑦. + 3𝑦-(1 − 𝑦.) + 7(1 − 𝑦-)𝑦. + 5(1−𝑦-)(1 − 𝑦.) = 3𝑦-𝑦. − 2𝑦- + 2𝑦. + 5 𝑞 𝑡.
f = 1 − 𝑞 𝑡- f
𝑞 𝑡.
g = 1 − 𝑞 𝑡- g
A two-player symmetric game
𝜌 f(𝜏f, 𝜏g) = ∑ ∑ 𝑞f 𝑡_ 𝑞g(𝑡
a) ^ ae- ^ _e-
𝜌 f(𝑡_,𝑡
a)
𝜌g(𝜏 f, 𝜏g) = ∑ ∑ 𝑞f 𝑡_ 𝑞g(𝑡
a) ^ ae- ^ _e-
𝜌g(𝑡_, 𝑡
a)
à 𝑦- = 𝑞 𝑡-
f , 𝑦. = 𝑞 𝑡- g
§ Quadratic function in (𝑦-, 𝑦.), the probability of playing strategy 1 for both players § Resulting from the linear relations with the pure strategy payoffs
15781 Fall 2016: Lecture 22
PAYOFF FUNCTION AND NASH EQUILIBRIUM
34
𝜌f(𝜏 f, 𝜏g)= 3𝑦-𝑦. − 2𝑦- + 2𝑦. + 5 § Strategy combinations (𝜏 f, 𝜏g) that are pure NE? § If any, they correspond to the vertices, where players adopt pure strategies § Two pure NE: § (𝑦- = 0,𝑦. = 0) § 𝜌f( 0,1 , (0,1)) = 5 § (𝑦- = 1,𝑦. = 1) § 𝜌f( 1,0 , (1,0)) = 8 § (𝜏 f, 𝜏g) that are mixed NE? § They correspond to interior points § Equilibrium points: if player 𝑗 sticks with its strategy, player 𝑘 can’t improve his utility by a small change in strategy à Conditions on the partial derivatives
𝑦- = 𝑞 𝑡-
f ,
𝑦. = 𝑞 𝑡-
g
𝜌g(𝜏 f, 𝜏g)= 3𝑦-𝑦. + 2𝑦- − 2𝑦. + 5
15781 Fall 2016: Lecture 22
PAYOFF FUNCTION AND NASH EQUILIBRIUM
35
𝜌f(𝜏 f, 𝜏g)= 3𝑦-𝑦. − 2𝑦- + 2𝑦. + 5
𝑦- = 𝑞 𝑡-
f ,
𝑦. = 𝑞 𝑡-
g
𝜌g(𝜏 f, 𝜏g)= 3𝑦-𝑦. + 2𝑦- − 2𝑦. + 5
§
—˜™(I™,I•) —I™
|I•eI∗• = 0 §
—˜•(I™,I•) —I•
|I™eI∗™ = 0
š3𝑦. − 2 = 0 3𝑦- − 2 = 0 → 𝑦- = 2 3 ,𝑦. = 2 3 𝜌f(𝜏∗f, 𝜏∗g) = 𝜌g(𝜏∗f, 𝜏∗g) = 6.33 § Analyzing the Hessian, it results that the equilibrium point is a Saddle
15781 Fall 2016: Lecture 22
USING CALCULUS TO FIND NASH EQUILIBRIA
36
§ Why does imposing the conditions on the partial derivatives make sense to find mixed Nash equilibria? § In the example, we wanted to find (𝑦1,𝑦2 ) such that 𝜏𝐵 and 𝜏𝐶 form a NE: they are mutual best responses. For being best responses, each player aims to maximize his expected utility (given the strategy of the other, but let’s consider first the general idea to maximize the expected utility) § For player 𝐵, this means finding the maximizing value of 𝑦- for the function 𝜌f ∶ 0,1 × 0,1 → ℝ, 𝜌f(𝜏 f, 𝜏g)= 3𝑦-𝑦. − 2𝑦- + 2𝑦. + 5 § It can be on the frontier of the function domain (which is a compact set) à In one of the vertices, that in turn correspond to where the players both play a combination of pure strategies § It can be interior points, that correspond to mixed strategies § Finding the optimizing point of a continuous function à Finding the critical points, where the partial derivatives of the function go to zero, indicating no local improvement
15781 Fall 2016: Lecture 22
USING CALCULUS TO FIND NASH EQUILIBRIA
37
§
𝜖𝜌𝐵(𝜏𝐵,𝜏𝐶) 𝜖𝜏𝐵
|𝜏𝐶=𝜏∗𝐶 = 0 à 3𝑦2 − 2 = 0 à 𝑦2 =
2 3
§ Any interior value of 𝑦1 is a candidate maximum as long as 𝑦2 =
2 3
𝜌𝐵(𝜏𝐵, 𝜏𝐶)= 3𝑦1𝑦2 − 2𝑦1 + 2𝑦2 + 5 à 3𝑦1
2 3 − 2𝑦1 + 2 2 3 + 5 = 6.33
It doesn’t matter what strategy 𝑦1, player 𝐵’s utility is always 6.33 as long as player 𝐶 plays strategy 𝑦2 =
2 3 à Any 𝐵’s strategy is a best response
§ If player 2 doesn’t play 𝑦2 =
2 3 :
§ If he plays 𝑦2 >
2 3 , then always 𝜖𝜌𝐵(𝜏𝐵,𝜏𝐶) 𝜖𝜏𝐵
> 0 and the best response for player 1 is to play the pure strategy 𝑦1 = 1 to maximize his utility § If he plays 𝑦2 <
2 3 , then always 𝜖𝜌𝐵(𝜏𝐵,𝜏𝐶) 𝜖𝜏𝐵
< 0 and the best response for player 1 is to play the pure strategy 𝑦1 = 0 to maximize his utility § An identical reasoning can be done for player 𝐶. Putting al results together we reach the previous conclusions
15781 Fall 2016: Lecture 22
SYMMETRIC TWO-PLAYER GAMES
38
§ Let’s go back to our population games and consider again pairwise contest games § In a pwc population game, the payoff to a focal individual using mixed strategy 𝜏 in a population with profile 𝒚 is: 𝜌(𝜏, 𝒚 ) = ∑ ∑ 𝑞 𝑡 𝑦(𝑡;)𝜌(s, s′)
4;∈5 4∈5
§ This payoff is the same that would be obtained in a two-player game against an
- pponent using a strategy 𝜏′ that assigns 𝑞; 𝑡 = 𝑦 𝑡 ,∀𝑡 ∈ 𝑇
§ à We can always associate a two-player game with a population game involving a pairwise contest and vice versa § In a monomorphic population all players play the same randomized profile, that corresponds to the population one § à The pwc two-players games have a symmetric payoff matrix: 𝜌• 𝑡_,𝑡a = 𝜌ž 𝑡a,𝑡_ , ∀ pure strategies 𝑡_,𝑡a ∈ 𝑇 of Focal and Population players
𝑊 − 𝐷 2 , 𝑊 − 𝐷 2 𝑊,0 0, 𝑊 𝑊 2 , 𝑊 2
H D H D
F P
𝑊 − 𝐷 2 𝑊 𝑊 2
H D H D
F P
Only row / F player payoffs
15781 Fall 2016: Lecture 22
NASH EQUILIBRIUM IN SYMMETRIC GAMES
39
§ Since 𝜌• 𝑡_,𝑡a = 𝜌ž 𝑡a,𝑡_ , for all pure strategies 𝑡_, 𝑡a ∈ 𝑇, we can skip the player superscript, keeping referring to the focal player à 𝜌 𝑡_,𝑡a § If 𝜏 ¡∗ = 𝜏•
∗, 𝜏ž ∗ is a mixed Nash equilibrium, for any mixed strategy profile 𝜏 ∈ Σ:
¢ 𝜌• 𝜏•
∗, 𝜏ž ∗ ≥ 𝜌• 𝜏, 𝜏ž ∗
𝜌ž(𝜏•
∗, 𝜏ž ∗) ≥ 𝜌ž(𝜏• ∗,𝜏)
§ In a symmetric game like in population-based pairwise ontests, this becomes equivalent to: 𝜌 𝜏∗,𝜏∗ ≥ 𝜌 𝜏, 𝜏∗ = 𝜌 𝜏∗, 𝜏 , ∀𝜏 ∈ Σ § à The NE mixed strategy is the best response against itself! (in terms of strategy) § From Nash theorem we know that each game has at least one mixed NE, which belongs to: 𝑐 𝜏 = {𝜏; ∈ Σ ∶ 𝜏; h Π𝜏 = max¤∈J 𝑨 h Π𝜏}, where 𝑐: Σ → Σ is the best response mapping § The existence of the Nash equilibrium can be precisely derived from the existence
- f a fixed point of the best response mapping
15781 Fall 2016: Lecture 22
FROM NASH EQUILIBRIUM TO ESS
40
§ If 𝜌 𝜏∗,𝜏∗ > 𝜌 𝜏, 𝜏∗ the NE is strict § In a monomorphic population, at a strict NE, any mutant strategy would have a lower fitness than 𝜏∗, and would not reproduce if rare in the population § If the NE is not strict, then there may be infinitely many mutant strategies that have the same fitness as the NE (because of =), and they might reproduce § à A non-strict NE is not protected against a mutant invasion § à We need(ed) a different definition of equilibrium à ESS § ESS: Form of NE that is resistant to mutant invasions § A stable evolutionary strategy is the best response to the population profiles it generates
15781 Fall 2016: Lecture 22
FORMAL RELATION BETWEEN ESS AND NASH
41
§ Theorem: Let 𝜏∗be an ESS in pairwise contest then ∀𝜏 ≠ 𝜏∗, 𝜏 ∈ Σ, either:
- 1. π 𝜏∗, 𝜏∗ > π(𝜏, 𝜏∗) (strict Nash equilibrium) OR
- 2. π 𝜏∗, 𝜏∗ = π 𝜏, 𝜏∗ and π 𝜏∗,𝜏 > π 𝜏, 𝜏
(good mutants won’t spread) The first conditions is a strict NE, which is inherently protected against mutant invasion. The second condition admits non-strict NE, but only when the mutant strategies that are as good as the NE strategy won’t spread in the population since they will loose all contests with 𝜏∗ focal individuals. Condition (2) in practice removes some (undesired) NE from consideration: there may be a NE in a two-player games but no corresponding to ESS in the population game. § Theorem: Conversely, if either (1) or (2) holds for each 𝜏 ≠ 𝜏∗ in a two- player game, then 𝜏∗ is an ESS in the corresponding population game
15781 Fall 2016: Lecture 22
FINDING ESS
42
§ Feom previous theorem, an alternative form to find an ESS: § Write down the associated two-player game § Find the symmetric Nash equilibria of the game § Test the NE equilibria using conditions 1 and 2 § Any NE strategy 𝜏∗ that passes these test is an ESS, leading to a population profile 𝒚∗ = 𝜏∗
15781 Fall 2016: Lecture 22
DO ALL GAMES HAVE AN ESS?
43
0,0
- 1,1
1,-1 1,-1 0,0
- 1,1
- 1,1
1,-1 0,0
Unique symmetric NE
1 3 ,1 3 , 1 3 , 1 3 ,1 3 , 1 3
This mixed strategy is not an ESS, let’s check:
1. π 𝜏∗,𝜏∗ > π(𝜏,𝜏∗), or 2. π 𝜏∗,𝜏∗ = π 𝜏,𝜏∗ and π 𝜏∗,𝜏 > π 𝜏,𝜏
E.g, π 𝜏∗, 𝑆 = 0 = π(R,R)
§ Theorem: All generic two-action symmetric games have an ESS § Not all games have an ESS § Not all interesting /useful NE are ESS (e.g., Tit-for-tat in Iterated prisoner)
𝑏, 𝑏 𝑐,𝑑 𝑑, 𝑐 𝑒, 𝑒
A B A B
𝑸𝟑 𝑸𝟐
§ It can be demonstrated by first applying an affine transformation to the payoffs, that leaves the NE unchanged π′ 𝑡1,𝑡2 = 𝛽π 𝑡1,𝑡2 + 𝛾(𝑡2) § 𝑏,𝑏 = 𝑏 − 𝑑, 𝑏 − 𝑑 , 𝑐,𝑑 = 𝑑, 𝑐 = 0,0 , 𝑒,𝑒 = (𝑒 − 𝑐,𝑒 − 𝑐)