Dynamics in Near-Potential Games Ozan Candogan, Asu Ozdaglar, and - - PowerPoint PPT Presentation

dynamics in near potential games
SMART_READER_LITE
LIVE PREVIEW

Dynamics in Near-Potential Games Ozan Candogan, Asu Ozdaglar, and - - PowerPoint PPT Presentation

Dynamics in Near-Potential Games Ozan Candogan, Asu Ozdaglar, and Pablo Parrilo Laboratory for Information and Decision Systems Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology Innovations in


slide-1
SLIDE 1

Dynamics in Near-Potential Games

Ozan Candogan, Asu Ozdaglar, and Pablo Parrilo Laboratory for Information and Decision Systems Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology

Innovations in Algorithmic Game Theory Workshop May 2011

1

slide-2
SLIDE 2

Introduction

Motivating Example

Two games that are close (in terms of payoffs) may have significantly different limiting dynamics: for any small θ > 0, consider the two games ˆ G and G, A B A 0, 1 0, 0 B 1, 0 θ, 2 A B A 0, 1 0, 0 B 1, 0 −θ, 2 ˆ G G The unique Nash equilibrium of ˆ G: (B,B). The unique Nash equilibrium of G:

  • 2

3A + 1 3B, θ 1+θA + 1 1+θB

  • .

We consider convergence of the sequence of pure strategy profiles generated by better-response dynamics (at any strategy profile, a player chosen at random updates its strategy unilaterally to one that yields a better payoff). For ˆ G, the sequence converges to the Nash equilibrium (B,B). For G, the sequence follows the better response cycle (A,A), (B,A), (B,B) and (A,B), hence it is not contained in any (pure) ǫ-equilibrium set for ǫ < 2.

2

slide-3
SLIDE 3

Introduction

This Paper

Can we identify classes of games in which convergence of adaptive dynamics is robust to small perturbations or misspecifications of payoffs, i.e., limiting dynamics are contained in approximate equilibrium sets? As well-known, for (finite) potential games many reasonable adaptive dynamics, including best-response and fictitious play dynamics, “converge" to a Nash equilibrium. [Monderer, Shapley 96, 96], [Young 93, 04] Does this convergence behavior extend to near-potential games? Relatedly, for a given game, can we find a nearby potential game and use the distance between these games to obtain a quantitative measure of the size of the limiting approximate equilibrium set?

3

slide-4
SLIDE 4

Introduction

Our Contributions

We study convergence properties of dynamics in finite strategic form games by exploiting their relation to close potential games. Our approach relies on using the potential function of a close potential game for the analysis of dynamics. We show that for a given game, we can find the “closest" potential game by solving a convex optimization problem. We show that many reasonable adaptive dynamics converge to an approximate equilibrium set, whose size is a function of the distance from a close potential game. For near-potential games, we obtain convergence to a small approximate equilibrium set.

4

slide-5
SLIDE 5

Introduction

This Talk

We focus on three commonly studied update rules and show the following. For discrete time better response dynamics: The sequence of pure strategy profiles converges to a pure ǫ-equilibrium set (ǫ is proportional to the distance of the game to a potential game). For discrete time fictitious play dynamics: The sequence of empirical frequencies converges to a neighborhood of a (mixed) equilibrium (the size of the neighborhood increases with distance of the game to a potential game). For logit-response dynamics: The stochastically stable strategy profiles are pure ǫ-equilibria (ǫ is proportional to the distance of the game to a potential game).

5

slide-6
SLIDE 6

Preliminaries

Preliminaries: Strategies and Nash Equilibrium

We consider finite strategic form games G = M, {Em}m∈M, {um}m∈M. M: set of players. Em: set of strategies of player m, E = Πm∈MEm: set of strategy profiles. um: payoff function of player m (um : E → R) Notation: p ∈ E, p−m ∈ E−m =

k=m Ek.

Let ∆Em denote the set of probability distributions on Em. We refer to xm ∈ ∆Em as a mixed strategy of player m, and a collection of mixed strategies x = {xm}m as a mixed strategy profile. A mixed strategy profile x is a mixed ǫ-(Nash) equilibrium if um(xm, x−m) ≥ um(ym, x−m) − ǫ for all m ∈ M, ym ∈ ∆Em. If ǫ = 0, then x is a Nash equilibrium. If xm is degenerate for all m, i.e., it assigns probability 1 to a single strategy, then we refer to the strategy profile as a pure equilibrium (or pure ǫ-equilibrium).

6

slide-7
SLIDE 7

Preliminaries

Preliminaries: Potential Games

A game G is an exact potential game if there exists a function φ : E → R, such that φ(pm, p−m) − φ(qm, p−m) = um(pm, p−m) − um(qm, p−m), for all m ∈ M, pm, qm ∈ Em, and p−m ∈ E−m. Let γ = (p0, . . . , pN) be a simple closed path (i.e., pi and pi+1 differ in the strategy of only one player and p0 = pN). Define I(γ) to be the total utility improvement along the path, i.e., I(γ) =

N

  • i=1

umi(pi) − umi(pi−1).

Proposition (Monderer and Shapley)

A game is a potential game if and only if I(γ) = 0 for all simple closed paths γ.

7

slide-8
SLIDE 8

Near-Potential Games

Maximal Pairwise Difference

We adopt the following metric to measure the distance between games.

Definition (Maximal pairwise difference)

Let G and ˆ G be two games with set of players M, set of strategy profiles E, and utility functions {um} and {ˆ um}. The maximal pairwise difference (MPD) between these games is defined as d(G, ˆ G)

= max

{m,p,q|p−m=q−m} |(um(p) − um(q)) − (ˆ

um(p) − ˆ um(q))| . MPD captures how different two games are in terms of utility improvements due to unilateral deviations. We use difference of utility improvements rather than difference of utility values since the former is a better representation of strategic similarities (equilibrium and dynamic properties) [Candogan, Menache, Ozdaglar, Parrilo 10]. We refer to games with small MPD to a potential game as a near-potential game.

8

slide-9
SLIDE 9

Near-Potential Games

Finding Close Potential Games

We consider the problem of finding the closest potential game to a given game, where the distance is measured in terms of the MPD. Potential games are characterized by linear equality constraints. Given a game with utility functions {um}, the closest potential game, with utility functions {ˆ um}, can be obtained by solving the following convex

  • ptimization problem:

min

φ,{ˆ um}

max

m∈M,p∈E,qm∈Em |

  • um(qm, p−m) − um(pm, p−m)
  • ˆ

um(qm, p−m) − ˆ um(pm, p−m)

  • |

s.t. φ(¯ qm, ¯ p−m) − φ(¯ pm, ¯ p−m) = ˆ um(¯ qm, ¯ p−m) − ˆ um(¯ pm, ¯ p−m), for all m ∈ M, ¯ p ∈ E, ¯ qm ∈ Em. We study extensions to other norms and weighted potential games in [Candogan, Ozdaglar, Parrilo 2010].

9

slide-10
SLIDE 10

Dynamics Discrete Time Better-Response Dynamics

Discrete Time Better-Response Dynamics – 1

We first focus on discrete time better response dynamics: At each time step t, a single player is chosen at random (using a probability distribution with full support over the set of players). Suppose player m is chosen and r ∈ E is the current strategy profile. Player m updates its strategy to a strategy in {qm ∈ Em | um(qm, r−m) > um(r)}, chosen uniformly at random. We consider convergence of the sequence of generated pure strategy profiles {pt}∞

t=0, which we refer to as the trajectory of the dynamics.

In finite potential games, convergence of the trajectory to a Nash equilibrium is established using the fact that with each update the potential strictly increases.

10

slide-11
SLIDE 11

Dynamics Discrete Time Better-Response Dynamics

Discrete Time Better Response Dynamics – 2

Theorem

Consider a game G and let ˆ G be a close potential game with d(G, ˆ G) = δ. In G, the trajectory of the better-response dynamics is contained in the pure ǫ-equilibrium set after finite time with probability 1, where ǫ = δ|E|. Proof Sketch: The evolution of trajectories can be represented by a Markov chain: Set of states given by the set of strategy profiles, and there is a nonzero transition probability from r to q if r and q differ in the strategy of a single player, say m, and qm is a (strictly) better response of player m to r−m. With probability 1, we have convergence to a recurrence class in finite time. For any transition between two states in the same recurrence class, we can construct a closed improvement path. Using zero total utility improvement along the path for the close potential game and the proximity of our game to the potential game, we can establish a bound

  • n the utility improvement between any two states in this recurrence class.

11

slide-12
SLIDE 12

Dynamics Discrete Time Fictitious Play Dynamics

Discrete Time Fictitious Play – 1

In fictitious play, agents form predictions about opponent strategies using the entire history of play: they forecast other players’ strategies to be (independent) empirical frequency distributions. Let 1(pm

t = pm) be the indicator function, which is equal to 1 if pm t = pm, and 0

  • therwise.

The empirical frequency at time T that player m uses strategy qm is given by µm

T (qm) = 1

T

T−1

  • t=0

1(pm

t = qm).

Let µm

T denote the empirical frequency distribution (vector) of player m at time

T. At each time instant t, every player m, chooses a strategy pm

t such that

pm

t ∈ arg max qm∈Em um(qm, µ−m t

). It is known that empirical frequency distributions converge to the (mixed) equilibrium set in potential games. [Monderer and Shapley 96].

12

slide-13
SLIDE 13

Dynamics Discrete Time Fictitious Play Dynamics

Discrete Time Fictitious Play – 2

Observe that the evolution of the (joint) empirical frequency vector µt = {µm

t }m

can be represented by µt+1 = t t + 1µt + 1 t + 1It, where It = {Im

t }m∈M, and Im t is a vector such that

Im

t (pm) = 1(pm t = pm).

Since µm

t , Im t ∈ ∆Em for all m, this implies

µt+1 − µt = 1 t + 1It − µt = O 1 t

  • .

Thus, for large t, the change in the empirical frequency vector µt is small.

13

slide-14
SLIDE 14

Dynamics Discrete Time Fictitious Play Dynamics

Improvement in Potential Function

Let M denote the number of players.

Lemma

Consider a game G and let ˆ G be a near-potential game with potential function φ and d(G, ˆ G) = δ. Assume that at time t > 0, the empirical frequency vector µt is outside the ǫ-equilibrium set of G for some ǫ > 0. Then, φ(µt+1) − φ(µt) ≥ ǫ − Mδ t + 1 + O 1 t2

  • .

Implications: If µt is not in the ǫ-equilibrium set for some ǫ > Mδ, and t sufficiently large, then the potential evaluated at empirical frequencies increases when players update their strategies. Hence, empirical frequency vector eventually reaches the ǫ-equilibrium set.

14

slide-15
SLIDE 15

Dynamics Discrete Time Fictitious Play Dynamics

Proof Sketch

Consider Taylor expansion of φ around µt. Since φ is a multilinear function of its argument, first order terms in φ(µt+1) − φ(µt) have the form (recall pm

t denotes strategy of player m at time t)

φ(pm

t , µ−m t

) − φ(µm

t , µ−m t

) t + 1 . Note that, by definition of MPD: φ(pm

t , µ−m t

) − φ(µm

t , µ−m t

) ≥ um(pm

t , µ−m t

) − um(µm

t , µ−m t

) − δ. In fictitious play, players best respond to the empirical frequencies, thus first

  • rder terms are

at least −δ for each player, at least ǫ − δ for one player (since µt is not an ǫ-equilibrium). Thus, φ(µt+1) − φ(µt) ≥ ǫ−Mδ

t+1 + O

1

t2

  • .

15

slide-16
SLIDE 16

Dynamics Discrete Time Fictitious Play Dynamics

Convergence to a Level Set of the Potential Function

Let Xα denote the set of mixed α-equilibria.

Theorem

Consider a game G and let ˆ G be a near-potential game with potential function φ and d(G, ˆ G) = δ. In G, the empirical frequency vector of the fictitious play converges to C = {x ∈

  • m

∆Em|φ(x) ≥ min

y∈XMδ φ(y)}.

Assume that t ≫ 0 and µt / ∈ XMδ+ǫ′ for some ǫ′ > 0. Then φ(µt+1) − φ(µt) ≥ ǫ′ (t + 1) + O 1 t2

ǫ′ 2(t + 1) > 0. Thus XMδ+ǫ′ is reached in finite time. After XMδ+ǫ′ is reached, the potential cannot fall below miny∈XMδ+ǫ′ φ(y). Result follows by taking ǫ′ → 0.

16

slide-17
SLIDE 17

Dynamics Discrete Time Fictitious Play Dynamics

Convergence to a Level Set of the Potential Function

If the original game is a potential game, δ = 0, the above theorem implies convergence to C = {x ∈

  • m

∆Em|φ(x) ≥ min

y∈X0 φ(y)},

i.e., the set of strategy profiles that have potential (weakly) larger than the minimum potential at an equilibrium. If the game has multiple equilibria (with different potential function values), this result is not tight: in potential games, dynamics converge to equilibria. But we can strengthen it using the structure of approximate equilibrium sets.

17

slide-18
SLIDE 18

Dynamics Discrete Time Fictitious Play Dynamics

Approximate Equilibrium Sets

Observation: Consider a game with finitely many equilibria. For small ǫ, ǫ-equilibrium sets are contained in disjoint neighborhoods of equilibria. O F O 3, 2 0, 0 F 0, 0 2, 3 Figure: Payoffs in battle of the sexes. Equilibria: (i) (O, O), (ii) (F, F), (iii) (0.6 O+ 0.4 F, 0.4 O + 0.6 F ).

18

slide-19
SLIDE 19

Dynamics Discrete Time Fictitious Play Dynamics

Approximate Equilibrium Sets

0.2 0.4 0.6 0.8 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Probability the row player uses strategy O Probability the column player uses strategy O 0.2 0.4 0.6 0.8 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Probability the row player uses strategy O Probability the column player uses strategy O 0.2 0.4 0.6 0.8 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Probability the row player uses strategy O Probability the column player uses strategy O

Figure: Approximate equilibrium sets in BoS: ǫ = 0.2 (upper left), ǫ = 0.3 (upper right), ǫ = 0.4 (bottom).

19

slide-20
SLIDE 20

Dynamics Discrete Time Fictitious Play Dynamics

Upper Semicontinuity of the Equilibrium Correspondence

This is due to the upper semicontinuity of the approximate equilibrium mapping:

Definition

A correspondence g : X ⇒ Y is upper semicontinuous at x∗, if for any open neighborhood V of g(x∗) there exists a neighborhood U of x∗ such that g(x) ⊂ V for all x ∈ U.

Lemma

Let g : R ⇒

m∈M ∆Em be a correspondence such that g(α) = Xα. This

correspondence is upper semicontinuous. Upper semicontinuity at α = 0 implies ǫ-equilibrium sets with small ǫ are contained in disjoint neighborhoods of equilibria.

20

slide-21
SLIDE 21

Dynamics Discrete Time Fictitious Play Dynamics

Size of Approximate Equilibrium Sets

Consider a game with finitely many equilibria {xk}. Let f(α) capture the size of a neighborhood of equilibria, which contains α-equilibria of the game: f(α) = max

x∈g(α)=Xα

min

k∈{1,...,l} ||x − xk||.

Xα xk f(α)

By definition f is weakly increasing, and it satisfies f(0) = 0. By upper semicontinuity of g and Berge’s theorem it is upper semicontinuous and f(x) → 0 as x → 0.

21

slide-22
SLIDE 22

Dynamics Discrete Time Fictitious Play Dynamics

Convergence Result

Let L denote the Lipshitz constant of the mixed extension of φ.

Theorem

Consider a game G and let ˆ G be a close potential game with d(G, ˆ G) = δ. Assume that G has finitely many equilibria. There exists some ¯ δ > 0, and ¯ ǫ > 0 such that if δ < ¯ δ, then in G, the empirical frequency vector of fictitious play converges to

  • x
  • ||x − xk|| ≤ 4f(Mδ)ML

ǫ + f(Mδ + ǫ), for some equilibrium xk

  • ,

for any ǫ with 0 < ǫ ≤ ¯ ǫ. If G is a potential game, i.e., δ = 0, then letting ǫ → 0, convergence to an equilibrium follows.

22

slide-23
SLIDE 23

Dynamics Discrete Time Fictitious Play Dynamics

Proof Sketch

Step 1: We can provide a lower bound on the increase in the potential function

  • utside an approximate equilibrium set in terms of the length of the path traveled
  • utside this set.

Assume that µt leaves Mδ + ǫ-equilibrium set at some T and returns back to it at some T′ > T. As established before, when µt is outside this equilibrium set φ(µt+1) − φ(µt) ≥ ǫ/(t + 1) + O(1/t2) > ǫ/2t. Since ||µt+1 − µt|| = O(1/t), it follows that, for some constant c, φ(µT′) − φ(µT) ≥

T′

  • t=T

ǫ/2t ≥ ǫc||µT′ − µT||.

23

slide-24
SLIDE 24

Dynamics Discrete Time Fictitious Play Dynamics

Proof Sketch

Step 2: Using the previous bound and the fact that approximate equilibrium sets are contained in disjoint neighborhoods of equilibria, we can establish that µt can visit the approximate equilibrium set infinitely often only around one equilibrium. For small δ and ǫ, the Mδ + ǫ-equilibrium set of the game is contained in disjoint neighborhoods of the equilibria of the game. Assume µT is in the neighborhood of equilibrium xk, and µT′ is in the neighborhood of xk′. Then between T and T′, the potential increases significantly, and µt cannot revisit the neighborhood of xk.

xk′ xk

24

slide-25
SLIDE 25

Dynamics Discrete Time Fictitious Play Dynamics

Proof Sketch

Step 3: We can now get a bound on how far µt can get away from the approximate equilibrium set component (around one equilibrium). Assume µt leaves the component of Mδ + ǫ equilibrium set (in the neighborhood of xk′) at time T1, reaches up to distance d from this set, returns back to it at time T2 As before, the change in the potential is proportional to the length of the path traveled outside Mδ + ǫ equilibrium set, for some constant c, φ(µT2) − φ(µT1) > cǫd. Since µT2 and µT1 are both in the neighborhood of xk′, it follows that φ(µT2) − φ(µT1) ≤ 2f(Mδ + ǫ)L, where L is the Lipschitz constant of φ. Thus, it follows that d ≤ 2f(Mδ + ǫ)L/cǫ.

We can obtain a tighter bound on d, by considering different approximate equilibrium sets for the upper and lower bound for φ(µT2) − φ(µT1): d < 4f(Mδ)ML/ǫ.

25

slide-26
SLIDE 26

Dynamics Discrete Time Fictitious Play Dynamics

Proof Sketch

Since Mδ + ǫ equilibrium set is contained in f(Mδ + ǫ) neighborhood of xk′, we establish convergence to the r = d + f(Mδ + ǫ) = 4f(Mδ)ML/ǫ + f(Mδ + ǫ) neighborhood of equilibria.

r xk′ µT µt

26

slide-27
SLIDE 27

Dynamics Logit-Response Dynamics

Logit-Response Dynamics

We finally consider logit-response dynamics: At each time step t, a single player is chosen at random (using a probability distribution with full support over the set of players). Suppose player m is chosen and r ∈ E is the current strategy profile. Player m chooses qm ∈ Em with probability Pm

τ (qm|r) =

e

1 τ um(qm,r−m)

  • pm∈Em e

1 τ um(pm,r−m) ,

where τ is a fixed smoothing parameter that determines how often players choose their best responses. We can associate a Markov chain for the evolution of the strategy profiles. This Markov chain has a unique stationary distribution, denoted by µτ. We refer to a strategy profile q such that limτ→0 µτ(q) > 0 as a stochastically stable strategy profile.

27

slide-28
SLIDE 28

Dynamics Logit-Response Dynamics

Logit-Response Dynamics – 2

In exact potential games: µ∗

τ can be written explicitly in terms of the potential function.

stochastically stable strategy profiles are those that maximize the potential. In near-potential games: we can obtain a bound on the difference of µτ from µ∗

τ in terms of the

distance to a potential game. stochastically stable strategy profiles are pure ǫ-equilibria, where ǫ is proportional to the distance to a potential game. Key idea of the proof is a novel multiplicative perturbation result for Markov chains.

28

slide-29
SLIDE 29

Conclusions and Future Work

Conclusions

We presented a framework for studying the convergence behavior of adaptive dynamics in strategic-form finite games by using closeness to potential games. This framework allows for extension of convergence properties of potential games to near potential games. The approach is applicable to various other update processes (such as continuous time update rules). Future Directions: Dynamics in “near” zero-sum and supermodular games. Designing incentive mechanisms for guaranteeing desirable limiting behavior in near-potential games. Approximate characterization of limiting behavior in potential games where players use heterogeneous update rules.

29