Multi-agent learning The repliato r dynami Gerard Vreeswijk , - - PowerPoint PPT Presentation

multi agent learning
SMART_READER_LITE
LIVE PREVIEW

Multi-agent learning The repliato r dynami Gerard Vreeswijk , - - PowerPoint PPT Presentation

Multi-agent learning The repliato r dynami Gerard Vreeswijk , Intelligent Software Systems, Computer Science Department, Faculty of Sciences, Utrecht University, The Netherlands. Wednesday 10 th June, 2020 disrete repliato r disrete


slide-1
SLIDE 1

Multi-agent learning

The repli ato r dynami

Gerard Vreeswijk, Intelligent Software Systems, Computer Science Department, Faculty of Sciences, Utrecht University, The Netherlands.

Wednesday 10th June, 2020

slide-2
SLIDE 2

Topics of today

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 2

p rop
  • rtions
tness average tness
  • ntinuous
repli ato r dis rete repli ato r dis rete step
slide-3
SLIDE 3

Topics of today

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 2

■ Symmetric games, symmetric

Nash equilibria in symmetric games, a-symmetric Nash equilibria in symmetric games.

p rop
  • rtions
tness average tness
  • ntinuous
repli ato r dis rete repli ato r dis rete step
slide-4
SLIDE 4

Topics of today

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 2

■ Symmetric games, symmetric

Nash equilibria in symmetric games, a-symmetric Nash equilibria in symmetric games.

■ Evolutionary game theory:

p rop
  • rtions,
tness, average tness.
  • ntinuous
repli ato r dis rete repli ato r dis rete step
slide-5
SLIDE 5

Topics of today

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 2

■ Symmetric games, symmetric

Nash equilibria in symmetric games, a-symmetric Nash equilibria in symmetric games.

■ Evolutionary game theory:

p rop
  • rtions,
tness, average tness.

■ The replicator dynamic.

  • ntinuous
repli ato r dis rete repli ato r dis rete step
slide-6
SLIDE 6

Topics of today

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 2

■ Symmetric games, symmetric

Nash equilibria in symmetric games, a-symmetric Nash equilibria in symmetric games.

■ Evolutionary game theory:

p rop
  • rtions,
tness, average tness.

■ The replicator dynamic.

  • Fitness vector, score-matrix
  • ntinuous
repli ato r dis rete repli ato r dis rete step
slide-7
SLIDE 7

Topics of today

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 2

■ Symmetric games, symmetric

Nash equilibria in symmetric games, a-symmetric Nash equilibria in symmetric games.

■ Evolutionary game theory:

p rop
  • rtions,
tness, average tness.

■ The replicator dynamic.

  • Fitness vector, score-matrix

(= grand table if species are reply rules).

  • ntinuous
repli ato r dis rete repli ato r dis rete step
slide-8
SLIDE 8

Topics of today

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 2

■ Symmetric games, symmetric

Nash equilibria in symmetric games, a-symmetric Nash equilibria in symmetric games.

■ Evolutionary game theory:

p rop
  • rtions,
tness, average tness.

■ The replicator dynamic.

  • Fitness vector, score-matrix

(= grand table if species are reply rules).

  • The
  • ntinuous
repli ato r

equation.

dis rete repli ato r dis rete step
slide-9
SLIDE 9

Topics of today

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 2

■ Symmetric games, symmetric

Nash equilibria in symmetric games, a-symmetric Nash equilibria in symmetric games.

■ Evolutionary game theory:

p rop
  • rtions,
tness, average tness.

■ The replicator dynamic.

  • Fitness vector, score-matrix

(= grand table if species are reply rules).

  • The
  • ntinuous
repli ato r

equation.

  • The
dis rete repli ato r

equation.

dis rete step
slide-10
SLIDE 10

Topics of today

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 2

■ Symmetric games, symmetric

Nash equilibria in symmetric games, a-symmetric Nash equilibria in symmetric games.

■ Evolutionary game theory:

p rop
  • rtions,
tness, average tness.

■ The replicator dynamic.

  • Fitness vector, score-matrix

(= grand table if species are reply rules).

  • The
  • ntinuous
repli ato r

equation.

  • The
dis rete repli ato r

equation.

  • The
dis rete step equation.
slide-11
SLIDE 11

Topics of today

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 2

■ Symmetric games, symmetric

Nash equilibria in symmetric games, a-symmetric Nash equilibria in symmetric games.

■ Evolutionary game theory:

p rop
  • rtions,
tness, average tness.

■ The replicator dynamic.

  • Fitness vector, score-matrix

(= grand table if species are reply rules).

  • The
  • ntinuous
repli ato r

equation.

  • The
dis rete repli ato r

equation.

  • The
dis rete step equation.
  • Derivation of the DRE from

the DSE.

slide-12
SLIDE 12

Topics of today

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 2

■ Symmetric games, symmetric

Nash equilibria in symmetric games, a-symmetric Nash equilibria in symmetric games.

■ Evolutionary game theory:

p rop
  • rtions,
tness, average tness.

■ The replicator dynamic.

  • Fitness vector, score-matrix

(= grand table if species are reply rules).

  • The
  • ntinuous
repli ato r

equation.

  • The
dis rete repli ato r

equation.

  • The
dis rete step equation.
  • Derivation of the DRE from

the DSE.

  • Derivation of the CRE from

the DRE.

slide-13
SLIDE 13

Topics of today

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 2

■ Symmetric games, symmetric

Nash equilibria in symmetric games, a-symmetric Nash equilibria in symmetric games.

■ Evolutionary game theory:

p rop
  • rtions,
tness, average tness.

■ The replicator dynamic.

  • Fitness vector, score-matrix

(= grand table if species are reply rules).

  • The
  • ntinuous
repli ato r

equation.

  • The
dis rete repli ato r

equation.

  • The
dis rete step equation.
  • Derivation of the DRE from

the DSE.

  • Derivation of the CRE from

the DRE.

  • Properties of the replicator

dynamic, connection with Nash equilibria.

slide-14
SLIDE 14

Symmetric games in normal form

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 3

slide-15
SLIDE 15

Ways to denote the payoff matrix of a symmetric game

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 4

Symmetric game: only your actions matter, not your role.

slide-16
SLIDE 16

Ways to denote the payoff matrix of a symmetric game

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 4

Symmetric game: only your actions matter, not your role. For 2-player games: it is not important whether you are row or column player:

slide-17
SLIDE 17

Ways to denote the payoff matrix of a symmetric game

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 4

Symmetric game: only your actions matter, not your role. For 2-player games: it is not important whether you are row or column player:    A1 A2 A3 A1 1, 1 2, 3

−4, 6

A2 3, 2 0, 0 8, −7 A3 6, −4

−7, 8

5, 5   

=

   A1 A2 A3 A1 1, 1 2, 3

−4, 6

A2 0, 0 8, −7 A3 5, 5   

slide-18
SLIDE 18

Ways to denote the payoff matrix of a symmetric game

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 4

Symmetric game: only your actions matter, not your role. For 2-player games: it is not important whether you are row or column player:    A1 A2 A3 A1 1, 1 2, 3

−4, 6

A2 3, 2 0, 0 8, −7 A3 6, −4

−7, 8

5, 5   

=

   A1 A2 A3 A1 1, 1 2, 3

−4, 6

A2 0, 0 8, −7 A3 5, 5   

   A1 A2 A3 A1 1 2

−4

A2 3 8 A3 6

−7

5   .

■ As a bi-matrix.

slide-19
SLIDE 19

Ways to denote the payoff matrix of a symmetric game

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 4

Symmetric game: only your actions matter, not your role. For 2-player games: it is not important whether you are row or column player:    A1 A2 A3 A1 1, 1 2, 3

−4, 6

A2 3, 2 0, 0 8, −7 A3 6, −4

−7, 8

5, 5   

=

   A1 A2 A3 A1 1, 1 2, 3

−4, 6

A2 0, 0 8, −7 A3 5, 5   

   A1 A2 A3 A1 1 2

−4

A2 3 8 A3 6

−7

5   .

■ As a bi-matrix. ■ As a partially filled bi-matrix.

slide-20
SLIDE 20

Ways to denote the payoff matrix of a symmetric game

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 4

Symmetric game: only your actions matter, not your role. For 2-player games: it is not important whether you are row or column player:    A1 A2 A3 A1 1, 1 2, 3

−4, 6

A2 3, 2 0, 0 8, −7 A3 6, −4

−7, 8

5, 5   

=

   A1 A2 A3 A1 1, 1 2, 3

−4, 6

A2 0, 0 8, −7 A3 5, 5   

   A1 A2 A3 A1 1 2

−4

A2 3 8 A3 6

−7

5   .

■ As a bi-matrix. ■ As a partially filled bi-matrix. ■ As a plain matrix.

slide-21
SLIDE 21

Hawk vs. Dove

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 5

slide-22
SLIDE 22

Symmetric normal-form games

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 6

  • Example. Hawk-dove game (share V or threaten [possibly fight: −C]):

H D H

(V − C)/2

V D V/2

symmetri
slide-23
SLIDE 23

Symmetric normal-form games

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 6

  • Example. Hawk-dove game (share V or threaten [possibly fight: −C]):

H D H

(V − C)/2

V D V/2 H D H

−2, −2

2, 0 D 0, 2 1, 1 V=2, C=6

symmetri
slide-24
SLIDE 24

Symmetric normal-form games

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 6

  • Example. Hawk-dove game (share V or threaten [possibly fight: −C]):

H D H

(V − C)/2

V D V/2 H D H

−2, −2

2, 0 D 0, 2 1, 1 V=2, C=6 Other instantiations: prisoner’s dilemma, chicken (= hawk-dove), matching pennies, stag hunt.

symmetri
slide-25
SLIDE 25

Symmetric normal-form games

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 6

  • Example. Hawk-dove game (share V or threaten [possibly fight: −C]):

H D H

(V − C)/2

V D V/2 H D H

−2, −2

2, 0 D 0, 2 1, 1 V=2, C=6 Other instantiations: prisoner’s dilemma, chicken (= hawk-dove), matching pennies, stag hunt.

  • Definition. A game is
symmetri when players have equal actions

and payoffs: ui(a1, . . . , ai, . . . , aj, . . . , an) = uj(a1, . . . , aj, . . . , ai, . . . , an). for all i and j.

slide-26
SLIDE 26

Symmetric normal-form games

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 6

  • Example. Hawk-dove game (share V or threaten [possibly fight: −C]):

H D H

(V − C)/2

V D V/2 H D H

−2, −2

2, 0 D 0, 2 1, 1 V=2, C=6 Other instantiations: prisoner’s dilemma, chicken (= hawk-dove), matching pennies, stag hunt.

  • Definition. A game is
symmetri when players have equal actions

and payoffs: ui(a1, . . . , ai, . . . , aj, . . . , an) = uj(a1, . . . , aj, . . . , ai, . . . , an). for all i and j. So a 2-player game G = (A, B) is symmetric iff m = n and B = AT.

slide-27
SLIDE 27

Symmetric equilibrium

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 7

symmetri equilib rium
slide-28
SLIDE 28

Symmetric equilibrium

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 7

  • Definition. Let p be a strategy in an n-player symmetric game. If

the n-vector (p, . . . , p) is a NE, p is called a

symmetri equilib rium.
slide-29
SLIDE 29

Symmetric equilibrium

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 7

  • Definition. Let p be a strategy in an n-player symmetric game. If

the n-vector (p, . . . , p) is a NE, p is called a

symmetri equilib rium.

■ !! Symmetric equilibria can be identified with strategies !!

slide-30
SLIDE 30

Symmetric equilibrium

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 7

  • Definition. Let p be a strategy in an n-player symmetric game. If

the n-vector (p, . . . , p) is a NE, p is called a

symmetri equilib rium.

■ !! Symmetric equilibria can be identified with strategies !! ■ (Theorem.) Every symmetric game has at least one symmetric

equilibrium.

slide-31
SLIDE 31

Symmetric equilibrium

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 7

  • Definition. Let p be a strategy in an n-player symmetric game. If

the n-vector (p, . . . , p) is a NE, p is called a

symmetri equilib rium.

■ !! Symmetric equilibria can be identified with strategies !! ■ (Theorem.) Every symmetric game has at least one symmetric

equilibrium.

■ (Fact.) Symmetric games can have a-symmetric equilibria.

slide-32
SLIDE 32

Symmetric equilibrium

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 7

  • Definition. Let p be a strategy in an n-player symmetric game. If

the n-vector (p, . . . , p) is a NE, p is called a

symmetri equilib rium.

■ !! Symmetric equilibria can be identified with strategies !! ■ (Theorem.) Every symmetric game has at least one symmetric

equilibrium.

■ (Fact.) Symmetric games can have a-symmetric equilibria.

For example Hawk-Dove: H D H

−2, −2

2, 0 D 0, 2 1, 1

slide-33
SLIDE 33

Symmetric equilibrium

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 7

  • Definition. Let p be a strategy in an n-player symmetric game. If

the n-vector (p, . . . , p) is a NE, p is called a

symmetri equilib rium.

■ !! Symmetric equilibria can be identified with strategies !! ■ (Theorem.) Every symmetric game has at least one symmetric

equilibrium.

■ (Fact.) Symmetric games can have a-symmetric equilibria.

For example Hawk-Dove: H D H

−2, −2

2, 0 D 0, 2 1, 1 Two asymmetric equilibria and one symmetric equilibrium (1/3, 1/3).

slide-34
SLIDE 34

Evolutionary game theory

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 8

slide-35
SLIDE 35

Evolutionary game theory: the idea

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 9

slide-36
SLIDE 36

Evolutionary game theory: the idea

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10

p rop
  • rtions
tness average tness
slide-37
SLIDE 37

Evolutionary game theory: the idea

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10

■ There are n, say 5, species.

p rop
  • rtions
tness average tness
slide-38
SLIDE 38

Evolutionary game theory: the idea

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10

■ There are n, say 5, species. An encounter

between individuals of different species yields payoffs for both.

p rop
  • rtions
tness average tness
slide-39
SLIDE 39

Evolutionary game theory: the idea

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10

■ There are n, say 5, species. An encounter

between individuals of different species yields payoffs for both. For row: A =       s1 s2 s3 s4 s5 s1 6 7

−1

s2

−1

5

−1

4 7 s3 9 8 9 6 s4

−4 −2

3

−3

s5 3 6

−1

      .

p rop
  • rtions
tness average tness
slide-40
SLIDE 40

Evolutionary game theory: the idea

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10

■ There are n, say 5, species. An encounter

between individuals of different species yields payoffs for both. For row: A =       s1 s2 s3 s4 s5 s1 6 7

−1

s2

−1

5

−1

4 7 s3 9 8 9 6 s4

−4 −2

3

−3

s5 3 6

−1

      .

■ The population consists of a very large

number of individuals, each playing a pure

  • strategy. Individuals interact randomly.
p rop
  • rtions
tness average tness
slide-41
SLIDE 41

Evolutionary game theory: the idea

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10

■ There are n, say 5, species. An encounter

between individuals of different species yields payoffs for both. For row: A =       s1 s2 s3 s4 s5 s1 6 7

−1

s2

−1

5

−1

4 7 s3 9 8 9 6 s4

−4 −2

3

−3

s5 3 6

−1

      .

■ The population consists of a very large

number of individuals, each playing a pure

  • strategy. Individuals interact randomly.

■ We are

interested in

p rop
  • rtions: p

= (p1, . . . , p5).

tness average tness
slide-42
SLIDE 42

Evolutionary game theory: the idea

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10

■ There are n, say 5, species. An encounter

between individuals of different species yields payoffs for both. For row: A =       s1 s2 s3 s4 s5 s1 6 7

−1

s2

−1

5

−1

4 7 s3 9 8 9 6 s4

−4 −2

3

−3

s5 3 6

−1

      .

■ The population consists of a very large

number of individuals, each playing a pure

  • strategy. Individuals interact randomly.

■ We are

interested in

p rop
  • rtions: p

= (p1, . . . , p5).

■ The

tness of

species i is: fi = ∑5

j=1 pjAij

= (Ap)i.

average tness
slide-43
SLIDE 43

Evolutionary game theory: the idea

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 10

■ There are n, say 5, species. An encounter

between individuals of different species yields payoffs for both. For row: A =       s1 s2 s3 s4 s5 s1 6 7

−1

s2

−1

5

−1

4 7 s3 9 8 9 6 s4

−4 −2

3

−3

s5 3 6

−1

      .

■ The population consists of a very large

number of individuals, each playing a pure

  • strategy. Individuals interact randomly.

■ We are

interested in

p rop
  • rtions: p

= (p1, . . . , p5).

■ The

tness of

species i is: fi = ∑5

j=1 pjAij

= (Ap)i.

■ The

average tness is

¯ f = ∑5

i=1 pi fi

= pTAp.

slide-44
SLIDE 44

The replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 11

slide-45
SLIDE 45

History of the replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 12

repli ato r dynami s
slide-46
SLIDE 46

History of the replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 12

■ Defined for a single species by Taylor and Jonker (1978), and named

by Schuster and Sigmund (1983): “Several evolutionary models in distinct biological fields— population genetics, population ecology, early biochemical evo- lution and sociobiology—lead independently to the same class of

repli ato r dynami s.”
slide-47
SLIDE 47

History of the replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 12

■ Defined for a single species by Taylor and Jonker (1978), and named

by Schuster and Sigmund (1983): “Several evolutionary models in distinct biological fields— population genetics, population ecology, early biochemical evo- lution and sociobiology—lead independently to the same class of

repli ato r dynami s.”

■ The replicator equation is the first game dynamics studied in

connection with evolutionary game theory (as developed by Maynard Smith and Price).

slide-48
SLIDE 48

History of the replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 12

■ Defined for a single species by Taylor and Jonker (1978), and named

by Schuster and Sigmund (1983): “Several evolutionary models in distinct biological fields— population genetics, population ecology, early biochemical evo- lution and sociobiology—lead independently to the same class of

repli ato r dynami s.”

■ The replicator equation is the first game dynamics studied in

connection with evolutionary game theory (as developed by Maynard Smith and Price).

Taylor P.D., Jonker L. “Evolutionarily stable strategies and game dynamics” in: Math. Biosci. 1978;40(1), pp. 145-156.

slide-49
SLIDE 49

History of the replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 12

■ Defined for a single species by Taylor and Jonker (1978), and named

by Schuster and Sigmund (1983): “Several evolutionary models in distinct biological fields— population genetics, population ecology, early biochemical evo- lution and sociobiology—lead independently to the same class of

repli ato r dynami s.”

■ The replicator equation is the first game dynamics studied in

connection with evolutionary game theory (as developed by Maynard Smith and Price).

Taylor P.D., Jonker L. “Evolutionarily stable strategies and game dynamics” in: Math. Biosci. 1978;40(1), pp. 145-156. Schuster P., Sigmund K. “Replicator dynamics” in: J. Theor. Biol. 1983, 100(3), pp. 533-538.

slide-50
SLIDE 50

The replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13

repli ato r equation relative s o re matrix p rop
  • rtion
slide-51
SLIDE 51

The replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13

■ The

repli ato r equation models how n different specifies grow (or

decline) due to mutual interaction.

relative s o re matrix p rop
  • rtion
slide-52
SLIDE 52

The replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13

■ The

repli ato r equation models how n different specifies grow (or

decline) due to mutual interaction.

■ It is assumed that if an individual of species i interacts with an

individual of species j, the expected reward for the individual of type i is a constant aij.

relative s o re matrix p rop
  • rtion
slide-53
SLIDE 53

The replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13

■ The

repli ato r equation models how n different specifies grow (or

decline) due to mutual interaction.

■ It is assumed that if an individual of species i interacts with an

individual of species j, the expected reward for the individual of type i is a constant aij. Summarised in a

relative s o re matrix: A =

   a11 . . . a1n . . . ... . . . an1 . . . ann   .

p rop
  • rtion
slide-54
SLIDE 54

The replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13

■ The

repli ato r equation models how n different specifies grow (or

decline) due to mutual interaction.

■ It is assumed that if an individual of species i interacts with an

individual of species j, the expected reward for the individual of type i is a constant aij. Summarised in a

relative s o re matrix: A =

   a11 . . . a1n . . . ... . . . an1 . . . ann   .

Proportions

p rop
  • rtion
slide-55
SLIDE 55

The replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13

■ The

repli ato r equation models how n different specifies grow (or

decline) due to mutual interaction.

■ It is assumed that if an individual of species i interacts with an

individual of species j, the expected reward for the individual of type i is a constant aij. Summarised in a

relative s o re matrix: A =

   a11 . . . a1n . . . ... . . . an1 . . . ann   .

Proportions

■ The number of individuals of species i is denoted by qi, or qi(t).

p rop
  • rtion
slide-56
SLIDE 56

The replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13

■ The

repli ato r equation models how n different specifies grow (or

decline) due to mutual interaction.

■ It is assumed that if an individual of species i interacts with an

individual of species j, the expected reward for the individual of type i is a constant aij. Summarised in a

relative s o re matrix: A =

   a11 . . . a1n . . . ... . . . an1 . . . ann   .

Proportions

■ The number of individuals of species i is denoted by qi, or qi(t). ■

pj =Def qj/q is the

p rop
  • rtion of species i, where q = q1 + · · · + qn.
slide-57
SLIDE 57

The replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 13

■ The

repli ato r equation models how n different specifies grow (or

decline) due to mutual interaction.

■ It is assumed that if an individual of species i interacts with an

individual of species j, the expected reward for the individual of type i is a constant aij. Summarised in a

relative s o re matrix: A =

   a11 . . . a1n . . . ... . . . an1 . . . ann   .

Proportions

■ The number of individuals of species i is denoted by qi, or qi(t). ■

pj =Def qj/q is the

p rop
  • rtion of species i, where q = q1 + · · · + qn.

■ So pi ∝ qi and p1 + · · · + pn = 1.

slide-58
SLIDE 58

Fitness

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14

tness tness ve to r
slide-59
SLIDE 59

Fitness

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14

■ The

tness of an individual is its

expected reward when it encounters a random individual in the population.

tness ve to r
slide-60
SLIDE 60

Fitness

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14

■ The

tness of an individual is its

expected reward when it encounters a random individual in the population.

■ Example. Suppose

A =   1 3 1 1 2 3 4 1 3   and p =   0.1 0.4 0.5   .

tness ve to r
slide-61
SLIDE 61

Fitness

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14

■ The

tness of an individual is its

expected reward when it encounters a random individual in the population.

■ Example. Suppose

A =   1 3 1 1 2 3 4 1 3   and p =   0.1 0.4 0.5   . The

tness ve to r, f , can now be

computed as follows: f = Ap =   1 3 1 1 2 3 4 1 3     0.1 0.4 0.5   =   1.8 2.4 2.3   .

slide-62
SLIDE 62

Fitness

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14

■ The

tness of an individual is its

expected reward when it encounters a random individual in the population.

■ Example. Suppose

A =   1 3 1 1 2 3 4 1 3   and p =   0.1 0.4 0.5   . The

tness ve to r, f , can now be

computed as follows: f = Ap =   1 3 1 1 2 3 4 1 3     0.1 0.4 0.5   =   1.8 2.4 2.3   .

■ Average fitness:

¯ f (t) =

3

j=1

pi fi(t)

= p(Ap) = 2.29.

slide-63
SLIDE 63

Fitness

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14

■ The

tness of an individual is its

expected reward when it encounters a random individual in the population.

■ Example. Suppose

A =   1 3 1 1 2 3 4 1 3   and p =   0.1 0.4 0.5   . The

tness ve to r, f , can now be

computed as follows: f = Ap =   1 3 1 1 2 3 4 1 3     0.1 0.4 0.5   =   1.8 2.4 2.3   .

■ Average fitness:

¯ f (t) =

3

j=1

pi fi(t)

= p(Ap) = 2.29.

■ Fitness of species 1:

f1 =

3

j=1

pja1j

= (Ap)1 = 1.8.

slide-64
SLIDE 64

Fitness

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14

■ The

tness of an individual is its

expected reward when it encounters a random individual in the population.

■ Example. Suppose

A =   1 3 1 1 2 3 4 1 3   and p =   0.1 0.4 0.5   . The

tness ve to r, f , can now be

computed as follows: f = Ap =   1 3 1 1 2 3 4 1 3     0.1 0.4 0.5   =   1.8 2.4 2.3   .

■ Average fitness:

¯ f (t) =

3

j=1

pi fi(t)

= p(Ap) = 2.29.

■ Fitness of species 1:

f1 =

3

j=1

pja1j

= (Ap)1 = 1.8.

So species 1 does worse than average.

slide-65
SLIDE 65

Fitness

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 14

■ The

tness of an individual is its

expected reward when it encounters a random individual in the population.

■ Example. Suppose

A =   1 3 1 1 2 3 4 1 3   and p =   0.1 0.4 0.5   . The

tness ve to r, f , can now be

computed as follows: f = Ap =   1 3 1 1 2 3 4 1 3     0.1 0.4 0.5   =   1.8 2.4 2.3   .

■ Average fitness:

¯ f (t) =

3

j=1

pi fi(t)

= p(Ap) = 2.29.

■ Fitness of species 1:

f1 =

3

j=1

pja1j

= (Ap)1 = 1.8.

So species 1 does worse than average.

■ Species 2 and 3 have

fitness 2.4 and 2.3, respectively.

slide-66
SLIDE 66

The continuous replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 15

slide-67
SLIDE 67

The continuous replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16

The

  • ntinuous
repli ato r equation has an extremely intuitive reading:

˙ pi(t) = pi(t)[ fi(t) − ¯ f (t) ], where ˙ pi(t) is shorthand for the change of pi in time: ˙ pi(t) = p′

i(t) = dpi(t)/dt.

slide-68
SLIDE 68

The continuous replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16

The

  • ntinuous
repli ato r equation has an extremely intuitive reading:

˙ pi(t) = pi(t)[ fi(t) − ¯ f (t) ], where ˙ pi(t) is shorthand for the change of pi in time: ˙ pi(t) = p′

i(t) = dpi(t)/dt.

Example 1. Suppose the proportion of species 7 at time t is p7(t) = 0.2

slide-69
SLIDE 69

The continuous replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16

The

  • ntinuous
repli ato r equation has an extremely intuitive reading:

˙ pi(t) = pi(t)[ fi(t) − ¯ f (t) ], where ˙ pi(t) is shorthand for the change of pi in time: ˙ pi(t) = p′

i(t) = dpi(t)/dt.

Example 1. Suppose the proportion of species 7 at time t is p7(t) = 0.2, the fitness of species 7 at time t is f7(t) = 6

slide-70
SLIDE 70

The continuous replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16

The

  • ntinuous
repli ato r equation has an extremely intuitive reading:

˙ pi(t) = pi(t)[ fi(t) − ¯ f (t) ], where ˙ pi(t) is shorthand for the change of pi in time: ˙ pi(t) = p′

i(t) = dpi(t)/dt.

Example 1. Suppose the proportion of species 7 at time t is p7(t) = 0.2, the fitness of species 7 at time t is f7(t) = 6, and the average fitness at time t is ¯ f (t) = 9.

slide-71
SLIDE 71

The continuous replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16

The

  • ntinuous
repli ato r equation has an extremely intuitive reading:

˙ pi(t) = pi(t)[ fi(t) − ¯ f (t) ], where ˙ pi(t) is shorthand for the change of pi in time: ˙ pi(t) = p′

i(t) = dpi(t)/dt.

Example 1. Suppose the proportion of species 7 at time t is p7(t) = 0.2, the fitness of species 7 at time t is f7(t) = 6, and the average fitness at time t is ¯ f (t) = 9. How fast does p7 grow on time t?

slide-72
SLIDE 72

The continuous replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16

The

  • ntinuous
repli ato r equation has an extremely intuitive reading:

˙ pi(t) = pi(t)[ fi(t) − ¯ f (t) ], where ˙ pi(t) is shorthand for the change of pi in time: ˙ pi(t) = p′

i(t) = dpi(t)/dt.

Example 1. Suppose the proportion of species 7 at time t is p7(t) = 0.2, the fitness of species 7 at time t is f7(t) = 6, and the average fitness at time t is ¯ f (t) = 9. How fast does p7 grow on time t?

  • Answer. ˙

p7(t)

slide-73
SLIDE 73

The continuous replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16

The

  • ntinuous
repli ato r equation has an extremely intuitive reading:

˙ pi(t) = pi(t)[ fi(t) − ¯ f (t) ], where ˙ pi(t) is shorthand for the change of pi in time: ˙ pi(t) = p′

i(t) = dpi(t)/dt.

Example 1. Suppose the proportion of species 7 at time t is p7(t) = 0.2, the fitness of species 7 at time t is f7(t) = 6, and the average fitness at time t is ¯ f (t) = 9. How fast does p7 grow on time t?

  • Answer. ˙

p7(t) = p7(t)[ f7(t) − ¯ f (t)]

slide-74
SLIDE 74

The continuous replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16

The

  • ntinuous
repli ato r equation has an extremely intuitive reading:

˙ pi(t) = pi(t)[ fi(t) − ¯ f (t) ], where ˙ pi(t) is shorthand for the change of pi in time: ˙ pi(t) = p′

i(t) = dpi(t)/dt.

Example 1. Suppose the proportion of species 7 at time t is p7(t) = 0.2, the fitness of species 7 at time t is f7(t) = 6, and the average fitness at time t is ¯ f (t) = 9. How fast does p7 grow on time t?

  • Answer. ˙

p7(t) = p7(t)[ f7(t) − ¯ f (t)] = 0.2(6 − 9)

slide-75
SLIDE 75

The continuous replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16

The

  • ntinuous
repli ato r equation has an extremely intuitive reading:

˙ pi(t) = pi(t)[ fi(t) − ¯ f (t) ], where ˙ pi(t) is shorthand for the change of pi in time: ˙ pi(t) = p′

i(t) = dpi(t)/dt.

Example 1. Suppose the proportion of species 7 at time t is p7(t) = 0.2, the fitness of species 7 at time t is f7(t) = 6, and the average fitness at time t is ¯ f (t) = 9. How fast does p7 grow on time t?

  • Answer. ˙

p7(t) = p7(t)[ f7(t) − ¯ f (t)] = 0.2(6 − 9) = −0.6.

slide-76
SLIDE 76

The continuous replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16

The

  • ntinuous
repli ato r equation has an extremely intuitive reading:

˙ pi(t) = pi(t)[ fi(t) − ¯ f (t) ], where ˙ pi(t) is shorthand for the change of pi in time: ˙ pi(t) = p′

i(t) = dpi(t)/dt.

Example 1. Suppose the proportion of species 7 at time t is p7(t) = 0.2, the fitness of species 7 at time t is f7(t) = 6, and the average fitness at time t is ¯ f (t) = 9. How fast does p7 grow on time t?

  • Answer. ˙

p7(t) = p7(t)[ f7(t) − ¯ f (t)] = 0.2(6 − 9) = −0.6.

  • Example 2. Suppose p5(t) = 0.2, f5(t) = 6, and ¯

f (t) = 4. Same question.

slide-77
SLIDE 77

The continuous replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 16

The

  • ntinuous
repli ato r equation has an extremely intuitive reading:

˙ pi(t) = pi(t)[ fi(t) − ¯ f (t) ], where ˙ pi(t) is shorthand for the change of pi in time: ˙ pi(t) = p′

i(t) = dpi(t)/dt.

Example 1. Suppose the proportion of species 7 at time t is p7(t) = 0.2, the fitness of species 7 at time t is f7(t) = 6, and the average fitness at time t is ¯ f (t) = 9. How fast does p7 grow on time t?

  • Answer. ˙

p7(t) = p7(t)[ f7(t) − ¯ f (t)] = 0.2(6 − 9) = −0.6.

  • Example 2. Suppose p5(t) = 0.2, f5(t) = 6, and ¯

f (t) = 4. Same question.

  • Answer. ˙

p5(t) = p5(t)[ f5(t) − ¯ f (t)] = 0.2(6 − 4) = 0.4.

slide-78
SLIDE 78

The dynamics of the replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 17

Relative score matrix A =   1 3 1 1 2 3 4 1 3  , start proportions p =   1/3 1/3 1/3  .

slide-79
SLIDE 79

Phase space of the replicator on the previous page

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 18

Circled rest points indicate Nash equilibria of the score-matrix, interpreted as the payoff matrix of a symmetric game in normal form.

slide-80
SLIDE 80

A replicator dynamic in a higher dimension

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 19

slide-81
SLIDE 81

Rest point, stable point, asymptotically stable point

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20

The

  • ntinuous
repli ato r equation:

˙ pi(t) = pi(t)[ fi(t) − ¯ f (t) ] is a system of differential equations. We have p = (p1, . . . , pn)

rest p
  • int
(Ly apunov) stable asymptoti ally stable
slide-82
SLIDE 82

Rest point, stable point, asymptotically stable point

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20

The

  • ntinuous
repli ato r equation:

˙ pi(t) = pi(t)[ fi(t) − ¯ f (t) ] is a system of differential equations. We have p = (p1, . . . , pn) ∈ ∆n

rest p
  • int
(Ly apunov) stable asymptoti ally stable
slide-83
SLIDE 83

Rest point, stable point, asymptotically stable point

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20

The

  • ntinuous
repli ato r equation:

˙ pi(t) = pi(t)[ fi(t) − ¯ f (t) ] is a system of differential equations. We have p = (p1, . . . , pn) ∈ ∆n and ˙ p = ( ˙ p1, . . . , ˙ pn)

rest p
  • int
(Ly apunov) stable asymptoti ally stable
slide-84
SLIDE 84

Rest point, stable point, asymptotically stable point

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20

The

  • ntinuous
repli ato r equation:

˙ pi(t) = pi(t)[ fi(t) − ¯ f (t) ] is a system of differential equations. We have p = (p1, . . . , pn) ∈ ∆n and ˙ p = ( ˙ p1, . . . , ˙ pn) ∈ Rn.

rest p
  • int
(Ly apunov) stable asymptoti ally stable
slide-85
SLIDE 85

Rest point, stable point, asymptotically stable point

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20

The

  • ntinuous
repli ato r equation:

˙ pi(t) = pi(t)[ fi(t) − ¯ f (t) ] is a system of differential equations. We have p = (p1, . . . , pn) ∈ ∆n and ˙ p = ( ˙ p1, . . . , ˙ pn) ∈ Rn. Definitions:

rest p
  • int
(Ly apunov) stable asymptoti ally stable
slide-86
SLIDE 86

Rest point, stable point, asymptotically stable point

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20

The

  • ntinuous
repli ato r equation:

˙ pi(t) = pi(t)[ fi(t) − ¯ f (t) ] is a system of differential equations. We have p = (p1, . . . , pn) ∈ ∆n and ˙ p = ( ˙ p1, . . . , ˙ pn) ∈ Rn. Definitions:

p is called a

rest p
  • int, if ˙

p = 0.

(Ly apunov) stable asymptoti ally stable
slide-87
SLIDE 87

Rest point, stable point, asymptotically stable point

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20

The

  • ntinuous
repli ato r equation:

˙ pi(t) = pi(t)[ fi(t) − ¯ f (t) ] is a system of differential equations. We have p = (p1, . . . , pn) ∈ ∆n and ˙ p = ( ˙ p1, . . . , ˙ pn) ∈ Rn. Definitions:

p is called a

rest p
  • int, if ˙

p = 0. (“If at p, then stays at p”.)

(Ly apunov) stable asymptoti ally stable
slide-88
SLIDE 88

Rest point, stable point, asymptotically stable point

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20

The

  • ntinuous
repli ato r equation:

˙ pi(t) = pi(t)[ fi(t) − ¯ f (t) ] is a system of differential equations. We have p = (p1, . . . , pn) ∈ ∆n and ˙ p = ( ˙ p1, . . . , ˙ pn) ∈ Rn. Definitions:

p is called a

rest p
  • int, if ˙

p = 0. (“If at p, then stays at p”.)

■ A rest point p is called

(Ly apunov) stable if for every neighborhood U
  • f p there is another neighborhood U′ of p such that states in U′, if

iterated, remain within U.

asymptoti ally stable
slide-89
SLIDE 89

Rest point, stable point, asymptotically stable point

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20

The

  • ntinuous
repli ato r equation:

˙ pi(t) = pi(t)[ fi(t) − ¯ f (t) ] is a system of differential equations. We have p = (p1, . . . , pn) ∈ ∆n and ˙ p = ( ˙ p1, . . . , ˙ pn) ∈ Rn. Definitions:

p is called a

rest p
  • int, if ˙

p = 0. (“If at p, then stays at p”.)

■ A rest point p is called

(Ly apunov) stable if for every neighborhood U
  • f p there is another neighborhood U′ of p such that states in U′, if

iterated, remain within U. (“If close to p, then always close to p.”)

asymptoti ally stable
slide-90
SLIDE 90

Rest point, stable point, asymptotically stable point

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20

The

  • ntinuous
repli ato r equation:

˙ pi(t) = pi(t)[ fi(t) − ¯ f (t) ] is a system of differential equations. We have p = (p1, . . . , pn) ∈ ∆n and ˙ p = ( ˙ p1, . . . , ˙ pn) ∈ Rn. Definitions:

p is called a

rest p
  • int, if ˙

p = 0. (“If at p, then stays at p”.)

■ A rest point p is called

(Ly apunov) stable if for every neighborhood U
  • f p there is another neighborhood U′ of p such that states in U′, if

iterated, remain within U. (“If close to p, then always close to p.”)

■ A rest point p is called

asymptoti ally stable if p has a neighborhood U

such that all proportion vectors in U, if iterated, converge to p.

slide-91
SLIDE 91

Rest point, stable point, asymptotically stable point

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 20

The

  • ntinuous
repli ato r equation:

˙ pi(t) = pi(t)[ fi(t) − ¯ f (t) ] is a system of differential equations. We have p = (p1, . . . , pn) ∈ ∆n and ˙ p = ( ˙ p1, . . . , ˙ pn) ∈ Rn. Definitions:

p is called a

rest p
  • int, if ˙

p = 0. (“If at p, then stays at p”.)

■ A rest point p is called

(Ly apunov) stable if for every neighborhood U
  • f p there is another neighborhood U′ of p such that states in U′, if

iterated, remain within U. (“If close to p, then always close to p.”)

■ A rest point p is called

asymptoti ally stable if p has a neighborhood U

such that all proportion vectors in U, if iterated, converge to p. (“if close to p, then convergence to p.”).

slide-92
SLIDE 92

Relation with Nash equilibria

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21

State p is a Nash equilibrium:

∀q : q(Ap) ≤ p(Ap) ⇔ ∀q : q f ≤ p f ⇔ ∀q : q1 f1 + · · · + qn fn ≤ ¯

f . If for all i: fi ≤ ¯ f, then it must be that for all i: fi = ¯ f (check!), which means we have a rest

  • point. Such a rest point is called
saturated.
slide-93
SLIDE 93

Relation with Nash equilibria

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21

State p is a Nash equilibrium:

∀q : q(Ap) ≤ p(Ap) ⇔ ∀q : q f ≤ p f ⇔ ∀q : q1 f1 + · · · + qn fn ≤ ¯

f . If for all i: fi ≤ ¯ f, then it must be that for all i: fi = ¯ f (check!), which means we have a rest

  • point. Such a rest point is called
saturated.

■ Nash equilibrium ⇔

saturated rest point.

slide-94
SLIDE 94

Relation with Nash equilibria

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21

State p is a Nash equilibrium:

∀q : q(Ap) ≤ p(Ap) ⇔ ∀q : q f ≤ p f ⇔ ∀q : q1 f1 + · · · + qn fn ≤ ¯

f . If for all i: fi ≤ ¯ f, then it must be that for all i: fi = ¯ f (check!), which means we have a rest

  • point. Such a rest point is called
saturated.

■ Nash equilibrium ⇔

saturated rest point.

  • Proof. ⇒: take pure q. ⇐: if

for all i: fi ≤ f , then no convex combination of those fi can supersede ¯ f.

slide-95
SLIDE 95

Relation with Nash equilibria

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21

State p is a Nash equilibrium:

∀q : q(Ap) ≤ p(Ap) ⇔ ∀q : q f ≤ p f ⇔ ∀q : q1 f1 + · · · + qn fn ≤ ¯

f . If for all i: fi ≤ ¯ f, then it must be that for all i: fi = ¯ f (check!), which means we have a rest

  • point. Such a rest point is called
saturated.

■ Nash equilibrium ⇔

saturated rest point.

  • Proof. ⇒: take pure q. ⇐: if

for all i: fi ≤ f , then no convex combination of those fi can supersede ¯ f.

■ Nash equilibrium ⇒ rest

  • point. (Trivial.)
slide-96
SLIDE 96

Relation with Nash equilibria

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21

State p is a Nash equilibrium:

∀q : q(Ap) ≤ p(Ap) ⇔ ∀q : q f ≤ p f ⇔ ∀q : q1 f1 + · · · + qn fn ≤ ¯

f . If for all i: fi ≤ ¯ f, then it must be that for all i: fi = ¯ f (check!), which means we have a rest

  • point. Such a rest point is called
saturated.

■ Nash equilibrium ⇔

saturated rest point.

  • Proof. ⇒: take pure q. ⇐: if

for all i: fi ≤ f , then no convex combination of those fi can supersede ¯ f.

■ Nash equilibrium ⇒ rest

  • point. (Trivial.)

■ Fully mixed rest point ⇒

Nash equilibrium. (Because fully mixed implies saturated.)

slide-97
SLIDE 97

Relation with Nash equilibria

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21

State p is a Nash equilibrium:

∀q : q(Ap) ≤ p(Ap) ⇔ ∀q : q f ≤ p f ⇔ ∀q : q1 f1 + · · · + qn fn ≤ ¯

f . If for all i: fi ≤ ¯ f, then it must be that for all i: fi = ¯ f (check!), which means we have a rest

  • point. Such a rest point is called
saturated.

■ Nash equilibrium ⇔

saturated rest point.

  • Proof. ⇒: take pure q. ⇐: if

for all i: fi ≤ f , then no convex combination of those fi can supersede ¯ f.

■ Nash equilibrium ⇒ rest

  • point. (Trivial.)

■ Fully mixed rest point ⇒

Nash equilibrium. (Because fully mixed implies saturated.)

■ Strict Nash equilibrium ⇒

asymptotically stable.

slide-98
SLIDE 98

Relation with Nash equilibria

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21

State p is a Nash equilibrium:

∀q : q(Ap) ≤ p(Ap) ⇔ ∀q : q f ≤ p f ⇔ ∀q : q1 f1 + · · · + qn fn ≤ ¯

f . If for all i: fi ≤ ¯ f, then it must be that for all i: fi = ¯ f (check!), which means we have a rest

  • point. Such a rest point is called
saturated.

■ Nash equilibrium ⇔

saturated rest point.

  • Proof. ⇒: take pure q. ⇐: if

for all i: fi ≤ f , then no convex combination of those fi can supersede ¯ f.

■ Nash equilibrium ⇒ rest

  • point. (Trivial.)

■ Fully mixed rest point ⇒

Nash equilibrium. (Because fully mixed implies saturated.)

■ Strict Nash equilibrium ⇒

asymptotically stable.

■ Limit point in the interior of

∆n ⇒ Nash equilibrium.

slide-99
SLIDE 99

Relation with Nash equilibria

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 21

State p is a Nash equilibrium:

∀q : q(Ap) ≤ p(Ap) ⇔ ∀q : q f ≤ p f ⇔ ∀q : q1 f1 + · · · + qn fn ≤ ¯

f . If for all i: fi ≤ ¯ f, then it must be that for all i: fi = ¯ f (check!), which means we have a rest

  • point. Such a rest point is called
saturated.

■ Nash equilibrium ⇔

saturated rest point.

  • Proof. ⇒: take pure q. ⇐: if

for all i: fi ≤ f , then no convex combination of those fi can supersede ¯ f.

■ Nash equilibrium ⇒ rest

  • point. (Trivial.)

■ Fully mixed rest point ⇒

Nash equilibrium. (Because fully mixed implies saturated.)

■ Strict Nash equilibrium ⇒

asymptotically stable.

■ Limit point in the interior of

∆n ⇒ Nash equilibrium.

■ Asymptotically stable in the

interior of ∆n ⇒ isolated trembling-hand perfect Nash equilibrium.

slide-100
SLIDE 100

Not all Nash equilibria are Lyapunov stable

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 22

(1, 0, 0) is Nash but not Lyapunov stable. (The picture is merely

suggestive, since it only contains a few traces of the dynamics.)

slide-101
SLIDE 101

Not all Nash equilibria are Lyapunov stable

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 23

(1, 0, 0) is Nash but not Lyapunov stable. (The picture is merely

suggestive, since it only contains a few traces of the dynamics.)

slide-102
SLIDE 102

The discrete replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 24

slide-103
SLIDE 103

The discrete step equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 25

dis rete step equation birth and death rate absolute gro wth
slide-104
SLIDE 104

The discrete step equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 25

■ The

dis rete step equation is given by

qi(t + 1) =Def qi(t)[1 + β + fi(t)], where 1 is the reproduction factor, β is the

birth and death rate, and

fi(t) indicates the percentage that is added / subtracted due to fitness.

absolute gro wth
slide-105
SLIDE 105

The discrete step equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 25

■ The

dis rete step equation is given by

qi(t + 1) =Def qi(t)[1 + β + fi(t)], where 1 is the reproduction factor, β is the

birth and death rate, and

fi(t) indicates the percentage that is added / subtracted due to fitness.

■ To prevent negative proportions and sudden extermination, it is

required that fitness, hence scores, must be defined such that 1 + β + fi(t) > 0, for all t and i.

absolute gro wth
slide-106
SLIDE 106

The discrete step equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 25

■ The

dis rete step equation is given by

qi(t + 1) =Def qi(t)[1 + β + fi(t)], where 1 is the reproduction factor, β is the

birth and death rate, and

fi(t) indicates the percentage that is added / subtracted due to fitness.

■ To prevent negative proportions and sudden extermination, it is

required that fitness, hence scores, must be defined such that 1 + β + fi(t) > 0, for all t and i. This requirement is often left implicit in the literature,

absolute gro wth
slide-107
SLIDE 107

The discrete step equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 25

■ The

dis rete step equation is given by

qi(t + 1) =Def qi(t)[1 + β + fi(t)], where 1 is the reproduction factor, β is the

birth and death rate, and

fi(t) indicates the percentage that is added / subtracted due to fitness.

■ To prevent negative proportions and sudden extermination, it is

required that fitness, hence scores, must be defined such that 1 + β + fi(t) > 0, for all t and i. This requirement is often left implicit in the literature,

■ The

absolute gro wth of species i is

∆qi(t) = qi(t + 1) − qi(t)

= qi(t)[1 + β + fi(t)] − qi(t) = qi(t)[β + fi(t)].

slide-108
SLIDE 108

The discrete replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 26

dis rete repli ato r equation intrinsi birth and death ratio tness dep endent birth ratio dis rete step equation
slide-109
SLIDE 109

The discrete replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 26

■ The

dis rete repli ato r equation:

pi(t + 1) = pi(t)1 + β + fi(t) 1 + β + ¯ f (t)

intrinsi birth and death ratio tness dep endent birth ratio dis rete step equation
slide-110
SLIDE 110

The discrete replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 26

■ The

dis rete repli ato r equation:

pi(t + 1) = pi(t)1 + β + fi(t) 1 + β + ¯ f (t)

■ The idea of the discrete replicator equation: between two generations,

an individual of type i produces 1 + β + fi(t)

  • ffspring. Here, β is the
intrinsi birth and death ratio, and fi is the tness dep endent birth ratio. dis rete step equation
slide-111
SLIDE 111

The discrete replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 26

■ The

dis rete repli ato r equation:

pi(t + 1) = pi(t)1 + β + fi(t) 1 + β + ¯ f (t)

■ The idea of the discrete replicator equation: between two generations,

an individual of type i produces 1 + β + fi(t)

  • ffspring. Here, β is the
intrinsi birth and death ratio, and fi is the tness dep endent birth ratio.

■ The DRE follows from the

dis rete step equation:

qi(t + 1) =Def qi(t)[1 + β + fi(t)].

slide-112
SLIDE 112

Derivation of the DRE from the DSE

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 27

pi(t + 1) = qi(t + 1) ∑n

j=1 qj(t + 1) =

qi(t)[1 + β + fi(t)] ∑n

j=1 qj(t)[1 + β + fj(t)]

=

1 q(t)qi(t)[1 + β + fi(t)] 1 q(t) ∑n j=1 qj(t)[1 + β + fj(t)]

=

pi(t)[1 + β + fi(t)] ∑n

j=1 pj(t)[1 + β + fj(t)]

=

pi(t)[1 + β + fi(t)] ∑n

j=1 pj(t) + β ∑n j=1 pj(t) + ∑n j=1 pj(t) fj(t)]

= pi(t)[1 + β + fi(t)]

1 + β + ¯ f (t)

= pi(t)1 + β + fi(t)

1 + β + ¯ f (t) .

slide-113
SLIDE 113

Properties of the DRE

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 28

dis rete repli ato r equation
slide-114
SLIDE 114

Properties of the DRE

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 28

  • Claim. If a species is present, is was present and will be present
  • forever. Same for absent.
  • Proof. Just look at the
dis rete repli ato r equation:

pi(t + 1) = pi(t)1 + β + fi(t) 1 + β + ¯ f (t) and recall that 1 + β + fi(t) > 0 for all t and i, hence 1 + β + ¯ f (t) > 0 for all t. So all the pi are always multiplied by a positive number.

slide-115
SLIDE 115

Properties of the DRE

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 28

  • Claim. If a species is present, is was present and will be present
  • forever. Same for absent.
  • Proof. Just look at the
dis rete repli ato r equation:

pi(t + 1) = pi(t)1 + β + fi(t) 1 + β + ¯ f (t) and recall that 1 + β + fi(t) > 0 for all t and i, hence 1 + β + ¯ f (t) > 0 for all t. So all the pi are always multiplied by a positive number.

■ If pi was 0 it remains 0.

slide-116
SLIDE 116

Properties of the DRE

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 28

  • Claim. If a species is present, is was present and will be present
  • forever. Same for absent.
  • Proof. Just look at the
dis rete repli ato r equation:

pi(t + 1) = pi(t)1 + β + fi(t) 1 + β + ¯ f (t) and recall that 1 + β + fi(t) > 0 for all t and i, hence 1 + β + ¯ f (t) > 0 for all t. So all the pi are always multiplied by a positive number.

■ If pi was 0 it remains 0. ■ If pi was positive it remains positive.

slide-117
SLIDE 117

If a species is absent, it will be absent forever

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 29

Phase space of a replicator. Notice that corners, edges, and the interior map into themselves. This is always the case.

slide-118
SLIDE 118

If a species is absent, it will be absent forever

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 30

Phase space of a replicator. Notice that corners, edges, and the interior map into themselves. This is always the case.

slide-119
SLIDE 119

If a species is absent, it will be absent forever

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 31

Phase space of a replicator. Notice that corners, edges, and the interior map into themselves. This is always the case. q

slide-120
SLIDE 120

Properties of the DRE

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 32

slide-121
SLIDE 121

Properties of the DRE

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 32

  • Claim. Species i grows if and only if pi(t) > 0 and it has above

average fitness.

slide-122
SLIDE 122

Properties of the DRE

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 32

  • Claim. Species i grows if and only if pi(t) > 0 and it has above

average fitness. Proof. pi(t + 1) > pi(t)

pi(t)1 + β + fi(t) 1 + β + ¯ f (t) > pi(t)

1 + β + fi(t) > 1 + β + ¯ f (t), pi(t) > 0

fi(t) > ¯ f (t), pi(t) > 0

slide-123
SLIDE 123

Properties of the DRE

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 32

  • Claim. Species i grows if and only if pi(t) > 0 and it has above

average fitness. Proof. pi(t + 1) > pi(t)

pi(t)1 + β + fi(t) 1 + β + ¯ f (t) > pi(t)

1 + β + fi(t) > 1 + β + ¯ f (t), pi(t) > 0

fi(t) > ¯ f (t), pi(t) > 0

  • Question. What if β is large?
slide-124
SLIDE 124

Properties of the DRE

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 32

  • Claim. Species i grows if and only if pi(t) > 0 and it has above

average fitness. Proof. pi(t + 1) > pi(t)

pi(t)1 + β + fi(t) 1 + β + ¯ f (t) > pi(t)

1 + β + fi(t) > 1 + β + ¯ f (t), pi(t) > 0

fi(t) > ¯ f (t), pi(t) > 0

  • Question. What if β is large? Answer. If β is large then the differences in

growth among species is smaller, and the dynamics is slower (“bluer”).

slide-125
SLIDE 125

The continuous replicator equation

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 33

slide-126
SLIDE 126

Derivation of the CRE from the DRE

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 34

Discrete step equation: qi(t + 1) = qi(t)[1 + β + fi(t)]

slide-127
SLIDE 127

Derivation of the CRE from the DRE

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 34

Discrete step equation: qi(t + 1) = qi(t)[1 + β + fi(t)]

■ Idea: reduce time steps t = 1 to smaller time steps t = δ, where

0 ≤ δ ≤ 1.

slide-128
SLIDE 128

Derivation of the CRE from the DRE

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 34

Discrete step equation: qi(t + 1) = qi(t)[1 + β + fi(t)]

■ Idea: reduce time steps t = 1 to smaller time steps t = δ, where

0 ≤ δ ≤ 1.

■ The idea is the following:

Per small step t = δ the largest part 1 − δ of species i remains unchanged, while a smaller part δ of species i does change: qi(t + δ) = (1 − δ) qi(t)

  • remains

+δ qi(t)(1 + β + fi(t))

  • changes

.

slide-129
SLIDE 129

Derivation of the CRE from the DRE

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 34

Discrete step equation: qi(t + 1) = qi(t)[1 + β + fi(t)]

■ Idea: reduce time steps t = 1 to smaller time steps t = δ, where

0 ≤ δ ≤ 1.

■ The idea is the following:

Per small step t = δ the largest part 1 − δ of species i remains unchanged, while a smaller part δ of species i does change: qi(t + δ) = (1 − δ) qi(t)

  • remains

+δ qi(t)(1 + β + fi(t))

  • changes

.

■ What if δ = 0?

slide-130
SLIDE 130

Derivation of the CRE from the DRE

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 34

Discrete step equation: qi(t + 1) = qi(t)[1 + β + fi(t)]

■ Idea: reduce time steps t = 1 to smaller time steps t = δ, where

0 ≤ δ ≤ 1.

■ The idea is the following:

Per small step t = δ the largest part 1 − δ of species i remains unchanged, while a smaller part δ of species i does change: qi(t + δ) = (1 − δ) qi(t)

  • remains

+δ qi(t)(1 + β + fi(t))

  • changes

.

■ What if δ = 0? What if δ = 1?

slide-131
SLIDE 131

Derivation of the CRE from the DRE

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 35

We have: qi(t + δ) − qi(t) δ

= (1 − δ)qi(t) + δqi(t)(1 + β + fi(t)) − qi(t)

δ

= . . . = qi(β + fi(t)).

So: ˙ qi = dqi(t) dt

= lim

δ→0

qi(t + δ) − qi(t) δ

= lim

δ→0 qi(β + fi(t))

= qi(β + fi(t)).

slide-132
SLIDE 132

Derivation of the CRE from the DRE

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 36

pi(t + δ) = qi(t + δ) ∑n

j=1 qj(t + δ)

= (1 − δ)qi(t) + δqi(t)(1 + β + fi(t))

∑n

j=1

(1 − δ)qj(t) + δqj(t)(1 + β + fj(t))

  • (/q(t))

= (1 − δ)pi(t) + δpi(t)(1 + β + fi(t))

∑n

j=1

(1 − δ)pj(t) + δpj(t)(1 + β + fj(t))

  • (yields proportions)

=

pi(t)[1 + δ(β + fi(t))] ∑n

j=1 pj(t)[1 + δ(β + fj(t))]

(∑ pi(t) = 1)

= pi(t)1 + δ ((β + fi(t))

1 + δ

  • β + ¯

f (t) .

slide-133
SLIDE 133

Derivation of the CRE from the DRE

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 36

pi(t + δ) = qi(t + δ) ∑n

j=1 qj(t + δ)

= (1 − δ)qi(t) + δqi(t)(1 + β + fi(t))

∑n

j=1

(1 − δ)qj(t) + δqj(t)(1 + β + fj(t))

  • (/q(t))

= (1 − δ)pi(t) + δpi(t)(1 + β + fi(t))

∑n

j=1

(1 − δ)pj(t) + δpj(t)(1 + β + fj(t))

  • (yields proportions)

=

pi(t)[1 + δ(β + fi(t))] ∑n

j=1 pj(t)[1 + δ(β + fj(t))]

(∑ pi(t) = 1)

= pi(t)1 + δ ((β + fi(t))

1 + δ

  • β + ¯

f (t) .

■ What if δ = 0?

slide-134
SLIDE 134

Derivation of the CRE from the DRE

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 36

pi(t + δ) = qi(t + δ) ∑n

j=1 qj(t + δ)

= (1 − δ)qi(t) + δqi(t)(1 + β + fi(t))

∑n

j=1

(1 − δ)qj(t) + δqj(t)(1 + β + fj(t))

  • (/q(t))

= (1 − δ)pi(t) + δpi(t)(1 + β + fi(t))

∑n

j=1

(1 − δ)pj(t) + δpj(t)(1 + β + fj(t))

  • (yields proportions)

=

pi(t)[1 + δ(β + fi(t))] ∑n

j=1 pj(t)[1 + δ(β + fj(t))]

(∑ pi(t) = 1)

= pi(t)1 + δ ((β + fi(t))

1 + δ

  • β + ¯

f (t) .

■ What if δ = 0? What if δ = 1?

slide-135
SLIDE 135

Derivation of the CRE from the DRE

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 37

Now pi(t + δ) − pi(t) δ

=

pi(t) 1+δ((β+ fi(t))

1+δ(β+ ¯ f(t)) − pi(t)

δ

= pi(t)

1+δ((β+ fi(t)) 1+δ(β+ ¯ f(t)) − 1

δ multiply w. 1 + δ

  • β + ¯

f (t)

  • = pi(t)1 + δ ((β + fi(t)) −
  • 1 + δ
  • β + ¯

f (t)

  • δ
  • 1 + δ
  • β + ¯

f (t)

  • = pi(t)

fi(t) − ¯ f (t) 1 + δ(β + ¯ f (t)). So ˙ pi = lim

δ→0

pi(t + δ) − pi(t) δ

= lim

δ→0 pi(t)

fi(t) − ¯ f (t) 1 + δ(β + ¯ f (t)) = pi(t) fi(t) − ¯ f (t) 1 + 0 · C

= pi(t)[ fi(t) − ¯

f (t) ].

slide-136
SLIDE 136

Calculating stationary points

  • f the replicator

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 38

slide-137
SLIDE 137

Finding stationary points of the replicator: example

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 39

Consider the replicator with A =   6 1 6 4 10 1 8 5 1   and p =   x y z   .

slide-138
SLIDE 138

Finding stationary points of the replicator: example

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 39

Consider the replicator with A =   6 1 6 4 10 1 8 5 1   and p =   x y z   . Stationary points (fixed points, rest points):        p ∈ ∆2 x((Ap)x − p(Ap)) = 0 y((Ap)y − p(Ap)) = 0 z((Ap)z − p(Ap)) = 0.

slide-139
SLIDE 139

Finding stationary points of the replicator: example

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 39

Consider the replicator with A =   6 1 6 4 10 1 8 5 1   and p =   x y z   . Stationary points (fixed points, rest points):        p ∈ ∆2 x((Ap)x − p(Ap)) = 0 y((Ap)y − p(Ap)) = 0 z((Ap)z − p(Ap)) = 0. This is equivalent with       

(x, y, z) ∈ ∆2

slide-140
SLIDE 140

Finding stationary points of the replicator: example

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 39

Consider the replicator with A =   6 1 6 4 10 1 8 5 1   and p =   x y z   . Stationary points (fixed points, rest points):        p ∈ ∆2 x((Ap)x − p(Ap)) = 0 y((Ap)y − p(Ap)) = 0 z((Ap)z − p(Ap)) = 0. This is equivalent with       

(x, y, z) ∈ ∆2, i.e., x, y, z ∈ [0, 1] and x + y + z = 1

slide-141
SLIDE 141

Finding stationary points of the replicator: example

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 39

Consider the replicator with A =   6 1 6 4 10 1 8 5 1   and p =   x y z   . Stationary points (fixed points, rest points):        p ∈ ∆2 x((Ap)x − p(Ap)) = 0 y((Ap)y − p(Ap)) = 0 z((Ap)z − p(Ap)) = 0. This is equivalent with       

(x, y, z) ∈ ∆2, i.e., x, y, z ∈ [0, 1] and x + y + z = 1

x = 0

slide-142
SLIDE 142

Finding stationary points of the replicator: example

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 39

Consider the replicator with A =   6 1 6 4 10 1 8 5 1   and p =   x y z   . Stationary points (fixed points, rest points):        p ∈ ∆2 x((Ap)x − p(Ap)) = 0 y((Ap)y − p(Ap)) = 0 z((Ap)z − p(Ap)) = 0. This is equivalent with       

(x, y, z) ∈ ∆2, i.e., x, y, z ∈ [0, 1] and x + y + z = 1

x = 0 or 6x − 6x2 + y − 5xy − 10y2 + 6z − 14xz − 6yz − z2 = 0

slide-143
SLIDE 143

Finding stationary points of the replicator: example

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 39

Consider the replicator with A =   6 1 6 4 10 1 8 5 1   and p =   x y z   . Stationary points (fixed points, rest points):        p ∈ ∆2 x((Ap)x − p(Ap)) = 0 y((Ap)y − p(Ap)) = 0 z((Ap)z − p(Ap)) = 0. This is equivalent with       

(x, y, z) ∈ ∆2, i.e., x, y, z ∈ [0, 1] and x + y + z = 1

x = 0 or 6x − 6x2 + y − 5xy − 10y2 + 6z − 14xz − 6yz − z2 = 0 y = 0

slide-144
SLIDE 144

Finding stationary points of the replicator: example

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 39

Consider the replicator with A =   6 1 6 4 10 1 8 5 1   and p =   x y z   . Stationary points (fixed points, rest points):        p ∈ ∆2 x((Ap)x − p(Ap)) = 0 y((Ap)y − p(Ap)) = 0 z((Ap)z − p(Ap)) = 0. This is equivalent with       

(x, y, z) ∈ ∆2, i.e., x, y, z ∈ [0, 1] and x + y + z = 1

x = 0 or 6x − 6x2 + y − 5xy − 10y2 + 6z − 14xz − 6yz − z2 = 0 y = 0 or 4x − 6x2 + 10y − 5xy − 10y2 + z − 14xz − 6yz − z2 = 0

slide-145
SLIDE 145

Finding stationary points of the replicator: example

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 39

Consider the replicator with A =   6 1 6 4 10 1 8 5 1   and p =   x y z   . Stationary points (fixed points, rest points):        p ∈ ∆2 x((Ap)x − p(Ap)) = 0 y((Ap)y − p(Ap)) = 0 z((Ap)z − p(Ap)) = 0. This is equivalent with       

(x, y, z) ∈ ∆2, i.e., x, y, z ∈ [0, 1] and x + y + z = 1

x = 0 or 6x − 6x2 + y − 5xy − 10y2 + 6z − 14xz − 6yz − z2 = 0 y = 0 or 4x − 6x2 + 10y − 5xy − 10y2 + z − 14xz − 6yz − z2 = 0 z = 0

slide-146
SLIDE 146

Finding stationary points of the replicator: example

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 39

Consider the replicator with A =   6 1 6 4 10 1 8 5 1   and p =   x y z   . Stationary points (fixed points, rest points):        p ∈ ∆2 x((Ap)x − p(Ap)) = 0 y((Ap)y − p(Ap)) = 0 z((Ap)z − p(Ap)) = 0. This is equivalent with       

(x, y, z) ∈ ∆2, i.e., x, y, z ∈ [0, 1] and x + y + z = 1

x = 0 or 6x − 6x2 + y − 5xy − 10y2 + 6z − 14xz − 6yz − z2 = 0 y = 0 or 4x − 6x2 + 10y − 5xy − 10y2 + z − 14xz − 6yz − z2 = 0 z = 0 or 8x − 6x2 + 5y − 5xy − 10y2 + z − 14xz − 6yz − z2 = 0

slide-147
SLIDE 147

Finding stationary points of the replicator: example

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 39

Consider the replicator with A =   6 1 6 4 10 1 8 5 1   and p =   x y z   . Stationary points (fixed points, rest points):        p ∈ ∆2 x((Ap)x − p(Ap)) = 0 y((Ap)y − p(Ap)) = 0 z((Ap)z − p(Ap)) = 0. This is equivalent with       

(x, y, z) ∈ ∆2, i.e., x, y, z ∈ [0, 1] and x + y + z = 1

x = 0 or 6x − 6x2 + y − 5xy − 10y2 + 6z − 14xz − 6yz − z2 = 0 y = 0 or 4x − 6x2 + 10y − 5xy − 10y2 + z − 14xz − 6yz − z2 = 0 z = 0 or 8x − 6x2 + 5y − 5xy − 10y2 + z − 14xz − 6yz − z2 = 0 Solve with Maple / Mathematica / SciPy / . . .

slide-148
SLIDE 148

Finding stationary points of the replicator: example

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 39

Consider the replicator with A =   6 1 6 4 10 1 8 5 1   and p =   x y z   . Stationary points (fixed points, rest points):        p ∈ ∆2 x((Ap)x − p(Ap)) = 0 y((Ap)y − p(Ap)) = 0 z((Ap)z − p(Ap)) = 0. This is equivalent with       

(x, y, z) ∈ ∆2, i.e., x, y, z ∈ [0, 1] and x + y + z = 1

x = 0 or 6x − 6x2 + y − 5xy − 10y2 + 6z − 14xz − 6yz − z2 = 0 y = 0 or 4x − 6x2 + 10y − 5xy − 10y2 + z − 14xz − 6yz − z2 = 0 z = 0 or 8x − 6x2 + 5y − 5xy − 10y2 + z − 14xz − 6yz − z2 = 0 Solve with Maple / Mathematica / SciPy / . . . (Nash equilibria are blue):

  • (1, 0, 0), (0, 1, 0) , (0, 0, 1),

25 71, 20 71, 26 71

  • ,

5 7, 0, 2 7

  • ,

9 11, 2 11, 0

  • .
slide-149
SLIDE 149

Finding stationary points of the replicator: example

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 40

Phase space of the replicator as discussed. Circled rest points indicate Nash equilibria in the corresponding symmetric game.

slide-150
SLIDE 150

Summary

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 41

slide-151
SLIDE 151

Implications

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 42

SN ESS NSS GSS ASS LSS LR NE FP * i * * SN = strict Nash, ESS - evol’y stable strategy, GSS = glob’y stable state, ASS = asymp’y stable state, NSS = neutrally stable strategy, LR = limit of replicator, LSS = Lyapunov stable state, FP = fixed point, * = only if fully mixed, i = isolated NE. Dotted: indirect implication. Blue: game theory; olive: evolutionary game theory; green: the replicator dynamic.

slide-152
SLIDE 152

Justifications of the implications

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 43

■ SN ⇒ ESS: cf. slides evolutionary games and, e.g., Th 7.7.12 of Sh&LB. ■ ESS ⇒ NSS: cf. slides evolutionary games and, e.g., Game Theory Evolving (2nd ed.) by H. Gintis. ■ ESS ⇒ NE: cf. slides evolutionary games and, e.g., Sh&LB Th 7.7.11. ■ ESS ⇒∗ GSS: cf., e.g., Th. 12.7 Gintis. ■ ESS ⇒ ASS: cf., e.g., Th. 7.7.13 Sh&LB,

  • Th. 12.7 Gintis, Sec. 3.5 (begin) of
  • Evol. Game Theory by J.G. Weibull.

■ NSS ⇒ LSS: cf. Sec. 3.5 Weibull. ■ GSS ⇒ ASS: by definition of the two concepts. ■ ASS ⇒ LSS: by definition of the two concepts. ■ ASS ⇒ LR: by definition of the two concepts. ■ ASS ⇒i NE: Th 7.7.8 Sh&LB,

  • Th. 12.6 Gintis.

■ LSS ⇒ NE: Th 7.7.6 Sh&LB, 7.2.1(c) Hofbauer & Sigmund. ■ LR ⇒∗ NE: Th. 7.2.1(b) H&S. ■ NE ⇒ FP: Th. 7.2.1(a) H&S, Th 7.7.5 Sh&LB, Th. 12.6 Gintis. ■ LR ⇒ FP: Ch. 6 Weibull.

slide-153
SLIDE 153

Sources referred to

Author: Gerard Vreeswijk. Slides last modified on June 10th, 2020 at 14:01 Multi-agent learning: The replicator dynamic, slide 44

Weibull, J. W. (1997). Evolutionary game theory. MIT press. Hofbauer, J., & Sigmund, K. (1998). Evolutionary games and population dynamics. Cambridge university press. Gintis, H. (2001). Game theory evolving: A problem-centered introduction to modeling strategic interaction (2nd ed.). Princeton: Princeton University Press. Shoham, Y., & Leyton-Brown, K. (2008). Multiagent systems: Algorithmic, game-theoretic, and logical foundations. Cambridge University Press. Vreeswijk, G.A.W. (2011). Evolutionary game theory. (Slides)