multi agent learning
play

Multi-agent learning Emergence of Conventions Gerard Vreeswijk , - PowerPoint PPT Presentation

Multi-agent learning Emergence of Conventions Multi-agent learning Emergence of Conventions Gerard Vreeswijk , Intelligent Systems Group, Computer Science Department, Faculty of Sciences, Utrecht University, The Netherlands. Last modified on


  1. Multi-agent learning Emergence of Conventions Multi-agent learning Emergence of Conventions Gerard Vreeswijk , Intelligent Systems Group, Computer Science Department, Faculty of Sciences, Utrecht University, The Netherlands. Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 1

  2. Multi-agent learning Emergence of Conventions Motivation Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 2

  3. Multi-agent learning Emergence of Conventions Simple example of a Markov process • Return probabilities are usually omitted in diagrams. • In this case it can be derived that, on average, � P ( Sun ) = 6/7 P ( Rain ) = 1/7 • How? We’ll see . . . Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 3

  4. Multi-agent learning Emergence of Conventions Plan for today 1. Markov processes. (Ergodic process, communicating states/class, transient state/class, recurrent state/class, periodic state/class, absorbing state, irreducible process, stationary distribution.) Compute stationary distributions: • Solve n linear equations. • Compare n so-called z -trees (Freidlin and Wentzell, 1984). 2. Perturbed Markov processes. (Regular perturbed Markov process, punctuated equilibrium, stochastically stable state.) Compute stochastically stable states: • Compare k so-called z -trees, where k is the number of so-called recurrent classes (Peyton Young, 1993). Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 4

  5. Multi-agent learning Emergence of Conventions Plan for today 3. Applications. • Emergence of a currency standard. • Competing technologies: operating system A vs. operating system B . • Competing technologies: cell phone company A vs. cell phone company B . (If time allows.) • Schelling’s model of segregation (1969). Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 5

  6. Multi-agent learning Emergence of Conventions P art 1: M arkov processes Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 6

  7. Multi-agent learning Emergence of Conventions State transitions Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 7

  8. Multi-agent learning Emergence of Conventions Communication classes Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 8

  9. Multi-agent learning Emergence of Conventions Start state matters Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 9

  10. Multi-agent learning Emergence of Conventions Start state matters. . . but here it does not Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 10

  11. Multi-agent learning Emergence of Conventions The stationary distribution (and computing one) P ( A ) = P ( A | A ′ ) P ( A ′ ) + P ( A | B ′ ) P ( B ′ ) + P ( A | C ′ ) P ( C ′ ) + P ( A | D ′ ) P ( D ′ ) Let us assume that visiting probabilities are stationary ( A = A ′ , B = B ′ , . . . ): = P ( A | A ) P ( A ) + P ( A | B ) P ( B ) + P ( A | C ) P ( C ) + P ( A | D ) P ( D ) = 0 · P ( A ) + 0 · P ( B ) + 1 · P ( C ) + 0 · P ( D ) = P ( C ) Let us write this as A = C . Similarly, B = 0.8 A , C = D , and D = 0.2 A + B . Four equations with four unknowns. (Always regular, i.e. Det � = 0 ?) Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 11

  12. Multi-agent learning Emergence of Conventions Theory of discrete Markov processes Facts: Definitions: • Stationary distribution: fixed point • Node is recurrent: process will of transition probabilities. return to it a.s. • Empirical distribution: long run • If finite number of states: normalised frequency of visits. – At least one recurrence class. • Limit distribution: long run – If precisely one recurrence class probability to visit a node. then ergodic, and conversely. • Process is path-dependent: • Stationary distribution always empirical distribution depends on exists. start state. Ergodic otherwise. Unique iff ergodic. In that case, • Class is recurrent: process cannot stationary distr. ≡ empirical distr. escape. Transient otherwise. • If ergodic and a-periodic, then • Process is irreducible: all states can stationary distr. ≡ limit distr. reach each other. Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 12

  13. Multi-agent learning Emergence of Conventions Finding stationary distributions with many states is difficult • Solve n equations in n unknowns. What if S is large?  0.1 0.2 0.0 0.1 0.0 0.1 0.0 0.3 0.0 0.2  0.1 0.2 0.0 0.1 0.0 0.1 0.0 0.3 0.0 0.2     0.1 0.2 0.0 0.1 0.0 0.1 0.0 0.3 0.0 0.2       0.0 0.1 0.1 0.2 0.0 0.1 0.0 0.3 0.0 0.2     0.5 0.2 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.2       0.1 0.2 0.0 0.1 0.0 0.1 0.0 0.3 0.0 0.2     0.0 0.1 0.1 0.2 0.0 0.1 0.0 0.3 0.0 0.2     0.1 0.2 0.0 0.1 0.0 0.1 0.0 0.3 0.0 0.2       0.3 0.1 0.2 0.0 0.1 0.0 0.0 0.0 0.3 0.0   0.1 0.2 0.0 0.1 0.0 0.1 0.0 0.3 0.0 0.2 • Freidlin & Wentzell (1984): only look at so-called state trees. Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 13

  14. Multi-agent learning Emergence of Conventions An irreducible (and finite) Markov process Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 14

  15. Multi-agent learning Emergence of Conventions One possible A -tree Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 15

  16. Multi-agent learning Emergence of Conventions Another possible A -tree Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 16

  17. Multi-agent learning Emergence of Conventions A perhaps easier way to compute the stationary distribution • An s -tree, T s , is a complete collection of disjoint paths from states � = s to s . • The likelihood of an s -tree T s , written ℓ ( T s ) , = Def the product of its edge probabilities. • The likelihood of a state s , written ℓ ( s ) , = Def sum of the likelihood of all s -trees. Theorem (Freidlin & Wentzell, 1984). Let P be an irreducible finite Markov process. Then, for all states, the likelihood of that state is proportional to the stationary probability of that state. Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 17

  18. Multi-agent learning Emergence of Conventions Counting s -trees with Freidlin & Wentzell: example Freidlin & Wentzell (1984): v ( s ) v ( t ) = Def ∑ µ ( s ) = ℓ ( T s ) ∑ t ∈ S v ( t ) , where T ∈ T s The unique C -tree is coloured red. Computing ℓ ( T C ) = 10 ǫ · 1/4 · . . . = 5 ǫ 3 /12. Similarly: State: A B C D E F G ǫ 2 /24 5 ǫ 3 /9 5 ǫ 3 /12 5 ǫ 2 /24 ǫ 2 /24 Distribution: ǫ /48 ǫ /32 Note what happens if ǫ → 0. Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 18

  19. Multi-agent learning Emergence of Conventions P art 2: P erturbed M arkov processes Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 19

  20. Multi-agent learning Emergence of Conventions Motivation Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 20

  21. Multi-agent learning Emergence of Conventions Most Markov processes are path-dependent (non-ergodic) Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 21

  22. Multi-agent learning Emergence of Conventions Make them ergodic by perturbing with ǫ r ( s , s ′ ) here and there Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 22

  23. Multi-agent learning Emergence of Conventions Compute s -trees from P 0 -recurrent classes only (!) Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 23

  24. Multi-agent learning Emergence of Conventions Compute s -trees from P 0 -recurrent classes only (!) Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 24

  25. Multi-agent learning Emergence of Conventions Class { B , D , E } possesses lowest stochastic potential, viz. 4 . Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 25

  26. Multi-agent learning Emergence of Conventions Example of P 0 and P ǫ     0.0 0.2 0.2 0.1 0.5 0.0 0.2 0.2 0.1 0.5 ǫ 7 0.5 − ǫ 7 0.3 0.1 0.1 0.3 0.0 0.1 0.1 0.5             0.1 0.2 0.2 0.0 0.5 0.1 0.2 0.2 0.0 0.5     = lim     0.7 0.1 0.2 0.0 0.0 0.7 0.1 0.2 0.0 0.0 ǫ → 0         0.2 − ǫ 2 /2 ǫ 2 0.5 − ǫ 2 /2  0.1 0.2   0.1 0.2 0.2 0.0 0.5      0.0 0.0 0.1 0.0 0.9 0.0 0.0 0.1 0.0 0.9 • Notice that some P 0 -positive probabilities “have to give way” to perturbe P 0 -zero probabilities with ǫ . (Because row probabilities must add up to 1.) Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 26

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend