Multi-agent learning Emergence of Conventions Gerard Vreeswijk , - PowerPoint PPT Presentation

Multi-agent learning Emergence of Conventions Multi-agent learning Emergence of Conventions Gerard Vreeswijk , Intelligent Systems Group, Computer Science Department, Faculty of Sciences, Utrecht University, The Netherlands. Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 1

Multi-agent learning Emergence of Conventions Motivation Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 2

Multi-agent learning Emergence of Conventions Simple example of a Markov process • Return probabilities are usually omitted in diagrams. • In this case it can be derived that, on average, � P ( Sun ) = 6/7 P ( Rain ) = 1/7 • How? We’ll see . . . Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 3

Multi-agent learning Emergence of Conventions Plan for today 1. Markov processes. (Ergodic process, communicating states/class, transient state/class, recurrent state/class, periodic state/class, absorbing state, irreducible process, stationary distribution.) Compute stationary distributions: • Solve n linear equations. • Compare n so-called z -trees (Freidlin and Wentzell, 1984). 2. Perturbed Markov processes. (Regular perturbed Markov process, punctuated equilibrium, stochastically stable state.) Compute stochastically stable states: • Compare k so-called z -trees, where k is the number of so-called recurrent classes (Peyton Young, 1993). Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 4

Multi-agent learning Emergence of Conventions Plan for today 3. Applications. • Emergence of a currency standard. • Competing technologies: operating system A vs. operating system B . • Competing technologies: cell phone company A vs. cell phone company B . (If time allows.) • Schelling’s model of segregation (1969). Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 5

Multi-agent learning Emergence of Conventions P art 1: M arkov processes Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 6

Multi-agent learning Emergence of Conventions State transitions Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 7

Multi-agent learning Emergence of Conventions Communication classes Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 8

Multi-agent learning Emergence of Conventions Start state matters Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 9

Multi-agent learning Emergence of Conventions Start state matters. . . but here it does not Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 10

Multi-agent learning Emergence of Conventions The stationary distribution (and computing one) P ( A ) = P ( A | A ′ ) P ( A ′ ) + P ( A | B ′ ) P ( B ′ ) + P ( A | C ′ ) P ( C ′ ) + P ( A | D ′ ) P ( D ′ ) Let us assume that visiting probabilities are stationary ( A = A ′ , B = B ′ , . . . ): = P ( A | A ) P ( A ) + P ( A | B ) P ( B ) + P ( A | C ) P ( C ) + P ( A | D ) P ( D ) = 0 · P ( A ) + 0 · P ( B ) + 1 · P ( C ) + 0 · P ( D ) = P ( C ) Let us write this as A = C . Similarly, B = 0.8 A , C = D , and D = 0.2 A + B . Four equations with four unknowns. (Always regular, i.e. Det � = 0 ?) Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 11

Multi-agent learning Emergence of Conventions Theory of discrete Markov processes Facts: Definitions: • Stationary distribution: fixed point • Node is recurrent: process will of transition probabilities. return to it a.s. • Empirical distribution: long run • If finite number of states: normalised frequency of visits. – At least one recurrence class. • Limit distribution: long run – If precisely one recurrence class probability to visit a node. then ergodic, and conversely. • Process is path-dependent: • Stationary distribution always empirical distribution depends on exists. start state. Ergodic otherwise. Unique iff ergodic. In that case, • Class is recurrent: process cannot stationary distr. ≡ empirical distr. escape. Transient otherwise. • If ergodic and a-periodic, then • Process is irreducible: all states can stationary distr. ≡ limit distr. reach each other. Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 12

Multi-agent learning Emergence of Conventions Finding stationary distributions with many states is difficult • Solve n equations in n unknowns. What if S is large?  0.1 0.2 0.0 0.1 0.0 0.1 0.0 0.3 0.0 0.2  0.1 0.2 0.0 0.1 0.0 0.1 0.0 0.3 0.0 0.2     0.1 0.2 0.0 0.1 0.0 0.1 0.0 0.3 0.0 0.2       0.0 0.1 0.1 0.2 0.0 0.1 0.0 0.3 0.0 0.2     0.5 0.2 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.2       0.1 0.2 0.0 0.1 0.0 0.1 0.0 0.3 0.0 0.2     0.0 0.1 0.1 0.2 0.0 0.1 0.0 0.3 0.0 0.2     0.1 0.2 0.0 0.1 0.0 0.1 0.0 0.3 0.0 0.2       0.3 0.1 0.2 0.0 0.1 0.0 0.0 0.0 0.3 0.0   0.1 0.2 0.0 0.1 0.0 0.1 0.0 0.3 0.0 0.2 • Freidlin & Wentzell (1984): only look at so-called state trees. Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 13

Multi-agent learning Emergence of Conventions An irreducible (and finite) Markov process Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 14

Multi-agent learning Emergence of Conventions One possible A -tree Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 15

Multi-agent learning Emergence of Conventions Another possible A -tree Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 16

Multi-agent learning Emergence of Conventions A perhaps easier way to compute the stationary distribution • An s -tree, T s , is a complete collection of disjoint paths from states � = s to s . • The likelihood of an s -tree T s , written ℓ ( T s ) , = Def the product of its edge probabilities. • The likelihood of a state s , written ℓ ( s ) , = Def sum of the likelihood of all s -trees. Theorem (Freidlin & Wentzell, 1984). Let P be an irreducible finite Markov process. Then, for all states, the likelihood of that state is proportional to the stationary probability of that state. Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 17

Multi-agent learning Emergence of Conventions Counting s -trees with Freidlin & Wentzell: example Freidlin & Wentzell (1984): v ( s ) v ( t ) = Def ∑ µ ( s ) = ℓ ( T s ) ∑ t ∈ S v ( t ) , where T ∈ T s The unique C -tree is coloured red. Computing ℓ ( T C ) = 10 ǫ · 1/4 · . . . = 5 ǫ 3 /12. Similarly: State: A B C D E F G ǫ 2 /24 5 ǫ 3 /9 5 ǫ 3 /12 5 ǫ 2 /24 ǫ 2 /24 Distribution: ǫ /48 ǫ /32 Note what happens if ǫ → 0. Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 18

Multi-agent learning Emergence of Conventions P art 2: P erturbed M arkov processes Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 19

Multi-agent learning Emergence of Conventions Motivation Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 20

Multi-agent learning Emergence of Conventions Most Markov processes are path-dependent (non-ergodic) Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 21

Multi-agent learning Emergence of Conventions Make them ergodic by perturbing with ǫ r ( s , s ′ ) here and there Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 22

Multi-agent learning Emergence of Conventions Compute s -trees from P 0 -recurrent classes only (!) Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 23

Multi-agent learning Emergence of Conventions Compute s -trees from P 0 -recurrent classes only (!) Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 24

Multi-agent learning Emergence of Conventions Class { B , D , E } possesses lowest stochastic potential, viz. 4 . Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 25

Multi-agent learning Emergence of Conventions Example of P 0 and P ǫ     0.0 0.2 0.2 0.1 0.5 0.0 0.2 0.2 0.1 0.5 ǫ 7 0.5 − ǫ 7 0.3 0.1 0.1 0.3 0.0 0.1 0.1 0.5             0.1 0.2 0.2 0.0 0.5 0.1 0.2 0.2 0.0 0.5     = lim     0.7 0.1 0.2 0.0 0.0 0.7 0.1 0.2 0.0 0.0 ǫ → 0         0.2 − ǫ 2 /2 ǫ 2 0.5 − ǫ 2 /2  0.1 0.2   0.1 0.2 0.2 0.0 0.5      0.0 0.0 0.1 0.0 0.9 0.0 0.0 0.1 0.0 0.9 • Notice that some P 0 -positive probabilities “have to give way” to perturbe P 0 -zero probabilities with ǫ . (Because row probabilities must add up to 1.) Last modified on March 29 th , 2011 at 11:53 Gerard Vreeswijk. Slide 26

Multi-agent learning Emergence of Conventions Gerard Vreeswijk , - PowerPoint PPT Presentation

Multi-agent learning Emergence of Conventions Multi-agent learning Emergence of Conventions Gerard Vreeswijk , Intelligent Systems Group, Computer Science Department, Faculty of Sciences, Utrecht University, The Netherlands. Last modified on

Multi-agent learning Multi-agent reinforcement learning Gerard Vreeswijk , Intelligent Systems

Overview Multi-Agent Systems Introduction to multi-agent systems and agent societies Agent

Multi-agent learning Gerard Vreeswijk , Intelligent Systems Group, Computer Science Department,

REINFORCEMENT LEARNING IN MULTI-AGENT SYSTEMS MACHINE LEARNING MEETUP DR. ANA PELETEIRO

An Agent Architecture An Agent Architecture An Agent Architecture An Agent Architecture for

S S S S calable calable Agent calable calable Agent Agent Plat forms Agent Plat forms

Agent-Based Systems Agent communication Speech act theory Michael Rovatsos Agent

Multi-agent learning Simplied Poker Yannick Bitane , April 14th, 2011. Yannick Bitane. Slides

Learning Agent Learning Agents An Agent that observes its performance and adapts its

ROMA: Multi-Agent Reinforcement Learning with Emerging Roles Tonghan Wang, Heng Dong, Victor

The Player Agent The Player Agent Are they the most important league official right now? right

Rational Agents (Ch. 2) Rational agent An agent/robot must be able to perceive and interact with

Agent-Based Systems Michael Rovatsos mrovatso@inf.ed.ac.uk Lecture 6 Agent Communication 1

Agent Training Welcome Blues Agent Portal Training e-Learning on the BCBSM Agent Portal

W HAT S AN A GENT ? Weiss, p. 29 [after Wooldridge and Jennings]: An agent is a

M ULTI -A GENT S YSTEMS Overview and Research Directions Whats an agent? AI Class 12 (C H .

Checking states and transitions of a set of communicating finite state machines R.M. Hierons

Communicating State Transition Systems for Fine-Grained Concurrent Resources Aleksandar Nanevski

Discrete time Markov chains Today: Discrete Time Markov Chains, Limiting Discrete time Markov

New Faculty Orientation Office of Research August 16, 2016 Dr. Christopher J. Keane Vice

CS780 Discrete-State Models Instructor: Peter Kemper R 006, phone 221-3462, email:kemper@cs.wm.edu

Multiparty Communications CS 118 Computer Network Fundamentals Peter Reiher Lecture 4 CS 118

Programs and State Machines Program & FSM connection What is the connection between

State of the AEU 2019 Executive Directors Report to the 2019 Assembly 2018-2019 Board of

Multi-agent learning Emergence of Conventions Gerard Vreeswijk , - PowerPoint PPT Presentation

Multi-agent learning Emergence of Conventions Multi-agent learning Emergence of Conventions Gerard Vreeswijk , Intelligent Systems Group, Computer Science Department, Faculty of Sciences, Utrecht University, The Netherlands. Last modified on

Multi-agent learning Multi-agent reinforcement learning Gerard Vreeswijk , Intelligent Systems

Overview Multi-Agent Systems Introduction to multi-agent systems and agent societies Agent

Multi-agent learning Gerard Vreeswijk , Intelligent Systems Group, Computer Science Department,

REINFORCEMENT LEARNING IN MULTI-AGENT SYSTEMS MACHINE LEARNING MEETUP DR. ANA PELETEIRO

An Agent Architecture An Agent Architecture An Agent Architecture An Agent Architecture for

S S S S calable calable Agent calable calable Agent Agent Plat forms Agent Plat forms

Agent-Based Systems Agent communication Speech act theory Michael Rovatsos Agent

Multi-agent learning Simplied Poker Yannick Bitane , April 14th, 2011. Yannick Bitane. Slides

Learning Agent Learning Agents An Agent that observes its performance and adapts its

ROMA: Multi-Agent Reinforcement Learning with Emerging Roles Tonghan Wang, Heng Dong, Victor

The Player Agent The Player Agent Are they the most important league official right now? right

Rational Agents (Ch. 2) Rational agent An agent/robot must be able to perceive and interact with

Agent-Based Systems Michael Rovatsos mrovatso@inf.ed.ac.uk Lecture 6 Agent Communication 1

Agent Training Welcome Blues Agent Portal Training e-Learning on the BCBSM Agent Portal

W HAT S AN A GENT ? Weiss, p. 29 [after Wooldridge and Jennings]: An agent is a

M ULTI -A GENT S YSTEMS Overview and Research Directions Whats an agent? AI Class 12 (C H .

Checking states and transitions of a set of communicating finite state machines R.M. Hierons

Communicating State Transition Systems for Fine-Grained Concurrent Resources Aleksandar Nanevski

Discrete time Markov chains Today: Discrete Time Markov Chains, Limiting Discrete time Markov

New Faculty Orientation Office of Research August 16, 2016 Dr. Christopher J. Keane Vice

CS780 Discrete-State Models Instructor: Peter Kemper R 006, phone 221-3462, email:kemper@cs.wm.edu

Multiparty Communications CS 118 Computer Network Fundamentals Peter Reiher Lecture 4 CS 118

Programs and State Machines Program &amp; FSM connection What is the connection between

State of the AEU 2019 Executive Directors Report to the 2019 Assembly 2018-2019 Board of

Programs and State Machines Program & FSM connection What is the connection between