Multi-agent learning Erik Berbee & Bas van Gijzel , Master - PowerPoint PPT Presentation

Multi-agent learning Methodology of MAL research Metho dology of MAL resea r h Multi-agent learning Erik Berbee & Bas van Gijzel , Master Student AT, Utrecht University Erik Berbee & Bas van Gijzel. Slides last processed on Monday 22 nd March, 2010 at 23:37h. Slide 1

Multi-agent learning Methodology of MAL research Overview Today we will talk about... • Formal setting • Characteristics of multi-agent learning • Classes of techniques • Types of results • Agendas and criticism • Loose ends and questions Erik Berbee & Bas van Gijzel. Slides last processed on Monday 22 nd March, 2010 at 23:37h. Slide 2

Multi-agent learning Methodology of MAL research The Problem • No unified goals/agendas – No unified formal setting Erik Berbee & Bas van Gijzel. Slides last processed on Monday 22 nd March, 2010 at 23:37h. Slide 3

Multi-agent learning Methodology of MAL research Formal setting: Stochastic games • Represented as a tuple ( N , S , � A , � R , T ) – N is the set of agents – S is the set of n-agents stage games – � A = A 1 , ..., A n with A i the set of actions (pure strategies) of agent i R = R 1 , ..., R n , with R i : S × � – � A → ℜ the reward function of agent i – T : S × � A → Π ( S ) is a stochastic transition function – Restricted versions: Repeated game, MDP Erik Berbee & Bas van Gijzel. Slides last processed on Monday 22 nd March, 2010 at 23:37h. Slide 4

Multi-agent learning Methodology of MAL research Sidetrack: Replicator Dynamics • Represented as a tuple ( A , P 0 , R ) • A set of possible pure strategies/actions for the agents indexed 1,...,m • P 0 initial distribution of agent across possible strategies, ∑ m i = 1 P 0 ( i ) • R : A × A → ℜ the immediate reward function for each agent • Each P t ( a ) is adjusted to the average reward • Can be seen as a repeated game between two agents playing the same mixed strategy Erik Berbee & Bas van Gijzel. Slides last processed on Monday 22 nd March, 2010 at 23:37h. Slide 5

Multi-agent learning Methodology of MAL research Formal setting: Available Information What information does an agent have? • Play is fully observable • Game is known • Opponents strategy is not know a priori Erik Berbee & Bas van Gijzel. Slides last processed on Monday 22 nd March, 2010 at 23:37h. Slide 6

Multi-agent learning Methodology of MAL research Sidetrack: Consequences of Restrictions on Information • f i ( z ) maps each state z to a probability distribution over i’s actions next period • f i ( z ) is uncoupled if it does not depend on opponents’payoffs Theorem 3. Given a finite action space A and positive integer s, there exist no uncoupled rules f i ( z ) whose state variable z is the last s plays, such that, for every game G on A, the period-by-period behaviors converge almost surely to a Nash equilibrium of G, or even to an ǫ -equilibrium of G, for all sufficiently small ǫ > 0. * H.P. Young (2007): The possible and the impossible in multi-agent learning . In: Artificial Intelligence 171 , pp. 429-433, 2007. Erik Berbee & Bas van Gijzel. Slides last processed on Monday 22 nd March, 2010 at 23:37h. Slide 7

Multi-agent learning Methodology of MAL research Characteristics of multi-agent learning • Learning and Teaching – Teaching assumes learning Y ou: • Equilibrium play not always best Other: Stackelberg game Left Right ( 1 , 0 ) ( 3 , 2 ) Up ( 2 , 1 ) ( 4 , 0 ) Down • Agent can either learn opponents strategy or learn a strategy that does well without learning the opponents strategy. Erik Berbee & Bas van Gijzel. Slides last processed on Monday 22 nd March, 2010 at 23:37h. Slide 8

Multi-agent learning Methodology of MAL research Model-based Learning • learns opponent’s strategy and play a best response • General scheme: 1. Start with some model of the opponent’s strategy. 2. Compute and play the best response. 3. Observe the opponent’s play and update your model of her strategy. 4. Goto step 2. Erik Berbee & Bas van Gijzel. Slides last processed on Monday 22 nd March, 2010 at 23:37h. Slide 9

Multi-agent learning Methodology of MAL research Model-based Learning: Fictitious Play • Model is a count of the plays by the opponent in the past • Model after ( R , S , P , R , P ) is ( R = 0.4, P = 0.4, S = 0.2 ) • Other examples Are: smooth fictitious play, exponential fictitious play, rational learninig Erik Berbee & Bas van Gijzel. Slides last processed on Monday 22 nd March, 2010 at 23:37h. Slide 10

Multi-agent learning Methodology of MAL research Model-free Learning • Learns how well own possible actions do • Most based on Bellman equation. • Basic algorithm: – Initial value function V 0 : S → ℜ for each state – V k + 1 ← R ( s ) + γ max a ∑ s ′ T ( s , a , s ′ ) V k ( s ′ ) – Optimal policy: for each s select a that maximizes ∑ s ′ T ( s , a , s ′ ) V k ( s ′ ) • Q-Learning: Compute optimal policy with unknown reward and transition functions • MAL: minimax-Q (zero sum), joint-action learner and Friend-or-Foe Q (team-games), Nash-Q and CE-Q (general-sum games) Erik Berbee & Bas van Gijzel. Slides last processed on Monday 22 nd March, 2010 at 23:37h. Slide 11

Multi-agent learning Methodology of MAL research Regret minimization: No Regret Learning • No-regret learning i ( a j , s i | s − i ) = ∑ t • r t k = 1 R ( a j , s k − i ) − R ( s k i , s k − i ) • If regret is positive agents selects each of its actions with probability proportional to regret Erik Berbee & Bas van Gijzel. Slides last processed on Monday 22 nd March, 2010 at 23:37h. Slide 12

Multi-agent learning Methodology of MAL research Types of results from Learning Algorithms 1. Convergence of the strategy profile to an (e.g., Nash) equilibrium of the stage game in self play (that is, when all agents adopt the learning procedure under consideration). 2. Successful learning of an opponent’s strategy (or opponents’ strategies). 3. Obtaining payoffs that exceed a specified threshold. • Safe, at least minimax • Consistent, at leas as well as the best response to the empirical distribution of play Erik Berbee & Bas van Gijzel. Slides last processed on Monday 22 nd March, 2010 at 23:37h. Slide 13

Multi-agent learning Methodology of MAL research Discussion 1. Convergence in self play to play of the equilibrium of stage game • Nash Equilibrium of stage game useful? • Convergence of play vs convergence of payoff • Convergence in self play necessary? 2. Most work assumes 2 players 2 actions, why? 3. Obtaining payoffs that exceed a specified threshold (safe/consistent). • Excludes teaching? Erik Berbee & Bas van Gijzel. Slides last processed on Monday 22 nd March, 2010 at 23:37h. Slide 14

Multi-agent learning Methodology of MAL research lassi� ation of the agendas in multi-agent Agendas of MAL Shoham et al. try to make a learning. What are possible purposes of the current (and possibly future) research done in MAL? Erik Berbee & Bas van Gijzel. Slides last processed on Monday 22 nd March, 2010 at 23:37h. Slide 15

Multi-agent learning Methodology of MAL research Introducing the Agenda (Shoham et al.) • Computational • Descriptive • Normative • Prescriptive - cooperative • Prescriptive - non-cooperative Erik Berbee & Bas van Gijzel. Slides last processed on Monday 22 nd March, 2010 at 23:37h. Slide 16

Multi-agent learning Methodology of MAL research iterative way to compute certain properties, Computational Agenda • Learning algorithms as an such as Nash equilibria, on a certain class of games. – Fictitious play calculates Nash equilibria for zero-sum games. – Replicator dynamics calculates Nash equilibria for symmetric games. • Quick and dirty Erik Berbee & Bas van Gijzel. Slides last processed on Monday 22 nd March, 2010 at 23:37h. Slide 17

Multi-agent learning Methodology of MAL research dire t algo rithms . Computational Agenda: Addenda by Sandholm • Computing properties of games by using • Quick and dirty MAL algorithms as a last resort, when there is no good direct algorithm available. • (MAL algorithms can be easier to program though.) * T. Sandholm (2007): Perspectives on multiagent learning . In: Artificial Intelligence 171 , pp. 382-391, 2007. Erik Berbee & Bas van Gijzel. Slides last processed on Monday 22 nd March, 2010 at 23:37h. Slide 18

Multi-agent learning Methodology of MAL research des ription for social and Descriptive Agenda p eople's b ehaviours . • MAL as a economic behaviour. la rge p opulations . • Formal models of learning possibly correspond to • Can be extended to the modelling of • Descriptive agenda corresponds to (most) usage of MAL in social sci- ences. Erik Berbee & Bas van Gijzel. Slides last processed on Monday 22 nd March, 2010 at 23:37h. Slide 19

Multi-agent learning Methodology of MAL research Descriptive Agenda: Addenda by Sandholm Problem: Humans might not have the required rationality to act according to game-theoretic equilibrium. But this is exactly what the descriptive agenda wants to model! Erik Berbee & Bas van Gijzel. Slides last processed on Monday 22 nd March, 2010 at 23:37h. Slide 20

Multi-agent learning Erik Berbee & Bas van Gijzel , Master - PowerPoint PPT Presentation

Multi-agent learning Methodology of MAL research Metho dology of MAL resea rh Multi-agent learning Erik Berbee & Bas van Gijzel , Master Student AT, Utrecht University Erik Berbee & Bas van Gijzel. Slides last processed on Monday

Multi-agent learning Multi-agent reinforcement learning Gerard Vreeswijk , Intelligent Systems

Overview Multi-Agent Systems Introduction to multi-agent systems and agent societies Agent

Multi-agent learning Gerard Vreeswijk , Intelligent Systems Group, Computer Science Department,

REINFORCEMENT LEARNING IN MULTI-AGENT SYSTEMS MACHINE LEARNING MEETUP DR. ANA PELETEIRO

An Agent Architecture An Agent Architecture An Agent Architecture An Agent Architecture for

S S S S calable calable Agent calable calable Agent Agent Plat forms Agent Plat forms

Agent-Based Systems Agent communication Speech act theory Michael Rovatsos Agent

Multi-agent learning Simplied Poker Yannick Bitane , April 14th, 2011. Yannick Bitane. Slides

Learning Agent Learning Agents An Agent that observes its performance and adapts its

ROMA: Multi-Agent Reinforcement Learning with Emerging Roles Tonghan Wang, Heng Dong, Victor

The Player Agent The Player Agent Are they the most important league official right now? right

Rational Agents (Ch. 2) Rational agent An agent/robot must be able to perceive and interact with

Agent-Based Systems Michael Rovatsos mrovatso@inf.ed.ac.uk Lecture 6 Agent Communication 1

Agent Training Welcome Blues Agent Portal Training e-Learning on the BCBSM Agent Portal

W HAT S AN A GENT ? Weiss, p. 29 [after Wooldridge and Jennings]: An agent is a

M ULTI -A GENT S YSTEMS Overview and Research Directions Whats an agent? AI Class 12 (C H .

Theory and Statistics Constantinos Daskalakis CSAIL and EECS, MIT Min-Max Optimization Solve:

Introduction to Machine Learning 25. Multiplicative Updates, Games and Boosting Alex Smola

Outline 1. Standing on the Shoulders of Giants . . . 2. What is Information? 3. Shannon

a chaining algorithm for online non parametric regression Pierre Gaillard December 2, 2015

CSCI 446 ARTIFICIAL INTELLIGENCE EXAM 1 STUDY OUTLINE Introduction to Artificial Intelligence

Large-scale machine learning and convex optimization Francis Bach INRIA - Ecole Normale Sup

The Challenge Initiative: Business Unusual Approach to Scale up Kojo Lokko Bill & Melinda

On Scalable and Efficient Computation of Large Scale Optimal Transport Yujia Xie, Minshuo Chen,

Multi-agent learning Erik Berbee & Bas van Gijzel , Master - PowerPoint PPT Presentation

Multi-agent learning Methodology of MAL research Metho dology of MAL resea rh Multi-agent learning Erik Berbee & Bas van Gijzel , Master Student AT, Utrecht University Erik Berbee & Bas van Gijzel. Slides last processed on Monday

Multi-agent learning Multi-agent reinforcement learning Gerard Vreeswijk , Intelligent Systems

Overview Multi-Agent Systems Introduction to multi-agent systems and agent societies Agent

Multi-agent learning Gerard Vreeswijk , Intelligent Systems Group, Computer Science Department,

REINFORCEMENT LEARNING IN MULTI-AGENT SYSTEMS MACHINE LEARNING MEETUP DR. ANA PELETEIRO

An Agent Architecture An Agent Architecture An Agent Architecture An Agent Architecture for

S S S S calable calable Agent calable calable Agent Agent Plat forms Agent Plat forms

Agent-Based Systems Agent communication Speech act theory Michael Rovatsos Agent

Multi-agent learning Simplied Poker Yannick Bitane , April 14th, 2011. Yannick Bitane. Slides

Learning Agent Learning Agents An Agent that observes its performance and adapts its

ROMA: Multi-Agent Reinforcement Learning with Emerging Roles Tonghan Wang, Heng Dong, Victor

The Player Agent The Player Agent Are they the most important league official right now? right

Rational Agents (Ch. 2) Rational agent An agent/robot must be able to perceive and interact with

Agent-Based Systems Michael Rovatsos mrovatso@inf.ed.ac.uk Lecture 6 Agent Communication 1

Agent Training Welcome Blues Agent Portal Training e-Learning on the BCBSM Agent Portal

W HAT S AN A GENT ? Weiss, p. 29 [after Wooldridge and Jennings]: An agent is a

M ULTI -A GENT S YSTEMS Overview and Research Directions Whats an agent? AI Class 12 (C H .

Theory and Statistics Constantinos Daskalakis CSAIL and EECS, MIT Min-Max Optimization Solve:

Introduction to Machine Learning 25. Multiplicative Updates, Games and Boosting Alex Smola

Outline 1. Standing on the Shoulders of Giants . . . 2. What is Information? 3. Shannon

a chaining algorithm for online non parametric regression Pierre Gaillard December 2, 2015

CSCI 446 ARTIFICIAL INTELLIGENCE EXAM 1 STUDY OUTLINE Introduction to Artificial Intelligence

Large-scale machine learning and convex optimization Francis Bach INRIA - Ecole Normale Sup

The Challenge Initiative: Business Unusual Approach to Scale up Kojo Lokko Bill &amp; Melinda

On Scalable and Efficient Computation of Large Scale Optimal Transport Yujia Xie, Minshuo Chen,

The Challenge Initiative: Business Unusual Approach to Scale up Kojo Lokko Bill & Melinda