Ad Hoc Autonomous Agent Teams: Collaboration without - PowerPoint PPT Presentation

Ad Hoc Autonomous Agent Teams: Collaboration without Pre-Coordination Peter Stone Director, Learning Agents Research Group Department of Computer Science The University of Texas at Austin Joint work with Gal A. Kaminka , Sarit Kraus , Bar Ilan University Jeffrey S. Rosenschein , Hebrew University

Teamwork � 2010 Peter Stone c

Teamwork • Typical scenario: pre-coordination − People practice together − Robots given coordination languages, protocols − “Locker room agreement” [Stone & Veloso, ’99] � 2010 Peter Stone c

Ad Hoc Teams • Ad hoc team player is an individual − Unknown teammates (programmed by others) � 2010 Peter Stone c

Ad Hoc Teams • Ad hoc team player is an individual − Unknown teammates (programmed by others) • May or may not be able to communicate � 2010 Peter Stone c

Ad Hoc Teams • Ad hoc team player is an individual − Unknown teammates (programmed by others) • May or may not be able to communicate • Teammates likely sub-optimal: no control � 2010 Peter Stone c

Ad Hoc Teams • Ad hoc team player is an individual − Unknown teammates (programmed by others) • May or may not be able to communicate • Teammates likely sub-optimal: no control Challenge: Create a good team player � 2010 Peter Stone c

Illustration � 2010 Peter Stone c

An Individual � 2010 Peter Stone c

With Teammates � 2010 Peter Stone c

Made by Others � 2010 Peter Stone c

Heterogeneous � 2010 Peter Stone c

May not Communicate � 2010 Peter Stone c

May Have Different Capabilities � 2010 Peter Stone c

And/Or Maneuverability � 2010 Peter Stone c

May be a Previously Unknown Type � 2010 Peter Stone c

Human Ad Hoc Teams • Military and industrial settings � 2010 Peter Stone c

Human Ad Hoc Teams • Military and industrial settings − Outsourcing � 2010 Peter Stone c

Human Ad Hoc Teams • Military and industrial settings − Outsourcing • Agents support human ad hoc team formation [Just et al., 2004; Kildare, 2004] � 2010 Peter Stone c

Human Ad Hoc Teams • Military and industrial settings − Outsourcing • Agents support human ad hoc team formation [Just et al., 2004; Kildare, 2004] • Autonomous agents (robots) deployed for short times − Teams developed as cohesive groups − Tuned to interact well together � 2010 Peter Stone c

Challenge Statement Create an autonomous agent that is able to efficiently and robustly collaborate with previously unknown teammates on tasks to which they are all individually capable of contributing as team members. � 2010 Peter Stone c

Challenge Statement Create an autonomous agent that is able to efficiently and robustly collaborate with previously unknown teammates on tasks to which they are all individually capable of contributing as team members. • Aspects can be approached theoretically � 2010 Peter Stone c

Challenge Statement Create an autonomous agent that is able to efficiently and robustly collaborate with previously unknown teammates on tasks to which they are all individually capable of contributing as team members. • Aspects can be approached theoretically • Ultimately an empirical challenge � 2010 Peter Stone c

Empirical Evaluation a0 � 2010 Peter Stone c

Evaluation: A Metric a0 a1 � 2010 Peter Stone c

Evaluation: A Metric a0 a1 • Most meaningful when a0 and a1 have similar individual competencies � 2010 Peter Stone c

Evaluation: Domain Consisting of Tasks a0 a1 D � 2010 Peter Stone c

Evaluation: Set of Possible Teammates a0 a1 D A � 2010 Peter Stone c

Evaluation: Draw a Random Task a0 a1 D A � 2010 Peter Stone c

Evaluation: Random Team, Check Comp a0 a1 D A � 2010 Peter Stone c

Evalution: Replace Random with a0 a1 D a0 A � 2010 Peter Stone c

Evaluation: Then a1 — Evaluate Diff a0 D a1 A � 2010 Peter Stone c

Evaluation: Repeat a0 a1 D A � 2010 Peter Stone c

Evaluate( a 0 , a 1 , A , D ) • Initialize performance (reward) counters r 0 and r 1 for agents a 0 and a 1 respectively to r 0 = r 1 = 0 . • Repeat: – Sample a task d from D . – Randomly draw a subset of agents B , | B | ≥ 2 , from A such that E [ s ( B, d )] ≥ s min . – Randomly select one agent b ∈ B to remove from the team to create the team B − . – increment r 0 by s ( { a 0 } ∪ B − , d ) – increment r 1 by s ( { a 1 } ∪ B − , d ) • If r 0 > r 1 then we conclude that a 0 is a better ad-hoc team player than a 1 in domain D over the set of possible teammates A . � 2010 Peter Stone c

Technical Requirements • Assess capabilities of other agents (teammate modeling) � 2010 Peter Stone c

Technical Requirements • Assess capabilities of other agents (teammate modeling) • Assess the other agents’ knowledge states � 2010 Peter Stone c

Technical Requirements • Assess capabilities of other agents (teammate modeling) • Assess the other agents’ knowledge states • Estimate effects of actions on teammates � 2010 Peter Stone c

Technical Requirements • Assess capabilities of other agents (teammate modeling) • Assess the other agents’ knowledge states • Estimate effects of actions on teammates • Be prepared to interact with many types of teammates: − May or may not be able to communicate − May be more or less mobile − May be better or worse at sensing � 2010 Peter Stone c

Technical Requirements • Assess capabilities of other agents (teammate modeling) • Assess the other agents’ knowledge states • Estimate effects of actions on teammates • Be prepared to interact with many types of teammates: − May or may not be able to communicate − May be more or less mobile − May be better or worse at sensing A good team player’s best actions will differ depending on its teammates’ characteristics. � 2010 Peter Stone c

Preliminary Theoretical Progress • Aspects can be approached theoretically • Ultimately an empirical challenge � 2010 Peter Stone c

Preliminary Theoretical Progress • Aspects can be approached theoretically • Ultimately an empirical challenge Be prepared to interact with many types of teammates � 2010 Peter Stone c

Preliminary Theoretical Progress • Aspects can be approached theoretically • Ultimately an empirical challenge Be prepared to interact with many types of teammates • Minimal representative scenarios − One teammate, no communication − Fixed and known behavior � 2010 Peter Stone c

Scenarios • Cooperative iterated normal form game [w/ Kaminka & Rosenschein—AMEC’09] M 1 b 0 b 1 b 2 25 1 0 a 0 a 1 10 30 10 0 33 40 a 2 • Cooperative k -armed bandit [w/ Kraus—AAMAS’10] � 2010 Peter Stone c

Scenarios • Cooperative normal form game M 1 b 0 b 1 b 2 25 1 0 a 0 10 30 10 a 1 a 2 0 33 40 • Cooperative k -armed bandit � 2010 Peter Stone c

3-armed bandit • Random value from a distribution = ⇒ • Expected value µ • � 2010 Peter Stone c

3-armed bandit Arm ∗ Arm 1 Arm 2 � 2010 Peter Stone c

3-armed bandit Arm ∗ Arm 1 Arm 2 µ ∗ > µ 1 > µ 2 • Agent A: teacher − Knows payoff distributions − Objective: maximize expected sum of payoffs � 2010 Peter Stone c

3-armed bandit Arm ∗ Arm 1 Arm 2 µ ∗ > µ 1 > µ 2 • Agent A: teacher − Knows payoff distributions − Objective: maximize expected sum of payoffs − If alone, always Arm ∗ � 2010 Peter Stone c

3-armed bandit Arm ∗ Arm 1 Arm 2 µ ∗ > µ 1 > µ 2 • Agent A: teacher − Knows payoff distributions − Objective: maximize expected sum of payoffs − If alone, always Arm ∗ • Agent B: learner − Can only pull Arm 1 or Arm 2 � 2010 Peter Stone c

3-armed bandit Arm ∗ Arm 1 Arm 2 µ ∗ > µ 1 > µ 2 • Agent A: teacher − Knows payoff distributions − Objective: maximize expected sum of payoffs − If alone, always Arm ∗ • Agent B: learner − Can only pull Arm 1 or Arm 2 − Selects arm with highest observed sample average � 2010 Peter Stone c

Assumptions Arm ∗ Arm 1 Arm 2 � 2010 Peter Stone c

Assumptions Arm ∗ Arm 1 Arm 2 µ ∗ > µ 1 > µ 2 • Alternate actions (teacher first) • Results of all actions fully observable (to both) � 2010 Peter Stone c

Assumptions Arm ∗ Arm 1 Arm 2 µ ∗ > µ 1 > µ 2 • Alternate actions (teacher first) • Results of all actions fully observable (to both) • Number of rounds remaining finite, known to teacher � 2010 Peter Stone c

Assumptions Arm ∗ Arm 1 Arm 2 µ ∗ > µ 1 > µ 2 • Alternate actions (teacher first) • Results of all actions fully observable (to both) • Number of rounds remaining finite, known to teacher Objective : maximize expected sum of payoffs � 2010 Peter Stone c

Summary of Findings Arm ∗ Arm 1 Arm 2 � 2010 Peter Stone c

Ad Hoc Autonomous Agent Teams: Collaboration without - PowerPoint PPT Presentation

Ad Hoc Autonomous Agent Teams: Collaboration without Pre-Coordination Peter Stone Director, Learning Agents Research Group Department of Computer Science The University of Texas at Austin Joint work with Gal A. Kaminka , Sarit Kraus , Bar Ilan

Overview Multi-Agent Systems Introduction to multi-agent systems and agent societies Agent

AUTONOMOUS DRIVING AGENT An agent by Stylianos Zafeiris for the Autonomous Agents (COMP513)

Agent-Based Systems Agent: autonomous Learning for Agent-Based Systems Environment: fully,

An Agent Architecture An Agent Architecture An Agent Architecture An Agent Architecture for

S S S S calable calable Agent calable calable Agent Agent Plat forms Agent Plat forms

Agent-Based Systems Agent communication Speech act theory Michael Rovatsos Agent

Mobile ad hoc networks Standard Mobile IP needs an infrastructure Home Agent/Foreign Agent in

Area 11 Redistricting Ad-Hoc Committee AREA 11 Redistricting Ad-Hoc Committee March 8 th 2017 a

Routing In Ad Hoc Networks 1. Introduction to Ad-hoc networks 2. Routing in Ad-hoc networks 3.

Ad-hoc and Mesh Networks MAP-I Manuel P. Ricardo Faculdade de Engenharia da Universidade do

Mobile Communications Ad-hoc and Mesh Networks Manuel P. Ricardo Faculdade de Engenharia da

TEAMS AGENDA Update on Leading Teams Values Exercise LEADING TEAMS Leadership

The Player Agent The Player Agent Are they the most important league official right now? right

Rational Agents (Ch. 2) Rational agent An agent/robot must be able to perceive and interact with

Agent-Based Systems Michael Rovatsos mrovatso@inf.ed.ac.uk Lecture 6 Agent Communication 1

Process Slides 1 Directing a Project Mandate REQUEST AN EXCEPTION PLAN Ad Ad Ad Ad hoc

Sato-Tate groups of higher weight motives Kiran S. Kedlaya Department of Mathematics, University

Welcome! Announcements: DiVE Archeology Open House this afternoon, 4-6pm Data & GIS

INEX 2012 Overview Shlomo Geva Jaap Kamps Ralf Schenkel 10 years! 2002-2012 INEX 2012

820+ active in intermediaries in in 2018 Buyouts City State Harris Williams Richmond VA

Academic Sabbatical: What, Why, How, Laurie Dillon Michigan State University My sabbatical

Overview of Muon Collider Project Katsuya Yonehara, Fermilab Muon Physics Workshop 08 @RCNP

to Bernard Jancovici in memoriam Analytic results in statistical mechanics Can the

The impact of low-skilled labor migration boom on education investment in Nepal Rashesh Shrestha

Ad Hoc Autonomous Agent Teams: Collaboration without - PowerPoint PPT Presentation

Ad Hoc Autonomous Agent Teams: Collaboration without Pre-Coordination Peter Stone Director, Learning Agents Research Group Department of Computer Science The University of Texas at Austin Joint work with Gal A. Kaminka , Sarit Kraus , Bar Ilan

Overview Multi-Agent Systems Introduction to multi-agent systems and agent societies Agent

AUTONOMOUS DRIVING AGENT An agent by Stylianos Zafeiris for the Autonomous Agents (COMP513)

Agent-Based Systems Agent: autonomous Learning for Agent-Based Systems Environment: fully,

An Agent Architecture An Agent Architecture An Agent Architecture An Agent Architecture for

S S S S calable calable Agent calable calable Agent Agent Plat forms Agent Plat forms

Agent-Based Systems Agent communication Speech act theory Michael Rovatsos Agent

Mobile ad hoc networks Standard Mobile IP needs an infrastructure Home Agent/Foreign Agent in

Area 11 Redistricting Ad-Hoc Committee AREA 11 Redistricting Ad-Hoc Committee March 8 th 2017 a

Routing In Ad Hoc Networks 1. Introduction to Ad-hoc networks 2. Routing in Ad-hoc networks 3.

Ad-hoc and Mesh Networks MAP-I Manuel P. Ricardo Faculdade de Engenharia da Universidade do

Mobile Communications Ad-hoc and Mesh Networks Manuel P. Ricardo Faculdade de Engenharia da

TEAMS AGENDA Update on Leading Teams Values Exercise LEADING TEAMS Leadership

The Player Agent The Player Agent Are they the most important league official right now? right

Rational Agents (Ch. 2) Rational agent An agent/robot must be able to perceive and interact with

Agent-Based Systems Michael Rovatsos mrovatso@inf.ed.ac.uk Lecture 6 Agent Communication 1

Process Slides 1 Directing a Project Mandate REQUEST AN EXCEPTION PLAN Ad Ad Ad Ad hoc

Sato-Tate groups of higher weight motives Kiran S. Kedlaya Department of Mathematics, University

Welcome! Announcements: DiVE Archeology Open House this afternoon, 4-6pm Data &amp; GIS

INEX 2012 Overview Shlomo Geva Jaap Kamps Ralf Schenkel 10 years! 2002-2012 INEX 2012

820+ active in intermediaries in in 2018 Buyouts City State Harris Williams Richmond VA

Academic Sabbatical: What, Why, How, Laurie Dillon Michigan State University My sabbatical

Overview of Muon Collider Project Katsuya Yonehara, Fermilab Muon Physics Workshop 08 @RCNP

to Bernard Jancovici in memoriam Analytic results in statistical mechanics Can the

The impact of low-skilled labor migration boom on education investment in Nepal Rashesh Shrestha

Welcome! Announcements: DiVE Archeology Open House this afternoon, 4-6pm Data & GIS