Collaborative on line learning of an action model Christophe - PowerPoint PPT Presentation

Collaborative on line learning of an action model Christophe Rodrigues ∗ , Henry Soldano ∗ , Gauvain Bourgne † and Céline Rouveirol ∗ ∗ LIPN (Université Paris 13, UMR-CNRS 7030) † LIP6 (Université Pierre et Marie Curie, UMR-CNRS 7606) ILP - Nancy 14/09/2014 C.Rodrigues & Al. Collaborative online learning of an action mod 14/09/2014 1 / 14

Introduction Collaborative Learning Given a relational revision algorithm IRALe, that performs online learning of a deterministic conditional STRIPS-like model and a multi agent learning protocol SMILE. C.Rodrigues & Al. Collaborative online learning of an action mod 14/09/2014 2 / 14

Incremental refinement of relational action model C.Rodrigues & Al. Collaborative online learning of an action mod 14/09/2014 3 / 14

SMILE protocol Local consistency mechanism, a-consistency: consistency wrt internal counter-examples SMILE : A general consistency maintenance protocol Global revision mechanism triggered by an agent a i upon direct observation of a contradictory observation x ,( internal counter-example ). A set of interactions I ( a i , a j ), j ∈ [1 .. n ] , j � = i , between the learner agent a i and other agents a j , acting as critics C.Rodrigues & Al. Collaborative online learning of an action mod 14/09/2014 4 / 14

SMILE protocol C.Rodrigues & Al. Collaborative online learning of an action mod 14/09/2014 5 / 14

SMILE properties An IRALe agent a i is mas-consistent iff T i is consistent with respect to O , i.e., to all counter-examples stored by agents of the n -MAS. Provided agents in the n -MAS in independent environments, each agent is MAS-consistent. The process always terminates. Let d be the cost of an interaction and c to be the cost of revision. When a MAS of n agents has received n e examples, in the worst case: The total number of local revisions performed during the history of the 1 MAS is less than n e ∗ n The total cost of interactions is less than n e · ( n + 1) · ( n − 1) · d 2 The total revision cost is less than n e · n · c ( n e ) 3 C.Rodrigues & Al. Collaborative online learning of an action mod 14/09/2014 6 / 14

Experiments A community of n agents (1, 5, 30), each acting in their own environment. Agents are individualistic as they maintain and modify their own current hypothesis. Proof of concept in the blocks world domain in which color predicates for blocks are introduced. 2 rules (over 7) in the 7 blocks 2 colors ( b and w ) world Preconditions Action Effect bl ( A ) , bl ( B ) , bl ( C ) , move ( A , B ) on ( A , B ) , cl ( A ) , cl ( B ) , w ( A ) , w ( B ) , ¬ on ( A , C ) , on ( A , C ) , on ( B , D ) cl ( C ) , ¬ cl ( B ) bl ( A ) , bl ( B ) , bl ( C ) , move ( A , B ) b ( A ) , cl ( A ) , cl ( B ) , w ( A ) , b ( B ) , ¬ w ( A ) on ( A , C ) , on ( B , D ) C.Rodrigues & Al. Collaborative online learning of an action mod 14/09/2014 7 / 14

Integration with a planner At a given moment, the agent a is in state s i , and has its own current action model T i and corresponding counter-examples memory O i . Each agent is provided with some random goal to reach G i . At each time t , the agent tries to build a plan P = ( a 1 , ..., a m ) to reach G i . - If it succeeds, the agent applies a 1 , observing effect e . Let ˆ e = predict ( a 1 , T i , s i ). If e = ˆ e , the agent applies the next action of the plan. - Otherwise, this generates a new counter-example x = ( s i , a 1 , e ), T i is revised locally into T ′ i , which is transmitted to the other agents, therefore triggering a global revision. If planning fails, random actions are selected and performed. C.Rodrigues & Al. Collaborative online learning of an action mod 14/09/2014 8 / 14

Results Accuracy ¡ 1 ¡ Average ¡accuracy ¡of ¡each ¡agent ¡ 0.9 ¡ 0.8 ¡ 0.7 ¡ 0.6 ¡ Tilde(1ag) ¡ 0.5 ¡ 1 ¡agent ¡ 0.4 ¡ 5 ¡agents ¡ 0.3 ¡ 30 ¡agents ¡ 0.2 ¡ 0.1 ¡ 0 ¡ 0 ¡ 100 ¡ 200 ¡ 300 ¡ 400 ¡ 500 ¡ Number ¡of ¡ac3ons ¡performed ¡by ¡each ¡agent ¡ C.Rodrigues & Al. Collaborative online learning of an action mod 14/09/2014 9 / 14

Results Communica4onal ¡cost ¡ 300 ¡ Average ¡number ¡of ¡messages ¡sent ¡ 250 ¡ 200 ¡ by ¡each ¡agent ¡ 150 ¡ 5 ¡agents ¡ 30 ¡agents ¡ 100 ¡ 50 ¡ 0 ¡ 0 ¡ 100 ¡ 200 ¡ 300 ¡ 400 ¡ 500 ¡ Number ¡of ¡ac4ons ¡performed ¡by ¡each ¡agent ¡ C.Rodrigues & Al. Collaborative online learning of an action mod 14/09/2014 10 / 14

Results Number ¡of ¡examples ¡ 30 ¡ Average ¡number ¡of ¡examples ¡ 1 ¡agent ¡ stored ¡by ¡each ¡agent ¡ 25 ¡ 5 ¡agents ¡(only ¡ 20 ¡ internal) ¡ 15 ¡ 30 ¡agents ¡(only ¡ internal) ¡ 10 ¡ 5 ¡agents ¡(all ¡ 5 ¡ examples) ¡ 30 ¡agents ¡(all ¡ 0 ¡ examples) ¡ 0 ¡ 100 ¡ 200 ¡ 300 ¡ 400 ¡ 500 ¡ Number ¡of ¡ac8ons ¡performed ¡by ¡each ¡agent ¡ C.Rodrigues & Al. Collaborative online learning of an action mod 14/09/2014 11 / 14

Results Number ¡of ¡achieved ¡goals ¡ 12 ¡ Average ¡number ¡of ¡goals ¡achieved ¡ 10 ¡ 8 ¡ by ¡each ¡agent ¡ 1 ¡agent ¡ 6 ¡ 5 ¡agents ¡ 4 ¡ 30 ¡agents ¡ 2 ¡ 0 ¡ 0 ¡ 50 ¡ 100 ¡ 150 ¡ 200 ¡ 250 ¡ 300 ¡ Number ¡of ¡ac7ons ¡performed ¡by ¡each ¡agent ¡ C.Rodrigues & Al. Collaborative online learning of an action mod 14/09/2014 12 / 14

Results - Vote protocol 7 ¡blocks ¡-‑ ¡2 ¡colors ¡ Average ¡and ¡vo3ng ¡accuracy ¡of ¡an ¡agent ¡in ¡a ¡5-‑MAS ¡ 1 ¡ 0.98 ¡ 0.96 ¡ 0.94 ¡ 0.92 ¡ Accuracy ¡ 0.9 ¡ Average ¡ 0.88 ¡ Vo3ng ¡ 0.86 ¡ 0.84 ¡ 0.82 ¡ 0.8 ¡ 0 ¡ 200 ¡ 400 ¡ 600 ¡ 800 ¡ Number ¡of ¡examples ¡received ¡by ¡each ¡agent ¡ C.Rodrigues & Al. Collaborative online learning of an action mod 14/09/2014 13 / 14

Thank you for your attention C.Rodrigues & Al. Collaborative online learning of an action mod 14/09/2014 14 / 14

Collaborative on line learning of an action model Christophe - PowerPoint PPT Presentation

Collaborative on line learning of an action model Christophe Rodrigues , Henry Soldano , Gauvain Bourgne and Cline Rouveirol LIPN (Universit Paris 13, UMR-CNRS 7030) LIP6 (Universit Pierre et Marie Curie, UMR-CNRS 7606)

The Slope of a Line The Slope of a Line The Slope of a Line The Slope of a Line The Slope of a

Title Slide Math 696 Class July 19, 2002 Line 1 Line 2 Line 3 Line 4 Line 5 Line 6 Line 7

Green Action Centre, 2019 Green Action Centre, 2019 Green Action Centre, 2019 Green Action

X-Line 101 June 2019 X-Line 101 X-Line Unit Overview What makes X-Line unique X-Line 101

Hartford Line: A New Model for Intercity Passenger Rail 1 Hartford Line Service 2 Hartford

Router Architectures CPU CPU Memory Memory packets NFE NFE Processor Processor Line Card

Municipal Water District of Orange County May 1, 2019 Action 1 Action 1 Action 2 Action 2

COLLABORATIVE COMMUNITY PRESENTATION MAY 30TH, 2018 One San Pedro COLLABORATIVE One San Pedro

Coupling On-line and Off-line Random Graphs Woojin Kim March 1st Introduction Preliminary

John Heartfeld J. Otto Seibold Tempest Half life Piet Mondrian The 7 elements of art 1. line

3i Capital Markets Seminar Action 8 June 2016 Agenda 3is investment in Action 1 2 Action

Omron TM Collaborative Robot Collaborative robotics taken to the next level in intelligence

Collaborative Modeling Collaborative Modeling Incorporating new technologies into the

Collaborative Academy Collaborative Academy Ales Cepek and Jan Pytel Ales Cepek and Jan Pytel

POPULATION HEALTH FEBRUARY 11, 2019 LEARNING COLLABORATIVE GOAL This learning

Collaborative Planning Workgroup (CPW) Collaborative Model Recommendations Presentation to:

web-based collaborative systems Masoud Koleini, Hasan Qunoo, Mark Ryan School of Computer Science

Introduction Dr. Ahmed Rafea CSCI485 Intelligent Agents 1 Chapter Outline Artificial

Design of I ntelligent Agents for Collaborative Testing of Service-Based Systems Xiaoying BAI and

1. Introduction ( (to Agents and Multiagent g g Systems) ems (SMA-UPC) Javier

Inaugural Cultural Evolution Society Conference Jena 2017, 13-15 September Designing filtering,

Dialogue Systems Emerging interdisciplinary area since the early 1990s integration of

Artificial Intelligence (IT4042E) Quang Nhat Nguyen quang.nguyennhat@hust.edu.vn Hanoi

Coordinating International Shipping Steven Y. Goldsmith Laurence R. Phillips Shannon V. Spires