Online Learning within Cooperative Planning Alborz Geramifard - PowerPoint PPT Presentation

2 r e n r a e L r e n n a l P Existing Gap Overly Restrictive [ Heger 1994 ] Lack of Analytical Convergence [ Geibel et al. 2005 ] No Robustness Guarantees [ Abbeel et al. 2005 ] 33

2 r e n r a e L Approach r e n n a l P ! !!" !"#$%&'()*+# 0%'&-)-1 ",1#&)(23 !##$%&'()*% "1%-(89%2)5,% +,'--%& +%&4#&3'-5% "-',67)7 .#&,/ ,'#+&-($",)# ),"#+ [ Redding, Geramifard, How, ACC 2010 ] 34

2 r e n r a e L Approach r e n n a l P ! !!" !##$%&'()*%+,%'&-%& # " ! !#-.%-./. ,%'&-%& 0'.%1+ 0/-12%+ # (!) ! " ! "23#&)(45 # " " "3%-(<=%4)>2% !"*"+ 6!00"7 9).: ! "-'2;.). # (!) 8#&21 !"#$%&'()!*#+, !"#$%&''' - Stochastic Risk Model, Learners with Implicit Policy Formulation [ Geramifard, et al. ACC 2011 ] 35

2 r e n r a e L r e n n a l P Grid W orld Example 30 % Uniform Noise for Movement ( Not known to the agent ) Rewards { +1, - 1, - .001 } 36

2 r e n r a e L r e n n a l P Grid W orld Optimal Optimal CSarsa CNAC !'( !'( Planner Planner ! ! .+*/01 .+*/01 Sarsa NAC ! !'( ! !'( ! & ! & ! &'( ! &'( ! "!!! #!!! $!!! %!!! &!!!! ! "!!! #!!! $!!! %!!! &!!!! )*+,- )*+,- [ Geramifard, et al. ACC 2011 ] 37

2 r e n r a e L UAV Mission r e n n a l P +100 [2,3] .5 2 1 3 +100 5 6 .7 8 [2,3] [3,4] 4 5 +100 +200 .5 +300 7 .6 5 % Movement Failure ( Not known to the agent ) 38

2 r e n r a e L r e n n a l P UAV Mission Results P(Crash) Optimality 100 100% 90 80% 80 60% 70 40% 60 20% 50 0% 40 Learner Learner Planner + Learner Planner + Learner Planner Planner [ Geramifard, et al. ACC 2011 ] 39

Outline 1 Learner 2 Planner Learner 40

1 Contributions r e n a r e L Introduced incremental Feature Dependency Discovery ( iFDD ) Scaled existing online RL methods to large domains using iFDD 2 e r n r a e L r e n n a P l Combined online learning methods with cooperative planners 41

Backup Slides 42

Algorithm 1: Discover Input : φ ( s ) , δ t , ξ , F , ψ Output : F , ψ foreach ( g, h ) ∈ { ( i, j ) | φ i ( s ) φ j ( s ) = 1 } do f ← g ∧ h ∈ F then if f / ψ f ← ψ f + | δ t | if ψ f > ξ then F ← F ∪ f end end end 44

Algorithm 2: Activate Features Input : φ 0 ( s ) , F Output : φ ( s ) φ ( s ) ← ¯ 0 activeInitialFeatures ← { i | φ 0 i ( s ) = 1 } Candidates ← ℘ ( activeInitialFeatures ) (*sorted by set size) while activeInitialFeatures � = ∅ do f ← Candidates .next() if f ∈ F then activeInitialFeatures ← activeInitialFeatures − f φ f ( s ) ← 1 end end return φ ( s ) 45

· � � � Initial+iFDD 3000 ATC 2500 Guassian 2000 Balancing Steps Tabular Initial 1500 1000 500 0 0 2 4 6 8 10 Steps 4 x 10 46

initial+iFDD & !)* Tabular ! 0-,123 Initial ! !)* ATC ! & ! &)* ! " # $ % &! +,-./ # '(&! 47

Sparse Distributed Memories ( SDM ) [ Ratitch et al. 2004 ] 49

! ''! !"#$%& ! !"#$%&'()*+# !0%.,1',2%20 34 '((! '.$%,.**#, )*+$$#, 3256 !$+*7525 89) -.,*/ ,'#+&-($",)#. !"""# ),"#+ Stochastic Domain, Known Deterministic Risk Model [ ACC 2010, GNC 2010 ] 51

Online Learning within Cooperative Planning Alborz Geramifard - PowerPoint PPT Presentation

Online Learning within Cooperative Planning Alborz Geramifard September, 2010 agf@mit.edu Joint W ork: Finale Doshi, Josh Redding, Nicholas Roy, Jonathan How Supported by: AFOSR 1 Problem W aypoint Obstacle Base 2 Why is this a hard

Cooperative Web Caching Cooperative Web Caching Cooperative Caching Cooperative Caching

Cooperative Game Theory Outline Introduction Relationship between Non-cooperative and

Cooperative Choice Cooperative and non-cooperative motives and their consequences via Mark

Online Learning Lorenzo Rosasco MIT, 9.520 L. Rosasco Online Learning About this class Goal

CLIMBS Life and General Insurance Cooperative CLIMBS Life and General Insurance Cooperative A

COOPERATIVE EDUCATION INVENTORY STUDY Association of Cooperative Christina Clamp, PhD.

Online Learning and Online Investing Jia Mao February 20, 2006 Jia Mao () Online Learning and

Cooperative Cooperative Learning Learning Mark Robertson- - Tessi Tessi Mark Robertson

Cooperative Learning for Everyone Presented by: Debbie Silver, Ed.D.

NCSS NCSS NRCS NRCS NCSS NCSS National Cooperative Soil Survey National Cooperative Soil

Cooperative Concepts Kane County Division of Transportation (KDOT) July 19 th 2019 Presented by:

Outline Introduction Full-duplex system Cooperative system Cooperative full-duplex

Health Information Exchange Health Information Exchange Cooperative Agreement Program:

Insert Your Logo Cooperative Governance Survey Key Results [Insert Name of Cooperative] Insert

Cooperative Games Mihai Manea MIT Coalitional Games A coalitional (or cooperative) game is a

Introduction to Cooperative Games Mehdi Dastani BBL-521 M.M.Dastani@uu.nl Cooperative Game

Overview State Spaces & Partial-Order Planning What is planning? AI Class 22 (Ch. 10

Heuristics for Cost-Optimal Classical Planning Based on Linear Programming (from ICAPS-14)

Foundations of AI 11. Planning Planning in Situational Calculus, STRIPS Formalism, Non-Linear

Foundations of Artificial Intelligence 14. Planning Solving Logically Specified Problems Step by

Human-Robot Interaction Elective in Artificial Intelligence Lecture 2 Interaction Management

A LINEAR PROGRAMMING TO PLANNING PRODUCTION IN SWINE FARM Sara V. Rodrguez, Luis M. Pl

The International Linear Collider: A Brief History, Present Status and Future Plans L. Warren

The Jacobi-Stirling Numbers Eric S. Egge (joint work with G. Andrews, L. Littlejohn, and W.

Online Learning within Cooperative Planning Alborz Geramifard - PowerPoint PPT Presentation

Online Learning within Cooperative Planning Alborz Geramifard September, 2010 agf@mit.edu Joint W ork: Finale Doshi, Josh Redding, Nicholas Roy, Jonathan How Supported by: AFOSR 1 Problem W aypoint Obstacle Base 2 Why is this a hard

Cooperative Web Caching Cooperative Web Caching Cooperative Caching Cooperative Caching

Cooperative Game Theory Outline Introduction Relationship between Non-cooperative and

Cooperative Choice Cooperative and non-cooperative motives and their consequences via Mark

Online Learning Lorenzo Rosasco MIT, 9.520 L. Rosasco Online Learning About this class Goal

CLIMBS Life and General Insurance Cooperative CLIMBS Life and General Insurance Cooperative A

COOPERATIVE EDUCATION INVENTORY STUDY Association of Cooperative Christina Clamp, PhD.

Online Learning and Online Investing Jia Mao February 20, 2006 Jia Mao () Online Learning and

Cooperative Cooperative Learning Learning Mark Robertson- - Tessi Tessi Mark Robertson

Cooperative Learning for Everyone Presented by: Debbie Silver, Ed.D.

NCSS NCSS NRCS NRCS NCSS NCSS National Cooperative Soil Survey National Cooperative Soil

Cooperative Concepts Kane County Division of Transportation (KDOT) July 19 th 2019 Presented by:

Outline Introduction Full-duplex system Cooperative system Cooperative full-duplex

Health Information Exchange Health Information Exchange Cooperative Agreement Program:

Insert Your Logo Cooperative Governance Survey Key Results [Insert Name of Cooperative] Insert

Cooperative Games Mihai Manea MIT Coalitional Games A coalitional (or cooperative) game is a

Introduction to Cooperative Games Mehdi Dastani BBL-521 M.M.Dastani@uu.nl Cooperative Game

Overview State Spaces &amp; Partial-Order Planning What is planning? AI Class 22 (Ch. 10

Heuristics for Cost-Optimal Classical Planning Based on Linear Programming (from ICAPS-14)

Foundations of AI 11. Planning Planning in Situational Calculus, STRIPS Formalism, Non-Linear

Foundations of Artificial Intelligence 14. Planning Solving Logically Specified Problems Step by

Human-Robot Interaction Elective in Artificial Intelligence Lecture 2 Interaction Management

A LINEAR PROGRAMMING TO PLANNING PRODUCTION IN SWINE FARM Sara V. Rodrguez, Luis M. Pl

The International Linear Collider: A Brief History, Present Status and Future Plans L. Warren

The Jacobi-Stirling Numbers Eric S. Egge (joint work with G. Andrews, L. Littlejohn, and W.

Overview State Spaces & Partial-Order Planning What is planning? AI Class 22 (Ch. 10