open world planning for robots via hindsight optimization
play

Open World Planning for Robots via Hindsight Optimization Scott - PowerPoint PPT Presentation

Open World Planning for Robots via Hindsight Optimization Scott Kiesel 1 , Ethan Burns 1 , Wheeler Ruml 1 , J. Benton 2 , Frank Kreimendahl 1 1 2 We are grateful for funding from the DARPA CSSG program (grant H R0011-09-1-0021) and NSF (grant


  1. Open World Planning for Robots via Hindsight Optimization Scott Kiesel 1 , Ethan Burns 1 , Wheeler Ruml 1 , J. Benton 2 , Frank Kreimendahl 1 1 2 We are grateful for funding from the DARPA CSSG program (grant H R0011-09-1-0021) and NSF (grant IIS-0812141). Scott Kiesel (UNH) Open World Planning for Robots – 1 / 19

  2. Open World Planning - Search and Rescue Introduction ■ Open World ■ Search & Rescue ■ Previous Approaches ■ Hindsight Opt OH-wOW Results Conclusion Scott Kiesel (UNH) Open World Planning for Robots – 2 / 19

  3. Search and Rescue Domain Robot agent ■ Introduction Unknown building/map layout ■ Open World ■ ■ Search & Rescue Unknown victim locations ■ ■ Previous Approaches Unknown number of victims ■ ■ Hindsight Opt Search time limit ■ OH-wOW Results Conclusion Scott Kiesel (UNH) Open World Planning for Robots – 3 / 19

  4. Previous Approaches Talamadupula et al. (ICAPS ’09, AAAI ’10, TIST ’10) ■ Introduction ad-hoc assumption: roomExists ( x ) → personExistsIn ( x ) ■ Open World ■ Search & Rescue ■ Previous Approaches ■ Hindsight Opt Joshi et al. (ICRA ’12) ■ OH-wOW based on FODD approximations Results hours of offline planning Conclusion Optimization in Hindsight with Open Worlds (OH-wOW) ■ general principled easy to implement (and extend) Scott Kiesel (UNH) Open World Planning for Robots – 4 / 19

  5. Hindsight Optimization Select action that maximizes expected reward. Introduction ■ Open World ■ Search & Rescue reward = cumulative reward following optimal plan ■ Previous Approaches ■ Hindsight Opt OH-wOW Results Conclusion Scott Kiesel (UNH) Open World Planning for Robots – 5 / 19

  6. Hindsight Optimization Select action leading to states with highest expected reward. Introduction ■ Open World ■ Search & Rescue reward = reward of plan out of all possible plans with best ■ Previous Approaches average reward over all configurations ■ Hindsight Opt OH-wOW   Results | A | V ∗ ( s 1 ) = � min R ( s i , a i ) Conclusion E   A = � a 1 ,...,a | A | � � s 2 ,...,s | A | � i =1 , , , , ... , , , , ... Scott Kiesel (UNH) Open World Planning for Robots – 5 / 19

  7. Hindsight Optimization Select action leading to states with highest expected reward. Introduction ■ Open World ■ Search & Rescue reward ≈ reward of plan out of all possible plans with best ■ Previous Approaches average reward across sampled configurations ■ Hindsight Opt OH-wOW   Results | A | ˆ � V ( s 1 ) = min R ( s i , a i ) Conclusion E   A = � a 1 ,...,a | A | � � s 2 ,...,s | A | � i =1 , , , , , , ... Scott Kiesel (UNH) Open World Planning for Robots – 5 / 19

  8. Hindsight Optimization Select action leading to states with highest expected reward. Introduction ■ Open World ■ Search & Rescue reward ≈ average reward of best plan in each sampled ■ Previous Approaches configuration ■ Hindsight Opt OH-wOW   | A | ˆ � Results V ( s 1 ) = min R ( s i , a i ) E   A = � a 1 ,...,a | A | � Conclusion � s 2 ,s 3 ,... � i =1 Scott Kiesel (UNH) Open World Planning for Robots – 5 / 19

  9. Introduction OH-wOW ■ Implementation ■ Sense ■ Sample ■ Plan ■ Act Results Optimization in Hindsight with Open Worlds Conclusion Scott Kiesel (UNH) Open World Planning for Robots – 6 / 19

  10. OH-wOW Implementation for Search and Rescue 1. Sense Introduction 2. Sample OH-wOW ■ Implementation 3. Plan ■ Sense 4. Act ■ Sample ■ Plan ■ Act Results Conclusion Scott Kiesel (UNH) Open World Planning for Robots – 7 / 19

  11. Sensing and Observations 1. Sense Introduction 2. Sample OH-wOW ■ Implementation 3. Plan ■ Sense 4. Act ■ Sample ■ Plan ■ Act Results Conclusion SLAM (ROS gmapping) ■ laser rangefinder Topological Map ■ rough construction Person Detector ■ Scott Kiesel (UNH) Open World Planning for Robots – 8 / 19

  12. Sensing and Observations 1. Sense Introduction 2. Sample OH-wOW ■ Implementation 3. Plan ■ Sense 4. Act ■ Sample ■ Plan ■ Act Results Sensed Occupancy Grid with Topological Graph Overlayed Conclusion Scott Kiesel (UNH) Open World Planning for Robots – 8 / 19

  13. Sampling Possible Worlds 1. Sense Introduction 2. Sample OH-wOW ■ Implementation 3. Plan ■ Sense 4. Act ■ Sample ■ Plan ■ Act Results Conclusion Current Knowledge ■ observed known to be true Expectation ■ prior domain knowledge bias Scott Kiesel (UNH) Open World Planning for Robots – 9 / 19

  14. Sampling Possible Worlds 1. Sense Introduction 2. Sample OH-wOW ■ Implementation 3. Plan ■ Sense 4. Act ■ Sample ■ Plan ■ Act Results Known Partial World State Conclusion Scott Kiesel (UNH) Open World Planning for Robots – 9 / 19

  15. Sampling Possible Worlds 1. Sense Introduction 2. Sample OH-wOW ■ Implementation 3. Plan ■ Sense 4. Act ■ Sample ■ Plan ■ Act Results Sampled “Complete” World State Conclusion Scott Kiesel (UNH) Open World Planning for Robots – 9 / 19

  16. Planning in Sampled Worlds 1. Sense Introduction 2. Sample OH-wOW ■ Implementation 3. Plan ■ Sense 4. Act ■ Sample ■ Plan ■ Act Results Conclusion Fully Known ■ Deterministic ■ Classical Planners or ■ Domain Specific Planners ■ Scott Kiesel (UNH) Open World Planning for Robots – 10 / 19

  17. Planning in Sampled Worlds 1. Sense Introduction 2. Sample OH-wOW ■ Implementation 3. Plan ■ Sense 4. Act ■ Sample ■ Plan ■ Act Results A Single Sample Conclusion Scott Kiesel (UNH) Open World Planning for Robots – 10 / 19

  18. Acting in Sampled Worlds 1. Sense Introduction 2. Sample OH-wOW ■ Implementation 3. Plan ■ Sense 4. Act ■ Sample ■ Plan ■ Act Results Conclusion Execute Best Currently Available Action ■ maximize expected reward Scott Kiesel (UNH) Open World Planning for Robots – 11 / 19

  19. Acting in Sampled Worlds 1. Sense Introduction 2. Sample OH-wOW ■ Implementation 3. Plan ■ Sense 4. Act ■ Sample ■ Plan ■ Act Results Conclusion Scott Kiesel (UNH) Open World Planning for Robots – 11 / 19

  20. Introduction OH-wOW Results ■ Rescue ■ Rescue (sim) ■ Omelette (sim) Conclusion Results Scott Kiesel (UNH) Open World Planning for Robots – 12 / 19

  21. Search and Rescue UNH CS Offices, Pioneer 3-DX, SICK LMS500, ROS Fuerte Introduction OH-wOW victims found Results deadline 0 1 2 3 ■ Rescue ■ Rescue (sim) 1 minute 4 6 0 0 ■ Omelette (sim) 5 minutes 0 7 3 0 Conclusion 10 minutes 0 3 4 3 Joshi et al: 4 hours precomputation, 3 victims constant time table lookup OH-wOW: no precomputation 0.18 sec avg max step time, 3 victims (256 samples) 2.7 sec avg max step time, 10 victims (256 samples) Scott Kiesel (UNH) Open World Planning for Robots – 13 / 19

  22. Search and Rescue UNH CS Offices, Pioneer 3-DX, SICK LMS500, ROS Fuerte Introduction OH-wOW victims found Results deadline 0 1 2 3 ■ Rescue ■ Rescue (sim) 1 minute 4 6 0 0 ■ Omelette (sim) 5 minutes 0 7 3 0 Conclusion 10 minutes 0 3 4 3 OH-wOW: is online, ■ computes the next action quickly, ■ and handles the tradeoff between hard and soft goals. ■ Scott Kiesel (UNH) Open World Planning for Robots – 13 / 19

  23. Search and Rescue in Simulation 10 cost over optimal Introduction OH-wOW Results 5 ■ Rescue ■ Rescue (sim) ■ Omelette (sim) Conclusion 0 32 256 ctlr 32 256 ctlr 32 256 ctlr none south southwest OH-wOW: leverages domain specific knowledge, ■ and can beat a handcoded controller. ■ Scott Kiesel (UNH) Open World Planning for Robots – 14 / 19

  24. Omelette Domain in Simulation Levesque (AAAI ’96) Introduction OH-wOW planning time (seconds) Results 3 eggs step 4 eggs step ■ Rescue ■ Rescue (sim) Bonet et al (IJCAI ’01) 185 - - - ■ Omelette (sim) Levesque (IJCAI ’05)) 1.4 - 1,681 - Conclusion OH-wOW 12.9 0.52 76.7 1.57 Levesque plans are longer than OH-wOW OH-wOW: is online, ■ computes the next action quickly, ■ and finds cheaper cost solutions. ■ Scott Kiesel (UNH) Open World Planning for Robots – 15 / 19

  25. Introduction OH-wOW Results Conclusion ■ Limitations ■ Summary ■ Advertising Conclusion Scott Kiesel (UNH) Open World Planning for Robots – 16 / 19

  26. Limitations Scalability of the underlying planner ■ Introduction leverage large body of literature OH-wOW Calls underlying planner repetitively ■ Results embarassingly parallel Conclusion ■ Limitations Vulnerable to black swans during sampling ■ ■ Summary importance sampling ■ Advertising Regenerates world samples at every step ■ reuse samples until world ”changes” (see Yoon et al. ICAPS ’10 for HO Optimizations) Scott Kiesel (UNH) Open World Planning for Robots – 17 / 19

  27. Summary The OH-wOW framework is a: Introduction OH-wOW Fast, ■ Results Simple, ■ Conclusion General, ■ ■ Limitations Online, ■ Summary ■ ■ Advertising Approximate, ■ Way of Handling Open Worlds. ■ Scott Kiesel (UNH) Open World Planning for Robots – 18 / 19

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend