Open World Planning for Robots via Hindsight Optimization Scott - - PowerPoint PPT Presentation

open world planning for robots via hindsight optimization
SMART_READER_LITE
LIVE PREVIEW

Open World Planning for Robots via Hindsight Optimization Scott - - PowerPoint PPT Presentation

Open World Planning for Robots via Hindsight Optimization Scott Kiesel 1 , Ethan Burns 1 , Wheeler Ruml 1 , J. Benton 2 , Frank Kreimendahl 1 1 2 We are grateful for funding from the DARPA CSSG program (grant H R0011-09-1-0021) and NSF (grant


slide-1
SLIDE 1

Scott Kiesel (UNH) Open World Planning for Robots – 1 / 19

Open World Planning for Robots via Hindsight Optimization

Scott Kiesel1, Ethan Burns1, Wheeler Ruml1, J. Benton2, Frank Kreimendahl1

1 2

We are grateful for funding from the DARPA CSSG program (grant H R0011-09-1-0021) and NSF (grant IIS-0812141).

slide-2
SLIDE 2

Open World Planning - Search and Rescue

Introduction ■ Open World ■ Search & Rescue ■ Previous Approaches ■ Hindsight Opt OH-wOW Results Conclusion

Scott Kiesel (UNH) Open World Planning for Robots – 2 / 19

slide-3
SLIDE 3

Search and Rescue Domain

Introduction ■ Open World ■ Search & Rescue ■ Previous Approaches ■ Hindsight Opt OH-wOW Results Conclusion

Scott Kiesel (UNH) Open World Planning for Robots – 3 / 19

Robot agent

Unknown building/map layout

Unknown victim locations

Unknown number of victims

Search time limit

slide-4
SLIDE 4

Previous Approaches

Introduction ■ Open World ■ Search & Rescue ■ Previous Approaches ■ Hindsight Opt OH-wOW Results Conclusion

Scott Kiesel (UNH) Open World Planning for Robots – 4 / 19

Talamadupula et al. (ICAPS ’09, AAAI ’10, TIST ’10) ad-hoc assumption: roomExists(x) → personExistsIn(x)

Joshi et al. (ICRA ’12) based on FODD approximations hours of offline planning

Optimization in Hindsight with Open Worlds (OH-wOW) general principled easy to implement (and extend)

slide-5
SLIDE 5

Hindsight Optimization

Introduction ■ Open World ■ Search & Rescue ■ Previous Approaches ■ Hindsight Opt OH-wOW Results Conclusion

Scott Kiesel (UNH) Open World Planning for Robots – 5 / 19

Select action that maximizes expected reward. reward = cumulative reward following optimal plan

slide-6
SLIDE 6

Hindsight Optimization

Introduction ■ Open World ■ Search & Rescue ■ Previous Approaches ■ Hindsight Opt OH-wOW Results Conclusion

Scott Kiesel (UNH) Open World Planning for Robots – 5 / 19

Select action leading to states with highest expected reward. reward = reward of plan out of all possible plans with best average reward over all configurations V ∗(s1) = min

A=a1,...,a|A|

E

s2,...,s|A|

 

|A|

  • i=1

R(si, ai)

 

, , , , ... , , , , ...

slide-7
SLIDE 7

Hindsight Optimization

Introduction ■ Open World ■ Search & Rescue ■ Previous Approaches ■ Hindsight Opt OH-wOW Results Conclusion

Scott Kiesel (UNH) Open World Planning for Robots – 5 / 19

Select action leading to states with highest expected reward. reward ≈ reward of plan out of all possible plans with best average reward across sampled configurations ˆ V (s1) = min

A=a1,...,a|A|

E

s2,...,s|A|

 

|A|

  • i=1

R(si, ai)

 

, , , , , , ...

slide-8
SLIDE 8

Hindsight Optimization

Introduction ■ Open World ■ Search & Rescue ■ Previous Approaches ■ Hindsight Opt OH-wOW Results Conclusion

Scott Kiesel (UNH) Open World Planning for Robots – 5 / 19

Select action leading to states with highest expected reward. reward ≈ average reward of best plan in each sampled configuration ˆ V (s1) = E

s2,s3,...

 

min

A=a1,...,a|A| |A|

  • i=1

R(si, ai)

 

slide-9
SLIDE 9

Optimization in Hindsight with Open Worlds

Introduction OH-wOW ■ Implementation ■ Sense ■ Sample ■ Plan ■ Act Results Conclusion

Scott Kiesel (UNH) Open World Planning for Robots – 6 / 19

slide-10
SLIDE 10

OH-wOW Implementation for Search and Rescue

Introduction OH-wOW ■ Implementation ■ Sense ■ Sample ■ Plan ■ Act Results Conclusion

Scott Kiesel (UNH) Open World Planning for Robots – 7 / 19

1. Sense 2. Sample 3. Plan 4. Act

slide-11
SLIDE 11

Sensing and Observations

Introduction OH-wOW ■ Implementation ■ Sense ■ Sample ■ Plan ■ Act Results Conclusion

Scott Kiesel (UNH) Open World Planning for Robots – 8 / 19

1. Sense 2. Sample 3. Plan 4. Act

SLAM (ROS gmapping) laser rangefinder

Topological Map rough construction

Person Detector

slide-12
SLIDE 12

Sensing and Observations

Introduction OH-wOW ■ Implementation ■ Sense ■ Sample ■ Plan ■ Act Results Conclusion

Scott Kiesel (UNH) Open World Planning for Robots – 8 / 19

1. Sense 2. Sample 3. Plan 4. Act Sensed Occupancy Grid with Topological Graph Overlayed

slide-13
SLIDE 13

Sampling Possible Worlds

Introduction OH-wOW ■ Implementation ■ Sense ■ Sample ■ Plan ■ Act Results Conclusion

Scott Kiesel (UNH) Open World Planning for Robots – 9 / 19

1. Sense 2. Sample 3. Plan 4. Act

Current Knowledge

  • bserved

known to be true

Expectation prior domain knowledge bias

slide-14
SLIDE 14

Sampling Possible Worlds

Introduction OH-wOW ■ Implementation ■ Sense ■ Sample ■ Plan ■ Act Results Conclusion

Scott Kiesel (UNH) Open World Planning for Robots – 9 / 19

1. Sense 2. Sample 3. Plan 4. Act Known Partial World State

slide-15
SLIDE 15

Sampling Possible Worlds

Introduction OH-wOW ■ Implementation ■ Sense ■ Sample ■ Plan ■ Act Results Conclusion

Scott Kiesel (UNH) Open World Planning for Robots – 9 / 19

1. Sense 2. Sample 3. Plan 4. Act Sampled “Complete” World State

slide-16
SLIDE 16

Planning in Sampled Worlds

Introduction OH-wOW ■ Implementation ■ Sense ■ Sample ■ Plan ■ Act Results Conclusion

Scott Kiesel (UNH) Open World Planning for Robots – 10 / 19

1. Sense 2. Sample 3. Plan 4. Act

Fully Known

Deterministic

Classical Planners or

Domain Specific Planners

slide-17
SLIDE 17

Planning in Sampled Worlds

Introduction OH-wOW ■ Implementation ■ Sense ■ Sample ■ Plan ■ Act Results Conclusion

Scott Kiesel (UNH) Open World Planning for Robots – 10 / 19

1. Sense 2. Sample 3. Plan 4. Act A Single Sample

slide-18
SLIDE 18

Acting in Sampled Worlds

Introduction OH-wOW ■ Implementation ■ Sense ■ Sample ■ Plan ■ Act Results Conclusion

Scott Kiesel (UNH) Open World Planning for Robots – 11 / 19

1. Sense 2. Sample 3. Plan 4. Act

Execute Best Currently Available Action maximize expected reward

slide-19
SLIDE 19

Acting in Sampled Worlds

Introduction OH-wOW ■ Implementation ■ Sense ■ Sample ■ Plan ■ Act Results Conclusion

Scott Kiesel (UNH) Open World Planning for Robots – 11 / 19

1. Sense 2. Sample 3. Plan 4. Act

slide-20
SLIDE 20

Results

Introduction OH-wOW Results ■ Rescue ■ Rescue (sim) ■ Omelette (sim) Conclusion

Scott Kiesel (UNH) Open World Planning for Robots – 12 / 19

slide-21
SLIDE 21

Search and Rescue

Introduction OH-wOW Results ■ Rescue ■ Rescue (sim) ■ Omelette (sim) Conclusion

Scott Kiesel (UNH) Open World Planning for Robots – 13 / 19

UNH CS Offices, Pioneer 3-DX, SICK LMS500, ROS Fuerte victims found deadline 1 2 3 1 minute 4 6 5 minutes 7 3 10 minutes 3 4 3 Joshi et al: 4 hours precomputation, 3 victims constant time table lookup OH-wOW: no precomputation 0.18 sec avg max step time, 3 victims (256 samples) 2.7 sec avg max step time, 10 victims (256 samples)

slide-22
SLIDE 22

Search and Rescue

Introduction OH-wOW Results ■ Rescue ■ Rescue (sim) ■ Omelette (sim) Conclusion

Scott Kiesel (UNH) Open World Planning for Robots – 13 / 19

UNH CS Offices, Pioneer 3-DX, SICK LMS500, ROS Fuerte victims found deadline 1 2 3 1 minute 4 6 5 minutes 7 3 10 minutes 3 4 3 OH-wOW:

is online,

computes the next action quickly,

and handles the tradeoff between hard and soft goals.

slide-23
SLIDE 23

Search and Rescue in Simulation

Introduction OH-wOW Results ■ Rescue ■ Rescue (sim) ■ Omelette (sim) Conclusion

Scott Kiesel (UNH) Open World Planning for Robots – 14 / 19

cost over optimal

10 5

32 256 ctlr none 32 256 ctlr south 32 256 ctlr southwest

OH-wOW:

leverages domain specific knowledge,

and can beat a handcoded controller.

slide-24
SLIDE 24

Omelette Domain in Simulation

Introduction OH-wOW Results ■ Rescue ■ Rescue (sim) ■ Omelette (sim) Conclusion

Scott Kiesel (UNH) Open World Planning for Robots – 15 / 19

Levesque (AAAI ’96) planning time (seconds) 3 eggs step 4 eggs step Bonet et al (IJCAI ’01) 185

  • Levesque (IJCAI ’05))

1.4

  • 1,681
  • OH-wOW

12.9 0.52 76.7 1.57 Levesque plans are longer than OH-wOW OH-wOW:

is online,

computes the next action quickly,

and finds cheaper cost solutions.

slide-25
SLIDE 25

Conclusion

Introduction OH-wOW Results Conclusion ■ Limitations ■ Summary ■ Advertising

Scott Kiesel (UNH) Open World Planning for Robots – 16 / 19

slide-26
SLIDE 26

Limitations

Introduction OH-wOW Results Conclusion ■ Limitations ■ Summary ■ Advertising

Scott Kiesel (UNH) Open World Planning for Robots – 17 / 19

Scalability of the underlying planner leverage large body of literature

Calls underlying planner repetitively embarassingly parallel

Vulnerable to black swans during sampling importance sampling

Regenerates world samples at every step reuse samples until world ”changes” (see Yoon et al. ICAPS ’10 for HO Optimizations)

slide-27
SLIDE 27

Summary

Introduction OH-wOW Results Conclusion ■ Limitations ■ Summary ■ Advertising

Scott Kiesel (UNH) Open World Planning for Robots – 18 / 19

The OH-wOW framework is a:

Fast,

Simple,

General,

Online,

Approximate,

Way of Handling Open Worlds.

slide-28
SLIDE 28

The University of New Hampshire

Introduction OH-wOW Results Conclusion ■ Limitations ■ Summary ■ Advertising

Scott Kiesel (UNH) Open World Planning for Robots – 19 / 19

Tell your students to apply to grad school in CS at UNH!

friendly faculty

funding

individual attention

beautiful campus

low cost of living

easy access to Boston, White Mountains

strong in AI, infoviz, networking, systems, bioinformatics