Greedy On-Line Planning - abstract overview: what is greedy on-line - - PowerPoint PPT Presentation

greedy on line planning
SMART_READER_LITE
LIVE PREVIEW

Greedy On-Line Planning - abstract overview: what is greedy on-line - - PowerPoint PPT Presentation

Greedy On-Line Planning Greedy On-Line Planning - abstract overview: what is greedy on-line planning? Part 1: - greedy on-line planning makes planning tractable example: greedy localization Sven Koenig Part 2: - greedy on-line planning is


slide-1
SLIDE 1

SA3 - 1 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Greedy On-Line Planning

Sven Koenig

http://www.cc.gatech.edu/fac/Sven.Koenig/ Collaborators: David Furcy, Yaxin Liu, Yuri Smirnov (Additional Programming: Colin Bauer, William Halliburton) Craig Tovey, Maxim Likhachev,

SA3 - 2 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Greedy On-Line Planning

  • abstract overview: what is greedy on-line planning?
  • greedy on-line planning makes planning tractable
  • greedy on-line planning is reactive to the current situation

(plus other advantages)

  • fast replanning for greedy on-line planning

example: greedy localization example: greedy mapping example: moving a robot to goal coordinates in unknown terrain example: greedy mapping example: moving a robot to goal coordinates in unknown terrain example: symbolic planning example: replanning of shortest paths Part 1: Part 2: Part 3: heuristic search-based replanning calculating the heuristics for heuristic search-based planning

SA3 - 3 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Nondeterministic Planning - The Problem

goal

planning in nondeterministic domains is time consuming due to the many contingencies

goal start SA3 - 4 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

planning in nondeterministic domains is time consuming due to the many contingencies

Nondeterministic Planning - A Solution

start goal

agent-centered search makes it more efficient by interleaving planning with limited lookahead and plan execution

Agent-Centered Search [Koenig; 2001]

goal

state space can even become deterministic

[Nourbakhsh, 1997]

slide-2
SLIDE 2

SA3 - 5 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

planning in nondeterministic domains is time consuming due to the many contingencies

Nondeterministic Planning - A Solution

agent-centered search makes it more efficient by interleaving planning with limited lookahead and plan execution

Agent-Centered Search

planning plan execution traditional search agent-centered search small (bounded) planning time between plan executions (depends on search area) small sum of planning and execution time

state space can even become deterministic

goal

SA3 - 6 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

goal

planning in nondeterministic domains is time consuming due to the many contingencies

Nondeterministic Planning - Another Solution

start goal

assumption-based planning makes it more efficient by making assumptions about the outcomes of action executions

Assumption-Based Planning

desired trajectory actual trajectory

state space can even become deterministic

[Nourbakhsh, 1997]

SA3 - 7 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

both agent-centered search and assumption-based planning are

Nondeterministic Planning: Greedy On-Line Planning

greedy planning methods

because they make simplifying assumptions to make planning tractable

  • n-line planning methods

because they interleave planning and plan execution Note: without additional assumptions, it is not guaranteed that greedy on-line planning methods achieve the goal!

SA3 - 8 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Nondeterministic Planning: Robot Navigation under Incomplete Information

robot knows the map but not its location

  • localization

robot knows its location but not the map

  • mapping
  • goal-directed navigation in unknown terrain

Sensor-Based Planning [Choset and Burdick, 1994]

slide-3
SLIDE 3

SA3 - 9 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Part 1 Greedy On-line Planning makes Planning Tractable

SA3 - 10 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

The robot is always in exactly one cell.* The robot has a compass on board. The robot has no sensor or actuator uncertainty and knows the map. The robot initially does not know where it is. short-range sensor discretized space The robot always senses which of the four adjacent cells is empty. The robot can move to one of the four adjacent empty cells.

Greedy Localization

The task of the robot is to find out where it is with a shortest travel distance in the worst case (that is, for the worst possible start location) or detect that this is impossible. (Example: 5 moves) * We also have results for continuous terrain that are similar to the ones presented in the following for discretized terrain.

SA3 - 11 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

It is in NP to determine whether there exists a valid localiza- tion plan that executes no more movements than a given value. It is NP-hard to find a localization plan in gridworlds of size whose worst-case number of movements to localiza- tion is within a factor

  • f optimum, even in

connected gridworlds in which localization is possible. m n × O mn ( ) log ( ) Theorem [Tovey and Koenig, 2000]

contrast with: [Dudek, Romanik, Whitesides, 1995]

Hardness of (Approximately) Optimal Localization

SA3 - 12 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002. e1 e2 e3 e4 e5 S1 S2 S3 number of elements number of sets number of sets that form a smallest set cover x = 5 y = 3 y* = 2

Set Cover Theorem It is NP-hard to find a set cover whose number of sets is within a factor

  • f optimum (for sufficiently

small constants). O x ( ) log ( ) To prove the theorem, we reduce set cover problems to our localization problems.

[Lund and Yannakakis, 1994]

Hardness of (Approximately) Optimal Localization

slide-4
SLIDE 4

SA3 - 13 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

start S1 S2 S3 e0 e1 e2 e3 e4 e5

Whenever ei ∈ Sj, we make the corresponding

e1 e2 e3 e4 e5 S1 S2 S3

To localize, horizonal corridor i cells shorter. that correspond to a set cover. all the horizontal corridors the robot has to visit

Hardness of (Approximately) Optimal Localization

e0 SA3 - 14 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

and so on x3 replications for a total of m = 3x3y+1 cells xy cells (We leave out some small technical details.) n = (xy+2)(x+1) cells start signature

vertical = column horizontal = row

SA3 - 15 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002. Consider the following localization plan: Find the closest signature (= gives the robot its current column). Then move into all vertical corridors that correspond to a smallest set cover (= gives the robot its current row). The number of movements of this localization plan is at most 3y*xy. Thus, the number of movements of an optimal localization plan is at most 3y*xy. Thus, the number of movements of a localization plan whose worst-case number of move- ments to localization is within a factor O(log(mn)) of optimum is at most O(log(mn)) 3y*xy = O(log(x)) 3y*xy ≤ O(3x3y). Thus, such a plan cannot leave its current east-west corridor and can only localize by mov- ing into all corridors that correspond to a set cover. Let y’ denote the cardinality of this set

  • cover. Then, the number of movements is at least (2y’-1)(xy-x-1).

Thus, the number of movements is at least (2y’-1)(xy-x-1) and at most O(log(x)) 3y*xy, implying that y’ = O(log(x)) y* and thus that the set cover is within a factor O(log(x)) of minimum. However, it is NP-hard to find a set cover whose number of sets is within a factor O(log(x)) of minimum. qed

Hardness of (Approximately) Optimal Localization

SA3 - 16 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Theorem [Tovey and Koenig, 2000] For every gridworld of size , there exists a valid local- ization plan that executes movements to localization and that can be found in time . This result is the best possible in the sense that there exist gridworlds of size in which every valid localization plan must execute movements to localization and can only be found in time . m n × O mn ( ) O mn ( ) m n × Ω mn ( ) Ω mn ( )

Cost of (Approximately) Optimal Localization

slide-5
SLIDE 5

SA3 - 17 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Cost of (Approximately) Optimal Localization

qed

Map and Robot Trajectory Knowledge of the Robot Matching the Map and Knowledge of the Robot

m cells n cells start SA3 - 18 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Cost of (Approximately) Optimal Localization

SA3 - 19 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Greedy Localization

Greedy Localization repeatedly makes the robot execute a shortest (deterministic) movement sequence (subplan) that is guaranteed to reduce the number of possible robot cells by at least one. Greedy localization uses new information right away.

[Genesereth and Nourbakhsh, 1993][Koenig and Simmons, 1998]

A B C D E F 1 2 3 4 5 6 7 8 {A1,C1,E1,B4,D4} {A2,B5} {C2,E2,D5} {D2,E5} {F2} move east move south ... ...

SA3 - 20 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Greedy Localization = Agent-Centered Search

start goal

Greedy Localization repeatedly makes the robot execute a shortest (deterministic) movement sequence (subplan) that is guaranteed to reduce the number of possible robot cells by at least one. Thus, it plans in the deterministic part of the nondeterministic state space until a plan is found that achieves a gain in information. Note: Assume localization is possible. The state space is safely explorable. Greedy Localization always achieves a gain in information. Thus, Greedy Localization localizes the robot.

goal

slide-6
SLIDE 6

SA3 - 21 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Theorem The planning and plan-execution times of Greedy Localiza- tion are guaranteed to be low-order polynomials in the size

  • f the gridworld.

Cost of (Approximately) Optimal Localization Greedy

Greedy Localization makes planning tractable.

SA3 - 22 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Random Acyclic Mazes

gridworld size

  • bstacle

density

  • av. number
  • f subplans
  • av. number
  • f steps per

subplan

  • av. total

number of movements 11 x 11 41.3 45.4 46.8 47.6 48.1 48.4 48.6 2.4 3.3 3.8 4.1 4.5 4.7 4.9 1.5 1.7 1.7 1.8 1.8 1.8 1.9 3.6 5.4 6.6 7.5 8.0 8.6 9.1 61 x 61 % % % % % % % x x x x x x x = = = = = = = 21 x 21 31 x 31 41 x 41 51 x 51 71 x 71 to localization to localization to localization

Cost of (Approximately) Optimal Localization Greedy

Greedy Localization is fast in practice.

(5041 cells) SA3 - 23 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Example for a Corridor-Like Terrain [Tovey and Koenig, 2000] The worst-case number of movements of Greedy Localiza- tion can be a factor worse than the optimal worst- case number of movements to localization in gridworlds of size , even in connected gridworlds in which localiza- tion is possible. Ω mn 3 ( ) m n ×

Cost of (Approximately) Optimal Localization Greedy

However, its plan-execution time cannot be optimal.

SA3 - 24 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002. start and so on qed

Cost of (Approximately) Optimal Localization Greedy

slide-7
SLIDE 7

SA3 - 25 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002. gridworld size

  • bstacle

density

  • av. number
  • f subplans
  • av. number
  • f steps per

subplan

  • av. total

number of movements 11 x 25 50.2 4.5 2.3 10.2 % x = 13 x 36 15 x 49 17 x 64 19 x 81 21 x 100 23 x 121 25 x 144 27 x 169 29 x 196 31 x 225 33 x 256 35 x 289 50.2 50.2 50.2 50.2 50.1 50.1 50.1 50.1 50.1 50.1 50.1 50.1 % % % % % % % % % % % % 5.9 7.4 8.9 10.4 11.5 13.4 14.4 16.0 18.0 19.4 20.8 22.5 x x x x x x x x x x x x 2.9 3.2 3.4 4.0 4.4 4.5 4.9 5.2 5.4 5.7 5.8 6.1 = = = = = = = = = = = = 16.9 23.7 30.6 42.0 50.0 60.4 71.1 82.5 98.0 110.5 121.5 137.7 to localization to localization to localization

Our Acyclic Mazes

Cost of (Approximately) Optimal Localization Greedy

(5684 cells) (4563 cells) SA3 - 26 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Example for a Room-Like Terrain* The worst-case number of movements of Greedy Localiza- tion can be a factor worse than the

  • ptimal worst-case number of movements to localization in

gridworlds of size , even in connected gridworlds in which localization is possible. Ω mn ( ) mn ( ) log ( ) ⁄ ( ) m n ×

Cost of (Approximately) Optimal Localization Greedy

However, its plan-execution time cannot be optimal. * We also have even better lower bounds (although in more complex gridworlds) and small upper bounds.

SA3 - 27 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Cost of (Approximately) Optimal Localization Greedy

start qed

0 0 0 0 0 1 1 . . .

SA3 - 28 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002. (Approximately) Optimal Localization planning time plan-execution time Greedy Localization low-order polynomial low-order polynomial (likely) exponential low-order polynomial

Cost of (Approximately) Optimal Localization Greedy Summary

slide-8
SLIDE 8

SA3 - 29 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

no sensor uncertainty, no actuator uncertainty minimax model

Extension: Actuator and Sensor Noise

so far: sensor uncertainty, actuator uncertainty probabilistic model POMDP-based (“Markov”) Localization more realistic on robots: Mobile robots have

  • noisy actuators
  • noisy sensors

sonar ring

  • ccupancy grid

SA3 - 30 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Landmark-Based Navigation Metric-Based Navigation “sensitive to the environment” “sensitive to robot movements” be sensitive to both the environment and the robot movements discretize the locations, but allow arbitrary location distributions restrict location distributions, but don’t discretize the locations Kalman Filters POMDPs maintain a probability distribution over all locations (location distribution) +

Extension: Actuator and Sensor Noise

0.20 0.10 0.10 0.20 0.10 0.10 0.05 0.05

(Partially Observable Markov Decision Process Models) SA3 - 31 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

goal location mapping from location distributions to directives (“policy”) directive selection policy generation POMDP motion generation desired directive motor commands raw sonar data raw odometer data

  • ccupancy grid [Elfes]

sensor sensor report motion report location estimation current location distribution topological map prior actuator model prior sensor model prior distance model POMDP

Navigation Obstacle Avoidance Real-Time Control

path planning

Destination Planner

path model learning interpretation compilation using GROW-BW (based on Baum-Welch) (Bayes’ rule)

SA3 - 32 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

POMDP-Based Navigation on Xavier Xavier

Extension: Actuator and Sensor Noise

now very popular with large amount of follow-up work

  • perated for three years with > 200 km travel distance

[Simmons and Koenig, 1995] [Koenig and Simmons, 1998] [Thrun, 2000]

slide-9
SLIDE 9

SA3 - 33 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

  • explicitly models all uncertainty using probabilities
  • maintains arbitrary probability distributions over the locations
  • utilizes all available sensor data (landmarks, robot movements)
  • robust towards sensor errors (no explicit exception handling required)
  • uniform, theoretically grounded framework for localization

POMDP-based (“Markov”) Localization

Extension: Actuator and Sensor Noise

SA3 - 34 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Extension: Actuator and Sensor Noise

no sensor uncertainty, no actuator uncertainty minimax model sensor uncertainty, actuator uncertainty probabilistic model POMDP-based (“Markov”) Localization 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.2

50 20 20 10 (simplified)

SA3 - 35 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

no sensor uncertainty, no actuator uncertainty minimax model sensor uncertainty, actuator uncertainty probabilistic model POMDP-based (“Markov”) Localization

Extension: Actuator and Sensor Noise

It is NP-hard to find an optimal homing sequence for a colored finite state automaton. It is NP-hard to find an optimal localization sequence in a gridworld. add more structure the robot can only move north, east, south, or west

[Schapire, 1992]

It is PSPACE-hard to find an optimal policy for a POMDP. [Papadimitriou, Tsitsiklis, 1987] ????? add more structure the robot can only move north, east, south, or west

s s s s s n n n n e e ew w w n a b a b a b a a b a b b

SA3 - 36 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Extension: Actuator and Sensor Noise

no sensor uncertainty, no actuator uncertainty minimax model sensor uncertainty, actuator uncertainty probabilistic model POMDP-based (“Markov”) Localization Greedy Localization repeatedly makes the robot execute a shortest (deterministic) movement sequence (subplan) that is guaranteed to reduce the number of possible robot cells by at least one. Greedy Localization repeatedly makes the robot execute a shortest (deterministic) movement sequence (subplan) that is guaranteed to reduce the entropy of the probability distribution over the possible robot cells.

[Burgard, Fox, Thrun, 1997]

slide-10
SLIDE 10

SA3 - 37 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Part 2 Greedy On-line Planning is Reactive to the Current Situation (plus other advantages)

SA3 - 38 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Greedy Mapping

Greedy Mapping always moves the robot on a shortest path to clos- est unobserved (or unvisited) cell.

3 4 4 4

...

we assume here that the robot can move in eight directions

[Koenig, Tovey, Halliburton, 2001] [Thrun et al. 1998] [Romero, Morales, Sucar, 2001]

SA3 - 39 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Greedy Mapping = Agent-Centered Search

start goal

Greedy Mapping always moves the robot on a shortest path to clos- est unobserved (or unvisited) cell. Thus, it plans in the deterministic part of the nondeterministic state space until a plan is found that achieves a gain in information. Note: Assume mapping is possible. The state space is safely explorable. Greedy Mapping always achieves a gain in information. Thus, Greedy Mapping maps the terrain.

goal

SA3 - 40 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

4

can easily be integrated into robot architectures (“reactive planning”)

4 4

does not need to be in control of the robot at all times (“reactive planning”)

Greedy Mapping - Advantages

for example, our implementation combines greedy mapping and schema-based navigation (MissionLab) [Mackenzie, Arkin, Cameron, 1997]

we assume here that the robot can move in eight directions

slide-11
SLIDE 11

SA3 - 41 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

4

utilizes prior map knowledge, if available can be used by multiple robots that share their maps

Greedy Mapping - Advantages

we assume here that the robot can move in eight directions SA3 - 42 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

28 feet 20 feet

Greedy Mapping - Robot Implementation

SA3 - 43 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Greedy Mapping - Robot Implementation

SA3 - 44 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Greedy Mapping - Travel Distance

slide-12
SLIDE 12

SA3 - 45 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

200 400 600 800 1000 1200 1400 1600 1800 2000 200 400 600 800 1000 1200 1400 1600 1800 Number of vertices Travel distance

Greedy Mapping - Travel Distance

number of vertices travel distance

SA3 - 46 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Greedy Mapping - Travel Distance

we assume here that the robot can move in eight directions SA3 - 47 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Greedy Mapping - Travel Distance

Here: Greedy Mapping always moves the robot on a shortest path to the closest unvisited cell. This version of Greedy Mapping works

  • n any strongly connected undirected graph.

start = visited (known) vertex = unvisited known vertex = known edge = a shortest path to a closest unvisited vertex 1 2 3 4 5 6 7 8 9 10 11 12 = current vertex of the robot

SA3 - 48 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

The worst-case number of move- ments of Greedy Mapping is and , where is the number vertices of the graph, even for undirected planar graphs. Ω s ( ) O s2 ( ) s Theorem: Trivial Theorem The worst-case number of move- ments of Greedy Mapping is and , where is the number vertices of the graph, even for undirected planar graphs. Ω

s log s log log

  • s

( ) O s

s log ( )

s Theorem: [Koenig, Tovey, Smirnov, 2001] More Interesting Theorem

robot start

Here: Greedy Mapping always moves the robot on a shortest path to the closest unvisited cell.

Greedy Mapping - Travel Distance

slide-13
SLIDE 13

SA3 - 49 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

  • rder of |V|
  • rder of

log |V| log log |V| lower bound for Greedy Mapping tight bound for chronological backtracking

travel distance (logscale) |V| (logscale) lower bound identity function

n 3 4 5 6 7 8 9 10 travel distance 207 2279 31253 515085 9928271 219130987 5448100629 150617283953 |V| 80 778 9612 144014 2542528 51744018 1193201300 30753086422 travel distance |V| 2.59 2.93 3.25 3.58 3.90 4.23 4.57 4.90

|V|

Greedy Mapping - Travel Distance

can we use structure to decrease the travel distance?

  • rder of

upper bound for Greedy Mapping |V|log |V|

SA3 - 50 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

[Brumitt and Stentz, 1998] [Hebert, McLachlan, Chang, 1999] [Matthies et al., 2000] [Stentz and Hebert, 1995] [Thayer et al., 2000]

Planning with the Freespace Assumption

Planning with the Freespace Assumption always moves the robot

  • n a shortest potentially unblocked path to the goal cell.

3 4 4 4

...

we assume here that the robot can move in eight directions SA3 - 51 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

[Brumitt and Stentz, 1998] [Hebert, McLachlan, Chang, 1999] [Matthies et al., 2000] [Thayer et al., 2000]

Planning with the Freespace Assumption

Planning with the Freespace Assumption always moves the robot

  • n a shortest potentially unblocked path to the goal cell.

HMMWV that navigated 1,410 meters of natural outdoor terrain in 1995

  • Demo Vehicles of the Darpa UGV II Program
  • Mars Rover Prototype
  • Prototypes of Urban Reconnaissance Robots

[Stentz and Hebert, 1995]

SA3 - 52 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Freespace Assumption = Assumption-Based Planning

Planning with the Freespace Assumption always moves the robot

  • n a shortest potentially unblocked path to the goal cell.

Thus, it makes assumptions about outcomes of actions that make the nondeterministic state space deterministic.

goal

start goal desired trajectory actual trajectory

Note: Assume moving to the goal is possible. The state space is safely explorable. Planning with the Freespace Assumption always achieves a gain in information. Thus, Planning with the Freespace Assumption moves to the goal.

slide-14
SLIDE 14

SA3 - 53 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Freespace Assumption - Travel Distance

Here: Planning with the Freespace Assumption always moves the robot on a shortest (potentially unblocked) path to the goal vertex.

start goal

1 2 3 4 5 6 7 8 = edge known to be unblocked = edge assumed to be unblocked = a shortest potentially traversable path to the goal = edge known to be blocked robot can

  • move north
  • move east
  • move south
  • move west

SA3 - 54 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Freespace Assumption - Travel Distance

Here: Planning with the Freespace Assumption always moves the robot on a shortest (potentially unblocked) path to the goal vertex.

start goal = unblocked edge / edge known to be unblocked = blocked edge / edge known to be blocked gridworld graph initial knowledge of graph = edge assumed to be unblocked start goal start goal

robot can

  • move forward
  • move backward
  • turn 90 degree left (or right)

SA3 - 55 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Freespace Assumption - Travel Distance

Planning with the Freespace Assumption results in small travel distances if the freespace assumption is approxi- mately satisfied, that is, if the obstacle density is small. However, the travel distances are also small if the freespace assumption is not satisfied.

SA3 - 56 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

200 400 600 800 1000 1200 1400 1600 1800 2000 50 100 150 200 250 300 350 400 Number of vertices Travel distance

Freespace Assumption - Travel Distance

number of vertices travel distance

slide-15
SLIDE 15

SA3 - 57 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

The worst-case number of movements of Planning with the Freespace Assumption is and , where is the number vertices of the graph, even for undirected planar graphs. Ω

s log s log log

  • s

( ) O s3 2

( ) s Theorem: [Koenig, Tovey, Smirnov, 2001]* * we also have even better bounds

Freespace Assumption - Travel Distance

start goal

Here: Planning with the Freespace Assumption always moves the robot on a shortest (potentially unblocked) path to the goal vertex.

SA3 - 58 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Part 3 Fast Replanning for Greedy On-line Planning

SA3 - 59 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

3 4 4 4

... 1 1 1 1 1 2 1 1 1 1 2 2 2 3 1 1 1 3 3 2 2 1 1 1 2 2 3 3 3 1 1 2 2 2 3 3 3 3 3 3 3

Greedy Mapping - Implementation

Greedy Mapping always moves the robot on a shortest path to the closest unobserved (or unvisited) cell.

we assume here that the robot can move in eight directions SA3 - 60 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

3 2 1 1 1 2 3 2 2 2 3 3 3 3 3 4 5

3 4 4 4

... 5 4 4 4 5 5

Freespace Assumption - Implementation

3 2 1 1 1 2 3 2 2 2 3 5 3 3 3 4 5 5 4 4 5 5 5 3 2 1 1 1 2 3 2 2 2 3 3 3 3 3 4 5 5 4 4 4 5 5 3 2 1 1 1 2 3 2 2 2 3 5 3 3 3 4 5 5 4 4 5 5 5 3 2 1 1 1 2 3 2 2 2 3 5 3 3 3 4 5 5 4 4 5 5 5 goal Planning with the Freespace Assumption always moves the robot

  • n a shortest potentially unblocked path to the goal cell.

we assume here that the robot can move in eight directions

slide-16
SLIDE 16

SA3 - 61 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

  • riginal eight-connected gridworld

Path Planning - Example

we assume here that the robot can move in eight directions

sstart sgoal

SA3 - 62 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

changed eight-connected gridworld

Path Planning - Example

we assume here that the robot can move in eight directions

sstart sgoal

SA3 - 63 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

  • riginal eight-connected gridworld

Path Planning - Example

we assume here that the robot can move in eight directions

1 2 2 3 3 4 6 4 5 5 6 5 7 6 5 6 6 6 6 6 8 8 7 7 7 6 7 7 8 14 15 9 8 12 11 10 10 7 8 6 6 7 8 8 9 8 9 9 11 11 11 12 12 12 9 10 10 10 10 18 18 11 11 10 12 14 14 14 13 12 14 14 14 15 15 15 16 16 15 15 16 17 16 15 16 17 16 18 17 18 18 1 2 3 4 5 6 7 10 10 11 8 3 5 7 14 sstart sgoal 12 8 5 5 7 7 8 3 3 3 4 6 6 6 7 7 7 7 9 7 3 4 4 5 5 9 5 5 5 6 6 6 6 7 8 9 8 9 8 9 9 12 12 11 11 11 9 9 10 13 13 13 13 12 13 13 14 13 15 16 16 16 16 17 17 11 13 13 18 16 16

SA3 - 64 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

changed eight-connected gridworld

Path Planning - Example

we assume here that the robot can move in eight directions

1 2 2 3 4 8 4 5 9 9 10 7 6 5 6 6 6 6 6 6 10 9 7 7 7 6 7 7 9 14 15 10 8 12 11 10 10 7 8 6 6 7 8 8 9 8 9 9 11 11 11 12 12 12 10 10 10 10 10 19 19 11 11 10 12 15 15 14 13 12 14 14 14 16 16 16 17 16 15 15 17 17 17 16 17 18 17 17 18 18 19 18 1 2 3 4 5 6 7 11 12 12 8 3 5 7 15 sstart 13 8 5 5 7 7 8 3 3 3 10 10 8 6 7 8 8 9 10 7 3 4 4 5 5 9 5 5 5 6 6 6 6 9 9 8 9 8 10 12 12 11 11 11 9 9 10 13 13 13 13 13 14 14 15 14 16 17 17 16 16 18 18 11 13 13 18 16 17 13 sgoal

slide-17
SLIDE 17

SA3 - 65 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

heuristic incremental search search how to search efficiently using heuristic to guide the search how to search efficiently by reusing information

Path Planning - Example

from previous searches Artificial Intelligence Algorithm Theory

SA3 - 66 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

uninformed search Breadth-First Search DynamicSWSF-FP complete search incremental search heuristic search A* Lifelong Planning A*

[Ramalingam, Reps, 1996] [Hart, Nilsson, Raphael, 1968]

with early termination (our addition)

Path Planning - Lifelong Planning A*

SA3 - 67 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

uninformed search complete search heuristic search Lifelong Planning A*

  • riginal eight-connected gridworld

Path Planning - Experimental Evaluation

incremental search

sgoal sstart sgoal sstart sgoal sstart sgoal sstart

SA3 - 68 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

uninformed search complete search heuristic search Lifelong Planning A* changed eight-connected gridworld

Path Planning - Experimental Evaluation

incremental search

sstart sstart sstart sgoal sgoal sstart sgoal sgoal

slide-18
SLIDE 18

SA3 - 69 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

ve = va = hp = ve = va = hp = ve= va= hp= ve= va= hp= 1331.7 26207.2 5985.3 173.0 5697.4 956.2 284.0 6177.3 1697.3 25.6 1235.9 240.1 +/- +/- +/- +/- +/- +/- +/- +/- +/- 4.4 84.0 19.7 4.9 167.0 26.6 +/- +/- +/- 5.9 129.3 39.9 2.0 75.0 16.9 uninformed search complete search heuristic search Lifelong Planning A* changed eight-connected gridworld - first implementation

Path Planning - Experimental Evaluation

incremental search ve = vertex expansions, va = vertex accesses, hp = heap percolates (with the same tie-breaking as LPA*)

SA3 - 70 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

ve = hp = ve = hp = ve= hp= ve= hp= 801.76 2359.60 115.95 561.48 172.20 724.60 18.80 182.15 uninformed search complete search heuristic search Lifelong Planning A* changed eight-connected gridworld - second implementation

Path Planning - Experimental Evaluation

incremental search ve = vertex expansions, hp = heap percolates (with the same tie-breaking as LPA*)

SA3 - 71 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Path Planning - Experimental Evaluation

ve= hp= t1= ve= hp= t1= 68.17 547.72 13.62 18.80 182.15 6.66 heuristic search Lifelong Planning A* ve = vertex expansions, hp = heap percolates, t1 = time in main search routine, (with better tie-breaking than LPA*) t2= 18.61 t2= 13.22 tie-breaking matters A* expands nodes faster thanLPA* time speedup = x1.5 in the long run after the third replanning episode, the total planning time of LPA* over all episodes is less than that of A* t2= total runtime (including maze generation etc.)

SA3 - 72 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

procedure CalculateKey(s) return [min(g(s), rhs(s)) + h(s,sgoal); min(g(s), rhs(s))]; procedure Initialize() U = Ø; for all s ∈ S rhs(s) = g(s) = ∞ rhs(sstart) = 0; U.Insert(sstart, CalculateKey(sstart)]; procedure UpdateVertex(u) if (u ≠ sstart) rhs(u) = mins’ ∈ Pred(u) (g(s’)+c(s’,u)); if (u ∈ U) U.Remove(u); if (g(u) ≠ rhs(u)) U.Insert(u, CalculateKey(u)); procedure ComputeShortestPath() while (U.TopKey < CalculateKey(sgoal) OR rhs(sgoal) ≠ g(sgoal)) u = U.Pop(); if (g(u) > rhs(u)) g(u) = rhs(u); for all s ∈ Succ(u) UpdateVertex(s); else g(u) = ∞; for all s ∈ Succ(u) ∪ {u} UpdateVertex(s); procedure Main() Initialize(); forever ComputeShortestPath(); Wait for changes in edge costs; for all directed edges (u, v) with changed edge costs Update the edge cost c(u,v); UpdateVertex(v);

Path Planning - Lifelong Planning A*

This version of LPA* can be

  • ptimized further without changing

We also have versions of LPA* that

  • break ties differently
  • work with inconsistent heuristics

its overall operation. U.TopKey() returns the smallest priority

  • f all vertices in the priority queue U.

If U is empty, then U.TopKey() returns [∞; ∞]. U.Pop() deletes the vertex with the smallest priority in priority queue U and returns the vertex. U.Insert(s,k) inserts vertex s into priority queue U with priority k. Finally, U.Remove(s) removes vertex s from priority queue U. The heuristics need to be nonnegative and (forward) consistent: for all vertices s ∈ S and s’ ∈ Succ(s). and h(s,sgoal) ≤ c(s,s’) + h(s’,sgoal) h(sgoal,sgoal) = 0

  • terminate earlier
  • contain several runtime optimizations.

[Koenig, Likhachev, 2001]

slide-19
SLIDE 19

SA3 - 73 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Lifelong Planning A*

  • applies to the same finite search problems as A*
  • produces the same (optimal) solution as A*
  • handles arbitrary edge cost changes
  • is algorithmically very similar to A*
  • is more efficient than A* in many situations
  • applies to
  • route planning problems (traffic, networking, ...)
  • robot control
  • symbolic artificial intelligence planning
  • has nice theoretical properties
  • ...

Path Planning - Lifelong Planning A*

SA3 - 74 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

start 1 1 1 2 2 2 3 3 4 4 3 5 5 4 4 goal

Path Planning - Lifelong Planning A*

3 4 5 6

A B C D 1 2 4 5 6

SA3 - 75 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

2 3 4

Path Planning - Lifelong Planning A*

3 4 5 3+1 5+1

minimum 4

A B C ... ... ... g-value rhs-value g-value = rhs-value: g-value > rhs-value: g-value < rhs-value: cell is locally consistent cell is locally overconsistent cell is locally underconsistent the priority queue contains exactly the locally inconsistent vertices s g-value ≠ rhs-value: cell is locally inconsistent their priority is [min(g(s),rhs(s))+h(s,sgoal); min(g(s),rhs(s))] smaller priorities first, according to a lexicographic ordering

SA3 - 76 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

start 1 1 1 2 2 2 3 3 4 4 3 5 5 4 4 goal

Path Planning - Lifelong Planning A*

3 4 5 6

A B C D 1 2 4 5 6

slide-20
SLIDE 20

SA3 - 77 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

start 1 1 1 2 2 2 3 3 4 4 3 5 5 4 4 goal

Path Planning - Lifelong Planning A*

3 4 5 6

min(2,4)+2 min(2,4)

A B C D 1 2 4 5 6 priority queue C3: [4;2]

SA3 - 78 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

start 1 1 1 2 2 ∞ 3 3 4 4 3 5 5 4 4 goal

Path Planning - Lifelong Planning A*

3 4 5 6

min(∞,4)+2 min(∞,4) min(3,5)+1 min(3,5)

A B C D 1 2 4 5 6 priority queue D3: [4;3]; C3: [6;4]

SA3 - 79 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

start 1 1 1 2 2 ∞ 3 3 4 4 ∞ 5 5 4 4 goal

Path Planning - Lifelong Planning A*

3 4 5 6

min(∞,5)+1 min(∞,5) min(4,6)+0 min(4,6)

4

min(4,6)+2 min(4,6)

A B C D 1 2 4 5 6 priority queue D2: [4;4]; D4: [6;4]; D3: [6;5]

SA3 - 80 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

start 1 1 1 2 2 ∞ 3 3 4 4 ∞ 5 5 ∞ 4 goal

Path Planning - Lifelong Planning A*

3 4 5 6

min(∞,5)+1 min(∞,5) min(∞,6)+0 min(∞,6) min(4,6)+2 min(4,6)

A B C D 1 2 4 5 6 priority queue D4: [6;4]; D3: [6;5]; D2: [6;6]

slide-21
SLIDE 21

SA3 - 81 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

start 1 1 1 2 2 ∞ 3 3 4 4 ∞ 5 5

∞ goal

Path Planning - Lifelong Planning A*

3 4 5 6

min(∞,6)+0 min(∞,6) min(∞,6)+2 min(∞,6) min(5,7)+3 min(5,7)

A B C D 1 2 4 5 6 priority queue D2: [6;6]; D5: [8;5]; D4: [8;6]

SA3 - 82 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

start 1 1 1 2 2 ∞ 3 3 4 4 ∞ 5 5 6 ∞ goal

Path Planning - Lifelong Planning A*

3 4 5 6

min(∞,6)+2 min(∞,6) min(5,7)+3 min(5,7) min(∞,7)+1 min(∞,7)

A B C D 1 2 4 5 6 priority queue D5: [8;5]; D4: [8;6]; D3: [8;7]

SA3 - 83 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

start 1 1 1 2 2 ∞ 3 3 4 4 ∞ 5 5 6 ∞ goal

Path Planning - Lifelong Planning A*

3 4 5 6

min(∞,6)+2 min(∞,6) min(5,7)+3 min(5,7) min(∞,7)+1 min(∞,7)

A B C D 1 2 4 5 6 priority queue D5: [8;5]; D4: [8;6]; D3: [8;7]

SA3 - 84 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

ComputeShortestPath() expands every vertex at most twice and thus terminates. Theorem: [Likhachev and Koenig, 2001]

Path Planning - Lifelong Planning A*

After ComputeShortestPath() terminates, one can trace back a shortest path from the start to the goal by always moving from the current vertex s, starting at the goal, to any predecessor s’ that min- imizes g(s’) + c(s’,s) until the start is reached (ties can be broken arbitrarily). Theorem: [Likhachev and Koenig, 2001]

slide-22
SLIDE 22

SA3 - 85 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

In the worst case, replanning cannot be more efficient than planning from scratch. [Nebel, Koehler, 1995]

Path Planning - Lifelong Planning A*

  • ld search tree

new search tree start goal

  • ld search tree

new search tree start goal

SA3 - 86 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

ComputeShortestPath() does not expand any vertices whose g-val- ues were equal to their respective start distances before Compute- ShortestPath() was called. Theorem: [Likhachev and Koenig, 2001]

Path Planning - Lifelong Planning A*

= LPA* is efficient because it uses incremental search ComputeShortestPath() expands at most those vertices s with [f(s); g*(s)] ≤ [f(sstart); g*(sstart)] or [gold(s)+h(s); gold(s)] ≤ [f(sstart); g*(sstart)], where f(s) = g*(s)+h(s) and gold(s) is the g-value of s directly before the call to ComputeShortestPath(). Theorem: [Likhachev and Koenig, 2001] = LPA* is efficient because it uses heuristic search

SA3 - 87 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

The first search of Lifelong Planning A* is the same as that of A*. Afterwards, Lifelong Planning A* operates in a very similar way to A*. (The theorem makes this more concrete. For example, Com- puteShortestPath() expands locally overconsistent vertices with finite f-values in the same order as A*.) “Theorem:” [Likhachev and Koenig, 2001]

Path Planning - Lifelong Planning A*

SA3 - 88 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

3 2 1 1 1 2 3 2 2 2 3 3 3 3 3 4 5

3 4 4 4

... 5 4 4 4 5 5

Freespace Assumption - Implementation

3 2 1 1 1 2 3 2 2 2 3 5 3 3 3 4 5 5 4 4 5 5 5 3 2 1 1 1 2 3 2 2 2 3 3 3 3 3 4 5 5 4 4 4 5 5 3 2 1 1 1 2 3 2 2 2 3 5 3 3 3 4 5 5 4 4 5 5 5 3 2 1 1 1 2 3 2 2 2 3 5 3 3 3 4 5 5 4 4 5 5 5 goal Planning with the Freespace Assumption always moves the robot

  • n a shortest potentially unblocked path to the goal cell.

we assume here that the robot can move in eight directions

slide-23
SLIDE 23

SA3 - 89 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

goal robot

Transforming Planning with the Freespace

here: search from the goal location towards the robot location

  • allows one to reuse parts of the search tree after the robot has moved
  • allows one to use heuristics to focus the search

(this additional argument holds for Greedy Mapping later)

Assumption to Path Planning

sstart sgoal h(sstart,s) = g(s) = approximation of the distance from the robot to vertex s approximation of the goal distance of vertex s

SA3 - 90 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Transforming Planning with the Freespace

here: search from the goal location towards the robot location

  • makes incremental search efficient

Assumption to Path Planning

  • ld search tree

new search tree robot sstart goal sgoal

SA3 - 91 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

procedure CalculateKey(s) return [min(g(s), rhs(s)) + h(sstart, s); min(g(s), rhs(s))]; procedure Initialize() U = Ø; for all s ∈ S rhs(s) = g(s) = ∞ rhs(sgoal) = 0; U.Insert(sgoal, CalculateKey(sgoal); procedure UpdateVertex(u) if (u ≠ sgoal) rhs(u) = mins’ ∈ Succ(u) (c(u,s’)+g(s’)); if (u ∈ U) U.Remove(u); if (g(u) ≠ rhs(u)) U.Insert(u, CalculateKey(u)); procedure ComputeShortestPath() while (U.TopKey < CalculateKey(sstart) OR rhs(sstart) ≠ g(sstart)) u = U.Pop(); if (g(u) > rhs(u)) g(u) = rhs(u); for all s ∈ Pred(u) UpdateVertex(s); else g(u) = ∞; for all s ∈ Pred(u) ∪ {u} UpdateVertex(s); procedure Main() Initialize(); ComputeShortestPath(); while (sstart ≠ sgoal) /* if (g(sstart) = ∞) then there is no known path */ sstart = arg min s’ ∈Succ(sstart) (c(sstart,s’)+g(s’)) Move to sstart; Scan graph for changed edge costs;

Freespace Assumption - D* Lite (Basic Version)

U.TopKey() returns the smallest priority

  • f all vertices in the priority queue U.

If U is empty, then U.TopKey() returns [∞; ∞]. U.Pop() deletes the vertex with the smallest priority in priority queue U and returns the vertex. U.Insert(s,k) inserts vertex s into priority queue U with priority k. Finally, U.Remove(s) removes vertex s from priority queue U.

if any edge costs changed for all directed edges (u,v) with changed edge costs Update the edge cost c(u,v); UpdateVertex(u); for all s ∈ U U.Update(s, CalculateKey(s)); ComputeShortestPath();

The heuristics need to be nonnegative and backward consistent for all vertices s ∈ S and s’ ∈ Pred(s). and h(sstart,s) ≤ h(sstart, s’)+c(s’,s) h(sstart,sstart) = 0 no matter what the start vertex is:

[Koenig, Likhachev, 2002]

SA3 - 92 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

When the robot moves, the goal of the search (sstart) moves. This influences the priorities of the vertices in the priority queue (but not which vertices are in the priority queue). vertex s is locally inconsistent iff vertex s is in the priority queue with priority [min(g(s),rhs(s))+h(soldstart,s); min(g(s),rhs(s))]. This value changes when the robot moves from soldstart to snewstart. Thus, one needs to reorder the priority queue. [Stentz, 1994]

Freespace Assumption - D* Lite (Basic Version)

h(snewstart,s)

Idea

slide-24
SLIDE 24

SA3 - 93 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

priority queue A: [8;5]; B: [8;6]; C: [8;7] priority queue C: [7;7]; B: [8;6]; A: [9;5]

Freespace Assumption - D* Lite (Basic Version) Fictitious Example

SA3 - 94 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

procedure CalculateKey(s) return [min(g(s), rhs(s)) + h(sstart, s) + km; min(g(s), rhs(s))]; procedure Initialize() U = Ø; km = 0; for all s ∈ S rhs(s) = g(s) = ∞ rhs(sgoal) = 0; U.Insert(sgoal, CalculateKey(sgoal); procedure UpdateVertex(u) if (u ≠ sgoal) rhs(u) = mins’ in Succ(u) (c(u,s’)+g(s’)); if (u ∈ U) U.Remove(u); if (g(u) ≠ rhs(u)) U.Insert(u, CalculateKey(u)); procedure ComputeShortestPath() while (U.TopKey < CalculateKey(sstart) OR rhs(sstart) ≠ g(sstart)) kold = U.TopKey(); u = U.Pop(); if (kold < CalculateKey(u)) U.Insert(u, CalculateKey(u)); else if (g(u) > rhs(u)) g(u) = rhs(u); for all s ∈ Pred(u) UpdateVertex(s); else g(u) = ∞; for all s ∈ Pred(u) ∪ {u} UpdateVertex(s); procedure Main() slast = sstart; Initialize(); ComputeShortestPath();

Freespace Assumption - D* Lite (Final Version)

while (sstart ≠ sgoal) /* if (g(sstart) = ∞) then there is no known path */ sstart = arg min s’∈Succ(sstart) (c(sstart,s’)+g(s’)) Move to sstart; Scan graph for changed edge costs; if any edge costs changed km = km + h(slast,sstart); slast = sstart; for all directed edges (u,v) with changed edge costs Update the edge cost c(u,v); UpdateVertex(u); ComputeShortestPath();

The heuristics need to be nonnegative and forward-backward consistent: for all vertices s,s’,s’’ ∈ S. h(s,s’’) ≤ h(s,s’)+h(s’,s’’) The heuristics also need to be admissible no matter what the goal vertex is: h(s,s’) ≤ shortest distance from s to s’ for all vertices s,s’ ∈ S.

[Koenig, Likhachev, 2002]

s s’ s’’ triangle inequality SA3 - 95 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Freespace Assumption - D* Lite (Final Version)

Reordering the priority queue is time consuming. vertex s is locally inconsistent iff vertex s is in the priority queue with priority [min(g(s),rhs(s))+h(soldstart,s); min(g(s),rhs(s))].

[Stentz, 1995] We use lower bounds on the new priorities instead of the new priorities themselves. [min(g(s),rhs(s))+h(soldstart,s); min(g(s),rhs(s))] ≤ [min(g(s),rhs(s))+h(soldstart,snewstart)+h(snewstart,s); min(g(s),rhs(s))] [min(g(s),rhs(s))+h(soldstart,s)-h(soldstart,snewstart); min(g(s),rhs(s))] ≤ [min(g(s),rhs(s))+h(snewstart,s); min(g(s),rhs(s))] The term h(soldstart,snewstart) is the same across vertices in the priority queue. Instead of deleting it from the all vertices in the priority queue, we add it to the vertices added to the priority queue in the future. When ComputeShortestPath() selects a vertex for expansion, it checks first whether its priority is correct. If so, it expands the vertex. If it is a lower bound, it calculates the correct priority and reinserts the vertex into the queue.

h(snewstart,s)

Idea

[Stentz, 1995]]

SA3 - 96 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Freespace Assumption - D* Lite (Final Version)

priority queue A: [8;5]; B: [8;6]; C: [8;7] add vertex D with priority [10;5] priority queue A: [6;5]; B: [6;6]; C: [6;7] add vertex D with priority [10;5] priority queue A: [8;5]; B: [8;6]; C: [8;7] add vertex D with priority [12;5] priority queue A: [8;5]; B: [8;6]; C: [8;7] priority queue B: [8;6]; C: [8;7]; A: [9;5] correct priority is A: [9;5] correct priority is B: [8;6] expand B

Fictitious Example

slide-25
SLIDE 25

SA3 - 97 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Freespace Assumption - D* Lite

knowledge before the movement sequence of the robot

2 3 2 2 3 6 3 5 7 7 6 5 6 5 8 12 7 12 13 13 13 7 8 14 14 8 14 14 12 12 12 13 14 13 13 13 12 14 14 18 14 14 14 14 14 13 13 13 12 12 12 4 6 4 3 3 3 3 3 4 5 3 2 4 8 2 1 1 2 3 5 7 6 5 6 4 3 2 4 8 1 3 6 7 6 6 6 6 6 6 8 1 13 12 9 9 9 9 9 9 6 12 12 13 14 sstart 7 7 7 3 7 7 7 7 7 7 7 7 7 8 1 9 9 16 11 13 12 10 10 10 10 10 11 7 12 10 10 10 10 10 10 14 4 11 11 11 11 11 7 12 11 11 11 11 11 11 15 5 6 1 2 3 5 7 6 5 6 4 3 2 4 8 1 1 1 2 3 5 7 6 5 6 4 3 2 4 8 1 2 2 2 3 5 7 6 5 6 4 3 2 4 8 2 3 3 3 3 5 7 6 5 6 4 3 3 4 8 3 4 4 4 4 5 7 6 5 6 4 4 4 4 8 4 5 5 5 5 5 7 6 5 6 5 5 5 5 8 5 6 8 8 8 3 5 7 6 8 8 8 8 8 4 8 2 9 9 sgoal 6 we assume here that the robot can move in eight directions SA3 - 98 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Freespace Assumption - D* Lite

knowledge after the movement sequence of the robot

2 3 2 2 3 6 3 5 7 7 6 5 6 5 8 19 15 7 14 14 14 15 7 20 8 15 15 8 14 14 12 12 12 13 14 13 13 13 12 14 14 21 14 14 15 15 15 13 13 13 12 12 12 4 6 4 3 3 3 3 3 4 5 3 2 4 8 2 1 1 2 3 5 7 6 5 6 4 3 2 4 8 1 6 3 6 7 6 6 6 6 6 6 8 1 14 13 6 9 9 14 16 15 9 9 9 9 13 17 10 10 10 12 14 7 15 10 10 10 10 11 13 17 4 11 11 11 12 14 7 15 11 11 11 11 12 13 18 5 6 13 13 14 1 2 3 5 7 6 5 6 4 3 2 4 8 1 1 1 2 3 5 7 6 5 6 4 3 2 4 8 1 2 2 2 3 5 7 6 5 6 4 3 2 4 8 2 3 3 3 3 5 7 6 5 6 4 3 3 4 8 3 4 4 4 4 5 7 6 5 6 4 4 4 4 8 4 5 5 5 5 5 7 6 5 6 5 5 5 5 8 5 6 sstart sgoal 7 7 7 3 7 7 7 7 7 7 7 7 7 8 1 9 11 10 8 8 8 3 5 7 6 8 8 8 8 2 4 8 2 6 we assume here that the robot can move in eight directions SA3 - 99 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Freespace Assumption - D* Lite

uninformed search complete search heuristic search D* Lite (Lifelong Planning A*) before the movement sequence of the robot incremental search

sgoal sstart sgoal sstart sgoal sstart sgoal sstart

SA3 - 100 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Freespace Assumption - D* Lite

uninformed search complete search heuristic search D* Lite (Lifelong Planning A*) after the movement sequence of the robot incremental search

sgoal sstart sgoal sstart sgoal sstart sgoal sstart

slide-26
SLIDE 26

SA3 - 101 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Freespace Assumption - D* Lite

A = overhead of D* Lite without incremental Search (A*) va ve hp B = overhead of D* Lite without heuristic search

SA3 - 102 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Freespace Assumption - D* Lite

  • verhead of Focussed D* =

va ve hp

[Stentz, 1995]

probably the first truly incremental heuristic search method (note: Focussed D* is likely a bit faster than D* Lite per vertex expansion)

SA3 - 103 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

3 4 4 4

... 1 1 1 1 1 2 1 1 1 1 2 2 2 3 1 1 1 3 3 2 2 1 1 1 2 2 3 3 3 1 1 2 2 2 3 3 3 3 3 3 3

Greedy Mapping - Implementation

Greedy Mapping always moves the robot on a shortest path to clos- est unobserved (or unvisited) cell.

we assume here that the robot can move in eight directions SA3 - 104 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

R

new vertex

Transforming Greedy Mapping to Planning with the Freespace Assumption

= goal vertex

R

slide-27
SLIDE 27

SA3 - 105 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

knowledge before the movement sequence of the robot

Greedy Mapping - D* Lite

3 5 1 7 6 5 2 1 2 16 16 16 16 17 17 17 17 17 18 18 18 18 18 18 16 16 16 17 18 17 17 17 16 18 18 18 18 18 18 18 18 17 17 17 16 16 16 4 2 3 2 1 1 2 3 5 7 6 5 3 4 3 2 4 3 1 10 6 10 10 10 6 10 17 16 14 13 13 13 13 13 13 13 13 13 13 15 14 14 14 14 14 14 14 14 14 14 15 14 14 16 15 15 15 15 15 15 15 15 15 15 15 15 15 15 16 16 16 17 5 4 5 4 4 5 3 4 5 5 5 5 5 5 3 5 5 6 6 6 6 6 3 6 5 7 7 7 7 7 3 7 5 8 8 8 8 8 4 8 5 9 9 9 9 9 5 9 sstart 7 7 7 3 7 11 11 11 11 7 7 7 11 1 15 15 16 15 14 13 14 12 12 12 12 12 12 12 15 13 12 16 17 14 10 11 1 1 2 3 2 4 1 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ we assume here that the robot can move in eight directions SA3 - 106 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

knowledge after the movement sequence of the robot

Greedy Mapping - D* Lite

16 16 16 16 17 17 17 17 17 18 18 18 18 18 18 19 20 21 19 19 20 21 22 23 22 20 18 23 21 18 18 18 23 18 17 22 17 18 1 1 2 3 5 5 4 3 2 4 1 6 17 16 17 18 19 14 14 14 14 22 23 21 20 15 14 14 16 17 18 19 15 15 15 15 22 23 21 20 15 15 15 16 16 16 17 5 3 5 3 5 3 5 3 5 4 5 5 sstart 16 17 22 23 21 22 23 22 23 23 23 24 24 24 25 25 25 26 26 26 27 27 27 28 28 28 29 5 7 6 5 30 4 3 2 3 31 32 1 2 7 6 3 10 10 10 4 5 4 5 5 5 6 6 6 7 7 7 8 8 8 9 9 9 11 11 11 18 19 13 13 13 22 23 21 20 13 13 7 7 7 3 7 7 7 1 17 15 14 16 17 18 19 14 12 12 12 22 23 21 20 15 13 12 16 7 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ we assume here that the robot can move in eight directions SA3 - 107 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

uninformed search complete search heuristic search D* Lite (Lifelong Planning A*) before the movement sequence of the robot

sstart sstart sstart sstart

Greedy Mapping - D* Lite

incremental search

SA3 - 108 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

uninformed search complete search heuristic search D* Lite (Lifelong Planning A*) after the movement sequence of the robot

sstart sstart sstart sstart

Greedy Mapping - D* Lite

incremental search

slide-28
SLIDE 28

SA3 - 109 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Greedy Mapping - D* Lite

A = overhead of D* Lite without incremental Search (A*) va ve hp B = overhead of D* Lite without heuristic search

SA3 - 110 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

  • verhead of Focussed D* =

Greedy Mapping - D* Lite

va ve hp

[Stentz, 1995]

probably the first truly incremental heuristic search method (note: Focussed D* is likely a bit faster than D* Lite per vertex expansion)

SA3 - 111 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

  • world changes over time
  • what-if analyses
  • model of the world changes over time

emergency management

planning task 1 slightly different planning task 2 slightly different planning task 3 ...

Other Examples of Lifelong Planning

replanning (and plan reuse) is important!

SA3 - 112 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Other Examples of Lifelong Planning

  • symbolic planning (with HSP)
  • continual planning
  • one-time planning
  • mobile robotics
  • mapping
  • goal-directed navigation in unknown terrain
  • computer games
  • reinforcement learning and on-line dynamic programming
  • control (with the Parti-Game algorithm)
  • route planning
  • in traffic networks
  • in computer networks
slide-29
SLIDE 29

SA3 - 113 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Game Playing

Total Annihilation SA3 - 114 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

  • plan adaptation
  • repair-based planning
  • learning search control knowledge
  • case-based planning
  • lifelong planning
  • transformational planning
  • iterative repair methods in scheduling

CHEF, GORDIUS, LS-ADJUST-PLAN, MRL, NoLimit, PLEXUS, PRIAR, SPA... plan quality of replanning is as good as plan quality of planning from scratch plan quality of replanning is usually worse than plan quality of planning from scratch SHERPA

Symbolic Planning (with HSP) - Continual Planning

SA3 - 115 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

start goal STRIPS-type planning in the elevator domain Operators:

  • The elevator moves from floor fi to floor fj with i ≠ j.
  • Person pk boards the elevator on floor fi provided that the elevator

is currently on floor fi and floor fi is the origin of person pk.

  • Person pk gets off the elevator on floor fi, provided that person pk

is in the elevator, the elevator is currently on floor fi, and floor ri is the destination of person pk.

Symbolic Planning (with HSP) - Continual Planning

SA3 - 116 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

planning problem 1 planning problem 2 planning problem 3

SHERPA Speedy HEuristic search-based RePlAnner

[S. Koenig, D. Furcy, C. Bauer, 2002]

... ...

Symbolic Planning (with HSP) - Continual Planning

note: in the following, we consider only finding shortest plans

slide-30
SLIDE 30

SA3 - 117 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

first search in the elevator domain similar to HSP 2.0

[Bonet, Geffner, 2001]

with the hmax heuristic

g=5 h=0 rhs=5 g=1 h=3 rhs=1 g=2 h=2 rhs=2 g=2 h=2 rhs=2 g=1 h=2 rhs=1 g=3 h=2 rhs=3 g=∞ h=3 rhs=3 g=3 h=2 rhs=3 g=3 h=2 rhs=3 g=0 h=3 rhs=0

GOAL 9 8 7 6 5 4 3 2 1

g=∞ h=3 rhs=3 [6;3] [6;3] g=∞ h=2 rhs=4 [6;4] g=∞ h=2 rhs=4 [6;4] g=4 h=1 rhs=4

10

ground

  • perator

deleted after the search generated vertex expanded vertex

i

  • rder of

vertex expansion

start goal

Symbolic Planning (with HSP) - Continual Planning

using SHERPA

SA3 - 118 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

second search in the using SHERPA from scratch elevator domain similar to HSP 2.0 with the hmax heuristic

[Bonet, Geffner, 2001]

GOAL 12 11 7 6 9 5 4 3 2 1 10 8

g=0 h=3 rhs=0 g=1 h=3 rhs=1 g=1 h=2 rhs=1 g=2 h=2 rhs=2 g=2 h=2 rhs=2 g=3 h=2 rhs=3 g=3 h=2 rhs=3 g=3 h=3 rhs=3 g=4 h=2 rhs=4 g=4 h=2 rhs=4 g=6 h=0 rhs=6 g=5 h=1 rhs=5 g=∞ h=2 rhs=5 [7;5] g=∞ h=3 rhs=4 [7;4] generated vertex expanded vertex

i

  • rder of

vertex expansion

Symbolic Planning (with HSP) - Continual Planning

SA3 - 119 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

second search in the elevator domain using SHERPA

GOAL 2 1 5 4

g=0 h=3 rhs=0 g=1 h=3 rhs=1 g=1 h=2 rhs=1 g=2 h=2 rhs=2 g=2 h=2 rhs=2 g=3 h=2 rhs=3 g=3 h=2 rhs=3 g=3 h=3 rhs=3 g=4 h=2 rhs=4 g=4 h=2 rhs=4 g=6 h=0 rhs=6 g=5 h=1 rhs=5 g=∞ h=2 rhs=5 [7;5] g=∞ h=3 rhs=4 [7;4]

6 7 3 8

untouched vertex generated vertex expanded vertex

i

  • rder of

vertex expansion

Symbolic Planning (with HSP) - Continual Planning

SA3 - 120 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002. savings percentage number of people ve for elevator (5 floors)

80%

SHERPA achieves speedups up to 80 percent

planning from scratch with SHERPA

Symbolic Planning (with HSP) - Continual Planning

slide-31
SLIDE 31

SA3 - 121 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

  • ld search tree

new search tree start goal

  • ld search tree

new search tree start goal

Symbolic Planning (with HSP) - Continual Planning

SA3 - 122 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002. number of edges between the deleted edge in the plan and the goal state savings percentage ve for blocksworld speedup becomes negative

Symbolic Planning (with HSP) - Continual Planning

SA3 - 123 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002. number of deleted ground operators savings percentage ve for blocksworld

Symbolic Planning (with HSP) - Continual Planning

SA3 - 124 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

planning problem 1 planning problem 2 planning problem 3

PINCH Prioritized, INCremental Heuristics calculation ...

calcu- late heu- ristic value 1 calcu- late heu- ristic value n calcu- late heu- ristic value n-1 calcu- late heu- ristic value 2 calcu- late heu- ristic value 7 calcu- late heu- ristic value 6 calcu- late heu- ristic value 5 calcu- late heu- ristic value 4 calcu- late heu- ristic value 3

tens of thousands of calculations of heuristic values for each planning problem

[Liu, Koenig, Furcy, 2002]

... ...

Symbolic Planning (with HSP) - One-Time Planning

slide-32
SLIDE 32

SA3 - 125 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

PINCH Prioritized, INCremental Heuristics calculation

hadd(state) = gstate(proposition) = if proposition in state minoperator with proposition in add list (1 + gstate(operator))

  • therwise

gstate(operator) =

Σproposition in goal state gstate(proposition) Σproposition on precondition list of operator gstate(proposition) {

g=0 h=3 rhs=0 g=1 h=3 rhs=1 g=1 h=2 rhs=1

  • rder of state expansions

here: for HSP 2.0 with the hadd heuristic [Bonet, Geffner, 2001]

Symbolic Planning (with HSP) - One-Time Planning

SA3 - 126 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

PINCH achieves speedups up to (another!) 80 percent.

problem size savings percentage method used by HSP 2.0 (value iteration) generalized Bellman Ford PINCH 80%

Symbolic Planning (with HSP) - One-Time Planning

SA3 - 127 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Reinforcement Learning and On-Line DP

while there exists at least one state with g(s) = rhs(s) pick a state s with g(s) = rhs(s) and then set g(s) := rhs(s) Prioritized Sweeping [Moore and Atkeson; 1993] Minimax LPA*

  • chooses the g-value of which state to update
  • updates the g-value of the chosen state in a particular way
  • chooses the g-value of which state to update
  • updates the g-value of the chosen state in a particular way
  • minimizes the expected or worst-case plan-execution cost for MDPs
  • minimizes the worst-case plan-execution cost for MDPs
  • uses heuristics to focus the search
  • terminates immediate once a shortest path is found

SA3 - 128 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Reinforcement Learning and On-Line DP

Prioritized Sweeping [Moore and Atkeson; 1993]

s1

g=102 ∆=0

s3

g=101 ∆=2

s2

g=103 ∆=0 10 2 1 1 g=100 ∆=0

s4

g=99 ∆=0

s5

1

s1

g=102 ∆=2

s3

g=103 ∆=0

s2

g=103 ∆=2 10 2 1 1 g=100 ∆=0

s4

g=99 ∆=0

s5

1

s1

g=104 ∆=0

s3

g=103 ∆=2

s2

g=103 ∆=2 10 2 1 1 g=100 ∆=0

s4

g=99 ∆=0

s5

1

s1

g=104 ∆=2

s3

g=103 ∆=2

s2

g=105 ∆=0 10 2 1 1 g=100 ∆=2

s4

g=99 ∆=0

s5

1

s1

g=104 ∆=2

s3

g=105 ∆=0

s2

g=105 ∆=2 10 2 1 1 g=100 ∆=2

s4

g=99 ∆=0

s5

1

s1

g=106 ∆=0

s3

g=105 ∆=2

s2

g=105 ∆=2 10 2 1 1 g=100 ∆=2

s4

g=99 ∆=0

s5

1 1

and so on, for a total of 22 g-value updates. Minimax LPA* needs only 6. Note: Minimax LPA* expands every state at most twice.

slide-33
SLIDE 33

SA3 - 129 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Control (with the Parti-Game algorithm)

state spaces of control problems are

  • ften continuous and sometimes high-dimensional

might not be able to find a plan is very inefficient coarse-grained discretization fine-grained discretization

SA3 - 130 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Control (with the Parti-Game algorithm)

avoids these problems nonuniform discretization Parti-Game algorithm [Moore and Atkeson; 1995]

SA3 - 131 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Control (with the Parti-Game algorithm)

goal goal goal goal goal goal goal goal goal goal

here: using a deterministic state space for illustration

SA3 - 132 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

A

t1 t2 t0

A A

O1 O2 O3

Control (with the Parti-Game algorithm)

the state space is really nondeterministic we thus use Minimax LPA* instead of LPA*

slide-34
SLIDE 34

SA3 - 133 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Control (with the Parti-Game algorithm)

Implementation Uninformed Search from Scratch Informed Search from Scratch Uninformed Incremental Search Informed Incremental Search (Minimax LPA*) Planning Time 362 minutes 135 minutes 14 minutes 13 minutes 55 seconds 15 seconds 53 seconds 53 seconds terrains of size 2000 x 2000

SA3 - 134 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

References (in order of their appearance)

  • S. Koenig, Agent-Centered Search, Artificial Intelligence Magazine, 22(4), 2001, 109-131.
  • I. Nourbakhsh, Interleaving Planning and Execution for Autonomous Robots, Kluwer, 1997.
  • H. Choset, J. Burdick, Sensor-based planning and nonsmooth analysis. In Proceedings of the International Confer-

ence on Robotics and Automation, 1994, 3034-3041.

  • C. Tovey and S. Koenig, Gridworlds as Testbeds for Planning with Incomplete Information, Proceedings of the

National Conference on Artificial Intelligence, 819-824, 2000.

  • G. Dudek, K. Romanik, S. Whitesides, Localizing a robot with minimum travel, In Proceedings of the ACM-SIAM

Symposium on Discrete Algorithms, 437-446, 1995.

  • C. Lund, M. Yannakakis, On the hardness of approximating minimization problems, Journal of the ACM, 41:960-

981, 1994.

  • M. Genesereth and I. Nourbakhsh, Time-saving tips for problem solving with incomplete information, In Proceed-

ings of the National Conference on Artificial Intlligence, 1993, 724-730.

  • S. Koenig and R. Simmons, Solving robot navigation problems with initial pose uncertainty using real-time heuris-

tic search, In Proceedings of the International Conference on Artificial Intelligence Planning Systems, 1998, 145- 153.

  • R. Simmons, S. Koenig, Probabilistic Robot Navigation in Partially Observable Environments, Proceedings of the

International Joint Conference on Artificial Intelligence, 1993, 99-105.

  • S. Koenig and R. Simmons, Xavier: A Robot Navigation Architecture Based on Partially Observable Markov Deci-

sion Process Models, In: Artificial Intelligence Based Mobile Robots: Case Studies of Successful Robot Systems,

  • D. Kortenkamp, R. Bonasso, R. Murphy (Eds.), MIT Press, 1998.
  • S. Thrun, Probabilistic Algorithms in Robotics, Artificial Intelligence Magazine, 21(4), 2000, 93-109.
  • W. Burgard, D. Fox, S. Thrun, Active Mobile Robot Localization, Proceedings of the International Joint Conference
  • n Artificial Intelligence, 1997.
  • R. Schapire, The Design and Analysis of Efficient Learning Algorithms, MIT Press, 1992.
  • C. Papadimitriou and J. Tsitsiklis, The complexity of Markov decision processes, Mathematics of Operations

Research 12(3), 1987, 441-450.

SA3 - 135 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

References (in order of their appearance)

  • S. Koenig, C. Tovey, W. Halliburton, Greedy Mapping of Terrain, Proceedings of the International Conference on

Robotics and Automation, 2001, 3594-3599.

  • S. Thrun, A. Buecken, W. Burgard, D. Fox, T. Froehlinghaus, D. Hennig, T. Hofmann, M. Krell, T. Schmidt, Map

learning and high-speed navigation in RHINO, In : Artificial Intelligence Based Mobile Robotics: Case Studies of Successful Robot Systems, D. Kortenkamp, R. Bonasso, R. Murphy (Eds.), MIT Press, 1998, 21-52.

  • L. Romero, E. Morales, E. Sucar, An exploration and navigation approach for indoor mobile robots considering sen-

sor’s perceptual limitations, Proceedings of the International Conference on Robotics and Automation, 2001, 3092- 3097.

  • D. Mackenzie, R. Arkin, J. Cameron, Multiagent mission specification and execution, Autonomous Robots, 4(1),

1997, 29-57.

  • S. Koenig, C. Tovey, Y. Smirnov, Performance Bounds for Planning in Unknown Terrain, 2001.
  • B. Brumitt, A. Stentz, GRAMMPS: a generalized mission planner for multiple mobile robots. In Proceedings of the

International Conference on Robotics and Automation, 1998.

  • M. Hebert, R. McLachlan, P. Chang, Experiments with driving modes for urban robots, Proceedings of the SPIE

Mobile Robots, 1999.

  • L. Matthies, Y. Xiong, R. Hogg, D. Zhu, A. Rankin, B. Kennedy, M. Hebert, R. Maclachlan, C. Won, T. Frost, G.

Sukhatme, M. McHenry, S. Goldberg, A portable, autonomous, urban reconnaissance robot. Proceedings of the International Conference on Intelligent Autonomous Systems, 2000.

  • A. Stentz and M. Hebert, A complete navigation system for goal acquisition in unknown environments. Autono-

mous Robots, 2(2), 1995, 127-145.

  • S. Thayer, B. Digney, M. Diaz, A. Stentz, B. Nabbe, M. Hebert, Distributed robotic mapping of extreme environ-
  • ments. In Proceedings of the SPIE: Mobile Robots XV and Telemanipulator and Telepresence Technologies VII,

Volume 4195, 2000.

  • P. Hart, N. Nilsson, B. Raphael, A Formal Basis for the Heuristic Determination of Minimum Cost Paths in Graphs,

IEEE Transactions on Systems Science and Cybernetics, SSC-4(2), 1968, 100-107.

  • G. Ramalingam, T. Reps, On the computational complexity of dynamic graph problems, Theoretical Computer Sci-

ence 158 (1-2), 1996, 233-277.

SA3 - 136 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

References (in order of their appearance)

  • S. Koenig, M. Likhachev, Incremental A*, Advances in Neural Information Processing Systems, 2001.
  • M. Likhachev, S. Koenig, Lifelong Planning A* and Dynamic A* Lite: The Proofs, 2001.
  • B. Nebel and J. Koehler, Plan reuse versus plan generation: A theoretical and empirical analysis, Artificial Intelli-

gence, 76(1-2), 1995, 427-454.

  • A. Stentz, Optimal and Efficient Path Planning for Partially-Known Environments, Proceedings of the International

Conference on Robotics and Automation, 1994, 3310-3317.

  • S. Koenig, M. Likhachev, D* Lite, Proceedings of the National Conference on Artificial Intelligence, 2002.
  • A. Stentz. The focussed D* algorithm for real-time replanning. In Proceedings of the International Joint Conference
  • n Artificial Intelligence, 1652-1659, 1995.
  • S. Koenig, D. Furcy, C. Bauer, Heuristic Search-Based Replanning, Proceedings of the International Conference on

Artificial Intelligence Planning Systems, 2002.

  • B. Bonet, H. Geffner, Heuristic Search Planner 2.0, Artificial Intelligence Magazine 22(3), 2001, 77-80.
  • Y. Liu, S. Koenig, D. Furcy, Speeding up the calculation of the heuristics for heuristic search-based planning, Pro-

ceedings of the National Conference on Artificial Intelligence, 2002.

slide-35
SLIDE 35

SA3 - 137 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Greedy On-Line Planning and Lifelong Planning

Related Work:

  • K. Hammond. Explaining and repairing plans that fail, Artificial Intelligence 45, 1990, 173-228.
  • R. Simmons. A theory of debugging plans and interpretations, in: Proceedings of the National Conference on Artifi-

cial Intelligence, 1988, 94-99.

  • A. Gerevini, I. Serina, Fast plan adaptation through planning graphs: Local and systematic search techniques, in:

proceedings of the International Conference on Artificial Intelligence Planning and Scheduling, 2000, 112-121.

  • J. Koehler, Flexible plan reuse in a formal framework, in: C. Baeckstroem, E. Sandewall (Eds.), Current Trends in

AI Planning, IOS Press, 1994, 171-184.

  • M. Veloso, Planning and Learning by Analogical Reasoning, Springer, 1994.
  • R. Alterman, Adaptive Planning, Cognitive Science 12(3), 1988, 393-421.
  • S. Kambhampati, J. Hendler, A validation-structure-based theory of plan modification and reuse, Artificial Intelli-

gence 55, 1992, 193-258.

  • S. Edelkamp, Updating Shortest Paths, Proceedings of the European Conference on Artificial Intelligence, 1998,

655-659.

  • S. Hanks, D. Weld, A domain-independent algorithm for plan adaptation, Journal of Artificial Intelligence Research

2, 1995, 319-360. ... and many more

Artificial Intelligence

SA3 - 138 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Greedy On-Line Planning and Lifelong Planning

Related Work:

  • G. Ausiello, G. Italiano, A. Marchetti-Spaccamela, U. Nanni, Incremental algorithms for minimal length paths,

Journal of Algorithms 12(4), 1991, 615-638.

  • S. Even, H. Gazit, Updating distance in dynamic graphs, Methods of Operations Research 49, 1985, 371-387.
  • E. Feuerstein, A. Marchetti-Spaccamela, Dynamic algorithms for shortest paths in planar graphs, Theoretical Com-

puter Science 116(2), 1993, 359-371.

  • P. Franciosa, D. Frigioni, R. Giaccio, Semi-dynamic breadth-first search in digraphs, Theoretical Computer Science

250(1-2), 2001, 201-217.

  • D. Friogioni, A. Marchetti-Spaccamela, U. Nanni, Fully dynamic output bounded single source shortest path prob-

lem, in: Prodeedings of the Symposium on Discrete Algorithms, 1996, 212-221.

  • S. Goto, A. Sangiovanni-Vincentelli, A new shortest path updating algorithm, Networks 8(4), 1978, 341-372.
  • G. Italiano, Finding paths and deleting edges in directed acyclic graphs, Information Processing Letters 28(1), 1988,

5-11.

  • P. Klein, S. Subramanian, Fully dynamic approximation schemes for shortest path problems in planar graphs, in:

Proceedings of the International Workshop on Algorithms and Data Structures, 1993, 443-451.

  • C. Lin, R. Chang, On the dynamic shortest path problem, Journal of Information Processing 13(4), 1990, 470-476.
  • H. Rohnert, A dynamization of the all pairs least cost path problem, in: Proceedings of the Symposium on Theoret-

ical Aspects of Computer Science, 1985, 279-286.

  • P. Spira, A. Pan, On finding and updating spanning trees and shortest paths, SIAM Journal on Computing 4, 1975,

375-380. ... and many more

Algorithm Theory

SA3 - 139 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Greedy On-Line Planning and Lifelong Planning

Related Work:

  • V. Lumelsky and A. Stepanov. Path planning strategies for point mobile automaton moving amidst unknown obsta-

cles of arbitrary shape. Algorithmica, 2:403-430, 1987.

  • M. Barbehenn and S. Hutchinson. Efficient search and hierarchical motion planning by dynamically maintaining

single-source shortest paths trees. IEEE Transactions on Robotics and Automation, 11(2):198-214, 1995.

  • T. Ersson and X. Hu. Path planning and navigation of mobile robots in unknown environments. In Proceedings of

the International Conference on Intelligent Robots and Systems, 2001.

  • Y. Huiming, C. Chia-Jung, S. Tong, and B. Qiang. Hybrid evolutionary motion planning using follow boundary

repair for mobile robots. Journal of Systems Architecture, 47(7):635-647, 2001.

  • L. Podsedkowski, J. Nowakowski, M. Idzikowski, and I. Vizvary. A new solution for path planning in partially

known or unknown environments for nonholonomic mobile robots. Robotics and Autonomous Systems, 34:145-

  • 152. 2001
  • M. Tao, A. Elssamadisy, N. Flann, and B. Abbott. Optimal route re-planning for mobile robots: A massively parallel

incremental A* algorithm. In International Conference on Robotics and Automation, pages 2727-2732, 1997.

  • K. Trovato. Differential A*: An adaptive search method illustrated with robot path planning for moving obstacles

and goals, and an uncertain environment. Journal of Pattern Recognition and Artificial Intelligence, 4(2), 1990. ... and many more

Robotics

SA3 - 140 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

Greedy On-Line Planning and Lifelong Planning Theoretical Results

Related Work:

  • S. Carlsson, H. Jonsson, Computing a shortest watchman path in a simple polygon in polynomial time, in: S. Akl, F.

Dehne, J. Sack, N. Santoro (Eds.), Proceedings of the Workshop on Algorithms and Data Structures, Vol. 955 of Lecture Notes in Computer Science, Springer, 1995, 122-134.

  • X. Tan, T. Hirata, Constructing shortest watchman routes by divide-and-conquer, in: K. Ng, P. Raghavan, N. Bala-

subramanian, F. Chin (Eds.), Proceedings of the International Symposium on Algorithms and Computation, Vol. 762 of Lecture Notes in Computer Science, Springer, 1993, 68-77.

  • S. Ntafos, Watchman routes under limited visibility, in: Proceedings of the Canadian Conference on Computational

Geometry, 1990, 89-92.

  • X. Deng, T. Kameda, C. Papadimitriou, How to learn an unknown environment I: the rectilinear case, Journal of the

ACM 45(2), 1998, 215-245.

  • F. Hoffman, C. Icking, R. Klein, K. Kriegel, A competitive strategy for learning a polygon, in: Proceedings of the

Symposium on Discrete Algorithms, 1997, 166-174.

  • V. Lumelsky, Algorithmic and complexity issues of robot motion in an uncertain environment, Journal of Complex-

ity 3, 1987, 146-182.

  • A. Blum, P. Raghavan, B. Schieber, Navigating in unfamiliar geometric terrain, SIAM Journal on Computing 26(1),

1997, 110-137.

  • C. Icking, R. Klein, E. Langetepe, An optimal competitive strategy for walking in streets, in: C. Meinel, S. Tison

(Eds.), Proceedings of the Symposium on Theoretical Aspects of Computer Science, Vol. 1563 of Lecture notes in Computer Science, Springer, 1999, 110-120.

  • X. Deng, C. Papadimitriou, Exploring an unknown graph, in: Proceedings of the Symposium on Foundations of

Computer Science, 1990, 355-361.

  • S. Albers, M. Henzinger, Exploring unknown environments, in: Proceedings of the Symposium on Theory of Com-

puting, 1997, 416-425. ... and many more

slide-36
SLIDE 36

SA3 - 141 of 141 Greedy On-Line Planning; (c) Sven Koenig; Georgia Tech; January 2002.

http://www.cc.gatech.edu/fac/Sven.Koenig/greedyonline

Lifelong Planning Techniques - Our Work

Please see We gratefully acknowledge funding from NSF and IBM.

The views and conclusions contained in this material are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the sponsoring organizations and agencies or the U.S. government.