Planning 8 AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 1 8 - - PowerPoint PPT Presentation

planning
SMART_READER_LITE
LIVE PREVIEW

Planning 8 AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 1 8 - - PowerPoint PPT Presentation

Planning 8 AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 1 8 Planning 8.1 The planning problem 8.2 STRIPS operators 8.3 PDDL (Planning Domain Definition Language) 8.4 Situation calculus 8.5 Partial-order planning 8.6 Conditional


slide-1
SLIDE 1

Planning

8

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 1

slide-2
SLIDE 2

8 Planning 8.1 The planning problem 8.2 STRIPS operators 8.3 PDDL (Planning Domain Definition Language) 8.4 Situation calculus 8.5 Partial-order planning 8.6 Conditional planning∗ 8.7 Replanning∗

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 2

slide-3
SLIDE 3

The planning problem

function Simple-Planning-Agent( percept) returns an action persistent: KB, a knowledge base (includes action descriptions) p, a plan, initially NoPlan t, a counter, initially 0, indicating time local variables: G, a goal current, a current state description Tell(KB,Make-Percept-Sentence( percept, t)) current ← State-Description(KB, t) if p = NoPlan then G ← Ask(KB,Make-Goal-Query(t)) p ← Ideal-Planner(current,G,KB) if p = NoPlan or p is empty then action ← NoOp else action ← First( p) p ← Rest( p) Tell(KB,Make-Action-Sentence(action, t)) t ← t + 1 return action

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 3

slide-4
SLIDE 4

Search vs. planning

Consider the task get milk, bananas, and a cordless drill Standard search algorithms seem to fail miserably:

. . . Buy Tuna Fish Buy Arugula Buy Milk Go To Class Buy a Dog Talk to Parrot Sit Some More Read A Book ... Go To Supermarket Go To Sleep Read A Book Go To School Go To Pet Store

  • Etc. Etc. ...

Sit in Chair Start Finish

After-the-fact heuristic/goal test inadequate

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 4

slide-5
SLIDE 5

State space vs. plan space

Standard search: node = concrete world state Planning search: node = partial plan Open condition is a precondition of a step not yet fulfilled Operators (actions) on partial plans add a link from an existing action to an open condition add a step to fulfill an open condition

  • rder one step wrt another

Gradually move from incomplete/vague plans to complete, correct plans

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 5

slide-6
SLIDE 6

Search vs. planning

Planning systems do the following: 1) open up action and goal representation to allow selection 2) divide-and-conquer by subgoaling 3) relax requirement for sequential construction of solutions Search Planning States Lisp data structures Logical sentences Actions Lisp code Preconditions/outcomes Goal Lisp code Logical sentence (conjunction) Plan Sequence from S0 Constraints on actions

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 6

slide-7
SLIDE 7

Planning as state space search

Planning as a search problem: search from the initial state through the space of states, looking for a goal Algorithms for planning: – Progression: forward state-space search – Regression: backward relevant-space search Heuristics for planning: need to find good domain-specific heuristics for planning problems

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 7

slide-8
SLIDE 8

STRIPS operators

Restricted planning language, tidily arranged actions descriptions Action: Buy(x)

Have(x) At(p) Sells(p,x)

Buy(x)

Precondition: At(p), Sells(p, x) Effect: Have(x) Note: this abstracts away many important details Restricted language ⇒ efficient algorithm Precondition: conjunction of positive literals Effect: conjunction of literals Hint: A complete set of STRIPS operators can be translated into a set of successor-state axioms (see later)

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 8

slide-9
SLIDE 9

PDDL(Planning Domain Definition Language)

PDDL extends STRIPS by allowing preconditions and goals in nega- tive literals State: a conjunction of fluents that are ground, functionless atoms Actions: a set of action schema that is a set of ground actions E.g., Action : Fly(p, from, to), Precond : At(p, from) ∧ Plane(p) ∧ Airport(from) ∧ Airport(to) Effect : ¬At(p, from) ∧ At(p, to) The preconditon and effect are conjunctions of literals that may con- tain variables The action schema lifts the level of reasoning from propositional logic to a restriced subset of first-order logic

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 9

slide-10
SLIDE 10

PDDL

The result of executing action a in state s is a state s′ Del(a): deleting fluents that appear as negative literals in the action’s effects Add(a): adding the fluents that are positive literals in the ac- tion’s effects Result(s, a) = (s−Del(a))∪Add(a) A planning domain is defined by a set of action schemas – A planning problem within the domain: initial state and a goal (conjunctions of literals) By propositionalization of the action schema we can use a proposi- tional solver (say SATPlan) to find a solution The complexity of planning decision problems are in class of PSPACE

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 10

slide-11
SLIDE 11

Situation Calculus

Situation calculus is a dialect of FOL to represent change by actions – situations: add a situation argument to non-eternal predicates E.g., now in Holding(gold, now) denotes a situation – actions: e.g., pickup(r, x) robot r picks up object x

PIT PIT PIT

Gold

PIT PIT PIT

Gold

S0 Forward S1

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 11

slide-12
SLIDE 12

The language: situations

Let L be a first-order language with especially – constant: an initial situation S0 – terms: situations s1, s2, · · · – function symbol: do – predicate symbol: Poss Facts hold in situations, rather than eternally e.g., Holding(gold, now) rather than just Holding(gold) Situations, denoting possible world histories a distinguished constant S0 and function symbol do are used – S0: the initial situation, before any actions have been performed – do(a, s) (or result(a, s)): the situation that results from doing action a in situation s E.g., do(put(A, B), do(put(B, C), S0)) the situation that results from putting A on B after putting B

  • n C in the initial situation

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 12

slide-13
SLIDE 13

The language: fluents

A function or predicate (relations) that can vary from one situation to the next is a fluent – written function/predicate whose last argument is a situation E.g., Holding(r, x, s): robot r is holding object x in situation s A distinguished predicate symbol Poss(a, s) is used to state that a may be performed in s E.g., Poss(pickup(r, x), S0) it is possible for the robot r to pickup object x in the initial situation

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 13

slide-14
SLIDE 14

Actions

It is necessary to include in a KB not only facts about the initial situation, but also about world dynamics Actions typically have preconditions: what needs to be true for the action to be performed E.g., Poss(pickup(r, x), s) ⇔ ∀z.¬Holding(r, z, s)∧¬Heavy(x)∧ NextTo(r, x, s) a robot can pickup an object iff it is not holding anything, the

  • bject is not too heavy, and the robot is next to the object

Actions typically have effects: the fluents that change as the result

  • f performing the action

E.g., Fragile(x) ⇒ Broken(x, do(drop(r, x), s)) dropping a fragile object causes it to break ∀ s AtGold(s) ⇒ Holding(Gold, do(grab, s)) called effect axioms

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 14

slide-15
SLIDE 15

The frame problem

What fluents are unaffected (non-changes) by performing an action ∀ s HaveArrow(s) ⇒ HaveArrow(do(grab, s)) called frame axioms Frame problem: find an elegant way to handle non-change (a) representation – avoid frame axioms a vast number of such axioms ; most leave then invariant (b) reasoning – avoid repeated “copy-overs” to keep track of state Qualification problem: true descriptions of real actions require endless caveats – what if gold is slippery or nailed down or . . . Ramification problem: real actions have many secondary consequences – what about the dust on the gold, wear and tear on gloves, . . .

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 15

slide-16
SLIDE 16

The frame problem

Solutions – Building a KB to write down all the effect axioms for each fluent F and action A that can cause the truth value

  • f F to change, an axiom of the form R(s) ⇒ F(do(A, s)) (and

R(s) ⇒ ¬F(do(A, s))), where R(s) is some condition on s – Want a systematic procedure for generating all the frame axioms from these effect axioms – If possible, also want a parsimonious representation for them (since in their simplest form, there are too many) Frame axioms are necessary to reason about actions and are not en- tailed by the other axioms

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 16

slide-17
SLIDE 17

Normal form for effect axioms

Suppose there are two positive effect axioms for the fluent Broken Fragile(x) ⇒ Broken(x, do(drop(r, x), s)) NextTo(b, x, s) ⇒ Broken(x, do(explode(b), s)) These can be rewritten as ∃r{a = drop(r, x)∧Fragile(x)}∨∃b{a = explode(b)∧NextTo(b, x, s)} ⇒ Broken(x, do(a, s)) Similarly, consider the negative effect axiom ¬Broken(x, do(repair(r, x), s)) which can be rewritten as ∃r{a = repair(r, x)} ⇒ ¬Broken(x, do(a, s))

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 17

slide-18
SLIDE 18

Normal form for effect axioms

In general, for any fluent F, rewrite all the effect axioms as as two formulas of the form (1) PF(x, a, s) ⇒ F(x, do(a, s)) (2) NF(x, a, s) ⇒ ¬F(x, do(a, s)) where PF(x, a, s) and NF(x, a, s) are formulas whose free vari- ables are among the xi, a, and s (x = (x1, . . . , xn))

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 18

slide-19
SLIDE 19

Explanation closure

Completeness assumption regarding these effect axioms: assume that (1) and (2) characterize all the conditions under which an action a changes the value of fluent F This can be formalized by explanation closure axioms (3) ¬F(x, s) ∧ F(x, do(a, s)) ⇒ PF(x, a, s) if F was false and was made true by doing action a then condition PF must have been true (4) F(x, s) ∧ ¬F(x, do(a, s)) ⇒ NF(x, a, s) if F was true and was made false by doing action a then condition NF must have been true These explanation closure axioms are in fact disguised versions of frame axioms ¬F(x, s) ∧ ¬PF(x, a, s) ⇒ ¬F(x, do(a, s)) F(x, s) ∧ ¬NF(x, a, s) ⇒ F(x, do(a, s))

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 19

slide-20
SLIDE 20

Successor state axioms

Assume that the KB entails the following – integrity of the effect axioms: ¬∃x, a, s.PF(x, a, s)∧NF(x, a, s) – unique names for actions – – for each distict pair of action names Ai and Aj Ai(x, · · ·) = Aj(y, · · ·) – – for each action name Ai, two uses of that action name are equal iff all their arguments are equal Ai(x1, · · · , xn) = Ai(y1, · · · , yn) ⇔ x1 = y1 ∧ · · · ∧ xn = yn Then it can be shown that KB entails that (1), (2), (3), and (4) together are logically equivalent to F(x, do(a, s)) ⇔ PF(x, a, s) ∨ (F(x, s) ∧ ¬NF(x, a, s)) This is called the successor state axiom for F

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 20

slide-21
SLIDE 21

Successor state axioms: examples

Each axiom is “about” a predicate (not an action per se): F true afterwards ⇔ [an action made F true ∨ F true already and no action made F false] The successor state axiom for the Broken fluent Broken(x, do(a, s)) ⇔ ∃r{a = drop(r, x) ∧ Fragile(x)} ∨∃b{a = explode(b)∧NextTo(b, x, s)}∨Broken(x, s)∧¬∃r{a = repair(r, x)} An object x is broken after doing action a iff a is a dropping action and x is fragile,

  • r a is a bomb exploding

where x is next to the bomb,

  • r x was already broken and

a is not the action of repairing it

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 21

slide-22
SLIDE 22

Successor state axioms: examples

F true afterwards ⇔ [an action made F true ∨ F true already and no action made F false] For holding the gold: ∀ a, s Holding(gold, do(a, s)) ⇔ [(a = grab ∧ AtGold(s)) ∨ (Holding(gold, s) ∧ a = release)] Notes – Successor state axioms solve the representational frame problem – [Situation calculus is a FOL and hence not nonmonotonic, see later for nonmonotonic reasoning]

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 22

slide-23
SLIDE 23

A simple solution to frame problem

This simple solution to the frame problem yields the following axioms – one successor state axiom per fluent – one precondition axiom per action – unique name axioms for actions Moreover, do not get fewer axioms at the expense of long ones the length of a successor state axioms is proportional to the num- ber of actions which affect the truth value of the fluent This solution relies on – quantification over actions – the assumption that relatively few actions affect each fluent – the completeness assumption (for effects) depends on the fact that actions always have deterministic effects

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 23

slide-24
SLIDE 24

Making plans

Initial condition in KB: At(agent, [1, 1], S0) At(gold, [1, 2], S0) Query: Ask(KB, ∃ s Holding(gold, s)) i.e., in what situation will I be holding the gold? Answer: {s/do(grab, do(forward, S0))} i.e., go forward and then grab the gold This assumes that the agent is interested in plans starting at S0 and that S0 is the only situation described in the KB

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 24

slide-25
SLIDE 25

Planning in situation calculus

Represent a plan p as a sequence of actions [a1, a2, . . . , an] The planning problem can be formulated as follows Given a formula Goal(s) find a sequence of actions a = [a1, a2, . . . , an] s.t. KB | = Goal(do(a, S0)) ∧ Legal(do(a, S0)) where do([a1, a2, . . . , an], S0) is an abbreviation for do(an, do(an−1, · · · , do(a2, do(a1, S0)) · · ·)) and where Legal([a1, a2, . . . , an], S0)) is an abbreviation for Poss(a1, S0)∧Poss(a2, do(a1, S0))∧· · ·∧Poss(an, do([a1, · · · , an−1], S0))

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 25

slide-26
SLIDE 26

Planning in situation calculus

PlanResult(p, s) is the result of executing p in s Then the query Ask(KB, ∃ p Holding(gold, PlanResult(p, S0))) has the solution {p/[forward, grab]} Definition of PlanResult in terms of do ∀ s PlanResult([], s) = s ∀ a, p, s PlanResult([a|p], s) = PlanResult(p, do(a, s)) Using resolution with answer extraction to find a sequence of actions KB | = ∃s.Goal(s) ∧ Legal(s) Planning systems are special-purpose reasoners designed to do this type of inference more efficiently than a general-purpose reasoner

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 26

slide-27
SLIDE 27

Planning in situation calculus: example

Initial state At(home, S0) ∧ ¬Have(milk, S0) ∧ . . . Actions as Successor State axioms Have(milk, do(a, s)) ⇔ [(a = Buy(milk) ∧ At(supermarket, s)) ∨ (Have(milk, s) ∧ a = . . .)] Query s = PlanResult(p, S0) ∧ At(home, s) ∧ Have(milk, s) ∧ . . . Solution p = [Go(supermarket), Buy(milk), Buy(bananas), Go(hws), . . .] Principal difficulty: unconstrained branching, hard to apply heuristics

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 27

slide-28
SLIDE 28

Limitations of the situation calculus

There are a number of limitations in the situation calculus – no time: cannot talk about how long actions take, or when they

  • ccur

– no concurrency: cannot talk about doing two actions at once – only discrete situations: no continuous actions – only primitive actions: no actions made up of other parts, like conditionals or iterations, etc. GOLOG (Algol in logic) is a programming language that generalizes conventional imperative programming languages to overcome these limitations Ref: Reiter R. Knowledge in Action: Logical Foundations for Speci- fying and Implementing Dynamical Systems. MIT Press, Cambridge, MA, 2001

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 28

slide-29
SLIDE 29

Partially ordered plans

Partially ordered collection of steps with Start step has the initial state description as its effect Finish step has the goal description as its precondition causal links from outcome of one step to precondition of an-

  • ther

temporal ordering between pairs of steps Open condition = precondition of a step not yet causally linked A plan is complete iff every precondition is achieved A precondition is achieved iff it is the effect of an earlier step and no possibly intervening step undoes it

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 29

slide-30
SLIDE 30

Example

Finish Start At(Home) Have(Ban.) Have(Drill) Have(Milk) Sells(SM,Milk) Sells(HWS,Drill) At(Home) Sells(SM,Ban.)

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 30

slide-31
SLIDE 31

Example

Buy(Drill) Buy(Milk) Go(SM) Finish Start At(Home) Have(Ban.) Have(Drill) Have(Milk) Sells(SM,Milk) At(SM) Sells(HWS,Drill) At(HWS) At(x) Sells(SM,Milk) Sells(HWS,Drill) At(Home) Sells(SM,Ban.)

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 31

slide-32
SLIDE 32

Example

At(SM) At(Home) At(HWS) Buy(Drill) Buy(Milk) Buy(Ban.) Go(Home) Go(HWS) Go(SM) Finish Start At(Home) Have(Ban.) Have(Drill) Have(Milk) Sells(SM,Milk) At(SM) Sells(SM,Ban.) At(SM) Sells(HWS,Drill) At(HWS)

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 32

slide-33
SLIDE 33

Planning process

Operators on partial plans: add a link from an existing action to an open condition add a step to fulfill an open condition

  • rder one step wrt another to remove possible conflicts

Gradually move from incomplete/vague plans to complete, correct plans Backtrack if an open condition is unachievable or if a conflict is unresolvable

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 33

slide-34
SLIDE 34

POP algorithm sketch

function POP(initial, goal, operators) returns plan plan ← Make-Minimal-Plan(initial, goal) loop do if Solution?( plan) then return plan Sneed, c ← Select-Subgoal( plan) Choose-Operator( plan, operators,Sneed,c) Resolve-Threats( plan) end function Select-Subgoal( plan) returns Sneed, c pick a plan step Sneed from Steps( plan) with a precondition c that has not been achieved return Sneed, c

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 34

slide-35
SLIDE 35

POP algorithm

procedure Choose-Operator(plan, operators, Sneed, c) choose a step Sadd from operators or Steps( plan) that has c as an effect if there is no such step then fail add the causal link Sadd

c

− → Sneed to Links( plan) add the ordering constraint Sadd ≺ Sneed to Orderings( plan) if Sadd is a newly added step from operators then add Sadd to Steps( plan) add Start ≺ Sadd ≺ Finish to Orderings( plan) procedure Resolve-Threats(plan) for each Sthreat that threatens a link Si

c

− → Sj in Links( plan) do choose either Demotion: Add Sthreat ≺ Si to Orderings( plan) Promotion: Add Sj ≺ Sthreat to Orderings( plan) if not Consistent( plan) then fail end

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 35

slide-36
SLIDE 36

Clobbering and promotion/demotion

A clobberer is a potentially intervening step that destroys the condi- tion achieved by a causal link. E.g., Go(Home) clobbers At(Supermarket):

At(HWS) Finish At(Home) At(Home) Go(Home) Buy(Drill) Go(HWS) DEMOTION PROMOTION

Demotion: put before Go(Supermarket) Promotion: put after Buy(Milk)

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 36

slide-37
SLIDE 37

Properties of POP

Nondeterministic algorithm: backtracks at choice points on failure: – choice of Sadd to achieve Sneed – choice of demotion or promotion for clobberer – selection of Sneed is irrevocable POP is sound, complete, and systematic (no repetition) Extensions for disjunction, universals, negation, conditionals Can be made efficient with good heuristics derived from problem description Particularly good for problems with many loosely related subgoals

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 37

slide-38
SLIDE 38

Example: Blocks world

Start State Goal State

B A C A B C

PutOn(x,y) Clear(x) On(x,z) Clear(y) ~On(x,z) ~Clear(y) Clear(z) On(x,y) PutOnTable(x) Clear(x) On(x,z) ~On(x,z) Clear(z) On(x,Table) + several inequality constraints

"Sussman anomaly" problem

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 38

slide-39
SLIDE 39

Example

B A C A B C

FINISH On(A,B) On(B,C) START On(C,A) On(A,Table) Cl(B) On(B,Table) Cl(C)

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 39

slide-40
SLIDE 40

Example

B A C A B C

FINISH On(A,B) On(B,C) START On(C,A) On(A,Table) Cl(B) On(B,Table) Cl(C) PutOn(B,C) Cl(B) On(B,z) Cl(C)

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 40

slide-41
SLIDE 41

Example

B A C A B C

FINISH On(A,B) On(B,C) START On(C,A) On(A,Table) Cl(B) On(B,Table) Cl(C) PutOn(B,C) Cl(B) On(B,z) Cl(C) PutOn(A,B) Cl(A) On(A,z) Cl(B)

PutOn(A,B) clobbers Cl(B) => order after PutOn(B,C)

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 41

slide-42
SLIDE 42

Example

B A C A B C

FINISH On(A,B) On(B,C) START On(C,A) On(A,Table) Cl(B) On(B,Table) Cl(C) PutOn(B,C) Cl(B) On(B,z) Cl(C) PutOn(A,B) Cl(A) On(A,z) Cl(B)

PutOn(A,B) clobbers Cl(B) => order after PutOn(B,C)

PutOnTable(C) On(C,z) Cl(C)

PutOn(B,C) clobbers Cl(C) => order after PutOnTable(C)

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 42

slide-43
SLIDE 43

Planning in the real world

~Flat(Spare) Intact(Spare) Off(Spare) On(Tire1) Flat(Tire1) START FINISH On(x) ~Flat(x) Remove(x) On(x) Off(x) ClearHub Puton(x) Off(x) ClearHub On(x) ~ClearHub Inflate(x) Intact(x) Flat(x) ~Flat(x)

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 43

slide-44
SLIDE 44

Things go wrong

Incomplete information Unknown preconditions, e.g., Intact(Spare)? Disjunctive effects, e.g., Inflate(x) causes Inflated(x)∨SlowHiss(x)∨Burst(x)∨BrokenPump∨ . . . Incorrect information Current state incorrect, e.g., spare NOT intact Missing/incorrect postconditions in operators Qualification problem: can never finish listing all the required preconditions and possible conditional outcomes of actions

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 44

slide-45
SLIDE 45

Solutions

Sensorless planning Devise a plan that works regardless of state or outcome Such plans may not exist Conditional planning Plan to obtain information (observation actions) Subplan for each contingency Expensive because it plans for many unlikely cases Monitoring/Replanning Assume normal states, outcomes Check progress during execution, replan if necessary Unanticipated outcomes may lead to failure

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 45

slide-46
SLIDE 46

Conditional planning

If the world is nondeterministic or partially observable then percepts usually provide information, i.e., split up the belief state

ACTION PERCEPT

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 46

slide-47
SLIDE 47

Conditional planning

Conditional plans check (any consequence of KB +) percept [. . . , if C then PlanA else PlanB, . . .] Execution: check C against current KB, execute “then” or “else” Need some plan for every possible percept (Cf. game playing: some response for every opponent move) (Cf. backward chaining: some rule such that every premise satisfied)

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 47

slide-48
SLIDE 48

Example

Double Murphy: sucking or arriving may dirty a clean square

8 3 6 8 7 1 5 7 8 4 2

Left Suck Right Suck Left Suck

GOAL GOAL LOOP LOOP

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 48

slide-49
SLIDE 49

Example

Triple Murphy: also sometimes stays put instead of moving

8

Left Suck

6 3 7

GOAL

[L1 : Left, if AtR then L1 else [if CleanL then [] else Suck]] “Infinite loop” but will eventually work unless action always fails

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 49

slide-50
SLIDE 50

Replanning

“Failure” = preconditions of remaining plan not met Preconditions of remaining plan = all preconditions of remaining steps not achieved by remain- ing steps = all causal links crossing current time point On failure, resume POP to achieve open conditions from current state IPEM (Integrated Planning, Execution, and Monitoring): keep updating Start to match current state links from actions replaced by links from Start when done

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 50

slide-51
SLIDE 51

Example

At(SM) At(Home) At(HWS) Buy(Drill) Buy(Milk) Buy(Ban.) Go(Home) Go(HWS) Go(SM) Finish Start

Sells(SM,Milk)

At(Home) Have(Ban.) Have(Drill) Have(Milk) Sells(SM,Milk) At(SM) Sells(SM,Ban.) At(SM) Sells(HWS,Drill) At(HWS)

At(Home) Sells(SM,Ban.) Sells(HWS,Drill)

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 51

slide-52
SLIDE 52

Example

At(SM) At(HWS) Buy(Drill) Buy(Milk) Buy(Ban.) Go(Home) Go(HWS) Go(SM) Finish Start

Sells(SM,Milk)

At(Home) Have(Ban.) Have(Drill) Have(Milk) Sells(SM,Milk) At(SM) Sells(SM,Ban.) At(SM) Sells(HWS,Drill) At(HWS)

Sells(SM,Ban.) Sells(HWS,Drill) At(HWS)

At(Home)

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 52

slide-53
SLIDE 53

Example

At(SM) At(Home) At(HWS) Buy(Drill) Buy(Milk) Buy(Ban.) Go(Home) Go(HWS) Go(SM) Finish Start

At(HWS) Have(Drill) Sells(SM,Ban.) Sells(SM,Milk)

At(Home) Have(Ban.) Have(Drill) Have(Milk) Sells(SM,Milk) At(SM) Sells(SM,Ban.) At(SM) Sells(HWS,Drill) At(HWS)

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 53

slide-54
SLIDE 54

Example

At(SM) At(Home) At(HWS) Buy(Drill) Buy(Milk) Buy(Ban.) Go(Home) Go(HWS) Go(SM) Finish Start

Have(Drill) Sells(SM,Ban.) Sells(SM,Milk)

At(Home) Have(Ban.) Have(Drill) Have(Milk) Sells(SM,Milk) At(SM) Sells(SM,Ban.) At(SM) Sells(HWS,Drill) At(HWS)

At(SM)

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 54

slide-55
SLIDE 55

Example

At(SM) At(Home) At(HWS) Buy(Drill) Buy(Milk) Buy(Ban.) Go(Home) Go(HWS) Go(SM) Finish Start

Have(Drill)

At(Home) Have(Ban.) Have(Drill) Have(Milk) Sells(SM,Milk) At(SM) Sells(SM,Ban.) At(SM) Sells(HWS,Drill) At(HWS)

At(SM) Have(Ban.) Have(Milk)

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 55

slide-56
SLIDE 56

Example

At(SM) At(Home) At(HWS) Buy(Drill) Buy(Milk) Buy(Ban.) Go(Home) Go(HWS) Go(SM) Finish Start

Have(Drill)

At(Home) Have(Ban.) Have(Drill) Have(Milk) Sells(SM,Milk) At(SM) Sells(SM,Ban.) At(SM) Sells(HWS,Drill) At(HWS)

Have(Ban.) Have(Milk) At(Home)

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 56

slide-57
SLIDE 57

Emergent behavior

START Get(Red) Color(Chair,Blue) ~Have(Red) Paint(Red) Have(Red) FINISH Color(Chair,Red)

FAILURE RESPONSE Have(Red) PRECONDITIONS Fetch more red

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 57

slide-58
SLIDE 58

Emergent behavior

START Get(Red) Color(Chair,Blue) ~Have(Red) Paint(Red) Have(Red) FINISH Color(Chair,Red)

FAILURE RESPONSE PRECONDITIONS Color(Chair,Red) Extra coat of paint

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 58

slide-59
SLIDE 59

Emergent behavior

START Get(Red) Color(Chair,Blue) ~Have(Red) Paint(Red) Have(Red) FINISH Color(Chair,Red)

FAILURE RESPONSE PRECONDITIONS Color(Chair,Red) Extra coat of paint

“Loop until success” behavior emerges from interaction between mon- itor/replan agent design and uncooperative environment

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 59

slide-60
SLIDE 60

Planners

  • Planner as Boolean satisfiability
  • Planner by progression with good heuristics
  • Planner by search of using a planning graph
  • Planner as constrain satisfaction
  • Planner as refinement of partially ordered plans
  • Planner as logical deduction
  • Planner as decision making

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 60

slide-61
SLIDE 61

Planner as SAT solver

function SatPlan(init,successor-states,goal,Tmax) returns solution or failure inputs: init, a collection of assertions about the initial state successor-states, the successor-state axioms (or transitions) for all possible actions at each time up to some maximum time t goal, the assertion that the goal is achieved at time t Tmax, an upper limit for plan length for i = 1 to Tmax do cnf ← Translate-To-Sat(init,successor-states,goal,t) models ← Sat-Solver(cnf) if p model is not null then return Extract-Solution(model) return failure

AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 8 61