CMU-Q 15-381 Lecture 5: Classical Planning Factored - - PowerPoint PPT Presentation

cmu q 15 381
SMART_READER_LITE
LIVE PREVIEW

CMU-Q 15-381 Lecture 5: Classical Planning Factored - - PowerPoint PPT Presentation

CMU-Q 15-381 Lecture 5: Classical Planning Factored Representations STRIPS Teacher: Gianni A. Di Caro A UTOMATED P LANNING : F ACTORED STATE REPRESENTATIONS 2 P LANNING , FOR ( MORE ) COMPLEX WORLDS Searching for plan of action to achieve


slide-1
SLIDE 1

CMU-Q 15-381

Lecture 5: Classical Planning Factored Representations STRIPS

Teacher: Gianni A. Di Caro

slide-2
SLIDE 2

AUTOMATED PLANNING: FACTORED STATE REPRESENTATIONS

2

slide-3
SLIDE 3

PLANNING, FOR (MORE) COMPLEX WORLDS

3

§ Searching for plan of action to achieve one’s goal is a critical part of AI (in both open and closed loop) § In fact, planning is glorified search § But search needs more powerful state representations than those used so far, in order to be effective § So far: states are indivisible, they have no internal structure § → Planning exploiting structured representation of states § … And let’s keep living in deterministic, known, fully

  • bservable, single agent worlds

§ This is what is commonly termed “Classical Planning”

slide-4
SLIDE 4

STATE REPRESENTATIONS

4

B C

(a) Atomic (b) Factored (b) Structured

B C

So far Now

(Structured)

slide-5
SLIDE 5

STATE REPRESENTATIONS

5

The vacuum-world example

slide-6
SLIDE 6

THE NEED FOR FACTORED STATES

6

§ The goal is to reach the banana, but achieving the goal requires achieving, in the correct sequence, a number of sub-goals that overall make a Plan

slide-7
SLIDE 7

PROPOSITIONAL STRIPS PLANNING

§ STRIPS = Stanford Research Institute Problem Solver (1971)

  • Originally based on first-order logic, later simplified to include only propositional logic

§ (Logic-based) Language expressive enough to describe a wide variety of problems, but restrictive enough to allow efficient algorithms to operate over it § PDDL = Planning Domain Definition Language (1998 - ), the standard language for defining planning domains and problems, it includes original STRIPS + more advanced

  • features. Last version is 3.1, from 2014. Compact representation of planning.

§ A state is a conjunction of propositions, e.g., at(Truck1,Shadyside) ∧ at(Truck2,Oakland)

  • Proposition: A statement that is either true or false à A fact
  • Predicate: a proposition that contains variables/parameters, such as at(truck, place)

§ States are transformed via operators (actions) that have the form Preconditions ⇒ Postconditions (effects)

7

STRIPS / PDDL language(s) to represent / solve planning problems based on propositional (factored) state representation

slide-8
SLIDE 8

RUNNING EXAMPLE: BLOCKS WORLD

8

A C B

Start

A C B

Goal

Predicates that can be used to describe the world: §

  • nTable(X) ( on(X, [Table | Y] )

§

  • n(X, Y)

§ clear(X) § holding(X) § handEmpty(X) § Negation of all the above Objects of the world: § Block A § Block B § Block C § Table § Hand Actions: § …

slide-9
SLIDE 9

REPRESENTING STATES AS SET OF FACTS

§ Fact(ored) representation of states! § State 1 = { holding(A), clear(B), on(B,C), onTable(C)}

9

Closed World Assumption (CWA): Facts not listed in a state are assumed to be false. Under CWA the assumption the agent has full observability and only positive facts need to be stated World states are represented as sets of facts: conjunction of propositions (conditions)

slide-10
SLIDE 10

STATES

10

§ State:

  • Propositional literals: Poor ∧ Unknown
  • Ground first order literals: At(Plane1, Rome) ∧ At(Plane2, Tokyo)

At(𝑦, Rome) ∧ At(𝑧, Tokyo)

  • Function-free: At(Father(Tom), NY)

At(Alex, NY) ∧ Father(Alex, Tom)

The world is represented through a set of features /

  • bjects (e.g., planes, people,

cities) and each proposition states a fact that attributes “values” to features

Objects State propositions CWA

slide-11
SLIDE 11

REPRESENTING GOALS AS SET OF FACTS

§ Also Goals (being world states) are represented as sets of facts § Example: state { on(A,B) } can be set as a goal

11

§ State 1 is a not goal state for the goal { on(A,B) } § State 2 is a goal state for the goal { on(A,B) }

A goal state is any state that includes all the goal facts

slide-12
SLIDE 12

GOALS

12

§ Goals: A conjunction of facts, At(P1, JFK) ∧ At(P2, SFO), that may also contain variables, such as: At(p, JFK) ∧ Plane(p) → to have any plane at JFK § The aim is to reach a state that entails a goal: OnTable(A) ∧ OnTable(B) ∧ OnTable(D) ∧ On(C, D) ∧ Clear(A) ∧ Clear(B) ∧ Clear(C) satisfies the goal to stack C on D

We can focus on getting individual sub-goals. Not possible in atomic representations!

A goal g is a conjunction of sub-goals! g = g1 ∧ g2 ∧… ∧ gn

Goals are reached through sequences of actions (the plan)

slide-13
SLIDE 13

ACTIONS

§ Pre-cond is a conjunction of positive and negative conditions that must be satisfied to apply the operation § Post-cond is a conjunction of positive and negative conditions that become true when the operation is applied

13

A STRIPS action definition specifies: ü A set PRE of preconditions facts ü A set ADD of add effect facts (to state facts) ü A set DEL of delete effect facts (from state facts) PutDown(A,B): as PRE, robot hand is holding A + B’s top is clear à the action puts A down on top of B

Actions: Operators with Preconditions + Effects (Postconditions)

In STRIPS only positive preconditions are used

slide-14
SLIDE 14

EXAMPLE: MOVE OPERATOR

14

slide-15
SLIDE 15

ACTION SCHEMA

15

An action schema to fly a plane from one location to another: Action(Fly(p, from, to), PRECOND: At(p, from) ∧ Plane(p) ∧ Airport(from) ∧ Airport(to) EFFECT: ¬At(p, from) ∧ At(p, to)) § An action is applicable in state 𝑡 if 𝑡 entails the preconditions § The facts negated by the effect of Action are removed from 𝑡, while the positive facts resulting from Action are added to 𝑡 Action schema: a number of different actions that can be derived by universal quantification of the variables

slide-16
SLIDE 16

ACTION SCHEMA

16

§ Action schema: Action(Name(v1, v2,…., vn), PRECONDITIONS: P1(v) ∧ P2 (v) ∧ … ∧ Pm(v) ADD-LIST: {F1(v), F2(v), …., Fq (v)} DELETE-LIST: {Si(p), Sj(p) ∧ … ∧ Sk(p)} § RESULT(𝑡, 𝑏) = (𝑡 – DELETE(𝑏)) ∪ ADD(𝑏)

slide-17
SLIDE 17

(CLASSICAL, PROPOSITIONAL) PLANNING PROBLEM

17

§ Planning domain:

  • Set of Action schemas (actions)
  • Set of Predicates (conjunction of predicates à states)

§ Planning problem (instance):

  • Planning domain
  • Set of Objects (world features)
  • Initial state (facts/propositions about the objects)
  • Goal(s)

§ Solution of the planning problem:A sequence of actions that, starting from the initial state, end in a state 𝑡 that entails the goal

slide-18
SLIDE 18

AUTOMATED PLANNING PROBLEM

18

An action-state model 𝑄 = < 𝑇, 𝑡+,-.,,𝑇/0-1,𝐵, 𝑈, 𝑑, 𝐻 > § S: the set of states (can be atomic, factorial) § sstart ∈ S: the initial state, in S § Sgoal ⊆ S : the subset of goal states, in S § A : the set of possible actions, can be defined as A(s) § T: S × A → S : the Successor / state Transition function § c: S × A → ℝ : the step cost for taking action a in state s § G: S → {0,1} : criterion to check whether or not at a goal / terminal state

Solution plan: Path [sstart , s1, s2, … sgoal] associated to the feasible sequence of actions, [a1, a2, … an] such that cost(path) is minimized

slide-19
SLIDE 19

EXAMPLE: AIR CARGO TRANSPORTATION

19

Air cargo transportation problem (from R&N) § Predicates: At, Cargo, Plane, Airport, In § Objects: C1 (cargo container), C2, P1 (plane), P2, SFO, JFK § Actions: Load, Unload, Fly

slide-20
SLIDE 20

BLOCKS WORLD

20

A C B

Start

A C B

Goal

§ MoveToTable(X, Y) Pre: clear(X) ∧ on(X,Y) ⇒ on(X,Table) ∧ clear(X) ∧ ¬on(X,Y) § Move (X, From, To): clear(X) ∧ on(X, From) ∧ clear(To) ∧ block(X) ∧ block(To) ⇒ on(X,To) ∧ ¬clear(To) ∧ ¬on(X,From) § MoveFromTable (X, Y) A C B A C B

MoveToTable (C, A) MoveFromTable (B, C) MoveFromTable (A, B)

  • n(A,B), on(B,C)

Plan:

slide-21
SLIDE 21

COMPLEXITY OF PLANNING

§ PLANSAT is the problem of determining whether a given planning problem is satisfiable § In general PLANSAT is PSPACE-complete (~require an amount of space which is exponential in the size of the input) § Bounded PlanSAT = decide if plan of given length exists § (Bounded) PlanSAT decidable but PSPACE-hard § Disallow neg effects: (Bounded) PlanSAT NP-hard § Disallow neg preconditions: PlanSAT in P but finding optimal (shortest) plan still NP-hard

21

slide-22
SLIDE 22

COMPLEXITY RESULTS FOR PLANSAT

22

slide-23
SLIDE 23

PLANNING AS SEARCH

§ (Forward) Search from initial state to goal § Can use search techniques, including heuristic search

23

At(P1,A) At(P2,A) At(P1,B) At(P2,A) At(P1,A) At(P2,B) Fly(P1,A,B) Fly(P2,A,B)

slide-24
SLIDE 24

(FORWARD) STATE-SPACE SEARCH

§ In absence of function symbols, the state space of a planning problem is finite → Any graph search algorithm that is complete will be a complete planning algorithm § Irrelevant action problem: All applicable actions are considered at each state! § The resulting branching factor b is typically large and the state space is exponential in b → Needs for good heuristics!

24

At home → get milk, bananas and a cordless drill → return home

slide-25
SLIDE 25

(FORWARD) STATE-SPACE SEARCH

§ Air Cargo Example § Initial state: 10 airports, each airport has 5 planes and 20 pieces of cargo § Goal: transport all the cargos at airport A to airport B § Solution: load the 20 pieces of cargo at A into one of the planes at A and fly it to B § Avg Branching factor b: each of the 50 planes can fly to 9

  • ther airports, and each of the 200 packages can be either

unloaded (if it is loaded), or loaded into any plane at its airport (if it is unloaded) à ~ 2000 possible actions per state § Number of states to explore: O(bd) ∼ 200041

25

slide-26
SLIDE 26

FIND A HEURISTIC: RELAX THE PROBLEM

26

§ Define a Relaxed problem: § (Potentially) Easy to solve § The solution of the relaxed problem gives an admissible heuristic for A*

Any ideas about how to perform a general relaxation?

§ Relaxation: Remove all preconditions from actions § → Every action will always be applicable, and any condition (sub-goal) can be potentially achieved in one step (if there is an action that sets the sub-goal literal to true, otherwise the problem is impossible) § ℎ(𝑦) = Cost-to-go(al) of the relaxed problem from state 𝑦 § Equivalent to adding edges to the state graph: including forbidden actions

slide-27
SLIDE 27

DOMAIN-INDEPENDENT HEURISTIC

§ Solving the relaxed problem should be easy enough, we can even think to take a shortcut, by setting the solution to be the same as the number of unsatisfied sub-goals from current state 𝑦 …

§ ℎ(𝑦) =~ number of unsatisfied goals from current state 𝑦

  • It looks like a correct estimate and also admissible … or

maybe not? (we need admissibility!) § Impossible to derive such a heuristic with atomic states! The successor function is a black box, here we exploit the structure

  • f the representation

§ The heuristic is domain-independent! § ☛ With atomic states, in general only domain-specific heuristics are possible

27

slide-28
SLIDE 28

HEURISTIC: IGNORE PRECONDITIONS

§ Complications, that could made the heuristic function ℎ(𝑦) neither admissible nor being the solution of the relaxed problem:

a. Some operations achieve multiple sub-goals (have multiple post-conditions) b. Some operations undo the effects of others

28

To get an admissible and efficient heuristic ignore preconditions and, in addition ignore (i.e., further relax)

1. Just a 2. Just b 3. Both a and b

slide-29
SLIDE 29

HEURISTIC: IGNORE PRECONDITIONS

§ Complications, that could made the heuristic function ℎ(𝑦) neither admissible nor being the solution of the relaxed problem:

a. Some operations achieve multiple sub-goals (have multiple post-conditions) § ℎ(𝑦) doesn’t correspond to the solution of the relaxed problem + it violates admissibility b. Some operations undo the effects of others § ℎ(𝑦) doesn’t correspond to the solution of the relaxed problem

29

slide-30
SLIDE 30

IGNORE PRECONDITIONS + NON-GOAL EFFECTS

§ To avoid actions that can cancel each

  • ther effects: remove all the effects of

actions, except those that are facts gi, i=1,…,n, in the goal g (i.e., sub-goals) → Exploit factored structure! § ℎ 𝑦 = from 𝑦, the min number of actions such that the union of their effects contains all n sub-goals gi → Admissible § Computing ℎ(𝑦) = solving a SET COVER problem: NP-hard! § Greedy log n approximation:

  • Admissibility is lost!

30

𝑕4 𝑕6 𝑕5 𝑕3 𝑕1 𝑕2

A2 A3 A4 A5 A6 A7 G1 x X X x G2 x X x G3 x x X G4 x X

slide-31
SLIDE 31

IGNORE (SPECIFIC) PRECONDITIONS

§ Ignore specific preconditions to derive domain-specific heuristics § Sliding block puzzle, move(t,s1,s2) action: § On(𝑢, 𝑡1)∧Blank(𝑡2)∧Adjacent(𝑡1, 𝑡2) ⇒ On(𝑢, 𝑡2)∧Blank(𝑡1)∧¬On(𝑢, 𝑡1)∧¬Blank(𝑡2) § Consider two options for removing specific preconditions from move()

a.Removing Blank(𝑡B)∧Adjacent(𝑡C,𝑡B) b.Removing Blank(𝑡B)

§ Poll: Match option to heuristic:

1.a↔ ∑Manhattan, b↔#misplaced tiles 2.a↔#misplaced tiles, b↔ ∑Manhattan 3.b↔#misplaced tiles, a is inadmissible 4.b↔ ∑Manhattan, a is inadmissible

31

5 4 6 1 8 7 3 2 5 4 6 1 8 7 3 2

Example state Goal state

slide-32
SLIDE 32

BACKWARD STATE-SPACE SEARCH

§ Searching from a goal state to the initial state (regression)

§ We only need to consider actions that are relevant to the goal (or current state) → Relevant-state search § This can makes a strong reduction in branching factor, such that it could be more efficient than forward (progression) search § “Imagine trying to figure out how to get to some small place with few traffic connections from somewhere with a lot of traffic connections”

32

At(P1, A)

Fly(P1, A, B) Fly(P2, A, B) Fly(P1, A, B) Fly(P2, A, B)

At(P2, A) At(P1, B) At(P2, A) At(P1, A) At(P2, B) At(P1, B) At(P2, B) At(P1, B) At(P2, A) At(P1, A) At(P2, B)

slide-33
SLIDE 33

BACKWARD STATE-SPACE SEARCH

§ Regression from a (goal) state g over the action a, gives state g’

  • g’ = (g – ADD(a)) ∪ Preconditions(a)

§ DEL(a) doesn’t appear: we don’t know whether the facts negated by DEL(a) were true or not before a, therefore nothing can be said about them § Variables can be included, such that a set of states is defined:

  • Goal At(C2, SFO) → Unload(C2, p, SFO) → g’ = In(C2,p) ∧ At(p, SFO) ∧

Cargo(C2) ∧ Plane(p) ∧ Airport(SFO)

33

slide-34
SLIDE 34

BACKWARD STATE-SPACE SEARCH

§ How to select actions? § Relevant actions only

  • Have an effect which is in the set of (current) goal

conditions

  • Goal: At(C1, JFK) ∧ At(C2, SFO) → Unload(C2, p, SFO)

is relevant, Fly(p, JFK, SFO) is not relevant § Consistent actions only

  • Have no effect which negates an element of the goal
  • Goal: A ∧ B ∧ C, action a with effect A ∧ B ∧ ¬C is

not consistent

34

Ø How to define good heuristics?

slide-35
SLIDE 35

CONCLUSIONS, SO FAR

§ Representations using factored states allows to reason about the (factored) structure of the states (in terms of sets of variables) and exploit it § A goal states is a conjunction of sub-goals that can be individually satisfied § STRIPS / PDDL language to express problem domains and problem instances in a way which expressive enough while allowing for efficient solutions § Forward search can in principle applied but the state space and, more importantly, the branching factor b is expected to be very large § Uninformed search can’t really be used § Informed search, A*, is an option but it needs very good heuristics! § It may not be obvious to define general, problem-independent, admissible heuristics (in polynomial time) § Backward search can potentially be “easier” since it can partially overcome the problem of irrelevant actions, however, defining heuristics for backward search is even more difficult than forward search § We need a better tool, possibly a way for generating tight admissible heuristics for A* in automatic

35

slide-36
SLIDE 36

PLANNING GRAPHS

§ Graph-based data structure representing a polynomial-size/time approximation of the exponential search tree § Can be used to automatically produce good heuristic estimates (e.g., for A*) § Can be used to search for a solution using the GRAPHPLAN algorithm

36

slide-37
SLIDE 37

Planning Graphs

37

§ Leveled graph: vertices organized into levels/stages, with edges only between levels § Two types of vertices on alternating levels:

  • Conditions
  • Operations

§ Two types of edges:

  • Precondition: from condition to operation
  • Postcondition: from operation to condition
slide-38
SLIDE 38

GENERIC PLANNING GRAPH

38

Condition

  • 𝑇0 contains all the conditions that hold in initial state
slide-39
SLIDE 39

GENERIC PLANNING GRAPH

39

Condition Precondition

  • Add operation to level 𝑃𝑗 if its preconditions appear in level 𝑇𝑗

Level O0 Level S0

slide-40
SLIDE 40

GENERIC PLANNING GRAPH

40

Condition No-Op (Persistent action)

Level S1

Precondition

§ Add condition to level 𝑇𝑗 if it is the postcondition of an operation (it is in ADD or DELETE lists) in level 𝑃IJC § Keep a previous condition of no action negates it (persistence, no-op action)

slide-41
SLIDE 41

CONDITIONS MONOTONICALLY INCREASE

41

O1 No-Op No-Op No-Op No-Op No-Op No-Op No-Op O1 O2

#Conditions

(always carried forward by no-ops)

𝑞 ¬𝑟 ¬𝑠 𝑞 ¬𝑟 ¬𝑠 ¬𝑞 𝑞 ¬𝑟 ¬𝑠 ¬𝑞

slide-42
SLIDE 42

GENERIC PLANNING GRAPH

42

… … …

Condition Operation No-Op (Persistent action) Precondition Postcondition

Level O2 Level S2

  • Repeat …
slide-43
SLIDE 43

GENERIC PLANNING GRAPH

43

… … …

§ → The level j at which a condition first appears is a (good) estimate

  • f how difficult is to achieve that condition

§ → Can optimistically estimate how many steps it takes to reach a goal g (or sub-goal gi) from the initial state: admissible heuristic! Id Idea: 𝑇𝑗 contains all conditions that could hold at stage 𝑗 based on past actions; 𝑃𝑗 contains all operations that could have their preconditions satisfied at time 𝑗 No ordering among the operations is assumed at each stage, they could be executed in parallel

slide-44
SLIDE 44

OPERATIONS MONOTONICALLY INCREASE

44

O1 No-Op No-Op No-Op No-Op No-Op No-Op No-Op O1 O2

#Operations

(as a result of conditions monotonic increase, that keep previous preconditions hold and set new preconditions true)

𝑞 ¬𝑟 ¬𝑠 𝑞 ¬𝑟 ¬𝑠 ¬𝑞 𝑞 ¬𝑟 ¬𝑠 ¬𝑞

slide-45
SLIDE 45

MUTUAL EXCLUSION LINKS

§ As it is the graph would be too optimistic! § The graph data structure also records conflicts between actions or conditions: two operations or conditions are mutually exclusive (mutex) if no valid plan can contain both at the same time § A bit more formally:

  • Two operations are mutex if their preconditions or

postconditions are mutex (inconsistent effects, competing needs, interference)

  • Two conditions are mutex if one is the negation of the
  • ther, or all action pairs that achieve them are mutex

(inconsistent support)

45

slide-46
SLIDE 46

A RUNNING EXAMPLE

§ “Have cake and eat cake too” problem

46

slide-47
SLIDE 47

A RUNNING EXAMPLE

47

  • Only Eat(Cake) is

applicable

slide-48
SLIDE 48

A RUNNING EXAMPLE

48

slide-49
SLIDE 49

MUTEX CASES

§ Inconsistent postconditions (two ops):

  • ne operation negates the effect of

the other; Eat(Cake) and no-op Have(Cake) § Interference (two ops): a postcondition of one operation negates a precondition of other; Eat(Cake) and no-op Have(Cake) (issue in parallel execution, the order should not matter but here it would)

49

Inconsistent Postconditions

B ¬ B

Interference

B ¬ B

slide-50
SLIDE 50

A RUNNING EXAMPLE

50

Inconsistent postconditions Negation of each other Interference Interference Inconsistent post

slide-51
SLIDE 51

MUTEX CASES

§ Competing needs (two ops): a precondition of one operation is mutex with a precondition of the other because they are the negate of each other, like for Bake(Cake) and Eat(Cake), or because they have inconsistent support § Inconsistent support (two conditions): each possible pair of operations that achieve the two conditions is mutex. Have(Cake) and Eaten(Cake), are mutex in S1 but not in S2 because they can be achieved by Bake(Cake) and Eaten(Cake)

51

Inconsistent Support Competing Needs

B ¬ B B C

slide-52
SLIDE 52

A RUNNING EXAMPLE

52

Inconsistent support Competing needs

slide-53
SLIDE 53

SUMMARY

53

§ STRIPS Language for Automated Planning § Factored state representation § Actions and action schema § Planning problem § Complexity of planning § Planning as search problem § Forward and Backward search § Need for domain-independent heuristics § Difficulty to define admissible heuristics for A* § Planning graph data structure: construction and stored information