PLANNING WITH INCOMPLETE USER PREFERENCES AND DOMAIN MODELS
Tuan Anh Nguyen Graduate Committee Members: Subbarao Kambhampati (Chair) Chitta Baral Minh B. Do Joohyung Lee David E. Smith
P LANNING WITH I NCOMPLETE U SER P REFERENCES AND D OMAIN M ODELS - - PowerPoint PPT Presentation
P LANNING WITH I NCOMPLETE U SER P REFERENCES AND D OMAIN M ODELS Tuan Anh Nguyen Graduate Committee Members: Subbarao Kambhampati (Chair) Chitta Baral Minh B. Do Joohyung Lee David E. Smith M OTIVATION Automated Planning In practice
Tuan Anh Nguyen Graduate Committee Members: Subbarao Kambhampati (Chair) Chitta Baral Minh B. Do Joohyung Lee David E. Smith
2
Actions Preconditions Effects
Deterministic Non-deterministic Stochastic
Initial situation Goal conditions What a user wants
Find a (best) plan!
Action models are not
Cost of modeling Error-prone Users usually don’t
Always want to see
more than one plan Planning with incomplete user preferences and domain models
Classical Model: “Closed world” assumption
All preferences assumed to be fully
specified/available
If no preferences specified —then user is
If preferences/objectives are specified, find a plan
Full Knowledge
3 3
Real World: Preferences not fully known
Full Knowledge
lacking
4
Unknown preferences
For all we know, user may care about every thing
times, the type of flight, the airport, time of travel and cost of travel… Partially known
We know that users cares only about travel time
and cost. But we don’t know how she combines them…
4
Classical Model: “Closed world” assumption
Fully specified preconditions and effects Known exact probabilities of outcomes
Full Knowledge
5
pick-up
:parameters (?b – ball ?r – room) :precondition (and (at ?b ?r) (at-robot ?r) (free-gripper)) :effect (and (carry ?b) (not (at ?b ?r)) (not (free-gripper)))
5
6
Completely modeling the domain dynamics Time consuming Error-prone Sometimes impossible What does it mean by planning with incompletely
specified domain models?
Plan could fail! Prefer plans that are more likely to
succeed…
How to define such a solution concept?
6
7
Incompleteness representation
7
Solution concepts Planning techniques
8
“Model-lite” Planning
Preference incompleteness Domain incompleteness
Representation Solution concept Solving techniques Representation Solution concept Solving techniques
9
“Model-lite” Planning
Preference incompleteness Domain incompleteness
Representation: two levels
User preferences exist, but
totally unknown
Partially specified Complete set of plan
attributes
Parameterized value
function, unknown trade-off values
Representation Actions with possible
preconditions / effects
Optionally with weights
for being the real ones
Solution concept: plan sets Solving techniques:
synthesizing high quality plan sets
Solution concept: “robust”
plans
Solving techniques:
synthesizing robust plans
10
“Model-lite” Planning
Preference incompleteness
Representation: two levels
User preferences exist, but
totally unknown
Partially specified Full set of plan
attributes
Parameterized value
function, unknown trade-off values
Solution concept: plan sets
with quality
Solving techniques:
synthesizing quality plan sets
Distance measures w.r.t.
base-level features of plans (actions, states, causal links)
CSP-based and local-search
based planners
IPF/ICP measure Sampling, ICP and Hybrid
approaches
11
“Model-lite” Planning
Preference incompleteness Domain incompleteness
Representation: two levels
User preferences exist, but
totally unknown
Partially specified Full set of plan
attributes
Parameterized value
function, unknown trade-off values
Representation Actions with possible
preconditions / effects
Optionally with weights
for being the real ones
Solution concept: plan sets Solving techniques:
synthesizing high quality plan sets
Solution concept: “robust”
plans
Solving techniques:
synthesizing robust plans
Publication
Domain independent approaches
for finding diverse plans. IJCAI (2007)
Planning with partial preference
Generating diverse plans to
handle unknown and partially known user preferences. AIJ 190 (2012) (with Biplav Srivastava, Subbarao Kambhampati, Minh Do, Alfonso Gerevini and Ivan Serina)
Publication
Assessing and Generating
Robust Plans with Partial Domain Models. ICAPS-WS (2010)
Synthesizing Robust Plans under
Incomplete Domain Models. AAAI-WS(2011), NIPS (2013)
A Heuristic Approach to
Planning with Incomplete STRIPS Action Models. ICAPS (2014) (with Subbarao Kambhampati, Minh Do)
12
Predicate set R: clear(x – object), on-
Operators O: Name (signature): pick-up(x – object) Preconditions: hand-empty, clear(x) Effects: ~hand-empty, holding(x), ~clear(x) A single complete model!
13
Set of typed objects {𝑝1, … , 𝑝𝑙} Together with predicate set 𝑄, we have a set of
Together with operators 𝑃, we have a set of
Initial state: 𝐽 ∈ 𝐺 Goals: 𝐻 ⊆ 𝐺
14
Find: a plan 𝜌 achieves 𝐻 starting from 𝐽:
Transition function:
𝛿
Applying 𝜌 = 〈𝑏1, … , 𝑏𝑜〉 at state 𝑡: 𝛿 𝜌, 𝑡 =
15
Predicate set 𝑺: clear(x – object), on-table(x – object),
Operators 𝑷
Name (signature): pick-up(x – object) Preconditions: hand-empty, clear(x) Possible preconditions: light(x) Effects: ~hand-empty, holding(x), ~clear(x) Possible effects: dirty(x) Incomplete domain 𝑬
At “schema” level with typed variables (no objects) With K “annotations”, we have 2𝐿 possible complete models,
16
Incompleteness in deterministic domains Stochastic domains
Set of typed objects {𝑝1, … , 𝑝𝑙} Together with predicate set 𝑄, we have a
Together with operators 𝑃, we have a set of
Initial state: 𝐽 ∈ 𝐺 Goals: 𝐻 ⊆ 𝐺 Find: a plan 𝜌 “achieves” 𝐻 starting
An ill-defined solution concept! Need a definition for “goal achievement”
17
18
Under 𝑬
𝑬𝒋∈≪𝑬 ≫
The probability of ending up in 𝒕′ ∈ 𝜹(𝝆, 𝒕) is equal
𝑬𝒋∈≪𝑬 ≫, 𝒕′=𝜹𝑬𝒋(𝝆,𝒕)
20
STRIPS Execution (SE): Generous Execution (GE):
𝐸 (〈𝑏〉, 𝑡) = 𝑡 ∖ 𝐸𝑓𝑚𝐸 𝑏 ∪ 𝐵𝑒𝑒𝐸 𝑏 ,
𝐸 (〈𝑏〉, 𝑡) = 𝑡 ∖ 𝐸𝑓𝑚𝐸 𝑏 ∪ 𝐵𝑒𝑒𝐸 𝑏 ,
21 Proposition set 𝐺 = {𝑞1, 𝑞2, 𝑞3} Initial state 𝐽 = {𝑞2} Goal 𝐻 = {𝑞3}
Naturally, we prefer plan that succeeds in as
22
Predicate set 𝑺: clear(x – object), on-table(x –
Operators 𝑷
Name (signature): pick-up(x – object) Preconditions: hand-empty, clear(x) Possible preconditions: light(x) with a weight of 0.8 Effects: ~hand-empty, holding(x), ~clear(x) Possible effects: dirty(x) with an unspecified weight
Treat weights as probabilities with random
Robustness measure:
23
𝑬𝒋 ∈ 〈〈𝑬 〉〉:𝜹𝑬𝒋 𝝆,𝑱 ⊨𝑯
A measure for plan quality Robustness of plan 𝑆 𝜌 ∈ [0,1] Plan robustness assessment Reduced to weighted model counting Complexity Synthesizing robust plans Compilation approach Heuristic search approach
24
Computation: Given 𝐸
Construct a set of correctness constraints 𝚻(𝝆) for
State transitions caused by actions are correct. The goal 𝐻 is satisfied in the last state.
Then: 𝑆(𝜌) is computed from the weighted
25
26
𝒃𝒆𝒆 𝐷𝑞
𝑗 ≤𝑙≤𝑗−1,𝑞∈𝐵𝑒𝑒
(𝑏𝑙)
𝒒𝒔𝒇 ⇒
𝒃𝒆𝒆 𝐷𝑞
𝑗 ≤𝑙≤𝑗−1,𝑞∈𝐵𝑒𝑒
(𝑏𝑙)
𝒆𝒇𝒎 ⇒
𝒃𝒆𝒆 𝐷𝑞
𝑗 ≤𝑙≤𝑗−1,𝑞∈𝐵𝑒𝑒
(𝑏𝑙)
𝒒𝒔𝒇 ⇒ (𝒒𝒃𝒏 𝒆𝒇𝒎 ⇒
𝒃𝒆𝒆 𝐷𝑞
𝑗 ≤𝑙≤𝑗−1,𝑞∈𝐵𝑒𝑒
(𝑏𝑙)
Complexity
The problem of computing 𝑆(𝜌) for a plan π to a problem 〈D , I, G〉 is #P-complete.
Membership:
Have a Counting TM non-deterministically guess a
complete model, and check the correctness of the plan.
The number of accepting branches output: the number
Completeness:
There exists a counting reduction from the problem of
counting satisfying assignments for Monotone-2-SAT problem to Robustness-Assessment (RA) problem
27
A measure for plan quality Robustness of plan 𝑆 𝜌 ∈ [0,1] Plan robustness assessment Reduced to weighted model counting Complexity Synthesizing robust plans Compilation approach Heuristic search approach
28
The realization of possible preconditions / effects is
𝑞𝑠𝑓, 𝑞𝑏 𝑏𝑒𝑒, 𝑞𝑏 𝑒𝑓𝑚
Thus, can be compiled away using “conditional effects”
If 𝑞𝑏
𝑞𝑠𝑓 = 𝑢𝑠𝑣𝑓 then 𝑞 is a precondition of 𝑏.
Domain incompleteness State incompleteness
Conformant probabilistic planning problem!
29
Compiled “pick-up”
30
Using Probabilistic-FF planner (Domshlak &
Synthesizing Robust Plans under Incomplete Domain Models (NIPS 2013)
Normally fails with large problem instances
31
Incomplete Logistics domain
A measure for plan quality Robustness of plan 𝑆 𝜌 ∈ [0,1] Plan robustness assessment Reduced to weighted model counting Complexity Synthesizing robust plans Compilation approach Heuristic search approach
32
33
Not explicitly maintain set of resulting states Successor state:
Possible delete effects might not take effects!
Recursive definition for 𝛿
Completeness: Any solution in the complete STRIPS action model exists in the solution space of the problem with incomplete domain. Soundness: For any plan returned under incomplete STRIPS domain semantics, there is one complete STRIPS model under which the plan succeeds.
𝜹 𝝆, 𝒕 = 𝜹𝑬𝒋(𝝆, 𝒕)
𝑬𝒋∈≪𝑬 ≫
1.
2.
3.
34
Reduce exact weighted model counting
35
𝐽 𝑡
If (𝐻 ⊆ 𝑡) and 𝑉 𝜌 > 𝜀 then 𝑥𝑛𝑑(𝜌) 𝑽 𝝆 ≥ 𝒙𝒏𝒅(𝝆) (Upper bound for 𝑆(𝜌))
How to… Compute ℎ 𝑡, 𝜀 = |𝜌
|
37
𝐽 𝑡 𝑡G 𝜌𝑙 𝜌
(Lower bound for 𝑆(𝜌))
38
𝒒𝒃𝒍
𝒃𝒆𝒆 𝐷𝑞
𝑗 ≤𝑙≤𝑗−1,𝑞∈𝐵𝑒𝑒
(𝑏𝑙)
𝒒𝒃𝒋
𝒒𝒔𝒇 ⇒
𝒒𝒃𝒍
𝒃𝒆𝒆 𝐷𝑞
𝑗 ≤𝑙≤𝑗−1,𝑞∈𝐵𝑒𝑒
(𝑏𝑙)
𝒒𝒃𝒏
𝒆𝒇𝒎 ⇒
𝒒𝒃𝒍
𝒃𝒆𝒆 𝐷𝑞
𝑗 ≤𝑙≤𝑗−1,𝑞∈𝐵𝑒𝑒
(𝑏𝑙)
𝒒𝒃𝒋
𝒒𝒔𝒇 ⇒ (𝒒𝒃𝒏 𝒆𝒇𝒎 ⇒
𝒒𝒃𝒍
𝒃𝒆𝒆 𝐷𝑞
𝑗 ≤𝑙≤𝑗−1,𝑞∈𝐵𝑒𝑒
(𝑏𝑙)
)
39
𝒒𝒃𝒍
𝒃𝒆𝒆 𝐷𝑞
𝑗 ≤𝑙≤𝑗−1,𝑞∈𝐵𝑒𝑒
(𝑏𝑙)
¬𝒒𝒃𝒋
𝒒𝒔𝒇 ∨
𝒒𝒃𝒍
𝒃𝒆𝒆 𝐷𝑞
𝑗 ≤𝑙≤𝑗−1,𝑞∈𝐵𝑒𝑒
(𝑏𝑙)
¬𝒒𝒃𝒏
𝒆𝒇𝒎 ∨
𝒒𝒃𝒍
𝒃𝒆𝒆 𝐷𝑞
𝑗 ≤𝑙≤𝑗−1,𝑞∈𝐵𝑒𝑒
(𝑏𝑙)
¬𝒒𝒃𝒋
𝒒𝒔𝒇 ∨ ¬𝒒𝒃𝒏 𝒆𝒇𝒎 ∨
𝒒𝒃𝒍
𝒃𝒆𝒆 𝐷𝑞
𝑗 ≤𝑙≤𝑗−1,𝑞∈𝐵𝑒𝑒
(𝑏𝑙)
𝒙 ¬𝒒 = 𝟐 − 𝒙(𝒒)
Σ(𝜌) as a set of clauses with positive literals.
40
𝑑𝑗∈Σ(𝜌)
𝑑𝑗∈Σ(𝜌)
(Equality holds when all clauses are independent)
41
Σ 𝜌 = {𝑑1, … , 𝑑𝑜} An (trivial) upper bound: 𝑉 𝜌 ≥ min
𝑑𝑗∈Σ(𝜌) Pr
A much tighter bound:
¬𝑞 𝑞 𝑏𝑗
𝑞𝑏𝑙
𝑏𝑒𝑒 𝐷𝑞
𝑗 ≤𝑙≤𝑗−1,𝑞∈𝐵𝑒𝑒
(𝑏𝑙)
𝑏𝑙 𝑞 𝐽 𝐻
corresponding to one specific predicate
decomposable into “connected components”
𝑑∈Σ𝑘 Pr
𝑘
{𝒚𝟐, 𝒚𝟑} {𝒚𝟑, 𝒚𝟒} {𝒚𝟓, 𝒚𝟔}
Build relaxed planning graph
Ignoring known & possible delete effects
Propagate clauses for propositions and actions Extract relaxed plan
42
𝐽 𝑡 𝑡G 𝜌𝑙 𝜌
PROPOSITIONAL LAYER 𝑴𝟐
43
𝑞𝑘 ¬𝑞𝑘
𝜌𝑙
Establishment constraints (if needed) and protection constraints for 𝑞𝑘 at state 𝑡𝑙+1
𝚻𝒒𝒌(𝟐)
ACTION LAYER 𝑩𝒖
44
𝚻𝒒𝒋(𝒖) 𝚻𝒒𝒌(𝒖)
𝑞𝑠𝑓
𝚻𝒃𝒏 𝒖 = 𝚻𝐪𝐣 𝐮 ∧ (𝒒𝒌 𝒃𝒏
𝒒𝒔𝒇 ⇒ 𝚻𝐪𝐤(𝐮))
ACTION LAYER 𝑩𝒖
45
𝚻𝒒𝒋(𝒖) 𝚻𝒒𝒌(𝒖)
𝒐𝒑𝒑𝒒𝒒𝒋
𝚻𝒒𝒋(𝒖)
𝒐𝒑𝒑𝒒𝒒𝒌
𝚻𝒒𝒌(𝒖)
PROPOSITIONAL LAYER 𝑴𝒖+𝟐
46
𝚻𝒃𝒏(𝒖) 𝚻𝒃𝒎(𝒖)
Σ𝑞𝑗 𝑢 + 1 = 𝑏𝑠𝑛𝑏𝑦Σ 𝑚(Σ ∧ Σ𝑙) Σ ∈ {Σ𝑏𝑛 𝑢 , 𝑞𝑗 𝑏𝑚
𝑞𝑠𝑓 ⇒ Σ𝑏𝑚(𝑢)}
𝚻𝒒𝒏(𝒖 + 𝟐)
47
𝒉
𝑀𝑈 𝑀1 𝐵1 𝐵𝑈−1
𝑀𝑈−1
Best supporting action for at layer T 𝚻𝒒𝒋 𝒖 + 𝟐 = 𝒃𝒔𝒉𝒏𝒃𝒚𝚻 𝒎(𝚻 ∧ 𝚻𝒍)
𝜌
in total order
Succeed when: All know preconditions are supported 𝑚 Σ𝑙 ∧ Σ𝜌′ > 𝜀
𝑀2
WHEN TO INSERT ACTIONS?
A supporting action 𝑏𝑐𝑓𝑡𝑢 is inserted only if
Depending on: Relation between: subgoal and “relaxed plan state” Robustness of the current 𝜌
and 𝜌 ∪ {𝑏𝑐𝑓𝑡𝑢}
48
𝒕→𝒃 + 𝒕→𝒃
+
SUBGOAL V.S RP STATE
49
𝒕→𝒃 + 𝒕→𝒃
+
𝒒
𝒕→𝒃 + 𝒕→𝒃
+
𝒒
𝒕→𝒃 + 𝒕→𝒃
+
𝒒
𝒕→𝒃 + 𝒕→𝒃
+
𝒕→𝒃 + 𝒕→𝒃
+
𝒒
𝒕→𝒃 + 𝒕→𝒃
+
𝒒 𝒒
𝑞 ∈ 𝑄𝑠𝑓 𝑏 , 𝑞 ∉ 𝑡→𝑏: insert 𝑏𝑐𝑓𝑡𝑢 into 𝜌 This type of subgoal makes the relaxed plan
50
𝒕→𝒃 + 𝒕→𝒃
+
𝒒 No actions in 𝜌𝑙 and 𝜌 supporting this subgoal
51
𝒕→𝒃 + 𝒕→𝒃
+
𝒒 𝒕→𝒃
+
𝒕→𝒃 + 𝒕→𝒃
+
𝒒
𝒕→𝒃 + 𝒒
For these subgoals, supporting actions inserted if the
insertion increases the robustness of the current relaxed plan.
∪ 𝑏𝑐𝑓𝑡𝑢
)
52
𝒕→𝒃 + 𝒕→𝒃
+
𝒒
𝒕→𝒃 + 𝒕→𝒃
+
𝒒
For these subgoals, no supporting actions needed!
Stochastic local search with failed bounded restarts (Coles et al.,
2007)
53
h s, 𝜀 = 100
Depth bound
ℎ 𝑡′, 𝜀 = 55 𝑔𝑏𝑗𝑚𝑑𝑝𝑣𝑜𝑢 = 0 𝑞𝑠𝑝𝑐𝑓𝑑𝑝𝑣𝑜𝑢 = 1 𝑔𝑏𝑗𝑚𝑑𝑝𝑣𝑜𝑢 = 0 If 𝑔𝑏𝑗𝑚𝑑𝑝𝑣𝑜𝑢 = 𝑔𝑏𝑗𝑚𝑐𝑝𝑣𝑜𝑒 then double depth bound
ℎ 𝑡′′, 𝜀 = 0
Goal reached 𝑔𝑏𝑗𝑚𝑐𝑝𝑣𝑜𝑒 = 32, 64,128, … 𝒃𝟐𝟏 𝒃𝟐𝟑 𝒃𝟑𝟏 𝑞𝑠𝑝𝑐𝑓𝑑𝑝𝑣𝑜𝑢 = 2 If probecount = 𝑞𝑠𝑝𝑐𝑓𝑐𝑝𝑣𝑜𝑒 then increment 𝑔𝑏𝑗𝑚𝑑𝑝𝑣𝑜𝑢 𝑞𝑠𝑝𝑐𝑓𝑑𝑝𝑣𝑜𝑢 = 0 Better state found.
𝜀 = 𝑆(𝜌)
ℎ(𝑡, 𝜀): how far it is approximately from s to a goal state so that the resulting plan has approximate robustness > 𝜀.
54
Number of instances for which PISA produces better, equal and worse robust plans compared to DeFault.
Domains:
Zenotravel, Freecell, Satellite, Rover (215 domains x 10 problems =
2150 instances)
Parc Printer (300 instances)
55
Total time in seconds (log scale) to generate plans with the same robustness by PISA and DeFault.
56
“Model-lite” Planning
Preference incompleteness Domain incompleteness
Representation: two levels
User preferences exist, but
totally unknown
Partially specified Full set of plan
attributes
Parameterized value
function, unknown trade-off values
Representation Actions with possible
preconditions / effects
Optionally with weights
for being the real ones
Solution concept: plan sets Solving techniques:
synthesizing high quality plan sets
Solution concept: “robust”
plans
Solving techniques:
synthesizing robust plans
57
“Model-lite” Planning
Preference incompleteness Domain incompleteness
Representation: two levels
User preferences exist, but
totally unknown
Partially specified Full set of plan
attributes
Parameterized value
function, unknown trade-off values
Representation Actions with possible
preconditions / effects
Optionally with weights
for being the real ones
Solution concept: plan sets Solving techniques:
synthesizing high quality plan sets
Solution concept: “robust”
plans
Solving techniques:
synthesizing robust plans
Publication
Domain independent approaches
for finding diverse plans. IJCAI (2007)
Planning with partial preference
Generating diverse plans to
handle unknown and partially known user preferences. AIJ 190 (2012) (with Biplav Srivastava, Subbarao Kambhampati, Minh Do, Alfonso Gerevini and Ivan Serina)
Publication
Assessing and Generating
Robust Plans with Partial Domain Models. ICAPS-WS (2010)
Synthesizing Robust Plans under
Incomplete Domain Models. AAAI-WS (2011), NIPS (2013)
A Heuristic Approach to
Planning with Incomplete STRIPS Action Models. ICAPS (2014) (with Subbarao Kambhampati, Minh Do)
58