[PPT] - Planning under uncertainty as Golog Programs Jorge Baier Co-work PowerPoint Presentation

SLIDE 1

Planning under uncertainty as Golog Programs

Jorge Baier∗

∗Co-work with Javier Pinto

SLIDE 2

Outline

Objectives, contributions and motivation.
The Probabilistic Situation Calculus.
An extension to Golog.
An algorithm for planning with conditional Golog plans.
Loop induction.
Conculsions.

SLIDE 3

Objectives and Contributions

Main objective of the work: use the Probabilistic Situation Calculus

(PSC) to model and program agents in domains with uncertainty.

Application: program execution and planning under uncertainty.
Contributions:

– An (offline) extension to Golog to handle non-determinism in the effects of actions ` a la PSC. – An algorithm for planning under uncertainty for fully observable worlds. – The algorithm may generate plans with loops.

SLIDE 4

Why conditional planning?

For efficiency reasons, one of the fundamental ideas of cognitive robotics

is to program agents instead of letting them plan.

Nevertheless, planning is still necessary; we don’t want the programmer

to “think about everything”. We want agents to be flexible.

In the context of uncertainty, conditional plans must be generated since

there might be different contingencies under wich different courses of action might be chosen.

SLIDE 5

The Probabilistic Situation Calculus

The Probabilistic Situation Calculus [PSSM00] is a many-sorted second-
rder (first-order + induction) family of logical languages. It is an exten-

sion of the standard Situation Calculus that handles uncertainty effects

f actions with (discrete) probability distributions.
Language elements:

– Actions. An action is a pair i, e, Where i is the deterministic part (input), and e is the non-deterministic, “by-nature” part (outcome). There are sorts I and E for inputs and outcomes. Example: If input action toss a coin is denoted by Toss then Toss, Tails and Toss, Heads, are two standard SC actions.

SLIDE 6

The Probabilistic Situation Calculus – Situations. The same as in the standard SC.

Input Outcome

S0 T ails Heads do(T oss, T ails, S0) do(T oss, Heads, S0) T oss

SLIDE 7

The Probabilistic Situation Calculus – Fluents. Treated as objects of sort F. – Distinguished predicate holds ⊆ F × S, that can be extended nat- urally to support fluent formulas. – Distinguished predicates Poss i ⊆ I ×S and Poss ⊆ A×S state preconditions of inputs and actions. Example:

Poss i(drop(x), s) ≡ holds(holding(x),s) Poss(drop(x), e, s) ≡ Poss i(drop(x), s)∧ (e = tails(x) ∨ e = heads(x))

– In our version, a distinguished function Outcome : I × E × S → [0, 1], that assigns a multinomial probability distribution on out- comes of actions. Example:

¬holds(biased(x), s) ⊃ Outcome(drop(x), tails(x), s) = 0.5 holds(biased(x), s) ⊃ Outcome(drop(x), heads(x), s) = 0.8

SLIDE 8

Theories of action in Probabilistic Situation Calculus As usual, a theory of action in the Probabilistic Situation Calculus is com- posed by foundational axioms. Among them there are aditional for handling probabilities, (∀i, s). Poss i(i, s) ⊃

e∈E

Outcome(i, e, s) = 1. and axioms for describing:

the intitial situation
preconditions for actions
successor state axioms. Example:

Poss(a, s) ⊃ [holds(headsUp(x), do(a, s)) ≡

ut(a) = heads(x)∨

(holds(headsUp(x), s) ∧ ¬out(a) = tails(x))],

probability distribution and unique name axioms

SLIDE 9

Computing probabilities for U-GOLOG programs

We extended a subset of CONGOLOG with non-deterministic effects of
actions. Features:

– Primitive actions in a U-GOLOG program are inputs, not standard SC actions. – The execution of an input in a U-GOLOG program may result in multiple situations.

Here is the main change to CONGOLOG’s semantics:

Trans(α, s, δ, s′) ≡ α = NoOp ∧ δ = {}∧ (∃e) .Poss(α, e, s) ∧ s′ = do(α, e, s).

SLIDE 10

Computing probabilities for U-GOLOG programs

We also define the function ProbSG(α, s, s′), which is the probability

that after executing α in s, the program ends in situation s′. Some of the axioms for ProbSG are the following.

ProbSG(α, s, s′) = Outcome(α, e, s) ≡ Trans(α, s, do(α, e, s) ProbSG((σ1; σ2), s, s′) = ProbSG(σ1, s, s′′)×ProbSG(σ2, s′′, s′) ≡ Do(σ1, s, s′′)∧Do(σ2, s′′, s′)

P robSG(if φ then σ1 else σ2 endIf, s, s′) = ( P robSG(σ1, s, s′) iff holds(φ, s) P robSG(σ2, s, s′)

therwise

P robSG(while φ do σ , s, s′) = ( 1 iff ¬holds(φ, s) ∧ s = s′ P robSG(σ; while φ do σ , s, s′)

therwise

SLIDE 11

Computing probabilities for U-GOLOG programs

We also define the predicate ProbG such that ProbG(g, σ, s) is the prob-

ability that fluent (fluent formula) g holds after executing program σ in s. ProbG(g, σ, s) =

s′∈{s′′|Do(σ,s,s′′)}

ProbSG(σ, s, s′)×holds(g, s′). where holds(g, s′) is 1 when holds(g, s′) is true and 0 otherwise.

SLIDE 12

Planning under uncertainty

In classical planning, it is assumed that actions have deterministic effects.

Solution: linear plans (simple sequences of action). Example: (World of coins) In this domains, there are two coins, C1 and C2 that are initially over a table with tails up. An agent can drop and grab coins. The drop coin action has non-deterministic effects.

In the PCS we model this world using

– Input: grab(x), drop(x). – Outcomes: grab(x), heads(x), tails(x). – Fluents: headsUp(x), tailsUp(x), onTable(x), onFloor(x)

Initial conditions: two coins over a table, C1 and C2.

SLIDE 13

Conditional plans through refinement

Suppose we want a plan that with probability at least 0.7 achieves the

following goal: have coin C1 with heads up and on the floor G

def

= headsUp(C1) ∧ onFlooe(C1).

Consider the following plan:

δ1

def

= grab(C1); drop(C1)

This program achieves the goal with a positive probability, but not with

the required 0.7.

It is not hard to see that no linear sequence of action achieves the goal

with the required probability.

Solution: Conditional plans.

SLIDE 14

A more general case

Let i1; i2; i3; i4 be an arbitrary sequence of inputs that achieves G with a

positive probability. Consider that the following is the tree of situations that result from its execution.

: Good situation

: Bad situation

S0 S1 S3 S2 i1 i2 S4 i4 i3

Our algorithm for conditional planning starts with an arbitrary sequence

SLIDE 15

A more general case

f actions that achieves the goal with a positive probability threshold

(given as parameter) and then refines it.

The algorithm starts simulating the program until it finds one or more

bad situations. Once a bad situation is found, an if-then-else construct is introduced to the program.

A situation is bad for a goal G and plan σ in situation S if

ProbG(G, σ, S) = 0.

Intuitively, refinement of the plan corresponds to the following program.

i1; i2; if in situation S4 then new plan for G in S4 else recursive refi nement of i

3; i4 for

the rest of current situations

The condition “in situation S2” cannot be included directly in the pro-

gram.

SLIDE 16

A more general case

The algorithm finds a discriminating fluent (true in good situations false

in bad).

For a theory of actions for the world of coins, the plan returned in FinalPlan

by the execution of CRefine(headsUp(C1), {}, FinalPlan, {S0}, 0.4, 0, 2) is grab(C1); drop(C1); if headsUp(C1) then NoOp; else grab(C1); drop(C1); if headsUp(C1) then NoOp; else grab(C1); drop(C1); endIf endIf which achieves the goal with probability 0.875.

SLIDE 17

CRefine Operator

CRefine(Goal, CandP lan, F inalP lan, CurSits, T, Level, T op) ← BadSits = {s|s ∈ CurSits ∧ Bad(s, Goal, CandP lan)} if BadSits = {} then if CandP lan = {} then F inalP lan = NoOp else if(∃α, σ) CandP lan = α; σ then NewSits = {s |(∃s′) s′ ∈ CurSits ∧ Do(α, s′, s)} CRefine(Goal, σ, σ′, NewSits, T, Level, T op) F inalP lan = α; σ′ endIf else GoodSits = CurSits − BadSits F irstBad = an element of BadSits FindSeq(Goal, CandP lanF orBads, F irstBad, T ) if Level < T op then CRefine(Goal, CandP lanF orBads, P lanF orBads, BadSits, T, Level + 1, T op) else P lanF orBads = CandP lanF orBads endIf if GoodSits = {} then F inalP lan = P lanF orBads else P roperty = fl uent literal l |(∀s) s ∈ GoodSits ⊃ holds(l, s)∧ (∀s) s ∈ BadSits ⊃ ¬holds(l, s) CRefine(Goal, CandP lan, P lanF orGoods, GoodSits, T, Level + 1, T op) F inalP lan = if P roperty then P lanF orGoods else P lanF orBads end

Figure 1: Pseudo-prolog code for a simple algorithm for planning under uncer- tainty with complete knowledge

SLIDE 18

Loop induction

If CRefine is invoked on the same arguments but replacing the depth level

by 3 we obtain the following program:

grab(C1); drop(C1); if headsUp(C1) then NoOp; else grab(C1); drop(C1); if headsUp(C1) then NoOp; else grab(C1); drop(C1); if headsUp(C1) then NoOp; else grab(C1); drop(C1); endIf endIf endIf,

which achieves the goal with probability 0.9375.

This suggests that loops could be induced when repeated sequences of

if-then-else conditionals appear involving the same body...

SLIDE 19

A sufficient condition to induce a (correct) loop

From program,

σp; σℓ; if l then σt else σℓ; σt for goal G in situation S it is always possible to induce a “sound” loop σp; σℓ; while ¬l do σℓ endWhile; if the set of states after executing σp is equal to the set of states after executing σp; σl; l?.

It is possible to extend this operator to convert loops nested inside a con-

ditional into a single loop.

SLIDE 20

Planning with loop induction

To plan with loop induction we have to interleave the conditional refine-

ment operator and the loop induction operator.

It is possible to prove that if σ = σl; while ¬f do σl endWhile; σt

be a program generated by a refinement of CRefine and LRefine for a goal g in a situation s. Then the probability that fluent g holds after the execution of σ in s is: ProbG(g, σ, s) = ProbG(g, σl; σt, s) ProbG(f, σl, s)

How to interleave them is not a trivial problem, although some good

results where obtained executing one followed by the other.

SLIDE 21

An example in the world of coins

A plan to have two coins picked up and heads up.

grab(c2); drop(c2); if headsUp(c2) then grab(c1); drop(c1); grab(c2); grab(c1); NoOp; else grab(c2); drop(c2); grab(c1); drop(c1); grab(c2); grab(c1); NoOp; endif P = 0.375 grab(c2); drop(c2); while -headsUp(c2) do grab(c2); drop(c2); endWhile grab(c1); drop(c1); grab(c2); grab(c1); NoOp; P = 0.5

SLIDE 22

An example in the world of coins

grab(c2); drop(c2); while -headsUp(c2) do grab(c2); drop(c2); endWhile grab(c1); drop(c1); if headsUp(c1) then grab(c2); grab(c1); NoOp; else grab(c1); drop(c1); grab(c2); grab(c1); NoOp; endif P = 0.75 grab(c2); drop(c2); while -headsUp(c2) do grab(c2); drop(c2); endWhile grab(c1); drop(c1); while -headsUp(c1) do grab(c1); drop(c1); endWhile grab(c2); grab(c1); NoOp; P = 1

SLIDE 23

Related Work

Planners under uncertainty:

– Burindan [KHW95], C-Buridan [DHW94]. – Cassandra [PC96], CNLP [PS92]. – Decision-theoretic planning [BDH99]. [BRST00]

Other Gologs:

– DT-Golog [BRST00]. – pGolog [GL01].

SLIDE 24

Conclusions and Discusion

We have given algorithms that generate Golog programs for planning:

– With actions with non-deterministic effects. – Uncertainty about the initial situation.

Some problems: Depending on parameters, the algorithm may find plans

that are not so good.

Future work: extend algorithms to partially observable worlds.

SLIDE 25

Thank you!

SLIDE 26

REFERENCES REFERENCES

References

[BDH99] Craig Boutilier, Thomas Dean, and Steve Hanks. Decision-Theoretic Plan- ning: Structural Assumptions and Computational Leverage. Journal of AI Research (JAIR), 11:1–94, 1999. [BRST00] Craig Boutilier, Raymond Reiter, M. Soutchanski, and S. Thurn. Decision- Theoretic, High-Level Agent Programming in the Situation Calculus. In Proceedings of the Seventeenth National Conference on Artificial Intelli- gence (AAAI-2000), 2000. [DHW94] Denise Draper, Steve Hanks, and Daniel Weld. Probabilistic Planning with Information Gathering and Contingent Execution. In Kristian Hammond, editor, Proceedings of the Second Conference on AI Planning Systems, pages 31–36, Chicago, IL, USA, 1994. AAAI Press. [GL01] Henrik Grosskreutz and Gerhard Lakemeyer. Belief Updates in the pGolog

Framework. In 24th German / 9th Austrian Conference on Artificial Intelli-

gence KI-2001, Viena, Austria, September 2001. [KHW95] Nicholas Kushmerick, Steve Hanks, and Daniel Weld. An algorithm for probabilistic planning. Artificial Intelligence, 76(1-2):239–86, 1995.

SLIDE 27

REFERENCES REFERENCES

[PC96] Louise Pryor and Gregg Collins. Planning for contingencies: A decision- based approach. Journal of AI Research (JAIR), 4:287–339, 1996. [PS92] Mark A. Peot and David E. Smith. Conditional Nonlinear Planning. In Proceedings of the First International Conference on Artificial Intelligence Planning Systems, pages 189–197, Maryland, 1992. Springer-Verlag. [PSSM00] Javier Pinto, Amilcar Sernadas, Cristina Sernadas, and Paulo Mateus. Non- determinism and uncertainty in the situation calculus. International Journal

f Uncertainty, Fuzziness and Knowledge-Based Systems, 8(2):127–149,