Kowledge-Based Programs as Explainable Policies for Contingent - - PowerPoint PPT Presentation

kowledge based programs as explainable policies for
SMART_READER_LITE
LIVE PREVIEW

Kowledge-Based Programs as Explainable Policies for Contingent - - PowerPoint PPT Presentation

Kowledge-Based Programs as Explainable Policies for Contingent Planning J. Lang, A. Saffidine, F. Schwarzentruber, B. Zanuttini MAFTEC, April 1, 2019 Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 1/40 Planning


slide-1
SLIDE 1

Kowledge-Based Programs as Explainable Policies for Contingent Planning

  • J. Lang, A. Saffidine, F. Schwarzentruber, B. Zanuttini

MAFTEC, April 1, 2019

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 1/40

slide-2
SLIDE 2

Planning Problems

Let’s design an agent for solving problems! ? ? ? ? 1 ? ? 2 ? ? ? ? Maybe even let the agent compute its policy by itself

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 2/40

slide-3
SLIDE 3

Standard Policies

Before we send the agent to the mine field. . . Let’s just check how it is planning to behave

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 3/40

slide-4
SLIDE 4

Standard Policies

Before we send the agent to the mine field. . . Let’s just check how it is planning to behave

1, 1 1, 2 1, 3 2, 1 2, 2 2, 3 1 3, 1 1 4, 1; 4, 2 4, 3 1 4, 2 1 4, 1 2 2, 3 1 3, 3 4, 2; 4, 3 4, 1 1 4, 2 4, 3 1, 3 1 2, 1 1 3, 1 3, 3; 4, 1; 4, 2 3, 3 1 4, 2; 4, 3 1 4, 1; 4, 3 2 1, 2 1 1, 3 1 2, 3 3, 1 3, 3 1 4, 1; 4, 2 1 3, 3 2 4, 2; 4, 3 4, 1; 4, 3 1

Wouldn’t this lack of a little readability, verifiability. . . explainability?

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 3/40

slide-5
SLIDE 5

Knowledge-Based Programs

What about this behaviour? while not sure that all positions except mines have been cleared do if sure that there is no mine at 1, 1 then click1,1 fi if sure that there is no mine at 1, 2 then click1,2 fi . . . if sure that there is no mine at H, W then clickH,W fi

  • d

Wouldn’t this be perfectly readable, verifiable. . . explainable?

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 4/40

slide-6
SLIDE 6

Outline

Contingent Planning Problems Standard Representations Knowledge-Based Programs The Bright Side of KBPs as Policies The Dark Side of KBPs as Policies Conclusion Multi-Agent KBPs Synthesis of KBPs More Succinctness?

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 5/40

slide-7
SLIDE 7

Outline

Contingent Planning Problems Standard Representations Knowledge-Based Programs The Bright Side of KBPs as Policies The Dark Side of KBPs as Policies Conclusion Multi-Agent KBPs Synthesis of KBPs More Succinctness?

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 6/40

slide-8
SLIDE 8

Partially Observable Domains

◮ X = {x1, . . . , xn}: propositional variables

→ 2n states

◮ A = {a1, . . . , ak}: actions ◮ O = {o1, . . . , op}: observations ◮ ϕδ: transition function

States are not directly observable Minesweeper H × W :

◮ variables X: mi,j, ci,j (∀i, j)

→ space of 22HW states

◮ actions A: clicki,j (∀i, j) ◮ observations O: o0, . . . , o8 + olost

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 7/40

slide-9
SLIDE 9

Actions

Actions:

◮ ontic effects: change current state (nondeterministic) ◮ epistemic effects: yield observation (nondeterministic, ambiguous)

Description for Minesweeper: ϕδ =

  • i,j

(clicki,j → ϕδ

i,j)

with ϕδ

i,j = c′ i,j ∧ (mi,j → olost) ∧

  • ¬mi,j →
  • n=0,...,8

(ϕn,i,j ↔ on)

  • x=ci,j

(x′ ↔ x)

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 8/40

slide-10
SLIDE 10

Planning Problems

◮ Domain + initial belief state + goal states ◮ Same as POMDPs except for proba.

Minesweeper:

◮ initial belief state:

  • i,j

(¬ci,j) ∧

  • =

(mi,j ∧ mi′,j′) ∧

  • =

¬(mi,j ∧ mi′,j′ ∧ mi′′,j′′)

◮ goals: i,j(ci,j ⊕ mi,j)

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 9/40

slide-11
SLIDE 11

Policies

◮ Prescribe the agent what action to take ◮ Cannot be as a function of current state ◮ Function from histories actions/observations to actions ◮ Abstract notion

Examples for Minesweeper:

◮ let (pt)t = 1, 1, 1, 2, 1, 3 . . . ◮ π def. by π := clickp|h| ◮ π′ def. by

   π′(ǫ) = clickp0 π′(h) = clickpt(h)+1 if o|h|−1(h) = o0 π′(h) = clickpt(h)+2 otherwise

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 10/40

slide-12
SLIDE 12

Valid policies

Execution model: at each t = 0, 1, . . .

◮ current state st (nonobservable) ◮ current history ht = a0o0a1o1 . . . atot ◮ action at = π(ht) executed (or “stop”) ◮ observation ot + new state st+1 chosen nondet. wrt ϕδ ◮ ot given to agent ◮ st+1 = new current state ◮ new current history ht+1 = htatot

Valid policy: ∀s0 | = ϕI, terminate in finite time t and st | = ϕG

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 11/40

slide-13
SLIDE 13

Example policy

1, 1 1, 2 1, 3 2, 1 2, 2 2, 3 1 3, 1 1 4, 1; 4, 2 4, 3 1 4, 2 1 4, 1 2 2, 3 1 3, 3 4, 2; 4, 3 4, 1 1 4, 2 4, 3 1, 3 1 2, 1 1 3, 1 3, 3; 4, 1; 4, 2 3, 3 1 4, 2; 4, 3 1 4, 1; 4, 3 2 1, 2 1 1, 3 1 2, 3 3, 1 3, 3 1 4, 1; 4, 2 1 3, 3 2 4, 2; 4, 3 4, 1; 4, 3 1 Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 12/40

slide-14
SLIDE 14

Example policy

1, 1 1, 2 1, 3 2, 1 2, 2 2, 3 1 3, 1 1 4, 1; 4, 2 4, 3 1 4, 2 1 4, 1 2 2, 3 1 3, 3 4, 2; 4, 3 4, 1 1 4, 2 4, 3 1, 3 1 2, 1 1 3, 1 3, 3; 4, 1; 4, 2 3, 3 1 4, 2; 4, 3 1 4, 1; 4, 3 2 1, 2 1 1, 3 1 2, 3 3, 1 3, 3 1 4, 1; 4, 2 1 3, 3 2 4, 2; 4, 3 4, 1; 4, 3 1

? ? ? ? 1 ? ? 2 ? ? ? ?

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 12/40

slide-15
SLIDE 15

Example policy

1, 1 1, 2 1, 3 2, 1 2, 2 2, 3 1 3, 1 1 4, 1; 4, 2 4, 3 1 4, 2 1 4, 1 2 2, 3 1 3, 3 4, 2; 4, 3 4, 1 1 4, 2 4, 3 1, 3 1 2, 1 1 3, 1 3, 3; 4, 1; 4, 2 3, 3 1 4, 2; 4, 3 1 4, 1; 4, 3 2 1, 2 1 1, 3 1 2, 3 3, 1 3, 3 1 4, 1; 4, 2 1 3, 3 2 4, 2; 4, 3 4, 1; 4, 3 1

? ? ? ? 1 ? ? 2 ? ? ? ? 1 ? ? ? 1 ? ? 2 ? ? ? ?

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 12/40

slide-16
SLIDE 16

Example policy

1, 1 1, 2 1, 3 2, 1 2, 2 2, 3 1 3, 1 1 4, 1; 4, 2 4, 3 1 4, 2 1 4, 1 2 2, 3 1 3, 3 4, 2; 4, 3 4, 1 1 4, 2 4, 3 1, 3 1 2, 1 1 3, 1 3, 3; 4, 1; 4, 2 3, 3 1 4, 2; 4, 3 1 4, 1; 4, 3 2 1, 2 1 1, 3 1 2, 3 3, 1 3, 3 1 4, 1; 4, 2 1 3, 3 2 4, 2; 4, 3 4, 1; 4, 3 1

? ? ? ? 1 ? ? 2 ? ? ? ? 1 ? ? ? 1 ? ? 2 ? ? ? ? 1 1 ? ? 1 ? ? 2 ? ? ? ?

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 12/40

slide-17
SLIDE 17

Example policy

1, 1 1, 2 1, 3 2, 1 2, 2 2, 3 1 3, 1 1 4, 1; 4, 2 4, 3 1 4, 2 1 4, 1 2 2, 3 1 3, 3 4, 2; 4, 3 4, 1 1 4, 2 4, 3 1, 3 1 2, 1 1 3, 1 3, 3; 4, 1; 4, 2 3, 3 1 4, 2; 4, 3 1 4, 1; 4, 3 2 1, 2 1 1, 3 1 2, 3 3, 1 3, 3 1 4, 1; 4, 2 1 3, 3 2 4, 2; 4, 3 4, 1; 4, 3 1

? ? ? ? 1 ? ? 2 ? ? ? ? 1 ? ? ? 1 ? ? 2 ? ? ? ? 1 1 ? ? 1 ? ? 2 ? ? ? ? 1 1 ? 1 ? ? 2 ? ? ? ?

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 12/40

slide-18
SLIDE 18

Outline

Contingent Planning Problems Standard Representations Knowledge-Based Programs The Bright Side of KBPs as Policies The Dark Side of KBPs as Policies Conclusion Multi-Agent KBPs Synthesis of KBPs More Succinctness?

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 13/40

slide-19
SLIDE 19

Policy Trees

The natural representation:

◮ Node = action, edge = observation ◮ One history = one branch ◮ No child: stop

Usage:

◮ typical output of planners ◮ policy typically found as a tree ◮ DAGs when equivalent situations detected ◮ very verbose

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 14/40

slide-20
SLIDE 20

Finite-State Controllers

Natural compaction of trees:

a0 a1 a0 a2 a3 a4

  • 0, o1
  • 0, o1
  • 1
  • 0, o1
  • 1

Usage:

◮ direct search in policy space ◮ representation of infinite policies

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 15/40

slide-21
SLIDE 21

Common Properties

Note:

◮ there are other representations ◮ implicit representation as DAG (Brafman & Hoffmann 2005) ◮ with “pseudo-epistemic” literals (Albore, Geffner & Palacios 2009)

Common properties:

◮ branching on observations ◮ “reactive”: execution is instantaneous at each timestep ◮ unreadable

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 16/40

slide-22
SLIDE 22

Outline

Contingent Planning Problems Standard Representations Knowledge-Based Programs The Bright Side of KBPs as Policies The Dark Side of KBPs as Policies Conclusion Multi-Agent KBPs Synthesis of KBPs More Succinctness?

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 17/40

slide-23
SLIDE 23

Syntax

Essentially defined by Fagin et al., 1990’s: κ ::= ε | a | κ ; κ | if Θ then κ else κ fi | while Θ do κ od with Θ either

◮ subjective epistemic formula over variables X ◮ jo(o) for some observation o

Note: no auxiliary variable

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 18/40

slide-24
SLIDE 24

Example KBP

Minesweeper: while ¬K

i,j(ci,j ⊕ mi,j) do

if K¬m1,1 then click1,1 fi if K¬m1,2 then click1,2 fi . . . if K¬mH,W then clickH,W fi

  • d

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 19/40

slide-25
SLIDE 25

Epistemic Logic

Subjective formulas over X: Φ ::= Kϕ |

  • Kϕ | ¬Φ | Φ ∨ Φ | Φ ∧ Φ

with ϕ propositional over X Minesweeper: Km2,1 ∧ K(m4,1 ∨ m4,2 ∨ m4,3) ∧ Km4,1 ∧ Km4,2 ∧ ¬ K(m4,1 ∧ m4,2)

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 20/40

slide-26
SLIDE 26

Semantics of Subjective Formulas

◮ S5 semantics ◮ Belief state = set of possible states B ◮ B |

= Kϕ: ∀s ∈ B, s | = ϕ

◮ B |

= Kϕ: ∃s ∈ B, s | = ϕ

1 1 * 1 2 2 * 1 1 1 * 1 2 2 1 1 * 1 1 1 * 1 1 2 1 1 *

1 ? ? ? 1 ? ? 2 ? ? ? ? | = Km2,1 ∧ K(m4,1 ∨ m4,2 ∨ m4,3) ∧

  • Km4,1 ∧

Km4,2 ∧ ¬ K(m4,1 ∧ m4,2)

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 21/40

slide-27
SLIDE 27

Operational Semantics of KBPs

◮ Agent maintains current belief state Bt

→ [if Φ then . . . else . . . fi]: decide Bt | = Φ

◮ Agent maintains last observation received ot−1

→ [if jo(o) then . . . else . . . fi]: decide ot−1 = o

◮ Same for [while Φ . . . ] and [while jo(o) . . . ] ◮ A KBP represents a policy given B0 and o−1

Minesweeper: same policy given B0 =

? ? ? ? 1 ? ? 2 ? ? ? ?

and o−1 = ⊥

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 22/40

slide-28
SLIDE 28

Outline

Contingent Planning Problems Standard Representations Knowledge-Based Programs The Bright Side of KBPs as Policies The Dark Side of KBPs as Policies Conclusion Multi-Agent KBPs Synthesis of KBPs More Succinctness?

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 23/40

slide-29
SLIDE 29

Succinctness

Reactive representation of policies: with execution in polytime at each step Families of contingent planning problems (Πn)n:

◮ with description polysize in n ◮ with valid KBPs (κn)n polysize in n ◮ with no valid reactive policy (πn)n polysize in n

(if NP ⊆ P/poly) Notes:

◮ for any reactive representation ◮ more succinct than FSCs in the sense of KC (modulo repr. of KBPs) ◮ proof based on 3SAT but could have been on Minesweeper

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 24/40

slide-30
SLIDE 30

A Very Small Clock/A Very Slow KBP

Recall:

◮ no auxiliary variable in KBP → polysize memory ◮ + polysize π → termination in at most exponential time, or loop ◮ valid reactive π cannot execute for more than 0(|π|2n) steps

With knowledge:

◮ no auxiliary variable but 22n belief states ◮ there is a polysize KBP going through all of them and terminating ◮ can be used as a 22n-step clock ◮ a very very small clock. . . or a very very slow KBP!

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 25/40

slide-31
SLIDE 31

Readability and Succinctness

Succinctness:

◮ embed policy in low-memory device: robot, satelite. . . ◮ transmit policy quickly or over low bandwith: other planet. . . ◮ enhances readability/explainability by humans

Readable/explainable policies:

◮ succinct + clear and simple semantics ◮ without jo: abstracts away from sensor model

Who cares about readable/explainable policies?

◮ missions with high cost, high risk, high stake ◮ policy reviewed/written by experts ◮ autonomous agents in society: airport surveillance. . .

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 26/40

slide-32
SLIDE 32

Outline

Contingent Planning Problems Standard Representations Knowledge-Based Programs The Bright Side of KBPs as Policies The Dark Side of KBPs as Policies Conclusion Multi-Agent KBPs Synthesis of KBPs More Succinctness?

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 27/40

slide-33
SLIDE 33

Slow Execution

All representations in the literature:

◮ reactive representations ◮ execution in low polynomial time at each step

KBPs:

◮ deciding next action to take: complete for Θ2 P = PNP ◮ however belief state is maintained

Still:

◮ PNP: use SAT solvers ◮ factored belief tracking (Brafman & Shani, Geffner)

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 28/40

slide-34
SLIDE 34

Slow Automated Verification

Verification:

◮ given: contingent planning problem Π ◮ given: some policy π (e.g., written by experts) ◮ decide whether π is valid for Π

Complexity:

◮ reactive representations: in PSPACE ◮ KBPs with while : EXPSPACE-hard ◮ without while : Π2 P-hard (vs in NP)

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 29/40

slide-35
SLIDE 35

Taking the Best of Both Worlds

Use (knowledge) compilation:

◮ let experts write KBPs ◮ automatically verify ◮ keep/store/transmit succinct representation ◮ if needed by application, unroll (parts) into reactive representation ◮ example use case: next plans for robot on Mars

As a specification language:

◮ initially intended to be so ◮ literature taking this view (Van Der Meyden) ◮ automatic refinement of specifications through operational semantics

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 30/40

slide-36
SLIDE 36

Outline

Contingent Planning Problems Standard Representations Knowledge-Based Programs The Bright Side of KBPs as Policies The Dark Side of KBPs as Policies Conclusion Multi-Agent KBPs Synthesis of KBPs More Succinctness?

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 31/40

slide-37
SLIDE 37

Summary

KBPs are

◮ succinct (provably more than FSCs) ◮ more high-level than standard policies ◮ abstract away from sensor model ◮ but harder to execute and verify

Possible usage:

◮ as readable specification language for policies ◮ for low memory or low bandwidth ◮ explainable policies for robots in society

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 32/40

slide-38
SLIDE 38

Outline

Contingent Planning Problems Standard Representations Knowledge-Based Programs The Bright Side of KBPs as Policies The Dark Side of KBPs as Policies Conclusion Multi-Agent KBPs Synthesis of KBPs More Succinctness?

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 33/40

slide-39
SLIDE 39

Strike Example

Planning:

◮ Allˆ

  • Bob, tu viens me chercher `

a l’a´ eroport demain?

◮ OK, et `

a la gare si la gr` eve a´ erienne se confirme.

◮ J’ai cass´

e mon t´ el´ ephone, on se pr´ evient comment?

◮ J’´

ecouterai la radio pour savoir s’il y a gr`

  • eve. . .

Execution by Alice (with strike):

◮ Alice prend le train, ´

ecoute la radio

◮ Donc Alice sait que Bob sait qu’il y a une gr`

eve a´ erienne

◮ Donc Alice sait que Bob ira la chercher `

a la gare

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 34/40

slide-40
SLIDE 40

Xmas Example

while ¬KM(clothesBought) do if KM(DadAtShopA) then

  • neStepToShopB

else if KM(DadAtShopB) then

  • neStepToShopA

else

  • bserveTraffic

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 35/40

slide-41
SLIDE 41

Outline

Contingent Planning Problems Standard Representations Knowledge-Based Programs The Bright Side of KBPs as Policies The Dark Side of KBPs as Policies Conclusion Multi-Agent KBPs Synthesis of KBPs More Succinctness?

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 36/40

slide-42
SLIDE 42

From the Problem Specification

Complexity:

◮ same as standard policies ◮ 2-EXPTIME-complete in the general case ◮ better complexity for some subclasses

In practice:

◮ regression from goals ◮ need for efficient data structures (BDD like) ◮ work in progress ◮ with Ana¨

elle Wilczynski, Alexandre Niveau, S´ ebastien Gamblin. . .

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 37/40

slide-43
SLIDE 43

From a Standard Policy?

Idea: use power of standard planners. . .

1, 1 1, 2 1, 3 2, 1 2, 2 2, 3 1 3, 1 1 4, 1; 4, 2 4, 3 1 4, 2 1 4, 1 2 2, 3 1 3, 3 4, 2; 4, 3 4, 1 1 4, 2 4, 3 1, 3 1 2, 1 1 3, 1 3, 3; 4, 1; 4, 2 3, 3 1 4, 2; 4, 3 1 4, 1; 4, 3 2 1, 2 1 1, 3 1 2, 3 3, 1 3, 3 1 4, 1; 4, 2 1 3, 3 2 4, 2; 4, 3 4, 1; 4, 3 1

. . . and summarize as KBP Model-agnostic explanations vs explainable by design

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 38/40

slide-44
SLIDE 44

Outline

Contingent Planning Problems Standard Representations Knowledge-Based Programs The Bright Side of KBPs as Policies The Dark Side of KBPs as Policies Conclusion Multi-Agent KBPs Synthesis of KBPs More Succinctness?

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 39/40

slide-45
SLIDE 45

Reasoning about the Past/Future?

Universal plan: while true do for all actions a do if K (after this action there will be a plan for G then do a fi

  • d
  • d

Common in logic, but not as specification language!

Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 40/40