Towards an efficient representation for epistemic planning - - PowerPoint PPT Presentation

towards an efficient representation for epistemic planning
SMART_READER_LITE
LIVE PREVIEW

Towards an efficient representation for epistemic planning - - PowerPoint PPT Presentation

Context of the work Theory Work Conclusion, perspectives Towards an efficient representation for epistemic planning Supervised by Alexandre Niveau Sebastien Gamblin University of Caen December 6, 2018 1/24 Context of the work Theory Work


slide-1
SLIDE 1

1/24 Context of the work Theory Work Conclusion, perspectives

Towards an efficient representation for epistemic planning

Supervised by Alexandre Niveau Sebastien Gamblin

University of Caen

December 6, 2018

slide-2
SLIDE 2

2/24 Context of the work Theory Work Conclusion, perspectives

Plan

1

Context of the work

2

Theory Epistemic logic Kripke structure Product update KBP

3

Work V-injectivity Propositional representation Simple quering of the structure Planification Results

4

Conclusion, perspectives

slide-3
SLIDE 3

3/24 Context of the work Theory Work Conclusion, perspectives

Context of the work

Theme : Multi-agent planification, using epistemic logic and event model of DEL to represent the problem. Goal : We want to find one policy for each agent in the form of KBP [LZ12]. We work on the game Hanabi, a collaborative card game where it’s natural to learn about the knowledge of other agents.

slide-4
SLIDE 4

4/24 Context of the work Theory Work Conclusion, perspectives

Epistemic logic

Let A, a set of agents Let X, a set of propositional atoms Let LEL the langage := ⊤| p | ¬Φ | Φ ∨ Φ | KaΦ , p ∈ X , a ∈ A Kiφ means that "agent i knows that φ".

slide-5
SLIDE 5

5/24 Context of the work Theory Work Conclusion, perspectives

Kripke structure

Kripke structure : M = (W , R1 . . . Rn, V ) [FHMV95] W : set of worlds R1 . . . Rn ⊆ W × W , indistinguishability’s relations V : valuation’s function W → 2X pq w1 pq w2 2 1, 2 1, 2

Figure: Example of the knowledge of agents

Allows to interprets an epistemic formula in a certain state of knowledge. Example in w1 : K1p, ¬K2p, K1¬K2p are true.

slide-6
SLIDE 6

6/24 Context of the work Theory Work Conclusion, perspectives

Event model

ε = (E, E1 . . . En, pre, post) E : set of actions E1 . . . En ⊆ E × E : indistinguishability’s relations pre : precondition function, E → LEL post : postcondition function, E × X → LPROP pre: p post: p ← ⊥ e1 pre: q post: q ← ⊥ e2 1, 2 1, 2 1, 2

Figure: Event model example

slide-7
SLIDE 7

7/24 Context of the work Theory Work Conclusion, perspectives

Product update

Product update : ε et M : = ε ⊗ M = (W ′, R ′

1 . . . R ′ n, V ′)

W ′ = {(w, e) ∈ W × E | M , w | = pre(e)} (w, e) R ′

i (w ′, e′) iff w Ri w ′ and e Ei e′

V ′((w, e)) = {p ∈ X | M , w | = post(e, p)} It’s the cartesian product of the two Kripke structure.

slide-8
SLIDE 8

8/24 Context of the work Theory Work Conclusion, perspectives

Product update: example

pq w1 pq w2 2 1, 2 1, 2 pre: p post: p ← ⊥ e1 pre: q post: q ← ⊥ e2 1, 2 1, 2 1, 2 pq w1 e1 pq w1 e2 pq w2 e2 2 1, 2 2 1, 2 1, 2 1, 2

Figure: ⊗ Updated structure

slide-9
SLIDE 9

9/24 Context of the work Theory Work Conclusion, perspectives

KBP : Knowledge-based programs [LZ12]

A : set of primitive actions KBP is defined inductively as follows : the empty plan is a KBP. any action α ∈ A is a KBP. if π and π′ are KBPs, then π; π′ is a KBP. if Φ then π else π′ is a KBP. while Φ do π is a KBP. Φ must be subjective to the current agent.

slide-10
SLIDE 10

10/24 Context of the work Theory Work Conclusion, perspectives

Domain : We have set of V, set of actions Problem : With an initial state and a goal (epistemic formula), we want to find KBP for each agent, such that when the agents execute their KBP synchronously turn by turn, the goal is reached in a finite number of steps.

slide-11
SLIDE 11

11/24 Context of the work Theory Work Conclusion, perspectives

Hanabi-like’s initial Kripke frame

x11x22xp3 m1 x11x23xp2 m2 x12x21xp3 m3 x12x23xp1 m4 x13x21xp2 m5 x13x22xp1 m6 1, 2 1, 2 1, 2 1, 2 1, 2 1, 2 1 1 1 2 2 2

Figure: Example of initial situation with 3 cards and 2 agents.

xic : agent i has card c.

slide-12
SLIDE 12

12/24 Context of the work Theory Work Conclusion, perspectives

Event

pre : xp1 post: xj1 ← ⊥ xp1 ← ⊤ j pioche 1 pre : xp2 post: xj2 ← ⊥ xp2 ← ⊤ j pioche 2 pre : xp3 post: xj3 ← ⊥ xp3 ← ⊤ j pioche 3 j j A j A A

slide-13
SLIDE 13

13/24 Context of the work Theory Work Conclusion, perspectives

Toward an efficient representation

Combinatorial explosion Naive implementation of a classical graph : 2 players, 4 cards in hand. Cards Number of worlds Number of relations 9 630 22694 10 3150 58926 11 11550 112266 Contribution Hanabi has a particularity : two worlds never have the same propositional valuations. Definition V-injective. A Kripke structure M = W , R1 . . . Rn, V is called V-injective if V is injective, i.e., ∀w, w ′ ∈ W : w = w ′ = ⇒ V (w) = V (w ′). Also identified by M. Gattinger [Gat18]. Add variables to split worlds : [CS17].

slide-14
SLIDE 14

14/24 Context of the work Theory Work Conclusion, perspectives

Definition Boolean representation of Kripke Structure. Let M a V-injectif Kripke structure for agents A = {a1, . . . , am} and propositonal variables X = {x1, . . . , xn}. The propositional representation of M is a tuple of Boolean functions F = f1, . . . , fm where every fi : B2n → B is defined as follow : fi(v1, . . . , vn, v ′

1, . . . , v ′ n) = 1 ⇐

⇒ ∃w, w ′ ∈ W :    ∀j : V (w)(xj) = vj ∀j : V (w ′)(xj) = v ′

j

(w, w ′) ∈ Ri Relations for agent 1 p q p′ q′ Rel 1 1 1 1 w1 → w1 1 1 w2 → w2 Relations for agent 2 p q p′ q′ Rel 1 1 1 1 w1 → w1 1 1 w2 → w2 1 1 1 w1 → w2 1 1 1 w2 → w1

Figure: Representation by a Boolean function of the example

slide-15
SLIDE 15

15/24 Context of the work Theory Work Conclusion, perspectives

Model checking

There is a practical algorithm for checking if a Boolean representation is a model of an epistemic formula. Goal : find a propositional representation of Θ(F, Φ) on X of the set of worlds Q(M , Φ) = {w ∈ W | M , w | = Φ}, where Φ ∈ LEL. i.e. Mod(Θ(F, Φ)) = Q(M , Φ). Proposition 1. Let F the propositional representation of M . Let fw is the formula which has for models all valuations of world in W .

1

Θ(F, x) = fw ∧ x

2

Θ(F, ¬Φ) = fw ∧ ¬Θ(F, Φ)

3

Θ(F, Φ ∧ Ψ) = Θ(F, Φ) ∧ Θ(F, Ψ)

4

Θ(F, ˆ KiΦ) = Forget(fi ∧ F ′, X ′), where F ′ = Θ(F, Φ)[X → X ′] Several languages representing propositional formulas have efficient algorithms for these operations (OBDD for example).

slide-16
SLIDE 16

16/24 Context of the work Theory Work Conclusion, perspectives

Boolean representation for Event model

Simple atoms are used for the original world : p Plus atoms are used to apply valuation in furtur state: p+ With primes : modelise propositions of the arrival event Here, ϕe1 = p ∧ ¯ p+ et ϕe2 = q ∧ ¯ q+ We can obtain the event formula like this : Φe = (ϕe1 ∧ ϕe1[X ′ ← X, X+′ ← X+]) ∨ (ϕe1 ∧ ϕe2[X ′ ← X, X+′ ← X+]) ∨ (ϕe2 ∧ ϕe1[X ′ ← X, X+′ ← X+]) ∨ (ϕe2 ∧ ϕe2[X ′ ← X, X+′ ← X+]) Relations p p+ q q+′ p′ p′+ q′ q′+ Rel 1 1 e1 → e1 1 1 e2 → e2 1 1 e1 → e2 1 1 e2 → e1

slide-17
SLIDE 17

17/24 Context of the work Theory Work Conclusion, perspectives

Symbolic product update for propositional representation

Proposition Product update. with fi knowledge structure of agent i and Φe the event formula : Forget(fi ∧ Φe, X ∪ X ′)[X+ ← X, X+′ ← X ′]

slide-18
SLIDE 18

18/24 Context of the work Theory Work Conclusion, perspectives

Here, we can modelize the knowledge of the agents. Now, we want to create KBP. We use regression because formulas include policy...

slide-19
SLIDE 19

19/24 Context of the work Theory Work Conclusion, perspectives

Regression

Starting from an epistemic goal formula, we want to get all the epistemic formulas that could lead to this goal formula through the events of the game. Definition Regression. of ΦG (goal formula) by (M, w) (pointed event), called Regw(ΦG) is the formula defined as (see [Auc12] ): Regw(p) = Pre(w) ∧ Post(w)(p) Regw(Φ ∨ Ψ) = Regw(Φ) ∨ Regw(Ψ) Regw(¬Φ) = Pre(w) ∧ ¬Regw(Φ) Regw( KjΦ) = Pre(w) ∧

  • v′∈Kj(w)
  • Kj(Regw(Φ))
slide-20
SLIDE 20

20/24 Context of the work Theory Work Conclusion, perspectives

Data: n : degree regression Data: final state Result: Plan ΦG ← final_state; Plan ← list((ΦG,′ stop′)); forall i ∈ {0 . . . i} do tmp ← ⊤ ; forall a ∈ Actions do ΦP ← Rega(ΦG) ; Plan.append((ΦP, a)) ; tmp∨ = ΦP ; end ΦG ← tmp ; end Algorithm 1: Create plan We obtain plan : if Φ1 then execute action 1 else if Φ2 then execute action 2 else if Φ3 then execute action 3 ... elise if ΦG then ’STOP’

slide-21
SLIDE 21

21/24 Context of the work Theory Work Conclusion, perspectives

Data: Plan done ← ⊥; while not done do forall (Φ, action) ∈ Plan do if evaluate(Φ, state) then if action == stop then done ← ⊤ ; end Execute action ; end end end Algorithm 2: Follow plan Programm pointer of other agents in this KBP ?

slide-22
SLIDE 22

22/24 Context of the work Theory Work Conclusion, perspectives

Implementation in python with cudd library for BDD.

slide-23
SLIDE 23

23/24 Context of the work Theory Work Conclusion, perspectives

Example

slide-24
SLIDE 24

24/24 Context of the work Theory Work Conclusion, perspectives

Conclusion, perspectives

How to model Kripke structure with Boolean formula Quering this structure "Planing" with regression, but it’s too long.

eliminate redundant sub-formulas: requires an efficient data structure for epistemic formulas

  • > Test with implementation of the Tableaux method, but formulas

explode

Exploit the initial state for the regression ? Which heuristic for forward planification ?

slide-25
SLIDE 25

25/24

Guillaume Aucher. Del-sequents for regression and epistemic plannning. Journal of Applied Non Classical Logics, page 29, 2012. Tristan Charrier and Francois Schwarzentruber. A succinct language for dynamic epistemic logic. In Proc. 16th Conference on Autonomous Agents and Multi-Agent Systems (AAMAS 2017), pages 123–131, 2017. Ronald Fagin, Joseph Y. Halpern, Yoram Moses, and Moshe Y. Vardi. Reasoning About Knowledge. The MIT PRess, Cambdrige, Massachusetts, 1995. Malvin Gattinger. New Directions in Model Checking Dynamic Epistemic Logic. PhD thesis, Universiteit van Amsterdam, 2018. Jérôme Lang and Bruno Zanuttini. Knowledge-Based Programs as Plans - The Complexity of Plan Verification -. In Proc. 20th Conference on European Conference on Artificial Intelligence (ECAI 2012), page 6 p., France, August 2012.

slide-26
SLIDE 26

26/24

Example OBDD

f (x1, . . . , x8) = x1x2 + x3x4 + x5x6 + x7x8

slide-27
SLIDE 27

27/24

Modelisation of Hanabi

Atome xpc signifie le joueur p possede la carte c. parmi(0, X) =

x∈X

¬x parmi(n, X) =

x∈X

x ∧

x∈X

(x → parmi(n-1, X\{x}) ), avec X une liste de variables propositionnelles quelconques. Unicite et existence d’une carte U =

c∈C

parmi(1, varsc) Reciprocite Rj =

  • p∈P\{p,j}

(

c∈C

Kjxpc ∨ Kj¬xpc) Denombrer Dj = Kj parmi(n, varsj)

avec n = nombre de cartes maximal en main pour un joueur