kowledge based programs as explainable policies for
play

Kowledge-Based Programs as Explainable Policies for Contingent - PowerPoint PPT Presentation

Kowledge-Based Programs as Explainable Policies for Contingent Planning J. Lang, A. Saffidine, F. Schwarzentruber, B. Zanuttini MAFTEC, April 1, 2019 Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 1/40 Planning


  1. Kowledge-Based Programs as Explainable Policies for Contingent Planning J. Lang, A. Saffidine, F. Schwarzentruber, B. Zanuttini MAFTEC, April 1, 2019 Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 1/40

  2. Planning Problems Let’s design an agent for solving problems! ? ? ? ? 1 ? ? 2 ? ? ? ? Maybe even let the agent compute its policy by itself Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 2/40

  3. Standard Policies Before we send the agent to the mine field. . . Let’s just check how it is planning to behave Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 3/40

  4. Standard Policies Before we send the agent to the mine field. . . Let’s just check how it is planning to behave 4 , 1; 4 , 3 2 3 , 3 4 , 2; 4 , 3 1 1 3 , 1 3 , 3; 4 , 1; 4 , 2 4 , 3 0 0 0 2 , 1 4 , 1 4 , 2 1 0 1 2 , 3 3 , 3 4 , 2; 4 , 3 4 , 1 2 0 0 1 , 3 1 4 , 3 4 , 2 1 1 1 1 , 1 1 , 2 1 , 3 2 , 1 2 , 2 2 , 3 3 , 1 4 , 1; 4 , 2 0 0 0 0 1 1 0 1 1 , 2 1 , 3 2 , 3 3 , 1 3 , 3 4 , 1; 4 , 2 1 0 0 1 1 0 2 3 , 3 4 , 2; 4 , 3 1 4 , 1; 4 , 3 Wouldn’t this lack of a little readability, verifiability. . . explainability? Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 3/40

  5. Knowledge-Based Programs What about this behaviour? while not sure that all positions except mines have been cleared do if sure that there is no mine at � 1 , 1 � then click 1 , 1 fi if sure that there is no mine at � 1 , 2 � then click 1 , 2 fi . . . if sure that there is no mine at � H , W � then click H , W fi od Wouldn’t this be perfectly readable, verifiable. . . explainable? Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 4/40

  6. Outline Contingent Planning Problems Standard Representations Knowledge-Based Programs The Bright Side of KBPs as Policies The Dark Side of KBPs as Policies Conclusion Multi-Agent KBPs Synthesis of KBPs More Succinctness? Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 5/40

  7. Outline Contingent Planning Problems Standard Representations Knowledge-Based Programs The Bright Side of KBPs as Policies The Dark Side of KBPs as Policies Conclusion Multi-Agent KBPs Synthesis of KBPs More Succinctness? Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 6/40

  8. Partially Observable Domains → 2 n states ◮ X = { x 1 , . . . , x n } : propositional variables ◮ A = { a 1 , . . . , a k } : actions ◮ O = { o 1 , . . . , o p } : observations ◮ ϕ δ : transition function States are not directly observable Minesweeper H × W : → space of 2 2 HW states ◮ variables X : m i , j , c i , j ( ∀ i , j ) ◮ actions A : click i , j ( ∀ i , j ) ◮ observations O : o 0 , . . . , o 8 + o lost Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 7/40

  9. Actions Actions: ◮ ontic effects: change current state (nondeterministic) ◮ epistemic effects: yield observation (nondeterministic, ambiguous) Description for Minesweeper: � ϕ δ = (click i , j → ϕ δ i , j ) i , j with � � � � ( x ′ ↔ x ) ϕ δ i , j = c ′ i , j ∧ ( m i , j → o lost ) ∧ ¬ m i , j → ( ϕ n , i , j ↔ o n ) ∧ n =0 ,..., 8 x � = c i , j Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 8/40

  10. Planning Problems ◮ Domain + initial belief state + goal states ◮ Same as POMDPs except for proba. Minesweeper: ◮ initial belief state: � � � ( ¬ c i , j ) ∧ ( m i , j ∧ m i ′ , j ′ ) ∧ ¬ ( m i , j ∧ m i ′ , j ′ ∧ m i ′′ , j ′′ ) i , j � = � = ◮ goals: � i , j ( c i , j ⊕ m i , j ) Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 9/40

  11. Policies ◮ Prescribe the agent what action to take ◮ Cannot be as a function of current state ◮ Function from histories actions/observations to actions ◮ Abstract notion Examples for Minesweeper: ◮ let ( p t ) t = � 1 , 1 � , � 1 , 2 � , � 1 , 3 � . . . ◮ π def. by π := click p | h |  π ′ ( ǫ ) = click p 0  ◮ π ′ def. by π ′ ( h ) click p t ( h )+1 if o | h |− 1 ( h ) = o 0 =  π ′ ( h ) = click p t ( h )+2 otherwise Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 10/40

  12. Valid policies Execution model: at each t = 0 , 1 , . . . ◮ current state s t (nonobservable) ◮ current history h t = a 0 o 0 a 1 o 1 . . . a t o t ◮ action a t = π ( h t ) executed (or “stop”) ◮ observation o t + new state s t +1 chosen nondet. wrt ϕ δ ◮ o t given to agent ◮ s t +1 = new current state ◮ new current history h t +1 = h t a t o t Valid policy: ∀ s 0 | = ϕ I , terminate in finite time t and s t | = ϕ G Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 11/40

  13. Example policy 4 , 1; 4 , 3 2 3 , 3 4 , 2; 4 , 3 1 1 3 , 1 3 , 3; 4 , 1; 4 , 2 4 , 3 0 0 0 2 , 1 4 , 1 4 , 2 1 0 1 2 , 3 3 , 3 4 , 2; 4 , 3 4 , 1 2 0 0 1 , 3 1 4 , 3 4 , 2 1 1 1 1 , 1 1 , 2 1 , 3 2 , 1 2 , 2 2 , 3 3 , 1 4 , 1; 4 , 2 0 0 0 0 1 1 0 1 1 , 2 1 , 3 2 , 3 3 , 1 3 , 3 4 , 1; 4 , 2 1 0 0 1 1 0 2 3 , 3 4 , 2; 4 , 3 1 4 , 1; 4 , 3 Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 12/40

  14. Example policy 4 , 1; 4 , 3 2 3 , 3 4 , 2; 4 , 3 1 1 3 , 1 3 , 3; 4 , 1; 4 , 2 4 , 3 0 0 0 2 , 1 4 , 1 4 , 2 1 0 1 2 , 3 3 , 3 4 , 2; 4 , 3 4 , 1 2 0 0 1 , 3 1 4 , 3 4 , 2 1 1 1 1 , 1 1 , 2 1 , 3 2 , 1 2 , 2 2 , 3 3 , 1 4 , 1; 4 , 2 0 0 0 0 1 1 0 1 1 , 2 1 , 3 2 , 3 3 , 1 3 , 3 4 , 1; 4 , 2 1 0 0 1 1 0 2 3 , 3 4 , 2; 4 , 3 1 4 , 1; 4 , 3 ? ? ? ? 1 ? ? 2 ? ? ? ? Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 12/40

  15. Example policy 4 , 1; 4 , 3 2 3 , 3 4 , 2; 4 , 3 1 1 3 , 1 3 , 3; 4 , 1; 4 , 2 4 , 3 0 0 0 2 , 1 4 , 1 4 , 2 1 0 1 2 , 3 3 , 3 4 , 2; 4 , 3 4 , 1 2 0 0 1 , 3 1 4 , 3 4 , 2 1 1 1 1 , 1 1 , 2 1 , 3 2 , 1 2 , 2 2 , 3 3 , 1 4 , 1; 4 , 2 0 0 0 0 1 1 0 1 1 , 2 1 , 3 2 , 3 3 , 1 3 , 3 4 , 1; 4 , 2 1 0 0 1 1 0 2 3 , 3 4 , 2; 4 , 3 1 4 , 1; 4 , 3 ? ? ? 1 ? ? ? 1 ? ? 1 ? ? 2 ? ? 2 ? ? ? ? ? ? ? Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 12/40

  16. Example policy 4 , 1; 4 , 3 2 3 , 3 4 , 2; 4 , 3 1 1 3 , 1 3 , 3; 4 , 1; 4 , 2 4 , 3 0 0 0 2 , 1 4 , 1 4 , 2 1 0 1 2 , 3 3 , 3 4 , 2; 4 , 3 4 , 1 2 0 0 1 , 3 1 4 , 3 4 , 2 1 1 1 1 , 1 1 , 2 1 , 3 2 , 1 2 , 2 2 , 3 3 , 1 4 , 1; 4 , 2 0 0 0 0 1 1 0 1 1 , 2 1 , 3 2 , 3 3 , 1 3 , 3 4 , 1; 4 , 2 1 0 0 1 1 0 2 3 , 3 4 , 2; 4 , 3 1 4 , 1; 4 , 3 ? ? ? 1 ? ? 1 1 ? ? 1 ? ? 1 ? ? 1 ? ? 2 ? ? 2 ? ? 2 ? ? ? ? ? ? ? ? ? ? Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 12/40

  17. Example policy 4 , 1; 4 , 3 2 3 , 3 4 , 2; 4 , 3 1 1 3 , 1 3 , 3; 4 , 1; 4 , 2 4 , 3 0 0 0 2 , 1 4 , 1 4 , 2 1 0 1 2 , 3 3 , 3 4 , 2; 4 , 3 4 , 1 2 0 0 1 , 3 1 4 , 3 4 , 2 1 1 1 1 , 1 1 , 2 1 , 3 2 , 1 2 , 2 2 , 3 3 , 1 4 , 1; 4 , 2 0 0 0 0 1 1 0 1 1 , 2 1 , 3 2 , 3 3 , 1 3 , 3 4 , 1; 4 , 2 1 0 0 1 1 0 2 3 , 3 4 , 2; 4 , 3 1 4 , 1; 4 , 3 ? ? ? 1 ? ? 1 1 ? 1 1 0 ? 1 ? ? 1 ? ? 1 ? ? 1 ? ? 2 ? ? 2 ? ? 2 ? ? 2 ? ? ? ? ? ? ? ? ? ? ? ? ? Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 12/40

  18. Outline Contingent Planning Problems Standard Representations Knowledge-Based Programs The Bright Side of KBPs as Policies The Dark Side of KBPs as Policies Conclusion Multi-Agent KBPs Synthesis of KBPs More Succinctness? Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 13/40

  19. Policy Trees The natural representation: ◮ Node = action, edge = observation ◮ One history = one branch ◮ No child: stop Usage: ◮ typical output of planners ◮ policy typically found as a tree ◮ DAGs when equivalent situations detected ◮ very verbose Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 14/40

  20. Finite-State Controllers Natural compaction of trees: o 1 o 0 , o 1 o 0 , o 1 o 1 o 0 , o 1 a 0 a 1 a 0 a 2 a 3 o 0 o 0 a 4 Usage: ◮ direct search in policy space ◮ representation of infinite policies Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 15/40

  21. Common Properties Note: ◮ there are other representations ◮ implicit representation as DAG (Brafman & Hoffmann 2005) ◮ with “pseudo-epistemic” literals (Albore, Geffner & Palacios 2009) Common properties: ◮ branching on observations ◮ “reactive”: execution is instantaneous at each timestep ◮ unreadable Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 16/40

  22. Outline Contingent Planning Problems Standard Representations Knowledge-Based Programs The Bright Side of KBPs as Policies The Dark Side of KBPs as Policies Conclusion Multi-Agent KBPs Synthesis of KBPs More Succinctness? Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 17/40

  23. Syntax Essentially defined by Fagin et al., 1990’s: κ ::= ε | a | κ ; κ | if Θ then κ else κ fi | while Θ do κ od with Θ either ◮ subjective epistemic formula over variables X ◮ jo( o ) for some observation o Note: no auxiliary variable Lang, Saffidine, Schwarzentruber, Zanuttini KBPs as Explainable Policies 18/40

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend