Epistemic Planning for Implicit Coordination Thomas Bolander , DTU - PowerPoint PPT Presentation

Epistemic Planning for Implicit Coordination Thomas Bolander , DTU Compute, Technical University of Denmark Joint work with Thorsten Engesser, Robert Mattm¨ uller and Bernhard Nebel from Uni Freiburg Thomas Bolander, Epistemic Planning, M4M, 9 January 2017 – p. 1/23

Example: The helpful household robot Essential features: • No instructions are given to the robot. • Multi-agent planning : The robot plans for both its own actions and the actions of the human. • It does (dynamic) epistemic reasoning : It knows that the human doesn’t know the location of the hammer, and plans to inform him. • It is altruistic : Seeks to minimise the number of actions the human has to execute. Thomas Bolander, Epistemic Planning, M4M, 9 January 2017 – p. 2/23

The problem we wish to solve We are interested in decentralised multi-agent planning where: • The agents form a single coalition with a joint goal . • Agents may differ arbitrarily in uncertainty about initial state and partial observability of actions (including higher-order uncertainty). • Plans are computed by all agents, for all agents . • Sequential execution : At every time step during plan execution, one action is randomly chosen among the agents who wish to act. • No explicit coordination/negotiation/commitments/requests. Coordination is achieved implicitly via observing action outcomes (e.g. ontic actions or announcement). We call it epistemic planning with implicit coordination . Thomas Bolander, Epistemic Planning, M4M, 9 January 2017 – p. 3/23

Another example: Implicit robot coordination under partial observability Joint goal : Both robots get to their respective goal cells. They can move one cell at a time. A cell can only contain one robot. Both robots only know the location of their own goal cell. Thomas Bolander, Epistemic Planning, M4M, 9 January 2017 – p. 4/23

A simpler example: Stealing a diamond Thomas Bolander, Epistemic Planning, M4M, 9 January 2017 – p. 5/23

And now, finally, some technicalities... Setting: Multi-agent planning under higher-order partial observability. Natural formal framework: Dynamic epistemic logic ( DEL ) [Baltag et al. , 1998]. We use DEL with postconditions [van Ditmarsch and Kooi, 2008]. Language: φ ::= p | ¬ φ | φ ∧ φ | K i φ | C φ | ( a ) φ, where a is an (epistemic) action (to be defined later). • K i φ is read “agent i knows that φ ”. • C φ is read “it is common knowledge that φ ”. • ( a ) φ is read “action a is applicable and will result in φ holding”. Thomas Bolander, Epistemic Planning, M4M, 9 January 2017 – p. 6/23

DEL by example: Cutting the red wire I’m agent 0, my partner in crime is agent 1. r : The red wire is the power cable for the alarm. l : The alarm is activated. h : Have diamond. All indistinguishability relations are equivalence relations (S5). precond. postcond. event w 1 : r , l w 1 : r , l e 1 : � r , ¬ l � e 1 : � r , ¬ l � e 2 : �¬ r , ⊤� e 2 : �¬ r , ⊤� w 1 e 1 : r w 1 e 1 : r w 2 : l w 2 : l w 2 e 2 : l w 2 e 2 : l w 2 e 2 : l = ⊗ 0 , 1 1 1 event model epistemic model epistemic model a := ( E , { e 1 , e 1 } ) s := ( M , { w 1 } ) s ⊗ a product update • Designated worlds/events marked by . • s | = Cl ∧ K 0 r ∧ ¬ K 1 r ∧ K 0 ¬ K 1 r . (Truth in a model means truth in all designated worlds) • Event model : the action of cutting the red wire. • s ⊗ a | = K 0 ¬ l ∧ ¬ K 1 ¬ l ∧ K 0 ¬ K 1 ¬ l . Thomas Bolander, Epistemic Planning, M4M, 9 January 2017 – p. 7/23

Planning interpretation of DEL w 1 : r , l e 1 : � r , ¬ l � e 2 : �¬ r , ⊤� w 1 e 1 : r w 2 : l w 2 e 2 : l = ⊗ 0 , 1 1 1 state s action a resulting state s ⊗ a action transition operator • States : Epistemic models. • Actions : Event models. • Result of applying an action in a state : Product update of state with action. • Semantics : s | = ( a ) φ iff a is applicable in s and s ⊗ a | = φ . • Example : s | = ( a )( ¬ l ∧ ¬ K 1 ¬ l ). Thomas Bolander, Epistemic Planning, M4M, 9 January 2017 – p. 8/23

Planning to get the diamond Definition . A planning task is Π = ( s 0 , A , ω, φ g ) where • s 0 is the initial state : an epistemic model. • A is the action library : a finite set of event models called actions . • ω : A → Ag is an owner function: specifies who “owns” each action, that is, is able to execute it. • φ g is a goal formula : a formula of epistemic logic. Example r , l l • s 0 = 1 • A = { cut red , take diam } • ω ( cut red ) = 0; ω ( take dia ) = 1 � r , ¬ l � �¬ r , ⊤� • cut red = 0 , 1 �¬ l , h � � l , c � • take diam = (where c : get caught) • φ g = h Thomas Bolander, Epistemic Planning, M4M, 9 January 2017 – p. 9/23

Example continued Consider again the planning task Π from the previous slide (actions are cut red and take diam , goal is φ g = h ). A plan exists for Π exists: ( cut red , take diam ), since � r , ¬ l � �¬ r , ⊤� r , l r l l 0 , 1 = ⊗ 1 1 s 0 = ⊗ s 0 ⊗ cut red cut red �¬ l , h � � l , c � r h l c ⊗ = 1 s 0 ⊗ cut red ⊗ | = φ g take diam Expressed syntactically: s 0 | = ( cut red )( take diam ) φ g . This reads: “Executing the plan ( cut red , take diam ) in the init. state s 0 leads to the goal φ g being satisfied.” But not implicitly coordinated ... Thomas Bolander, Epistemic Planning, M4M, 9 January 2017 – p. 10/23

Local states and perspective shifts Consider the state s after the red wire has been cut: 1 s = r l s is the global state of the system after the wire has been cut (a state with a single designated world). But s is not the local state of agent 1 in this situation. The associated local state of agent 1, s 1 , is achieved by closing under the indistinguishability relation of 1: 1 s 1 = r l = ¬ l and s 0 | = ¬ l but s 1 �| We have s | = ¬ l . Hence agent 1 does not know that it is safe to take the diamond. Agent 0 can in s 0 = s make a change of perspective to agent 1, that is, compute s 1 , and conclude that agent 1 will not take the diamond. Thomas Bolander, Epistemic Planning, M4M, 9 January 2017 – p. 11/23

Example continued • Agent 0 knows the plan ( cut red , take diam ) works: s 0 | = K 0 ( cut red )( take diam ) φ g . • Agent 1 does not know the plan works, and agent 0 knows this: s 0 | = ¬ K 1 ( cut red )( take diam ) φ g ∧ K 0 ( ¬ K 1 ( cut red )( take diam ) φ g ). • Even after the wire has been cut, agent 1 does not know she can achieve the goal by take diam : s 0 | = ( cut red ) ¬ K 1 ( take diam ) φ g . Consider adding an announcement action tell ¬ l with ω ( tell ¬ l ) = 0. Then: • Agent 0 knows the plan ( cut red , tell ¬ l , take diam ) works: s 0 | = K 0 ( cut red )( tell ¬ l )( cut diam ) φ g . • Agent 1 still does not know the plan works: s 0 | = ¬ K 1 ( cut red )( tell ¬ l )( take diam ) φ g . • But agent 1 will know in due time , and agent 0 knows this: s 0 | = K 0 ( cut red )( tell ¬ l ) K 1 ( take diam ) φ g . Thomas Bolander, Epistemic Planning, M4M, 9 January 2017 – p. 12/23

Implicitly coordinated sequential plans Definition . Given a planning taks Π = ( s 0 , A , ω, φ g ), an implicitly coordinated plan is a sequence π = ( a 1 , . . . , a n ) of action from A such that s 0 | = K ω ( a 1 ) ( a 1 ) K ω ( a 2 ) ( a 2 ) · · · K ω ( a n ) ( a n ) φ g . In words: The owner of the first action a 1 knows that a 1 is initially applicable and will lead to a situation where the owner of the second action a 2 knows that a 2 is applicable and will lead to a situation where... the owner of the nth action an knows that a n is applicable and will lead to the goal being satisfied. Example . For the diamond stealing task, ( cut red , take diam ) is not an implicitly coordinated plan, but ( cut red , tell ¬ l , take diam ) is. Thomas Bolander, Epistemic Planning, M4M, 9 January 2017 – p. 13/23

Household robot example s 0 | = K r ( get hammer ) K h ( hang up picture ) φ g s 0 | = K r ( tell hammer location ) K h ( get hammer ) K h ( hang up picture ) φ g If the robot is eager to help, it will prefer implicitly coordinated plans in which it itself acts whenever possible. If it is altruistic it will try to minimise the actions of the human. Thomas Bolander, Epistemic Planning, M4M, 9 January 2017 – p. 14/23

From sequential plans to policies Sequential plans are not in general sufficient. We need to define policies: mappings from states to actions... Thomas Bolander, Epistemic Planning, M4M, 9 January 2017 – p. 15/23

Implicitly coordinated policies by example Below: Initial segment of the execution tree of an implicitly coordinated policy for the square robot (that is, an implicitly coordinated policy for the planning task where the initial state is s 0 ). right right down down left left left left Thomas Bolander, Epistemic Planning, M4M, 9 January 2017 – p. 16/23

Epistemic Planning for Implicit Coordination Thomas Bolander , DTU - PowerPoint PPT Presentation

Epistemic Planning for Implicit Coordination Thomas Bolander , DTU Compute, Technical University of Denmark Joint work with Thorsten Engesser, Robert Mattm uller and Bernhard Nebel from Uni Freiburg Thomas Bolander, Epistemic Planning, M4M, 9

Epistemic Network Analysis Todays Class Epistemic Network Analysis Epistemic Network

An epistemic extension of equilibrium logic and its relation to Gelfonds epistemic

Implicit Guarantees and Risk Taking: Implicit Guarantees and Risk Taking: Implicit Guarantees and

+ Epistemic Injustice Helen Lauer, David Crowe How epistemic injustice in the global health

A Lightweight Epistemic Logic and its Application to Planning Elise Perrotin Joint work with

Implicit Bias Implicit bias Implicit bias refers to attitudes or stereotypes that affect our

Implicit Surfaces Implicit Surfaces An implicit surface is simply an iso-contour CIS 781 of a

Coordination models Essence We are trying to separate computation from coordination; coordination

Epistemic modals and mathematics Craige Roberts and Stewart Shapiro November 19, 2015 Epistemic

Epistemic Optimism Julien Dutant Kings College London Les Principes de lpistmologie,

Epistemic Analysis of Strategic Games with Arbitrary Strategy Sets Krzysztof R. Apt CWI and

Celebration Event for Johan van Benthem, Amsterdam Syntactic Epistemic Logic Sergei Artemov

From Classical to Epistemic Planning Thomas Bolander, DTU Compute, Technical University of Denmark

Implicit Bias: Transcript Inclusive Teaching Series: Implicit Bias Welcome to the third module of

Implicit Extremes and Implicit MaxStable Laws Stilian Stoev ( sstoev@umich.edu ) University of

Multi-core Programming: Implicit Parallelism Tuukka Haapasalo April 16, 2009 Tuukka Haapasalo

Challenges in Automating Style Checking for Legislative Texts Stefan Hfler and Kyoko Sugisaki

From behind the keyboard to behind bars: Cybercrime arrests and prosecu6ons in the UK Dr Alice

Causal Premise Semantics Stefan Kaufmann Northwestern / University of Connecticut Perspectives

Whats New in Neurology? MEGAN RICHIE, MD ASSISTANT PROFESSOR OF NEUROLOGY Outline

a Security ECONomics service platform for smart security investments and cyber insurance pricing

Intelligent logging server SIEM for the poor Jan Vykopal , Martin Ju ren, Daniel Kou

S U N D A Y C Y B E R S E S S I O N better boards conference 2018 Robens Report Initiator

Evelyn Chye A for Apple illustration Cyber Wellness Values B alance E mbracing the