PDL as a Multi-Agent Strategy Logic Jan van Eijck CWI & ILLC, - PowerPoint PPT Presentation

PDL as a Multi-Agent Strategy Logic Jan van Eijck CWI & ILLC, Amsterdam September 17, 2012 Abstract We propose a new perspective on PDL as a multi-agent strategic logic (MASL). This logic for strategic reasoning has group strategies as first class citizens, and brings game logic closer to standard modal logic. We show that MASL can express key notions of game theory, social choice theory and voting theory in a natural way. We then present a sound and complete proof system for MASL. We end by tracing connections to a number of other logics for reasoning about strategies.

Overview • The Pebble Puzzle • Reasoning About Programs • Reasoning About Actions • Strategic Games: the Prisoner’s Dilemma • Group Strategies in Games • Key Notions: Best Response, Nash Equilibrium • Voting as a Multi-Agent Game • MASL: Language and Expressiveness • Soundness and Completeness • Connections, Further Work

The Pebble Puzzle An urn cointains 70 pebbles; 35 of them are white and 35 are black. There is a pile of black pebbles available outside the urn. Pebble Algorithm • While there are still enough pebbles in the urn: – pick two pebbles; if they have the same colour, put back a black pebble otherwise, put back the white pebble. In every step of the algorithm one pebble gets removed. After 69 steps, there is one pebble left. What is its colour?

module Pebbles where data Color = W | B deriving (Eq,Show) drawPebble :: [Color] -> [Color] drawPebble [] = [] drawPebble [x] = [x] drawPebble (W:W:xs) = drawPebble (B:xs) drawPebble (B:B:xs) = drawPebble (B:xs) drawPebble (W:B:xs) = drawPebble (W:xs) drawPebble (B:W:xs) = drawPebble (W:xs) numberW :: [Color] -> Int numberW = length . (filter (\x -> x == W)) parityW :: [Color] -> Int parityW xs = mod (numberW xs) 2 prop_invariant = \xs -> parityW xs == parityW (drawPebble xs)

Sir Tony Hoare

Formal Specification With Hoare Triples In general a triple initial state – statement – final state { P } S { Q } has the following operational meaning: If execution of S in a state that satisfies P terminates, then the termination state is guaranteed to satisfy Q . Such triples { P } S { Q } are called Hoare triples after Tony Hoare. The predicate for the initial state is called the precondition, and the predicate for the final state is called the postcondition.

{ ϕ v a } v := a { ϕ } assignment { ϕ } SKIP { ϕ } skip { ϕ } C 1 { ψ } { ψ } C 2 { χ } sequence { ϕ } C 1 ; C 2 { χ } { ϕ ∧ B } C 1 { ψ } { ϕ ∧ ¬ B } C 2 { ψ } { ϕ } if B then C 1 else C 2 { ψ } conditional choice { ϕ ∧ B } C { ϕ } { ϕ } while B do C { ϕ ∧ ¬ B } guarded iteration = ϕ ′ → ϕ N | { ϕ } C { ψ } { ϕ ′ } C { ψ } precondition strengthening = ψ → ψ ′ { ϕ } C { ψ } N | { ϕ } C { ψ ′ } postcondition weakening

Vaughan Pratt

Hoare Logic as a Fragment of Dynamic Logic Hoare logic is a fragment of a more general system of (propositional) dynamic logic. The language of propositional dynamic logic was defined by Pratt in [13, 14] as a generic language for reasoning about computation. Ax- iomatisations were given independently by Segerberg [16], Fisher/Ladner [8], and Parikh [10]. These axiomatisations make the connection be- tween propositional dynamic logic and modal logic very clear.

PDL Language Let p range over the set of basic propositions P , and let a range over a set of basic actions A . Then the formulae ϕ and programs α of propositional dynamic logic are given by: ϕ ::= ⊤ | p | ¬ ϕ | ϕ 1 ∨ ϕ 2 | � α � ϕ α ::= a | ? ϕ | α 1 ; α 2 | α 1 ∪ α 2 | α ∗ Abbreviation: [ α ] ϕ abbreviates ¬� α �¬ ϕ.

Expressing Hoare Triples in PDL Floyd-Hoare correctness assertions are expressible in PDL, as fol- lows. If ϕ, ψ are PDL formulae and α is a PDL program, then { ϕ } α { ψ } translates into ϕ → [ α ] ψ. Clearly, { ϕ } α { ψ } holds in a state in a model iff ϕ → [ α ] ψ is true in that state in that model.

PDL Axiomatisation Aioms are all propositional tautologies, plus the following axioms (we give box ( [ α ] )versions here, but every axiom has an equivalent diamond ( � α � ) version): (K) ⊢ [ α ]( ϕ → ψ ) → ([ α ] ϕ → [ α ] ψ ) (test) ⊢ [? ϕ 1 ] ϕ 2 ↔ ( ϕ 1 → ϕ 2 ) (sequence) ⊢ [ α 1 ; α 2 ] ϕ ↔ [ α 1 ][ α 2 ] ϕ (choice) ⊢ [ α 1 ∪ α 2 ] ϕ ↔ [ α 1 ] ϕ ∧ [ α 2 ] ϕ [ α ∗ ] ϕ ↔ ϕ ∧ [ α ][ α ∗ ] ϕ (mix) ⊢ ( ϕ ∧ [ α ∗ ]( ϕ → [ α ] ϕ )) → [ α ∗ ] ϕ (induction) ⊢ and the following rules of inference: (modus ponens) From ⊢ ϕ 1 and ⊢ ϕ 1 → ϕ 2 , infer ⊢ ϕ 2 . (modal generalisation) From ⊢ ϕ, infer ⊢ [ α ] ϕ .

The Loop Invariance Rule In the presence of the other axioms, the induction axiom is equivalent to the loop invariance rule : ϕ → [ α ] ϕ ϕ → [ α ∗ ] ϕ

Deriving Hoare Rules in PDL The Floyd-Hoare inference rules can now be derived in PDL. As an example we derive the rule for guarded iteration: { ϕ ∧ ψ } α { ψ } { ψ } WHILE ϕ DO α {¬ ϕ ∧ ψ } Let the premise { ϕ ∧ ψ } α { ψ } be given, i.e. assume (1). ⊢ ( ϕ ∧ ψ ) → [ α ] ψ. (1) We wish to derive the conclusion ⊢ { ψ } WHILE ϕ DO α {¬ ϕ ∧ ψ } , i.e. we wish to derive (2). ⊢ ψ → [(? ϕ ; α ) ∗ ; ? ¬ ϕ ]( ¬ ϕ ∧ ψ ) . (2)

From (1) by means of propositional reasoning: ⊢ ψ → ( ϕ → [ α ] ψ ) . From this, by means of the test and sequence axioms: ⊢ ψ → [? ϕ ; α ] ψ. Applying the loop invariance rule gives: ⊢ ψ → [(? ϕ ; α ) ∗ ] ψ. Since ψ is propositionally equivalent with ¬ ϕ → ( ¬ ϕ ∧ ψ ) , we get from this by propositional reasoning: ⊢ ψ → [(? ϕ ; α ) ∗ ]( ¬ ϕ → ( ¬ ϕ ∧ ψ )) . The test axiom and the sequencing axiom yield the desired result (2).

Strategic Games: The Prisoner’s Dilemma cooperate defect cooperate c, c c, d defect d, c d, d With output function o : { c, d } 2 → { x, y, z, u } 2 : cooperate defect x, x y, z cooperate defect z, y u, u Fixing the preferences of the players: z > x > u > y . With numerical utilities: cooperate defect 2 , 2 0 , 3 cooperate 3 , 0 1 , 1 defect

Group Strategies in PD Game are the Strategy Profiles cd cc cd cc cd cc cc dc cd dd dd dc dd dc dd dc

Key Notions: Best Response Let ( s ′ i , s − i ) be the strategy profile that is like s for all players except i , but has s i replaced by s ′ i . A strategy s i is a best response in s if ∀ s ′ i ∈ S i u i ( s ) ≥ u i ( s ′ i , s − i ) . Example in PD game. Let s = ( d, c ) . The first player defects, the second player cooperates. Is d a best response for player 1 in ( d, c ) ? Yes, because ( d, c ) gives payoff 3 for player 1, while the alternative ( c, c ) only gives payoff 2 . So player 1 cannot do better than play d .

John Nash

Key Notions: Pure Nash Equilibrium A strategy profile s is a (pure) Nash equilibrium if each s i is a best response in s : ∀ i ∈ N ∀ s ′ i ∈ S i u i ( s ) ≥ u i ( s ′ i , s − i ) . A game G is Nash if G has a (pure) Nash equilibrium. ( d, d ) is a Nash equilibrium for the PD game, so the PD game is Nash.

Charles Dodgson, also known as Lewis Carroll

Voting as a Multi-Agent Game Voting can be seen as a form of multi-agent decision making, with the voters as agents [7]. Voting is the process of selecting an item or a set of items from a finite set A of alternatives, on the basis of the stated preferences of a set of voters. We assume that the preferences of a voter are represented by a ballot: a linear ordering of A . Let ord ( A ) be the set of all ballots on A . If there are three alternatives a, b, c , and a voter prefers a over b and b over c , then her ballot is abc .

Example • Assume there are three voters { 1 , 2 , 3 } . • Assume there are three alternatives { a, b, c } . • Then profiles are vectors of ballots. • Example profile where the first voter has ballot abc , the second voter has ballot abc , the third voter has ballot bca , and so on: ( abc, abc, bca ) .

Voting Rules A voting rule V for set of alternatives A is a function from A -profiles to P + ( A ) (the set of non-empty subsets of A ). If V ( P ) = B , then the members of B are called the winners of P under V . A voting rule is resolute if V ( P ) is a singleton for any profile P . Example voting rule: voting by absolute majority. Selects an alternative with more than 50 % of the votes as winner, and returns the whole set of alternatives otherwise. ( abc, abc, bca ) . Absolute majority selects a as winner, for a has two votes, b has one.

Strategizing in Voting: Gibbard-Satterthwaite Ballot Profile Vector of ballots. Resolute Voting Rule Function V from ballot profiles to alternatives. P ∼ i P ′ P and P ′ differ at most in the ballot for i . Strategy-Proofness V is strategy-proof if P ∼ i P ′ implies that, from the perspective of P , V ( P ) is at least as good for i as V ( P ′ ) . (Weak) Non-Imposition V has at least three possible outcomes. Dictatorship V is a dictatorship if there is some voter k such that V maps any profile to the top-ranking alternative in the k -ballot. GS Theorem Any resolute voting rule that is strategy-proof and weakly non-imposed is a dictatorship.

PDL as a Multi-Agent Strategy Logic Jan van Eijck CWI & ILLC, - PowerPoint PPT Presentation

PDL as a Multi-Agent Strategy Logic Jan van Eijck CWI & ILLC, Amsterdam September 17, 2012 Abstract We propose a new perspective on PDL as a multi-agent strategic logic (MASL). This logic for strategic reasoning has group strategies as

Overview Multi-Agent Systems Introduction to multi-agent systems and agent societies Agent

Multi-agent learning Multi-agent reinforcement learning Gerard Vreeswijk , Intelligent Systems

Complexity Optimal Decision Procedure for PDL with Parallel Composition Joseph Boudou IRIT,

12/10/2008 1 12/10/2008 Provide introduction/overview of the history of the Preferred Drug

A Canonical Model Construction for Iteration-Free PDL with Intersection Florian Bruse Daniel

Multi-agent learning Gerard Vreeswijk , Intelligent Systems Group, Computer Science Department,

An Agent Architecture An Agent Architecture An Agent Architecture An Agent Architecture for

S S S S calable calable Agent calable calable Agent Agent Plat forms Agent Plat forms

Agent-Based Systems Agent communication Speech act theory Michael Rovatsos Agent

Markov Logic Markov Logic Probability First-Order Logic Propositional Logic Markov Logic

The Player Agent The Player Agent Are they the most important league official right now? right

Rational Agents (Ch. 2) Rational agent An agent/robot must be able to perceive and interact with

Agent-Based Systems Michael Rovatsos mrovatso@inf.ed.ac.uk Lecture 6 Agent Communication 1

W HAT S AN A GENT ? Weiss, p. 29 [after Wooldridge and Jennings]: An agent is a

M ULTI -A GENT S YSTEMS Overview and Research Directions Whats an agent? AI Class 12 (C H .

The dynamic logic of policies and contingent planning Andreas Herzig CNRS, IRIT joint work with

Beyond Nash Equilibrium: Solution Concepts for the 21st Century Joe Halpern and many

Organizational Equilibrium with Capital Marco Bassetto, Zhen Huo, and Jos-Vctor Ros-Rull

Peer Discipline and the Strength of Organizations David K. Levine and Salvatore Modica 1

Degrees of Streams Jrg Endrullis Dimitri Hendriks Jan Willem Klop Vrije Universiteit

Floating-Point Verification by Theorem Proving John Harrison Intel Corporation Capital Normal

CS5412: BIMODAL MULTICAST ASTROLABE Lecture XIX Ken Birman Leiden; Dec 06 Gossip 201 2

Strategies for Success Business Development Plans Silvia Vitiello 8 July 2020 Todays

WELCOME Comments about ZOOM Final District 6860 Business Meeting AGENDA for Business Meeting