Probabilistic Model Checking for Games of imperfect information P. - - PowerPoint PPT Presentation

probabilistic model checking for games of imperfect
SMART_READER_LITE
LIVE PREVIEW

Probabilistic Model Checking for Games of imperfect information P. - - PowerPoint PPT Presentation

Probabilistic Model Checking for Games of imperfect information P. Ballarini, M. Fisher, M. Wooldridge University of Liverpool What is this work about Uncertainty is relevant for a specific class of Games (game of imperfect information) Q :


slide-1
SLIDE 1

Probabilistic Model Checking for Games of imperfect information

  • P. Ballarini, M. Fisher, M. Wooldridge

University of Liverpool

slide-2
SLIDE 2

What is this work about

Uncertainty is relevant for a specific class of Games (game of imperfect information) Q: Can we apply probabilistic model checking for analysing games in which players’ behaviour is characterised by uncertainty ?

slide-3
SLIDE 3

Motivation

LTL model checking of BDI MAS (Bordini) AgentSpeak -> Promela (SPIN) AgentSpeak -> Java (Java-PathFinder) Model checking for Multi-Agent Systems (MAS) Can we use any existing Probabilistic Modelling Framework (PRISM?) to reason about uncertain MAS ? can we extend it to probabilistic model checking so that uncertain behaviour can be accounted for? We need a new language for uncertain MAS (Probmela)

slide-4
SLIDE 4

Outline

  • Games, strategies, equilibria
  • strategic games, equilibria
  • extensive games (perfect/imperfect information)
  • Alternating offers negotiation game
  • Markovian model of the Alternating offers game
  • Analysis through Model Checking
  • Conclusion
slide-5
SLIDE 5
  • set of players: N={1,..,n}
  • players actions: Ai={a1, a2,....ak}
  • players preferences: a relation over
  • utcome utilities

Strategic Games

i

the outcome of the game is achieved in one-shot

G = N, (Ai), (i)

an action profile is combination of actions:

a=(a1, a2, . . . , an)

the outcome of an action profile is denoted: O(a1, a2, . . . , an)

slide-6
SLIDE 6

Example- Battle of Sexes

  • two people wish to go out together to a concert of

music by either the “Red Hot Chili Peppers” or “Bach”

  • their main concern is to go out together but one

prefers the “Peppers” and the other one “Bach”

  • individual’s preferences are represented by payoff

functions (2,1) (0,0) (0,0) (1,2)

Peppers Peppers Bach Bach Stephen Jane

slide-7
SLIDE 7

Example- Battle of Sexes

(2,1) (0,0) (0,0) (1,2)

Peppers Peppers Bach Bach Stephen Jane

(Peppers,Peppers)S (Bach,Bach)S (Peppers,Bach)∼S (Bach,Peppers)

Stephen’s preferences Jane’s preferences

(Bach,Bach)J (Peppers,Peppers)J (Peppers,Bach)∼J (Bach,Peppers)

slide-8
SLIDE 8

Nash Equilibria

  • a profile of actions is a Nash Equilibria iff no player

has interest in adopting another strategy assuming the other player sticks to his one the Battle of Sexes has 2 Equilibria: (Peppers,Peppers), (Bach,Bach)

i.e. : togetherness rules

(2,1) (0,0) (0,0) (1,2)

Peppers Peppers Bach Bach Stephen Jane

slide-9
SLIDE 9

Extensive Games

  • set of players N={1,..,n}
  • set of histories H
  • preferences over histories (rather

than over action profiles)

  • a player function: P(h) is the player

who takes an action of history h They are sequential strategic games

(the decision problem is iterated over time)

G = N, H, P, (i)

slide-10
SLIDE 10

1 2 2 2 (2,0) (1,1) (0,2) 2,0 1,1 0,2 0,0 0,0 0,0 y y y n n n

  • Ext. Game example: two people propose different allocations for

2 indivisible items

Extensive Games as Trees

slide-11
SLIDE 11

Perfect information: strategies

a strategy in an Ext. Game of perfect information is a function that assign an action to each non-terminal history (Perf.Inf. assumption: players are completely informed on past actions)

strategies examples 1-

s1(e)=(2, 0) s2((2, 0))=y s2((1, 1))=n

2-

s1(e)=(2, 0) s2((2, 0))=n s2((1, 1))=y .....

.....

  • utcome

(2,0)

  • utcome

(?)

slide-12
SLIDE 12

perfect information: equilibria

a Nash Equilibria of an Ext. Game of perfect information is a strategy profile s=(s1,s2,..,sn) such that no player would get a better outcome by choosing a different strategy assuming all other players are sticking with their ones

O(s∗

−i, s∗ i ) i O(s∗ −i, si) for all strategy si of player i

Formally: a profile s∗ =(s∗

1, . . . , s∗ n) is a Nash Equilibria iff

O(s∗) : outcome for s∗ =(s∗

1, . . . , s∗ n)

slide-13
SLIDE 13

Alternating offers game

(Rubinstein)

two players aim to split a pie (or bargain over an item) players either accept (Y) or Reject (N) the most recent offer they receive players alternatively propose agreements in the set:

X ={(x1, x2)|xi ≥ 0 and x1+x2 =1}

D: disagreement

slide-14
SLIDE 14

1 1 2 2

x1

x0

(x0, 0)

(x1, 1) t=0

t=1

Y Y N N

slide-15
SLIDE 15

Alternating offers game

(Rubinstein)

histories are of type

(x0, N, x1, N . . . , Xt) (x0, N, x1, N . . . , Xt, Y )

non-terminal terminal

where preferences are time-dependent

i (X×T) ∪ {D}

is defined over

formally an Alt. Offers Game is given by:

G= {1, 2}, X∪{D}, (i)

slide-16
SLIDE 16

Alternating offers preferences

must fulfils some “basic”constraints

i

i- disagreement is the worst possible outcome

(x×t) i D

(x×t)i (y×t) ⇐ ⇒ xi >yi

(x×t)i (x×s) if t<s

ii- pie is desirable iii- time is valuable

slide-17
SLIDE 17

Alternating offers: equilibria

PROPERTY: there are infinite Nash Equilibria Given an Alter. Offers game

G= {1, 2}, X∪{D}, (i)

slide-18
SLIDE 18

t=0

1

t=1

2 1

t=2

2

t=3

1

t=n

(x∗,n)

Equilibria example

strategy: players keep asking the whole pie until time t=n then they ask and each player will accept only

x∗ x∗

slide-19
SLIDE 19

Preferences: more constraints

iv- stationarity

(x×t)i (y×t+1) iff (x×0)i (y×1)

v- increasing loss to delay

xi−vi(xi, 1) increasing function of xi

slide-20
SLIDE 20

Alternating offers: equilibria

THEOREM: if fulfils all constraints i-v

i

(σ∗, δ∗)

then there exists a unique strategy profile which is a Nash Equilibria

1 2

((x∗

1, x∗ 2), 1)

(x∗

1, x∗ 2) is depends on both

  • Pl. 1 proposes

and Pl. 2 accepts straight away

(x∗

1, x∗ 2)

Equilibria

1 and 2

slide-21
SLIDE 21

Imperfect information: strategies

Imperf.Inf. assumption: players may have only partial info on past actions. a strategy in an Ext. Game of imperfect information is a function that assign to each non-terminal history a lottery over possible actions as a result some actions are determined by chance

G = N, H, P, fc, (Ii)(i)

P(h) = c the next action for history h is determined by the

lottery fc(h) preferences are over (induced) lotteries on the set of terminal histories

slide-22
SLIDE 22

Markovian model of Negotiation

Markov processes are suitable for modelling past-independent behaviours

(hence imperfect-information games)

we consider the imperfect-information variant of the alternating offers game which is: we assume players actions being state- dependent, rather than path-dependent

slide-23
SLIDE 23

Markovian model of Negotiation

the imperfect-info alternating offer game can be naturally encoded as a DTMC

(players decision is a lottery over the possible actions)

BUYER-BID SELLER DECIDE ACCEPT (x)-agreed SELLER-BID 1:BID(x) p: accept (1-p): reject 1:BID BUYER DECIDE

slide-24
SLIDE 24

Markovian model of Negotiation

players’ strategies depend on 2 parameters

IP b initial price proposed by player b

RP b

reserved price of player b

T b

time-deadline of player b

i)- the Offer proposal function pa→ˆ

a(t)

ii)- the Acceptance Probability function

S AP(x) B AP(x)

for the Seller for the Buyer

slide-25
SLIDE 25

Offer Function families

Conceder: player concedes a lot in early stage of negotiation Boulware: player concedes a lot only close to deadline

pa→ˆ

a(t) =

IP a+φa(t)(RP a−IP a) for a=b buyer, RP a+(1−φa(t))(IP a−RP a) for a=s, seller φa(t) = ka + (1 − ka)( t T a)

1 ψ

0.6 0.8 1 0.4 0.2 (Boulware) t/T Price (Conceder) (Linear)

b b

RP IP

b

Buyer

slide-26
SLIDE 26

Offer Function approximation

with the PRISM model-checker we are forced to use two- segments linear approximation of non-linear Offer Functions

100 200 300 400 500 600 700 800 900 1000 2 4 6 8 10 Offer’s value Time Two segments linear NDFs’ approximation Buyer-Boul(1/500)-Tswitch=8 Seller-Boul(1/500)-Tswitch=8 Buyer-Conc(490/1)-Tswitch=2 Seller-Conc(490/1)-Tswitch=2

low-grad-segment high-grad-segment Boulware Conceder

slide-27
SLIDE 27

Acceptance Probability functions

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 S_RP=1000 B_RP=10000 acceptance probability -> x=bid/cib value -> bid acceptance probability S_AP(x) B_AP(x)

S AP(x, t)=    if (x≤S RP)∧(t<T s) 1− S RP

x

if (x>S RP)∧(t<T s) 1 if (t≥T s) B AP(x, t)=    1 if (x<=0)∨(t≥T b) 1+

S RP x − (B RP+ S RP) if (S RP <x<B RP)∧(t<T b)

if (x>B RP)∧(t<T b)

d d p

slide-28
SLIDE 28

PCTL Model-Checking

probabilistic extension of CTL for referring to Discrete Time Markov Chains

φ ::= tt | a | φ ∧ φ | ¬φ | Pp(ϕ) ϕ ::= φ UI φ

PCTL syntax

slide-29
SLIDE 29

PCTL Model-Checking

PCTL syntax

BUYER-BID SELLER DECIDE ACCEPT (x)-agreed SELLER-BID 1:BID(x) p: accept (1-p): reject 1:BID BUYER DECIDE

φ1 ≡P≥0.8[(agreed=100)] φx ≡P?[(agreed=x)]

slide-30
SLIDE 30

Model Verification

by verifying we devise the distribution of probability

  • ver the set of possible agreements, hence

the expected utility

φx ≡ P?[(agreed = x)]

by comparing a number of strategy profiles we devise how strategy parameters affect the expected outcome of negotiation

slide-31
SLIDE 31

One (fairly trivial) indication

The less a player concedes the higher his expected utility is going to be

slide-32
SLIDE 32

Conceder-Conceder

0.2 0.4 0.6 0.8 1 100 200 300 400 500 600 700 800 900 1000 Acceptance Cumulative Probability Offer Value Conc(10/1)-Conc(100/10)-Ts_sw:4 Conc(10/1)-Conc(100/10)-Ts_sw:8 Conc(100/10)-Conc(10/1)-Tb_sw:4 Conc(100/10)-Conc(10/1)-Tb_sw:8

Seller Offer Function T Offer Cumulative Acceptance Prob

slide-33
SLIDE 33

Conclusion

we have shown that: under certain assumption a game of imperfect information can be encoded into a discrete-time Markovian model PCTL model-checking can be used to verify such a model model-checking allows for comparing of strategy profiles such an approach differ from both classical game-theory analysis and from simulative analysis issue: can we perform a deeper analysis through model-checking? how about Nash-Equilibria analysis through model-checking ?