Probabilistic Model Checking for Games of imperfect information
- P. Ballarini, M. Fisher, M. Wooldridge
University of Liverpool
Probabilistic Model Checking for Games of imperfect information P. - - PowerPoint PPT Presentation
Probabilistic Model Checking for Games of imperfect information P. Ballarini, M. Fisher, M. Wooldridge University of Liverpool What is this work about Uncertainty is relevant for a specific class of Games (game of imperfect information) Q :
University of Liverpool
the outcome of the game is achieved in one-shot
an action profile is combination of actions:
a=(a1, a2, . . . , an)
the outcome of an action profile is denoted: O(a1, a2, . . . , an)
music by either the “Red Hot Chili Peppers” or “Bach”
prefers the “Peppers” and the other one “Bach”
functions (2,1) (0,0) (0,0) (1,2)
Peppers Peppers Bach Bach Stephen Jane
(2,1) (0,0) (0,0) (1,2)
Peppers Peppers Bach Bach Stephen Jane
(Peppers,Peppers)S (Bach,Bach)S (Peppers,Bach)∼S (Bach,Peppers)
Stephen’s preferences Jane’s preferences
(Bach,Bach)J (Peppers,Peppers)J (Peppers,Bach)∼J (Bach,Peppers)
has interest in adopting another strategy assuming the other player sticks to his one the Battle of Sexes has 2 Equilibria: (Peppers,Peppers), (Bach,Bach)
i.e. : togetherness rules
(2,1) (0,0) (0,0) (1,2)
Peppers Peppers Bach Bach Stephen Jane
(the decision problem is iterated over time)
1 2 2 2 (2,0) (1,1) (0,2) 2,0 1,1 0,2 0,0 0,0 0,0 y y y n n n
2 indivisible items
a strategy in an Ext. Game of perfect information is a function that assign an action to each non-terminal history (Perf.Inf. assumption: players are completely informed on past actions)
(2,0)
(?)
a Nash Equilibria of an Ext. Game of perfect information is a strategy profile s=(s1,s2,..,sn) such that no player would get a better outcome by choosing a different strategy assuming all other players are sticking with their ones
−i, s∗ i ) i O(s∗ −i, si) for all strategy si of player i
Formally: a profile s∗ =(s∗
1, . . . , s∗ n) is a Nash Equilibria iff
1, . . . , s∗ n)
(Rubinstein)
1 1 2 2
x0
t=1
Y Y N N
(Rubinstein)
non-terminal terminal
is defined over
t=0
1
t=1
2 1
t=2
2
t=3
1
t=n
(x∗,n)
strategy: players keep asking the whole pie until time t=n then they ask and each player will accept only
1 2
((x∗
1, x∗ 2), 1)
1, x∗ 2) is depends on both
and Pl. 2 accepts straight away
1, x∗ 2)
Equilibria
Imperf.Inf. assumption: players may have only partial info on past actions. a strategy in an Ext. Game of imperfect information is a function that assign to each non-terminal history a lottery over possible actions as a result some actions are determined by chance
P(h) = c the next action for history h is determined by the
lottery fc(h) preferences are over (induced) lotteries on the set of terminal histories
(hence imperfect-information games)
(players decision is a lottery over the possible actions)
BUYER-BID SELLER DECIDE ACCEPT (x)-agreed SELLER-BID 1:BID(x) p: accept (1-p): reject 1:BID BUYER DECIDE
IP b initial price proposed by player b
RP b
reserved price of player b
T b
time-deadline of player b
a(t)
S AP(x) B AP(x)
for the Seller for the Buyer
Conceder: player concedes a lot in early stage of negotiation Boulware: player concedes a lot only close to deadline
pa→ˆ
a(t) =
IP a+φa(t)(RP a−IP a) for a=b buyer, RP a+(1−φa(t))(IP a−RP a) for a=s, seller φa(t) = ka + (1 − ka)( t T a)
1 ψ
0.6 0.8 1 0.4 0.2 (Boulware) t/T Price (Conceder) (Linear)
b b
RP IP
b
with the PRISM model-checker we are forced to use two- segments linear approximation of non-linear Offer Functions
100 200 300 400 500 600 700 800 900 1000 2 4 6 8 10 Offer’s value Time Two segments linear NDFs’ approximation Buyer-Boul(1/500)-Tswitch=8 Seller-Boul(1/500)-Tswitch=8 Buyer-Conc(490/1)-Tswitch=2 Seller-Conc(490/1)-Tswitch=2
low-grad-segment high-grad-segment Boulware Conceder
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 S_RP=1000 B_RP=10000 acceptance probability -> x=bid/cib value -> bid acceptance probability S_AP(x) B_AP(x)
S AP(x, t)= if (x≤S RP)∧(t<T s) 1− S RP
x
if (x>S RP)∧(t<T s) 1 if (t≥T s) B AP(x, t)= 1 if (x<=0)∨(t≥T b) 1+
S RP x − (B RP+ S RP) if (S RP <x<B RP)∧(t<T b)
if (x>B RP)∧(t<T b)
d d p
φ ::= tt | a | φ ∧ φ | ¬φ | Pp(ϕ) ϕ ::= φ UI φ
BUYER-BID SELLER DECIDE ACCEPT (x)-agreed SELLER-BID 1:BID(x) p: accept (1-p): reject 1:BID BUYER DECIDE
0.2 0.4 0.6 0.8 1 100 200 300 400 500 600 700 800 900 1000 Acceptance Cumulative Probability Offer Value Conc(10/1)-Conc(100/10)-Ts_sw:4 Conc(10/1)-Conc(100/10)-Ts_sw:8 Conc(100/10)-Conc(10/1)-Tb_sw:4 Conc(100/10)-Conc(10/1)-Tb_sw:8
Seller Offer Function T Offer Cumulative Acceptance Prob
we have shown that: under certain assumption a game of imperfect information can be encoded into a discrete-time Markovian model PCTL model-checking can be used to verify such a model model-checking allows for comparing of strategy profiles such an approach differ from both classical game-theory analysis and from simulative analysis issue: can we perform a deeper analysis through model-checking? how about Nash-Equilibria analysis through model-checking ?