Energy and Meanpayoff Games
Laurent Doyen LSV, ENS Cachan & CNRS joint work with Aldric Degorre, Raffaella Gentilini, JeanFrançois Raskin, Szymon Torunczyk ACTS 2010, Chennai
Energy and Meanpayoff Games Laurent Doyen LSV, ENS Cachan & - - PowerPoint PPT Presentation
Energy and Meanpayoff Games Laurent Doyen LSV, ENS Cachan & CNRS joint work with Aldric Degorre, Raffaella Gentilini, JeanFranois Raskin, Szymon Torunczyk ACTS 2010, Chennai Synthesis problem Specification avoid failure,
Laurent Doyen LSV, ENS Cachan & CNRS joint work with Aldric Degorre, Raffaella Gentilini, JeanFrançois Raskin, Szymon Torunczyk ACTS 2010, Chennai
Correctness relation
Specification
avoid failure, ensure progress, etc.
Solved as a game – system vs. environment solution = winning strategy This talk: quantitative games (resource-constrained systems)
Correctness relation
System - Model Specification
avoid failure, ensure progress, etc.
play: (1,4) (4,1) (1,4) (4,1) … weights: 1 +2 1 +2 … energy level: 0 2 1 3 2 4 3 …
Maximizer Minimizer positive weight = reward
Maximizer Minimizer positive weight = reward play: (1,4) (4,1) (1,4) (4,1) … weights: 1 +2 1 +2 … energy level: 0 2 1 3 2 4 3 … Initial credit
Strategies: Maximizer Minimizer play:
Infinite sequence of edges consistent with strategies and
Energy level
Decision problem: Decide if there exist an initial credit c0 and a strategy of the maximizer to maintain the energy level always nonnegative.
Decision problem: Decide if there exist an initial credit c0 and a strategy of the maximizer to maintain the energy level always nonnegative. For energy games, memoryless strategies suffice.
c0=2 c0=2 c0=1 c0=0 A memoryless strategy is winning if all cycles are nonnegative when is fixed. Decision problem: Decide if there exist an initial credit c0 and a strategy of the maximizer to maintain the energy level always nonnegative. For energy games, memoryless strategies suffice.
c0=2 c0=2 c0=1 c0=0 A memoryless strategy is winning if all cycles are nonnegative when is fixed. Decision problem: Decide if there exist an initial credit c0 and a strategy of the maximizer to maintain the energy level always nonnegative. For energy games, memoryless strategies suffice.
Initial credit is useful to survive before a cycle is formed
Q: #states E: #edges W: maximal weight
Length(AcyclicPath) ≤ Q
Initial credit is useful to survive before a cycle is formed Minimum initial credit is at most QW
Q: #states E: #edges W: maximal weight
Length(AcyclicPath) ≤ Q
The minimum initial credit is such that: in Maximizer state q: in Minimizer state q: Compute successive underapproximations of the minimum initial credit.
Fixpoint algorithm: start with
Fixpoint algorithm: start with 0 1 0 1 0 0 0 2 iterate
at Maximizer states: at Minimizer states:
0 1 2 0 1 1 0 0 0 0 2 2 Fixpoint algorithm: start with iterate
at Maximizer states: at Minimizer states:
0 1 2 0 1 1 0 0 0 0 2 2 Fixpoint algorithm: start with iterate
at Maximizer states: at Minimizer states:
Termination argument: monotonic operators, and finite codomain Complexity: O(EQW)
Maximizer Minimizer positive weight = reward play: (1,4) (4,1) (1,4) (4,1) … weights: 1 +2 1 +2 … meanpayoff value:
(limit of weight average)
Decision problem: Note: we can assume e.g. by shifting all weights by . Given a rational threshold , decide if there exists a strategy of the maximizer to ensure meanpayoff value at least . Meanpayoff value: either or
Decision problem: Assuming Given a rational threshold , decide if there exists a strategy of the maximizer to ensure meanpayoff value at least . Meanpayoff value: either or A memoryless strategy is winning if all cycles are nonnegative when is fixed.
Decision problem: Assuming Given a rational threshold , decide if there exists a strategy of the maximizer to ensure meanpayoff value at least . Meanpayoff value: either or A memoryless strategy is winning if all cycles are nonnegative when is fixed. logspace equivalent to energy games [BFL+08]
Energy games Meanpayoff games Decision problem
O(EQW) O(EQW) (this talk) O(EQ2W) [ZP96]
► Perfect information
► Imperfect information
Strategies should not rely on hidden information
System - Model Specification
Correctness relation
avoid failure, ensure progress, etc.
Maximizer states only
(compatible with Maximizer’s action)
Observations
Observationbased strategies
Observations
Goal: all outcomes have nonnegative energy level,
Energy games Meanpayoff games Perfect information
O(EQW) O(EQW) (this talk) O(EQ2W) [ZP96]
Imperfect information ? ?
Two variants for Energy games: fixed initial credit unknown initial credit Observationbased strategies Goal: all outcomes have nonnegative energy level,
Can you win with initial credit = 3 ? Actions Observations
Can you win with initial credit = 3 ? Keep track of which can be the current state, and what is the worstcase energy level Initially: (3,⊥,⊥)
(3,⊥,⊥) (⊥,2,2)
(3,⊥,⊥) (⊥,2,2) (3,⊥,⊥) (⊥,2,1) (⊥,1,3) (3,⊥,⊥)
(3,⊥,⊥) (⊥,2,2) (3,⊥,⊥) (⊥,2,1) (⊥,1,3) (3,⊥,⊥)
negative value, or comparable ancestor
(3,⊥,⊥) (⊥,2,2) (3,⊥,⊥) (⊥,2,1) (⊥,1,3) (3,⊥,⊥)
(⊥,1,4) (2,⊥,⊥)
negative value, or comparable ancestor
(3,⊥,⊥) (⊥,2,2) (3,⊥,⊥) (⊥,2,1) (⊥,1,3) (3,⊥,⊥)
(⊥,1,4) (2,⊥,⊥)
(3,⊥,⊥) (⊥,2,2) (3,⊥,⊥) (⊥,2,1) (⊥,1,3) (3,⊥,⊥)
(⊥,1,4) (2,⊥,⊥)
wellquasi ordered.
(3,⊥,⊥) (⊥,2,2) (3,⊥,⊥) (⊥,2,1) (⊥,1,3) (3,⊥,⊥)
(⊥,1,4) (2,⊥,⊥)
wellquasi ordered. Upper bound: nonprimitive recursive Lower bound: EXPSPACEhard
Proof (not shown in this talk): reduction from the infinite execution problem of Petri Nets.
Energy games
(unknown initial credit)
Meanpayoff games Perfect information
O(EQW) O(EQW) (this talk) O(EQ2W) [ZP96]
Imperfect information
r.e. ?
Corollary: Finitememory strategies suffice in energy games With imperfect information:
Corollary: Finitememory strategies suffice in energy games In meanpayoff games:
With imperfect information:
Energy games Meanpayoff games Perfect information
memoryless memoryless
Imperfect information
finite memory infinite memory
The unknown initial credit problem for energy games is undecidable. Theorem Proof: Using a reduction from the halting problem
(even for blind games)
q1: inc c1 goto q2 q2: inc c1 goto q3 q3: if c1 == 0 goto q6 else dec c1 goto q4 q4: inc c2 goto q5 q5: inc c2 goto q3 q6: halt
q1: inc c1 goto q2 q2: inc c1 goto q3 q3: if c1 == 0 goto q6 else dec c1 goto q4 q4: inc c2 goto q5 q5: inc c2 goto q3 q6: halt
q1: inc c1 goto q2 q2: inc c1 goto q3 q3: if c1 == 0 goto q6 else dec c1 goto q4 q4: inc c2 goto q5 q5: inc c2 goto q3 q6: halt
Reduction: Given M, construct GM such that M halts iff there exists a winning strategy in GM (with some initial credit).
Given M and state qhalt, decide if qhalt is reachable (i.e., M halts). Halting problem:
q1: inc c1 goto q2 q2: inc c1 goto q3 q3: if c1 == 0 goto q6 else dec c1 goto q4 q4: inc c2 goto q5 q5: inc c2 goto q3 q6: halt
Reminder: Winning strategy = #AcceptingRun#AcceptingRun#... Gadget 1: « First symbol is # »
Reminder: Winning strategy = #AcceptingRun#AcceptingRun#... Gadget 2: « Every σ1 is followed by σ2 »
Gadget 3: « Infinitely many # » Reminder: Winning strategy = #AcceptingRun#AcceptingRun#... Guess: this is the last #
(and a bit more…)
Gadget 4: « Counter correctness » Check zero tests on c
Gadget 4: « Counter correctness » Check zero tests on c
Gadget 4: « Counter correctness » Check zero tests on c Check nonzero test
q1: inc c1 goto q2 q2: inc c1 goto q3 q3: if c1 == 0 goto q6 else dec c1 goto q4 q4: inc c2 goto q5 q5: inc c2 goto q3 q6: halt
initial credit Length(AcceptingRun).
then # occurs infinitely often, and finitely many cheats occur. Hence, M has an accepting run.
Theorem Proof: Using a reduction from the halting problem
(even blind games) Nota: the proof works for both limsup and liminf, but only for strict meanpayoff objective (i.e., MP > )
Meanpayoff games are undecidable (not cor.e.).
Reduction: Given M, construct GM such that M halts iff there exists a strategy to ensure strictly positive meanpayoff value.
Meanpayoff games are undecidable (not cor.e.). Theorem Proof: Using a reduction from the halting problem
(even blind games)
Reminder: Winning strategy = #AcceptingRun#AcceptingRun#... Gadget 1: « First symbol is # »
Reminder: Winning strategy = #AcceptingRun#AcceptingRun#... Gadget 2: « Every σ1 is followed by σ2 »
Gadget 3: « Infinitely many # » Reminder: Winning strategy = #AcceptingRun#AcceptingRun#... Guess: this is the last #
(and a bit more…)
Gadget 4: « Counter correctness » Check zero tests on c Check nonzero test
Energy games
(unknown initial credit)
Meanpayoff games Perfect information
O(EQW) O(EQW) (this talk) O(EQ2W) [ZP96]
Imperfect information
r.e. not cor.e. ? not cor.e.
Meanpayoff games are undecidable (not r.e.). Theorem Proof: Using a reduction from the nonhalting problem of 2counter machines.
(for games with at least 2 observations) Nota: the proof works only for limsup and nonstrict meanpayoff
)
Reduction: Given M, construct GM such that M iff there exists a strategy to ensure strictly nonnegative meanpayoff value.
Meanpayoff games are undecidable (not r.e.). Theorem Proof: Using a reduction from the nonhalting problem of 2counter machines.
(for games with at least 2 observations)
. (+ backedges)
Gadget 3: « avoid halting state » Reminder: Winning strategy = NonterminatingRun
Gadget 5 and 6: « Counter correctness » Check nonzero test on c
Gadget 5 and 6: « Counter correctness » Check zero tests on c
strategy.
L = Size(AcceptingRun), or reaches halting state, thus he ensures meanpayoff at most 1/L.
Energy games
(unknown initial credit)
Meanpayoff games Perfect information
O(EQW) O(EQW) (this talk) O(EQ2W) [ZP96]
Imperfect information
r.e. not cor.e. not r.e. not cor.e.
Nota: whether there exists a finitememory winning strategy in meanpayoff games is also undecidable.
Energy and meanpayoff games with are decidable (EXPTIMEcomplete). Weights are if implies Weighted subset construction is finite
not r.e. not cor.e. r.e. not cor.e.
Imperfect information
EXPTIME-complete EXPTIME-complete
Visible weights Energy games
(unknown initial credit)
Meanpayoff games Perfect information
O(EQW) O(EQW) (this talk) O(EQ2W) [ZP96]
Srba, , Proc. of FORMATS: Formal Modeling and Analysis of Timed Systems, LNCS 5215, Springer, pp. 33 47, 2008 [BFL+08]
!, International Journal of Game Theory, vol. 8, pp. 109113, 1979 [EM79] A.Chakrabarti, L. de Alfaro, T.A. Henzinger, and M. Stoelinga. "", Proc. of EMSOFT: Embedded Software, LNCS 2855, Springer, pp.117133, 2003 [CdAHS03]