Strategy Synthesis for Multi-dimensional Quantitative Objectives
Krishnendu Chatterjee1 Mickael Randour2 Jean-Fran¸ cois Raskin3
1IST Austria 2UMONS 3ULB
04.09.2012
CONCUR 2012: 23rd International Conference on Concurrency Theory
Strategy Synthesis for Multi-dimensional Quantitative Objectives - - PowerPoint PPT Presentation
Strategy Synthesis for Multi-dimensional Quantitative Objectives Krishnendu Chatterjee 1 Mickael Randour 2 cois Raskin 3 Jean-Fran 1 IST Austria 2 UMONS 3 ULB 04.09.2012 CONCUR 2012: 23rd International Conference on Concurrency Theory MEPGs
Krishnendu Chatterjee1 Mickael Randour2 Jean-Fran¸ cois Raskin3
1IST Austria 2UMONS 3ULB
04.09.2012
CONCUR 2012: 23rd International Conference on Concurrency Theory
MEPGs & MMPPGs
Synthesis Randomization Conclusion
system description environment description functional properties (e.g., no deadlock) model as a game model as winning
quantitative requirements (e.g., mean response time, fuel consumption) synthesis is there a winning strategy ? empower system capabilities
specification requirements strategy = controller no yes
Chatterjee, Randour, Raskin 1 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
system description environment description functional properties (e.g., no deadlock) model as a game model as winning
quantitative requirements (e.g., mean response time, fuel consumption) synthesis is there a winning strategy ? empower system capabilities
specification requirements strategy = controller no yes
restriction to finite-memory strategies.
Chatterjee, Randour, Raskin 1 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
Study games with
multi-dimensional quantitative objectives (energy and mean-payoff) and a parity objective.
First study of such a conjunction. Address questions that revolve around strategies:
bounds on memory, synthesis algorithm, randomness
?
∼ memory.
Chatterjee, Randour, Raskin 2 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
Memory bounds
MEPGs MMPPGs
finite-memory optimal
exp. exp. infinite [CDHR10]
Strategy synthesis (finite memory)
MEPGs MMPPGs EXPTIME EXPTIME
Randomness as a substitute for finite memory
MEGs EPGs MMP(P)Gs MPPGs
× × √ √ two-player × × × √
Chatterjee, Randour, Raskin 3 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
1 Multi energy and mean-payoff parity games 2 Memory bounds 3 Strategy synthesis 4 Randomization as a substitute to finite-memory 5 Conclusion
Chatterjee, Randour, Raskin 4 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
1 Multi energy and mean-payoff parity games 2 Memory bounds 3 Strategy synthesis 4 Randomization as a substitute to finite-memory 5 Conclusion
Chatterjee, Randour, Raskin 5 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
s0 s1 s2 s3 s4 s5
G = (S1, S2, sinit, E) S = S1∪S2, S1∩S2 = ∅, E ⊆ S×S P1 states = P2 states = Plays, prefixes, pure strategies.
Chatterjee, Randour, Raskin 6 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
s0 s1 s2 s3 s4 s5 (2, 1) (1, −2) (0, −2) (−3, 3) (0, 1) (1, −1) (0, 0) (1, 0)
G = (S1, S2, sinit, E, w) w : E → Zk, model changes in quantities Energy level EL(ρ) = v0 + i=n−1
i=0
w(si, si+1) Mean-payoff MP(π) = lim infn→∞ 1
nEL(π(n))
Chatterjee, Randour, Raskin 6 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
s0 s1 s2 s3 s4 s5 (2, 1) (1, −2) (0, −2) (−3, 3) (0, 1) (1, −1) (0, 0) (1, 0)
Unknown initial credit ∃? v0 ∈ Nk, λ1 ∈ Λ1 s.t. Mean-payoff threshold Given v ∈ Qk, ∃? λ1 ∈ Λ1 s.t.
Chatterjee, Randour, Raskin 6 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
s0 p = 1 s1 p = 1 s2 p = 3 s3 p = 2 s4 p = 0 s5 p = 2 (2, 1) (1, −2) (0, −2) (−3, 3) (0, 1) (1, −1) (0, 0) (1, 0)
Gp =
Par(π) = min {p(s) | s ∈ Inf(π)} Even parity ∃? λ1 ∈ Λ1 s.t. the parity is even canonical way to express ω-regular
Chatterjee, Randour, Raskin 6 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
Memory (P1) Decision problem Energy 1-dim memoryless NP ∩ coNP [CdAHS03, BFL+08] k-dim finite coNP-c [CDHR10] 1-dim + parity exponential NP ∩ coNP [CD10] Mean-payoff 1-dim memoryless NP ∩ coNP [EM79, LL69] k-dim infinite coNP-c (fin.) [CDHR10] 1-dim + parity infinite NP ∩ coNP [CHJ05, BMOU11]
Chatterjee, Randour, Raskin 7 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
Example for MMPGs, even with only one player! [CDHR10]
s0 s1 (2, 0) (0, 2) (0, 0) (0, 0)
To obtain MP(π) = (1, 1) (with lim sup, (2, 2) !), P1 has to visit s0 and s1 for longer and longer intervals before jumping from one to the other. Any finite-memory strategy alternating between these edges induces an ultimately periodic play s.t. MP(π) = (x, y), x + y < 2.
Chatterjee, Randour, Raskin 8 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
Infinite memory:
needed for MMPGs & MPPGs, practical implementation is unrealistic.
Chatterjee, Randour, Raskin 9 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
Infinite memory:
needed for MMPGs & MPPGs, practical implementation is unrealistic.
Finite memory:
preserves game determinacy, provides equivalence between energy and mean-payoff settings, the way to go for strategy synthesis.
Chatterjee, Randour, Raskin 9 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
1 Multi energy and mean-payoff parity games 2 Memory bounds 3 Strategy synthesis 4 Randomization as a substitute to finite-memory 5 Conclusion
Chatterjee, Randour, Raskin 10 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
MEPGs MMPPGs
finite-memory optimal
exp. exp. infinite [CDHR10]
By [CDHR10], we only have to consider MEPGs. Recall that the unknown initial credit decision problem for MEGs (without parity) is coNP-complete.
Chatterjee, Randour, Raskin 11 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
s0 2 s1 3 s2 1 s3 2 s4 3 s5 (−1, 1) (0, 2) (0, 1) (0, 0) (1, −1) (−2, 1) (−2, 1) (0, −1) (2, 0)
A winning strategy λ1 for initial credit v0 = (2, 0) is
λ1(∗s1s3) = s4, λ1(∗s2s3) = s5, λ1(∗s5s3) = s5.
Chatterjee, Randour, Raskin 12 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
s0 2 s1 3 s2 1 s3 2 s4 3 s5 (−1, 1) (0, 2) (0, 1) (0, 0) (1, −1) (−2, 1) (−2, 1) (0, −1) (2, 0)
A winning strategy λ1 for initial credit v0 = (2, 0) is
λ1(∗s1s3) = s4, λ1(∗s2s3) = s5, λ1(∗s5s3) = s5.
Lemma: To win, P1 must be able to enforce positive cycles of even parity.
Self-covering paths on VASS [Rac78, RY86]. Self-covering trees (SCTs) on reachability games over VASS [BJK10].
Chatterjee, Randour, Raskin 12 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
s0 2 s1 3 s2 1 s3 2 s4 3 s5 (−1, 1) (0, 2) (0, 1) (0, 0) (1, −1) (−2, 1) (−2, 1) (0, −1) (2, 0)
s0, (0, 0) s1, (−1, 1) s2, (0, 2) s3, (−1, 2) s3, (0, 2) s4, (0, 1) s5, (−2, 3) s0, (0, 0) s3, (0, 3)
Pebble moves ⇒ strategy.
Chatterjee, Randour, Raskin 12 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
T = (Q, R) is an epSCT for s0, Θ : Q → S × Zk is a labeling function. Root labeled s0, (0, . . . , 0). Non-leaf nodes have
unique child if P1, all possible children if P2.
Leafs have even-descendance energy ancestors: ancestors with lower label and minimal priority even on the downward path.
s0, (0, 0) s1, (−1, 1) s2, (0, 2) s3, (−1, 2) s3, (0, 2) s4, (0, 1) s5, (−2, 3) s0, (0, 0) s3, (0, 3)
Pebble moves ⇒ strategy.
Chatterjee, Randour, Raskin 12 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
P1 wins ⇒ ∃ SCT of depth at most exponential [BJK10]. If there exists a winning strategy, there exists a “compact” one. Idea is to eliminate unnecessary cycles. Limits: weights in {−1, 0, 1}, no parity, depth only.
Chatterjee, Randour, Raskin 13 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
exp.
Chatterjee, Randour, Raskin 14 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
2-exp.
⇓ Arbitrary weights, no parity
Chatterjee, Randour, Raskin 14 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
2-exp. 3-exp.
⇓ Arbitrary weights, no parity ⇓ Width exp. in depth
Chatterjee, Randour, Raskin 14 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
1-exp. 2-exp.
⇓ Arbitrary weights, no parity preserve branching, add parity ⇓ Width exp. in depth
Chatterjee, Randour, Raskin 14 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
1-exp. 1-exp.
⇓ Arbitrary weights, no parity preserve branching, add parity ⇓ Width exp. in depth encode parity as additionnal energy dimensions merge nodes based on energy levels
Chatterjee, Randour, Raskin 14 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
Lemma: There exists a family of multi energy games (G(K))K≥1, = (S1, S2, sinit, E, k = 2 · K, w : E → {−1, 0, 1}) s.t. for any initial credit, P1 needs exponential memory to win.
Chatterjee, Randour, Raskin 15 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
s1 s1,L s1,R sK sK,L sK,R t1 t1,L t1,R tK tK,L tK,R
∀ 1 ≤ i ≤ K, w((◦, si)) = w((◦, ti)) = (0, . . . , 0), w((si, si,L)) = −w((si, si,R)) = w((ti, ti,L)) = −w((ti, ti,R)), ∀ 1 ≤ j ≤ k, w((si, si,L))(j) = = 1 if j = 2 · i − 1 = −1 if j = 2 · i = 0 otherwise .
Chatterjee, Randour, Raskin 16 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
s1 s1,L s1,R sK sK,L sK,R t1 t1,L t1,R tK tK,L tK,R
If P1 plays according to a Moore machine with less than 2K states, he takes the same decision in some state tx for the two highlighted prefixes (let x = K w.l.o.g.). ⇒ P2 can force a decrease by 2 on some dimension every visit. ⇒ P1 loses for any v0 ∈ Nk.
Chatterjee, Randour, Raskin 16 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
1 Multi energy and mean-payoff parity games 2 Memory bounds 3 Strategy synthesis 4 Randomization as a substitute to finite-memory 5 Conclusion
Chatterjee, Randour, Raskin 17 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
Algorithm CpreFP for MEPGs and MMPPGs: symbolic (antichains) and incremental, winning strategy of at most exponential size, worst-case exponential time.
Chatterjee, Randour, Raskin 18 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
Algorithm CpreFP for MEPGs and MMPPGs: symbolic (antichains) and incremental, winning strategy of at most exponential size, worst-case exponential time. Idea: greatest fixed point of a CpreC operator. Compute for each state the set of winning initial credits, represented by the minimal elements of upper closed sets. Parameter C: range of energy levels to consider. incremental, ensures convergence.
Chatterjee, Randour, Raskin 18 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
C C
s s′ s′′
P1 can win for energy levels in the upper closed sets.
Chatterjee, Randour, Raskin 19 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
C C
s s′ s′′
P1 can win for energy levels in the upper closed sets.
Chatterjee, Randour, Raskin 19 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
C C
s s′ s′′
P1 can win for energy levels in the upper closed sets.
Chatterjee, Randour, Raskin 19 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
C C
s s′ s′′
P1 can win for energy levels in the upper closed sets.
Chatterjee, Randour, Raskin 19 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
C C
s s′ s′′
P1 can win for energy levels in the upper closed sets.
Chatterjee, Randour, Raskin 19 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
Correctness
(sinit, (c1, . . . , ck)) ∈ Cpre∗
C winning strategy for initial
credit (c1, . . . , ck).
Completeness
Winning strategy and SCT of depth l (sinit, (C, . . . , C)) ∈ Cpre∗
C for C = 2 · l · W
(cf. max init. credit).
2 · l · W 2 · l · W
Chatterjee, Randour, Raskin 20 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
1 Multi energy and mean-payoff parity games 2 Memory bounds 3 Strategy synthesis 4 Randomization as a substitute to finite-memory 5 Conclusion
Chatterjee, Randour, Raskin 21 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
When and how can P1 trade his pure finite-memory strategy for an equally powerful randomized memoryless one ? Sure semantics almost-sure semantics (i.e., probability 1). Illustration on single mean-payoff B¨ uchi games.
Chatterjee, Randour, Raskin 22 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
s0 s1 −1 P1 has to delay his visits of s1 for longer and longer intervals.
Chatterjee, Randour, Raskin 23 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
s0 s1 −1 P1 has to delay his visits of s1 for longer and longer intervals. Lemma: In MPBGs, ε-optimality can be achieved surely by pure finite-memory strategies and almost-surely by randomized memoryless strategies.
Chatterjee, Randour, Raskin 23 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
s0 s1 −1
1 Uniform memoryless strategies:
Chatterjee, Randour, Raskin 24 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
s0 s1 −1
1 Uniform memoryless strategies:
λgfe
1
ensures any cycle c has EL(c) ≥ 0 [CD10],
Chatterjee, Randour, Raskin 24 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
s0 s1 −1 −1
1 Uniform memoryless strategies:
λgfe
1
ensures any cycle c has EL(c) ≥ 0 [CD10], λ♦F
1
ensures reaching F in at most n steps (attractor).
Chatterjee, Randour, Raskin 24 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
s0 s1 −1 −1
1 Uniform memoryless strategies:
λgfe
1
ensures any cycle c has EL(c) ≥ 0 [CD10], λ♦F
1
ensures reaching F in at most n steps (attractor).
2 Alternate using pure memory or probability distributions.
Frequency of λgfe
1
→ 1 ⇒ MP → MP∗.
Chatterjee, Randour, Raskin 24 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
MEGs EPGs MMP(P)Gs MPPGs
× × √ √ two-player × × × √
Chatterjee, Randour, Raskin 25 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
1 Multi energy and mean-payoff parity games 2 Memory bounds 3 Strategy synthesis 4 Randomization as a substitute to finite-memory 5 Conclusion
Chatterjee, Randour, Raskin 26 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
Quantitative objectives Parity Restriction to finite memory (practical interest) Exponential memory bounds EXPTIME symbolic and incremental synthesis Randomness instead of memory
Chatterjee, Randour, Raskin 27 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
Memory bounds
MEPGs MMPPGs
finite-memory optimal
exp. exp. infinite [CDHR10]
Strategy synthesis (finite memory)
MEPGs MMPPGs EXPTIME EXPTIME
Randomness as a substitute for finite memory
MEGs EPGs MMP(P)Gs MPPGs
× × √ √ two-player × × × √
Chatterjee, Randour, Raskin 28 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
Chatterjee, Randour, Raskin 29 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
Patricia Bouyer, Ulrich Fahrenberg, Kim Guldstrand Larsen, Nicolas Markey, and Jir´ ı Srba. Infinite runs in weighted timed automata with energy constraints. In Proc. of FORMATS, volume 5215 of LNCS, pages 33–47. Springer, 2008. Tom´ as Br´ azdil, Petr Jancar, and Anton´ ın Kucera. Reachability games on extended vector addition systems with states. In Proc. of ICALP, volume 6199 of LNCS, pages 478–489. Springer, 2010. Patricia Bouyer, Nicolas Markey, J¨
Ummels. Measuring permissiveness in parity games: Mean-payoff parity games revisited.
Chatterjee, Randour, Raskin 29 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
In Proc. of ATVA, volume 6996 of LNCS, pages 135–149. Springer, 2011. Krishnendu Chatterjee and Laurent Doyen. Energy parity games. In Proc. of ICALP, volume 6199 of LNCS, pages 599–610. Springer, 2010. Arindam Chakrabarti, Luca de Alfaro, Thomas A. Henzinger, and Mari¨ elle Stoelinga. Resource interfaces. In Proc. of EMSOFT, volume 2855 of LNCS, pages 117–133. Springer, 2003. Krishnendu Chatterjee, Laurent Doyen, Thomas A. Henzinger, and Jean-Fran¸ cois Raskin. Generalized mean-payoff and energy games. In Proc. of FSTTCS, volume 8 of LIPIcs, pages 505–516. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2010.
Chatterjee, Randour, Raskin 29 / 29
MEPGs & MMPPGs
Synthesis Randomization Conclusion
Krishnendu Chatterjee, Thomas A. Henzinger, and Marcin Jurdzinski. Mean-payoff parity games. In Proc. of LICS, pages 178–187. IEEE Computer Society, 2005.
Positional strategies for mean payoff games. International Journal of Game Theory, 8(2):109–113, 1979. T.M. Liggett and S.A. Lippman. Short notes: Stochastic games with perfect information and time average payoff. Siam Review, 11(4):604–607, 1969. Charles Rackoff. The covering and boundedness problems for vector addition systems.
Chatterjee, Randour, Raskin 29 / 29
Louis E. Rosier and Hsu-Chun Yen. A multiparameter analysis of the boundedness problem for vector addition systems.
Chatterjee, Randour, Raskin 30 / 29
exp.
w : E → {−1, 0, 1}k l = 2(d−1)·|S| · (|S| + 1)c·k2 Depth bound from [BJK10].
Chatterjee, Randour, Raskin 30 / 29
2-exp.
w : E → {−1, 0, 1}k l = 2(d−1)·|S| · (|S| + 1)c·k2 ⇓ w : E → Zk, W max absolute weight, V bits to encode W l = 2(d−1)·W ·|S| · (W · |S| + 1)c·k2 = 2(d−1)·2V ·|S| · (W · |S| + 1)c·k2 Naive approach: blow-up by W in the size of the state space.
Chatterjee, Randour, Raskin 30 / 29
2-exp. 3-exp.
w : E → {−1, 0, 1}k l = 2(d−1)·|S| · (|S| + 1)c·k2 ⇓ w : E → Zk, W max absolute weight, V bits to encode W l = 2(d−1)·W ·|S| · (W · |S| + 1)c·k2 = 2(d−1)·2V ·|S| · (W · |S| + 1)c·k2 ⇓ Width bounded by L = dl Naive approach: width increases exponentially with depth.
Chatterjee, Randour, Raskin 30 / 29
2-exp. 3-exp.
w : E → {−1, 0, 1}k l = 2(d−1)·|S| · (|S| + 1)c·k2 ⇓ w : E → Zk, W max absolute weight, V bits to encode W l = 2(d−1)·W ·|S| · (W · |S| + 1)c·k2 = 2(d−1)·2V ·|S| · (W · |S| + 1)c·k2 ⇓ Width bounded by L = dl Naive approach: overall, 3-exp. memory ≤ L · l, without parity.
Chatterjee, Randour, Raskin 30 / 29
1-exp. 2-exp.
w : E → {−1, 0, 1}k l = 2(d−1)·|S| · (|S| + 1)c·k2 ⇓ w : E → Zk, W max absolute weight, l = 2(d−1)·|S| · (W · |S| + 1)c·k2 ⇓ Width bounded by L = dl Refined approach: no blow-up in exponent as branching is preserved, extension to parity.
Chatterjee, Randour, Raskin 30 / 29
1-exp. 1-exp.
w : E → {−1, 0, 1}k l = 2(d−1)·|S| · (|S| + 1)c·k2 ⇓ w : E → Zk, W max absolute weight, l = 2(d−1)·|S| · (W · |S| + 1)c·k2 ⇓ Width bounded by L = |S| · (2 · l · W + 1)k Refined approach: merge equivalent nodes, width is bounded by number of incomparable labels (see next slide).
Chatterjee, Randour, Raskin 30 / 29
1-exp. 1-exp.
w : E → {−1, 0, 1}k l = 2(d−1)·|S| · (|S| + 1)c·k2 ⇓ w : E → Zk, W max absolute weight, l = 2(d−1)·|S| · (W · |S| + 1)c·k2 ⇓ Width bounded by L = |S| · (2 · l · W + 1)k Refined approach: overall, single exp. memory ≤ L · l, for multi energy along with parity. Initial credit bounded by l · W .
Chatterjee, Randour, Raskin 30 / 29
Thanks to the bound on depth for MEPGs, encode parity (2 · m priorities) as m additional energy dimensions.
For each odd priority, add one dimension. Decrease by 1 when this odd priority is visited. Increase by l each time a smaller even priority is visited.
P1 maintains the energy positive on all additional dimensions iff he wins the original parity objective.
Chatterjee, Randour, Raskin 31 / 29
Key idea to reduce width to single exp.
P1 only cares about the energy level. If he can win with energy v, he can win with energy ≥ v.
s0 s1 s2 s3 s4 s5 (−1, 1) (0, 2) (0, 1) (0, 0) (1, −1) (−2, 1) (0, −1) (2, 0)
s0, (0, 0) s1, (−1, 1) s2, (0, 2) s3, (−1, 2) s3, (0, 2) s4, (0, 1) s5, (−2, 3) s0, (0, 0) s3, (0, 3)
Chatterjee, Randour, Raskin 32 / 29
C = 2 · l · W ∈ N, U(C) = (S1 ∪ S2) × {0, 1, . . . , C}k, U(C) = 2U(C), the powerset of U(C), CpreC : U(C) → U(C), CpreC(V ) =
{(s1, e1) ∈ U(C) | s1 ∈ S1 ∧ ∃(s1, s) ∈ E, ∃(s, e2) ∈ V : e2 ≤ e1 + w(s1, s)} ∪ {(s2, e2) ∈ U(C) | s2 ∈ S2 ∧ ∀(s2, s) ∈ E, ∃(s, e1) ∈ V : e1 ≤ e2 + w(s2, s)}
Exponential bound on the size of manipulated sets (∼ width). Exponential bound on the number of iterations if a winning strategy exists (∼ depth).
Chatterjee, Randour, Raskin 33 / 29
s0 s1 −1
1 Let G = (S1, S2, sinit, E, w, F), with F the set of B¨
uchi states. Let n = |S|. Let Win be the set of winning states for the MPB objective with threshold 0 (w.l.o.g.). For all s ∈ Win, P1 has two uniform memoryless strategies λgfe
1
and λ♦F
1
s.t.
Chatterjee, Randour, Raskin 34 / 29
s0 s1 −1
1 Let G = (S1, S2, sinit, E, w, F), with F the set of B¨
uchi states. Let n = |S|. Let Win be the set of winning states for the MPB objective with threshold 0 (w.l.o.g.). For all s ∈ Win, P1 has two uniform memoryless strategies λgfe
1
and λ♦F
1
s.t.
λgfe
1
ensures that any cycle c of its outcome has EL(c) ≥ 0 [CD10],
Chatterjee, Randour, Raskin 34 / 29
s0 s1 −1 −1
1 Let G = (S1, S2, sinit, E, w, F), with F the set of B¨
uchi states. Let n = |S|. Let Win be the set of winning states for the MPB objective with threshold 0 (w.l.o.g.). For all s ∈ Win, P1 has two uniform memoryless strategies λgfe
1
and λ♦F
1
s.t.
λgfe
1
ensures that any cycle c of its outcome has EL(c) ≥ 0 [CD10], λ♦F
1
ensures reaching F in at most n steps, while staying in Win.
Chatterjee, Randour, Raskin 34 / 29
2 For ε > 0, we build a pure finite-memory λpf 1 s.t.
(a) it plays λgfe
1
for 2 · W · n ε − n steps, then (b) it plays λ♦F
1
for n steps, then again (a).
Chatterjee, Randour, Raskin 35 / 29
2 For ε > 0, we build a pure finite-memory λpf 1 s.t.
(a) it plays λgfe
1
for 2 · W · n ε − n steps, then (b) it plays λ♦F
1
for n steps, then again (a).
This ensures that
F is visited infinitely often, the total cost of phases (a) + (b) is bounded by −2 · W · n, and thus the mean-payoff is at least −ε.
Chatterjee, Randour, Raskin 35 / 29
3 Based on λgfe 1
and λ♦F
1
, we obtain almost-surely ε-optimal randomized memoryless strategies, i.e., ∀ ε > 0, ∃ λrm
1
∈ ΛRM
1
, ∀ λ2 ∈ Λ2, Pλrm
1 ,λ2
sinit
(Par(π) mod 2 = 0) = 1 ∧ Pλrm
1 ,λ2
sinit
(MP(π) ≥ −ε) = 1.
Chatterjee, Randour, Raskin 36 / 29
3 Based on λgfe 1
and λ♦F
1
, we obtain almost-surely ε-optimal randomized memoryless strategies, i.e., ∀ ε > 0, ∃ λrm
1
∈ ΛRM
1
, ∀ λpm
2
∈ ΛPM
2
, Pλrm
1 ,λpm 2
sinit
(Par(π) mod 2 = 0) = 1 ∧ Pλrm
1 ,λpm 2
sinit
(MP(π) ≥ −ε) = 1.
Chatterjee, Randour, Raskin 36 / 29
3 Based on λgfe 1
and λ♦F
1
, we obtain almost-surely ε-optimal randomized memoryless strategies, i.e., ∀ ε > 0, ∃ λrm
1
∈ ΛRM
1
, ∀ λpm
2
∈ ΛPM
2
, Pλrm
1 ,λpm 2
sinit
(Par(π) mod 2 = 0) = 1 ∧ Pλrm
1 ,λpm 2
sinit
(MP(π) ≥ −ε) = 1. Strategy: ∀s ∈ S, λrm
1 (s) =
1 (s) with probability 1 − γ,
λ♦F
1
(s) with probability γ, for some well-chosen γ ∈ ]0, 1[.
Chatterjee, Randour, Raskin 36 / 29
B¨ uchi Probability of playing as λ♦F
1
for n steps in a row and ensuring visit of F strictly positive at all times. Thus λrm
1
almost-sure winning for the B¨ uchi objective.
Chatterjee, Randour, Raskin 37 / 29
Mean-payoff Consider
all end components in all MCs induced by pure memoryless strategies of P2.
Choose γ so that all ECs have expectation > −ε. Put more probability on lengthy sequences of gfe edges.
Chatterjee, Randour, Raskin 38 / 29