 
              Stochastic Games with Lexicographic Reachability-Safety Objectives Krishnendu Chatterjee, Joost-Pieter Katoen, Maximilian Weininger, Tobias Winkler Highlights 2020
Example: Getting to the (non-virtual) conference Hotel 2 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
Example: Getting to the (non-virtual) conference Hotel 2 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
Example: Getting to the (non-virtual) conference p = 0.5 p = 0.5 Hotel 2 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
Example: Getting to the (non-virtual) conference p = 0.5 ! p = 0.5 Hotel 2 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
Example: Getting to the (non-virtual) conference Goal: Plan route from hotel to conference. Consider p = 0.5 • probabilities • adversarial non-determinism • multiple objectives ! p = 0.5 Hotel 2 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
Example: Getting to the (non-virtual) conference Goal: Plan route from hotel to conference. Consider p = 0.5 • probabilities • adversarial non-determinism • multiple objectives ! Lexicographic preferences: p = 0.5 reach conference > avoid traffic Hotel 2 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
The lex-game problem Given a turn-based SG, find strategy such that π * Reach conference Avoid traffic [ v π *, τ ( ◊ T ), v π *, τ ( □ S ) ] = sup [ v π , τ ( ◊ T ), v π , τ ( □ S ) ] inf inf τ τ π where and are taken in the lexicographic order. inf sup 3 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
The main idea: Reduction to single objective [ v π *, τ ( ◊ T ), v π *, τ ( □ S ) ] = sup [ v π , τ ( ◊ T ), v π , τ ( □ S ) ] inf inf τ τ π Assuming are sinks, the problem can be reduced to T , S solving three simple (single-objective) stochastic games 4 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
The main idea: Reduction to single objective [ v π *, τ ( ◊ T ), v π *, τ ( □ S ) ] = sup [ v π , τ ( ◊ T ), v π , τ ( □ S ) ] inf inf τ τ π Assuming are sinks, the problem can be reduced to T , S solving three simple (single-objective) stochastic games Proof sketch: ◊ T Solve , identify value-0states, remove sub-optimal actions □ S Solve —> strategy π 2 □ S Weigh value-0 states with their -value Solve (weighted) reachability to the value-0 states —> strategy π 1 Final strategy: Follow , then follow in the value-0 states. π 1 π 2 4 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
Results | Conclusion [ …, v π *, τ ( ⋆ i T i ), … ] = sup [ …, v π , τ ( ⋆ i T i ), … ] , ⋆ i ∈ { ◊ , □ } inf inf τ τ π Theorem The lex-game problem can be reduced to solving O (2 # obj ) (general) • (absorbing ) • O (# obj ) T i many simple (single-objective) stochastic games. 5 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
Results | Conclusion [ …, v π *, τ ( ⋆ i T i ), … ] = sup [ …, v π , τ ( ⋆ i T i ), … ] , ⋆ i ∈ { ◊ , □ } inf inf τ τ π Theorem The lex-game problem can be reduced to solving O (2 # obj ) (general) • (absorbing ) • O (# obj ) T i many simple (single-objective) stochastic games. Consequences • deterministic optimal strategy exists π * • memory only needed to remember satisfied/violated objectives NP ∩ coNP • lex-game problem in for absorbing targets or constant # obj • in practice complexity similar to solving all objectives separately Conclusion Lexicographic is a useful add-on to single-objective solvers that allows reasoning about a secondary/ternary/… objective 5 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
Results | Conclusion [ …, v π *, τ ( ⋆ i T i ), … ] = sup [ …, v π , τ ( ⋆ i T i ), … ] , ⋆ i ∈ { ◊ , □ } inf inf τ τ π Theorem The lex-game problem can be reduced to solving O (2 # obj ) (general) • (absorbing ) • O (# obj ) T i many simple (single-objective) stochastic games. Consequences • deterministic optimal strategy exists π * • memory only needed to remember satisfied/violated objectives NP ∩ coNP • lex-game problem in for absorbing targets or constant # obj • in practice complexity similar to solving all objectives separately Conclusion Lexicographic is a useful add-on to single-objective solvers that allows reasoning about a secondary/ternary/… objective Thanks for your attention! 5 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
Recommend
More recommend