Stochastic Games with Lexicographic Reachability-Safety Objectives - - PowerPoint PPT Presentation
Stochastic Games with Lexicographic Reachability-Safety Objectives - - PowerPoint PPT Presentation
Stochastic Games with Lexicographic Reachability-Safety Objectives Krishnendu Chatterjee, Joost-Pieter Katoen, Maximilian Weininger, Tobias Winkler Highlights 2020 Example: Getting to the (non-virtual) conference Hotel 2 Tobias Winkler
Example: Getting to the (non-virtual) conference
2 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
Hotel
Example: Getting to the (non-virtual) conference
2 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
Hotel
Example: Getting to the (non-virtual) conference
2 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
Hotel p = 0.5 p = 0.5
Example: Getting to the (non-virtual) conference
2 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
Hotel ! p = 0.5 p = 0.5
Example: Getting to the (non-virtual) conference
2 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
Goal: Plan route from hotel to
- conference. Consider
- probabilities
- adversarial non-determinism
- multiple objectives
Hotel ! p = 0.5 p = 0.5
Example: Getting to the (non-virtual) conference
2 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
Goal: Plan route from hotel to
- conference. Consider
- probabilities
- adversarial non-determinism
- multiple objectives
Lexicographic preferences: reach conference > avoid traffic
Hotel ! p = 0.5 p = 0.5
The lex-game problem
3
Given a turn-based SG, find strategy such that where and are taken in the lexicographic order.
π* inf sup
inf
τ
[vπ*,τ(◊T), vπ*,τ(□S)] = sup
π
inf
τ
[vπ,τ(◊T), vπ,τ(□S)]
Reach conference Avoid traffic
Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
The main idea: Reduction to single objective
4
inf
τ
[vπ*,τ(◊T), vπ*,τ(□S)] = sup
π
inf
τ
[vπ,τ(◊T), vπ,τ(□S)]
Assuming are sinks, the problem can be reduced to solving three simple (single-objective) stochastic games
T, S
Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
The main idea: Reduction to single objective
4
inf
τ
[vπ*,τ(◊T), vπ*,τ(□S)] = sup
π
inf
τ
[vπ,τ(◊T), vπ,τ(□S)]
Assuming are sinks, the problem can be reduced to solving three simple (single-objective) stochastic games
T, S
Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
Proof sketch: Solve , identify value-0states, remove sub-optimal actions Solve —> strategy Weigh value-0 states with their
- value
Solve (weighted) reachability to the value-0 states —> strategy Final strategy: Follow , then follow in the value-0 states.
◊T □ S π2 □ S π1 π1 π2
Results | Conclusion
5
inf
τ
[…, vπ*,τ(⋆iTi), …] = sup
π
inf
τ
[…, vπ,τ(⋆iTi), …], ⋆i ∈ {◊, □ }
Theorem The lex-game problem can be reduced to solving
- (general)
- (absorbing
) many simple (single-objective) stochastic games.
O(2#obj) O(#obj) Ti
Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
Results | Conclusion
5
inf
τ
[…, vπ*,τ(⋆iTi), …] = sup
π
inf
τ
[…, vπ,τ(⋆iTi), …], ⋆i ∈ {◊, □ }
Consequences
- deterministic optimal strategy
exists
- memory only needed to remember satisfied/violated objectives
- lex-game problem in
for absorbing targets or constant
- in practice complexity similar to solving all objectives separately
Conclusion Lexicographic is a useful add-on to single-objective solvers that allows reasoning about a secondary/ternary/… objective
π* NP ∩ coNP #obj
Theorem The lex-game problem can be reduced to solving
- (general)
- (absorbing
) many simple (single-objective) stochastic games.
O(2#obj) O(#obj) Ti
Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
Results | Conclusion
5
inf
τ
[…, vπ*,τ(⋆iTi), …] = sup
π
inf
τ
[…, vπ,τ(⋆iTi), …], ⋆i ∈ {◊, □ }
Consequences
- deterministic optimal strategy
exists
- memory only needed to remember satisfied/violated objectives
- lex-game problem in
for absorbing targets or constant
- in practice complexity similar to solving all objectives separately
Conclusion Lexicographic is a useful add-on to single-objective solvers that allows reasoning about a secondary/ternary/… objective
π* NP ∩ coNP #obj
Theorem The lex-game problem can be reduced to solving
- (general)
- (absorbing
) many simple (single-objective) stochastic games.
O(2#obj) O(#obj) Ti
Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives
Thanks for your attention!