Stochastic Games with Lexicographic Reachability-Safety Objectives - - PowerPoint PPT Presentation

stochastic games with lexicographic reachability safety
SMART_READER_LITE
LIVE PREVIEW

Stochastic Games with Lexicographic Reachability-Safety Objectives - - PowerPoint PPT Presentation

Stochastic Games with Lexicographic Reachability-Safety Objectives Krishnendu Chatterjee, Joost-Pieter Katoen, Maximilian Weininger, Tobias Winkler Highlights 2020 Example: Getting to the (non-virtual) conference Hotel 2 Tobias Winkler


slide-1
SLIDE 1

Stochastic Games with Lexicographic Reachability-Safety Objectives

Krishnendu Chatterjee, Joost-Pieter Katoen, Maximilian Weininger, Tobias Winkler Highlights 2020

slide-2
SLIDE 2

Example: Getting to the (non-virtual) conference

2 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives

Hotel

slide-3
SLIDE 3

Example: Getting to the (non-virtual) conference

2 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives

Hotel

slide-4
SLIDE 4

Example: Getting to the (non-virtual) conference

2 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives

Hotel p = 0.5 p = 0.5

slide-5
SLIDE 5

Example: Getting to the (non-virtual) conference

2 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives

Hotel ! p = 0.5 p = 0.5

slide-6
SLIDE 6

Example: Getting to the (non-virtual) conference

2 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives

Goal: Plan route from hotel to

  • conference. Consider
  • probabilities
  • adversarial non-determinism
  • multiple objectives

Hotel ! p = 0.5 p = 0.5

slide-7
SLIDE 7

Example: Getting to the (non-virtual) conference

2 Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives

Goal: Plan route from hotel to

  • conference. Consider
  • probabilities
  • adversarial non-determinism
  • multiple objectives

Lexicographic preferences: reach conference > avoid traffic

Hotel ! p = 0.5 p = 0.5

slide-8
SLIDE 8

The lex-game problem

3

Given a turn-based SG, find strategy such that where and are taken in the lexicographic order.

π* inf sup

inf

τ

[vπ*,τ(◊T), vπ*,τ(□S)] = sup

π

inf

τ

[vπ,τ(◊T), vπ,τ(□S)]

Reach conference Avoid traffic

Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives

slide-9
SLIDE 9

The main idea: Reduction to single objective

4

inf

τ

[vπ*,τ(◊T), vπ*,τ(□S)] = sup

π

inf

τ

[vπ,τ(◊T), vπ,τ(□S)]

Assuming are sinks, the problem can be reduced to solving three simple (single-objective) stochastic games

T, S

Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives

slide-10
SLIDE 10

The main idea: Reduction to single objective

4

inf

τ

[vπ*,τ(◊T), vπ*,τ(□S)] = sup

π

inf

τ

[vπ,τ(◊T), vπ,τ(□S)]

Assuming are sinks, the problem can be reduced to solving three simple (single-objective) stochastic games

T, S

Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives

Proof sketch: Solve , identify value-0states, remove sub-optimal actions Solve —> strategy Weigh value-0 states with their

  • value

Solve (weighted) reachability to the value-0 states —> strategy Final strategy: Follow , then follow in the value-0 states.

◊T □ S π2 □ S π1 π1 π2

slide-11
SLIDE 11

Results | Conclusion

5

inf

τ

[…, vπ*,τ(⋆iTi), …] = sup

π

inf

τ

[…, vπ,τ(⋆iTi), …], ⋆i ∈ {◊, □ }

Theorem The lex-game problem can be reduced to solving

  • (general)
  • (absorbing

) many simple (single-objective) stochastic games.

O(2#obj) O(#obj) Ti

Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives

slide-12
SLIDE 12

Results | Conclusion

5

inf

τ

[…, vπ*,τ(⋆iTi), …] = sup

π

inf

τ

[…, vπ,τ(⋆iTi), …], ⋆i ∈ {◊, □ }

Consequences

  • deterministic optimal strategy

exists

  • memory only needed to remember satisfied/violated objectives
  • lex-game problem in

for absorbing targets or constant

  • in practice complexity similar to solving all objectives separately

Conclusion Lexicographic is a useful add-on to single-objective solvers that allows reasoning about a secondary/ternary/… objective

π* NP ∩ coNP #obj

Theorem The lex-game problem can be reduced to solving

  • (general)
  • (absorbing

) many simple (single-objective) stochastic games.

O(2#obj) O(#obj) Ti

Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives

slide-13
SLIDE 13

Results | Conclusion

5

inf

τ

[…, vπ*,τ(⋆iTi), …] = sup

π

inf

τ

[…, vπ,τ(⋆iTi), …], ⋆i ∈ {◊, □ }

Consequences

  • deterministic optimal strategy

exists

  • memory only needed to remember satisfied/violated objectives
  • lex-game problem in

for absorbing targets or constant

  • in practice complexity similar to solving all objectives separately

Conclusion Lexicographic is a useful add-on to single-objective solvers that allows reasoning about a secondary/ternary/… objective

π* NP ∩ coNP #obj

Theorem The lex-game problem can be reduced to solving

  • (general)
  • (absorbing

) many simple (single-objective) stochastic games.

O(2#obj) O(#obj) Ti

Tobias Winkler Stochastic Games with Lexicographic Reachability Safety Objectives

Thanks for your attention!