Parity Objectives in Countable MDPs Stefan Kiefer Richard Mayr - - PowerPoint PPT Presentation

parity objectives in countable mdps
SMART_READER_LITE
LIVE PREVIEW

Parity Objectives in Countable MDPs Stefan Kiefer Richard Mayr - - PowerPoint PPT Presentation

Parity Objectives in Countable MDPs Stefan Kiefer Richard Mayr Mahsa Shirmohammadi Dominik Wojtczak LICS 2017, Reykjavik 20 June 2017 Kiefer , Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 1 Countable MDPs


slide-1
SLIDE 1

Parity Objectives in Countable MDPs

Stefan Kiefer Richard Mayr Mahsa Shirmohammadi Dominik Wojtczak LICS 2017, Reykjavik 20 June 2017

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 1

slide-2
SLIDE 2

Countable MDPs

1 1

· · ·

1

· · ·

2 2

· · ·

2

· · ·

1 1

1 2 1 2i 1 2

1− 1

2i

bad/odd good/even controlled 1 2 random 1

0.3 0.7

2

0.3 0.7

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 2

slide-3
SLIDE 3

Countable MDPs

1 1

· · ·

1

· · ·

2 2

· · ·

2

· · ·

1 1

1 2 1 2i 1 2

1− 1

2i

There is no almost-surely winning strategy. sup

σ

Prσ(Parity) = 1 All finite-memory strategies lose almost surely.

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 2

slide-4
SLIDE 4

{1, 2, 3}-Parity

1 1

· · ·

1

· · ·

2 2

· · ·

2

· · ·

3 1

1 2 1 2i 1 2

1− 1

2i

There exists an almost-surely winning strategy. All finite-memory strategies lose almost surely.

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 3

slide-5
SLIDE 5

Our Results in the Mostowski Hierarchy

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 4

Safety Reach {0, 1}-Parity {1, 2}-Parity {0, 1, 2}-Parity {1, 2, 3}-Parity {0, 1, 2, 3}-Parity {1, 2, 3, 4}-Parity

slide-6
SLIDE 6

Our Results in the Mostowski Hierarchy

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 4

Safety Reach {0, 1}-Parity {1, 2}-Parity {0, 1, 2}-Parity {1, 2, 3}-Parity {0, 1, 2, 3}-Parity {1, 2, 3, 4}-Parity 1 0.5 0.5

slide-7
SLIDE 7

Our Results in the Mostowski Hierarchy

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 4

Safety Reach {0, 1}-Parity {1, 2}-Parity {0, 1, 2}-Parity {1, 2, 3}-Parity {0, 1, 2, 3}-Parity {1, 2, 3, 4}-Parity 1 0.5 0.5

slide-8
SLIDE 8

Our Results in the Mostowski Hierarchy

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 4

ε-optimal MD

  • ptimal MD

Safety Reach {0, 1}-Parity {1, 2}-Parity {0, 1, 2}-Parity {1, 2, 3}-Parity {0, 1, 2, 3}-Parity {1, 2, 3, 4}-Parity ε-optimal MD means: sup

σ

Prσ(Parity) = sup

MD σ

Prσ(Parity)

slide-9
SLIDE 9

Our Results in the Mostowski Hierarchy

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 4

ε-optimal MD

  • ptimal MD

Safety Reach {0, 1}-Parity {1, 2}-Parity {0, 1, 2}-Parity {1, 2, 3}-Parity {0, 1, 2, 3}-Parity {1, 2, 3, 4}-Parity ε-optimal MD means: sup

σ

Prσ(Parity) = sup

MD σ

Prσ(Parity) 1 1

· · ·

1

· · ·

2 2

· · ·

2

· · ·

1 1

1 2 1 2i 1 2

1− 1

2i

slide-10
SLIDE 10

Our Results in the Mostowski Hierarchy

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 4

ε-optimal MD

  • ptimal MD

Safety Reach {0, 1}-Parity {1, 2}-Parity {0, 1, 2}-Parity {1, 2, 3}-Parity {0, 1, 2, 3}-Parity {1, 2, 3, 4}-Parity ε-optimal MD means: sup

σ

Prσ(Parity) = sup

MD σ

Prσ(Parity)

  • ptimal MD means: if ∃ optimal σ, then ∃ optimal σ that is MD

1 1

· · ·

1

· ·

2 2

· · ·

2

· ·

3 1

1 2 1 2i 1 2

1− 1

2i

slide-11
SLIDE 11

Our Results in the Mostowski Hierarchy

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 4

ε-optimal MD

  • ptimal MD

Safety Reach {0, 1}-Parity {1, 2}-Parity {0, 1, 2}-Parity {1, 2, 3}-Parity {0, 1, 2, 3}-Parity {1, 2, 3, 4}-Parity ε-optimal MD means: sup

σ

Prσ(Parity) = sup

MD σ

Prσ(Parity)

  • ptimal MD means: if ∃ optimal σ, then ∃ optimal σ that is MD

Dichotomy between MD and infinite memory; contrast to finite MDPs 1 1

· · ·

1

· ·

2 2

· · ·

2

· ·

3 1

1 2 1 2i 1 2

1− 1

2i

slide-12
SLIDE 12

Optimal MD-Strategies

Theorem Consider a countable-state MDP with {0, 1, 2}-parity objective. If there exists an optimal strategy, then there exists an optimal strategy that is MD. “Optimal strategies for {0, 1, 2}-parity may be chosen MD.”

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 5

slide-13
SLIDE 13

Optimal MD-Strategies for Co-Büchi

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 6

Theorem Almost-surely winning strategies for co-Büchi may be chosen MD. Suppose there is an almost-surely winning strategy σ. Focus on states used by σ. They all have an a.s. winning strategy. Set a more ambitious goal: Safety (= never see 1 again) 1 1 1 1

· · ·

1 2 1 2 3 4 1 4 7 8 1 8

1 1 1 Always playing for safety is too greedy.

slide-14
SLIDE 14

An Optimal MD-Strategy for Co-Büchi

max

σ

Prσ

  • never see

1

  • r

1 again

  • 1

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 7

slide-15
SLIDE 15

An Optimal MD-Strategy for Co-Büchi

max

σ

Prσ

  • never see

1

  • r

1 again

  • 1
  • 0. Playing the safest action everywhere is not ok.

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 7

slide-16
SLIDE 16

An Optimal MD-Strategy for Co-Büchi

max

σ

Prσ

  • never see

1

  • r

1 again

  • 1

3

1

  • 0. Playing the safest action everywhere is not ok.
  • 1. Fixing the safest action in the blue region is ok.

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 7

slide-17
SLIDE 17

An Optimal MD-Strategy for Co-Büchi

max

σ

Prσ

  • never see

1

  • r

1 again

  • 1

3 2 3

1

  • 0. Playing the safest action everywhere is not ok.
  • 1. Fixing the safest action in the blue region is ok.

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 7

slide-18
SLIDE 18

An Optimal MD-Strategy for Co-Büchi

max

σ

Prσ

  • never see

1

  • r

1 again

  • 1

3 2 3

1

  • 0. Playing the safest action everywhere is not ok.
  • 1. Fixing the safest action in the blue region is ok.
  • 2. Once we are in dark blue : with prob ≥ 1

2 we stay in blue .

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 7

slide-19
SLIDE 19

An Optimal MD-Strategy for Co-Büchi

max

σ

Prσ

  • never see

1

  • r

1 again

  • 1

3 2 3

1

  • 0. Playing the safest action everywhere is not ok.
  • 1. Fixing the safest action in the blue region is ok.
  • 2. Once we are in dark blue : with prob ≥ 1

2 we stay in blue .

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 7

slide-20
SLIDE 20

An Optimal MD-Strategy for Co-Büchi

max

σ

Prσ

  • never see

1

  • r

1 again

  • 1

3 2 3

1

  • 0. Playing the safest action everywhere is not ok.
  • 1. Fixing the safest action in the blue region is ok.
  • 2. Once we are in dark blue : with prob ≥ 1

2 we stay in blue .

  • 3. The a.s. winning strategy for 1. gets us in dark blue a.s.

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 7

slide-21
SLIDE 21

When MD Suffices For Finitely Branching MDPs

ε-optimal MD

  • ptimal MD

Safety Reach {0, 1}-Parity {1, 2}-Parity {0, 1, 2}-Parity {1, 2, 3}-Parity {0, 1, 2, 3}-Parity {1, 2, 3, 4}-Parity

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 8

slide-22
SLIDE 22

When MD Suffices For Infinitely Branching MDPs

ε

  • p

t i m a l M D

  • p

t i m a l M D Safety Reach {0, 1}-Parity {1, 2}-Parity {0, 1, 2}-Parity {1, 2, 3}-Parity {0, 1, 2, 3}-Parity {1, 2, 3, 4}-Parity

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 8

slide-23
SLIDE 23

Context of the Paper

Our work: countable MDPs Other work: mostly finite MDPs Our work: maximizing the probability of Parity objectives Other work: maximizing expected (discounted) total/average reward/cost Our work: general countable MDPs Other work: countable MDPs arising from specific models: recursive MDPs nondeterministic probabilistic lossy channel systems VASS-induced MDPs

  • ne-counter MDPs

controlled queueing systems controlled multitype branching processes · · ·

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 9

slide-24
SLIDE 24

Conditioning a Markov Chain

1

1 3 1 3 1 3 1 3 1 3 1 3

1 1

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 10

P r ( A ) = P r ( A | P a r i t y ) P r ( A | ¬ P a r i t y ) = P r ( A )

slide-25
SLIDE 25

Conditioning a Markov Chain

1

1 3 1 3 1 3 1 3 1 3 1 3

1 1

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 10

1 3 1 3 1 6 2 3 1 2

1 1

1 3 1 3 2 3 1 6 1 2

1 P r ( A ) = P r ( A | P a r i t y ) P r ( A | ¬ P a r i t y ) = P r ( A )

slide-26
SLIDE 26

Countable Markov Chains

Infinite Markov chains are very different from finite ones. Gambler’s ruin:

· · ·

1 1/3 1/3 1/3 1/3 2/3 2/3 2/3

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 11

slide-27
SLIDE 27

Countable Markov Chains

Infinite Markov chains are very different from finite ones. Gambler’s ruin:

· · ·

1 1/3 1/3 1/3 1/3 2/3 2/3 2/3 Dependence on exact probabilities

Kiefer, Mayr, Shirmohammadi, Wojtczak Parity Objectives in Countable MDPs 11