Approximation of continuous LMPs [1] Alexandre Bouchard-Ct - - PowerPoint PPT Presentation

approximation of
SMART_READER_LITE
LIVE PREVIEW

Approximation of continuous LMPs [1] Alexandre Bouchard-Ct - - PowerPoint PPT Presentation

Approximation of continuous LMPs [1] Alexandre Bouchard-Ct Supervisors: Prakash Panangaden, Doina Precup Reasoning and Learning Lab, McGill University Sponsored by: NSERC, McGill School of Computer Science. [1] Labelled Markov Processes


slide-1
SLIDE 1

Approximation of continuous LMPs[1]

[1] Labelled Markov Processes

Alexandre Bouchard-Côté

Supervisors: Prakash Panangaden, Doina Precup Reasoning and Learning Lab, McGill University Sponsored by: NSERC, McGill School of Computer Science.

slide-2
SLIDE 2

Motivation: example

Continuous system: Possible finite state approx.:

state space dynamics “geometric” better

slide-3
SLIDE 3

The approx. scheme[2]

level 0 level 1 level 2 level 3 X (X, 3)

[2] J. Desharnais,

  • V. Gupta, R. Jagadeesan, P

. Panangaden. (2002).

Approximating Labelled Markov Processes.

slide-4
SLIDE 4

m states in the preceding level ... ... inf τa(t, B) t ∈ X

(X, k) (B, k-1)

partition of the range of the probability kernels into h-intervals of length ε/m partition of the states {τa

  • 1(•, Ci)(Aj) : Aj ∈ P}

(Ci, k-1) Aj

slide-5
SLIDE 5

inf τa(t, B) t ∈ X {τa

  • 1(•, Ci)(Aj)}

Implementation difficulties

Infimum of measurable functions Generate partition (check if a set is empty) Invert a measurable function

slide-6
SLIDE 6

How to “invert” the kernels

Representation of τa

  • 1(•, C)((a,b])

Operations: Check if s0 ∈ τa

  • 1(•, C)((a,b])

Output true iff ∫C fa(s0, x) dμ(x) ∈ (a,b] Instance’s variables: fa C (a,b]

slide-7
SLIDE 7

Infimum

ess inf g(x) inf g(x) Measure zero sets g(x)

slide-8
SLIDE 8

Proof of correctness (sketch)

Q

bisimulation

  • ~
  • Q

~ approximation sampling approximation

slide-9
SLIDE 9

ε-homogeneity

(S, A, R, P) = M M1 = (S1, A, R1, P1) Φ M1 ε-homogenous w.r.t. M if ∃Φ: S→S1 surj. s.t.∀s ∈ S ∀a ∈ A Σs’εS | P1(Φ(s), s’, a) - ΣtεΦ-1({s’}) P(s, t, a) |k ≤ εk s s’ Φ(s)

slide-10
SLIDE 10

Link between 0-homogeneity and bisimulation

Let R≡0, M1= (S1, A, R, P1), M = (S, A, R, P) be MDP’s (and therefore LMP’s). Then they are 0- homogenous with mapping Φ iff {Φ-1({s’}) : s’ ∈ S1} is a bisimulation equivalence relation on M.

slide-11
SLIDE 11

Proof idea

Enough: if s1, s2 are s.t. Φ(s1) = Φ(s2) = s, then they satisfy the same formulas in L0. Structural induction on L0. As usual, the “hard” step is <a>qφ. By induction hypothesis, [[φ]] has the form: [[φ]] = ∪ {Φ-1({s’i})}

slide-12
SLIDE 12

For each of these s’i, we have, by 0-homogeneity: Σt∈Φ-1({s’i}) P(sj, t, a) = P1(Φ(sj), s’i, a) for j=1,2 ∴ P(sj, [[φ]], a) = Σi Σt∈Φ-1({s’i}) P1(s, s’i, a) ∴ P(s1, [[φ]], a) = P1(s2, [[φ]], a) ∴ s1 ⊨ <a>qφ ⇔ s2 ⊨ <a>qφ S S1 Φ

[[φ]] = ∪ {Φ-1({s’i})}