SLIDE 1 Approximation of continuous LMPs[1]
[1] Labelled Markov Processes
Alexandre Bouchard-Côté
Supervisors: Prakash Panangaden, Doina Precup Reasoning and Learning Lab, McGill University Sponsored by: NSERC, McGill School of Computer Science.
SLIDE 2 Motivation: example
Continuous system: Possible finite state approx.:
state space dynamics “geometric” better
SLIDE 3 The approx. scheme[2]
level 0 level 1 level 2 level 3 X (X, 3)
[2] J. Desharnais,
- V. Gupta, R. Jagadeesan, P
. Panangaden. (2002).
Approximating Labelled Markov Processes.
SLIDE 4 m states in the preceding level ... ... inf τa(t, B) t ∈ X
(X, k) (B, k-1)
partition of the range of the probability kernels into h-intervals of length ε/m partition of the states {τa
(Ci, k-1) Aj
SLIDE 5 inf τa(t, B) t ∈ X {τa
Implementation difficulties
Infimum of measurable functions Generate partition (check if a set is empty) Invert a measurable function
SLIDE 6 How to “invert” the kernels
Representation of τa
Operations: Check if s0 ∈ τa
Output true iff ∫C fa(s0, x) dμ(x) ∈ (a,b] Instance’s variables: fa C (a,b]
SLIDE 7
Infimum
ess inf g(x) inf g(x) Measure zero sets g(x)
SLIDE 8 Proof of correctness (sketch)
Q
bisimulation
~ approximation sampling approximation
SLIDE 9
ε-homogeneity
(S, A, R, P) = M M1 = (S1, A, R1, P1) Φ M1 ε-homogenous w.r.t. M if ∃Φ: S→S1 surj. s.t.∀s ∈ S ∀a ∈ A Σs’εS | P1(Φ(s), s’, a) - ΣtεΦ-1({s’}) P(s, t, a) |k ≤ εk s s’ Φ(s)
SLIDE 10
Link between 0-homogeneity and bisimulation
Let R≡0, M1= (S1, A, R, P1), M = (S, A, R, P) be MDP’s (and therefore LMP’s). Then they are 0- homogenous with mapping Φ iff {Φ-1({s’}) : s’ ∈ S1} is a bisimulation equivalence relation on M.
SLIDE 11
Proof idea
Enough: if s1, s2 are s.t. Φ(s1) = Φ(s2) = s, then they satisfy the same formulas in L0. Structural induction on L0. As usual, the “hard” step is <a>qφ. By induction hypothesis, [[φ]] has the form: [[φ]] = ∪ {Φ-1({s’i})}
SLIDE 12 For each of these s’i, we have, by 0-homogeneity: Σt∈Φ-1({s’i}) P(sj, t, a) = P1(Φ(sj), s’i, a) for j=1,2 ∴ P(sj, [[φ]], a) = Σi Σt∈Φ-1({s’i}) P1(s, s’i, a) ∴ P(s1, [[φ]], a) = P1(s2, [[φ]], a) ∴ s1 ⊨ <a>qφ ⇔ s2 ⊨ <a>qφ S S1 Φ
[[φ]] = ∪ {Φ-1({s’i})}