On the Total Variation Distance of SMCs
Giorgio Bacci, Giovanni Bacci, Kim G. Larsen, Radu Mardare Aalborg University, Denmark
14 April 2015 - London, UK
1/28
FoSSaCS ‘15
On the Total Variation Distance of SMCs Giorgio Bacci, Giovanni - - PowerPoint PPT Presentation
On the Total Variation Distance of SMCs Giorgio Bacci, Giovanni Bacci, Kim G. Larsen, Radu Mardare Aalborg University, Denmark 14 April 2015 - London, UK FoSSaCS 15 1/28 Outline Motivations Semi-Markov Chains (SMCs) Trace
Giorgio Bacci, Giovanni Bacci, Kim G. Larsen, Radu Mardare Aalborg University, Denmark
14 April 2015 - London, UK
1/28
FoSSaCS ‘15
2/28
3/28
3/28
3/28
3/28
tests over execution runs (no internal access!)
3/28
tests over execution runs (no internal access!)
artificial intelligence, security, ect.
3/28
s0 s2 s1 s3 s4
1/3 1/3 1/3 1/3 2/3 1 1 1 4/28
s0 s2 s1 s3 s4
1/3 1/3 1/3 1/3 2/3 1 1 1 p,r q p,r q,r q,r 4/28
s0 s2 s1 s3 s4
1/3 1/3 1/3 1/3 2/3 1 1 1 Exp(3) p,r q p,r q,r q,r 4/28
s0 s2 s1 s3 s4
1/3 1/3 1/3 1/3 2/3 1 1 1 Exp(3) Exp(3) N(2,3) U(2) U(2) p,r q p,r q,r q,r 4/28
s0 s2 s1 s3 s4
1/3 1/3 1/3 1/3 2/3 1 1 1 Exp(3) Exp(3) N(2,3) U(2) U(2) p,r q p,r q,r q,r
Given an initial state, SMCs can be interpreted as “machines” that emit timed traces of states with a certain probability
4/28
s0 s1 sn-1 sn
... π:
5/28
s0 s1 sn-1 sn t0 tn-1
... π:
residence-time
5/28
𝕯(S0,R0, ... ,Rn-1,Sn)
s0 s1 sn-1 sn t0 tn-1
... ∈ π:
Cylinder set (or cone) (si ∈Si, ti ∈Ri and Ri Borel set) residence-time
5/28
𝕯(S0,R0, ... ,Rn-1,Sn)
s0 s1 sn-1 sn t0 tn-1
... ∈ π:
Cylinder set (or cone) (si ∈Si, ti ∈Ri and Ri Borel set) residence-time
P[s](𝕯(S0,R0, ... ,Rn-1,Sn)) = “probability that, starting from s, the SMC emits a timed path with prefix in S0×R0× ... ×Rn-1×Sn”
5/28
s0 s2 s1 s3 s4
1/3 1/3 1/3 1/3 2/3 1 1 1 Exp(3) Exp(3) N(2,3) U(2) U(2) p,r q p,r q,r q,r 6/28
s0 s2 s1 s3 s4
1/3 1/3 1/3 1/3 2/3 1 1 1 Exp(3) Exp(3) N(2,3) U(2) U(2) p,r q p,r q,r q,r 6/28
s0 s2 s1 s3 s4
1/3 1/3 1/3 1/3 2/3 1 1 1 Exp(3) Exp(3) N(2,3) U(2) U(2) p,r q p,r q,r q,r
P[s0](𝕯( ,R0, ... ,Rn-1, )) = P[s1](𝕯( ,R0, ... ,Rn-1, ))
L0 Ln L0 Ln 6/28
s0 s2 s1 s3 s4
1/3 1/3 1/3 1/3 2/3 1 1 1 Exp(3) Exp(3) N(2,3) U(2) U(2) p,r q p,r q,r q,r
P[s0](𝕯( ,R0, ... ,Rn-1, )) = P[s1](𝕯( ,R0, ... ,Rn-1, ))
Trace Cylinders (up to label equiv.)
L0 Ln L0 Ln 6/28
s0 s2 s1 s3 s4
1/3 1/3 1/3 1/3 2/3 1 1 1 Exp(3) Exp(3) N(2,3) U(2) U(2) p,r q p,r q,r q,r
P[s0](𝕯( ,R0, ... ,Rn-1, )) = P[s1](𝕯( ,R0, ... ,Rn-1, ))
Trace Cylinders (up to label equiv.)
L0 Ln L0 Ln 6/28
for all
s0 s2 s1 s3 s4 1/3+ε
1/3 1/3 1/3
2/3-ε
1 1 1 Exp(3) Exp(3) N(2,3) U(2) U(2) p,r q p,r q,r q,r 7/28
s0 s2 s1 s3 s4 1/3+ε
1/3 1/3 1/3
2/3-ε
1 1 1 Exp(3) Exp(3) N(2,3) U(2) U(2) p,r q p,r q,r q,r
P[s0](𝕯( ,ℝ, )) =1/3+ε ≠ 1/3 = P[s1] (𝕯( ,ℝ, ))
p,r q p,r q 7/28
s0 s2 s1 s3 s4 1/3+ε
1/3 1/3 1/3
2/3-ε
1 1 1 Exp(3) Exp(3) N(2,3) U(2) U(2) p,r q p,r q,r q,r
P[s0](𝕯( ,ℝ, )) =1/3+ε ≠ 1/3 = P[s1] (𝕯( ,ℝ, ))
p,r q p,r q
7/28
d(s,s’) = sup |P[s](E) - P[s’](E)|
E ∈ σ(𝓤)
σ-algebra generated by Trace Cylinders
8/28
d(s,s’) = sup |P[s](E) - P[s’](E)|
E ∈ σ(𝓤)
It’s a Behavioral Distance! d(s,s’) = 0 iff s≈ s’
σ-algebra generated by Trace Cylinders T
8/28
s0
p
s2
q
s1
p
s3
q
s4
r 1/4 1/4 1/4 1/4 1 1/2 1/4 1/2 1/4 1/2 1/4 1/4 1/2
(from Chen-Kiefer LICS’14)
9/28
s0
p
s2
q
s1
p
s3
q
s4
r 1/4 1/4 1/4 1/4 1 1/2 1/4 1/2 1/4 1/2 1/4 1/4 1/2
d(s0,s1) = √2 / 4
(from Chen-Kiefer LICS’14)
9/28
s0
p
s2
q
s1
p
s3
q
s4
r 1/4 1/4 1/4 1/4 1 1/2 1/4 1/2 1/4 1/2 1/4 1/4 1/2
d(s0,s1) = √2 / 4
(from Chen-Kiefer LICS’14)
irrational number
9/28
s0
p
s2
q
s1
p
s3
q
s4
r 1/4 1/4 1/4 1/4 1 1/2 1/4 1/2 1/4 1/2 1/4 1/4 1/2
d(s0,s1) = √2 / 4
(from Chen-Kiefer LICS’14)
irrational number maximizing event is not ω-regular!
9/28
||μ - ν|| = sup |μ(E) - ν(E)|
E ∈ Σ
The largest possible difference that μ and ν assign to the same event
Given μ,ν: Σ → ℝ+ measures on (X,Σ)
Total Variation Distance
10/28
(a.k.a. supremum norm)
11/28
11/28
Application: Probabilistic Model Checking
M0 P[M0]({s⊨φ})
1
probability of satisfying φ
11/28
Application: Probabilistic Model Checking
M0 M1 P[M0]({s⊨φ}) P[M1]({s⊨φ})
1
|P[M0]({s⊨φ}) - P[M1]({s⊨φ})|
probability of satisfying φ
11/28
Application: Probabilistic Model Checking
M0 M1 P[M0]({s⊨φ}) P[M1]({s⊨φ}) ε
1
|P[M0]({s⊨φ}) - P[M1]({s⊨φ})|
probability of satisfying φ
11/28
Application: Probabilistic Model Checking
M0 M1 P[M0]({s⊨φ}) P[M1]({s⊨φ}) ε
1
|P[M0]({s⊨φ}) - P[M1]({s⊨φ})|
ε ε
distance bounds the abs. error probability of satisfying φ
≤ ε
11/28
Application: Probabilistic Model Checking
M0 M1 P[M0]({s⊨φ}) P[M1]({s⊨φ}) ε
1
|P[M0]({s⊨φ}) - P[M1]({s⊨φ})|
ε ε
distance bounds the abs. error probability of satisfying φ
≤ ε
11/28
for all formulas!
(i.e., does it provide a good approximation error?)
12/28
SMC ⊨ Linear Real-time Spec.
i.e., measuring the likelihood that a property is satisfied by the probabilistic model
13/28
SMC ⊨ Linear Real-time Spec.
i.e., measuring the likelihood that a property is satisfied by the probabilistic model
represented as Metric Temporal Logic formulas
13/28
SMC ⊨ Linear Real-time Spec.
i.e., measuring the likelihood that a property is satisfied by the probabilistic model
represented as Metric Temporal Logic formulas ... or languages recognized by Timed Automata
13/28
φ ≔ p | ⊥ | φ→φ | X φ | φU φ
I
Next
I
Until
(*) I ⊆ ℝ closed interval with rational endpoints (Alur-Henzinger)
14/28
φ ≔ p | ⊥ | φ→φ | X φ | φU φ
I I
φ φ φ ψ t0 ti-1
... ⊨ π:
Next
φU ψ
I
Until
(*) I ⊆ ℝ closed interval with rational endpoints
+ + ∈ I ... ψ within time t ∈ I
(Alur-Henzinger)
14/28
MTL(s,s’) = sup |P[s]({π⊨φ}) - P[s’]({π⊨φ})|
φ ∈ MTL set of timed paths that satisfy φ
(max error w.r.t. MTL properties)
15/28
MTL(s,s’) = sup |P[s]({π⊨φ}) - P[s’]({π⊨φ})|
φ ∈ MTL set of timed paths that satisfy φ
(max error w.r.t. MTL properties)
measurable in σ(𝓤)
MTL(s,s’) ≤ d(s,s’) = sup |P[s](E) - P[s’](E)|
E ∈ σ(𝓤)
Relation with Trace Distance
15/28
MTL(s,s’) = sup |P[s]({π⊨φ}) - P[s’]({π⊨φ})|
φ ∈ MTL set of timed paths that satisfy φ
(max error w.r.t. MTL properties)
measurable in σ(𝓤)
MTL(s,s’) ≤ d(s,s’) = sup |P[s](E) - P[s’](E)|
E ∈ σ(𝓤)
Relation with Trace Distance
=
15/28
without invariants ℓ1 ℓ2 ℓ0
p,r x≤1/2 q y≤1/2 p,r , x<3, {y} p,r x≥5, {x} q x≥1/4, {x}
g ≔ x ⋈ q | g ∧ g
for ⋈ ∈ {<,≤,>,≥}, q∈ℚ (ℓ0, )
x=0 y=0
(ℓ2, )
x=2 y=0
(ℓ1, )
x=2.5 y=0.5
...
Clock Guards
p,r , 2 q , 1/2 q , 1/2
accepted!
(Alur-Dill)
16/28
Clocks = {x,y}
TA(s,s’) = sup |P[s]({π∈L(𝓑)}) - P[s’]({π∈L(𝓑)})|
𝓑 ∈ TA set of timed paths accepted by 𝓑
(max error w.r.t. timed regular properties)
17/28
TA(s,s’) = sup |P[s]({π∈L(𝓑)}) - P[s’]({π∈L(𝓑)})|
𝓑 ∈ TA set of timed paths accepted by 𝓑
(max error w.r.t. timed regular properties)
measurable in σ(𝓤)
TA(s,s’) ≤ d(s,s’) = sup |P[s](E) - P[s’](E)|
E ∈ σ(𝓤)
Relation with Trace Distance
17/28
TA(s,s’) = sup |P[s]({π∈L(𝓑)}) - P[s’]({π∈L(𝓑)})|
𝓑 ∈ TA set of timed paths accepted by 𝓑
(max error w.r.t. timed regular properties)
measurable in σ(𝓤)
TA(s,s’) ≤ d(s,s’) = sup |P[s](E) - P[s’](E)|
E ∈ σ(𝓤)
Relation with Trace Distance
=
17/28
||μ - ν|| = sup |μ(E) - ν(E)|
E ∈ F For μ,ν: Σ → ℝ+ finite measures on (X,Σ) and F⊆Σ field such that σ(F)=Σ
Representation Theorem
18/28
||μ - ν|| = sup |μ(E) - ν(E)|
E ∈ F
F is much simpler than Σ, nevertheless it suffices to attain the supremum!
For μ,ν: Σ → ℝ+ finite measures on (X,Σ) and F⊆Σ field such that σ(F)=Σ
Representation Theorem
18/28
MTL(s,s’) = MTL (s,s’) TA(s,s’) = DTA(s,s’) = 1-DTA(s,s’) = 1-RDTA(s,s’)
¬U 19/28
MTL(s,s’) = MTL (s,s’) TA(s,s’) = DTA(s,s’) = 1-DTA(s,s’) = 1-RDTA(s,s’)
¬U max error w.r.t. φ∈MTL without Until 19/28
MTL(s,s’) = MTL (s,s’) TA(s,s’) = DTA(s,s’) = 1-DTA(s,s’) = 1-RDTA(s,s’)
¬U max error w.r.t. φ∈MTL without Until max error w.r.t. Deterministic TAs 19/28
MTL(s,s’) = MTL (s,s’) TA(s,s’) = DTA(s,s’) = 1-DTA(s,s’) = 1-RDTA(s,s’)
¬U max error w.r.t. φ∈MTL without Until max error w.r.t. Deterministic TAs max error w.r.t. single-clock DTAs 19/28
MTL(s,s’) = MTL (s,s’) TA(s,s’) = DTA(s,s’) = 1-DTA(s,s’) = 1-RDTA(s,s’)
¬U max error w.r.t. φ∈MTL without Until max error w.r.t. Deterministic TAs max error w.r.t. single-clock DTAs max error w.r.t. Resetting 1-DTAs 19/28
20/28
20/28
generalizes Chan-Kiefer LICS’14 with timed-event
20/28
NP-hardness [Lyngsø-Pedersen JCSS’02] Approximating the trace distance up to any ε>0 whose size is polynomial in the size of the Interval MC is NP-hard. easy to adapt to MCs... generalizes Chan-Kiefer LICS’14 with timed-event
20/28
Decidability still an open problem! NP-hardness [Lyngsø-Pedersen JCSS’02] Approximating the trace distance up to any ε>0 whose size is polynomial in the size of the Interval MC is NP-hard. easy to adapt to MCs... generalizes Chan-Kiefer LICS’14 with timed-event
21/28
d(s,s’)
trace distance
ε
21/28
d(s,s’)
trace distance
ε lk
lower approximants
l0 l1 ...
21/28
d(s,s’)
trace distance
ε lk uk
lower approximants upper approximants
l0 l1 ... u1 u0 ...
21/28
d(s,s’)
trace distance
ε lk uk
lower approximants upper approximants
l0 l1 ... u1 u0 ...
21/28
d(s,s’)
trace distance
d(s,s’)
ε lk uk
lower approximants upper approximants
l0 l1 ... u1 u0 ...
21/28
||μ - ν||
total variation distance
||μ - ν|| (general version)
22/28
||μ - ν|| = sup |μ(E) - ν(E)|
E∈F Representation Theorem recall that...
22/28
||μ - ν|| = sup |μ(E) - ν(E)|
E∈F F field that generates Σ Representation Theorem recall that...
22/28
||μ - ν|| = sup |μ(E) - ν(E)|
E∈F F field that generates Σ Representation Theorem
We need F0 ⊆ F1 ⊆ F2 ⊆ ... such that Ui Fi = F
li = sup |μ(E) - ν(E)|
E ∈ Fi
recall that...
22/28
||μ - ν|| = sup |μ(E) - ν(E)|
E∈F F field that generates Σ Representation Theorem
We need F0 ⊆ F1 ⊆ F2 ⊆ ... such that Ui Fi = F
li = sup |μ(E) - ν(E)|
E ∈ Fi
so that ∀i≥0, li ≤ li+1 & supi li = ||μ - ν||
increasing limiting recall that...
22/28
23/28
...seen before
Provide F0 ⊆ F1 ⊆ F2 ⊆ ... such that Ui Fi is a field for σ(𝓤)
23/28
...seen before
Provide F0 ⊆ F1 ⊆ F2 ⊆ ... such that Ui Fi is a field for σ(𝓤) Take Fi to be the collection of finite unions of cylinders
𝕯( ,R0, ... ,Ri-1, ) ∈ 𝓤
L0 Li
where Rj ∈ {[ , ) | 0≤n≤i2i }⋃{[i,∞)}
n 2i n+1 2i
23/28
...seen before
Provide F0 ⊆ F1 ⊆ F2 ⊆ ... such that Ui Fi is a field for σ(𝓤) Take Fi to be the collection of finite unions of cylinders
𝕯( ,R0, ... ,Ri-1, ) ∈ 𝓤
L0 Li
where Rj ∈ {[ , ) | 0≤n≤i2i }⋃{[i,∞)}
n 2i n+1 2i
each repartitioned in 2i [closed-open) intervals
[ )[ )[ )[ ) [ )[ )[
0 1 2 3 4 i-2 i-1 i
...
ℝ+
24/28
||μ - ν|| = min {w(≄) | w∈Ω(μ,ν)}
Coupling Characterization
it is know that...
24/28
||μ - ν|| = min {w(≄) | w∈Ω(μ,ν)}
Coupling Characterization
it is know that...
24/28
s0 s1 s2 s3 s4
μ
t0 t1 t2 t3 t4
ν Coupling as a transportation schedule...
||μ - ν|| = min {w(≄) | w∈Ω(μ,ν)}
Coupling Characterization
it is know that...
24/28
s0 s1 s2 s3 s4
μ
t0 t1 t2 t3 t4
ν Coupling as a transportation schedule... w(s0,t2)
||μ - ν|| = min {w(≄) | w∈Ω(μ,ν)}
Coupling Characterization
it is know that...
24/28
s0 s1 s2 s3 s4
μ
t0 t1 t2 t3 t4
ν Coupling as a transportation schedule... w(s0,t2)
||μ - ν|| = min {w(≄) | w∈Ω(μ,ν)}
Coupling Characterization it is know that...
25/28
||μ - ν|| = min {w(≄) | w∈Ω(μ,ν)}
Coupling Characterization
We need Ω0 ⊆ Ω1 ⊆ Ω2 ⊆ ... such that
Ui Ωi dense in Ω(μ,ν) w.r.t. total variation
ui = inf {w(≄) | w∈Ωi}
it is know that...
25/28
||μ - ν|| = min {w(≄) | w∈Ω(μ,ν)}
Coupling Characterization
We need Ω0 ⊆ Ω1 ⊆ Ω2 ⊆ ... such that
Ui Ωi dense in Ω(μ,ν) w.r.t. total variation
ui = inf {w(≄) | w∈Ωi}
so that ∀i≥0, ui ≥ ui+1 & infi ui = ||μ - ν||
decreasing limiting it is know that...
25/28
26/28
Provide Ω0 ⊆ Ω1 ⊆ Ω2 ⊆ ... such that Ui Ωi is dense in Ω(P[s],P[s’])
...seen before
26/28
Provide Ω0 ⊆ Ω1 ⊆ Ω2 ⊆ ... such that Ui Ωi is dense in Ω(P[s],P[s’])
𝓓: S×S →Δ(Sk × Sk)
such that 𝓓(s,s’)∈Ω(P[s]k,P[s’]k)
coupling structure
Stochastic process generating pairs of timed paths divided in multisteps of length k
...seen before
26/28
Provide Ω0 ⊆ Ω1 ⊆ Ω2 ⊆ ... such that Ui Ωi is dense in Ω(P[s],P[s’]) Take Ωi = {P𝓓[s,s’]∈Ω(P[s],P[s’]) | 𝓓 of rank 2i} where P𝓓[s,s’] is the probability generated by 𝓓
𝓓: S×S →Δ(Sk × Sk)
such that 𝓓(s,s’)∈Ω(P[s]k,P[s’]k)
coupling structure
Stochastic process generating pairs of timed paths divided in multisteps of length k
...seen before
residence-time distributions are computable on [q,q’) with q,q’∈ℚ+
distributions is computable
27/28
residence-time distributions are computable on [q,q’) with q,q’∈ℚ+
distributions is computable
Not that strong!
27/28
residence-time distributions are computable on [q,q’) with q,q’∈ℚ+
distributions is computable
Not that strong!
Exp(λ) 27/28
residence-time distributions are computable on [q,q’) with q,q’∈ℚ+
distributions is computable
Not that strong!
Exp(λ) N(a,b) 27/28
residence-time distributions are computable on [q,q’) with q,q’∈ℚ+
distributions is computable
Not that strong!
Exp(λ) N(a,b) U(a,b) 27/28
residence-time distributions are computable on [q,q’) with q,q’∈ℚ+
distributions is computable
Not that strong!
Exp(λ) N(a,b) U(a,b)
...
27/28
residence-time distributions are computable on [q,q’) with q,q’∈ℚ+
distributions is computable For any ε>0, the approximation procedure for the trace distance is decidable.
Not that strong!
Exp(λ) N(a,b) U(a,b)
...
27/28
28/28
28/28
28/28
28/28
28/28
28/28
Variation distance:
28/28
Variation distance:
28/28
Variation distance:
28/28