 
              Advanced Algorithms (XIII) Shanghai Jiao Tong University Chihao Zhang June 1, 2020
Total Variation Distance
Total Variation Distance Let and be two distributions on Ω μ ν
Total Variation Distance Let and be two distributions on Ω μ ν Their total variation distance is
Total Variation Distance Let and be two distributions on Ω μ ν Their total variation distance is d TV ( μ , ν ) = 1 2 ∑ μ ( x ) − ν ( x ) = max A ⊆Ω μ ( A ) − ν ( A ) x ∈Ω
Total Variation Distance Let and be two distributions on Ω μ ν Their total variation distance is d TV ( μ , ν ) = 1 2 ∑ μ ( x ) − ν ( x ) = max A ⊆Ω μ ( A ) − ν ( A ) x ∈Ω ν μ A
Total Variation Distance Let and be two distributions on Ω μ ν Their total variation distance is d TV ( μ , ν ) = 1 2 ∑ μ ( x ) − ν ( x ) = max A ⊆Ω μ ( A ) − ν ( A ) x ∈Ω ν 1 -distance scaled by ℓ 1 2 μ A
Coupling
Coupling Let and be two distributions on μ ν Ω
Coupling Let and be two distributions on μ ν Ω A coupling of and is a joint distribution on μ ν ω such that: Ω × Ω
Coupling Let and be two distributions on μ ν Ω A coupling of and is a joint distribution on μ ν ω such that: Ω × Ω μ ( x ) = ∑ ∀ x ∈ Ω , ω ( x , y ) y ∈Ω
Coupling Let and be two distributions on μ ν Ω A coupling of and is a joint distribution on μ ν ω such that: Ω × Ω μ ( x ) = ∑ ∀ x ∈ Ω , ω ( x , y ) y ∈Ω ν ( x ) = ∑ ∀ y ∈ Ω , ω ( x , y ) x ∈Ω
Coupling Lemma
Coupling Lemma Let be a coupling of and ω μ ν
Coupling Lemma Let be a coupling of and ω μ ν and ( X , Y ) ∼ ω ⟹ X ∼ μ Y ∼ ν
Coupling Lemma Let be a coupling of and ω μ ν and ( X , Y ) ∼ ω ⟹ X ∼ μ Y ∼ ν Then ( X , Y ) ∼ ω [ X ≠ Y ] ≥ d TV ( μ , ν ) Pr
Coupling Lemma Let be a coupling of and ω μ ν and ( X , Y ) ∼ ω ⟹ X ∼ μ Y ∼ ν Then ( X , Y ) ∼ ω [ X ≠ Y ] ≥ d TV ( μ , ν ) Pr Moreover, there exists such that ω *
Coupling Lemma Let be a coupling of and ω μ ν and ( X , Y ) ∼ ω ⟹ X ∼ μ Y ∼ ν Then ( X , Y ) ∼ ω [ X ≠ Y ] ≥ d TV ( μ , ν ) Pr Moreover, there exists such that ω * ( X , Y ) ∼ ω * [ X ≠ Y ] = d TV ( μ , ν ) Pr
Proof of Coupling Lemma
Proof of Coupling Lemma For finite , designing a coupling is equivalent to Ω filling a matrix so that the marginals are correct Ω × Ω
Proof of Coupling Lemma For finite , designing a coupling is equivalent to Ω filling a matrix so that the marginals are correct Ω × Ω Ω = {1,2}, μ = (1/2,1/2), ν = (1/3,2/3)
Proof of Coupling Lemma For finite , designing a coupling is equivalent to Ω filling a matrix so that the marginals are correct Ω × Ω Ω = {1,2}, μ = (1/2,1/2), ν = (1/3,2/3) μ 1 1 ν 2 2 1 3 2 3
Proof of Coupling Lemma For finite , designing a coupling is equivalent to Ω filling a matrix so that the marginals are correct Ω × Ω Ω = {1,2}, μ = (1/2,1/2), ν = (1/3,2/3) μ 1 1 ν 2 2 1 1 3 3 2 3
Proof of Coupling Lemma For finite , designing a coupling is equivalent to Ω filling a matrix so that the marginals are correct Ω × Ω Ω = {1,2}, μ = (1/2,1/2), ν = (1/3,2/3) μ 1 1 ν 2 2 1 1 3 3 2 1 3 2
Proof of Coupling Lemma For finite , designing a coupling is equivalent to Ω filling a matrix so that the marginals are correct Ω × Ω Ω = {1,2}, μ = (1/2,1/2), ν = (1/3,2/3) μ 1 1 ν 2 2 1 1 0 3 3 2 1 3 2
Proof of Coupling Lemma For finite , designing a coupling is equivalent to Ω filling a matrix so that the marginals are correct Ω × Ω Ω = {1,2}, μ = (1/2,1/2), ν = (1/3,2/3) μ 1 1 ν 2 2 1 1 0 3 3 2 1 1 3 6 2
Proof of Coupling Lemma For finite , designing a coupling is equivalent to Ω filling a matrix so that the marginals are correct Ω × Ω Ω = {1,2}, μ = (1/2,1/2), ν = (1/3,2/3) μ 1 1 ν 2 2 is the one maximizing ω * 1 1 0 3 3 the sum of diagonals 2 1 1 3 6 2
Coupling of Markov Chains
Coupling of Markov Chains Consider two copies of the chain : P
Coupling of Markov Chains Consider two copies of the chain : P • The initial distribution is and μ 0 ν 0 • μ T t = μ T 0 P t and ν T t = ν T 0 P t
Coupling of Markov Chains Consider two copies of the chain : P • The initial distribution is and μ 0 ν 0 • μ T t = μ T 0 P t and ν T t = ν T 0 P t A coupling of the two chains is joint distribution of and satisfying the following ω { μ t } t ≥ 0 { ν t } t ≥ 0 conditions
is a pair of processes such that {( X t , Y t )} t ≥ 0 ∼ ω
is a pair of processes such that {( X t , Y t )} t ≥ 0 ∼ ω ∀ a , b ∈ Ω , Pr [ X t +1 = b ∣ X t = a ] = P ( a , b )
is a pair of processes such that {( X t , Y t )} t ≥ 0 ∼ ω ∀ a , b ∈ Ω , Pr [ X t +1 = b ∣ X t = a ] = P ( a , b ) ∀ a , b ∈ Ω , Pr [ Y t +1 = b ∣ X t = a ] = P ( a , b )
is a pair of processes such that {( X t , Y t )} t ≥ 0 ∼ ω ∀ a , b ∈ Ω , Pr [ X t +1 = b ∣ X t = a ] = P ( a , b ) ∀ a , b ∈ Ω , Pr [ Y t +1 = b ∣ X t = a ] = P ( a , b ) Marginally, and are both chain { X t } { Y t } P
is a pair of processes such that {( X t , Y t )} t ≥ 0 ∼ ω ∀ a , b ∈ Ω , Pr [ X t +1 = b ∣ X t = a ] = P ( a , b ) ∀ a , b ∈ Ω , Pr [ Y t +1 = b ∣ X t = a ] = P ( a , b ) Marginally, and are both chain { X t } { Y t } P ∀ t ≥ 0, X t = Y t ⟹ X t ′ = Y t ′ for all t ′ > t
is a pair of processes such that {( X t , Y t )} t ≥ 0 ∼ ω ∀ a , b ∈ Ω , Pr [ X t +1 = b ∣ X t = a ] = P ( a , b ) ∀ a , b ∈ Ω , Pr [ Y t +1 = b ∣ X t = a ] = P ( a , b ) Marginally, and are both chain { X t } { Y t } P ∀ t ≥ 0, X t = Y t ⟹ X t ′ = Y t ′ for all t ′ > t Two chains coalesce once they meet
Fundamental Theorem via Coupling
Fundamental Theorem via Coupling If a finite chain is irreducible and aperiodic, then it has a P unique stationary distribution . Moreover, for any initial π distribution , it holds that μ t →∞ μ T P t = π T lim
Fundamental Theorem via Coupling If a finite chain is irreducible and aperiodic, then it has a P unique stationary distribution . Moreover, for any initial π distribution , it holds that μ t →∞ μ T P t = π T lim Consider two chains and { X t } t ≥ 0 { Y t } t ≥ 0
Fundamental Theorem via Coupling If a finite chain is irreducible and aperiodic, then it has a P unique stationary distribution . Moreover, for any initial π distribution , it holds that μ t →∞ μ T P t = π T lim Consider two chains and { X t } t ≥ 0 { Y t } t ≥ 0 , for arbitrary • X 0 ∼ π Y 0 ∼ μ 0 μ 0
Fundamental Theorem via Coupling If a finite chain is irreducible and aperiodic, then it has a P unique stationary distribution . Moreover, for any initial π distribution , it holds that μ t →∞ μ T P t = π T lim Consider two chains and { X t } t ≥ 0 { Y t } t ≥ 0 , for arbitrary • X 0 ∼ π Y 0 ∼ μ 0 μ 0 •A coupling where and run independently X t Y t
irreducible + aperiodic ⟹ ∃ t , ∀ x , y , P t ( x , y ) > 0
irreducible + aperiodic ⟹ ∃ t , ∀ x , y , P t ( x , y ) > 0 Then for any , there exists some s.t. z ∈ Ω θ > 0
irreducible + aperiodic ⟹ ∃ t , ∀ x , y , P t ( x , y ) > 0 Then for any , there exists some s.t. z ∈ Ω θ > 0 Pr [ X t = Y t ] ≥ Pr [ X t = Y t = z ] = Pr [ X t = z ] ⋅ Pr [ Y t = z ] = π ( z ) ⋅ P t ( Y 0 , z ) ≥ θ > 0
irreducible + aperiodic ⟹ ∃ t , ∀ x , y , P t ( x , y ) > 0 Then for any , there exists some s.t. z ∈ Ω θ > 0 Pr [ X t = Y t ] ≥ Pr [ X t = Y t = z ] = Pr [ X t = z ] ⋅ Pr [ Y t = z ] = π ( z ) ⋅ P t ( Y 0 , z ) ≥ θ > 0 Pr [ X t ≠ Y t ] ≤ 1 − θ < 1
irreducible + aperiodic ⟹ ∃ t , ∀ x , y , P t ( x , y ) > 0 Then for any , there exists some s.t. z ∈ Ω θ > 0 Pr [ X t = Y t ] ≥ Pr [ X t = Y t = z ] = Pr [ X t = z ] ⋅ Pr [ Y t = z ] = π ( z ) ⋅ P t ( Y 0 , z ) ≥ θ > 0 Pr [ X t ≠ Y t ] ≤ 1 − θ < 1 Pr [ X 2 t ≠ Y 2 t ] = Pr [ X 2 t ≠ Y 2 t ∧ X t = Y t ] + Pr [ X 2 t ≠ Y 2 t ∧ X t ≠ Y t ] = Pr [ X 2 t ≠ Y 2 t ∣ X t ≠ Y t ] ⋅ Pr [ X t ≠ Y t ] ≤ (1 − θ ) 2
irreducible + aperiodic ⟹ ∃ t , ∀ x , y , P t ( x , y ) > 0 Then for any , there exists some s.t. z ∈ Ω θ > 0 Pr [ X t = Y t ] ≥ Pr [ X t = Y t = z ] = Pr [ X t = z ] ⋅ Pr [ Y t = z ] = π ( z ) ⋅ P t ( Y 0 , z ) ≥ θ > 0 Pr [ X t ≠ Y t ] ≤ 1 − θ < 1 Pr [ X 2 t ≠ Y 2 t ] = Pr [ X 2 t ≠ Y 2 t ∧ X t = Y t ] + Pr [ X 2 t ≠ Y 2 t ∧ X t ≠ Y t ] = Pr [ X 2 t ≠ Y 2 t ∣ X t ≠ Y t ] ⋅ Pr [ X t ≠ Y t ] ≤ (1 − θ ) 2 …
Recommend
More recommend