SLIDE 1 Advanced Algorithms (XII)
Shanghai Jiao Tong University
Chihao Zhang
May 25, 2020
SLIDE 2
Random Walk on a Graph
SLIDE 3 Random Walk on a Graph
1 3 2
1 2 1 8 1 4 3 8 1 3 2 3 3 4
SLIDE 4 Random Walk on a Graph
P = [pij]1≤i,j≤n =
1 2 3 8 1 8 1 3 2 3 1 3 3 4
1 3 2
1 2 1 8 1 4 3 8 1 3 2 3 3 4
SLIDE 5 Random Walk on a Graph
P = [pij]1≤i,j≤n =
1 2 3 8 1 8 1 3 2 3 1 3 3 4
pij = Pr[Xt+1 = j ∣ Xt = i]
1 3 2
1 2 1 8 1 4 3 8 1 3 2 3 3 4
SLIDE 6 Random Walk on a Graph
P = [pij]1≤i,j≤n =
1 2 3 8 1 8 1 3 2 3 1 3 3 4
pij = Pr[Xt+1 = j ∣ Xt = i] ∀t ≥ 0, μT
t = μT 0 Pt
1 3 2
1 2 1 8 1 4 3 8 1 3 2 3 3 4
SLIDE 7 Random Walk on a Graph
P = [pij]1≤i,j≤n =
1 2 3 8 1 8 1 3 2 3 1 3 3 4
pij = Pr[Xt+1 = j ∣ Xt = i] Stationary distribution :
π πTP = πT
∀t ≥ 0, μT
t = μT 0 Pt
1 3 2
1 2 1 8 1 4 3 8 1 3 2 3 3 4
SLIDE 8
Fundamental Theorem of Markov Chains
SLIDE 9
Fundamental Theorem of Markov Chains
We study a few basic questions regarding a chain:
SLIDE 10 Fundamental Theorem of Markov Chains
We study a few basic questions regarding a chain:
- Does a stationary distribution always exist?
SLIDE 11 Fundamental Theorem of Markov Chains
We study a few basic questions regarding a chain:
- Does a stationary distribution always exist?
- If so, is the stationary distribution unique?
SLIDE 12 Fundamental Theorem of Markov Chains
We study a few basic questions regarding a chain:
- Does a stationary distribution always exist?
- If so, is the stationary distribution unique?
- If so, does any initial distribution converge to it?
SLIDE 13
Existence of Stationary Distribution
SLIDE 14
Existence of Stationary Distribution
Yes, any Markov chain has a stationary distribution
SLIDE 15 Existence of Stationary Distribution
Yes, any Markov chain has a stationary distribution
Perron-Frobenius
Any positive matrix matrix has a positive real eigenvalue with . Moreover, its eigenvector is positive.
n × n A λ ρ(A) = λ
SLIDE 16 Existence of Stationary Distribution
Yes, any Markov chain has a stationary distribution
Perron-Frobenius
Any positive matrix matrix has a positive real eigenvalue with . Moreover, its eigenvector is positive.
n × n A λ ρ(A) = λ λ(PT) = λ(P) = 1
SLIDE 17 Existence of Stationary Distribution
Yes, any Markov chain has a stationary distribution
Perron-Frobenius
Any positive matrix matrix has a positive real eigenvalue with . Moreover, its eigenvector is positive.
n × n A λ ρ(A) = λ λ(PT) = λ(P) = 1 The positive eigenvector is π
SLIDE 18
Uniqueness and Convergence
SLIDE 19 Uniqueness and Convergence
1 2
1 − p p 1 − q q
SLIDE 20 Uniqueness and Convergence
P = [ 1 − p p q 1 − q]
1 2
1 − p p 1 − q q
SLIDE 21 Uniqueness and Convergence
P = [ 1 − p p q 1 − q] is a stationary dist. of
π = ( q p + q, p p + q )
T
P
1 2
1 − p p 1 − q q
SLIDE 22 Uniqueness and Convergence
P = [ 1 − p p q 1 − q] is a stationary dist. of
π = ( q p + q, p p + q )
T
P
Start from an arbitrary μ0 = (μ(1), μ(2))
T
1 2
1 − p p 1 − q q
SLIDE 23 Uniqueness and Convergence
P = [ 1 − p p q 1 − q] is a stationary dist. of
π = ( q p + q, p p + q )
T
P
Start from an arbitrary μ0 = (μ(1), μ(2))
T
Compute ∥μT
0 Pt − πT∥
1 2
1 − p p 1 − q q
SLIDE 24
SLIDE 25
Δt = |μt(1) − π(1)|
SLIDE 26
Δt = |μt(1) − π(1)| Δt+1 = μt+1(1) − q p + q = μt(1 − p) + (1 − μt(1))q − q p + q = (1 − p − q) μt(1) − q p + q = (1 − p − q) ⋅ Δt
SLIDE 27 Δt = |μt(1) − π(1)| Δt+1 = μt+1(1) − q p + q = μt(1 − p) + (1 − μt(1))q − q p + q = (1 − p − q) μt(1) − q p + q = (1 − p − q) ⋅ Δt Since , there are two ways to prohibit :
p, q ∈ [0,1] Δt → 0 p = q = 1 p = q = 0
SLIDE 28
p = q = 0
SLIDE 29 p = q = 0
1 2
1 1
SLIDE 30 p = q = 0
1 2
1 1
∀t, Δt = Δ0
SLIDE 31 p = q = 0
1 2
1 1
The graph is disconnected ∀t, Δt = Δ0
SLIDE 32 p = q = 0
1 2
1 1
The graph is disconnected The chain is called reducible ∀t, Δt = Δ0
SLIDE 33 p = q = 0
1 2
1 1
The graph is disconnected The chain is called reducible In this case, the stationary distribution is not unique ∀t, Δt = Δ0
SLIDE 34 p = q = 0
1 2
1 1
The graph is disconnected The chain is called reducible In this case, the stationary distribution is not unique
Chain = convex combination of small chains
∀t, Δt = Δ0
SLIDE 35 p = q = 0
1 2
1 1
The graph is disconnected The chain is called reducible In this case, the stationary distribution is not unique
Chain = convex combination of small chains
∀t, Δt = Δ0
Stationary distribution=convex combination of “small” distributions
SLIDE 36
p = q = 1
SLIDE 37 p = q = 1
1 2
1 1
SLIDE 38 p = q = 1
1 2
1 1
∀t, Δt = − Δt−1
SLIDE 39 p = q = 1
1 2
1 1
∀t, Δt = − Δt−1 The graph is bipartite
SLIDE 40 p = q = 1
1 2
1 1
∀t, Δt = − Δt−1 The graph is bipartite The chain is called periodic
SLIDE 41 p = q = 1
1 2
1 1
∀t, Δt = − Δt−1 The graph is bipartite The chain is called periodic Formally, ∃v, gcdC∈Cv|C| > 1
SLIDE 42 p = q = 1
1 2
1 1
∀t, Δt = − Δt−1 The graph is bipartite The chain is called periodic In this case, not all initial distribution converges to the stationary distribution Formally, ∃v, gcdC∈Cv|C| > 1
SLIDE 43
Fundamental Theorem of Markov Chains
SLIDE 44 Fundamental Theorem of Markov Chains
If a finite chain is irreducible and aperiodic, then it has a unique stationary distribution . Moreover, for any initial distribution , it holds that
P π μ
lim
t→∞ μTPt = πT
SLIDE 45 Fundamental Theorem of Markov Chains
If a finite chain is irreducible and aperiodic, then it has a unique stationary distribution . Moreover, for any initial distribution , it holds that
P π μ
lim
t→∞ μTPt = πT
(Show on board, see the note for details)
SLIDE 46
Reversible Chains
SLIDE 47
Reversible Chains
We study a special family of Markov chains called reversible chains
SLIDE 48
Reversible Chains
We study a special family of Markov chains called reversible chains Their transition graphs are undirected x → y ⟺ y → x
SLIDE 49
Reversible Chains
We study a special family of Markov chains called reversible chains Their transition graphs are undirected x → y ⟺ y → x A chain and a distribution satisfies detailed balance condition:
P π
SLIDE 50
Reversible Chains
We study a special family of Markov chains called reversible chains Their transition graphs are undirected x → y ⟺ y → x A chain and a distribution satisfies detailed balance condition:
P π
∀x, y ∈ V, π(x) ⋅ P(x, y) = π(y) ⋅ P(y, x)
SLIDE 51
Reversible Chains
We study a special family of Markov chains called reversible chains Their transition graphs are undirected x → y ⟺ y → x A chain and a distribution satisfies detailed balance condition:
P π
∀x, y ∈ V, π(x) ⋅ P(x, y) = π(y) ⋅ P(y, x) Then is a stationary distribution of
π P
SLIDE 52
SLIDE 53
We study reversible chains because
SLIDE 54 We study reversible chains because
- They are quite general. For any , one can define
an reversible whose stationary distribution is
π P π
SLIDE 55 We study reversible chains because
- They are quite general. For any , one can define
an reversible whose stationary distribution is
π P π
Helpful for Sampling
SLIDE 56 We study reversible chains because
- They are quite general. For any , one can define
an reversible whose stationary distribution is
π P π
Helpful for Sampling
- We have powerful tools (spectral method) to
analyze reversible chains
SLIDE 57
Spectral Decomposition Theorem
SLIDE 58
Spectral Decomposition Theorem
An symmetric matrix has real eigenvalues with corresponding eigenvectors which are orthogonal. Moreover, it holds that
n × n A n λ1, …, λn v1, …, vn
A = VΛVT
SLIDE 59
Spectral Decomposition Theorem
An symmetric matrix has real eigenvalues with corresponding eigenvectors which are orthogonal. Moreover, it holds that
n × n A n λ1, …, λn v1, …, vn
A = VΛVT where and
V = [v1, …, vn] Λ = diag(λ1, …, λn)
SLIDE 60 Spectral Decomposition Theorem
An symmetric matrix has real eigenvalues with corresponding eigenvectors which are orthogonal. Moreover, it holds that
n × n A n λ1, …, λn v1, …, vn
A = VΛVT where and
V = [v1, …, vn] Λ = diag(λ1, …, λn)
Equivalently, A =
n
∑
i=1
λivivT
i
SLIDE 61
Spectral Decomposition Theorem for Reversible Chains
SLIDE 62
Spectral Decomposition Theorem for Reversible Chains
is a stationary distribution of a reversible chain
π P
SLIDE 63 Spectral Decomposition Theorem for Reversible Chains
is a stationary distribution of a reversible chain
π P
Define an inner product
:
⟨ ⋅ , ⋅ ⟩π ℝn
SLIDE 64 Spectral Decomposition Theorem for Reversible Chains
is a stationary distribution of a reversible chain
π P
Define an inner product
:
⟨ ⋅ , ⋅ ⟩π ℝn
⟨x, y⟩π =
n
∑
i=1
π(i) ⋅ x(i) ⋅ y(i) = xTDπy,
SLIDE 65 Spectral Decomposition Theorem for Reversible Chains
is a stationary distribution of a reversible chain
π P
Define an inner product
:
⟨ ⋅ , ⋅ ⟩π ℝn
⟨x, y⟩π =
n
∑
i=1
π(i) ⋅ x(i) ⋅ y(i) = xTDπy, where Dπ = diag(π1, …, πn)
SLIDE 66 Spectral Decomposition Theorem for Reversible Chains
is a stationary distribution of a reversible chain
π P
Define an inner product
:
⟨ ⋅ , ⋅ ⟩π ℝn
⟨x, y⟩π =
n
∑
i=1
π(i) ⋅ x(i) ⋅ y(i) = xTDπy, where Dπ = diag(π1, …, πn) Consider the Hilbert space endowed with
ℝn ⟨ ⋅ , ⋅ ⟩π
SLIDE 67
SLIDE 68 Let be reversible with respect to . It has real eigenvalues with corresponding eigenvectors which are orthogonal in . Moreover
P ∈ ℝn×n π n λ1, …, λn v1, …, vn (ℝn, ⟨ ⋅ , ⟩π)
P =
n
∑
i=1
λivivT
i Dπ
SLIDE 69 Let be reversible with respect to . It has real eigenvalues with corresponding eigenvectors which are orthogonal in . Moreover
P ∈ ℝn×n π n λ1, …, λn v1, …, vn (ℝn, ⟨ ⋅ , ⟩π)
P =
n
∑
i=1
λivivT
i Dπ
SLIDE 70 Let be reversible with respect to . It has real eigenvalues with corresponding eigenvectors which are orthogonal in . Moreover
P ∈ ℝn×n π n λ1, …, λn v1, …, vn (ℝn, ⟨ ⋅ , ⟩π)
P =
n
∑
i=1
λivivT
i Dπ
symmetric case.
SLIDE 71
Properties of Eigenvalues
SLIDE 72
Properties of Eigenvalues
is a stationary distribution of a reversible chain
π P
SLIDE 73
Properties of Eigenvalues
is a stationary distribution of a reversible chain
π P
The eigenvalues of are
P λ1 ≤ λ2… ≤ λn
SLIDE 74 Properties of Eigenvalues
is a stationary distribution of a reversible chain
π P
The eigenvalues of are
P λ1 ≤ λ2… ≤ λn
SLIDE 75 Properties of Eigenvalues
is a stationary distribution of a reversible chain
π P
The eigenvalues of are
P λ1 ≤ λ2… ≤ λn
if and only if is bipartite
λ1 ≥ − 1 λ1 = − 1 P
SLIDE 76 Properties of Eigenvalues
is a stationary distribution of a reversible chain
π P
The eigenvalues of are
P λ1 ≤ λ2… ≤ λn
if and only if is bipartite
λ1 ≥ − 1 λ1 = − 1 P
- if and only if is reducible
λn−1 = 1 P
SLIDE 77 Properties of Eigenvalues
is a stationary distribution of a reversible chain
π P
The eigenvalues of are
P λ1 ≤ λ2… ≤ λn
if and only if is bipartite
λ1 ≥ − 1 λ1 = − 1 P
- if and only if is reducible
λn−1 = 1 P
Proof next week!
SLIDE 78
SLIDE 79 P =
n
∑
i=1
λivivT
i Dπ
SLIDE 80 P =
n
∑
i=1
λivivT
i Dπ
Pt =
n
∑
i=1
λt
ivivT i Dπ
SLIDE 81 P =
n
∑
i=1
λivivT
i Dπ
Pt =
n
∑
i=1
λt
ivivT i Dπ
SLIDE 82 P =
n
∑
i=1
λivivT
i Dπ
Pt =
n
∑
i=1
λt
ivivT i Dπ
If is irreducible ( ) and aperiodic ( )
P λn−1 < 1 λ1 > − 1
SLIDE 83 P =
n
∑
i=1
λivivT
i Dπ
Pt =
n
∑
i=1
λt
ivivT i Dπ
If is irreducible ( ) and aperiodic ( )
P λn−1 < 1 λ1 > − 1
lim
t→∞ Pt = 11TDπ =
πT πT ⋮ πT