Advanced Algorithms (XII) Shanghai Jiao Tong University Chihao - - PowerPoint PPT Presentation

advanced algorithms xii
SMART_READER_LITE
LIVE PREVIEW

Advanced Algorithms (XII) Shanghai Jiao Tong University Chihao - - PowerPoint PPT Presentation

Advanced Algorithms (XII) Shanghai Jiao Tong University Chihao Zhang May 25, 2020 Random Walk on a Graph Random Walk on a Graph 1 1 2 8 1 3 1 4 3 2 8 3 3 1 4 3 2 Random Walk on a Graph 1 3 1 1 1 2 2 8 8 8 1 3 1 1


slide-1
SLIDE 1

Advanced Algorithms (XII)

Shanghai Jiao Tong University

Chihao Zhang

May 25, 2020

slide-2
SLIDE 2

Random Walk on a Graph

slide-3
SLIDE 3

Random Walk on a Graph

1 3 2

1 2 1 8 1 4 3 8 1 3 2 3 3 4

slide-4
SLIDE 4

Random Walk on a Graph

P = [pij]1≤i,j≤n =

1 2 3 8 1 8 1 3 2 3 1 3 3 4

1 3 2

1 2 1 8 1 4 3 8 1 3 2 3 3 4

slide-5
SLIDE 5

Random Walk on a Graph

P = [pij]1≤i,j≤n =

1 2 3 8 1 8 1 3 2 3 1 3 3 4

pij = Pr[Xt+1 = j ∣ Xt = i]

1 3 2

1 2 1 8 1 4 3 8 1 3 2 3 3 4

slide-6
SLIDE 6

Random Walk on a Graph

P = [pij]1≤i,j≤n =

1 2 3 8 1 8 1 3 2 3 1 3 3 4

pij = Pr[Xt+1 = j ∣ Xt = i] ∀t ≥ 0, μT

t = μT 0 Pt

1 3 2

1 2 1 8 1 4 3 8 1 3 2 3 3 4

slide-7
SLIDE 7

Random Walk on a Graph

P = [pij]1≤i,j≤n =

1 2 3 8 1 8 1 3 2 3 1 3 3 4

pij = Pr[Xt+1 = j ∣ Xt = i] Stationary distribution :

π πTP = πT

∀t ≥ 0, μT

t = μT 0 Pt

1 3 2

1 2 1 8 1 4 3 8 1 3 2 3 3 4

slide-8
SLIDE 8

Fundamental Theorem of Markov Chains

slide-9
SLIDE 9

Fundamental Theorem of Markov Chains

We study a few basic questions regarding a chain:

slide-10
SLIDE 10

Fundamental Theorem of Markov Chains

We study a few basic questions regarding a chain:

  • Does a stationary distribution always exist?
slide-11
SLIDE 11

Fundamental Theorem of Markov Chains

We study a few basic questions regarding a chain:

  • Does a stationary distribution always exist?
  • If so, is the stationary distribution unique?
slide-12
SLIDE 12

Fundamental Theorem of Markov Chains

We study a few basic questions regarding a chain:

  • Does a stationary distribution always exist?
  • If so, is the stationary distribution unique?
  • If so, does any initial distribution converge to it?
slide-13
SLIDE 13

Existence of Stationary Distribution

slide-14
SLIDE 14

Existence of Stationary Distribution

Yes, any Markov chain has a stationary distribution

slide-15
SLIDE 15

Existence of Stationary Distribution

Yes, any Markov chain has a stationary distribution

Perron-Frobenius

Any positive matrix matrix has a positive real eigenvalue with . Moreover, its eigenvector is positive.

n × n A λ ρ(A) = λ

slide-16
SLIDE 16

Existence of Stationary Distribution

Yes, any Markov chain has a stationary distribution

Perron-Frobenius

Any positive matrix matrix has a positive real eigenvalue with . Moreover, its eigenvector is positive.

n × n A λ ρ(A) = λ λ(PT) = λ(P) = 1

slide-17
SLIDE 17

Existence of Stationary Distribution

Yes, any Markov chain has a stationary distribution

Perron-Frobenius

Any positive matrix matrix has a positive real eigenvalue with . Moreover, its eigenvector is positive.

n × n A λ ρ(A) = λ λ(PT) = λ(P) = 1 The positive eigenvector is π

slide-18
SLIDE 18

Uniqueness and Convergence

slide-19
SLIDE 19

Uniqueness and Convergence

1 2

1 − p p 1 − q q

slide-20
SLIDE 20

Uniqueness and Convergence

P = [ 1 − p p q 1 − q]

1 2

1 − p p 1 − q q

slide-21
SLIDE 21

Uniqueness and Convergence

P = [ 1 − p p q 1 − q] is a stationary dist. of

π = ( q p + q, p p + q )

T

P

1 2

1 − p p 1 − q q

slide-22
SLIDE 22

Uniqueness and Convergence

P = [ 1 − p p q 1 − q] is a stationary dist. of

π = ( q p + q, p p + q )

T

P

Start from an arbitrary μ0 = (μ(1), μ(2))

T

1 2

1 − p p 1 − q q

slide-23
SLIDE 23

Uniqueness and Convergence

P = [ 1 − p p q 1 − q] is a stationary dist. of

π = ( q p + q, p p + q )

T

P

Start from an arbitrary μ0 = (μ(1), μ(2))

T

Compute ∥μT

0 Pt − πT∥

1 2

1 − p p 1 − q q

slide-24
SLIDE 24
slide-25
SLIDE 25

Δt = |μt(1) − π(1)|

slide-26
SLIDE 26

Δt = |μt(1) − π(1)| Δt+1 = μt+1(1) − q p + q = μt(1 − p) + (1 − μt(1))q − q p + q = (1 − p − q) μt(1) − q p + q = (1 − p − q) ⋅ Δt

slide-27
SLIDE 27

Δt = |μt(1) − π(1)| Δt+1 = μt+1(1) − q p + q = μt(1 − p) + (1 − μt(1))q − q p + q = (1 − p − q) μt(1) − q p + q = (1 − p − q) ⋅ Δt Since , there are two ways to prohibit :

  • r

p, q ∈ [0,1] Δt → 0 p = q = 1 p = q = 0

slide-28
SLIDE 28

p = q = 0

slide-29
SLIDE 29

p = q = 0

1 2

1 1

slide-30
SLIDE 30

p = q = 0

1 2

1 1

∀t, Δt = Δ0

slide-31
SLIDE 31

p = q = 0

1 2

1 1

The graph is disconnected ∀t, Δt = Δ0

slide-32
SLIDE 32

p = q = 0

1 2

1 1

The graph is disconnected The chain is called reducible ∀t, Δt = Δ0

slide-33
SLIDE 33

p = q = 0

1 2

1 1

The graph is disconnected The chain is called reducible In this case, the stationary distribution is not unique ∀t, Δt = Δ0

slide-34
SLIDE 34

p = q = 0

1 2

1 1

The graph is disconnected The chain is called reducible In this case, the stationary distribution is not unique

Chain = convex combination of small chains

∀t, Δt = Δ0

slide-35
SLIDE 35

p = q = 0

1 2

1 1

The graph is disconnected The chain is called reducible In this case, the stationary distribution is not unique

Chain = convex combination of small chains

∀t, Δt = Δ0

Stationary distribution=convex combination of “small” distributions

slide-36
SLIDE 36

p = q = 1

slide-37
SLIDE 37

p = q = 1

1 2

1 1

slide-38
SLIDE 38

p = q = 1

1 2

1 1

∀t, Δt = − Δt−1

slide-39
SLIDE 39

p = q = 1

1 2

1 1

∀t, Δt = − Δt−1 The graph is bipartite

slide-40
SLIDE 40

p = q = 1

1 2

1 1

∀t, Δt = − Δt−1 The graph is bipartite The chain is called periodic

slide-41
SLIDE 41

p = q = 1

1 2

1 1

∀t, Δt = − Δt−1 The graph is bipartite The chain is called periodic Formally, ∃v, gcdC∈Cv|C| > 1

slide-42
SLIDE 42

p = q = 1

1 2

1 1

∀t, Δt = − Δt−1 The graph is bipartite The chain is called periodic In this case, not all initial distribution converges to the stationary distribution Formally, ∃v, gcdC∈Cv|C| > 1

slide-43
SLIDE 43

Fundamental Theorem of Markov Chains

slide-44
SLIDE 44

Fundamental Theorem of Markov Chains

If a finite chain is irreducible and aperiodic, then it has a unique stationary distribution . Moreover, for any initial distribution , it holds that

P π μ

lim

t→∞ μTPt = πT

slide-45
SLIDE 45

Fundamental Theorem of Markov Chains

If a finite chain is irreducible and aperiodic, then it has a unique stationary distribution . Moreover, for any initial distribution , it holds that

P π μ

lim

t→∞ μTPt = πT

(Show on board, see the note for details)

slide-46
SLIDE 46

Reversible Chains

slide-47
SLIDE 47

Reversible Chains

We study a special family of Markov chains called reversible chains

slide-48
SLIDE 48

Reversible Chains

We study a special family of Markov chains called reversible chains Their transition graphs are undirected x → y ⟺ y → x

slide-49
SLIDE 49

Reversible Chains

We study a special family of Markov chains called reversible chains Their transition graphs are undirected x → y ⟺ y → x A chain and a distribution satisfies detailed balance condition:

P π

slide-50
SLIDE 50

Reversible Chains

We study a special family of Markov chains called reversible chains Their transition graphs are undirected x → y ⟺ y → x A chain and a distribution satisfies detailed balance condition:

P π

∀x, y ∈ V, π(x) ⋅ P(x, y) = π(y) ⋅ P(y, x)

slide-51
SLIDE 51

Reversible Chains

We study a special family of Markov chains called reversible chains Their transition graphs are undirected x → y ⟺ y → x A chain and a distribution satisfies detailed balance condition:

P π

∀x, y ∈ V, π(x) ⋅ P(x, y) = π(y) ⋅ P(y, x) Then is a stationary distribution of

π P

slide-52
SLIDE 52
slide-53
SLIDE 53

We study reversible chains because

slide-54
SLIDE 54

We study reversible chains because

  • They are quite general. For any , one can define

an reversible whose stationary distribution is

π P π

slide-55
SLIDE 55

We study reversible chains because

  • They are quite general. For any , one can define

an reversible whose stationary distribution is

π P π

Helpful for Sampling

slide-56
SLIDE 56

We study reversible chains because

  • They are quite general. For any , one can define

an reversible whose stationary distribution is

π P π

Helpful for Sampling

  • We have powerful tools (spectral method) to

analyze reversible chains

slide-57
SLIDE 57

Spectral Decomposition Theorem

slide-58
SLIDE 58

Spectral Decomposition Theorem

An symmetric matrix has real eigenvalues with corresponding eigenvectors which are orthogonal. Moreover, it holds that

n × n A n λ1, …, λn v1, …, vn

A = VΛVT

slide-59
SLIDE 59

Spectral Decomposition Theorem

An symmetric matrix has real eigenvalues with corresponding eigenvectors which are orthogonal. Moreover, it holds that

n × n A n λ1, …, λn v1, …, vn

A = VΛVT where and

V = [v1, …, vn] Λ = diag(λ1, …, λn)

slide-60
SLIDE 60

Spectral Decomposition Theorem

An symmetric matrix has real eigenvalues with corresponding eigenvectors which are orthogonal. Moreover, it holds that

n × n A n λ1, …, λn v1, …, vn

A = VΛVT where and

V = [v1, …, vn] Λ = diag(λ1, …, λn)

Equivalently, A =

n

i=1

λivivT

i

slide-61
SLIDE 61

Spectral Decomposition Theorem for Reversible Chains

slide-62
SLIDE 62

Spectral Decomposition Theorem for Reversible Chains

is a stationary distribution of a reversible chain

π P

slide-63
SLIDE 63

Spectral Decomposition Theorem for Reversible Chains

is a stationary distribution of a reversible chain

π P

Define an inner product

  • n

:

⟨ ⋅ , ⋅ ⟩π ℝn

slide-64
SLIDE 64

Spectral Decomposition Theorem for Reversible Chains

is a stationary distribution of a reversible chain

π P

Define an inner product

  • n

:

⟨ ⋅ , ⋅ ⟩π ℝn

⟨x, y⟩π =

n

i=1

π(i) ⋅ x(i) ⋅ y(i) = xTDπy,

slide-65
SLIDE 65

Spectral Decomposition Theorem for Reversible Chains

is a stationary distribution of a reversible chain

π P

Define an inner product

  • n

:

⟨ ⋅ , ⋅ ⟩π ℝn

⟨x, y⟩π =

n

i=1

π(i) ⋅ x(i) ⋅ y(i) = xTDπy, where Dπ = diag(π1, …, πn)

slide-66
SLIDE 66

Spectral Decomposition Theorem for Reversible Chains

is a stationary distribution of a reversible chain

π P

Define an inner product

  • n

:

⟨ ⋅ , ⋅ ⟩π ℝn

⟨x, y⟩π =

n

i=1

π(i) ⋅ x(i) ⋅ y(i) = xTDπy, where Dπ = diag(π1, …, πn) Consider the Hilbert space endowed with

ℝn ⟨ ⋅ , ⋅ ⟩π

slide-67
SLIDE 67
slide-68
SLIDE 68

Let be reversible with respect to . It has real eigenvalues with corresponding eigenvectors which are orthogonal in . Moreover

P ∈ ℝn×n π n λ1, …, λn v1, …, vn (ℝn, ⟨ ⋅ , ⟩π)

P =

n

i=1

λivivT

i Dπ

slide-69
SLIDE 69

Let be reversible with respect to . It has real eigenvalues with corresponding eigenvectors which are orthogonal in . Moreover

P ∈ ℝn×n π n λ1, …, λn v1, …, vn (ℝn, ⟨ ⋅ , ⟩π)

P =

n

i=1

λivivT

i Dπ

slide-70
SLIDE 70

Let be reversible with respect to . It has real eigenvalues with corresponding eigenvectors which are orthogonal in . Moreover

P ∈ ℝn×n π n λ1, …, λn v1, …, vn (ℝn, ⟨ ⋅ , ⟩π)

P =

n

i=1

λivivT

i Dπ

  • Proof. Reduce to the

symmetric case.

slide-71
SLIDE 71

Properties of Eigenvalues

slide-72
SLIDE 72

Properties of Eigenvalues

is a stationary distribution of a reversible chain

π P

slide-73
SLIDE 73

Properties of Eigenvalues

is a stationary distribution of a reversible chain

π P

The eigenvalues of are

P λ1 ≤ λ2… ≤ λn

slide-74
SLIDE 74

Properties of Eigenvalues

is a stationary distribution of a reversible chain

π P

The eigenvalues of are

P λ1 ≤ λ2… ≤ λn

  • λn = 1
slide-75
SLIDE 75

Properties of Eigenvalues

is a stationary distribution of a reversible chain

π P

The eigenvalues of are

P λ1 ≤ λ2… ≤ λn

  • λn = 1
  • and

if and only if is bipartite

λ1 ≥ − 1 λ1 = − 1 P

slide-76
SLIDE 76

Properties of Eigenvalues

is a stationary distribution of a reversible chain

π P

The eigenvalues of are

P λ1 ≤ λ2… ≤ λn

  • λn = 1
  • and

if and only if is bipartite

λ1 ≥ − 1 λ1 = − 1 P

  • if and only if is reducible

λn−1 = 1 P

slide-77
SLIDE 77

Properties of Eigenvalues

is a stationary distribution of a reversible chain

π P

The eigenvalues of are

P λ1 ≤ λ2… ≤ λn

  • λn = 1
  • and

if and only if is bipartite

λ1 ≥ − 1 λ1 = − 1 P

  • if and only if is reducible

λn−1 = 1 P

Proof next week!

slide-78
SLIDE 78
slide-79
SLIDE 79

P =

n

i=1

λivivT

i Dπ

slide-80
SLIDE 80

P =

n

i=1

λivivT

i Dπ

Pt =

n

i=1

λt

ivivT i Dπ

slide-81
SLIDE 81

P =

n

i=1

λivivT

i Dπ

Pt =

n

i=1

λt

ivivT i Dπ

slide-82
SLIDE 82

P =

n

i=1

λivivT

i Dπ

Pt =

n

i=1

λt

ivivT i Dπ

If is irreducible ( ) and aperiodic ( )

P λn−1 < 1 λ1 > − 1

slide-83
SLIDE 83

P =

n

i=1

λivivT

i Dπ

Pt =

n

i=1

λt

ivivT i Dπ

If is irreducible ( ) and aperiodic ( )

P λn−1 < 1 λ1 > − 1

lim

t→∞ Pt = 11TDπ =

πT πT ⋮ πT