modern discrete probability vi spectral techniques
play

Modern Discrete Probability VI - Spectral Techniques Background S - PowerPoint PPT Presentation

Review Bounding the mixing time via the spectral gap Applications: random walk on cycle and hypercube Infinite networks Modern Discrete Probability VI - Spectral Techniques Background S ebastien Roch UWMadison Mathematics December 1,


  1. Review Bounding the mixing time via the spectral gap Applications: random walk on cycle and hypercube Infinite networks Modern Discrete Probability VI - Spectral Techniques Background S´ ebastien Roch UW–Madison Mathematics December 1, 2014 S´ ebastien Roch, UW–Madison Modern Discrete Probability – Spectral Techniques

  2. Review Bounding the mixing time via the spectral gap Applications: random walk on cycle and hypercube Infinite networks Review 1 Bounding the mixing time via the spectral gap 2 Applications: random walk on cycle and hypercube 3 Infinite networks 4 S´ ebastien Roch, UW–Madison Modern Discrete Probability – Spectral Techniques

  3. Review Bounding the mixing time via the spectral gap Applications: random walk on cycle and hypercube Infinite networks Mixing time I Theorem (Convergence to stationarity) Consider a finite state space V. Suppose the transition matrix P is irreducible, aperiodic and has stationary distribution π . Then, for all x , y, P t ( x , y ) → π ( y ) as t → + ∞ . For probability measures µ, ν on V , let their total variation distance be � µ − ν � TV := sup A ⊆ V | µ ( A ) − ν ( A ) | . Definition (Mixing time) The mixing time is t mix ( ε ) := min { t ≥ 0 : d ( t ) ≤ ε } , where d ( t ) := max x ∈ V � P t ( x , · ) − π ( · ) � TV . S´ ebastien Roch, UW–Madison Modern Discrete Probability – Spectral Techniques

  4. Review Bounding the mixing time via the spectral gap Applications: random walk on cycle and hypercube Infinite networks Mixing time II Definition (Separation distance) The separation distance is defined as 1 − P t ( x , y ) � � s x ( t ) := max , π ( y ) y ∈ V and we let s ( t ) := max x ∈ V s x ( t ) . Because both { π ( y ) } and { P t ( x , y ) } are non-negative and sum to 1, we have that s x ( t ) ≥ 0. Lemma (Separation distance v. total variation distance) d ( t ) ≤ s ( t ) . S´ ebastien Roch, UW–Madison Modern Discrete Probability – Spectral Techniques

  5. Review Bounding the mixing time via the spectral gap Applications: random walk on cycle and hypercube Infinite networks Mixing time III y P t ( x , y ) , Proof: Because 1 = � y π ( y ) = � � � � � � � π ( y ) − P t ( x , y ) P t ( x , y ) − π ( y ) = . y : P t ( x , y ) <π ( y ) y : P t ( x , y ) ≥ π ( y ) So � P t ( x , · ) − π ( · ) � TV = 1 � � � � π ( y ) − P t ( x , y ) � � 2 � y � � � π ( y ) − P t ( x , y ) = y : P t ( x , y ) <π ( y ) 1 − P t ( x , y ) � � � = π ( y ) π ( y ) y : P t ( x , y ) <π ( y ) ≤ s x ( t ) . S´ ebastien Roch, UW–Madison Modern Discrete Probability – Spectral Techniques

  6. Review Bounding the mixing time via the spectral gap Applications: random walk on cycle and hypercube Infinite networks Reversible chains Definition (Reversible chain) A transition matrix P is reversible w.r.t. a measure η if η ( x ) P ( x , y ) = η ( y ) P ( y , x ) for all x , y ∈ V . By summing over y , such a measure is necessarily stationary. S´ ebastien Roch, UW–Madison Modern Discrete Probability – Spectral Techniques

  7. Review Bounding the mixing time via the spectral gap Applications: random walk on cycle and hypercube Infinite networks Example I Recall: Definition (Random walk on a graph) Let G = ( V , E ) be a finite or countable, locally finite graph. Simple random walk on G is the Markov chain on V , started at an arbitrary vertex, which at each time picks a uniformly chosen neighbor of the current state. Let ( X t ) be simple random walk on a connected graph G . Then ( X t ) is reversible w.r.t. η ( v ) := δ ( v ) , where δ ( v ) is the degree of vertex v . S´ ebastien Roch, UW–Madison Modern Discrete Probability – Spectral Techniques

  8. Review Bounding the mixing time via the spectral gap Applications: random walk on cycle and hypercube Infinite networks Example II Definition (Random walk on a network) Let G = ( V , E ) be a finite or countable, locally finite graph. Let c : E → R + be a positive edge weight function on G . We call N = ( G , c ) a network . Random walk on N is the Markov chain on V , started at an arbitrary vertex, which at each time picks a neighbor of the current state proportionally to the weight of the corresponding edge. Any countable, reversible Markov chain can be seen as a random walk on a network (not necessarily locally finite) by setting c ( e ) := π ( x ) P ( x , y ) = π ( y ) P ( y , x ) for all e = { x , y } ∈ E . Let ( X t ) be random walk on a network N = ( G , c ) . Then ( X t ) is reversible w.r.t. η ( v ) := c ( v ) , where c ( v ) := � x ∼ v c ( v , x ) . S´ ebastien Roch, UW–Madison Modern Discrete Probability – Spectral Techniques

  9. Review Bounding the mixing time via the spectral gap Applications: random walk on cycle and hypercube Infinite networks Eigenbasis I We let n := | V | < + ∞ . Assume that P is irreducible and reversible w.r.t. its stationary distribution π > 0. Define � � f � 2 � f , g � π := π ( x ) f ( x ) g ( x ) , π := � f , f � π , x ∈ V � ( Pf )( x ) := P ( x , y ) f ( y ) . y We let ℓ 2 ( V , π ) be the Hilbert space of real-valued functions on V equipped with the inner product �· , ·� π (equivalent to the vector space ( R n , �· , ·� π ) ). Theorem There is an orthonormal basis of ℓ 2 ( V , π ) formed of eigenfunctions { f j } n j = 1 of P with real eigenvalues { λ j } n j = 1 . S´ ebastien Roch, UW–Madison Modern Discrete Probability – Spectral Techniques

  10. Review Bounding the mixing time via the spectral gap Applications: random walk on cycle and hypercube Infinite networks Eigenbasis II Proof: We work over ( R n , �· , ·� π ) . Let D π be the diagonal matrix with π on the diagonal. By reversibility, � � π ( x ) π ( y ) M ( x , y ) := π ( y ) P ( x , y ) = π ( x ) P ( y , x ) =: M ( y , x ) . So M = ( M ( x , y )) x , y = D 1 / 2 π PD − 1 / 2 , as a symmetric matrix, has real π j = 1 forming an orthonormal basis of R n with corresponding eigenvectors { φ j } n j = 1 . Define f j := D − 1 / 2 real eigenvalues { λ j } n φ j . Then π Pf j = PD − 1 / 2 φ j = D − 1 / 2 D 1 / 2 π PD − 1 / 2 φ j = D − 1 / 2 M φ j = λ j D − 1 / 2 φ j = λ j f j , π π π π π and � f i , f j � π = � D − 1 / 2 φ i , D − 1 / 2 φ j � π π π � π ( x )[ π ( x ) − 1 / 2 φ i ( x )][ π ( x ) − 1 / 2 φ j ( x )] = x = � φ i , φ j � . S´ ebastien Roch, UW–Madison Modern Discrete Probability – Spectral Techniques

  11. Review Bounding the mixing time via the spectral gap Applications: random walk on cycle and hypercube Infinite networks Eigenbasis III Lemma For all j � = 1 , � x π ( x ) f j ( x ) = 0 . Proof: By orthonormality, � f 1 , f j � π = 0. Now use the fact that f 1 ≡ 1. Let δ x ( y ) := ✶ { x = y } . Lemma For all x , y, � n j = 1 f j ( x ) f j ( y ) = π ( x ) − 1 δ x ( y ) . Proof: Using the notation of the theorem, the matrix Φ whose columns are the φ j s is unitary so ΦΦ ′ = I . That is, � n j = 1 φ j ( x ) φ j ( y ) = δ x ( y ) , or � n � π ( x ) π ( y ) f j ( x ) f j ( y ) = δ x ( y ) . Rearranging gives the result. j = 1 S´ ebastien Roch, UW–Madison Modern Discrete Probability – Spectral Techniques

  12. Review Bounding the mixing time via the spectral gap Applications: random walk on cycle and hypercube Infinite networks Eigenbasis IV Lemma Let g ∈ ℓ 2 ( V , π ) . Then g = � n j = 1 � g , f j � π f j . Proof: By the previous lemma, for all x n n � � � � π ( y ) g ( y )[ π ( x ) − 1 δ x ( y )] = g ( x ) . � g , f j � π f j ( x ) = π ( y ) g ( y ) f j ( y ) f j ( x ) = j = 1 j = 1 y y Lemma π = � n Let g ∈ ℓ 2 ( V , π ) . Then � g � 2 j = 1 � g , f j � 2 π . Proof: By the previous lemma, 2 � � n � n n � n � � � g � 2 � � � � � � π = � g , f j � π f j = � g , f i � π f i , � g , f j � π f j = � g , f i � π � g , f j � π � f i , f j � π , � � � � j = 1 i = 1 j = 1 i , j = 1 � � π π S´ ebastien Roch, UW–Madison Modern Discrete Probability – Spectral Techniques

  13. Review Bounding the mixing time via the spectral gap Applications: random walk on cycle and hypercube Infinite networks Eigenvalues I Let P be finite, irreducible and reversible. Lemma Any eigenvalue λ of P satisfies | λ | ≤ 1 . Proof: Pf = λ f = ⇒ | λ |� f � ∞ = � Pf � ∞ = max x | � y P ( x , y ) f ( y ) | ≤ � f � ∞ We order the eigenvalues 1 ≥ λ 1 ≥ · · · ≥ λ n ≥ − 1. In fact: Lemma We have λ 1 = 1 and λ 2 < 1 . Also we can take f 1 ≡ 1 . Proof: Because P is stochastic, the all-one vector is a right eigenvector with eigenvalue 1. Any eigenfunction with eigenvalue 1 is P -harmonic. By Corollary 3.22 for a finite, irreducible chain the only harmonic functions are the constant functions. So the eigenspace corresponding to 1 is one-dimensional. Since all eigenvalues are real, we must have λ 2 < 1. S´ ebastien Roch, UW–Madison Modern Discrete Probability – Spectral Techniques

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend