 
              ✬ ✩ Markov Chains on Finite Groups Maarit Hietalahti Postgraduate Seminar in Theoretical Computer Science 17.11.2003 Based on Sections 15 and 16 of E. Behrends. Introduction to Markov Chains, with Special Emphasis on Rapid Mixing. Vieweg & Sohn, Braunschweig Wiesbaden, 2000. and P. Diaconis. Group Representations in Probability and Statistics. Institute of Mathematical Statistics, Hayward CA, 1988. ✫ ✪ 1
✬ ✩ ✞ ☎ Contents ✝ ✆ 1. Preliminaries: Algebraic terms 2. Markov chains on groups: definition 3. Goal and the path 4. k-step transitions 5. Convolutions 6. Characters 7. Lemma 15.3 8. Fourier transforms 9. Variation distance 10. Conclusion: Rapid mixing in Markov chains on finite commutative groups ✫ ✪ 11. Remark on the non-commutative case 2
✬ ✩ ✞ ☎ Refresher on Algebra ✝ ✆ group ( G, ◦ ) : G set, ◦ associative multiplication between elements of G : if g, h ∈ G then g ◦ h ∈ G . Identity: g ◦ id = id ◦ g = g . Inverse g − 1 g = id . subgroup H ∈ G : id ∈ H and h 1 ◦ h 2 ∈ H when h 1 , h 2 ∈ H . ( H is closed with respect to ◦ ) group generator g ∈ G is said to generate the group G , if for all elements of h ∈ G there is a k s.t. h = g k . conjugacy class H subgroup, left (right) conjugacy classes are sets of the form H ◦ g ( g ◦ H ) with g ∈ G . group homomorphism is a map between two groups G, H such that 1) f ( g 1 g 2 ) = f ( g 1 ) f ( g 2 ) and 2) f ( id G ) = id H . ✫ ✪ 3
✬ ✩ ✞ ☎ Markov chains on finite commutative groups ✝ ✆ ( G, ◦ ) is a finite group, g, h, . . . ∈ G are the states of a Markov chain. P 0 is a probability measure on G . Transition probabilities: p g,h ◦ g := P 0 ( { h } ) lemma 15.1 • p g,h ◦ g are the entries of a (doubly) stochastic matrix. Thus, the uniform distribution of this matrix is the equilibrium distribution. • H subgroup generated by supp := { h | P 0 ( h ) > 0 } . The irreducible subsets of the chain are precisely the sets of the form H ◦ g with g ∈ G , that is, the left conjugacy classes. In particular, the chain is irreducible iff supp P 0 generates G . • The chain is aperiodic and irreducible iff there is a k s. t. every element of G can be written as the product of k elements, each lying in supp P 0 . ✫ ✪ 4
✬ ✩ ✞ ☎ Outline: the train of thought ✝ ✆ Problem: How fast does the chain converge to its equilibrium? –> What is the distribution after k steps of a walk which starts at 0? Answer: P ( k ∗ ) . 0 Notion Matrix doubly stochastic: the uniform distribution is the equilibrium distribution! –> How fast does the P ( k ∗ ) tend to the uniform distribution? 0 –> How close is a distribution P 0 to the uniform distribution? Notion Variation distance can be calculated with the help of the Fourier transformation –> How small are the ˆ P 0 ( χ ) for the nontrivial characters χ ? ✫ ✪ 5
✬ ✩ ✞ ☎ k-step transitions ✝ ✆ Probability P 0 on G for the one-step transitions. • Start: g 0 arbitrary. • g 0 + h 0 with probability P 0 ( { h 0 } ) for h 0 . • ( g 0 + h 0 ) + h 1 with probability P 0 ( { h 1 } ) for h 1 . • and so on. Note that h 0 and h 1 are independent. 2-step transitions: g 0 → ( g 0 + h 0 ) + h 1 = g 0 + h for which the probability is Σ h 0 + h 1 = h P 0 ( { h 0 } ) P 0 ( { h 1 } ) = Σ h 0 P 0 ( { h 0 } ) P 2 ( { h − h 0 } )) . ✫ ✪ 6
✬ ✩ ✞ ☎ Convolutions of probability measures ✝ ✆ Let P 1 , P 2 be probability measures on G . Definition 15.9 (i) We define the convolution P 1 ∗ P 2 of P 1 , P 2 by ( P 1 ∗ P 2 )( { h } ) := Σ h 0 P 1 ( { h 0 } ) P 2 ( { h − h 0 } ) (ii) In the special case P 1 = P 2 = P 0 we put P ( k ∗ ) := P 0 ∗ P 0 . This is extended to 0 a definition for arbitrary integer exponents P (( k +1) ∗ ) := P ( k ∗ ) ∗ P 0 . 0 0 ✫ ✪ 7
✬ ✩ ✞ ☎ Characters ✝ ✆ Relating abstract groups to complex numbers: Denote by (Γ , · ) the multiplicative group of all complex numbers Definition 15.2 of modulus one. Then a character on G is a group homomorphism χ from G to Γ : χ ( g + h ) = χ ( g ) χ ( h ) for all g, h ∈ G . Properties of characters: • ( χ ( g ) = χ ( g )) is a character. (Also, χ is the inverse 1 /χ of χ .) • χ 1 χ 2 is a character when χ 1 , χ 2 are. • The trivial character: χ triv : g → 1 . • ˆ G , the collection of all characters, forms a commutative group with resp. to pointwise multiplication. • If G has N elements, the range of any character on G is contained in the set of the N ’th roots of unity ( exp (2 πij/N ) , j = 0 , . . . , N − 1 , i = √− 1 ) ✫ ✪ 8
✬ ✩ ✞ ☎ Lemma 15.3 and corollary ✝ ✆ Let ( G, +) be a commutative group with N elements. The N-dimensional vector space of all mappings from G to C will be denoted by X G , and this space will be provided with the scalar product < f 1 , f 2 > G := Σ g f 1 ( g ) f 2 ( g ) /N . (i) Let χ be a character which is not the trivial character χ triv . Then Σ g χ ( g ) = 0 . (ii) In the Hilbert space ( X g , < · , · > G ) the family of characters forms an orthonormal system. (iii) Any collection of characters is linearly independent (iv) ˆ G has at most N elements. (v) In fact there exists N different characters so that ˆ G is an orthonormal basis of X G . Also ( G, +) is isomorphic with ( ˆ G, · ) . ✫ ✪ 9
✬ ✩ Corollary 15.4 (i) Let f be any element of X G . Then f can be written as a linear combination of the χ ∈ ˆ G as follows: f = Σ χ < f, χ > G χ . (ii) For different g, h ∈ G there is a character χ s.t. χ ( g ) � = χ ( h ) . ✫ ✪ 10
✬ ✩ ✞ ☎ Fourier transform ✝ ✆ Fourier transform of measure P 0 : P 0 : ˆ ˆ G → C , χ �→ Σ g χ ( g ) P 0 ( { g } ) Fourier transform of convolutions: For probability measures P 1 , P 2 on ( G, +) the Fourier transform of P 2 ∗ P 1 is just the (pointwise) product of the functions ˆ P 1 and ˆ P 2 . In particular it follows that, for any probability P 0 , the Fourier transform of P ( k ∗ ) is the k ’th power of the Fourier 0 transform of P 0 . ✫ ✪ 11
✬ ✩ ✞ ☎ Calculating the variation distance ✝ ✆ Let P 0 , P 1 , P 2 be probability measures on the finite commutative Lemma 15.8 group G . By U we denote the uniform distribution. (i) P 0 = U iff ˆ P 0 ( χ ) is one for the trivial character and zero for the other χ . (ii) The variation distance � P 1 − P 2 � can be estimated by (Σ χ | ˆ P 1 ( χ ) − ˆ P 2 ( χ ) | 2 ) 1 / 2 / 2 ; in particular � P 1 − U � is less than or equal to (Σ χ � = χ triv | ˆ P 1 ( χ ) | 2 ) 1 / 2 / 2 , where the summation runs over all nontrivial characters χ . (iii) Conversely, the distance of ˆ P 1 and ˆ P 2 with respect to the maximum norm is bounded by 2 � P 1 − P 2 � . ✫ ✪ 12
✬ ✩ ✞ ☎ Rapid mixing: Conclusion ✝ ✆ Combining previous results gives us − U � 2 ≤ 1 � P ( k ∗ ) 4Σ χ � = χ triv | ˆ P 0 ( χ ) | 2 k 0 ✫ ✪ 13
✬ ✩ ✞ ☎ Remark: Generalization to arbitrary finite groups ✝ ✆ Relating the abstract group to something more concrete is done by using representations . Characters will no longer do, as they are homomorphisms with commutative ranges, which cannot distinguish between different elements of a non-commutative groups. The use of representations leads to more demanding technicalities. In other respects, the construction follows the same principles. ✫ ✪ 14
Recommend
More recommend