 
              Limiting Spectral Distribution of Stochastic Block Model Yizhe Zhu University of Washington December 30, 2016 SJTU Joint Work with Ioana Dumitriu
Overview Semicircle Law 1 Erd˝ os R´ enyi random graph 2 Stochastic Block Model 3 Proof of Semicircle law for Erd˝ os R´ enyi 4 Spectral Distribution for SBM 5
Wigner Semicircle Law Figure: Eugene Wigner, Nobel Prize in Physics (1963) This law was first observed by Wigner (1955) for certain special classes of random matrices arising in quantum mechanical investigations.
Wigner Semicircle Law Two independent families of i.i.d. zero mean, real-valued random variables { Z ij } 1 ≤ i < j and { Y i } 1 ≤ i , E [ Z 2 1 , 2 ] = 1, max { E | Z 1 , 2 | k , E | Y 1 | k } < ∞ for all k ≥ 1. √ � Z ij / N if i ≤ j X N ( i , j ) = X N ( j , i ) = √ Y i / N if i = j Let λ N denote the eigenvalues of X N with λ N 1 ≤ λ N 2 ≤ · · · ≤ λ N N , and i define the empirical distribution of the eigenvalues as the probability measure N F N = 1 � δ λ N N i i =1
√ 1 4 − x 2 1 | x |≤ 2 . Semicircle distribution density σ ( x ) = 2 π Theorem (Wigner) For a Wigner matrix, the empirical measure F N converges weakly in probability to the standard semicircle distribution. N →∞ P ( |� F N , f � − � σ, f �| > ǫ ) = 0 , ∀ f ∈ C b ( R ) , ǫ > 0 . lim
Erd˝ os R´ enyi random graph G ( n , p ). n vertices. i ∼ j independently with probability p . Theorem (Tran, Vu and Wang, 2011) For p = ω ( 1 1 n ) , the empirical spectral distribution of the matrix √ n σ A n converges in distribution to the semicircle distribution which has a density ρ sc ( x ) with support on [ − 2 , 2] , ρ sc ( x ) := 1 � 4 − x 2 . 2 π
How to generalize Erd˝ os R´ enyi model? Definition (random graph with given expected degree, Chung-Lu-Vu model) G ( w ). For a sequence w = ( w 1 , w 2 , . . . , w n ). Edges are independently assigned to each pair of vertices ( i , j ) with 1 probability w i w j ρ , where ρ = i =1 w i . � n G ( n , p ) can be viewed as w = ( pn , pn , . . . , pn ).
Stochastic Block Model Consider a network with n nodes and d communities Ω 1 , · · · Ω d with size n 1 , . . . , n d , � d i =1 n i = n . If two nodes belong to different communities, connect them independently with probability p 0 . If two nodes are in the same community Ω m , connect them independently with probability p m , 1 ≤ m ≤ d . Statistical task: Community Detection / Recovery
graphon A graphon is a symmetric measurable function W : [0 , 1] 2 → [0 , 1]. Limit objects for graph sequences in the dense case. Generate a graph in the following way: 1. Each vertex j of the graph is assigned an independent random value u j ∼ U [0 , 1]. 2. Edge ( i , j ) is independently included in the graph with probability W ( u i , u j ). Erd˝ os R´ enyi: W = p . for some constant p ∈ [0 , 1].
Measure Theory:
Measure Theory: constant function → step function → measurable function
Measure Theory: constant function → step function → measurable function Random Graph Theory:
Measure Theory: constant function → step function → measurable function Random Graph Theory: Erd˝ os R´ enyi model → stochastic block model → graphon
Proof of Semicircle Law for Erd˝ os R´ enyi random graph
Note that E [ A n ] = pJ n , M n = σ − 1 ( A n − pJ n ) is centered. Lemma (Rank Inequality) � F A − F B � ≤ rank ( A − B ) . n It’s sufficient to show the semicircle law holds for M n . A standard way is the moment method.
Note that E [ A n ] = pJ n , M n = σ − 1 ( A n − pJ n ) is centered. Lemma (Rank Inequality) � F A − F B � ≤ rank ( A − B ) . n It’s sufficient to show the semicircle law holds for M n . A standard way is the moment method. k th moment of empirical spectral distribution of a matrix W n is � n ( x ) = 1 x k dF W n E [ Trace ( W k n )]
Note that E [ A n ] = pJ n , M n = σ − 1 ( A n − pJ n ) is centered. Lemma (Rank Inequality) � F A − F B � ≤ rank ( A − B ) . n It’s sufficient to show the semicircle law holds for M n . A standard way is the moment method. k th moment of empirical spectral distribution of a matrix W n is � n ( x ) = 1 x k dF W n E [ Trace ( W k n )] On a compact set, convergence in distribution is the same as convergence of moments. Need to show � 2 1 n E [ Trace ( W k x k ρ sc ( x ) dx n )] → − 2
� 2 − 2 x k ρ s c ( x ) dx = 0. For k = 2 m + 1 , � 2 � 2 m − 2 x k ρ sc ( x ) dx = 1 � For k = 2 m , m +1 m
� 2 − 2 x k ρ s c ( x ) dx = 0. For k = 2 m + 1 , � 2 � 2 m − 2 x k ρ sc ( x ) dx = 1 � For k = 2 m , ← Catalan number. m +1 m
1 Let W n = √ n M n and η ij be ( i , j ) entry of M n . We have the following expansion for W k n . 1 1 n E [ Trace ( W k n 1+ k / 2 E [ Trace ( M k n )] = n )] 1 � = E η i 1 i 2 η i 2 i 3 · · · η i k i 1 n 1+ k / 2 1 ≤ i 1 ... i k ≤ n Each term (indices) corresponds to a closed walk of length k on the complete graph K n .
1 Let W n = √ n M n and η ij be ( i , j ) entry of M n . We have the following expansion for W k n . 1 1 n E [ Trace ( W k n 1+ k / 2 E [ Trace ( M k n )] = n )] 1 � = E η i 1 i 2 η i 2 i 3 · · · η i k i 1 n 1+ k / 2 1 ≤ i 1 ... i k ≤ n Each term (indices) corresponds to a closed walk of length k on the complete graph K n . The term is nonzero if and only if each edge in the closed walk appears at least twice, we call such a walk a good walk.
Consider a good walk that uses l different edges e 1 , . . . , e l with multiplicities m 1 , . . . , m l , l ≤ m . A bound for number of good walks with l different edges are n l +1 × l k . When k = 2 m + 1,
Consider a good walk that uses l different edges e 1 , . . . , e l with multiplicities m 1 , . . . , m l , l ≤ m . A bound for number of good walks with l different edges are n l +1 × l k . When k = 2 m + 1, m 1 1 1 n E [ Trace ( W k � � E η m 1 e 1 · · · η m l n )] = e l = O ( √ np ) . n 1+ k / 2 l =1 good walk of l edges When k = 2 m ,
Consider a good walk that uses l different edges e 1 , . . . , e l with multiplicities m 1 , . . . , m l , l ≤ m . A bound for number of good walks with l different edges are n l +1 × l k . When k = 2 m + 1, m 1 1 1 n E [ Trace ( W k � � E η m 1 e 1 · · · η m l n )] = e l = O ( √ np ) . n 1+ k / 2 l =1 good walk of l edges When k = 2 m , classify good walks into two types. The first type uses l ≤ m − 1 different edges, the contribution of these terms are O ( 1 np ). The second kind of good walk use exactly l = m different edges and each term has form E η 2 e 1 · · · η 2 e l = 1. The number of the second kind of good walk is n m +1 (1+ O ( n − 1 ) � 2 m � . m +1 m Then the conclusion follows.
Spectral Distribution for SBM We consider a n × n random matrix with rectangular blocks, we can write the matrix as follows d � E kl ⊗ A ( k , l ) A n = k , l =1 where A ( k , l ) , 1 ≤ k ≤ l ≤ d are n k × n l independent rectangular random rs to denote the entries of the matrix A ( k , l ) and make matrices. We use a k , l the following assumptions 1 :
Spectral Distribution for SBM We consider a n × n random matrix with rectangular blocks, we can write the matrix as follows d � E kl ⊗ A ( k , l ) A n = k , l =1 where A ( k , l ) , 1 ≤ k ≤ l ≤ d are n k × n l independent rectangular random rs to denote the entries of the matrix A ( k , l ) and make matrices. We use a k , l the following assumptions 1 : 1 a ( k , l ) = a ( l , k ) , for all r = 1 , . . . , n k , s = 1 , . . . , n l , 1 ≤ k , ≤ l ≤ d and rs sr n k / n → α k ∈ [0 , ∞ ) , 1 ≤ k ≤ d . 2 { a k , l rs , 1 ≤ r ≤ n k , 1 ≤ s ≤ n l , k ≤ l } are i.i.d. random variable with mean zero and variance σ 2 kl , 1 ≤ k ≤ l ≤ d . σ 2 3 Let σ 2 = max k , l σ 2 kl , we have lim n →∞ σ 2 = s kl . kl σ = ( c ( k , l ) 4 Let M n = A n ) and rs | ≥ η √ n ) 1 � � | c ( k , l ) | 2 I ( | c ( k , l ) � � lim E = 0 rs rs n 2 n →∞ k , l r , s
Theorem (Ding, 2015) If d is fixed, under the assumptions (1) − (4) , with probability 1, the empirical spectral distribution F n of the random matrix √ n σ = ( c ( k , l ) M n A n √ n = ) converges to a probability distribution F. rs
Theorem (Ding, 2015) If d is fixed, under the assumptions (1) − (4) , with probability 1, the empirical spectral distribution F n of the random matrix √ n σ = ( c ( k , l ) M n A n √ n = ) converges to a probability distribution F. rs Lemma (Ding, 2015) In order to prove the theorem above, we only need to verify that they hold under the following assumptions 2 : 1 c ( k , k ) = 0 , { a k , l rs , 1 ≤ r ≤ n k , 1 ≤ s ≤ n l , k ≤ l } are i.i.d. random rr variable with mean zero and variance σ 2 kl , 1 ≤ k ≤ l ≤ d, and lim n →∞ σ 2 kl = s kl ≤ 1 . √ n for some positive sequence η n such that η n → 0 . 2 | c ( k , l ) | ≤ η n rs
Recommend
More recommend