SLIDE 1 A Nearly-Linear Time Algorithm for Exact Community Recovery in Stochastic Block Model
Peng Wang1, Zirui Zhou2, Anthony Man-Cho So1
1Department of Systems Engineering and Engineering Management,
The Chinese University of Hong Kong
2Department of Mathematics, Hong Kong Baptist University
June 14, 2020
SLIDE 2
Table of Contents
1 Overview 2 Introduction 3 Main Results 4 Experimental Results 5 Conclusions
SLIDE 3
Table of Contents
1 Overview 2 Introduction 3 Main Results 4 Experimental Results 5 Conclusions
SLIDE 4 Community Detection
- Community detection refers to the problem of inferring
similarity classes of vertices (i.e., communities) in a network by observing their local interactions (Abbe 2017); see the below graphs.
- Broad applications in machine learning, biology, social science and many
areas.
- Exact recovery requires to identify the entire partition correctly.
SLIDE 5 Overview
- Problem: exactly recover the communities in the binary symmetric
stochastic block model (SBM), where n vertices are partitioned into two equal-sized communities and the vertices are connected with probability p = α log(n)/n within communities and q = β log(n)/n across communities.
- Goal: propose an efficient algorithm that achieves exact recovery at
the information-theoretic limit, i.e., √α − √β > √ 2.
- Proposed Method: a two-stage iterative algorithm:
(i) 1st-stage: power method, coarse estimate, (ii) 2nd-stage: generalized power method, refinement.
- Theoretic Results: the proposed method can achieve exact recovery
at the information-theoretic limit within ˜ O(n) time complexity.
SLIDE 6
Table of Contents
1 Overview 2 Introduction 3 Main Results 4 Experimental Results 5 Conclusions
SLIDE 7 Stochastic Block Model
Given n nodes in two equal-sized clusters, we denote by x∗ its true community structures, e.g., for every i ∈ [n], x∗
i = 1 if the node i belongs
to the first cluster and x∗
i = −1 if it belongs to the second one.
Model 1 (Binary symmetric SBM) The elements {aij : 1 ≤ i ≤ j ≤ n} of A are generated independently by aij ∼
if x∗
i x∗ j = 1,
Bern(q), if x∗
i x∗ j = −1,
where p = α log n n and q = β log n n for some constants α > β > 0. Besides, we have aij = aji for all 1 ≤ j < i ≤ n.
The problem of achieving exact recovery is to develop efficient methods that can find x∗ or −x∗ with high probability given the adjacency matrix A.
SLIDE 8 Phase Transition
The maximum likelihood (ML) estimator of x∗ in the binary symmetric SBM is the solution of the following problem: max
n x = 0, xi = ±1, i = 1, . . . , n
(1) Theorem 1 (Abbe et al. (2016), Mossel et al. (2014)) In the binary symmetric SBM, exact recovery is impossible if √α − √β < √ 2, while it is possible and can be achieved by the ML estimator if √α − √β > √ 2. In literature, √α − √β > √ 2 is called the information-theoretic limit. Question: Is it possible to develop efficient methods for achieving exact recovery at the information-theoretic limit?
SLIDE 9
Related Works
Table: Methods above the information-theoretic limit
Authors Methods Time complexity Recovery bounds Boppana, 1987 spectral algo. polynomial time (α − β)2/(α + β) > 72 McSherry, 2001 spectral algo. polynomial time (α − β)2/(α + β) > 64 Abbe et al., 2016 SDP polynomial time 3(α − β)2 > 24(α + β)+ 8(α − β) Bandeira et al., 2016 manifold opti. polynomial time (p − q)/√p + q ≥ cn−1/6
Table: Methods at the information-theoretic limit
Authors Methods Time complexity Recovery bounds Hajek et al., 2016 SDP polynomial time √α − √β > √ 2 Abbe et al., 2017 spectral algo. polynomial time √α − √β > √ 2 Gao et al., 2017 two-stage algo. polynomial time √α − √β > √ 2 Our paper two-stage algo. nearly-linear time √α − √β > √ 2
SLIDE 10
Table of Contents
1 Overview 2 Introduction 3 Main Results 4 Experimental Results 5 Conclusions
SLIDE 11 Algorithm
Algorithm 1 A Two-Stage Algorithm for Exact Recovery
1: Input: adjacency matrix A, positive integer N 2: set ρ ← 1T
n A1n/n2 and B ← A − ρEn
3: choose y 0 randomly with uniform distribution over the unit sphere 4: for k = 1, 2, . . . , N do 5:
set y k ← By k−1/By k−12
6: end for 7: set x0 ← √ny N 8: for k = 1, 2, . . . do 9:
set xk ← Bxk−1/|Bxk−1|
10:
if xk = xk−1 then
11:
terminate and return xk
12:
end if
13: end for
power method (PM): coarse estimate generalized power method (GPM): re- finement stopping criteria
For any v ∈ Rn, v/|v| denotes the vector of Rn defined as v |v|
=
if vi ≥ 0, −1,
- therwise, i = 1, . . . , n.
SLIDE 12 Main Theorem
Theorem 2 (Iteration Complexity for Exact Recovery) Let A be randomly generated by Model 1. If √α − √β > √ 2, then the following statement holds with probability at least 1 − n−Ω(1): Algorithm 1 finds x∗ or −x∗ in O(log n/ log log n) power iterations and O(log n/ log log n) generalized power iterations. Consequences:
- Algorithm 1 achieves exact recovery at the information-theoretic limit.
- Explicit iteration complexity bound for Algorithm 1 to achieve exact
recovery. The number of non-zero entries in A is, with high probability, in the order of n log n. Corollary 3 (Time Complexity for Exact Recovery) Let A be randomly generated by Model 1. If √α − √β > √ 2, then with probability at least 1 − n−Ω(1), Algorithm 1 finds x∗ or −x∗ in O(n log2 n) time complexity.
SLIDE 13 Analysis of Power Method
Proposition 1 (Convergence Rate of Power Method) Let {y k}k≥0 be the sequence generated in the first-stage of Algorithm 1. Then, it holds with probability at least 1 − n−Ω(1) that min
s∈{±1} y k − su12 n/(log n)k/2, ∀ k ≥ 0,
(2) where u1 is an eigenvector of B associated with the largest eigenvalue.
- {y k}k≥0 with high probability converges at least linearly to u1.
- Equation (2) shows that the ratio in the linear rate of convergence tends
to 0 as n → ∞. Lemma 4 (Distance from Leading Eigenvalue of B to Ground Truth) It holds with probability at least 1 − n−Ω(1) that min
s∈{±1}
(3)
- It suffices to compute y Np such that mins∈{±1} y Np − su12 1/√log n.
By (2), we have Np = O(log n/ log log n).
SLIDE 14 Analysis of Generalized Power Method
Proposition 2 (Convergence Rate of Generalized Power Method) Let α > β > 0 be fixed such that √α − √β > √
- 2. Suppose that the x0 in
Algorithm 1 satisfies x02 = √n and x0 − x∗2
with probability at least 1 − n−Ω(1) that xk − x∗2 ≤ x0 − x∗2/(log n)k/2. (4)
- Note that x0 − x∗2 ≤ x0 − √nu12 + √nu1 − x∗2
- n/ log n.
Lemma 5 (One-step Convergence of Generalized Power Iterations) For any fixed α > β > 0 such that √α − √β > √ 2, the following event happens with probability at least 1 − n−Ω(1): for all x ∈ {±1}n such that x − x∗2 ≤ 2, it holds that Bx/|Bx| = x∗. (5)
- This lemma indicates that the GPM exhibits finite termination.
- If x0 − x∗2/(log n)Ng /2 ≤ 2, by (4), we have xNg − x∗2 ≤ 2. Then,
xNg +1 = x∗. One can verify Ng = O(log n/ log log n).
SLIDE 15
Table of Contents
1 Overview 2 Introduction 3 Main Results 4 Experimental Results 5 Conclusions
SLIDE 16 Phase Transition and Computation Efficiency
- Benchmark methods:
- SDP-based approach in Amini et al. (2018) solved by ADMM.
- Manifold optimization (MFO) based approach in Bandeira et al.
(2016) solved by manifold gradient descent (MGD) method.
- Spectral clustering approach in Abbe et al. (2017) solved by Matlab
function eigs.
- Parameters setting:
- n = 300; α and β vary from 0 to 30 and 0 to 10, with increments 0.5
and 0.4, respectively.
- For fixed (α, β), we generate 40 instances and calculate the ratio of
exact recovery.
GPM 2 4 6 8 10 5 10 15 20 25 30 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
running time: 25 s
SDP 2 4 6 8 10 5 10 15 20 25 30 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
running time: 9313 s
MGD 2 4 6 8 10 5 10 15 20 25 30 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
running time: 1064 s
SC 2 4 6 8 10 5 10 15 20 25 30 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
running time: 118 s
Figure: Phase transition: the x-axis is β, the y-axis is α, and darker pixels represent
lower empirical probability of success. The red curve is √α − √β = √ 2.
SLIDE 17 Convergence Performance
- Parameters setting:
- α = 10, β = 2.
- n = 1000, 5000, 10000.
10 20 30 40 50 Iter num 10-8 10-6 10-4 10-2 100 102 104 distance to ground truth n=1000, =10, =2 GPM MGD 5 10 15 20 25 30 35 40 45 50 Iter num 10-8 10-6 10-4 10-2 100 102 104 distance to ground truth n=5000, =10, =2 GPM MGD 10 20 30 40 50 60 70 Iter num 10-8 10-6 10-4 10-2 100 102 104 distance to ground truth n=10000, =10, =2 GPM MGD
Figure: Convergence performance: the x-axis is number of iterations, the y-axis for
GPM is xkxk T − x∗x∗T F , and the y-axis for MGD is QkQk T − x∗x∗T F , where xk and Qk are the iterates generated in the k-th iteration of GPM and MGD, respectively.
SLIDE 18
Table of Contents
1 Overview 2 Introduction 3 Main Results 4 Experimental Results 5 Conclusions
SLIDE 19 Conclusions
1 We propose a two-stage iterative algorithm to solve the problem of
exact community recovery in the binary symmetric SBM: (i) 1st-stage: power method, (ii) 2nd-stage: generalized power method.
2 We show that the proposed method can achieve exact recovery at
the information-theoretic limit within ˜ O(n) time complexity.
3 Numerical experiments demonstrate that the proposed approach has
strong recovery performance and is highly efficient.