Graphon Estimation: Minimax Rates and Posterior Contraction
Chao Gao Yale University
@Leiden, March 2015
Graphon Estimation: Minimax Rates and Posterior Contraction Chao Gao - - PowerPoint PPT Presentation
Graphon Estimation: Minimax Rates and Posterior Contraction Chao Gao Yale University @Leiden, March 2015 Stochastic Block Model z : { 1 , 2 , ..., n } ! { 1 , 2 , ..., k } A ij Bernoulli( ij ) ij = Q z ( i ) z ( j ) Goal: recover ij
Graphon Estimation: Minimax Rates and Posterior Contraction
Chao Gao Yale University
@Leiden, March 2015
z : {1, 2, ..., n} ! {1, 2, ..., k}
Aij ⇠ Bernoulli(θij)
θij = Qz(i)z(j)
z1 : {1, 2, ..., n} ! {1, 2, ..., k} z2 : {1, 2, ..., m} ! {1, 2, ..., l}
E(Aij) = θij = Qz1(i)z2(j)
xi 2 D, ✏i ⇠ N(0, 1)
2 D ⇠ Common assumption: f is smooth on D.
Goal: recover f from both x and y
xi 2 D, ✏i ⇠ N(0, 1)
2 D ⇠ Common assumption: f is smooth on D.
Goal: recover f from only y
inf
ˆ f
sup
f∈F
E 1 n
n
X
i=1
( ˆ f(xi) f(xi))2 ! ⇣ 1 n.
F = n f : f(x) = q1 for x 2 (0, 1/2], f(x) = q2 for x 2 (1/2, 1]
Θ = {✓ : half ✓i is q1, half ✓i is q2}
inf
ˆ θ
sup
θ∈Θ
E 1 n
n
X
i=1
(ˆ ✓i ✓i)2 ! ⇣ 1.
yij = f(⇠i, ⇠j) + ✏ij, ⇠i = i n, i, j = 1, 2, .., n
f(x, y) = q1 (x, y) 2 [0, 1/2) ⇥ [0, 1/2) q2 (x, y) 2 [0, 1/2) ⇥ [1/2, 1] q3 (x, y) 2 [1/2, 1] ⇥ [0, 1/2) q4 (x, y) 2 [1/2, 1] ⇥ [1/2, 1]
inf
ˆ f
sup
f∈F
E @ 1 n2 X
1≤i,j≤n
( ˆ f(⇠i, ⇠j) f(⇠i, ⇠j))2 1 A ⇣ 1 n2 . F How about without knowing the design?
inf
ˆ f
sup
f∈F
E @ 1 n2 X
1≤i,j≤n
( ˆ f(⇠i, ⇠j) f(⇠i, ⇠j))2 1 A ⇣ 1 n.
Let θij = f(ξi, ξj). Does θij have any structure?
{θi1, θi2, ..., θin} are from the same row for each i. {θ1j, θ2j, ..., θnj} are from the same column for each j.
yij = f(⇠ij) + ✏ij, ⇠ij 2 [0, 1]2, i, j = 1, 2, .., n
{ } Without knowing the design?
inf
ˆ f
sup
f∈F
E @ 1 n2 X
1≤i,j≤n
( ˆ f(⇠ij) f(⇠ij))2 1 A ⇣ 1.
Aij ⇠ Bernoulli(✓ij)
Θ2 = n θ : θij = Qz(i)z(j), with z : [n] ! [2]
ˆ θ
sup
θ∈Θ2
E @ 1 n2 X
1≤i,j≤n
(ˆ ✓ij ✓ij)2 1 A ⇣ 1 n.
Aij ⇠ Bernoulli(✓ij)
Θk = n θ : θij = Qz(i)z(j), with z : [n] ! [k]
inf
ˆ θ
sup
θ∈Θk
E 8 < : 1 n2 X
i,j∈[n]
(ˆ θij θij)2 9 = ; ⇣ k2 n2 + log k n , for any 1 k n.
k2 n2 + log k n ⇣ 8 > > > > > > > > > < > > > > > > > > > : n−2 δ = 0, k = 1, n−1 δ = 0, k > 1, n−1 log n δ 2 (0, 1/2], n−2(1−δ) δ 2 (1/2, 1].
Theorem (Aldous-Hoover). A random array {Aij} is jointly exchangeable in the sense that {Aij} d ={A(i)(j)} for all permutation , if and only if it can be represented as follows: there is a random function F : [0, 1]3 ! R such that Aij
d
=F(⇠i, ⇠j, ⇠ij), where {⇠i} and {⇠ij} are i.i.d. Unif[0, 1].
⇣ 2 When the graph is undirected and has no self-loop,
Aij|ξi, ξj ⇠ Bernoulli(θij), θij = f(ξi, ξj).
ξi ⇠ Unif(0, 1) i.i.d.
Aij|ξi, ξj ⇠ Bernoulli(θij), θij = f(ξi, ξj).
Theorem 1.2. Consider the H¨
inf
ˆ θ
sup
f∈Fα(M)
sup
ξ∼Pξ
E 8 < : 1 n2 X
i,j∈[n]
(ˆ θij θij)2 9 = ; ⇣ ( n− 2α
α+1 ,
0 < α < 1,
log n n ,
α 1. The expectation is jointly over {Aij} and {ξi}.
k
Proposition (Fano). Let (Θ, ⇢) be a metric space and {P✓ : ✓ 2 Θ} a collection of probability
the KL diameter of T by dKL(T) = sup
✓,✓02T
D(P✓||P✓0). Then inf
ˆ ✓
sup
✓2Θ
E✓⇢2 ⇣ ˆ ✓(X), ✓ ⌘ sup
✏>0
✏2 4 ✓ 1 dKL(T) + log 2 log M(✏, T, ⇢) ◆
T = ( {✓ij} ∈ [0, 1]n⇥n : ✓ij = 1 2 for (i, j) ∈ (S × S) ∪ (Sc × Sc), ✓ij = 1 2 + c √n for (i, j) ∈ (S × Sc) ∪ (Sc × S), with some S ∈ S ) .
S S S S S Sc S Sc 1 2 1 2 1 2 + c pn 1 2 + c pn
S S S S S Sc S Sc 1 2 1 2 1 2 + c pn 1 2 + c pn
⇢2(✓, ✓0) = 1 n2 X
1i,jn
(✓ij ✓0
ij)2 = 2c2
n |IS IS0| n (n |IS IS0|) n .
Construct a subset:
|| sup
θ,θ02T
D(Pθ||Pθ0) ≤ sup
θ,θ02T
8||✓ − ✓0||2 ≤ 8c2n.
Lower bound the packing number
I −IS0| as the Hamming
1 4n ≤ |IS − IS0| ≤ 3 4n,
⇢2(✓, ✓0) = 2c2 n |IS IS0| n (n |IS IS0|) n c2 8n =: ✏2.
M(✏, T, ⇢) ≥ N ≥ exp(c1n)
inf
ˆ θ
sup
θ∈Θ2
E @ 1 n2 X
1≤i,j≤n
(ˆ ✓ij ✓ij)2 1 A c2 32n ✓ 1 8c2n + log 2 c1n ◆ & 1 n.
∥ − ∥ Oracle solution When the clustering z is known, an obvious estimator ˆ θij = 1 |z−1(a)||z−1(b)|
Aij, for (i, j) ∈ z−1(a) × z−1(b) achieves the rate ∥ˆ θ − θ∥2
F ≤ OP
.
An equivalent form (least squares) Fixing the known z, then solve min
θ
∥A − θ∥2
F
s.t. θij = Qz(i)z(j) for some Q = QT ∈ [0, 1]k×k A natural estimator Solve min
θ
∥A − θ∥2
F
s.t. θij = Qz(i)z(j) for some Q = QT ∈ [0, 1]k×k and some z : {1, 2, ..., n} → {1, 2, ..., k}.
∥ˆ θ − θ∥2
F ≤ OP
uniform
⇡(k) / exp
uniform
⇡(k) / exp
f(Q) = 1 2 ✓ k p⇡ ◆k2 Γ(k2/2) Γ(k2) ek||Q||
uniform
f(Q) = 1 2 ✓ k p⇡ ◆k2 Γ(k2/2) Γ(k2) ek||Q||
⇡(k) / Γ(k2) Γ(k2/2) exp
uniform
⇡(k) / exp
f(Q) = 1 2 ✓ λk pπ ◆k2 e−λk||Q||
2 ✓ π ◆ Theorem 1.3. Consider λk = β n k for some constant β > 0. Then Eθ∗Π @ 1 n2 X
i,j
(θij θ⇤
ij)2 > M
✓k2 n2 + log k n ◆
1 A exp
k2 + n log k
for some constants M, C0 > 0.
Gao, Chao, Yu Lu, and Harrison H. Zhou. "Rate-optimal Graphon Estimation." arXiv preprint arXiv:1410.5837 (2014).
Thank you