Graph diffusions and matrix functions: fast algorithms and localization results
Kyle Kloster Purdue University
Advised by
David F David F. . Gleich Gleich
Supported by NSF CAREER 1149756-CCF
- Thesis defense
1
Graph diffusions and matrix functions: fast algorithms and - - PowerPoint PPT Presentation
Graph diffusions and matrix functions: fast algorithms and localization results Thesis defense Advised by Kyle Kloster David F David F. . Gleich Gleich Supported by Purdue University NSF CAREER 1149756-CCF 1 Network
Advised by
David F David F. . Gleich Gleich
Supported by NSF CAREER 1149756-CCF
1
Erdős Number Facebook friends Twitter followers Search engines Amazon/Netflix rec. Protein interactions Power grids Google Maps Air traffic control Sports rankings Cell tower placement Scheduling Parallel programming Everything Kevin Bacon 2
3
4
5
Seed Seed
Newman’s netscience graph 379 vertices 924 edges
7
8
9
[Nassar, K., Gleich,
10 10
11 11
= high = low diffusion value seed
12 12
13 13
seed
p0 c0 p1 c1 p2 c2 p3 c3
+ + + + …
k=0
14 14
seed
15 15
cut(T) min( vol(T), vol(Tc) )
16 16
seed = high = low diffusion value = local community / low-conductance set
17 17
18 18
kD−1(f ˆ f)k∞ ε 19 19
f =
∞
X
k=0
αkPk ˜ s f =
∞
X
k=0 tk k!Pk ˜
s
20 20
Local Cheeger Inequality [Andersen,Chung,Lang 06] [Andersen Chung Lang 06] “PPR-push” is O(1/(ε(1-𝛽))) Local Cheeger Inequality [Chung ’07] [K., Gleich ’14] “HK-push” is O(etC/ε ) Open question [Avron, Horesh ’15] Constant-time heuristically [Ghosh et al. ’14] on L;
In preparation with Gleich and Simpson TDPR 21 21
ε )
ε )
22 22
p0 p1 p2 p3
seed seed
…
p0 c0 p1 c1 p2 c2 p3 c3
+ + + + …
k=0
23 23
p0 c0 p1 c1 p2 c2 p3 c3
+ + + + …
k=0
24 24
r0 r1 r2 r3
seed seed
…
p0 p1 p2 p3
+ + + + …
c0 c1 c2 c3
25 25
r0 r1 r2 r3 … p0 p1 p2 p3
+ + + + … entries < threshold
c0 c1 c2 c3
ε/ @
∞
X
j=k+1
cj 1 A
26 26
∞
X
k=0
ck = 1
and
N
X
k=0
ck ≤ ✏/2
27 27
N
X
k=0
ck ≤ ✏/2
(each push is added to f, which sums to 1)
rk(j) ≥ d(j)✏/(2N)
≤
N−1
X
k=0 mk
X
t=1
rk(jt)(2N)/✏
N−1
X
k=0 mk
X
t=1
d(jt)
mk
X
t=1
rk(jt) ≤ 1
28 28
29 29
Given a seed and a graph Seeded PageRank is defined as the solution to
parameter” in (0,1). es P = ATD−1 α ( − αP)x = (1 − α)es Strong localization: if we can approximate x so that and the approximation is sparse, x is strongly localized.
kx ˆ xk1 ε
30 30
2 4 6 8 10 x 10
5
0.5 1 1.5
10 10
2
10
4
10
6
10
−15
10
−10
10
−5
10 10 10
2
10
4
10
6
10
−15
10
−10
10
−5
10
nonzeros error
plot(x)
||xtrue – xnnz||1
X-axis: node index Y-axis: value at that index in true PageRank vector
31 31
32 32
Values in the PageRank vector seeded on the center node. Essentially everything is needed to be non-zero to get a global error bound. 1 1 + α α (1 + α)(n − 1) 33 33
Values in the PageRank vector seeded on the center node. Essentially everything is needed to be non-zero to get a global error bound.
1 1 + α α (1 + α)(n − 1)
α
34 34
35
f(x) = (1 − αx)−1. p(λi) = f(λi) → p(P) = f(P) ( − αP)−1ej = f(P)ej = (c0 + c1P + c2P2)ej
36 36
37 37
[Yang and Leskovec, ICDM 2015]
Graphs where the k-th largest degree d(k) ≤ max(dk−p, δ) ( is min degree, d is max degree p is decay exponent )
38 38
Due to the maximum degree d, this does not say anything about traditional power-law graphs (e.g. the Pareto case)
Theorem (Nassar, K., Gleich): Let a graph have max-degree d, min-degree δ, n nodes, and let p be the decay exponent. Then Gauss Southwell computes xε with accuracy kx xεk1 ε, and the number
⇢ n , 1 δ Cp(1/ε)
δ 1−α
( d(1 + log d) p = 1 d ⇣ 1 +
1 1−p(d(1/p)−1 1)
⌘
39 39
We study the behavior of the Gauss-Southwell or push algorithm for computing PageRank
Algorithm
1. pick node with most residual dye 2. assign dye to node 3. update residual dye on neighbors, 4. then repeat.
40 40
41 41
42 42
Residual nonnegative Triangle inequality P is column-stochastic Residual nonnegative 43 43
(definition of average)
44 44
Cp ≈ d log d δ 45 45
Theorem (Nassar, K., Gleich): Let a graph have max-degree d, min-degree δ, n nodes, and let p be the decay exponent. Then Gauss Southwell computes xε with accuracy kx xεk1 ε, and the number
⇢ n , 1 δ Cp(1/ε)
δ 1−α
( d(1 + log d) p = 1 d ⇣ 1 +
1 1−p(d(1/p)−1 1)
⌘
46 46
[Nassar, K., Gleich,
47 47
48 48
10
1
10
2
10
3
10
4
10
5
10
−5
10
−4
10
−3
10
−2
10
−1
10 1/ε Degree normalized PageRank Netscience −− PageRank Solution Paths
k=0
(K. & Gleich) (Jiang, K., Gleich, Gribskov)
advisor and collaborator on all projects
collaborator on PageRank localization and de-localization
collaborator on generalized diffusion work
papers alike: Nichole Eikmeier, Yangyang Hou, Huda Nassar, Bryan Rainey, Yanfei Ren, Ayan Sinha, Varun Vasudevan, Nate Veldt, Tau Wu 49 49