Damping Effect on PageRank Distribution IEEE High Performace Extreme - - PowerPoint PPT Presentation

damping effect on pagerank distribution
SMART_READER_LITE
LIVE PREVIEW

Damping Effect on PageRank Distribution IEEE High Performace Extreme - - PowerPoint PPT Presentation

Damping Effect on PageRank Distribution IEEE High Performace Extreme Computing, Waltham, MA, USA September 26, 2018 Tiancheng Liu Yuchen Qian Xi Chen Xiaobai Sun Department of Computer Science, Duke University, USA Outline Analysis:


slide-1
SLIDE 1

Damping Effect on PageRank Distribution

IEEE High Performace Extreme Computing, Waltham, MA, USA September 26, 2018 Tiancheng Liu Yuchen Qian Xi Chen Xiaobai Sun Department of Computer Science, Duke University, USA

slide-2
SLIDE 2

Outline

⋄ Personalized PageRank model: invention by Brin and Page (1998) in need of innovative extension ⋄ The PageRank model family: an analytic apparatus with increased description power and scope ⋄ Analysis: damping effects on PageRank distributions ⋄ Algorithm: exploiting structures of the personalized, stochastic Krylov (PSK) space ⋄ Findings: by experiments on real-world network data

slide-3
SLIDE 3

Sparse graphs in sparse matrix representations

x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 x20

link graph G(V , E) directed edge (u, v) ∈ E

2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20

2 4 6 8 10 12 14 16 18 20

adjacency matrix A A(v, u) = 1 din in-degrees dout out-degrees

2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20

probability transition matrix P P = A · diag(1./dout) factor form in storage

1 / 26

slide-4
SLIDE 4

Precursor: Personalized PageRank

Web surfing modeled as a random walk on Mα(v), a Markov chain with a personalized term S Mα(v) = α

damping factor

P

link graph

+ (1 − α) S, S = v

personalized vector

eT

gathering vector

x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 x20

personalized Markov chain

= α

x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 x20

link graph

+(1 − α)

x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 x20

personalized direct links

Bernoulli decision at each click: follow P-links or S-links with probability α ∈ (0, 1) a.k.a. damping factor The personalized term S: direct links to v-nodes (yellow) gathering/broadcasting rank-1, stochastic

2 / 26

slide-5
SLIDE 5

Precursor: Personalized PageRank

Web surfing modeled as a random walk on Mα(v), a Markov chain with a personalized term S Mα(v) = α

damping factor

P

link graph

+ (1 − α) S, S = v

personalized vector

eT

gathering vector

2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20

= α

0.85

2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20

+ (1 − α)

0.15

5 10 15 20 2 4 6 8 10 12 14 16 18 20

Bernoulli decision at each click: follow P-links or S-links with probability α ∈ (0, 1) a.k.a. damping factor The personalized term S: direct links to v-nodes (yellow) gathering/broadcasting rank-1, stochastic

2 / 26

slide-6
SLIDE 6

Equivalent expressions of PageRank distribution vector

Purpose: multi-aspect investigation for interpretation and computational analysis

  • 1. Steady state distribution of Mα

Mαx =

  • αP + (1 − α)veT

x = x the power method

  • 2
4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20

k

Mk

α

2 4 6 8 10 12 14 16 18 20

x0

− →

2 4 6 8 10 12 14 16 18 20

x

Asymptotic walk on Mα, memoryless of x0

  • 2. Solution to sparse linear system

(I − αP)x = (1 − α)v many iterative solution methods

  • 3. Explicit representation

x = (1 − α)

k αk(Pkv)

in Neumann series with P, v, α (1 − α)

  • k

αk

2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20

link graph P

k

2 4 6 8 10 12 14 16 18 20

v

− →

2 4 6 8 10 12 14 16 18 20

x

Cumulative propagation of v on P

  • 4. Differential transition equation

˙ x(α) = [P(I − αP)−1 − (1 − α)−1I]x(α) spectrum-based method

3 / 26

slide-7
SLIDE 7

Outline

⋄ Personalized PageRank model: invention by Brin and Page (1998) in need of innovative extension ⋄ The PageRank model family: an analytic apparatus with increased description power and scope ⋄ Analysis: damping effects on PageRank distributions ⋄ Algorithm: exploiting structures of the personalized, stochastic Krylov (PSK) space ⋄ Findings: by experiments on real-world network data

slide-8
SLIDE 8

PageRank model family: characterizing various propagation patterns

Model description in equivalent expressions: ⋄ Propagation kernel functions propagation patterns ⋄ Cumulative propagation on P ⋄ Linear systems ⋄ Differential transitions PageRank distribution response to damping variation

2 4 6 8 10 12 14 16 18 20 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Geometric kernels (Brin-Page)

2 4 6 8 10 12 14 16 18 20 0.1 0.2 0.3 0.4 0.5 0.6 0.7

Poisson kernels (Chung)

2 4 6 8 10 12 14 16 18 20 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45

Conway-Maxwell-Poisson kernels (slow)

2 4 6 8 10 12 14 16 18 20 0.1 0.2 0.3 0.4 0.5 0.6

Conway-Maxwell-Poisson kernels (fast)

2 4 6 8 10 12 14 16 18 20 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

Negative Binomial kernels

2 4 6 8 10 12 14 16 18 20 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Logarithmic kernels

A few particular subfamilies of propagation kernel functions

4 / 26

slide-9
SLIDE 9

Propagation kernel functions

Propagation kernel function fρ(λ) fρ( λ

graph eigenvalue

) =

  • k

wk(ρ)

discrete pmf

λk PageRank vector (model solution) with particular network P and personalized distribution vector v x = fρ(P)v =

  • k

wk(ρ)

damping on k-th step

· Pkv

k-th step propagation

{wk(ρ)} : any probability mass function (pmf)

  • f variable ρ, w.i./w.o. additional parameters

10 -5 10 0 10 5 1 2 3 4 5 6 7 # of nodes (bin counts) 10 6 0.9 0.8 2 10 -5 4 Bin counts 10 6 10 0 0.7 6 10 5 10 -5 10 0 10 5 1 2 3 4 5 6 7 # of nodes (bin counts) 10 6 30 20 2 10 10 -5 4 Bin counts 10 6 10 0 6 10 5 10 -5 10 0 10 5 1 2 3 4 5 6 7 # of nodes (bin counts) 10 6 0.95 0.9 0.85 2 10 -5 4 10 6 Bin counts 0.8 10 0 6 10 5

PageRank distributions of 3 propagation patterns with P for link graph Twitter(www) 1

1 H. Kwak et al. (2009)

5 / 26

slide-10
SLIDE 10

Propagation pattern kernels : CMP sub-family

Conway-Maxwell-Poisson (CMP): wk( ρ

damping variable

, ν

damping speed

) = ρk (k!)ν Z

normalization constant

Damping speed parameter ν ≥ 0 ν =              0, geometric, (B-P, 1998) 1, Poisson, (Chung, 2007) < 1, slow decaying with k > 1, fast decaying with k

2 4 6 8 10 12 14 16 18 20 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45

slow damping speed: 0 ≤ ν ≤ 1 (ρ = 0.9) including BP model and Chung’s model

2 4 6 8 10 12 14 16 18 20 0.1 0.2 0.3 0.4 0.5 0.6

fast damping speed: ν ≥ 1 (ρ = 5) Slow and fast propagation patterns of CMP distribution

6 / 26

slide-11
SLIDE 11

Propagation pattern kernels: NB sub-family

Negative Binomial (NB): step k wk( ρ

damping variable

, r

distribution shape

) = k + r − 1 k

  • ρk(1 − ρ)r

Distribution shape parameter r:

r =    1, geometric distribution ∞, Poisson distribution, with r ·

ρ (1−ρ) = const

2 4 6 8 10 12 14 16 18 20 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

Propagation patterns of NB distribution

7 / 26

slide-12
SLIDE 12

Propagation pattern kernels: logarithmic distribution

Logarithmic: step k wk(ρ) = −1 ln(1 − ρ) ρk k , ρ ∈ (0, 1) unique new model in the model family: weight decay faster than geometric distribution weight decay slower than Poisson distribution no extra control parameters

2 4 6 8 10 12 14 16 18 20 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Propagation patterns of logarithmic distributions

8 / 26

slide-13
SLIDE 13

Propagation pattern kernels: precursor models and new model

Precursor models: Brin-Page1 model: geometric distribution wk(α) = (1 − α)αk Chung’s2 model: Poisson distribution wk(β) = e−β βk k! new model in the family: log-γ model: logarithmic distribution wk(γ) = −1 ln(1 − γ) γk k

1 L. Page and S. Brin, 1998 2 F. Chung, PNAS, 2007

2 4 6 8 10 12 14 16 18 20 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 2 4 6 8 10 12 14 16 18 20 0.1 0.2 0.3 0.4 0.5 0.6 0.7 2 4 6 8 10 12 14 16 18 20 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

9 / 26

slide-14
SLIDE 14

Cumulative propagation on P

link graph P and personalized vector v

2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20

P

2 4 6 8 10 12 14 16 18 20

v

2 4 6 8 10 12 14 16 18 20

v

2 4 6 8 10 12 14 16 18 20

Pv

2 4 6 8 10 12 14 16 18 20

P2v

· · ·

propagation on P

2 4 6 8 10 12 14 16 18 20

Pm−1v

2 4 6 8 10 12 14 16 18 20 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 2 4 6 8 10 12 14 16 18 20

geometric kernel (Brin-Page)

2 4 6 8 10 12 14 16 18 20 0.1 0.2 0.3 0.4 0.5 0.6 0.7 2 4 6 8 10 12 14 16 18 20

Poisson kernel (Chung)

2 4 6 8 10 12 14 16 18 20 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 2 4 6 8 10 12 14 16 18 20

Logarithmic kernel (log-γ)

2 4 6 8 10 12 14 16 18 20 0.05 0.1 0.15 0.2 0.25 2 4 6 8 10 12 14 16 18 20

x(α) = zα

  • k

αk Pk v

2 4 6 8 10 12 14 16 18 20 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 2 4 6 8 10 12 14 16 18 20

x(β) = zβ

  • k

βk k! Pk v

2 4 6 8 10 12 14 16 18 20 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 2 4 6 8 10 12 14 16 18 20

x(γ) = zγ

  • k

γk k Pk v

10 / 26

slide-15
SLIDE 15

Linear systems

Close-form expression of the coefficient matrix Aρ(P)x = v, Aρ(P) = f −1

ρ

(P) Particular instances Brin-Page model: Aα(P) = (1 − α)−1(I − αP) Chung’s model: Aβ(P) = e−β(I−P) log-γ model: Aγ(P) = ln(1 − γ) ln−1(I − γP)

– Except the Brin-Page model, explicit forma- tion of the coefficient matrix is non-necessary – This formulation is used for derivation of the differential transition equation (next)

11 / 26

slide-16
SLIDE 16

Differential transition

Effect of damping variation in one model: Node-wise trajectory of PageRank vector ˙ x(ρ) ˙ x(ρ) = d dρx(ρ) = ∂ ∂ρfρ(P)v = Qρ(P)x(ρ) at any particular value of ρ Brin-Page model: Qα(P) = [P(I − αP)−1 − (1 − α)−1I] Chung’s model: Q = −(I − P) log-γ model: Qγ(P) = (1 − γ)−1 ln(1 − γ) I − P(I − γP)−1(ln(I − γP))−1

  • Matrix-vector multiplication for Chung’s model
  • Linear-solver may be used once again for Brin-

Page model

  • An efficient spectrum-based algorithm for all

models, without eigen-decomposition of P

12 / 26

slide-17
SLIDE 17

Outline

⋄ Personalized PageRank model: invention by Brin and Page (1998) in need of innovative extension ⋄ The PageRank model family: an analytic apparatus with increased description power and scope ⋄ Analysis: damping effects on PageRank distributions ⋄ Algorithm: exploiting structures of the personalized, stochastic Krylov (PSK) space ⋄ Findings: by experiments on real-world network data

slide-18
SLIDE 18

Inter-model correspondence

statistically similar damping level of propagation on P: at expected propagation weight center µ(wk(ρ)) =

  • k∈Nw

k · wk(ρ) Brin-Page ← → Chung’s α 1 − α = β Brin-Page ← → log-γ α 1 − α =

  • γ

1 − γ

  • −1

ln(1 − γ)

2 4 6 8 10 12 14 16 18 20 0.05 0.1 0.15 0.2 0.25 0.3 0.35

pmfs associated with Brin-Page, Chung’s, and log-γ model, at corresponding damping variables (α = 0.85, β = 5.66, γ = 0.94)

13 / 26

slide-19
SLIDE 19

Intra-model damping effect by KL divergence and its derivative

Aggregated effect of damping variation: KL divergence

  • f PageRank vectors (scalar)

KL(x(ρ), x(ρo)) =

  • i

xi(ρ) log xi(ρ) xi(ρo) d dρKL(x(ρ), x(ρo)) = ˙ x(ρ)(log x(ρ) − log x(ρo) + e)

0.7 0.75 0.8 0.85 0.9 0.95 1 0.05 0.1 0.15 0.2

  • 1

1 2 3 4 5 6 7

KL divergence analytical derivative emprical derivative, =0.004 emprical derivative, =0.002

Damping variation in KL and dKL/dρ (Twitter-www, Brin-Page model)

* dKL/dρ in red, KL in blue * reference damping factor denote as ρo 14 / 26

slide-20
SLIDE 20

Outline

⋄ Personalized PageRank model: invention by Brin and Page (1998) in need of innovative extension ⋄ The PageRank model family: an analytic apparatus with increased description power and scope ⋄ Analysis: damping effects on PageRank distributions ⋄ Algorithm: exploiting structures of the personalized, stochastic Krylov (PSK) space ⋄ Findings: by experiments on real-world network data

slide-21
SLIDE 21

Personalized, stochastic Krylov space

Personalized, stochastic Krylov (PSK) space: PSK(P, v) = span{v, Pv, P2v, · · · , Pkv, · · · }, v ≥ 0, eTv = 1 Properties:

  • Any convex combination of the Krylov vectors is

a probability distribution

  • The same PSK space is shared by all models,

housing all model solutions and their trajectories

  • The PSK space is of finite dimension m
  • Let K = [v, Pv, P2, · · · , Pm−1v] and K = QR.

There exists a Hessenberg matrix H such that PQ = QH, Qe1 = v and that g(P)v = Q g(H)e1 for any function g

link graph P and personalized vector v

2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20

P

2 4 6 8 10 12 14 16 18 20

v

2 4 6 8 10 12 14 16 18 20

v

2 4 6 8 10 12 14 16 18 20

Pv

2 4 6 8 10 12 14 16 18 20

P2v

· · ·

Krylov vectors

2 4 6 8 10 12 14 16 18 20

Pm−1v

PageRank vector x(ρ) = fρ(P)v ∈ PSK(P, v) PageRank vector trajectory ˙ x(ρ) = Qρ(P)x(ρ) ∈ PSK(P, v)

15 / 26

slide-22
SLIDE 22

Efficient algorithm for damping effect analysis

intra-model, inter-model damping variations, across all models under consideration based on the PSK properties, without eigen-decomposition P

n×n

v

n×1

K

n×m Krylov matrix

Q

n×m

R

m×m

H

m×m Hessenberg matrix

{x(ρ)} { ˙ x(ρ)}

PageRank distributions PageRank distribution trajectories Krylov space construction QR decomp. PQ = QH

g(P)v = Qg(H)e1

16 / 26

slide-23
SLIDE 23

Outline

⋄ Personalized PageRank model: invention by Brin and Page (1998) in need of innovative extension ⋄ The PageRank model family: an analytic apparatus with increased description power and scope ⋄ Analysis: damping effects on PageRank distributions ⋄ Algorithm: exploiting structures of the personalized, stochastic Krylov (PSK) space ⋄ Findings: by experiments on real-world network data

slide-24
SLIDE 24

Data: real-world large social and knowledge network snapshots

Total #nodes #nodes in LSCC [max(dout), µ(dout), max(din)] Google 1 875,713 434,818 [4209, 8.86, 382] Wikilink 2 12,150,976 7,283,915 [7527, 50.48, 920207] DBpedia 3 18,268,992 3,796,073 [8104, 26.76, 414924] Twitter(www) 4 41,652,230 33,479,734 [2936232, 42.65, 768552] Twitter(mpi) 5 52,579,682 40,012,384 [778191, 47.57, 3438929] Friendster 6 68,349,466 48,928,140 [3124, 32.76, 3124]

1 Google Inc. (2002) 2 Wikipedia Foundation (2017) 3 DBpedia (2017) 4 H. Kwak et al. (2009) 5 M. Cha et al. (2010) 6 ArchiveTeam (2011)

17 / 26

slide-25
SLIDE 25

Sparse real-world networks under Dulmage-Mendelsohn permutation

200 400 600 800 100 200 300 400 500 600 700 800 0.5 1 1.5 2 2.5 3 3.5

Google (τ = 8)

5000 10000 15000 2000 4000 6000 8000 10000 12000 14000 16000 18000 0.5 1 1.5 2 2.5 3 3.5 4

DBpedia (τ = 2)

2000 4000 6000 8000 10000 12000 2000 4000 6000 8000 10000 12000 1 2 3 4 5

Wikilink (τ = 2)

1 2 3 4 10 4 0.5 1 1.5 2 2.5 3 3.5 4 10 4 1 2 3 4 5

Twitter(www) (τ = 2)

1 2 3 4 5 10 4 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 10 4 1 2 3 4 5

Twitter(mpi) (τ = 3)

1 2 3 4 5 6 10 4 1 2 3 4 5 6 10 4 0.5 1 1.5 2 2.5 3 3.5 4 4.5

Friendster (τ = 3) each point represent a 1000 × 1000 block, a block with ≥ τ non-zeros is colored blue

18 / 26

slide-26
SLIDE 26

Personalized stochastic Krylov space: small-world phenomenon

10 20 30 40 50 60 70 80

  • 16
  • 14
  • 12
  • 10
  • 8
  • 6
  • 4
  • 2

Google (m = 62)

10 20 30 40 50 60 70 80

  • 16
  • 14
  • 12
  • 10
  • 8
  • 6
  • 4
  • 2

DBpedia (m = 19)

10 20 30 40 50 60 70 80

  • 16
  • 14
  • 12
  • 10
  • 8
  • 6
  • 4
  • 2

Wikilink (m = 27)

10 20 30 40 50 60 70 80

  • 16
  • 14
  • 12
  • 10
  • 8
  • 6
  • 4
  • 2

Twitter(www) (m = 25)

10 20 30 40 50 60 70 80

  • 16
  • 14
  • 12
  • 10
  • 8
  • 6
  • 4
  • 2

Twitter(mpi) (m = 30)

10 20 30 40 50 60 70 80

  • 16
  • 14
  • 12
  • 10
  • 8
  • 6
  • 4
  • 2

Friendster (m = 24)

Effective PSK(P, v) dimension m by Rii in QR decomposition

19 / 26

slide-27
SLIDE 27

Damping effect: KL and dKL/dρ across models

0.7 0.75 0.8 0.85 0.9 0.95 1 0.05 0.1 0.15 0.2

  • 1

1 2 3 4 5 6 7 KL divergence analytical derivative emprical derivative, =0.004 emprical derivative, =0.002

α0 = 0.85

0.75 0.8 0.85 0.9 0.95 1 0.01 0.02 0.03 0.04 0.05 0.06 5 10 15 KL divergence analytical derivative emprical derivative, =0.004 emprical derivative, =0.002

γ0 = 0.94146

5 10 15 20 25 30 35

  • 0.3
  • 0.2
  • 0.1

0.1 0.2 0.3

  • 0.4
  • 0.3
  • 0.2
  • 0.1

0.1 0.2 0.3 0.4 KL divergence analytical derivative emprical derivative, =0.004 emprical derivative, =0.002

β0 = 5.˙ 6

0.7 0.75 0.8 0.85 0.9 0.95 1

  • 0.8
  • 0.6
  • 0.4
  • 0.2

0.2 0.4

  • 3
  • 2
  • 1

1 2 KL divergence analytical derivative emprical derivative, =0.004 emprical derivative, =0.002

α0 = 0.95

0.75 0.8 0.85 0.9 0.95 1

  • 0.05

0.05 0.1 0.15 0.2

  • 1

1 2 3 4 KL divergence analytical derivative emprical derivative, =0.004 emprical derivative, =0.002

γ0 = 0.98831

5 10 15 20 25 30 35

  • 1
  • 0.8
  • 0.6
  • 0.4
  • 0.2

0.2 0.4 0.6 0.8 1

  • 0.6
  • 0.4
  • 0.2

0.2 0.4 0.6 KL divergence analytical derivative emprical derivative, =0.004 emprical derivative, =0.002

β0 = 19

Twitter(www) dataset substantial different sensitivity patterns across model B-P model and log-γ model are sensitive when damping parameter approaches 1 Chung’s model is less sensitive with damping parameter change, especially with large β

20 / 26

slide-28
SLIDE 28

Damping effect: KL and dKL/dρ across datasets

0.7 0.75 0.8 0.85 0.9 0.95 1 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 2 4 6 8 10 12 14 KL divergence analytical derivative emprical derivative, =0.008 emprical derivative, =0.002

Google

0.7 0.75 0.8 0.85 0.9 0.95 1

  • 0.02

0.02 0.04 0.06 0.08 0.1 0.12 0.14

  • 1

1 2 3 4 5 KL divergence analytical derivative emprical derivative, =0.004 emprical derivative, =0.002

DBpedia

0.7 0.75 0.8 0.85 0.9 0.95 1

  • 0.02

0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16

  • 1

1 2 3 4 5 6 7 KL divergence analytical derivative emprical derivative, =0.004 emprical derivative, =0.002

Wikilink

0.7 0.75 0.8 0.85 0.9 0.95 1 0.05 0.1 0.15 0.2

  • 1

1 2 3 4 5 6 7 KL divergence analytical derivative emprical derivative, =0.004 emprical derivative, =0.002

Twitter(www)

0.7 0.75 0.8 0.85 0.9 0.95 1

  • 0.04
  • 0.02

0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16

  • 1

1 2 3 4 5 6 KL divergence analytical derivative emprical derivative, =0.004 emprical derivative, =0.002

Twitter(mpi)

0.7 0.75 0.8 0.85 0.9 0.95 1

  • 0.01

0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08

  • 0.5

0.5 1 1.5 2 2.5 3 3.5 KL divergence analytical derivative emprical derivative, =0.004 emprical derivative, =0.002

Friendster

Brin-Page model, α0 = 0.85 similar trend across 6 datasets low variation with relatively small α substantially larger variation when α − → 1

21 / 26

slide-29
SLIDE 29

Intra-model variation: PageRank vector profiles across models

10 -5 10 0 10 5 1 2 3 4 5 6 7 # of nodes (bin counts) 10 6

Brin-Page model

10 -5 10 0 10 5 1 2 3 4 5 6 7 # of nodes (bin counts) 10 6

Chung’s model

10 -5 10 0 10 5 1 2 3 4 5 6 7 # of nodes (bin counts) 10 6

log-γ model

0.9 0.8 2 10 -5 4 Bin counts 10 6 10 0 0.7 6 10 5 30 20 2 10 10 -5 4 Bin counts 10 6 10 0 6 10 5 0.95 0.9 0.85 2 10 -5 4 10 6 Bin counts 0.8 10 0 6 10 5

PageRank vector profile: normalized histogram of PageRank values Twitter(www) dataset

22 / 26

slide-30
SLIDE 30

Intra-model variation: PageRank vector profiles across datasets

10 -6 10 -4 10 -2 10 0 10 2 1 2 3 4 5 6 7 # of nodes (bin counts) 10 4

Google

10 -5 10 0 10 5 1 2 3 4 5 6 7 8 # of nodes (bin counts) 10 5

DBpedia

10 -5 10 0 10 5 2 4 6 8 10 12 14 # of nodes (bin counts) 10 5

Wikilink

10 -5 10 0 10 5 1 2 3 4 5 6 7 # of nodes (bin counts) 10 6

Twitter(www)

10 -5 10 0 10 5 1 2 3 4 5 6 7 8 9 10 # of nodes (bin counts) 10 6

Twitter(mpi)

10 -8 10 -6 10 -4 10 -2 10 0 10 2 1 2 3 4 5 6 7 8 # of nodes (bin counts) 10 6

Friendster

Brin-Page model, α0 = 0.85

23 / 26

slide-31
SLIDE 31

Recap

Intellectual merits

  • Rich family of PageRank models

capturing, differentiating various activities and propagation patterns with quantitative form and speed

  • Unified analysis of damping effects

easily instantiated on particular network P and personalized vector v

  • The PSK space

residence for all model solutions, foundation for efficient model solution methods Experimental findings ⋄ Model utility inter-model difference in PageRank distribution profile is much greater than intra-model difference ⋄ Bump/peak in PageRank distribution single, with minority support ⋄ The PSK dimension with small-world networks, the dimension

  • f personalized, stochastic Krylov space is

low, which leads to upper bounds on algorithm complexity

24 / 26

slide-32
SLIDE 32

Thank you!

Tiancheng Liu – tcliu [at] cs.duke.edu

slide-33
SLIDE 33

Recap

Intellectual merits

  • Rich family of PageRank models

capturing, differentiating various activities and propagation patterns with quantitative form and speed

  • Unified analysis of damping effects

easily instantiated on particular network P and personalized vector v

  • The PSK space

residence for all model solutions, foundation for efficient model solution methods Experimental findings ⋄ Model utility inter-model difference in PageRank distribution profile is much greater than intra-model difference ⋄ Bump/peak in PageRank distribution single, with minority support ⋄ The PSK dimension with small-world networks, the dimension

  • f personalized, stochastic Krylov space is

low, which leads to upper bounds on algorithm complexity

slide-34
SLIDE 34