http://cs224w.stanford.edu How to organize/navigate it? How to - - PowerPoint PPT Presentation

http cs224w stanford edu how to organize navigate it how
SMART_READER_LITE
LIVE PREVIEW

http://cs224w.stanford.edu How to organize/navigate it? How to - - PowerPoint PPT Presentation

CS224W: Social and Information Network Analysis Jure Leskovec Stanford University Jure Leskovec, Stanford University http://cs224w.stanford.edu How to organize/navigate it? How to organize/navigate it? First try: y Web directories


slide-1
SLIDE 1

CS224W: Social and Information Network Analysis Jure Leskovec Stanford University Jure Leskovec, Stanford University

http://cs224w.stanford.edu

slide-2
SLIDE 2

 How to organize/navigate it?  How to organize/navigate it?  First try:

y Web directories

  • Yahoo,

,

  • DMOZ,
  • LookSmart

LookSmart

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 2

slide-3
SLIDE 3

 SEARCH!  SEARCH!  Find relevant docs in a small and trusted set:

  • Newspaper articles
  • Patents, etc.

Patents, etc.

 Two traditional problems:

  • Synonimy: buy – purchase, sick – ill
  • Polysemi: jaguar

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 3

slide-4
SLIDE 4

D d t b tt lt ? Does more documents mean better results?

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 4

slide-5
SLIDE 5

 What is “best” answer to query “Stanford”?

What is best answer to query Stanford ?

  • Anchor Text: I go to Stanford where I study

 What about query “newspaper”?  What about query newspaper ?

  • No single right answer

 Scarcity (IR) vs abundance (Web) of information  Scarcity (IR) vs. abundance (Web) of information

  • Web: Many sources of information. Who to “trust”

 Trick:  Trick:

  • Pages that actually know about newspapers

might all be pointing to many newspapers might all be pointing to many newspapers

 Ranking!

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 5

slide-6
SLIDE 6

 Goal (back to the newspaper example):

Goal (back to the newspaper example):

  • Don’t just find newspapers.Find “experts” – people

who link in a coordinated way to good newspapers

 Idea: Links as votes  Idea: Links as votes

  • Page is more important if it has more links
  • In‐coming links? Out‐going links?

 Hubs and Authorities

  • Quality as an expert (hub):

NYT: 10 Ebay: 3

Q y p ( )

  • Total sum of votes of pages pointed to
  • Quality as an content (authority):
  • Total sum of votes of experts

Yahoo: 3 CNN: 8

  • Total sum of votes of experts
  • Principle of repeated improvement

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

WSJ: 9

6

slide-7
SLIDE 7

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 7

slide-8
SLIDE 8

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 8

slide-9
SLIDE 9

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 9

slide-10
SLIDE 10

[Kleinberg ‘98]

 Each page i has 2 kinds of scores:

Each page i has 2 kinds of scores:

  • Hub score: hi
  • Authority score: ai

y

i

 HITS algorithm:

  • Initialize: ai=hi=1

i i

  • Then keep iterating:

h i

h

  • Authority:
  • Hub:

j i i j

h a

j i j i

a h

  • Normalize: ai=1, hi=1

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 10

slide-11
SLIDE 11

[Kleinberg ‘98]

 HITS converges to a single stable point  HITS converges to a single stable point  Slightly change the notation:

  • Vector a=(a

a ) h=(h h )

  • Vector a=(a1…,an), h=(h1…,hn)
  • Adjacency matrix (n x n): Mij=1 if ij

 Then:  Then:

 

  

 j j ij i j i j i

a M h a h

 So:  And likewise:

 j j i

Ma h  h M a

T

 And likewise:

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

h M a 

11

slide-12
SLIDE 12

 Algorithm in new notation:  Algorithm in new notation:

  • Set: a = h = 1n
  • Repeat:

Repeat:

  • h=Ma, a=MTh
  • Normalize

T

 Then: a=MT(Ma)

new h new a

a is being updated (in 2 steps): MT(Ma)=(MTM)a

 Thus, in 2k steps:

a=(MTM)ka

new a

( ) ( ) h is updated (in 2 steps): M (MTh)=(MMT)h

( ) h=(MMT)kh

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Repeated matrix powering

12

slide-13
SLIDE 13

 Definition:  Definition:

  • Let Ax=x for some scalar , vector x and matrix A
  • Th

i i t d  i it i l

  • Then x is an eigenvector, and  is its eigenvalue

 Fact:

  • If A is symmetric (Aij=Aji)

(in our case MTM and MMT are symmetric) ( y )

  • Then A has n orthogonal unit eigenvectors w1…wn

that form a basis (coordinate system) with eigenvalues 1... n (|i||i+1|)

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 13

slide-14
SLIDE 14

 Write x in coordinate system w1

w

 Write x in coordinate system w1…wn

x=i i wi

  • x has coordinates (1,…, n)

x has coordinates (1,…, n)

 Suppose: 1 ... n (|1||2|  … |n|)  Akx = ( k 

 k   k  ) =   k  w

 A x

(1 1, 2 2,…., n n)  i i wi

 As k, if we normalize

Ak x 1 1 w1 A x 1 1 w1

(all other coordinates 0)

 So authority a is eigenvector of MTM associated with

l t i l  largest eigenvalue 1

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 14

slide-15
SLIDE 15

 A vote from an important

The web in 1839

 A vote from an important

page is worth more

 A page is important if it is

y y/2

 A page is important if it is

pointed to by other important pages

y y/2 a/2

important pages

 Define a “rank” rj for node j

r should be proportional to:

m a a/2 m

rj should be proportional to:

i j

r r

y = y /2 + a /2 /2

Flow equations:

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

 j i j

i

  • f
  • utdegree

15

a = y /2 + m m = a /2

slide-16
SLIDE 16

 Stochastic adjacency matrix M

Stochastic adjacency matrix M

  • Let page j has dj out‐links
  • If j → i, then Mij = 1/ dj else Mij = 0

ij j ij

  • M is a column stochastic matrix
  • Columns sum to 1

R k i h 1

 Rank vector r: vector with 1 entry per page

  • ri is the importance score of page i
  • |r| = 1
  • |r| = 1

 The flow equations can be written

r = Mr

11/29/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 16

slide-17
SLIDE 17

 Imagine a random web surfer:  Imagine a random web surfer:

  • At any time t, surfer is on some page u
  • At ti

t+1 th f f ll t li k

  • At time t+1, the surfer follows an out‐link

from u uniformly at random

  • Ends up on some page v linked from u
  • Ends up on some page v linked from u
  • Process repeats indefinitely

 Let:  Let:

 p(t) … vector whose ith coordinate is the

  • prob. that the surfer is at page i at time t
  • prob. that the surfer is at page i at time t
  • p(t) is a probability distribution over pages

11/29/2010 17 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining

slide-18
SLIDE 18

 Where is the surfer at time t+1?  Where is the surfer at time t+1?

  • Follows a link uniformly at random

p(t+1) = Mp(t) p(t+1) = Mp(t)

 Suppose the random walk reaches a state

(t+1) M (t) (t) p(t+1) = Mp(t) = p(t)

  • then p(t) is stationary distribution of a random walk

O k i fi M

 Our rank vector r satisfies r = Mr

  • So it is a stationary distribution for the random

f surfer

11/29/2010 18 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining

slide-19
SLIDE 19

 Power Iteration:  Power Iteration:

  • Set ri=1

 /d

y y a m

  • rj=i ri/di
  • And iterate

y a m y ½ ½ a ½ 1 m ½

 Example:

1 1 5/4 9/8 6/5 y 1 1 5/4 9/8 6/5 a = 1 3/2 1 11/8 … 6/5 m 1 ½ ¾ ½ 3/5

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 19

slide-20
SLIDE 20

 Some pages are “dead ends”  Some pages are dead ends

(have no out‐links)

  • Such pages cause importance
  • Such pages cause importance

to leak out

 Spider traps (all out links are

within the group) within the group)

  • Eventually spider traps absorb all importance

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 20

slide-21
SLIDE 21

 Power Iteration:  Power Iteration:

  • Set ri=1

 /d

y y a m y ½ ½

  • rj=i ri/di
  • And iterate

a m a ½ m ½

 Example:

1 1 ¾ 5/8 y 1 1 ¾ 5/8 a = 1 ½ ½ 3/8 … m 1 ½ ¼ ¼

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 21

slide-22
SLIDE 22

 Power Iteration:

y y a m

 Power Iteration:

  • Set ri=1

 /d

y a m y y ½ ½ a ½ ½ 1

  • rj=i ri/di
  • And iterate

m m ½ 1

 Example:

1 1 ¾ 5/8 y 1 1 ¾ 5/8 a = 1 ½ ½ 3/8 … m 1 3/2 7/4 2 3

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 22

slide-23
SLIDE 23

 At each step random surfer has two options:

di … outdegree of node i

 At each step, random surfer has two options:

  • With probability 1‐, follow a link at random
  • With

b bilit  j t if l

  • With probability , jump to some page uniformly

at random

 PageRank equation:

rj=(1- ) ij ri/di + 

j (

)

ij i i

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 23

slide-24
SLIDE 24

 PageRank as a principal eigenvector

di … outdegree of node i

 PageRank as a principal eigenvector

r=Mr  rj=i ri/di

 But we really want:  But we really want:

rj = (1- ) ij ri/di +  iri

 Define:  Define:

M’ij = (1- ) Mij +  1/n

 Then: r = M’r  Then: r = M r  What is ?

I ti  0 15 (5 li k d j )

  • In practice  =0.15 (5 links and jump)

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 24

slide-25
SLIDE 25

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 25

slide-26
SLIDE 26

 Goal: Evaluate pages not just by popularity but  Goal: Evaluate pages not just by popularity but

by how close they are to the topic

 Teleporting can go to:  Teleporting can go to:

  • Any page with equal probability
  • (we used this so far)
  • (we used this so far)
  • A topic‐specific set of “relevant” pages
  • Topic‐specific (personalized) PageRank
  • Topic‐specific (personalized) PageRank

M’ij = (1-) Mij +  c

(c...teleport vector)

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 26

slide-27
SLIDE 27

 Graphs and web search:

… …

 Graphs and web search:

  • Ranks nodes by “importance”

 Personalized PageRank:

Philip S. Yu IJCAI

 Personalized PageRank:

  • Ranks proximity of nodes

to the teleport nodes c

ICDM KDD Ning Zhong

to the teleport nodes c

 Proximity on graphs:

  • Q: What is most related

SDM AAAI

  • M. Jordan
  • R. Ramakrishnan
  • Q: What is most related

conference to ICDM?

  • Random Walks with Restarts

NIPS

… …

  • Random Walks with Restarts
  • Teleport back: c=(0…0, 1, 0…0)

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 27

Conference Author

slide-28
SLIDE 28

 Link Farms: networks of  Link Farms: networks of

millions of pages design to focus PageRank on a g few undeserving webpages

 To minimize their

i fl t l t influence use a teleport set of trusted webpages

  • E g homepages of
  • E.g., homepages of

universities

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 28

slide-29
SLIDE 29

[LibenNowell‐Kleinberg ‘03]

 Link prediction task:  Link prediction task:

  • Given G[t0,t0’] a graph on edges up to time t0’
  • utput a ranked list L of links (not in G[t t ’]) that
  • utput a ranked list L of links (not in G[t0,t0 ]) that

are predicted to appear in G[t1,t1’]

 Evaluation:

  • n=|Enew|: # new edges that appear during the test

period [t1,t1’]

  • Take top n elements of L and count correct edges

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 29

slide-30
SLIDE 30

[LibenNowell‐Kleinberg ‘03]

 Predict links a evolving collaboration network  Predict links a evolving collaboration network  Core: Since network data is very sparse

  • Consider only nodes with in‐degree and out‐

degree of at least 3

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 30

slide-31
SLIDE 31

[LibenNowell‐Kleinberg CIKM ‘03]

 Rank potential links (x,y) based on:

Rank potential links (x,y) based on:

Γ(x) … degree of node x

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 31

slide-32
SLIDE 32

[LibenNowell‐Kleinberg CIKM’ 03]

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 32

slide-33
SLIDE 33

 Recommend a list of possible friends

Recommend a list of possible friends

 Supervised machine learning setting:

  • Training example:
  • For every node s have a list of nodes

she will create links to {g1, …, gk}

  • Problem:

g1 g1 g2 g2

Problem:

  • Learn a model that will for a given

node s rank nodes {g1, …, gk} higher than other nodes in the network

s

than other nodes in the network

 How to combine node/edge

attributes and network structure?

g3

  • Let’s learn how to bias random walks!

33

g3 g3 positive examples negative examples

11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

slide-34
SLIDE 34

[WSDM ’11]

 Let s be the center node

v1 v1

 Let fw(u,v) be a function that assigns

a strength to each edge:

f ( ) ( Ψ )

v2 v2

auv = fw(u,v) = exp(-wΨuv)

  • Ψuv is a feature vector
  • Features of node u

s

  • Features of node u
  • Features of node v
  • Features of edge (u,v)

v3 v3

  • w is the parameter vector we want to learn

 Do a random walk from s where transitions

di t d t th are according to edge strengths

 How to learn fw(u,v)?

34 11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

slide-35
SLIDE 35

[WSDM ’11]

 Random walk transition matrix:

v1 v1 v2 v2

 Random walk transition matrix:

2

 PageRank transition matrix:

s

g

  • with prob. α jump back to s

v3 v3

 Compute PageRank vector: p=pTQ  Rank nodes by p  Rank nodes by pu

35 11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

slide-36
SLIDE 36

[WSDM ’11]

 Each node u has a score p

v1 v1 v2 v2

 Each node u has a score pu  Destination nodes D={v1,…, vk}  No‐link nodes L={the rest}

2

 No‐link nodes L={the rest}  What do we want?

s v3 v3

 Hard constraints, make them soft

36 11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

slide-37
SLIDE 37

[WSDM ’11]

 Want to minimize:

v1 v1 v2 v2

 Want to minimize:

2

  • Loss: h(x)=0 if x<0, x2 else

 How to minimize F?

s

How to minimize F? pl and pd depend on w:

  • Given w assign edge weights a =f (u v)

v3 v3

Given w assign edge weights auv fw(u,v)

  • Using transition matrix Q=[auv] compute

PageRank scores p PageRank scores pu

  • Want to set w such that pl<pd

37 11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

slide-38
SLIDE 38

[WSDM ’11]

 How to minimize F?

v1 v1 v2 v2

  • Take the derivative!

2

s

 We know:

i.e.

v3 v3

 So:  Looks like the PageRank equation!

38 11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

slide-39
SLIDE 39

[WSDM ’11]

 Iceland Facebook network

v1 v1 v2 v2

 Iceland Facebook network

  • 174,000 nodes (55% of population)
  • A

d 168

2

  • Avg. degree 168
  • Avg. person added 26 new friends/month

For every node

s

 For every node s:

  • Positive examples:

D { f i d hi f i N ‘09 }

v3 v3

  • D={ new friendships of s in Nov ‘09 }
  • Negative examples:
  • L { th

d did t t li k t }

  • L={ other nodes s did not create new links to }

39 11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

slide-40
SLIDE 40

 Node and Edge features for learning:

g g

  • Node:
  • Age
  • Gender
  • Degree
  • Edge:
  • Age of an edge

C i ti

  • Communication,
  • Profile visits
  • Co‐tagged photos

 Baselines:

Baselines:

  • Decision trees and logistic regression:
  • Above features + 10 network features (PageRank, common friends)

 Evaluation:

  • AUC and precision at Top20

40 11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

slide-41
SLIDE 41

 Facebook: predicting your future friends  Facebook: predicting your future friends

41 11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

slide-42
SLIDE 42

 Results:  Results:

  • 2.3X improvement over

previous FB‐PYMK system

2.3x

previous FB‐PYMK system

 How to scale to FB size?

  • FB network:
  • >500 million people, >65 billion edges
  • 40 machines, each 72GB of RAM (total 2.8TB)
  • System makes 8.6 million suggests per second

y gg p

42 11/29/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu