CS345a: Data Mining Jure Leskovec and Anand Rajaraman j
Stanford University
CS345a: Data Mining Jure Leskovec and Anand Rajaraman j Stanford - - PowerPoint PPT Presentation
CS345a: Data Mining Jure Leskovec and Anand Rajaraman j Stanford University Instead of generic popularity can we measure Instead of generic popularity, can we measure popularity within a topic? E.g., computer science, health Bias the
CS345a: Data Mining Jure Leskovec and Anand Rajaraman j
Stanford University
1/28/2010 2 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 3 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
Suppose S = { 1} , = 0.8
0.2
1
0.2 0.5 0.5 1 0.4 0.4
2 3 Node I teration 1 2… stable
1 1 1 0.8 0.8 0.8
4 1 1.0 0.2 0.52 0.294 2 0.4 0.08 0.118 3 0.4 0.08 0.327 4 0 32 0 261 4 0.32 0.261
1/28/2010 4 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 5 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 6 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
NYT: 10 Ebay: 3
Ebay: 3 Yahoo: 3 CNN: 8 WSJ: 9
1/28/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 7
1/28/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 8
1/28/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 9
1/28/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 10
NYT: 10 Ebay: 3 Yahoo: 3
CNN: 8 WSJ: 9
1/28/2010 11 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 12
j i i j
j i j i
1/28/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 13
1/28/2010 14 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 15 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
j j ij i j i j i
T
1/28/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
16
1/28/2010 17 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 18 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
T
Yahoo
T = 1 0 1
M’soft Amazon Amazon
1/28/2010 19 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
new h new a
new a
1/28/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
20
1/28/2010 21 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
Hubs Authorities Most densely‐connected core Most densely connected core (primary core) Less densely‐connected core Less densely connected core (secondary core)
1/28/2010 22 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 23 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 24 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 25
1/28/2010 26 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 27
1/28/2010 28 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 29
1/28/2010 30 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 31 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 32 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 33 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
Accessible Own Inaccessible t 1 2 t M
1/28/2010 34 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
I ibl
Accessible Own
Inaccessible
t 1 2
N…# pages on the web
M
p g M…# of pages spammer owns
Very small; ignore
1/28/2010 35 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
I ibl
Accessible Own
Inaccessible
t 1 2
M
N…# pages on the web M…# of pages
spammer owns
1/28/2010 36 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 37 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 38 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 39 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 40 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 41 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 42 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 43 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 44 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 45 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 46
1/28/2010 47 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 48 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining
1/28/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 49