PageRank Model of internet: Users click random link on a page. - - PowerPoint PPT Presentation

pagerank
SMART_READER_LITE
LIVE PREVIEW

PageRank Model of internet: Users click random link on a page. - - PowerPoint PPT Presentation

MathematicsforComputerScience GoogleRankings MIT 6.042J/18.062J Which webpages are more important? PageRank Model of internet: Users click random link on a page. (byGooglefounder Occasionally start over. LarryPage) A


slide-1
SLIDE 1

Mathematics for Computer Science

MIT 6.042J/18.062J

PageRank

(by Google founder Larry Page)

Google Rankings

Which webpages are “more important?” Model of internet:

  • Users click random link on a page.
  • Occasionally start over.

A page is “more important” if viewed a large fraction of time

page­rank.1 page­rank.2 Albert R Meyer, May 13, 2015 Albert R Meyer, May 13, 2015

Random Walk on the Web Random Walk on the Web

View the entire web as digraph

  • vertices are webpages
  • edge (V,W) exists if link from

page V to page W

  • edges out of V equally likely

Pr[(V,W)] = 1/outdeg(V)

To model starting over: * add a “super­node” to the graph * an edge from super­node to each

  • ther node

* edges from each other node back to super­node ­may get customized probabilities

page­rank.3 page­rank.4 Albert R Meyer, May 13, 2015 Albert R Meyer, May 13, 2015

1

slide-2
SLIDE 2

T T T

page­rank.5 Albert R Meyer, May 13, 2015

Super-node

­­

­­ ­H

H

­T

T

½ ½

­­ HH

H

HT

T

½ ½

­­ TH

H

TT

T H T T

win

T H H T H

lose

T H H T super

page­rank.6

Compute stationary distribution PageRank(V) ::= sV Rank V above W when sV > sW

Albert R Meyer, May 13, 2015

s

PageRank

* Creating fake nodes pointing to self * Adding links to important nodes

won’t improve PageRank

page­rank.7 Albert R Meyer, May 13, 2015

Resistance to scamming

ensures * unique stable distribution * every initial distribution converges to * convergence is rapid: t is small so easy to compute

Importance of Super-node

page­rank.8 Albert R Meyer, May 13, 2015

s

p

s

lim

t→∞ p

⋅Mt = s

s

2

slide-3
SLIDE 3

Actual Google Rank Google rank rules are a closely held trade secret using text, location, payment, and other criteria that have evolved for 15 years. But PageRank continues to play a significant role.

page­rank.9 Albert R Meyer, May 13, 2015

3

slide-4
SLIDE 4

MIT OpenCourseWare https://ocw.mit.edu

6.042J / 18.062J Mathematics for Computer Science

Spring 2015 For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.