Google PageRank Francesco Ricci Faculty of Computer Science Free - - PowerPoint PPT Presentation

google pagerank
SMART_READER_LITE
LIVE PREVIEW

Google PageRank Francesco Ricci Faculty of Computer Science Free - - PowerPoint PPT Presentation

Google PageRank Francesco Ricci Faculty of Computer Science Free University of Bozen-Bolzano fricci@unibz.it 1 Content p Linear Algebra p Matrices p Eigenvalues and eigenvectors p Markov chains p Google PageRank 2 Literature


slide-1
SLIDE 1

Google PageRank

Francesco Ricci Faculty of Computer Science Free University of Bozen-Bolzano fricci@unibz.it

1

slide-2
SLIDE 2

Content

p Linear Algebra p Matrices p Eigenvalues and eigenvectors p Markov chains p Google PageRank

2

slide-3
SLIDE 3

Literature

p C. D. Manning, P. Raghavan, H.

Schütze, Introduction to Information Retrieval, Cambridge University Press,

  • 2008. Chapter 21

p Markov chains description on wikipedia p Amy N. Langville & Carl D. Meyer,

Google's PageRank and Beyond: The Science of Search Engine Rankings, Princeton University Press, 2006.

3

slide-4
SLIDE 4

Google

p Google is the leading search and online

advertising company - founded by Larry Page and Sergey Brin (Ph.D. students at Stanford University)

p “googol” or 10100 is the mathematical term

Google was named after

p Google’s success in search is largely based on its

PageRank™ algorithm

p Gartner reckons that Google now make use of

more than 1 million servers, spitting out search results, images, videos, emails and ads

p Google reports that it spends some 200 to 250

million US dollars a year on IT equipment.

4

slide-5
SLIDE 5

Matrices

p A Matrix is a rectangular array of numbers p aij is the element of matrix A in row i and column j p A is said to be a n x m matrix if it has n rows and m

columns

p A square matrix is a n x n matrix p The transpose AT of a matrix A is the matrix obtained by

exchanging the rows and the columns

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ = ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ = 6 5 4 3 2 1

23 22 21 13 12 11

a a a a a a A

AT = a11

T

a12

T

a21

T

a22

T

a31

T

a32

T

! " # # # # # # $ % & & & & & & = a11 a21 a12 a22 a13 a23 ! " # # # # # # $ % & & & & & & = 1 4 2 5 3 6 ! " # # # # # $ % & & & & &

5

slide-6
SLIDE 6

Exercise

p What is the size of these matrices p Compute their transpose

6

slide-7
SLIDE 7

Exercise

p What is the size of these matrices p Compute their transpose

7

2x3 3x1 3X4

1 20 9 5 −13 −6 " # $ $ $ $ $ % & ' ' ' ' '

4 1 8 ! " # $ 1 2 −1 1 −2 −1 −3 −2 −1 " # $ $ $ $ $ $ $ % & ' ' ' ' ' ' '

slide-8
SLIDE 8

Matrices

p A square matrix is diagonal iff has aij = 0

for all i≠j

p The Identity matrix 1 is the diagonal matrix

with 1´s along the diagonal

p A symmetric matrix A satisfy the condition

A=AT

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ =

22 11

a a A

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ = 1 1 A

8

slide-9
SLIDE 9

Exercise

p Is a diagonal matrix symmetric? p Make an example of a symmetric matrix p Make an example of a 2x3 symmetric matrix

9

slide-10
SLIDE 10

Exercise

p Is a diagonal matrix symmetric? n YES because if it is diagonal then aij = 0 for all

i≠j, hence aij = aji for all i≠j

p Make an example of a symmetric matrix p Make an example of a 2x3 symmetric matrix n Impossible, a symmetric matrix is a square

matrix

10

1 2 2 3 ! " # $ % &

slide-11
SLIDE 11

Vectors

p A vector v is a one-dimensional array of

numbers (is an n x 1 matrix – column vector)

p Example: p The standard form of a vector is a column

vector

p The transpose of a column vector vT =(3 5 7)

is a row vector.

v = 3 5 7 ! " # # # $ % & & &

11

slide-12
SLIDE 12

Operation on matrices

p Addition: A=(aij), B=(bij), C=(cij) = A+B n cij = aij + bij p Scalar multiplication: λ is a number, λ A =

(λaij)

p Multiplication: if A and B are compatible, i.e.,

the number of columns of A is equal to the number of rows of B, then

n C=(cij)= AB n cij = Σk aik bkj

12

slide-13
SLIDE 13

Examples

p If AB=1, then B is said to be the inverse of A

and is denoted with A-1

p If a matrix has an inverse is called invertible or

non singular

1 2 3 4 5 6 ! " # # # $ % & & & 1 4 2 5 3 6 ! " # # # # # $ % & & & & & = 1*1+ 2*2 +3*3 1*4+ 2*5+3*6 4*1+ 5*2 + 6*3 4*4+ 5*5+ 6*6 ! " # $ % & = 14 32 32 77 ! " # $ % &

1 1 1 ! " # $ % & 1 −1 1 ! " # $ % & = 1 1 ! " # $ % &

13

It is symmetric. Is it a general fact? Is AATalways symmetric?

slide-14
SLIDE 14

Exercise

p Compute the following operations

14

slide-15
SLIDE 15

Exercise

p Compute the following operations

15

slide-16
SLIDE 16

Rank of a Matrix

p The row (column) rank of a matrix is the

maximum number of rows (columns) that are linearly independent

p The vectors v1, …, vn are linearly independent

iff there is no linear combination a1v1+ … + anvn (with coefficients ai not all 0) of the vectors that is equal to 0

p Example 1: (1 2 3), (1 4 6), and (0 2 3) are

linearly dependent: show it

p Example 2: (1 2 3) and (1 4 6) are not linearly

dependent: show it

p The kernel of a matrix A is the subspace of

vectors v such that Av=0

16

slide-17
SLIDE 17

Exercise solution

p 1*(1 2 3)T -1*(1 4 6)T + 1*(0 2 3)T =(0 0 0)T p (1 -1 1)T is in the kernel of the matrix: p a*(1 2 3) + b*(1 4 6) = (0 0 0) n Then a=-b and also a = -2b, absurd.

17

1 1 2 4 2 3 6 3 ! " # # # $ % & & & 1 −1 1 ! " # # # $ % & & & = ! " # # # $ % & & & 1 1 2 4 2 3 6 3 ! " # # # $ % & & &

slide-18
SLIDE 18

Rank and Determinant

p Theorem. A n x n square matrix is nonsingular

iff has full rank (i.e. n).

p Theorem. A matrix has full column rank iff it

does not have a null vector

p Theorem. A n x n matrix A is singular iff the

det(A)=0

p A[ij] is the ij minor, i.e., the matrix obtained by

deleting the i-th row and the j-th column from A.

1 1 n ) det( ) 1 ( ) det(

1 ] 1 [ 1 1 11

> = ⎪ ⎩ ⎪ ⎨ ⎧ − = ∑

= +

n if if A a a A

n j j j j

18

slide-19
SLIDE 19

Exercise

p Compute the determinant of the following

matrices

19

1 1 2 4 2 3 6 3 ! " # # # $ % & & & 1 1 2 4 ! " # $ % &

slide-20
SLIDE 20

Exercise

p Compute the determinant of the following

matrices

20

1 1 2 4 2 3 6 3 ! " # # # $ % & & & 1 1 2 4 ! " # $ % &

= 1*4-1*2 = 2 = 1*(4*3-2*6)-(2*3-2*3)=0 http://www.bluebit.gr/matrix-calculator/

slide-21
SLIDE 21

Eigenvectors and Eigenvalues

p Definition. If M is a square matrix, v is a

nonzero vector and λ is a number such that

n M v = λ v p then v is said to be an (right) eigenvector of A

with eigenvalue λ

p If v is an eigenvector of M with eigenvalue λ,

then so is any nonzero multiple of v

p Only the direction matters.

21

slide-22
SLIDE 22

Example

p The matrix p Has two (right) eigenvectors: n v1 =(1 1)t and v2 = (3 1)t

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − − = 2 1 3 2 M

Prove that

22

Is it singular?

slide-23
SLIDE 23

Example

p The matrix p Has two eigenvectors: n v1 =(1 1)t and v2 = (3 1)t p Mv1 = (-1 -1)t = -1 v1 n The eigenvalue is -1 p Mv2 = (3 1)t = 1 v2 n The eigenvalue is 1

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − − = 2 1 3 2 M

Prove that

23

Is it singular?

slide-24
SLIDE 24

Transformation

p There is a lot of distortion in these directions (1

0)t, (1 1)t, (0 1)t

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − − = 2 1 3 2 M

24

slide-25
SLIDE 25

Transformation along eigenvectors

p There are two

independent directions which are not twisted at all by the matrix M: (1 1) and (3 1)

p one of them is flipped

(1 1)

p We see less distortion

if our box is oriented in the two special directions.

25

slide-26
SLIDE 26

Results

p Theorem: every square matrix has at least one

eigenvector

p The usual situation is that an n x n matrix has n

linearly independent eigenvectors

p If there are n of them, they are a useful basis for

Rn.

p Unfortunately, it can happen that there are fewer

than n of them.

26

slide-27
SLIDE 27

Finding Eigenvectors

p M v = λ v n v is an eigenvector and is λ an eigenvalue p If λ = 0, then finding eigenvectors is the same as

finding nonzero vectors in the null space – iff det(M) = 0, i.e., the matrix is singular

p If λ != 0, then finding the eigenvectors is

equivalent to finding the null space for the matrix M – λ1 (1 is the identity matrix)

p The matrix M – λ1 has a non zero vector in the

null space iff det(M – λ1) = 0

p det(M – λ1) = 0 is called the characteristic

equation.

27

slide-28
SLIDE 28

Exercise

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − − = 2 1 3 2 M

Find the eigenvalues and the eigenvectors of this matrix

28

1) Find the solutions λ of the characteristic equation (eigenvalues) 2) Find the eigenvectors corresponding to the found eigenvalues.

slide-29
SLIDE 29

Exercise Solution

p det(M – λ1) = 0 n (2 - λ)(-2 - λ) + 3 = λ2 - 1 p The solutions are +1 and -1

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − − = 2 1 3 2 M

Find the eigenvalues and the eigenvectors of this matrix

29

slide-30
SLIDE 30

Exercise Solution

p det(M – λ1) = 0 n (2 - λ)(-2 - λ) + 3 = λ2 - 1 p The solutions are +1 and -1 p Now we have to solve the set of linear

equations

n Mv=v (for the first eigenvalue)

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − − = 2 1 3 2 M

Find the eigenvalues and the eigenvectors of this matrix

30

slide-31
SLIDE 31

Exercise Solution

p det(M – λ1) = 0 n (2 - λ)(-2 - λ) + 3 = λ2 - 1 p The solutions are +1 and -1 p Now we have to solve the set of linear

equations

n Mv=v (for the first eigenvalue) n Has solution x=3y, (3 1)t – and all vectors

  • btained multiplying this with a scalar.

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − − = 2 1 3 2 M

y y x x y x = − = − 2 3 2

Find the eigenvalues and the eigenvectors of this matrix

31

slide-32
SLIDE 32

Algorithm

p To find the eigenvalues and eigenvectors of M: n First find the eigenvalues by solving the

characteristic equation. Call the solutions λ1,..., λn. (There is always at least one eigenvalue, and there are at most n

  • f them.)

n For all λk, the existence of a nonzero vector in

this null space is guaranteed. Any such vector is an eigenvector.

32

slide-33
SLIDE 33

Graphs

p A directed graphs G is a pair (V,E), where V is a

finite set and E is a binary relations on V

n V is the Vertex set of G: contains the

vertices

n E is the Edge set of G: contains the edges p In an undirected graphs G=(V,E) the edges

consists of unordered pairs of vertices

p The in-degree of a vertex v (directed graph) is

the number of edges entering in v

p The out-degree of a vertex v (directed graph) is

the number of edges leaving v.

33

slide-34
SLIDE 34

The Web as a Directed Graph

Assumption 1: A hyperlink between pages denotes author perceived relevance (quality signal) Assumption 2: The anchor of the hyperlink describes the target page (textual context) Page A

hyperlink

Page B

Anchor

34

slide-35
SLIDE 35

35

Ranking web pages

p To count inlinks: enter in google search form

link:www.mydomain.com

p Web pages are not equally “important” n www.unibz.it vs. www.stanford.edu n Inlinks as votes

p www.stanford.edu has 3200 inlinks p www.unibz.it has 352 inlink (Feb 2013)

p Are all inlinks equal? n Recursive question!

slide-36
SLIDE 36

36

Simple recursive formulation

p Each link’s vote is proportional to the

importance of its source page

p If page P with importance x has n outlinks, each

link gets x/n votes 1000 $ 333 $ 333 $ 333 $

slide-37
SLIDE 37

37

Simple “flow” model

The web in 1839

Yahoo Microsoft Amazon

y a m y/2 y/2 a/2 a/2 m

y = y /2 + a /2 a = y /2 + m m = a /2

a, m, and y are the importance of these pages

slide-38
SLIDE 38

38

Solving the flow equations

p 3 equations, 3 unknowns, no constants n No unique solution n If you multiply a solution by a constant (λ) you

  • btain another solution - try with (2 2 1)

p Additional constraint forces uniqueness n y+a+m = 1 (normalization) n y = 2/5, a = 2/5, m = 1/5 n These are the scores of the pages under the

assumption of the flow model

p Gaussian elimination method works for small

examples, but we need a better method for large graphs.

slide-39
SLIDE 39

39

Matrix formulation

p Matrix M has one row and one column for each

web page (square matrix)

p Suppose page i has n outlinks n If i links to j, then Mij=1/n n Else Mij=0 p M is a row stochastic matrix n Rows sum to 1 p Suppose r is a vector with one entry per web

page

n ri is the importance score of page i n Call it the rank vector

slide-40
SLIDE 40

40

Example

y ½ ½ 0 a ½ 0 ½ m 0 1 0 y a m

y = y /2 + a /2 a = y /2 + m m = a /2

(y a m) = (y a m)M Yahoo Microsoft Amazon y a m y/2 y/2 a/2 a/2 m

= M

slide-41
SLIDE 41

41

Power Iteration Solution

Yahoo Microsoft Amazon

(1/3 1/3 1/3) (1/3 1/3 1/3)M = (1/3 1/2 1/6) (1/3 1/2 1/6)M = (5/12 1/3 1/4) (5/12 1/3 1/4)M = (3/8 11/24 1/6) … (2/5 2/5 1/5)

y ½ ½ 0 a ½ 0 ½ m 0 1 0 y a m (y a m) = (y a m)M

= M

slide-42
SLIDE 42

Example

42

slide-43
SLIDE 43

States and probabilities

DRY RAIN 0.38 0.62 0.15 0.85

43

slide-44
SLIDE 44

Composing transitions

0.44 = 0.38*0.15+0.62*0.62 What kind of operation is on the matrix?

44

slide-45
SLIDE 45

Composing transitions

p The probabilities of the 12hours transitions are

given by squaring the matrix representing the probabilities of the 6hours transitions

n P(rain-in-12hours|rain-now)= P(rain-in-12hours|rain-

in-6hours)*P(rain-in-6hours|rain-now)+P(rain- in-12hours|dry-in-6hours)*P(dry-in-6hours|rain-now)=. 62*.62+.15*.38=.44

n P(dry-in-12hours|rain-now)= P(dry-in-12hours|rain-

in-6hours)*P(rain-in-6hours|rain-now)+P(dry- in-12hours|dry-in-6hours)*P(dry-in-6hours|rain-now)= 38*.62+.85*.38=.56

2

44 . 56 . 22 . 78 . 62 . 38 . 15 . 85 . 62 . 38 . 15 . 85 . A = ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ = ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛

45

dry dry rain rain

slide-46
SLIDE 46

Behavior in the limit

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ = 62 . 38 . 15 . 85 . A ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ = 36 . 64 . 25 . 75 .

3

A ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ = 29 . 71 . 28 . 72 .

7

A ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ = 28 . 72 . 28 . 72 .

8

A

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ = 44 . 56 . 22 . 78 .

2

A ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ = 32 . 68 . 27 . 73 .

4

A ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ = 29 . 71 . 28 . 72 .

6

A ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ = 30 . 70 . 28 . 72 .

5

A ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ = 28 . 72 . 28 . 72 .

9

A ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ =

28 . 72 . 28 . 72 . A

46

slide-47
SLIDE 47

Behavior in the limit

p If a,b <=1, and a+b=1, i.e., (a b) is a generic

state with a certain probability a to be dry and b=1-a to be rain, then

p In particular (.72 .28)A=(.72 .28), i.e., it is a

(left) eigenvector with eigenvalue 1

p The eigenvector (.72 .28) represents the limit

situation starting from a generic situation (a b): it is called the stationary distribution.

( ) ( ) ( )

28 . 72 . 28 . 72 . 28 . 72 . = ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ =

b a A b a

47

slide-48
SLIDE 48

Exercise

p Find one (left) eigenvector of the matrix below: n Solve first the characteristic equation (to find

the eigenvalues)

n and then find the left eigenvector

corresponding to the largest eigenvalue

48

.85 .15 .38 .62 ! " # $ % &

slide-49
SLIDE 49

Exercise Solution

p Characteristic equation

49

det .85− λ .15 .38 .62 − λ " # $ % & ' =(0.85-λ)(0.62-λ) - 0.15*0.38 = λ2 – 1.47λ +0.47

λ = 1.47± 1.472 − 4*0.47 2

Solutions λ =1 and λ = 0.47

( x y ) .85 .15 .38 .62 ! " # $ % & = x y

( )

0.85x + 0.38y=x x + y = 1 0.85x +0.38(1-x)=x

  • 0.53x +0.38=0

x = 0.38/0.53=0.72 y = 1 – 0.72= 0.28

slide-50
SLIDE 50

Markov Chain

p A Markov chain is a sequence X1, X2, X3, ... of random

variables (Σv all possible values of X P(X=v) = 1) with the property:

p Markov property: the conditional probability distribution of

the next future state Xn+1 given the present and past states is a function of the present state Xn alone

p If the state space is finite then the transition probabilities

can be described with a matrix Pij=P(Xn+1= j | Xn = i ), i,j =1, …m

50

.85 .15 .38 .62 ! " # $ % & = P(Xn+1 =1| Xn =1) P(Xn+1 = 2 | Xn =1) P(Xn+1 =1| Xn = 2) P(Xn+1 = 2 | Xn = 2) ! " # # $ % & &

slide-51
SLIDE 51

Example: Web

p Xt is the page visited by a user (random surfer) at

time t;

p At every time t the user can be in one among m

pages (states)

p We assume that when a user is on page i at time

t, then the probability to be on page j at time t+1 depends only on the fact that the user is on page i, and not on the pages previously visited.

51

slide-52
SLIDE 52

Probabilities

P0 P1 P2 P3 P4 Goal P(P2|P1) = 0.4 P(P1|P1) = 0.1 P(P0|P1)= 0.05 P(P3 |P1) = 0.3 P(P4| P1) = 0.15

In this example there are 5 states and the probability to jump from a page/state to another is not constant (it is not 1/(#of outlinks of the node)) … as we have assumed before in the simple web graph This is not a Markov chain! (why?)

P(P1|P0)= 1.0 P(P1|P2)= 1.0 P(P4 |P3) = 0.5 P(P1 |P3) = 0.5

52

slide-53
SLIDE 53

Examples

p Pij=P(Xn+1= j | Xn = i ), i,j =1, …m p (1, 0, 0, …, 0) P = (P11, P12, P13, …, P1n) n if at time n it is in state 1, then at time n+1 it

is in state j with probability P1j, i.e., the first row of Pij gives the probabilities to be in the

  • ther states

p (0.5, 0.5, 0, …, 0) P = (P11·0.5 + P21·0.5, …,

P1n·0.5 + P2n·0.5)

n this is the linear combination of the first two

rows.

53

slide-54
SLIDE 54

Stationary distribution

p A stationary distribution is a m-dimensional (sum

1) vector which satisfies the equation:

p Where π is a (column) vector and πT (row vector) is

the transpose of π

p A stationary distribution always exists, but is not

guaranteed to be unique (can you make an example

  • f a Markov chain with more than one stationary

distribution?)

p If there is only one stationary distribution then p Where x is a generic distribution over the m states

(i.e., it is an m-dimensional vector whose entries are <=1 and the sum is 1)

n T n T

P x lim = π

54

slide-55
SLIDE 55

Random Walk Interpretation

p Imagine a random web surfer n At any time t, surfer is on some page P n At time t+1, the surfer follows an outlink from

P uniformly at random

n Ends up on some page Q linked from P n Process repeats indefinitely p Let p(t) be a vector whose ith component is the

probability that the surfer is at page i at time t

n p(t) is a probability distribution on pages

55

slide-56
SLIDE 56

The stationary distribution

p Where is the surfer at time t+1? n Follows a link uniformly at random n p(t+1) = p(t)M p Suppose the random walk reaches a state such

that p(t+1) = p(t)M = p(t)

n Then p(t) is a stationary distribution for the

random walk

p Our rank vector r= p(t) satisfies r = rM.

56

slide-57
SLIDE 57

Ergodic Markov chains

p A Markov chain is ergodic if: n Informally: there is a path from any state to any

  • ther; and the states are not partitioned into sets

such that all state transitions occur cyclically from

  • ne set to another.

n Formally: for any start state, after a finite transient

time T0, the probability of being in any state at any fixed time T>T0 is nonzero.

Not ergodic (even/

  • dd).

Not ergodic: the probability to be in a state, at a fixed time, e.g., after 500 transitions, is always either 0 or 1 according to the initial state.

57

slide-58
SLIDE 58

Ergodic Markov chains

p For any ergodic Markov chain, there is a unique

long-term visit rate for each state

n Steady-state probability distribution p Over a long time-period, we visit each state in

proportion to this rate

p It doesn’t matter where we start. p Note: non ergodic Markov chains may still have a

steady state.

58

slide-59
SLIDE 59

Non Ergodic Example

p It is easy to show that the steady state (left

eigenvector) is πT= (0 0 1), πTP=πT , i.e., is the state 3

p The user will always reach the state 3 and will

stay there (spider trap)

p This is a non-ergodic Markov Chain (with a

steady-state). 1 2 3 0.5 0.5 0.8

⎟ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎜ ⎝ ⎛ = 1 8 . 2 . 5 . 5 . P

1.0 0.2

59

slide-60
SLIDE 60

Random teleports

p The Google solution for spider traps (not for dead

ends)

p At each time step, the random surfer has two

  • ptions:

n With probability β, follow a link at random n With probability 1-β, jump to some page

uniformly at random

n Common values for β are in the range 0.8 to

0.9

p Surfer will teleport out of spider trap within a few

time steps

60

slide-61
SLIDE 61

Matrix formulation

p Suppose there are N pages n Consider a page i, with set of outlinks O(i) n We have

p Mij = 1/|O(i)| when i links j p and Mij = 0 otherwise

n The random teleport is equivalent to

p adding a teleport link from i to every other

page with probability (1-β)/N

p reducing the probability of following each

  • utlink from 1/|O(i)| to β/|O(i)|

p Equivalent: tax each page a fraction (1-β) of

its score and redistribute evenly.

61

slide-62
SLIDE 62

p Simple example with 6 pages p P(5|1)=P(4|1)=P(3|1)=P(2|1)= β/4 +(1-β)/6 p P(1|1)=P(6|1)= (1-β)/6 p P(*|1) = 4[β/4 +(1-β)/6] + 2(1-β)/6 = 1

Example

1 2 4 5 3 6

62

slide-63
SLIDE 63

Google Page Rank

p Construct the NxN matrix A as follows n Aij = βMij + (1-β)/N p Verify that A is a stochastic matrix p The page rank vector r is the principal eigenvector of

this matrix

n satisfying r = rA n The score of each page ri satisfies the following: p I(i) is the set of nodes that have a link to page i p O(k) is the set of links exiting from k p r is the stationary distribution of the random walk with

teleports.

r

i = β

r

k

|O(k)|

k∈I(i)

# $ % % & ' ( (+ (1− β) N

63

slide-64
SLIDE 64

Example

0,03 0,24 0,24 0,24 0,24 0,03 0,03 0,03 0,45 0,03 0,03 0,45 0,03 0,03 0,03 0,03 0,88 0,03 0,03 0,88 0,03 0,03 0,03 0,03 0,03 0,03 0,03 0,03 0,03 0,88 0,03 0,03 0,03 0,88 0,03 0,03

1 2 4 5 3 6

P(4|1)=0.24=0.85/4 + 0.15/6 P(6|1)=0.03=0.15/6 P(4|6)=0.88=0.85/1 + 0.15/6 A =

0,03 0,23 0,13 0,24 0,14 0,24 0,03 0,23 0,13 0,24 0,14 0,24 0,03 0,23 0,13 0,24 0,14 0,24 0,03 0,23 0,13 0,24 0,14 0,24 0,03 0,23 0,13 0,24 0,14 0,24 0,03 0,23 0,13 0,24 0,14 0,24

A30 = Stationary distribution = (0.03 0.23 0.13 0.24 0.14 0.24)

64

β=0.85

slide-65
SLIDE 65

Dead ends

p Pages with no outlinks are “dead ends” for the

random surfer (dangling nodes)

n Nowhere to go on next step p When there are dead ends the matrix is no longer

stochastic (the sum of the row elements is not 1)

p This is true even if we add the teleport n because the probability to follow a teleport link

is only (1-β)/N and there are just N of these teleports- hence any of them is (1-β)

65

slide-66
SLIDE 66

Dealing with dead-ends

p 1) Teleport n Follow random teleport links with probability

1.0 from dead-ends (i.e., for that pages set β = 0)

n Adjust matrix accordingly p 2) Prune and propagate n Preprocess the graph to eliminate dead-ends n Might require multiple passes (why?) n Compute page rank on reduced graph n Approximate values for dead ends by

propagating values from reduced graph

66

slide-67
SLIDE 67

Computing page rank

p Key step is matrix-vector multiply n rnew = roldA p Easy if we have enough main memory to hold

A, rold, rnew

p Say N = 1 billion pages n We need 4 bytes (32 bits) for each entry

(say)

n 2 billion entries for vectors rnew and rold,

approx 8GB

n Matrix A has N2 entries, i.e., 1018

p it is a large number!

67

slide-68
SLIDE 68

Sparse matrix formulation

p Although A is a dense matrix, it is obtained from a

sparse matrix M

n 10 links per node, approx 10N entries p We can restate the page rank equation n r = βrM + [(1-β)/N]N (see slide 63) n [(1-β)/N]N is an N-vector with all entries (1-β)/N p So in each iteration, we need to: n Compute rnew = βroldM n Add a constant value (1-β)/N to each entry in

rnew

68

slide-69
SLIDE 69

Sparse matrix encoding

p Encode sparse matrix using only nonzero entries n Space proportional roughly to number of links n say 10N, or 4*10*1 billion = 40GB n still won’t fit in memory, but will fit on disk 3 1, 5, 7 1 5 17, 64, 113, 117, 245 2 2 13, 23

source node degree destination nodes

69

slide-70
SLIDE 70

Basic Algorithm

p

Assume we have enough RAM to fit rnew, plus some working memory

n

Store rold and matrix M on disk Basic Algorithm:

p

Initialize: rold = [1/N]N

p

Iterate:

n

Update: Perform a sequential scan of M and rold and update rnew

n

Write out rnew to disk as rold for next iteration

n

Every few iterations, compute |rnew-rold| and stop if it is below threshold

p Need to read in both vectors into memory 70

slide-71
SLIDE 71

Update step

3 1, 5, 6 1 4 17, 64, 113, 117 2 2 13, 23

src degree destination 1 2 3 4 5 6 1 2 3 4 5 6 rnew rold

Initialize all entries of rnew to (1-β)/N For each page p (out-degree n): Read into memory: p, n, dest1,…,destn, rold(p) for j = 1…N: rnew(destj) += β*rold(p)/n

The old value in 0 contributes to updating only the new values in 1,5, and 6.

71