DATA MINING LECTURE 11
Link Analysis Ranking PageRank -- Random walks HITS Absorbing Random Walks and Label Propagation
DATA MINING LECTURE 11 Link Analysis Ranking PageRank -- Random - - PowerPoint PPT Presentation
DATA MINING LECTURE 11 Link Analysis Ranking PageRank -- Random walks HITS Absorbing Random Walks and Label Propagation Network Science A number of complex systems can be modeled as networks (graphs). The Web (Online) Social
Link Analysis Ranking PageRank -- Random walks HITS Absorbing Random Walks and Label Propagation
networks (graphs).
unless we understand the underlying network.
network data mining.
Newspaper articles, Patents, etc. (“needle-in-a- haystack”)
spam, etc.
value
content
is also important on the web
authority/centrality/importance of node q
authority value to every node
𝑤2 𝑤3 𝑤4 𝑤5 𝑤1
point to it.
distribute to all nodes.
1 𝑜 amount of authority
to their neighbors
authority fractions it collects from its neighbors.
Think of the nodes in the graph as containers of capacity of 1 liter. We distribute a liter of liquid equally to all containers
The edges act like pipes that transfer liquid between nodes.
The contents of each node are distributed to its neighbors.
The edges act like pipes that transfer liquid between nodes.
The contents of each node are distributed to its neighbors.
The edges act like pipes that transfer liquid between nodes.
The contents of each node are distributed to its neighbors.
The edges act like pipes that transfer liquid between nodes.
The system will reach an equilibrium state where the amount of liquid in each node remains constant.
The amount of liquid in each node determines the importance of the node. Large quantity means large incoming flow from nodes with large quantity
to it.
all nodes.
1 𝑜 amount of authority
their neighbors
authority fractions it collects from its neighbors. 𝑥𝑤 = 1 𝑒𝑝𝑣𝑢 𝑣 𝑥𝑣
𝑣→𝑤
𝑥𝑤: the PageRank value of node 𝑤
Recursive definition
w1 = 1/3 w4 + 1/2 w5 w2 = 1/2 w1 + w3 + 1/3 w4 w3 = 1/2 w1 + 1/3 w4 w4 = 1/2 w5 w5 = w2
𝑥𝑤 = 1 𝑒𝑝𝑣𝑢 𝑣 𝑥𝑣
𝑣→𝑤
𝑤2 𝑤3 𝑤4 𝑤5 𝑤1
1 𝑜
1 𝑒𝑝𝑣𝑢 𝑣 𝑥𝑣 𝑣→𝑤
w1 = 1/3 w4 + 1/2 w5 w2 = 1/2 w1 + w3 + 1/3 w4 w3 = 1/2 w1 + 1/3 w4 w4 = 1/2 w5 w5 = w2
𝑥𝑤 = 1 𝑒𝑝𝑣𝑢 𝑣 𝑥𝑣
𝑣→𝑤
𝑤2 𝑤3 𝑤4 𝑤5 𝑤1
𝒙𝟐 𝒙𝟑 𝒙𝟒 𝒙𝟓 𝒙𝟔 t=0 0.2 0.2 0.2 0.2 0.2 t=1 0.16 0.36 0.16 0.1 0.2 t=2 0.13 0.28 0.11 0.1 0.36 t=3 0.22 0.22 0.1 0.18 0.28 t=4 0.2 0.27 0.17 0.14 0.22
Think of the weight as a fluid: there is constant amount of it in the graph, but it moves around until it stabilizes
w1 = 1/3 w4 + 1/2 w5 w2 = 1/2 w1 + w3 + 1/3 w4 w3 = 1/2 w1 + 1/3 w4 w4 = 1/2 w5 w5 = w2
𝑥𝑤 = 1 𝑒𝑝𝑣𝑢 𝑣 𝑥𝑣
𝑣→𝑤
𝑤2 𝑤3 𝑤4 𝑤5 𝑤1
𝒙𝟐 𝒙𝟑 𝒙𝟒 𝒙𝟓 𝒙𝟔 t=25 0.18 0.27 0.13 0.13 0.27
Think of the weight as a fluid: there is constant amount of it in the graph, but it moves around until it stabilizes
probability 1
𝑜.
𝑤2 𝑤3 𝑤4 𝑤5 𝑤1
𝑤2 𝑤3 𝑤4 𝑤5 𝑤1
𝑤2 𝑤3 𝑤4 𝑤5 𝑤1
𝑤2 𝑤3 𝑤4 𝑤5 𝑤1
𝑤2 𝑤3 𝑤4 𝑤5 𝑤1
𝑤2 𝑤3 𝑤4 𝑤5 𝑤1
𝑤2 𝑤3 𝑤4 𝑤5 𝑤1
𝑤2 𝑤3 𝑤4 𝑤5 𝑤1
𝑤2 𝑤3 𝑤4 𝑤5 𝑤1
𝑢 of being at
𝑤2 𝑤3 𝑤4 𝑤5 𝑤1
𝑞3
0 = 1
5 𝑞4
0 = 1
5 𝑞5
0 = 1
5 𝑞1
𝑢 = 1
3 𝑞4
𝑢−1 + 1
2 𝑞5
𝑢−1
𝑞2
𝑢 = 1
2 𝑞1
𝑢−1
+ 𝑞3
𝑢−1 + 1
3 𝑞4
𝑢−1
𝑞3
𝑢 = 1
2 𝑞1
𝑢−1 + 1
3 𝑞4
𝑢−1
𝑞4
𝑢 = 1
2 𝑞5
𝑢−1
𝑞5
𝑢 = 𝑞2 𝑢−1
𝑞1
0 = 1
5 𝑞2
0 = 1
5
The equations are the same as those for the PageRank computation
states 𝑇 = {𝑡1, 𝑡2, … , 𝑡𝑜} according to a transition probability matrix 𝑄 = {𝑄𝑗𝑘}
𝑄 𝑗, 𝑘 = 1
𝑘
A matrix with this property is called stochastic
𝑢, 𝑞2 𝑢, … , 𝑞𝑜 𝑢) that stores the
probability of being at state 𝑡𝑗 after 𝑢 steps
current state and not on the past of the process (first order MC)
to a unique distribution if the chain is irreducible and aperiodic
we follow an edge from one node to another 𝑄 𝑗, 𝑘 = 1 d𝑝𝑣𝑢 𝑗
𝑞𝑢+1 = 𝑞𝑢𝑄
2 1 2 1 3 1 3 1 3 1 1 1 2 1 2 1 P
𝑤2 𝑤3 𝑤4 𝑤5 𝑤1
1 1 1 1 1 1 1 1 1 A
2 1 2 1 3 1 3 1 3 1 1 1 2 1 2 1 P
𝑤2 𝑤3 𝑤4 𝑤5 𝑤1
𝑞1
𝑢 = 1
3 𝑞4
𝑢−1 + 1
2 𝑞5
𝑢−1
𝑞2
𝑢 = 1
2 𝑞1
𝑢−1
+ 𝑞3
𝑢−1 + 1
3 𝑞4
𝑢−1
𝑞3
𝑢 = 1
2 𝑞1
𝑢−1 + 1
3 𝑞4
𝑢−1
𝑞4
𝑢 = 1
2 𝑞5
𝑢−1
𝑞5
𝑢 = 𝑞2 𝑢−1
have maximum eigenvalue 1
Initialize 𝑞0 to some distribution Repeat 𝑞𝑢 = 𝑞𝑢−1𝑄 Until convergence
random walk?
(infinite) number of steps
vector
paths of length 2)
starting point does not matter.
walk
2 1 2 1 3 1 3 1 3 1 1 1 2 1 2 1 P
without any outgoing inks?
2 1 2 1 3 1 3 1 3 1 1 2 1 2 1 P
2 1 2 1 3 1 3 1 3 1 1 5 1 5 1 5 1 5 1 5 1 2 1 2 1 P'
P’ = P + dvT
sink is i if 1 d
5 1 5 1 5 1 5 1 5 1 5 1 5 1 5 1 5 1 5 1 5 1 5 1 5 1 5 1 5 1 5 1 5 1 5 1 5 1 5 1 5 1 5 1 5 1 5 1 5 1 2 1 2 1 3 1 3 1 3 1 1 5 1 5 1 5 1 5 1 5 1 2 1 2 1 ' P' ) 1 (
random walk
𝑄’’ = (1 − 𝛽)𝑄’ + 𝛽𝑣𝑤𝑈, where u is the vector of all 1s Random walk with restarts
distribution 𝑥𝑤 = 1 − 𝛽 1 𝑒𝑝𝑣𝑢 𝑣 𝑥𝑣
𝑣→𝑤
+ 𝛽 1 𝑜
links in the page
random page 1. Red Page 2. Purple Page 3. Yellow Page 4. Blue Page 5. Green Page
𝑞0 = 𝑤 𝑞1 = (1 − 𝛽)𝑞0𝑄 + 𝛽𝑤 = (1 − 𝛽)𝑤𝑄 + 𝛽𝑤 𝑞2 = (1 − 𝛽)𝑞1𝑄 + 𝛽𝑤 = (1 − 𝛽)2𝑤𝑄2 + 1 − 𝛽 𝛽𝑤𝑄 + 𝛽𝑤 𝑞2 = 1 − 𝛽 𝑞2𝑄 + 𝛽𝑤 = 1 − 𝛽 3𝑤𝑄3 + 1 − 𝛽 2𝛽𝑤𝑄2 + + 1 − 𝛽 𝛽𝑤𝑄 + 𝛽𝑤 ⋮ 𝑞∞ = 𝛽𝑤 + 1 − 𝛽 𝛽𝑤𝑄 + 1 − 𝛽 2𝛽𝑤𝑄2 + ⋯ = 𝛽 𝐽 − (1 − 𝛽)𝑄 −1
important, since the weight decreases exponentially
some node 𝑦, nodes close to 𝑦 have higher probability
nodes that are close to 𝑤
popularity
previous distribution does not hold.
1 t T t
q ' P' q
1 t t
q q δ
Efficient computation of y = (P’’)T x
βv y y y x β x α)P y
1 1 T
1 (
P = normalized adjacency matrix P’’ = (1-α)P’ + αuvT, where u is the vector of all 1s P’ = P + dvT, where di is 1 if i is sink and 0 o.w.
was useful in many different ways
probably more important for ranking
Root Set Root set obtained from a text-only search engine
Root Set IN OUT
Root Set IN OUT
Root Set IN OUT
Base Set
hubs authorities
j i j j i
:
i j j j i
:
hubs authorities 1 1 1 1 1 1 1 1 1 1 Initialize
hubs authorities 1 1 1 1 1 1 2 3 2 1 Step 1: O operation
hubs authorities 6 5 5 2 1 1 2 3 2 1 Step 1: I operation
hubs authorities 1 5/6 5/6 2/6 1/6 1/3 2/3 1 2/3 1/3 Step 1: Normalization (Max norm)
hubs authorities 1 5/6 5/6 2/6 1/6 1 11/6 16/6 7/6 1/6 Step 2: O step
hubs authorities 33/6 27/6 23/6 7/6 1/6 1 11/6 16/6 7/6 1/6 Step 2: I step
hubs authorities 1 27/33 23/33 7/33 1/33 6/16 11/16 1 7/16 1/16 Step 2: Normalization
hubs authorities 1 0.8 0.6 0.14 0.4 0.75 1 0.3 Convergence
computation
the matrix A
r 2 1 r 2 1 r 2 1 T
v v v σ σ σ u u u V Σ U A
[n×r] [r×r] [r×n]
r 2 1
u , , u , u
r 2 1
v , , v , v
T r r r T 2 2 2 T 1 1 1
eigenvalues and eigenvectors: 𝜇1, 𝑥1 , 𝜇2, 𝑥2 , … , (𝜇𝑠, 𝑥𝑠)
define a basis of the vector space
𝑆𝑢𝑦 = 𝜇1
𝑢−1𝛽1𝑥1 + 𝜇2 𝑢−1𝑏2𝑥2 + ⋯ + 𝜇2 𝑢−1𝑏𝑠𝑥𝑠
hubs authorities
graph
v u c
u) d(v, 1 v D
t v s st st c
σ (v) σ v B
1 m m 2 2
the journal in the previous two years
directed to journal j
What is the stationary distribution?
close the node is to red or blue
themselves and zero of being absorbed in another node.
the absorption probabilities of your neighbors
𝑄 𝑆𝑓𝑒 𝐻𝑠𝑓𝑓𝑜 = 1 4 𝑄 𝑆𝑓𝑒 𝑍𝑓𝑚𝑚𝑝𝑥 + 1 4 𝑄 𝑆𝑓𝑒 𝑍𝑓𝑚𝑚𝑝𝑥 = 2 3 2 2 1 1 1 2 1 𝑄 𝑆𝑓𝑒 𝑄𝑗𝑜𝑙 = 2 3 𝑄 𝑆𝑓𝑒 𝑍𝑓𝑚𝑚𝑝𝑥 + 1 3 𝑄(𝑆𝑓𝑒|𝐻𝑠𝑓𝑓𝑜) 𝑄 𝑆𝑓𝑒 𝑆𝑓𝑒 = 1 , 𝑄 𝑆𝑓𝑒 𝐶𝑚𝑣𝑓 = 0
themselves and zero of being absorbed in another node.
the absorption probabilities of your neighbors
𝑄 𝐶𝑚𝑣𝑓 𝑄𝑗𝑜𝑙 = 2 3 𝑄 𝐶𝑚𝑣𝑓 𝑍𝑓𝑚𝑚𝑝𝑥 + 1 3 𝑄(𝐶𝑚𝑣𝑓|𝐻𝑠𝑓𝑓𝑜) 𝑄 𝐶𝑚𝑣𝑓 𝐻𝑠𝑓𝑓𝑜 = 1 4 𝑄 𝐶𝑚𝑣𝑓 𝑍𝑓𝑚𝑚𝑝𝑥 + 1 2 𝑄 𝐶𝑚𝑣𝑓 𝑍𝑓𝑚𝑚𝑝𝑥 = 1 3 2 2 1 1 1 2 1 𝑄 𝐶𝑚𝑣𝑓 𝐶𝑚𝑣𝑓 = 1 , 𝑄 𝐶𝑚𝑣𝑓 𝑆𝑓𝑒 = 0
probability to sink nodes?
choose to make some nodes absorbing.
them and remove outgoing edges.
proximity of non-absorbing nodes to the chosen nodes.
negative, to which opinion is a non-absorbing node closer?
2 2 1 1 1 2 1
2 2 1 1 1 2 1
𝑄 𝑆𝑓𝑒 𝑄𝑗𝑜𝑙 = 2 3 𝑄 𝑆𝑓𝑒 𝑍𝑓𝑚𝑚𝑝𝑥 + 1 3 𝑄(𝑆𝑓𝑒|𝐻𝑠𝑓𝑓𝑜) 𝑄 𝑆𝑓𝑒 𝐻𝑠𝑓𝑓𝑜 = 1 5 𝑄 𝑆𝑓𝑒 𝑍𝑓𝑚𝑚𝑝𝑥 + 1 5 𝑄 𝑆𝑓𝑒 𝑄𝑗𝑜𝑙 + 1 5 𝑄 𝑆𝑓𝑒 𝑍𝑓𝑚𝑚𝑝𝑥 = 1 6 𝑄 𝑆𝑓𝑒 𝐻𝑠𝑓𝑓𝑜 + 1 3 𝑄 𝑆𝑓𝑒 𝑄𝑗𝑜𝑙 + 1 3 0.52 0.48 0.42 0.58 0.57 0.43 2 2 1 1 1 2 1 𝑄 𝐶𝑚𝑣𝑓 𝑄𝑗𝑜𝑙 = 1 − 𝑄 𝑆𝑓𝑒 𝑄𝑗𝑜𝑙 𝑄 𝐶𝑚𝑣𝑓 𝐻𝑠𝑓𝑓𝑜 = 1 − 𝑄 𝑆𝑓𝑒 𝐻𝑠𝑓𝑓𝑜 𝑄 𝐶𝑚𝑣𝑓 𝑍𝑓𝑚𝑚𝑝𝑥 = 1 − 𝑄 𝑆𝑓𝑒 𝑍𝑓𝑚𝑚𝑝𝑥
0.52 0.48 0.42 0.58 0.57 0.43 2 2 1 1 1 2 1 𝑄 𝐶𝑚𝑣𝑓 𝑃𝑠𝑏𝑜𝑓 = 𝑄 𝐶𝑚𝑣𝑓 𝑍𝑓𝑚𝑚𝑝𝑥 1 𝑄 𝑆𝑓𝑒 𝑃𝑠𝑏𝑜𝑓 = 𝑄 𝑆𝑓𝑒 𝑍𝑓𝑚𝑚𝑝𝑥 0.57 0.43
1-α α α α α 1-α 1-α 1-α 𝑄 𝑆𝑓𝑒 𝐻𝑠𝑓𝑓𝑜 = (1 − 𝛽) 1 5 𝑄 𝑆𝑓𝑒 𝑍𝑓𝑚𝑚𝑝𝑥 + 1 5 𝑄 𝑆𝑓𝑒 𝑄𝑗𝑜𝑙 + 1 5 With probability α the random walk dies With probability (1-α) the random walk continues as before The longer the path from a node to an absorbing node the more likely the random walk dies along the way, the lower the absorbtion probability e.g.
seems similar to Pagerank
similarity between nodes u,v
Walk (ARW) are similar but not the same
while AWR the probability of paths from a node v, to the absorbing node u.
each node
starting nodes
negative value
repeatedly averaging the values of the neighbors
for a random walk that starts from u
𝑊(𝑄𝑗𝑜𝑙) = 2 3 𝑊(𝑍𝑓𝑚𝑚𝑝𝑥) + 1 3 𝑊(𝐻𝑠𝑓𝑓𝑜) 𝑊 𝐻𝑠𝑓𝑓𝑜 = 1 5 𝑊(𝑍𝑓𝑚𝑚𝑝𝑥) + 1 5 𝑊(𝑄𝑗𝑜𝑙) + 1 5 − 2 5 𝑊 𝑍𝑓𝑚𝑚𝑝𝑥 = 1 6 𝑊 𝐻𝑠𝑓𝑓𝑜 + 1 3 𝑊(𝑄𝑗𝑜𝑙) + 1 3 − 1 6 +1
0.05
0.16 2 2 1 1 1 2 1
negative voltage -1 at the Blue node
the weights (or conductance proportional to the weights)
+1 𝑊(𝑄𝑗𝑜𝑙) = 2 3 𝑊(𝑍𝑓𝑚𝑚𝑝𝑥) + 1 3 𝑊(𝐻𝑠𝑓𝑓𝑜) 𝑊 𝐻𝑠𝑓𝑓𝑜 = 1 5 𝑊(𝑍𝑓𝑚𝑚𝑝𝑥) + 1 5 𝑊(𝑄𝑗𝑜𝑙) + 1 5 − 2 5 𝑊 𝑍𝑓𝑚𝑚𝑝𝑥 = 1 6 𝑊 𝐻𝑠𝑓𝑓𝑜 + 1 3 𝑊(𝑄𝑗𝑜𝑙) + 1 3 − 1 6 +1
2 2 1 1 1 2 1 0.05
0.16
𝑑 𝑨𝑣 = 𝑡𝑣 − 𝑨𝑣 2 + 𝑥𝑣 𝑨𝑣 − 𝑨𝑤 2
𝑤:𝑤 is a friend of 𝑣
cost then the best thing to do is to set 𝑨𝑣to the average of all opinions:
𝑨𝑣 = 𝑡𝑣 + 𝑥𝑣𝑨𝑣
𝑤:𝑤 is a friend of 𝑣
1 + 𝑥𝑣
𝑤:𝑤 is a friend of 𝑣
2 2 1 1 1 2 1 s = +0.5 s = -0.3 s = -0.1 s = +0.2 s = +0.8
2 2 1 1 1 2 1 1 1 1 1 1 s = +0.5 s = -0.3 s = -0.1 s = -0.5 s = +0.8 The external opinion for each node is computed using the value propagation we described before
Intuitive model: my opinion is a combination of what I believe and what my social network believes. One absorbing node per user with value the internal opinion of the user One non-absorbing node per user that links to the corresponding absorbing node z = +0.22 z = +0.17 z = -0.03 z = 0.04 z = -0.01
starting from node u to end up in v for the first time
steps to reach v
and v
nodes we can propagate them to the remaining nodes
for the rest of the graph
produce a model, but just labels the unlabeled data that is at hand.
new example
update the value of u.
absorbing, in which case the value of the node is not updated.
small.