1
CoupledLP: Link Prediction in Coupled Networks
Yuxiao Dong#, Jing Zhang+, Jie Tang+, Nitesh V. Chawla#, Bai Wang*
#University of Notre Dame +Tsinghua University *Beijing University of Posts
and Telecommunications
Mobile Networks 2015.08.08 10:30 2015.08.08 10:48 2015.08.08 11:01 - - PowerPoint PPT Presentation
CoupledLP: Link Prediction in Coupled Networks Yuxiao Dong # , Jing Zhang + , Jie Tang + , Nitesh V. Chawla # , Bai Wang* * Beijing University of Posts # University of Notre Dame + Tsinghua University and Telecommunications 1 Mobile Networks
1
#University of Notre Dame +Tsinghua University *Beijing University of Posts
and Telecommunications
2
2015.08.08 10:30 2015.08.08 10:48 2015.08.08 11:01 2015.08.08 11:29 …… 2016.01.01 00:00 ……
3
2015.08.08 10:30 2015.08.08 10:48 2015.08.08 11:01 2015.08.08 11:29 …… 2016.01.01 00:00 ……
4
1. K.I. Goh, M. E. Cusick, D. Valle, B. Childs, M. Vidal, and A.-L. Barabási. The human disease network. PNAS 2007. 2.
3.
Disease network Cross network Gene network
5
Given a source network GS = (VS, ES) and a target network GT = (VT, ET), they compose coupled networks if there exists a cross link eij with one node vi ∈ VS and the other node vj ∈ VT. The cross network GC = (VC, EC) is a bipartite network containing all the cross links in the coupled networks.
Source network Target network Cross network Coupled networks
6
Input ¡ Output ¡
Source network Cross network Target network
Given the source network GS and the cross network GC in coupled networks G = (GS, GT, GC), the task is to find a predictive function: f : (GS, GC) → YT where YT is the set of labels for the potential links in the target network GT.
7
Input ¡ Output ¡
Source network Cross network Target network
8
B D C A E F G
B D C A E F G Input Output
9
Input
Output
10
Input Output
1.
2.
Source network Target network
11
Input Output
Source network Cross network Target network
12
Input Output
1.
2.
A network Self-linkage network B network
13
Input ¡ Output ¡
Source network Cross network Target network
14
15
Input Output
Source network Cross network Target network 99% 80% 75% 87%
Implicit Target Network Construction
16
v3
S
v1
T
v4
S
v2
T
v3
S
v1
T
v4
S
v2
T
v3
S
v1
T
v4
S
v2
T
Atomic Propagations for constructing an implicit target network
Direct Coupling Co-citation
MM MMT MTM
top z%
17
Input Output
Source network Cross network Target network 99% 80% 75% 87%
18
1.
2. Jie Tang, Tiancheng Lou, and Jon Kleinberg. Inferring social ties across heterogeneous networks. In WSDM '12
Input Network Factor Graph
19
y13
f (v1, v3, y13)
y34 y06
Observations y13=? v1,v3 v3, v4 v0, v6 y06=?
f (v3, v4, y34) F (vs, v3, ys3)
y12
v1, v2
f (v1, v2, y12)
y12=1 v5, v6 v2, v3
y56 y23
y23=? y56=0
f (v2, v3, y23) f (v5, v6, y56) g(y12, y13) g(y13, y23) g(y06, y56)
v5 v2 v3 v0
Coupled Networks
v6 v1 v4
...
CoupledFG
g(y12, y23)
y34=?
20
y13
f (v1, v3, y13)
y34 y06
Observations y13=? v1,v3 v3, v4 v0, v6 y06=?
f (v3, v4, y34) F (vs, v3, ys3)
y12
v1, v2
f (v1, v2, y12)
y12=1 v5, v6 v2, v3
y56 y23
y23=? y56=0
f (v2, v3, y23) f (v5, v6, y56) g(y12, y13) g(y13, y23) g(y06, y56)
v5 v2 v3 v0
Coupled Networks
v6 v1 v4
...
CoupledFG
g(y12, y23)
y34=?
model source and target network separately meta-path
Asymmetry
Attribute Factor
21
y13
f (v1, v3, y13)
y34 y06
Observations y13=? v1,v3 v3, v4 v0, v6 y06=?
f (v3, v4, y34) F (vs, v3, ys3)
y12
v1, v2
f (v1, v2, y12)
y12=1 v5, v6 v2, v3
y56 y23
y23=? y56=0
f (v2, v3, y23) f (v5, v6, y56) g(y12, y13) g(y13, y23) g(y06, y56)
v5 v2 v3 v0
Coupled Networks
v6 v1 v4
...
CoupledFG
g(y12, y23)
y34=?
model source and target network separately meta-path
Heterogeneity Asymmetry
Meta-path Factor
22
1.
disease
express associate
gene gene disease disease gene gene
23
v Factor Initialization: exponential-linear
v Objective Function:
model source & target network separately meta-path bridge source & target networks
24
Learning: Gradient Decent method
25
26
1.
2.
3.
Healthcare Networks Disease (D)---Gene (G)
27
1.
2.
3.
Healthcare Networks Disease (D)---Gene (G) Mobile Phone Call Networks Two Operators: Aa---Ab
28
1.
2.
3.
Healthcare Networks Disease (D)---Gene (G) Mobile Phone Call Networks Two Operators: Aa---Ab Mobile Phone Call Networks Three Operators: Ea---Eb---Ec
29
30
1 2 3 4 5 6 7 8 9 10
31
AUPR or AUROC or Precision @ k
32
1.
2.
AUPR or AUROC or Precision @ k
ü Common Neighbors (CN) ü Adamic Adar (AA) ü Jaccard Coefficient (JC) ü Preferential Attachment (PA) ü PropFlow (PF) ü Implicit Target Network (IT)
ü Logistic Regression (LRC)
ü Decision Tree (DT)
ü CoupledLP ü CoupledLP-IT
LRC-IT, DT-IT, CoupledLP-IT: NO Implicit Target network construction
features
33
AUPR or AUROC or Precision @ k
ü source links between nodes with cross links ü 1% target links
ü 99% target links
34
1.
2.
AUPR or AUROC or Precision @ k
35
AUPR
36
AUROC
37
Precision @ k
38
AUPR
39
AUPR
40
41
Output
Source network Cross network Target network
Input
42
43
x-axis: pruning threshold z y-axis: AUPR / AUROC +5% AUPR +8% AUROC