Mining Rich Graphs Ranking, Classification, and Anomaly Detection - - PowerPoint PPT Presentation
Mining Rich Graphs Ranking, Classification, and Anomaly Detection - - PowerPoint PPT Presentation
Mining Rich Graphs Ranking, Classification, and Anomaly Detection Leman Akoglu Feb 9 th 2018 Networks are ubiquitous! - - - Terrorist Network Food Web Internet Map [Krebs 2002] [2007] [Koren 2009] Social Network Protein Network
Networks are ubiquitous!
- Internet Map
[Koren 2009] Food Web [2007] Protein Network [Salthe 2004] Social Network [Newman 2005] Web Graph Terrorist Network [Krebs 2002]
2
Graph problems
- Internet Map
[Koren 2009] Food Web [2007] Protein Network [Salthe 2004] Social Network [Newman 2005] Web Graph Terrorist Network [Krebs 2002]
- ranking, ¡
- classifica-on, ¡
- clustering ¡& ¡
anomaly ¡mining, ¡
- link ¡predic-on, ¡ ¡
- role ¡discovery, ¡ ¡
- similarity ¡search, ¡
- influence, ¡
- evolu-on, ¡
- … ¡ ¡
3
Ranking in networks
Src: wiki/PageRank
4
Classification in networks
Src: [Adamic+ 2005]
5
Community detection in networks
Src: [McAuley&Leckovec 2012]
6
Rich networks
7
Read the Web
Rich networks also ubiquitous!
8
Read the Web
- ranking, ¡
- clustering ¡& ¡
anomaly ¡mining, ¡
- classifica-on, ¡
- link ¡predic-on, ¡ ¡
- role ¡discovery, ¡ ¡
- similarity ¡search, ¡
- influence, ¡
- evolu-on, ¡
- … ¡ ¡
Graph problems on rich networks
9
Read the Web
Graph problems on rich networks
- ranking, ¡
- clustering ¡& ¡
anomaly ¡mining, ¡ ¡
- classifica-on, ¡
- link ¡predic-on, ¡ ¡
- role ¡discovery, ¡ ¡
- similarity ¡search, ¡
- influence, ¡
- evolu-on, ¡
- … ¡ ¡
10
Ranking in rich networks: example
Medical referral network (weighted, directed)
11
Ranking in rich networks: example
Medical referral network + physician expertise
12
Ranking in rich networks: example
Medical referral network + physician expertise + location
Town A Town B
13
Ranking Problem: Which are the top k nodes
- f a certain type?
e.g.: Who are the best cardiologists in the network, in my town, etc.?
Ranking in rich networks
Town A Town B
Ranking in Heterogeneous Networks with Geo-Location Information Abhinav Mishra & Leman Akoglu SIAM SDM 2017.
14
Modeling the ranking problem
Goal: ranking in directed heterogeneous information networks (HIN) with geo-location
n HINside model
- 1. Relation strength
- 2. Relation distance
- 3. Neighbor authority
- 4. Authority transfer rates
- 5. Competition
v Closed form solution
n Parameter estimation
15
HINside model
Relation Strength and Distance
q edge weights q pair-wise distances
denote the where W(i, j) = distance
⇥ log(w(i, j) + 1). matrix such that
⇥ that D(i, j) = log(d(li, lj) + 1). for the relation distance, we combine
(3.1) M = W D
HINside model
In-neighbor authority Authority Transfer Rates (ATR)
(3.2) ri = X
j∈V
M(j, i) rj
(3.3) ri = X
j∈V
Γ(tj, ti) M(j, i) rj.
i i
ti : type of node i ri : authority score of node i
HINside model
Competition
j i
- ther nodes of type ti
in the vicinity of node j
N(u, v) = ⇢ g(d(lu, lv)) u, v 2 V, u 6= v u = v
(3.4) ri = X
j
Γ(tj, ti) M(j, i) ( rj + X
v:tv=ti
N(v, j) rv )
e.g. g(z) = ez. the authority scores
for monotonically decreasing
Closed-form
n Authority scores vector r written in closed
form (& computed by power iterations) as :
q
n
(n x m) where
n
(m x m) authority transfer rates (ATR)
q where
n: #nodes m: #types
8 2 V define L = M(T Γ T 0)
T(i, c) = 1 if ti = T (c)
⇢ form, E = TT 0. X
Let T denote
Γ(
r = ⇥ L0 + (L0N 0 E) ⇤ r = H r
E(u, v) = ⇢ 1 if tu = tv
- therwise
19
Modeling the ranking problem
Goal: ranking in directed heterogeneous information networks (HIN) with geo-location
n HINside model
- 1. Relation strength
- 2. Relation distance
- 3. Neighbor authority
- 4. Authority transfer rates
- 5. Competition
v Closed form solution
n Parameter estimation
20
Parameter estimation
n HINside’s parameters consist of the m2
authority transfer rates (ATR)
q ri as a vector-vector product
(3.4) ri = X
j
Γ(tj, ti) M(j, i) ( rj + X
v:tv=ti
N(v, j) rv )
ri = X
t
Γ(t, ti) X
j:tj=t
⇥ M(j, i)(rj + X
v:tv=ti
N(v, j) rv) ⇤
(4.8) ri = X
t
Γ(t, ti)X(t, i) = i) = Γ0(ti, :)·X(:, i) = Γ0
ti ·xi
- f a feature vector xi and
ri = f(xi) =< w, xi >. representation to be used
21
An alternating optimization scheme:
n
r
Given: graph G, (partial) lists ranking a subset of nodes of a certain type
q Randomly initialize , q Compute authority scores r using q Repeat
n
ß compute feature vectors using r
n
ß ß learn new parameters by learning-to-rank
n compute authority scores r using q Until convergence
Γ(
X for exactly
Γ(
Output:
1: Γ0(
Output:
1: Γ0(
}, k = 0
repeat Xk
X ← Γk+1 X ← Γk+1
estimate
22
An alternating optimization scheme:
n
r
Given: graph G, (partial) lists ranking a subset of nodes of a certain type
q Randomly initialize , q Compute authority scores r using q Repeat
n
ß compute feature vectors using r
n
ß learn new parameters by learning-to-rank
n compute authority scores r using q Until convergence
Γ(
X for exactly
Γ(
Output:
1: Γ0(
Output:
1: Γ0(
}, k = 0
repeat Xk
X ← Γk+1 X ← Γk+1
estimate
23
RankSVM formulation
v
Given partial ranked lists;
q create all pairs q add training data
if u ranked ahead of v
- therwise
q for each type t, solve:
nodes) (u, v)
instance ((xu, xv), 1) ), and
- therwise. As a result, training
- f {((x1
d, x2 d), yd)}|D| d=1,
feature vectors that belong
min
Γt
||Γt||2
2 +
X
d2D
✏d s.t. Γ0
t(x1 d − x2 d)yd ≥ 1 − ✏d, ∀d ∈ D and tx1
d, tx2 d = t
✏d ≥ 0, ∀d ∈ D Γt(c) ≥ 0, ∀c = 1, . . . , m
nodes) (u, v) in instance ((xu, xv), ), and
), −1) in the
Cross-entropy based
- bjective
by gradient descent
Read the Web
- ranking, ¡
- clustering ¡& ¡
anomaly ¡mining, ¡ ¡
- classifica-on, ¡
- link ¡predic-on, ¡ ¡
- role ¡discovery, ¡ ¡
- similarity ¡search, ¡
- influence, ¡
- evolu-on, ¡
- … ¡ ¡
Graph problems on rich networks
25
Attributed graphs
telemarketer
Teenager Adult
skater doctor data scientist
Attributed graph: each node has 1+ properties
26
Communities in rich networks
Attributed graph: each node has 1+ properties
27
Anomalous subgraphs
Given a set of attributed subgraphs* (e.g. Google+ circles), Find poorly-defined ones
* social circles, communities, egonetworks, …
28
Communities in attributed networks Given an attributed subgraph*, how to quantify its quality?
* social circles, communities, egonetworks, …
29
Communities in attributed networks
v Given a subgraph,
how to quantify its quality?
30
Communities in attributed networks
v Given a subgraph,
how to quantify its quality?
q Structure-only
n Internal measures
q e.g. average degree
31
Communities in attributed networks
v Given a subgraph,
how to quantify its quality?
q Structure-only
n Internal-only
q average degree
n Boundary-only
q cut edges
n Internal + Boundary
q conductance
32
Communities in attributed networks
v Given an attributed subgraph,
how to quantify its quality?
q Structure-only
n Internal-only
q average degree
n Boundary-only
q cut edges
n Internal + Boundary
q conductance q Structure + Attributes?
Scalable Anomaly Ranking of Attributed Neighborhoods Bryan Perozzi and Leman Akoglu SIAM SDM 2016.
33
What’s an Anomaly, Anyhow?
high low
v Given an attributed subgraph
how to quantify quality?
34
Normality (intuition)
high low
n Given an attributed subgraph
how to quantify quality?
q Internal
n structural density
35
Normality (intuition)
chess biking
n Given an attributed subgraph
how to quantify quality?
q Internal
n structural density AND n attribute coherence
v neighborhood “focus”
high low
36
Normality (intuition)
n Given an attributed subgraph
how to quantify quality?
q Internal
n structural density AND n attribute coherence
v neighborhood “focus”
q Boundary
n structural sparsity, OR n external separation
v “exoneration”
high low
37
n Motivation:
q no good cuts in real-world graphs q social circles overlap
n “exoneration” : by (a) null model, (b) attributes
Normality (intuition)
[Leskovec+ ‘08] [McAuley+ ‘14] (b) neighborhood overlap (a) hub effect edges expected, not surprising
separable by different “focus”
38
The measure of Normality
1
N = I + E = X
i∈C,j∈C
- Aij − kikj
2m
- s(xi, xj|w)
− X
i∈C,b∈B (i,b)∈E
- 1 − min(1, kikb
2m )
- s(xi, xb|w)
(3.4)
Leman Akoglu 39
The measure of Normality
1
N = I + E = X
i∈C,j∈C
- Aij − kikj
2m
- s(xi, xj|w)
− X
i∈C,b∈B (i,b)∈E
- 1 − min(1, kikb
2m )
- s(xi, xb|w)
(3.4)
internal consistency Null model similarity “focus” vector chess biking
40
The measure of Normality
1
N = I + E = X
i∈C,j∈C
- Aij − kikj
2m
- s(xi, xj|w)
− X
i∈C,b∈B (i,b)∈E
- 1 − min(1, kikb
2m )
- s(xi, xb|w)
(3.4)
external separability
41
Anomaly Mining of Entity Neighborhoods (AMEN)
n Given an attributed subgraph, can we find the
attribute weights?
max
wC
wC
T ·
X
i∈C,j∈C
- Aij kikj
2m
- s(xi, xj)
- X
i∈C,b∈B (i,b)∈E
- 1 min(1, kikb
2m )
- s(xi, xb)
- 2
1
N = I + E = X
i∈C,j∈C
- Aij − kikj
2m
- s(xi, xj|w)
− X
i∈C,b∈B (i,b)∈E
- 1 − min(1, kikb
2m )
- s(xi, xb|w)
(3.4)
latent
42
Optimizing Normality
2 1
3
max
wC
wCT · X
i∈C,j∈C
Aij kikj 2m s(xi, xj)
- X
i∈C,b∈B (i,b)∈E
1 min(1, kikb 2m )s(xi, xb)
- N = I + E =
X
i∈C,j∈C
- Aij − kikj
2m
- s(xi, xj|w)
− X
i∈C,b∈B (i,b)∈E
- 1 − min(1, kikb
2m )
- s(xi, xb|w)
(3.4)
max
wC
wC
T · (ˆ
xI + ˆ xE) (4.6) s.t. kwCkp = 1, wC(f) 0, 8f = 1 . . . d
43
Optimizing Normality
max
wC
wC
T · (ˆ
xI + ˆ xE) (4.6) s.t. kwCkp = 1, wC(f) 0, 8f = 1 . . . d
: one attribute f with largest x
is, wC(f) = 1 One can interpret
=2, or the 2 norm yields.
In that wC(f) =
x(f)
pP
x(i)>0 x(i)2 ,
- therwise, where w
is unit-normalized.
N = wC
T · x
s
2 = kx+k2
x
: all f with positive x
Normality becomes
Linear in number of attributes!
44
Example neighborhoods
45
Read the Web
- ranking, ¡
- clustering ¡& ¡
anomaly ¡mining, ¡ ¡
- classifica-on, ¡
- link ¡predic-on, ¡ ¡
- role ¡discovery, ¡ ¡
- similarity ¡search, ¡
- influence, ¡
- evolu-on, ¡
- … ¡ ¡
Graph problems on rich networks
46
Motivating Problem
Connotation Mining: finding dash of sentiment beneath “seemingly objective” words & senses
cheesecake emission fine fine
47
Words+Senses edge-typed network
48
Classification
n A collective classification approach
q Objective utilizes pairwise Markov Random Fields
Node labels as random variables edge potential (label-label) prior belief edge potential (label-observed label)
max y edge type
t t
49
Edge potentials depend on edge type
50
Inference
n A collective classification approach
q Objective utilizes pairwise Markov Random Fields
- Inference problem (NP-hard)
n Loopy Belief Propagation (LBP)
edge type 1) Repeat for each node: 2) At convergence: edge potential
51
Summary
n Ranking in node-typed graphs with location
q motivating domain: physician referrals q HINside model for ranking w/ parameter learning
n Anomalous subgraphs in node-attributed graphs
q motivating domain: social networks q AMEN model for quality scoring
n Classification in edge-typed graphs
q motivating application: connotation mining/NLP q LBP with type-specific edge potentials
52
References
n Ranking in Heterogeneous Networks with Geo-Location
Information Abhinav Mishra & Leman Akoglu. SIAM SDM 2017. Code: https://github.com/abhimm/HINSIDE
n Scalable Anomaly Ranking of Attributed Neighborhoods
Bryan Perozzi & Leman Akoglu. SIAM SDM 2016. Code: https://github.com/phanein/amen
n ConnotationWordNet: Learning Connotation of the Word
+Sense Network Jun S. Kang, Song Feng, Leman Akoglu, Yejin Choi. ACL 2014.
http://www3.cs.stonybrook.edu/~junkang/connotation_wordnet/
53