Mining Rich Graphs Ranking, Classification, and Anomaly Detection - - PowerPoint PPT Presentation

mining rich graphs
SMART_READER_LITE
LIVE PREVIEW

Mining Rich Graphs Ranking, Classification, and Anomaly Detection - - PowerPoint PPT Presentation

Mining Rich Graphs Ranking, Classification, and Anomaly Detection Leman Akoglu Feb 9 th 2018 Networks are ubiquitous! - - - Terrorist Network Food Web Internet Map [Krebs 2002] [2007] [Koren 2009] Social Network Protein Network


slide-1
SLIDE 1

Mining Rich Graphs

Ranking, Classification, and Anomaly Detection

Leman Akoglu Feb 9th 2018

slide-2
SLIDE 2

Networks are ubiquitous!

  • Internet Map

[Koren 2009] Food Web [2007] Protein Network [Salthe 2004] Social Network [Newman 2005] Web Graph Terrorist Network [Krebs 2002]

2

slide-3
SLIDE 3

Graph problems

  • Internet Map

[Koren 2009] Food Web [2007] Protein Network [Salthe 2004] Social Network [Newman 2005] Web Graph Terrorist Network [Krebs 2002]

  • ranking, ¡
  • classifica-on, ¡
  • clustering ¡& ¡

anomaly ¡mining, ¡

  • link ¡predic-on, ¡ ¡
  • role ¡discovery, ¡ ¡
  • similarity ¡search, ¡
  • influence, ¡
  • evolu-on, ¡
  • … ¡ ¡

3

slide-4
SLIDE 4

Ranking in networks

Src: wiki/PageRank

4

slide-5
SLIDE 5

Classification in networks

Src: [Adamic+ 2005]

5

slide-6
SLIDE 6

Community detection in networks

Src: [McAuley&Leckovec 2012]

6

slide-7
SLIDE 7

Rich networks

7

slide-8
SLIDE 8

Read the Web

Rich networks also ubiquitous!

8

slide-9
SLIDE 9

Read the Web

  • ranking, ¡
  • clustering ¡& ¡

anomaly ¡mining, ¡

  • classifica-on, ¡
  • link ¡predic-on, ¡ ¡
  • role ¡discovery, ¡ ¡
  • similarity ¡search, ¡
  • influence, ¡
  • evolu-on, ¡
  • … ¡ ¡

Graph problems on rich networks

9

slide-10
SLIDE 10

Read the Web

Graph problems on rich networks

  • ranking, ¡
  • clustering ¡& ¡

anomaly ¡mining, ¡ ¡

  • classifica-on, ¡
  • link ¡predic-on, ¡ ¡
  • role ¡discovery, ¡ ¡
  • similarity ¡search, ¡
  • influence, ¡
  • evolu-on, ¡
  • … ¡ ¡

10

slide-11
SLIDE 11

Ranking in rich networks: example

Medical referral network (weighted, directed)

11

slide-12
SLIDE 12

Ranking in rich networks: example

Medical referral network + physician expertise

12

slide-13
SLIDE 13

Ranking in rich networks: example

Medical referral network + physician expertise + location

Town A Town B

13

slide-14
SLIDE 14

Ranking Problem: Which are the top k nodes

  • f a certain type?

e.g.: Who are the best cardiologists in the network, in my town, etc.?

Ranking in rich networks

Town A Town B

Ranking in Heterogeneous Networks with Geo-Location Information Abhinav Mishra & Leman Akoglu SIAM SDM 2017.

14

slide-15
SLIDE 15

Modeling the ranking problem

Goal: ranking in directed heterogeneous information networks (HIN) with geo-location

n HINside model

  • 1. Relation strength
  • 2. Relation distance
  • 3. Neighbor authority
  • 4. Authority transfer rates
  • 5. Competition

v Closed form solution

n Parameter estimation

15

slide-16
SLIDE 16

HINside model

Relation Strength and Distance

q edge weights q pair-wise distances

denote the where W(i, j) = distance

⇥ log(w(i, j) + 1). matrix such that

⇥ that D(i, j) = log(d(li, lj) + 1). for the relation distance, we combine

(3.1) M = W D

slide-17
SLIDE 17

HINside model

In-neighbor authority Authority Transfer Rates (ATR)

(3.2) ri = X

j∈V

M(j, i) rj

(3.3) ri = X

j∈V

Γ(tj, ti) M(j, i) rj.

i i

ti : type of node i ri : authority score of node i

slide-18
SLIDE 18

HINside model

Competition

j i

  • ther nodes of type ti

in the vicinity of node j

N(u, v) = ⇢ g(d(lu, lv)) u, v 2 V, u 6= v u = v

(3.4) ri = X

j

Γ(tj, ti) M(j, i) ( rj + X

v:tv=ti

N(v, j) rv )

e.g. g(z) = ez. the authority scores

for monotonically decreasing

slide-19
SLIDE 19

Closed-form

n Authority scores vector r written in closed

form (& computed by power iterations) as :

q

n

(n x m) where

n

(m x m) authority transfer rates (ATR)

q where

n: #nodes m: #types

8 2 V define L = M(T Γ T 0)

T(i, c) = 1 if ti = T (c)

⇢ form, E = TT 0. X

Let T denote

Γ(

r = ⇥ L0 + (L0N 0 E) ⇤ r = H r

E(u, v) = ⇢ 1 if tu = tv

  • therwise

19

slide-20
SLIDE 20

Modeling the ranking problem

Goal: ranking in directed heterogeneous information networks (HIN) with geo-location

n HINside model

  • 1. Relation strength
  • 2. Relation distance
  • 3. Neighbor authority
  • 4. Authority transfer rates
  • 5. Competition

v Closed form solution

n Parameter estimation

20

slide-21
SLIDE 21

Parameter estimation

n HINside’s parameters consist of the m2

authority transfer rates (ATR)

q ri as a vector-vector product

(3.4) ri = X

j

Γ(tj, ti) M(j, i) ( rj + X

v:tv=ti

N(v, j) rv )

ri = X

t

Γ(t, ti) X

j:tj=t

⇥ M(j, i)(rj + X

v:tv=ti

N(v, j) rv) ⇤

(4.8) ri = X

t

Γ(t, ti)X(t, i) = i) = Γ0(ti, :)·X(:, i) = Γ0

ti ·xi

  • f a feature vector xi and

ri = f(xi) =< w, xi >. representation to be used

21

slide-22
SLIDE 22

An alternating optimization scheme:

n

r

Given: graph G, (partial) lists ranking a subset of nodes of a certain type

q Randomly initialize , q Compute authority scores r using q Repeat

n

ß compute feature vectors using r

n

ß ß learn new parameters by learning-to-rank

n compute authority scores r using q Until convergence

Γ(

X for exactly

Γ(

Output:

1: Γ0(

Output:

1: Γ0(

}, k = 0

repeat Xk

X ← Γk+1 X ← Γk+1

estimate

22

slide-23
SLIDE 23

An alternating optimization scheme:

n

r

Given: graph G, (partial) lists ranking a subset of nodes of a certain type

q Randomly initialize , q Compute authority scores r using q Repeat

n

ß compute feature vectors using r

n

ß learn new parameters by learning-to-rank

n compute authority scores r using q Until convergence

Γ(

X for exactly

Γ(

Output:

1: Γ0(

Output:

1: Γ0(

}, k = 0

repeat Xk

X ← Γk+1 X ← Γk+1

estimate

23

slide-24
SLIDE 24

RankSVM formulation

v

Given partial ranked lists;

q create all pairs q add training data

if u ranked ahead of v

  • therwise

q for each type t, solve:

nodes) (u, v)

instance ((xu, xv), 1) ), and

  • therwise. As a result, training
  • f {((x1

d, x2 d), yd)}|D| d=1,

feature vectors that belong

min

Γt

||Γt||2

2 +

X

d2D

✏d s.t. Γ0

t(x1 d − x2 d)yd ≥ 1 − ✏d, ∀d ∈ D and tx1

d, tx2 d = t

✏d ≥ 0, ∀d ∈ D Γt(c) ≥ 0, ∀c = 1, . . . , m

nodes) (u, v) in instance ((xu, xv), ), and

), −1) in the

Cross-entropy based

  • bjective

by gradient descent

slide-25
SLIDE 25

Read the Web

  • ranking, ¡
  • clustering ¡& ¡

anomaly ¡mining, ¡ ¡

  • classifica-on, ¡
  • link ¡predic-on, ¡ ¡
  • role ¡discovery, ¡ ¡
  • similarity ¡search, ¡
  • influence, ¡
  • evolu-on, ¡
  • … ¡ ¡

Graph problems on rich networks

25

slide-26
SLIDE 26

Attributed graphs

telemarketer

Teenager Adult

skater doctor data scientist

Attributed graph: each node has 1+ properties

26

slide-27
SLIDE 27

Communities in rich networks

Attributed graph: each node has 1+ properties

27

slide-28
SLIDE 28

Anomalous subgraphs

Given a set of attributed subgraphs* (e.g. Google+ circles), Find poorly-defined ones

* social circles, communities, egonetworks, …

28

slide-29
SLIDE 29

Communities in attributed networks Given an attributed subgraph*, how to quantify its quality?

* social circles, communities, egonetworks, …

29

slide-30
SLIDE 30

Communities in attributed networks

v Given a subgraph,

how to quantify its quality?

30

slide-31
SLIDE 31

Communities in attributed networks

v Given a subgraph,

how to quantify its quality?

q Structure-only

n Internal measures

q e.g. average degree

31

slide-32
SLIDE 32

Communities in attributed networks

v Given a subgraph,

how to quantify its quality?

q Structure-only

n Internal-only

q average degree

n Boundary-only

q cut edges

n Internal + Boundary

q conductance

32

slide-33
SLIDE 33

Communities in attributed networks

v Given an attributed subgraph,

how to quantify its quality?

q Structure-only

n Internal-only

q average degree

n Boundary-only

q cut edges

n Internal + Boundary

q conductance q Structure + Attributes?

Scalable Anomaly Ranking of Attributed Neighborhoods Bryan Perozzi and Leman Akoglu SIAM SDM 2016.

33

slide-34
SLIDE 34

What’s an Anomaly, Anyhow?

high low

v Given an attributed subgraph

how to quantify quality?

34

slide-35
SLIDE 35

Normality (intuition)

high low

n Given an attributed subgraph

how to quantify quality?

q Internal

n structural density

35

slide-36
SLIDE 36

Normality (intuition)

chess biking

n Given an attributed subgraph

how to quantify quality?

q Internal

n structural density AND n attribute coherence

v neighborhood “focus”

high low

36

slide-37
SLIDE 37

Normality (intuition)

n Given an attributed subgraph

how to quantify quality?

q Internal

n structural density AND n attribute coherence

v neighborhood “focus”

q Boundary

n structural sparsity, OR n external separation

v “exoneration”

high low

37

slide-38
SLIDE 38

n Motivation:

q no good cuts in real-world graphs q social circles overlap

n “exoneration” : by (a) null model, (b) attributes

Normality (intuition)

[Leskovec+ ‘08] [McAuley+ ‘14] (b) neighborhood overlap (a) hub effect edges expected, not surprising

separable by different “focus”

38

slide-39
SLIDE 39

The measure of Normality

1

N = I + E = X

i∈C,j∈C

  • Aij − kikj

2m

  • s(xi, xj|w)

− X

i∈C,b∈B (i,b)∈E

  • 1 − min(1, kikb

2m )

  • s(xi, xb|w)

(3.4)

Leman Akoglu 39

slide-40
SLIDE 40

The measure of Normality

1

N = I + E = X

i∈C,j∈C

  • Aij − kikj

2m

  • s(xi, xj|w)

− X

i∈C,b∈B (i,b)∈E

  • 1 − min(1, kikb

2m )

  • s(xi, xb|w)

(3.4)

internal consistency Null model similarity “focus” vector chess biking

40

slide-41
SLIDE 41

The measure of Normality

1

N = I + E = X

i∈C,j∈C

  • Aij − kikj

2m

  • s(xi, xj|w)

− X

i∈C,b∈B (i,b)∈E

  • 1 − min(1, kikb

2m )

  • s(xi, xb|w)

(3.4)

external separability

41

slide-42
SLIDE 42

Anomaly Mining of Entity Neighborhoods (AMEN)

n Given an attributed subgraph, can we find the

attribute weights?

max

wC

wC

T ·

 X

i∈C,j∈C

  • Aij kikj

2m

  • s(xi, xj)
  • X

i∈C,b∈B (i,b)∈E

  • 1 min(1, kikb

2m )

  • s(xi, xb)
  • 2

1

N = I + E = X

i∈C,j∈C

  • Aij − kikj

2m

  • s(xi, xj|w)

− X

i∈C,b∈B (i,b)∈E

  • 1 − min(1, kikb

2m )

  • s(xi, xb|w)

(3.4)

latent

42

slide-43
SLIDE 43

Optimizing Normality

2 1

3

max

wC

wCT ·  X

i∈C,j∈C

Aij kikj 2m s(xi, xj)

  • X

i∈C,b∈B (i,b)∈E

1 min(1, kikb 2m )s(xi, xb)

  • N = I + E =

X

i∈C,j∈C

  • Aij − kikj

2m

  • s(xi, xj|w)

− X

i∈C,b∈B (i,b)∈E

  • 1 − min(1, kikb

2m )

  • s(xi, xb|w)

(3.4)

max

wC

wC

T · (ˆ

xI + ˆ xE) (4.6) s.t. kwCkp = 1, wC(f) 0, 8f = 1 . . . d

43

slide-44
SLIDE 44

Optimizing Normality

max

wC

wC

T · (ˆ

xI + ˆ xE) (4.6) s.t. kwCkp = 1, wC(f) 0, 8f = 1 . . . d

: one attribute f with largest x

is, wC(f) = 1 One can interpret

=2, or the 2 norm yields.

In that wC(f) =

x(f)

pP

x(i)>0 x(i)2 ,

  • therwise, where w

is unit-normalized.

N = wC

T · x

s

2 = kx+k2

x

: all f with positive x

Normality becomes

Linear in number of attributes!

44

slide-45
SLIDE 45

Example neighborhoods

45

slide-46
SLIDE 46

Read the Web

  • ranking, ¡
  • clustering ¡& ¡

anomaly ¡mining, ¡ ¡

  • classifica-on, ¡
  • link ¡predic-on, ¡ ¡
  • role ¡discovery, ¡ ¡
  • similarity ¡search, ¡
  • influence, ¡
  • evolu-on, ¡
  • … ¡ ¡

Graph problems on rich networks

46

slide-47
SLIDE 47

Motivating Problem

Connotation Mining: finding dash of sentiment beneath “seemingly objective” words & senses

cheesecake emission fine fine

47

slide-48
SLIDE 48

Words+Senses edge-typed network

48

slide-49
SLIDE 49

Classification

n A collective classification approach

q Objective utilizes pairwise Markov Random Fields

Node labels as random variables edge potential (label-label) prior belief edge potential (label-observed label)

max y edge type

t t

49

slide-50
SLIDE 50

Edge potentials depend on edge type

50

slide-51
SLIDE 51

Inference

n A collective classification approach

q Objective utilizes pairwise Markov Random Fields

  • Inference problem (NP-hard)

n Loopy Belief Propagation (LBP)

edge type 1) Repeat for each node: 2) At convergence: edge potential

51

slide-52
SLIDE 52

Summary

n Ranking in node-typed graphs with location

q motivating domain: physician referrals q HINside model for ranking w/ parameter learning

n Anomalous subgraphs in node-attributed graphs

q motivating domain: social networks q AMEN model for quality scoring

n Classification in edge-typed graphs

q motivating application: connotation mining/NLP q LBP with type-specific edge potentials

52

slide-53
SLIDE 53

References

n Ranking in Heterogeneous Networks with Geo-Location

Information Abhinav Mishra & Leman Akoglu. SIAM SDM 2017. Code: https://github.com/abhimm/HINSIDE

n Scalable Anomaly Ranking of Attributed Neighborhoods

Bryan Perozzi & Leman Akoglu. SIAM SDM 2016. Code: https://github.com/phanein/amen

n ConnotationWordNet: Learning Connotation of the Word

+Sense Network Jun S. Kang, Song Feng, Leman Akoglu, Yejin Choi. ACL 2014.

http://www3.cs.stonybrook.edu/~junkang/connotation_wordnet/

53