Axioms for graph clustering objective functions Twan van Laarhoven - - PowerPoint PPT Presentation

axioms for graph clustering objective functions
SMART_READER_LITE
LIVE PREVIEW

Axioms for graph clustering objective functions Twan van Laarhoven - - PowerPoint PPT Presentation

Introduction Axioms Modularity Adaptive Modularity Conclusion Axioms for graph clustering objective functions Twan van Laarhoven Institute for Computing and Information Sciences Radboud University Nijmegen, The Netherlands 28th June 2013 1


slide-1
SLIDE 1

Introduction Axioms Modularity Adaptive Modularity Conclusion

Axioms for graph clustering objective functions

Twan van Laarhoven

Institute for Computing and Information Sciences Radboud University Nijmegen, The Netherlands

28th June 2013

1 / 32

slide-2
SLIDE 2

Introduction Axioms Modularity Adaptive Modularity Conclusion

Outline

Introduction Axioms Modularity Adaptive Modularity Conclusion

2 / 32

slide-3
SLIDE 3

Introduction Axioms Modularity Adaptive Modularity Conclusion

The motivation

  • There is no strict definition of clustering.
  • Can we formalize our intuition?
  • Previous work is about distance based clustering (hierarchical

clustering, K-means, etc.)

  • What about graphs?

3 / 32

slide-4
SLIDE 4

Introduction Axioms Modularity Adaptive Modularity Conclusion

The setting

Definition (Graph)

A symmetric weighted graph is a pair (V , E) of

  • a finite set V of nodes, and
  • a function E : V × V → R≥0 of edge weights,

such that E(i, j) = E(j, i) for all i, j ∈ V .

  • Larger weight = stronger connection.
  • We allow self loops.

4 / 32

slide-5
SLIDE 5

Introduction Axioms Modularity Adaptive Modularity Conclusion

The setting (cont.)

Definition (Clustering)

A clustering C of a graph G = (V , E) is a partition of its nodes.

Definition (Clustering function)

A graph clustering function f is a function from graphs G to clusterings of G.

Definition (Objective function)

A graph clustering objective function Q is a function from graphs G and clusterings of G to R.

  • Larger objective value = better.

5 / 32

slide-6
SLIDE 6

Introduction Axioms Modularity Adaptive Modularity Conclusion

Outline

Introduction Axioms Modularity Adaptive Modularity Conclusion

6 / 32

slide-7
SLIDE 7

Introduction Axioms Modularity Adaptive Modularity Conclusion

The form of axioms

Things that define clusterings

Form Notation 1 Clustering function f (G) = argmaxC Q(G, C) 2 Objective function Q(G, C) 3 Objective relation Q(G, C) ≥ Q(G, D) or C ≥G D

7 / 32

slide-8
SLIDE 8

Introduction Axioms Modularity Adaptive Modularity Conclusion

Basic axioms

Axiom 1: Scale invariance (first form)

A graph clustering objective function Q is scale invariant if

  • for all graphs G = (V , E),
  • all constants α > 0,

f (G) = f (αG). (where αG = (V , (i, j) → αE(i, j)).)

Example

f   a b c e d   = f   a b c e d   = a b c e d

8 / 32

slide-9
SLIDE 9

Introduction Axioms Modularity Adaptive Modularity Conclusion

Basic axioms

Axiom 1: Scale invariance (second form)

A graph clustering objective function Q is scale invariant if

  • for all graphs G = (V , E),
  • all constants α > 0,
  • all clusterings C of G,

Q(G, C) = Q(αG, C). (where αG = (V , (i, j) → αE(i, j)).)

Example

Q   a b c e d   = Q   a b c e d  

8 / 32

slide-10
SLIDE 10

Introduction Axioms Modularity Adaptive Modularity Conclusion

Basic axioms

Axiom 1: Scale invariance (second form)

A graph clustering objective function Q is scale invariant if

  • for all graphs G = (V , E),
  • all constants α > 0,
  • all clusterings C of G,

Q(G, C) = αQ(αG, C) ??? (where αG = (V , (i, j) → αE(i, j)).)

Example

Q   a b c e d   = αQ   a b c e d  

8 / 32

slide-11
SLIDE 11

Introduction Axioms Modularity Adaptive Modularity Conclusion

Basic axioms

Axiom 1: Scale invariance (third form)

A graph clustering objective function Q is scale invariant if

  • for all graphs G = (V , E),
  • all constants α > 0,
  • all clusterings C1, C2 of G,

Q(G, C1) ≥ Q(G, C2) if and only if Q(αG, C1) ≥ Q(αG, C2). (where αG = (V , (i, j) → αE(i, j)).)

Example

Q

  • ≥ Q

⇒ Q

  • ≥ Q
  • 8 / 32
slide-12
SLIDE 12

Introduction Axioms Modularity Adaptive Modularity Conclusion

Basic axioms

Axiom 2: permutation invariance

A graph clustering objective function Q is permutation invariant if

  • for all graphs G = (V , E) and
  • all isomorphisms f : V → V ′,

it is the case that Q(G, C) = Q(f (G), f (C)). (where f is extended to graphs and clusterings in the obvious way.)

Example

Q   a b c e d   = Q   z v y x u  

9 / 32

slide-13
SLIDE 13

Introduction Axioms Modularity Adaptive Modularity Conclusion

Basic axioms

Axiom 3: Richness

A graph clustering objective function Q is rich if

  • for all sets V and
  • all partitions C ∗ of V ,

there is

  • a graph G = (V , E)
  • such that C ∗ is the optimal clustering of G.

Intuition:

  • No trivial objective functions.
  • No fixed number of clusters.

10 / 32

slide-14
SLIDE 14

Introduction Axioms Modularity Adaptive Modularity Conclusion

Basic axioms

Definition (Consistent improvement)

Let

  • G = (V , E) and G ′ = (V , E ′) be graphs, and
  • C be a clustering of G and G ′.

Then G ′ is a C-consistent improvement of G if

  • E ′(i, j) ≥ E(i, j) for all i ∼C j and
  • E ′(i, j) ≤ E(i, j) for all i ∼C j.

Intuition:

  • Consistent improvements make a clustering fit better.

11 / 32

slide-15
SLIDE 15

Introduction Axioms Modularity Adaptive Modularity Conclusion

Basic axioms

Axiom 4: Monotonicity

A graph clustering objective function Q is monotonic if

  • for all graphs G,
  • all clusterings C of G and
  • all C-consistent improvements G ′ of G

it is the case that Q(G ′, C) ≥ Q(G, C).

Example

Q   a b c e d   ≥ Q   a b c e d  

12 / 32

slide-16
SLIDE 16

Introduction Axioms Modularity Adaptive Modularity Conclusion

Local changes

Definition (agreement)

Let

  • G1 = (V1, E1) and G2 = (V2, E2) be two graphs and
  • Va ⊆ V1 ∩ V2.

The graphs agree on Va if E1(i, j) = E2(i, j) for all i, j ∈ Va.

Definition (agreement on neighborhood)

The graphs also agree on the neighborhood of Va if E1(i, j) = E2(i, j) for all i ∈ Va, j ∈ V1 ∩ V2, and E1(i, j) = 0 for all i ∈ Va, j ∈ V1 \ V2, and E2(i, j) = 0 for all i ∈ Va, j ∈ V2 \ V1. What this means:

  • For nodes/clusters in Va, all incident edges are the same.

13 / 32

slide-17
SLIDE 17

Introduction Axioms Modularity Adaptive Modularity Conclusion

Local changes

Definition (agreement)

Let

  • G1 = (V1, E1) and G2 = (V2, E2) be two graphs and
  • Va ⊆ V1 ∩ V2.

The graphs agree on Va if E1(i, j) = E2(i, j) for all i, j ∈ Va.

Definition (agreement on neighborhood)

The graphs also agree on the neighborhood of Va if E1(i, j) = E2(i, j) for all i ∈ Va, j ∈ V1 ∩ V2, and E1(i, j) = 0 for all i ∈ Va, j ∈ V1 \ V2, and E2(i, j) = 0 for all i ∈ Va, j ∈ V2 \ V1. What this means:

  • For nodes/clusters in Va, all incident edges are the same.

13 / 32

slide-18
SLIDE 18

Introduction Axioms Modularity Adaptive Modularity Conclusion

Local changes

Axiom 5: Locality

A graph clustering objective function Q is local if

  • for all graphs G1 = (V1, E1) and G2 = (V2, E2)

that agree on a set Va and its neighborhood,

  • for all clusterings C1 of V1 \ Va, C2 of V2 \ Va and Ca, Da of

Va. if Q(G1, Ca ∪ C1) ≥ Q(G1, Da ∪ C1) then Q(G2, Ca ∪ C2) ≥ Q(G2, Da ∪ C2).

14 / 32

slide-19
SLIDE 19

Introduction Axioms Modularity Adaptive Modularity Conclusion

Local changes

Example

Q   a b c · · ·   ≥ Q   a b c · · ·  

  • Q

     a b c · · ·      ≥ Q      a b c · · ·     

15 / 32

slide-20
SLIDE 20

Introduction Axioms Modularity Adaptive Modularity Conclusion

Local changes

Special cases

  • G1 = G2: change part of a clustering.

In practice: optimize parts separately (divide and conquer).

  • Va = ∅: union of two disjoint graphs.

16 / 32

slide-21
SLIDE 21

Introduction Axioms Modularity Adaptive Modularity Conclusion

Interlude: Related work

Theorem (Kleinberg 2002)

There is no clustering function that is permutation invariant, scale invariant, monotonic and rich.

Theorem (Ackerman, Ben-David 2008)

There is a clustering quality function that is permutation invariant, scale invariant, monotonic and rich.

17 / 32

slide-22
SLIDE 22

Introduction Axioms Modularity Adaptive Modularity Conclusion

Discontinuity is magic

Theorem

There is a graph clustering function that is scale invariant, permutation invariant, monotonic, rich and local.

Connected components

fcoco(G) = the connected components of G Qcoco(G, C) = 1[C are the connected components of G]

Huh!?!?

  • Doesn’t this contradict Kleinberg’s theorem?
  • No: edge weight 0 = distance ∞.

18 / 32

slide-23
SLIDE 23

Introduction Axioms Modularity Adaptive Modularity Conclusion

Discontinuity is magic

Theorem

There is a graph clustering function that is scale invariant, permutation invariant, monotonic, rich and local.

Connected components

fcoco(G) = the connected components of G Qcoco(G, C) = 1[C are the connected components of G]

Huh!?!?

  • Doesn’t this contradict Kleinberg’s theorem?
  • No: edge weight 0 = distance ∞.

18 / 32

slide-24
SLIDE 24

Introduction Axioms Modularity Adaptive Modularity Conclusion

Discontinuity is magic

Theorem

There is a graph clustering function that is scale invariant, permutation invariant, monotonic, rich and local.

Connected components

fcoco(G) = the connected components of G Qcoco(G, C) = 1[C are the connected components of G]

Huh!?!?

  • Doesn’t this contradict Kleinberg’s theorem?
  • No: edge weight 0 = distance ∞.

18 / 32

slide-25
SLIDE 25

Introduction Axioms Modularity Adaptive Modularity Conclusion

Discontinuity is magic

Why I don’t like it

  • Adding/removing an edge with tiny weight ǫ changes the

graph slightly, but the clustering completely.

  • Possibly unstable.
  • So don’t allow it.

Axiom 6: continuity

An objective function Q is continuous if a small change in the graph leads to a small change in the objective value.

19 / 32

slide-26
SLIDE 26

Introduction Axioms Modularity Adaptive Modularity Conclusion

Outline

Introduction Axioms Modularity Adaptive Modularity Conclusion

20 / 32

slide-27
SLIDE 27

Introduction Axioms Modularity Adaptive Modularity Conclusion

An objective function

Modularity

Qmodularity(G, C) =

  • c∈C

wc vV − vc vV 2 . Where vc =

  • i∈c
  • j∈V

E(i, j) volume of cluster wc =

  • i∈c
  • j∈c

E(i, j) within cluster weight.

21 / 32

slide-28
SLIDE 28

Introduction Axioms Modularity Adaptive Modularity Conclusion

Properties

The obvious:

  • Modularity is permutation invariant.
  • Modularity is scale invariant.
  • Modularity is continuous.

The less obvious:

  • Modularity is rich.

The bad:

  • Modularity is not local.
  • Modularity is not monotonic.

22 / 32

slide-29
SLIDE 29

Introduction Axioms Modularity Adaptive Modularity Conclusion

What goes wrong?

Modularity is not monotonic.

Qmodularity

  • a

b c d 1 1

  • = 0.125

Qmodularity

  • a

b c d 0.1 1

  • = 0.079

Qmodularity

  • a

b c d 1 10

  • = 0.079

23 / 32

slide-30
SLIDE 30

Introduction Axioms Modularity Adaptive Modularity Conclusion

Outline

Introduction Axioms Modularity Adaptive Modularity Conclusion

24 / 32

slide-31
SLIDE 31

Introduction Axioms Modularity Adaptive Modularity Conclusion

Fixed Scale modularity

Idea 1

Fix the scale QM-fixed(G, C) =

  • c∈C

wc M − vc M 2 Is it monotonic? Take vc = wc + bc (within + between) ∂QM-fixed(G, C) ∂wc = 1 M − 2wc + 2bc M2 . This is negative when 2vc > M

⇒ not monotonic

25 / 32

slide-32
SLIDE 32

Introduction Axioms Modularity Adaptive Modularity Conclusion

Fixed Scale modularity

Idea 1

Fix the scale QM-fixed(G, C) =

  • c∈C

wc M − vc M 2 Is it monotonic? Take vc = wc + bc (within + between) ∂QM-fixed(G, C) ∂wc = 1 M − 2wc + 2bc M2 . This is negative when 2vc > M

⇒ not monotonic

25 / 32

slide-33
SLIDE 33

Introduction Axioms Modularity Adaptive Modularity Conclusion

Fixed Scale modularity

Idea 1

Fix the scale QM-fixed(G, C) =

  • c∈C

wc M − vc M 2 Is it monotonic? Take vc = wc + bc (within + between) ∂QM-fixed(G, C) ∂wc = 1 M − 2wc + 2bc M2 . This is negative when 2vc > M

⇒ not monotonic

25 / 32

slide-34
SLIDE 34

Introduction Axioms Modularity Adaptive Modularity Conclusion

Adaptive Scale Modularity

Idea 2

Add some vc to the denominator QM,γ(G, C) =

  • c∈C
  • wc

M + γvc −

  • vc

M + γvc 2 .

Theorem

Adaptive scale modularity is monotonic for all M ≥ 0 and γ ≥ 2.

Theorem

Adaptive scale modularity is rich for all M ≥ 0 and γ ≥ 1.

Theorem

Adaptive scale modularity is scale invariant for M = 0.

26 / 32

slide-35
SLIDE 35

Introduction Axioms Modularity Adaptive Modularity Conclusion

Adaptive Scale Modularity

Idea 2

Add some vc to the denominator QM,γ(G, C) =

  • c∈C
  • wc

M + γvc −

  • vc

M + γvc 2 .

Theorem

Adaptive scale modularity is monotonic for all M ≥ 0 and γ ≥ 2.

Theorem

Adaptive scale modularity is rich for all M ≥ 0 and γ ≥ 1.

Theorem

Adaptive scale modularity is scale invariant for M = 0.

26 / 32

slide-36
SLIDE 36

Introduction Axioms Modularity Adaptive Modularity Conclusion

Adaptive Scale Modularity

Idea 2

Add some vc to the denominator QM,γ(G, C) =

  • c∈C
  • wc

M + γvc −

  • vc

M + γvc 2 .

Theorem

Adaptive scale modularity is monotonic for all M ≥ 0 and γ ≥ 2.

Theorem

Adaptive scale modularity is rich for all M ≥ 0 and γ ≥ 1.

Theorem

Adaptive scale modularity is scale invariant for M = 0.

26 / 32

slide-37
SLIDE 37

Introduction Axioms Modularity Adaptive Modularity Conclusion

Adaptive Scale Modularity

Idea 2

Add some vc to the denominator QM,γ(G, C) =

  • c∈C
  • wc

M + γvc −

  • vc

M + γvc 2 .

Theorem

Adaptive scale modularity is monotonic for all M ≥ 0 and γ ≥ 2.

Theorem

Adaptive scale modularity is rich for all M ≥ 0 and γ ≥ 1.

Theorem

Adaptive scale modularity is scale invariant for M = 0.

26 / 32

slide-38
SLIDE 38

Introduction Axioms Modularity Adaptive Modularity Conclusion

Adaptive Scale Modularity: related objectives

  • When γ = 0, we get fixed scale modularity.

Equivalent to other modularity variants.

  • When γ = 0 and M = vV , we get modularity.
  • When M = 0 we get

Q0,γ(G, C) ∝

  • c∈C

wc vc − 1 γ

  • ,

i.e. normalized cut.

  • When M → ∞ we get

Q∞,γ(G, C) ∝

  • c∈C

wc, i.e. unnormalized cut.

27 / 32

slide-39
SLIDE 39

Introduction Axioms Modularity Adaptive Modularity Conclusion

Adaptive Scale Modularity: behavior

Take a simple graph: w w b

  • Two cliques each with w within weight
  • Connected by edges with total weight b.
  • Total volume 2w + 2b.
  • What is the behavior of adaptive scale modularity?

28 / 32

slide-40
SLIDE 40

Introduction Axioms Modularity Adaptive Modularity Conclusion M0

3 10 20 30 40 50 10 20 30 40 50 b 1 10 20 30 40 50 10 20 30 40 50 1 2 10 20 30 40 50 10 20 30 40 50 1 2 10 20 30 40 50 10 20 30 40 50

M10

3 10 20 30 40 50 10 20 30 40 50 b 1 2 10 20 30 40 50 10 20 30 40 50 1 2 10 20 30 40 50 10 20 30 40 50 1 2 10 20 30 40 50 10 20 30 40 50

M100

1 2 3 10 20 30 40 50 10 20 30 40 50 b 1 2 10 20 30 40 50 10 20 30 40 50 1 2 10 20 30 40 50 10 20 30 40 50 1 2 10 20 30 40 50 10 20 30 40 50

M1000

1 10 20 30 40 50 10 20 30 40 50 w b 1 10 20 30 40 50 10 20 30 40 50 w 1 10 20 30 40 50 10 20 30 40 50 w 1 2 10 20 30 40 50 10 20 30 40 50 w

Γ 0 Γ 1 Γ 2 Γ 10

Legend:

1

=

2

=

3

=

29 / 32

slide-41
SLIDE 41

Introduction Axioms Modularity Adaptive Modularity Conclusion

Outline

Introduction Axioms Modularity Adaptive Modularity Conclusion

30 / 32

slide-42
SLIDE 42

Introduction Axioms Modularity Adaptive Modularity Conclusion

Summary

  • 6 axioms for graph clustering objectives.
  • Graph setting allows for locality.
  • Modularity is not monotonic.
  • Non-monotonicity leads to splitting of cliques.
  • Adaptive scale modularity satisfies all axioms (when M = 0).
  • Generalizes both modularity and normalized cut.

31 / 32

slide-43
SLIDE 43

Introduction Axioms Modularity Adaptive Modularity Conclusion

Thank you for your attention.

Axioms for graph clustering objective functions

Twan van Laarhoven

Institute for Computing and Information Sciences Radboud University Nijmegen, The Netherlands

28th June 2013

32 / 32